ARGUS — ArgusLabs

API Reference

Python API reference for ARGUS classes and methods.

ArgusWatcher

The main class for instrumenting and monitoring pipeline executions. Import from the top-level package:

python
from argus import ArgusWatcher
Constructor parameters
strictbool

Raise an exception during finalize() if any detector fires.

Default: False

investigatebool | "always"

Run forensic analysis on detected failures. "always" runs forensics even without detections.

Default: True

max_field_sizeint

Maximum characters per captured state field before truncation.

Default: 50_000

redact_keyslist[str] | None

Field names to redact from traces. Supports glob patterns (e.g., "*.secret").

Default: None

validatorsdict | None

Custom validation functions keyed by field name. fn(value) → bool.

Default: None

persist_statebool

Save full state at each step for replay capability.

Default: True

record_httpbool

Record HTTP requests made during execution for mocked replay.

Default: False

semantic_judgebool

Enable LLM-as-judge semantic evaluation.

Default: False

judge_modelstr

Model to use for semantic judging.

Default: "gpt-4o"

Methods

.watch()

Instrument a graph for monitoring. Must be called before graph.compile().

python
watcher.watch(graph: StateGraph) -> None
Parameters
graphStateGraph

The LangGraph StateGraph instance to instrument. ARGUS patches the graph's node callbacks to intercept execution data.

Call watch() before graph.compile(). If you compile first, ARGUS won't be able to instrument the nodes.

.finalize()

Complete the trace, run all detection layers, execute forensic analysis if needed, and persist results to storage.

python
watcher.finalize() -> Trace

Returns the completed Trace object. If strict=True and any detections fire, raises DetectionError after storing the trace.

.get_trace()

Retrieve the trace after finalization. Returns the same object as finalize().

python
trace = watcher.get_trace() -> Trace

# Trace properties
trace.id                # str — unique trace identifier
trace.status            # "ok" | "warning" | "failed"
trace.duration_ms       # int — total execution time
trace.steps             # list[TraceStep]
trace.detections        # list[Detection]
trace.forensics         # Forensics | None
trace.summary           # str — human-readable summary

Data Models

python
# TraceStep — one node execution
class TraceStep:
    id: str
    step_number: int
    node_name: str
    input_state: dict
    output_state: dict
    duration_ms: int
    timestamp: datetime

# Detection — one detected issue
class Detection:
    id: str
    layer: str            # "statistical" | "semantic" | "behavioral" | "structural"
    severity: str         # "info" | "warning" | "critical"
    message: str
    details: dict
    step_id: str          # which step triggered this

# Forensics — root cause analysis
class Forensics:
    root_cause_step: str  # step ID of the root cause
    explanation: str      # human-readable explanation
    causal_chain: list    # ordered list of steps from cause to symptom
    detection_ids: list   # which detections this explains

Type hints

All data models are fully typed with Python type hints. If you're using an IDE with type checking, you'll get autocomplete and type validation throughout.