ARGUS — ArgusLabs

Core Concepts

Understand Watchers, Detectors, Traces, and Forensics — the four primitives of ARGUS.

Watchers

A Watcher is the core instrumentation primitive. It attaches to your pipeline graph and records everything that happens during execution — node inputs, outputs, state transitions, timing, and tool calls.

The Watcher doesn't modify your pipeline's behavior. It's a passive observer that hooks into execution callbacks. Your pipeline runs exactly as it would without ARGUS — the Watcher just records what happened.

python
from argus import ArgusWatcher

watcher = ArgusWatcher(
    strict=False,           # don't halt on detection
    investigate=True,       # run root cause analysis
    persist_state=True,     # save state for replay
)
watcher.watch(graph)
Each Watcher instance tracks one execution run. For concurrent pipelines, create separate Watcher instances per run.

Detectors

Detectors are the analysis engines that examine a trace after execution. ARGUS runs four detection layers, each looking for different categories of failure:

  • 1.Statistical — anomalies in timing, output length, token counts, and other numerical signals
  • 2.Semantic — meaning drift, relevance loss, and hallucination patterns using embedding similarity and LLM-as-judge
  • 3.Behavioral — unexpected node transitions, infinite loops, skipped steps, and state corruption
  • 4.Structural — schema violations, missing required fields, type mismatches, and contract breaches

Detectors run automatically when you call watcher.finalize(). You can configure sensitivity thresholds and enable/disable individual layers through the configuration.

Traces

A Trace is the complete record of a single pipeline execution. It contains:

  • Every node that executed, in order
  • Input and output state at each node
  • Wall clock and CPU timing per step
  • Tool calls and their results
  • Detection results from all four layers
  • Forensic analysis if failures were detected
bash
# View the latest trace
argus trace --last

# View a specific trace by ID
argus trace abc123

# List all traces
argus trace --list

Traces are stored locally in SQLite by default. See Storage for details on schema and export options.

Forensics

When detectors flag a failure, the Forensics engine kicks in. It traces the failure backward through the execution graph to find the root cause — which node, which input, which state transition caused the downstream degradation.

Forensic analysis answers three questions:

  • What failed? — the specific detection that fired and what it found
  • Where did it fail? — the node and step in the execution graph
  • Why did it fail? — the causal chain from the root cause to the observed symptom

Investigate mode

Forensics only runs when investigate=True (the default). Set it to "always" to run forensic analysis even when no detections fire — useful for catching near-misses.

How They Connect

Diagram showing the flow: Watcher instruments pipeline → Trace captures execution → Detectors analyze trace → Forensics explains failures
Watcher → Trace → Detectors → Forensics: the ARGUS pipeline

The flow is linear and deterministic:

  1. You create a Watcher and attach it to your graph
  2. Your pipeline runs normally — the Watcher records a Trace
  3. You call finalize()Detectors analyze the trace
  4. If failures are found — Forensics traces back to root cause
  5. You view results via CLI, UI, or programmatic API