Watchers | ARGUS Docs

Overview

ArgusWatcheris what you interact with. It hooks into your graph, records every node's execution, runs detectors, and produces traces. One watcher per run.

Basic Usage

Three ways to set up. Pick whichever fits your code:

python

from argus import ArgusWatcher

# Option A — pass graph to constructor (recommended)
watcher = ArgusWatcher(graph)
app = graph.compile()
result = app.invoke(initial_state)

# Option B — separate watch call
watcher = ArgusWatcher()
watcher.watch(graph)
app = graph.compile()
result = app.invoke(initial_state)

# Option C — after compile
watcher = ArgusWatcher()
app = graph.compile(checkpointer=memory)
app = watcher.watch_compiled(app)
result = app.invoke(initial_state)

Runs save automatically for linear and fan-out graphs. Cyclic graphs need a manual watcher.finalize() call.

Parameters

Everything is optional. Pass to the constructor to override config file and environment variable values.

Core

graphStateGraph

LangGraph graph to monitor. If passed, watch() is called automatically.

Default: None

max_field_sizeint

Max characters per captured state field. Fields exceeding this get truncated.

Default: 50_000

strictbool

Raise an exception if any detector fires. Use in CI/CD to fail builds on quality regressions.

Default: False

investigatebool | "always"

LLM root-cause investigation. True = on failure only, "always" = every node, False = off.

Default: True

Security

redact_keysset[str]

Field names to redact from stored outputs (e.g. {"password", "api_key"}).

Default: None

validatorsdict

Per-node semantic validators. Use "*" as key to run on every node. Each validator is a (bool, str) callable.

Default: None

python

# Validators — catch semantic failures
watcher = ArgusWatcher(graph, validators={
    "classify": lambda o: (o.get("label") in ["yes", "no"], "unexpected label"),
    "*":        lambda o: ("error" not in o, "error key present"),
})

"*" runs on every node.

Replay & Eval

persist_statebool

Save run records to .argus/runs/. Set False for ephemeral monitoring.

Default: True

record_httpbool

Record all external HTTP/API calls for deterministic replay.

Default: True

semantic_judgebool

LLM-powered quality judge on every node output. Requires OPENAI_API_KEY.

Default: False

judge_modelstr

Model for the semantic judge and investigation.

Default: "gpt-4o"

python

# Full example with multiple options
watcher = ArgusWatcher(
    graph,
    semantic_judge=True,
    judge_model="gpt-4o-mini",
    strict=True,
    record_http=True,
    redact_keys={"api_key", "token"},
    validators={
        "summarize": lambda o: (len(o.get("summary", "")) > 10, "Summary too short"),
    },
)

Cost

semantic_judge sends node outputs to an LLM. This adds API cost proportional to the number of nodes. Use it in staging/CI, not every production run.

Lifecycle

A Watcher goes through four phases:

Created — constructor called, parameters loaded
Watching — graph instrumented, ready for execution
Recording — pipeline running, capturing node data
Finalized — detectors run, forensics generated, trace stored

Single-use

A Watcher instance is single-use. After finalize, create a new Watcher for the next run. Calling watch() on a finalized Watcher raises WatcherStateError.