Gate Pipeline

Every agent action passes through four gates in sequence. Evaluation is fail-fast — if any gate returns BLOCK or ESCALATE, later gates are not evaluated.

Gate 1 — Fast Reject

Type: Deterministic
Latency: < 1ms

Hardcoded blocks checked before any LLM or vector call. Includes:

Actions on explicitly forbidden targets (configured at org level)
Irreversible actions above a configured row threshold
Actions matching forbidden action patterns (e.g. DROP TABLE, rm -rf)

Gate 1 returns BLOCK immediately if any rule matches. No LLM involvement.

Gate 2 — Semantic Check

Type: Semantic + LLM
Latency: 5–50ms

Three-layer evaluation:

Layer 1 — Embedding similarity

The action (tool name + parameters) is embedded and compared against a library of known harmful patterns using cosine similarity.

Score < 0.3: pass
Score 0.3–0.7: continue to L2
Score > 0.7: BLOCK

Layer 2 — Consequence scoring

Deterministic scoring based on:

Action reversibility (irreversible = higher score)
aggregateRowsAffected (cumulative across session)
Action type classification (DATABASE_WRITE, FILE_DELETE, API_CALL, etc.)

Layer 3 — Causal LLM

For actions that pass L1 and L2, a causal LLM evaluates whether the action could violate any semantic constraint extracted from the agent's system prompt. Returns a confidence score and reasoning.

Gate 2B — Topology Check (multi-agent)

Type: Deterministic
Applies to: Delegation actions only

When an orchestrator delegates to a sub-agent, Gate 2B checks:

Is the sub-agent in the orchestrator's registered delegation whitelist?
Is the orchestrator in the sub-agent's allowed-caller list?

Any topology violation returns BLOCK immediately. See Multi-Agent.

Gate 3 — Policy RAG

Type: Policy retrieval + LLM evaluation
Latency: 10–80ms

Retrieves the most relevant policies from the org's PolicyRAG vector store using the action as the query. Evaluates each retrieved policy against the action deterministically (for explicit conditions) or via LLM (for semantic policies).

Policy sources retrieved:

Explicit policies with condition expressions (deterministic evaluation)
Semantic policies from uploaded documents (LLM evaluation)
Compliance framework policies (GDPR, HIPAA, SOC2)

Returns BLOCK, ESCALATE, or ALLOW with the matching policy and reasoning.

Gate 4 — Tier Auth

Type: RBAC + escalation
Applies to: ESCALATE verdicts from earlier gates

Checks the action's tier classification against the agent's registered tier:

Tier	Risk level	Default behavior
T1	Low / read-only	ALLOW
T2	Medium / reversible write	ALLOW with logging
T3	High / irreversible	ESCALATE
T4	Critical	ESCALATE + require VP approval

Human escalations are routed via Slack and dashboard. The agent SDK polls for the decision with a configurable timeout.

Verdict types

Verdict	Meaning	Agent behavior
`ALLOW`	Action authorized	Tool executes normally
`BLOCK`	Action forbidden	SDK returns block reason to agent
`ESCALATE`	Awaiting human review	SDK waits, agent pauses

Shadow mode

In shadow mode, the full gate pipeline runs but the verdict returned to the agent is always ALLOW. The real verdict is stored as shadowOutcome on the ledger entry. Use this to observe what would be blocked before enforcing.

Gate Pipeline

On this page