Gate Pipeline
How every agent action is evaluated — the four gates in detail.
Gate Pipeline
Every agent action passes through four gates in sequence. Evaluation is fail-fast — if any gate returns BLOCK or ESCALATE, later gates are not evaluated.
Gate 1 — Fast Reject
Type: Deterministic
Latency: < 1ms
Hardcoded blocks checked before any LLM or vector call. Includes:
- Actions on explicitly forbidden targets (configured at org level)
- Irreversible actions above a configured row threshold
- Actions matching forbidden action patterns (e.g.
DROP TABLE,rm -rf)
Gate 1 returns BLOCK immediately if any rule matches. No LLM involvement.
Gate 2 — Semantic Check
Type: Semantic + LLM
Latency: 5–50ms
Three-layer evaluation:
Layer 1 — Embedding similarity
The action (tool name + parameters) is embedded and compared against a library of known harmful patterns using cosine similarity.
- Score < 0.3: pass
- Score 0.3–0.7: continue to L2
- Score > 0.7: BLOCK
Layer 2 — Consequence scoring
Deterministic scoring based on:
- Action reversibility (irreversible = higher score)
aggregateRowsAffected(cumulative across session)- Action type classification (DATABASE_WRITE, FILE_DELETE, API_CALL, etc.)
Layer 3 — Causal LLM
For actions that pass L1 and L2, a causal LLM evaluates whether the action could violate any semantic constraint extracted from the agent's system prompt. Returns a confidence score and reasoning.
Gate 2B — Topology Check (multi-agent)
Type: Deterministic
Applies to: Delegation actions only
When an orchestrator delegates to a sub-agent, Gate 2B checks:
- Is the sub-agent in the orchestrator's registered delegation whitelist?
- Is the orchestrator in the sub-agent's allowed-caller list?
Any topology violation returns BLOCK immediately. See Multi-Agent.
Gate 3 — Policy RAG
Type: Policy retrieval + LLM evaluation
Latency: 10–80ms
Retrieves the most relevant policies from the org's PolicyRAG vector store using the action as the query. Evaluates each retrieved policy against the action deterministically (for explicit conditions) or via LLM (for semantic policies).
Policy sources retrieved:
- Explicit policies with condition expressions (deterministic evaluation)
- Semantic policies from uploaded documents (LLM evaluation)
- Compliance framework policies (GDPR, HIPAA, SOC2)
Returns BLOCK, ESCALATE, or ALLOW with the matching policy and reasoning.
Gate 4 — Tier Auth
Type: RBAC + escalation
Applies to: ESCALATE verdicts from earlier gates
Checks the action's tier classification against the agent's registered tier:
| Tier | Risk level | Default behavior |
|---|---|---|
| T1 | Low / read-only | ALLOW |
| T2 | Medium / reversible write | ALLOW with logging |
| T3 | High / irreversible | ESCALATE |
| T4 | Critical | ESCALATE + require VP approval |
Human escalations are routed via Slack and dashboard. The agent SDK polls for the decision with a configurable timeout.
Verdict types
| Verdict | Meaning | Agent behavior |
|---|---|---|
ALLOW | Action authorized | Tool executes normally |
BLOCK | Action forbidden | SDK returns block reason to agent |
ESCALATE | Awaiting human review | SDK waits, agent pauses |
Shadow mode
In shadow mode, the full gate pipeline runs but the verdict returned to the
agent is always ALLOW. The real verdict is stored as shadowOutcome on the
ledger entry. Use this to observe what would be blocked before enforcing.