Human in the loop
Human in the loop (HITL) is the policy engine outcome where a rule’s on_fail is escalate. Instead of rejecting or approving the mandate at the edge, Sill holds the transaction: it persists the verified draft, returns a 202 with an escalation_id, and lists the held order on the dashboard review queue. No payment authorization, no connector mutation, and no settled audit row occur until a human reviewer with the Owner or Admin role resolves it. On approve, the held draft re-enters the same settlement consumer a normal approved mandate uses; on reject or timeout, a signed escalated_rejected audit record is written and no charge ever runs.
The flow
Section titled “The flow”sequenceDiagram
autonumber
participant Agent
participant Edge as Sill edge
participant Origin as Sill origin
participant Dash as Dashboard
participant Rail as Stripe
Agent->>Edge: POST /v1/m/:site_key/mandate (signed)
Edge->>Edge: verify, evaluate policy (escalate)
Edge->>Origin: enqueue sanitized draft
Origin-->>Edge: escalation_id
Edge-->>Agent: 202 escalated — returns escalation_id
Note over Origin,Rail: Held. No charge. No connector call.
Dash->>Origin: GET /v1/escalations?status=pending
Origin-->>Dash: queue items (non-PII summary)
alt Approve
Dash->>Origin: POST /v1/escalations/:id/resolve — approve
Origin->>Rail: authorize charge
Origin->>Origin: signed audit record (escalated_approved)
else Reject or Timeout
Dash->>Origin: POST /v1/escalations/:id/resolve — reject
Origin->>Origin: signed audit record (escalated_rejected)
end
When a rule escalates
Section titled “When a rule escalates”A rule lives inside the active policy for a site. Each rule has a category, a DSL expression, and an action_on_match of either reject or escalate. The shipped HITL rule is r07 (HITL on destructive actions); other rules — for example a spend ceiling such as r05 — can also be configured to escalate rather than reject. See policy engine for the rule catalog and the evaluation order.
When a rule with action_on_match: 'escalate' matches a verified mandate, the edge:
- Mints an
escalation_idof the formesc_…. - Captures the sanitized transactional draft the approve path would have enqueued — the policy-evaluated payload, with the transit-only buyer block stripped and AEAD-encrypted into a separate vault frame (buyer PII never sits cleartext at rest).
- Persists the escalation row through an account-scoped, RLS-isolated insert at the origin.
- Returns HTTP
202with{ outcome: "escalated", escalation_id, evaluated_rule_id }.
No payment is authorized. No connector mutation runs. The held transaction is durable and idempotent — a re-submitted equivalent mandate returns the existing handle rather than minting a new one.
Example: holding a mandate
Section titled “Example: holding a mandate”curl -sS -i -X POST "https://edge.sill.so/v1/m/$SITE_KEY/mandate" \ -H 'content-type: application/json' \ --data-binary @/tmp/mandate.jsonHTTP/2 202content-type: application/json
{ "outcome": "escalated", "mandate_id": "mnd_01K8YQ7LZQ7K6QB9TXV0F8Y2J1", "escalation_id": "esc_7a1c3f8b2e9d6a4c5b8e0f1a2c", "evaluated_at": "2026-06-22T14:21:08.412Z", "evaluated_rule_id": "r05_escalate_smoke"}The escalation_id is an opaque, prefixed handle (esc_ followed by 26 lowercase hex characters). It is the only identifier the resolve API accepts. (Values shown are illustrative.)
The review queue
Section titled “The review queue”The dashboard’s review queue lists pending escalations for the active account, scoped per-site, with role-gated actions:
- Owner / Admin — see the queue and resolve.
- Reviewer — see the queue, read-only.
- Viewer — no access to the queue.
Each row carries a deliberately minimal non-PII summary: action, merchant, SKU or line items, currency, the agent-signed max_amount cap, the quoted_total, and the rule id that escalated. No buyer email, phone, or shipping address is in the summary. Buyer detail can be revealed through a separate, access-logged endpoint that decrypts the vaulted frame and writes a durable witness row to the admin audit log before the response leaves the server.
Pending escalations as the Owner sees them in the dashboard: each row shows the rule that escalated, the held order summary, the timeout clock, and the resolve actions.
Listing pending escalations
Section titled “Listing pending escalations”curl -sS "https://api.sill.so/v1/escalations?status=pending" \ -H "cookie: $SILL_SESSION"{ "items": [ { "escalation_id": "esc_7a1c3f8b2e9d6a4c5b8e0f1a2c", "mandate_id": "mnd_01K8YQ7LZQ7K6QB9TXV0F8Y2J1", "rule_id": "r05_escalate_smoke", "status": "pending", "created_at": "2026-06-22T14:21:08.412Z", "timeout_at": "2026-06-22T15:21:08.412Z", "summary": { "action": "place_order", "merchant": "SmokeTest", "sku": "sku_smoke", "max_amount": 2.00, "quoted_total": 0.50, "currency": "USD", "failed_rule_id": "r05_escalate_smoke" } } ]}Pre-flight quote for Shopify carts
Section titled “Pre-flight quote for Shopify carts”For escalations on a Shopify rail, the dashboard can request a projected settlement — Shopify-computed tax and shipping via the same calculate the rail runs at settlement, with zero mutations. The response includes within_ceiling: if false, an approval would fail the pre-create ceiling check (amount_exceeds_mandate). Non-Shopify rails return unavailable: rail_not_shopify and the dashboard falls back to the held summary.
Resolution: approve, reject, timeout
Section titled “Resolution: approve, reject, timeout”A pending escalation has exactly three exit paths. Every exit writes exactly one signed audit record linked back to the escalation by id.
Approve
Section titled “Approve”curl -sS -X POST "https://api.sill.so/v1/escalations/$ESC_ID/resolve" \ -H "cookie: $SILL_SESSION" -H 'content-type: application/json' \ -d '{"decision":"approve"}'{ "escalation_id": "esc_7a1c3f8b2e9d6a4c5b8e0f1a2c", "status": "approved", "decision": "escalated_approved"}On approve the resolver:
- Atomically flips the escalation row
pending → approved(underFOR UPDATE; a concurrent resolver gets a409conflict). - Decrypts the vaulted buyer frame (AAD bound to the row’s
escalation_id) and re-attaches the buyer block to the draft. - Enqueues the draft tagged
escalation_resolution: { escalation_id, decision: 'escalated_approved', resolved_by }. - The settlement consumer authorizes the charge through the existing rail dispatcher — the same code path a normal approved mandate uses — and writes a single audit record with
decision = 'escalated_approved', Merkle-chained and linked byescalation_id.
If the connected site is in Stripe live mode, an approve charges a real card for real money. The dashboard surfaces the mode prominently on the resolve action.
Reject
Section titled “Reject”curl -sS -X POST "https://api.sill.so/v1/escalations/$ESC_ID/resolve" \ -H "cookie: $SILL_SESSION" -H 'content-type: application/json' \ -d '{"decision":"reject"}'{ "escalation_id": "esc_7a1c3f8b2e9d6a4c5b8e0f1a2c", "status": "rejected", "decision": "escalated_rejected"}The resolver flips the row to rejected and writes a signed escalated_rejected audit record. No buyer decrypt. No queue push. No rail call. No charge.
Timeout (fail-closed)
Section titled “Timeout (fail-closed)”Each escalation carries a timeout_at deadline. A 60-second cron sweeps pending rows past their deadline, flips them to timed_out, and writes a signed escalated_rejected audit record through the same consumer path — with a 'timeout_cron' sentinel as resolved_by. Timeout never settles: if a human does not approve in time, the transaction is treated as rejected. This is the safe default for a money path.
What gets signed, and what doesn’t
Section titled “What gets signed, and what doesn’t”| Object | Signed | Notes |
|---|---|---|
| Inbound mandate | yes | ed25519 over JCS — verified at the edge before escalation runs. |
| Escalation row | no | An operational record; the load-bearing signed object is the audit row at resolution. |
| Approved audit record | yes | decision = escalated_approved, Merkle-chained, exportable as part of the audit envelope. |
| Rejected audit record | yes | decision = escalated_rejected, identical chain semantics. |
There is exactly one signed audit record per escalation lifecycle. The append-only chain therefore distinguishes a normal approved charge from an escalated_approved one — the signed record carries the operator decision permanently. The signing key and verification recipe are the same for every signed surface Sill publishes; see verify a signature and the public JWKS.
Failure modes and idempotency
Section titled “Failure modes and idempotency”The resolve endpoint is idempotent and exposes a small set of recoverable failures the dashboard surfaces directly:
409 conflict— another admin already resolved this escalation. The dashboard refetches and reconciles its view.503 settlement_enqueue_failed— the queue push failed on an approve. The escalation stays pending and the action is re-runnable. No partial state.404 not_found— escalation id missing or filtered by tenant RLS. Cross-account access is impossible by construction.- Double-approve — re-POSTing
approveon the same id returns200without a second charge; a partial-unique index on(site_id, mandate_id)reclassifies the duplicate inside the consumer. - Conflict-direction — POSTing
rejecton an already-approved escalation returns409.
Security posture
Section titled “Security posture”- Account isolation — the
escalationtable and both endpoints are scoped under row-level security. One merchant cannot see, list, or resolve another’s escalation. - Role gating — list is Owner / Admin / Reviewer; resolve is Owner / Admin only. Enforced server-side, not just in the UI.
- Buyer PII at rest — the stored draft is captured without its buyer block; the buyer is AEAD-encrypted into a separate vault frame with AAD bound to the escalation id. Cleartext buyer data is never persisted on the escalation row.
- Fail-closed everywhere — timeout rejects, missing producer leaves the row pending, encrypter failure drops the buyer block rather than persisting it cleartext, the consumer drops duplicates on the audit-chain.
- Channel — resolution happens in the dashboard. A digest email of pending escalations is sent to the account’s Owner / Admin / Reviewer users; Slack routing is on the roadmap.
Common questions
Section titled “Common questions”Does an escalation ever cost the agent or merchant money? No. Until an Owner or Admin explicitly approves, no payment authorization is sent to the merchant’s processor. Reject and timeout never charge.
Can the escalation be approved automatically?
No. The cron only resolves timed-out rows, and only as escalated_rejected. The approve path requires a real human session with the Owner or Admin role.
Does an approval re-verify the original mandate signature? The mandate was verified at the edge before the escalation was created. The resolver re-uses the captured, sanitized draft so it does not re-hit the nonce store or re-run policy. The signed audit record at resolution attests to the operator decision; the original signed mandate remains in the audit chain as the upstream evidence.
Is the agent told about the resolution?
The agent receives the held 202 synchronously. The current release does not push the eventual resolution back to the agent — the operator’s decision lands in the audit chain and the merchant’s connected rail. Webhook-style resolution callbacks are roadmap.
Where are escalations exported?
Resolved escalations appear in the audit log and bundle export with their escalated_approved or escalated_rejected decision, linked to the original escalation id.
See also
Section titled “See also”- Policy engine — how rules with
on_fail: escalateare configured. - Guardrails — the dashboard surface for managing the active policy.
- Signed mandates — the inbound object an escalation holds.
- Payments — the settlement path an approve re-enters.
- Refunds — refund escalations carry a server-resolved original-order panel.
- Audit envelope — what the resolution record contains.
- Verify a signature — independent verification of any signed Sill record.
- Transactional overview — the honest bounds of the Phase 2 live-rail scope.
External references:
- Google Agent Payments Protocol (AP2) — the mandate model Sill’s intent layer is compatible with.
- Model Context Protocol — the discovery transport that surfaces escalating skills to clients.
- RFC 8785 — JSON Canonicalization Scheme — the canonicalization used in every Sill signing input.
- RFC 8032 — EdDSA / Ed25519 — the signature scheme on every Sill audit record.
- OWASP Top 10 for LLM Applications — LLM06 Excessive Agency, the class of risk HITL escalation directly mitigates.
- OWASP Top 10 for Agentic Applications — ASI07 (delegation) and related agentic risks.
- NIST AI Risk Management Framework — the Measure and Manage functions the audit record supports.