Operations & Data · Draft
Human-in-the-Loop, Approvals, and Versioning
Policy statement
Automation is the default. Humans intervene for judgment, compliance, access, and exceptions — never for routine tasks that agents can perform within guardrails.
Approval types
| ID | Type | Required approver | Blocks |
|---|---|---|---|
| A1 | Media plan approval | Planner | Execution |
| A2 | Launch confirmation | Operator / Planner | First go-live |
| A3 | Spend guardrail breach | Operator + optional client | Budget change |
| A4 | Access / BM / DNS | Admin / client | Onboarding completion |
| A5 | Plan revise / replan approval | Planner + client if budget up | New plan version vN+1 |
| A6 | Manual override | Admin | Reconciliation |
| A7 | Compliance / policy | Planner + legal flag | Creative launch |
| A8 | Agent loop exhausted (red flag) | Operator / admin | Same run_id auto-retry; downstream mutations on that task |
| A9 | Cost Guard tripped (≥3× estimate) | Operator / admin | All LLM calls on run_id; downstream agent work |
A8 is raised when QC correction, tool-call, orchestrator retry, or global per-run_id loop limits are hit without resolution — see Loop limits.
A9 is raised by the deterministic Cost Guard service (not an agent) when actual token spend on a run_id reaches 3× the pre-computed estimate — see Cost Guard.
Dashboard requirements (Phase 0 spec)
Every human touch must appear in the Human Touch Dashboard (operator surface — not System Ops) with:
- Unique ticket ID
- Tenant, vertical, platform
- Requesting agent or service
- Linked
plan_versionorchange_set_id - Diff preview (structured JSON + human summary)
- Approve / Reject / Request changes actions
- Comment thread
- SLA timer
- Red flag (A8 / A9) items: distinct visual priority — A8 shows loop type / attempt count; A9 shows estimated vs actual USD summary — must not be buried in generic inbox; full token/QC drill-down → System Ops Dashboard
Versioning rules
Media plans
- Monotonic integer version per tenant
- Immutable approved payloads
- Superseded plans remain queryable
Campaign state
execution_manifestlinks platform IDs →plan_version(live state; may drift until revise)optimization_log/change_set_idappend-only per applied change (input to plan revise)- Plan revise creates
vN+1typerevise; replan creates typereplan— both require A5 approval
Audit log
Append-only:
timestamp, actor_type, actor_id, action, resource, plan_version, payload_hash
Retention: minimum 7 years for financial audit (TBD with legal).
Escalation
Client-visible approvals
Optional client portal actions:
- Approve budget increases
- Approve creative themes
- Reject proposed replans
Internal ops retain veto on compliance.
Configuration per tenant
approval_policy:
first_launch_requires_human: true
auto_approve_optimization_under_pct: 10
client_must_approve_budget_increase: true
vertical_compliance_profile: health_strict
loop_policy: # hard ceilings — tenant may be stricter, not looser
qc_correction_max: 2
tool_rounds_max: 5
orchestrator_retry_max: 1
global_steps_per_run_id_max: 8
on_exhaustion: red_flag_a8 # never infinite_retry
cost_guard_policy:
trip_multiplier: 3.0
block_on_trip: true
qc_threshold_policy:
success_floor: 0.80
window_hours: 24
min_sample_global: 20
Metrics (ops)
- Mean time to approve by type
- % changes auto-approved
- QC first-pass rate by
main_task_id/ agent / model (fromagent_qc_results) - Top
failure_codesand correction-loop depth (fromagent_qc_loops) - A8 rate correlated with QC fail patterns (same
run_idtrace) - % task/subtask grains below 80% success floor (alert rate, time-to-recovery)
- Rollback rate post-approval
- Manual override count (target: decreasing over time)