Operations & Data · Draft

Human-in-the-Loop, Approvals, and Versioning

Created 9 Jun 2026·Updated 11 Jun 2026

Latest change: Publish Dossier site and full doc pack to GitHub

Draft document — deep-dive spec incomplete; content will be updated before and during build. Do not treat as signed-off implementation detail. Pack overview

Policy statement

Automation is the default. Humans intervene for judgment, compliance, access, and exceptions — never for routine tasks that agents can perform within guardrails.

Approval types

ID Type Required approver Blocks
A1 Media plan approval Planner Execution
A2 Launch confirmation Operator / Planner First go-live
A3 Spend guardrail breach Operator + optional client Budget change
A4 Access / BM / DNS Admin / client Onboarding completion
A5 Plan revise / replan approval Planner + client if budget up New plan version vN+1
A6 Manual override Admin Reconciliation
A7 Compliance / policy Planner + legal flag Creative launch
A8 Agent loop exhausted (red flag) Operator / admin Same run_id auto-retry; downstream mutations on that task
A9 Cost Guard tripped (≥3× estimate) Operator / admin All LLM calls on run_id; downstream agent work

A8 is raised when QC correction, tool-call, orchestrator retry, or global per-run_id loop limits are hit without resolution — see Loop limits.

A9 is raised by the deterministic Cost Guard service (not an agent) when actual token spend on a run_id reaches the pre-computed estimate — see Cost Guard.

Dashboard requirements (Phase 0 spec)

Every human touch must appear in the Human Touch Dashboard (operator surface — not System Ops) with:

  • Unique ticket ID
  • Tenant, vertical, platform
  • Requesting agent or service
  • Linked plan_version or change_set_id
  • Diff preview (structured JSON + human summary)
  • Approve / Reject / Request changes actions
  • Comment thread
  • SLA timer
  • Red flag (A8 / A9) items: distinct visual priority — A8 shows loop type / attempt count; A9 shows estimated vs actual USD summary — must not be buried in generic inbox; full token/QC drill-down → System Ops Dashboard

Versioning rules

Media plans

  • Monotonic integer version per tenant
  • Immutable approved payloads
  • Superseded plans remain queryable

Campaign state

  • execution_manifest links platform IDs → plan_version (live state; may drift until revise)
  • optimization_log / change_set_id append-only per applied change (input to plan revise)
  • Plan revise creates vN+1 type revise; replan creates type replan — both require A5 approval

Audit log

Append-only:

timestamp, actor_type, actor_id, action, resource, plan_version, payload_hash

Retention: minimum 7 years for financial audit (TBD with legal).

Escalation

NoYesPause spendNoneApproval pendingSLA breached?WaitNotify lead + clientAuto policy?Pause campaignsContinue waiting

Client-visible approvals

Optional client portal actions:

  • Approve budget increases
  • Approve creative themes
  • Reject proposed replans

Internal ops retain veto on compliance.

Configuration per tenant

approval_policy:
  first_launch_requires_human: true
  auto_approve_optimization_under_pct: 10
  client_must_approve_budget_increase: true
  vertical_compliance_profile: health_strict

loop_policy:  # hard ceilings — tenant may be stricter, not looser
  qc_correction_max: 2
  tool_rounds_max: 5
  orchestrator_retry_max: 1
  global_steps_per_run_id_max: 8
  on_exhaustion: red_flag_a8  # never infinite_retry

cost_guard_policy:
  trip_multiplier: 3.0
  block_on_trip: true

qc_threshold_policy:
  success_floor: 0.80
  window_hours: 24
  min_sample_global: 20

Metrics (ops)

  • Mean time to approve by type
  • % changes auto-approved
  • QC first-pass rate by main_task_id / agent / model (from agent_qc_results)
  • Top failure_codes and correction-loop depth (from agent_qc_loops)
  • A8 rate correlated with QC fail patterns (same run_id trace)
  • % task/subtask grains below 80% success floor (alert rate, time-to-recovery)
  • Rollback rate post-approval
  • Manual override count (target: decreasing over time)