Human Control Plane

Purpose

Not every step should be fully autonomous. The human control plane is where Kobi operators and planners review, approve, reject, or override agent actions — with full audit and version linkage.

This document covers the Human Touch Dashboard only: business operations, approvals, and client-workflow surfaces. System health, logs, and engineering statistics live in a separate System Ops Dashboard behind IAP and/or VPN — not mixed into operator views.

Two surfaces (do not merge)

Surface	Audience	Access	Contains
Human Touch Dashboard	Operators, planners, admins, auditors	SSO + business RBAC; WAF on public endpoints	Approvals, plan diffs, tenant timeline, tracking health signals, manual overrides
System Ops Dashboard	Engineering, SRE, system admins	IAP (required) + VPN (recommended for prod)	Logs, system status, QC/Cost Guard statistics, run telemetry, infra health

Principle: operators see what to decide (diff, SLA, summary). System users see why it failed (token rows, failure-code heatmaps, playbook versions, infra errors). Deep links from Human Touch tickets into System Ops are allowed for users with both roles — never embed BQ/GCS explorers in the operator inbox.

Design goals

No silent human work — If a human touches a platform manually, it must be logged or migrated into the system.
Single approval queue — All pending human touches in one inbox, filterable by tenant, platform, urgency.
Version binding — Every approval references a plan_version or change_set_id.
Rollback — Approved changes that caused issues can be reverted to prior version where platforms allow.
No engineering noise — Operators are not exposed to raw logs, model statistics, or infra panels.

Human touch categories

Category	Examples	Default policy
Access & trust	BM partner invite, Kobi-entity business verify (PRE), client domain verify (guide), agency billing (PRE)	Kobi ops / legal for PRE; client DNS optional — client steps in onboarding client portal
Plan approval	New media plan, replan, budget reallocation > threshold	Always human
Launch	First campaign go-live per tenant	Human confirm (configurable auto after first)
Spend guardrail breach	Budget +20%, new geo	Human approve
Compliance	Health claims, school enrollment copy	Human + optional legal flag
Exception	API failure fallback manual fix	Human with post-hoc entry form
Red flag (A8)	Agent loop exhausted — QC/tool/retry/global cap hit	Always human; blocks auto-retry until resolved
Red flag (A9)	Cost Guard tripped — actual spend ≥3× estimate	Always human; all LLM calls on `run_id` blocked

Human Touch Dashboard views

1. Approval inbox

Pending items sorted by SLA; red flags (A8 / A9) pinned above standard approvals
Fields: tenant, vertical, agent, requested action summary, diff preview, plan version
Actions: Approve, Reject (with reason), Request changes
A8 summary: loop type, attempt count, last QC failure reason (one line) — link to System Ops trace for engineers
A9 summary: estimated vs actual USD, trip ratio — link to System Ops token breakdown
A8 actions: Resolve (authorize new run_id), Manual fix, Admin override (A6), Cancel task — no "retry in place" on same run_id

2. Red flag queue

Operational view of open A8 / A9 tickets — not a statistics console.

Open tickets across tenants; SLA and assignee
Filters: loop type (QC / tool / timeout / global), cost trip, agent, platform, age
Actions: resolve, escalate to engineering, cancel
No BigQuery explorers, QC leaderboards, or Cost Guard ledgers here — those are System Ops

3. Tenant timeline

Chronological audit: onboarding steps, plans, launches, optimizations, reports
Filter by platform and actor (agent vs human)
Business-readable events only; raw run_id traces open in System Ops

4. Plan diff viewer

Side-by-side comparison of plan versions:

Channel budget split
Target KPIs
Campaign structure summary
Tracking dependencies

5. Tracking health

GA4 event volume, tag coverage, CAPI match rates (client-workflow signals)
Blocks launch/optimization when red
Detailed tag-debug and server-side logs → System Ops

6. Manual override form

When ops must act outside agents:

Record platform, action, reason, ticket ID
System schedules reconciliation job to sync state

Approval workflow

Roles (RBAC — Human Touch)

Role	Permissions
Viewer	Read reports and timeline
Operator	Approve routine optimizations within policy; resolve A8/A9 with playbook
Planner	Approve/create media plans
Admin	Access grants, tenant config, guardrail edits
Auditor	Read-only audit export

System Ops roles (system_developer, system_admin, sre) are defined in System Ops Dashboard — separate IAM group; may overlap with Admin for senior ops.

Versioning model

MediaPlan
  id: uuid
  tenant_id
  version: integer (monotonic)
  status: draft | pending_approval | approved | superseded
  approved_by, approved_at
  payload: channels, budgets, structures, rules

Campaign execution always stores plan_version on every platform mutation for traceability.

SLAs (planning defaults)

Touch type	Target response
Launch approval	4 business hours
Plan approval	1 business day
Access / verification	2 business days
Optimization over threshold	4 business hours
Red flag (A8 / A9)	4 business hours

System Ops Dashboard — logs, statistics, system health (IAP / VPN)
05-human-in-the-loop.md — extended approval policy
07-security-access-governance.md — access tiers
04-lifecycle/plan-update.md