Architecture · Draft

Human Control Plane

Created 9 Jun 2026·Updated 11 Jun 2026

Latest change: Publish Dossier site and full doc pack to GitHub

Draft document — deep-dive spec incomplete; content will be updated before and during build. Do not treat as signed-off implementation detail. Pack overview

Purpose

Not every step should be fully autonomous. The human control plane is where Kobi operators and planners review, approve, reject, or override agent actions — with full audit and version linkage.

This document covers the Human Touch Dashboard only: business operations, approvals, and client-workflow surfaces. System health, logs, and engineering statistics live in a separate System Ops Dashboard behind IAP and/or VPN — not mixed into operator views.

Two surfaces (do not merge)

Surface Audience Access Contains
Human Touch Dashboard Operators, planners, admins, auditors SSO + business RBAC; WAF on public endpoints Approvals, plan diffs, tenant timeline, tracking health signals, manual overrides
System Ops Dashboard Engineering, SRE, system admins IAP (required) + VPN (recommended for prod) Logs, system status, QC/Cost Guard statistics, run telemetry, infra health
Human Touch business RBACsummary onlydrill-downSystem Ops IAP + VPNLogs and tracesQC statisticsCost Guard ledgerSystem statusApproval inboxPlan diff / timelineTracking healthOrchestratorA8 / A9 tickets

Principle: operators see what to decide (diff, SLA, summary). System users see why it failed (token rows, failure-code heatmaps, playbook versions, infra errors). Deep links from Human Touch tickets into System Ops are allowed for users with both roles — never embed BQ/GCS explorers in the operator inbox.

Design goals

  1. No silent human work — If a human touches a platform manually, it must be logged or migrated into the system.
  2. Single approval queue — All pending human touches in one inbox, filterable by tenant, platform, urgency.
  3. Version binding — Every approval references a plan_version or change_set_id.
  4. Rollback — Approved changes that caused issues can be reverted to prior version where platforms allow.
  5. No engineering noise — Operators are not exposed to raw logs, model statistics, or infra panels.

Human touch categories

Category Examples Default policy
Access & trust BM partner invite, Kobi-entity business verify (PRE), client domain verify (guide), agency billing (PRE) Kobi ops / legal for PRE; client DNS optional — client steps in onboarding client portal
Plan approval New media plan, replan, budget reallocation > threshold Always human
Launch First campaign go-live per tenant Human confirm (configurable auto after first)
Spend guardrail breach Budget +20%, new geo Human approve
Compliance Health claims, school enrollment copy Human + optional legal flag
Exception API failure fallback manual fix Human with post-hoc entry form
Red flag (A8) Agent loop exhausted — QC/tool/retry/global cap hit Always human; blocks auto-retry until resolved
Red flag (A9) Cost Guard tripped — actual spend ≥3× estimate Always human; all LLM calls on run_id blocked

Human Touch Dashboard views

1. Approval inbox

  • Pending items sorted by SLA; red flags (A8 / A9) pinned above standard approvals
  • Fields: tenant, vertical, agent, requested action summary, diff preview, plan version
  • Actions: Approve, Reject (with reason), Request changes
  • A8 summary: loop type, attempt count, last QC failure reason (one line) — link to System Ops trace for engineers
  • A9 summary: estimated vs actual USD, trip ratio — link to System Ops token breakdown
  • A8 actions: Resolve (authorize new run_id), Manual fix, Admin override (A6), Cancel task — no "retry in place" on same run_id

2. Red flag queue

Operational view of open A8 / A9 tickets — not a statistics console.

  • Open tickets across tenants; SLA and assignee
  • Filters: loop type (QC / tool / timeout / global), cost trip, agent, platform, age
  • Actions: resolve, escalate to engineering, cancel
  • No BigQuery explorers, QC leaderboards, or Cost Guard ledgers here — those are System Ops

3. Tenant timeline

  • Chronological audit: onboarding steps, plans, launches, optimizations, reports
  • Filter by platform and actor (agent vs human)
  • Business-readable events only; raw run_id traces open in System Ops

4. Plan diff viewer

Side-by-side comparison of plan versions:

  • Channel budget split
  • Target KPIs
  • Campaign structure summary
  • Tracking dependencies

5. Tracking health

  • GA4 event volume, tag coverage, CAPI match rates (client-workflow signals)
  • Blocks launch/optimization when red
  • Detailed tag-debug and server-side logs → System Ops

6. Manual override form

When ops must act outside agents:

  • Record platform, action, reason, ticket ID
  • System schedules reconciliation job to sync state

Approval workflow

ApproveRejectRequest changesAgent proposes changeApproval QueueHuman reviews diffExecution Service appliesAgent notified with reasonPlan Agent revisesAudit log immutable entry

Roles (RBAC — Human Touch)

Role Permissions
Viewer Read reports and timeline
Operator Approve routine optimizations within policy; resolve A8/A9 with playbook
Planner Approve/create media plans
Admin Access grants, tenant config, guardrail edits
Auditor Read-only audit export

System Ops roles (system_developer, system_admin, sre) are defined in System Ops Dashboard — separate IAM group; may overlap with Admin for senior ops.

Versioning model

MediaPlan
  id: uuid
  tenant_id
  version: integer (monotonic)
  status: draft | pending_approval | approved | superseded
  approved_by, approved_at
  payload: channels, budgets, structures, rules

Campaign execution always stores plan_version on every platform mutation for traceability.

SLAs (planning defaults)

Touch type Target response
Launch approval 4 business hours
Plan approval 1 business day
Access / verification 2 business days
Optimization over threshold 4 business hours
Red flag (A8 / A9) 4 business hours