Architecture · Draft
System Overview
Purpose
This document describes the high-level architecture of the Kobi Digital Ads module. Diagrams and GCP references are illustrative; the tech stack is locked to all-TypeScript (ADR 0001).
Logical architecture
Core principles
- Agency-owned accounts — All ad accounts live under Kobi/agency hierarchy; clients never receive admin on raw platforms.
- GA4 as source of truth — Cross-channel measurement and optimization signals anchor on GA4; platform metrics are secondary checks.
- Agents execute, humans approve — Automation by default; explicit approval gates for spend, structural changes, and compliance-sensitive actions.
- Everything versioned — Media plans, campaign configs, and approval decisions carry version IDs and audit trails.
- Connector isolation — Each platform is a bounded connector; core domain logic does not embed platform-specific APIs.
- Event-driven — State changes (onboarding complete, plan approved, campaign live) propagate via async events for loose coupling.
Major components
| Component | Responsibility |
|---|---|
| Orchestrator | Routes work across lifecycle stages; enforces prerequisites (e.g. tracking live before spend) |
| Onboarding Service | Account provisioning, BM linking, verifications, Merchant Center |
| Planning Service | Media plan CRUD, budget allocation, channel mix, versioning |
| Execution Service | Campaign/ad group/ad creation, launch, pause, budget application |
| Optimization Service | Rules + ML/agent-driven bid/budget/audience/creative changes within guardrails |
| Reporting Service | Aggregates GA4 + platform + CRM; client and ops views |
| Feed Management | Product/catalog feeds, validation, sync schedules |
| Conversion Tracking | Tag deployment spec, server-side forwarding, CAPI/offline pipelines |
| Human Control Plane | Queue of pending touches, approval UI, rollback |
Data flows (summary)
Multi-tenancy
Each client tenant has:
- Isolated configuration (vertical, budgets, brand rules)
- Mapped platform account IDs (stored in tenant registry; credentials in secret manager)
- Separate approval policies (e.g. school vs clinic compliance rules)
- Row-level isolation in warehouse and audit logs
Scale tiers and global deployment
The architecture is scale-out by design — not a monolith that must be rewritten at 50 or 500 clients.
| Tier | Tenants | Software posture | Platform / ops prerequisites |
|---|---|---|---|
| Pilot | 1–5 | Single region; scale-to-zero services | Sandbox + Basic API tiers |
| Growth | ~50 | Same stack; quota-aware job queues | Meta 2-Tier; Google Basic/Standard path |
| Scale | ~200 | Tenant sharding in registry; connector worker pools | Full Meta tier; multi-MCC if needed |
| Stretch | ~500+ | Regional shards + multi-parent BM/MCC sharding | Credit headroom, ops runbooks, residency pins |
What already scales (no redesign needed)
- Event-driven connectors — Pub/Sub + Cloud Run Jobs absorb API call volume; back-pressure via queues, not bigger VMs.
- Tenant registry — every client is a row + secret refs; adding tenant #500 is data, not a new deployment model.
- Platform isolation — Google/Meta/TikTok connectors are bounded; horizontal workers per platform.
- LLM cost control — Cost Guard + model router prevent runaway inference as tenant count grows.
- Client-facing latency — first-party tag relay uses Cloudflare anycast edge globally (when relay SKU is enabled).
What we add as client count grows (evolution, not rewrite)
| Trigger | Addition |
|---|---|
| ~100 Meta tenants | Multi-parent BM sharding (blast-radius isolation) |
| Global client mix + residency contracts | Regional control-plane shard per geography (EU, MENA, …) with tenant home_region |
| High API volume | Per-platform rate-limit scheduler (token bucket per tenant + global ceiling) |
| 24/7 approval SLAs | More automation + policy; optional follow-the-sun ops — not duplicate stacks per timezone |
| Heavy reporting | BigQuery partition pruning by tenant_id; Batch API for scheduled reports |
Global clients — what "worldwide" means
- Campaigns target the client's local ad market (country/region on each platform) — this is normal agency operations, not a special Kobi feature.
- Control plane (orchestrator, HITL, connectors) can start in one GCP region (
europe-west1default) and serve clients globally via platform APIs. - Data residency is tenant-configurable in target-state — pin warehouse + secrets to EU (or other) when contracts require; not every tenant needs their own region on day one.
- Not in MVP: active-active multi-region failover for the control plane — document as Phase 3+ when revenue or SLAs justify it.