ADR 0001: Tech Stack — TypeScript (Accepted)

Status

Accepted — 11 Jun 2026. Decision owned by the founder (delegated by the parent/VC). Supersedes the prior "TBD/Open" state. Revisit only at the trigger points in Evolution path.

Decision: Option A — all-TypeScript, strict mode, single language across connectors, API, orchestration, and dashboard.

Context

Kobi Digital Ads requires:

API services for platform connectors and domain logic (Google Ads, Meta, TikTok, DV360, GA4, CRM)
Agent orchestration — implemented as a deterministic state machine with a non-AI Cost Guard and a thin LLM planner/QC (not autonomous research agents)
Human-in-the-loop dashboard and client portal
Event-driven architecture
Analytics warehouse integration (BigQuery)
Deployment on Google Cloud (Cloud Run)

Decisive operating reality: the code is authored by AI agents in Cursor (Composer for most work, Claude models for hard tasks) and reviewed at a distance by a non-engineer founder — not hand-written by a Python or TS specialist. Models are hosted (Vertex/Gemini/Claude); there is no in-house ML or self-hosted LLM planned in the mid term. The system must scale to many tenants worldwide over time.

Decision drivers (reframed for this reality)

Driver	Weight	Note
AI-authoring + non-engineer review fit	High	Replaces "team familiarity" — the "team" is the model + a reviewer who doesn't read code line-by-line
Type safety / fail-loud correctness	High	The compiler substitutes for code-level review
Dashboard velocity	High	Next.js HITL UI + client portal
Single mental model / low ops surface	High	Solo operator; one runtime, one pipeline
Google Cloud / Ads ecosystem libraries	Medium	First-class TS clients exist for all needed APIs
Agent/ML ecosystem	Low (today)	Orchestration is deterministic; LLMs are API-hosted

Why TypeScript wins for an AI-written, reviewed-at-a-distance codebase

Compiler as safety net — when a model hallucinates a field or wrong shape, tsc fails before runtime; in Python the same error surfaces in production. The reviewer can't catch that by eye; the compiler can.
Tighter agent feedback loop — Cursor feeds type/lint errors back to the model, which self-corrects. A typed, compiled language gives the agent far more signal to fix itself.
One language end-to-end — connectors → domain → BFF → dashboard share one type system and one mental model; no Python↔TS serialization seam for the AI to get subtly wrong.
ML advantage is moot today — the only real reason to pick Python (ML ecosystem) does not apply while optimization is rules + thin LLM and all models are hosted.

Options considered

Option A — all-TypeScript (chosen)

API + connectors: Fastify (or NestJS) with Zod validation
Dashboard + portal: Next.js
Orchestration: custom deterministic state machine (already specified)
Pros: single language, compiler-enforced correctness, best AI-authoring loop, strong Cloud Run + Next.js fit
Cons: weaker ML/data ecosystem than Python — not relevant until in-house ML exists

Option B — Python

Strong agent/ML ecosystem and BigQuery tooling, but dynamic typing gives AI-written code far less compile-time safety, and a separate Next.js dashboard re-introduces a split stack anyway.

Option C — Hybrid (Python agents + TS dashboard)

Best-tool-per-job at production scale with a team, but two runtimes/pipelines and a serialization seam are pure tax for a solo, AI-authored pilot — and the seam is exactly where AI introduces subtle bugs.

Consequences — engineering guardrails (mandatory)

Because code is AI-authored and not read line-by-line, guardrails replace code review. Enforced via the Cursor rule .cursor/rules/engineering-standards.mdc:

Strict TypeScript (strict: true, no implicit any, no unjustified @ts-ignore).
Runtime validation at every boundary (Zod) — validate all platform API responses, LLM outputs, inbound requests, and env vars; never trust an external shape.
Types generated from schemas (OpenAPI/event schemas) as the single source of truth.
Self-verification before "done" — the agent must run typecheck + lint + tests + build and leave them green; never ship red.
Tests + CI are the reviewer — green CI is a hard gate for any deploy.
Small, boundaried, stateless services over Pub/Sub; idempotent, retryable workers; per-tenant isolation on every query.

Infrastructure-as-Code — deferred (realistic for a solo pilot)

Do not adopt Terraform yet. For a solo founder with a handful of resources, Terraform's state/locking overhead is a liability, not a safeguard.

Now: idempotent gcloud provisioning scripts checked into the repo ("IaC lite") — reproducible, reviewable, AI-writable, no state to corrupt.
Adopt Terraform at a trigger: going multi-region (data-residency replication) or onboarding a second engineer — whichever comes first.

Worldwide scale — what actually drives it (not the language)

Language is second-order. Global tenant scale is decided by architecture: stateless + queue-driven work, per-tenant isolation (incl. Meta child-BM-per-tenant), multi-region deployment + data residency (S8), platform rate-limit management, and the database choice (Cloud SQL/Postgres now; evaluate Spanner only if truly global-massive). TypeScript on Cloud Run scales horizontally per tenant load and does not constrain any of this.

Evolution path

Add an isolated Python worker (behind Pub/Sub + schemas, not a rewrite) only when in-house ML is justified — i.e., enough clients/data to warrant custom models — or sustained budget pressure justifies self-hosted LLMs. Until then, stay single-language.

GCP mapping

Concern	Service
Compute	Cloud Run
Events	Pub/Sub
Warehouse	BigQuery
Secrets	Secret Manager
Database	Cloud SQL / Postgres (revisit Spanner at global scale)
IaC	`gcloud` scripts now → Terraform at multi-region / 2nd engineer

First build step

Monorepo scaffold (all-TS) + one vertical slice: Google Ads mutate + GA4 read, with strict config, Zod boundaries, and CI gates wired from commit #1.

References

System overview
GCP deployment topology
Engineering standards: .cursor/rules/engineering-standards.mdc (enforced for all AI-authored code)