← Home

Projects

One serious thing at a time, plus the experiments that feed it. Most projects are open about their state — if something is alpha, it says alpha.

Flagship · in development

AI Control Plane for Enterprise Agents

A governance and observability runtime for agentic systems in regulated environments. Brokers every tool call through a signed policy verdict, captures dual model/runtime traces, and surfaces uncertain decisions to a human-review queue before any side effects materialise.

Status
Scaffolding — v0.1 milestone: agent registry + execution tracing
Stack
Python · FastAPI · LangGraph · Postgres · Kafka · OpenTelemetry · Temporal · Next.js
Role
Architect & sole maintainer (for now).
Public
repo · coming soon
Origin
The single flagship of a deliberate 12–18 month transformation from CTO exploring AI to AI systems architect operating in the field.
Experiment

Eval harness for retrieval-grounded agents

Open benchmark scoring agent answers against a frozen corpus with adversarial paraphrase and recency probes.

Python · DuckDB Planned
Tool

Trace replay CLI

Re-runs a stored agent session against a different model and produces a structured diff of the divergence.

Python · TypeScript Planned
Side

Latency atlas

Continuous measurement of frontier-model p50/p95/p99 latency by region, prompt size, and tool-use depth.

Rust · ClickHouse Idea