BranchLab
Local-first replay engine for AI agents — rewind any run, branch into counterfactual 'what-if' timelines, simulate policy changes, and produce evidence-grade audit reports.
I build AI agents. I also advise execs on whether they should.
Agentic AI engineering, multi-agent orchestration, and enterprise strategy. 20+ years bridging the gap between deep tech and the boardroom.
20+ years in tech across consumer, industrial, TMT, and public sectors — UK and US. For most of that, my job was translating what engineers build into something a CFO cares about. New revenue. Lower costs. Better experiences.
Then I stopped just advising and started building. Agentic systems with Claude Code and Codex. Multi-agent orchestration. Autonomous triage engines. I've broken enough prototypes at 11pm to know what works versus what looks good in a demo.
The market is full of people who can talk about AI, and people who can build it. I do both. TEDx speaker. Three DOI-registered open-source standards. 20+ public repos and growing fast.
Focus Areas
Open Source
From agent orchestration to observability tooling — building at the intersection of AI capability and production readiness.
Featured
Local-first replay engine for AI agents — rewind any run, branch into counterfactual 'what-if' timelines, simulate policy changes, and produce evidence-grade audit reports.
Multi-agent control room — shared world model, causal dependency graph, energy-aware scheduling, and fair baton arbitration for coordinated agent teams.
Cryptographic verification for AI agent runs — Ed25519 signatures, Merkle audit trees, and tamper-evident bundles that prove exactly what an agent did.
Memory control plane for AI agents — inspect, edit, replay, and govern agent memory across sessions with full observability and policy enforcement.
Standards-first control plane for agentic browser actions across Gemini Computer Use, AWS Bedrock AgentCore, and ChatGPT Atlas.
Observability for Anthropic Skills — see which skill was intended, files referenced, policy approvals, and token/latency shifts.
Governing Model-Context Protocol servers with policies, budgets, and verifiable provenance.
End-to-end toolkit for observing agentic systems — detecting emergent anomalies in near-real time and benchmarking detector performance.
Multi-provider agent orchestration sandbox — OpenAI, AWS Bedrock, Google Vertex with policy-based approvals and cryptographic audit trails.
GitHub App and Action enforcing policy, provenance, and budget controls across autonomous agent runs. CI/CD guards for AI.
Agentic system that overhears Teams conversations, detects actionable threads, and drafts remediation artefacts with full observability.
Safety-first auditor for Agent Skills — examine any skill bundle before enabling it with repeatable, standards-aligned checks.
Enterprise-grade scaffold engineering workbench — proves orchestration code around LLMs dominates outcomes in quality, reliability, and cost.
Research-grade harness for stress-testing MCP servers across multiple agent runtimes with reproducible task suites.
Local-first analysis cockpit for code/docs corpora using Recursive Language Models — deep corpus intelligence without sending data to the cloud.
Policy-first manifest and supply-chain manager for agent skills — declarative governance for what runs, where, and under what constraints.
AI agent containment research — sandboxed execution boundaries, capability restriction, and escape-path analysis for autonomous agents.
Geospatial AI governance research — FEMA NRI, BigQuery Earth Engine, and satellite imagery with auditable, policy-enforced outputs.
Intelligent emotion detection for TTS — Mistral 7B analyzes context and sarcasm to generate emotionally aware speech.
Reproducible benchmark quantifying how progressive disclosure inside Agent Skills affects latency, context load, and output quality.
Live Activity
Standards & Research
Contributing to the foundations of trustworthy AI — standards, RFCs, and research that shape how agents operate safely at scale.
A provenance and integrity standard for the MCP ecosystem. Cryptographically signed manifests that bind MCP server releases to immutable tool metadata, enabling automated trust verification and preventing tool poisoning in AI agent supply chains.
Deterministic bundle identity, content attestation format, and verification tooling for agent skill bundles. Minimal, reproducible, and supply-chain friendly — enabling trust at the skill layer.
Machine-readable security advisories for MCP tools. Defines a JSON format, feed index, and trust model for registries, hosts, and gateways to automatically block, warn, or remediate vulnerable tools.