Agent Director
Cinematic trace debugger for AI agents — playback, replay, diff, and debug agent runs like you're editing video.
I build AI agents. I also advise execs on whether they should.
Agentic AI engineering, multi-agent orchestration, and enterprise strategy. 20+ years bridging the gap between deep tech and the boardroom.
20+ years in tech across consumer, industrial, TMT, and public sectors — UK and US. For most of that, my job was translating what engineers build into something a CFO cares about. New revenue. Lower costs. Better experiences.
Then I stopped just advising and started building. Agentic systems with Claude Code and Codex. Multi-agent orchestration. Autonomous triage engines. I've broken enough prototypes at 11pm to know what works versus what looks good in a demo.
The market is full of people who can talk about AI, and people who can build it. I do both. TEDx speaker. Three DOI-registered open-source standards. 20+ public repos and growing fast.
Focus Areas
Open Source
From agent orchestration to observability tooling — building at the intersection of AI capability and production readiness.
Cinematic trace debugger for AI agents — playback, replay, diff, and debug agent runs like you're editing video.
Standards-first control plane for agentic browser actions across Gemini Computer Use, AWS Bedrock AgentCore, and ChatGPT Atlas.
Multi-provider agent orchestration sandbox — OpenAI, AWS Bedrock, Google Vertex with policy-based approvals and cryptographic audit trails.
Observability for Anthropic Skills — see which skill was intended, files referenced, policy approvals, and token/latency shifts.
End-to-end toolkit for observing agentic systems — detecting emergent anomalies in near-real time and benchmarking detector performance.
Research-grade harness for stress-testing MCP servers across multiple agent runtimes with reproducible task suites.
Governing Model-Context Protocol servers with policies, budgets, and verifiable provenance.
GitHub App and Action enforcing policy, provenance, and budget controls across autonomous agent runs. CI/CD guards for AI.
Safety-first auditor for Agent Skills — examine any skill bundle before enabling it with repeatable, standards-aligned checks.
Agentic system that overhears Teams conversations, detects actionable threads, and drafts remediation artefacts with full observability.
Geospatial AI governance research — FEMA NRI, BigQuery Earth Engine, and satellite imagery with auditable, policy-enforced outputs.
Intelligent emotion detection for TTS — Mistral 7B analyzes context and sarcasm to generate emotionally aware speech.
Standards & Research
Contributing to the foundations of trustworthy AI — standards, RFCs, and research that shape how agents operate safely at scale.
A provenance and integrity standard for the MCP ecosystem. Cryptographically signed manifests that bind MCP server releases to immutable tool metadata, enabling automated trust verification and preventing tool poisoning in AI agent supply chains.
Deterministic bundle identity, content attestation format, and verification tooling for agent skill bundles. Minimal, reproducible, and supply-chain friendly — enabling trust at the skill layer.
Machine-readable security advisories for MCP tools. Defines a JSON format, feed index, and trust model for registries, hosts, and gateways to automatically block, warn, or remediate vulnerable tools.