Jason Lovell · Austin, Texas

I build AI agents. I also advise execs on whether they should.

I work where executive problem-framing meets agent architecture, hands-on build, governance, and proof-of-value. The public trail is specs, white papers, benchmarks, and small systems that test what should actually ship.

  • 0Public papers and specs
  • 0+Open source repos
  • 0+Years across UK and US

CurrentlyAgent Boundary AssurancePAISLnanoassembly

About

Strategy, architecture, and hands-on AI build. That combination is the point.

I started in mobile before smartphones, when the consumer-tech frontier moved one new category at a time. The years since have been across most of them: mobile, IoT, wearables, tablets, VR and AR, connected home, smart audio. The arc tends to repeat. Loud enthusiasm, then a small number of teams quietly figuring out what is real and shipping it.

I have spent years with leadership teams across sectors and most C-suite seats. UK first, then the US. Most serious AI conversations end up at the same question: which of this is real, and where does it create value worth shipping? Before that, I ran my own innovation consultancy. That is where AI caught me.

AlphaGo, OpenAI's DOTA work, and the AlphaGo Zero release pulled me in during 2017. I have read the field since. Attention Is All You Need, the GPT-2 staged release, AlphaFold, InstructGPT and RLHF before ChatGPT became a verb, the agent wave, the current scaling-versus-inference-time argument. I am not new here.

A few years ago I stopped just advising and started building properly. Agentic systems with Claude Code, Codex, and Cursor. Multi-agent orchestration. Replay engines, signed evidence objects, local boundary harnesses, vulnerable benchmark corpora. I run the models worth testing, fine-tune what warrants it, and test coding agents, plugins, and MCP servers on real work. Enough prototypes broken at 11pm to know what works and what only looks good in a demo.

What I care about now is the gap between what agentic AI can do, what an organization should trust, and what people will actually use. That gap is where strategy, architecture, build quality, governance, and judgement have to meet. It is also why most of what I publish is open source.

Where I am focused now

  • StandardsAuthoring TBOM, SBA, TSA, AAC, and Agent Boundary Assurance. Supply-chain and evidence primitives for agentic systems.
  • EngineeringReplay engines, signed evidence objects, local boundary harnesses, vulnerable benchmark corpora.
  • AdvisoryHelping leaders turn ambiguous AI ambition into workflows, controls, prototypes, and deployment decisions.
  • ReadingMech interp, test-time compute, world models, scalable oversight. See the threads section.

Open source

Agent assurance, boundary evidence, and small research labs.

Four public pillars first: agent-boundary assurance, signed release evidence, vulnerable fixtures, and a compact interpretability sequence. The rest stays close by, but not at the same volume.

Featured

Governance
TypeScript

PAISL / Agent Boundary Assurance

Local-first agent-boundary benchmark and June 2026 white paper defining Agent Boundary Assurance for personal and enterprise AI agents.

Agent boundariesPAISLEvidence records
Governance
NewPython

Agent Assurance Case

Draft specification and reference verifier for a portable, signed evidence object that helps decide whether an agentic workflow is ready to ship.

SpecEd25519Offline verifier
Governance
NewPython

DVAAC

Benchmark fixtures for agent-asset scanners and AAC release assurance. Vulnerable and clean cases, paired with expected findings and verifier-ready evidence.

BenchmarkFixturesAssurance

June 2026 deployment stack labs

These are compact supporting artifacts, not the main show: policy layers, skill observability, MCP portability, agent commerce, budget controls, and multi-provider routing.

Tooling

GovernanceTypeScript

ProofPack

Tamper-evident receipt and audit-pack system for agent and tool runs. Ed25519 signatures, Merkle trees, and portable proof bundles.

jlov7/ProofPack
ObservabilityTypeScript

BranchLab

Local-first replay engine for agent runs. Rewind a trace, branch into counterfactual timelines, and simulate policy changes before deployment.

jlov7/branchlab
OrchestrationPython

Baton Studio

Multi-agent control room with a shared world model, causal dependency graph, energy-aware scheduling, and fair baton arbitration.

jlov7/baton-studio
OrchestrationPython

Meta Memory Studio

Memory control plane for agents. Inspect, evolve, and govern what agents remember across sessions.

jlov7/meta-memory-studio
OrchestrationTypeScript

Agent Director

Trace debugger for AI agents. Scrub through execution, replay specific steps, and diff runs side by side.

jlov7/agent-director
ResearchTypeScript

Scaffold Arena

Workbench for measuring how scaffold choices change LLM quality, reliability, cost, latency, and failure profile.

jlov7/scaffold-arena
ObservabilityPython

AMDM

Toolkit for observing agentic systems, detecting emergent anomalies, and benchmarking detector behaviour against synthetic scenarios.

jlov7/AMDM
Lab21 R&D experiments

Threads I track

What I read when I am not building, and where it shows up in the work.

Most agentic-AI work needs help from outside the LLM stack. The list below is the current reading list, ordered roughly by how much it is shaping what I am shipping: boundary evidence, release assurance, interpretability, world models, and practical deployment controls.

  • Agent boundaries

    The PAISL white paper defines Agent Boundary Assurance: evidence for what an agent accessed, remembered, transformed, sent, blocked, or executed across local and enterprise settings.

    ABA white paper
  • Personal AI

    PAISL treats the benchmark object as the run itself: scenario, data items, consent state, boundary decisions, tool trace, egress record, scorecard, and failure cases.

    PAISL
  • Scalable oversight

    Most agentic releases now exceed what one reviewer can verify by hand. AAC and DVAAC put the release decision into signed evidence and deterministic fixture checks.

    AAC and DVAAC
  • Mechanistic interpretability

    nanocircuits, nanofeatures, and nanoassembly ask the same practical question at three levels: when does a circuit claim beat the strongest cheap baseline?

    nanoassembly
  • Subliminal learning

    GhostTrace measures behavioural half-life under recursive self-distillation. The toy result is supported; the local LLM tier is negative boundary evidence, not a recursive LLM claim.

    GhostTrace
  • Inference-time compute

    Test-time scaling, process reward models, and search-at-inference change what a release decision means. AAC's verifier recomputes outcomes without an LLM in the loop.

    AAC verifier
  • World models

    nanoAWM is the small symbolic lab: learned consequence simulation in MiniOS, not production safety validation. Baton Studio is the multi-agent control-room side of the same interest.

    nanoAWM
  • Verifier-first measurement

    ProofKern keeps the negative result instead of sanding it down: four MLX-relative kernel wins become zero cross-framework wins against torch.compile on the same GPU.

    ProofKern
  • Faithful chain-of-thought

    If CoT stops being monitorable under heavy RL training, a lot of current oversight assumptions fall over. Tracking the faithfulness and process-supervision literature closely.

  • Formal methods

    SMT solvers (Z3) and deterministic verifiers as a route to release decisions an auditor can check offline. Where neuro-symbolic verification becomes practical for agent systems.

    AAC verifier
  • Conformal prediction

    Conformal calibration and selective classification for LLM outputs and abstention. Active interest, no published artifact yet.

  • Knowledge graphs

    KG-augmented retrieval and corpus intelligence inside scaffold engineering work.

    Scaffold Arena
  • Geospatial AI

    FEMA NRI, BigQuery Earth Engine, and satellite imagery for compliance-aware geospatial underwriting.

    TerraRisk Agent

Standards

Boundary evidence, supply-chain specs, and release assurance.

The work below is independent and artifact-backed. Some items are draft specs; the new Agent Boundary Assurance paper is a white paper. The common thread is simple: agent supply chains, tool use, and release decisions should leave evidence that someone else can inspect.

  • White paper2026

    Agent Boundary Assurance / PAISL

    Independent white paper defining an evidence discipline for local and enterprise agents: what they accessed, remembered, transformed, sent, blocked, or executed. It connects PAISL, TBOM, SBA, TSA, and ABER into a reviewable boundary-assurance model.

  • Standard2026

    TBOM: Tool Bill of Materials

    A provenance and integrity standard for the MCP ecosystem. Cryptographically signed manifests that bind MCP server releases to immutable tool metadata, supporting automated trust verification and reducing the surface for tool poisoning in AI agent supply chains.

  • Standard2026

    SBA: Skill Bundle Attestation

    Deterministic bundle identity, content attestation format, and verification tooling for agent skill bundles. Minimal, reproducible, and supply-chain friendly. Adds trust at the skill layer.

  • Standard2026

    TSA: Tool Security Advisory

    Machine-readable security advisories for MCP tools. Defines a JSON format, a feed index, and a trust model so registries, hosts, and gateways can automatically block, warn, or remediate vulnerable tools.

  • Standard2026

    AAC: Agent Assurance Case

    Portable, signed, audit-grade evidence object for agentic AI release assurance. Binds inventory, detector coverage, findings, policy decisions, release conditions, compliance evidence, a deterministic verdict, and an Ed25519 signature. Verifiable offline.

Live activity

Recent commits and releases.

Contact

Worth a conversation?

Working out whether agentic AI is real for your situation, or building something and want another set of eyes on it, or hiring for a senior AI role. Any of those is fair.

Replies usually inside a working day.

Jason Lovell · 2026Built with Next.js on Vercel