Agentic systems · guardrails · real stakes

I don't demo agents. I build agentic systems for enterprise.

I'm a Data Collection Lead on a robotics program who designs and runs coordinator-and-specialist agent systems with guardrails built in. I operate them against real stakes: a car-rental business whose back office runs on a team of agents I built, live markets through read-only samplers, and production data pipelines. The interesting part is not what an agent can do, it is what it is allowed to do.

View the work Resume Contact GitHub →

Claude Certified Architect(Foundations) · Anthropic

Open to: AI and agent engineering, system design, technical program management, and data-collection programs for autonomous vehicles and robotics.

The safety floor

Anyone can make an agent act. The engineering is making it stop. Every system I build runs under a permission model: routine work is allowed, consequential work waits for a human, and a few things can never happen at all.

Permission modelIllustrative

Sample actions an agent might attempt, and the verdict each one gets. Allowed runs on its own, Needs approval pauses for a human, Blocked is a floor the system cannot cross.

An illustration of the guardrail model my systems run under, not a live or connected system. No real action is taken here.

Agent-driven delivery

The most honest proof of the approach is the site you are reading. It was designed, built, rebuilt, tested, and deployed by orchestrating my own agents, with a human keeping the decisions and the agents on the legwork.

An ideation think-tank proposes directions in parallel, specialist build agents implement, an auditor gate reviews every wave and returns pass, warn, or block, and persona-based testers stress the result. The loop runs to a written definition of done. One operator covers ground that traditionally spans several roles: design, front-end build, QA, accessibility review, and copy.

Parallel specialistsIndependent work runs at the same time instead of one model doing everything in sequence.
Auditor-gated wavesEvery wave is reviewed and returns pass, warn, or block before the next one starts.
Persona testing built inPersona-based testers click through the result and loop findings back until it meets the bar.
One-operator leverageA single person orchestrates the work that usually needs a small multi-role team.

EstimateWhat this approach saves, assumptions shown

A hand-built equivalent at this level of polish, interactivity, accessibility, and QA is realistically one to two weeks of focused solo work, or a small multi-person sprint. Orchestrating agents compressed that to about two days of part-time work, with the build-and-QA loop running in hours rather than weeks.

Assumptions: a freelance web rate of roughly $75 to $150 per hour, and a one-to-two-week hand-built baseline for the same scope. On those assumptions the saved effort lands on the order of several thousand dollars, delivered by one operator. This is an estimate with its assumptions shown, not a measured invoice or a stopwatch result. Ranges are deliberate.

How this site was built, and what that approach saves →

Inside the work

For the people who want to look closer. The Lab is an interactive model of the orchestration engine I build: set the autonomy dials, press Run, and watch a real deterministic loop dispatch parallel agents, gate each wave, catch a contradiction, and stop at the safety floor. The control logic runs in your browser; the agent outputs are illustrative.

Open the Lab →

The thinking behind it

Every system on this site runs on the same five principles: how the loop knows it is done, what it is allowed to touch, how tools are designed, how output is verified, and what survives at scale. Each one is stated plainly, tied to real work, and paired with the failure it exists to prevent.

See the principles I build by →

Selected work

Six systems, each carrying the same through-line: state the capability, then state the leash. The two production agents lead because they are the only ones with real measured outcomes.

Production

First-pass labeling agent

An agent that auto-buckets incoming unlabeled data into first-pass categories, so human labelers open a triaged queue instead of raw input, and a person confirms every call.

+25% data grading output

Read the case study →

Production

RCA auto-remediation agent

An agent that reads root-cause-analysis tickets and runs scoped terminal commands to resolve known failure modes, autonomous only inside a known, safe, reversible remediation envelope.

+30% data uptime

Read the case study →

Multi-agent system

Vantage OS

A Claude Code plugin where a coordinator routes every request to the right skill or sub-agent, then gates real-world actions through a QA filter and a permissions-tier system before they reach the human.

10 skills, 8 agents, public

Read the case study →

See all work →

Built in production

Two agents I designed and shipped on a humanoid robotics data program, working with node-cluster and egocentric-tracking data. Real production systems with real outcomes, each keeping a human on the decision and the agent on the legwork.

25% increase in data grading output, from the first-pass labeling agent Read the case study →

30% increase in data uptime, from the RCA auto-remediation agent Read the case study →

Production results from professional work on a humanoid robotics data program. De-identified, with no public repo, so there is nothing to link here.

Work with me

If you are hiring for agent engineering, system design, or technical program management, I am glad to talk. The fastest path is email, and the code is on GitHub.

Contact Email me How I build →

Open to: AI and agent engineering, system design, technical program management, and data-collection programs for autonomous vehicles and robotics.