Every engineering team is deploying AI coding agents. Almost none of them can tell you what those agents did last Tuesday.
I help teams ship AI agent infrastructure with progressive autonomy controls, full audit trails, and multi-model orchestration. I built the reference implementation and run it in production every day.
I've built and operated multi-agent infrastructure across four AI models in production. Here's how that experience translates to your team.
Fixed-price deep dive into your AI agent deployment. Permission models, audit gaps, blast radius analysis, and a concrete roadmap for progressive autonomy. Delivered in 2 weeks.
You get a written report and a 90-minute walkthrough.
Embedded technical leadership for teams adopting AI-augmented development. Architecture decisions, agent governance policy, team enablement, and hands-on implementation.
Typically 3-6 month engagements.
Hands-on implementation of agent infrastructure in your stack. Audit trails, permission engines, multi-model routing, worktree isolation. I build it with your team, not for them.
2-4 week sprints, knowledge transfer included.
This isn't a pitch deck. Everything described here is running right now. The mesh is watching file changes. Agents are working in parallel worktrees. The research flywheel just deposited new findings. This site is a window into a live system.
Three trust tiers: hold (human-triggered), review (agent executes, human approves), auto (full autonomy). Trust earned per agent, per project, per task type. Start supervised, earn trust incrementally.
Every agent action is an immutable event. File changes, commits, task completions, voice commands — all correlated and queryable. Current state is always a projection over the event stream.
Agents don't just read files and write code. They capture conversations from ChatGPT, Claude.ai, and Gemini. They remote-control browser tabs for automated research. They query DOM and take screenshots. The entire surface area of your digital work is observable.
Claude, Gemini, Codex, and local models work simultaneously in isolated git worktrees. Three-tier routing: explicit tags, keyword heuristics, LLM triage. Each agent gets the right task for its strengths.
Every file save is a CRDT operation. Concurrent multi-agent edits merge automatically without conflicts. Operations replay in causal order. No manual conflict resolution needed.
Email, Slack, GitHub notifications, government portals — all indexed into one searchable layer. Pattern-matching rules auto-label, score importance, create tasks, and flag signals for human review. Your agents can search every channel you use. Replies stay native.
Voice transcription, file watching, CRDT versioning, and the browser bridge daemon all run locally. GPU-accelerated whisper on RTX 3090. Your code never leaves your machine.
The system I consult on is the system I operate. Every recommendation comes from running infrastructure, not whitepapers.
An autonomous knowledge production system that seeds open questions, collects multi-model research in parallel, synthesizes findings, and recursively generates deeper follow-up questions. It doesn't just search—it thinks.
Every topic is researched by Claude, Gemini, and Codex independently. Synthesis identifies areas of agreement, disagreement, and novel connections. No single-model bias.
Readiness scoring is pure Python — fully reproducible. Minimum deposit count, author diversity, source variety, and quality thresholds must all pass before synthesis fires.
Each synthesis extracts 3-5 follow-up questions that spawn new topics. Abstract unknowns progressively resolve into concrete, actionable research. The system gets smarter every cycle.
An arXiv scanner filters 100+ papers daily against the system's active agenda—open unknowns, core claims, and project goals. Relevant papers are automatically deposited into the matching research topic, bridging cutting-edge academic work with internal priorities. The flywheel doesn't just produce knowledge; it consumes the state of the art.
The control plane is the core. These are verticals built on top of it.
AI overlays for code & field work
Contextual AI projected into your line of sight. Developer HUD shows agent status, build pipelines, and voice commands. Field mode handles construction checklists, part identification, crew translation, and compliance documentation. Built on real agent infrastructure, not mockups.
Trade lineage & impact analytics
The only platform tracking complete trade trees and computing cumulative games-played impact across asset descendants. Real-time ingestion from NHL APIs with AI-powered sprite analysis for goal-level event extraction. Content-ready analytics for creators, teams, and betting platforms.
Unified message ingestion & routing
Every important message is buried in a different channel — email, Slack, GitHub, government portals, banking alerts. Signal Hub indexes them all into one searchable layer with automated routing rules. Pattern-match on sender, domain, or content to auto-label, set importance, create tasks, or flag for human review. Your agents can search across every channel. Replies stay in the native app.
On-device STT/TTS for agent workflows
Local speech-to-text (faster-whisper large-v3 on RTX 3090) and text-to-speech (Piper) running entirely on-device. Sub-second transcription, persistent GPU-resident model, integrated into editors, AI agents, and the SandoLens voice pipeline. Your voice never leaves your machine.
The infrastructure is grounded in active research. Theory that ships, not theory that sits.
Formal framework for agents earning autonomy incrementally. Three modes per task: hold, review, auto. Trust earned per agent, per project, per task type. Addresses the governance bottleneck blocking enterprise AI adoption.
Existing systems have binary trust. This is a three-tier workflow with formal trust accumulation.
Five-level LLM-driven memory pyramid: raw events, narrative summaries, project context, strategic synthesis, permanent model. Each level is actively consolidated and queryable. Running in production tracking 100K+ events.
Combines active consolidation with temporal hierarchy. More scalable than vector databases, more intelligent than static embeddings.
Tests encode beliefs. Specs encode intent. Code encodes reality. When all three diverge, which one is wrong? Measure drift as confidence scores. Predict incidents that traditional alerting misses.
Existing observability measures latency and error rates. Drift scoring measures epistemic health.
Every file save is a CRDT operation. Concurrent multi-agent edits merge automatically without conflicts. Operations replay in causal order. Makes multi-agent collaboration safe at the file level.
No existing multi-agent system handles file-level concurrency this way. Similar to Automerge but integrated into activity tracking.
Sandolab is the applied research lab of Taylor Sando. Two decades spanning neuroscience research, HCI publications, full-stack product engineering (SkipTheDishes, acquired by JustEat), regulated fintech (lead engineer), and now AI agent infrastructure.
Everything on this site is running in production. The system I consult on is the system I operate. Every recommendation comes from running infrastructure, not whitepapers.
Based in Winnipeg, MB.