sandolab.xyz

Your AI Agents Need
Guardrails and Audit Trails

Every engineering team is deploying AI coding agents. Almost none of them can tell you what those agents did last Tuesday.

I help teams ship AI agent infrastructure with progressive autonomy controls, full audit trails, and multi-model orchestration. I built the reference implementation and run it in production every day.

100K+
Agent Events Tracked Daily
4
AI Models Orchestrated
60
Research Topics Active
0
Unaudited Agent Actions

Services

I've built and operated multi-agent infrastructure across four AI models in production. Here's how that experience translates to your team.

Architecture Audit

$3,500

Fixed-price deep dive into your AI agent deployment. Permission models, audit gaps, blast radius analysis, and a concrete roadmap for progressive autonomy. Delivered in 2 weeks.

You get a written report and a 90-minute walkthrough.

Fractional CTO

$10-15K/mo

Embedded technical leadership for teams adopting AI-augmented development. Architecture decisions, agent governance policy, team enablement, and hands-on implementation.

Typically 3-6 month engagements.

Integration Sprint

Scoped

Hands-on implementation of agent infrastructure in your stack. Audit trails, permission engines, multi-model routing, worktree isolation. I build it with your team, not for them.

2-4 week sprints, knowledge transfer included.

Why me

SkipTheDishes
Early-stage engineer. Company acquired by JustEat Takeaway for $200M.
CTO, Regulated Fintech
Led engineering for a rent deposit insurance platform through North Forge. XState workflows, audit infrastructure, GCP.
10+ Years Full Stack
React, Go, TypeScript, Node, Postgres, MongoDB, GCP, Kubernetes. From MVPs to production at scale.
Published Researcher
ACM APGV, IEEE PacificVis. B.Sc. Psychology (neuroscience), B.Sc. Computer Science (HCI). Two decades studying how agents form and update models.
Running in Production

The Control Plane

This isn't a pitch deck. Everything described here is running right now. The mesh is watching file changes. Agents are working in parallel worktrees. The research flywheel just deposited new findings. This site is a window into a live system.

PERCEPTION
SandoLens AR
Browser Bridge
Console Web UI
Spacemacs
CLI
ORCHESTRATION
Task Router
Director Service
OCC Permissions
Worktree Manager
DATA & INTELLIGENCE
Mesh (SQLite + FTS5)
CRDT Versions
Signal Hub
Voice Pipeline
AGENT FLEET
Claude
Gemini
Codex
Local LLM (Ollama)

Progressive Autonomy

Three trust tiers: hold (human-triggered), review (agent executes, human approves), auto (full autonomy). Trust earned per agent, per project, per task type. Start supervised, earn trust incrementally.

Full Audit Trail

Every agent action is an immutable event. File changes, commits, task completions, voice commands — all correlated and queryable. Current state is always a projection over the event stream.

Browser-Native Agent Perception

Agents don't just read files and write code. They capture conversations from ChatGPT, Claude.ai, and Gemini. They remote-control browser tabs for automated research. They query DOM and take screenshots. The entire surface area of your digital work is observable.

Multi-Model Parallel Routing

Claude, Gemini, Codex, and local models work simultaneously in isolated git worktrees. Three-tier routing: explicit tags, keyword heuristics, LLM triage. Each agent gets the right task for its strengths.

CRDT File Versioning

Every file save is a CRDT operation. Concurrent multi-agent edits merge automatically without conflicts. Operations replay in causal order. No manual conflict resolution needed.

Unified Signal Ingestion

Email, Slack, GitHub notifications, government portals — all indexed into one searchable layer. Pattern-matching rules auto-label, score importance, create tasks, and flag signals for human review. Your agents can search every channel you use. Replies stay native.

On-Device First

Voice transcription, file watching, CRDT versioning, and the browser bridge daemon all run locally. GPU-accelerated whisper on RTX 3090. Your code never leaves your machine.

The system I consult on is the system I operate. Every recommendation comes from running infrastructure, not whitepapers.

Self-Correcting Knowledge Engine

The Research Flywheel

An autonomous knowledge production system that seeds open questions, collects multi-model research in parallel, synthesizes findings, and recursively generates deeper follow-up questions. It doesn't just search—it thinks.

1
Seed
Open unknowns from model.org become research topics
2
Collect
Claude, Gemini & Codex research each topic in parallel
3
Synthesize
Deposits merged into coherent narratives with confidence scores
4
Spawn
Follow-up questions extracted, approved, and re-queued
SEEDCOLLECTEVALUATESYNTHESIZEFOLLOW-UPAPPROVESEED
60
Active Research Topics
188
Multi-Model Deposits
16
Completed Syntheses
3
Models Per Topic

Multi-Model Triangulation

Every topic is researched by Claude, Gemini, and Codex independently. Synthesis identifies areas of agreement, disagreement, and novel connections. No single-model bias.

Deterministic Quality Gates

Readiness scoring is pure Python — fully reproducible. Minimum deposit count, author diversity, source variety, and quality thresholds must all pass before synthesis fires.

Recursive Depth

Each synthesis extracts 3-5 follow-up questions that spawn new topics. Abstract unknowns progressively resolve into concrete, actionable research. The system gets smarter every cycle.

📚

Goal-Directed Literature Recognition

An arXiv scanner filters 100+ papers daily against the system's active agenda—open unknowns, core claims, and project goals. Relevant papers are automatically deposited into the matching research topic, bridging cutting-edge academic work with internal priorities. The flywheel doesn't just produce knowledge; it consumes the state of the art.

Also Building

The control plane is the core. These are verticals built on top of it.

Augmented Reality

SandoLens

AI overlays for code & field work

Contextual AI projected into your line of sight. Developer HUD shows agent status, build pipelines, and voice commands. Field mode handles construction checklists, part identification, crew translation, and compliance documentation. Built on real agent infrastructure, not mockups.

Developer HUDVoice-to-Agent PipelineConstruction Site LogisticsRemote Expert Mode
View SandoLens →
Sports Analytics

Hockeypedia

Trade lineage & impact analytics

The only platform tracking complete trade trees and computing cumulative games-played impact across asset descendants. Real-time ingestion from NHL APIs with AI-powered sprite analysis for goal-level event extraction. Content-ready analytics for creators, teams, and betting platforms.

Trade Tree VisualizationAsset Value TrackingSprite Event AnalysisContent-Ready Outputs
Visit Hockeypedia →
Signal Intelligence

Signal Hub

Unified message ingestion & routing

Every important message is buried in a different channel — email, Slack, GitHub, government portals, banking alerts. Signal Hub indexes them all into one searchable layer with automated routing rules. Pattern-match on sender, domain, or content to auto-label, set importance, create tasks, or flag for human review. Your agents can search across every channel. Replies stay in the native app.

Unified Cross-Channel SearchAutomated Signal RulesImportance ScoringTask Auto-CreationOn-Device Only
In Development
Voice & Input

Voice Tools

On-device STT/TTS for agent workflows

Local speech-to-text (faster-whisper large-v3 on RTX 3090) and text-to-speech (Piper) running entirely on-device. Sub-second transcription, persistent GPU-resident model, integrated into editors, AI agents, and the SandoLens voice pipeline. Your voice never leaves your machine.

GPU-Accelerated WhisperSub-Second TranscriptionEditor IntegrationAgent Voice Commands
Running in Production

Research Depth

The infrastructure is grounded in active research. Theory that ships, not theory that sits.

Progressive Autonomy Trust Calculus

Agent Governance

Formal framework for agents earning autonomy incrementally. Three modes per task: hold, review, auto. Trust earned per agent, per project, per task type. Addresses the governance bottleneck blocking enterprise AI adoption.

Existing systems have binary trust. This is a three-tier workflow with formal trust accumulation.

Hierarchical Memory Consolidation

Agent Memory

Five-level LLM-driven memory pyramid: raw events, narrative summaries, project context, strategic synthesis, permanent model. Each level is actively consolidated and queryable. Running in production tracking 100K+ events.

Combines active consolidation with temporal hierarchy. More scalable than vector databases, more intelligent than static embeddings.

Belief Drift in Long-Running Agent Memory

Agent Reliability

Tests encode beliefs. Specs encode intent. Code encodes reality. When all three diverge, which one is wrong? Measure drift as confidence scores. Predict incidents that traditional alerting misses.

Existing observability measures latency and error rates. Drift scoring measures epistemic health.

CRDT-Based File Versioning

Distributed Systems

Every file save is a CRDT operation. Concurrent multi-agent edits merge automatically without conflicts. Operations replay in causal order. Makes multi-agent collaboration safe at the file level.

No existing multi-agent system handles file-level concurrency this way. Similar to Automerge but integrated into activity tracking.

About

Sandolab is the applied research lab of Taylor Sando. Two decades spanning neuroscience research, HCI publications, full-stack product engineering (SkipTheDishes, acquired by JustEat), regulated fintech (lead engineer), and now AI agent infrastructure.

Everything on this site is running in production. The system I consult on is the system I operate. Every recommendation comes from running infrastructure, not whitepapers.

Based in Winnipeg, MB.