AI Everywhere

AI Orchestration Distributed Systems Multi-Agent AI

What if AI assistants weren't confined to a single browser tab or one machine? What if you could orchestrate multiple AI systems - across different hardware, running different models - all working together on your problems?

I've been experimenting with two approaches that challenge the conventional "one AI, one chat" paradigm. Together, they represent a fundamental shift in how we think about AI assistance: not as singular tools, but as orchestrated ecosystems.

Breaking Free from Browser Constraints

The first limitation I wanted to overcome was simple: why should AI live in a browser? Every conversation trapped in a web interface, every context limited by what you can paste into a chat box, every tool confined to what a web app can access.

The terminal offers something fundamentally different. Your files are there. Your git repositories. Your entire development environment. And critically - you can run multiple AI tools simultaneously.

3Brains: Multi-AI Terminal Orchestration

The first experiment, 3Brains, asks a straightforward question: what if you could use Claude, Gemini, and local LLMs together in the same workspace?

Using tmux (terminal multiplexing), I created a layout where multiple AI systems work side-by-side:

  • Claude as the orchestrator - Managing overall workflow, creating files, coordinating operations
  • Gemini for agent management - Building specialized agents for specific tasks
  • Ollama for local research - Running open-source LLMs entirely on your hardware, no API calls, complete privacy
  • Real-time monitoring - Watching all three systems work simultaneously

The groundbreaking insight: each AI can see what the others are doing. They share the same files, the same context documents (CLAUDE.md, GEMINI.md, AGENTS.md), and the same project directory. Claude can delegate research to Ollama. Gemini can create agents that leverage Claude's capabilities. All visible in split panes.

Why This Matters

Different AI models have different strengths. Claude excels at complex reasoning and code generation. Gemini has Google Search integration and strong agent creation. Local LLMs offer privacy and zero cost. By orchestrating them together, you get specialized expertise for each aspect of your work.

And critically - you're not limited by API rate limits or costs. Heavy research? Route it to your local Ollama instance. Need Google integration? Send that to Gemini. Complex orchestration? Claude handles it.

Multi-Instance Claude: Distributed AI Across Hardware

The second experiment tackles a different challenge: what if you could run the same AI on multiple machines and coordinate them?

I built a distributed task queue system that lets multiple Claude instances - running on different Raspberry Pis, different servers, different hardware entirely - work together as a coordinated team.

The architecture is elegantly simple:

  • Shared filesystem queue - Tasks written as JSON files to NFS-mounted storage
  • File-based locking - Atomic operations prevent race conditions
  • Target-specific or opportunistic - Route tasks to specific machines or let any available worker claim them
  • Complete audit trail - Every operation logged with timestamps and host tracking

The Key Innovation

The breakthrough was discovering the --dangerously-skip-permissions flag for Claude. This allows Claude instances to run non-interactively, receiving tasks via the queue and executing them autonomously. Without this flag, remote Claude instances would block waiting for user approval - making automation impossible.

With this flag, suddenly you have true multi-agent AI:

from claude_queue import ClaudeQueue

q = ClaudeQueue()

# Distribute system checks across your infrastructure
for host in ["rpi01", "rpi05", "server01"]:
    q.submit_task(
        "Check disk usage and report anomalies",
        target_host=host
    )

Each machine picks up its task, Claude executes it autonomously, and results flow back through the shared filesystem. No complex networking. No APIs. Just files, locks, and atomic operations.

What Makes This Groundbreaking

Both experiments share a common insight: AI systems don't need to be complex to be powerful.

Simplicity Over Complexity

The distributed queue doesn't use message brokers, databases, or complex networking protocols. It uses files. NFS mounts. File locking with fcntl. Tools that have worked reliably for decades.

The multi-AI terminal doesn't use custom protocols or APIs. It uses tmux - a tool from 2007. Shell scripts. Symlinked context files. Dead simple, completely reliable.

Coordination Without Centralization

There's no central coordinator in the distributed system. No orchestrator managing worker nodes. Just workers checking a shared queue and claiming tasks. Beautifully decentralized.

The multi-AI terminal has Claude as a soft orchestrator, but each AI operates independently. They can work in parallel, delegate to each other, or operate completely separately.

Accessibility to Individuals

Perhaps most importantly: this level of multi-agent AI coordination is accessible to anyone.

You don't need enterprise infrastructure. A few Raspberry Pis (I'm using Pi 4s and Pi 5s with 8GB RAM) connected via NFS. Or just your laptop with tmux installed. That's it.

No cloud orchestration platforms. No Kubernetes clusters. No complex deployment pipelines. Just Python, bash scripts, and AI CLI tools.

Lessons Learned

Building these systems taught me several crucial lessons:

Start Simple, Then Iterate

The distributed Claude system started with direct SSH commands. It worked. Then I added the queue for sophistication. Proving the concept before adding complexity prevented over-engineering.

Local Privacy Matters

Running Ollama locally for research tasks means sensitive data never leaves your network. Combine this with the distributed system, and you can route privacy-sensitive tasks to local-only workers.

File-Based Coordination Scales

Everyone worries about race conditions and synchronization in distributed systems. File locking with atomic rename operations (write to temp file, then rename) solves this elegantly. NFS handles the distribution.

Specialization Beats Generalization

Having Claude do everything is less effective than having Claude orchestrate, Gemini handle agents, and Ollama do research. Each AI playing to its strengths produces better results than any single AI attempting everything.

The Future: Swarm Intelligence

What these experiments point toward is something bigger: swarm intelligence for AI assistants.

Imagine combining both approaches:

  • Multiple machines running 3Brains setups
  • Each machine with its own Claude, Gemini, and Ollama instances
  • All coordinated through the distributed queue
  • Tasks flowing to the most appropriate AI on the most appropriate hardware

Need heavy computation? Route to the server with GPU access. Need privacy? Route to the air-gapped machine. Need Google integration? Route to the Gemini instance. Need local inference at zero cost? Route to any Ollama instance.

And it all coordinates through simple files on a shared filesystem.

Why This Changes Everything

These experiments demonstrate that you can own your AI infrastructure. Not rent it through APIs. Not depend on cloud platforms. Own it.

Your context lives in files on your disk. Your AI instances run on your hardware (or hardware you control). Your orchestration uses open protocols and tools. Your conversations persist across sessions. Your costs are predictable (hardware + electricity, not per-token API fees).

This is the vision NetworkChuck outlined in his terminal AI videos, but taken further. Not just one AI in a terminal, but many AIs. Not just on one machine, but distributed across your infrastructure.

Getting Started

Both systems are straightforward to set up:

3Brains requires:

  • tmux installed (apt-get install tmux)
  • At least one AI CLI tool (Claude, Gemini, or Ollama)
  • A bash script to launch the layout

Multi-Instance Claude requires:

  • Shared NFS mount across your machines
  • Claude installed on each participating machine
  • The queue library and worker scripts (pure Python)
  • Passwordless SSH between machines

Total setup time for either: under an hour.

The Broader Insight

What makes these experiments significant isn't the specific implementations - it's what they demonstrate about the future of AI assistance.

We've been conditioned to think of AI as monolithic services accessed through web interfaces. One conversation. One model. One provider. Rate-limited. API-metered. Context-constrained.

But AI doesn't have to work that way. AI can be:

  • Distributed - Running where you need it, on hardware you control
  • Specialized - Different models for different tasks
  • Coordinated - Multiple instances working together
  • Persistent - Context maintained across sessions in files you own
  • Private - Sensitive work routed to local-only models
  • Accessible - Built with simple tools anyone can deploy

This is AI everywhere. Not locked in browsers. Not limited to single instances. Not controlled by platforms.

AI distributed across your infrastructure, coordinated by simple protocols, specialized for different tasks, and entirely under your control.

The tools already exist. The infrastructure is straightforward. The only limit is imagination.