An honest look at building a multi-agent AI system that turned one automation developer into a small team. 18 specialised agents, a structured knowledge base, and the architecture that makes it work.
The automation work was getting more complex. The old approach stopped working.
Using Gemini or Claude in the browser. Copy-paste context. Explain the project from scratch each conversation. Lose track of decisions made last week. Repeat mistakes because the AI forgot the lesson learnt.
Projects were getting bigger. 12+ workflow ecosystems. Multiple APIs. Complex business logic. The context was too scattered, too big. I was spending more time re-explaining than building. I had become the bottleneck.
Build an AI coworker. One that knows the codebase, the conventions, the past mistakes. One that can fill knowledge gaps, pick up where I reach my limit, and push way past those limits. A force multiplier, not just a chat interface.
This had to work with existing tools. No custom infrastructure. No team to maintain it. Just me, an AI, and whatever I could build in the gaps between client work. It had to be practical, not theoretical.
Every component was chosen for a specific reason. Here's what I use and why.
Core reasoning engine
Terminal-based interface
Persistent knowledge base
Agent specs and docs
Agent scripts
Version control
The browser interface resets context every conversation. A CLI tool running locally has access to the entire file system. It can read documentation, execute scripts, and maintain context across sessions.
Claude writes better code and works natively with OpenCode. More importantly, the Pro and Max memberships give fixed monthly costs instead of unpredictable API bills that scale with usage.
Vector databases add complexity and cost. For a single-user system, simple file-based knowledge just works. The AI reads markdown files directly. No embedding, no retrieval tuning, no infrastructure.
One giant prompt trying to do everything fails. Specialised agents with focused responsibilities perform better. An agent that only does database work knows SQL deeply. An agent that only deploys workflows knows the n8n API inside out.
Each agent has a specific role. Together they cover the full development lifecycle.
When I give the copilot a task, the Orchestrator (Agent 00) analyses it and routes to the right specialist. Need a new workflow? Agent 05. Need to edit an existing one? Agent 13. Need database changes? Agent 15. Each agent has its own specification file that defines exactly what it does and how.
Analyses tasks and routes to specialists. The traffic controller for the whole system.
Requirements gathering and scoping. Creates project briefs from vague ideas.
Technical architecture design. Turns requirements into workflow blueprints.
Optimises AI prompts for workflow nodes. Makes AI steps reliable.
External API integration. Handles authentication, endpoints, error handling.
Generates new workflow JSON from architecture specs.
Deploys workflows to the n8n instance via API.
Creates and updates project documentation in Notion.
Git operations. Commits, branches, version management.
Workflow health checks and alerting.
Automated quality validation. Catches errors before deployment.
Edits existing live workflows via the n8n API.
Drafts emails and messages in my writing style.
SQL operations on Supabase/PostgreSQL.
Repository operations, issues, PRs.
Creates single-file HTML tools and dashboards.
The copilot doesn't just have tools. It has context. Everything I've learnt, documented.
The knowledge base is organised into categories. When the copilot needs to know how to name a file, it reads the naming standards. When it needs to integrate with an API, it reads the API documentation. When it makes a mistake, I document it in LESSONS-LEARNT.md so it never happens again.
JavaScript best practices for n8n nodes. Python standards (PEP 8, type hints). Common patterns and anti-patterns specific to workflow development.
File naming standards by type. Directory structure rules. Workflow versioning policy. The copilot never asks "where should I put this?" It knows.
Project operations guide. Agent spawning guide. Session management protocol. When I figure out the best way to do something, I document it so the copilot always does it that way.
29 markdown files covering every API we use. Endpoints, authentication, rate limits, common gotchas. The copilot can write API integrations without me explaining the API.
Every time something breaks in a non-obvious way, I document it. The copilot reads this file. It knows that comment syntax matters in n8n code nodes. It knows about rate limit gotchas. It learns from my mistakes.
New AI sessions start with a mandatory onboarding checklist. The copilot must read 8 core documents before taking any action. This ensures it never creates files in wrong locations, uses incorrect naming, or forgets the conventions. It takes about 5 seconds and prevents hours of rework.
From vague request to deployed solution.
I describe what I need, provide context, and point to relevant files. The copilot asks clarifying questions to fully understand the requirement before proceeding.
The Orchestrator determines this needs: Project Planner (requirements), Project Structurer (architecture), n8n Specialist (build), and Deployment (push live).
Project brief gets created. Architecture gets designed. The copilot asks clarifying questions if needed. I approve the plan before building starts.
The n8n Specialist creates the workflow JSON following all conventions. It knows the naming standards, the structure requirements, the error handling patterns.
The Workflow Critic checks for disconnected nodes, missing error handling, incorrect patterns. Catches issues before they hit production.
The workflow gets deployed via API. Git commit gets created. Documentation gets updated. All automatic.
Building an AI copilot taught me more about software development than the AI itself.
The AI is only as good as its context. Every hour spent documenting conventions, patterns, and lessons learnt pays dividends. The knowledge base is the real asset. The AI is just the interface to it.
One agent trying to do everything fails. Eighteen agents each doing one thing well succeeds. The cognitive load of context-switching is real for AI too. Keep agents focused.
When the copilot makes a mistake, don't just fix it. Document why it happened and how to prevent it. The LESSONS-LEARNT.md file is one of the most valuable files in the system.
Strict naming conventions, file locations, and coding standards seem bureaucratic until you have 100+ workflows. Then they're the only thing keeping the system navigable.
Mandatory onboarding protocols, quality gates, self-critique steps. The AI will confidently do the wrong thing if you let it. Build in checkpoints.
I wasted time on vector databases and custom tooling before realising simple markdown files work better. Start with the simplest thing that could work. You can add complexity later if needed.
The copilot multiplies my capabilities, it doesn't replace my judgement. I test everything, guide it when it goes off track, sometimes do things myself and explain why. I make suggestions, review outputs, and handle client communication. It's a partnership, not delegation.
Every project adds to the knowledge base. Every mistake becomes documentation. Six months in, the copilot handles tasks that would have broken it on day one.
Everything I wish someone had told me before starting. The lessons that cost me time.
I needed multiple AI models, not just Claude. OpenCode is free, open source, has a clean interface, and lets you swap models easily. This saved me when Anthropic randomly banned my account (a known bug affecting many developers). No response to appeals. Had to create a new account and now I keep it to one device only.
OpenCode has a 200k token window. Gemini in browser has 1M. Space runs out fast. Agents don't consume your session tokens, which helps, but the real reason is consistency. Agents do exactly the same thing every time based on their prompt and knowledge base. The chat AI doesn't.
Within days you'll have 10+ agents. You won't remember their names, IDs, or numbers. The Orchestrator's sole purpose is to point the AI to the correct agent and enforce usage. Now I just say "run via Agent 0" and the right specialist gets called.
I asked it to create and spawn agents. It did a good job for simple tasks, but complex ones kept getting stuck. The missing piece was context. I fell into the classic trap of expecting the AI to know things because it's "smart".
I don't write my own prompts anymore. The Prompt Engineer agent handles everything AI-related: OpenCode commands, agent specifications, deep research prompts, workflow AI node prompts, documentation. If it involves instructing an AI, this agent writes it better and faster than I ever could. Here's the loop: I brief it, it generates what I need, I test and refine. This is how the agent was improved too: I researched advanced prompt styles, fed the research back, and the agent upgraded itself. Self-healing loop. Agents fixing agents.
I was repeating myself every session. Pointing to the same docs. Explaining the same structure. Now /start reads all directories, documentation, knowledge base, knows the agents, and waits for my command. /refresh is a lighter version for mid-session resets.
Summarises the session, saves under the correct project folder, checks for misplaced files, queries n8n for updated workflows that need pushing to GitHub, checks if agents or knowledge base changed and need committing. One command, full sync.
Each agent connects to my tech stack. The n8n agent has API access to all workflows. The database agent can query, edit, delete, and create records directly. They gather their own context without me copy-pasting. Recently I ran a full database audit: structure, speed, data quality. It identified issues I didn't know existed, fixed them, made the database 5x faster with proper indexing, and flagged workflow problems based on data stuck in certain statuses. This is what happens when specialists have access to their tools.
The principles here apply to any complex automation project. Let's talk about what's possible for yours.