We Built 200 Agents, Then Consolidated to 33 — Here's What We Learned
Over the past year, we’ve built 200+ agent definitions, then consolidated to 33 active agents, 178 skills, 155 hooks, 30 triggers, and 74 prompt files, all living inside a single .claude/ directory. This is what we learned.
The Architecture That Emerged
We didn’t plan to build a giant roster of agents. We started with 7. But as we deployed RelayLaunch for client engagements, patterns emerged that demanded specialization. And then we learned that most of that specialization belongs in skills, not agents.
The current structure:
.claude/
├── agents/ # 33 active agents (consolidated from 200+)
├── skills/ # 178 skills (composable procedural knowledge)
├── hooks/ # 155 hooks (auto-format, brand-scan, quality-gate, etc.)
├── triggers/ # 30 triggers (scheduled and event-driven activation)
└── memory/ # Tiered memory (index + topic files)
Lesson 1: Skills Beat Agents
This is the biggest lesson. We stopped building new agents and started building skills instead.
An agent is a persona with instructions. A skill is a procedure with tools. The difference matters because:
- Agents live in the context window permanently. 200+ of them would blow any budget
- Skills load on demand. Only the metadata is shown until the agent needs it
- Skills can include scripts, executables, and code that the agent runs
- Skills are shareable, versionable, and composable
We now build skills by default. Agents are only created when a distinct persona and authority level are needed.
Lesson 2: Progressive Disclosure Saves Tokens
Our CLAUDE.md file grew to 400+ lines. Performance dropped. Instructions got ignored. Sound familiar?
The fix: tiered memory loading.
- L0 Abstract (~100 tokens): Always loaded. Routing, awareness, quick checks.
- L1 Overview (~500-2K tokens): Loaded on agent activation. Planning, navigation.
- L2 Detail (unlimited): Loaded only when the agent needs specifics. Execution.
I.S.A., our Chief of Staff agent, routes using L0 only. That’s a ~97% token savings vs loading everything into every context window.
The practical implementation: our .claude/memory/index.md file is always loaded (it’s the directory). Individual memory files (product-decisions.md, client-patterns.md, etc.) are loaded on demand.
Lesson 3: Hooks Are Your Quality Floor
We run 3 hooks on every file edit:
- auto-format.sh: consistent code formatting
- brand-color-scan.sh: enforces our brand color system (Obsidian, Amber, White, Zinc)
- quality-gate.sh: catches accessibility, performance, and SEO issues
These run silently. When they pass, you don’t even notice them. When they catch something, they block the change before it ships. This is the quality floor that makes 23 buyer-facing specialists safe to operate at scale.
Lesson 4: Agents Need to Dream
After 20+ sessions, our agent memory files were full of noise and contradictions. The session-by-session “automemory” approach kept adding without ever consolidating.
Our fix: AutoDream, a memory consolidation system inspired by human REM sleep.
Between sessions, the system:
- Reviews all memory files for staleness
- Searches recent sessions for corrections and new decisions
- Merges, deduplicates, and resolves contradictions
- Prunes entries older than 90 days with no recent references
The result: agents on day 30 of working with us are dramatically better than agents on day 1. The learning actually compounds.
Lesson 5: Rivalries Produce Better Output Than Agreement
Our 16 Room directors have intentional rivalries. Sales vs Operations. Growth vs Quality. Content vs Engineering.
When we first built the system, we tried to make agents collaborate harmoniously. The output was mediocre, a consensus of the lowest common denominator.
When we introduced rivalries, the output quality jumped. Why? Because disagreement surfaces blind spots. A recommendation that survives challenge from Sales, Operations, AND Quality is a recommendation you can trust.
Lesson 6: Code Is the Universal Interface
After building agents for 20+ different domains (wellness spas, professional services, fitness studios, e-commerce) we realized the underlying agent is more universal than we thought.
The model can:
- Pull data via API calls
- Organize it in the file system
- Analyze it with Python
- Synthesize insights in any output format
The core scaffolding is just bash and file system access. The differentiation is in the skills: the procedural knowledge that tells the agent how to do domain-specific work.
This means deploying to a new vertical is primarily a skills problem, not an agent-building problem.
What We’d Do Differently
- Start with skills, not agents. We over-indexed on agent personas early on.
- Implement tiered memory from day one. The CLAUDE.md bloat was avoidable.
- Add hooks before the first deployment. Quality gates are too important to add later.
- Build the dream system earlier. Memory consolidation should have been built alongside automemory.
Try It Yourself
RelayLaunch is available in three ways:
- Free Ops Scan: Score your operations in 60 seconds
- Starter ($149/mo): Morning Brief, client recovery, slot filling — flat monthly after proof
- Pro ($299/mo): Full 16-room coverage, all integrations
- Team ($999/mo): 5 team seats, full platform access
And if you’re a developer, Relay Wire lets you integrate our council system into your own applications via API.