Claude CodeAI AgentsDevelopmentArchitectureSkillsOperations

We Built 200 Agents, Then Consolidated to 33 — Here's What We Learned

March 25, 2026 · Victor David Medina · 4 min read · AI Operations

Over the past year, we’ve built 200+ agent definitions, then consolidated to 33 active agents, 178 skills, 155 hooks, 30 triggers, and 74 prompt files, all living inside a single .claude/ directory. This is what we learned.

The Architecture That Emerged

We didn’t plan to build a giant roster of agents. We started with 7. But as we deployed RelayLaunch for client engagements, patterns emerged that demanded specialization. And then we learned that most of that specialization belongs in skills, not agents.

The current structure:

.claude/
├── agents/     # 33 active agents (consolidated from 200+)
├── skills/     # 178 skills (composable procedural knowledge)
├── hooks/      # 155 hooks (auto-format, brand-scan, quality-gate, etc.)
├── triggers/   # 30 triggers (scheduled and event-driven activation)
└── memory/     # Tiered memory (index + topic files)

Lesson 1: Skills Beat Agents

This is the biggest lesson. We stopped building new agents and started building skills instead.

An agent is a persona with instructions. A skill is a procedure with tools. The difference matters because:

Agents live in the context window permanently. 200+ of them would blow any budget
Skills load on demand. Only the metadata is shown until the agent needs it
Skills can include scripts, executables, and code that the agent runs
Skills are shareable, versionable, and composable

We now build skills by default. Agents are only created when a distinct persona and authority level are needed.

Lesson 2: Progressive Disclosure Saves Tokens

Our CLAUDE.md file grew to 400+ lines. Performance dropped. Instructions got ignored. Sound familiar?

The fix: tiered memory loading.

L0 Abstract (~100 tokens): Always loaded. Routing, awareness, quick checks.
L1 Overview (~500-2K tokens): Loaded on agent activation. Planning, navigation.
L2 Detail (unlimited): Loaded only when the agent needs specifics. Execution.

I.S.A., our Chief of Staff agent, routes using L0 only. That’s a ~97% token savings vs loading everything into every context window.

The practical implementation: our .claude/memory/index.md file is always loaded (it’s the directory). Individual memory files (product-decisions.md, client-patterns.md, etc.) are loaded on demand.

Lesson 3: Hooks Are Your Quality Floor

We run 3 hooks on every file edit:

auto-format.sh: consistent code formatting
brand-color-scan.sh: enforces our brand color system (Obsidian, Amber, White, Zinc)
quality-gate.sh: catches accessibility, performance, and SEO issues

These run silently. When they pass, you don’t even notice them. When they catch something, they block the change before it ships. This is the quality floor that makes 23 buyer-facing specialists safe to operate at scale.

Lesson 4: Agents Need to Dream

After 20+ sessions, our agent memory files were full of noise and contradictions. The session-by-session “automemory” approach kept adding without ever consolidating.

Our fix: AutoDream, a memory consolidation system inspired by human REM sleep.

Between sessions, the system:

Reviews all memory files for staleness
Searches recent sessions for corrections and new decisions
Merges, deduplicates, and resolves contradictions
Prunes entries older than 90 days with no recent references

The result: agents on day 30 of working with us are dramatically better than agents on day 1. The learning actually compounds.

Lesson 5: Rivalries Produce Better Output Than Agreement

Our 16 Room directors have intentional rivalries. Sales vs Operations. Growth vs Quality. Content vs Engineering.

When we first built the system, we tried to make agents collaborate harmoniously. The output was mediocre, a consensus of the lowest common denominator.

When we introduced rivalries, the output quality jumped. Why? Because disagreement surfaces blind spots. A recommendation that survives challenge from Sales, Operations, AND Quality is a recommendation you can trust.

Lesson 6: Code Is the Universal Interface

After building agents for 20+ different domains (wellness spas, professional services, fitness studios, e-commerce) we realized the underlying agent is more universal than we thought.

The model can:

Pull data via API calls
Organize it in the file system
Analyze it with Python
Synthesize insights in any output format

The core scaffolding is just bash and file system access. The differentiation is in the skills: the procedural knowledge that tells the agent how to do domain-specific work.

This means deploying to a new vertical is primarily a skills problem, not an agent-building problem.

What We’d Do Differently

Start with skills, not agents. We over-indexed on agent personas early on.
Implement tiered memory from day one. The CLAUDE.md bloat was avoidable.
Add hooks before the first deployment. Quality gates are too important to add later.
Build the dream system earlier. Memory consolidation should have been built alongside automemory.

Try It Yourself

RelayLaunch is available in three ways:

Free Ops Scan: Score your operations in 60 seconds
Starter ($149/mo): Morning Brief, client recovery, slot filling — flat monthly after proof
Pro ($299/mo): Full 16-room coverage, all integrations
Team ($999/mo): 5 team seats, full platform access

And if you’re a developer, Relay Wire lets you integrate our council system into your own applications via API.