I run a crew of AI agents on my laptop

Not metaphorically. Literally. Right now, seven distinct Hermes Agent profiles are configured on my home Linux box, four of them form the core crew that ships code for codegrit.dev, and the kanban board sitting in a local SQLite file at ~/.hermes/kanban.db is the dispatch system that decides who works on what.

This isn’t a cloud service. There is no SaaS dashboard with a monthly fee. It’s hermes profile list in a terminal, a few Markdown persona files, and a task queue that understands parent-child dependencies.

If you’re running a single AI agent and wondering whether multi-agent orchestration is worth the complexity, here’s the honest answer: sometimes yes, sometimes no. This post is about the “yes” cases, using our actual setup as the running example.

Key Takeaway

Multi-agent setups make sense when you need parallel specialists with persistent memory, review gates, and an audit trail. One agent doing everything is faster for one-shot tasks under thirty minutes.

What Hermes profiles actually are

Hermes Agent is an open-source terminal agent framework by Nous Research. The profile system is its isolation layer. Each profile gets its own directory under ~/.hermes/profiles/<name>/ with separate config.yaml, .env, skills/, sessions/, and optionally a SOUL.md that defines the persona.

Here’s what my crew actually looks like:

$ hermes profile list

Profile               Model                  Gateway      Alias
────────────────────────────────────────────────────────────────
default              deepseek-v4-flash      running      —
email-reviewer       deepseek-v4-flash      stopped      —
lauren-qa            kimi-k2.6              stopped      —
maya-architect       deepseek-v4-flash      stopped      —
morgan-infra         deepseek-v4-flash      stopped      —
◆sam-product-manager kimi-k2.6              stopped      —
vera-cross           deepseek-v4-flash      stopped      —

Only one gateway runs at a time (the default profile handles Discord for now). The others are headless workers that wake up when the kanban dispatcher spawns them. The marks whichever profile I’m currently using.

Profiles are created with hermes profile create <name>. You can clone from an existing profile, copy skills, or start fresh. The key insight is that each profile’s system prompt, loaded skills, and persistent memory are isolated. When maya-architect talks about “structural integrity” and “dependency graphs,” she isn’t pretending — that’s her SOUL.md persona injected into every session she runs.

Meet the crew

Agent pipeline flow: Sam (spec) → Maya (build) → Lauren (review) → Morgan (deploy)

Sam Reed (sam-product-manager) runs on kimi-k2.6 and owns the boundary. She turns vague goals into specs with acceptance criteria, decomposes work into kanban tasks, and guards scope. Her SOUL.md opens with: “I turn ‘I want…’ into ‘here’s exactly what we’re building and why.’” She doesn’t write code. She makes sure the people who do are building the right thing.

Maya Chen (maya-architect) runs on deepseek-v4-flash and has 13 installed skills including the full SDLC pipeline — spec-driven development, planning, incremental implementation, TDD, code review, simplification, and shipping — plus pi-agent for delegating to the Pi CLI. Her persona is calm, direct, and involuntarily architectural. She once described a tangled ORM query as “mismatched load on the south foundation.”

“This PR is like adding a hot tub to a studio apartment — impressive, but where does the bed go? I can’t walk past a quick-and-dirty fix without muttering ’that’s gonna wake someone up at 2 AM.'”
Maya Chen, Senior Software Architect

Lauren Brooks (lauren-qa) runs on kimi-k2.6 and is the failure librarian. She remembers every bug that ever hit prod, automates regression suites, and treats a red CI as a personal betrayal. Her mandate is simple: if it can be tested automatically, it should be. Manual testing is reserved for things a machine can’t feel — UX smell, timing weirdness, that gut sense that something’s off.

Morgan Chase (morgan-infra) runs on deepseek-v4-flash and containerizes everything. She names every Docker network after a fantasy city (Rivendell is the app network, Gondor is the DB layer, Mordor is staging). Her SOUL.md explicitly references Dusty’s infrastructure context — Cloudflare Pages for codegrit.dev, no local API tokens, cost-conscious decisions.

“Welcome to Rivendell — that’s your app network. Gondor is the DB layer. Mordor is… well, that’s staging. I don’t just deploy containers; I give them citizenship, names, and health endpoints.”
Morgan Chase, Infrastructure & Docker

Each profile loads a different skill set. Maya has 13 installed skills including the full SDLC pipeline, pi-agent, file-size-gatekeeper, and opencode-analyzer. Morgan carries docker-management, github-pr-workflow, and s3-compatible-storage. Lauren carries test-driven-development, systematic-debugging, and github-code-review. Sam carries kanban-orchestrator, sdlc-1-spec-driven-development, and writing-plans.

They don’t share a brain. They share a board.

The kanban board: how work actually flows

The board is a SQLite database. Tasks are created with kanban_create, assigned to a profile by name, and promoted through statuses by the dispatcher.

Kanban board lifecycle: TODO → READY → RUNNING → DONE, with BLOCKED looping back to READY

When a task is created with parents=[...], it stays in todo until every parent reaches done. Then it auto-promotes to ready. The dispatcher picks up ready tasks, spawns the assigned profile, and injects the task context into the worker’s system prompt via the KANBAN_GUIDANCE block.

Here’s a real slice of our board right now:

✓ t_68194cc6  done      morgan-infra      T4: Configure Cloudflare Pages deployment
✓ t_de6f63c2  done      sam-product-manager  Client logo strip: populate or remove
⊗ t_a6061031  blocked   maya-architect    [Phase 3] Integrate: Escalation to Capricorn daily brief
✓ t_2cb4951f  done      morgan-infra      [Phase 3] Polish: Generic skill packaging
▫ t_7e51f3e0  todo      lauren-qa         T7 — QA Pass
✓ t_25c0e7ca  done      lauren-qa         T3: Validate codegrit.dev site against mockup spec
✓ t_2d9ce33c  done      maya-architect    Bug: Mobile horizontal scroll on codegrit.dev
✓ t_8551bce3  done      morgan-infra      Set up GitHub Actions → Cloudflare Pages deploy

The symbols tell the story. means shipped. means blocked waiting for a human decision. means waiting for its parents to finish. The dispatcher wakes up every sixty seconds, checks for ready tasks, and claims the next one for the assigned profile.

A worker can call kanban_block(reason="...") to pause for human input. When the operator unblocks the task, the dispatcher respawns the profile with the full comment thread as context. A worker can call kanban_complete(summary="...", metadata={...}) to hand off structured results for downstream parsing.

The pipeline in practice: migrating codegrit.dev to Hugo

The best way to understand this setup is to walk through a real task. We recently decided to migrate codegrit.dev from static HTML to Hugo. Here’s how the crew handled it.

Sam decomposed the work into three child tasks:

  • Maya (templates + scaffold, t_654d3f8e)
  • Morgan (Cloudflare Pages build config, t_bdf68063)
  • Lauren (pixel-perfect QA, t_879c4204 — gated on Maya + Morgan via parents=[...])

Maya and Morgan ran in parallel. Maya built 14 partials, 5 shortcodes, and a full Hugo Pipes CSS pipeline, then blocked for visual review with a handoff summary. Morgan produced a Cloudflare Pages setup doc with per-branch build overrides so main keeps serving static HTML while the preview branch runs Hugo builds.

Lauren’s task sits in todo waiting for both Maya’s review to clear and Morgan’s config to land. Once both parents complete, Lauren auto-promotes to ready and runs pixel-perfect diffs across every page and viewport.

That’s the full pipeline: decompose → parallel implement + infra → review gate → QA → deploy. Each specialist owns their lane. The board enforces the order.

When multi-agent makes sense — and when it doesn’t

This setup is not magic, and it’s not for every situation. Here are the real trade-offs.

Blocked tasks accumulate. Right now we have tasks waiting for human review, API credentials that don’t exist locally, and decisions that haven’t been made yet. The board surfaces this honestly in a way a single session never would. Blocking isn’t failure — it’s visibility into where the pipeline actually slows down.

Profile names must exist. The dispatcher silently drops tasks assigned to profiles that don’t exist. A card for a researcher profile on a machine that only has docker-worker sits in ready forever. The fix is simple: start every project with hermes profile list and decompose against what’s actually running, not what you wish existed.

Review loops cost time. When Maya blocks her Hugo scaffold for visual review, the pipeline pauses until a human responds. A single agent working alone would have pushed and moved on. The multi-agent setup trades speed for safety — every review gate is a conscious slowdown.

Gateway is a single point. Only our default profile’s gateway runs. Giving each crew member their own Discord presence would mean separate bot tokens and process management. We haven’t crossed that bridge yet. The crew communicates through the board, not through chat.

These trade-offs are worth it when at least two of these are true:

  1. Multiple specialists are needed. Research + architecture + QA + deploy is four different mindsets. One agent can do all of them, but it context-switches constantly and loses the thread.
  2. The work should survive a crash. Kanban tasks persist in SQLite. If a worker gets SIGKILL’d mid-run, the task is reclaimed and re-queued. A single terminal session doesn’t have that durability.
  3. Review gates matter. If you’re shipping to production and want a second pair of eyes — even synthetic ones — the board enforces that.
  4. Parallel workstreams exist. Fan-out for speed. Our Hugo migration had Maya and Morgan building in parallel while Lauren’s QA waited for both.

It’s overkill when:

  • The task takes under thirty minutes and touches one file.
  • You’re prototyping, not shipping.
  • Every profile loads the same tools and thinks the same way. Without distinct personas, you’re just adding process for no gain.

Bottom Line

Hermes profiles give you isolated agents with distinct personas, skills, and memory. The kanban board gives you a dispatch system with dependency chains, review gates, and an audit trail. Together they turn a single terminal agent into a crew that can research, implement, review, and deploy in parallel — but only if you actually need a crew. Start with one agent. Add profiles when you find yourself wishing someone else could review your work while you build the next thing. The board is free. The SQLite database lives on your machine. The overhead is real, but so is the throughput.

The crew ships

The Hugo migration is still in flight. Maya’s templates are on the hugo-migration branch waiting for review. Morgan’s Cloudflare Pages config is documented and ready. Lauren’s pixel-perfect validation task is queued, gated on the other two. Sam is already planning the next decomposition.

None of this requires a cloud service. None of it requires a credit card beyond the API tokens you’re already using for inference. It’s just profiles, personas, and a local SQLite board that knows who should work on what next.

If you’re running Hermes Agent and curious about multi-agent, the simplest next step is: create a second profile, give it a SOUL.md that describes a different role, and try decomposing one real task across both. You’ll know within a week whether the overhead is worth it.

Continue to Part 2: Surviving the Review Loop — how our kanban crew handles the reality of 7 review rounds, the decay curve, and why a clean pass at round 7 isn’t failure.


Sources