Multi-Agent AI Coding Workflow: Git Worktrees That Scale

Running two AI agents against the same repository is not doubling your output — it’s scheduling a race condition. Both agents read the same file state, make independent decisions, and write back to the same paths. The second write wins. The first agent’s work disappears without an error, without a warning, gone.

This is the infrastructure problem that most multi-agent AI coding workflow tutorials skip entirely. They teach you which tool to use; they don’t teach you why parallel agents silently destroy each other, or how git worktrees provide the isolation primitive that makes parallel work safe. This post covers both — plus the coordination contract, decomposition discipline, and merge strategy your team needs to run a repeatable multi-agent setup that holds up on a production codebase.

Why Parallel AI Agents Silently Destroy Each Other

The failure modes from shared working directories compound quietly:

  • Silent overwrites: Agent B writes `src/auth/token.ts` after Agent A already rewrote it. One version survives. You may not notice until tests fail in a way that’s impossible to bisect.
  • Divergent git history: Both agents commit. One rebases cleanly; the other produces conflicts referencing code neither human wrote.
  • State inconsistency: Agent A refactors a function signature. Agent B, still working from the original state, calls the old signature in new code. The error only surfaces at merge time — far from its cause.

A METR randomized controlled trial (July 2025) found that experienced open-source developers using AI tools took 19% longer to complete tasks than without them — largely due to the overhead of reviewing and debugging AI-generated code that interacted unexpectedly. Parallelism amplifies this overhead when isolation is absent.

The fix is not more careful prompting. It’s structural.

Git Worktrees 101: The Isolation Primitive That Makes Parallel Agents Safe

A git worktree is a separate checked-out copy of your repository that shares the same `.git` directory. Each worktree has its own working directory and its own branch, but they all read from the same object store. Changes in one worktree are completely invisible to another until you explicitly merge them.

This is the isolation primitive that makes parallel agents safe.

“`bash

# Create a new worktree on a feature branch

git worktree add ../my-repo-feature-auth feature/auth-refactor

# List all active worktrees

git worktree list

# Remove a worktree after the branch is merged

git worktree remove ../my-repo-feature-auth

“`

The binding rule — the one that prevents every common multi-agent failure mode — is:

One task → one branch → one worktree → one agent.

Deviate from this and you’re back to race conditions. Follow it and you get true file-level isolation with zero networking overhead — no Docker, no VM provisioning. The isolation is at the git layer, which means it works identically whether your agent is Claude Code, Cursor, Aider, Codex CLI, or any other CLI-based tool.

One real operational cost to know upfront: worktrees copy your working tree on disk. In a 20-minute session on a ~2GB codebase, automatic worktree creation consumed 9.82 GB of disk space, according to reports from the Cursor community forum. Plan your disk budget before you spin up six agents.

Step-by-Step: Setting Up a Multi-Agent Worktree Environment

This setup is tool-agnostic. The agent doesn’t know or care that it’s running in a worktree — it sees a normal repository checkout.

Script the worktree lifecycle

Don’t create worktrees manually each time. Commit a small shell script to your repo so every engineer runs the same lifecycle:

“`bash

#!/usr/bin/env bash

# scripts/agent-worktree.sh

# Usage: ./scripts/agent-worktree.sh

set -e

BRANCH=”$1″

WORKTREE_PATH=”../$(basename “$PWD”)-${BRANCH}”

git fetch origin

git worktree add -b “$BRANCH” “$WORKTREE_PATH” origin/main

echo “Worktree ready at: $WORKTREE_PATH”

echo “Next: cd $WORKTREE_PATH &&

“`

Launch your agent inside the worktree

Navigate to the worktree directory and launch your agent from there. It sees a clean, isolated checkout:

“`bash

cd ../my-repo-feature-auth

claude # or: cursor, aider, codex — your choice

“`

Automate cleanup after merge

“`bash

#!/usr/bin/env bash

# scripts/cleanup-worktree.sh

BRANCH=”$1″

WORKTREE_PATH=”../$(basename “$PWD”)-${BRANCH}”

git worktree remove “$WORKTREE_PATH”

git branch -d “$BRANCH”

echo “Worktree and branch removed.”

“`

Keep a manifest

Maintain a simple `WORKTREES.md` tracking active agent sessions:

| Worktree | Branch | Agent | Task | Status |

|—|—|—|—|—|

| `../repo-feature-auth` | `feature/auth-refactor` | Claude Code | Auth token refresh | In progress |

| `../repo-fix-api-timeout` | `fix/api-timeout` | Aider | Fix 30s timeout | In review |

This prevents two engineers from accidentally launching agents against the same scope — a coordination failure the tooling will not catch for you.

Writing an AGENTS.md That Coordinates Agent Behavior

AGENTS.md is not a README. It is a coordination contract — a file that every modern agent reads on startup, declaring what it owns, what it must never touch, and how this codebase operates.

More than 60,000 open-source projects now use AGENTS.md, stewarded by the Agentic AI Foundation under the Linux Foundation. An analysis of over 2,500 repositories found that the single most effective constraint was explicit and simple: `”Never commit secrets”`. Prohibition, not convention, is what agents respond to reliably.

A minimal AGENTS.md that works covers four areas:

File ownership and prohibited zones:

“`markdown

File Ownership

  • `db/migrations/` — NEVER modify. Migrations are human-authored only.
  • `vendor/` — NEVER modify.
  • `.env`, `.env.*` — NEVER read or write. Never commit secrets.
  • `src/payments/` — Do not modify unless your task explicitly targets this module.

“`

Build and test commands:

“`markdown

Build & Test

  • Install: `npm ci`
  • Build: `npm run build`
  • Test: `npm test`
  • Lint: `npm run lint`

Always run lint and tests before marking a task complete.

“`

Code conventions (naming, patterns, error handling, export style).

Agent coordination rules (one branch per task, no new dependencies without documenting them, open a draft PR on start).

Peer-reviewed research presented at ICSE 2026 confirmed this: incorporating architectural documentation into agent context produces measurable gains in functional correctness, architectural conformance, and code modularity. AGENTS.md is how you deliver that context at scale, consistently, across every agent session.

Spec-Driven Task Decomposition: Breaking Work Into Agent-Safe Units

Worktree isolation prevents file-level collisions. But if two agents are both tasked with “improve the checkout flow,” they’ll still conflict — because neither task was scoped to be independent.

Spec-driven task decomposition is the prerequisite that determines whether your parallel agents will work in parallel, or appear to while creating future merge problems. Four spec-driven development tools have collectively accumulated 137,000+ GitHub stars by early 2026, indicating rapid ecosystem convergence around this pattern. Write a detailed spec first; then decompose it into bounded agent tasks.

The task independence test

Before assigning work to a parallel agent, apply three checks:

  1. File exclusivity: Does this task write to any file that another concurrent task also writes to? If yes, the tasks are not independent.
  2. Interface stability: Does this task change a function signature, API contract, or data schema that another task depends on? If yes, serialize them — don’t parallelize.
  3. Bounded scope: Can you state, in one sentence, exactly which directories and files this task modifies? If not, the task is too large.

A decomposition template

“`

Task: [One sentence description]

Branch: feature/

Files in scope: [explicit list]

Files out of scope: [adjacent files the agent must not touch]

Input contract: [what must be stable before this task starts]

Output contract: [what this task must produce]

Success criteria: [how to verify completion]

“`

Tasks that pass the independence test can go to parallel agents with confidence. Tasks that fail should be serialized or decomposed further.

Merge Strategies: Getting Clean, Reviewable PRs Out of Parallel Agent Branches

Parallelism creates a merge bottleneck if you don’t plan for it. Five agents finishing simultaneously doesn’t help if review takes five times as long.

Open draft PRs immediately

Have each agent open a draft PR as soon as it starts. This makes in-progress work visible and lets you spot scope overlap before it becomes a conflict.

Choose your review gate

Two patterns work well:

Lead-agent merge pattern: One orchestrator agent reviews and merges the output of the others, resolving conflicts with context about all branches, then merging to main in dependency order. Faster. Works when decomposition was clean.

Human review gate: Each agent PR goes to a human reviewer before merge. More overhead, higher confidence. Required for production-critical or domain-logic changes.

Neither is universally correct. Let the risk profile of the change drive the choice.

Use visual diffs to catch AI drift

Agents over-refactor. A task scoped to “fix the timeout bug” may come back restructuring error handling across four files. Visual diffs — PR diff views with file-by-file navigation — let you catch this without reading every line. Treat any PR touching files outside its declared scope as an automatic review escalation.

The Limits You Need to Know (Disk, Ports, Databases, and Ceilings)

Worktrees isolate the filesystem. They do not isolate everything:

  • Shared ports: If Agent A’s integration tests spin up on port 3000, Agent B’s tests will collide. Use dynamic port assignment or per-worktree port offsets.
  • Shared databases: Agents sharing a local database produce interleaved test data. Use separate database names per worktree, or a test schema factory.
  • Docker daemon: Multiple agents building images share the same daemon. Usually fine, but problematic if agents are modifying Dockerfiles in parallel.
  • Package caches: `npm ci` benefits from a shared cache; `node_modules` inside each worktree is isolated by default.

The practical agent ceiling

The productive ceiling for parallel agents is 5–7 concurrent on a modern laptop before rate limits, disk consumption, and merge review overhead cancel out the throughput gains.

The math is simple: if each worktree consumes ~5 GB on a 2 GB codebase, six agents consume 30+ GB. Add API rate limits and the review time for six simultaneous PRs, and you’ve rebuilt the bottleneck at a different layer. Start with two or three parallel agents. Measure. Scale when you have evidence.

Zooming out: 51% of professional developers now use AI tools daily, and 84% are using or planning to use them — yet only 17% of developers using AI agents agree they’ve improved team collaboration (Stack Overflow Developer Survey 2025). That gap exists because most teams adopted the tools without the infrastructure layer. That’s what this workflow addresses.

Putting It All Together: Multi-Agent AI Coding Workflow Checklist

Here’s the full multi-agent AI coding workflow distilled into a checklist your team can run every session:

Before you start:

  • [ ] Write or update the spec. Decompose into independent tasks using the task independence test.
  • [ ] Verify AGENTS.md is current — correct build commands, accurate prohibited zones, up-to-date conventions.
  • [ ] Check disk space. Budget ~5 GB per worktree beyond base repo size.

For each task:

  • [ ] Run `scripts/agent-worktree.sh “` to create the isolated environment.
  • [ ] Launch the agent inside the worktree directory.
  • [ ] Agent opens a draft PR immediately.
  • [ ] Update the WORKTREES.md manifest.

During agent work:

  • [ ] Monitor for agents touching files outside their declared scope.
  • [ ] Verify runtime isolation: isolated ports and database schemas.

At merge time:

  • [ ] Review PR diffs visually. Flag out-of-scope file changes for escalation.
  • [ ] Apply lead-agent or human review gate based on change risk.
  • [ ] Merge in dependency order (shared abstractions before consumers).
  • [ ] Run `scripts/cleanup-worktree.sh ` after successful merge.

Conclusion

The multi-agent AI coding workflow that scales isn’t about running more agents — it’s about the infrastructure layer that keeps them from overwriting each other’s work. Git worktrees provide the isolation primitive. AGENTS.md provides the coordination contract. Spec-driven decomposition ensures your tasks are genuinely independent before you treat them as parallel. A disciplined merge strategy converts parallel agent output into reviewable, trustworthy PRs.

When these four components work together, parallelism delivers what it promises. When any one is missing, you get harder-to-debug failures than single-agent work would have produced.

Start small: one spec, two worktrees, two agents, a tight AGENTS.md. Measure the review overhead. Then scale — with evidence, not optimism.

If you’re rolling this out for your team, commit the lifecycle scripts and AGENTS.md template from this post to your repo today and share the WORKTREES.md manifest as a shared coordination artifact. The setup takes under an hour. The merge conflicts it prevents will save considerably more.

Leave a Reply

Your email address will not be published. Required fields are marked *