spec-agent is the contract that won't let it. A deterministic gate, project rules it actually applies, and durable learning — for Claude Code, Copilot, Cursor & Codex. It doesn't replace your agent. It tells it no.
npx @marcusbarcelos/spec-agent init --id my-project --agents claudeA deterministic checkpoint between your agent finishing and the code landing — not a prompt. Watch it block a real bug, hand it back, and pass only once it's fixed.
- const slug = name.trim().replace(/\s+/g, "-") // "a b" → "a-b" ✗ collapses runs
+ const slug = name.trim().replace(/\s/g, "-") // "a b" → "a--b" ✓SPEC-AGENT VERDICT: PASSED — done means verified
Without spec-agent, that first diff ships — the agent thought it was done. Run it yourself in 30s: examples/idempotency-demo.
Why a gate beats a better prompt →AI agents are fast. They are also confidently wrong.— an agent without a gate is an overconfident junior
When the base model is already capable, prompting alone barely moves the needle. The leverage is governance — block work that's objectively wrong, and teach the model your invariants so it applies them instead of re-deriving them wrong.
spec-agent is the quality contract between your repository and your coding agent. The automatic tech lead that says "No. This doesn't pass."
A deterministic Stop-hook that blocks "done" while lint, typecheck or tests fail on what was touched. Catches the error the model can't see in itself.
Turns a recurring mistake into an imperative, reusable rule the model actually applies — where, without it, it knows the rule and breaks it anyway.
Token discipline in prompt, tool input and output. A code graph instead of re-reading files.
Multi-perspective review reserved for high-risk, ambiguous calls — DB migrations, contract breaks, security, architecture. Not for everyday turns.
spec-agent verify runs your gate and prints a verdict that reads like a code review — green in CI, blocking on a PR, non-zero exit when blocked.
# .github/workflows/spec-agent.yml - run: npx @marcusbarcelos/spec-agent verify SPEC-AGENT VERDICT: BLOCKED ✗ domain contract node --test idempotency invariant failed: duplicate ledger entry for sale-1 Blocked: 1 check(s) failed. Fix and re-run. Run summary: 1 check · 1 blocked · 0 passed · 127ms
The checks live in .spec/manifest.yaml — your tests, lint, typecheck, any command. This is where spec-agent stops being an assistant and becomes the guardian of the repo.
Runnable demo: examples/idempotency-demo — a commission ledger that must be idempotent by sale_id.
A small, tamper-isolated benchmark — the agent never sees the checkers. A method plus a first signal, not proof. Small N, stated openly.
| finding | signal |
|---|---|
| The gate is the win | Recovered an objective failure the model shipped — targeted tasks went 80% → 100% via the fix-loop. Prompt rules alone moved ~0 on a capable model. |
| Durable knowledge changes behavior | A learned project rule flipped a wrong answer to right — the model knew the rule and violated it without the skill. |
| Council's niche is calibration | On ambiguous-but-sound trade-offs it never false-blocked (0/4), where a single pass did. Real, but narrow. |
Full method & caveats: RESULTS · SKILLFORGE · COUNCIL. Small N — indicative, not statistical proof.
Will it work with my agent? →Full harness on Claude Code; on other agents it runs degraded-but-functional, with the gaps written down in the manifest's loss_report.
| capability | Claude Code | other agents |
|---|---|---|
| verification gate | Stop hook | git pre-commit / CI |
| durable learning | native skills + memory | .spec/learning/ |
| multi-agent (council) | native subagents | single-thread simulation |
| code graph | graphify CLI | graphify CLI |
Agent-specific enhancers (claude-mem, superpowers, rtk, graphify) are optional — never dependencies.
# scaffold .spec/ + adapters into the current repo npx @marcusbarcelos/spec-agent init --id my-project --agents claude,agents-md # re-project adapters when the engine evolves (never touches your durable state) npx @marcusbarcelos/spec-agent sync # run the gate yourself (CI / PR) — verdict + non-zero exit if blocked npx @marcusbarcelos/spec-agent verify
Requires Node ≥ 20. The harness runs inside the coding agent you already use.
npx @marcusbarcelos/spec-agent Get started