Skip to Content
Development Workflow

Evidence-Bound Development Workflow

Agent-driven development with enforced quality gates, adversarial reviews, and telemetry as a first-class concern.


How to Start Working

Say “next task” — Claude handles the rest:

You: "next task" Claude: "Next task: [ARCH-2] Decompose execute_ask(). Confirm, or pick another?" You: "confirm" Claude: → pulls main → creates branch → spawns researcher → presents brief You: "looks good" Claude: → spawns coder → TDD implementation → verifier → skeptic → PR

That’s it. One prompt to start, one confirmation to approve the approach. Claude orchestrates all 6 subagents automatically.


Architecture: Agents + Hooks + Commands

Subagents (.claude/agents/) — Isolated, Model-Appropriate

Main Conversation (opus) — talks to you, picks tasks, spawns agents ├── researcher (sonnet) ← "what should we build?" ├── coder (opus) ← "build it with TDD" ├── eval-writer (sonnet) ← "write the eval first" (LLM/retrieval only) ├── verifier (haiku) ← "did it pass?" ├── skeptic (sonnet) ← "is it safe?" └── doc-sync (haiku) ← "are docs current?"
AgentModelWhat it doesWhen
researchersonnetGathers context, patterns, risks. Returns structured brief.Before any FR/NFR (NON-NEGOTIABLE)
coderopusImplements with TDD. Follows project invariants. Returns files + tests.After research brief approved
eval-writersonnetWrites failing eval before AI behavior changes.When touching retrieval/evidence/policy/verification
verifierhaikuRuns lint + types + tests + telemetry grep. Returns pass/fail table.After implementation
skepticsonnetAdversarial review: AI failures, data leakage, security, telemetry audit.Before PR merge (NON-NEGOTIABLE)
doc-synchaikuChecks if docs are stale after code changes.Before PR, parallel with skeptic

Hooks (.claude/settings.json) — Automatic, Can’t Skip

HookTriggerBlocks?
Branch protectiongit push to mainYes — use feature branches
Pre-commit gatesgit commitYes — ruff + mypy + pytest must pass
Pre-push adversarial scangit pushYes — agent scans diff for security issues
DB safetyDELETE, alembicYes — requires explicit approval
Post-edit lintEdit/Write .pyNo — informational

Commands (.claude/commands/) — Lightweight Utilities

CommandWhat it does
/wsstatusQuick STATUS.md update
/wsmistakeLog a mistake to CLAUDE.md

The Flow

HOW A FEATURE GETS BUILT ───────────────────────── 1. git checkout -b feat/TASK-ID-description ┌─────────────────────────────────┐ │ researcher (sonnet) │ ← returns structured brief │ • reads REQUIREMENTS.md │ │ • finds patterns in codebase │ │ • identifies risks + invariants│ └─────────────────────────────────┘ You approve the approach ┌──────────────────────────────────────────┐ │ coder (opus) │ │ • receives research brief │ │ • writes test first (RED) │ │ • implements (GREEN) │ │ • spawns eval-writer if LLM code │ │ • returns files changed + test results │ └──────────────────────────────────────────┘ ┌─────────────────────────────────┐ │ verifier (haiku) │ ← lint + types + tests + telemetry check └─────────────────────────────────┘ ┌──────────────────────────────────────────┐ │ skeptic (sonnet) ← adversarial review │ │ doc-sync (haiku) ← doc drift check │ IN PARALLEL └──────────────────────────────────────────┘ git commit ← [hook: ruff + mypy + pytest BLOCK on fail] git push ← [hook: blocks main + adversarial scan] gh pr create ← PR to main merge ← auto-deploys API + Web + Docs

Git Flow

main is protected. All changes go through feature branches + PRs.

main (production — no direct push) PR required, skeptic review must pass feat/TASK-ID-description (where work happens) git checkout -b feat/ARCH-2-decompose-ask

Branch naming: feat/TASK-ID-desc, fix/TASK-ID-desc, chore/desc

Auto-deploy on merge to main:

  • API → Azure Container Apps (via GitHub Actions)
  • Frontend → Vercel
  • Docs → knowledge.bound.legal (Nextra on Vercel)

Telemetry Invariant

Telemetry is a first-class concern, enforced at 3 levels:

LevelAgent/HookWhat it checks
ImplementationcoderEvery LLM call uses traced_llm_call(). Every request calls record_telemetry(). All @_observe have capture_input=False.
VerificationverifierGreps for raw httpx.post/httpx.get calls that bypass telemetry wrapper.
ReviewskepticReviews for missing @_observe decorators, missing record_telemetry(), PII in logs.
Every LLM call → traced_llm_call() wrapper Every request → record_telemetry() (including refusals) Every @observe → capture_input=False, capture_output=False (PII safety)

Published Documentation — knowledge.bound.legal

Source of truth: docs/*.md in the repo. Published via Nextra on Vercel.

docs/*.md (edit these) ↓ sync-docs.sh (copies on build) apps/docs/content/*.mdx (gitignored, generated) ↓ Nextra v4 knowledge.bound.legal (static site)

To update: Edit docs/, commit, push. Auto-deploys. Local preview: cd apps/docs && npm run dev


Key Files

FilePurpose
STATUS.mdCurrent phase, Now/Next/Done
REQUIREMENTS.mdFRs/NFRs with acceptance criteria
CLAUDE.mdAI assistant rules + auto-trigger protocol
CHECKPOINT.mdAutonomous work log
.claude/agents/6 subagent definitions
.claude/commands/2 utility commands
.claude/settings.json5 enforced hooks

Autonomous Work Mode

When user says “work on this, I’ll check back”:

  1. Read STATUS.md → identify tasks
  2. For each task: researcher → coder → verifier → skeptic → doc-sync
  3. Log to CHECKPOINT.md after each task
  4. Stop conditions:
    • Test failures after 2 fix attempts
    • Need to modify policy.py or evidence.py
    • Architecture decision needed
    • Ambiguous requirement

NON-NEGOTIABLE Rules

RuleEnforcement
researcher before implementationCLAUDE.md auto-trigger
skeptic before PR mergeCLAUDE.md auto-trigger
Tests pass before commitPre-commit hook (blocks)
No direct push to mainBranch protection hook (blocks)
Telemetry on all LLM callsverifier grep + skeptic audit
No PII in logsskeptic audit