Skip to main content
ValidationForge · Rules · 9 Enforcement Rules

9 Rules.
Zero exceptions.

VF rules are installed to .claude/rules/ via /forge-install-rules and loaded automatically by Claude Code. They are not suggestions — they are enforcement constraints that hooks and agents check at every step.

ValidationForgeRules

bashinstall
# Install to project-level .claude/rules/
/forge-install-rules

# Install to global ~/.claude/rules/ (all projects)
/forge-install-rules --global

# Verify installation
ls .claude/rules/ | grep vf-
# vf-validation-discipline.md
# vf-evidence-management.md
# vf-platform-detection.md
# vf-execution-workflow.md
# vf-team-validation.md
# vf-benchmarking.md
# vf-forge-execution.md
# vf-forge-team-orchestration.md
# vf-consensus-engine.md
01
Validation Discipline
validation-discipline.md

The no-mock mandate. Never create test files, mocks, stubs, or test doubles. Always build and run the real system. Every PASS/FAIL verdict must cite specific evidence — screenshots, API responses, build logs — that a skeptical reviewer can open and read.

  • Build passing ≠ feature working
  • Type-check clean ≠ UI rendering correctly
  • No lint errors ≠ user journey functional
enforcement pattern
❌ /validate-sweep → verdict PASS (no evidence files)
✓ /validate-sweep → e2e-evidence/user-login/step-01.png (82 KB)
✓ /validate-sweep → e2e-evidence/user-login/step-02-cookie.txt (1 KB)
✓ /validate-sweep → verdict PASS (5/5 criteria, 9 files cited)
02
Execution Workflow
execution-workflow.md

The 7-phase pipeline: RESEARCH → PLAN → PREFLIGHT → EXECUTE → ANALYZE → VERDICT → SHIP. Every validation run walks these seven phases in order. Build success at phase 2 does not imply feature correctness — that happens at phase 3, against the real running system, with evidence captured at every step.

  • RESEARCH: standards, applicable criteria (WCAG, HIG, security)
  • PREFLIGHT is a hard gate — failure halts the run
  • EXECUTE runs against the REAL system, not a mock
  • VERDICT cites evidence paths, not build output
pipeline trace
> /validate-sweep
[00] RESEARCH → WCAG 2.1 AA applicable
[01] PLAN → 3 journeys, 12 PASS criteria
[02] PREFLIGHT → build 2.1s ✓ :3000 reachable ✓
[03] EXECUTE → Playwright, 47 evidence files
[04] ANALYZE → 0 FAILs, skip
[05] VERDICT → 3/3 PASS (e2e-evidence/report.md)
[06] SHIP → production-readiness-audit ✓
03
Evidence Management
evidence-management.md

Directory structure, naming convention, quality standards, and retention policy for all validation evidence. Evidence follows a structured lifecycle: captured at execution time, preserved post-FAIL indefinitely, eligible for cleanup after the configured retention window (default 30 days).

  • File names: step-{NN}-{action}-{result}.{ext}
  • Every journey dir needs evidence-inventory.txt with byte counts
  • 0-byte files are INVALID evidence — enforced by hook
  • Archive before purge: tar -czf before any cleanup
directory structure
e2e-evidence/
user-login/
step-01-navigate-login.png 78 KB ✓
step-02-fill-credentials.png 61 KB ✓
step-03-dashboard-rendered.png 91 KB ✓
step-04-session-cookie.txt 1 KB ✓
evidence-inventory.txt 612 B ✓
step-05-api-response.json 0 B ✗ INVALID
04
Platform Detection
platform-detection.md

Detection priority order and platform-specific validation toolchain routing. VF detects iOS before React Native before Flutter before CLI before API before Web. Each platform maps to a specific start command, validation tool, and evidence method.

  • iOS: .xcodeproj, .xcworkspace, *.swift, Package.swift
  • Web: package.json + framework config (next.config, vite.config)
  • HIGH confidence required before starting sweep
  • LOW confidence → ask user to confirm platform
detection output
> /vf-setup
[detect] Next.js 15 (App Router) — HIGH confidence
indicators: next.config.ts, app/ dir, 'next' in package.json
platform: web
tool: playwright-validation
evidence: screenshots + DOM snapshots + network logs
05
Team Validation
team-validation.md

Multi-agent validation coordination. Each validator owns its evidence directory exclusively. Validators execute in dependency-aware waves: DB → API → Web/iOS (parallel in Wave 3). A FAIL in any wave marks all dependent validators BLOCKED — not FAIL. Blocked validators are never spawned.

  • One validator per platform, max 5 per run
  • No validator writes to another validator's directory
  • Lead orchestrator does NOT write evidence
  • BLOCKED ≠ FAIL: blocked means upstream broke, never ran
wave execution
Wave 1: DB validator → PASS (schema clean)
Wave 2: API validator → PASS (after Wave 1 PASS)
Wave 3: Web validator → PASS ↳ parallel
Wave 3: iOS validator → PASS ↳ parallel
Wave 4: Verdict writer → synthesizes all evidence
Report: e2e-evidence/report.md
06
Benchmarking
benchmarking.md

Metric collection, integrity rules, and comparative analysis for /validate-benchmark. All metrics are calculated from actual evidence files and timestamps — never estimated. Coverage = validated / total; evidence quality = files examined / files claimed. History appended, never overwritten.

  • Coverage 35%: validated journeys / total discoverable features
  • Evidence Quality 30%: citations, observation quality, verdict rigor
  • Enforcement 25%: hooks on, no test files, no mocks, rules active
  • Speed 10%: wall clock time relative to project size
benchmark output
> /validate-benchmark
Coverage 35/35 (3/3 journeys)
Evidence Quality 27/30 (47 files, 0 zero-byte)
Enforcement 25/25 (strict + hooks)
Speed 8/10 (9m 14s vs median 7m 30s)
Total Score 95/100 Grade: A Trend: +3 vs run #9
07
Forge Execution
forge-execution.md

Phase gate protocol, fix-loop discipline, and state persistence for the FORGE engine. Forge commands treat the pipeline as strictly sequential with hard gates — preflight failure halts the run completely. Each fix attempt must target a DIFFERENT root cause. Evidence is never reused across attempts.

  • Max 3 fix attempts per journey. 4th attempt = UNFIXABLE
  • Each attempt writes to forge-attempt-N/ — fresh evidence only
  • Retrying the same hypothesis counts as a failed attempt
  • State persists to .validationforge/forge-state.json for resume
fix loop trace
> /validate-fix
[fix] Attempt 1: missing NEXTAUTH_URL → partial fix
[fix] Attempt 2: SameSite=Strict → regressed pc-04
[fix] Attempt 3: redirect URL allowlist → no change
[fix] 3 attempts exhausted → UNFIXABLE
See: e2e-evidence/user-signup/UNFIXABLE.md
08
Forge Team Orchestration
forge-team-orchestration.md

Validator assignment, evidence ownership, verdict synthesis, and failure-blocking rules for /forge-team runs. Each validator receives ONLY its platform's journeys. FAIL propagates: every downstream platform is marked BLOCKED. The verdict writer spawns only after ALL validators complete.

  • One validator per platform, maximum 5 per run
  • FAIL propagates to all downstream platforms
  • Partial verdicts are forbidden — wait for ALL validators
  • Verdict writer reads every evidence file, not just inventories
blocking propagation
SPAWN DB — deps: none
DB FAIL → API, Web, iOS all BLOCKED
BLOCK API — deps: DB=FAIL
BLOCK Web — deps: API=BLOCKED
BLOCK iOS — deps: API=BLOCKED
Verdict: DB=FAIL, API=BLOCKED, Web=BLOCKED, iOS=BLOCKED
09
Consensus Engine
consensus-engine.md

Execution-time agreement gate for high-stakes features. N independent validators (default 3) run the same journey list blind to each other's verdicts. The synthesizer applies synthesis states (UNANIMOUS_PASS / MAJORITY_PASS / SPLIT) and computes a confidence tier (HIGH / MEDIUM / LOW). SPLIT is a real outcome — the synthesizer never invents agreement.

  • HIGH confidence requires unanimity (agreement_ratio = 1.0)
  • MEDIUM: ≥⅔ PASS after disagreement analysis resolves it
  • LOW / SPLIT: escalate to human, never silently downgrade
  • Evidence directories are exclusive — cross-writes invalidate the run
synthesis trace
validator-1 → PASS (4m 12s)
validator-2 → FAIL pc-04 (stale browser state)
validator-3 → PASS (4m 05s)
[synthesizer] MAJORITY_PASS → disagreement protocol
[synthesizer] root cause: contradictory evidence (case b)
[synthesizer] re-run v2 pc-04 only → PASS
[synthesizer] UNANIMOUS_PASS (3/3) → HIGH confidence