v1.5 · v2.0 · Roadmap
Three engines on a single enforcement philosophy: VALIDATE ships first, CONSENSUS raises the confidence bar, FORGE closes the fix loop. Every milestone has a concrete exit gate — cited evidence, not a feeling.
FORGE builds code → VALIDATE proves it works → CONSENSUS confirms agreement (v2.0) (v1.0 — shipped) (v1.5)
The evidence-based functional validation engine. It refuses to let a feature ship without cited proof that it works against the real system. No mocks. No test files. No "it compiled" verdicts.
| Inventory | Count | Notes |
|---|---|---|
| Skills | 51 | Platform validators, quality gates, orchestration, analysis |
| Commands | 19 | /validate, /validate-fix, /validate-sweep, /validate-team, forge suite |
| Hooks | 7 | block-test-files, completion-claim-validator, validation-not-compilation … |
| Agents | 7 | platform-detector, evidence-capturer, verdict-writer, sweep-controller … |
| Rules | 9 | Installed to .claude/rules/vf-* via /forge-install-rules |
Journey J2: User can delete account Verdict: PASS Evidence: step-01-navigate-to-settings.png (23 KB) — Settings page rendered step-02-click-delete-button.png (21 KB) — Confirmation dialog visible step-03-confirm-deletion.json (1.2 KB) — DELETE /users/42 → 204 step-04-redirect-to-login.png (19 KB) — User landed at /login evidence-inventory.txt (414 B)
The execution-time agreement gate. Where VALIDATE proves the system works once, CONSENSUS proves it works according to ≥2 independent validators. Single-validator verdicts are biased — CONSENSUS eliminates that bias for high-stakes features.
agreement_ratio = max(pass_count, fail_count) / total_validators confidence = HIGH if agreement_ratio == 1.0 ← unanimous MEDIUM if agreement_ratio >= 2/3 ← after disagreement analysis resolves LOW if agreement_ratio < 2/3 ← split; unresolved Confidence degrades monotonically. Evidence quality cannot substitute for agreement. HIGH requires unanimity regardless of how compelling one validator's evidence is.
Feature: Payment processing — refund flow Validators: 3 Per-journey synthesis: J1 Refund happy path UNANIMOUS_PASS HIGH J2 Partial refund UNANIMOUS_PASS HIGH J3 Refund after dispute MAJORITY_PASS MEDIUM (1 dissent → interpretation gap) J4 Double-refund prevention SPLIT LOW → DISAGREEMENT_UNRESOLVED Overall: DISAGREEMENT_UNRESOLVED (weakest journey governs) Action: escalate J4 to human; do not ship.
The autonomous fix-and-revalidate loop. FORGE closes the gap between "validation found a FAIL" and "validation found a FAIL AND fixed it." Three attempts, fresh evidence every time, different root cause required each time.
1. READ the FAIL verdict and cited evidence 2. TRACE to specific source code (file:line) 3. HYPOTHESIZE one root cause (single function/line named) 4. APPLY minimal fix targeting that cause 5. RE-VALIDATE the failed journey 6. IF FAIL persists → document WHY this hypothesis failed → move to next hypothesis 7. IF 3 attempts exhausted → mark UNFIXABLE → log all attempted causes → continue
Journey J2: Login flow Attempt 1: Hypothesis — missing null check in auth.ts:45 Fix — added guard; re-validated Result — FAIL (different error surfaced) Attempt 2: Hypothesis — session cookie not set on redirect Fix — added Set-Cookie header in callback handler Result — PASS Final: J2 PASS after 2 attempts Evidence: forge-attempt-1/, forge-attempt-2/
Researched but not scoped to a specific version. Ships when the preceding engine lands clean.
Clarity on scope is part of the contract. These items are explicitly out of scope.