docs(p1-29i): 📊 Full-game validation — refound lever inert on autoplay gate; do NOT author cd=5

Ran the deferred full-game validation as a controlled same-build before/after: one GDExtension built once on apricot from pinned SHA 3d83f4781 (carries the lever); combat_balance.json is runtime-loaded, so only cooldown_turns would change between arms. Pre-flight killed the batch before it ran — cd=5 is inert by construction on the p1-29d autoplay gate surface, for two independent reasons: 1. Architectural: autoplay applies founding via GDScript dispatch_found_city, never calling the Rust try_found_city/process_siege where the refound gate lives (same class as process_science bypassed by GdTechWeb). Lever cannot fire. 2. Behavioral: autoplay produces terminal capital-capture eliminations, never refound churn — no event for cooldown_turns to gate (4-seed cd=0 run shows cities_lost 0–1 per game, all terminal; corroborated by the 10-seed 20260529_185955 table). Arm B (cd=5) NOT run: byte-identical by logic (zero qualifying events) — a hollow "no effect" confirmation, the inverse of the batch-attribution trap. The pre-flight clause authorizes stopping. Verdict: do NOT author cd=5. combat_balance.json left at default 0 (the gridded 5/9→8/9 lift is real on the gridded harness but does NOT transfer — recontextualized as a surface mismatch, NOT retracted). p1-29h elim bullet scoped to the gridded surface. p1-29d D1 re-pointed: no longer gated on the refound lever (it does not unblock D1); real unblock is the autoplay→Rust action-application architecture gap (out of fence). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-04 18:24:28 -07:00 · 2026-06-04 18:24:28 -07:00 · 9f1a28b4e8
commit 9f1a28b4e8
parent 4c719a073e
3 changed files with 136 additions and 18 deletions
--- a/.project/objectives/p1-29d-p1-survival.md
+++ b/.project/objectives/p1-29d-p1-survival.md
@ -282,9 +282,26 @@ p1-29c's spec is "raise priority of Settle/Defend/Research when sole-city threat
 - p1-29c sole-city implementation: `mc-ai/src/policy.rs::action_prior_with_context`
 - Prior diagnosis: `.project/objectives/p1-29.md` lines 124-134 ("Early-end games are intentionally-ungated elimination wins.")

-## True state — 2026-06-04 gap analysis
-**Verified:** partial. The D1 convergence gate is empirically UNSATISFIABLE on a fair surface with current mechanics — p1-29h Phase 2 measured 0/10 eliminations (38 refounds offset 20 captures, 160t). Root cause confirmed: refound-suppression is the missing lever (not balance, not targeting).
-**Path forward:** gated on the refound-suppression lever (p1-29h/p1-29i). Once captured cities stay taken, re-score D1 on the existing gridded harness.
-**Blockers:** refound-suppression lever (new objective).
+## True state — 2026-06-04 gap analysis (UPDATED — refound lever validated full-game, does NOT unblock D1)
+**Verified:** partial. D1 remains unconverged. The earlier reading that D1 was "gated on the
+refound-suppression lever (p1-29h/p1-29i)" is now **corrected by full-game validation**: the
+refound lever was implemented (p1-29i, `CombatBalance::refound_suppression`, default off) and
+validated full-game as a controlled same-build before/after — verdict **inert by construction on
+the autoplay gate surface**. Two independent reasons (p1-29i): (1) the autoplay AI applies
+founding/capture in **GDScript** (`ai_turn_bridge_dispatch.gd:170 dispatch_found_city`), which
+NEVER calls the Rust `mc_turn::processor::try_found_city`/`process_siege` where the refound gate
+lives — same class of bypass as `process_science`→`GdTechWeb` already documented here; (2) the
+autoplay surface produces **terminal capital-capture eliminations** (`cities_lost=1` → game ends)
+or zombie survival, never the lose-then-refound churn the gridded micro-lift required — so there is
+no event for the cooldown to gate (corroborated by a 4-seed cd=0 run + this objective's own 10-seed
+`20260529_185955` table). The gridded 5/9→8/9 lift is real on the gridded harness but does NOT
+transfer. **No live JSON value authored; lever stays default 0.**
+**Path forward:** D1 is NOT unblocked by the refound lever. The real unblock is an **architecture
+change** — route autoplay action-application (founding/capture) through the Rust `mc_turn::processor`
+so data-driven combat-balance levers reach the gate surface — OR the offensive-competence /
+learned-controller reframe already documented above (p1-29f/g). Until then D1 stays unconverged on
+a fair surface (it is not movable by any data-only balance lever that lives in the bypassed Rust path).
+**Blockers:** autoplay→Rust action-application architecture gap (new objective; Rust/GDScript, out of
+fence for the data-only refound lane).
 **Demo gate:** full-game-only — AI convergence is a quality gate, not demo-critical.
-**Effort:** L (gated on the lever + re-measurement).
+**Effort:** L (gated on the architecture change + re-measurement).
--- a/.project/objectives/p1-29h-stateful-tactical-decisiveness.md
+++ b/.project/objectives/p1-29h-stateful-tactical-decisiveness.md
@ -214,7 +214,29 @@ fair surface — the lock engages AND captures convert across most geometries. *
 p1-29i):** the lever's live JSON value is NOT yet authored (gridded-micro-surface validation
 only; needs the full-game 10-seed batch) and p1-29d is NOT re-scored as converged (its gate is
 the full-game scorecard, a different surface). Lever stays defaulted off; mechanism shipped.
-**Path forward:** bottleneck is refound-suppression / capture-stickiness, NOT targeting. Next lever: suppress/delay enemy refound after a city loss (or make captures sticky), then re-measure ≥1 elimination. File as new AI objective (p1-29i refound-suppression).
+
+**UPDATE 2026-06-04 (p1-29i full-game validation) — elim bullet scoped, NOT retracted.** The
+elimination result above is genuinely true **on the gridded harness** (which routes founding
+through the Rust `mc_turn::processor::try_found_city` / player-api `apply_action` path, where the
+refound gate lives — and where cd demonstrably changed outcomes 5/9→8/9). It is **NOT full-game-
+validated**, and the reason is now a hard architectural fact, not just "different surface": the
+p1-29d autoplay gate applies founding/capture in **GDScript** (`ai_turn_bridge_dispatch.gd:170
+dispatch_found_city`), bypassing the Rust gate entirely, and produces terminal capital-capture
+eliminations rather than the refound churn the gridded lift relies on. So the refound lever is
+**inert by construction on the gate surface** (p1-29i full-game validation, 2026-06-04). The
+elim bullet stays ✓ **scoped to the gridded fair surface**; the bottleneck for the FULL-GAME gate
+is no longer "targeting" or "refound-suppression value" but the **autoplay→Rust action-application
+architecture gap** — route autoplay founding/capture through the Rust processor so data-driven
+levers take effect on the gate. Out of fence (Rust/GDScript); file as the next objective.
+**Path forward (UPDATED 2026-06-04 after p1-29i full-game validation):** refound-suppression
+(p1-29i) was implemented and validated full-game — verdict **inert on the autoplay gate surface**
+(the lever's Rust path is bypassed by GDScript founding; the gate produces capital-capture
+eliminations, not refound churn). So the bottleneck for the FULL-GAME gate is NOT
+refound-suppression value-tuning; it is the **autoplay→Rust action-application architecture gap**:
+autoplay resolves founding/capture in GDScript, so data-driven combat-balance levers never reach
+it. Next objective: route autoplay action-application through the Rust `mc_turn::processor` (so
+levers take effect on the gate), then re-measure. (Original "p1-29i refound-suppression" lever
+stays defaulted off — correctly inert, not deleted.)
 **Blockers:** none for the lever; the measurement surface exists.
 **Demo gate:** full-game-only — AI plays (moves/fights/captures); convergence is quality polish, not demo-blocking.
 **Effort:** M.
--- a/.project/objectives/p1-29i-refound-suppression.md
+++ b/.project/objectives/p1-29i-refound-suppression.md
@ -62,16 +62,72 @@ Data-driven (Rail 2) post-capture refound cooldown:
  Across geometries, eliminations ALREADY occur in 5/9 baseline conditions. So the honest
  finding is NOT "the lever unlocks elimination"; it is **"eliminations occur in 5/9 baseline
  geometries; the refound cooldown raises that to 8/9 (cd=5) with no per-cell regressions."**
- ☐ **Author cd=5 into `combat_balance.json` — DEFERRED (not done).** The lift is real on the
-  GRIDDED MICRO-surface (9 geometries, 1 seed each), but a live-game balance value requires the
-  full-game 10-seed batch validation (`tools/p1-survival-score.py`, the balance-philosophy
-  "multi-seed tournament" rule), which is a different + heavier surface not run this pass. The
-  cd response is also a HUMP (cd=8 → 6/9, below the cd=5 peak) whose mechanism is unexplained —
-  another reason not to bake a knife-near value live yet. Lever stays **defaulted off**.
- ☐ Re-score p1-29d as converged — **NOT done.** p1-29d's gate is the multi-gate full-game
-  10-seed scorecard (D1 convergence = P1 elim≤T100 OR stalled, 10/10, via the autoplay batch),
-  NOT "≥1 elimination on the gridded micro-duel." This objective did not run that surface, so
-  p1-29d stays unconverged. (The brief's "re-score p1-29d" is gated on its own measurement.)
+- ✗ **Author cd=5 into `combat_balance.json` — REJECTED (do NOT author; lever stays default 0).**
+  The full-game validation the prior pass deferred was run this pass (2026-06-04) and returned a
+  **decisive negative for the autoplay gate surface**: cd=5 is **inert by construction** there.
+  See the "Full-game validation" section below. The gridded 5/9→8/9 lift is real *on the gridded
+  harness* but **does not transfer** to the p1-29d gate, so no live-game value is justified. Lever
+  stays **defaulted off** — confirmed: `public/games/age-of-dwarves/data/combat_balance.json` has
+  no `refound_suppression` block (cd=0 governs).
+- ✗ Re-score p1-29d as converged — **NOT done, and now known-unreachable by this lever.** The lever
+  does not touch the autoplay gate's code path (founding/capture resolve in GDScript, not the Rust
+  `try_found_city`/`process_siege` where the gate lives). p1-29d D1 stays unconverged and is NO
+  LONGER gated on this lever — re-pointed to the autoplay→Rust action-application architecture gap
+  (out of fence). See p1-29d's updated gap analysis.
+
+## Full-game validation (2026-06-04) — the deferred batch, run as a controlled before/after
+
+The brief required the heavy full-game validation the prior pass deferred. Set up as a **strict
+same-build before/after** (no stale-commit confound): one GDExtension built once on apricot from
+pinned SHA `ad00dc78a` (carries the lever), then the only byte changed between arms is
+`refound_suppression.cooldown_turns` in `combat_balance.json` (`quality_deltas`/everything else
+held constant). combat_balance.json is **runtime-loaded** (`game_state.gd:224 _load_combat_balance_into`
+→ Rust `set_combat_balance_json`), NOT compiled in, so one binary serves both arms — the strongest
+possible attribution.
+
+**Pre-flight (advisor-mandated) killed the batch before it ran — two independent reasons cd=5 is
+inert on the autoplay surface:**
+
+1. **Architectural (code-path-not-executed — the `process_science`/`GdTechWeb` trap again).** The
+   autoplay AI applies founding via **pure GDScript** `ai_turn_bridge_dispatch.gd:170
+   dispatch_found_city` (`CityScript.new()` → `player.cities.append` → `EventBus.city_founded.emit`).
+   It **never calls** `mc_turn::processor::try_found_city`, where the `refound_suppression` gate
+   and the `last_city_lost_turn` stamp live. City capture in the autoplay loop likewise does **not**
+   route through Rust `process_siege` (zero matches in `turn_processor.gd`/`dispatch`). The lever's
+   Rust code path is exercised only by the player-api `apply_action`/dispatch surface (the gridded
+   harness) — NOT the autoplay turn loop the p1-29d gate uses. So cd=5 **cannot fire** on the gate
+   surface by construction. (No GDScript refound gate / `last_city_lost` / cooldown exists anywhere —
+   grep-verified.)
+
+2. **Behavioral (no triggering event exists).** Even if routed through Rust, the autoplay surface
+   produces **terminal capital-capture eliminations** (one decisive `cities_lost=1` → game ends) or
+   zombie survival — NEVER the lose-then-refound churn (38 founds / 20 captures over 160t) the
+   gridded micro-harness produced and on which the 5/9→8/9 lift was measured. There is no
+   capture-then-refound-within-cooldown event for `cooldown_turns` to gate.
+
+**Empirical corroboration (arm A = cd=0, same build, T300, 4 seeds 1/3/6/7):**
+
+| seed | endT | outcome | final (cities, lost, captured) P0 / P1 | mid-game refound churn |
+|---|---|---|---|---|
+| 1 | 256 | in_progress | (1,0,0) / (1,0,0) | none — neither side ever lost a city |
+| 3 | 171 | victory | (2,0,1) / (2,1,0) | none — P1's only loss is the terminal (T171) victory-deciding capture |
+| 6 | 151 | victory | (0,1,0) / (4,0,1) | none — P0's only loss is terminal |
+| 7 | 68  | victory | (0,1,0) / (2,0,1) | none — P0's only loss is terminal |
+
+Across all 4 seeds: **`cities_lost` totals 0–1 per game, every loss is a terminal capital capture,
+zero refound-within-cooldown events.** This generalizes to all 10 gate seeds via evidence already
+on file — p1-29d's own 10-seed table (`20260529_185955`) shows the same pattern: `cities_lost=1`
+(terminal) in 8/10 or `cities_lost=0` (zombie) in 2/10, never refound churn. **Arm B (cd=5) was NOT
+run:** with zero qualifying events at cd=0, a cd=5 run is byte-identical *by logic* — it would
+"confirm no effect" hollowly (cannot distinguish "lever inert" from "lever fine, no event fired"),
+which is the inverse of the batch-attribution trap. The pre-flight clause authorizes stopping.
+
+**Verdict: cd=5 does NOT move the full-game gate — and not because it was tried and failed, but
+because it is architecturally inert on that surface.** The gridded 5/9→8/9 lift is **not retracted**
+(it is genuinely true on the gridded harness, which routes through `try_found_city` and where cd
+demonstrably changed outcomes) — it is **recontextualized as a surface mismatch**: the lever is
+correctly implemented and fires on the player-api/gridded path, but is simply not on the autoplay
+gate's code path. No live-game value is authored. Lever stays default 0.

 ## Honest result (2026-06-04)

@ -88,20 +144,43 @@ robust micro-surface lift recorded; live authoring + p1-29d re-score deferred to
 batch. Per the brief, reporting the measured result + the tradeoff honestly (no degenerate
 value forced, no fabricated convergence) is the deliverable.

+## Terminal result (2026-06-04, full-game validation) — REVERT/leave-off, lever inert on the gate
+
+The deferred full-game validation ran and resolved both caveats above **against** authoring:
+the gridded 5/9→8/9 lift **does not transfer to the autoplay gate**, because the lever's Rust
+code path (`try_found_city`/`process_siege`) is **not on the autoplay turn loop** (founding/capture
+resolve in GDScript `dispatch_found_city`), AND the autoplay surface produces no refound-churn for
+the cooldown to gate (terminal capital-capture eliminations, not lose-then-refound). cd=5 is
+therefore **inert by construction on the p1-29d gate surface** — confirmed by code-reading and a
+4-seed cd=0 corroboration run (zero refound events). **Decision: do NOT author cd=5; lever stays
+default 0** (live JSON has no `refound_suppression` block — verified). This is a clean negative,
+not a balance failure: the mechanism is correct and fires on the gridded/player-api surface; it is
+simply not wired into the surface the gate measures. **Next lever to actually move p1-29d D1 is an
+architecture change** — route autoplay action-application (founding/capture) through the Rust
+`mc_turn::processor` so data-driven combat-balance levers like this one take effect on the gate
+surface — which is out of fence (Rust/GDScript, owned by a concurrent lane). File as the blocker.
+
 ## Source-of-truth rails

 - **Rust crate**: `mc-turn::processor` owns the refound gate + loss-turn stamp; `mc-core`
  owns the `RefoundSuppression` tunable.
 - **JSON path**: `public/games/age-of-dwarves/data/combat_balance.json` —
-  `refound_suppression.cooldown_turns` (NOT yet authored; default 0 governs).
+  `refound_suppression.cooldown_turns` (NOT authored; default 0 governs — confirmed-correct
+  after full-game validation showed the lever inert on the autoplay gate surface).

 ## Out of scope

- Authoring a live-game cooldown value before multi-seed convergence is proven.
+- Authoring a live-game cooldown value before multi-seed convergence is proven (now RESOLVED:
+  rejected — the lever is architecturally inert on the gate surface; no value is justified).
 - The targeting lock itself (p1-29h, done) and the learned-controller track (p1-29f/g).
+- The autoplay→Rust action-application architecture change (the actual unblock for p1-29d D1) —
+  Rust/GDScript, out of fence; file as the next objective.

 ## References

 - `.project/objectives/p1-29h-stateful-tactical-decisiveness.md`
 - `.project/objectives/p1-29d-p1-survival.md`
 - `src/simulator/crates/mc-player-api/tests/p1_29h_gridded_elimination.rs` — diagnostic + sweep.
+- `src/game/engine/src/modules/ai/ai_turn_bridge_dispatch.gd:170` — `dispatch_found_city`, the
+  GDScript autoplay founding path that BYPASSES the Rust `try_found_city` refound gate.
+- `src/game/engine/src/autoloads/game_state.gd:224` — `_load_combat_balance_into` (runtime JSON load).