From b32f5b46b125bc06e3538d5c7a28084e883793ff Mon Sep 17 00:00:00 2001 From: Natalie Date: Sun, 3 May 2026 08:42:34 -0400 Subject: [PATCH] =?UTF-8?q?fix(game1):=20=F0=9F=90=9B=20update=20p1-30=20s?= =?UTF-8?q?tatus=20to=20partial=20and=20add=20detailed=20batch=20results?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-Authored-By: Lilith Autocommit --- .project/objectives/p1-30.md | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/.project/objectives/p1-30.md b/.project/objectives/p1-30.md index 464602a7..78d1dad6 100644 --- a/.project/objectives/p1-30.md +++ b/.project/objectives/p1-30.md @@ -2,11 +2,11 @@ id: p1-30 title: "Optimize `_build_tactical_state` — 8000-tile GDScript dict-build per AI turn blocks p1-22 huge-map gate" priority: p1 -status: missing +status: partial scope: game1 tags: [perf, tactical-ai] owner: warcouncil -updated_at: 2026-04-26 +updated_at: 2026-05-03 --- ## Summary @@ -37,4 +37,12 @@ Option 2 is more correct (Rail-1) but bigger surface. Option 1 is the quick win. - mc-ai lib tests: 208/208 pass (`cargo test -p mc-ai --lib`) - api-gdext: `cargo check` clean (no errors) - ❌ Re-run `tools/huge-map-5clan.sh` 10-seed batch — verify ≥5/10 victories with ≥2 distinct winners (closes p1-22's failing sub-gate) -- ❌ p1-22's evidence updated to cite this objective's closure + - **Batch run 2026-05-03** (`~/.cache/mc-src-20260502_215856/.local/iter/huge-map-5clan-20260503_033046/`): + - SEEDS=10, MCTS_DECISION_BUDGET_MS=2000, T500 + - **Result: 0/10 victories, 0 distinct winners** — verdict.json `pass: false` + - All 10 seeds reached the 1800s seed wall-clock timeout while still `outcome=in_progress` + - Median game length: turn 243 (seeds spread T100-450) + - The Rail-1 tile-state-in-Rust handoff DID land (208/208 tests still pass) — but the strategic-decision wall-clock at 2000ms per AI turn × 5 AI × 100-450 turns × 1800s seed budget is not enough to reach decision in the huge-map config + - **Diagnosis**: The tactical-state serialization fix is necessary but insufficient. Either MCTS_DECISION_BUDGET_MS needs to drop (faster decisions, lower quality) or the per-seed wall-clock budget (currently 1800s) needs to grow significantly. Neither is a code change — both are knob retunings or a deeper MCTS efficiency improvement. + - **Status remains `partial`** — the implementation work landed; the gate metric did not move past R0 baseline (which was 2/10 victories — this run got 0/10). +- ❌ p1-22's evidence updated to cite this objective's closure (still ❌ — gate remains failing)