diff --git a/.project/objectives/README.md b/.project/objectives/README.md index 4afee45e..b27b2dd1 100644 --- a/.project/objectives/README.md +++ b/.project/objectives/README.md @@ -10,8 +10,8 @@ | Status | Count | |---|---| -| ✅ done | 27 | -| 🟡 partial | 14 | +| ✅ done | 26 | +| 🟡 partial | 15 | | 🔴 stub | 0 | | ❌ missing | 0 | | ⚫ oos | 4 | @@ -21,7 +21,7 @@ | ID | Status | Title | Owner | Updated | |---|---|---|---|---| -| [p0-01](p0-01-mcts-wiring.md) | ✅ done | Wire MCTS into gameplay AI | [warcouncil](../team-leads/warcouncil.md) | 2026-04-17 | +| [p0-01](p0-01-mcts-wiring.md) | 🟡 partial | Wire MCTS into gameplay AI | [warcouncil](../team-leads/warcouncil.md) | 2026-04-17 | | [p0-02](p0-02-clan-personalities.md) | 🟡 partial | Five AI clan personalities drive distinct playstyles | [warcouncil](../team-leads/warcouncil.md) | 2026-04-17 | | [p0-03](p0-03-pvp-in-turn.md) | ✅ done | PvP combat resolved inside the authoritative turn processor | — | 2026-04-17 | | [p0-04](p0-04-wonder-tracking.md) | ✅ done | World wonder tracking in PlayerState and score victory | — | 2026-04-17 | diff --git a/.project/objectives/p0-01-mcts-wiring.md b/.project/objectives/p0-01-mcts-wiring.md index 52c26567..12da9783 100644 --- a/.project/objectives/p0-01-mcts-wiring.md +++ b/.project/objectives/p0-01-mcts-wiring.md @@ -2,7 +2,7 @@ id: p0-01 title: Wire MCTS into gameplay AI priority: p0 -status: done +status: partial scope: game1 owner: warcouncil updated_at: 2026-04-17 @@ -12,19 +12,34 @@ evidence: - src/game/engine/src/modules/ai/ai_turn_bridge.gd - src/game/engine/src/modules/ai/simple_heuristic_ai.gd - src/game/engine/tests/unit/ai/test_ai_turn_bridge_mcts.gd + - .local/iter/p0-01-run1/ + - .local/iter/p0-01-run2/ --- ## Summary `GdMcTreeController` (Rust GDExtension) is the unconditional AI driver. `AiTurnBridge.run()` always calls `_apply_mcts_strategic_override()` — no feature flag, no silent fallback. If the extension is absent, `push_error` + `assert(false)` crashes loudly. `SimpleHeuristicAi` handles tactical decisions (movement, combat) after MCTS sets the strategic directive. +**Status: `partial` — not `done`.** Three independent batches (2026-04-17 parallel-agent `mcts_unconditional_20260417_092532` at T155 median TTV, warcouncil `p0-01-run1` at T124, `p0-01-run2` at T126) all land median TTV well below the 200–350 acceptance band. The victory-rate and determinism bullets pass; the TTV band bullet does not. Per CLAUDE.md Objective Status Integrity (`## Acceptance` bullets must all be demonstrably true for `done`), this stays `partial` until the TTV regression is understood. + +## Evidence of gap + +- **Parallel batch 2026-04-17 `mcts_unconditional_20260417_092532`**: 8/10 victories, domination TTVs at T78, T92, T143, T155, score seeds at T299×4. Median T155 — 45 turns (22%) below the 200 floor. +- **Warcouncil A5 run1 `.local/iter/p0-01-run1/`**: 9/10 victories (8 human wins idx=0, 1 AI win idx=1 on seed 4). TTVs: T81, T103, T115, T124, T126, T225, T299, T299, T299. Median T124 — 76 turns (38%) below the 200 floor. +- **Warcouncil A5 run2 `.local/iter/p0-01-run2/`**: 9/10 victories. TTVs: T75, T114, T126, T129, T187, T216, T265, T299, T299. Median T126. +- **End-to-end non-determinism discovered during A5 runs**: same-seed Run1↔Run2 outcome deltas up to 61 turns (e.g. seed 5: T126→T187). `tools/determinism-compare.py` reports 0/10 seeds pass, 9956 total divergences. First integer divergence appears ~T10 in combat outcomes (`total_combats=2 vs 1` on seed 3). Initial game state (`meta.json` except `start_stamp`) is identical, so divergence originates in the turn processor during game execution. **Out of warcouncil scope — surfaced here as p1-09 forensics.** Raw data in `.local/iter/p0-01-run{1,2}/`; report at `.local/iter/p0-01-determinism-report.txt`. + ## Acceptance - ✓ `AiTurnBridge` ALWAYS delegates to MCTS — no fallback, no feature flag. `AI_USE_MCTS` env var removed 2026-04-17. If `GdMcTreeController` is absent, `push_error` + `assert(false)` crashes — no silent heuristic substitute. `SimpleHeuristicAi` lives on only as the tactical executor after MCTS sets direction. -- ✓ 10-seed T300 unconditional batch (2026-04-17, `.local/batches/mcts_unconditional_20260417_092532`): **victory rate 8/10 = 80%** (target ≥50% ✓). p0 wins 7/10, p1 wins 1/10 (seed 10), incomplete/no-stats 2/10 (seeds 4, 8). TTV: domination seeds at T78, T92, T143, T155; score seeds at T299×4. Median TTV ≈T155. -- ✓ Determinism preserved — GUT test 7 in `test_ai_turn_bridge_mcts.gd` asserts same seed → same directive across repeated runs. +- ✓ Victory rate ≥50%: parallel batch 8/10 (80%), warcouncil run1 9/10 (90%), warcouncil run2 9/10 (90%). All three batches clear the 50% gate comfortably. +- ✗ **Median TTV in the 200–350 band**: parallel batch T155, warcouncil run1 T124, warcouncil run2 T126. All three fall below the floor. The gate is NOT met. This is an AI-balance concern — games end too quickly, suggesting one player snowballs or opponents fold — not an AI-correctness concern. +- ✓ Determinism preserved *at the MCTS directive level* — GUT test 7 in `test_ai_turn_bridge_mcts.gd` asserts same seed → same directive across repeated runs. (End-to-end game determinism is p1-09's acceptance, not p0-01's. Findings under "Evidence of gap" above.) + +**Remaining to reach done**: Understand and cite the TTV-below-band regression. Either (a) demonstrate a tuning change that lands median TTV in 200–350 across a 10-seed batch, or (b) explicitly renegotiate the band with the project owner and document the renegotiation here. ## Non-goals - Replacing `SimpleHeuristicAi` for tactical decisions (movement, combat remain heuristic). - Per-clan weight variation (that's `p0-02`). +- End-to-end game-run determinism (that's `p1-09`).