docs(p2-67-followup): flip MCTS tactical TreeState impl to done (Wave-3 audit)

Audit confirmed the full surface already in-tree (TacticalTreeState +
apply_tactical_action all 14 variants + score_for_player [0,1] + generic
most_visited_root_action_cloned + bench wiring). No rebuild.

Empirical gates on apricot (CARGO_TARGET_DIR shared):
- cargo check --workspace clean; mc-ai 278/278 lib pass
- claude_real_mcts_vs_heuristic_ais_transcript: 283s/293s (<300s), Claude
  #1 (68 cities vs 19 / eliminated), action diversity move×453/queue_unit×48/
  queue_building×3/found×1, two runs byte-identical (determinism).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
autocommit 2026-06-04 11:37:01 -07:00
parent a7a1fc89f3
commit e48a6d5acf
2 changed files with 24 additions and 16 deletions

View file

@ -1,12 +1,12 @@
{
"generated_at": "2026-06-04T16:45:14Z",
"generated_at": "2026-06-04T18:36:52Z",
"totals": {
"done": 241,
"done": 242,
"in_progress": 1,
"missing": 1,
"oos": 29,
"partial": 18,
"stub": 9,
"stub": 8,
"superseded": 4,
"total": 303
},
@ -2990,10 +2990,10 @@
"id": "p2-67-followup-mcts-tactical-state-impl",
"title": "TreeState impl for TacticalState — wire real MCTS into the AI decision path",
"priority": "p2",
"status": "stub",
"status": "done",
"scope": "game1",
"owner": "simulator-infra",
"updated_at": "2026-05-13",
"updated_at": "2026-06-04",
"blocked_by": [],
"summary": "After the night's bug-fix pass (Bugs 1-5 closed, simulation fully playable, last-survivor victory firing), the question \"can Claude beat the hardest AI?\" hit a deeper architectural finding:"
},
@ -3520,7 +3520,7 @@
"remaining_by_lead": [
{
"owner": "simulator-infra",
"remaining": 8
"remaining": 7
},
{
"owner": "asset-sprite",

View file

@ -2,12 +2,12 @@
id: p2-67-followup-mcts-tactical-state-impl
title: "TreeState impl for TacticalState — wire real MCTS into the AI decision path"
priority: p2
status: stub
status: done
scope: game1
category: simulation
owner: simulator-infra
created: 2026-05-13
updated_at: 2026-05-13
updated_at: 2026-06-04
blocked_by: []
follow_ups: [p2-67]
---
@ -145,14 +145,22 @@ The existing one is specialized on `Tree<GameRolloutState>`. Add a generic versi
## Acceptance
- ☐ `TacticalTreeState` exists with `TreeState` impl.
- ☐ `apply_tactical_action` covers all 14 `Action` variants.
- ☐ `score_for_player` returns [0, 1] reward.
- ☐ Generic `most_visited_action_at_root<S>` added to `Tree<S>`.
- ☐ `claude_real_mcts_vs_ai_transcript` test runs in < 5 minutes wall clock.
- ☐ Claude with 1000-rollout budget MCTS shows meaningful action diversity vs the heuristic baseline (action-frequency table differs).
- ☐ Result reported: did Claude win? Top 3? Bottom?
- ☐ Determinism: same seed → byte-identical transcript across two runs.
- ✓ `TacticalTreeState` exists with `TreeState` impl — `mc-ai/src/tactical/tree_state.rs` (`impl TreeState for TacticalTreeState`, `new_root`, exported in `tactical/mod.rs:52`).
- ✓ `apply_tactical_action` covers all 14 `Action` variants — `mc-ai/src/tactical/apply.rs:39` (MoveUnit, AttackTarget, Fortify, Heal, FoundCity, SetProduction, EnqueueBuild, AssignCitizen, Scout, IssuePatrol, DeploySiege, PackSiege, Bombard, PromotionPicked).
- ✓ `score_for_player` returns [0, 1] reward — `mc-ai/src/tactical/scoring.rs:46` (`(0.6*city_share + 0.4*unit_share).clamp(0.0, 1.0)`); 4 unit tests cover balanced/dominant/losing/out-of-range.
- ✓ Generic most-visited-root-action added to `Tree<S>``mc-ai/src/mcts_tree.rs:348 most_visited_root_action_cloned()` (clone-based generic; the older `Copy`-bound `most_visited_action_at_root` is `ActionKind`-specialised, this one is the generic equivalent).
- ✓ `claude_real_mcts_vs_heuristic_ais_transcript` runs in < 5 min release run 283.48s (run 1) / 292.97s (run 2), both < 300s ceiling. `mc-player-api/tests/full_game_transcript.rs:2064`.
- ✓ Action diversity vs heuristic — MCTS action-frequency over 500 turns: `move`×453, `queue_unit`×48, `queue_building`×3, `found`×1 (`.local/demo-runs/2026-05-13-claude-real-mcts/recap.md`).
- ✓ Result reported — Claude (slot 0, MCTS) finished **#1**: 68 cities / 559 units / 152 gold vs slot-1 heuristic 19 cities and slot-2 heuristic eliminated (0 cities). No `GameOver` (500-turn ceiling reached, no domination victory), but Claude dominant on every axis.
- ✓ Determinism — two full 500-turn runs produced `TRANSCRIPT_BYTE_IDENTICAL` (`transcript.jsonl`) and identical recap.
## Verification — Wave 3 audit (2026-06-04)
Audit confirmed the full surface already in-tree (commit `378e5799e` "threshold + tree-state mgmt" + later). No rebuild needed. Gates run on apricot (RUN host), `CARGO_TARGET_DIR=~/.cache/mc-cargo-target-shared`:
- `cargo check --workspace` — clean (doc warnings only).
- `cargo test -p mc-ai --lib` — 278 passed, 0 failed (incl. 4 `tree_state::tests`).
- `cargo test -p mc-player-api ... --ignored claude_real_mcts_vs_heuristic_ais_transcript` — 1 passed, ×2, byte-identical transcripts.
## Why this size