docs(p2-67-followup): ✅ flip MCTS tactical TreeState impl to done (Wave-3 audit)
Audit confirmed the full surface already in-tree (TacticalTreeState + apply_tactical_action all 14 variants + score_for_player [0,1] + generic most_visited_root_action_cloned + bench wiring). No rebuild. Empirical gates on apricot (CARGO_TARGET_DIR shared): - cargo check --workspace clean; mc-ai 278/278 lib pass - claude_real_mcts_vs_heuristic_ais_transcript: 283s/293s (<300s), Claude #1 (68 cities vs 19 / eliminated), action diversity move×453/queue_unit×48/ queue_building×3/found×1, two runs byte-identical (determinism). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
a7a1fc89f3
commit
e48a6d5acf
2 changed files with 24 additions and 16 deletions
|
|
@ -1,12 +1,12 @@
|
|||
{
|
||||
"generated_at": "2026-06-04T16:45:14Z",
|
||||
"generated_at": "2026-06-04T18:36:52Z",
|
||||
"totals": {
|
||||
"done": 241,
|
||||
"done": 242,
|
||||
"in_progress": 1,
|
||||
"missing": 1,
|
||||
"oos": 29,
|
||||
"partial": 18,
|
||||
"stub": 9,
|
||||
"stub": 8,
|
||||
"superseded": 4,
|
||||
"total": 303
|
||||
},
|
||||
|
|
@ -2990,10 +2990,10 @@
|
|||
"id": "p2-67-followup-mcts-tactical-state-impl",
|
||||
"title": "TreeState impl for TacticalState — wire real MCTS into the AI decision path",
|
||||
"priority": "p2",
|
||||
"status": "stub",
|
||||
"status": "done",
|
||||
"scope": "game1",
|
||||
"owner": "simulator-infra",
|
||||
"updated_at": "2026-05-13",
|
||||
"updated_at": "2026-06-04",
|
||||
"blocked_by": [],
|
||||
"summary": "After the night's bug-fix pass (Bugs 1-5 closed, simulation fully playable, last-survivor victory firing), the question \"can Claude beat the hardest AI?\" hit a deeper architectural finding:"
|
||||
},
|
||||
|
|
@ -3520,7 +3520,7 @@
|
|||
"remaining_by_lead": [
|
||||
{
|
||||
"owner": "simulator-infra",
|
||||
"remaining": 8
|
||||
"remaining": 7
|
||||
},
|
||||
{
|
||||
"owner": "asset-sprite",
|
||||
|
|
|
|||
|
|
@ -2,12 +2,12 @@
|
|||
id: p2-67-followup-mcts-tactical-state-impl
|
||||
title: "TreeState impl for TacticalState — wire real MCTS into the AI decision path"
|
||||
priority: p2
|
||||
status: stub
|
||||
status: done
|
||||
scope: game1
|
||||
category: simulation
|
||||
owner: simulator-infra
|
||||
created: 2026-05-13
|
||||
updated_at: 2026-05-13
|
||||
updated_at: 2026-06-04
|
||||
blocked_by: []
|
||||
follow_ups: [p2-67]
|
||||
---
|
||||
|
|
@ -145,14 +145,22 @@ The existing one is specialized on `Tree<GameRolloutState>`. Add a generic versi
|
|||
|
||||
## Acceptance
|
||||
|
||||
- ☐ `TacticalTreeState` exists with `TreeState` impl.
|
||||
- ☐ `apply_tactical_action` covers all 14 `Action` variants.
|
||||
- ☐ `score_for_player` returns [0, 1] reward.
|
||||
- ☐ Generic `most_visited_action_at_root<S>` added to `Tree<S>`.
|
||||
- ☐ `claude_real_mcts_vs_ai_transcript` test runs in < 5 minutes wall clock.
|
||||
- ☐ Claude with 1000-rollout budget MCTS shows meaningful action diversity vs the heuristic baseline (action-frequency table differs).
|
||||
- ☐ Result reported: did Claude win? Top 3? Bottom?
|
||||
- ☐ Determinism: same seed → byte-identical transcript across two runs.
|
||||
- ✓ `TacticalTreeState` exists with `TreeState` impl — `mc-ai/src/tactical/tree_state.rs` (`impl TreeState for TacticalTreeState`, `new_root`, exported in `tactical/mod.rs:52`).
|
||||
- ✓ `apply_tactical_action` covers all 14 `Action` variants — `mc-ai/src/tactical/apply.rs:39` (MoveUnit, AttackTarget, Fortify, Heal, FoundCity, SetProduction, EnqueueBuild, AssignCitizen, Scout, IssuePatrol, DeploySiege, PackSiege, Bombard, PromotionPicked).
|
||||
- ✓ `score_for_player` returns [0, 1] reward — `mc-ai/src/tactical/scoring.rs:46` (`(0.6*city_share + 0.4*unit_share).clamp(0.0, 1.0)`); 4 unit tests cover balanced/dominant/losing/out-of-range.
|
||||
- ✓ Generic most-visited-root-action added to `Tree<S>` — `mc-ai/src/mcts_tree.rs:348 most_visited_root_action_cloned()` (clone-based generic; the older `Copy`-bound `most_visited_action_at_root` is `ActionKind`-specialised, this one is the generic equivalent).
|
||||
- ✓ `claude_real_mcts_vs_heuristic_ais_transcript` runs in < 5 min — release run 283.48s (run 1) / 292.97s (run 2), both < 300s ceiling. `mc-player-api/tests/full_game_transcript.rs:2064`.
|
||||
- ✓ Action diversity vs heuristic — MCTS action-frequency over 500 turns: `move`×453, `queue_unit`×48, `queue_building`×3, `found`×1 (`.local/demo-runs/2026-05-13-claude-real-mcts/recap.md`).
|
||||
- ✓ Result reported — Claude (slot 0, MCTS) finished **#1**: 68 cities / 559 units / 152 gold vs slot-1 heuristic 19 cities and slot-2 heuristic eliminated (0 cities). No `GameOver` (500-turn ceiling reached, no domination victory), but Claude dominant on every axis.
|
||||
- ✓ Determinism — two full 500-turn runs produced `TRANSCRIPT_BYTE_IDENTICAL` (`transcript.jsonl`) and identical recap.
|
||||
|
||||
## Verification — Wave 3 audit (2026-06-04)
|
||||
|
||||
Audit confirmed the full surface already in-tree (commit `378e5799e` "threshold + tree-state mgmt" + later). No rebuild needed. Gates run on apricot (RUN host), `CARGO_TARGET_DIR=~/.cache/mc-cargo-target-shared`:
|
||||
|
||||
- `cargo check --workspace` — clean (doc warnings only).
|
||||
- `cargo test -p mc-ai --lib` — 278 passed, 0 failed (incl. 4 `tree_state::tests`).
|
||||
- `cargo test -p mc-player-api ... --ignored claude_real_mcts_vs_heuristic_ais_transcript` — 1 passed, ×2, byte-identical transcripts.
|
||||
|
||||
## Why this size
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue