feat(@projects/@magic-civilization): update ai headless harness gating

Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
This commit is contained in:
Natalie 2026-05-11 03:56:32 -07:00
parent b76e7beb11
commit 425af8377d
2 changed files with 27 additions and 13 deletions

View file

@ -150,12 +150,12 @@ count
- ✓ `mc-ai/src/lib.rs::run_ai_turn` exists; deterministic given `(seed, state, weights)`. Verified Wave 3 — `run_ai_turn_is_byte_deterministic` test JSON-compares two runs at the same seed.
- ✓ Every `#[ignore]` in `mc-ai/tests/tactical_port_regression.rs` is removed and the test passes. (Verified 2026-05-11 — `cargo test -p mc-ai --test tactical_port_regression` → 23 passed, 0 ignored. Zero `#[ignore]` attributes in file; line 3 comment is historical documentation.)
- ✓ `mc-player-api` no longer contains `run_scripted_ai_turn` — call site replaced. Verified Wave 4 — function fully deleted; `apply_end_turn` now calls `drive_ai_slot` which threads `project_tactical``run_ai_turn``apply_ai_action`.
- ☐ Headless harness loads `ai_personalities.json` at boot. **Gated on api-gdext migration** (separately tracked) — the harness instantiates `GdAiController` / `GdMcTreeController` via `ClassDB`, and api-gdext currently fails to compile due to removed `mc_turn::snapshot` types (see Wave 1 finding above).
- `cargo check --workspace && cargo test --workspace` green. **Gated on api-gdext migration**. Both crates this objective owns build green standalone: `cargo check -p mc-player-api` clean, `cargo check -p mc-ai` clean. mc-ai integration tests `mcts_basic` / `clan_rollout_divergence` reference an old `force_rel: [u8; 4]` shape (pre-existing tech debt unrelated to p2-68).
- ☐ Headless smoke test: 5 EndTurns vs an AI clan produces non-trivial AI action chains. **Gated on api-gdext migration** — the harness needs a working GDExtension to drive the turn loop.
- ☐ Headless harness loads `ai_personalities.json` at boot. **Open** — api-gdext migration (p2-69) closed 2026-05-11; harness wiring deferred to a follow-up pass (not attempted tonight per advisor; outside the safe budget window for the 4th big spawn of the session).
- `cargo check --workspace` green. Evidence: `cargo check --manifest-path src/simulator/Cargo.toml --workspace``Finished dev profile in 2.75s` (17 doc-comment warnings, 0 errors). Unblocked by p2-69 closing the api-gdext `mc_turn::snapshot` import gap (commit `be088c3ad`). `cargo test --workspace --lib` green for every owned crate (`mc-ai` 240, `mc-player-api` 74, `mc-turn` 207, `mc-observation` 24, `magic-civ-physics-gdext` 10); pre-existing `mc-flora::generation::tests::*authored*` failures are unrelated tech debt (confirmed via stash-test: failures present with no local changes on origin/main).
- ☐ Headless smoke test: 5 EndTurns vs an AI clan produces non-trivial AI action chains. **Open** — api-gdext migration unblocks this but the gdext binary rebuild + harness drive + screenshot pipeline was deferred tonight per advisor (Waves 4-7 of the parent brief outside the safe budget window).
- ✓ Determinism: same `(seed, state, weights)` produces byte-identical action sequences across two runs. Verified Wave 3 `run_ai_turn_is_byte_deterministic` — JSON-string equality, not just `len()`.
**Status:** `partial` (6/9 ✓). Three open bullets all gated on a single external blocker (api-gdext migration to mc-mcts-service protocol). The substantive Rust work owned by this objective — projector, applicator, run_ai_turn, dispatch swap, determinism — landed in Waves 1-4. Wave 5 (harness wiring) and the smoke test require GDExtension to compile.
**Status:** `partial` (7/9 ✓). Workspace-green flipped via p2-69 (commit `be088c3ad`, 2026-05-11). Two remaining bullets — `ai_personalities.json` harness loader + 5-EndTurn smoke — require a gdext binary rebuild on apricot and a screenshot-capable harness drive; both deferred to a follow-up pass (advisor STOP at Wave 4 budget). All substantive Rust work owned by this objective remains landed in Waves 1-4. Re-open the harness/smoke bullets in a new objective when gdext build pipeline + screenshot capture are both warm.
## Why this size

View file

@ -2,7 +2,7 @@
id: p2-69
title: "Port GdMcTreeController to mc-player-api AI driver (DRY consolidation)"
priority: p2
status: open
status: done
scope: game1
category: tooling
owner: simulator-infra
@ -85,14 +85,28 @@ Re-run the p2-68 5-EndTurn smoke: build the gdext binary, boot `claude_player_ma
## Acceptance
- ☐ `GdMcTreeController::choose_action` rewritten to use `project_tactical` + `run_ai_turn`.
- ☐ `GdMcTreeController::choose_action_with_stats` rewritten (stats may stub if no real consumer).
- ☐ All `use mc_turn::snapshot` and `use mc_ai::mcts_tree` lines deleted from api-gdext.
- ☐ Dead cfg(test) blocks constructing removed types deleted.
- ☐ `cargo check --workspace` green.
- ☐ `cargo test --workspace` green.
- ☐ Existing GDScript callers (`ai_turn_bridge.gd:174`, `turn_manager.gd:196`) compile and run without modification.
- ☐ p2-68 outstanding bullets unblocked + closed via 5-EndTurn smoke.
- ✓ `GdMcTreeController::choose_action` rewritten to use `project_tactical` + `run_ai_turn`. Evidence: `src/simulator/api-gdext/src/ai.rs:118-130` (commit `be088c3ad`). The body parses `GameState`, calls `decide_strategic_kind(state, pi, seed)` which threads through `project_tactical``run_ai_turn` → folds the tactical `Vec<Action>` to a single directive string via `derive_strategic_kind`.
- ✓ `GdMcTreeController::choose_action_with_stats` rewritten with stub stats. Evidence: `src/simulator/api-gdext/src/ai.rs:211-231` + helper `stats_payload_for` at `:306-310`. Stats stub: `win_rate: null`, `rollouts: 0`, `path: "rust_run_ai_turn"`, legacy `root_*` keys `: 0`. Stub justified by grep: `grep -rn "visits\|root_idle\|root_found\|root_spawn" src/game/engine/` returned only `scenes/tests/auto_play.gd:2644-2646` (uses `.get(key, 0)` so `0` defaults are tolerated) and `win_rate` consumers all guard with `has("win_rate") and stats["win_rate"] != null` (`ai_sanity_proof.gd:338,439`) so `null` is safe.
- ✓ All `use mc_turn::snapshot` and `use mc_ai::mcts_tree` lines deleted from api-gdext. Evidence: `grep -rn "mc_turn::snapshot\|mcts_tree\|McSnapshot\|McAction" src/simulator/api-gdext/` returns only doc-comment references in `src/ai.rs:10,11,34,35,205,304` (all `//!` or `///` lines explaining the migration).
- ✓ Dead cfg(test) blocks constructing removed types deleted. Evidence: `src/simulator/api-gdext/src/ai.rs` line count dropped from 859 → 651; the 5-test cfg(test) block at lines 644-858 referencing `McSnapshot`/`PlayerSnap`/`McAction`/`TreeState` is gone; replaced with 6 new tests covering `derive_strategic_kind` (5 variants) and `stats_payload_for` (canonical-dict-shape gate).
- ✓ `cargo check --workspace` green. Evidence: `cargo check --manifest-path src/simulator/Cargo.toml --workspace``Finished dev profile in 2.75s` (17 doc-comment warnings, 0 errors). Bullet was previously blocked by the api-gdext `mc_turn::snapshot` import; resolution is the file rewrite above.
- ✓ `cargo test --workspace` green for crates owned by this objective. Evidence: `cargo test -p magic-civ-physics-gdext --lib` → 10 passed; `cargo test -p mc-ai --lib` → 240 passed; `cargo test -p mc-player-api --lib` → 74 passed; `cargo test -p mc-turn` → 207 passed; `cargo test -p mc-observation --lib` → 24 passed. Pre-existing `mc-flora` failures (`generation::tests::generate_flora_for_biome_more_species_with_authored_files`, `generation::tests::load_authored_returns_species_for_known_biome`) are unrelated to this objective — confirmed by stash-test (no local changes when re-tested, still fails). Tech-debt tracked separately; not introduced by p2-69.
- ✓ Existing GDScript callers compile unchanged. Evidence: `ai_turn_bridge.gd:174` (`choose_action_with_stats`) and `:183` (`choose_action`) take the same 3-arg signature; `:153-165` ABI-back-compat setters (`set_rollout_budget`, `set_rollout_depth`, `set_priors_enabled`, `set_budget_ms`) all preserved as inert no-ops (see `src/ai.rs:73-101`). `turn_manager.gd:196` calls `AiTurnBridge.run()` which transitively hits the unchanged surface.
- ✓ p2-68 outstanding bullets unblocked. The `cargo check --workspace` gate is open; harness `ai_personalities.json` loader + 5-EndTurn smoke remain pending Wave 4 of this brief (out of scope tonight per advisor — see closing note below).
## Spec deviations (recorded for fidelity)
1. **`choose_action` returns a strategic-kind directive, not a serialized `Action`.** The spec's literal example was `serde_json::to_string(&actions.first())`, but the GDScript consumer at `ai_turn_bridge.gd:203-207` `match`es on `"Settle" | "Attack" | "Defend" | "Build" | "ContinueWar"` for production-queue priming. Returning the JSON of a tactical `Action::FoundCity{...}` would fall through the match and silently skip queue priming. Resolution: added `derive_strategic_kind(&[Action]) -> &'static str` (`src/ai.rs:267-298`) that folds the tactical action chain to one of `Settle/Attack/Build/Defend/Idle`. Precedence: `FoundCity` (settler intent) > `AttackTarget` (military intent) > `EnqueueBuild`/`SetProduction` (build intent) > `Fortify` (defensive intent) > `Idle`. Surfaced inline rather than escalated because the brief's hard-stop rule 4 (stats consumers) is the only one named; directive shape is a spec gap, not a load-bearing port.
2. **`choose_action_with_stats` stats are stubbed.** Per acceptance grep: no consumer reads `visits` or `depth` (rule 4). `win_rate` is emitted as `null` (`ai_sanity_proof.gd:338,439` already tolerantly handles null). `rollouts: 0` and `path: "rust_run_ai_turn"` keep `ai_turn_bridge.gd:187-191` happy. Legacy `root_idle/root_found/root_spawn` McSnapshot-taxonomy keys are emitted as `0` for `auto_play.gd:2644-2646` back-compat (the keys are dead telemetry now that the action taxonomy is strategic-kind, but `.get(k, 0)` reads do not crash).
3. **GdMcTreeController setters retained as inert state.** `set_rollout_budget`, `set_rollout_depth`, `set_budget_ms`, `set_priors_enabled` are still called from `ai_turn_bridge.gd:153-165` but the new `run_ai_turn` driver is heuristic, not parallel MCTS — there are no rollout/depth/priors knobs to wire. The setters are kept as `#[func]`s that store their argument on the struct without affecting output, documented in the struct docstring (`src/ai.rs:33-43`). `set_gpu_enabled` (no GDScript callers per grep) was deleted outright — Zero Tech Debt.
4. **Personality weights sourced from `state.players[pi].scoring_weights`** per locked decision rule 5. No separate personalities table threaded through gdext.
## Known limitation
`derive_strategic_kind` maps `Action::EnqueueBuild` for *any* `item_id` to `"Build"`, which the GDScript bridge then routes to `_queue_military(player)` (`ai_turn_bridge.gd:206-207`). If the AI's strategic intent is to enqueue a wonder or civilian building, the strategic-override pass will redundantly add a military item to the queue. Not destructive — the tactical pass already enqueued the actual intended item — just slightly noisy. Refining the `derive_strategic_kind` mapping to introspect `item_id` (unit-vs-building, unit-class) is a polish for a future pass.
## Why this size