feat(@projects/@magic-civilization): ✨ add phase-13 stop criteria & render path blockers

Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-05-11 09:55:39 -07:00 · 2026-05-11 09:55:39 -07:00 · f948d2968e
commit f948d2968e
parent 560f99484b
2 changed files with 246 additions and 1 deletions
--- a/.project/objectives/p2-67-claude-player-api.md
+++ b/.project/objectives/p2-67-claude-player-api.md
@ -639,6 +639,251 @@ that cite the precise schema mismatch blocking them.
 - `src/simulator/api-gdext/src/lib.rs` — `GdGameState::init`
  updated for new `trade_ledger` field.

+## 2026-05-12 — Phase 13 STOP (demo would surface degenerate AI; render path absent)
+
+Per brief hard-stop rule: "Claude-vs-AI demo produces no AI activity in
+any 5-turn block → STOP, document, exit (signals a regression somewhere
+in the AI driver)."
+
+### Evidence
+
+5-EndTurn smoke at Phase-11 commit `ff7198346` (3-player, seed=42):
+
+```
+turn 0 → slot 1 actions_applied=1, slot 2 actions_applied=1
+turn 1 → slot 1 actions_applied=0, slot 2 actions_applied=0
+turn 2 → slot 1 actions_applied=0, slot 2 actions_applied=0
+turn 3 → slot 1 actions_applied=0, slot 2 actions_applied=0
+turn 4 → slot 1 actions_applied=0, slot 2 actions_applied=0
+```
+
+A 25-turn run would produce identical zero-activity blocks for turns
+5-9, 10-14, 15-19, 20-24. The hard-stop fires multiple times. Driving
+the demo would produce a video of Claude playing solitaire while the
+AI sits motionless — not the "Claude vs production AI" promise.
+
+### Independent blocker — render path
+
+Phase 13 also requires "Capture screenshots every 5 turns". The
+current headless harness (`claude_player_main.gd`) is JSON-Lines only
+— no scene tree, no TileMap, no camera. Production proof scenes
+(`gameplay_arc_proof.tscn` etc.) render from `GameState` autoload,
+not from a `GdPlayerApi`-held state. There is no path today that
+takes the JSON state held by `GdPlayerApi.load_state_json` and renders
+it visually.
+
+Wiring this requires either:
+
+1. **Render bridge** — extract the proof-scene rendering pipeline into
+   a function that takes a `GdGameState` instance (not the
+   autoload), so the harness can pass its bootstrapped + ticked state
+   for capture.
+2. **Two-process orchestration** — one process drives the JSON pump,
+   another reads its events and replays them into a renderable scene
+   on the side.
+
+Either is its own objective with its own surface area.
+
+### What WAS validated this session
+
+- MCP install path is well-understood (the brief's command is
+  `cd tooling/claude-player-mcp && npm install`, then add `magic-civ`
+  to `.mcp.json`). Both can be done in <5 minutes when the rest of the
+  pipeline is warm. Not attempted now per the parent hard-stop.
+- The MCP server itself (`tooling/claude-player-mcp/`) was shipped in
+  the 2026-05-10 Phase 4 work and is wire-stable.
+
+### What unblocks Phase 13
+
+Both Phases 12 and 13's dependencies overlap:
+
+- AI projector enrichment (so AI produces non-trivial action chains
+  past turn 0 → demo isn't degenerate).
+- Render bridge from `GdPlayerApi` state to a scene (so screenshots
+  capture real game state).
+
+When both land, Phase 13 is a single afternoon: `npm install`, edit
+`.mcp.json`, drive a 25-turn run via the MCP, capture per-5-turn
+screenshots into `.local/demo-runs/<stamp>/`, write the recap.md.
+
+### Status
+
+p2-67 stays `partial`. Phases 0-11 landed; Phases 12 + 13 deferred
+behind two follow-ups (`pX-bench-projector-enrichment`,
+`pX-render-bridge-gdplayerapi`). Re-open Phase 13 when both follow-ups
+close.
+
+## 2026-05-12 — Phase 12 STOP (ObservationStore API surface mismatch)
+
+Hard-stop triggered per brief rule: "ObservationStore API surface mismatch
+with what the projector needs → STOP, document, exit (don't paper over
+with a parallel observation store in mc-player-api)."
+
+### What the brief assumed
+
+`mc_observation::ObservationStore` lookups answer the question
+"is tile (col, row) visible to player P at the current turn?" so the
+projector can mark each `TileView` as visible / fogged / hidden.
+
+### What `ObservationStore` actually is
+
+A per-player CLIMATE / WEATHER observation history for the Chronicle
+UI. `src/simulator/crates/mc-observation/src/store.rs:8-90`:
+
+- `TurnObservation { turn, tile_indices, records }` — climate snapshot
+  (temperature, moisture, wind, succession_progress) of every tile
+  visible *at recording time* for that turn. Sparse on visible tiles
+  only.
+- `ObservationStore::record_turn(turn, grid, visible_tile_indices)`
+  takes a pre-computed list of visible tile indices — meaning the
+  visibility calculation lives somewhere OTHER than `mc-observation`.
+- `ObservationStore::get_turn(turn) -> Option<&TurnObservation>`
+  returns historical climate, not a "right now this tile is visible"
+  lookup.
+
+There is no `is_visible(player, col, row, turn) -> bool` API. The
+store's public surface (`write_turn_frame_buffers`,
+`write_latest_known_frame_buffers`, `unlock_lens`, `set_recording_gate`,
+…) is shaped for the Chronicle UI's climate ribbon — not for
+gameplay fog of war.
+
+### Why papering over would be wrong
+
+Per Rust SoT rail + brief's hard-stop: building a parallel "current
+visibility per player" calculation inside `mc-player-api/projection.rs`
+would duplicate the visibility logic that has to also live wherever
+`ObservationStore::record_turn`'s `visible_tile_indices` argument is
+computed (likely GDScript Vision.gd or a Rust port thereof). That's
+exactly the duplication the rail forbids.
+
+### What's actually needed
+
+Either:
+
+1. **`mc-vision` crate** (or similar) that owns "compute current visible
+   tile set for player P given GameState" as the single source of
+   truth. Both `ObservationStore::record_turn` callers and the
+   projector pull from this. Includes a `Visibility { Hidden, Fogged,
+   Visible }` query for any (player, tile, turn) tuple.
+
+2. **Widen `ObservationStore`** to include current visibility lookups
+   alongside the climate history. Doable but mixes concerns — climate
+   recording is one job, gameplay fog is another.
+
+The honest path is option 1. Surface area is moderate: walk all
+P-owned units + cities, compute hex-distance ≤ vision_radius per
+unit/city, union into a `HashSet<(col, row)>`, expose a `Visibility`
+enum that says "Visible if in current set, Fogged if in any prior
+set, Hidden otherwise."
+
+### Why Phase 12 stays open until then
+
+The projector currently uses strict-redaction fog (own-player-only).
+Without per-tile vision data, **all** enemy tiles are hidden, which
+matches "Hidden if never seen." The current behaviour is correct for
+"player who has never explored anywhere" — degenerate but not wrong.
+The wrong-ness only matters once units have moved and explored, and
+that path is also blocked by the AI behavioural-inertness gap from
+Phase 11's notes (units don't move past spawn). Fix in order:
+
+1. AI projector enrichment so units actually move and explore.
+2. `mc-vision` crate so fog has meaningful current/last-seen state.
+3. Phase 12 projection rework on top of (1) + (2).
+
+### Status
+
+p2-67 stays `partial`. Phases 0-11 landed. Phase 12 deferred behind
+the `mc-vision` follow-up objective. Phase 13 (MCP install + 25-turn
+demo + screenshot bundle) is also held — independent of fog
+correctness, but driving a 25-turn Claude run against an AI that
+returns to inertness on turn 2+ produces a degenerate demo (Claude
+moves, AI sits). Phase 13 unblocks alongside the AI projector
+enrichment.
+
+## 2026-05-12 — Phase 11 landed (TurnProcessor::step ticking)
+
+p2-68 closed all of Phase 10 (production AI driver replaces scripted heuristic).
+Phase 11 wires `TurnProcessor::step` into `apply_end_turn` between the AI
+loop and the closing `TurnStarted` emit so production, growth, research,
+founding, pending_move_requests, and fauna encounters all drain per turn.
+
+### Shipped
+
+- **`mc_turn::processor::TurnProcessor::step` now owns per-turn unit refresh.**
+  `src/simulator/crates/mc-turn/src/processor.rs:528-535` — added
+  `crate::refresh_units(state)` at end-of-step. Single source of truth per
+  the DRY rule locked in Phase 9; the dispatch-level `refresh_units` call
+  is deleted in the same patch.
+- **`mc_player_api::dispatch::apply_end_turn` runs `step` after the AI loop.**
+  `src/simulator/crates/mc-player-api/src/dispatch.rs:258-281` — constructs
+  `TurnProcessor::new(u32::MAX)` (advisory `max_turns`; victory_config
+  overrides when present), calls `step(state)`, extends the response
+  `events` vec with translated processor events. The dispatch's
+  `state.turn = state.turn.saturating_add(1)` and `refresh_units(state)`
+  call sites are both deleted — `step` owns turn increment + unit refresh.
+- **`translate_processor_events` translator** at
+  `dispatch.rs:295-368`. Maps 5 `mc_replay::TurnEvent` variants to
+  `wire::Event`: `TechResearched`, `WonderBuilt`, `CityFounded`,
+  `CityCaptured`, `GameOver`. `ClanId(u32)` is sourced from
+  `processor.rs:910` as `pi as u32` so the clan→player mapping is
+  `id.0 as PlayerId` with no separate table needed. Variants without a
+  direct wire counterpart (AmbientEncounterFired, UnitKilled, War/Peace,
+  Era, Leader, ClanEliminated, UnitCaptured, UnitRansomOffered,
+  CivilianDestroyed) are listed in an explicit drop arm so adding a new
+  `TurnEvent` variant forces a compile-time decision.
+- **Cargo dep `mc-replay`** added to `mc-player-api/Cargo.toml`.
+
+### Tests + gate
+
+- `cargo test -p mc-player-api --lib`: 77 passed (was 74, +3 new):
+  - `end_turn_ticks_city_food_growth_via_turn_processor` — 2-turn
+    food accumulation crosses growth threshold (pop 1 → 2).
+  - `end_turn_completes_queued_unit_via_turn_processor` — city with
+    `production_stored=100` + `Queueable::Unit{dwarf_warrior}` spawns a
+    unit after one EndTurn (`player.units.len()` grows).
+  - `end_turn_refreshes_unit_movement_via_turn_processor` — unit with
+    `movement_remaining=0` and `base_moves=32` refreshes to 32 after step.
+- `cargo test -p mc-turn --lib`: 207/207 still green (no regression
+  from adding `refresh_units` to end-of-step).
+- `cargo check --workspace`: clean (pre-existing 17 doc-comment warnings).
+
+### Smoke confirms ticking
+
+Re-ran the 3-player apricot smoke at the Phase-11 commit (gdext rebuild
+ class-cache refresh + 5 EndTurns). Claude's view across turns 0..5
+showed visible state advancement:
+
+- `food_stored`: 0 → 2 → 4 → 6 → 8 → 10 (net +2/turn)
+- `gold`: 60 → 68 → 76 → 84 → 92 → 100 (+8/turn)
+- `unit_count`: 3 → 3 → 4 → 5 → 6 → 6 (production threshold spawns)
+- `science_per_turn`: 0 → 42 (strategic_axes kicked in post-step)
+
+This is the direct, observable consequence of Phase 11. Pre-Phase-11
+smokes showed every field static across all 5 turns.
+
+### Honest finding — AI side still inert (separate from Phase 11)
+
+Same smoke surfaces `actions_applied=0` on the AI side (slots 1+2) for
+turns 1-4 despite Phase 11 wiring step. Turn 0 still produces 1 action
+per slot (the founding pass).
+
+This contradicts the p2-68 Wave-final hypothesis ("the bench doesn't tick,
+that's why the AI sees nothing to do"). Wave-final was partially wrong:
+the bench DOES tick visibly for Claude. The AI's inertness is a deeper
+issue — `decide_tactical_actions` on the bench projection bottoms out
+after the founding pass because:
+- `unit_catalog` is empty in the bench-projector (p2-68 Wave 1
+  documented limitation),
+- `(food, prod, gold)` per-tile yields are zero in the bench projection,
+- the unit move queue is empty because the AI projector has no per-tile
+  cost data.
+
+Phase 11 closes the "step doesn't tick" issue. The "AI is behaviorally
+inert past turn 0" issue is its own follow-up. Recommendation: open a
+new objective `pX-bench-projector-enrichment` to widen `project_tactical`
+with unit_catalog + per-tile yields + movement-cost data so
+`decide_tactical_actions` has a non-degenerate search space on the bench.
+
 ## 2026-05-11 — Phase 10 STOP (structural blocker; documented)

 Phase 10 cannot land as a thin dispatch swap. Per the user's
--- a/.project/objectives/p2-68-mc-ai-headless-turn-driver.md
+++ b/.project/objectives/p2-68-mc-ai-headless-turn-driver.md
@ -152,7 +152,7 @@ count
 - ✓ `mc-player-api` no longer contains `run_scripted_ai_turn` — call site replaced. Verified Wave 4 — function fully deleted; `apply_end_turn` now calls `drive_ai_slot` which threads `project_tactical` → `run_ai_turn` → `apply_ai_action`.
 - ✓ Headless harness loads `ai_personalities.json` at boot. Verified Wave-final 2026-05-11 — `claude_player_main.gd::_apply_ai_personalities` reads `res://public/games/age-of-dwarves/data/ai_personalities.json` once via `FileAccess`, parses for clan key list, deterministic slot→clan mapping (`clan_ids[ai_index % count]` over sorted clan ids), and calls new `GdGameState::set_player_personality_json(slot, clan_id, json)` per AI slot. The setter delegates JSON parsing into `mc_core::ScoringWeights::from_personality_json` (single SoT). 3-player smoke output evidence: `{"clan_id":"blackhammer","slot":1,"type":"ai_personality_assigned"}{"clan_id":"deepforge","slot":2,"type":"ai_personality_assigned"}`. Commit `2de1880db`.
 - ✓ `cargo check --workspace` green. Evidence: `cargo check --manifest-path src/simulator/Cargo.toml --workspace` → `Finished dev profile in 2.75s` (17 doc-comment warnings, 0 errors). Unblocked by p2-69 closing the api-gdext `mc_turn::snapshot` import gap (commit `be088c3ad`). `cargo test --workspace --lib` green for every owned crate (`mc-ai` 240, `mc-player-api` 74, `mc-turn` 207, `mc-observation` 24, `magic-civ-physics-gdext` 10); pre-existing `mc-flora::generation::tests::*authored*` failures are unrelated tech debt (confirmed via stash-test: failures present with no local changes on origin/main).
- ⚠ Headless smoke test: 5 EndTurns vs an AI clan produces non-trivial AI action chains. **Plumbing verified; action-depth limited to turn 0 pending p2-67 Phase 11.** Wave-final 2026-05-11 — 2-player smoke (Claude vs blackhammer) and 3-player smoke (Claude vs blackhammer + deepforge) on apricot (gdext build `2de1880db` + class-cache pre-pass): turn 0 AI driver produced `actions_applied=1` per slot; turns 1-4 produced `actions_applied=0`. AI plumbing is end-to-end correct (`drive_ai_slot` → `project_tactical` → `run_ai_turn` with personality-shaped `ScoringWeights` → `apply_ai_action`). The action-depth ceiling is a downstream constraint: bench `GameState` does NOT tick production / research / unit refresh between EndTurns (`TurnProcessor::step` is not wired into `apply_end_turn` — owned by p2-67 Phase 11), so the AI sees the same idle bench state turn-over-turn with no new opportunities after the first founding pass. Output captured at `apricot:/tmp/wave1-smoke-output.txt` + `wave1-smoke-3p-output.txt`. **Personality variation:** both clans produced the same chain length on a fixed seed, which is consistent with `decide_tactical_actions` bottoming out on the bench's empty unit-catalog + zero-yield grid; the seed-derivation helper already proves byte-determinism in `mc-ai/tests/run_ai_turn_is_byte_deterministic`. Flips to ✓ when Phase 11 wires `TurnProcessor::step` (next wave) and turns 2-4 produce non-zero chains as a downstream consequence.
+- ⚠ Headless smoke test: 5 EndTurns vs an AI clan produces non-trivial AI action chains. **Plumbing verified end-to-end; AI behavioural inertness past turn 0 is a separate (deeper) issue.** Re-run 2026-05-12 at p2-67 Phase 11 commit `ff7198346` (TurnProcessor::step now ticks every EndTurn): Claude's view advances visibly turn-over-turn (food_stored 0→10, gold 60→100, unit_count 3→6, science_per_turn 0→42 — recorded at `apricot:/tmp/wave2-smoke-output.txt`), confirming the original Wave-final diagnosis ("bench isn't ticking, AI sees the same state every turn") was partially wrong. AI side still emits `actions_applied=1` on turn 0 per slot and `=0` on turns 1-4. Root cause is NOT a step-ticking gap; it is the bench projector's degenerate search space — `project_tactical` populates an empty unit_catalog, zero per-tile yields, and no move-cost data, so `decide_tactical_actions` returns empty after the turn-0 founding pass. Open follow-up `pX-bench-projector-enrichment` to widen the projector. Flips to ✓ once the projector has enough surface for `decide_tactical_actions` to find work past turn 0.
 - ✓ Determinism: same `(seed, state, weights)` produces byte-identical action sequences across two runs. Verified Wave 3 `run_ai_turn_is_byte_deterministic` — JSON-string equality, not just `len()`.

 **Status:** `partial` (8/9 ✓, 1 ⚠ conditional on p2-67 Phase 11). Harness loader bullet flipped ✓ via Wave-final 2026-05-11 (commit `2de1880db`, gdext rebuild on apricot, 2-player + 3-player smoke runs both emit `ai_personality_assigned` notification per AI slot with deterministic clan mapping). 5-EndTurn smoke bullet flips to ⚠ — plumbing verified, action depth limited to turn 0 because `TurnProcessor::step` is not wired into `apply_end_turn` (owned by p2-67 Phase 11). Will flip to ✓ when Phase 11 lands as a downstream consequence — the same smoke harness will then surface a multi-turn chain.