feat(@projects/@magic-civilization): add phase-13 stop criteria & render path blockers

Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
This commit is contained in:
Natalie 2026-05-11 09:55:39 -07:00
parent 560f99484b
commit f948d2968e
2 changed files with 246 additions and 1 deletions

View file

@ -639,6 +639,251 @@ that cite the precise schema mismatch blocking them.
- `src/simulator/api-gdext/src/lib.rs``GdGameState::init`
updated for new `trade_ledger` field.
## 2026-05-12 — Phase 13 STOP (demo would surface degenerate AI; render path absent)
Per brief hard-stop rule: "Claude-vs-AI demo produces no AI activity in
any 5-turn block → STOP, document, exit (signals a regression somewhere
in the AI driver)."
### Evidence
5-EndTurn smoke at Phase-11 commit `ff7198346` (3-player, seed=42):
```
turn 0 → slot 1 actions_applied=1, slot 2 actions_applied=1
turn 1 → slot 1 actions_applied=0, slot 2 actions_applied=0
turn 2 → slot 1 actions_applied=0, slot 2 actions_applied=0
turn 3 → slot 1 actions_applied=0, slot 2 actions_applied=0
turn 4 → slot 1 actions_applied=0, slot 2 actions_applied=0
```
A 25-turn run would produce identical zero-activity blocks for turns
5-9, 10-14, 15-19, 20-24. The hard-stop fires multiple times. Driving
the demo would produce a video of Claude playing solitaire while the
AI sits motionless — not the "Claude vs production AI" promise.
### Independent blocker — render path
Phase 13 also requires "Capture screenshots every 5 turns". The
current headless harness (`claude_player_main.gd`) is JSON-Lines only
— no scene tree, no TileMap, no camera. Production proof scenes
(`gameplay_arc_proof.tscn` etc.) render from `GameState` autoload,
not from a `GdPlayerApi`-held state. There is no path today that
takes the JSON state held by `GdPlayerApi.load_state_json` and renders
it visually.
Wiring this requires either:
1. **Render bridge** — extract the proof-scene rendering pipeline into
a function that takes a `GdGameState` instance (not the
autoload), so the harness can pass its bootstrapped + ticked state
for capture.
2. **Two-process orchestration** — one process drives the JSON pump,
another reads its events and replays them into a renderable scene
on the side.
Either is its own objective with its own surface area.
### What WAS validated this session
- MCP install path is well-understood (the brief's command is
`cd tooling/claude-player-mcp && npm install`, then add `magic-civ`
to `.mcp.json`). Both can be done in <5 minutes when the rest of the
pipeline is warm. Not attempted now per the parent hard-stop.
- The MCP server itself (`tooling/claude-player-mcp/`) was shipped in
the 2026-05-10 Phase 4 work and is wire-stable.
### What unblocks Phase 13
Both Phases 12 and 13's dependencies overlap:
- AI projector enrichment (so AI produces non-trivial action chains
past turn 0 → demo isn't degenerate).
- Render bridge from `GdPlayerApi` state to a scene (so screenshots
capture real game state).
When both land, Phase 13 is a single afternoon: `npm install`, edit
`.mcp.json`, drive a 25-turn run via the MCP, capture per-5-turn
screenshots into `.local/demo-runs/<stamp>/`, write the recap.md.
### Status
p2-67 stays `partial`. Phases 0-11 landed; Phases 12 + 13 deferred
behind two follow-ups (`pX-bench-projector-enrichment`,
`pX-render-bridge-gdplayerapi`). Re-open Phase 13 when both follow-ups
close.
## 2026-05-12 — Phase 12 STOP (ObservationStore API surface mismatch)
Hard-stop triggered per brief rule: "ObservationStore API surface mismatch
with what the projector needs → STOP, document, exit (don't paper over
with a parallel observation store in mc-player-api)."
### What the brief assumed
`mc_observation::ObservationStore` lookups answer the question
"is tile (col, row) visible to player P at the current turn?" so the
projector can mark each `TileView` as visible / fogged / hidden.
### What `ObservationStore` actually is
A per-player CLIMATE / WEATHER observation history for the Chronicle
UI. `src/simulator/crates/mc-observation/src/store.rs:8-90`:
- `TurnObservation { turn, tile_indices, records }` — climate snapshot
(temperature, moisture, wind, succession_progress) of every tile
visible *at recording time* for that turn. Sparse on visible tiles
only.
- `ObservationStore::record_turn(turn, grid, visible_tile_indices)`
takes a pre-computed list of visible tile indices — meaning the
visibility calculation lives somewhere OTHER than `mc-observation`.
- `ObservationStore::get_turn(turn) -> Option<&TurnObservation>`
returns historical climate, not a "right now this tile is visible"
lookup.
There is no `is_visible(player, col, row, turn) -> bool` API. The
store's public surface (`write_turn_frame_buffers`,
`write_latest_known_frame_buffers`, `unlock_lens`, `set_recording_gate`,
…) is shaped for the Chronicle UI's climate ribbon — not for
gameplay fog of war.
### Why papering over would be wrong
Per Rust SoT rail + brief's hard-stop: building a parallel "current
visibility per player" calculation inside `mc-player-api/projection.rs`
would duplicate the visibility logic that has to also live wherever
`ObservationStore::record_turn`'s `visible_tile_indices` argument is
computed (likely GDScript Vision.gd or a Rust port thereof). That's
exactly the duplication the rail forbids.
### What's actually needed
Either:
1. **`mc-vision` crate** (or similar) that owns "compute current visible
tile set for player P given GameState" as the single source of
truth. Both `ObservationStore::record_turn` callers and the
projector pull from this. Includes a `Visibility { Hidden, Fogged,
Visible }` query for any (player, tile, turn) tuple.
2. **Widen `ObservationStore`** to include current visibility lookups
alongside the climate history. Doable but mixes concerns — climate
recording is one job, gameplay fog is another.
The honest path is option 1. Surface area is moderate: walk all
P-owned units + cities, compute hex-distance ≤ vision_radius per
unit/city, union into a `HashSet<(col, row)>`, expose a `Visibility`
enum that says "Visible if in current set, Fogged if in any prior
set, Hidden otherwise."
### Why Phase 12 stays open until then
The projector currently uses strict-redaction fog (own-player-only).
Without per-tile vision data, **all** enemy tiles are hidden, which
matches "Hidden if never seen." The current behaviour is correct for
"player who has never explored anywhere" — degenerate but not wrong.
The wrong-ness only matters once units have moved and explored, and
that path is also blocked by the AI behavioural-inertness gap from
Phase 11's notes (units don't move past spawn). Fix in order:
1. AI projector enrichment so units actually move and explore.
2. `mc-vision` crate so fog has meaningful current/last-seen state.
3. Phase 12 projection rework on top of (1) + (2).
### Status
p2-67 stays `partial`. Phases 0-11 landed. Phase 12 deferred behind
the `mc-vision` follow-up objective. Phase 13 (MCP install + 25-turn
demo + screenshot bundle) is also held — independent of fog
correctness, but driving a 25-turn Claude run against an AI that
returns to inertness on turn 2+ produces a degenerate demo (Claude
moves, AI sits). Phase 13 unblocks alongside the AI projector
enrichment.
## 2026-05-12 — Phase 11 landed (TurnProcessor::step ticking)
p2-68 closed all of Phase 10 (production AI driver replaces scripted heuristic).
Phase 11 wires `TurnProcessor::step` into `apply_end_turn` between the AI
loop and the closing `TurnStarted` emit so production, growth, research,
founding, pending_move_requests, and fauna encounters all drain per turn.
### Shipped
- **`mc_turn::processor::TurnProcessor::step` now owns per-turn unit refresh.**
`src/simulator/crates/mc-turn/src/processor.rs:528-535` — added
`crate::refresh_units(state)` at end-of-step. Single source of truth per
the DRY rule locked in Phase 9; the dispatch-level `refresh_units` call
is deleted in the same patch.
- **`mc_player_api::dispatch::apply_end_turn` runs `step` after the AI loop.**
`src/simulator/crates/mc-player-api/src/dispatch.rs:258-281` — constructs
`TurnProcessor::new(u32::MAX)` (advisory `max_turns`; victory_config
overrides when present), calls `step(state)`, extends the response
`events` vec with translated processor events. The dispatch's
`state.turn = state.turn.saturating_add(1)` and `refresh_units(state)`
call sites are both deleted — `step` owns turn increment + unit refresh.
- **`translate_processor_events` translator** at
`dispatch.rs:295-368`. Maps 5 `mc_replay::TurnEvent` variants to
`wire::Event`: `TechResearched`, `WonderBuilt`, `CityFounded`,
`CityCaptured`, `GameOver`. `ClanId(u32)` is sourced from
`processor.rs:910` as `pi as u32` so the clan→player mapping is
`id.0 as PlayerId` with no separate table needed. Variants without a
direct wire counterpart (AmbientEncounterFired, UnitKilled, War/Peace,
Era, Leader, ClanEliminated, UnitCaptured, UnitRansomOffered,
CivilianDestroyed) are listed in an explicit drop arm so adding a new
`TurnEvent` variant forces a compile-time decision.
- **Cargo dep `mc-replay`** added to `mc-player-api/Cargo.toml`.
### Tests + gate
- `cargo test -p mc-player-api --lib`: 77 passed (was 74, +3 new):
- `end_turn_ticks_city_food_growth_via_turn_processor` — 2-turn
food accumulation crosses growth threshold (pop 1 → 2).
- `end_turn_completes_queued_unit_via_turn_processor` — city with
`production_stored=100` + `Queueable::Unit{dwarf_warrior}` spawns a
unit after one EndTurn (`player.units.len()` grows).
- `end_turn_refreshes_unit_movement_via_turn_processor` — unit with
`movement_remaining=0` and `base_moves=32` refreshes to 32 after step.
- `cargo test -p mc-turn --lib`: 207/207 still green (no regression
from adding `refresh_units` to end-of-step).
- `cargo check --workspace`: clean (pre-existing 17 doc-comment warnings).
### Smoke confirms ticking
Re-ran the 3-player apricot smoke at the Phase-11 commit (gdext rebuild
+ class-cache refresh + 5 EndTurns). Claude's view across turns 0..5
showed visible state advancement:
- `food_stored`: 0 → 2 → 4 → 6 → 8 → 10 (net +2/turn)
- `gold`: 60 → 68 → 76 → 84 → 92 → 100 (+8/turn)
- `unit_count`: 3 → 3 → 4 → 5 → 6 → 6 (production threshold spawns)
- `science_per_turn`: 0 → 42 (strategic_axes kicked in post-step)
This is the direct, observable consequence of Phase 11. Pre-Phase-11
smokes showed every field static across all 5 turns.
### Honest finding — AI side still inert (separate from Phase 11)
Same smoke surfaces `actions_applied=0` on the AI side (slots 1+2) for
turns 1-4 despite Phase 11 wiring step. Turn 0 still produces 1 action
per slot (the founding pass).
This contradicts the p2-68 Wave-final hypothesis ("the bench doesn't tick,
that's why the AI sees nothing to do"). Wave-final was partially wrong:
the bench DOES tick visibly for Claude. The AI's inertness is a deeper
issue — `decide_tactical_actions` on the bench projection bottoms out
after the founding pass because:
- `unit_catalog` is empty in the bench-projector (p2-68 Wave 1
documented limitation),
- `(food, prod, gold)` per-tile yields are zero in the bench projection,
- the unit move queue is empty because the AI projector has no per-tile
cost data.
Phase 11 closes the "step doesn't tick" issue. The "AI is behaviorally
inert past turn 0" issue is its own follow-up. Recommendation: open a
new objective `pX-bench-projector-enrichment` to widen `project_tactical`
with unit_catalog + per-tile yields + movement-cost data so
`decide_tactical_actions` has a non-degenerate search space on the bench.
## 2026-05-11 — Phase 10 STOP (structural blocker; documented)
Phase 10 cannot land as a thin dispatch swap. Per the user's

View file

@ -152,7 +152,7 @@ count
- ✓ `mc-player-api` no longer contains `run_scripted_ai_turn` — call site replaced. Verified Wave 4 — function fully deleted; `apply_end_turn` now calls `drive_ai_slot` which threads `project_tactical``run_ai_turn``apply_ai_action`.
- ✓ Headless harness loads `ai_personalities.json` at boot. Verified Wave-final 2026-05-11 — `claude_player_main.gd::_apply_ai_personalities` reads `res://public/games/age-of-dwarves/data/ai_personalities.json` once via `FileAccess`, parses for clan key list, deterministic slot→clan mapping (`clan_ids[ai_index % count]` over sorted clan ids), and calls new `GdGameState::set_player_personality_json(slot, clan_id, json)` per AI slot. The setter delegates JSON parsing into `mc_core::ScoringWeights::from_personality_json` (single SoT). 3-player smoke output evidence: `{"clan_id":"blackhammer","slot":1,"type":"ai_personality_assigned"}{"clan_id":"deepforge","slot":2,"type":"ai_personality_assigned"}`. Commit `2de1880db`.
- ✓ `cargo check --workspace` green. Evidence: `cargo check --manifest-path src/simulator/Cargo.toml --workspace``Finished dev profile in 2.75s` (17 doc-comment warnings, 0 errors). Unblocked by p2-69 closing the api-gdext `mc_turn::snapshot` import gap (commit `be088c3ad`). `cargo test --workspace --lib` green for every owned crate (`mc-ai` 240, `mc-player-api` 74, `mc-turn` 207, `mc-observation` 24, `magic-civ-physics-gdext` 10); pre-existing `mc-flora::generation::tests::*authored*` failures are unrelated tech debt (confirmed via stash-test: failures present with no local changes on origin/main).
- ⚠ Headless smoke test: 5 EndTurns vs an AI clan produces non-trivial AI action chains. **Plumbing verified; action-depth limited to turn 0 pending p2-67 Phase 11.** Wave-final 2026-05-11 — 2-player smoke (Claude vs blackhammer) and 3-player smoke (Claude vs blackhammer + deepforge) on apricot (gdext build `2de1880db` + class-cache pre-pass): turn 0 AI driver produced `actions_applied=1` per slot; turns 1-4 produced `actions_applied=0`. AI plumbing is end-to-end correct (`drive_ai_slot``project_tactical``run_ai_turn` with personality-shaped `ScoringWeights``apply_ai_action`). The action-depth ceiling is a downstream constraint: bench `GameState` does NOT tick production / research / unit refresh between EndTurns (`TurnProcessor::step` is not wired into `apply_end_turn` — owned by p2-67 Phase 11), so the AI sees the same idle bench state turn-over-turn with no new opportunities after the first founding pass. Output captured at `apricot:/tmp/wave1-smoke-output.txt` + `wave1-smoke-3p-output.txt`. **Personality variation:** both clans produced the same chain length on a fixed seed, which is consistent with `decide_tactical_actions` bottoming out on the bench's empty unit-catalog + zero-yield grid; the seed-derivation helper already proves byte-determinism in `mc-ai/tests/run_ai_turn_is_byte_deterministic`. Flips to ✓ when Phase 11 wires `TurnProcessor::step` (next wave) and turns 2-4 produce non-zero chains as a downstream consequence.
- ⚠ Headless smoke test: 5 EndTurns vs an AI clan produces non-trivial AI action chains. **Plumbing verified end-to-end; AI behavioural inertness past turn 0 is a separate (deeper) issue.** Re-run 2026-05-12 at p2-67 Phase 11 commit `ff7198346` (TurnProcessor::step now ticks every EndTurn): Claude's view advances visibly turn-over-turn (food_stored 0→10, gold 60→100, unit_count 3→6, science_per_turn 0→42 — recorded at `apricot:/tmp/wave2-smoke-output.txt`), confirming the original Wave-final diagnosis ("bench isn't ticking, AI sees the same state every turn") was partially wrong. AI side still emits `actions_applied=1` on turn 0 per slot and `=0` on turns 1-4. Root cause is NOT a step-ticking gap; it is the bench projector's degenerate search space — `project_tactical` populates an empty unit_catalog, zero per-tile yields, and no move-cost data, so `decide_tactical_actions` returns empty after the turn-0 founding pass. Open follow-up `pX-bench-projector-enrichment` to widen the projector. Flips to ✓ once the projector has enough surface for `decide_tactical_actions` to find work past turn 0.
- ✓ Determinism: same `(seed, state, weights)` produces byte-identical action sequences across two runs. Verified Wave 3 `run_ai_turn_is_byte_deterministic` — JSON-string equality, not just `len()`.
**Status:** `partial` (8/9 ✓, 1 ⚠ conditional on p2-67 Phase 11). Harness loader bullet flipped ✓ via Wave-final 2026-05-11 (commit `2de1880db`, gdext rebuild on apricot, 2-player + 3-player smoke runs both emit `ai_personality_assigned` notification per AI slot with deterministic clan mapping). 5-EndTurn smoke bullet flips to ⚠ — plumbing verified, action depth limited to turn 0 because `TurnProcessor::step` is not wired into `apply_end_turn` (owned by p2-67 Phase 11). Will flip to ✓ when Phase 11 lands as a downstream consequence — the same smoke harness will then surface a multi-turn chain.