diff --git a/.project/objectives/p2-67-claude-player-api.md b/.project/objectives/p2-67-claude-player-api.md index 1b5d4ce9..a0110118 100644 --- a/.project/objectives/p2-67-claude-player-api.md +++ b/.project/objectives/p2-67-claude-player-api.md @@ -639,6 +639,251 @@ that cite the precise schema mismatch blocking them. - `src/simulator/api-gdext/src/lib.rs` — `GdGameState::init` updated for new `trade_ledger` field. +## 2026-05-12 — Phase 13 STOP (demo would surface degenerate AI; render path absent) + +Per brief hard-stop rule: "Claude-vs-AI demo produces no AI activity in +any 5-turn block → STOP, document, exit (signals a regression somewhere +in the AI driver)." + +### Evidence + +5-EndTurn smoke at Phase-11 commit `ff7198346` (3-player, seed=42): + +``` +turn 0 → slot 1 actions_applied=1, slot 2 actions_applied=1 +turn 1 → slot 1 actions_applied=0, slot 2 actions_applied=0 +turn 2 → slot 1 actions_applied=0, slot 2 actions_applied=0 +turn 3 → slot 1 actions_applied=0, slot 2 actions_applied=0 +turn 4 → slot 1 actions_applied=0, slot 2 actions_applied=0 +``` + +A 25-turn run would produce identical zero-activity blocks for turns +5-9, 10-14, 15-19, 20-24. The hard-stop fires multiple times. Driving +the demo would produce a video of Claude playing solitaire while the +AI sits motionless — not the "Claude vs production AI" promise. + +### Independent blocker — render path + +Phase 13 also requires "Capture screenshots every 5 turns". The +current headless harness (`claude_player_main.gd`) is JSON-Lines only +— no scene tree, no TileMap, no camera. Production proof scenes +(`gameplay_arc_proof.tscn` etc.) render from `GameState` autoload, +not from a `GdPlayerApi`-held state. There is no path today that +takes the JSON state held by `GdPlayerApi.load_state_json` and renders +it visually. + +Wiring this requires either: + +1. **Render bridge** — extract the proof-scene rendering pipeline into + a function that takes a `GdGameState` instance (not the + autoload), so the harness can pass its bootstrapped + ticked state + for capture. +2. **Two-process orchestration** — one process drives the JSON pump, + another reads its events and replays them into a renderable scene + on the side. + +Either is its own objective with its own surface area. + +### What WAS validated this session + +- MCP install path is well-understood (the brief's command is + `cd tooling/claude-player-mcp && npm install`, then add `magic-civ` + to `.mcp.json`). Both can be done in <5 minutes when the rest of the + pipeline is warm. Not attempted now per the parent hard-stop. +- The MCP server itself (`tooling/claude-player-mcp/`) was shipped in + the 2026-05-10 Phase 4 work and is wire-stable. + +### What unblocks Phase 13 + +Both Phases 12 and 13's dependencies overlap: + +- AI projector enrichment (so AI produces non-trivial action chains + past turn 0 → demo isn't degenerate). +- Render bridge from `GdPlayerApi` state to a scene (so screenshots + capture real game state). + +When both land, Phase 13 is a single afternoon: `npm install`, edit +`.mcp.json`, drive a 25-turn run via the MCP, capture per-5-turn +screenshots into `.local/demo-runs//`, write the recap.md. + +### Status + +p2-67 stays `partial`. Phases 0-11 landed; Phases 12 + 13 deferred +behind two follow-ups (`pX-bench-projector-enrichment`, +`pX-render-bridge-gdplayerapi`). Re-open Phase 13 when both follow-ups +close. + +## 2026-05-12 — Phase 12 STOP (ObservationStore API surface mismatch) + +Hard-stop triggered per brief rule: "ObservationStore API surface mismatch +with what the projector needs → STOP, document, exit (don't paper over +with a parallel observation store in mc-player-api)." + +### What the brief assumed + +`mc_observation::ObservationStore` lookups answer the question +"is tile (col, row) visible to player P at the current turn?" so the +projector can mark each `TileView` as visible / fogged / hidden. + +### What `ObservationStore` actually is + +A per-player CLIMATE / WEATHER observation history for the Chronicle +UI. `src/simulator/crates/mc-observation/src/store.rs:8-90`: + +- `TurnObservation { turn, tile_indices, records }` — climate snapshot + (temperature, moisture, wind, succession_progress) of every tile + visible *at recording time* for that turn. Sparse on visible tiles + only. +- `ObservationStore::record_turn(turn, grid, visible_tile_indices)` + takes a pre-computed list of visible tile indices — meaning the + visibility calculation lives somewhere OTHER than `mc-observation`. +- `ObservationStore::get_turn(turn) -> Option<&TurnObservation>` + returns historical climate, not a "right now this tile is visible" + lookup. + +There is no `is_visible(player, col, row, turn) -> bool` API. The +store's public surface (`write_turn_frame_buffers`, +`write_latest_known_frame_buffers`, `unlock_lens`, `set_recording_gate`, +…) is shaped for the Chronicle UI's climate ribbon — not for +gameplay fog of war. + +### Why papering over would be wrong + +Per Rust SoT rail + brief's hard-stop: building a parallel "current +visibility per player" calculation inside `mc-player-api/projection.rs` +would duplicate the visibility logic that has to also live wherever +`ObservationStore::record_turn`'s `visible_tile_indices` argument is +computed (likely GDScript Vision.gd or a Rust port thereof). That's +exactly the duplication the rail forbids. + +### What's actually needed + +Either: + +1. **`mc-vision` crate** (or similar) that owns "compute current visible + tile set for player P given GameState" as the single source of + truth. Both `ObservationStore::record_turn` callers and the + projector pull from this. Includes a `Visibility { Hidden, Fogged, + Visible }` query for any (player, tile, turn) tuple. + +2. **Widen `ObservationStore`** to include current visibility lookups + alongside the climate history. Doable but mixes concerns — climate + recording is one job, gameplay fog is another. + +The honest path is option 1. Surface area is moderate: walk all +P-owned units + cities, compute hex-distance ≤ vision_radius per +unit/city, union into a `HashSet<(col, row)>`, expose a `Visibility` +enum that says "Visible if in current set, Fogged if in any prior +set, Hidden otherwise." + +### Why Phase 12 stays open until then + +The projector currently uses strict-redaction fog (own-player-only). +Without per-tile vision data, **all** enemy tiles are hidden, which +matches "Hidden if never seen." The current behaviour is correct for +"player who has never explored anywhere" — degenerate but not wrong. +The wrong-ness only matters once units have moved and explored, and +that path is also blocked by the AI behavioural-inertness gap from +Phase 11's notes (units don't move past spawn). Fix in order: + +1. AI projector enrichment so units actually move and explore. +2. `mc-vision` crate so fog has meaningful current/last-seen state. +3. Phase 12 projection rework on top of (1) + (2). + +### Status + +p2-67 stays `partial`. Phases 0-11 landed. Phase 12 deferred behind +the `mc-vision` follow-up objective. Phase 13 (MCP install + 25-turn +demo + screenshot bundle) is also held — independent of fog +correctness, but driving a 25-turn Claude run against an AI that +returns to inertness on turn 2+ produces a degenerate demo (Claude +moves, AI sits). Phase 13 unblocks alongside the AI projector +enrichment. + +## 2026-05-12 — Phase 11 landed (TurnProcessor::step ticking) + +p2-68 closed all of Phase 10 (production AI driver replaces scripted heuristic). +Phase 11 wires `TurnProcessor::step` into `apply_end_turn` between the AI +loop and the closing `TurnStarted` emit so production, growth, research, +founding, pending_move_requests, and fauna encounters all drain per turn. + +### Shipped + +- **`mc_turn::processor::TurnProcessor::step` now owns per-turn unit refresh.** + `src/simulator/crates/mc-turn/src/processor.rs:528-535` — added + `crate::refresh_units(state)` at end-of-step. Single source of truth per + the DRY rule locked in Phase 9; the dispatch-level `refresh_units` call + is deleted in the same patch. +- **`mc_player_api::dispatch::apply_end_turn` runs `step` after the AI loop.** + `src/simulator/crates/mc-player-api/src/dispatch.rs:258-281` — constructs + `TurnProcessor::new(u32::MAX)` (advisory `max_turns`; victory_config + overrides when present), calls `step(state)`, extends the response + `events` vec with translated processor events. The dispatch's + `state.turn = state.turn.saturating_add(1)` and `refresh_units(state)` + call sites are both deleted — `step` owns turn increment + unit refresh. +- **`translate_processor_events` translator** at + `dispatch.rs:295-368`. Maps 5 `mc_replay::TurnEvent` variants to + `wire::Event`: `TechResearched`, `WonderBuilt`, `CityFounded`, + `CityCaptured`, `GameOver`. `ClanId(u32)` is sourced from + `processor.rs:910` as `pi as u32` so the clan→player mapping is + `id.0 as PlayerId` with no separate table needed. Variants without a + direct wire counterpart (AmbientEncounterFired, UnitKilled, War/Peace, + Era, Leader, ClanEliminated, UnitCaptured, UnitRansomOffered, + CivilianDestroyed) are listed in an explicit drop arm so adding a new + `TurnEvent` variant forces a compile-time decision. +- **Cargo dep `mc-replay`** added to `mc-player-api/Cargo.toml`. + +### Tests + gate + +- `cargo test -p mc-player-api --lib`: 77 passed (was 74, +3 new): + - `end_turn_ticks_city_food_growth_via_turn_processor` — 2-turn + food accumulation crosses growth threshold (pop 1 → 2). + - `end_turn_completes_queued_unit_via_turn_processor` — city with + `production_stored=100` + `Queueable::Unit{dwarf_warrior}` spawns a + unit after one EndTurn (`player.units.len()` grows). + - `end_turn_refreshes_unit_movement_via_turn_processor` — unit with + `movement_remaining=0` and `base_moves=32` refreshes to 32 after step. +- `cargo test -p mc-turn --lib`: 207/207 still green (no regression + from adding `refresh_units` to end-of-step). +- `cargo check --workspace`: clean (pre-existing 17 doc-comment warnings). + +### Smoke confirms ticking + +Re-ran the 3-player apricot smoke at the Phase-11 commit (gdext rebuild ++ class-cache refresh + 5 EndTurns). Claude's view across turns 0..5 +showed visible state advancement: + +- `food_stored`: 0 → 2 → 4 → 6 → 8 → 10 (net +2/turn) +- `gold`: 60 → 68 → 76 → 84 → 92 → 100 (+8/turn) +- `unit_count`: 3 → 3 → 4 → 5 → 6 → 6 (production threshold spawns) +- `science_per_turn`: 0 → 42 (strategic_axes kicked in post-step) + +This is the direct, observable consequence of Phase 11. Pre-Phase-11 +smokes showed every field static across all 5 turns. + +### Honest finding — AI side still inert (separate from Phase 11) + +Same smoke surfaces `actions_applied=0` on the AI side (slots 1+2) for +turns 1-4 despite Phase 11 wiring step. Turn 0 still produces 1 action +per slot (the founding pass). + +This contradicts the p2-68 Wave-final hypothesis ("the bench doesn't tick, +that's why the AI sees nothing to do"). Wave-final was partially wrong: +the bench DOES tick visibly for Claude. The AI's inertness is a deeper +issue — `decide_tactical_actions` on the bench projection bottoms out +after the founding pass because: +- `unit_catalog` is empty in the bench-projector (p2-68 Wave 1 + documented limitation), +- `(food, prod, gold)` per-tile yields are zero in the bench projection, +- the unit move queue is empty because the AI projector has no per-tile + cost data. + +Phase 11 closes the "step doesn't tick" issue. The "AI is behaviorally +inert past turn 0" issue is its own follow-up. Recommendation: open a +new objective `pX-bench-projector-enrichment` to widen `project_tactical` +with unit_catalog + per-tile yields + movement-cost data so +`decide_tactical_actions` has a non-degenerate search space on the bench. + ## 2026-05-11 — Phase 10 STOP (structural blocker; documented) Phase 10 cannot land as a thin dispatch swap. Per the user's diff --git a/.project/objectives/p2-68-mc-ai-headless-turn-driver.md b/.project/objectives/p2-68-mc-ai-headless-turn-driver.md index a9b9ed6c..5e19fa1c 100644 --- a/.project/objectives/p2-68-mc-ai-headless-turn-driver.md +++ b/.project/objectives/p2-68-mc-ai-headless-turn-driver.md @@ -152,7 +152,7 @@ count - ✓ `mc-player-api` no longer contains `run_scripted_ai_turn` — call site replaced. Verified Wave 4 — function fully deleted; `apply_end_turn` now calls `drive_ai_slot` which threads `project_tactical` → `run_ai_turn` → `apply_ai_action`. - ✓ Headless harness loads `ai_personalities.json` at boot. Verified Wave-final 2026-05-11 — `claude_player_main.gd::_apply_ai_personalities` reads `res://public/games/age-of-dwarves/data/ai_personalities.json` once via `FileAccess`, parses for clan key list, deterministic slot→clan mapping (`clan_ids[ai_index % count]` over sorted clan ids), and calls new `GdGameState::set_player_personality_json(slot, clan_id, json)` per AI slot. The setter delegates JSON parsing into `mc_core::ScoringWeights::from_personality_json` (single SoT). 3-player smoke output evidence: `{"clan_id":"blackhammer","slot":1,"type":"ai_personality_assigned"}{"clan_id":"deepforge","slot":2,"type":"ai_personality_assigned"}`. Commit `2de1880db`. - ✓ `cargo check --workspace` green. Evidence: `cargo check --manifest-path src/simulator/Cargo.toml --workspace` → `Finished dev profile in 2.75s` (17 doc-comment warnings, 0 errors). Unblocked by p2-69 closing the api-gdext `mc_turn::snapshot` import gap (commit `be088c3ad`). `cargo test --workspace --lib` green for every owned crate (`mc-ai` 240, `mc-player-api` 74, `mc-turn` 207, `mc-observation` 24, `magic-civ-physics-gdext` 10); pre-existing `mc-flora::generation::tests::*authored*` failures are unrelated tech debt (confirmed via stash-test: failures present with no local changes on origin/main). -- ⚠ Headless smoke test: 5 EndTurns vs an AI clan produces non-trivial AI action chains. **Plumbing verified; action-depth limited to turn 0 pending p2-67 Phase 11.** Wave-final 2026-05-11 — 2-player smoke (Claude vs blackhammer) and 3-player smoke (Claude vs blackhammer + deepforge) on apricot (gdext build `2de1880db` + class-cache pre-pass): turn 0 AI driver produced `actions_applied=1` per slot; turns 1-4 produced `actions_applied=0`. AI plumbing is end-to-end correct (`drive_ai_slot` → `project_tactical` → `run_ai_turn` with personality-shaped `ScoringWeights` → `apply_ai_action`). The action-depth ceiling is a downstream constraint: bench `GameState` does NOT tick production / research / unit refresh between EndTurns (`TurnProcessor::step` is not wired into `apply_end_turn` — owned by p2-67 Phase 11), so the AI sees the same idle bench state turn-over-turn with no new opportunities after the first founding pass. Output captured at `apricot:/tmp/wave1-smoke-output.txt` + `wave1-smoke-3p-output.txt`. **Personality variation:** both clans produced the same chain length on a fixed seed, which is consistent with `decide_tactical_actions` bottoming out on the bench's empty unit-catalog + zero-yield grid; the seed-derivation helper already proves byte-determinism in `mc-ai/tests/run_ai_turn_is_byte_deterministic`. Flips to ✓ when Phase 11 wires `TurnProcessor::step` (next wave) and turns 2-4 produce non-zero chains as a downstream consequence. +- ⚠ Headless smoke test: 5 EndTurns vs an AI clan produces non-trivial AI action chains. **Plumbing verified end-to-end; AI behavioural inertness past turn 0 is a separate (deeper) issue.** Re-run 2026-05-12 at p2-67 Phase 11 commit `ff7198346` (TurnProcessor::step now ticks every EndTurn): Claude's view advances visibly turn-over-turn (food_stored 0→10, gold 60→100, unit_count 3→6, science_per_turn 0→42 — recorded at `apricot:/tmp/wave2-smoke-output.txt`), confirming the original Wave-final diagnosis ("bench isn't ticking, AI sees the same state every turn") was partially wrong. AI side still emits `actions_applied=1` on turn 0 per slot and `=0` on turns 1-4. Root cause is NOT a step-ticking gap; it is the bench projector's degenerate search space — `project_tactical` populates an empty unit_catalog, zero per-tile yields, and no move-cost data, so `decide_tactical_actions` returns empty after the turn-0 founding pass. Open follow-up `pX-bench-projector-enrichment` to widen the projector. Flips to ✓ once the projector has enough surface for `decide_tactical_actions` to find work past turn 0. - ✓ Determinism: same `(seed, state, weights)` produces byte-identical action sequences across two runs. Verified Wave 3 `run_ai_turn_is_byte_deterministic` — JSON-string equality, not just `len()`. **Status:** `partial` (8/9 ✓, 1 ⚠ conditional on p2-67 Phase 11). Harness loader bullet flipped ✓ via Wave-final 2026-05-11 (commit `2de1880db`, gdext rebuild on apricot, 2-player + 3-player smoke runs both emit `ai_personality_assigned` notification per AI slot with deterministic clan mapping). 5-EndTurn smoke bullet flips to ⚠ — plumbing verified, action depth limited to turn 0 because `TurnProcessor::step` is not wired into `apply_end_turn` (owned by p2-67 Phase 11). Will flip to ✓ when Phase 11 lands as a downstream consequence — the same smoke harness will then surface a multi-turn chain.