magicciv/.project/objectives/p2-67-claude-player-api.md
Natalie 02ea1eccc0 feat(api): add 25-turn Claude demo transcript capture
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-05-11 20:20:10 -07:00

59 KiB
Raw Blame History

id title priority status scope category owner created updated_at blocked_by follow_ups
p2-67 Claude-driven player API — programmatic player + Agent-SDK adapter p2 partial game1 tooling simulator-infra 2026-05-10 2026-05-10
p2-68
p2-70
p2-71
p2-72
p2-68
p2-69
p2-70
p2-71
p2-72

Context

A Claude Agent SDK process should be able to play a real game of Magic Civilization vs. the production AI, taking authentic player-equivalent actions one at a time and reading game state from data — not from screen scraping. Each turn is a sequence of discrete actions ("open city, queue warrior, close city, move unit, end turn"), the same flow the human UI exercises.

This unlocks:

  • Authentic gameplay screenshots (this objective is the proper fix for the gap p2-66 only papered over).
  • Headless playtesting: Claude vs. AI tournaments, regression detection via behavioural diffs, balance-tuning A/B runs.
  • Live demos: stream Claude's reasoning + action choices alongside the rendered game.

Source-of-truth rails

  • Rust crate: mc-player-api — single crate that owns the PlayerAction enum, PlayerView snapshot type, apply_action, view. All logic in Rust per Rail-1.
  • JSON path: no new game-content files. The protocol is wire-only JSON, not authored data.
  • GDScript: presentation only. The Godot-side harness is a thin GDExtension wrapper around mc-player-api plus a stdin/stdout pump.
  • Existing leverage:
    • mc-core::action::ActionKind — unit actions vocabulary.
    • mc-core::city_action::CityAction — city actions vocabulary.
    • mc-core::building_action::BuildingAction — building queue ops.
    • mc-mcts-service — precedent for framing + JSON-RPC server.
    • auto_play.gd — full headless game-flow harness with events.jsonl.
    • AiTurnBridge::run(player) — proven action dispatch into mc-turn.

Acceptance

  • mc-player-api crate exposes apply_action(state, player, action) and view(state, player) covering every action a UI button can perform. Round-trip: serialise view → choose action → deserialise action → apply.
  • Headless Godot harness (scripts/claude-player-server.shscenes/headless/claude_player_main.tscn) runs a seeded game, binds player slot 0 to stdin/stdout JSON-RPC, runs the production AI for slots 1..N. Drains AI turns automatically; pauses on player-0 turn until it receives an EndTurn action.
  • Claude SDK adapter (tooling/claude-player/) — TypeScript Agent SDK app — connects to the harness, reads view, picks action, sends, repeats. Plays one full game vs. AI to victory or 100-turn cap.
  • Snapshot test: mc-player-api::tests::seeded_game_replay runs a scripted action sequence and asserts the resulting events match a golden file. Catches behavioural drift.
  • Demo deliverable: a screen-recording (or 25-frame screenshot series) of one Claude vs. AI game, with action log alongside.
  • Phase-gate proof: Claude's first 10 turns logged + reviewed in the conversation that closes this objective.

Out of scope

  • Magic / Archons / Ascension (Game-2/3 features).
  • Multi-Claude games (Claude vs Claude). Adapter handles one player slot.
  • Network IPC. Stdin/stdout local pipe is sufficient for v1; TCP comes later.
  • UI parity — the harness drives state, not the world_map renderer. Renders happen separately when wanted (replay viewer + p2-66 paths).

Phase plan

Phase 0 — Design + JSON schema (~3 hr)

  • Enumerate every UI button in world_map_hud.tscn, city_screen.tscn, tech_tree.tscn, culture_tree.tscn, diplomacy_panel.tscn. Map each button to a PlayerAction variant.
  • Write docs/CLAUDE_PLAYER_API.md with the JSON-RPC schema (Request / Response / Notification envelopes), action variants, view shape, error codes.
  • Decide: stdin/stdout JSON-Lines vs. JSON-RPC 2.0. (Recommend Lines — simpler, matches mc-mcts-service::framing.)
  • Confirm view perspective: fog-of-war filtered, hidden tech / hidden diplomacy redacted to player slot 0's knowledge.

Phase 1 — mc-player-api crate (~1 day)

  • New crate src/simulator/crates/mc-player-api/. Workspace member.
  • Re-exports + outer enums:
    pub enum PlayerAction {
        Unit { unit_id: UnitId, kind: ActionKind, target: Option<HexCoord> },
        City { city_id: CityId, op: CityAction },
        Building { city_id: CityId, op: BuildingAction },
        Tech { tech_id: String },
        Culture { tradition_id: String },
        Diplomacy { other: PlayerId, op: DiploOp },
        EndTurn,
    }
    
  • apply_action(state: &mut GameState, player: PlayerId, action: PlayerAction) -> Result<Vec<Event>, ActionError> — dispatches into the same handlers mc-turn::action_handlers/ already exposes.
  • view(state: &GameState, player: PlayerId) -> PlayerView — fog-aware snapshot. Includes legal_actions: Vec<PlayerAction> so Claude doesn't have to compute legality itself.
  • Unit tests: round-trip serialisation, every variant, fog-redaction invariants.

Phase 2 — GDExtension surface (~4 hr)

  • api-gdext::player_api module exposes GdPlayerApi class:
    • view_json(player: int) -> String
    • apply_action_json(player: int, action_json: String) -> String (returns events JSON)
  • Godot can call this from any scene; no wire protocol involved at this layer.

Phase 3 — Headless harness (~half-day)

  • scenes/headless/claude_player_main.tscn + .gd:
    • Boots a seeded game (env: CP_SEED, CP_PLAYERS, CP_MAP_SIZE).
    • Connects player slot 0 to stdin (read line) / stdout (write line).
    • For other slots: runs AiTurnBridge::run(player) exactly as auto_play.gd does today.
    • On player-0's turn: blocks reading stdin. Each line is one PlayerAction JSON. Emits the resulting Vec<Event> JSON + updated PlayerView JSON to stdout. Loops until EndTurn.
    • On all turns: emits a Notification line for each EventBus event.
  • scripts/claude-player-server.sh — flatpak Godot launch wrapper with the right env vars for headless + auto-quit on stdin EOF.

Phase 4 — Claude Agent SDK adapter (~half-day)

  • New TypeScript package tooling/claude-player/. Uses @anthropic-ai/sdk Agent SDK.
  • Tools exposed to Claude:
    • view() — returns current PlayerView JSON.
    • act(action) — sends one PlayerAction, returns events + new view.
    • end_turn() — convenience wrapper for act({EndTurn}).
  • Loop: spawn claude-player-server.sh as child process via spawn, pipe stdin/stdout, run an Agent loop where Claude reads the view, picks an action, applies, repeats until victory / 100 turns / blocker.
  • Output an action log (tooling/claude-player/.local/runs/<stamp>/log.jsonl) with reasoning + action + events per step.

Phase 5 — End-to-end demo + screenshots (~2 hr)

  • Run one Claude vs. 1-AI seeded game.
  • Capture a screenshot every 5 turns via the existing gameplay_arc_proof rendering path (now driven by real game state instead of a scripted arc). Bundle 2025 frames into a demo zip.
  • Append the action log to the conversation when closing this objective so the phase-gate review is complete.

Architecture sketch

┌─────────────────────────────────────────┐
│  Claude Agent SDK (TypeScript)          │
│  ┌───────────┐    ┌─────────────────┐   │
│  │ view tool │ ←→ │ tooling/        │   │
│  │ act tool  │    │  claude-player/ │   │
│  └───────────┘    └────────┬────────┘   │
└────────────────────────────│────────────┘
                       stdin/stdout JSON-Lines
┌────────────────────────────│────────────┐
│  Godot (flatpak, headless) │            │
│  ┌─────────────────────────▼─────────┐  │
│  │ claude_player_main.gd (harness)   │  │
│  │  - reads stdin / writes stdout    │  │
│  │  - drives AI for slots 1..N       │  │
│  │  - emits notifications on events  │  │
│  └────────┬──────────────────────────┘  │
│  ┌────────▼──────────┐                  │
│  │ GdPlayerApi (gdext bridge)          │
│  └────────┬──────────┘                  │
└───────────│─────────────────────────────┘
┌───────────▼───────────────────────────┐
│  Rust simulator                       │
│  ┌──────────────┐  ┌──────────────┐   │
│  │ mc-player-   │→ │ mc-turn      │   │
│  │  api         │  │  handlers    │   │
│  │  (apply/view)│  │  (existing)  │   │
│  └──────────────┘  └──────────────┘   │
└───────────────────────────────────────┘

Decisions resolved 2026-05-10

  1. Wire format: JSON-Lines (one JSON value per line, \n framing). Matches mc-mcts-service::framing::LineCodec; trivially debuggable with cat. JSON-RPC 2.0 envelope is overkill for a single-client local pipe.
  2. Fog-of-war: strict by default. Claude only sees what player slot 0 sees per the live Player.observations cache. Override via CP_OMNISCIENT=1 env (debug + golden-test mode only).
  3. Action timeout: 60s default, override via CP_TIMEOUT_SEC. On expiry the harness emits {"type":"turn_timeout"} notification and substitutes AiTurnBridge::run for that turn so the game keeps advancing. Adapter logs the substitution for review.
  4. Tool surface: three discrete toolsview(), act(action), end_turn(). Cleaner Claude UX than one mega-tool with a discriminator. end_turn is sugar for act({"type":"end_turn"}) so the wire protocol stays one-action-per-line.

Total estimate

Phase 05 = 34 days focused work. Phase 1 (mc-player-api) is the bulk; Phases 25 are small once the core surface exists.

2026-05-10 — Phases 0-5 v1 shipped

All six phases landed in this session. Status moves to partial (not done) because several acceptance bullets are wire-stable but have TRACKED follow-up subsystem wiring listed under each phase.

Phase 0 — Design doc ✓

  • src/game/engine/docs/CLAUDE_PLAYER_API.md — wire spec, action taxonomy, view shape, error codes, env contract, adapter loop pattern, UI button → action audit per scene.

Phase 1 — mc-player-api crate ✓

  • 5 modules (action, dispatch, error, projection, view, wire).
  • 39/39 tests green: cargo test -p mc-player-api.
  • Wire types complete; dispatcher routes EndTurn + Attack-hex-resolve
    • 11 unit-verb variants through mc_turn::action_handlers::invoke; other variants return typed NotYetImplemented with TRACKED breadcrumbs.
  • Projection wires gold / science / tech / culture / cities / units / diplomacy / score with strict fog redaction (own player only by default, omniscient via flag).

Phase 2 — GDExtension surface ✓

  • api-gdext::player_api::GdPlayerApiview_json(player), apply_action_json(player, action_json), load_state_json, dump_state_json, set_omniscient.
  • cargo check -p magic-civ-physics-gdext clean.
  • gdext binary rebuilt + copied into engine/addons.

Phase 3 — Headless harness ✓

  • src/game/engine/scenes/headless/claude_player_main.{gd,tscn} — stdin/stdout JSON-Lines pump.
  • scripts/claude-player-server.sh — flatpak launcher.
  • Env-driven: CP_SEED, CP_PLAYERS, CP_CLAUDE_SLOT, CP_MAP_SIZE, CP_MAP_TYPE, CP_OMNISCIENT, CP_TIMEOUT_SEC, CP_LOG_FILE.

Phase 4 — MCP server for Claude Code ✓

  • tooling/claude-player-mcp/ package — strict TS, Node 20+.
  • HarnessClient — child-process spawn + JSON-Lines correlation by monotonic id, timeouts, notification dispatch.
  • MCP server (@modelcontextprotocol/sdk stdio transport) exposes three tools: magic_civ_view, magic_civ_act, magic_civ_end_turn.
  • Server spawns scripts/claude-player-server.sh on first tool call and reuses the harness across the session.
  • Claude Code wires via .mcp.json:
    {
      "mcpServers": {
        "magic-civ": {
          "command": "node",
          "args": ["./tooling/claude-player-mcp/dist/index.js"]
        }
      }
    }
    
  • No Anthropic API key needed — Claude Code itself is the agent; this layer is purely the tool surface. The earlier Anthropic-SDK adapter (tooling/claude-player/) was scrapped in favour of this approach.

Phase 5 — E2E demo ✓ (wire transcript)

  • .project/history/20260510_p2-67-phase5-wire-transcript.md — full request/response trace for view → act(end_turn) → shutdown verified end-to-end on apricot. Real PlayerView JSON returned; EndTurn emitted canonical TurnEnded/PhaseChanged/TurnStarted event triple; shutdown clean.
  • What's still TRACKED for Phase 5 to flip to done:
    • Map + unit hydration of GdPlayerApi::state (wired once GdGameState::serialize_to_json exists). Harness initialises autoload GameState already; the API's held state stays default until that bridge lands.
    • Live Claude vs AI run with screenshots — requires ANTHROPIC_API_KEY and a fresh hydrated GameState. The adapter
      • harness pipe is proven; the run is a single npm run dev invocation away once the state bridge is hot.
    • Subsystem dispatch follow-ups for the variants currently returning NotYetImplemented: Move (needs pending_move_requests queue in mc-turn), city ops (mc-city dispatch), diplomacy verbs (mc-trade dispatch), tech / culture / civic selection.

2026-05-11 — Phase 1 follow-up + Phase 6 wiring

Past Phase 5 (wire-transcript proof) into actual playable gameplay.

Shipped this session

  • Real map generation in harness: claude_player_main.gd::_hydrate_player_api now boots via GdMapGenerator.generate(seed, map_size) + GdGameState.set_grid_from_gridstate(grid) + greedy max-distance land-tile picker. Capitals land on real biomes, not on fixed offsets.
  • Land-aware spawn in proof scenes: gameplay_arc_proof got _is_land_tile / _find_land_tile_near / _filter_land helpers wired into 5 placement sites. The water-tile spawn bug the user reported is fixed in the demo path.
  • 3 new live dispatch routes:
    • QueueProduction — sets CityState.queue to Queueable::Unit{...} or Queueable::Item{...} based on id prefix.
    • RemoveFromQueue — clears queue/queue_cost/queue_tier/production_stored.
    • ResearchTech / ResearchTradition — direct mutation via new mc_tech::PlayerTechState::set_researching_unchecked (sister to start_research that doesn't require a TechWeb handle).
  • Scripted AI heuristic at mc_player_api::dispatch::run_scripted_ai_turn. Fires inside apply_end_turn for every non-Claude slot. Found city / queue warrior / start tech / fortify idle units. Real Event::AiTurnStarted / Event::AiTurnCompleted{actions_applied} events emitted per slot. Verified: smoke test shows actions_applied=5 on Player 1's first turn (queue warrior + start bronze_working + fortify 3 warriors).
  • GdGameState::to_json / from_json symmetric serde bridge so the harness can hand its bootstrapped state to GdPlayerApi.load_state_json.
  • Unit id collision fix in GdGameState::add_player_militarist — units now get monotonic ids from state.next_unit_id instead of all defaulting to 0.

Multi-day roadmap to "Claude can play a real game vs production AI"

Honest split of what stands between today's state and a real demoable Claude-vs-AI run.

Phase 7 — Wire the rest of the city ops (~1.5 days)

  • RushBuy — deduct state.players[pi].gold by mc_items::ItemSystem::rush_buy_cost(item, base) and force-complete the queue head.
  • BuyTile — needs a per-city owned_tiles mutator on full City (the bench CityState doesn't carry tile ownership). Either widen bench struct or add a parallel state.players[pi].owned_tiles array.
  • SetFocusCity::set_focus is on the full type. Bench widening or a per-city focus field on CityState.
  • QueueReorder — bench queue: Option<Queueable> only holds one item; the production-queue-as-vec lives on full City. Either upgrade bench CityState.queue to Vec<Queueable> or treat queue ops on bench as a no-op.
  • MergeBuildingsmc_city::merge::apply_merge(&mut City, ...) requires the full City + a &BuildingRegistry + researched techs. Threading the registry through the dispatcher is the bigger lift.

Phase 8 — Open Borders / Shared Map / Promotion / RangedAttack (~1 day)

  • Add TradeLedger field to bench GameState (or load it via the existing parameterised mc_trade::declare_war signature pattern).
  • Wire OfferOpenBorders / AcceptOpenBorders / RejectOpenBorders through mc_trade::TradeLedger::alloc_agreement_id + push OpenBordersAgreement into ledger.agreements. Same for SharedMap.
  • Promote — promotion-pick state needs surfacing on MapUnit (currently no pending_promotion: Option<String> field).
  • RangedAttack — author the pending_ranged_attacks queue + drain pass in mc_turn::processor, analogous to pending_bombard_requests.
  • Formation commands — SetRallyPoint, ClearRallyPoint, CommandFormation, SetFormationShape, SplitFromFormation, SetAutoJoin all queue via existing pending_rally_requests / formations fields; just need the dispatcher mapping authored.

Phase 9 — Proper Move subsystem (~1 day)

Currently apply_move is trust-the-caller (direct unit.col/row mutation + occupancy check). To match production:

  • Add MoveRequest { player_idx, unit_idx, target_col, target_row } struct + pending_move_requests: Vec<MoveRequest> field on GameState.
  • Author Rust pathfinder (mirror pathfinder.gd::find_path A* with _is_passable gates). Add to mc-core or new mc-pathfinding crate.
  • Add movement_remaining: i32 field to MapUnit. Refresh per turn via existing _unit_manager.refresh_player_units analogue.
  • TurnProcessor::process_move_requests validates path + decrements movement_remaining + applies position.
  • apply_move now queues into pending_move_requests instead of direct mutation; processor drains.

Phase 10 — Real AI driving (~2 days)

Replace run_scripted_ai_turn with production MCTS:

  • The GDScript AiTurnBridge already does this for the world_map.tscn path. It depends on GameState autoload + Player entity + GdMcTreeController + ai_personalities.json.
  • Headless path needs an equivalent that takes &mut GameState directly. Spec a mc_ai::run_ai_turn(state: &mut GameState, player: u8, web: &TechWeb, personalities: &Personalities) -> u32 that internally calls GdMcTreeController's logic without going through the GDScript autoloads.
  • Wire _hydrate_player_api to load personalities from ai_personalities.json once at boot.
  • Replace run_scripted_ai_turn body with mc_ai::run_ai_turn.

Phase 11 — TurnProcessor between Claude's turns (~0.5 day)

  • After all AI slots have acted, run TurnProcessor::step(state) so production accumulates, cities grow, tech progresses. Otherwise the queue I set in QueueProduction never completes and research_progress never increments.
  • Surface the resulting events as additional Event::* entries in the EndTurn response.

Phase 12 — Fog of war from real Observations (~0.5 day)

  • The projection module currently uses conservative-strict redaction (own-player-only) because the bench GameState doesn't carry per-tile vision data.
  • Wire mc_observation::ObservationStore into the projector so the fog is per-player + per-tile, not all-or-nothing.

Phase 13 — Adapter polish + run actual demo (~0.5 day)

  • tooling/claude-player-mcp/ is shipped but npm install has never been run on this machine. Run it.
  • Add the magic-civ entry to .mcp.json.
  • Restart Claude Code so the MCP tools surface.
  • Drive an actual Claude vs AI game from inside the Claude Code session — call magic_civ_view, decide, magic_civ_act(...), observe the AI's response, repeat.
  • Capture screenshots every 5 turns of Claude's run. Bundle.

Total

67 days of focused work to go from today's state to a playable demo of Claude vs the production AI with real terrain, real tech progression, real per-turn economy, real fog of war, and a downloadable bundle of the Claude session's screenshots.

That's the encompassing job.

2026-05-11 — Phase 7 landed (RushBuy live, 4 NotYetImplemented breadcrumbs tightened)

The Phase-7 brief proposed widening bench CityState (focus, owned_tiles, multi-item queue) before wiring 5 routes. We rejected the widening: bench CityState is consumed by mc-sim/solo_dominion, fauna_pressure_bench, and MCTS rollout snapshots, and widening cascades into serde-compat and those crates' own field assumptions. The brief itself authorised the escape hatch — NotYetImplemented with precise breadcrumbs — for routes that can't be honestly implemented.

Also corrected: the Phase-7 brief stated "bench player struct has no gold field." It does — PlayerState.gold: i32 at mc-turn/src/game_state.rs:504. That made RushBuy honestly implementable today against the existing bench types.

Shipped

  • RushBuy live: mc_player_api::dispatch::apply_rush_buy deducts mc_items::ItemSystem::rush_buy_cost(queue_cost) = 2 × queue_cost from state.players[pi].gold, clears the queue head (queue / queue_cost / queue_tier / production_stored), and emits one wire event matching the queue-head variant:
    • Queueable::Wonder → inserts into player.wonders_built at the stored tier (mirrors TurnProcessor::process_city_production wonder completion exactly) and emits Event::WonderBuilt { wonder_id, player }.
    • Queueable::Unit → emits Event::CityUnitCompleted { city_id, unit_id }. Bench TurnProcessor does not spawn units from unit-queue heads in Phase 7's scope (Phase 11 wires that ticking); the wire event is the honest observable.
    • Queueable::Item → emits Event::CityBuildingCompleted (closest existing semantic; no dedicated item-completion event yet).
  • 4 routes return NotYetImplemented with tightened breadcrumbs that cite the specific missing bench field + cascade cost:
    • BuyTile — needs per-city owned_tiles: HashSet<HexCoord> (or a parallel array on GameState). The full City struct in mc-city/src/city.rs owns tile ownership today.
    • SetFocusCity::set_focus is on the full struct; bench CityState has no focus field.
    • QueueReorder — bench queue: Option<Queueable> holds one item; queue-as-vec migration is its own Phase 7 follow-up.
    • MergeBuildingsmc_city::merge::apply_merge requires &mut City + &BuildingRegistry + researched; threading the registry through bench GameState is the larger lift.
  • Cargo dep added: mc-player-api now depends on mc-items for ItemSystem::rush_buy_cost.

Tests + gate

  • cargo test -p mc-player-api: 56/56 green (was 50, +6 new for RushBuy: unit / item / wonder happy paths + empty-queue + insufficient-gold + unknown-city + the renamed buy_tile_returns_not_yet_implemented_with_bench_widening_breadcrumb covering the new breadcrumb).
  • cargo check --workspace: clean (pre-existing warnings unchanged).

Files touched

  • src/simulator/crates/mc-player-api/Cargo.toml — added mc-items dep.
  • src/simulator/crates/mc-player-api/src/dispatch.rsapply_rush_buy, 4 tightened breadcrumbs, 6 new tests, dropped the obsolete combined rush_buy_still_returns_not_yet_implemented test.

Honest scope cuts

BuyTile / SetFocus / QueueReorder / MergeBuildings stay unimplemented at the bench layer until either: (a) the bench CityState widens (and mc-sim callers are updated), or (b) the Phase-10/11 work moves to the full City struct + production TurnProcessor ticking, at which point these routes wire through that production-flavoured path instead of the bench.

2026-05-11 — Phase 9 landed (Proper Move subsystem)

A* pathfinding + movement-budget validation now run on the Rust side for every PlayerAction::Move. The old "trust-the-caller direct mutation" path is deleted.

Shipped

  • New crate mc-pathfinding (src/simulator/crates/mc-pathfinding/). Workspace member. Verbatim Rust port of pathfinder.gd::find_path with per-line GDScript citations in the source (pathfinder.gd:25-95, :245-260, :263-268, :281-292, :295-303). Public API: find_path(grid, start, goal, budget, domain) -> Vec<HexCoord>, is_passable, effective_cost, hex_distance. UnitDomain::{Land, Naval, Flying} mirrors the GDScript unit_type string param. 7/7 unit tests cover same-tile, unreachable-water-for-land, naval-only- water, budget-exhausted, flying-crosses-water, and the passability truth table.
  • New mc-units::UnitsCatalog — id → UnitStats { base_moves, domain } catalog loaded from public/resources/units/*.json. JSON field "movement" deserialises as base_moves; missing domain defaults to "land". 4/4 catalog tests cover the warrior.json shape, domain default, insert/lookup, and unknown-top-level handling.
  • MapUnit::new(unit_type, col, row, owner, &UnitsCatalog) -> Self reads base_moves from the catalog at spawn. Fallback to 0 when the catalog is missing the entry — callers must chain .with_moves(n) for tests that don't populate a catalog. No i32::MAX sentinelmovement_remaining = 0 means "exhausted this turn", never "uninitialised" (SRP-clean per the Phase-9 design lock).
  • MapUnit::with_moves(n) builder — test override that sets both base_moves and movement_remaining so refresh_units recharges to the same value next turn.
  • MapUnit::base_moves: i32 + movement_remaining: i32 added. Both #[serde(default)] so all 54 existing MapUnit { ... ..Default::default() } fixture sites compile without migration. The dispatch test helper make_state_with_units chains .with_moves(32) so existing happy- path move tests keep their geometry budget.
  • mc_turn::refresh_units(state) — single source of truth for per-turn movement-point refresh. Resets unit.movement_remaining = unit.base_moves for every non-captive unit (captives stay at 0 per p2-55 ransom rules). Wired from mc_player_api::dispatch::apply_end_turn for now; the call site deletes in Phase 11 once TurnProcessor::step is invoked from dispatch (DRY rule).
  • MoveRequest struct + pending_move_requests: Vec<MoveRequest> on GameState. #[serde(default)] for save-back-compat. Drained by mc_turn::processor::process_move_requests(state) -> Vec<MoveOutcome>, which pathfinds via mc-pathfinding, validates budget, checks occupancy, applies the new position, and decrements movement_remaining by path cost. Bench grid == None falls back to a 1-cost teleport so mc-sim unit-test fixtures keep working. 6/6 drain tests: happy path, zero budget, unreachable, occupied, no-grid teleport, captive rejection.
  • Event::UnitMoved wire variant gains path: Vec<WireHex> (#[serde(default, skip_serializing_if = "Vec::is_empty")]) — back-compat for adapters that ignore the field.
  • mc_player_api::dispatch::apply_move rewritten to queue a MoveRequest and drain synchronously via mc_turn::processor::process_move_requests. MoveOutcome::MovedEvent::UnitMoved { path, .. }; MoveOutcome::RejectedActionError::TargetInvalid { message: reason }. Each Move action returns its own events — synchronous semantics match the Claude-API one-action-per-line contract.
  • GameState::units_catalog: UnitsCatalog (#[serde(skip)]) added alongside improvement_registry. Bridge layers populate at boot; absent in unit tests by default.
  • api-gdext::lib.rs::GdGameState::init updated for the two new GameState fields.

Tests + gate

  • cargo test -p mc-pathfinding --lib: 7/7 green
  • cargo test -p mc-units --lib: 7/7 green (was 3, +4 new for catalog)
  • cargo test -p mc-turn --lib: 207/207 green (+6 new for move drain)
  • cargo test -p mc-player-api --lib: 56/56 green (no regression)
  • cargo check --workspace: clean (pre-existing warnings only; pre-existing four_player_projection_fills_every_slot integration test failure verified to exist on main HEAD and is unrelated)

Files touched

  • src/simulator/Cargo.toml — register mc-pathfinding workspace member.
  • src/simulator/crates/mc-pathfinding/{Cargo.toml,src/lib.rs} — new.
  • src/simulator/crates/mc-units/Cargo.toml — no change (serde already declared).
  • src/simulator/crates/mc-units/src/{lib.rs,catalog.rs} — new module.
  • src/simulator/crates/mc-turn/Cargo.tomlmc-units + mc-pathfinding deps.
  • src/simulator/crates/mc-turn/src/lib.rs — re-export MoveRequest, add top-level refresh_units.
  • src/simulator/crates/mc-turn/src/game_state.rsMoveRequest, pending_move_requests, units_catalog, MapUnit::{base_moves, movement_remaining, new, with_moves}.
  • src/simulator/crates/mc-turn/src/processor.rsprocess_move_requests, MoveOutcome, 6 new tests in move_request_tests.
  • src/simulator/crates/mc-player-api/src/dispatch.rs — rewrite apply_move to queue + drain; add refresh_units call in apply_end_turn; bump test helper's per-unit movement budget.
  • src/simulator/crates/mc-player-api/src/wire.rsEvent::UnitMoved.path field.
  • src/simulator/api-gdext/src/lib.rsGdGameState::init updated for new GameState fields.

Followups (not blockers)

  • Partial-path landing — when the full path exceeds movement_remaining, the drain rejects rather than landing on the furthest reachable tile. Tracking as a Phase-10 follow-up; needs a small refactor of mc-pathfinding::find_path to surface the truncated route.
  • Per-tile movement cost — mc_pathfinding::effective_cost returns 1 uniformly today (Game-1 default). When non-uniform terrain costs land, process_one_move's cost = p.len() heuristic needs to sum the per-tile cost instead.

2026-05-11 — Phase 8 landed (TradeLedger + Promote + formation/auto-join + bench OpenBorders/SharedMap)

Wired 9 previously NotYetImplemented dispatch routes and one pre-existing tech-debt site (dummy_ledger in apply_declare_war). The deferred routes have sharper breadcrumbs that cite the precise schema mismatch blocking them.

Shipped

  • GameState::trade_ledger: TradeLedger#[serde(default)] for save-back-compat. Single authoritative ledger; the dummy_ledger allocation in apply_declare_war is deleted in favour of &mut state.trade_ledger (real war declarations now break the right agreements).
  • MapUnit::pending_promotion: Option<String>#[serde(default, skip_serializing_if = "Option::is_none")]. Phase 11 follow-up consumes this on the next TurnProcessor::step to validate + apply the pick.
  • Promote dispatch liveapply_promote validates unit exists, rejects empty promotion_id, sets pending_promotion, and emits Event::UnitPromoted { unit_id, promotion }. 2 new tests cover happy path + empty-id rejection.
  • OfferOpenBorders / OfferSharedMap bench-sign — the wire protocol's three-verb flow (Offer → Accept → Reject) collapses on the bench because the counterparty AI doesn't yet model offer acceptance. Bench cheat: Offer instantly signs a 30-turn DiplomaticAgreement via `state.trade_ledger.alloc_agreement_id()
    • agreements.push(...). Accept*/Reject*` are no-op acknowledgements on the bench. Documented honestly in the dispatch comments; canonical doc update tracked for Phase 13. 2 new tests cover the OpenBorders + SharedMap sign paths.
  • SplitFromFormation / SetAutoJoin dispatch live — both resolve player_idx via find_unit_indices (so wire unit_id strings get translated to the u8 slot the queue structs require), then push mc_core::formation::SplitFormationRequest / AutoJoinRequest. 2 new tests assert the queue grows.
  • CommandFormation / SetFormationShape dispatch live — resolve player_index via state.formations.get(&formation_id) (so unknown formation ids fail with ActionError::IllegalAction). CommandFormation's optional target hex falls back to (-1, -1) per the queue struct's sentinel convention.

Honest scope cuts (sharper breadcrumbs, not silent)

  • SetRallyPoint / ClearRallyPoint — schema mismatch. The wire surface is per-unit; mc_core::RallyPointRequest is keyed by (player_index, city_index, building_id) and sets the rally on the producing building, not on an arbitrary unit. Routing honestly requires either (a) tracking the producer-building per unit, or (b) authoring a separate pending_unit_rally_requests queue. Both are bigger lifts than the brief promised. Breadcrumb cites the schema gap.
  • RangedAttack — no single-target ranged resolver exists in mc-combat today (only pending_volley_requests, which is AoE). Routing single-target through volley silently corrupts the wire contract — adapters would see AoE damage when they asked for one shot. Stays NotYetImplemented with the corrected breadcrumb citing the volley-vs-single-target distinction.

Tests + gate

  • cargo test -p mc-player-api --lib: 62/62 green (was 56, +6 new for Phase 8: open_borders sign / shared_map sign / promote happy / promote empty-id / split queue / auto-join queue / set_rally NotYetImplemented).
  • cargo test -p mc-turn --lib: 207/207 green (no regression from trade_ledger / pending_promotion field additions).
  • cargo check --workspace --exclude magic-civ-physics-gdext: clean. The api-gdext pre-existing errors (mc_turn::snapshot import, decide_tactical_actions arity) are unchanged on main HEAD and unrelated.

Files touched

  • src/simulator/crates/mc-turn/src/game_state.rs — add trade_ledger and pending_promotion fields.
  • src/simulator/crates/mc-player-api/src/dispatch.rs — wire 9 new routes + 6 new helper fns + 6 new tests + delete dummy_ledger.
  • src/simulator/api-gdext/src/lib.rsGdGameState::init updated for new trade_ledger field.

2026-05-12 — Phase 13 STOP (render bridge from GdPlayerApi state does not exist)

Two independent hard-stop conditions. Leading with the structural one because it doesn't depend on any AI-behaviour debate.

Primary blocker — render path

Phase 13 requires "Capture screenshots every 5 turns". The current headless harness (claude_player_main.gd) is JSON-Lines only — no scene tree, no TileMap, no camera. Production proof scenes (gameplay_arc_proof.tscn, world_map.tscn, etc.) render from the GameState autoload, not from a GdPlayerApi-held state. There is NO path today that takes the JSON state held by GdPlayerApi.load_state_json and renders it visually.

Wiring this requires either:

  1. Render bridge — extract the proof-scene rendering pipeline into a function that takes a GdGameState instance (not the autoload), so the harness can pass its bootstrapped + ticked state for capture.
  2. Two-process orchestration — one process drives the JSON pump, another reads its events and replays them into a renderable scene on the side.

Either is its own objective with its own surface area. Neither was specced in p2-67 Phase 0-9 because Phase 13 was scoped as "use the existing render path" without verifying one existed for this state shape.

Secondary blocker — degenerate AI behaviour

Per brief hard-stop rule: "Claude-vs-AI demo produces no AI activity in any 5-turn block → STOP, document, exit (signals a regression somewhere in the AI driver)."

Evidence

5-EndTurn smoke at Phase-11 commit ff7198346 (3-player, seed=42):

turn 0 → slot 1 actions_applied=1, slot 2 actions_applied=1
turn 1 → slot 1 actions_applied=0, slot 2 actions_applied=0
turn 2 → slot 1 actions_applied=0, slot 2 actions_applied=0
turn 3 → slot 1 actions_applied=0, slot 2 actions_applied=0
turn 4 → slot 1 actions_applied=0, slot 2 actions_applied=0

A 25-turn run would produce identical zero-activity blocks for turns 5-9, 10-14, 15-19, 20-24. The hard-stop fires multiple times. Even with the render bridge in place, the resulting video would be Claude playing solitaire while the AI sits motionless — not the "Claude vs production AI" promise.

What WAS validated this session

  • MCP install path is well-understood (the brief's command is cd tooling/claude-player-mcp && npm install, then add magic-civ to .mcp.json). Both can be done in <5 minutes when the rest of the pipeline is warm. Not attempted now per the parent hard-stop.
  • The MCP server itself (tooling/claude-player-mcp/) was shipped in the 2026-05-10 Phase 4 work and is wire-stable.

What unblocks Phase 13

Both Phases 12 and 13's dependencies overlap:

  • AI projector enrichment (so AI produces non-trivial action chains past turn 0 → demo isn't degenerate).
  • Render bridge from GdPlayerApi state to a scene (so screenshots capture real game state).

When both land, Phase 13 is a single afternoon: npm install, edit .mcp.json, drive a 25-turn run via the MCP, capture per-5-turn screenshots into .local/demo-runs/<stamp>/, write the recap.md.

Status

p2-67 stays partial. Phases 0-11 landed; Phases 12 + 13 deferred behind two follow-ups (pX-bench-projector-enrichment, pX-render-bridge-gdplayerapi). Re-open Phase 13 when both follow-ups close.

2026-05-12 — Phase 12 STOP (ObservationStore API surface mismatch)

Hard-stop triggered per brief rule: "ObservationStore API surface mismatch with what the projector needs → STOP, document, exit (don't paper over with a parallel observation store in mc-player-api)."

What the brief assumed

mc_observation::ObservationStore lookups answer the question "is tile (col, row) visible to player P at the current turn?" so the projector can mark each TileView as visible / fogged / hidden.

What's actually missing

Not "ObservationStore is the wrong shape" — ObservationStore is fine as a query surface: get_turn(turn).tile_indices.contains(idx) answers "was tile X visible to player P at turn T", which is exactly what a fog projector needs for "Visible / Fogged / Hidden" classification.

What's missing is the Rust-side visibility producer. Today record_turn(turn, grid, visible_tile_indices: &[u16]) takes pre-computed visibility — the caller (presumably GDScript Vision.gd or an equivalent Rust port that hasn't been ported yet) owns the "compute which tiles are visible to player P right now" calculation. There is no mc-observation API that takes (GameState, PlayerId) and returns a visible-tile set.

What ObservationStore actually is

A per-player CLIMATE / WEATHER observation history for the Chronicle UI. src/simulator/crates/mc-observation/src/store.rs:8-90:

  • TurnObservation { turn, tile_indices, records } — climate snapshot (temperature, moisture, wind, succession_progress) of every tile visible at recording time for that turn. Sparse on visible tiles only.
  • ObservationStore::record_turn(turn, grid, visible_tile_indices) takes a pre-computed list of visible tile indices — meaning the visibility calculation lives somewhere OTHER than mc-observation.
  • ObservationStore::get_turn(turn) -> Option<&TurnObservation> returns historical climate, not a "right now this tile is visible" lookup.

There is no is_visible(player, col, row, turn) -> bool API. The store's public surface (write_turn_frame_buffers, write_latest_known_frame_buffers, unlock_lens, set_recording_gate, …) is shaped for the Chronicle UI's climate ribbon — not for gameplay fog of war.

Why papering over would be wrong

Per Rust SoT rail + brief's hard-stop: building a parallel "current visibility per player" calculation inside mc-player-api/projection.rs would duplicate the visibility logic that has to also live wherever ObservationStore::record_turn's visible_tile_indices argument is computed (likely GDScript Vision.gd or a Rust port thereof). That's exactly the duplication the rail forbids.

What's actually needed

Either:

  1. mc-vision crate (or similar) that owns "compute current visible tile set for player P given GameState" as the single source of truth. Both ObservationStore::record_turn callers and the projector pull from this. Includes a Visibility { Hidden, Fogged, Visible } query for any (player, tile, turn) tuple.

  2. Widen ObservationStore to include current visibility lookups alongside the climate history. Doable but mixes concerns — climate recording is one job, gameplay fog is another.

The honest path is option 1. Surface area is moderate: walk all P-owned units + cities, compute hex-distance ≤ vision_radius per unit/city, union into a HashSet<(col, row)>, expose a Visibility enum that says "Visible if in current set, Fogged if in any prior set, Hidden otherwise."

Why Phase 12 stays open until then

The projector currently uses strict-redaction fog (own-player-only). Without per-tile vision data, all enemy tiles are hidden, which matches "Hidden if never seen." The current behaviour is correct for "player who has never explored anywhere" — degenerate but not wrong. The wrong-ness only matters once units have moved and explored, and that path is also blocked by the AI behavioural-inertness gap from Phase 11's notes (units don't move past spawn). Fix in order:

  1. AI projector enrichment so units actually move and explore.
  2. mc-vision crate so fog has meaningful current/last-seen state.
  3. Phase 12 projection rework on top of (1) + (2).

Status

p2-67 stays partial. Phases 0-11 landed. Phase 12 deferred behind the mc-vision follow-up objective. Phase 13 (MCP install + 25-turn demo + screenshot bundle) is also held — independent of fog correctness, but driving a 25-turn Claude run against an AI that returns to inertness on turn 2+ produces a degenerate demo (Claude moves, AI sits). Phase 13 unblocks alongside the AI projector enrichment.

2026-05-12 — Phase 11 landed (TurnProcessor::step ticking)

p2-68 closed all of Phase 10 (production AI driver replaces scripted heuristic). Phase 11 wires TurnProcessor::step into apply_end_turn between the AI loop and the closing TurnStarted emit so production, growth, research, founding, pending_move_requests, and fauna encounters all drain per turn.

Shipped

  • mc_turn::processor::TurnProcessor::step now owns per-turn unit refresh. src/simulator/crates/mc-turn/src/processor.rs:528-535 — added crate::refresh_units(state) at end-of-step. Single source of truth per the DRY rule locked in Phase 9; the dispatch-level refresh_units call is deleted in the same patch.
  • mc_player_api::dispatch::apply_end_turn runs step after the AI loop. src/simulator/crates/mc-player-api/src/dispatch.rs:258-281 — constructs TurnProcessor::new(u32::MAX) (advisory max_turns; victory_config overrides when present), calls step(state), extends the response events vec with translated processor events. The dispatch's state.turn = state.turn.saturating_add(1) and refresh_units(state) call sites are both deleted — step owns turn increment + unit refresh.
  • translate_processor_events translator at dispatch.rs:295-368. Maps 5 mc_replay::TurnEvent variants to wire::Event: TechResearched, WonderBuilt, CityFounded, CityCaptured, GameOver. ClanId(u32) is sourced from processor.rs:910 as pi as u32 so the clan→player mapping is id.0 as PlayerId with no separate table needed. Variants without a direct wire counterpart (AmbientEncounterFired, UnitKilled, War/Peace, Era, Leader, ClanEliminated, UnitCaptured, UnitRansomOffered, CivilianDestroyed) are listed in an explicit drop arm so adding a new TurnEvent variant forces a compile-time decision.
  • Cargo dep mc-replay added to mc-player-api/Cargo.toml.

Tests + gate

  • cargo test -p mc-player-api --lib: 77 passed (was 74, +3 new):
    • end_turn_ticks_city_food_growth_via_turn_processor — 2-turn food accumulation crosses growth threshold (pop 1 → 2).
    • end_turn_completes_queued_unit_via_turn_processor — city with production_stored=100 + Queueable::Unit{dwarf_warrior} spawns a unit after one EndTurn (player.units.len() grows).
    • end_turn_refreshes_unit_movement_via_turn_processor — unit with movement_remaining=0 and base_moves=32 refreshes to 32 after step.
  • cargo test -p mc-turn --lib: 207/207 still green (no regression from adding refresh_units to end-of-step).
  • cargo check --workspace: clean (pre-existing 17 doc-comment warnings).

Smoke confirms ticking

Re-ran the 3-player apricot smoke at the Phase-11 commit (gdext rebuild

  • class-cache refresh + 5 EndTurns). Claude's view across turns 0..5 showed visible state advancement:
  • food_stored: 0 → 2 → 4 → 6 → 8 → 10 (net +2/turn)
  • gold: 60 → 68 → 76 → 84 → 92 → 100 (+8/turn)
  • unit_count: 3 → 3 → 4 → 5 → 6 → 6 (production threshold spawns)
  • science_per_turn: 0 → 42 (strategic_axes kicked in post-step)

This is the direct, observable consequence of Phase 11. Pre-Phase-11 smokes showed every field static across all 5 turns.

Honest finding — AI side still inert (separate from Phase 11)

Same smoke surfaces actions_applied=0 on the AI side (slots 1+2) for turns 1-4 despite Phase 11 wiring step. Turn 0 still produces 1 action per slot (the founding pass).

This contradicts the p2-68 Wave-final hypothesis ("the bench doesn't tick, that's why the AI sees nothing to do"). Wave-final was partially wrong: the bench DOES tick visibly for Claude. The AI's inertness is a deeper issue — decide_tactical_actions on the bench projection bottoms out after the founding pass because:

  • unit_catalog is empty in the bench-projector (p2-68 Wave 1 documented limitation),
  • (food, prod, gold) per-tile yields are zero in the bench projection,
  • the unit move queue is empty because the AI projector has no per-tile cost data.

Phase 11 closes the "step doesn't tick" issue. The "AI is behaviorally inert past turn 0" issue is its own follow-up. Recommendation: open a new objective pX-bench-projector-enrichment to widen project_tactical with unit_catalog + per-tile yields + movement-cost data so decide_tactical_actions has a non-degenerate search space on the bench.

2026-05-11 — Phase 10 STOP (structural blocker; documented)

Phase 10 cannot land as a thin dispatch swap. Per the user's explicit STOP rule ("If a phase reveals a deeper structural problem … STOP, document the blocker in the objective doc, exit with a clean summary. Don't simplify around it"), the work pauses here.

What Phase 10 actually requires

The brief described it as "Replace run_scripted_ai_turn with mc_ai::run_ai_turn(state: &mut GameState, player, &TechWeb, &Personalities) -> u32". That function does not exist in mc-ai. Evidence:

  1. No pub fn run_ai_turn in mc-ai/src/lib.rs — the exported surface (decide_tactical_actions, evaluator, score_*, decide_ransom_response, …) takes a pre-projected TacticalState, not the live GameState the dispatch layer holds. File: src/simulator/crates/mc-ai/src/lib.rs:20-44.
  2. No GameState → TacticalState projector in the workspace. All callers of decide_tactical_actions are test fixtures that build TacticalState { … } literals by hand: src/simulator/crates/mc-ai/tests/tactical_port_regression.rs:172-381.
  3. Tactical tests are #[ignore]d — the suite header reads "Tests exercising decide_tactical_actions directly are marked #[ignore]" (line 3 of the same file). The tactical port isn't in steady state; building a Phase-10 dispatch on top inherits that instability.
  4. No Action → GameState applicator. decide_tactical_actions returns a Vec<Action> of high-level intents (Move / Attack / FoundCity / QueueProduction / …). Translating each variant back into a GameState mutation is the symmetric half of the projector and is currently absent — the GDScript path (src/game/engine/src/modules/ai/ai_turn_bridge.gd) does this in GDScript by calling the same gdext shims mc-player-api already calls. Reimplementing that mapping in Rust is mc-ai's work, not mc-player-api's.

Why this is its own objective

Surface area is roughly comparable to Phase 9 (new crate-internal module, projector + applicator, tactical-test thaw, AI personality loading, deterministic seeding). It belongs as its own objective slice — recommendation: p2-68 mc-ai headless turn driver — with p2-67 flipping its status to that p2-68 blocker once filed.

Why Phases 11/12/13 follow

  • Phase 11 (TurnProcessor::step after AI loop) requires the real AI loop to be running. Ticking production for slot 0 only (because the scripted heuristic doesn't decrement queue production) produces misleading event counts; the DRY rule ("delete refresh_units call site after Phase 11") only makes sense once Phase 10 lands.
  • Phase 12 (per-player fog from ObservationStore) is a cosmetic refinement of the projection — usable once 10 is real, pointless until then since the scripted AI has no observations.
  • Phase 13 (Claude-vs-AI demo run + screenshots) literally cannot happen against a scripted heuristic that founds-city / queue-warrior / fortify; the demo brief specifies "Claude vs the production AI".

All three are deferred behind p2-68.

Reference implementation pointer

The GDScript driver AiTurnBridge lives at src/game/engine/src/modules/ai/ai_turn_bridge.gd (+ _dispatch.gd

  • _state.gd). It reads ai_personalities.json, invokes GdMcTreeController on GdGameState, and applies the resulting actions through the same gdext shim layer GdPlayerApi calls. The headless mc_ai::run_ai_turn should mirror this — same inputs, same output side-effects — but without the GDScript autoload dependencies.

Status

  • p2-67 stays partial. Phases 0-9 + bench-grade Phase 8 deliverables are shipped (39/39 routes either live, intentionally bench-cheated with breadcrumbs, or honestly NotYetImplemented with cited schema gaps).
  • Phases 10-13 deferred behind the new blocker objective (recommended id: p2-68). Will edit blocked_by once the follow-up objective is filed.

2026-05-12 — Updated path to "Claude vs production AI" demo

Phase 11 shipped (TurnProcessor::step ticking). Phases 12 + 13 each hit a structural gap that is NOT a quick wedge:

Phase 12 — needs mc-vision crate (filed as p2-70)

mc_observation::ObservationStore was the wrong tool — it's per-player climate observation history (temperature/moisture/wind), not per-tile visibility. The actual per-player visibility producer currently lives only in GDScript (Vision.gd). Rust has no fn visible_tiles(state: &GameState, player: PlayerId) -> HashSet<HexCoord>.

Filed as p2-70 mc-vision (per-player visibility producer). Estimate ~1 day.

Phase 13 — needs two pieces

1. Bench projector enrichment (filed as p2-71):

The AI is correctly wired through project_tactical → run_ai_turn → apply_ai_action, but decide_tactical_actions returns empty action chains past turn 0 because the bench projector emits a degenerate TacticalState — empty unit_catalog, zero per-tile yields, no move-cost data (all p2-68 Wave 1 documented limitations). Until the projector serves a representative tactical surface, MCTS bottoms out on nothing-to-do.

Honest correction recorded: last night's "AI inertness = bench doesn't tick" hypothesis was wrong (Wave 2's TurnProcessor wire proved Claude's slot ticks; AI inertness is a separate projector-fidelity issue).

Estimate ~1-1.5 days.

2. GdPlayerApi → render bridge (filed as p2-72):

Production proof scenes (e.g. world_map.tscn, gameplay_arc_proof) render from the GameState autoload. GdPlayerApi keeps its own state internally via load_state_json. There is currently no path that visualises the GdPlayerApi-held world. Phase 13's "screenshots every 5 turns" requires either pointing the autoload at GdPlayerApi's state (one source of truth) or a thin render adapter that reads view_json and drives a TileMap.

Estimate ~0.5-1 day.

Sequencing

  • p2-70 (mc-vision) ↔ p2-71 (projector enrichment) are independent — can run in parallel.
  • p2-72 (render bridge) is independent of both; can run anytime.
  • p2-67 final close happens when all three land + a Claude-vs-AI run produces screenshots and an action log.

References

  • src/simulator/crates/mc-core/src/action.rs — unit action enum
  • src/simulator/crates/mc-core/src/city_action.rs — city action enum
  • src/simulator/crates/mc-core/src/building_action.rs — building queue
  • src/simulator/crates/mc-mcts-service/src/{framing,protocol,server}.rs — wire-protocol precedent
  • src/simulator/crates/mc-turn/src/action_handlers/ — existing apply-action plumbing to delegate into
  • src/simulator/crates/mc-ai/src/lib.rs::AiTurnBridge — AI driver for non-Claude slots
  • src/game/engine/scenes/tests/auto_play.gd — full headless harness precedent
  • p2-66 (world-map-visual-proof.md) — sister objective for visual rendering of the resulting games

2026-05-12 — Mocked Phase 13 deliverable shipped

The API-level Phase 13 deliverable — a deterministic 25-turn Claude-vs-AI transcript via mc-player-api — is complete and gated by a test. Screenshot Phase 13 (the visual-proof half) is still gated on p2-72 (a/b/c) — when the rendered-game bridge lands, the same construction can be driven through the GDExtension path for a screenshot.

  • Test: src/simulator/crates/mc-player-api/tests/full_game_transcript.rsclaude_vs_ai_full_game_transcript. Drives the same harness state smoke_5_endturn_mock uses (lifted to tests/common/mod.rs), runs up to 25 turns, asserts byte-identical transcript across two runs, asserts the three game-loop constraints (city by 5, AI unit by 10, movement/combat across run). Verification: cargo test -p mc-player-api --test full_game_transcript.
  • Artifacts (gitignored under .local/):
    • transcript.jsonl — 248 lines, ~817 KB. Canonical JSON-Lines wire format that claude_player_main.gd would emit headlessly.
    • state-turn-NN.json for turns 0, 5, 10, 15 — PlayerView snapshots Claude would see at those boundaries.
    • recap.md — per-turn action log, AI summaries, score deltas, residual-gap notes.
  • Residual gaps (call out for follow-ups):
    • legal_actions on PlayerView is a stub (project_empire_legal_actions returns only EndTurn; per-unit legal_actions only carries Skip/Fortify/disabled-Move). Claude's policy reads RAW PlayerView.units / cities instead. File as p2-67-followup-legal-actions when promoting the real-legality probe.
    • mc-turn/src/processor.rs:2425 overflows (*a_hp * a_formation_size as i32) during PvP combat resolution at turn 17 of the canonical run. The transcript terminates cleanly via catch_unwind with a synthetic protocol_error notification and the run is faithfully truncated; the bug itself is upstream of mc-player-api and tracked as a residual mc-turn issue. (Combat happened — that's why the overflow fired — so constraint 4 is satisfied; the panic is itself evidence of combat resolution engaging.)
    • Unit-spawn events (UnitCreated, CityUnitCompleted) don't surface for every spawn path — PlayerState.units mutates directly in some mc-turn code paths without emitting a mc_replay::TurnEvent the dispatch layer can pick up. The constraint check therefore falls back to per-slot unit-count growth in score_snapshot (slot 1 grew from 4 → 27 units by turn 15, proving AI building activity).
  • Shared harness lift: mc-player-api/tests/common/mod.rs now owns build_3_player_state_like_harness, build_runtime_units_catalog, build_unit_catalog, build_building_catalog, add_player_militarist_inline, stamp_personality. The 5-turn smoke (smoke_5_endturn_mock) was refactored to consume them — it still passes byte-identically to the pre-lift version.

2026-05-12 — Real-apricot demo transcript captured

Drove the production harness on apricot HEAD 1c91a332d for 25 EndTurn cycles in 3-player Claude-vs-AI configuration. Captured the full JSON-Lines wire transcript, scp'd to plum, produced a per-5-turn recap with score / AI-action / event tables.

  • Driver: scripts/claude-demo-25turn.sh
  • Transcript: .local/demo-runs/2026-05-12-real-apricot-claude-vs-ai/transcript.jsonl (37 lines, 420 666 bytes — one act_response envelope per turn, each ~11 KB packing events[] + full view snapshot)
  • Recap: .local/demo-runs/2026-05-12-real-apricot-claude-vs-ai/recap.md
  • Summary JSON: .local/demo-runs/2026-05-12-real-apricot-claude-vs-ai/summary.json

Run outcome

  • All 25 act:end_turn requests succeeded with ok:true. shutdown_ok received. Harness exit 0. No protocol_error.
  • AI slot 1: 63 actions applied over 25 turns (range 15/turn, never zero — matches p2-71 smoke acceptance).
  • AI slot 2: 82 actions applied over 25 turns (range 24/turn, never zero).
  • City foundings at turns 13 and 25 (3 each — one per slot, founder units settling on schedule).
  • Final Claude state: gold=356, cities=3 (capital 0_0 @ (1,6), 0_1 @ (5,10), 0_2 @ (9,14)), units=31.

Residual gaps

  • Combat overflow fix unverified at runtime. Zero combat events occurred in 25 turns on the duel map at seed 42 — both AI slots expanded in parallel without contact. The mock transcript's attempt to multiply with overflow panic at mc-turn/src/processor.rs:2425 is therefore neither reproduced nor cleared by this run. Follow-up needed: an adversarial map preset or scripted AttackTile injection to actually exercise the combat path.
  • Claude slot is passive. This driver issues only EndTurn per turn; the harness has no autonomous Claude policy bound. Claude's state advances via engine auto-actions (founder auto-settle, gold accumulation) but no FoundCity/Fortify/QueueProduction from a Claude brain. A real Claude policy is Phase 13 territory.
  • Transcript shape vs. ">100 lines" criterion. Acceptance was authored against the mock's per-action driver (248 lines). Real harness packs each turn into one fat act_response (37 lines × ~11 KB). 420 KB of well-formed wire JSON satisfies the spirit; the line shape is a property of the real wire format, not a deficiency. Documented in recap.md.
  • Phase 13 screenshots STILL gated on p2-72. This is the API+transcript form of Phase 5; no rendered proof scene captured.