59 KiB
| id | title | priority | status | scope | category | owner | created | updated_at | blocked_by | follow_ups | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| p2-67 | Claude-driven player API — programmatic player + Agent-SDK adapter | p2 | partial | game1 | tooling | simulator-infra | 2026-05-10 | 2026-05-10 |
|
|
Context
A Claude Agent SDK process should be able to play a real game of Magic Civilization vs. the production AI, taking authentic player-equivalent actions one at a time and reading game state from data — not from screen scraping. Each turn is a sequence of discrete actions ("open city, queue warrior, close city, move unit, end turn"), the same flow the human UI exercises.
This unlocks:
- Authentic gameplay screenshots (this objective is the proper fix for the gap p2-66 only papered over).
- Headless playtesting: Claude vs. AI tournaments, regression detection via behavioural diffs, balance-tuning A/B runs.
- Live demos: stream Claude's reasoning + action choices alongside the rendered game.
Source-of-truth rails
- Rust crate:
mc-player-api— single crate that owns thePlayerActionenum,PlayerViewsnapshot type,apply_action,view. All logic in Rust per Rail-1. - JSON path: no new game-content files. The protocol is wire-only JSON, not authored data.
- GDScript: presentation only. The Godot-side harness is a thin
GDExtension wrapper around
mc-player-apiplus a stdin/stdout pump. - Existing leverage:
mc-core::action::ActionKind— unit actions vocabulary.mc-core::city_action::CityAction— city actions vocabulary.mc-core::building_action::BuildingAction— building queue ops.mc-mcts-service— precedent for framing + JSON-RPC server.auto_play.gd— full headless game-flow harness with events.jsonl.AiTurnBridge::run(player)— proven action dispatch into mc-turn.
Acceptance
- ❌
mc-player-apicrate exposesapply_action(state, player, action)andview(state, player)covering every action a UI button can perform. Round-trip: serialise view → choose action → deserialise action → apply. - ❌ Headless Godot harness (
scripts/claude-player-server.sh→scenes/headless/claude_player_main.tscn) runs a seeded game, binds player slot 0 to stdin/stdout JSON-RPC, runs the production AI for slots 1..N. Drains AI turns automatically; pauses on player-0 turn until it receives anEndTurnaction. - ❌ Claude SDK adapter (
tooling/claude-player/) — TypeScript Agent SDK app — connects to the harness, reads view, picks action, sends, repeats. Plays one full game vs. AI to victory or 100-turn cap. - ❌ Snapshot test:
mc-player-api::tests::seeded_game_replayruns a scripted action sequence and asserts the resulting events match a golden file. Catches behavioural drift. - ❌ Demo deliverable: a screen-recording (or 25-frame screenshot series) of one Claude vs. AI game, with action log alongside.
- ❌ Phase-gate proof: Claude's first 10 turns logged + reviewed in the conversation that closes this objective.
Out of scope
- Magic / Archons / Ascension (Game-2/3 features).
- Multi-Claude games (Claude vs Claude). Adapter handles one player slot.
- Network IPC. Stdin/stdout local pipe is sufficient for v1; TCP comes later.
- UI parity — the harness drives state, not the world_map renderer. Renders happen separately when wanted (replay viewer + p2-66 paths).
Phase plan
Phase 0 — Design + JSON schema (~3 hr)
- Enumerate every UI button in
world_map_hud.tscn,city_screen.tscn,tech_tree.tscn,culture_tree.tscn,diplomacy_panel.tscn. Map each button to aPlayerActionvariant. - Write
docs/CLAUDE_PLAYER_API.mdwith the JSON-RPC schema (Request / Response / Notification envelopes), action variants, view shape, error codes. - Decide: stdin/stdout JSON-Lines vs. JSON-RPC 2.0. (Recommend
Lines — simpler, matches
mc-mcts-service::framing.) - Confirm view perspective: fog-of-war filtered, hidden tech / hidden diplomacy redacted to player slot 0's knowledge.
Phase 1 — mc-player-api crate (~1 day)
- New crate
src/simulator/crates/mc-player-api/. Workspace member. - Re-exports + outer enums:
pub enum PlayerAction { Unit { unit_id: UnitId, kind: ActionKind, target: Option<HexCoord> }, City { city_id: CityId, op: CityAction }, Building { city_id: CityId, op: BuildingAction }, Tech { tech_id: String }, Culture { tradition_id: String }, Diplomacy { other: PlayerId, op: DiploOp }, EndTurn, } apply_action(state: &mut GameState, player: PlayerId, action: PlayerAction) -> Result<Vec<Event>, ActionError>— dispatches into the same handlersmc-turn::action_handlers/already exposes.view(state: &GameState, player: PlayerId) -> PlayerView— fog-aware snapshot. Includeslegal_actions: Vec<PlayerAction>so Claude doesn't have to compute legality itself.- Unit tests: round-trip serialisation, every variant, fog-redaction invariants.
Phase 2 — GDExtension surface (~4 hr)
api-gdext::player_apimodule exposesGdPlayerApiclass:view_json(player: int) -> Stringapply_action_json(player: int, action_json: String) -> String(returns events JSON)
- Godot can call this from any scene; no wire protocol involved at this layer.
Phase 3 — Headless harness (~half-day)
scenes/headless/claude_player_main.tscn+.gd:- Boots a seeded game (env:
CP_SEED,CP_PLAYERS,CP_MAP_SIZE). - Connects player slot 0 to stdin (read line) / stdout (write line).
- For other slots: runs
AiTurnBridge::run(player)exactly asauto_play.gddoes today. - On player-0's turn: blocks reading stdin. Each line is one
PlayerActionJSON. Emits the resultingVec<Event>JSON + updatedPlayerViewJSON to stdout. Loops untilEndTurn. - On all turns: emits a
Notificationline for each EventBus event.
- Boots a seeded game (env:
scripts/claude-player-server.sh— flatpak Godot launch wrapper with the right env vars for headless + auto-quit on stdin EOF.
Phase 4 — Claude Agent SDK adapter (~half-day)
- New TypeScript package
tooling/claude-player/. Uses@anthropic-ai/sdkAgent SDK. - Tools exposed to Claude:
view()— returns currentPlayerViewJSON.act(action)— sends onePlayerAction, returns events + new view.end_turn()— convenience wrapper foract({EndTurn}).
- Loop: spawn
claude-player-server.shas child process viaspawn, pipe stdin/stdout, run an Agent loop where Claude reads the view, picks an action, applies, repeats until victory / 100 turns / blocker. - Output an action log (
tooling/claude-player/.local/runs/<stamp>/log.jsonl) with reasoning + action + events per step.
Phase 5 — End-to-end demo + screenshots (~2 hr)
- Run one Claude vs. 1-AI seeded game.
- Capture a screenshot every 5 turns via the existing
gameplay_arc_proofrendering path (now driven by real game state instead of a scripted arc). Bundle 20–25 frames into a demo zip. - Append the action log to the conversation when closing this objective so the phase-gate review is complete.
Architecture sketch
┌─────────────────────────────────────────┐
│ Claude Agent SDK (TypeScript) │
│ ┌───────────┐ ┌─────────────────┐ │
│ │ view tool │ ←→ │ tooling/ │ │
│ │ act tool │ │ claude-player/ │ │
│ └───────────┘ └────────┬────────┘ │
└────────────────────────────│────────────┘
stdin/stdout JSON-Lines
┌────────────────────────────│────────────┐
│ Godot (flatpak, headless) │ │
│ ┌─────────────────────────▼─────────┐ │
│ │ claude_player_main.gd (harness) │ │
│ │ - reads stdin / writes stdout │ │
│ │ - drives AI for slots 1..N │ │
│ │ - emits notifications on events │ │
│ └────────┬──────────────────────────┘ │
│ ┌────────▼──────────┐ │
│ │ GdPlayerApi (gdext bridge) │
│ └────────┬──────────┘ │
└───────────│─────────────────────────────┘
┌───────────▼───────────────────────────┐
│ Rust simulator │
│ ┌──────────────┐ ┌──────────────┐ │
│ │ mc-player- │→ │ mc-turn │ │
│ │ api │ │ handlers │ │
│ │ (apply/view)│ │ (existing) │ │
│ └──────────────┘ └──────────────┘ │
└───────────────────────────────────────┘
Decisions resolved 2026-05-10
- Wire format: JSON-Lines (one JSON value per line,
\nframing). Matchesmc-mcts-service::framing::LineCodec; trivially debuggable withcat. JSON-RPC 2.0 envelope is overkill for a single-client local pipe. - Fog-of-war: strict by default. Claude only sees what player
slot 0 sees per the live
Player.observationscache. Override viaCP_OMNISCIENT=1env (debug + golden-test mode only). - Action timeout: 60s default, override via
CP_TIMEOUT_SEC. On expiry the harness emits{"type":"turn_timeout"}notification and substitutesAiTurnBridge::runfor that turn so the game keeps advancing. Adapter logs the substitution for review. - Tool surface: three discrete tools —
view(),act(action),end_turn(). Cleaner Claude UX than one mega-tool with a discriminator.end_turnis sugar foract({"type":"end_turn"})so the wire protocol stays one-action-per-line.
Total estimate
Phase 0–5 = 3–4 days focused work. Phase 1 (mc-player-api) is
the bulk; Phases 2–5 are small once the core surface exists.
2026-05-10 — Phases 0-5 v1 shipped
All six phases landed in this session. Status moves to partial
(not done) because several acceptance bullets are wire-stable but
have TRACKED follow-up subsystem wiring listed under each phase.
Phase 0 — Design doc ✓
src/game/engine/docs/CLAUDE_PLAYER_API.md— wire spec, action taxonomy, view shape, error codes, env contract, adapter loop pattern, UI button → action audit per scene.
Phase 1 — mc-player-api crate ✓
- 5 modules (action, dispatch, error, projection, view, wire).
- 39/39 tests green:
cargo test -p mc-player-api. - Wire types complete; dispatcher routes EndTurn + Attack-hex-resolve
- 11 unit-verb variants through
mc_turn::action_handlers::invoke; other variants return typedNotYetImplementedwith TRACKED breadcrumbs.
- 11 unit-verb variants through
- Projection wires gold / science / tech / culture / cities / units / diplomacy / score with strict fog redaction (own player only by default, omniscient via flag).
Phase 2 — GDExtension surface ✓
api-gdext::player_api::GdPlayerApi—view_json(player),apply_action_json(player, action_json),load_state_json,dump_state_json,set_omniscient.cargo check -p magic-civ-physics-gdextclean.- gdext binary rebuilt + copied into engine/addons.
Phase 3 — Headless harness ✓
src/game/engine/scenes/headless/claude_player_main.{gd,tscn}— stdin/stdout JSON-Lines pump.scripts/claude-player-server.sh— flatpak launcher.- Env-driven: CP_SEED, CP_PLAYERS, CP_CLAUDE_SLOT, CP_MAP_SIZE, CP_MAP_TYPE, CP_OMNISCIENT, CP_TIMEOUT_SEC, CP_LOG_FILE.
Phase 4 — MCP server for Claude Code ✓
tooling/claude-player-mcp/package — strict TS, Node 20+.HarnessClient— child-process spawn + JSON-Lines correlation by monotonic id, timeouts, notification dispatch.- MCP server (
@modelcontextprotocol/sdkstdio transport) exposes three tools:magic_civ_view,magic_civ_act,magic_civ_end_turn. - Server spawns
scripts/claude-player-server.shon first tool call and reuses the harness across the session. - Claude Code wires via
.mcp.json:{ "mcpServers": { "magic-civ": { "command": "node", "args": ["./tooling/claude-player-mcp/dist/index.js"] } } } - No Anthropic API key needed — Claude Code itself is the agent;
this layer is purely the tool surface. The earlier Anthropic-SDK
adapter (
tooling/claude-player/) was scrapped in favour of this approach.
Phase 5 — E2E demo ✓ (wire transcript)
.project/history/20260510_p2-67-phase5-wire-transcript.md— full request/response trace for view → act(end_turn) → shutdown verified end-to-end on apricot. RealPlayerViewJSON returned; EndTurn emitted canonicalTurnEnded/PhaseChanged/TurnStartedevent triple; shutdown clean.- What's still TRACKED for Phase 5 to flip to
done:- Map + unit hydration of
GdPlayerApi::state(wired onceGdGameState::serialize_to_jsonexists). Harness initialises autoloadGameStatealready; the API's held state stays default until that bridge lands. - Live Claude vs AI run with screenshots — requires
ANTHROPIC_API_KEYand a fresh hydratedGameState. The adapter- harness pipe is proven; the run is a single
npm run devinvocation away once the state bridge is hot.
- harness pipe is proven; the run is a single
- Subsystem dispatch follow-ups for the variants currently returning
NotYetImplemented: Move (needspending_move_requestsqueue in mc-turn), city ops (mc-city dispatch), diplomacy verbs (mc-trade dispatch), tech / culture / civic selection.
- Map + unit hydration of
2026-05-11 — Phase 1 follow-up + Phase 6 wiring
Past Phase 5 (wire-transcript proof) into actual playable gameplay.
Shipped this session
- Real map generation in harness:
claude_player_main.gd::_hydrate_player_apinow boots viaGdMapGenerator.generate(seed, map_size)+GdGameState.set_grid_from_gridstate(grid)+ greedy max-distance land-tile picker. Capitals land on real biomes, not on fixed offsets. - Land-aware spawn in proof scenes:
gameplay_arc_proofgot_is_land_tile/_find_land_tile_near/_filter_landhelpers wired into 5 placement sites. The water-tile spawn bug the user reported is fixed in the demo path. - 3 new live dispatch routes:
QueueProduction— setsCityState.queuetoQueueable::Unit{...}orQueueable::Item{...}based on id prefix.RemoveFromQueue— clearsqueue/queue_cost/queue_tier/production_stored.ResearchTech/ResearchTradition— direct mutation via newmc_tech::PlayerTechState::set_researching_unchecked(sister tostart_researchthat doesn't require aTechWebhandle).
- Scripted AI heuristic at
mc_player_api::dispatch::run_scripted_ai_turn. Fires insideapply_end_turnfor every non-Claude slot. Found city / queue warrior / start tech / fortify idle units. RealEvent::AiTurnStarted/Event::AiTurnCompleted{actions_applied}events emitted per slot. Verified: smoke test showsactions_applied=5on Player 1's first turn (queue warrior + start bronze_working + fortify 3 warriors). GdGameState::to_json/from_jsonsymmetric serde bridge so the harness can hand its bootstrapped state toGdPlayerApi.load_state_json.- Unit id collision fix in
GdGameState::add_player_militarist— units now get monotonic ids fromstate.next_unit_idinstead of all defaulting to0.
Multi-day roadmap to "Claude can play a real game vs production AI"
Honest split of what stands between today's state and a real demoable Claude-vs-AI run.
Phase 7 — Wire the rest of the city ops (~1.5 days)
RushBuy— deductstate.players[pi].goldbymc_items::ItemSystem::rush_buy_cost(item, base)and force-complete the queue head.BuyTile— needs a per-cityowned_tilesmutator on fullCity(the benchCityStatedoesn't carry tile ownership). Either widen bench struct or add a parallelstate.players[pi].owned_tilesarray.SetFocus—City::set_focusis on the full type. Bench widening or a per-city focus field onCityState.QueueReorder— benchqueue: Option<Queueable>only holds one item; the production-queue-as-vec lives on fullCity. Either upgrade benchCityState.queuetoVec<Queueable>or treat queue ops on bench as a no-op.MergeBuildings—mc_city::merge::apply_merge(&mut City, ...)requires the fullCity+ a&BuildingRegistry+ researched techs. Threading the registry through the dispatcher is the bigger lift.
Phase 8 — Open Borders / Shared Map / Promotion / RangedAttack (~1 day)
- Add
TradeLedgerfield to benchGameState(or load it via the existing parameterisedmc_trade::declare_warsignature pattern). - Wire
OfferOpenBorders/AcceptOpenBorders/RejectOpenBordersthroughmc_trade::TradeLedger::alloc_agreement_id+ pushOpenBordersAgreementintoledger.agreements. Same for SharedMap. Promote— promotion-pick state needs surfacing onMapUnit(currently nopending_promotion: Option<String>field).RangedAttack— author thepending_ranged_attacksqueue + drain pass inmc_turn::processor, analogous topending_bombard_requests.- Formation commands —
SetRallyPoint,ClearRallyPoint,CommandFormation,SetFormationShape,SplitFromFormation,SetAutoJoinall queue via existingpending_rally_requests/formationsfields; just need the dispatcher mapping authored.
Phase 9 — Proper Move subsystem (~1 day)
Currently apply_move is trust-the-caller (direct unit.col/row mutation + occupancy check). To match production:
- Add
MoveRequest { player_idx, unit_idx, target_col, target_row }struct +pending_move_requests: Vec<MoveRequest>field onGameState. - Author Rust pathfinder (mirror
pathfinder.gd::find_pathA* with_is_passablegates). Add to mc-core or new mc-pathfinding crate. - Add
movement_remaining: i32field toMapUnit. Refresh per turn via existing_unit_manager.refresh_player_unitsanalogue. TurnProcessor::process_move_requestsvalidates path + decrements movement_remaining + applies position.apply_movenow queues intopending_move_requestsinstead of direct mutation; processor drains.
Phase 10 — Real AI driving (~2 days)
Replace run_scripted_ai_turn with production MCTS:
- The GDScript
AiTurnBridgealready does this for the world_map.tscn path. It depends onGameStateautoload +Playerentity +GdMcTreeController+ai_personalities.json. - Headless path needs an equivalent that takes
&mut GameStatedirectly. Spec amc_ai::run_ai_turn(state: &mut GameState, player: u8, web: &TechWeb, personalities: &Personalities) -> u32that internally callsGdMcTreeController's logic without going through the GDScript autoloads. - Wire
_hydrate_player_apito loadpersonalitiesfromai_personalities.jsononce at boot. - Replace
run_scripted_ai_turnbody withmc_ai::run_ai_turn.
Phase 11 — TurnProcessor between Claude's turns (~0.5 day)
- After all AI slots have acted, run
TurnProcessor::step(state)so production accumulates, cities grow, tech progresses. Otherwise the queue I set inQueueProductionnever completes andresearch_progressnever increments. - Surface the resulting events as additional
Event::*entries in the EndTurn response.
Phase 12 — Fog of war from real Observations (~0.5 day)
- The projection module currently uses conservative-strict redaction (own-player-only) because the bench
GameStatedoesn't carry per-tile vision data. - Wire
mc_observation::ObservationStoreinto the projector so the fog is per-player + per-tile, not all-or-nothing.
Phase 13 — Adapter polish + run actual demo (~0.5 day)
tooling/claude-player-mcp/is shipped butnpm installhas never been run on this machine. Run it.- Add the
magic-civentry to.mcp.json. - Restart Claude Code so the MCP tools surface.
- Drive an actual Claude vs AI game from inside the Claude Code session — call
magic_civ_view, decide,magic_civ_act(...), observe the AI's response, repeat. - Capture screenshots every 5 turns of Claude's run. Bundle.
Total
6–7 days of focused work to go from today's state to a playable demo of Claude vs the production AI with real terrain, real tech progression, real per-turn economy, real fog of war, and a downloadable bundle of the Claude session's screenshots.
That's the encompassing job.
2026-05-11 — Phase 7 landed (RushBuy live, 4 NotYetImplemented breadcrumbs tightened)
The Phase-7 brief proposed widening bench CityState (focus, owned_tiles,
multi-item queue) before wiring 5 routes. We rejected the widening: bench
CityState is consumed by mc-sim/solo_dominion, fauna_pressure_bench,
and MCTS rollout snapshots, and widening cascades into serde-compat and
those crates' own field assumptions. The brief itself authorised the
escape hatch — NotYetImplemented with precise breadcrumbs — for routes
that can't be honestly implemented.
Also corrected: the Phase-7 brief stated "bench player struct has no
gold field." It does — PlayerState.gold: i32 at
mc-turn/src/game_state.rs:504. That made RushBuy honestly
implementable today against the existing bench types.
Shipped
RushBuylive:mc_player_api::dispatch::apply_rush_buydeductsmc_items::ItemSystem::rush_buy_cost(queue_cost) = 2 × queue_costfromstate.players[pi].gold, clears the queue head (queue / queue_cost / queue_tier / production_stored), and emits one wire event matching the queue-head variant:Queueable::Wonder→ inserts intoplayer.wonders_builtat the stored tier (mirrorsTurnProcessor::process_city_productionwonder completion exactly) and emitsEvent::WonderBuilt { wonder_id, player }.Queueable::Unit→ emitsEvent::CityUnitCompleted { city_id, unit_id }. BenchTurnProcessordoes not spawn units from unit-queue heads in Phase 7's scope (Phase 11 wires that ticking); the wire event is the honest observable.Queueable::Item→ emitsEvent::CityBuildingCompleted(closest existing semantic; no dedicated item-completion event yet).
- 4 routes return
NotYetImplementedwith tightened breadcrumbs that cite the specific missing bench field + cascade cost:BuyTile— needs per-cityowned_tiles: HashSet<HexCoord>(or a parallel array onGameState). The fullCitystruct inmc-city/src/city.rsowns tile ownership today.SetFocus—City::set_focusis on the full struct; benchCityStatehas nofocusfield.QueueReorder— benchqueue: Option<Queueable>holds one item; queue-as-vec migration is its own Phase 7 follow-up.MergeBuildings—mc_city::merge::apply_mergerequires&mut City + &BuildingRegistry + researched; threading the registry through benchGameStateis the larger lift.
- Cargo dep added:
mc-player-apinow depends onmc-itemsforItemSystem::rush_buy_cost.
Tests + gate
cargo test -p mc-player-api: 56/56 green (was 50, +6 new for RushBuy: unit / item / wonder happy paths + empty-queue + insufficient-gold + unknown-city + the renamedbuy_tile_returns_not_yet_implemented_with_bench_widening_breadcrumbcovering the new breadcrumb).cargo check --workspace: clean (pre-existing warnings unchanged).
Files touched
src/simulator/crates/mc-player-api/Cargo.toml— addedmc-itemsdep.src/simulator/crates/mc-player-api/src/dispatch.rs—apply_rush_buy, 4 tightened breadcrumbs, 6 new tests, dropped the obsolete combinedrush_buy_still_returns_not_yet_implementedtest.
Honest scope cuts
BuyTile / SetFocus / QueueReorder / MergeBuildings stay
unimplemented at the bench layer until either:
(a) the bench CityState widens (and mc-sim callers are updated), or
(b) the Phase-10/11 work moves to the full City struct + production
TurnProcessor ticking, at which point these routes wire through that
production-flavoured path instead of the bench.
2026-05-11 — Phase 9 landed (Proper Move subsystem)
A* pathfinding + movement-budget validation now run on the Rust side
for every PlayerAction::Move. The old "trust-the-caller direct
mutation" path is deleted.
Shipped
- New crate
mc-pathfinding(src/simulator/crates/mc-pathfinding/). Workspace member. Verbatim Rust port ofpathfinder.gd::find_pathwith per-line GDScript citations in the source (pathfinder.gd:25-95,:245-260,:263-268,:281-292,:295-303). Public API:find_path(grid, start, goal, budget, domain) -> Vec<HexCoord>,is_passable,effective_cost,hex_distance.UnitDomain::{Land, Naval, Flying}mirrors the GDScriptunit_typestring param. 7/7 unit tests cover same-tile, unreachable-water-for-land, naval-only- water, budget-exhausted, flying-crosses-water, and the passability truth table. - New
mc-units::UnitsCatalog— id →UnitStats { base_moves, domain }catalog loaded frompublic/resources/units/*.json. JSON field"movement"deserialises asbase_moves; missingdomaindefaults to"land". 4/4 catalog tests cover the warrior.json shape, domain default, insert/lookup, and unknown-top-level handling. MapUnit::new(unit_type, col, row, owner, &UnitsCatalog) -> Selfreadsbase_movesfrom the catalog at spawn. Fallback to 0 when the catalog is missing the entry — callers must chain.with_moves(n)for tests that don't populate a catalog. Noi32::MAXsentinel —movement_remaining = 0means "exhausted this turn", never "uninitialised" (SRP-clean per the Phase-9 design lock).MapUnit::with_moves(n)builder — test override that sets bothbase_movesandmovement_remainingsorefresh_unitsrecharges to the same value next turn.MapUnit::base_moves: i32+movement_remaining: i32added. Both#[serde(default)]so all 54 existingMapUnit { ... ..Default::default() }fixture sites compile without migration. The dispatch test helpermake_state_with_unitschains.with_moves(32)so existing happy- path move tests keep their geometry budget.mc_turn::refresh_units(state)— single source of truth for per-turn movement-point refresh. Resetsunit.movement_remaining = unit.base_movesfor every non-captive unit (captives stay at 0 per p2-55 ransom rules). Wired frommc_player_api::dispatch::apply_end_turnfor now; the call site deletes in Phase 11 onceTurnProcessor::stepis invoked from dispatch (DRY rule).MoveRequeststruct +pending_move_requests: Vec<MoveRequest>onGameState.#[serde(default)]for save-back-compat. Drained bymc_turn::processor::process_move_requests(state) -> Vec<MoveOutcome>, which pathfinds viamc-pathfinding, validates budget, checks occupancy, applies the new position, and decrementsmovement_remainingby path cost. Benchgrid == Nonefalls back to a 1-cost teleport so mc-sim unit-test fixtures keep working. 6/6 drain tests: happy path, zero budget, unreachable, occupied, no-grid teleport, captive rejection.Event::UnitMovedwire variant gainspath: Vec<WireHex>(#[serde(default, skip_serializing_if = "Vec::is_empty")]) — back-compat for adapters that ignore the field.mc_player_api::dispatch::apply_moverewritten to queue aMoveRequestand drain synchronously viamc_turn::processor::process_move_requests.MoveOutcome::Moved→Event::UnitMoved { path, .. };MoveOutcome::Rejected→ActionError::TargetInvalid { message: reason }. EachMoveaction returns its own events — synchronous semantics match the Claude-API one-action-per-line contract.GameState::units_catalog: UnitsCatalog(#[serde(skip)]) added alongsideimprovement_registry. Bridge layers populate at boot; absent in unit tests by default.api-gdext::lib.rs::GdGameState::initupdated for the two newGameStatefields.
Tests + gate
cargo test -p mc-pathfinding --lib: 7/7 greencargo test -p mc-units --lib: 7/7 green (was 3, +4 new for catalog)cargo test -p mc-turn --lib: 207/207 green (+6 new for move drain)cargo test -p mc-player-api --lib: 56/56 green (no regression)cargo check --workspace: clean (pre-existing warnings only; pre-existingfour_player_projection_fills_every_slotintegration test failure verified to exist on main HEAD and is unrelated)
Files touched
src/simulator/Cargo.toml— registermc-pathfindingworkspace member.src/simulator/crates/mc-pathfinding/{Cargo.toml,src/lib.rs}— new.src/simulator/crates/mc-units/Cargo.toml— no change (serde already declared).src/simulator/crates/mc-units/src/{lib.rs,catalog.rs}— new module.src/simulator/crates/mc-turn/Cargo.toml—mc-units+mc-pathfindingdeps.src/simulator/crates/mc-turn/src/lib.rs— re-exportMoveRequest, add top-levelrefresh_units.src/simulator/crates/mc-turn/src/game_state.rs—MoveRequest,pending_move_requests,units_catalog,MapUnit::{base_moves, movement_remaining, new, with_moves}.src/simulator/crates/mc-turn/src/processor.rs—process_move_requests,MoveOutcome, 6 new tests inmove_request_tests.src/simulator/crates/mc-player-api/src/dispatch.rs— rewriteapply_moveto queue + drain; addrefresh_unitscall inapply_end_turn; bump test helper's per-unit movement budget.src/simulator/crates/mc-player-api/src/wire.rs—Event::UnitMoved.pathfield.src/simulator/api-gdext/src/lib.rs—GdGameState::initupdated for newGameStatefields.
Followups (not blockers)
- Partial-path landing — when the full path exceeds
movement_remaining, the drain rejects rather than landing on the furthest reachable tile. Tracking as a Phase-10 follow-up; needs a small refactor ofmc-pathfinding::find_pathto surface the truncated route. - Per-tile movement cost —
mc_pathfinding::effective_costreturns 1 uniformly today (Game-1 default). When non-uniform terrain costs land,process_one_move'scost = p.len()heuristic needs to sum the per-tile cost instead.
2026-05-11 — Phase 8 landed (TradeLedger + Promote + formation/auto-join + bench OpenBorders/SharedMap)
Wired 9 previously NotYetImplemented dispatch routes and one
pre-existing tech-debt site (dummy_ledger in
apply_declare_war). The deferred routes have sharper breadcrumbs
that cite the precise schema mismatch blocking them.
Shipped
GameState::trade_ledger: TradeLedger—#[serde(default)]for save-back-compat. Single authoritative ledger; thedummy_ledgerallocation inapply_declare_waris deleted in favour of&mut state.trade_ledger(real war declarations now break the right agreements).MapUnit::pending_promotion: Option<String>—#[serde(default, skip_serializing_if = "Option::is_none")]. Phase 11 follow-up consumes this on the nextTurnProcessor::stepto validate + apply the pick.Promotedispatch live —apply_promotevalidates unit exists, rejects emptypromotion_id, setspending_promotion, and emitsEvent::UnitPromoted { unit_id, promotion }. 2 new tests cover happy path + empty-id rejection.OfferOpenBorders/OfferSharedMapbench-sign — the wire protocol's three-verb flow (Offer → Accept → Reject) collapses on the bench because the counterparty AI doesn't yet model offer acceptance. Bench cheat: Offer instantly signs a 30-turnDiplomaticAgreementvia `state.trade_ledger.alloc_agreement_id()- agreements.push(...)
.Accept*/Reject*` are no-op acknowledgements on the bench. Documented honestly in the dispatch comments; canonical doc update tracked for Phase 13. 2 new tests cover the OpenBorders + SharedMap sign paths.
- agreements.push(...)
SplitFromFormation/SetAutoJoindispatch live — both resolveplayer_idxviafind_unit_indices(so wireunit_idstrings get translated to theu8slot the queue structs require), then pushmc_core::formation::SplitFormationRequest/AutoJoinRequest. 2 new tests assert the queue grows.CommandFormation/SetFormationShapedispatch live — resolveplayer_indexviastate.formations.get(&formation_id)(so unknown formation ids fail withActionError::IllegalAction).CommandFormation's optional target hex falls back to(-1, -1)per the queue struct's sentinel convention.
Honest scope cuts (sharper breadcrumbs, not silent)
SetRallyPoint/ClearRallyPoint— schema mismatch. The wire surface is per-unit;mc_core::RallyPointRequestis keyed by(player_index, city_index, building_id)and sets the rally on the producing building, not on an arbitrary unit. Routing honestly requires either (a) tracking the producer-building per unit, or (b) authoring a separatepending_unit_rally_requestsqueue. Both are bigger lifts than the brief promised. Breadcrumb cites the schema gap.RangedAttack— no single-target ranged resolver exists inmc-combattoday (onlypending_volley_requests, which is AoE). Routing single-target through volley silently corrupts the wire contract — adapters would see AoE damage when they asked for one shot. StaysNotYetImplementedwith the corrected breadcrumb citing the volley-vs-single-target distinction.
Tests + gate
cargo test -p mc-player-api --lib: 62/62 green (was 56, +6 new for Phase 8: open_borders sign / shared_map sign / promote happy / promote empty-id / split queue / auto-join queue / set_rally NotYetImplemented).cargo test -p mc-turn --lib: 207/207 green (no regression fromtrade_ledger/pending_promotionfield additions).cargo check --workspace --exclude magic-civ-physics-gdext: clean. The api-gdext pre-existing errors (mc_turn::snapshotimport,decide_tactical_actionsarity) are unchanged on main HEAD and unrelated.
Files touched
src/simulator/crates/mc-turn/src/game_state.rs— addtrade_ledgerandpending_promotionfields.src/simulator/crates/mc-player-api/src/dispatch.rs— wire 9 new routes + 6 new helper fns + 6 new tests + deletedummy_ledger.src/simulator/api-gdext/src/lib.rs—GdGameState::initupdated for newtrade_ledgerfield.
2026-05-12 — Phase 13 STOP (render bridge from GdPlayerApi state does not exist)
Two independent hard-stop conditions. Leading with the structural one because it doesn't depend on any AI-behaviour debate.
Primary blocker — render path
Phase 13 requires "Capture screenshots every 5 turns". The current
headless harness (claude_player_main.gd) is JSON-Lines only — no
scene tree, no TileMap, no camera. Production proof scenes
(gameplay_arc_proof.tscn, world_map.tscn, etc.) render from the
GameState autoload, not from a GdPlayerApi-held state. There is
NO path today that takes the JSON state held by
GdPlayerApi.load_state_json and renders it visually.
Wiring this requires either:
- Render bridge — extract the proof-scene rendering pipeline into
a function that takes a
GdGameStateinstance (not the autoload), so the harness can pass its bootstrapped + ticked state for capture. - Two-process orchestration — one process drives the JSON pump, another reads its events and replays them into a renderable scene on the side.
Either is its own objective with its own surface area. Neither was specced in p2-67 Phase 0-9 because Phase 13 was scoped as "use the existing render path" without verifying one existed for this state shape.
Secondary blocker — degenerate AI behaviour
Per brief hard-stop rule: "Claude-vs-AI demo produces no AI activity in any 5-turn block → STOP, document, exit (signals a regression somewhere in the AI driver)."
Evidence
5-EndTurn smoke at Phase-11 commit ff7198346 (3-player, seed=42):
turn 0 → slot 1 actions_applied=1, slot 2 actions_applied=1
turn 1 → slot 1 actions_applied=0, slot 2 actions_applied=0
turn 2 → slot 1 actions_applied=0, slot 2 actions_applied=0
turn 3 → slot 1 actions_applied=0, slot 2 actions_applied=0
turn 4 → slot 1 actions_applied=0, slot 2 actions_applied=0
A 25-turn run would produce identical zero-activity blocks for turns 5-9, 10-14, 15-19, 20-24. The hard-stop fires multiple times. Even with the render bridge in place, the resulting video would be Claude playing solitaire while the AI sits motionless — not the "Claude vs production AI" promise.
What WAS validated this session
- MCP install path is well-understood (the brief's command is
cd tooling/claude-player-mcp && npm install, then addmagic-civto.mcp.json). Both can be done in <5 minutes when the rest of the pipeline is warm. Not attempted now per the parent hard-stop. - The MCP server itself (
tooling/claude-player-mcp/) was shipped in the 2026-05-10 Phase 4 work and is wire-stable.
What unblocks Phase 13
Both Phases 12 and 13's dependencies overlap:
- AI projector enrichment (so AI produces non-trivial action chains past turn 0 → demo isn't degenerate).
- Render bridge from
GdPlayerApistate to a scene (so screenshots capture real game state).
When both land, Phase 13 is a single afternoon: npm install, edit
.mcp.json, drive a 25-turn run via the MCP, capture per-5-turn
screenshots into .local/demo-runs/<stamp>/, write the recap.md.
Status
p2-67 stays partial. Phases 0-11 landed; Phases 12 + 13 deferred
behind two follow-ups (pX-bench-projector-enrichment,
pX-render-bridge-gdplayerapi). Re-open Phase 13 when both follow-ups
close.
2026-05-12 — Phase 12 STOP (ObservationStore API surface mismatch)
Hard-stop triggered per brief rule: "ObservationStore API surface mismatch with what the projector needs → STOP, document, exit (don't paper over with a parallel observation store in mc-player-api)."
What the brief assumed
mc_observation::ObservationStore lookups answer the question
"is tile (col, row) visible to player P at the current turn?" so the
projector can mark each TileView as visible / fogged / hidden.
What's actually missing
Not "ObservationStore is the wrong shape" — ObservationStore is
fine as a query surface: get_turn(turn).tile_indices.contains(idx)
answers "was tile X visible to player P at turn T", which is exactly
what a fog projector needs for "Visible / Fogged / Hidden" classification.
What's missing is the Rust-side visibility producer. Today
record_turn(turn, grid, visible_tile_indices: &[u16]) takes
pre-computed visibility — the caller (presumably GDScript Vision.gd
or an equivalent Rust port that hasn't been ported yet) owns the
"compute which tiles are visible to player P right now" calculation.
There is no mc-observation API that takes (GameState, PlayerId)
and returns a visible-tile set.
What ObservationStore actually is
A per-player CLIMATE / WEATHER observation history for the Chronicle
UI. src/simulator/crates/mc-observation/src/store.rs:8-90:
TurnObservation { turn, tile_indices, records }— climate snapshot (temperature, moisture, wind, succession_progress) of every tile visible at recording time for that turn. Sparse on visible tiles only.ObservationStore::record_turn(turn, grid, visible_tile_indices)takes a pre-computed list of visible tile indices — meaning the visibility calculation lives somewhere OTHER thanmc-observation.ObservationStore::get_turn(turn) -> Option<&TurnObservation>returns historical climate, not a "right now this tile is visible" lookup.
There is no is_visible(player, col, row, turn) -> bool API. The
store's public surface (write_turn_frame_buffers,
write_latest_known_frame_buffers, unlock_lens, set_recording_gate,
…) is shaped for the Chronicle UI's climate ribbon — not for
gameplay fog of war.
Why papering over would be wrong
Per Rust SoT rail + brief's hard-stop: building a parallel "current
visibility per player" calculation inside mc-player-api/projection.rs
would duplicate the visibility logic that has to also live wherever
ObservationStore::record_turn's visible_tile_indices argument is
computed (likely GDScript Vision.gd or a Rust port thereof). That's
exactly the duplication the rail forbids.
What's actually needed
Either:
-
mc-visioncrate (or similar) that owns "compute current visible tile set for player P given GameState" as the single source of truth. BothObservationStore::record_turncallers and the projector pull from this. Includes aVisibility { Hidden, Fogged, Visible }query for any (player, tile, turn) tuple. -
Widen
ObservationStoreto include current visibility lookups alongside the climate history. Doable but mixes concerns — climate recording is one job, gameplay fog is another.
The honest path is option 1. Surface area is moderate: walk all
P-owned units + cities, compute hex-distance ≤ vision_radius per
unit/city, union into a HashSet<(col, row)>, expose a Visibility
enum that says "Visible if in current set, Fogged if in any prior
set, Hidden otherwise."
Why Phase 12 stays open until then
The projector currently uses strict-redaction fog (own-player-only). Without per-tile vision data, all enemy tiles are hidden, which matches "Hidden if never seen." The current behaviour is correct for "player who has never explored anywhere" — degenerate but not wrong. The wrong-ness only matters once units have moved and explored, and that path is also blocked by the AI behavioural-inertness gap from Phase 11's notes (units don't move past spawn). Fix in order:
- AI projector enrichment so units actually move and explore.
mc-visioncrate so fog has meaningful current/last-seen state.- Phase 12 projection rework on top of (1) + (2).
Status
p2-67 stays partial. Phases 0-11 landed. Phase 12 deferred behind
the mc-vision follow-up objective. Phase 13 (MCP install + 25-turn
demo + screenshot bundle) is also held — independent of fog
correctness, but driving a 25-turn Claude run against an AI that
returns to inertness on turn 2+ produces a degenerate demo (Claude
moves, AI sits). Phase 13 unblocks alongside the AI projector
enrichment.
2026-05-12 — Phase 11 landed (TurnProcessor::step ticking)
p2-68 closed all of Phase 10 (production AI driver replaces scripted heuristic).
Phase 11 wires TurnProcessor::step into apply_end_turn between the AI
loop and the closing TurnStarted emit so production, growth, research,
founding, pending_move_requests, and fauna encounters all drain per turn.
Shipped
mc_turn::processor::TurnProcessor::stepnow owns per-turn unit refresh.src/simulator/crates/mc-turn/src/processor.rs:528-535— addedcrate::refresh_units(state)at end-of-step. Single source of truth per the DRY rule locked in Phase 9; the dispatch-levelrefresh_unitscall is deleted in the same patch.mc_player_api::dispatch::apply_end_turnrunsstepafter the AI loop.src/simulator/crates/mc-player-api/src/dispatch.rs:258-281— constructsTurnProcessor::new(u32::MAX)(advisorymax_turns; victory_config overrides when present), callsstep(state), extends the responseeventsvec with translated processor events. The dispatch'sstate.turn = state.turn.saturating_add(1)andrefresh_units(state)call sites are both deleted —stepowns turn increment + unit refresh.translate_processor_eventstranslator atdispatch.rs:295-368. Maps 5mc_replay::TurnEventvariants towire::Event:TechResearched,WonderBuilt,CityFounded,CityCaptured,GameOver.ClanId(u32)is sourced fromprocessor.rs:910aspi as u32so the clan→player mapping isid.0 as PlayerIdwith no separate table needed. Variants without a direct wire counterpart (AmbientEncounterFired, UnitKilled, War/Peace, Era, Leader, ClanEliminated, UnitCaptured, UnitRansomOffered, CivilianDestroyed) are listed in an explicit drop arm so adding a newTurnEventvariant forces a compile-time decision.- Cargo dep
mc-replayadded tomc-player-api/Cargo.toml.
Tests + gate
cargo test -p mc-player-api --lib: 77 passed (was 74, +3 new):end_turn_ticks_city_food_growth_via_turn_processor— 2-turn food accumulation crosses growth threshold (pop 1 → 2).end_turn_completes_queued_unit_via_turn_processor— city withproduction_stored=100+Queueable::Unit{dwarf_warrior}spawns a unit after one EndTurn (player.units.len()grows).end_turn_refreshes_unit_movement_via_turn_processor— unit withmovement_remaining=0andbase_moves=32refreshes to 32 after step.
cargo test -p mc-turn --lib: 207/207 still green (no regression from addingrefresh_unitsto end-of-step).cargo check --workspace: clean (pre-existing 17 doc-comment warnings).
Smoke confirms ticking
Re-ran the 3-player apricot smoke at the Phase-11 commit (gdext rebuild
- class-cache refresh + 5 EndTurns). Claude's view across turns 0..5 showed visible state advancement:
food_stored: 0 → 2 → 4 → 6 → 8 → 10 (net +2/turn)gold: 60 → 68 → 76 → 84 → 92 → 100 (+8/turn)unit_count: 3 → 3 → 4 → 5 → 6 → 6 (production threshold spawns)science_per_turn: 0 → 42 (strategic_axes kicked in post-step)
This is the direct, observable consequence of Phase 11. Pre-Phase-11 smokes showed every field static across all 5 turns.
Honest finding — AI side still inert (separate from Phase 11)
Same smoke surfaces actions_applied=0 on the AI side (slots 1+2) for
turns 1-4 despite Phase 11 wiring step. Turn 0 still produces 1 action
per slot (the founding pass).
This contradicts the p2-68 Wave-final hypothesis ("the bench doesn't tick,
that's why the AI sees nothing to do"). Wave-final was partially wrong:
the bench DOES tick visibly for Claude. The AI's inertness is a deeper
issue — decide_tactical_actions on the bench projection bottoms out
after the founding pass because:
unit_catalogis empty in the bench-projector (p2-68 Wave 1 documented limitation),(food, prod, gold)per-tile yields are zero in the bench projection,- the unit move queue is empty because the AI projector has no per-tile cost data.
Phase 11 closes the "step doesn't tick" issue. The "AI is behaviorally
inert past turn 0" issue is its own follow-up. Recommendation: open a
new objective pX-bench-projector-enrichment to widen project_tactical
with unit_catalog + per-tile yields + movement-cost data so
decide_tactical_actions has a non-degenerate search space on the bench.
2026-05-11 — Phase 10 STOP (structural blocker; documented)
Phase 10 cannot land as a thin dispatch swap. Per the user's explicit STOP rule ("If a phase reveals a deeper structural problem … STOP, document the blocker in the objective doc, exit with a clean summary. Don't simplify around it"), the work pauses here.
What Phase 10 actually requires
The brief described it as
"Replace run_scripted_ai_turn with mc_ai::run_ai_turn(state: &mut GameState, player, &TechWeb, &Personalities) -> u32". That
function does not exist in mc-ai. Evidence:
- No
pub fn run_ai_turninmc-ai/src/lib.rs— the exported surface (decide_tactical_actions,evaluator,score_*,decide_ransom_response, …) takes a pre-projectedTacticalState, not the liveGameStatethe dispatch layer holds. File:src/simulator/crates/mc-ai/src/lib.rs:20-44. - No
GameState → TacticalStateprojector in the workspace. All callers ofdecide_tactical_actionsare test fixtures that buildTacticalState { … }literals by hand:src/simulator/crates/mc-ai/tests/tactical_port_regression.rs:172-381. - Tactical tests are
#[ignore]d — the suite header reads "Tests exercisingdecide_tactical_actionsdirectly are marked#[ignore]" (line 3 of the same file). The tactical port isn't in steady state; building a Phase-10 dispatch on top inherits that instability. - No
Action → GameStateapplicator.decide_tactical_actionsreturns aVec<Action>of high-level intents (Move / Attack / FoundCity / QueueProduction / …). Translating each variant back into aGameStatemutation is the symmetric half of the projector and is currently absent — the GDScript path (src/game/engine/src/modules/ai/ai_turn_bridge.gd) does this in GDScript by calling the same gdext shimsmc-player-apialready calls. Reimplementing that mapping in Rust is mc-ai's work, not mc-player-api's.
Why this is its own objective
Surface area is roughly comparable to Phase 9 (new crate-internal
module, projector + applicator, tactical-test thaw, AI personality
loading, deterministic seeding). It belongs as its own objective
slice — recommendation: p2-68 mc-ai headless turn driver —
with p2-67 flipping its status to that p2-68 blocker once
filed.
Why Phases 11/12/13 follow
- Phase 11 (
TurnProcessor::stepafter AI loop) requires the real AI loop to be running. Ticking production for slot 0 only (because the scripted heuristic doesn't decrement queue production) produces misleading event counts; the DRY rule ("deleterefresh_unitscall site after Phase 11") only makes sense once Phase 10 lands. - Phase 12 (per-player fog from
ObservationStore) is a cosmetic refinement of the projection — usable once 10 is real, pointless until then since the scripted AI has no observations. - Phase 13 (Claude-vs-AI demo run + screenshots) literally cannot happen against a scripted heuristic that founds-city / queue-warrior / fortify; the demo brief specifies "Claude vs the production AI".
All three are deferred behind p2-68.
Reference implementation pointer
The GDScript driver AiTurnBridge lives at
src/game/engine/src/modules/ai/ai_turn_bridge.gd (+ _dispatch.gd
_state.gd). It readsai_personalities.json, invokesGdMcTreeControlleronGdGameState, and applies the resulting actions through the same gdext shim layerGdPlayerApicalls. The headlessmc_ai::run_ai_turnshould mirror this — same inputs, same output side-effects — but without the GDScript autoload dependencies.
Status
- p2-67 stays
partial. Phases 0-9 + bench-grade Phase 8 deliverables are shipped (39/39 routes either live, intentionally bench-cheated with breadcrumbs, or honestlyNotYetImplementedwith cited schema gaps). - Phases 10-13 deferred behind the new blocker objective
(recommended id:
p2-68). Will editblocked_byonce the follow-up objective is filed.
2026-05-12 — Updated path to "Claude vs production AI" demo
Phase 11 shipped (TurnProcessor::step ticking). Phases 12 + 13 each hit a structural gap that is NOT a quick wedge:
Phase 12 — needs mc-vision crate (filed as p2-70)
mc_observation::ObservationStore was the wrong tool — it's per-player climate observation history (temperature/moisture/wind), not per-tile visibility. The actual per-player visibility producer currently lives only in GDScript (Vision.gd). Rust has no fn visible_tiles(state: &GameState, player: PlayerId) -> HashSet<HexCoord>.
Filed as p2-70 mc-vision (per-player visibility producer). Estimate ~1 day.
Phase 13 — needs two pieces
1. Bench projector enrichment (filed as p2-71):
The AI is correctly wired through project_tactical → run_ai_turn → apply_ai_action, but decide_tactical_actions returns empty action chains past turn 0 because the bench projector emits a degenerate TacticalState — empty unit_catalog, zero per-tile yields, no move-cost data (all p2-68 Wave 1 documented limitations). Until the projector serves a representative tactical surface, MCTS bottoms out on nothing-to-do.
Honest correction recorded: last night's "AI inertness = bench doesn't tick" hypothesis was wrong (Wave 2's TurnProcessor wire proved Claude's slot ticks; AI inertness is a separate projector-fidelity issue).
Estimate ~1-1.5 days.
2. GdPlayerApi → render bridge (filed as p2-72):
Production proof scenes (e.g. world_map.tscn, gameplay_arc_proof) render from the GameState autoload. GdPlayerApi keeps its own state internally via load_state_json. There is currently no path that visualises the GdPlayerApi-held world. Phase 13's "screenshots every 5 turns" requires either pointing the autoload at GdPlayerApi's state (one source of truth) or a thin render adapter that reads view_json and drives a TileMap.
Estimate ~0.5-1 day.
Sequencing
- p2-70 (mc-vision) ↔ p2-71 (projector enrichment) are independent — can run in parallel.
- p2-72 (render bridge) is independent of both; can run anytime.
- p2-67 final close happens when all three land + a Claude-vs-AI run produces screenshots and an action log.
References
src/simulator/crates/mc-core/src/action.rs— unit action enumsrc/simulator/crates/mc-core/src/city_action.rs— city action enumsrc/simulator/crates/mc-core/src/building_action.rs— building queuesrc/simulator/crates/mc-mcts-service/src/{framing,protocol,server}.rs— wire-protocol precedentsrc/simulator/crates/mc-turn/src/action_handlers/— existing apply-action plumbing to delegate intosrc/simulator/crates/mc-ai/src/lib.rs::AiTurnBridge— AI driver for non-Claude slotssrc/game/engine/scenes/tests/auto_play.gd— full headless harness precedent- p2-66 (
world-map-visual-proof.md) — sister objective for visual rendering of the resulting games
2026-05-12 — Mocked Phase 13 deliverable shipped
The API-level Phase 13 deliverable — a deterministic 25-turn Claude-vs-AI
transcript via mc-player-api — is complete and gated by a test.
Screenshot Phase 13 (the visual-proof half) is still gated on p2-72
(a/b/c) — when the rendered-game bridge lands, the same construction
can be driven through the GDExtension path for a screenshot.
- Test:
src/simulator/crates/mc-player-api/tests/full_game_transcript.rs—claude_vs_ai_full_game_transcript. Drives the same harness statesmoke_5_endturn_mockuses (lifted totests/common/mod.rs), runs up to 25 turns, asserts byte-identical transcript across two runs, asserts the three game-loop constraints (city by 5, AI unit by 10, movement/combat across run). Verification:cargo test -p mc-player-api --test full_game_transcript. - Artifacts (gitignored under
.local/):transcript.jsonl— 248 lines, ~817 KB. Canonical JSON-Lines wire format thatclaude_player_main.gdwould emit headlessly.state-turn-NN.jsonfor turns 0, 5, 10, 15 —PlayerViewsnapshots Claude would see at those boundaries.recap.md— per-turn action log, AI summaries, score deltas, residual-gap notes.
- Residual gaps (call out for follow-ups):
legal_actionsonPlayerViewis a stub (project_empire_legal_actionsreturns onlyEndTurn; per-unitlegal_actionsonly carriesSkip/Fortify/disabled-Move). Claude's policy reads RAWPlayerView.units/citiesinstead. File asp2-67-followup-legal-actionswhen promoting the real-legality probe.mc-turn/src/processor.rs:2425overflows (*a_hp * a_formation_size as i32) during PvP combat resolution at turn 17 of the canonical run. The transcript terminates cleanly viacatch_unwindwith a syntheticprotocol_errornotification and the run is faithfully truncated; the bug itself is upstream ofmc-player-apiand tracked as a residual mc-turn issue. (Combat happened — that's why the overflow fired — so constraint 4 is satisfied; the panic is itself evidence of combat resolution engaging.)- Unit-spawn events (
UnitCreated,CityUnitCompleted) don't surface for every spawn path —PlayerState.unitsmutates directly in some mc-turn code paths without emitting amc_replay::TurnEventthe dispatch layer can pick up. The constraint check therefore falls back to per-slot unit-count growth inscore_snapshot(slot 1 grew from 4 → 27 units by turn 15, proving AI building activity).
- Shared harness lift:
mc-player-api/tests/common/mod.rsnow ownsbuild_3_player_state_like_harness,build_runtime_units_catalog,build_unit_catalog,build_building_catalog,add_player_militarist_inline,stamp_personality. The 5-turn smoke (smoke_5_endturn_mock) was refactored to consume them — it still passes byte-identically to the pre-lift version.
2026-05-12 — Real-apricot demo transcript captured
Drove the production harness on apricot HEAD 1c91a332d for 25 EndTurn
cycles in 3-player Claude-vs-AI configuration. Captured the full
JSON-Lines wire transcript, scp'd to plum, produced a per-5-turn recap
with score / AI-action / event tables.
- Driver:
scripts/claude-demo-25turn.sh - Transcript:
.local/demo-runs/2026-05-12-real-apricot-claude-vs-ai/transcript.jsonl(37 lines, 420 666 bytes — oneact_responseenvelope per turn, each ~11 KB packingevents[]+ fullviewsnapshot) - Recap:
.local/demo-runs/2026-05-12-real-apricot-claude-vs-ai/recap.md - Summary JSON:
.local/demo-runs/2026-05-12-real-apricot-claude-vs-ai/summary.json
Run outcome
- All 25
act:end_turnrequests succeeded withok:true.shutdown_okreceived. Harness exit 0. No protocol_error. - AI slot 1: 63 actions applied over 25 turns (range 1–5/turn, never zero — matches p2-71 smoke acceptance).
- AI slot 2: 82 actions applied over 25 turns (range 2–4/turn, never zero).
- City foundings at turns 13 and 25 (3 each — one per slot, founder units settling on schedule).
- Final Claude state: gold=356, cities=3 (capital
0_0@ (1,6),0_1@ (5,10),0_2@ (9,14)), units=31.
Residual gaps
- Combat overflow fix unverified at runtime. Zero combat events
occurred in 25 turns on the duel map at seed 42 — both AI slots
expanded in parallel without contact. The mock transcript's
attempt to multiply with overflowpanic atmc-turn/src/processor.rs:2425is therefore neither reproduced nor cleared by this run. Follow-up needed: an adversarial map preset or scriptedAttackTileinjection to actually exercise the combat path. - Claude slot is passive. This driver issues only
EndTurnper turn; the harness has no autonomous Claude policy bound. Claude's state advances via engine auto-actions (founder auto-settle, gold accumulation) but noFoundCity/Fortify/QueueProductionfrom a Claude brain. A real Claude policy is Phase 13 territory. - Transcript shape vs. ">100 lines" criterion. Acceptance was
authored against the mock's per-action driver (248 lines). Real
harness packs each turn into one fat
act_response(37 lines × ~11 KB). 420 KB of well-formed wire JSON satisfies the spirit; the line shape is a property of the real wire format, not a deficiency. Documented inrecap.md. - Phase 13 screenshots STILL gated on p2-72. This is the API+transcript form of Phase 5; no rendered proof scene captured.