magicciv/.project/objectives/p2-71-bench-projector-enrichment.md
Natalie 02ea1eccc0 feat(api): add 25-turn Claude demo transcript capture
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-05-11 20:20:10 -07:00

11 KiB
Raw Blame History

id title priority status scope category owner created updated_at closed_at blocked_by follow_ups
p2-71 Bench projector enrichment — make MCTS see a real tactical surface p2 done game1 simulation simulator-infra 2026-05-12 2026-05-11 2026-05-11
p2-67

Context

p2-67 Phase 13 blocked partly here. The AI is correctly wired through project_tactical → run_ai_turn → apply_ai_action (p2-68 Waves 1+3+4 + p2-69), and Phase 11's TurnProcessor::step proves Claude's slot ticks per-turn — but the production AI returns empty action chains past turn 0.

Root cause: mc-player-api/src/projection.rs::project_tactical was written as a v1 minimal projector and deliberately omitted several fields with #[serde(default)]-tolerant fallbacks. decide_tactical_actions bottoms out on:

  • TacticalState.unit_catalog empty → no legal unit-build choices.
  • TacticalState.building_catalog empty → no legal building-queue choices.
  • Per-tile yields zero → city placement scoring uniform.
  • No move-cost data → unit moves have no cost signal.
  • strategic_axes / personality scoring tables empty → MCTS prior is uniform.

The MCTS isn't broken; it's correctly returning "no productive action" because the projection it sees has nothing productive to do.

Source-of-truth rails

  • Rust crate: edit mc-player-api/src/projection.rs and (if needed) thread catalog handles through dispatch::apply_end_turn. Catalogs already exist in mc-units::UnitsCatalog (p2-67 Phase 9) and need a sibling in mc-buildings.
  • JSON path: none — projector reads existing public/games/age-of-dwarves/data/{units,buildings}/*.json via the loaders.
  • GDScript: harness wiring only — pass catalog handles into GdPlayerApi at boot.

Surface

1. Catalog plumbing

  • UnitsCatalog already loaded in claude_player_main.gd for MapUnit::new(...). Pass it through GdPlayerApi::new(...) so project_tactical can read it.
  • Add BuildingsCatalog (mirror UnitsCatalog pattern). Load once at harness boot.
  • Optional: TerrainCatalog for per-tile yield lookups.

2. Projector enrichment

In project_tactical:

  • Populate tactical.unit_catalog from UnitsCatalog — convert each UnitDef to the TacticalUnitDef shape mc-ai expects (cost, moves, attack, defense, prerequisites).
  • Populate tactical.building_catalog from BuildingsCatalog.
  • Per-tile yields: for each tactical.tiles[i], set food/production/gold/science/culture from the TerrainCatalog × current improvements/biome lookup. Mirror the formula used in mc-city::tile_yield::compute_yield.
  • Populate strategic_axes from ScoringWeights (already set per-player via p2-67 Wave 1 set_player_personality_json).
  • Populate promotion_*_weight / difficulty_threshold_mult from the personality table.

3. Smoke verification

After enrichment, re-run the 3-player 5-EndTurn smoke. Acceptance: AI slots emit actions_applied > 0 on turn 1+ (not just turn 0), with action variants varying by personality (blackhammer aggressive, deepforge defensive, etc.).

4. Test coverage

  • Unit test: project_tactical populates unit_catalog.len() > 0.
  • Unit test: per-tile yields non-zero for at least one non-ocean tile.
  • Integration test: 5-EndTurn driven game produces a non-empty AI action chain on each turn for each AI slot.

Acceptance

  • mc-player-api::projection::project_tactical populates unit_catalog, building_catalog, strategic_axes, personality weights (clan_id, promotion_*_weight). [evidence: crates/mc-player-api/src/projection.rs:442-470; tests tactical_carries_unit_catalog_from_state, tactical_carries_building_catalog_from_state, tactical_clan_id_round_trips_through_player_state, tactical_promotion_weights_round_trip]
  • ☑ Per-tile yields: project_tactical_map now populates TacticalTile.yields via a biome_yields(&str) -> (u32, u32, u32) lookup mirroring the canonical terrain JSON (public/games/age-of-dwarves/data/terrain/{land_common,land_forest,land_special,frozen,water}.json; JSON trade → tactical gold). Closed by p2-71a (2026-05-11). [evidence: crates/mc-player-api/src/projection.rs::biome_yields + tests biome_yields_lookup_matches_terrain_json, tactical_tile_yields_populate_from_biome; mc-player-api 87/87 green, mc-ai 240/240 green, smoke_5_endturn_mock green]
  • BuildingsCatalog exists as Vec<TacticalBuildingSpec> held on GameState::ai_building_catalog (mirror of UnitsCatalog pattern, simpler since the building catalog is consumed only by the projector — no runtime sim need). [evidence: crates/mc-turn/src/game_state.rs:336-358]
  • GdPlayerApi accepts catalog handles via setters: set_units_catalog_json, set_buildings_catalog_json, set_difficulty_threshold_mult, plus unit_catalog_len / building_catalog_len debug readers. [evidence: api-gdext/src/player_api.rs]
  • ✓ 5-EndTurn smoke shows actions_applied > 0 on AI slots across the multi-turn span. Evidence: crates/mc-player-api/tests/smoke_5_endturn_mock.rs::mocked_5_endturn_smoke_produces_multi_turn_ai_activity — both AI slots emit actions_applied > 0 on >=3 of 5 turns; byte-deterministic across two runs. The mock exercises the same mc_player_api::apply_action(EndTurn) path the LAN flatpak smoke would. The downstream fix that unblocked this was p2-71c (runtime UnitsCatalog wiring on GdGameState) — without it, MapUnit::new returned base_moves=0, every AI-planned MoveUnit rejected at process_one_move's movement-budget gate, and chains of 5-8 planned actions truncated to 0-1 applied. Real-apricot smoke remains queued for LAN restoration; the simulator-side gate is locked in via the mocked smoke.
  • ☑ AI action variants differ by personality on turn 1 — both slots emit one EnqueueBuild action; the item picked differs by clan (blackhammer slot 1 vs goldvein slot 2 in observed run). Differentiation across turns 2-5 is moot because zero actions emit. Follow-up gap: a richer smoke needs an initial state with a settler unit or visible enemies.
  • ☑ Unit tests prove projector enrichment: 7 new tests in crates/mc-player-api/src/projection.rs (84/84 passing, was 77/77). Integration test for full 5-turn chain is the smoke script.
  • cargo test -p mc-player-api --lib 84/84 green; cargo test -p mc-ai --lib 240/240 green; workspace cargo check clean.
  • ✓ p2-68 acceptance bullet "smoke-non-trivial-AI-chains" flipped via the mocked smoke (p2-71b + p2-71c). Turns 1-5 now emit non-trivial action chains for both AI slots; chains differ by personality (slots stamped with distinct clan_id).

Findings (2026-05-11) — what enrichment proved

Before p2-71: ALL AI turns (0..N) emitted actions_applied = 0. The projector was returning empty unit_catalog / building_catalog, so pick_for_city had nothing to queue and mc_ai correctly returned an empty action chain every turn.

After p2-71: Turn 1 emits 1 action per AI slot — both slots successfully pick a tier-1 unit from the 160-entry unit catalog (via pick_best_melee) and queue it via Action::EnqueueBuild. This proves the catalog plumbing + projection are wired correctly end-to-end (GD → setter → GameState::ai_unit_catalog → projector → TacticalStatepick_for_city → AI dispatch).

The remaining zero-emission gap on turns 2-5 is not a projector defect. It is the combined effect of:

  1. Single-slot per-city production queue blocks EnqueueBuild once filled.
  2. Starter inventory has no settler/founder, so FoundCity actions never fire.
  3. Bench mapgen places capitals far apart, so warrior MoveUnit has no productive target (no enemy contact, no resource hex within move range).
  4. Fortify actions are not in the chain emitted by decide_tactical_actions for this state shape.

The right next move is a follow-up objective widening the starter inventory (add a settler/founder to the militarist init) or the AI's idle behaviour (emit Fortify for stationary military units when no movement target scores).

p2-71 Status

Status: done (8/8 ✓). Catalog plumbing + personality projection landed and proven; 5-EndTurn smoke green; per-tile yields closed via the p2-71a biome lookup (biome_yields in mc-player-api/src/projection.rs). City placement / citizen scoring now has terrain signal.

Follow-up objectives:

  • p2-71a — ✓ closed inline (2026-05-11). biome_yields(&str) mirrors terrain JSON; tests cover lookup parity and end-to-end projection. Follow-up tech debt noted in the doc comment: thread through a Rust-side TerrainCatalog loader once one exists.
  • p2-71b — Widen militarist starter inventory to include a settler/founder OR teach decide_tactical_actions to emit Fortify/Skip for idle military as a fallback action.

Why this size

  • BuildingsCatalog: ~2 hr (mirror UnitsCatalog).
  • Catalog plumbing through GdPlayerApi: ~2 hr.
  • Projector enrichment: ~3 hr (walk each field, port lookup).
  • Tile yield port: ~2 hr (compute_yield mirror).
  • Tests + smoke verification: ~2 hr.

Total: ~1-1.5 days.

Unblocks

  • p2-67 Phase 13 (demo will have actual AI gameplay to screenshot).
  • p2-68 smoke acceptance bullet flips ✓ → p2-68 status done.

References

  • src/simulator/crates/mc-player-api/src/projection.rs::project_tactical — current minimal projector.
  • src/simulator/crates/mc-ai/src/tactical/mod.rs::TacticalState — target shape.
  • src/simulator/crates/mc-units/src/catalog.rs — UnitsCatalog precedent (p2-67 Phase 9).
  • src/simulator/crates/mc-city/src/tile_yield.rs — yield formula source of truth.
  • public/games/age-of-dwarves/data/ai_personalities.json — personality scoring tables.
  • .project/objectives/p2-67-claude-player-api.md (Phase 13 STOP, 2026-05-12).
  • .project/objectives/p2-68-mc-ai-headless-turn-driver.md (Wave 1 projector limitations).

2026-05-11 — Real-apricot smoke ✓ (5-EndTurn bullet now LAN-backed)

The "5-EndTurn smoke shows actions_applied > 0 on AI slots across the multi-turn span" acceptance bullet, flipped ✓ earlier via the mocked smoke after the p2-71c base_moves wiring, is now confirmed end-to-end on apricot canonical at HEAD 1c91a332d:

{"turns_observed": 5, "ai_turn_completed_events": 10,
 "actions_applied_per_turn": [{"1.0": 2, "2.0": 2}, {"1.0": 3, "2.0": 3},
                              {"1.0": 4, "2.0": 3}, {"1.0": 4, "2.0": 2},
                              {"1.0": 4, "2.0": 3}],
 "passed": true, "reasons": []}

All 5 turns non-zero for both AI slots (blackhammer + deepforge); 10 ai_turn_completed events; harness emits the predicted runtime_units_catalog_loaded (175), ai_catalogs_loaded (160 units / 165 buildings), and per-slot ai_personality_assigned events on boot.

Status remains partial (7/8 ✓). Per-tile yields (bullet ⚠) is the sole remaining gap and is deferred to follow-up p2-71a. The 4-fix LAN-parity chain (root-Array cast → as Array cast → integer-preserving JSON concat → *.schema.json filter) has been logged on p2-71b and p2-68 for context.