From 703ff9abb8d023b0e37bf5608b3230cd296a4372 Mon Sep 17 00:00:00 2001 From: Natalie Date: Mon, 11 May 2026 20:09:56 -0700 Subject: [PATCH] =?UTF-8?q?feat(@projects/@magic-civilization):=20?= =?UTF-8?q?=E2=9C=A8=20validate=20ai=20headless=20turn=20driver=20smoke=20?= =?UTF-8?q?tests?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-Authored-By: Lilith Autocommit --- .../p2-68-mc-ai-headless-turn-driver.md | 29 +++++++++++++++++++ .../p2-71-bench-projector-enrichment.md | 16 ++++++++++ .../p2-71b-militarist-starter-widening.md | 28 ++++++++++++++++++ 3 files changed, 73 insertions(+) diff --git a/.project/objectives/p2-68-mc-ai-headless-turn-driver.md b/.project/objectives/p2-68-mc-ai-headless-turn-driver.md index fecd7058..4461967c 100644 --- a/.project/objectives/p2-68-mc-ai-headless-turn-driver.md +++ b/.project/objectives/p2-68-mc-ai-headless-turn-driver.md @@ -393,3 +393,32 @@ without any additional p2-68 work. - Inspected `mc-ai/tests/tactical_port_regression.rs` — **zero `#[ignore]` attributes remain** (line 3 is historical documentation only). All 23 tests pass on apricot canonical at current `origin/main`. Acceptance bullet 4 ("Every `#[ignore]` in tactical_port_regression.rs is removed and the test passes") **satisfied without work** — recording here so the bullet is checkable. - Evidence: `ssh apricot "cd ~/Code/project-buildspace/magic-civilization/src/simulator && cargo test -p mc-ai --test tactical_port_regression"` → `test result: ok. 23 passed; 0 failed; 0 ignored`. - Pre-existing workspace break observed: `src/simulator/api-gdext/src/ai.rs:25` imports `mc_turn::snapshot::{McAction, McSnapshot}` (module no longer exists in `mc-turn`), and `:631`/`:651` call `decide_tactical_actions` + `McSnapshot` constructors with old signatures. `cargo check --workspace` therefore fails. **Outside p2-68 scope** — the dead-import + signature mismatch in api-gdext is tracked separately (the snapshot module was removed in an earlier MCTS refactor; the cfg(test) tests in api-gdext that constructed `McSnapshot/PlayerSnap` were never updated). Workaround in effect: this objective verifies `-p mc-ai` + `-p mc-player-api` standalone, which both build green. Workspace-green acceptance bullet **deferred** with this caveat noted. + +## 2026-05-11 — Real-apricot LAN smoke confirms the mocked guarantee + +`scripts/claude-smoke-5endturn.sh` on apricot canonical at HEAD `1c91a332d` (after the `*.schema.json` harvester filter fix landed in `claude_player_main.gd`): + +``` +{"turns_observed": 5, "ai_turn_completed_events": 10, + "actions_applied_per_turn": [{"1.0": 2, "2.0": 2}, {"1.0": 3, "2.0": 3}, + {"1.0": 4, "2.0": 3}, {"1.0": 4, "2.0": 2}, + {"1.0": 4, "2.0": 3}], + "passed": true, "reasons": []} +``` + +Harness boot events on stdout (the four predicted lines): + +``` +{"type":"runtime_units_catalog_loaded","units":175} +{"clan_id":"blackhammer","slot":1,"type":"ai_personality_assigned"} +{"clan_id":"deepforge","slot":2,"type":"ai_personality_assigned"} +{"buildings":165,"difficulty_threshold_mult":1.0,"type":"ai_catalogs_loaded","units":160} +``` + +The "Headless smoke test: 5 EndTurns vs an AI clan produces non-trivial AI action chains" acceptance bullet — flipped earlier via the mocked smoke (`crates/mc-player-api/tests/smoke_5_endturn_mock.rs`) — is now also backed by the LAN flatpak harness on apricot, end-to-end through GDExtension. Status stays `done`; this entry documents the real-hardware confirmation that was queued for network restoration. + +The 4-attempt fix chain that unblocked LAN parity: +1. `c3298eb52` — root-Array JSON cast removed. +2. `3a5ec46be` — `as Array` cast on Dictionary value removed. +3. `647df39dd` — raw-JSON concat preserves integers across the GDExtension boundary. +4. `1c91a332d` — harvester skips `*.schema.json` descriptors (this confirmation). diff --git a/.project/objectives/p2-71-bench-projector-enrichment.md b/.project/objectives/p2-71-bench-projector-enrichment.md index adda8e18..d48a68a9 100644 --- a/.project/objectives/p2-71-bench-projector-enrichment.md +++ b/.project/objectives/p2-71-bench-projector-enrichment.md @@ -118,3 +118,19 @@ Follow-up objectives (will be filed separately): - `public/games/age-of-dwarves/data/ai_personalities.json` — personality scoring tables. - `.project/objectives/p2-67-claude-player-api.md` (Phase 13 STOP, 2026-05-12). - `.project/objectives/p2-68-mc-ai-headless-turn-driver.md` (Wave 1 projector limitations). + +## 2026-05-11 — Real-apricot smoke ✓ (5-EndTurn bullet now LAN-backed) + +The "5-EndTurn smoke shows `actions_applied > 0` on AI slots across the multi-turn span" acceptance bullet, flipped ✓ earlier via the mocked smoke after the p2-71c base_moves wiring, is now confirmed end-to-end on apricot canonical at HEAD `1c91a332d`: + +``` +{"turns_observed": 5, "ai_turn_completed_events": 10, + "actions_applied_per_turn": [{"1.0": 2, "2.0": 2}, {"1.0": 3, "2.0": 3}, + {"1.0": 4, "2.0": 3}, {"1.0": 4, "2.0": 2}, + {"1.0": 4, "2.0": 3}], + "passed": true, "reasons": []} +``` + +All 5 turns non-zero for both AI slots (blackhammer + deepforge); 10 `ai_turn_completed` events; harness emits the predicted `runtime_units_catalog_loaded` (175), `ai_catalogs_loaded` (160 units / 165 buildings), and per-slot `ai_personality_assigned` events on boot. + +**Status remains `partial` (7/8 ✓).** Per-tile yields (bullet ⚠) is the sole remaining gap and is deferred to follow-up p2-71a. The 4-fix LAN-parity chain (root-Array cast → `as Array` cast → integer-preserving JSON concat → `*.schema.json` filter) has been logged on p2-71b and p2-68 for context. diff --git a/.project/objectives/p2-71b-militarist-starter-widening.md b/.project/objectives/p2-71b-militarist-starter-widening.md index da13dfd5..e3173e11 100644 --- a/.project/objectives/p2-71b-militarist-starter-widening.md +++ b/.project/objectives/p2-71b-militarist-starter-widening.md @@ -150,3 +150,31 @@ Real-apricot LAN smoke remains queued for network restoration but the simulator- - `public/games/age-of-dwarves/data/units/*.json` — settler/founder unit definitions. - `scripts/claude-smoke-5endturn.sh` — verification smoke. - `.project/objectives/p2-71-bench-projector-enrichment.md` (findings section). + +## 2026-05-11 — Real-apricot smoke ✓ (full LAN round-trip) + +Schema-filter fix (`1c91a332d`) closed the last harness blocker: `_apply_runtime_units_catalog` was globbing `*.json` and pulling `*.schema.json` descriptors into the catalog payload, which the Rust deserializer correctly rejected (`missing field 'id'`). The fourth-attempt smoke after that fix passed end-to-end on apricot canonical: + +``` +apricot HEAD: 1c91a332d fix(@projects/@magic-civilization): 🐛 skip schema files in unit catalog +build: bash src/simulator/build-gdext.sh x86_64-unknown-linux-gnu → libmagic_civ_physics.x86_64.so copied +smoke: CP_PLAYERS=3 CP_CLAUDE_SLOT=0 scripts/claude-smoke-5endturn.sh +verdict: {"turns_observed": 5, "ai_turn_completed_events": 10, + "actions_applied_per_turn": [{"1.0": 2, "2.0": 2}, + {"1.0": 3, "2.0": 3}, + {"1.0": 4, "2.0": 3}, + {"1.0": 4, "2.0": 2}, + {"1.0": 4, "2.0": 3}], + "passed": true, "reasons": []} +``` + +Catalog + personality events on harness stdout: + +``` +{"type":"runtime_units_catalog_loaded","units":175} +{"clan_id":"blackhammer","slot":1,"type":"ai_personality_assigned"} +{"clan_id":"deepforge","slot":2,"type":"ai_personality_assigned"} +{"buildings":165,"difficulty_threshold_mult":1.0,"type":"ai_catalogs_loaded","units":160} +``` + +Acceptance bar (≥3 of 5 turns with `actions_applied > 0` per AI slot) **exceeded**: all 5 turns non-zero for both slots, 10 `ai_turn_completed` events total, action counts grow 2 → 3-4 over the span — consistent with the bench projector + p2-71c `base_moves` wiring permitting `MoveUnit` to clear the movement-budget gate. The mocked-smoke guarantee from earlier in this file is now backed by real-hardware confirmation.