diff --git a/.project/objectives/README.md b/.project/objectives/README.md index a6263a74..021d8eeb 100644 --- a/.project/objectives/README.md +++ b/.project/objectives/README.md @@ -14,11 +14,11 @@ | Priority | βœ… | 🟑 | πŸ”΄ | ❌ | ⚫ | Total | |---|---|---|---|---|---|---| -| **P0** | 27 | 7 | 1 | 0 | 0 | 35 | +| **P0** | 28 | 6 | 1 | 0 | 0 | 35 | | **P1** | 15 | 4 | 2 | 0 | 1 | 22 | | **P2** | 14 | 5 | 0 | 8 | 0 | 27 | | **P3 (oos)** | 0 | 0 | 0 | 0 | 17 | 17 | -| **total** | **56** | **16** | **3** | **8** | **18** | **101** | +| **total** | **57** | **15** | **3** | **8** | **18** | **101** | @@ -27,8 +27,8 @@ | Team Lead | Remaining | |---|---| | [asset-sprite](../team-leads/asset-sprite.md) | 7 | -| [warcouncil](../team-leads/warcouncil.md) | 6 | | [wireguard](../team-leads/wireguard.md) | 6 | +| [warcouncil](../team-leads/warcouncil.md) | 5 | | [shipwright](../team-leads/shipwright.md) | 2 | | [testwright](../team-leads/testwright.md) | 2 | | [asset-audio](../team-leads/asset-audio.md) | 1 | @@ -39,8 +39,8 @@ | ID | Status | Title | Owner | Updated | |---|---|---|---|---| -| [p0-01](p0-01-mcts-wiring.md) | 🟑 partial | Wire MCTS into gameplay AI | [warcouncil](../team-leads/warcouncil.md) | 2026-04-17 | -| [p0-02](p0-02-clan-personalities.md) | 🟑 partial | Five AI clan personalities drive distinct playstyles | [warcouncil](../team-leads/warcouncil.md) | 2026-04-17 | +| [p0-01](p0-01-mcts-wiring.md) | 🟑 partial | Wire MCTS into gameplay AI | [warcouncil](../team-leads/warcouncil.md) | 2026-04-18 | +| [p0-02](p0-02-clan-personalities.md) | 🟑 partial | Five AI clan personalities drive distinct playstyles | [warcouncil](../team-leads/warcouncil.md) | 2026-04-18 | | [p0-03](p0-03-pvp-in-turn.md) | βœ… done | PvP combat resolved inside the authoritative turn processor | β€” | 2026-04-17 | | [p0-04](p0-04-wonder-tracking.md) | βœ… done | World wonder tracking in PlayerState and score victory | β€” | 2026-04-17 | | [p0-05](p0-05-culture-and-borders.md) | βœ… done | Culture generation and border expansion | [shipwright](../team-leads/shipwright.md) | 2026-04-17 | @@ -58,13 +58,13 @@ | [p0-17](p0-17-wild-creature-lair-loop.md) | βœ… done | Wild creature and lair clearing loop | [shipwright](../team-leads/shipwright.md) | 2026-04-17 | | [p0-18](p0-18-strategic-resource-gate.md) | βœ… done | Strategic resources gate unit production (empire ledger) | β€” | 2026-04-17 | | [p0-19](p0-19-biome-economy-integration.md) | βœ… done | Biome-driven collectibles β†’ tile yields β†’ happiness end-to-end | β€” | 2026-04-16 | -| [p0-20](p0-20-gpu-mcts-rollouts.md) | 🟑 partial | GPU-accelerated MCTS rollouts for look-ahead decision-making | [warcouncil](../team-leads/warcouncil.md) | 2026-04-17 | +| [p0-20](p0-20-gpu-mcts-rollouts.md) | 🟑 partial | GPU-accelerated MCTS rollouts for look-ahead decision-making | [warcouncil](../team-leads/warcouncil.md) | 2026-04-18 | | [p0-21](p0-21-audio-system-capability.md) | βœ… done | Audio system capability β€” manifest + autoload + EventBus wiring | [shipwright](../team-leads/shipwright.md) | 2026-04-17 | -| [p0-22](p0-22-ultimate-ai-stress-test.md) | 🟑 partial | Ultimate AI stress test β€” 5 clans, huge map, deep lookahead | [warcouncil](../team-leads/warcouncil.md) | 2026-04-17 | +| [p0-22](p0-22-ultimate-ai-stress-test.md) | 🟑 partial | Ultimate AI stress test β€” 5 clans, huge map, deep lookahead | [warcouncil](../team-leads/warcouncil.md) | 2026-04-18 | | [p0-23](p0-23-sprite-rendering-capability.md) | βœ… done | Sprite rendering capability β€” replace procedural draw_* with texture rendering | [shipwright](../team-leads/shipwright.md) | 2026-04-17 | -| [p0-24](p0-24-difficulty-calibrated-ai-progression.md) | πŸ”΄ stub | Difficulty-calibrated AI progression β€” Easy / Normal / Hard tier-peak distributions | [warcouncil](../team-leads/warcouncil.md) | 2026-04-17 | +| [p0-24](p0-24-difficulty-calibrated-ai-progression.md) | πŸ”΄ stub | Difficulty-calibrated AI progression β€” Easy / Normal / Hard tier-peak distributions | [warcouncil](../team-leads/warcouncil.md) | 2026-04-18 | | [p0-25](p0-25-game-quality-metrics-instrumentation.md) | βœ… done | Game-quality metrics instrumentation β€” tier_peak, peak_unit_tier, wonder_count | [shipwright](../team-leads/shipwright.md) | 2026-04-17 | -| [p0-26](p0-26-ai-tactical-rust-port.md) | 🟑 partial | Port tactical AI from GDScript to mc-ai (Rail-1 compliance) | [warcouncil](../team-leads/warcouncil.md) | 2026-04-18 | +| [p0-26](p0-26-ai-tactical-rust-port.md) | βœ… done | Port tactical AI from GDScript to mc-ai (Rail-1 compliance) | [warcouncil](../team-leads/warcouncil.md) | 2026-04-18 | | [p0-27](p0-27-gd-culture-bridge.md) | βœ… done | GdCulture bridge β€” live game delegates culture to mc-culture | [shipwright](../team-leads/shipwright.md) | 2026-04-17 | | [p0-28](p0-28-gd-economy-bridge.md) | βœ… done | GdEconomy bridge β€” live game delegates gold/upkeep to mc-economy | [shipwright](../team-leads/shipwright.md) | 2026-04-17 | | [p0-29](p0-29-gd-tech-bridge.md) | βœ… done | GdTechWeb bridge β€” live game delegates research to mc-tech | [shipwright](../team-leads/shipwright.md) | 2026-04-17 | diff --git a/.project/objectives/p0-22-ultimate-ai-stress-test.md b/.project/objectives/p0-22-ultimate-ai-stress-test.md index 03cd0335..df41af06 100644 --- a/.project/objectives/p0-22-ultimate-ai-stress-test.md +++ b/.project/objectives/p0-22-ultimate-ai-stress-test.md @@ -5,7 +5,7 @@ priority: p0 status: partial scope: game1 owner: warcouncil -updated_at: 2026-04-17 +updated_at: 2026-04-18 evidence: - src/simulator/crates/mc-ai/tests/ultimate_lookahead_stress.rs - tools/matchup-grid.sh @@ -67,27 +67,28 @@ a foregone conclusion; the grid is the precondition. - βœ“ `python3 tools/test_matchup_and_ultimate.py` passes 26/26 unit tests for matchup_balance and ultimate_stress verdict fns. - βœ— **`tools/matchup-grid.sh` β†’ `matchup_balance: PASS`** β€” NOT yet run. - RUN host stabilized 2026-04-17 ~15:25 PDT (apricot flaky-services cleanup; - 10/10 sign-off batch clean β€” see p0-20 acceptance bullet for evidence - path). Sole remaining blocker: `auto_play.gd` hardcodes 1v1 and doesn't - honor `MAP_SIZE` / `NUM_PLAYERS` env vars, so the script can't target - an asymmetric clan pair. + Structural blocker RESOLVED 2026-04-18: `MAP_SIZE` + `NUM_PLAYERS` env vars + now threaded through `scenes/tests/auto_play.gd` and both local-flatpak + + remote-ssh paths of `autoplay-batch.sh`. Batch execution pending; expected + to be gated by the shared p0-01 gameplay-balance issue (games resolve + T39-T100 via rush domination, so per-pair median-turn may fall below + ultimate_stress's β‰₯40% of cap threshold). - βœ— **`tools/huge-map-5clan.sh` β†’ `ultimate_stress: PASS`** β€” NOT yet run. - Same blocker as above β€” needs `MAP_SIZE=standard` and `NUM_PLAYERS=5` - honored by the game binary. matchup_balance does not strictly precede - this bullet for mechanical reasons, but the user has stated matchup_balance - is the precondition per the "deeper validation" rationale in p0-02. + Same env-wiring resolved 2026-04-18. Batch execution pending behind the + matchup-grid precondition and the p0-01 balance fix. ## Remaining to reach done -1. **Game binary reads `MAP_SIZE` and `NUM_PLAYERS` env.** `auto_play.gd` - currently hardcodes a 1v1 setup. Needs minimal wiring to read the env - vars and size the player array / pick the map. This is the sole - remaining blocker for both acceptance bullets. +1. ~~**Game binary reads `MAP_SIZE` and `NUM_PLAYERS` env.**~~ DONE 2026-04-18. 2. **Run matchup-grid** (C(5,2)=10 pairs Γ— seeds). Cite verdict. 3. **Run huge-map-5clan** (5 clans on Civ5 `standard` 80Γ—52 map). Cite verdict. -4. **MAX_PLAYERS POD expansion** β€” NOT a blocker for p0-22 (the Civ5 +4. **Both batches likely gated by p0-01 gameplay-balance tune** β€” median-turn + gate in `ultimate_stress` requires β‰₯40% of cap (β‰₯120 turns of 300). Current + binary resolves most games T39-T100 via rush-domination. Running these + batches now is expected to FAIL the verdict; waiting for p0-01's pacing + tune to land is the cost-effective sequencing. +5. **MAX_PLAYERS POD expansion** β€” NOT a blocker for p0-22 (the Civ5 `standard` 80Γ—52 runs 8 players but our 5-clan ultimate only needs 5). If we later want to run the actual canonical `huge` (128Γ—80, 12-player) with 8+ AI, the POD's 4-slot-per-entry layout needs diff --git a/.project/objectives/p0-24-difficulty-calibrated-ai-progression.md b/.project/objectives/p0-24-difficulty-calibrated-ai-progression.md index baad0ff6..d4a0f646 100644 --- a/.project/objectives/p0-24-difficulty-calibrated-ai-progression.md +++ b/.project/objectives/p0-24-difficulty-calibrated-ai-progression.md @@ -5,7 +5,7 @@ priority: p0 scope: game1 owner: warcouncil status: stub -updated_at: 2026-04-17 +updated_at: 2026-04-18 evidence: - public/games/age-of-dwarves/data/difficulty.json --- @@ -23,11 +23,33 @@ Added 2026-04-17 as part of the TTV β†’ state-at-end metric reframe (see p0-01). - βœ— Asymmetric Hard vs Normal, 10 seeds: Hard wins β‰₯ 7/10. Hard's median tier_peak exceeds Normal's by β‰₯ 1 era. - βœ— `difficulty.json` documents the exact knobs each tier modifies (build-speed multipliers, AI aggression clamps, MCTS rollout budgets, yield bonuses). Each knob has a rationale comment. +## Status note (2026-04-18) + +`difficulty.json` defines four tiers (easy/normal/hard/insane) with +`ai_modifiers.{production_mult, research_mult, gold_mult, combat_bonus, +extra_starting_units, starting_gold_bonus}`. Grep confirms only +`mc-tech::costs.rs` currently reads the tier (for research cost scaling); +`mc-ai` + the tactical executor do NOT consume the production / gold / unit +bonuses, so the knobs are data-only at the decision layer. + +**Pre-work required before batches can be run:** +1. Wire `ai_modifiers.production_mult` into `mc-ai::tactical::production` (or + thread it through `TacticalState.player_stats.production_bonus`) so AI + production outputs scale per tier. +2. Wire `starting_gold_bonus` + `extra_starting_units` into the engine-side + setup path (`auto_play.gd` or `game_state.gd` init). +3. Surface the difficulty id through the game-setup env (`AI_DIFFICULTY=easy|normal|hard`) + + plumb down to both the mc-tech cost multiplier and the new mc-ai tactical + hook. +4. Unblock p0-01's gameplay-balance issue first β€” tier differentiation cannot + be measured while every tier resolves T39-T100 via rush-domination. + ## Depends on -- **p0-25** β€” new `turn_stats.jsonl` instrumentation (`tier_peak`, `peak_unit_tier`, `wonder_count`). Cannot measure without the fields. -- **p0-01** β€” MCTS must be the AI driver under test. +- **p0-25** β€” new `turn_stats.jsonl` instrumentation (`tier_peak`, `peak_unit_tier`, `wonder_count`). βœ… done. +- **p0-01** β€” MCTS driver under test; also carries the balance-tune blocker. - **p0-02** β€” clan personalities multiplied into each difficulty tier; Easy-Blackhammer must still behave aggressively but less efficiently than Normal-Blackhammer. +- **p0-26** β€” tactical AI port. βœ… done 2026-04-18; tactical knob hooks must now land in `mc-ai::tactical`, not the deleted GDScript executor. ## Non-goals diff --git a/public/games/age-of-dwarves/data/objectives.json b/public/games/age-of-dwarves/data/objectives.json index 078b7cc0..9de532b6 100644 --- a/public/games/age-of-dwarves/data/objectives.json +++ b/public/games/age-of-dwarves/data/objectives.json @@ -1,11 +1,11 @@ { - "generated_at": "2026-04-18T15:54:36Z", + "generated_at": "2026-04-18T17:08:47Z", "totals": { - "partial": 16, - "stub": 3, - "done": 56, - "oos": 18, "missing": 8, + "stub": 3, + "oos": 18, + "partial": 15, + "done": 57, "total": 101 }, "objectives": [ @@ -16,7 +16,7 @@ "status": "partial", "scope": "game1", "owner": "warcouncil", - "updated_at": "2026-04-17", + "updated_at": "2026-04-18", "summary": "`GdMcTreeController` (Rust GDExtension) is the unconditional AI driver. `AiTurnBridge.run()` always calls `_apply_mcts_strategic_override()` β€” no feature flag, no silent fallback. If the extension is absent, `push_error` + `assert(false)` crashes loudly. `SimpleHeuristicAi` handles tactical decisions (movement, combat) after MCTS sets the strategic directive.\n\n**Acceptance re-framed 2026-04-17 (user sign-off):** The prior \"median TTV in 200–350 band\" bullet was measuring the wrong thing. Every game ends at T300 (turn limit β†’ score victory) OR earlier via domination; \"median TTV\" is bimodal (domination cluster + score-cluster-at-T299), and its value shifts based on dom:score ratio rather than game quality. Replaced with a **state-at-end quality metric set** (winner tier-peak, symmetry gap, peak unit tier, wonder count, combat count) that measures whether games reach competitive mid/late-game content *regardless* of whether they resolve via domination or score victory." }, { @@ -26,7 +26,7 @@ "status": "partial", "scope": "game1", "owner": "warcouncil", - "updated_at": "2026-04-17", + "updated_at": "2026-04-18", "summary": "`ai_personalities.json` defines Ironhold / Goldvein / Blackhammer / Deepforge / Runesmith with 6-axis `strategic_axes`. `ScoringWeights::from_personality` and `apply_axes` are fully implemented in `mc-ai/src/evaluator.rs`.\n\nWired 2026-04-17: `GdMcTreeController::scoring_weights_for_clan(clan_id, data_dir)` resolves per-clan weights via GDExtension. `ai_turn_bridge.gd::_build_game_state_json` now calls this per player and injects the result into `\"scoring_weights\":` β€” previously always `{}`. `AI_PIN_PERSONALITY` env var added to `personality_assigner.gd` for per-clan batch testing. Smoke run confirms `player_clans: {\"1\": \"blackhammer\"}` in meta.json, EXIT_CODE=0.\n\n**5 Γ— 10-seed batch results (2026-04-17, `.local/iter/p0-02-clans/` β€” PRE-REFRAME EVIDENCE):**\n\n> These batches ran BEFORE p0-25's instrumentation landed, so `player_stats` does NOT carry\n> `tier_peak` / `peak_unit_tier` / `wonder_count`. The TTV column is preserved as the\n> contemporaneous signal; it is NOT the current acceptance metric. Per p0-01's 2026-04-17\n> reframe, the primary divergence gate is **tier_peak** (era-progression, which scales with\n> difficulty per p0-24) β€” tracked as a \"needs re-run\" in Remaining to reach done below.\n\n| Clan | Wins | TTV_med (legacy) | p1_gold | p1_mil | p1_techs |\n|---|---|---|---|---|---|\n| ironhold | 10/10 | T185.5 | 266 | 3.0 | 27.5 |\n| goldvein | 10/10 | T155.5 | **543** | 3.5 | 25.5 |\n| blackhammer | 9/9 | T189 | 327 | 3.0 | 28 |\n| deepforge | 10/10 | T185.5 | 266 | 3.0 | 27.5 |\n| runesmith | 10/10 | T155.5 | 543 | 3.5 | 25.5 |\n\nSignals that DON'T depend on TTV (still valid post-reframe):\n- **Balance**: 49 total games, each clan 3 AI-wins, max 33% β€” passes.\n- **Gold axis**: goldvein 2Γ— ironhold (wealth=9 vs 3) β€” passes.\n- **First-combat**: identical at T9 across all clans (map-forced start proximity, not AI-driven).\n- **Pair metric-identical**: deepforge/ironhold and goldvein/runesmith pairs show overlapping weight profiles; same 10 seeds converge.\n\nSignals that DO depend on TTV (need tier_peak re-run to close the reframed gate):\n- TTV delta between clan pairs β€” the \"goldvein/runesmith finish 30 turns faster than ironhold/deepforge\" claim doesn't translate into the tier_peak framework until re-measured.\n\n**B5 re-run (2026-04-17, `.local/iter/b5-manual-20260417_061957/`, 50 games, post-determinism-fix binary):** blackhammer 0/10 wins; AI wins only 9/50 overall (18%). Win-rate balance bullet fails. See \"Remaining to done\" for tuning plan.\n\n**Axis ablation sweep (2026-04-17, `.local/iter/ablate__20260417_072921/`, 10 seeds T300 per axis β€” PRE-REFRAME EVIDENCE):** Each axis neutralized to 5 for all clans. Measured under pre-p0-25 instrumentation; metrics are TTV / gold / mil from the legacy `player_stats` schema. All 6 axes show β‰₯10% delta on their correlated legacy metric vs pooled baseline (TTV=185, gold=379, mil=3):\n\n| Axis | Correlated metric (legacy) | Baseline | Ablated | Delta |\n|---|---|---|---|---|\n| aggression | mil_med | 3.0 | 2.5 | -16.7% |\n| expansion | ttv_med | 185 | 134 | -27.6% |\n| grudge_persistence | ttv_med | 185 | 131.5 | -28.9% |\n| production | ttv_med | 185 | 139 | -24.9% |\n| trade_willingness | gold_med | 379 | 193.5 | -48.9% |\n| wealth | gold_med | 379 | 227.5 | -40.0% |\n\nNote: ablated TTV drops (not rises) because most games hit T300 stalemate when the axis is neutralized β€” domination wins collapse from 49/49 to 1–8/10 per axis. The TTV delta reflects game degradation, not faster play. All axes CONFIRMED LIVE under the legacy metric set. Re-measurement under tier_peak is needed before the reframed acceptance (below) can be cited." }, { @@ -206,7 +206,7 @@ "status": "partial", "scope": "game1", "owner": "warcouncil", - "updated_at": "2026-04-17", + "updated_at": "2026-04-18", "summary": "The MCTS tree (`mcts_tree.rs`) and the `mc-turn` GPU fauna pipeline are both live\non `main`, but the AI cannot currently afford wide tree search: full\n`GridState` cloning (~12 MB at 256Γ—256) blows out RAM long before the tree is\ndeep enough to matter, and `TreeState::simulate()` is a 0.5 stub. This objective\nintroduces a **GPU-batched abstract rollout** layer so the tree search can\nevaluate hundreds of candidate futures per leaf at single-digit-millisecond\ncost.\n\n### 2026-04-17 update β€” GPU↔CPU numerical parity ACHIEVED\n\nPhase C structural work shipped in the earlier team pass but the parity test\nwas silently taking the skip path on headless hosts β€” the shader had never\nactually compiled on any adapter. A deep audit + four independent fixes landed\nthis cycle proving real numerical parity:\n\n1. **WGSL reserved-keyword bug**: `var active: u32 = 0u` at `rollout.wgsl:607`\n used the `active` reserved word β†’ Naga parse panic β†’ wgpu_core handler β†’ try_init\n worker thread panic β†’ timeout returned None β†’ skip-path. Renamed to\n `active_idx`; the shader now actually compiles. Without this, the skip-path\n was structurally \"passing\" every test in Phase C without ever exercising the\n WGSL kernel.\n2. **Adapter backend restriction**: `wgpu::Backends::all()` picked the NVIDIA\n OpenGL adapter first on apricot, whose compute support silently fails at\n `request_device`. Restricted to `VULKAN | METAL | DX12 | BROWSER_WEBGPU`\n which all have first-class compute paths.\n3. **Device limits fix**: `Limits::default()` targets a discrete GPU β€” too\n large for llvmpipe / lavapipe. Changed to\n `Limits::downlevel_defaults().using_resolution(adapter.limits())` so software\n Vulkan backends can satisfy device creation.\n4. **Action-walk order unified**: the root numerical divergence. CPU\n `active_actions()` returned actions in insertion order\n `[Build, Research, Defend, Idle, Attack, ...]`; WGSL iterated k=0..9 in\n `ActionKind::ALL` numerical order `[Build, Attack, Settle, Research, ...]`.\n Identical probabilities, identical RNG draw β†’ different action picked at\n every cumulative-sum boundary. Rewrote `active_actions()` to iterate\n `ActionKind::ALL` in canonical order (with explicit docstring warning not\n to reorder for readability).\n\n**Parity verification on apricot (headless bluefin + lavapipe software\nVulkan)**: with `MC_AI_GPU_DEBUG=1 VK_DRIVER_FILES=/usr/share/vulkan/icd.d/lvp_icd.x86_64.json`\ndriving the tests on real llvmpipe dispatch, not skip-path:\n\n```\n[parity small_batch backend=Vulkan] n=16 agree=16/16 (1.000) max_drift=0.000000\n[parity partial_workgroup backend=Vulkan] n=65 agree=65/65 (1.000) max_drift=0.000000\n[parity multi_workgroup backend=Vulkan] n=128 agree=128/128 (1.000) max_drift=0.000000\nbuckets: <1e-6=all others=0 across all three tests\n```\n\nNot 98% (the stated tolerance) β€” **100% agreement, bit-identical** on all 3\nquantitative parity tests (209 inputs total). Pre-fixes: 3–6% agreement with\nmax_drift 0.025–0.043 (action-boundary flips). Post-fix: integer fields\nbyte-equal, scalar fields byte-equal. WGSL kernel is now a provable,\nbyte-for-byte port of `rollout::walk`.\n\n### 2026-04-17 update β€” host-side infrastructure\n\n- `scripts/dev-setup/bluefin.sh` + `./run setup:bluefin` β€” idempotent installer\n for `weston`, `vulkan-tools`, `mesa-vulkan-drivers` on bootc/Bluefin systems\n via `rpm-ostree install --apply-live`. `--check` mode for CI.\n Delegates EDITβ†’RUN via `$AUTOPLAY_HOST` when invoked from EDIT.\n- `~/Code/bootc-bluefin/containerfiles/Containerfile.desktop-core` updated on\n apricot with `vulkan-tools` + `mesa-vulkan-drivers` added alongside `weston`.\n Rebooted bootc images now include these without needing the transient script.\n\n### 2026-04-17 update β€” fresh A5 attempt post-fix (failed on host SIGTERM)\n\nAfter the four WGSL parity fixes landed and GDExtension rebuilt, fresh A5\nbatches were attempted under multiple process-isolation strategies:\n\n| Strategy | Batch dir | Result |\n|---|---|---|\n| plain nohup | `.local/iter/a5-fresh-20260417_122847/` | exit 143, seeds `in_progress` T5–T10 before kill |\n| nohup + new dir | `.local/iter/a5-final-20260417_122936/` | games launched, no completion.marker written (process killed) |\n| bash SIGTERM trap | `.local/iter/a5-trap-20260417_123021/` | trap handler received NO signal; script exited rc=143 |\n| strace signal trace | `.local/iter/a5-strace-20260417_123200/` | revealed autoplay-batch.sh exits status **1** (not 143); no SIGTERM to parent. Root cause: `0/N games produced turn_stats.jsonl` check fires because flatpak Godot scopes end at 3–10s |\n| `systemd-run --user` | `.local/iter/warcouncil-a5-systemd-*/` | same β€” service `Active: inactive (dead)` after 2s, scope children SIGTERMed |\n| `KillMode=none` | `.local/iter/warcouncil-a5-systemd-*` (2nd) | games reached T9–T10 only; same kill pattern |\n| plain `bash autoplay-batch` synchronous | `.local/iter/a5-direct-123300/` | 10 games with 0-line `turn_stats.jsonl` β€” games get SIGTERMed during map generation |\n\nSeven distinct execution strategies, same failure pattern: flatpak Godot\nscopes SIGTERMed within 3–10s of launch, before any turn completes. Investigation\nfound the signal is NOT delivered by systemd-oomd (failed service), rpm-ostree\nautomatic updates (timer inactive), or apricot-rail-watchdog (emit-only). The\nactual SIGTERM source could not be identified in the apricot user session.\nParallel agent's own batches from earlier the same day (e.g.\n`.local/batches/blackhammer_tune_20260417_101447/`) completed fine, so the\nissue is transient/session-bound, NOT a permanent host failure.\n\n**Fresh A5 verdict β€” NOT HEALTHY, B5 therefore not launched.** Per\nwarcouncil's integrity rule: we report the measurement failure honestly\nrather than claim parity-fix-correctness translated into fresh gameplay\nevidence. Existing p0-01 batch data from pre-parity-fix binary (at\n`blackhammer_tune_20260417_101447`) still stands as the most recent\nsuccessful A5/B5 evidence in the repo." }, { @@ -226,7 +226,7 @@ "status": "partial", "scope": "game1", "owner": "warcouncil", - "updated_at": "2026-04-17", + "updated_at": "2026-04-18", "summary": "The \"ultimate test\" is the final gate on the AI lookahead pipeline:\nfive clan personalities competing on a map sized large enough for eight\nplayers, with MCTS + GPU batched rollouts driving every decision. The\ngoal is to confirm the lookahead SCALES β€” deep trees, many expansions,\ngenuine strategic divergence between clans at multi-clan scale β€” not\njust that it works on the 1v1 fixtures already covered by p0-02's\n`personality_win_balance`.\n\nPer project owner: the ultimate test runs ONLY AFTER the C(5,2)=10-pair\n1v1 matchup grid (`tools/matchup-grid.sh`) has shown the five clans are\nbalanced in head-to-head play. Unbalanced 1v1s make a 5-way free-for-all\na foregone conclusion; the grid is the precondition." }, { @@ -246,7 +246,7 @@ "status": "stub", "scope": "game1", "owner": "warcouncil", - "updated_at": "2026-04-17", + "updated_at": "2026-04-18", "summary": "Added 2026-04-17 as part of the TTV β†’ state-at-end metric reframe (see p0-01). The game's three AI-difficulty tiers (Easy / Normal / Hard in `difficulty.json`) must produce *measurably different* progression profiles when batched. The current MCTS + heuristic stack doesn't actually change behavior between difficulty tiers β€” `ai_difficulty` is read in a few Rust spots but has no empirically-validated behavioral split." }, { @@ -263,7 +263,7 @@ "id": "p0-26", "title": "Port tactical AI from GDScript to mc-ai (Rail-1 compliance)", "priority": "p0", - "status": "partial", + "status": "done", "scope": "game1", "owner": "warcouncil", "updated_at": "2026-04-18",