diff --git a/.project/objectives/README.md b/.project/objectives/README.md
index a6263a74..021d8eeb 100644
--- a/.project/objectives/README.md
+++ b/.project/objectives/README.md
@@ -14,11 +14,11 @@
| Priority | β
| π‘ | π΄ | β | β« | Total |
|---|---|---|---|---|---|---|
-| **P0** | 27 | 7 | 1 | 0 | 0 | 35 |
+| **P0** | 28 | 6 | 1 | 0 | 0 | 35 |
| **P1** | 15 | 4 | 2 | 0 | 1 | 22 |
| **P2** | 14 | 5 | 0 | 8 | 0 | 27 |
| **P3 (oos)** | 0 | 0 | 0 | 0 | 17 | 17 |
-| **total** | **56** | **16** | **3** | **8** | **18** | **101** |
+| **total** | **57** | **15** | **3** | **8** | **18** | **101** |
@@ -27,8 +27,8 @@
| Team Lead | Remaining |
|---|---|
| [asset-sprite](../team-leads/asset-sprite.md) | 7 |
-| [warcouncil](../team-leads/warcouncil.md) | 6 |
| [wireguard](../team-leads/wireguard.md) | 6 |
+| [warcouncil](../team-leads/warcouncil.md) | 5 |
| [shipwright](../team-leads/shipwright.md) | 2 |
| [testwright](../team-leads/testwright.md) | 2 |
| [asset-audio](../team-leads/asset-audio.md) | 1 |
@@ -39,8 +39,8 @@
| ID | Status | Title | Owner | Updated |
|---|---|---|---|---|
-| [p0-01](p0-01-mcts-wiring.md) | π‘ partial | Wire MCTS into gameplay AI | [warcouncil](../team-leads/warcouncil.md) | 2026-04-17 |
-| [p0-02](p0-02-clan-personalities.md) | π‘ partial | Five AI clan personalities drive distinct playstyles | [warcouncil](../team-leads/warcouncil.md) | 2026-04-17 |
+| [p0-01](p0-01-mcts-wiring.md) | π‘ partial | Wire MCTS into gameplay AI | [warcouncil](../team-leads/warcouncil.md) | 2026-04-18 |
+| [p0-02](p0-02-clan-personalities.md) | π‘ partial | Five AI clan personalities drive distinct playstyles | [warcouncil](../team-leads/warcouncil.md) | 2026-04-18 |
| [p0-03](p0-03-pvp-in-turn.md) | β
done | PvP combat resolved inside the authoritative turn processor | β | 2026-04-17 |
| [p0-04](p0-04-wonder-tracking.md) | β
done | World wonder tracking in PlayerState and score victory | β | 2026-04-17 |
| [p0-05](p0-05-culture-and-borders.md) | β
done | Culture generation and border expansion | [shipwright](../team-leads/shipwright.md) | 2026-04-17 |
@@ -58,13 +58,13 @@
| [p0-17](p0-17-wild-creature-lair-loop.md) | β
done | Wild creature and lair clearing loop | [shipwright](../team-leads/shipwright.md) | 2026-04-17 |
| [p0-18](p0-18-strategic-resource-gate.md) | β
done | Strategic resources gate unit production (empire ledger) | β | 2026-04-17 |
| [p0-19](p0-19-biome-economy-integration.md) | β
done | Biome-driven collectibles β tile yields β happiness end-to-end | β | 2026-04-16 |
-| [p0-20](p0-20-gpu-mcts-rollouts.md) | π‘ partial | GPU-accelerated MCTS rollouts for look-ahead decision-making | [warcouncil](../team-leads/warcouncil.md) | 2026-04-17 |
+| [p0-20](p0-20-gpu-mcts-rollouts.md) | π‘ partial | GPU-accelerated MCTS rollouts for look-ahead decision-making | [warcouncil](../team-leads/warcouncil.md) | 2026-04-18 |
| [p0-21](p0-21-audio-system-capability.md) | β
done | Audio system capability β manifest + autoload + EventBus wiring | [shipwright](../team-leads/shipwright.md) | 2026-04-17 |
-| [p0-22](p0-22-ultimate-ai-stress-test.md) | π‘ partial | Ultimate AI stress test β 5 clans, huge map, deep lookahead | [warcouncil](../team-leads/warcouncil.md) | 2026-04-17 |
+| [p0-22](p0-22-ultimate-ai-stress-test.md) | π‘ partial | Ultimate AI stress test β 5 clans, huge map, deep lookahead | [warcouncil](../team-leads/warcouncil.md) | 2026-04-18 |
| [p0-23](p0-23-sprite-rendering-capability.md) | β
done | Sprite rendering capability β replace procedural draw_* with texture rendering | [shipwright](../team-leads/shipwright.md) | 2026-04-17 |
-| [p0-24](p0-24-difficulty-calibrated-ai-progression.md) | π΄ stub | Difficulty-calibrated AI progression β Easy / Normal / Hard tier-peak distributions | [warcouncil](../team-leads/warcouncil.md) | 2026-04-17 |
+| [p0-24](p0-24-difficulty-calibrated-ai-progression.md) | π΄ stub | Difficulty-calibrated AI progression β Easy / Normal / Hard tier-peak distributions | [warcouncil](../team-leads/warcouncil.md) | 2026-04-18 |
| [p0-25](p0-25-game-quality-metrics-instrumentation.md) | β
done | Game-quality metrics instrumentation β tier_peak, peak_unit_tier, wonder_count | [shipwright](../team-leads/shipwright.md) | 2026-04-17 |
-| [p0-26](p0-26-ai-tactical-rust-port.md) | π‘ partial | Port tactical AI from GDScript to mc-ai (Rail-1 compliance) | [warcouncil](../team-leads/warcouncil.md) | 2026-04-18 |
+| [p0-26](p0-26-ai-tactical-rust-port.md) | β
done | Port tactical AI from GDScript to mc-ai (Rail-1 compliance) | [warcouncil](../team-leads/warcouncil.md) | 2026-04-18 |
| [p0-27](p0-27-gd-culture-bridge.md) | β
done | GdCulture bridge β live game delegates culture to mc-culture | [shipwright](../team-leads/shipwright.md) | 2026-04-17 |
| [p0-28](p0-28-gd-economy-bridge.md) | β
done | GdEconomy bridge β live game delegates gold/upkeep to mc-economy | [shipwright](../team-leads/shipwright.md) | 2026-04-17 |
| [p0-29](p0-29-gd-tech-bridge.md) | β
done | GdTechWeb bridge β live game delegates research to mc-tech | [shipwright](../team-leads/shipwright.md) | 2026-04-17 |
diff --git a/.project/objectives/p0-22-ultimate-ai-stress-test.md b/.project/objectives/p0-22-ultimate-ai-stress-test.md
index 03cd0335..df41af06 100644
--- a/.project/objectives/p0-22-ultimate-ai-stress-test.md
+++ b/.project/objectives/p0-22-ultimate-ai-stress-test.md
@@ -5,7 +5,7 @@ priority: p0
status: partial
scope: game1
owner: warcouncil
-updated_at: 2026-04-17
+updated_at: 2026-04-18
evidence:
- src/simulator/crates/mc-ai/tests/ultimate_lookahead_stress.rs
- tools/matchup-grid.sh
@@ -67,27 +67,28 @@ a foregone conclusion; the grid is the precondition.
- β `python3 tools/test_matchup_and_ultimate.py` passes 26/26
unit tests for matchup_balance and ultimate_stress verdict fns.
- β **`tools/matchup-grid.sh` β `matchup_balance: PASS`** β NOT yet run.
- RUN host stabilized 2026-04-17 ~15:25 PDT (apricot flaky-services cleanup;
- 10/10 sign-off batch clean β see p0-20 acceptance bullet for evidence
- path). Sole remaining blocker: `auto_play.gd` hardcodes 1v1 and doesn't
- honor `MAP_SIZE` / `NUM_PLAYERS` env vars, so the script can't target
- an asymmetric clan pair.
+ Structural blocker RESOLVED 2026-04-18: `MAP_SIZE` + `NUM_PLAYERS` env vars
+ now threaded through `scenes/tests/auto_play.gd` and both local-flatpak +
+ remote-ssh paths of `autoplay-batch.sh`. Batch execution pending; expected
+ to be gated by the shared p0-01 gameplay-balance issue (games resolve
+ T39-T100 via rush domination, so per-pair median-turn may fall below
+ ultimate_stress's β₯40% of cap threshold).
- β **`tools/huge-map-5clan.sh` β `ultimate_stress: PASS`** β NOT yet run.
- Same blocker as above β needs `MAP_SIZE=standard` and `NUM_PLAYERS=5`
- honored by the game binary. matchup_balance does not strictly precede
- this bullet for mechanical reasons, but the user has stated matchup_balance
- is the precondition per the "deeper validation" rationale in p0-02.
+ Same env-wiring resolved 2026-04-18. Batch execution pending behind the
+ matchup-grid precondition and the p0-01 balance fix.
## Remaining to reach done
-1. **Game binary reads `MAP_SIZE` and `NUM_PLAYERS` env.** `auto_play.gd`
- currently hardcodes a 1v1 setup. Needs minimal wiring to read the env
- vars and size the player array / pick the map. This is the sole
- remaining blocker for both acceptance bullets.
+1. ~~**Game binary reads `MAP_SIZE` and `NUM_PLAYERS` env.**~~ DONE 2026-04-18.
2. **Run matchup-grid** (C(5,2)=10 pairs Γ seeds). Cite verdict.
3. **Run huge-map-5clan** (5 clans on Civ5 `standard` 80Γ52 map).
Cite verdict.
-4. **MAX_PLAYERS POD expansion** β NOT a blocker for p0-22 (the Civ5
+4. **Both batches likely gated by p0-01 gameplay-balance tune** β median-turn
+ gate in `ultimate_stress` requires β₯40% of cap (β₯120 turns of 300). Current
+ binary resolves most games T39-T100 via rush-domination. Running these
+ batches now is expected to FAIL the verdict; waiting for p0-01's pacing
+ tune to land is the cost-effective sequencing.
+5. **MAX_PLAYERS POD expansion** β NOT a blocker for p0-22 (the Civ5
`standard` 80Γ52 runs 8 players but our 5-clan ultimate only needs
5). If we later want to run the actual canonical `huge` (128Γ80,
12-player) with 8+ AI, the POD's 4-slot-per-entry layout needs
diff --git a/.project/objectives/p0-24-difficulty-calibrated-ai-progression.md b/.project/objectives/p0-24-difficulty-calibrated-ai-progression.md
index baad0ff6..d4a0f646 100644
--- a/.project/objectives/p0-24-difficulty-calibrated-ai-progression.md
+++ b/.project/objectives/p0-24-difficulty-calibrated-ai-progression.md
@@ -5,7 +5,7 @@ priority: p0
scope: game1
owner: warcouncil
status: stub
-updated_at: 2026-04-17
+updated_at: 2026-04-18
evidence:
- public/games/age-of-dwarves/data/difficulty.json
---
@@ -23,11 +23,33 @@ Added 2026-04-17 as part of the TTV β state-at-end metric reframe (see p0-01).
- β Asymmetric Hard vs Normal, 10 seeds: Hard wins β₯ 7/10. Hard's median tier_peak exceeds Normal's by β₯ 1 era.
- β `difficulty.json` documents the exact knobs each tier modifies (build-speed multipliers, AI aggression clamps, MCTS rollout budgets, yield bonuses). Each knob has a rationale comment.
+## Status note (2026-04-18)
+
+`difficulty.json` defines four tiers (easy/normal/hard/insane) with
+`ai_modifiers.{production_mult, research_mult, gold_mult, combat_bonus,
+extra_starting_units, starting_gold_bonus}`. Grep confirms only
+`mc-tech::costs.rs` currently reads the tier (for research cost scaling);
+`mc-ai` + the tactical executor do NOT consume the production / gold / unit
+bonuses, so the knobs are data-only at the decision layer.
+
+**Pre-work required before batches can be run:**
+1. Wire `ai_modifiers.production_mult` into `mc-ai::tactical::production` (or
+ thread it through `TacticalState.player_stats.production_bonus`) so AI
+ production outputs scale per tier.
+2. Wire `starting_gold_bonus` + `extra_starting_units` into the engine-side
+ setup path (`auto_play.gd` or `game_state.gd` init).
+3. Surface the difficulty id through the game-setup env (`AI_DIFFICULTY=easy|normal|hard`)
+ + plumb down to both the mc-tech cost multiplier and the new mc-ai tactical
+ hook.
+4. Unblock p0-01's gameplay-balance issue first β tier differentiation cannot
+ be measured while every tier resolves T39-T100 via rush-domination.
+
## Depends on
-- **p0-25** β new `turn_stats.jsonl` instrumentation (`tier_peak`, `peak_unit_tier`, `wonder_count`). Cannot measure without the fields.
-- **p0-01** β MCTS must be the AI driver under test.
+- **p0-25** β new `turn_stats.jsonl` instrumentation (`tier_peak`, `peak_unit_tier`, `wonder_count`). β
done.
+- **p0-01** β MCTS driver under test; also carries the balance-tune blocker.
- **p0-02** β clan personalities multiplied into each difficulty tier; Easy-Blackhammer must still behave aggressively but less efficiently than Normal-Blackhammer.
+- **p0-26** β tactical AI port. β
done 2026-04-18; tactical knob hooks must now land in `mc-ai::tactical`, not the deleted GDScript executor.
## Non-goals
diff --git a/public/games/age-of-dwarves/data/objectives.json b/public/games/age-of-dwarves/data/objectives.json
index 078b7cc0..9de532b6 100644
--- a/public/games/age-of-dwarves/data/objectives.json
+++ b/public/games/age-of-dwarves/data/objectives.json
@@ -1,11 +1,11 @@
{
- "generated_at": "2026-04-18T15:54:36Z",
+ "generated_at": "2026-04-18T17:08:47Z",
"totals": {
- "partial": 16,
- "stub": 3,
- "done": 56,
- "oos": 18,
"missing": 8,
+ "stub": 3,
+ "oos": 18,
+ "partial": 15,
+ "done": 57,
"total": 101
},
"objectives": [
@@ -16,7 +16,7 @@
"status": "partial",
"scope": "game1",
"owner": "warcouncil",
- "updated_at": "2026-04-17",
+ "updated_at": "2026-04-18",
"summary": "`GdMcTreeController` (Rust GDExtension) is the unconditional AI driver. `AiTurnBridge.run()` always calls `_apply_mcts_strategic_override()` β no feature flag, no silent fallback. If the extension is absent, `push_error` + `assert(false)` crashes loudly. `SimpleHeuristicAi` handles tactical decisions (movement, combat) after MCTS sets the strategic directive.\n\n**Acceptance re-framed 2026-04-17 (user sign-off):** The prior \"median TTV in 200β350 band\" bullet was measuring the wrong thing. Every game ends at T300 (turn limit β score victory) OR earlier via domination; \"median TTV\" is bimodal (domination cluster + score-cluster-at-T299), and its value shifts based on dom:score ratio rather than game quality. Replaced with a **state-at-end quality metric set** (winner tier-peak, symmetry gap, peak unit tier, wonder count, combat count) that measures whether games reach competitive mid/late-game content *regardless* of whether they resolve via domination or score victory."
},
{
@@ -26,7 +26,7 @@
"status": "partial",
"scope": "game1",
"owner": "warcouncil",
- "updated_at": "2026-04-17",
+ "updated_at": "2026-04-18",
"summary": "`ai_personalities.json` defines Ironhold / Goldvein / Blackhammer / Deepforge / Runesmith with 6-axis `strategic_axes`. `ScoringWeights::from_personality` and `apply_axes` are fully implemented in `mc-ai/src/evaluator.rs`.\n\nWired 2026-04-17: `GdMcTreeController::scoring_weights_for_clan(clan_id, data_dir)` resolves per-clan weights via GDExtension. `ai_turn_bridge.gd::_build_game_state_json` now calls this per player and injects the result into `\"scoring_weights\":` β previously always `{}`. `AI_PIN_PERSONALITY` env var added to `personality_assigner.gd` for per-clan batch testing. Smoke run confirms `player_clans: {\"1\": \"blackhammer\"}` in meta.json, EXIT_CODE=0.\n\n**5 Γ 10-seed batch results (2026-04-17, `.local/iter/p0-02-clans/` β PRE-REFRAME EVIDENCE):**\n\n> These batches ran BEFORE p0-25's instrumentation landed, so `player_stats` does NOT carry\n> `tier_peak` / `peak_unit_tier` / `wonder_count`. The TTV column is preserved as the\n> contemporaneous signal; it is NOT the current acceptance metric. Per p0-01's 2026-04-17\n> reframe, the primary divergence gate is **tier_peak** (era-progression, which scales with\n> difficulty per p0-24) β tracked as a \"needs re-run\" in Remaining to reach done below.\n\n| Clan | Wins | TTV_med (legacy) | p1_gold | p1_mil | p1_techs |\n|---|---|---|---|---|---|\n| ironhold | 10/10 | T185.5 | 266 | 3.0 | 27.5 |\n| goldvein | 10/10 | T155.5 | **543** | 3.5 | 25.5 |\n| blackhammer | 9/9 | T189 | 327 | 3.0 | 28 |\n| deepforge | 10/10 | T185.5 | 266 | 3.0 | 27.5 |\n| runesmith | 10/10 | T155.5 | 543 | 3.5 | 25.5 |\n\nSignals that DON'T depend on TTV (still valid post-reframe):\n- **Balance**: 49 total games, each clan 3 AI-wins, max 33% β passes.\n- **Gold axis**: goldvein 2Γ ironhold (wealth=9 vs 3) β passes.\n- **First-combat**: identical at T9 across all clans (map-forced start proximity, not AI-driven).\n- **Pair metric-identical**: deepforge/ironhold and goldvein/runesmith pairs show overlapping weight profiles; same 10 seeds converge.\n\nSignals that DO depend on TTV (need tier_peak re-run to close the reframed gate):\n- TTV delta between clan pairs β the \"goldvein/runesmith finish 30 turns faster than ironhold/deepforge\" claim doesn't translate into the tier_peak framework until re-measured.\n\n**B5 re-run (2026-04-17, `.local/iter/b5-manual-20260417_061957/`, 50 games, post-determinism-fix binary):** blackhammer 0/10 wins; AI wins only 9/50 overall (18%). Win-rate balance bullet fails. See \"Remaining to done\" for tuning plan.\n\n**Axis ablation sweep (2026-04-17, `.local/iter/ablate__20260417_072921/`, 10 seeds T300 per axis β PRE-REFRAME EVIDENCE):** Each axis neutralized to 5 for all clans. Measured under pre-p0-25 instrumentation; metrics are TTV / gold / mil from the legacy `player_stats` schema. All 6 axes show β₯10% delta on their correlated legacy metric vs pooled baseline (TTV=185, gold=379, mil=3):\n\n| Axis | Correlated metric (legacy) | Baseline | Ablated | Delta |\n|---|---|---|---|---|\n| aggression | mil_med | 3.0 | 2.5 | -16.7% |\n| expansion | ttv_med | 185 | 134 | -27.6% |\n| grudge_persistence | ttv_med | 185 | 131.5 | -28.9% |\n| production | ttv_med | 185 | 139 | -24.9% |\n| trade_willingness | gold_med | 379 | 193.5 | -48.9% |\n| wealth | gold_med | 379 | 227.5 | -40.0% |\n\nNote: ablated TTV drops (not rises) because most games hit T300 stalemate when the axis is neutralized β domination wins collapse from 49/49 to 1β8/10 per axis. The TTV delta reflects game degradation, not faster play. All axes CONFIRMED LIVE under the legacy metric set. Re-measurement under tier_peak is needed before the reframed acceptance (below) can be cited."
},
{
@@ -206,7 +206,7 @@
"status": "partial",
"scope": "game1",
"owner": "warcouncil",
- "updated_at": "2026-04-17",
+ "updated_at": "2026-04-18",
"summary": "The MCTS tree (`mcts_tree.rs`) and the `mc-turn` GPU fauna pipeline are both live\non `main`, but the AI cannot currently afford wide tree search: full\n`GridState` cloning (~12 MB at 256Γ256) blows out RAM long before the tree is\ndeep enough to matter, and `TreeState::simulate()` is a 0.5 stub. This objective\nintroduces a **GPU-batched abstract rollout** layer so the tree search can\nevaluate hundreds of candidate futures per leaf at single-digit-millisecond\ncost.\n\n### 2026-04-17 update β GPUβCPU numerical parity ACHIEVED\n\nPhase C structural work shipped in the earlier team pass but the parity test\nwas silently taking the skip path on headless hosts β the shader had never\nactually compiled on any adapter. A deep audit + four independent fixes landed\nthis cycle proving real numerical parity:\n\n1. **WGSL reserved-keyword bug**: `var active: u32 = 0u` at `rollout.wgsl:607`\n used the `active` reserved word β Naga parse panic β wgpu_core handler β try_init\n worker thread panic β timeout returned None β skip-path. Renamed to\n `active_idx`; the shader now actually compiles. Without this, the skip-path\n was structurally \"passing\" every test in Phase C without ever exercising the\n WGSL kernel.\n2. **Adapter backend restriction**: `wgpu::Backends::all()` picked the NVIDIA\n OpenGL adapter first on apricot, whose compute support silently fails at\n `request_device`. Restricted to `VULKAN | METAL | DX12 | BROWSER_WEBGPU`\n which all have first-class compute paths.\n3. **Device limits fix**: `Limits::default()` targets a discrete GPU β too\n large for llvmpipe / lavapipe. Changed to\n `Limits::downlevel_defaults().using_resolution(adapter.limits())` so software\n Vulkan backends can satisfy device creation.\n4. **Action-walk order unified**: the root numerical divergence. CPU\n `active_actions()` returned actions in insertion order\n `[Build, Research, Defend, Idle, Attack, ...]`; WGSL iterated k=0..9 in\n `ActionKind::ALL` numerical order `[Build, Attack, Settle, Research, ...]`.\n Identical probabilities, identical RNG draw β different action picked at\n every cumulative-sum boundary. Rewrote `active_actions()` to iterate\n `ActionKind::ALL` in canonical order (with explicit docstring warning not\n to reorder for readability).\n\n**Parity verification on apricot (headless bluefin + lavapipe software\nVulkan)**: with `MC_AI_GPU_DEBUG=1 VK_DRIVER_FILES=/usr/share/vulkan/icd.d/lvp_icd.x86_64.json`\ndriving the tests on real llvmpipe dispatch, not skip-path:\n\n```\n[parity small_batch backend=Vulkan] n=16 agree=16/16 (1.000) max_drift=0.000000\n[parity partial_workgroup backend=Vulkan] n=65 agree=65/65 (1.000) max_drift=0.000000\n[parity multi_workgroup backend=Vulkan] n=128 agree=128/128 (1.000) max_drift=0.000000\nbuckets: <1e-6=all others=0 across all three tests\n```\n\nNot 98% (the stated tolerance) β **100% agreement, bit-identical** on all 3\nquantitative parity tests (209 inputs total). Pre-fixes: 3β6% agreement with\nmax_drift 0.025β0.043 (action-boundary flips). Post-fix: integer fields\nbyte-equal, scalar fields byte-equal. WGSL kernel is now a provable,\nbyte-for-byte port of `rollout::walk`.\n\n### 2026-04-17 update β host-side infrastructure\n\n- `scripts/dev-setup/bluefin.sh` + `./run setup:bluefin` β idempotent installer\n for `weston`, `vulkan-tools`, `mesa-vulkan-drivers` on bootc/Bluefin systems\n via `rpm-ostree install --apply-live`. `--check` mode for CI.\n Delegates EDITβRUN via `$AUTOPLAY_HOST` when invoked from EDIT.\n- `~/Code/bootc-bluefin/containerfiles/Containerfile.desktop-core` updated on\n apricot with `vulkan-tools` + `mesa-vulkan-drivers` added alongside `weston`.\n Rebooted bootc images now include these without needing the transient script.\n\n### 2026-04-17 update β fresh A5 attempt post-fix (failed on host SIGTERM)\n\nAfter the four WGSL parity fixes landed and GDExtension rebuilt, fresh A5\nbatches were attempted under multiple process-isolation strategies:\n\n| Strategy | Batch dir | Result |\n|---|---|---|\n| plain nohup | `.local/iter/a5-fresh-20260417_122847/` | exit 143, seeds `in_progress` T5βT10 before kill |\n| nohup + new dir | `.local/iter/a5-final-20260417_122936/` | games launched, no completion.marker written (process killed) |\n| bash SIGTERM trap | `.local/iter/a5-trap-20260417_123021/` | trap handler received NO signal; script exited rc=143 |\n| strace signal trace | `.local/iter/a5-strace-20260417_123200/` | revealed autoplay-batch.sh exits status **1** (not 143); no SIGTERM to parent. Root cause: `0/N games produced turn_stats.jsonl` check fires because flatpak Godot scopes end at 3β10s |\n| `systemd-run --user` | `.local/iter/warcouncil-a5-systemd-*/` | same β service `Active: inactive (dead)` after 2s, scope children SIGTERMed |\n| `KillMode=none` | `.local/iter/warcouncil-a5-systemd-*` (2nd) | games reached T9βT10 only; same kill pattern |\n| plain `bash autoplay-batch` synchronous | `.local/iter/a5-direct-123300/` | 10 games with 0-line `turn_stats.jsonl` β games get SIGTERMed during map generation |\n\nSeven distinct execution strategies, same failure pattern: flatpak Godot\nscopes SIGTERMed within 3β10s of launch, before any turn completes. Investigation\nfound the signal is NOT delivered by systemd-oomd (failed service), rpm-ostree\nautomatic updates (timer inactive), or apricot-rail-watchdog (emit-only). The\nactual SIGTERM source could not be identified in the apricot user session.\nParallel agent's own batches from earlier the same day (e.g.\n`.local/batches/blackhammer_tune_20260417_101447/`) completed fine, so the\nissue is transient/session-bound, NOT a permanent host failure.\n\n**Fresh A5 verdict β NOT HEALTHY, B5 therefore not launched.** Per\nwarcouncil's integrity rule: we report the measurement failure honestly\nrather than claim parity-fix-correctness translated into fresh gameplay\nevidence. Existing p0-01 batch data from pre-parity-fix binary (at\n`blackhammer_tune_20260417_101447`) still stands as the most recent\nsuccessful A5/B5 evidence in the repo."
},
{
@@ -226,7 +226,7 @@
"status": "partial",
"scope": "game1",
"owner": "warcouncil",
- "updated_at": "2026-04-17",
+ "updated_at": "2026-04-18",
"summary": "The \"ultimate test\" is the final gate on the AI lookahead pipeline:\nfive clan personalities competing on a map sized large enough for eight\nplayers, with MCTS + GPU batched rollouts driving every decision. The\ngoal is to confirm the lookahead SCALES β deep trees, many expansions,\ngenuine strategic divergence between clans at multi-clan scale β not\njust that it works on the 1v1 fixtures already covered by p0-02's\n`personality_win_balance`.\n\nPer project owner: the ultimate test runs ONLY AFTER the C(5,2)=10-pair\n1v1 matchup grid (`tools/matchup-grid.sh`) has shown the five clans are\nbalanced in head-to-head play. Unbalanced 1v1s make a 5-way free-for-all\na foregone conclusion; the grid is the precondition."
},
{
@@ -246,7 +246,7 @@
"status": "stub",
"scope": "game1",
"owner": "warcouncil",
- "updated_at": "2026-04-17",
+ "updated_at": "2026-04-18",
"summary": "Added 2026-04-17 as part of the TTV β state-at-end metric reframe (see p0-01). The game's three AI-difficulty tiers (Easy / Normal / Hard in `difficulty.json`) must produce *measurably different* progression profiles when batched. The current MCTS + heuristic stack doesn't actually change behavior between difficulty tiers β `ai_difficulty` is read in a few Rust spots but has no empirically-validated behavioral split."
},
{
@@ -263,7 +263,7 @@
"id": "p0-26",
"title": "Port tactical AI from GDScript to mc-ai (Rail-1 compliance)",
"priority": "p0",
- "status": "partial",
+ "status": "done",
"scope": "game1",
"owner": "warcouncil",
"updated_at": "2026-04-18",
|