fix(@projects/@magic-civilization): 🐛 update stress-test objective date

Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
This commit is contained in:
Natalie 2026-04-19 15:53:32 -07:00
parent fb2f800677
commit d2dd264027
3 changed files with 28 additions and 25 deletions

View file

@ -40,7 +40,7 @@
| ID | Status | Title | Owner | Updated |
|---|---|---|---|---|
| [p0-01](p0-01-mcts-wiring.md) | 🟡 partial | Wire MCTS into gameplay AI | [warcouncil](../team-leads/warcouncil.md) | 2026-04-18 |
| [p0-02](p0-02-clan-personalities.md) | 🟡 partial | Five AI clan personalities drive distinct playstyles | [warcouncil](../team-leads/warcouncil.md) | 2026-04-18 |
| [p0-02](p0-02-clan-personalities.md) | 🟡 partial | Five AI clan personalities drive distinct playstyles | [warcouncil](../team-leads/warcouncil.md) | 2026-04-19 |
| [p0-03](p0-03-pvp-in-turn.md) | ✅ done | PvP combat resolved inside the authoritative turn processor | — | 2026-04-17 |
| [p0-04](p0-04-wonder-tracking.md) | ✅ done | World wonder tracking in PlayerState and score victory | — | 2026-04-17 |
| [p0-05](p0-05-culture-and-borders.md) | ✅ done | Culture generation and border expansion | [shipwright](../team-leads/shipwright.md) | 2026-04-17 |
@ -58,9 +58,9 @@
| [p0-17](p0-17-wild-creature-lair-loop.md) | ✅ done | Wild creature and lair clearing loop | [shipwright](../team-leads/shipwright.md) | 2026-04-17 |
| [p0-18](p0-18-strategic-resource-gate.md) | ✅ done | Strategic resources gate unit production (empire ledger) | — | 2026-04-17 |
| [p0-19](p0-19-biome-economy-integration.md) | ✅ done | Biome-driven collectibles → tile yields → happiness end-to-end | — | 2026-04-16 |
| [p0-20](p0-20-gpu-mcts-rollouts.md) | 🟡 partial | GPU-accelerated MCTS rollouts for look-ahead decision-making | [warcouncil](../team-leads/warcouncil.md) | 2026-04-18 |
| [p0-20](p0-20-gpu-mcts-rollouts.md) | 🟡 partial | GPU-accelerated MCTS rollouts for look-ahead decision-making | [warcouncil](../team-leads/warcouncil.md) | 2026-04-19 |
| [p0-21](p0-21-audio-system-capability.md) | ✅ done | Audio system capability — manifest + autoload + EventBus wiring | [shipwright](../team-leads/shipwright.md) | 2026-04-17 |
| [p0-22](p0-22-ultimate-ai-stress-test.md) | 🟡 partial | Ultimate AI stress test — 5 clans, huge map, deep lookahead | [warcouncil](../team-leads/warcouncil.md) | 2026-04-18 |
| [p0-22](p0-22-ultimate-ai-stress-test.md) | 🟡 partial | Ultimate AI stress test — 5 clans, huge map, deep lookahead | [warcouncil](../team-leads/warcouncil.md) | 2026-04-19 |
| [p0-23](p0-23-sprite-rendering-capability.md) | ✅ done | Sprite rendering capability — replace procedural draw_* with texture rendering | [shipwright](../team-leads/shipwright.md) | 2026-04-17 |
| [p0-24](p0-24-difficulty-calibrated-ai-progression.md) | ✅ done | Difficulty-calibrated AI progression — Easy / Normal / Hard tier-peak distributions | [warcouncil](../team-leads/warcouncil.md) | 2026-04-19 |
| [p0-25](p0-25-game-quality-metrics-instrumentation.md) | ✅ done | Game-quality metrics instrumentation — tier_peak, peak_unit_tier, wonder_count | [shipwright](../team-leads/shipwright.md) | 2026-04-17 |

View file

@ -5,7 +5,7 @@ priority: p0
status: partial
scope: game1
owner: warcouncil
updated_at: 2026-04-18
updated_at: 2026-04-19
evidence:
- src/simulator/crates/mc-ai/tests/ultimate_lookahead_stress.rs
- tools/matchup-grid.sh
@ -66,13 +66,17 @@ a foregone conclusion; the grid is the precondition.
- `ai_personalities.json` still exports exactly 5 canonical clans
- ✓ `python3 tools/test_matchup_and_ultimate.py` passes 26/26
unit tests for matchup_balance and ultimate_stress verdict fns.
- ✗ **`tools/matchup-grid.sh``matchup_balance: PASS`** — NOT yet run.
Structural blocker RESOLVED 2026-04-18: `MAP_SIZE` + `NUM_PLAYERS` env vars
now threaded through `scenes/tests/auto_play.gd` and both local-flatpak +
remote-ssh paths of `autoplay-batch.sh`. Batch execution pending; expected
to be gated by the shared p0-01 gameplay-balance issue (games resolve
T39-T100 via rush domination, so per-pair median-turn may fall below
ultimate_stress's ≥40% of cap threshold).
- 🟡 **`tools/matchup-grid.sh``matchup_balance: PASS`** — IN PROGRESS 2026-04-19.
Batch `matchup-grid-20260419_000018` (5 seeds/pair, T300, `AI_USE_MCTS=true`):
**7/10 pairs complete** with exit=0 — ironhold_vs_goldvein, ironhold_vs_blackhammer,
ironhold_vs_deepforge, ironhold_vs_runesmith, goldvein_vs_blackhammer,
goldvein_vs_deepforge, goldvein_vs_runesmith. Remaining: blackhammer_vs_deepforge,
blackhammer_vs_runesmith, deepforge_vs_runesmith. Batch interrupted twice by
apricot OOM hard-poweroff (PARALLEL=16 Godot instances → memory spike during
simultaneous init). Fix landed: `LAUNCH_COOLDOWN` env var added to
`tools/autoplay-batch.sh` — staggers game launches N seconds apart to prevent
simultaneous peak-init memory pressure. Resume with `LAUNCH_COOLDOWN=15 PARALLEL=8`.
Verdict pending full 10/10 completion.
- ✗ **`tools/huge-map-5clan.sh``ultimate_stress: PASS`** — NOT yet run.
Same env-wiring resolved 2026-04-18. Batch execution pending behind the
matchup-grid precondition and the p0-01 balance fix.
@ -80,14 +84,13 @@ a foregone conclusion; the grid is the precondition.
## Remaining to reach done
1. ~~**Game binary reads `MAP_SIZE` and `NUM_PLAYERS` env.**~~ DONE 2026-04-18.
2. **Run matchup-grid** (C(5,2)=10 pairs × seeds). Cite verdict.
2. **Complete matchup-grid** — 7/10 pairs done. Resume with `LAUNCH_COOLDOWN=15 PARALLEL=8`
to avoid OOM. Run `checklist-report.py matchup_balance` across full grid dir once 10/10 done.
3. **Run huge-map-5clan** (5 clans on Civ5 `standard` 80×52 map).
Cite verdict.
4. **Both batches likely gated by p0-01 gameplay-balance tune** — median-turn
gate in `ultimate_stress` requires ≥40% of cap (≥120 turns of 300). Current
binary resolves most games T39-T100 via rush-domination. Running these
batches now is expected to FAIL the verdict; waiting for p0-01's pacing
tune to land is the cost-effective sequencing.
Cite verdict. Blocked on matchup-grid PASS.
4. **Median-turn gate concern**: `ultimate_stress` requires ≥40% of cap (≥120T of 300).
Post-p0-37+p0-39+tempo-bump binary now runs to median T192 — this gate should PASS.
The tier_peak and peak_unit_tier gates may still fail (gated by game-systems/data scope).
5. **MAX_PLAYERS POD expansion** — NOT a blocker for p0-22 (the Civ5
`standard` 80×52 runs 8 players but our 5-clan ultimate only needs
5). If we later want to run the actual canonical `huge` (128×80,

File diff suppressed because one or more lines are too long