fix(@projects/@magic-civilization): 🐛 update stress-test objective date
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
This commit is contained in:
parent
fb2f800677
commit
d2dd264027
3 changed files with 28 additions and 25 deletions
|
|
@ -40,7 +40,7 @@
|
|||
| ID | Status | Title | Owner | Updated |
|
||||
|---|---|---|---|---|
|
||||
| [p0-01](p0-01-mcts-wiring.md) | 🟡 partial | Wire MCTS into gameplay AI | [warcouncil](../team-leads/warcouncil.md) | 2026-04-18 |
|
||||
| [p0-02](p0-02-clan-personalities.md) | 🟡 partial | Five AI clan personalities drive distinct playstyles | [warcouncil](../team-leads/warcouncil.md) | 2026-04-18 |
|
||||
| [p0-02](p0-02-clan-personalities.md) | 🟡 partial | Five AI clan personalities drive distinct playstyles | [warcouncil](../team-leads/warcouncil.md) | 2026-04-19 |
|
||||
| [p0-03](p0-03-pvp-in-turn.md) | ✅ done | PvP combat resolved inside the authoritative turn processor | — | 2026-04-17 |
|
||||
| [p0-04](p0-04-wonder-tracking.md) | ✅ done | World wonder tracking in PlayerState and score victory | — | 2026-04-17 |
|
||||
| [p0-05](p0-05-culture-and-borders.md) | ✅ done | Culture generation and border expansion | [shipwright](../team-leads/shipwright.md) | 2026-04-17 |
|
||||
|
|
@ -58,9 +58,9 @@
|
|||
| [p0-17](p0-17-wild-creature-lair-loop.md) | ✅ done | Wild creature and lair clearing loop | [shipwright](../team-leads/shipwright.md) | 2026-04-17 |
|
||||
| [p0-18](p0-18-strategic-resource-gate.md) | ✅ done | Strategic resources gate unit production (empire ledger) | — | 2026-04-17 |
|
||||
| [p0-19](p0-19-biome-economy-integration.md) | ✅ done | Biome-driven collectibles → tile yields → happiness end-to-end | — | 2026-04-16 |
|
||||
| [p0-20](p0-20-gpu-mcts-rollouts.md) | 🟡 partial | GPU-accelerated MCTS rollouts for look-ahead decision-making | [warcouncil](../team-leads/warcouncil.md) | 2026-04-18 |
|
||||
| [p0-20](p0-20-gpu-mcts-rollouts.md) | 🟡 partial | GPU-accelerated MCTS rollouts for look-ahead decision-making | [warcouncil](../team-leads/warcouncil.md) | 2026-04-19 |
|
||||
| [p0-21](p0-21-audio-system-capability.md) | ✅ done | Audio system capability — manifest + autoload + EventBus wiring | [shipwright](../team-leads/shipwright.md) | 2026-04-17 |
|
||||
| [p0-22](p0-22-ultimate-ai-stress-test.md) | 🟡 partial | Ultimate AI stress test — 5 clans, huge map, deep lookahead | [warcouncil](../team-leads/warcouncil.md) | 2026-04-18 |
|
||||
| [p0-22](p0-22-ultimate-ai-stress-test.md) | 🟡 partial | Ultimate AI stress test — 5 clans, huge map, deep lookahead | [warcouncil](../team-leads/warcouncil.md) | 2026-04-19 |
|
||||
| [p0-23](p0-23-sprite-rendering-capability.md) | ✅ done | Sprite rendering capability — replace procedural draw_* with texture rendering | [shipwright](../team-leads/shipwright.md) | 2026-04-17 |
|
||||
| [p0-24](p0-24-difficulty-calibrated-ai-progression.md) | ✅ done | Difficulty-calibrated AI progression — Easy / Normal / Hard tier-peak distributions | [warcouncil](../team-leads/warcouncil.md) | 2026-04-19 |
|
||||
| [p0-25](p0-25-game-quality-metrics-instrumentation.md) | ✅ done | Game-quality metrics instrumentation — tier_peak, peak_unit_tier, wonder_count | [shipwright](../team-leads/shipwright.md) | 2026-04-17 |
|
||||
|
|
|
|||
|
|
@ -5,7 +5,7 @@ priority: p0
|
|||
status: partial
|
||||
scope: game1
|
||||
owner: warcouncil
|
||||
updated_at: 2026-04-18
|
||||
updated_at: 2026-04-19
|
||||
evidence:
|
||||
- src/simulator/crates/mc-ai/tests/ultimate_lookahead_stress.rs
|
||||
- tools/matchup-grid.sh
|
||||
|
|
@ -66,13 +66,17 @@ a foregone conclusion; the grid is the precondition.
|
|||
- `ai_personalities.json` still exports exactly 5 canonical clans
|
||||
- ✓ `python3 tools/test_matchup_and_ultimate.py` passes 26/26
|
||||
unit tests for matchup_balance and ultimate_stress verdict fns.
|
||||
- ✗ **`tools/matchup-grid.sh` → `matchup_balance: PASS`** — NOT yet run.
|
||||
Structural blocker RESOLVED 2026-04-18: `MAP_SIZE` + `NUM_PLAYERS` env vars
|
||||
now threaded through `scenes/tests/auto_play.gd` and both local-flatpak +
|
||||
remote-ssh paths of `autoplay-batch.sh`. Batch execution pending; expected
|
||||
to be gated by the shared p0-01 gameplay-balance issue (games resolve
|
||||
T39-T100 via rush domination, so per-pair median-turn may fall below
|
||||
ultimate_stress's ≥40% of cap threshold).
|
||||
- 🟡 **`tools/matchup-grid.sh` → `matchup_balance: PASS`** — IN PROGRESS 2026-04-19.
|
||||
Batch `matchup-grid-20260419_000018` (5 seeds/pair, T300, `AI_USE_MCTS=true`):
|
||||
**7/10 pairs complete** with exit=0 — ironhold_vs_goldvein, ironhold_vs_blackhammer,
|
||||
ironhold_vs_deepforge, ironhold_vs_runesmith, goldvein_vs_blackhammer,
|
||||
goldvein_vs_deepforge, goldvein_vs_runesmith. Remaining: blackhammer_vs_deepforge,
|
||||
blackhammer_vs_runesmith, deepforge_vs_runesmith. Batch interrupted twice by
|
||||
apricot OOM hard-poweroff (PARALLEL=16 Godot instances → memory spike during
|
||||
simultaneous init). Fix landed: `LAUNCH_COOLDOWN` env var added to
|
||||
`tools/autoplay-batch.sh` — staggers game launches N seconds apart to prevent
|
||||
simultaneous peak-init memory pressure. Resume with `LAUNCH_COOLDOWN=15 PARALLEL=8`.
|
||||
Verdict pending full 10/10 completion.
|
||||
- ✗ **`tools/huge-map-5clan.sh` → `ultimate_stress: PASS`** — NOT yet run.
|
||||
Same env-wiring resolved 2026-04-18. Batch execution pending behind the
|
||||
matchup-grid precondition and the p0-01 balance fix.
|
||||
|
|
@ -80,14 +84,13 @@ a foregone conclusion; the grid is the precondition.
|
|||
## Remaining to reach done
|
||||
|
||||
1. ~~**Game binary reads `MAP_SIZE` and `NUM_PLAYERS` env.**~~ DONE 2026-04-18.
|
||||
2. **Run matchup-grid** (C(5,2)=10 pairs × seeds). Cite verdict.
|
||||
2. **Complete matchup-grid** — 7/10 pairs done. Resume with `LAUNCH_COOLDOWN=15 PARALLEL=8`
|
||||
to avoid OOM. Run `checklist-report.py matchup_balance` across full grid dir once 10/10 done.
|
||||
3. **Run huge-map-5clan** (5 clans on Civ5 `standard` 80×52 map).
|
||||
Cite verdict.
|
||||
4. **Both batches likely gated by p0-01 gameplay-balance tune** — median-turn
|
||||
gate in `ultimate_stress` requires ≥40% of cap (≥120 turns of 300). Current
|
||||
binary resolves most games T39-T100 via rush-domination. Running these
|
||||
batches now is expected to FAIL the verdict; waiting for p0-01's pacing
|
||||
tune to land is the cost-effective sequencing.
|
||||
Cite verdict. Blocked on matchup-grid PASS.
|
||||
4. **Median-turn gate concern**: `ultimate_stress` requires ≥40% of cap (≥120T of 300).
|
||||
Post-p0-37+p0-39+tempo-bump binary now runs to median T192 — this gate should PASS.
|
||||
The tier_peak and peak_unit_tier gates may still fail (gated by game-systems/data scope).
|
||||
5. **MAX_PLAYERS POD expansion** — NOT a blocker for p0-22 (the Civ5
|
||||
`standard` 80×52 runs 8 players but our 5-clan ultimate only needs
|
||||
5). If we later want to run the actual canonical `huge` (128×80,
|
||||
|
|
|
|||
File diff suppressed because one or more lines are too long
Loading…
Add table
Reference in a new issue