fix(@projects/@magic-civilization): 🐛 update stress-test status to reflect ironhold balance failure
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
This commit is contained in:
parent
d902d97cca
commit
412dbb3ebf
1 changed files with 12 additions and 7 deletions
|
|
@ -72,11 +72,13 @@ a foregone conclusion; the grid is the precondition.
|
|||
- Standard: `2 × TURN_LIMIT + 300` (1300s)
|
||||
- Override: `SAFETY_TIMEOUT_OVERRIDE=<seconds>` for manual control.
|
||||
Unblocks huge-map-5clan batch execution (was timing out at 1300s; MCTS lookahead needs ~1800s).
|
||||
- 🟡 **`tools/matchup-grid.sh` → `matchup_balance: PASS`** — IN PROGRESS 2026-04-24.
|
||||
Prior batch `matchup-grid-20260419_000018` no longer exists on apricot (cleaned up or on different host).
|
||||
Fresh full run started 2026-04-24 on apricot (PID 2016984, log: `/tmp/p0-22-matchup-grid-fresh.log`),
|
||||
with `LAUNCH_COOLDOWN=15 COUNT=5 TURN_LIMIT=300 PARALLEL=4 AI_USE_MCTS=true`.
|
||||
Verdict pending full 10/10 completion.
|
||||
- 🔴 **`tools/matchup-grid.sh` → `matchup_balance: PASS`** — FAIL 2026-04-25.
|
||||
Full run `matchup-grid-20260424_165224` (10/10 pairs, 5 seeds each, 50 total games) completed.
|
||||
Verdict: `pass: false`. Single reason: `ironhold has 7 appearances but 0 wins in the grid`.
|
||||
Ironhold won 0/7 games — all other clans won at least once. This is a balance issue in ironhold's
|
||||
personality parameters, not a tooling problem. Fix: tune ironhold's personality axes in
|
||||
`public/games/age-of-dwarves/data/ai_personalities.json` so it wins at least 1/7 games.
|
||||
Full verdict: blackhammer 40%, deepforge 9.1%, goldvein 11.1%, ironhold 0%, runesmith 25%.
|
||||
- 🔴 **`tools/huge-map-5clan.sh` → `ultimate_stress: PASS`** — BLOCKED: three root causes, two fully diagnosed.
|
||||
|
||||
**Root cause 1 (fixed, confirmed):** Batch `000049` used a stale `.so` lacking `GdAiController` registration —
|
||||
|
|
@ -112,8 +114,11 @@ a foregone conclusion; the grid is the precondition.
|
|||
1. ~~**Game binary reads `MAP_SIZE` and `NUM_PLAYERS` env.**~~ DONE 2026-04-18.
|
||||
2. ~~**Wall-clock timeout sufficient for MCTS on huge maps.**~~ DONE 2026-04-23.
|
||||
`autoplay-batch.sh` SAFETY_TIMEOUT now auto-scales to 3× TURN_LIMIT when MCTS enabled.
|
||||
3. **Complete matchup-grid** — Fresh full run in progress on apricot (PID 2016984).
|
||||
Run `checklist-report.py matchup_balance` across full grid dir once 10/10 done.
|
||||
3. ~~**Complete matchup-grid**~~ DONE 2026-04-25 — all 10/10 pairs ran. Result: FAIL (ironhold 0 wins).
|
||||
Fix ironhold personality balance (see item below), then re-run with `LAUNCH_COOLDOWN=15 COUNT=5 TURN_LIMIT=300 PARALLEL=4 AI_USE_MCTS=true`.
|
||||
3b. **Fix ironhold balance** — ironhold won 0/7 games in matchup-grid. Likely has too-conservative personality axes
|
||||
(military/expansion too low). Tune `public/games/age-of-dwarves/data/ai_personalities.json` ironhold entry.
|
||||
Cross-check against the other 4 clans' axes. Re-run matchup-grid after tuning.
|
||||
4. **Fix score victory race in `auto_play.gd`** — `_on_victory` signal from `victory_manager.gd` may fire on
|
||||
the same frame that `_state = "done"` / `_finalize_run` writes `max_turns` to turn_stats. Check the `done`
|
||||
branch at line 539-545: it writes `_outcome = "victory" if _victory else "max_turns"` which is correct,
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue