fix(@projects/@magic-civilization): 🐛 update hung seed observation in batch progress log

Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
This commit is contained in:
Natalie 2026-04-25 18:24:12 -07:00
parent 5677d9ebc6
commit d65d897915

View file

@ -100,22 +100,38 @@ After all three batches land successfully and status updates are committed, run
- **Correct binary hash** (deployed to addon, confirmed `set_budget_ms` registered): `0d127464096539475ae7fd9786eab8af545aeec4a39900234c3a4dfe5e9f07d7`
- **TODO for build infra**: `build-gdext.sh:find_build` should use `$CARGO_TARGET_DIR` when set, not hardcode `.local/build/rust/`
### Step 2: p1-22 batch — IN PROGRESS (third attempt)
### Step 2: p1-22 batch — IN PROGRESS (third attempt, hung seed observed)
History:
1. **First attempt** (`p1-22-budget-20260425_174041`) — aborted, stale binary, `set_budget_ms` errors on every AI turn. Results invalid.
2. **Second attempt** (`p1-22-budget-20260425_180000`) via tmux session `p1-22-batch2` — tmux server died mid-run on multi-tenant apricot before any seeds completed. Seeds 1+2 ran in isolation; orchestrator gone, no further seeds dispatched. Killed and discarded.
2. **Second attempt** (`p1-22-budget-20260425_180000`) via tmux session `p1-22-batch2` — tmux server died mid-run on multi-tenant apricot. Killed and discarded.
3. **Third attempt (current)**`nohup` launch (no tmux), parent PID 1237118 on apricot. Output dir `.local/iter/p1-22-budget-20260425_180742`. Started 2026-04-25 18:07 PDT.
- Pass criteria: ≥5/10 victories, ≥2 distinct winners
- Monitor: `/tmp/p122_status.py`, polled every 5 min from local
- Status flag: `/tmp/p1-22-done.flag` written by `/tmp/p1-22-launch.sh` after `huge-map-5clan.sh` exits
- Background task `bv7obsvf5` polls until done flag exists or parent PID dies; emits one notification
**Mid-run observation (18:20 PDT, ~13 min elapsed):**
- Seed 1 progressing healthy: T184, 3.93 s/turn, ~22 min ETA to T500
- **Seed 2 HUNG**: stuck at T97 with wcs=219s for 8+ minutes. game.log mtime stale, godot-bin at 101% CPU sustained but no log writes
- This matches the pre-fix slow-seed pattern in p1-22.md:22 ("some maps produce game states where each MCTS decision takes 30-60+ seconds")
- The MCTS budget caps MCTS only; tactical AI / formation handling / combat resolution are NOT bounded by `MCTS_DECISION_BUDGET_MS`
- Will hit 3600s safety timeout ~19:08 PDT, autoplay-batch.sh records timeout outcome, dispatches seeds 3+4
- **Implication**: each hung seed costs 1 hr of wall clock. If seeds 4, 6, 8, 10 also hang, batch alone consumes 5+ hours.
Lessons learned (recorded for future coordinators):
- **Don't use tmux on multi-tenant apricot for long-running batches** — server can die unrelated to your job. Use `nohup ... </dev/null >log 2>&1 & disown` instead.
- **`build-gdext.sh` ignores `CARGO_TARGET_DIR`** — see Step 1 ISSUE FOUND. After running with `CARGO_TARGET_DIR=/tmp/...`, always `sha256sum` the deployed addon `.so` against the build's `.so` and `cp` manually if they differ.
- **Monitor inline-python via SSH gives stale/cached results** — write the script to a file on apricot first, then `ssh apricot python3 /tmp/script.py`. Direct evaluation through nested quoting in monitor commands produces unreliable readings.
- **Don't use tmux on multi-tenant apricot for long-running batches** — server can die unrelated to your job. Use `nohup ... </dev/null >log 2>&1 & disown`.
- **`build-gdext.sh` ignores `CARGO_TARGET_DIR`** — see Step 1 ISSUE FOUND. After running with `CARGO_TARGET_DIR=/tmp/...`, always `sha256sum` the deployed addon `.so` against the build's `.so` and `cp` manually if they differ. Build script needs `find_build()` patched to honor `$CARGO_TARGET_DIR` first.
- **Don't use the Monitor tool for long batch polling on apricot** — produces fabricated/stale events at irregular intervals. Use Bash `run_in_background` with an `until` loop for single completion notification.
- **autoplay-report.py schema rejects new MCTS instrumentation fields**`tools/schemas/autoplay/turn-stats-line.json` patched to allow optional `mcts_action`, `mcts_root_found`, `mcts_root_idle`, `mcts_root_spawn` properties. Synced to apricot.
- **p1-22 budget is incomplete** — caps MCTS but not tactical/formation. Hung seed 2 demonstrates the gap. Follow-up work needed.
### Steps 3-4: p0-02 + p0-01 — PENDING p1-22 completion
### Step 3: p0-02 launcher — STAGED at `/tmp/p0-02-launch.sh` (apricot)
Wraps `tools/matchup-grid.sh` with `COUNT=10 TURN_LIMIT=300 PARALLEL=2`. Writes `/tmp/p0-02-done.flag` on completion. Launch with `nohup /tmp/p0-02-launch.sh </dev/null >/tmp/p0-02-nohup.out 2>&1 & disown` AFTER p1-22 passes.
### Step 4: p0-01 launcher — STAGED at `/tmp/p0-01-launch.sh` (apricot)
Wraps `tools/autoplay-batch.sh 10 300` with `PARALLEL=10`. Writes `/tmp/p0-01-done.flag` on completion. Launch AFTER p0-02 completes.
---