11 KiB
| id | title | priority | status | scope | tags | owner | updated_at | evidence | superseded_by | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| p1-30b | Parallel MCTS rollouts for huge-map decisive games (closes p1-22's huge-map sub-gate) | p1 | superseded | game1 |
|
warcouncil | 2026-05-05 |
|
|
post-p0-20 close-out (2026-05-05)
Status: superseded by p0-20.
The original acceptance bullets below were framed around mc-ai/src/mcts_tree.rs::simulate_parallel — a CPU rayon fan-out path. That path was deleted in p0-20 Phase C when MCTS rollouts moved to the GPU. The "is parallel rollout fast enough?" question that this objective was filed to answer is now answered by p0-20's GPU-batched architecture, not by simulate_parallel.
Why the original gate doesn't apply
- The bench target (
simulate_parallelspeedup vs single-threaded baseline) no longer exists in the live game code. p0-20 Phase C removed the rayon fan-out in favor ofiterate_gpu_batched+AiBackend::Gpu. Re-introducingsimulate_parallelpurely to satisfy this gate would be tech-debt. - The huge-map ≥5/10 victories sub-gate that this objective inherited from p1-22 composes naturally with p0-20's huge-map batch and p1-22's huge-map close-out — it does not require this objective as its home.
- The determinism contract (parallel vs sequential identical visit counts under fixed
XorShift64seed) is now enforced one layer up: GPU vs CPU byte-identical rollouts on 209 inputs (gpu_rollout_paritytest).
Cross-references (where the gate actually closed)
- Parallel-rollout speedup measurement:
src/simulator/crates/mc-ai/tests/gpu_walltime.rs— measured ratio GPU/CPU = 0.523 on the canonical fixture. This is the post-p0-20 equivalent of the "≥2.5× speedup" gate this objective originally specified. - Determinism preserved across the parallelism boundary:
gpu_rollout_paritytest — byte-identical parity across 209 inputs between GPU and CPU rollout paths. - Huge-map ≥5/10 decisive games: composes with p1-22's huge-map batch under the GPU backend; no longer this objective's responsibility.
- p0-20 Phase D close-out:
.project/objectives/p0-20-gpu-mcts-rollouts.md— the canonical home for the post-p0-20 rollout-perf gate.
Decision rationale (Option A vs B)
Considered reframing the bullets to point at gpu_walltime.rs + gpu_rollout_parity and shipping done (Option B). Rejected: that double-counts the same evidence already cited under p0-20 and inflates the closed-objective set without adding signal. Option A (supersede) keeps documentation hygiene clean — the original gate is preserved verbatim below for archival, the supersession is explicit, and the live gate has exactly one home (p0-20).
Summary (original, archived)
Filed by p1-30 cycle 5 close-out as the home for the gameplay-outcome gate that p1-30 inherited from p1-22 but cannot close in its current scope.
p1-30's stated scope is _build_tactical_state GDScript-side perf. Cycle 5 verified the Rail-1 ephemerals handoff at the Rust level (criterion bench tactical_state_build: 17.6μs serialize, 36.9μs roundtrip on 112×72 huge-map fixture). But the gameplay-outcome gate that originally lived under p1-30 — "huge-map-5clan 10-seed batch achieves ≥5/10 victories with ≥2 distinct winners" — is constrained by MCTS rollout efficiency, not by _build_tactical_state perf. Cycle-2 and cycle-3 evidence (in p1-30.md) shows the gate is two-sided unreachable by knob retuning alone:
- 2000ms MCTS_DECISION_BUDGET_MS → 0/10 decisive (every seed
outcome=in_progressat 1800s timeout) - 500ms MCTS_DECISION_BUDGET_MS → 3/4 deadlock (per-decision floor breached)
The bottleneck is mc-ai/src/mcts.rs rollout cost, currently single-threaded. The fix is a structural rewrite to support parallel rollouts (rayon-based fan-out from mcts_tree::simulate_parallel, or a smaller-action-space MCTS, or a different decision algorithm entirely).
Acceptance criteria
-
Parallel rollouts:
mc-ai/src/mcts.rssimulate_parallelactually fans rollouts out across worker threads viarayon. Currently the function name is aspirational; needs verification that the body parallelizes (not just sequential rollouts in a loop). Document the speedup vs single-threaded baseline on a 32-action-state fixture (tests/mcts_basic.rsprovides one). -
Huge-map ≥5/10 victories sub-gate (the original p1-22 gate p1-30 inherited):
tools/huge-map-5clan.sh10-seed batch at MCTS_DECISION_BUDGET_MS=2000 achieves ≥5/10 victories with ≥2 distinct winning clans. This is the canonical p1-22 huge-map sub-gate; closing it here closes p1-22's outstanding ❌. -
No regression on standard-map gates: same parallel-rollout code path must not regress the standard-map autoplay 10-seed batch (currently 10/10 victories at T300; cycle-4 evidence in
p1-29.md). Re-run after parallel rollouts land. -
mc-ai lib tests still 223/223 (or higher): parallel rollouts must preserve the existing CPU rollout determinism contract under a fixed
XorShift64seed.tests/clan_rollout_divergence.rsis the determinism witness — it must still pass. -
Workspace
cargo checkclean: no new warnings, no breaking API change toGdAiController.
Verification
Run on apricot:
ssh apricot 'cd ~/Code/project-buildspace/magic-civilization && \
AUTOPLAY_HOST=apricot SEEDS=10 MCTS_DECISION_BUDGET_MS=2000 \
bash tools/huge-map-5clan.sh'
Compare verdict.json pass: true and per-seed outcome distribution against the cycle-2 / cycle-3 baselines documented in p1-30.md.
Notes
- p1-22's huge-map sub-gate evidence will be updated to cite this objective's closure (was citing p1-30; the cycle-5 reframe moved the responsibility here).
- The "smaller action space" alternative path (e.g. action-pruning by clan personality, or aggregating tile-tile moves) is not in scope for this objective unless the parallel-rollout approach proves insufficient. If parallel rollouts get to ≥5/10 victories, that closes the gate; if they only reach 3/10 or 4/10, expand scope to action-space pruning in a sub-bullet.
- Compose with p1-29 catch-up logic and any p1-29a last-stand defense work: parallel rollouts may make MCTS strong enough that the loser AI applies its own catch-up logic effectively. If both p1-30b and p1-29a land, the cycle-4 batch should be re-run to see whether
tier_peak_gapfinally moves. - Filed by p1-30 cycle 5 close-out, 2026-05-03.
Remaining work (2026-05-03)
All five acceptance bullets remain ❌ (status: stub). Per the user's note, rayon .into_par_iter() is already shipped at src/simulator/crates/mc-ai/src/mcts_tree.rs:337 inside simulate_parallel (line 301). The remaining work is the measured-speedup gate and the gameplay-outcome batch — not the implementation.
Bullet: Parallel rollouts — mc-ai/src/mcts.rs::simulate_parallel actually fans rollouts across worker threads via rayon (with documented speedup vs single-threaded baseline)
- Status correction: implementation already exists.
mcts_tree.rs:337calls.into_par_iter(). The objective text referencesmc-ai/src/mcts.rsbut the live parallel path is inmcts_tree.rs(mcts.rs is a 196-line legacy shim). Update the objective body to citemcts_tree.rs:301-337. - Files to touch (Rust SSoT):
- NEW
src/simulator/crates/mc-ai/benches/mcts_parallel_speedup.rs(criterion bench, mirrors existingtactical_state_build.rspattern). Measures: sequential rollout-loop equivalent vssimulate_parallelwithrayon::ThreadPoolBuilder::new().num_threads(N)for N=1,2,4,8 on a 32-action fixture fromtests/mcts_basic.rs. - NEW
src/simulator/crates/mc-ai/tests/mcts_parallel_speedup.rs(correctness test) — parallel vs sequential produce identical visit counts under a fixedXorShift64seed (citetests/clan_rollout_divergence.rsas the prior-art determinism witness).
- NEW
- Dependencies: none (rayon already wired).
- Acceptance gate: criterion bench shows N=8 median wall-time ≤ 0.40 × N=1 wall-time on apricot (8-core saturation; ≥2.5× speedup is the measured gate, not just "parallelizes"). Bench output committed under
.local/bench/mcts_parallel_speedup-<stamp>.txt. - SOLID/DRY/SSoT rails:
- Bench fixture reuses
tests/mcts_basic.rssetup — extract tomc-ai/benches/common/mod.rsif duplication arises; do NOT copy-paste. - No
cfg(feature = "parallel")— rayon is always-on (already is at:337). - Determinism contract uses the existing
XorShift64seed pinning; do NOT introduce a second RNG path. - Thread-count knob lives in
public/games/age-of-dwarves/data/setup.json::ai.mcts_parallel_threads(typed read intomc-ai); no env-string parsing inside hot loops.
- Bench fixture reuses
Bullet: Huge-map ≥5/10 victories sub-gate — closes p1-22's outstanding ❌
- Files to touch: zero direct (gameplay batch).
- Dependencies: bullet 1 (parallel rollouts measured-speedup gate). Composes with
p1-29alast-stand defense. - Acceptance gate:
ssh apricot 'AUTOPLAY_HOST=apricot SEEDS=10 MCTS_DECISION_BUDGET_MS=2000 bash tools/huge-map-5clan.sh'→verdict.json::pass=true, ≥5/10outcome=victory, ≥2 distinct winning clans. - SOLID/DRY/SSoT rails:
- All MCTS efficiency work stays in
mc-ai— no GDScript shadow rollout path. - Per-seed wall-clock budget knob (
SAFETY_TIMEOUT_OVERRIDE) read intools/huge-map-5clan.sh; do NOT duplicate the timeout in any Rust call-site.
- All MCTS efficiency work stays in
Bullet: No regression on standard-map gates
- Files to touch: zero direct.
- Dependencies: bullet 1.
- Acceptance gate:
tools/autoplay-batch.sh 10 300post-parallel batch shows 10/10 victories at T300 (cycle-4 baseline perp1-29.md). Zero regression inwinner_tier_peakmedian (≥4.0).
Bullet: mc-ai lib tests still 223/223+ green (parallel determinism contract preserved)
- Files to touch: see bullet 1 —
tests/mcts_parallel_speedup.rscorrectness test. - Dependencies: bullet 1.
- Acceptance gate:
cargo test -p mc-ai --lib≥223 passing, includingclan_rollout_divergenceand the newmcts_parallel_speedupcorrectness test. Zero new flaky tests. - SOLID/DRY/SSoT rails:
- Determinism uses
XorShift64already pinned inmcts_tree.rs; do NOT introduce parallel-only RNG paths that break the contract.
- Determinism uses
Bullet: Workspace cargo check clean — no new warnings, no breaking API change to GdAiController
- Files to touch: zero direct (verification only).
- Dependencies: bullet 1.
- Acceptance gate:
cargo check --workspaceexits 0 with zero new warnings vs pre-change baseline.GdAiControllerpublic API insrc/simulator/api-gdext/src/ai.rsunchanged (no new#[func], no removed signatures — additive-only if anything). - SOLID/DRY/SSoT rails:
- No backwards-compat shim if a method signature changes; either the change is purely additive or it ships through the existing typed bridge.
- No
cfg(feature)flag added toGdAiController.