docs(objectives): 📝 Add/clarify sole-city AI research path documentation with new evidence and load-bearing fix details

Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
This commit is contained in:
autocommit 2026-05-27 23:00:58 -07:00
parent e3066173d9
commit fa762794d3

View file

@ -11,9 +11,9 @@ evidence:
- "src/simulator/crates/mc-ai/src/policy.rs — SituationalContext + PersonalityPriors::action_prior_with_context (Settle +0.40, Defend +0.20, Research +0.50)"
- "src/simulator/crates/mc-ai/src/rollout.rs — TreeState::action_prior wired through SituationalContext::from_abstract (CPU MCTS only; GPU rollout walker keeps base prior to preserve gpu_rollout_parity)"
- "src/simulator/crates/mc-ai/tests/sole_city_priors.rs — 9 unit tests covering both bonuses, neutral identity, additive composition, and from_abstract detection (sole-city threatened, tech-below-median, multi-city no-threat, leader no-research-uplift, no-opponents default)"
- "LOAD-BEARING FIX (commit 8ebebd68f, 2026-05-27): src/simulator/crates/mc-ai/src/tactical/production.rs — sole-city economy break-out. Step 0 in pick_for_city interjects ONE production/infrastructure building once a threatened sole-city AI meets a 2-defender floor (SOLE_CITY_ECON_MIN_DEFENDERS=2) and holds < SOLE_CITY_ECON_TARGET=2 buildings, escaping the step-1 perpetual-military loop that left P1 building ZERO buildings in 10/10 p1-29d seeds. Gated on the live `sole_city_threatened = city_count==1 && threatened` var (production.rs:301), so multi-city/unthreatened players are untouched. 4 new unit tests (builds production when threatened, stops at target, requires min defenders, multi-city unaffected). The RL probe behind it (p1-29e divergence mining) found the trained policy's winning lever is a *production* building (forge), never a science building so the prior p1-29b science uplift was misdirected AND unreachable; this break-out is the mechanism that finally moves P1 off tier_peak=1."
- "src/simulator/crates/mc-ai/src/tactical/production.rs (commit 8ebebd68f, 2026-05-27) — sole-city economy break-out (SOLE_CITY_ECON_MIN_DEFENDERS=2, SOLE_CITY_ECON_TARGET=2; step-0 interject in pick_for_city, gated on the live `sole_city_threatened` var at production.rs:301; 4 unit tests). NOTE — NOT demonstrated as the cause of the gate pass below: per p1-29e's controlled analysis, P1's `mil` snapshot is 0 in 10/10 seeds so the `own_mil>=2` floor never fires and the break-out completes ZERO buildings; p1-29e's own attribution gate is NOT MET. This patch is in-tree and unit-tested but its in-game effect is unproven. See [[p1-29e]]."
- "cargo test -p mc-ai --lib: 265 passed (incl. 4 econ break-out tests); sole_city_priors 9 passed — no regressions on prior consumers"
- "GATE PASS — apricot batch 20260527_213814 (smoke, 10 seeds, T300, builds origin/main carrying 8ebebd68f, scored by tools/sole-city-gate.py): 10/10 alive-aware seeds with P0_tp≥2 AND P1_tp≥2 (P1 tier_peak range 26), median game length 89 ≤ 384. Per-seed P1_tp: s1=2 s2=2 s3=3 s4=3 s5=5 s6=2 s7=5 s8=2 s9=6 s10=2. Identical turn counts/outcomes to the p1-29d 0/10 baseline (same deterministic scenarios) — the ONLY delta is P1's mid-game tier lifting 1→≥2 via the economy break-out. `tier_peak` is research-derived (auto_play.gd:2585 = max `era` across `researched_techs`, NOT building-derived), so a P1_tp≥2 is a genuine era-2 *tech research*, not a side-effect of the forge build — the break-out frees P1 from the perpetual-military loop and the p1-29c Research-priority uplift then drives the actual research. Surviving seeds s5/s9 confirm sustained progression (tp=5/6). NOTE: tier_peak is a high-water mark; P1 still loses its capital in 8/10 seeds (cities=0, lost=1) before T300 — surviving to end-game is p1-29d's combat-balance domain, NOT this objective's gate. This objective's scope is the research/economy *path*, and P1 now reaches tier 26 before elimination."
- "GATE PASS (outcome, not attribution) — apricot batch 20260527_213814 (smoke, 10 seeds, T300, fresh build of origin/main, scored by tools/sole-city-gate.py): 10/10 alive-aware seeds with P0_tp≥2 AND P1_tp≥2 (P1 tier_peak range 26), median length 89 ≤ 384. Per-seed P1_tp: s1=2 s2=2 s3=3 s4=3 s5=5 s6=2 s7=5 s8=2 s9=6 s10=2. Independently reproduced by p1-29e's local working-tree batch (also 10/10). `tier_peak` is research-derived (auto_play.gd:2585 = max `era` across `researched_techs`), so P1_tp≥2 is genuine era-2 tech research. ATTRIBUTION: the tier_peak=1 symptom is resolved on current main, but p1-29e establishes it is research-driven main-branch drift (cumulative AI changes), NOT the economy break-out (which builds 0 buildings here) and NOT necessarily this objective's action-priority uplift either. A clean HEAD-vs-HEAD+patch before/after to isolate the cause is deferred (p1-29e). NOTE: tier_peak is a high-water mark; P1 still loses its capital in 8/10 seeds before T300 — end-game survival is p1-29d's domain. Closing on the OUTCOME gate (P1 has a viable mid-game tier-2 research path on current main), with mechanism attribution explicitly handed to p1-29e."
---
## Summary
@ -32,7 +32,7 @@ Cycle-48's three fix sites (evaluator `tech_weight_mult=2.5`, rollout `tech_coef
## Acceptance criteria
- [x] **Sole-city tier-2 reach in 10-seed batch**: ≥ 7/10 alive-aware seeds (both `p0_tp ≥ 2 AND p1_tp ≥ 2`). **CLOSED 2026-05-27 on apricot batch `20260527_213814` (smoke, 10 seeds, T300, origin/main @ 8ebebd68f): 10/10 PASS** (P1 `tier_peak` 26, median length 89 ≤ 384). Earlier history: 2026-05-15 batch `20260515_215705` was 0/10 (P1_tp=1 in all seeds) — the `+0.40 Settle / +0.20 Defend / +0.50 Research` action-priority uplifts could not move it because the trailing AI never escaped the step-1 perpetual-military production loop to build *anything*. The fix was the **sole-city economy break-out** in `tactical/production.rs` (commit 8ebebd68f, evidence above), driven by the p1-29e RL probe finding that the winning lever is a production building, not science. P1 now reaches tier 2 before elimination in every seed. (P1 still loses its capital in 8/10 seeds — end-game *survival* is owned by p1-29d, not this gate.)
- [x] **Sole-city tier-2 reach in 10-seed batch**: ≥ 7/10 alive-aware seeds (both `p0_tp ≥ 2 AND p1_tp ≥ 2`). **OUTCOME GATE MET 2026-05-27 on apricot batch `20260527_213814` (smoke, 10 seeds, T300, fresh origin/main): 10/10 PASS** (P1 `tier_peak` 26, median length 89 ≤ 384), independently reproduced by p1-29e's local batch. The 2026-05-15 batch `20260515_215705` was 0/10 (P1_tp=1 everywhere); the symptom is resolved on current main. ATTRIBUTION CAVEAT: per p1-29e's controlled analysis the lift is research-driven main-branch drift, NOT a single identified intervention — the economy break-out builds 0 buildings in this regime (its `own_mil>=2` floor never fires) and the action-priority uplifts are likewise unproven as the cause. The clean HEAD-vs-HEAD+patch before/after to isolate the mechanism is deferred to p1-29e. The bullet's literal acceptance condition (10/10 ≥ 7/10) is met on current main; closing on that outcome, not on a proven causal chain. (P1 still loses its capital in 8/10 seeds — end-game *survival* is owned by p1-29d.)
- [x] **Tactical action priority — "found second city when sole-city and threatened"**: when `city_count == 1 && threat_level > 0.4 && available_founder_action`, raise founder-action prior. Implemented as `PersonalityPriors::action_prior_with_context` with `+0.40` Settle and `+0.20` Defend uplift under `SituationalContext { sole_city_threatened: true, .. }`. Threat signal derived from any opponent's `force_rel[me] ≥ 1` (u16 unit-count proxy adapted from the spec's `> 0.4` ratio because `force_rel` is not a [0,1] ratio). Wired into MCTS tree-selection via `TreeState::action_prior` impl on `GameRolloutState`. Evidence: `policy.rs::action_prior_with_context`, `rollout.rs::TreeState::action_prior`.