diff --git a/.project/objectives/p1-29.md b/.project/objectives/p1-29.md index 4f91810b..53224e6f 100644 --- a/.project/objectives/p1-29.md +++ b/.project/objectives/p1-29.md @@ -2,11 +2,11 @@ id: p1-29 title: "Anti-early-domination: lift game-balance gates that p0-01 v1 measured" priority: p1 -status: missing +status: partial scope: game1 tags: [balance, pacing] owner: combat-dev -updated_at: 2026-04-29 +updated_at: 2026-05-03 --- ## Summary @@ -83,10 +83,37 @@ Three-round hypothesis tree: **Decision**: Lever-3 alone cannot close both gates simultaneously at ratio 2×. Lever-4 (occupation penalty) is being added on top rather than replacing lever-3 — the combination may reach equilibrium where games are long enough for high-tier development but short enough for decisive winners. -**Lever 4 — occupation penalty** (IMPLEMENTED, awaiting batch): +**Lever 4 — occupation penalty** (IMPLEMENTED, batch run 2026-05-03): - `mc-city/src/city.rs`: added `captured_turn: Option` field, `mark_captured(turn)`, `occupation_production_mult(turn) → f64` (0.5 for 5 turns post-capture, 1.0 otherwise) - `api-gdext/src/lib.rs`: exposed `mark_captured(i64)` + `get_occupation_production_mult(i64)` on `GdCity` - `combat_utils.gd::capture_city`: calls `city._bridge._gd_city.call("mark_captured", GameState.turn_number)` - `turn_processor.gd::_process_production`: multiplies prod by `get_occupation_production_mult(turn_number)` per city - All 70 mc-city tests + 88 mc-combat tests pass -- **Next**: rebuild GDExt on apricot + run lever-3+4 combined batch + +**Lever-3+4 combined batch results (2026-05-03, 10 seeds × T300, batch dir `~/.cache/mc-src-20260502_215856/.local/batches/autoplay_batch/game_20260503_024038_*`):** + +| Metric | R10 baseline | Lever-3 alone | **Lever-3+4** | Gate | Δ vs R10 | +|---|---|---|---|---|---| +| `tier_peak_gap` (alive-aware) | 5.0 ❌ | 1.5 ✓ (5/10 stall) | **N/A** | ≤4 | **WORSE** — no alive-aware games (all p1_tp=1) | +| `winner_tier_peak` median | 4.5 ✓ | 2.5 ❌ | **4.0 ✓** | ≥4 | restored (vs lever-3 alone) | +| `max_peak_unit≥3` | 10/10 ✓ | 10/10 ✓ | **10/10 ✓** | ≥7 | matches | +| `wonders≥1` | 7/10 ✓ | 5/10 | (not measured this batch) | ≥5 | n/a | +| `combats` | 454 | 1052 | **598** | ≥20 | 1.3× R10, lower than lever-3-alone | +| `victories` | 10/10 | 5/10 | **10/10 ✓** | — | restored | +| Distinct winner personalities | n/a | n/a | 3 (goldvein, ironhold, runesmith) | n/a | healthy | +| Median game length | n/a | n/a | 284 | n/a | within ≤300 cap | + +**Diagnosis (combat-dev verdict 2026-05-03):** Lever-4's occupation production penalty restored decisive winners (10/10 victories) and high `winner_tier_peak` (4.0 median, matching R10) without resurrecting lever-3-alone's combat overshoot. **HOWEVER**, the new failure mode is more uniform: every game has p0 dominant + p1 stuck at tier 1. Lever-3+4 doesn't equalize the players — it makes the loser even less competitive. The alive-aware tier_peak_gap metric is undefined because no game has both players developed past tp≥2. This is **structurally worse** than R10's "uneven gap" mode (where at least both players developed). + +The capture/development tempo isn't fixed by stack-of-doom damage clamping + occupation penalty. The remaining levers (Round 2: turn-floor in `mc-turn::victory.rs::check_domination`; Round 3: tech-tree shortening in `game-data`) require cross-team handoff per the original objective scope. + +## Acceptance (2026-05-03 update) + +- ❌ tier_peak_gap (alive-aware) ≤4 median — N/A this batch (all games have p1_tp=1) +- ✓ peak_unit_tier ≥3 in ≥7/10 absolute (10/10 in this batch; cite: `~/.cache/mc-src-20260502_215856/.local/batches/autoplay_batch/game_20260503_024038_*/turn_stats.jsonl`) +- ❌ Median game-end turn — 284 this batch (within target ~T300, but 4 of 10 ended T100-150 via early domination) +- ❌ Cross-team handoff to combat-dev/game-data for capture/balance work +- ❌ p0-01 evidence updated to cite this objective's closure +- ❌ Per-difficulty `AI_DIFFICULTY=hard|insane|easy` validation (not run this cycle) + +`status: partial` — lever-3+4 closes victory rate + winner_tier_peak gates but the alive-aware gap metric is undefined (loser never develops). 4 of 6 acceptance bullets remain ❌; cross-team handoff still required.