magicciv/.project/objectives/p1-29.md
2026-04-26 19:55:00 -07:00

4.8 KiB

id title priority status scope tags owner updated_at
p1-29 Anti-early-domination: lift game-balance gates that p0-01 v1 measured p1 missing game1
balance
pacing
warcouncil 2026-04-26

Summary

Split out from p0-01's original v1 sub-gates that the AI-layer cycles (1, 2, 3) could not move because they measure emergent game-balance dynamics, not AI quality. p0-01 closed done 2026-04-26 against Gate v2 (3/5 v1 sub-gates pass cleanly: tier_peak=4, wonders 7/10, combats=255). The 2 v1 sub-gates that v2 reframed away need a real owner:

  • tier_peak_gap ≤ 4 median: in surviving-pair games at end, one player tech-monopolizes (tp=6) while the other stagnates (tp=0), giving gap=5-6. Even with the alive-aware metric, the gap holds. Root cause: capture/combat dynamics let one player snowball without the other catching up. Loser stays alive but undeveloped.
  • peak_unit_tier ≥ 3 in ≥7/10 games absolute: 5/10 currently. 4 of the 5 fails are early-domination games (T48-T121) where tier-3 tech hasn't unlocked yet. The AI does deploy tier-3 units when available (80% of seeds reaching tp ≥3 also reach unit ≥3), but games end before tier-3 unlocks in half the seeds.

Cycle-3 attempted multiple AI-layer levers and confirmed they DON'T move these gates:

  • Tactical DOMINANCE_FACTOR bump (production.rs 1.25→2.0): no effect on outcome
  • Tactical dominance lerp bump (thresholds.rs 1.5→2.0/2.5 baseline): caused REGRESSION on tier_peak (faster opportunist wins)
  • Both reverted because the strategic MCTS doesn't pick attack actions — it only picks SpawnUnit/FoundCity/Idle per mc-turn/src/snapshot.rs:204-214 action_prior. The capture/development tempo is governed by mc-turn capture mechanics + mc-economy growth rates, NOT by AI scoring weights.

Real levers (cross-team scope):

  • mc-combat / mc-turn capture mechanics: increase city HP, lengthen siege duration, add capital-recapture cost, weaken early-rush combat math.
  • mc-economy growth rates: faster baseline tech research, lower tier-3 prereq cost, give players tech catch-up bonus when behind.
  • mc-turn turn-limit floor: refuse to award domination victory before T150 (force games to mid-game minimum).

Pick one or compose multiple. Each requires the corresponding team-lead's involvement.

User-stated targets (2026-04-26)

User clarified the intended game-feel envelope:

  • Game length: ~T300 typical, ≤T500 cap. Currently 50% of games end T48-T200 via early domination. The lower bound (T300 typical) is the binding constraint.
  • Hard/Insane AI should reach tier_peak ≥ 10 (top of era ladder) by T200. Today only the longest games (T408-T500) reach tier_peak=10 and only Normal difficulty has been measured at all (no Easy/Hard/Insane batches in current corpus).

These targets compose with the structural gates below. Even if the gap/unit-tier gates pass at Normal, the difficulty calibration must be re-run to validate Hard/Insane reach the targets.

Plan to make it happen — see experiments/p1-29-tier10-by-t200.md

Three-round hypothesis tree:

  1. Round 1 (H1, in-scope warcouncil): bump difficulty.json::insane.research_mult 1.4 → 3.0 and hard.research_mult 1.2 → 2.0. Data-only. Validate via AI_DIFFICULTY=hard|insane tools/autoplay-batch.sh 10 500 per difficulty. If median winner tier_peak ≥ 10 is reached at any turn, H1 partial-confirmed; if reached by T200, fully confirmed. NEXT ACTION.
  2. Round 2 (H1 + H2, cross-team to combat-dev): if games still end T48-T200 via early domination, add turn-floor in mc-turn::victory.rs::check_domination (skip domination check before T100).
  3. Round 3 (H3, cross-team to game-data): if Rounds 1+2 still insufficient, shorten the tech tree (audit which mid-tier techs can be merged or skipped).

Acceptance

  • 10-seed tools/autoplay-batch.sh 10 300 Normal-Normal batch shows median tier_peak_gap (alive-aware, both alive players developed past tp ≥2) ≤ 4
  • Same batch shows ≥7/10 games reach peak_unit_tier ≥ 3 absolute (no game-state filter)
  • Median game-end turn shifts from current ~T150 toward T300 typical, ≤T500 cap per user 2026-04-26 directive
  • Cross-team handoff exists in .project/handoffs/ documenting which team-lead owns the capture/balance change
  • p0-01's evidence updated to cite this objective's closure as the source of v1-style symmetry/unit-tier gate satisfaction
  • Per-difficulty validation: AI_DIFFICULTY=hard tools/autoplay-batch.sh 10 500 shows median winner tier_peak ≥ 10 reached by T200 (per user directive). AI_DIFFICULTY=insane same or stronger. AI_DIFFICULTY=easy shows clearly weaker progression. Use tools/time-to-peak-unit.py and a new tools/time-to-tier-peak.py (analogous metric for tier_peak not just unit) to measure.