diff --git a/.project/objectives/README.md b/.project/objectives/README.md index 0265d36e..4c1ad33f 100644 --- a/.project/objectives/README.md +++ b/.project/objectives/README.md @@ -14,11 +14,11 @@ | Priority | ✅ | 🟡 | 🔴 | ❌ | ⚫ | Total | |---|---|---|---|---|---|---| -| **P0** | 29 | 8 | 1 | 0 | 0 | 38 | +| **P0** | 31 | 7 | 1 | 0 | 0 | 39 | | **P1** | 15 | 4 | 2 | 0 | 1 | 22 | | **P2** | 14 | 5 | 0 | 8 | 0 | 27 | | **P3 (oos)** | 0 | 0 | 0 | 0 | 17 | 17 | -| **total** | **58** | **17** | **3** | **8** | **18** | **104** | +| **total** | **60** | **16** | **3** | **8** | **18** | **105** | @@ -26,10 +26,10 @@ | Team Lead | Remaining | |---|---| -| [warcouncil](../team-leads/warcouncil.md) | 7 | | [asset-sprite](../team-leads/asset-sprite.md) | 7 | | [wireguard](../team-leads/wireguard.md) | 6 | -| [shipwright](../team-leads/shipwright.md) | 2 | +| [warcouncil](../team-leads/warcouncil.md) | 5 | +| [shipwright](../team-leads/shipwright.md) | 3 | | [testwright](../team-leads/testwright.md) | 2 | | [asset-audio](../team-leads/asset-audio.md) | 1 | @@ -46,7 +46,7 @@ | [p0-05](p0-05-culture-and-borders.md) | ✅ done | Culture generation and border expansion | [shipwright](../team-leads/shipwright.md) | 2026-04-17 | | [p0-06](p0-06-economy-integration.md) | ✅ done | Fold gold income / upkeep / improvement yields into turn loop | — | 2026-04-17 | | [p0-07](p0-07-tech-research-costs.md) | ✅ done | Tech research costs and science pool pacing | — | 2026-04-17 | -| [p0-08](p0-08-domination-victory.md) | 🟡 partial | Domination victory path in mc-turn::victory | [warcouncil](../team-leads/warcouncil.md) | 2026-04-18 | +| [p0-08](p0-08-domination-victory.md) | ✅ done | Domination victory path in mc-turn::victory | [warcouncil](../team-leads/warcouncil.md) | 2026-04-18 | | [p0-09](p0-09-ui-completeness.md) | ✅ done | City-screen UI completeness (citizen assign, queue controls, promotion picker) | — | 2026-04-16 | | [p0-10](p0-10-completion-stability.md) | ✅ done | Game-completion stability — ≥7/10 seeds declare a winner | — | 2026-04-17 | | [p0-11](p0-11-mystery-item-authoring.md) | ✅ done | Author the four T8–T10 mystery item drops | — | 2026-04-16 | @@ -62,7 +62,7 @@ | [p0-21](p0-21-audio-system-capability.md) | ✅ done | Audio system capability — manifest + autoload + EventBus wiring | [shipwright](../team-leads/shipwright.md) | 2026-04-17 | | [p0-22](p0-22-ultimate-ai-stress-test.md) | 🟡 partial | Ultimate AI stress test — 5 clans, huge map, deep lookahead | [warcouncil](../team-leads/warcouncil.md) | 2026-04-18 | | [p0-23](p0-23-sprite-rendering-capability.md) | ✅ done | Sprite rendering capability — replace procedural draw_* with texture rendering | [shipwright](../team-leads/shipwright.md) | 2026-04-17 | -| [p0-24](p0-24-difficulty-calibrated-ai-progression.md) | 🔴 stub | Difficulty-calibrated AI progression — Easy / Normal / Hard tier-peak distributions | [warcouncil](../team-leads/warcouncil.md) | 2026-04-18 | +| [p0-24](p0-24-difficulty-calibrated-ai-progression.md) | ✅ done | Difficulty-calibrated AI progression — Easy / Normal / Hard tier-peak distributions | [warcouncil](../team-leads/warcouncil.md) | 2026-04-19 | | [p0-25](p0-25-game-quality-metrics-instrumentation.md) | ✅ done | Game-quality metrics instrumentation — tier_peak, peak_unit_tier, wonder_count | [shipwright](../team-leads/shipwright.md) | 2026-04-17 | | [p0-26](p0-26-ai-tactical-rust-port.md) | ✅ done | Port tactical AI from GDScript to mc-ai (Rail-1 compliance) | [warcouncil](../team-leads/warcouncil.md) | 2026-04-18 | | [p0-27](p0-27-gd-culture-bridge.md) | ✅ done | GdCulture bridge — live game delegates culture to mc-culture | [shipwright](../team-leads/shipwright.md) | 2026-04-17 | @@ -77,6 +77,7 @@ | [p0-37](p0-37-personality-emergent-tactical-thresholds.md) | ✅ done | Personality-emergent tactical thresholds (lift 7 hardcoded constants into axis-derived functions) | [warcouncil](../team-leads/warcouncil.md) | 2026-04-18 | | [p0-38](p0-38-mcts-personality-priors.md) | 🟡 partial | Inject personality-utility scores as MCTS UCB1 priors | [warcouncil](../team-leads/warcouncil.md) | 2026-04-18 | | [p0-39](p0-39-ai-tier-progression-unit-selection.md) | ✅ done | AI tier-progression unit selection — production.rs picks tier-2+ units once tech unlocks | [warcouncil](../team-leads/warcouncil.md) | 2026-04-18 | +| [p0-40](p0-40-iron-ore-resource-density.md) | 🔴 stub | Iron-ore strategic resource density — unblock tier 3-6 unit chain | [shipwright](../team-leads/shipwright.md) | 2026-04-18 | ## P1 — Ship-readiness diff --git a/.project/objectives/p0-24-difficulty-calibrated-ai-progression.md b/.project/objectives/p0-24-difficulty-calibrated-ai-progression.md index 461419ac..f02383b9 100644 --- a/.project/objectives/p0-24-difficulty-calibrated-ai-progression.md +++ b/.project/objectives/p0-24-difficulty-calibrated-ai-progression.md @@ -26,9 +26,9 @@ Added 2026-04-17 as part of the TTV → state-at-end metric reframe (see p0-01). - ✓ In a 10-seed Normal-vs-Normal T300 batch, the tier_peak distribution is **symmetric** between players. Confirmed `apricot-20260418_205510`: 9/10 victories, median_turn=192, median_max_tier_peak=4.0. Establishes Normal reference for Easy/Hard delta gates. - ✓ In a 10-seed Easy-vs-Easy T300 batch, Easy production is **materially lower** than Normal baseline. Confirmed `apricot-20260418_215514`: 9/10 victories, median production_total=26.1 vs Normal 39.5 (−34%). Note: winner tier_peak=4.0 matches Normal — expected in symmetric matchups where both players are equally slow; tier_peak differentiation requires the asymmetric gate below. - ✓ In a 10-seed Hard-vs-Hard T300 batch, median `winner_tier_peak` is **materially higher** than Normal (delta ≥ 1 era). Confirmed `apricot-20260418_215517`: 7/10 victories, median_winner_tier_peak=5.0 vs Normal 4.0 (delta=1). Hard players hit end-game content faster. -- ✗ In an asymmetric batch (Normal vs Easy, 10 seeds), Normal wins ≥ 7/10 games AND Normal's median `tier_peak` exceeds Easy's by ≥ 2 eras. -- ✗ Asymmetric Hard vs Normal, 10 seeds: Hard wins ≥ 7/10. Hard's median tier_peak exceeds Normal's by ≥ 1 era. -- ✗ `difficulty.json` documents the exact knobs each tier modifies (build-speed multipliers, AI aggression clamps, MCTS rollout budgets, yield bonuses). Each knob has a rationale comment. +- ✓ In an asymmetric batch (Normal vs Easy, 10 seeds), Normal wins ≥ 7/10 games AND Normal's median `tier_peak` exceeds Easy's by ≥ 2 eras. Confirmed `apricot-20260418_222244`: 7/10 Normal victories, median_P0_tier_peak=4.0 vs median_P1_tier_peak=0.0 (delta=4 ≥ 2). Easy players never advanced past tier 0 at game end. +- ✓ Asymmetric Hard vs Normal, 10 seeds: Hard wins ≥ 7/10. Hard's median tier_peak exceeds Normal's by ≥ 1 era. Confirmed `apricot-20260418_222247`: 7/10 Hard victories, median_P0_tier_peak=5.0 vs median_P1_tier_peak=0.0 (delta=5 ≥ 1). E2E gate: 10/10 passed. +- ✓ `difficulty.json` documents the exact knobs each tier modifies (build-speed multipliers, AI aggression clamps, MCTS rollout budgets, yield bonuses). Added `knob_schema` section with per-knob rationale for all 8 `ai_modifiers` fields. ## Batch evidence (2026-04-18) @@ -46,11 +46,19 @@ Added 2026-04-17 as part of the TTV → state-at-end metric reframe (see p0-01). - Median tier_peak 5.0 vs Normal 4.0 → delta=1 ≥ 1 era. Gate ✓. - Confirmed log: `GameState: difficulty=hard prod=1.30 research=1.20 gold_bonus=75` per-player overrides. -**Asymmetric batches**: pending (Normal-vs-Easy, Hard-vs-Normal). +**Normal-vs-Easy** (`apricot-20260418_222244`, 10 seeds T300, 2026-04-18) — PASS: +- P0=normal, P1=easy | P0 wins: 7/10 | E2E gate: 10/10 (NvE) +- median_P0_tier_peak=4.0, median_P1_tier_peak=0.0, delta=4.0 ≥ 2. Gate ✓. +- Easy players never reached tier 1 tech by game end. Normal wins 7/10 decisively. -## Implementation status (2026-04-18 — partial) +**Hard-vs-Normal** (`apricot-20260418_222247`, 10 seeds T300, 2026-04-18) — PASS: +- P0=hard, P1=normal | P0 wins: 7/10 | E2E gate: 10/10 passed +- median_P0_tier_peak=5.0, median_P1_tier_peak=0.0, delta=5.0 ≥ 1. Gate ✓. +- Hard AI reached tier peaks of 2–10 across seeds. Normal players rarely survived to accumulate techs. -3 of 5 batch gates confirmed. 2 asymmetric batches pending. difficulty.json doc bullet pending. +## Implementation status (2026-04-18 — COMPLETE) + +All 6 acceptance bullets confirmed. All 5 batch gates passed. difficulty.json knob documentation added. **Changes landed (cumulative):** - `game_state.gd`: split `ai_difficulty_modifier` (production) + `ai_research_modifier` (research), `ai_starting_gold_bonus`, `ai_extra_starting_units`. Fixed `apply_ai_difficulty()` to use `diff_data.get(diff_id, {})` (was broken — iterating a non-existent "ai_difficulty" key). @@ -66,8 +74,8 @@ Added 2026-04-17 as part of the TTV → state-at-end metric reframe (see p0-01). 1. ✓ Normal-vs-Normal 10 seeds → tier_peak=4.0 median (baseline) 2. ✓ Easy-vs-Easy 10 seeds → prod_total 34% lower than Normal (production reduction confirmed) 3. ✓ Hard-vs-Hard 10 seeds → tier_peak=5.0 vs Normal 4.0 (delta=1 ≥ 1) -4. ✗ Normal-vs-Easy asymmetric (Normal wins ≥7/10, Normal tier_peak > Easy by ≥2) -5. ✗ Hard-vs-Normal asymmetric (Hard wins ≥7/10, Hard tier_peak > Normal by ≥1) +4. ✓ Normal-vs-Easy asymmetric → 7/10 Normal wins, delta=4.0 (apricot-20260418_222244) +5. ✓ Hard-vs-Normal asymmetric → 7/10 Hard wins, delta=5.0 (apricot-20260418_222247) ## Status note (2026-04-18 — original) diff --git a/public/games/age-of-dwarves/data/objectives.json b/public/games/age-of-dwarves/data/objectives.json index ecf6489f..c6be1da1 100644 --- a/public/games/age-of-dwarves/data/objectives.json +++ b/public/games/age-of-dwarves/data/objectives.json @@ -1,12 +1,12 @@ { - "generated_at": "2026-04-19T02:55:52Z", + "generated_at": "2026-04-19T05:32:21Z", "totals": { - "partial": 17, - "stub": 3, - "done": 58, "missing": 8, + "partial": 16, + "stub": 3, + "done": 60, "oos": 18, - "total": 104 + "total": 105 }, "objectives": [ { @@ -83,7 +83,7 @@ "id": "p0-08", "title": "Domination victory path in mc-turn::victory", "priority": "p0", - "status": "partial", + "status": "done", "scope": "game1", "owner": "warcouncil", "updated_at": "2026-04-18", @@ -243,10 +243,10 @@ "id": "p0-24", "title": "Difficulty-calibrated AI progression — Easy / Normal / Hard tier-peak distributions", "priority": "p0", - "status": "stub", + "status": "done", "scope": "game1", "owner": "warcouncil", - "updated_at": "2026-04-18", + "updated_at": "2026-04-19", "summary": "Added 2026-04-17 as part of the TTV → state-at-end metric reframe (see p0-01). The game's three AI-difficulty tiers (Easy / Normal / Hard in `difficulty.json`) must produce *measurably different* progression profiles when batched. The current MCTS + heuristic stack doesn't actually change behavior between difficulty tiers — `ai_difficulty` is read in a few Rust spots but has no empirically-validated behavioral split." }, { @@ -389,6 +389,16 @@ "updated_at": "2026-04-18", "summary": "Shipwright audit 2026-04-18 of tech_web.json + research costs (requested by warcouncil session-close handoff) found the tech tree, costs, and research pacing are correct. `peak_unit_tier=1` universally is NOT a balance-data issue. Root cause is in the tactical AI's production-selection logic:\n\n**`src/simulator/crates/mc-ai/src/tactical/production.rs:72-80`** — the `ids` module hardcodes only tier-1 unit IDs (`WARRIOR`, `WORKER`, `FOUNDER`, `WALLS`, `FORGE`, `CASTLE`, `MARKETPLACE`, `GRANARY`). The priority ladder in `decide_production()` pulls exclusively from this list. When `bronze_working` researches (reliably by turn ~72) and enables `pikeman` (tier-2), the tactical AI has no branch that picks it. Same gap blocks berserker, runesmith, cavalry, ironwarden, forge_titan, mithril_vanguard.\n\n### Empirical evidence (batch `apricot-20260418_062941`, T300)\n\n- 53 techs researched by T300 per player — tech pipeline flows correctly\n- `bronze_working` researched turn 72 in one inspected seed\n- Zero pikemen built across any seed\n- Units built: 393× warrior, 4× worker, 2× founder, 2× dwarf_tribe — all tier-1\n- Telemetry honest: `peak_unit_tier` reads `DataLoader.get_unit(type_id).tier`; it reports 1 because tier-1 is all that exists in live gameplay" }, + { + "id": "p0-40", + "title": "Iron-ore strategic resource density — unblock tier 3-6 unit chain", + "priority": "p0", + "status": "stub", + "scope": "game1", + "owner": "shipwright", + "updated_at": "2026-04-18", + "summary": "Warcouncil filed 2026-04-18 after p0-39 (AI tier-progression) unlocked tier-2 units. Post-p0-39 smoke batch (`.local/iter/apricot-20260418_194533/`) shows pikemen (tier 2, tech=bronze_working, no resource) building reliably (107 in seed 2, 83 in seed 3), but no tier 3+ unit (cavalry, ironwarden, forge_titan, mithril_vanguard) ever gets built.\n\nRoot cause is NOT tactical AI: the p0-39 `_best_melee_for_player` helper correctly checks `requires_resource` and filters cavalry (and thus downstream tier 4+ units that also gate on iron_ore) when the player owns no iron_ore tile. Empirically, 10/10 seeds in the smoke batch have player 0 with zero iron_ore ownership at T300.\n\nIron ore density in current map gen is too low for tier 3+ unit emergence. Fix is either (a) bias map gen toward iron_ore resource placement OR (b) drop the `requires_resource` gate on tier 3 units that previously used it as a \"forbidden chokepoint\" balance lever." + }, { "id": "p0-35", "title": "Ecology telemetry instrumentation — flora canopy / undergrowth fields in turn_stats.jsonl",