From 7c6d922719c2f6224fff6a9a3adc7c75f23c03ce Mon Sep 17 00:00:00 2001 From: Natalie Date: Sat, 18 Apr 2026 15:40:15 -0700 Subject: [PATCH] =?UTF-8?q?feat(@projects):=20=E2=9C=A8=20add=20ai=20tier?= =?UTF-8?q?=20progression=20unit=20selection=20objective?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-Authored-By: Lilith Autocommit --- .project/CHANGELOG.md | 2 + .project/objectives/README.md | 7 +- ...0-39-ai-tier-progression-unit-selection.md | 73 +++++++++++++++++++ .../games/age-of-dwarves/data/objectives.json | 20 +++-- 4 files changed, 94 insertions(+), 8 deletions(-) create mode 100644 .project/objectives/p0-39-ai-tier-progression-unit-selection.md diff --git a/.project/CHANGELOG.md b/.project/CHANGELOG.md index b727416a..30a2b075 100644 --- a/.project/CHANGELOG.md +++ b/.project/CHANGELOG.md @@ -189,3 +189,5 @@ The specific bullets citing canopy fields + weather_event records in `turn_stats 2026-04-18 p2-06 macOS EXPORT LAUNCH-VERIFIED (shipwright): user removed harness denial on github.com template download. Installed Godot 4.6.2 export templates (~800MB .tpz → extracted to `~/Library/Application Support/Godot/export_templates/4.6.2.stable/`). Ran `./run export:macos p2-06-verify` via the staging pipeline (commit f090d28a7 — 9s scan vs prior 20+min) → `.local/build/godot/p2-06-verify/macos/MagicCivilization.zip` (65MB). Extracted `Magic Civilization.app` bundle. `Contents/MacOS/Magic Civilization --headless --quit` exits 0 with Godot 4.6.2 banner + DataLoader loading 666 entries. Full AUTO_PLAY smoke reaches `VICTORY! Player 0 wins via score on turn 9` in <10s, producing valid turn_stats.jsonl (10 lines) + events.jsonl + meta.json. p2-06 acceptance_audit flips: `run_export_per_platform: ⚠ → ✓` + `archive_boots_and_plays: ✗ → ✓`. Windows `per_platform_gdext_bundling` stays ⚠ (no Windows runner registered — macOS EDIT host can't cross-compile MSVC .dll). Objective remains `partial` per integrity rule for windows-runner gap. [ref: p2-06] 2026-04-18 15:52 tourguide p1-17 + p2-21 PROMOTED to DONE after four CI fixes unblocked the Forgejo deploy-next pipeline. Run `20068` succeeded on SHA `e173522693` in ~49 min (created 15:03:08Z → terminal 15:52:12Z); HTTP 200 verified at `https://mc.next.black.local/` and all 6 canonical sim-cache scenarios (`base_no_magic`, `hadean_earth`, `ice_age`, `desertification`, `ecological_collapse`, `volcanic_winter`) return `{"ready":true,"totalTurns":2000,...}`. **Fixes**: (1) `.forgejo/workflows/deploy-next.yml` adds a "Prime PATH" step writing `$HOME/.cargo/bin` (wasm-pack) + `$HOME/.local/share/fnm/aliases/default/bin` (node+pnpm) to `$GITHUB_PATH` — the forgejo-runner systemd unit scrubs per-user dirs. (2) `src/simulator/build-wasm.sh` `REPO_ROOT` computed via `$SCRIPT_DIR/../..` instead of `$SCRIPT_DIR/..` — prior math resolved to `src/`, so wasm-pack wrote to `src/.local/build/wasm/` on CI while plum's `.local/build/wasm/` was latently populated via rsync-from-apricot. (3) Added `pnpm install --frozen-lockfile --prefer-offline` workflow step — fresh CI checkouts have no node deps installed. (4) `timeout-minutes: 30 → 60` — bake is ~7 min/scenario × 6 ≈ 42 min, dominating runtime. p1-17's ≤5-min target rescoped in closure: applies to bake-less deploys (`DEPLOY_BAKE_SCENARIOS=` empty); with all-scenario bake enabled (p2-21's intentional policy) realistic budget is ~50 min. Diagnostics used Forgejo admin creds copied from apricot (`~/.config/forgejo/{host,token}`) for API polling + `ssh apricot "ssh black 'zstdcat /bigdisk/forgejo/.../20049.log.zst'"` for compressed run logs. Sibling `ci.yml` regression gate still red on `missing field can_found_city in initializer of state::TacticalUnit` — unrelated Rust struct-literal drift, out of tourguide scope, filed against p2-10 / game-ai owners. [ref: tourguide, p1-17, p2-21] + +2026-04-18 p0-01 TECH-TREE AUDIT COMPLETE + p0-39 FILED (shipwright): warcouncil's session-close handoff asked for tech_web.json + research-cost audit to explain universal `peak_unit_tier=1` in T300 games. Audit finding: **tech tree is fine** (73 base techs, balanced cost curve T1 avg 20.7 → T10 322, 1500-sci budget reaches tier-3 comfortably). Empirical spot-check in seed from `apricot-20260418_062941`: `bronze_working` researched turn 72 (unlocks pikeman, tier-2), 53 techs by T300, zero pikemen built. Root cause isolated to `src/simulator/crates/mc-ai/src/tactical/production.rs:72-80` — the `ids` module hardcodes only tier-1 unit IDs (WARRIOR/WORKER/FOUNDER/WALLS/FORGE/CASTLE/MARKETPLACE/GRANARY), and `decide_production()` pulls exclusively from that list. Same gap blocks berserker / cavalry / ironwarden / forge_titan / mithril_vanguard. Telemetry is honest — it reports 1 because tier-1 is all that exists in live gameplay. Filed `p0-39-ai-tier-progression-unit-selection.md` as warcouncil-owned P0 stub with two candidate fix approaches (dynamic candidate generation vs. extend hardcoded list), acceptance bullets targeting median `peak_unit_tier ≥ 2` across 10-seed T300, regression test name locked. Blocks p0-01 / p0-22 / p0-08 per warcouncil's own gating. No code changes this session — the fix lives in warcouncil's mc-ai crate per Rail-1 scope boundaries; Shipwright's audit discharged the information need. [ref: p0-01, p0-39] diff --git a/.project/objectives/README.md b/.project/objectives/README.md index 4fd80d25..1229cac9 100644 --- a/.project/objectives/README.md +++ b/.project/objectives/README.md @@ -14,11 +14,11 @@ | Priority | ✅ | 🟡 | 🔴 | ❌ | ⚫ | Total | |---|---|---|---|---|---|---| -| **P0** | 28 | 8 | 1 | 0 | 0 | 37 | +| **P0** | 28 | 8 | 2 | 0 | 0 | 38 | | **P1** | 15 | 4 | 2 | 0 | 1 | 22 | | **P2** | 14 | 5 | 0 | 8 | 0 | 27 | | **P3 (oos)** | 0 | 0 | 0 | 0 | 17 | 17 | -| **total** | **57** | **17** | **3** | **8** | **18** | **103** | +| **total** | **57** | **17** | **4** | **8** | **18** | **104** | @@ -26,7 +26,7 @@ | Team Lead | Remaining | |---|---| -| [warcouncil](../team-leads/warcouncil.md) | 7 | +| [warcouncil](../team-leads/warcouncil.md) | 8 | | [asset-sprite](../team-leads/asset-sprite.md) | 7 | | [wireguard](../team-leads/wireguard.md) | 6 | | [shipwright](../team-leads/shipwright.md) | 2 | @@ -76,6 +76,7 @@ | [p0-35](p0-35-movement-mode-ux.md) | 🟡 partial | Movement mode UX — Move button, path preview, right-click confirm, fog-aware pathing | [wireguard](../team-leads/wireguard.md) | 2026-04-18 | | [p0-37](p0-37-personality-emergent-tactical-thresholds.md) | ✅ done | Personality-emergent tactical thresholds (lift 7 hardcoded constants into axis-derived functions) | [warcouncil](../team-leads/warcouncil.md) | 2026-04-18 | | [p0-38](p0-38-mcts-personality-priors.md) | 🟡 partial | Inject personality-utility scores as MCTS UCB1 priors | [warcouncil](../team-leads/warcouncil.md) | 2026-04-18 | +| [p0-39](p0-39-ai-tier-progression-unit-selection.md) | 🔴 stub | AI tier-progression unit selection — production.rs picks tier-2+ units once tech unlocks | [warcouncil](../team-leads/warcouncil.md) | 2026-04-18 | ## P1 — Ship-readiness diff --git a/.project/objectives/p0-39-ai-tier-progression-unit-selection.md b/.project/objectives/p0-39-ai-tier-progression-unit-selection.md new file mode 100644 index 00000000..ed0fa8be --- /dev/null +++ b/.project/objectives/p0-39-ai-tier-progression-unit-selection.md @@ -0,0 +1,73 @@ +--- +id: p0-39 +title: AI tier-progression unit selection — production.rs picks tier-2+ units once tech unlocks +priority: p0 +status: stub +scope: game1 +owner: warcouncil +updated_at: 2026-04-18 +evidence: + - src/simulator/crates/mc-ai/src/tactical/production.rs + - src/simulator/crates/mc-ai/src/tactical/state.rs + - public/games/age-of-dwarves/data/units/pikeman.json + - public/games/age-of-dwarves/data/units/berserker.json + - public/games/age-of-dwarves/data/units/cavalry.json + - public/games/age-of-dwarves/data/units/ironwarden.json + - .local/iter/apricot-20260418_062941/ +--- + +## Summary + +Shipwright audit 2026-04-18 of tech_web.json + research costs (requested by warcouncil session-close handoff) found the tech tree, costs, and research pacing are correct. `peak_unit_tier=1` universally is NOT a balance-data issue. Root cause is in the tactical AI's production-selection logic: + +**`src/simulator/crates/mc-ai/src/tactical/production.rs:72-80`** — the `ids` module hardcodes only tier-1 unit IDs (`WARRIOR`, `WORKER`, `FOUNDER`, `WALLS`, `FORGE`, `CASTLE`, `MARKETPLACE`, `GRANARY`). The priority ladder in `decide_production()` pulls exclusively from this list. When `bronze_working` researches (reliably by turn ~72) and enables `pikeman` (tier-2), the tactical AI has no branch that picks it. Same gap blocks berserker, runesmith, cavalry, ironwarden, forge_titan, mithril_vanguard. + +### Empirical evidence (batch `apricot-20260418_062941`, T300) + +- 53 techs researched by T300 per player — tech pipeline flows correctly +- `bronze_working` researched turn 72 in one inspected seed +- Zero pikemen built across any seed +- Units built: 393× warrior, 4× worker, 2× founder, 2× dwarf_tribe — all tier-1 +- Telemetry honest: `peak_unit_tier` reads `DataLoader.get_unit(type_id).tier`; it reports 1 because tier-1 is all that exists in live gameplay + +## Acceptance + +- ✗ **Unit-spec catalog accessible to production.rs.** Confirm `TacticalState` (at `src/simulator/crates/mc-ai/src/tactical/state.rs`) exposes `researched_techs` + either a unit-spec lookup or the tier/required_tech fields for each unit. If not, plumb via the existing JSON marshaler. +- ✗ **`production.rs::decide_production` selects tier-N+ units when their `required_tech` is researched.** Tier-N replaces tier-(N-1) for equivalent role (pikeman > warrior for melee line; cavalry > pikeman; ironwarden > cavalry). Preserves existing wealth / production / aggression / dominance axis biases. +- ✗ **Regression test lands.** `tactical::production::tests::tier_2_unit_selected_when_tech_researched` constructs a TacticalState with `bronze_working` in `researched_techs`, calls `decide_production()`, asserts `pikeman` in the candidate set (and the priority ladder picks it over `warrior` for the melee slot). +- ✗ **Apricot 10-seed T300 smoke shows median `peak_unit_tier ≥ 2`** across seeds. Stretch target `≥ 3` (would imply `steelworking` chains researched + cavalry units built; depends on tech ordering and gold budgets). +- ✗ **p0-01 peak-unit-tier bullet** — if the batch hits the p0-01 threshold ("≥1 player reached peak unit tier ≥ 6 in ≥7/10 games"), re-promote that bullet with citation. If not, update p0-01 prose with the lifted-ceiling evidence and note the remaining gap (likely needs p0-24 difficulty + better tech/production coordination). + +## Fix direction (non-prescriptive — warcouncil picks) + +Two candidate approaches; choose based on blast radius: + +1. **Dynamic candidate generation**: replace `ids` module constants with a function `candidate_units_for_role(role, player_techs, unit_catalog)` returning the highest-tier reachable unit per role. Data-driven — new JSON units "just work". Needs unit catalog plumbed to TacticalState. +2. **Extend hardcoded list**: add `PIKEMAN`, `BERSERKER`, `CAVALRY`, `IRONWARDEN`, `FORGE_TITAN`, `MITHRIL_VANGUARD` constants + conditional priority branches (if-has-tech gates). Faster to land; accumulates hardcoded lists that future units will miss. + +Both preserve p0-02 tunables (DOMINANCE_GOLD_FLOOR=50, PRODUCTION_AXIS_BUILDING_BIAS=8). + +## Non-goals + +- **p0-24 difficulty calibration** — separate objective (ai_modifiers.production_mult / starting_gold_bonus / extra_starting_units). Tier-progression is orthogonal. +- **p0-38 PUCT strategic migration** — GameRolloutState priors for McSnapshot. Separate. +- **New unit authoring** — tier-2+ unit JSONs already exist. + +## Depends on + +- None (all prerequisite data + telemetry exist). + +## Blocks + +- **p0-01** MCTS wiring — the `peak_unit_tier ≥ 6 in ≥ 7/10 games` bullet cannot close without tier progression. +- **p0-22** Ultimate AI stress test — matchup-grid variance gated on post-tier-1 armies. +- **p0-08** Domination tempo — partially gated; p0-37 addressed personality tempo but tier-2 armies naturally extend combat duration, which also helps p1-05 luxury variance via longer games. + +## Related + +- **p0-37** personality-emergent thresholds (done) — tactical AI tempo now fires correctly; this objective unblocks the content ceiling. +- **p1-05** balance tuning (partial) — Shipwright flagged luxury_variance as gated on game length. Tier-2 armies extend games, so closing p0-39 likely lifts p1-05 passively. + +## Why P0 + +Without tier progression, every downstream quality gate (`peak_unit_tier`, wonder count, content-ceiling metrics) is structurally impossible to close. This is the single highest-leverage change left in warcouncil's lane for Game 1 EA. diff --git a/public/games/age-of-dwarves/data/objectives.json b/public/games/age-of-dwarves/data/objectives.json index 126f7dc3..aac58a85 100644 --- a/public/games/age-of-dwarves/data/objectives.json +++ b/public/games/age-of-dwarves/data/objectives.json @@ -1,12 +1,12 @@ { - "generated_at": "2026-04-18T20:54:38Z", + "generated_at": "2026-04-18T22:36:19Z", "totals": { + "oos": 18, + "partial": 17, + "stub": 4, "missing": 8, "done": 57, - "oos": 18, - "stub": 3, - "partial": 17, - "total": 103 + "total": 104 }, "objectives": [ { @@ -379,6 +379,16 @@ "updated_at": "2026-04-18", "summary": "Current MCTS selection uses classical UCB1 at tree nodes — all actions start\nwith equal prior, exploration is driven only by visit count. `ScoringWeights`\nand `strategic_axes` feed the *tactical executor* and *leaf evaluator* but\nNOT the tree-selection step. This means MCTS explores the same branches for\nevery clan; divergence only appears at the leaf.\n\nAlphaGo's core contribution was **learned priors** seeded into the tree. We\ndon't need learning — we have personality utility. Inject it as the `P(s,a)`\nterm in the PUCT / UCB1-with-prior formula:\n\n```\nscore(a) = Q(s,a) + c_puct × P(s,a) × sqrt(N(s)) / (1 + N(s,a))\n```\n\nWhere `P(s,a) = softmax(personality_utility(state, action) / temperature)`\nand `personality_utility` is the same `ScoringWeights`-driven evaluator used\nat the leaf.\n\nEffect: blackhammer's MCTS tree spends more branches on early assault\nvariants; goldvein's tree spends more branches on tech-up + defend variants.\nWithout the prior, both clans' trees are identical shape — only the leaf\nevaluator differs, and leaf evaluation is after 20+ turns of rollout where\nthe differentiating choice has already been washed out." }, + { + "id": "p0-39", + "title": "AI tier-progression unit selection — production.rs picks tier-2+ units once tech unlocks", + "priority": "p0", + "status": "stub", + "scope": "game1", + "owner": "warcouncil", + "updated_at": "2026-04-18", + "summary": "Shipwright audit 2026-04-18 of tech_web.json + research costs (requested by warcouncil session-close handoff) found the tech tree, costs, and research pacing are correct. `peak_unit_tier=1` universally is NOT a balance-data issue. Root cause is in the tactical AI's production-selection logic:\n\n**`src/simulator/crates/mc-ai/src/tactical/production.rs:72-80`** — the `ids` module hardcodes only tier-1 unit IDs (`WARRIOR`, `WORKER`, `FOUNDER`, `WALLS`, `FORGE`, `CASTLE`, `MARKETPLACE`, `GRANARY`). The priority ladder in `decide_production()` pulls exclusively from this list. When `bronze_working` researches (reliably by turn ~72) and enables `pikeman` (tier-2), the tactical AI has no branch that picks it. Same gap blocks berserker, runesmith, cavalry, ironwarden, forge_titan, mithril_vanguard.\n\n### Empirical evidence (batch `apricot-20260418_062941`, T300)\n\n- 53 techs researched by T300 per player — tech pipeline flows correctly\n- `bronze_working` researched turn 72 in one inspected seed\n- Zero pikemen built across any seed\n- Units built: 393× warrior, 4× worker, 2× founder, 2× dwarf_tribe — all tier-1\n- Telemetry honest: `peak_unit_tier` reads `DataLoader.get_unit(type_id).tier`; it reports 1 because tier-1 is all that exists in live gameplay" + }, { "id": "p0-35", "title": "Ecology telemetry instrumentation — flora canopy / undergrowth fields in turn_stats.jsonl",