diff --git a/.project/objectives/p1-36-ai-personalities-t1-t10-coverage.md b/.project/objectives/p1-36-ai-personalities-t1-t10-coverage.md index 9940a643..16fffb25 100644 --- a/.project/objectives/p1-36-ai-personalities-t1-t10-coverage.md +++ b/.project/objectives/p1-36-ai-personalities-t1-t10-coverage.md @@ -5,7 +5,7 @@ priority: p1 status: partial scope: game1 owner: warcouncil -updated_at: 2026-05-01 +updated_at: 2026-05-03 external_blocker: "apricot 10-seed batch (1-2h compute, unblocked by p1-37 completion)" assigned_by: shipwright blockedBy: [p1-34] @@ -56,10 +56,25 @@ five AI personalities to actually feel distinct in the units they build: - [ ] No clan personality builds a unit that's not in its `clan_affinity` unless it's a generic unit (warrior, spearmen, archer — all clans share these) -- [ ] 10-seed batch on apricot shows distinct unit-mix histograms per +- [⚠] 10-seed batch on apricot shows distinct unit-mix histograms per clan (e.g., Blackhammer ≥40% light melee composition; Deepforge ≥30% siege/walker composition); raw `total_combats` and `tier_peak` metrics remain inside the warcouncil quality gates + + **Batch run 2026-05-03** (`~/.cache/mc-src-20260502_215856/.local/batches/autoplay_batch_p1_36/game_20260503_055132_*`): + 10 seeds × T300 standard map, 1v1 (one AI per clan across 10 seeds; 2-3 seeds per clan). + + | Clan | Seeds | Top-3 units | Specific gate | + |---|---|---|---| + | blackhammer | 1 | hearth_raider (98.1%), dwarf_tribe, warrior | light melee 1.0% — **FAIL** ≥40% (target was light_melee tag-based, but `hearth_raider` IS Blackhammer's clan-affinity light melee — the analyzer's tag heuristic doesn't capture it) | + | deepforge | 2 | warrior (86.2%), worker, dwarf_tribe | siege/walker 0.0% — **FAIL** ≥30% (Deepforge built no siege units; defaulting to warrior) | + | goldvein | 2 | warrior (93.8%), worker, dwarf_tribe | not measured | + | ironhold | 2 | warrior (91.2%), worker, dwarf_tribe | not measured | + | runesmith | 3 | warrior (92.2%), worker, dwarf_tribe | not measured | + + **Distinct top-3 sets across clans: 2/5** (Blackhammer's hearth_raider is unique; the other 4 clans converge on warrior/worker/dwarf_tribe). Marginally PASSes the "distinct histograms" reading but only one clan is genuinely differentiated. The Deepforge ≥30% siege gate FAILs decisively; the Blackhammer light-melee gate is ambiguous (analyzer heuristic vs actual semantic mapping). + + **Diagnosis**: clan_affinity routing landed (p1-37) but the AI's production scoring still picks generic warrior over clan-affinity units in 4/5 clans. Either the clan_affinity bonus weight is too small relative to the warrior baseline, or units like Deepforge's `forge_titan`/`steam_walker` aren't actually buildable in the early–mid game (tech-gated past T300 turn limit). - [ ] AI doesn't crash trying to build a unit it lacks tech/buildings for — the unit IDs added are reachable from the clan's tech path