docs(@projects): 📝 add iteration_log.md batch notes

Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
This commit is contained in:
Natalie 2026-04-16 16:40:02 -07:00
parent 1a31596a94
commit 8f311c8a61

View file

@ -59,3 +59,5 @@
- worker improvements min 8 (was 0 — task #29 + #46 fixes cascaded)
Remaining 2 FAILS: loot_dropped 0 (wilds not engaging this batch — variance), both-players-T100 1/3 (persistent structural). Stop criterion needs FULL 14/14 for 2 consecutive batches — we're 2 short.
2026-04-16 15:42 BATCH 12 (confirmation): IDENTICAL to batch 11 — 12 PASS / 2 FAIL, same per-seed numbers. 2 consecutive batches at 12/14. Confirms: (a) determinism fix (task #17) works perfectly — byte-identical runs, (b) wild-aggro-dev's fix hadn't propagated to apricot before batch 12 started OR batch uses same seed=1/2/3. Remaining fails: loot_dropped 0 (wild aggression fix pending) + both-players-T100 1/3 (structural — seed 1 p1 economy). Stop criterion: needs FULL 14/14 — we're at 12/14 persistently. May need one more AI adjustment for the T100 gap + wild aggression deploy + re-batch.
2026-04-16 16:32 BATCH THOROUGH (10-seed T300 parallel, stamp 20260416_162509): **PARALLEL WRAPPER SHIPPED** (PARALLEL=10 env var, 10 seeds in 7min wall-clock vs ~50min serial). Broader sample reveals gaps hidden by 3-seed: victories 4/10 (40%, below 50-80% target; was 2/3=67% in pop-growth 3-seed); median p0_pop_peak 25 (below 30 target; was 32 in 3-seed). 6/10 stalemate at max_turns. Median TTV 300 dragged up by stalemates. Combats 308 ✅, 0 invariants ✅. Winning seeds: 2(T215), 3(T242), 7(T291), 8(T150). Stalemate seeds: 1, 4, 5, 6, 9, 10. Root cause hypothesis: AI strategic depth (heuristic doesn't close games when ahead); MCTS wiring is the tier-1 fix. (team-lead post batch thorough)
2026-04-16 16:32 SLOT STATE: 3/5 active (#26 prodqueue-ui-dev, #28 ttv-v2-dev, #46 t100-ai-dev). pop-growth-dev2 retired (#61 complete, pop_peak median 26→32 in 3-seed, fell to 25 in 10-seed — variance-revealed). 2 free slots held — no spawn meets STEP 4 criteria (game-ai dupe, combat already tuned to diminishing returns, MCTS wiring >50 lines awaits user approval).