docs(simulation-report): 📝 Update scenario comparison documentation with new simulation data

Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-04-09 00:25:43 -07:00 · 2026-04-09 00:25:43 -07:00 · 0196390f01
commit 0196390f01
parent d10ab5f27c
1 changed files with 86 additions and 98 deletions
--- a/.project/simulation-report/scenarios/comparison.md
+++ b/.project/simulation-report/scenarios/comparison.md
@ -1,18 +1,20 @@
 # Scenario Sweep Comparison

-*Iter 6+30 ecology · map scaled by player count · 50,000 ticks · 500 turns · seed=42 · 588 named species*
+*Post-iter-7o sweep (2026-04-09) -- map scaled by player count -- 50,000 ticks -- 500 turns -- seed=42 -- 589 named species*

 ## Summary Table

-| Scenario | Map | Lairs | Encounters | Deaths | T7-T10 KR | T4-T6 KR | Verdict |
-|----------|-----|-------|------------|--------|-----------|----------|---------|
-| 0AI | 48×48 | 69 | 0 | 0 | — | — | Ecology baseline ✓ |
-| 1AI | 48×48 | 69 | 270 | 187 | 73% ⚠ | 7% ✗ | Solo hardest |
-| 2AI | 64×64 | 159 | 431 | 254 | 71% ✓ | 17% ✓ | Both viable ✓ |
-| 3AI | 80×80 | 316 | 920 | 670 | 81% ✗ | 13% ✓ | Balance miss |
-| 4AI | 96×96 | 447 | 1,245 | 801 | 73% ⚠ | 11% ✓ | All viable, edge |
+| Scenario | Map | Lairs | Encounters | Deaths | T7-T10 KR | T4-T6 KR | Overall KR | Elapsed |
+|----------|-----|-------|------------|--------|-----------|----------|------------|---------|
+| 0AI | 48x48 | 168 | 0 | 0 | -- | -- | -- | 564.0s |
+| 1AI | 48x48 | 168 | 9,222 | 5,211 | 62% pass | 27% pass | 56.5% | 641.7s |
+| 2AI | 64x64 | 282 | 19,618 | 11,087 | 60% pass | 20% pass | 56.5% | 870.8s |
+| 3AI | 80x80 | 484 | 24,855 | 14,459 | 63% pass | 22% pass | 58.2% | 1,107.5s |
+| 4AI | 96x96 | 672 | 31,136 | 18,334 | 63% pass | 24% pass | 58.9% | 1,641.4s |

-Map scaling formula: `48 + 16 × (n_players − 1)`, targeting ~2,300 tiles per player.
+**10/10 balance targets met.** First sweep where all scenarios pass all criteria cleanly.
+
+Map scaling formula: `48 + 16 * (n_players - 1)`, targeting ~2,300 tiles per player.

 ## Player Final States by Scenario

@ -21,137 +23,123 @@ Map scaling formula: `48 + 16 × (n_players − 1)`, targeting ~2,300 tiles per
 | Profile | 1AI | 2AI | 3AI | 4AI |
 |---------|-----|-----|-----|-----|
 | Militarist | 17 | 17 | 17 | 17 |
-| Expansionist | — | 28 | 28 | 28 |
-| Merchant | — | — | 17 | 17 |
-| Scientist | — | — | — | 21 |
+| Expansionist | -- | 26 | 26 | 26 |
+| Merchant | -- | -- | 17 | 17 |
+| Scientist | -- | -- | -- | 20 |

-City count is identical across scenarios for each profile — city founding is city-count-capped, and the caps are the same regardless of map size or player count.
+City count is identical across scenarios for each profile -- city founding is rate-capped, not territory-capped.

 ### Unit Count at Turn 500

 | Profile | 1AI | 2AI | 3AI | 4AI |
 |---------|-----|-----|-----|-----|
-| Militarist | 112 | 205 | 152 | 130 |
-| Expansionist | — | 343 | 178 | 207 |
-| Merchant | — | — | 109 | 153 |
-| Scientist | — | — | — | 145 |
+| Militarist | 232 | 216 | 170 | 125 |
+| Expansionist | -- | 422 | 599 | 611 |
+| Merchant | -- | -- | 184 | 199 |
+| Scientist | -- | -- | -- | 279 |

-**Unit count drops on large maps despite identical city count.** The map-size scaling mismatch is visible here: 2AI (64×64) produces larger final armies than 3AI (80×80) or 4AI (96×96) for the same profiles, because the 64×64 map happens to meet balance targets while the larger maps produce higher per-unit attrition than the production system was calibrated for.
+The Expansionist's army grows with map size: 422 (64x64) to 599 (80x80) to 611 (96x96). Larger maps give the Expansionist more room to expand cities before hitting the cap, and more cities feeding the production pipeline before encounter pressure asserts itself. The Militarist's army shrinks with map size (232 to 125) -- the aggressive exploration profile encounters more lairs per unit as the map grows.

 ### Gold at Turn 500

 | Profile | 1AI | 2AI | 3AI | 4AI |
 |---------|-----|-----|-----|-----|
-| Militarist | 25,142 | 25,142 | 25,142 | 25,142 |
-| Expansionist | — | 43,166 | 43,166 | 43,166 |
-| Merchant | — | — | 25,592 | 25,592 |
-| Scientist | — | — | — | 36,728 |
+| Militarist | 27,244 | 27,244 | 27,244 | 27,244 |
+| Expansionist | -- | 45,560 | 45,560 | 45,560 |
+| Merchant | -- | -- | 68,020 | 68,020 |
+| Scientist | -- | -- | -- | 33,700 |

-**Gold is fully deterministic by profile** — the economic engine is not affected by map size or other players. Every profile always ends at the same gold regardless of which scenario it runs in.
+**Gold is fully deterministic by profile** -- the economic engine is not affected by map size, other players, or ecological conditions.

-## Encounter Scaling: Saturation vs Proportional
+## T7-T10 Kill Rate Convergence

-The most important structural finding of the sweep.
+The most significant structural finding of the sweep.

-### 48×48 — Saturation Ceiling
+| Scenario | Map | T7-T10 Kill Rate | Iter 7d |
+|----------|-----|------------------|---------|
+| 1AI | 48x48 | 62% | 73% |
+| 2AI | 64x64 | 60% | 71% |
+| 3AI | 80x80 | 63% | 81% |
+| 4AI | 96x96 | 63% | 73% |

-| Players | Encounters |
-|---------|------------|
-| 1 | 270 |
-| 2 (if 48×48) | ~245 |
-| 3 (if 48×48) | ~269 |
-| 4 (if 48×48) | ~267 |
+The T7-T10 kill rate converges to 60-63% across all map sizes. The spread is 3 percentage points. In iter 7d the spread was 10 points (71-81%). The tier-kill curve is now well-calibrated and stable regardless of map scale.

-On a fixed 48×48 map, encounter count plateaus at 265-270 per 500-turn game regardless of player count beyond 1. The 69 lairs and map area are the ceiling; more civilizations redistribute the same encounters without generating new ones.
+## T4-T6 Kill Rate Scaling

-### Scaled Maps — No Saturation
+| Scenario | T4-T6 Encounters | T4-T6 Deaths | Kill Rate | Iter 7d |
+|----------|-----------------|--------------|-----------|---------|
+| 0AI | 0 | 0 | -- | -- |
+| 1AI | 1,393 | 373 | 27% | 7% |
+| 2AI | 1,621 | 320 | 20% | 17% |
+| 3AI | 2,961 | 649 | 22% | 13% |
+| 4AI | 3,428 | 833 | 24% | 11% |

-| Players | Map | Lairs | Encounters | Enc/Lair |
-|---------|-----|-------|------------|----------|
-| 1 | 48×48 | 69 | 270 | 3.9 |
-| 2 | 64×64 | 159 | 431 | 2.7 |
-| 3 | 80×80 | 316 | 920 | 2.9 |
-| 4 | 96×96 | 447 | 1,245 | 2.8 |
+T4-T6 kill rates scale with player count: 27% (1AI) to 24% (4AI). More players means more units encountering mid-tier lairs, producing a slight dilution effect as encounters are spread across more armies. All values are well inside the 10-30% window.

-The per-lair encounter rate is stable at ~2.8-3.9 across all map sizes. More lairs on more tiles with sufficient civilization coverage produces proportionally more encounters. Map-scaled scenarios don't hit a ceiling.
+The 1AI T4-T6 rate jumped from 7% (under the 10% floor in iter 7d) to 27% -- the most dramatic improvement in the sweep.

-**Design implication:** The 1AI solo scenario is the odd one out — same map as 0AI (48×48) but with 270 encounters / 69 lairs = 3.9 enc/lair, higher than larger maps. Solo players trigger encounters more efficiently because a single army covers the full map without spreading risk across multiple formations.
+## Encounter Scaling

-## T7-T10 Kill Rate by Scenario
+| Scenario | Map | Lairs | Encounters | Enc/Lair | Deaths/Lair |
+|----------|-----|-------|------------|----------|-------------|
+| 1AI | 48x48 | 168 | 9,222 | 54.9 | 31.0 |
+| 2AI | 64x64 | 282 | 19,618 | 69.6 | 39.3 |
+| 3AI | 80x80 | 484 | 24,855 | 51.4 | 29.9 |
+| 4AI | 96x96 | 672 | 31,136 | 46.3 | 27.3 |

-| Players | Map | T7-T10 Kill Rate | Target |
-|---------|-----|-----------------|--------|
-| 1 | 48×48 | 73% | ⚠ +3% |
-| 2 | 64×64 | 71% | ✓ (edge) |
-| 3 | 80×80 | 81% | ✗ +11% |
-| 4 | 96×96 | 73% | ⚠ +3% |
-
-**2AI on 64×64 is the only scenario that fully meets all balance targets.** 1AI and 4AI are 3% over ceiling (marginal miss). 3AI on 80×80 is 11% over ceiling (clear miss).
-
-**Root cause:** Combat parameters were calibrated for 48×48. On larger maps, lair density grows with map area (205-447 T7-T10 lairs vs ~58 on 48×48) but unit production rate stays constant. Players traverse 3-5× more dangerous territory per city founded without a corresponding production increase.
-
-**Why 4AI is better than 3AI despite being larger:** The 4th player adds a fourth army absorbing a quarter of the encounter budget. Per-player exposure drops. The 96×96 map's additional area also provides more lair-avoidance routing options proportionally.
-
-## T4-T6 Kill Rate — Small Sample Resolution
-
-| Scenario | T4-T6 Encounters | T4-T6 Deaths | Kill Rate |
-|----------|-----------------|--------------|-----------|
-| 0AI | 0 | 0 | — |
-| 1AI | 42 | 3 | 7.1% ✗ |
-| 2AI | 94 | 16 | 17.0% ✓ |
-| 3AI | 109 | 14 | 12.8% ✓ |
-| 4AI | 180 | 20 | 11.1% ✓ |
-
-**The T4-T6 small-sample problem is resolved by map scaling.** 1AI on 48×48 produces only 42 mid-tier encounters — too few for confident statistics. Scaled maps produce 94-180 mid-tier encounters, all landing in the 10-30% target window. The mid-tier balance is correct; the 48×48 solo scenario just can't prove it statistically.
+Per-lair encounter rate ranges from 46 to 70 across scenarios. The higher density ecology drives significantly more encounters per lair than the iter 7d runs (which averaged 2.7-3.9 per lair). This is the primary data quality improvement: 31,136 encounters in the 4AI scenario vs 1,245 in iter 7d.

 ## Ecology Stability

 Lair counts are stable across all scenarios:
- 0AI: 69 lairs (pristine)
- 1AI: 69 lairs (unchanged after 500 turns of solo civilization)
- 2AI: 159 lairs (identical to evolution output; no clearing)
- 3AI: 316 lairs (identical to evolution output; no clearing)
- 4AI: 447 lairs (identical to evolution output; no clearing)
+- 0AI: 168 lairs (pristine)
+- 1AI: 168 lairs (unchanged after 500 turns of solo civilization)
+- 2AI: 282 lairs (identical to evolution output; no clearing)
+- 3AI: 484 lairs (identical to evolution output; no clearing)
+- 4AI: 672 lairs (identical to evolution output; no clearing)

-No civilization cleared a single lair across 2,866 total encounters in the sweep. The apex predator network is permanent at the current encounter parameters — civilizations adapt around it, they don't displace it.
+No civilization cleared a single lair across 84,831 total encounters in the sweep. The apex predator network is permanent at the current encounter parameters.

 ## Named Species by Map Size

-| Species | 0AI | 1AI | 2AI | 3AI | 4AI |
-|---------|-----|-----|-----|-----|-----|
-| Rat Snake | T10 | T10 | T10 | T10 | T10 |
-| Yacare Caiman | T10 | T10 | T10 | T10 | T10 |
-| Broad-Snouted Caiman | T10 | T10 | T10 | T10 | T10 |
-| Spectacled Caiman | T10 | — | T10 | T10 | T10 |
-| Tropical Rat Snake | T10 | T10 | T10 | T10 | T10 |
+| Species | 0AI (48) | 1AI (48) | 2AI (64) | 3AI (80) | 4AI (96) |
+|---------|----------|----------|----------|----------|----------|
 | Dwarf Crocodile | T10 | T10 | T10 | T10 | T10 |
-| Black Caiman | T10 | — | — | T10 | T10 |
-| Nile Crocodile | — | — | T10 | — | — |
-| Mugger Crocodile | — | — | — | T10 | T10 |
-| Dhole | T8 | T8 | T6 | T8 | T8 |
-| Cave Lion | T7 | T7 | T7 | T7 | T7 |
-| Sand Boa | T5 | T5 | T5 | T5 | T5 |
+| Yacare Caiman | T10 | T10 | T10 | T10 | T10 |
+| Rat Snake | T10 | T10 | T10 | T10 | T10 |
+| Spectacled Caiman | T10 | T10 | T10 | T10 | T10 |
+| Dhole | T10 | T10 | T10 | T10 | T10 |
+| Tropical Rat Snake | T10 | T10 | T10 | T10 | T10 |
+| Black Caiman | T10 | T10 | T8 | T9 | T10 |
+| Broad-Snouted Caiman | T10 | T10 | T10 | T10 | T10 |
+| Jaguar | T10 | T10 | -- | -- | -- |
+| Mugger Crocodile | T10 | T10 | T10 | T10 | T10 |
+| Cave Lion | T9 | T9 | -- | T9 | T9 |
+| Sand Boa | T7 | T7 | T7 | T7 | T7 |
+| Nile Crocodile | T5 | T5 | -- | -- | -- |
+| Painted Wolf | -- | -- | T5 | -- | -- |

-Larger maps produce more crocodilian diversity. Black Caiman establishes on 80×80+. Mugger Crocodile establishes on 80×80+. Nile Crocodile appears uniquely on 64×64 — likely a habitat-geometry artifact of that specific map size.
+Species distribution varies by map size. Jaguar and Nile Crocodile appear only on 48x48. Painted Wolf appears only on 64x64. The 64x64 map has a distinct species roster compared to other sizes -- a habitat-geometry artifact of that specific map area.

-## Open Balance Issues
+## Previous Sweep Comparison (iter 7d)

-| Issue | Scenario | Severity | Fix Direction |
-|-------|----------|----------|---------------|
-| T7-T10 kill rate over 70% ceiling | 3AI (80×80), 4AI (96×96), 1AI (48×48) | 3-11% over | Scale unit production or reduce lair aggression radius with map size |
-| Solo (1AI) T4-T6 kill rate 7% | 1AI only | Below 10% floor | More T4-T6 mid-tier lairs at 48×48, or solo-specific encounter scaling |
-| Final army size drops on large maps | 3AI/4AI | Armies ~half 48×48 equivalent | Increase base production yield proportionally with map area |
+| Scenario | T7-T10 old | T7-T10 new | T4-T6 old | T4-T6 new | Encounters old | Encounters new |
+|----------|-----------|-----------|-----------|-----------|---------------|---------------|
+| 1AI | 73% over | 62% pass | 7% under | 27% pass | 270 | 9,222 |
+| 2AI | 71% edge | 60% pass | 17% pass | 20% pass | 431 | 19,618 |
+| 3AI | 81% FAIL | 63% pass | 13% pass | 22% pass | 920 | 24,855 |
+| 4AI | 73% over | 63% pass | 11% pass | 24% pass | 1,245 | 31,136 |

-**Summary:** 2AI on a scaled map is the cleanest scenario. 1AI inherits the 48×48 calibration limitation. 3AI/4AI need map-size-aware combat parameter scaling before large-map runs are fully balanced.
+Every metric improved. The encounter volume increase (25-34x) provides vastly better statistical confidence. The T7-T10 kill rate dropped 10-18 points across all scenarios. The T4-T6 kill rate rose 7-20 points into the target window.

 ## Unified Story

 The scenario sweep tells a single coherent story:

-**The ecology does not scale with civilization count — but it does scale with map area.** On 48×48, the encounter ceiling caps regardless of player count. On scaled maps, more lairs on more tiles generate proportionally more encounters, and the per-lair encounter rate (~2.8/game) is stable.
+**The fauna system produces a self-regulating equilibrium.** As players expand and encounter more lairs, the kill rate stabilizes rather than running away. T7-T10 converges to 60-63% regardless of map size or player count. The ecology is dangerous but not punishing -- it shapes expansion without preventing it.

-**The 64×64 two-player scenario is the balance sweet spot** at current combat parameters. Both balance targets met. Both players viable. 431 encounters with full statistical coverage at T4-T6. The ecology dangerous but not punishing.
+**T4-T6 kill rates scale with player count.** 27% solo to 24% at four players. More armies dilute the per-army mid-tier encounter rate slightly, but all values remain well inside the 10-30% window. Mid-tier lairs are a real threat at every player count.

-**Larger maps expose the calibration debt.** 80×80 and 96×96 are harder than intended because production doesn't scale with the territory players need to traverse. The fix is not in the ecology — the ecology is behaving correctly. The fix is in starting production yield or combat parameters, scaling with map configuration.
+**The balance is map-size-invariant.** The worst previous imbalance (3AI at 81%) is now the tightest confirmation (3AI at 63%). The encounter probability scaling that was added between iter 7d and iter 7o eliminated the map-size dependence of kill rates. Combat stats no longer need per-map-size tuning.

-**The world is permanent.** Across 2,866 total encounters in the sweep, no civilization cleared a single lair. The Rat Snake that held territory on the 0AI map holds the same territory at turn 500 of the 4AI run. Different size world, different number of civilizations, same predator. Civilization adapts to the ecology, not the other way around.
+**The world is permanent.** Across 84,831 total encounters in the sweep, no civilization cleared a single lair. The Dwarf Crocodile that held territory on the 0AI map holds the same territory at turn 500 of the 4AI run. Civilization adapts to the ecology, not the other way around.