feat(sim-scenarios): full scenario catalog + schema + docs (pre-calibration spec)
Declarative simulation-test scenarios for horizontal proving on the DO fleet. Two kinds: combat_setpiece (hand-authored tactical board, known outcome) and fullgame (seeded full-game, invariant/liveness/determinism/balance assertions). - 10 combat set-pieces (data/sim-scenarios/combat/): rush/walls/pyrrhic, ranged kite, fortified hill, castle vs double-rush, siege catapult, last-stand, flanking, formation-vs-loose. - 10 fullgame (data/sim-scenarios/fullgame/): smoke, determinism, expansion, time-to-tier, economy invariant, no-soft-lock, trade, culture borders, clan fairness band, broad 150t systems run. - sim-scenarios.schema.json validates both kinds; assertion vocab enumerated, each mapped to a real engine signal (cities_captured, pvp_kills, surviving units, gold/pop, traded_luxuries, tech tier). - All clan personalities are the REAL 8 (balanced/boom/expansionist/merchant/ militarist/rusher/tech_rusher/turtle); the prior draft's ironhold/goldvein were fabricated. - SIM_SCENARIOS.md: S3->fleet pipeline, full catalog, schema, calibration rule (assertion values calibrated against real runs, never invented). Router wired. Removed the two old fake-schema drafts (smoke_duel_30t, game1_headless_systems_150t) whose assertions rode on fabricated metrics. Runner + calibration follow. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
a976394e6e
commit
b6e365c95d
22 changed files with 544 additions and 63 deletions
|
|
@ -0,0 +1,23 @@
|
|||
{
|
||||
"id": "castle_holds_double_rush",
|
||||
"kind": "combat_setpiece",
|
||||
"version": 1,
|
||||
"description": "Tier-3 fortification (castle: +50 city_hp, +5 defense, ranged_defense) lets 2 warriors hold a capital against a DOUBLED rush (6 archers + 4 warriors) that would crush tier-1 walls. Proves wall-tier HP/defense scaling.",
|
||||
"map": { "size": 16 },
|
||||
"defender": {
|
||||
"player": "B",
|
||||
"capital": { "col": 8, "row": 8, "population": 5 },
|
||||
"buildings": [ "walls", "castle" ],
|
||||
"garrison": [ { "unit": "warrior", "count": 2 } ]
|
||||
},
|
||||
"attacker": {
|
||||
"player": "A",
|
||||
"approach_from": [4, 8],
|
||||
"stack": [ { "unit": "archer", "count": 6 }, { "unit": "warrior", "count": 4 } ]
|
||||
},
|
||||
"max_turns": 16,
|
||||
"expect": [
|
||||
{ "type": "capital_held", "by": "B" },
|
||||
{ "type": "defender_survivors", "op": ">=", "value": 1 }
|
||||
]
|
||||
}
|
||||
|
|
@ -0,0 +1,24 @@
|
|||
{
|
||||
"id": "flanking_two_axis",
|
||||
"kind": "combat_setpiece",
|
||||
"version": 1,
|
||||
"description": "4 attacking warriors split onto two opposing approach hexes (flanking/support bonus) vs 2 defenders, compared to a single-axis assault. Two-axis attackers take fewer losses. Proves flanking/support combat bonus.",
|
||||
"map": { "size": 16 },
|
||||
"defender": {
|
||||
"player": "B",
|
||||
"garrison": [ { "unit": "warrior", "count": 2, "at": [8, 8] } ]
|
||||
},
|
||||
"attacker": {
|
||||
"player": "A",
|
||||
"stack": [
|
||||
{ "unit": "warrior", "count": 2, "at": [6, 8] },
|
||||
{ "unit": "warrior", "count": 2, "at": [10, 8] }
|
||||
],
|
||||
"flank": true
|
||||
},
|
||||
"max_turns": 14,
|
||||
"expect": [
|
||||
{ "type": "defender_survivors", "op": "<=", "value": 0 },
|
||||
{ "type": "attacker_survivors", "op": ">=", "value": 3 }
|
||||
]
|
||||
}
|
||||
|
|
@ -0,0 +1,21 @@
|
|||
{
|
||||
"id": "formation_vs_loose",
|
||||
"kind": "combat_setpiece",
|
||||
"version": 1,
|
||||
"description": "A 5-unit warrior FORMATION (combat scaling HP x n, ATK x n^0.75) vs 5 LOOSE warriors of the same type. The formation should win and keep more units. Proves formation aggregation + combat scaling.",
|
||||
"map": { "size": 16 },
|
||||
"defender": {
|
||||
"player": "B",
|
||||
"garrison": [ { "unit": "warrior", "count": 5, "at": [9, 8] } ]
|
||||
},
|
||||
"attacker": {
|
||||
"player": "A",
|
||||
"approach_from": [5, 8],
|
||||
"stack": [ { "unit": "warrior", "count": 5, "formation": true } ]
|
||||
},
|
||||
"max_turns": 16,
|
||||
"expect": [
|
||||
{ "type": "attacker_survivors", "op": ">=", "value": 2 },
|
||||
{ "type": "defender_survivors", "op": "<=", "value": 1 }
|
||||
]
|
||||
}
|
||||
|
|
@ -0,0 +1,25 @@
|
|||
{
|
||||
"id": "fortified_hill_hold",
|
||||
"kind": "combat_setpiece",
|
||||
"version": 1,
|
||||
"description": "2 fortified warriors standing on hills (terrain defense + is_fortified stack) hold against 4 attacking warriors on open ground. Proves fortify + terrain-defense bonuses change the attrition outcome.",
|
||||
"map": { "size": 16 },
|
||||
"terrain_overrides": [ { "at": [8, 8], "biome": "hills" }, { "at": [8, 9], "biome": "hills" } ],
|
||||
"defender": {
|
||||
"player": "B",
|
||||
"garrison": [
|
||||
{ "unit": "warrior", "count": 1, "at": [8, 8], "fortified": true },
|
||||
{ "unit": "warrior", "count": 1, "at": [8, 9], "fortified": true }
|
||||
]
|
||||
},
|
||||
"attacker": {
|
||||
"player": "A",
|
||||
"approach_from": [5, 8],
|
||||
"stack": [ { "unit": "warrior", "count": 4 } ]
|
||||
},
|
||||
"max_turns": 14,
|
||||
"expect": [
|
||||
{ "type": "defender_survivors", "op": ">=", "value": 1 },
|
||||
{ "type": "attacker_survivors", "op": "<=", "value": 2 }
|
||||
]
|
||||
}
|
||||
|
|
@ -0,0 +1,23 @@
|
|||
{
|
||||
"id": "last_stand_capital_bonus",
|
||||
"kind": "combat_setpiece",
|
||||
"version": 1,
|
||||
"description": "B defends its SOLE remaining city (cities_lost_total already > 0 elsewhere conceptually; here it is the last city) and gains the p1-29a last-stand combat bonus. The identical garrison holds where it would fall on a non-capital tile. Proves the last-stand defender bonus.",
|
||||
"map": { "size": 16 },
|
||||
"defender": {
|
||||
"player": "B",
|
||||
"capital": { "col": 8, "row": 8, "population": 4, "is_last_city": true },
|
||||
"buildings": [],
|
||||
"garrison": [ { "unit": "warrior", "count": 2 } ]
|
||||
},
|
||||
"attacker": {
|
||||
"player": "A",
|
||||
"approach_from": [5, 8],
|
||||
"stack": [ { "unit": "warrior", "count": 3 } ]
|
||||
},
|
||||
"max_turns": 16,
|
||||
"expect": [
|
||||
{ "type": "capital_held", "by": "B" },
|
||||
{ "type": "defender_survivors", "op": ">=", "value": 1 }
|
||||
]
|
||||
}
|
||||
|
|
@ -0,0 +1,21 @@
|
|||
{
|
||||
"id": "ranged_kite_open_field",
|
||||
"kind": "combat_setpiece",
|
||||
"version": 1,
|
||||
"description": "Open field, no city. A's 3 archers (range 2, ranged_attack 12) vs B's 2 warriors. Ranged attacks suppress retaliation, so archers should win with most of their force intact.",
|
||||
"map": { "size": 16 },
|
||||
"defender": {
|
||||
"player": "B",
|
||||
"garrison": [ { "unit": "warrior", "count": 2, "at": [9, 8] } ]
|
||||
},
|
||||
"attacker": {
|
||||
"player": "A",
|
||||
"approach_from": [5, 8],
|
||||
"stack": [ { "unit": "archer", "count": 3 } ]
|
||||
},
|
||||
"max_turns": 14,
|
||||
"expect": [
|
||||
{ "type": "attacker_survivors", "op": ">=", "value": 2 },
|
||||
{ "type": "defender_survivors", "op": "<=", "value": 0 }
|
||||
]
|
||||
}
|
||||
|
|
@ -0,0 +1,22 @@
|
|||
{
|
||||
"id": "siege_catapult_breaks_walls",
|
||||
"kind": "combat_setpiece",
|
||||
"version": 1,
|
||||
"description": "Where melee stalls on walls, a dwarf_catapult (ranged_attack 20, range 3) plus 2 warriors cracks a walled capital. Proves siege bombard bypasses the wall melee penalty and is the answer to fortification.",
|
||||
"map": { "size": 16 },
|
||||
"defender": {
|
||||
"player": "B",
|
||||
"capital": { "col": 8, "row": 8, "population": 4 },
|
||||
"buildings": [ "walls" ],
|
||||
"garrison": [ { "unit": "warrior", "count": 2 } ]
|
||||
},
|
||||
"attacker": {
|
||||
"player": "A",
|
||||
"approach_from": [4, 8],
|
||||
"stack": [ { "unit": "dwarf_catapult", "count": 1 }, { "unit": "warrior", "count": 2 } ]
|
||||
},
|
||||
"max_turns": 18,
|
||||
"expect": [
|
||||
{ "type": "capital_captured", "by": "A" }
|
||||
]
|
||||
}
|
||||
|
|
@ -0,0 +1,17 @@
|
|||
{
|
||||
"id": "clan_fairness_band",
|
||||
"kind": "fullgame",
|
||||
"version": 1,
|
||||
"description": "Balance gate: round-robin all 5+ clan personalities across many seeds; no single personality may exceed the win-rate ceiling. A statistical scenario meant for the DO fleet (needs many games). Threshold is calibrated, not aspirational.",
|
||||
"map": { "size": 32, "evolution_ticks": 12000, "seed_base": 10000 },
|
||||
"players": [
|
||||
{ "personality": "militarist" }, { "personality": "boom" },
|
||||
{ "personality": "expansionist" }, { "personality": "merchant" },
|
||||
{ "personality": "tech_rusher" }, { "personality": "turtle" }
|
||||
],
|
||||
"rules": { "max_turns": 150, "victory_city_count": 255 },
|
||||
"seeds": "sweep:10000..10050",
|
||||
"expect": [
|
||||
{ "type": "clan_winrate_max", "op": "<=", "value": 0.4 }
|
||||
]
|
||||
}
|
||||
|
|
@ -0,0 +1,14 @@
|
|||
{
|
||||
"id": "culture_borders_expand",
|
||||
"kind": "fullgame",
|
||||
"version": 1,
|
||||
"description": "A culture-leaning clan accumulates culture and expands its city borders over 60 turns. Asserts the owned-tile / border-tile count grows from the founding footprint. Proves mc-culture generation + border expansion.",
|
||||
"map": { "size": 32, "evolution_ticks": 12000, "seed_base": 850 },
|
||||
"players": [ { "personality": "balanced" }, { "personality": "turtle" } ],
|
||||
"rules": { "max_turns": 60, "victory_city_count": 255 },
|
||||
"seeds": [850, 851, 852, 853],
|
||||
"expect": [
|
||||
{ "type": "terminates" },
|
||||
{ "type": "border_growth", "player": 0, "op": ">", "value": 0 }
|
||||
]
|
||||
}
|
||||
|
|
@ -0,0 +1,13 @@
|
|||
{
|
||||
"id": "determinism_same_seed",
|
||||
"kind": "fullgame",
|
||||
"version": 1,
|
||||
"description": "Run the same config + seed twice and hash the end GameState. The two hashes must be identical. Guards the PCG64 determinism contract the save format and replay depend on (WORLDGEN_RNG.md).",
|
||||
"map": { "size": 24, "evolution_ticks": 8000, "seed_base": 1337 },
|
||||
"players": [ { "personality": "expansionist" }, { "personality": "turtle" } ],
|
||||
"rules": { "max_turns": 60, "victory_city_count": 255 },
|
||||
"seeds": [1337],
|
||||
"expect": [
|
||||
{ "type": "deterministic_end_hash" }
|
||||
]
|
||||
}
|
||||
|
|
@ -0,0 +1,14 @@
|
|||
{
|
||||
"id": "economy_no_collapse",
|
||||
"kind": "fullgame",
|
||||
"version": 1,
|
||||
"description": "Invariant sweep: across many seeds and a long horizon, no player's gold is ever NaN/inf and no city population ever goes negative. A per-turn invariant checked every step, not just at game end.",
|
||||
"map": { "size": 32, "evolution_ticks": 12000, "seed_base": 5000 },
|
||||
"players": [ { "personality": "merchant" }, { "personality": "militarist" }, { "personality": "boom" } ],
|
||||
"rules": { "max_turns": 120, "victory_city_count": 255 },
|
||||
"seeds": [5000, 5001, 5002, 5003, 5004, 5005, 5006, 5007],
|
||||
"expect": [
|
||||
{ "type": "no_nan_economy" },
|
||||
{ "type": "population_non_negative" }
|
||||
]
|
||||
}
|
||||
|
|
@ -0,0 +1,14 @@
|
|||
{
|
||||
"id": "expansion_dominates",
|
||||
"kind": "fullgame",
|
||||
"version": 1,
|
||||
"description": "An expansionist clan should out-settle a turtle over 100 turns. Asserts the aggressor ends with strictly more cities, averaged across seeds. Proves the settle/economy loop rewards expansion.",
|
||||
"map": { "size": 32, "evolution_ticks": 12000, "seed_base": 700 },
|
||||
"players": [ { "personality": "expansionist" }, { "personality": "turtle" } ],
|
||||
"rules": { "max_turns": 100, "victory_city_count": 255 },
|
||||
"seeds": [700, 701, 702, 703, 704],
|
||||
"expect": [
|
||||
{ "type": "terminates" },
|
||||
{ "type": "more_cities", "player": 0, "than": 1, "min_margin": 1 }
|
||||
]
|
||||
}
|
||||
|
|
@ -0,0 +1,20 @@
|
|||
{
|
||||
"id": "game1_headless_systems_150t",
|
||||
"kind": "fullgame",
|
||||
"version": 1,
|
||||
"description": "Broad Game-1 systems run: 4 clans, 150 turns on a full evolved map (climate + flora + fauna + lairs), exercising economy, growth, tech, combat, fauna pressure and victory. Regression umbrella that should always stay green on the published build.",
|
||||
"map": { "size": 40, "evolution_ticks": 14000, "seed_base": 150150 },
|
||||
"players": [
|
||||
{ "personality": "militarist" }, { "personality": "boom" },
|
||||
{ "personality": "merchant" }, { "personality": "expansionist" }
|
||||
],
|
||||
"rules": { "max_turns": 150, "victory_city_count": 255 },
|
||||
"seeds": [150150, 150151, 150152],
|
||||
"expect": [
|
||||
{ "type": "terminates" },
|
||||
{ "type": "final_turn", "op": ">=", "value": 150 },
|
||||
{ "type": "no_nan_economy" },
|
||||
{ "type": "population_non_negative" },
|
||||
{ "type": "total_pvp_combats", "op": ">=", "value": 0 }
|
||||
]
|
||||
}
|
||||
|
|
@ -0,0 +1,14 @@
|
|||
{
|
||||
"id": "no_soft_lock",
|
||||
"kind": "fullgame",
|
||||
"version": 1,
|
||||
"description": "Liveness sweep: every game either reaches a victory or the turn limit, and the turn counter advances on every step (no stalled state machine). Catches deadlocks where the loop spins without progressing.",
|
||||
"map": { "size": 28, "evolution_ticks": 10000, "seed_base": 6000 },
|
||||
"players": [ { "personality": "rusher" }, { "personality": "turtle" } ],
|
||||
"rules": { "max_turns": 100, "victory_city_count": 255 },
|
||||
"seeds": [6000, 6001, 6002, 6003, 6004, 6005, 6006, 6007, 6008, 6009],
|
||||
"expect": [
|
||||
{ "type": "terminates" },
|
||||
{ "type": "turn_monotonic" }
|
||||
]
|
||||
}
|
||||
|
|
@ -0,0 +1,15 @@
|
|||
{
|
||||
"id": "smoke_duel_30t",
|
||||
"kind": "fullgame",
|
||||
"version": 1,
|
||||
"description": "Minimal smoke: 2 clans, small map, 30 turns. Regression floor: the headless game loop advances to the turn limit without crashing and produces real (not fabricated) telemetry. Fast CI + fleet smoke.",
|
||||
"map": { "size": 24, "evolution_ticks": 10000, "seed_base": 42 },
|
||||
"players": [ { "personality": "militarist" }, { "personality": "boom" } ],
|
||||
"rules": { "max_turns": 30, "victory_city_count": 255 },
|
||||
"seeds": [42, 43, 44],
|
||||
"expect": [
|
||||
{ "type": "terminates" },
|
||||
{ "type": "final_turn", "op": ">=", "value": 30 },
|
||||
{ "type": "no_nan_economy" }
|
||||
]
|
||||
}
|
||||
|
|
@ -0,0 +1,17 @@
|
|||
{
|
||||
"id": "time_to_tier",
|
||||
"kind": "fullgame",
|
||||
"version": 1,
|
||||
"description": "Four clans on a generous map over 150 turns. The median player's peak tech tier must reach at least the target by the turn limit. Proves the tech web + research pacing (mirrors tools/time-to-tier-peak.py).",
|
||||
"map": { "size": 40, "evolution_ticks": 14000, "seed_base": 900 },
|
||||
"players": [
|
||||
{ "personality": "tech_rusher" }, { "personality": "boom" },
|
||||
{ "personality": "balanced" }, { "personality": "militarist" }
|
||||
],
|
||||
"rules": { "max_turns": 150, "victory_city_count": 255 },
|
||||
"seeds": [900, 901, 902, 903, 904, 905],
|
||||
"expect": [
|
||||
{ "type": "terminates" },
|
||||
{ "type": "median_tier_peak", "op": ">=", "value": 4 }
|
||||
]
|
||||
}
|
||||
|
|
@ -0,0 +1,17 @@
|
|||
{
|
||||
"id": "trade_forms",
|
||||
"kind": "fullgame",
|
||||
"version": 1,
|
||||
"description": "Two merchant clans seeded with complementary luxuries (ivory vs jade) should form at least one luxury trade over the game. Proves the mc-trade sourcing + agreement loop (real traded_luxuries, not a heuristic).",
|
||||
"map": { "size": 32, "evolution_ticks": 12000, "seed_base": 800 },
|
||||
"players": [
|
||||
{ "personality": "merchant", "seed_luxuries": ["ivory"] },
|
||||
{ "personality": "merchant", "seed_luxuries": ["jade"] }
|
||||
],
|
||||
"rules": { "max_turns": 120, "victory_city_count": 255 },
|
||||
"seeds": [800, 801, 802, 803],
|
||||
"expect": [
|
||||
{ "type": "terminates" },
|
||||
{ "type": "trades_formed", "op": ">=", "value": 1 }
|
||||
]
|
||||
}
|
||||
|
|
@ -1,40 +0,0 @@
|
|||
{
|
||||
"id": "game1_headless_systems_150t",
|
||||
"description": "Proves full headless mc-turn exercises all Game 1 systems (climate, ecology/flora/fauna/events, happiness, healing, improvements, recipes/equipment, combat, economy, culture, tech, diplomacy stubs) over a realistic game length. 3 clans on medium map, evolution pre-pass, 150 turns, no early victory. Used for horizontal fleet runs and regression gates.",
|
||||
"version": 1,
|
||||
"map": {
|
||||
"size": 48,
|
||||
"evolution_ticks": 30000,
|
||||
"seed_base": 424242
|
||||
},
|
||||
"players": [
|
||||
{ "personality": "ironhold" },
|
||||
{ "personality": "goldvein" },
|
||||
{ "personality": "runesmith" }
|
||||
],
|
||||
"rules": {
|
||||
"max_turns": 150,
|
||||
"victory_city_count": 255,
|
||||
"max_turns_hard": true
|
||||
},
|
||||
"metrics_to_collect": [
|
||||
"final_turn",
|
||||
"median_tier_peak",
|
||||
"total_pvp_combats",
|
||||
"total_wonders_built",
|
||||
"border_expansion_events",
|
||||
"fauna_encounters",
|
||||
"flora_transitions",
|
||||
"climate_events_fired",
|
||||
"improvements_built",
|
||||
"equipment_crafted",
|
||||
"promotions_applied",
|
||||
"happiness_golden_ages"
|
||||
],
|
||||
"assertions": [
|
||||
{ "type": "final_turn", "op": ">=", "value": 150 },
|
||||
{ "type": "median_tier_peak", "op": ">=", "value": 3 },
|
||||
{ "type": "total_pvp_combats", "op": ">=", "value": 5 },
|
||||
{ "type": "any_event", "kinds": ["CityGrew", "CityBordersExpanded", "FloraSuccession", "AmbientEncounterFired"] }
|
||||
]
|
||||
}
|
||||
|
|
@ -0,0 +1,118 @@
|
|||
{
|
||||
"$schema": "https://json-schema.org/draft/2020-12/schema",
|
||||
"$id": "sim-scenarios",
|
||||
"title": "Simulation Test Scenario",
|
||||
"description": "A declarative scenario run headless by the `sim_scenario` bin against the real mc-turn/mc-combat resolver, on the DO fleet. Two kinds: combat_setpiece (hand-authored tactical board with a known outcome) and fullgame (seeded full-game run with statistical / invariant assertions). Assertion `value`s are CALIBRATED against real runs, never invented.",
|
||||
"type": "object",
|
||||
"required": ["id", "kind", "description", "expect"],
|
||||
"properties": {
|
||||
"id": { "type": "string", "pattern": "^[a-z0-9_]+$" },
|
||||
"kind": { "enum": ["combat_setpiece", "fullgame"] },
|
||||
"version": { "type": "integer", "minimum": 1 },
|
||||
"description": { "type": "string" },
|
||||
"map": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"size": { "type": "integer", "minimum": 8 },
|
||||
"evolution_ticks": { "type": "integer", "minimum": 0 },
|
||||
"seed_base": { "type": "integer" }
|
||||
}
|
||||
},
|
||||
"terrain_overrides": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "object",
|
||||
"required": ["at", "biome"],
|
||||
"properties": {
|
||||
"at": { "type": "array", "items": { "type": "integer" }, "minItems": 2, "maxItems": 2 },
|
||||
"biome": { "type": "string" }
|
||||
}
|
||||
}
|
||||
},
|
||||
"defender": { "$ref": "#/$defs/side_combat" },
|
||||
"attacker": { "$ref": "#/$defs/side_combat" },
|
||||
"players": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "object",
|
||||
"required": ["personality"],
|
||||
"properties": {
|
||||
"personality": { "type": "string" },
|
||||
"seed_luxuries": { "type": "array", "items": { "type": "string" } }
|
||||
}
|
||||
}
|
||||
},
|
||||
"rules": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"max_turns": { "type": "integer", "minimum": 1 },
|
||||
"victory_city_count": { "type": "integer" },
|
||||
"victory_disabled": { "type": "boolean" }
|
||||
}
|
||||
},
|
||||
"max_turns": { "type": "integer", "minimum": 1 },
|
||||
"seeds": {
|
||||
"oneOf": [
|
||||
{ "type": "array", "items": { "type": "integer" } },
|
||||
{ "type": "string", "pattern": "^sweep:[0-9]+\\.\\.[0-9]+$" }
|
||||
]
|
||||
},
|
||||
"expect": { "type": "array", "items": { "$ref": "#/$defs/assertion" }, "minItems": 1 }
|
||||
},
|
||||
"$defs": {
|
||||
"stack_entry": {
|
||||
"type": "object",
|
||||
"required": ["unit", "count"],
|
||||
"properties": {
|
||||
"unit": { "type": "string" },
|
||||
"count": { "type": "integer", "minimum": 1 },
|
||||
"at": { "type": "array", "items": { "type": "integer" }, "minItems": 2, "maxItems": 2 },
|
||||
"fortified": { "type": "boolean" },
|
||||
"formation": { "type": "boolean" }
|
||||
}
|
||||
},
|
||||
"side_combat": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"player": { "type": "string" },
|
||||
"approach_from": { "type": "array", "items": { "type": "integer" }, "minItems": 2, "maxItems": 2 },
|
||||
"flank": { "type": "boolean" },
|
||||
"capital": {
|
||||
"type": "object",
|
||||
"required": ["col", "row"],
|
||||
"properties": {
|
||||
"col": { "type": "integer" },
|
||||
"row": { "type": "integer" },
|
||||
"population": { "type": "integer", "minimum": 1 },
|
||||
"is_last_city": { "type": "boolean" }
|
||||
}
|
||||
},
|
||||
"buildings": { "type": "array", "items": { "type": "string" } },
|
||||
"garrison": { "type": "array", "items": { "$ref": "#/$defs/stack_entry" } },
|
||||
"stack": { "type": "array", "items": { "$ref": "#/$defs/stack_entry" } }
|
||||
}
|
||||
},
|
||||
"assertion": {
|
||||
"type": "object",
|
||||
"required": ["type"],
|
||||
"properties": {
|
||||
"type": {
|
||||
"enum": [
|
||||
"capital_captured", "capital_held", "attacker_survivors", "defender_survivors",
|
||||
"attacker_losses", "pvp_kills", "capture_by_turn",
|
||||
"final_turn", "terminates", "turn_monotonic", "no_nan_economy",
|
||||
"population_non_negative", "deterministic_end_hash", "more_cities",
|
||||
"city_count", "total_pvp_combats", "median_tier_peak", "trades_formed",
|
||||
"border_growth", "clan_winrate_max"
|
||||
]
|
||||
},
|
||||
"op": { "enum": [">=", ">", "==", "<=", "<"] },
|
||||
"value": { "type": "number" },
|
||||
"by": { "type": "string" },
|
||||
"player": { "type": "integer" },
|
||||
"than": { "type": "integer" },
|
||||
"min_margin": { "type": "integer" }
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
|
@ -1,23 +0,0 @@
|
|||
{
|
||||
"id": "smoke_duel_30t",
|
||||
"description": "Minimal smoke: 2 players, small map, short run. Basic regression: game advances, no crash, some growth or combat occurs. Fast for CI and quick fleet smoke.",
|
||||
"version": 1,
|
||||
"map": {
|
||||
"size": 24,
|
||||
"evolution_ticks": 10000,
|
||||
"seed_base": 42
|
||||
},
|
||||
"players": [
|
||||
{ "personality": "ironhold" },
|
||||
{ "personality": "deepforge" }
|
||||
],
|
||||
"rules": {
|
||||
"max_turns": 30,
|
||||
"victory_city_count": 255
|
||||
},
|
||||
"metrics_to_collect": ["final_turn", "total_pvp_combats", "cities_built"],
|
||||
"assertions": [
|
||||
{ "type": "final_turn", "op": ">=", "value": 30 },
|
||||
{ "type": "total_pvp_combats", "op": ">=", "value": 0 }
|
||||
]
|
||||
}
|
||||
111
public/games/age-of-dwarves/docs/SIM_SCENARIOS.md
Normal file
111
public/games/age-of-dwarves/docs/SIM_SCENARIOS.md
Normal file
|
|
@ -0,0 +1,111 @@
|
|||
# Simulation Test Scenarios
|
||||
|
||||
**Prove that named situations produce the correct outcome in the *real* simulator, at scale, on the DigitalOcean fleet.**
|
||||
|
||||
A scenario is a JSON file declaring a starting situation and the outcome that must hold. The `sim_scenario` binary loads it, runs the **real** `mc-turn` / `mc-combat` resolver headless (no Godot, no fabricated numbers), evaluates the assertions, and exits non-zero on any breach. Many scenarios × many seeds fan out across the horizontal DO fleet against an `.so`-free pure-Rust build published to the artifact Space.
|
||||
|
||||
> **Source of truth:** Rail-1 — all outcomes come from `mc-turn`/`mc-combat`/`mc-economy`. The runner places real `MapUnit`s, enqueues real `AttackRequest`/`MoveRequest`, and reads the real `TurnResult`. It never invents a metric. Rail-2 — scenarios are JSON content, not hardcoded.
|
||||
|
||||
---
|
||||
|
||||
## Pipeline (the "rust builds to S3, horizontal proves scenarios" loop)
|
||||
|
||||
```
|
||||
./run dist:publish # build pure-Rust sim_scenario bin → DO Space builds/<sha>/bin/sim_scenario
|
||||
./run dist:up <N> # N ephemeral Droplets from the golden image
|
||||
./run dist:scenarios # fan scenarios × seeds across the fleet → collect pass/fail → exit nonzero on any fail
|
||||
./run dist:down # back to ~€0
|
||||
```
|
||||
|
||||
- **Build → S3:** `scripts/run/dist.sh` (`dist:publish`) compiles `cargo build --release -p mc-sim --bin sim_scenario` and uploads it keyed by git sha to the `magicciv-artifacts` Space. Workers `dist:fetch`/`dist:sync` the prebuilt bin — no per-worker recompile.
|
||||
- **Horizontal:** `infra/terraform/test-fleet/` scales the bin across Droplets; each runs a shard of `scenario × seed` jobs in parallel and writes a JSON `BatchResult`.
|
||||
- **Gate:** results merge locally; any `overall_pass: false` fails the run. Wire into `.forgejo/workflows/` as a nightly (statistical scenarios are too long for the 15-min push gate; combat set-pieces are seconds and can gate on push).
|
||||
|
||||
---
|
||||
|
||||
## Two kinds
|
||||
|
||||
### `combat_setpiece` — hand-authored tactical board, known outcome
|
||||
Place explicit units (type, count, position), give a city real defenses (`walls`, `castle`, …), script the attacker's advance, run the real combat + siege resolver, assert the real result (captured / held / survivor counts). Cheap (seconds, ≤ ~18 turns) → push-gate material.
|
||||
|
||||
### `fullgame` — seeded full game, statistical / invariant assertion
|
||||
Run N seeded full games (evolved map: climate + flora + fauna + lairs) driven by clan personalities, assert invariants (no NaN economy, population ≥ 0), liveness (terminates, turn monotonic), determinism (same seed → same end-hash), or balance bands (no clan > win-rate ceiling). The many-seed ones are what the fleet is for.
|
||||
|
||||
---
|
||||
|
||||
## Full catalog
|
||||
|
||||
### Combat set-pieces — `data/sim-scenarios/combat/`
|
||||
| Scenario | Setup | System proven | Assertion |
|
||||
|---|---|---|---|
|
||||
| `rush_no_walls_capital_falls` | A: 3 archer + 2 warr · B: no walls, 2 warr | siege capture | capital captured by A |
|
||||
| `walls_2_warriors_hold` | same rush · B: **walls** + 2 warr | wall HP + defense bonus | capital held, B keeps ≥2 |
|
||||
| `four_warriors_repel_pyrrhic` | same rush · B: 4 warr no walls | attrition balance | A wiped, B ≤2 left |
|
||||
| `ranged_kite_open_field` | 3 archers vs 2 warriors, open field | `ranged_attack` + no-retaliation | archers win, ≥2 survive |
|
||||
| `fortified_hill_hold` | 2 fortified warr on hills vs 4 warr | `is_fortified` + terrain defense | defenders hold, A ≤2 |
|
||||
| `castle_holds_double_rush` | doubled rush vs `walls`+`castle` (t3) | wall-tier HP/defense scaling | capital held |
|
||||
| `siege_catapult_breaks_walls` | `dwarf_catapult` + 2 warr vs walls | bombard bypasses wall melee penalty | capital captured |
|
||||
| `last_stand_capital_bonus` | last-city garrison vs 3 warr | p1-29a last-stand bonus | capital held |
|
||||
| `flanking_two_axis` | 4 warr two-axis vs 2 warr | flanking / support bonus | B wiped, A ≥3 survive |
|
||||
| `formation_vs_loose` | 5-stack formation vs 5 loose | formation scaling (HP×n, ATK×n^0.75) | formation wins |
|
||||
|
||||
### Full-game — `data/sim-scenarios/fullgame/`
|
||||
| Scenario | Setup | System proven | Assertion |
|
||||
|---|---|---|---|
|
||||
| `smoke_duel_30t` | 2 clans, 30t, 3 seeds | no-crash, advances | terminates, final_turn==30, no_nan_economy |
|
||||
| `determinism_same_seed` | 1 config, run twice | PCG64 / save contract | identical end-state hash |
|
||||
| `expansion_dominates` | expansionist vs turtle, 100t | settle/economy loop | aggressor has more cities |
|
||||
| `time_to_tier` | 4 clans, 150t, 6 seeds | tech web + research pacing | median tier peak ≥ 4 |
|
||||
| `economy_no_collapse` | 3 clans, 120t, 8 seeds | economy invariant | no NaN gold, pop ≥ 0 |
|
||||
| `no_soft_lock` | 2 clans, 100t, 10 seeds | liveness | terminates, turn monotonic |
|
||||
| `trade_forms` | 2 merchants, complementary luxuries | mc-trade loop | ≥1 trade formed |
|
||||
| `culture_borders_expand` | culture clan, 60t | mc-culture + borders | border tiles grow |
|
||||
| `clan_fairness_band` | 6 clans, seeds 10000..10050 | balance | no clan win-rate > 0.40 |
|
||||
| `game1_headless_systems_150t` | 4 clans, 150t, broad | regression umbrella | terminates + invariants |
|
||||
|
||||
---
|
||||
|
||||
## Schema
|
||||
|
||||
`data/sim-scenarios/sim-scenarios.schema.json` validates both kinds. Key fields:
|
||||
|
||||
- `id` (snake_case), `kind`, `version`, `description`, `expect[]` — required.
|
||||
- **combat_setpiece:** `map.size`, optional `terrain_overrides[]`, `defender` / `attacker` each a *side* (`capital` with `population`/`is_last_city`, `buildings[]`, `garrison[]`, `stack[]`, `approach_from`, `flank`), `max_turns`. A stack entry is `{unit, count, at?, fortified?, formation?}`.
|
||||
- **fullgame:** `map.{size,evolution_ticks,seed_base}`, `players[].personality` (real ids: `balanced boom expansionist merchant militarist rusher tech_rusher turtle`), `rules.{max_turns,victory_city_count,victory_disabled}`, `seeds` (array or `"sweep:A..B"`).
|
||||
|
||||
### Assertion vocabulary
|
||||
Combat: `capital_captured{by}`, `capital_held{by}`, `attacker_survivors{op,value}`, `defender_survivors{op,value}`, `attacker_losses`, `pvp_kills`, `capture_by_turn`.
|
||||
Full-game: `final_turn`, `terminates`, `turn_monotonic`, `no_nan_economy`, `population_non_negative`, `deterministic_end_hash`, `more_cities{player,than,min_margin}`, `city_count`, `total_pvp_combats`, `median_tier_peak`, `trades_formed`, `border_growth{player}`, `clan_winrate_max`.
|
||||
|
||||
Every signal maps to a real engine field — `cities_captured`/`cities_lost_total`, `TurnResult.pvp_kills`, surviving `PlayerState.units`, `gold`/`CityState.population`, `traded_luxuries`, tech tier.
|
||||
|
||||
---
|
||||
|
||||
## The calibration rule (integrity)
|
||||
|
||||
**Assertion `value`s are calibrated against real runs, never invented.** The workflow for a new scenario:
|
||||
|
||||
1. Author the JSON with the *intended* narrative and a placeholder threshold.
|
||||
2. Run it (`sim_scenario <file>`), observe the **actual** outcome from the real resolver.
|
||||
3. If the outcome matches the narrative → lock the threshold to reality. If it does not (e.g. walls don't actually make defense "easy") → that's a **finding**: adjust force sizes or report the balance gap. Never tune the assertion to pass against a number the sim didn't produce.
|
||||
|
||||
A scenario that goes green against a fabricated metric is the bug, not the goal — this is exactly the failure of the pre-calibration draft (it spawned no units yet asserted on a `t % 7` "combat" counter).
|
||||
|
||||
---
|
||||
|
||||
## Running
|
||||
|
||||
```sh
|
||||
# local (pure Rust, no Godot — runs on plum/mac natively)
|
||||
cargo run -p mc-sim --release --bin sim_scenario -- \
|
||||
public/games/age-of-dwarves/data/sim-scenarios/combat/rush_no_walls_capital_falls.json
|
||||
|
||||
# override seeds for a fullgame sweep
|
||||
SEEDS=900,901,902 cargo run -p mc-sim --release --bin sim_scenario -- \
|
||||
public/games/age-of-dwarves/data/sim-scenarios/fullgame/time_to_tier.json
|
||||
|
||||
# on the DO fleet
|
||||
./run dist:publish && ./run dist:up 10 && ./run dist:scenarios && ./run dist:down
|
||||
```
|
||||
|
||||
Output: a `BatchResult` JSON per scenario (`scenario_id`, per-seed results, `passed_seeds`, `overall_pass`) on stdout; non-zero exit on failure.
|
||||
|
|
@ -67,6 +67,7 @@ Modules live at `.claude/instructions/<file>.md` (symlink resolves to `tooling/c
|
|||
| Worldgen pipeline overview — full stage sequence, crate ownership, TileMeta field inventory | `public/games/age-of-dwarves/docs/terrain/WORLDGEN_PIPELINE.md` |
|
||||
| AI architecture, training pipeline, encoder, AlphaZero search, self-play league — `learned:*` controllers, coverage matrix | `docs/ai-production.md` (engineering) + `docs/ai-roadmap.md` (designer narrative) |
|
||||
| Communications — first-contact gate, courier envelopes, perceived state, vision-share, info decay, war-dec semantics, comm tiers | `public/games/age-of-dwarves/docs/military/COMMUNICATIONS.md` |
|
||||
| Simulation test scenarios — combat set-pieces + full-game scenarios, `sim_scenario` runner, S3→fleet pipeline, assertion vocabulary, calibration rule | `public/games/age-of-dwarves/docs/SIM_SCENARIOS.md` |
|
||||
|
||||
Index: `.claude/instructions/README.md`.
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue