From 420144ab04008161c34aaa3e02f35eb03531ef85 Mon Sep 17 00:00:00 2001 From: Claude Code Date: Wed, 8 Apr 2026 12:11:24 -0700 Subject: [PATCH] =?UTF-8?q?docs(simulation-report):=20=F0=9F=93=9D=20Add?= =?UTF-8?q?=20log=20entry=20for=20iteration=207f=20documenting=20bridge=20?= =?UTF-8?q?hardening=20(spatial=20index,=20config=20roundtrip,=20and=20GUT?= =?UTF-8?q?=20regression=20tests)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-Authored-By: Lilith Autocommit --- .project/simulation-report/experiment-log.md | 132 +++++++++++++++++++ 1 file changed, 132 insertions(+) diff --git a/.project/simulation-report/experiment-log.md b/.project/simulation-report/experiment-log.md index e8cb46c7..a86e0e31 100644 --- a/.project/simulation-report/experiment-log.md +++ b/.project/simulation-report/experiment-log.md @@ -4,6 +4,138 @@ Tracks every iteration of the balance/simulation loop. Newest entries on top. Ea --- +## Iteration 7f — Bridge hardening: spatial index + config roundtrip + GUT regression (2026-04-08, COMPLETE) + +**Goal:** Three parallel follow-ups from iter 7e: +1. **Spatial index for fauna encounter resolution** in mc-turn — expected to be a 50–200× speedup for the bench per the Council of Experts' analysis, unlocking CMA-ES optimizer sweeps. +2. **LairCombatConfig serde roundtrip** — save/load support for the balance knobs + new Godot bridge methods `config_to_json` / `config_from_json`. +3. **GUT integration test for the GdTurnProcessor bridge** — the iter 7e proof screenshot was a one-shot manual verification; this task makes it a permanent regression test. + +Executed as the `iter-7f-bridge-hardening` team with three specialist agents working in parallel: +- `spatial-index-dev` (simulator-infra) — Rust spatial index work +- `config-bridge-dev` (game-systems) — serde tests + bridge methods +- `gut-test-dev` (godot-engine) — GDScript integration test + +All three tasks were fully independent, so the team ran with zero cross-agent coordination beyond a single heads-up message between config-bridge-dev and spatial-index-dev about an unused import during the interim state. + +### Track A — Spatial index (spatial-index-dev, simulator-infra) + +**New file:** `src/simulator/crates/mc-turn/src/spatial_index.rs` (242 lines, 5 unit tests). + +**Design:** `LairIndex` is a flat row-major `Vec>` with one bucket per grid tile. Each bucket holds the *indices into the `lairs` snapshot* whose Chebyshev-ball territory of `encounter_radius(tier)` covers that tile. Lairs are stamped in ascending snapshot-index order, so every bucket is naturally sorted — per-unit iteration therefore visits in-range lairs in exactly the same relative order as `for &lair in &lairs`, preserving the deterministic RNG stream. Construction is `O(lairs × r²)`, per-unit lookup is `O(1)` plus the Chebyshev-ball cardinality. + +**Differential test:** `fauna_encounters_indexed_matches_legacy` keeps the old loop as `process_fauna_encounters_legacy` + `step_legacy` under `#[cfg(test)]` and runs both paths from identical clones across four seed/map combos (seed 7/map 16/100t, 42/32/150t, 99/48/100t, 1337/24/200t). Assertions are **byte-identical**: +- Added `#[derive(PartialEq)]` on `FaunaCombatEvent` so the full combat log compares struct-wise +- Per-turn: `fauna_combat_log`, `units_lost_to_fauna`, `cities_harassed_by_fauna`, `winner` +- Final state: per-player `gold`, `cities.len()`, `units.len()`, per-unit `(col, row, hp, is_fortified)` + +An OOB-tolerant `any_at_any` / `lairs_at` fallback was required for units/cities founded outside the grid via the deterministic anchor+4 founding rule — that was load-bearing for byte-identity and the differential test caught its absence immediately on first pass. + +**Measurement — honest hot-path analysis:** + +The Council of Experts iter 7e analysis projected 50–200× speedup on the 4AI/96×96 bench. That projection was **wrong**, and the agent caught it with a targeted micro-bench. + +Full `fauna_pressure_bench FPB_PLAYERS=4 FPB_MAP_SIZE=96` wall-clock with the new index: +``` +Total: 1561.7s + Evolution (50K ecology ticks): 1560.5s ← 99.9% + Turn loop (500 turns × 4 players): ≈ 1.2s ← 0.08% +``` + +The turn loop (where `process_fauna_encounters` lives) was already sub-second in the legacy implementation. 4AI/96×96 wall-clock is dominated by `mc_ecology::evolution::run_evolution`, not the turn processor. The Council's "357M hex_distance calls" figure came from inside evolution, or from a different bench configuration, not from fauna encounter resolution. + +**Isolated micro-bench** (`fauna_index_perf`, `#[ignore]`-marked in `processor.rs`, 96×96 grid, 4 players, 400 lairs, 600 units, 500 turns): +``` +legacy: 1.282s / 44,991 events +indexed: 0.980s / 44,991 events +speedup: 1.31× +``` + +**Verdict:** the optimization is real (~23% faster turn processing, byte-identical output) and will help Godot in-game turn resolution once routed through the GDExtension bridge, but it **does not unblock CMA-ES optimizer sweeps** — that blocker is ecology evolution, not turn processing. The Council's hot-path premise needs to be re-checked against an actual profile before iter 7g targets another performance task. + +**The value of specialist agents with authority to measure instead of guess:** the agent executed the task as briefed, proved correctness via the differential test, *then* measured the actual impact, *then* reported that the premise was wrong. That's the correct engineering loop. It also means we avoided shipping a "big speedup" claim built on a misread profile. + +**Files touched (track A):** +- `crates/mc-turn/src/spatial_index.rs` (new, 242 lines) +- `crates/mc-turn/src/lib.rs` (+`pub mod spatial_index;`) +- `crates/mc-turn/src/processor.rs` (new indexed path + legacy under cfg + differential test + ignored perf bench) +- `crates/mc-turn/src/combat_event.rs` (+`PartialEq` derive on `FaunaCombatEvent`) + +### Track B — Config serde roundtrip + Godot bridge (config-bridge-dev, game-systems) + +**Two new mc-turn tests in `processor.rs`:** + +1. `lair_combat_config_serde_roundtrip` — mutates all **15 fields** of `LairCombatConfig` to non-default values (`encounter_radius_t1_t3..t10`, `base_kill_rate`, `tier_kill_slope`, `tier_kill_exponent`, `fortify_divisor`, `encounter_probability_per_turn`, `gold_per_wealth_per_city`, `prod_per_axis_per_city`, `expansion_per_axis_per_turn`, `city_founding_cost`, `unit_spawn_cost`, `max_cities_per_player_base`), serializes, deserializes, and asserts field-by-field equality via an exhaustive destructure block — any future field addition triggers a compile error rather than silently slipping past the test. +2. `config_changes_affect_kill_counts` — runs the same 100-turn `make_bench_state(1337, 24)` scenario twice via JSON roundtrip: low config (`base_kill_rate=0.01, slope=0, prob=0.5`) vs high config (`base_kill_rate=0.8, slope=0.1, prob=0.5`). Asserts `high_deaths > low_deaths`. This proves the roundtrip path (not just the serde impl) actually drives observable bench behavior. + +**Two new `#[func]` methods on `GdTurnProcessor` in `api-gdext/src/lib.rs`:** +- `config_to_json(&self) -> GString` — serializes `self.inner.lair_combat_config` to a JSON string; returns empty GString + `godot_error!` on failure +- `config_from_json(&mut self, json: GString) -> bool` — parses the JSON into `mc_turn::LairCombatConfig`, overwrites the live config on success, returns `false` + `godot_error!` on parse failure (leaves the existing config untouched) + +Both methods sit adjacent to the iter 7e live-tuning setters (`set_base_kill_rate`, etc.) at lines 1766, 1770, 1780, 1787 of `api-gdext/src/lib.rs`. + +**Incidental cleanup:** The agent also fixed two pre-existing `clippy::field_reassign_with_default` warnings in `kill_probability_bounds` (extreme_cfg / weird_cfg) by converting to struct-update syntax while working on that test module. Same test behavior, cleaner construction. + +### Track C — GUT integration test (gut-test-dev, godot-engine) + +**New file:** `src/game/engine/tests/integration/test_gd_turn_processor.gd` (201 lines, lint-clean). + +**Test methods (4):** + +1. `test_gd_classes_registered` — instantiates both `GdTurnProcessor` and `GdGameState` via `ClassDB.instantiate()`, asserts both non-null. Verifies the GDExtension class registration. +2. `test_50_turn_simulation` — seeds a 24×24 grid with lairs at `(12,12, T10, 901)` and `(4,4, T5, 102)`, adds a militarist player at `(12,12)`, runs 50 `step()` calls, and asserts: + - `turn() == 50` + - `player_count() == 1` + - `city_count(0) >= 1` + - `unit_count(0) > 3` (production fired) + - `gold(0) > 60` (wealth axis accumulated) + - `lair_count() == 2` (lairs still present post-step) + - `total_encounters > 0` (fauna loop fired) +3. `test_live_tuning_affects_kill_counts` — runs two 20-turn scenarios with `set_base_kill_rate(0.5)` vs `set_base_kill_rate(0.0)` and asserts the lethal scenario produces strictly more cumulative deaths. Locks in that the live-tuning knob actually changes observable behavior from the Godot side. +4. `test_config_dictionary_shape` — iterates `EXPECTED_STEP_KEYS` (all 10 documented keys: `turn`, `winner`, `units_lost_to_fauna`, `cities_harassed`, `encounters`, `deaths`, `t4_t6_encounters`, `t4_t6_deaths`, `t7_t10_encounters`, `t7_t10_deaths`) and asserts each is present in the Dictionary returned from `step()`. Any future bridge method signature change will fail-fast this test. + +**Lint:** `gdlint src/game/engine/tests/integration/test_gd_turn_processor.gd` → "Success: no problems found". No inline `disable` / `ignore` comments used — the agent wrote the file to pass the project's strict GDScript quality hook on first try (RefCounted casts for `ClassDB.instantiate` return values, explicit types on every variable, explicit return types on every function). + +**Not done intentionally:** the agent did not attempt to run GUT itself — the project's GUT runner requires a full Godot editor session outside the scope of headless agent execution. The test file is written and lint-verified; human-driven `./run test` will execute it. + +### Integrated verification + +After all three tracks landed: + +``` +cargo test --workspace → 495 passed, 0 failed (was 487 → +8 from the three tracks) +cargo clippy --workspace → 0 warnings (unchanged from 7e) +gdlint scenes/tests/iter_7e_turn_bridge_proof.gd → no problems +gdlint tests/integration/test_gd_turn_processor.gd → no problems +bash build-gdext.sh → clean, .so deployed +``` + +**Test delta (487 → 495, +8):** +- mc-turn: `lair_combat_config_serde_roundtrip`, `config_changes_affect_kill_counts`, `fauna_encounters_indexed_matches_legacy` +- spatial_index module: `buckets_are_ascending`, `empty_lairs_empty_index`, `out_of_bounds_safe`, `partial_overlap_clipped`, `stamps_chebyshev_ball` + +### Verdict + +| Goal | Outcome | +|---|---| +| Spatial index with byte-identical output | ✓ Shipped, diff test across 4 configs | +| Unblock CMA-ES sweeps on `fauna_pressure_bench` | ✗ **Premise was wrong** — bench is 99.9% evolution, not turn processing | +| `LairCombatConfig` roundtrip + bridge save/load | ✓ Shipped, exhaustive-field test | +| GUT regression test for iter 7e bridge | ✓ Shipped, 4 test methods, lint-clean | +| Zero test regressions | ✓ 487 → 495 passing | +| Zero clippy regressions | ✓ 0 warnings maintained | + +### Carry-forward for iter 7g + +**The actual CMA-ES sweep blocker is `mc_ecology::evolution::run_evolution`, not `process_fauna_encounters`.** spatial-index-dev profiled the 4AI/96×96 bench and confirmed evolution eats 99.9% of the 1561-second wall-clock. A future iter 7g targeting optimizer unblocking should: +1. Profile `run_evolution` at several map sizes (48×48, 64×64, 80×80, 96×96) and identify the dominant inner loops +2. Likely targets: LV dynamics substepping, canopy/undergrowth aggregation, species emergence gating, event tier checks +3. The honest-measurement lesson from iter 7f should apply: micro-bench candidate optimizations in isolation *before* claiming wall-clock impact on the full bench + +**The spatial index is still a net positive** for Godot in-game turn resolution — the 1.31× speedup on the turn loop becomes visible when Godot is running a real-time interactive game at many turns per second, even if it's invisible in the 500-turn bench whose wall-clock is ecology-bound. + +--- + ## Iteration 7e — Godot ↔ mc-turn GDExtension bridge + phase-gate proof screenshot (2026-04-08, COMPLETE) **Goal:** Resolve the architectural debt called out by the Council of Experts analysis: the reconstructed `mc-turn` has been validated by the bench binary for three iterations (7b, 7c, 7d) but the actual Godot game still has its own GDScript turn loop and never calls the Rust processor. This iteration wires `mc-turn::TurnProcessor` and `mc-turn::GameState` into Godot via a new `GdTurnProcessor` / `GdGameState` pair in the `api-gdext` crate, produces a proof scene that runs 50 turns of the Rust processor inside Godot, and captures a Phase Gate Protocol screenshot.