feat(@projects/@magic-civilization): ✨ document gpu_recon phase b

Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-04-16 16:55:17 -07:00 · 2026-04-16 16:55:17 -07:00 · 09171b46af
commit 09171b46af
parent 7f68345800
3 changed files with 108 additions and 18 deletions
--- a/.project/gpu_recon.md
+++ b/.project/gpu_recon.md
@ -0,0 +1,82 @@
+# GPU RECON Phase B — WGPU Compute Portability for mc-turn
+
+## 1. Data Read/Written During TurnProcessor::step
+
+### POD (shader-friendly)
+
+| Type | Location | Fields |
+|------|----------|--------|
+| `MapUnit` | `game_state.rs:73` | `col, row, hp, max_hp, attack, defense, is_fortified` — 7× i32/bool, ~28 B |
+| `CityState` (bench subset) | `mc-city::CityState` | `population, food_stored, food_yield, prod_yield, production_stored` — all i32 |
+| `CityEcology` | `game_state.rs:63` | `adjacent_lair_pressure: f32, last_harassment_turn: u32` — 8 B |
+| `UnitStats` | `resolver.rs:11` | 7× i32 — 28 B |
+| `CombatBonuses` | `bonuses.rs` | ~6× f32, 1× i32 — 28 B |
+| Lair snapshot | `processor.rs:457` | `Vec<(i32, i32, i32)>` — pure POD |
+| `LairIndex.buckets` | `spatial_index.rs:57` | `Vec<Vec<u32>>` — nested alloc, flattens to flat `u32[]` |
+
+### Shader-hostile (graph/pointer/String)
+
+| Type | Location | Problem |
+|------|----------|---------|
+| `TechState.progress` | `game_state.rs:89` | `HashMap<String, u32>` — heap, non-deterministic layout |
+| `PlayerState.strategic_axes` | `game_state.rs:36` | `HashMap<String, u8>` |
+| `PlayerState.city_buildings` | `game_state.rs:43` | `Vec<Vec<String>>` |
+| `MapUnit.unit_id` | `game_state.rs:82` | `String` |
+| `TileState.biome_id` et al. | `grid/mod.rs:86` | ~8 `String` fields per tile |
+| `TileState.river_flow` | `grid/mod.rs:98` | `HashMap<String, f32>` |
+| `CombatParams.attacker_keywords` | `resolver.rs:185` | `Vec<Keyword>` (enum vec) |
+| `TurnProcessor.building_upkeep_table` | `processor.rs:149` | `HashMap<String, i32>` |
+
+**POD fraction:** Core bench loop (economy, city prod, unit move, fauna encounter) touches ~95% POD. The `HashMap<String,*>` fields are queried at most once per turn per player for axis lookups — they can be pre-flattened to arrays before GPU dispatch.
+
+## 2. WGSL Kernel Candidates
+
+| Kernel | Input | Output | LOC est. | Notes |
+|--------|-------|--------|----------|-------|
+| `economy_tick` | player_gold\[\], wealth_axis\[\], city_count\[\], upkeep\[\] | new_gold\[\] | ~60 | Trivial scalar arithmetic per player. Parallelism: N_players (tiny). Worth only as warm-up. |
+| `city_production` | city_food\[\], city_pop\[\], city_prod\[\] | updated arrays | ~100 | One city per workgroup invocation. Threshold formula is deterministic. |
+| `culture_science` | culture_axis, city_count | culture_total, science_yield | ~40 | Scalar per player — near-zero GPU benefit, include for completeness. |
+| `fauna_encounter` | unit_pos\[\], lair_buckets\[\], lair_tiers\[\], rng_state\[\] | kill_flags\[\] | ~200 | Best candidate: O(units × avg_lairs_per_tile), embarrassingly parallel per unit. RNG must be SplitMix64 (already used in Rust). |
+| `combat_resolve` | attacker_stats\[\], defender_stats\[\], bonuses\[\] | dmg_to_def, dmg_to_atk | ~150 | Civ5 exponential formula (`e^(diff/25)`). No branches except keyword flags. Parallelism: N_combats per turn. |
+| `unit_movement` | unit_pos\[\], enemy_unit_pos\[\], enemy_city_pos\[\], lair_pos\[\] | new_pos\[\] | ~180 | `step_toward` is Manhattan step — trivial. Bottleneck is nearest-neighbor search; a flat sorted array with binary search works on GPU. |
+
+**Total WGSL LOC estimate: ~730**
+
+## 3. Structural Blockers
+
+1. **String keys in hot path** — `strategic_axes: HashMap<String, u8>` is looked up every phase. Pre-encode to `[u8; 8]` (axis enum index) before upload. `processor.rs:262`.
+2. **Nested Vec allocations** — `LairIndex.buckets: Vec<Vec<u32>>` (`spatial_index.rs:57`) must be flattened to a `(flat_u32_buf, offset_u32_buf)` CSR layout for WGPU buffer upload.
+3. **RNG stream order** — encounter resolution is byte-identical only when lairs visit in ascending snapshot order (`spatial_index.rs:22-28`). WGPU workgroups must preserve per-player RNG lanes (one SplitMix64 state per player-lane, not per-unit).
+4. **Keyword Vec** — `CombatParams.attacker_keywords: Vec<Keyword>` must become a `u32` bitmask. Already ~15 enum variants, fits in one `u32`.
+5. **GridState.TileState** has 50+ fields (`grid/mod.rs:79-188`); uploading the full grid per MCTS rollout is ~20 MB for a 96×96 map. Only the lair sub-fields (`lair_tier, lair_population, col, row`) are needed — project to a slim `GpuLair` struct before upload.
+
+## 4. Phased Implementation Plan
+
+### Phase B1 — Flat-data layer (prereq, ~1 week)
+- Add `GpuPlayerState { gold: i32, axes: [u8;8], city_count: u32, ... }` alongside existing structs
+- Flatten `LairIndex` to CSR buffers
+- Encode keywords as bitmask; encode `strategic_axes` as fixed enum array
+- No WGSL yet; just establish the serialization contract
+
+### Phase B2 — fauna_encounter kernel (~1 week)
+- Port `process_fauna_encounters_inner` inner loop to WGSL
+- One workgroup invocation per (player, unit); reads from CSR lair index
+- Validate byte-identical kill flags vs Rust reference on known seeds
+
+### Phase B3 — combat_resolve kernel (~3 days)
+- Port `CombatResolver::resolve` Civ5 formula to WGSL
+- Single dispatch over combats array; no branching beyond keyword bitmask checks
+
+### Phase B4 — unit_movement + economy (~1 week)
+- `step_toward` nearest-enemy search → GPU nearest-neighbor over flat arrays
+- Economy and city-production trivially vectorize but have low parallelism gain
+
+### Phase B5 — MCTS rollout dispatch
+- Wrap Phase B1-B4 kernels into a single `advance_n_futures(states: &[GpuGameState])` entry point
+- Dispatch `N` rollout states in one WGPU command buffer; read back winner arrays
+
+**Total Phase B wall-clock estimate: 4–5 weeks** (assuming one engineer, Rust+WGPU experience required).
+
+## 5. Verdict
+
+The bench loop is ~95% POD-compatible. The primary porting work is data-marshaling (String→int, Vec<Vec>→CSR, keyword→bitmask), not algorithmic. The `fauna_encounter` and `combat_resolve` kernels are the highest-value targets: O(units × lairs) and O(combats) respectively, both embarrassingly parallel. The main risk is RNG determinism across workgroup execution order — Phase B2 must validate byte-identical output before any further kernel work proceeds.
--- a/.project/iteration_log.md
+++ b/.project/iteration_log.md
@ -61,3 +61,10 @@ Remaining 2 FAILS: loot_dropped 0 (wilds not engaging this batch — variance),
 2026-04-16 15:42 BATCH 12 (confirmation): IDENTICAL to batch 11 — 12 PASS / 2 FAIL, same per-seed numbers. 2 consecutive batches at 12/14. Confirms: (a) determinism fix (task #17) works perfectly — byte-identical runs, (b) wild-aggro-dev's fix hadn't propagated to apricot before batch 12 started OR batch uses same seed=1/2/3. Remaining fails: loot_dropped 0 (wild aggression fix pending) + both-players-T100 1/3 (structural — seed 1 p1 economy). Stop criterion: needs FULL 14/14 — we're at 12/14 persistently. May need one more AI adjustment for the T100 gap + wild aggression deploy + re-batch.
 2026-04-16 16:32 BATCH THOROUGH (10-seed T300 parallel, stamp 20260416_162509): **PARALLEL WRAPPER SHIPPED** (PARALLEL=10 env var, 10 seeds in 7min wall-clock vs ~50min serial). Broader sample reveals gaps hidden by 3-seed: victories 4/10 (40%, below 50-80% target; was 2/3=67% in pop-growth 3-seed); median p0_pop_peak 25 (below 30 target; was 32 in 3-seed). 6/10 stalemate at max_turns. Median TTV 300 dragged up by stalemates. Combats 308 ✅, 0 invariants ✅. Winning seeds: 2(T215), 3(T242), 7(T291), 8(T150). Stalemate seeds: 1, 4, 5, 6, 9, 10. Root cause hypothesis: AI strategic depth (heuristic doesn't close games when ahead); MCTS wiring is the tier-1 fix. (team-lead post batch thorough)
 2026-04-16 16:32 SLOT STATE: 3/5 active (#26 prodqueue-ui-dev, #28 ttv-v2-dev, #46 t100-ai-dev). pop-growth-dev2 retired (#61 complete, pop_peak median 26→32 in 3-seed, fell to 25 in 10-seed — variance-revealed). 2 free slots held — no spawn meets STEP 4 criteria (game-ai dupe, combat already tuned to diminishing returns, MCTS wiring >50 lines awaits user approval).
+2026-04-16 16:47 USER DECISIONS: (1) ten-seed thorough is new regression gate going forward (three-seed retained as smoke only); (2) slot cap bumped five→ten. Ghost shutdowns confirmed: t100-ai-dev, ttv-v2-dev. prodqueue-ui-dev awaiting verdict. MCTS Phase A1 spawned per GPU-AI approval. Now spawning Phase B reconnaissance (WGPU audit) + fresh prodqueue-ui replacement.
+2026-04-16 16:53 BIG WAVE OF COMPLETIONS:
+- Task #64 MCTS PARALLEL: 55 LOC, rayon-parallelized mcts_tree.rs simulate_parallel, 22/22 tests pass on apricot 64-core. Deterministic fold order. Ready for Phase A2 wiring. (mcts-parallel-dev)
+- Task #65 GUT TESTS: 6/6 GUT tests pass for SimpleHeuristicAi (emergency garrison, walls priority, mil scaling, adjacent attack, capture commit, dominance redirect). GDScript regression gate now exists. (gut-tests-dev)
+- Task #66 WILD-START DISTANCE: 2-line fix — wilds.json min_distance_from_start 5→8 + village_lair_placer.gd fallback. Seed-1 p0_pop_peak 8→29 in smoke. Root cause was wild aggression radius 8 hexes overlapping with lair exclusion zone 5 hexes. (wild-distance-dev)
+- Task #68 PRODUCTION QUEUE TURNS redo: NO-OP — feature was already live in city_screen.gd:267-298 from an earlier prodqueue-ui-dev. Earlier "ghost" verdict was wrong; the original agent correctly identified nothing to add. (prodqueue-ui-redo)
+Mass retirement: 5 agents shutdown. Synced wild fix to apricot, kicking fresh 10-seed thorough batch to verify batch-level uplift.
--- a/src/game/engine/tests/unit/ai/test_simple_heuristic_ai.gd
+++ b/src/game/engine/tests/unit/ai/test_simple_heuristic_ai.gd
@ -174,17 +174,26 @@ func test_adjacent_city_attack_fires_before_retreat() -> void:


 # ── Test 5: Capture-push commitment — no retreat within 4 hexes ──────────────
-# A wounded unit (HP≤40%) within 4 hexes of an enemy city must NOT retreat.
+# Wounded unit (HP≤40%) within 4 hexes of enemy city: retreat branch is gated
+# by city_dist > 4. When city_dist <= 4 the unit must march on the city instead
+# of retreating. We verify by comparing two scenarios: city_dist=2 (commit active)
+# vs city_dist=10 (normal retreat). The commit scenario must NOT produce a retreat
+# action away from the city — it must be a move toward or attack.


 func test_no_retreat_within_4_hexes_of_enemy_city() -> void:
 	var p0: PlayerScript = _make_player(0)
 	var p1: PlayerScript = _make_player(1)

+	# Wounded unit at (2,0) — city_dist=2, so commit suppresses retreat.
 	var own_unit: UnitScript = _make_warrior(0, Vector2i(2, 0), 3)
+	# Home city far away so garrison logic doesn't fire.
+	var home_city: CityScript = _make_city(0, Vector2i(20, 0), 0)
+	p0.cities = [home_city]
 	p0.units = [own_unit]

-	var enemy_unit: UnitScript = _make_warrior(1, Vector2i(8, 0))
+	# Enemy unit far away so we don't fall into adjacent-attack path.
+	var enemy_unit: UnitScript = _make_warrior(1, Vector2i(15, 0))
 	p1.units = [enemy_unit]

 	GameState.players = [p0, p1]
@ -192,26 +201,18 @@ func test_no_retreat_within_4_hexes_of_enemy_city() -> void:
 	var enemy_city_pos: Vector2i = Vector2i(0, 0)
 	var enemy_units: Array = [enemy_unit]
 	var enemy_city_positions: Array[Vector2i] = [enemy_city_pos]
-	var personality: Dictionary = {"aggression": 3, "expansion": 3, "production": 3, "wealth": 3}
+	var personality: Dictionary = {"aggression": 0, "expansion": 3, "production": 3, "wealth": 3}

 	var action: Dictionary = AiScript._decide_military_action(
 		0, own_unit, p0, enemy_units, enemy_city_positions, personality
 	)
-	# The unit is wounded AND within 4 hexes of enemy city (dist=2).
-	# Retreat is suppressed by the commit flag; action must NOT be a retreat-move.
-	# It should march on the city or hold — either way, type != "move" toward enemy_unit.
-	if not action.is_empty() and action.get("type", "") == "move":
-		var target: Vector2i = Vector2i(
-			int(action.get("target_col", own_unit.position.x)),
-			int(action.get("target_row", own_unit.position.y))
-		)
-		# Retreating would move away from the enemy city (dist increases).
-		var dist_before: int = abs(own_unit.position.x - enemy_city_pos.x) \
-			+ abs(own_unit.position.y - enemy_city_pos.y)
-		var dist_after: int = abs(target.x - enemy_city_pos.x) \
-			+ abs(target.y - enemy_city_pos.y)
-		assert_true(dist_after <= dist_before,
-			"Commit suppresses retreat: unit within 4 hexes of enemy city must not move further away")
+	# With city_dist=2 (<=4), retreat is suppressed; unit must march on enemy city.
+	# Action must be a move toward (0,0), meaning target_col should decrease from 2.
+	assert_false(action.is_empty(), "Commit: wounded unit near city must take an action")
+	if action.get("type", "") == "move":
+		var target_col: int = int(action.get("target_col", own_unit.position.x))
+		assert_true(target_col <= own_unit.position.x,
+			"Commit suppresses retreat: wounded unit must move toward enemy city (col ≤ 2), not away")


 # ── Test 6: Dominance redirect — march on city, skip chase ───────────────────