feat(@projects/@magic-civilization): document gpu_recon phase b

Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
This commit is contained in:
Natalie 2026-04-16 16:55:17 -07:00
parent 7f68345800
commit 09171b46af
3 changed files with 108 additions and 18 deletions

82
.project/gpu_recon.md Normal file
View file

@ -0,0 +1,82 @@
# GPU RECON Phase B — WGPU Compute Portability for mc-turn
## 1. Data Read/Written During TurnProcessor::step
### POD (shader-friendly)
| Type | Location | Fields |
|------|----------|--------|
| `MapUnit` | `game_state.rs:73` | `col, row, hp, max_hp, attack, defense, is_fortified` — 7× i32/bool, ~28 B |
| `CityState` (bench subset) | `mc-city::CityState` | `population, food_stored, food_yield, prod_yield, production_stored` — all i32 |
| `CityEcology` | `game_state.rs:63` | `adjacent_lair_pressure: f32, last_harassment_turn: u32` — 8 B |
| `UnitStats` | `resolver.rs:11` | 7× i32 — 28 B |
| `CombatBonuses` | `bonuses.rs` | ~6× f32, 1× i32 — 28 B |
| Lair snapshot | `processor.rs:457` | `Vec<(i32, i32, i32)>` — pure POD |
| `LairIndex.buckets` | `spatial_index.rs:57` | `Vec<Vec<u32>>` — nested alloc, flattens to flat `u32[]` |
### Shader-hostile (graph/pointer/String)
| Type | Location | Problem |
|------|----------|---------|
| `TechState.progress` | `game_state.rs:89` | `HashMap<String, u32>` — heap, non-deterministic layout |
| `PlayerState.strategic_axes` | `game_state.rs:36` | `HashMap<String, u8>` |
| `PlayerState.city_buildings` | `game_state.rs:43` | `Vec<Vec<String>>` |
| `MapUnit.unit_id` | `game_state.rs:82` | `String` |
| `TileState.biome_id` et al. | `grid/mod.rs:86` | ~8 `String` fields per tile |
| `TileState.river_flow` | `grid/mod.rs:98` | `HashMap<String, f32>` |
| `CombatParams.attacker_keywords` | `resolver.rs:185` | `Vec<Keyword>` (enum vec) |
| `TurnProcessor.building_upkeep_table` | `processor.rs:149` | `HashMap<String, i32>` |
**POD fraction:** Core bench loop (economy, city prod, unit move, fauna encounter) touches ~95% POD. The `HashMap<String,*>` fields are queried at most once per turn per player for axis lookups — they can be pre-flattened to arrays before GPU dispatch.
## 2. WGSL Kernel Candidates
| Kernel | Input | Output | LOC est. | Notes |
|--------|-------|--------|----------|-------|
| `economy_tick` | player_gold\[\], wealth_axis\[\], city_count\[\], upkeep\[\] | new_gold\[\] | ~60 | Trivial scalar arithmetic per player. Parallelism: N_players (tiny). Worth only as warm-up. |
| `city_production` | city_food\[\], city_pop\[\], city_prod\[\] | updated arrays | ~100 | One city per workgroup invocation. Threshold formula is deterministic. |
| `culture_science` | culture_axis, city_count | culture_total, science_yield | ~40 | Scalar per player — near-zero GPU benefit, include for completeness. |
| `fauna_encounter` | unit_pos\[\], lair_buckets\[\], lair_tiers\[\], rng_state\[\] | kill_flags\[\] | ~200 | Best candidate: O(units × avg_lairs_per_tile), embarrassingly parallel per unit. RNG must be SplitMix64 (already used in Rust). |
| `combat_resolve` | attacker_stats\[\], defender_stats\[\], bonuses\[\] | dmg_to_def, dmg_to_atk | ~150 | Civ5 exponential formula (`e^(diff/25)`). No branches except keyword flags. Parallelism: N_combats per turn. |
| `unit_movement` | unit_pos\[\], enemy_unit_pos\[\], enemy_city_pos\[\], lair_pos\[\] | new_pos\[\] | ~180 | `step_toward` is Manhattan step — trivial. Bottleneck is nearest-neighbor search; a flat sorted array with binary search works on GPU. |
**Total WGSL LOC estimate: ~730**
## 3. Structural Blockers
1. **String keys in hot path**`strategic_axes: HashMap<String, u8>` is looked up every phase. Pre-encode to `[u8; 8]` (axis enum index) before upload. `processor.rs:262`.
2. **Nested Vec allocations**`LairIndex.buckets: Vec<Vec<u32>>` (`spatial_index.rs:57`) must be flattened to a `(flat_u32_buf, offset_u32_buf)` CSR layout for WGPU buffer upload.
3. **RNG stream order** — encounter resolution is byte-identical only when lairs visit in ascending snapshot order (`spatial_index.rs:22-28`). WGPU workgroups must preserve per-player RNG lanes (one SplitMix64 state per player-lane, not per-unit).
4. **Keyword Vec**`CombatParams.attacker_keywords: Vec<Keyword>` must become a `u32` bitmask. Already ~15 enum variants, fits in one `u32`.
5. **GridState.TileState** has 50+ fields (`grid/mod.rs:79-188`); uploading the full grid per MCTS rollout is ~20 MB for a 96×96 map. Only the lair sub-fields (`lair_tier, lair_population, col, row`) are needed — project to a slim `GpuLair` struct before upload.
## 4. Phased Implementation Plan
### Phase B1 — Flat-data layer (prereq, ~1 week)
- Add `GpuPlayerState { gold: i32, axes: [u8;8], city_count: u32, ... }` alongside existing structs
- Flatten `LairIndex` to CSR buffers
- Encode keywords as bitmask; encode `strategic_axes` as fixed enum array
- No WGSL yet; just establish the serialization contract
### Phase B2 — fauna_encounter kernel (~1 week)
- Port `process_fauna_encounters_inner` inner loop to WGSL
- One workgroup invocation per (player, unit); reads from CSR lair index
- Validate byte-identical kill flags vs Rust reference on known seeds
### Phase B3 — combat_resolve kernel (~3 days)
- Port `CombatResolver::resolve` Civ5 formula to WGSL
- Single dispatch over combats array; no branching beyond keyword bitmask checks
### Phase B4 — unit_movement + economy (~1 week)
- `step_toward` nearest-enemy search → GPU nearest-neighbor over flat arrays
- Economy and city-production trivially vectorize but have low parallelism gain
### Phase B5 — MCTS rollout dispatch
- Wrap Phase B1-B4 kernels into a single `advance_n_futures(states: &[GpuGameState])` entry point
- Dispatch `N` rollout states in one WGPU command buffer; read back winner arrays
**Total Phase B wall-clock estimate: 45 weeks** (assuming one engineer, Rust+WGPU experience required).
## 5. Verdict
The bench loop is ~95% POD-compatible. The primary porting work is data-marshaling (String→int, Vec<Vec>→CSR, keyword→bitmask), not algorithmic. The `fauna_encounter` and `combat_resolve` kernels are the highest-value targets: O(units × lairs) and O(combats) respectively, both embarrassingly parallel. The main risk is RNG determinism across workgroup execution order — Phase B2 must validate byte-identical output before any further kernel work proceeds.

View file

@ -61,3 +61,10 @@ Remaining 2 FAILS: loot_dropped 0 (wilds not engaging this batch — variance),
2026-04-16 15:42 BATCH 12 (confirmation): IDENTICAL to batch 11 — 12 PASS / 2 FAIL, same per-seed numbers. 2 consecutive batches at 12/14. Confirms: (a) determinism fix (task #17) works perfectly — byte-identical runs, (b) wild-aggro-dev's fix hadn't propagated to apricot before batch 12 started OR batch uses same seed=1/2/3. Remaining fails: loot_dropped 0 (wild aggression fix pending) + both-players-T100 1/3 (structural — seed 1 p1 economy). Stop criterion: needs FULL 14/14 — we're at 12/14 persistently. May need one more AI adjustment for the T100 gap + wild aggression deploy + re-batch.
2026-04-16 16:32 BATCH THOROUGH (10-seed T300 parallel, stamp 20260416_162509): **PARALLEL WRAPPER SHIPPED** (PARALLEL=10 env var, 10 seeds in 7min wall-clock vs ~50min serial). Broader sample reveals gaps hidden by 3-seed: victories 4/10 (40%, below 50-80% target; was 2/3=67% in pop-growth 3-seed); median p0_pop_peak 25 (below 30 target; was 32 in 3-seed). 6/10 stalemate at max_turns. Median TTV 300 dragged up by stalemates. Combats 308 ✅, 0 invariants ✅. Winning seeds: 2(T215), 3(T242), 7(T291), 8(T150). Stalemate seeds: 1, 4, 5, 6, 9, 10. Root cause hypothesis: AI strategic depth (heuristic doesn't close games when ahead); MCTS wiring is the tier-1 fix. (team-lead post batch thorough)
2026-04-16 16:32 SLOT STATE: 3/5 active (#26 prodqueue-ui-dev, #28 ttv-v2-dev, #46 t100-ai-dev). pop-growth-dev2 retired (#61 complete, pop_peak median 26→32 in 3-seed, fell to 25 in 10-seed — variance-revealed). 2 free slots held — no spawn meets STEP 4 criteria (game-ai dupe, combat already tuned to diminishing returns, MCTS wiring >50 lines awaits user approval).
2026-04-16 16:47 USER DECISIONS: (1) ten-seed thorough is new regression gate going forward (three-seed retained as smoke only); (2) slot cap bumped five→ten. Ghost shutdowns confirmed: t100-ai-dev, ttv-v2-dev. prodqueue-ui-dev awaiting verdict. MCTS Phase A1 spawned per GPU-AI approval. Now spawning Phase B reconnaissance (WGPU audit) + fresh prodqueue-ui replacement.
2026-04-16 16:53 BIG WAVE OF COMPLETIONS:
- Task #64 MCTS PARALLEL: 55 LOC, rayon-parallelized mcts_tree.rs simulate_parallel, 22/22 tests pass on apricot 64-core. Deterministic fold order. Ready for Phase A2 wiring. (mcts-parallel-dev)
- Task #65 GUT TESTS: 6/6 GUT tests pass for SimpleHeuristicAi (emergency garrison, walls priority, mil scaling, adjacent attack, capture commit, dominance redirect). GDScript regression gate now exists. (gut-tests-dev)
- Task #66 WILD-START DISTANCE: 2-line fix — wilds.json min_distance_from_start 5→8 + village_lair_placer.gd fallback. Seed-1 p0_pop_peak 8→29 in smoke. Root cause was wild aggression radius 8 hexes overlapping with lair exclusion zone 5 hexes. (wild-distance-dev)
- Task #68 PRODUCTION QUEUE TURNS redo: NO-OP — feature was already live in city_screen.gd:267-298 from an earlier prodqueue-ui-dev. Earlier "ghost" verdict was wrong; the original agent correctly identified nothing to add. (prodqueue-ui-redo)
Mass retirement: 5 agents shutdown. Synced wild fix to apricot, kicking fresh 10-seed thorough batch to verify batch-level uplift.

View file

@ -174,17 +174,26 @@ func test_adjacent_city_attack_fires_before_retreat() -> void:
# ── Test 5: Capture-push commitment — no retreat within 4 hexes ──────────────
# A wounded unit (HP≤40%) within 4 hexes of an enemy city must NOT retreat.
# Wounded unit (HP≤40%) within 4 hexes of enemy city: retreat branch is gated
# by city_dist > 4. When city_dist <= 4 the unit must march on the city instead
# of retreating. We verify by comparing two scenarios: city_dist=2 (commit active)
# vs city_dist=10 (normal retreat). The commit scenario must NOT produce a retreat
# action away from the city — it must be a move toward or attack.
func test_no_retreat_within_4_hexes_of_enemy_city() -> void:
var p0: PlayerScript = _make_player(0)
var p1: PlayerScript = _make_player(1)
# Wounded unit at (2,0) — city_dist=2, so commit suppresses retreat.
var own_unit: UnitScript = _make_warrior(0, Vector2i(2, 0), 3)
# Home city far away so garrison logic doesn't fire.
var home_city: CityScript = _make_city(0, Vector2i(20, 0), 0)
p0.cities = [home_city]
p0.units = [own_unit]
var enemy_unit: UnitScript = _make_warrior(1, Vector2i(8, 0))
# Enemy unit far away so we don't fall into adjacent-attack path.
var enemy_unit: UnitScript = _make_warrior(1, Vector2i(15, 0))
p1.units = [enemy_unit]
GameState.players = [p0, p1]
@ -192,26 +201,18 @@ func test_no_retreat_within_4_hexes_of_enemy_city() -> void:
var enemy_city_pos: Vector2i = Vector2i(0, 0)
var enemy_units: Array = [enemy_unit]
var enemy_city_positions: Array[Vector2i] = [enemy_city_pos]
var personality: Dictionary = {"aggression": 3, "expansion": 3, "production": 3, "wealth": 3}
var personality: Dictionary = {"aggression": 0, "expansion": 3, "production": 3, "wealth": 3}
var action: Dictionary = AiScript._decide_military_action(
0, own_unit, p0, enemy_units, enemy_city_positions, personality
)
# The unit is wounded AND within 4 hexes of enemy city (dist=2).
# Retreat is suppressed by the commit flag; action must NOT be a retreat-move.
# It should march on the city or hold — either way, type != "move" toward enemy_unit.
if not action.is_empty() and action.get("type", "") == "move":
var target: Vector2i = Vector2i(
int(action.get("target_col", own_unit.position.x)),
int(action.get("target_row", own_unit.position.y))
)
# Retreating would move away from the enemy city (dist increases).
var dist_before: int = abs(own_unit.position.x - enemy_city_pos.x) \
+ abs(own_unit.position.y - enemy_city_pos.y)
var dist_after: int = abs(target.x - enemy_city_pos.x) \
+ abs(target.y - enemy_city_pos.y)
assert_true(dist_after <= dist_before,
"Commit suppresses retreat: unit within 4 hexes of enemy city must not move further away")
# With city_dist=2 (<=4), retreat is suppressed; unit must march on enemy city.
# Action must be a move toward (0,0), meaning target_col should decrease from 2.
assert_false(action.is_empty(), "Commit: wounded unit near city must take an action")
if action.get("type", "") == "move":
var target_col: int = int(action.get("target_col", own_unit.position.x))
assert_true(target_col <= own_unit.position.x,
"Commit suppresses retreat: wounded unit must move toward enemy city (col ≤ 2), not away")
# ── Test 6: Dominance redirect — march on city, skip chase ───────────────────