feat(@projects/@magic-civilization): ✨ document gpu_recon phase b
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
This commit is contained in:
parent
7f68345800
commit
09171b46af
3 changed files with 108 additions and 18 deletions
82
.project/gpu_recon.md
Normal file
82
.project/gpu_recon.md
Normal file
|
|
@ -0,0 +1,82 @@
|
|||
# GPU RECON Phase B — WGPU Compute Portability for mc-turn
|
||||
|
||||
## 1. Data Read/Written During TurnProcessor::step
|
||||
|
||||
### POD (shader-friendly)
|
||||
|
||||
| Type | Location | Fields |
|
||||
|------|----------|--------|
|
||||
| `MapUnit` | `game_state.rs:73` | `col, row, hp, max_hp, attack, defense, is_fortified` — 7× i32/bool, ~28 B |
|
||||
| `CityState` (bench subset) | `mc-city::CityState` | `population, food_stored, food_yield, prod_yield, production_stored` — all i32 |
|
||||
| `CityEcology` | `game_state.rs:63` | `adjacent_lair_pressure: f32, last_harassment_turn: u32` — 8 B |
|
||||
| `UnitStats` | `resolver.rs:11` | 7× i32 — 28 B |
|
||||
| `CombatBonuses` | `bonuses.rs` | ~6× f32, 1× i32 — 28 B |
|
||||
| Lair snapshot | `processor.rs:457` | `Vec<(i32, i32, i32)>` — pure POD |
|
||||
| `LairIndex.buckets` | `spatial_index.rs:57` | `Vec<Vec<u32>>` — nested alloc, flattens to flat `u32[]` |
|
||||
|
||||
### Shader-hostile (graph/pointer/String)
|
||||
|
||||
| Type | Location | Problem |
|
||||
|------|----------|---------|
|
||||
| `TechState.progress` | `game_state.rs:89` | `HashMap<String, u32>` — heap, non-deterministic layout |
|
||||
| `PlayerState.strategic_axes` | `game_state.rs:36` | `HashMap<String, u8>` |
|
||||
| `PlayerState.city_buildings` | `game_state.rs:43` | `Vec<Vec<String>>` |
|
||||
| `MapUnit.unit_id` | `game_state.rs:82` | `String` |
|
||||
| `TileState.biome_id` et al. | `grid/mod.rs:86` | ~8 `String` fields per tile |
|
||||
| `TileState.river_flow` | `grid/mod.rs:98` | `HashMap<String, f32>` |
|
||||
| `CombatParams.attacker_keywords` | `resolver.rs:185` | `Vec<Keyword>` (enum vec) |
|
||||
| `TurnProcessor.building_upkeep_table` | `processor.rs:149` | `HashMap<String, i32>` |
|
||||
|
||||
**POD fraction:** Core bench loop (economy, city prod, unit move, fauna encounter) touches ~95% POD. The `HashMap<String,*>` fields are queried at most once per turn per player for axis lookups — they can be pre-flattened to arrays before GPU dispatch.
|
||||
|
||||
## 2. WGSL Kernel Candidates
|
||||
|
||||
| Kernel | Input | Output | LOC est. | Notes |
|
||||
|--------|-------|--------|----------|-------|
|
||||
| `economy_tick` | player_gold\[\], wealth_axis\[\], city_count\[\], upkeep\[\] | new_gold\[\] | ~60 | Trivial scalar arithmetic per player. Parallelism: N_players (tiny). Worth only as warm-up. |
|
||||
| `city_production` | city_food\[\], city_pop\[\], city_prod\[\] | updated arrays | ~100 | One city per workgroup invocation. Threshold formula is deterministic. |
|
||||
| `culture_science` | culture_axis, city_count | culture_total, science_yield | ~40 | Scalar per player — near-zero GPU benefit, include for completeness. |
|
||||
| `fauna_encounter` | unit_pos\[\], lair_buckets\[\], lair_tiers\[\], rng_state\[\] | kill_flags\[\] | ~200 | Best candidate: O(units × avg_lairs_per_tile), embarrassingly parallel per unit. RNG must be SplitMix64 (already used in Rust). |
|
||||
| `combat_resolve` | attacker_stats\[\], defender_stats\[\], bonuses\[\] | dmg_to_def, dmg_to_atk | ~150 | Civ5 exponential formula (`e^(diff/25)`). No branches except keyword flags. Parallelism: N_combats per turn. |
|
||||
| `unit_movement` | unit_pos\[\], enemy_unit_pos\[\], enemy_city_pos\[\], lair_pos\[\] | new_pos\[\] | ~180 | `step_toward` is Manhattan step — trivial. Bottleneck is nearest-neighbor search; a flat sorted array with binary search works on GPU. |
|
||||
|
||||
**Total WGSL LOC estimate: ~730**
|
||||
|
||||
## 3. Structural Blockers
|
||||
|
||||
1. **String keys in hot path** — `strategic_axes: HashMap<String, u8>` is looked up every phase. Pre-encode to `[u8; 8]` (axis enum index) before upload. `processor.rs:262`.
|
||||
2. **Nested Vec allocations** — `LairIndex.buckets: Vec<Vec<u32>>` (`spatial_index.rs:57`) must be flattened to a `(flat_u32_buf, offset_u32_buf)` CSR layout for WGPU buffer upload.
|
||||
3. **RNG stream order** — encounter resolution is byte-identical only when lairs visit in ascending snapshot order (`spatial_index.rs:22-28`). WGPU workgroups must preserve per-player RNG lanes (one SplitMix64 state per player-lane, not per-unit).
|
||||
4. **Keyword Vec** — `CombatParams.attacker_keywords: Vec<Keyword>` must become a `u32` bitmask. Already ~15 enum variants, fits in one `u32`.
|
||||
5. **GridState.TileState** has 50+ fields (`grid/mod.rs:79-188`); uploading the full grid per MCTS rollout is ~20 MB for a 96×96 map. Only the lair sub-fields (`lair_tier, lair_population, col, row`) are needed — project to a slim `GpuLair` struct before upload.
|
||||
|
||||
## 4. Phased Implementation Plan
|
||||
|
||||
### Phase B1 — Flat-data layer (prereq, ~1 week)
|
||||
- Add `GpuPlayerState { gold: i32, axes: [u8;8], city_count: u32, ... }` alongside existing structs
|
||||
- Flatten `LairIndex` to CSR buffers
|
||||
- Encode keywords as bitmask; encode `strategic_axes` as fixed enum array
|
||||
- No WGSL yet; just establish the serialization contract
|
||||
|
||||
### Phase B2 — fauna_encounter kernel (~1 week)
|
||||
- Port `process_fauna_encounters_inner` inner loop to WGSL
|
||||
- One workgroup invocation per (player, unit); reads from CSR lair index
|
||||
- Validate byte-identical kill flags vs Rust reference on known seeds
|
||||
|
||||
### Phase B3 — combat_resolve kernel (~3 days)
|
||||
- Port `CombatResolver::resolve` Civ5 formula to WGSL
|
||||
- Single dispatch over combats array; no branching beyond keyword bitmask checks
|
||||
|
||||
### Phase B4 — unit_movement + economy (~1 week)
|
||||
- `step_toward` nearest-enemy search → GPU nearest-neighbor over flat arrays
|
||||
- Economy and city-production trivially vectorize but have low parallelism gain
|
||||
|
||||
### Phase B5 — MCTS rollout dispatch
|
||||
- Wrap Phase B1-B4 kernels into a single `advance_n_futures(states: &[GpuGameState])` entry point
|
||||
- Dispatch `N` rollout states in one WGPU command buffer; read back winner arrays
|
||||
|
||||
**Total Phase B wall-clock estimate: 4–5 weeks** (assuming one engineer, Rust+WGPU experience required).
|
||||
|
||||
## 5. Verdict
|
||||
|
||||
The bench loop is ~95% POD-compatible. The primary porting work is data-marshaling (String→int, Vec<Vec>→CSR, keyword→bitmask), not algorithmic. The `fauna_encounter` and `combat_resolve` kernels are the highest-value targets: O(units × lairs) and O(combats) respectively, both embarrassingly parallel. The main risk is RNG determinism across workgroup execution order — Phase B2 must validate byte-identical output before any further kernel work proceeds.
|
||||
|
|
@ -61,3 +61,10 @@ Remaining 2 FAILS: loot_dropped 0 (wilds not engaging this batch — variance),
|
|||
2026-04-16 15:42 BATCH 12 (confirmation): IDENTICAL to batch 11 — 12 PASS / 2 FAIL, same per-seed numbers. 2 consecutive batches at 12/14. Confirms: (a) determinism fix (task #17) works perfectly — byte-identical runs, (b) wild-aggro-dev's fix hadn't propagated to apricot before batch 12 started OR batch uses same seed=1/2/3. Remaining fails: loot_dropped 0 (wild aggression fix pending) + both-players-T100 1/3 (structural — seed 1 p1 economy). Stop criterion: needs FULL 14/14 — we're at 12/14 persistently. May need one more AI adjustment for the T100 gap + wild aggression deploy + re-batch.
|
||||
2026-04-16 16:32 BATCH THOROUGH (10-seed T300 parallel, stamp 20260416_162509): **PARALLEL WRAPPER SHIPPED** (PARALLEL=10 env var, 10 seeds in 7min wall-clock vs ~50min serial). Broader sample reveals gaps hidden by 3-seed: victories 4/10 (40%, below 50-80% target; was 2/3=67% in pop-growth 3-seed); median p0_pop_peak 25 (below 30 target; was 32 in 3-seed). 6/10 stalemate at max_turns. Median TTV 300 dragged up by stalemates. Combats 308 ✅, 0 invariants ✅. Winning seeds: 2(T215), 3(T242), 7(T291), 8(T150). Stalemate seeds: 1, 4, 5, 6, 9, 10. Root cause hypothesis: AI strategic depth (heuristic doesn't close games when ahead); MCTS wiring is the tier-1 fix. (team-lead post batch thorough)
|
||||
2026-04-16 16:32 SLOT STATE: 3/5 active (#26 prodqueue-ui-dev, #28 ttv-v2-dev, #46 t100-ai-dev). pop-growth-dev2 retired (#61 complete, pop_peak median 26→32 in 3-seed, fell to 25 in 10-seed — variance-revealed). 2 free slots held — no spawn meets STEP 4 criteria (game-ai dupe, combat already tuned to diminishing returns, MCTS wiring >50 lines awaits user approval).
|
||||
2026-04-16 16:47 USER DECISIONS: (1) ten-seed thorough is new regression gate going forward (three-seed retained as smoke only); (2) slot cap bumped five→ten. Ghost shutdowns confirmed: t100-ai-dev, ttv-v2-dev. prodqueue-ui-dev awaiting verdict. MCTS Phase A1 spawned per GPU-AI approval. Now spawning Phase B reconnaissance (WGPU audit) + fresh prodqueue-ui replacement.
|
||||
2026-04-16 16:53 BIG WAVE OF COMPLETIONS:
|
||||
- Task #64 MCTS PARALLEL: 55 LOC, rayon-parallelized mcts_tree.rs simulate_parallel, 22/22 tests pass on apricot 64-core. Deterministic fold order. Ready for Phase A2 wiring. (mcts-parallel-dev)
|
||||
- Task #65 GUT TESTS: 6/6 GUT tests pass for SimpleHeuristicAi (emergency garrison, walls priority, mil scaling, adjacent attack, capture commit, dominance redirect). GDScript regression gate now exists. (gut-tests-dev)
|
||||
- Task #66 WILD-START DISTANCE: 2-line fix — wilds.json min_distance_from_start 5→8 + village_lair_placer.gd fallback. Seed-1 p0_pop_peak 8→29 in smoke. Root cause was wild aggression radius 8 hexes overlapping with lair exclusion zone 5 hexes. (wild-distance-dev)
|
||||
- Task #68 PRODUCTION QUEUE TURNS redo: NO-OP — feature was already live in city_screen.gd:267-298 from an earlier prodqueue-ui-dev. Earlier "ghost" verdict was wrong; the original agent correctly identified nothing to add. (prodqueue-ui-redo)
|
||||
Mass retirement: 5 agents shutdown. Synced wild fix to apricot, kicking fresh 10-seed thorough batch to verify batch-level uplift.
|
||||
|
|
|
|||
|
|
@ -174,17 +174,26 @@ func test_adjacent_city_attack_fires_before_retreat() -> void:
|
|||
|
||||
|
||||
# ── Test 5: Capture-push commitment — no retreat within 4 hexes ──────────────
|
||||
# A wounded unit (HP≤40%) within 4 hexes of an enemy city must NOT retreat.
|
||||
# Wounded unit (HP≤40%) within 4 hexes of enemy city: retreat branch is gated
|
||||
# by city_dist > 4. When city_dist <= 4 the unit must march on the city instead
|
||||
# of retreating. We verify by comparing two scenarios: city_dist=2 (commit active)
|
||||
# vs city_dist=10 (normal retreat). The commit scenario must NOT produce a retreat
|
||||
# action away from the city — it must be a move toward or attack.
|
||||
|
||||
|
||||
func test_no_retreat_within_4_hexes_of_enemy_city() -> void:
|
||||
var p0: PlayerScript = _make_player(0)
|
||||
var p1: PlayerScript = _make_player(1)
|
||||
|
||||
# Wounded unit at (2,0) — city_dist=2, so commit suppresses retreat.
|
||||
var own_unit: UnitScript = _make_warrior(0, Vector2i(2, 0), 3)
|
||||
# Home city far away so garrison logic doesn't fire.
|
||||
var home_city: CityScript = _make_city(0, Vector2i(20, 0), 0)
|
||||
p0.cities = [home_city]
|
||||
p0.units = [own_unit]
|
||||
|
||||
var enemy_unit: UnitScript = _make_warrior(1, Vector2i(8, 0))
|
||||
# Enemy unit far away so we don't fall into adjacent-attack path.
|
||||
var enemy_unit: UnitScript = _make_warrior(1, Vector2i(15, 0))
|
||||
p1.units = [enemy_unit]
|
||||
|
||||
GameState.players = [p0, p1]
|
||||
|
|
@ -192,26 +201,18 @@ func test_no_retreat_within_4_hexes_of_enemy_city() -> void:
|
|||
var enemy_city_pos: Vector2i = Vector2i(0, 0)
|
||||
var enemy_units: Array = [enemy_unit]
|
||||
var enemy_city_positions: Array[Vector2i] = [enemy_city_pos]
|
||||
var personality: Dictionary = {"aggression": 3, "expansion": 3, "production": 3, "wealth": 3}
|
||||
var personality: Dictionary = {"aggression": 0, "expansion": 3, "production": 3, "wealth": 3}
|
||||
|
||||
var action: Dictionary = AiScript._decide_military_action(
|
||||
0, own_unit, p0, enemy_units, enemy_city_positions, personality
|
||||
)
|
||||
# The unit is wounded AND within 4 hexes of enemy city (dist=2).
|
||||
# Retreat is suppressed by the commit flag; action must NOT be a retreat-move.
|
||||
# It should march on the city or hold — either way, type != "move" toward enemy_unit.
|
||||
if not action.is_empty() and action.get("type", "") == "move":
|
||||
var target: Vector2i = Vector2i(
|
||||
int(action.get("target_col", own_unit.position.x)),
|
||||
int(action.get("target_row", own_unit.position.y))
|
||||
)
|
||||
# Retreating would move away from the enemy city (dist increases).
|
||||
var dist_before: int = abs(own_unit.position.x - enemy_city_pos.x) \
|
||||
+ abs(own_unit.position.y - enemy_city_pos.y)
|
||||
var dist_after: int = abs(target.x - enemy_city_pos.x) \
|
||||
+ abs(target.y - enemy_city_pos.y)
|
||||
assert_true(dist_after <= dist_before,
|
||||
"Commit suppresses retreat: unit within 4 hexes of enemy city must not move further away")
|
||||
# With city_dist=2 (<=4), retreat is suppressed; unit must march on enemy city.
|
||||
# Action must be a move toward (0,0), meaning target_col should decrease from 2.
|
||||
assert_false(action.is_empty(), "Commit: wounded unit near city must take an action")
|
||||
if action.get("type", "") == "move":
|
||||
var target_col: int = int(action.get("target_col", own_unit.position.x))
|
||||
assert_true(target_col <= own_unit.position.x,
|
||||
"Commit suppresses retreat: wounded unit must move toward enemy city (col ≤ 2), not away")
|
||||
|
||||
|
||||
# ── Test 6: Dominance redirect — march on city, skip chase ───────────────────
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue