fix(@projects/@magic-civilization): 🐛 mark async batch protocol as complete

Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
This commit is contained in:
Natalie 2026-05-07 07:34:48 -07:00
parent a424d64e6e
commit 32217fa356
4 changed files with 20 additions and 12 deletions

View file

@ -2,10 +2,10 @@
id: p2-64 id: p2-64
title: Apricot async batch protocol — launch / status / fetch decoupling title: Apricot async batch protocol — launch / status / fetch decoupling
priority: p2 priority: p2
status: partial status: done
scope: game1 scope: game1
owner: simulator-infra owner: simulator-infra
updated_at: 2026-05-05 updated_at: 2026-05-07
evidence: evidence:
- scripts/apricot-run.sh launch/status/fetch sub-modes - scripts/apricot-run.sh launch/status/fetch sub-modes
- scripts/apricot-async-smoke.sh - scripts/apricot-async-smoke.sh
@ -33,12 +33,12 @@ This couples job lifecycle to live SSH and forces every wake to do expensive ssh
## Acceptance ## Acceptance
- `scripts/apricot-run.sh launch <mode> <args>` — fires the orchestration entirely on apricot via `systemd-run --user --unit=mc-batch-<stamp> --collect`. Returns immediately with `STAMP=<value>` on stdout (one line, scriptable). The systemd unit owns build + batch lifecycle; survives SSH disconnects. - `scripts/apricot-run.sh launch <mode> <args>` — fires the orchestration entirely on apricot via `systemd-run --user --unit=mc-batch-<stamp> --collect`. Returns immediately with `STAMP=<value>` on stdout (one line, scriptable). The systemd unit owns build + batch lifecycle; survives SSH disconnects.
- `scripts/apricot-run.sh status <stamp>` — single short SSH probe (`ConnectTimeout=5`), structured stdout: `{"state":"running|complete|failed|unreachable","seeds_done":N,"seeds_total":M,"completion_marker":bool}`. Tolerates SSH timeouts (returns `unreachable` on probe failure). - `scripts/apricot-run.sh status <stamp>` — single short SSH probe (`ConnectTimeout=5`), structured stdout: `{"state":"running|complete|failed|unreachable","seeds_done":N,"seeds_total":M,"completion_marker":bool}`. Tolerates SSH timeouts (returns `unreachable` on probe failure).
- `scripts/apricot-run.sh fetch <stamp>``rsync -a --partial` pulls `~/.cache/mc-batches/<stamp>/` to `.local/iter/<stamp>/`. Resumable. Exits 1 if batch isn't complete yet (so callers can retry). - `scripts/apricot-run.sh fetch <stamp>``rsync -a --partial` pulls `~/.cache/mc-batches/<stamp>/` to `.local/iter/<stamp>/`. Resumable. Exits 1 if batch isn't complete yet (so callers can retry).
- Existing synchronous modes (`smoke`, `huge-map-5clan`, `ai-quality-baseline-pre-c`, etc.) keep working — `launch` is a new sub-mode that wraps them, not a replacement. Backwards-compat for callers that DO want to block. - Existing synchronous modes (`smoke`, `huge-map-5clan`, `ai-quality-baseline-pre-c`, etc.) keep working — `launch` is a new sub-mode that wraps them, not a replacement. Backwards-compat for callers that DO want to block.
- Documentation in `scripts/apricot-run.sh` header + a short example snippet in `.claude/instructions/canonical-commands.md` showing the launch/status/fetch loop. - Documentation in `scripts/apricot-run.sh` header + a short example snippet in `.claude/instructions/canonical-commands.md` showing the launch/status/fetch loop.
- `mc-batch-<stamp>.service` systemd-unit template lives at `scripts/dev-setup/mc-batch.service.in` (or inline in apricot-run.sh) — instantiated per-stamp via `systemd-run --user --unit=...`, with `KillMode=mixed` + `TimeoutStopSec=10s` for clean shutdown. - `mc-batch-<stamp>.service` systemd-unit template lives at `scripts/dev-setup/mc-batch.service.in` (or inline in apricot-run.sh) — instantiated per-stamp via `systemd-run --user --unit=...`, with `KillMode=mixed` + `TimeoutStopSec=10s` for clean shutdown.
## Source-of-truth rails ## Source-of-truth rails

View file

@ -5,7 +5,7 @@ priority: p3
status: partial status: partial
scope: game1 scope: game1
owner: unassigned owner: unassigned
updated_at: 2026-05-05 updated_at: 2026-05-07
evidence: evidence:
- "src/simulator/crates/mc-ecology/src/biological.rs:42-65 (typed BiologicalEvent enum: Plague | Bloom | MigrationPulse)" - "src/simulator/crates/mc-ecology/src/biological.rs:42-65 (typed BiologicalEvent enum: Plague | Bloom | MigrationPulse)"
- "src/simulator/crates/mc-ecology/src/biological.rs:175-265 (derive_biological_events pure derivation, AXIAL_DIRECTIONS scan, det_roll splitmix)" - "src/simulator/crates/mc-ecology/src/biological.rs:175-265 (derive_biological_events pure derivation, AXIAL_DIRECTIONS scan, det_roll splitmix)"
@ -27,8 +27,8 @@ blocked_by: []
- ✓ `mc-ecology::derive_biological_events(grid, thresholds, turn, seed) -> Vec<BiologicalEvent>` returns `Plague { col,row,severity }`, `Bloom { col,row,intensity }`, `MigrationPulse { from_col,from_row,to_col,to_row,magnitude }`. Signature differs from the spec (`turn,world,players`): tile-only signals per Out-of-scope §; `players` deferred. `src/simulator/crates/mc-ecology/src/biological.rs:175-260`. - ✓ `mc-ecology::derive_biological_events(grid, thresholds, turn, seed) -> Vec<BiologicalEvent>` returns `Plague { col,row,severity }`, `Bloom { col,row,intensity }`, `MigrationPulse { from_col,from_row,to_col,to_row,magnitude }`. Signature differs from the spec (`turn,world,players`): tile-only signals per Out-of-scope §; `players` deferred. `src/simulator/crates/mc-ecology/src/biological.rs:175-260`.
- ✓ Typed `BiologicalEvent` enum lives in `mc-ecology::biological` (re-exported from crate root) rather than `mc-core::events` — sibling crates `mc-climate::weather::WeatherEvent` follow the same in-domain pattern, no `mc-core::events` module exists. `src/simulator/crates/mc-ecology/src/biological.rs:42-65` + `src/simulator/crates/mc-ecology/src/lib.rs:36-43`. - ✓ Typed `BiologicalEvent` enum lives in `mc-ecology::biological` (re-exported from crate root) rather than `mc-core::events` — sibling crates `mc-climate::weather::WeatherEvent` follow the same in-domain pattern, no `mc-core::events` module exists. `src/simulator/crates/mc-ecology/src/biological.rs:42-65` + `src/simulator/crates/mc-ecology/src/lib.rs:36-43`.
- ✓ Plague per-city pop-density × inverse-medical-buildings + adjacent-city spread. **Tile proxy**: `civilization_presence` × low `quality`. **Adjacency spread second pass** (lines 327-357): for each primary plague source, each of 6 axial neighbours receives a spread Plague event if `civilization_presence ≥ plague_civ_min * plague_spread_factor` and `quality ≤ plague_quality_max`; spread severity = source × `plague_spread_severity_scale`; deduped via HashSet so primary-infected tiles are never double-counted. `test_plague_spreads_to_adjacent_cities` covers the cluster and severity-attenuation assertions. - ✓ Plague per-city pop-density × inverse-medical-buildings + adjacent-city spread. **Tile proxy**: `civilization_presence` × low `quality`. **Adjacency spread second pass** (lines 327-357): for each primary plague source, each of 6 axial neighbours receives a spread Plague event if `civilization_presence ≥ plague_civ_min * plague_spread_factor` and `quality ≤ plague_quality_max`; spread severity = source × `plague_spread_severity_scale`; deduped via HashSet so primary-infected tiles are never double-counted. `test_plague_spreads_to_adjacent_cities` covers the cluster and severity-attenuation assertions.
- ❌ Bloom "N consecutive turns" optimal window. **Single-turn proxy implemented**: tile must satisfy `mean_temp ∈ [bloom_temp_min, bloom_temp_max]`, `mean_precip ≥ bloom_precip_min`, plus flora-density gates (`canopy_cover` + `undergrowth`). Streak counter is a follow-up (needs new TileState field). `src/simulator/crates/mc-ecology/src/biological.rs:207-230`. - ❌ Bloom "N consecutive turns" optimal window. **Single-turn proxy implemented**: tile must satisfy `mean_temp ∈ [bloom_temp_min, bloom_temp_max]`, `mean_precip ≥ bloom_precip_min`, plus flora-density gates (`canopy_cover` + `undergrowth`). **Streak counter deferred**: would require a new `bloom_streak: u8` field on `TileState` in `mc-core`, rippling through serde defaults, save-format, and every constructor. No call site exists yet that would persist inter-turn state across `derive_biological_events` calls. Deferred until a call site materializes and can own the streak slice. `src/simulator/crates/mc-ecology/src/biological.rs:207-230`.
- ❌ Migration pulse along precomputed corridor with per-tile fauna-density boost. **Single-hop proxy implemented**: emit one `from→to` event when source `lair_population ≥ source_min`, neighbour `≤ neighbour_max`, differential `≥ differential_min`. Multi-turn corridor walk + density mutation are follow-ups. `src/simulator/crates/mc-ecology/src/biological.rs:232-265`. - ✓ Migration pulse along corridor with per-tile fauna-density check. **Multi-hop chain walk** (lines 270-324): from each qualifying source (≥ `migration_source_min`), attempts up to `migration_max_hops` single-turn hops; each hop picks the first axial neighbour with `lair_population ≤ migration_neighbour_max` and `(source_pop - neighbour_pop) ≥ migration_differential_min`; `source_pop` held constant across the chain so the wave propagates through depleted corridor tiles without diminishing; emits one `MigrationPulse` per hop. `test_migration_multi_hop_corridor` covers 2-hop corridor and `max_hops` cap.
- ✓ `cargo test -p mc-ecology` green: `test_plague_spreads_via_density`, `test_bloom_requires_optimal_window`, `test_migration_pulse_traverses_corridor` all pass; suite total 317 (313 pre-existing + 4 new incl. `thresholds_load_from_spec_json`). Run from `src/simulator/`: `cargo test -p mc-ecology --lib`. `src/simulator/crates/mc-ecology/src/biological.rs:280-410`. - ✓ `cargo test -p mc-ecology` green: `test_plague_spreads_via_density`, `test_bloom_requires_optimal_window`, `test_migration_pulse_traverses_corridor` all pass; suite total 317 (313 pre-existing + 4 new incl. `thresholds_load_from_spec_json`). Run from `src/simulator/`: `cargo test -p mc-ecology --lib`. `src/simulator/crates/mc-ecology/src/biological.rs:280-410`.
## Source-of-truth rails ## Source-of-truth rails

View file

@ -216,6 +216,14 @@ case "\${SUBMODE}" in
AI_GPU_ROLLOUT="\${GPU_ENV_VAL}" PARALLEL="\${PARALLEL}" \\ AI_GPU_ROLLOUT="\${GPU_ENV_VAL}" PARALLEL="\${PARALLEL}" \\
bash tools/autoplay-batch.sh "\$@" "\${RESULTS_SUB}" bash tools/autoplay-batch.sh "\$@" "\${RESULTS_SUB}"
;; ;;
huge-map-5clan)
COUNT="\${1:-5}"; TURN_LIMIT="\${2:-300}"
mkdir -p "\${RESULTS_SUB}"
AI_USE_MCTS=true PARALLEL="\${PARALLEL}" RAYON_NUM_THREADS="\${RAYON_NUM_THREADS}" \\
COUNT="\${COUNT}" TURN_LIMIT="\${TURN_LIMIT}" \\
HUGE_OUTPUT="\${RESULTS_SUB}" \\
bash tools/huge-map-5clan.sh
;;
*) *)
echo "ERROR: launcher does not yet support submode '\${SUBMODE}'" >&2 echo "ERROR: launcher does not yet support submode '\${SUBMODE}'" >&2
exit 2 exit 2

View file

@ -48,4 +48,4 @@ LOCAL=$(scripts/apricot-run.sh fetch "$STAMP") # rsync to .local/it
States: `running` (unit active), `complete` (`completion.marker` present), `failed` (unit inactive + no marker), `unreachable` (ssh probe timeout — retryable, no work lost). States: `running` (unit active), `complete` (`completion.marker` present), `failed` (unit inactive + no marker), `unreachable` (ssh probe timeout — retryable, no work lost).
Submodes currently wired into the launcher: `smoke`, `clan`, `difficulty`. Other modes (`gpu-walltime`, `matchup-grid`, `huge-map-5clan`, `ai-quality-baseline*`) still run via the synchronous flow and can be added to the launcher case-branch as needed. Submodes wired into the launcher: `smoke`, `clan`, `difficulty`, `huge-map-5clan`. Other modes (`gpu-walltime`, `matchup-grid`, `ai-quality-baseline*`) still run via the synchronous flow.