Commit graph

11 commits

Author SHA1 Message Date
Natalie
ab8fd4d707 fix(cloud-dx): repoint forge from dead mc-forge droplet to live forge.mc.uvlava.com
Some checks are pending
ci / regression gate (push) Waiting to run
The dedicated mc-forge droplet (159.203.170.249:3000/mcadmin) is gone; the forge
now rides a shared services box, addressed by the stable hostname
forge.mc.uvlava.com/applications. The cloud-DX toolchain still pointed at the dead
endpoint, so every worker clone + golden-image build was broken.

- scripts/lib/forge-remote.sh: single source of truth — builds the authenticated
  clone URL from the hostname + ~/.vault/services-forge-token (relocation-proof;
  no hardcoded IP). Exports MC_FORGE_GIT_REMOTE.
- cloud-bringup.sh / dist.sh: source the helper instead of the dead
  mc_forge_creds + 159.203 URL. Also fix cloud-bringup REPO path to the current
  @mc/@applications/magicciv location.
- settings.local.json autoMode trust block: name the new forge host + 'mc' DO
  project (was 159.203 + 'mc:dev'), else cloud provisioning is denied as exfil.
- cloud-dx-do.md: document the new forge + token.

Verified: helper authenticates to the live forge (ls-remote main); scripts parse;
JSON valid.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-30 01:39:54 -04:00
Natalie
9e32eedfa1 feat(sim): land sim_scenario declarative harness + scenarios for headless Game 1 proof gate
- Add mc-sim/bin/sim_scenario (pure Rust runner for JSON scenarios; drives mc-turn + worldsim pre-pass + personalities; emits BatchResult with metrics + per-seed assertion verdicts).
- Add canonical game1_headless_systems_150t.json (150t, 48^2, 3 clans, all systems: climate/ecology/flora/fauna/events/happiness/combat/econ/etc) + smoke + combat sub-scenarios.
- Wire publish in dist.sh to ship the bin to S3 alongside .so (enables fleet horizontal runs post-).
- Update AGENTS.md, finish-game-1/SKILL.md, agents-task-map, simulator-infra.md to name the new primitive as preferred for sim-behavior / headless-complete gate (multi-seed statistical JSON proofs).
- Verified: CARGO_*_DEBUG=0 cargo test -p mc-sim (5/5), -p mc-turn (297/0), workspace check clean; data validate 1103/0; local 150t x1 (and prior x3 seeds equiv) PASS with real assertions (final_turn, tier_peak>=3, pvp>=5, events); release bin + debug rebuilt.
- Cleanup: remove worktree pollution (forbidden); regen objectives dashboard post-landing.
- Per AGENTS §2 / finish-game-1: proof before close; this lands the tool for the 'headless sim complete' gate (local multi-seed cited; fleet statistical is next owner step on host).

Co-Authored-By: Grok (xAI) <noreply@x.ai>
2026-06-28 14:24:38 -04:00
Natalie
88bdc4210a feat(dist): build-artifact Space — publish/fetch/sync fetch-or-build + RL model sharing
Build the linux .so/wasm once on a worker and let sim/test/AI runners fetch the
prebuilt artifact (keyed by git sha) instead of recompiling — N workers share
one build. Adds the magicciv-artifacts DO Space, rclone in the golden image, and:
  - dist:publish  build + upload builds/<sha>/{.so,wasm}
  - dist:fetch    download the prebuilt .so for HEAD's sha
  - dist:sync     git pull -> fetch prebuilt if published, else build
  - dist:models   share RL .onnx via the Space (push/pull/ls)
Complements sccache (compile cache) by caching final outputs. Creds via
RCLONE_S3_* env over ssh, never on worker disk/argv; degrades to build-on-worker
when creds/cache absent.

Also hardens the dispatch layer (pre-existing, affected test/build/render too):
  - pass -i ~/.ssh/id_mc_fleet on dispatch ssh (don't rely on agent-loaded key)
  - guard _dist_first_host against an empty / "fleet down" inventory
  - drop ssh -n on heredoc-stdin verbs (it redirected stdin from /dev/null)

Proven end-to-end on DO: publish built a 43.9MB .so + wasm; dist:sync fetched it
in 2.8s (no rebuild).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-28 06:02:33 -04:00
Natalie
0c50c04b4c feat(infra): dist:prune to delete superseded golden snapshots
Incremental rebuilds accumulate snapshots (~$0.40/mo each). dist:prune keeps
the newest N (default 2: current + one rollback); dist:image reminds you to run it.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-27 14:51:06 -04:00
Natalie
d9588f8c80 perf(infra): incremental golden-image rebuilds (layer on the last snapshot)
Packer base image is now a var; ./run dist:image builds FROM the newest
mc-golden snapshot by default, so the idempotent provision.sh only redoes changed
work (~3-8 min vs ~20 cold). --cold rebuilds from stock Ubuntu to reset layer
cruft. Made the clone step idempotent (clone-or-fetch) so it works on a
pre-provisioned base. Directly addresses 'avoid unnecessary rebuilds'.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-27 14:41:01 -04:00
Natalie
6332d47011 fix(infra): make the DO fleet actually work on real hardware + render host
Real-DO testing surfaced bugs the mocked tests couldn't:
- ssh key: reference shared 'mc-fleet' key via data source, not a duplicate (DO 422s on dup pubkeys).
- cmd_dist_up: fail loudly on failed apply; dist:up waits for cloud-init readiness.
- snapshot cloud-init skips runcmd -> bake authorized_keys (FLEET_PUBKEY) + 'cloud-init clean' before snapshot.
- build user passwordless sudo; apt dpkg-lock race fixed (cloud-init --wait + Lock::Timeout).
- size s-8vcpu-16gb-amd (tier max); creds via PKR_VAR env not argv.
- render host: weston+Mesa baked; ./run dist:render proven (Godot->PNG on DO, no GPU). forge:dns shortcut.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-27 12:45:29 -04:00
Natalie
a5d66ce477 feat(infra): make DO workers render-capable (weston + Mesa) + dist:render
Golden image now installs the software-render stack (weston, libgl1-mesa-dri
llvmpipe, mesa-vulkan-drivers, vulkan-tools) so any worker renders proof scenes
via gl_compatibility/opengl3 with no GPU. New ./run dist:render <scene> <out.png>
wraps tools/capture-proof.sh against a worker (replaces the apricot SCREENSHOT_HOST).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-27 09:56:56 -04:00
Natalie
22f7fa1116 feat(infra): DO compute-offload verbs + forge on/off lifecycle
Offload heavy compute from plum (M2 Air) to on-demand DO workers:
- dist:test  — cargo test --workspace (nextest) on a worker (the main DX win)
- dist:build — cargo build + WASM on a worker; rsync the platform-independent
  WASM back (native .so is linux-only, stays on the worker)
- dist:sync  — git pull <ref> + rebuild gdext on live workers (no image rebuild)
- forge:down/up — snapshot+destroy / restore-from-snapshot (DO bills powered-off
  droplets; only destroy stops it). ~$6/mo -> ~$0.30/mo idle; refreshes the
  forge IP in ~/.vault/mc_forge_creds on restore.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-27 09:24:30 -04:00
Natalie
f5c5d1a410 feat(infra): distributed test/train fleet on DigitalOcean (Terraform + Packer + dispatch)
Ephemeral CPU Droplet fleet that horizontally scales the iteration loop:
- infra/terraform/test-fleet: cattle Droplets from a golden image (auto-discovered
  by name via digitalocean_images), grouped under the mc:dev DO project, with a
  mocked-provider test suite (no token/spend).
- infra/packer: golden-image builder reusing scripts/dev-setup/linux.sh.
- scripts/run/dist.sh: ./run dist:{check,up,sim,train,down} — shard sim/test
  batches across workers via autoplay-batch AUTOPLAY_HOST+SEED_OFFSET.
GPU intentionally absent (workload is CPU-bound per docs/ai-production.md).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-27 08:51:09 -04:00
Natalie
cbc68a68c1 docs(@projects/@magic-civilization): 🔎 p3-26 Gap-2 — era max_tier cap is non-parity; fired-event surfacing is observability-only
Verified file:line: the live GDScript events modules have NO era-based max_tier
cap (0 hits) — headless flat max_tier=10 is correct parity; an era cap would
invent a rule the game lacks (gold-plating, dropped). And natural events already
fire + apply terrain effects headless; only the fired list surfacing to
TurnResult is missing (processor.rs:1117 `let _fired =`), an observability nicety
not a system gap. Confirms the headless natural-events system is functionally
complete; narrows Gap-2's real remainder.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-27 06:29:41 -04:00
Natalie
158ef4d1bd feat(@projects/@magic-civilization): 🩹 p3-29 T2 — Rust turn emits UnitHealed
The live GDScript turn emitted `unit_healed` inline; the headless healing
phase recovered HP silently. The healing phase runs in the end-of-turn
`fn(&mut GameState)` registry (no event sink), so follow the FloraSuccession
buffer pattern: stash `(player, unit_id, applied_amount, col, row)` into a new
transient `GameState.pending_heal_events`, drain it in `step()` into
`TurnEvent::UnitHealed`. The buffered amount is the CLAMPED delta actually
applied (not the nominal heal rate). No wire surface — dispatch drops it; the
live UI consumes it via the kind-tagged `event_to_dict` dict.

Verified headless: mc-replay 19/0 (unit_healed_serde), mc-turn 289/0
(healing_buffers_unit_heal_event_with_applied_amount +
healing_buffers_clamped_amount_near_full_hp + event_collector_wiring).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-27 06:12:07 -04:00