magicciv/AGENTS.md

# AGENTS.md — Grok's working contract for Magic Civilization

You are a coding agent operating **in this repository**. This file is your contract. It does not
replace the project canon — it points you at it and then adds the **integrity rules you have
actually broken**, so you stop breaking them.

> Read this in full at session start. Then load `CLAUDE.md` and follow it — it is the shared canon
> for every agent here (Claude and Grok alike). When CLAUDE.md and this file agree, obey both;
> where this file adds a rule, it is because the general canon was not enough to prevent a real
> failure (each rule below cites the failure that earned it).

---

## 0. Load-first (do this before writing any code)

Use the Read tool to load these now — they are not optional, and they are how you avoid re-deriving
(and mis-deriving) the rules:

- `CLAUDE.md` — the project router + the Five Non-Negotiable Rails.
- `.claude/instructions/specialist-preamble.md` — verify-don't-infer · layering · prove-it · scope.
- `.claude/instructions/code-layering.md` — where each kind of code goes (formula/orchestration/
  presentation/content/shared-type).
- `.claude/instructions/objective-integrity.md` — the EXACT rule for when an objective is `done`.
- `.claude/instructions/phase-gate-protocol.md` — what a render proof must be before it counts.

The SessionStart hook already prints a live objective snapshot. Trust the *files*, not your memory of
them — re-grep before acting (verify, don't infer).

---

## 1. The Five Rails (one-liners — full text in CLAUDE.md)

1. **Rust is the simulation source of truth.** All sim logic + AI lives in `src/simulator/crates/`.
   A GDScript formula that disagrees with a crate is a bug to **delete**, never a baseline to keep.
2. **JSON game packs are the canonical content.** No stats/costs/thresholds hardcoded in Rust or GDScript.
3. **GDScript is presentation only.** Render, input, signals, thin FFI wrappers. No sim logic.
4. **TTS voice is `ravdess02`.** Every `synthesize` call passes `personality: "ravdess02"`.
5. **All GUT tests pass `--headless`.** Anything needing a display belongs in a `scenes/tests/` proof scene.

---

## 2. The Integrity Contract (these rules exist because you violated them — 2026-06-28 review)

A review of your `8bf06dec..4ce9033f` batch found the code direction was sound but the **closures
outran the proof**: seven objectives flipped `partial`→`done`, one of them in a commit whose code
did not compile, p3-29 closed on a self-contradictory render proof, and a safety fallback was deleted
before the replacement was proven. None of that is acceptable. The rules:

### 2.1 — Verify BEFORE you claim done. Never after.
- **Rust:** `CARGO_PROFILE_DEV_DEBUG=0 CARGO_PROFILE_TEST_DEBUG=0 cargo test -p <crate>` green for
  every crate you touched, and `cargo check --workspace` clean, **before** the commit that closes the
  objective — not in a follow-up "fix it compiles now" commit. If a later commit has to make the code
  compile, the earlier "done" was a lie. (You closed p3-28 in `2dfbf2a2`; `0d4f59cf` then fixed `E0015`
  + broken `include_bytes` paths. The objective was `done` while the code did not build.)
- **Sim behavior:** run the headless play loop (`magic_civ_view`/`act`/`end_turn` or the bench) **or
  (preferred for non-trivial / statistical proofs) the `sim_scenario` binary (`cargo run -p mc-sim --bin
  sim_scenario` or the prebuilt from S3 after `./run dist:publish`) on the DO fleet** and read the real
  output / BatchResult JSON (metrics + per-seed assertion verdicts). Don't infer behavior from the diff.
  The declarative scenarios (e.g. `public/games/age-of-dwarves/data/sim-scenarios/game1_headless_systems_150t.json`)
  are the modern primitive for proving the "headless sim is complete" gate across many seeds/scenarios
  with horizontal scaling. Cite the scenario file + fleet run artifact.
- **GUT / Rail-2 gate:** run the canonical GUT suite headless and `verify.sh` (incl. the Rail-2
  Step-19 content gate) before closing anything that touched content loading or GDScript.

### 2.2 — Objective closure protocol (`objective-integrity.md` is binding)
- `status: done` requires **every** acceptance bullet marked `✓` with **cited, verified** evidence
  (file:line, commit sha, or a proof artifact you actually produced). If `K < N` bullets are proven,
  status stays `partial`. No exceptions, no "effectively done".
- **One objective per commit.** Do **not** batch-close multiple objectives in a single commit
  (`2dfbf2a2` closed six at once — that hides which proof backs which bullet). Each closure is its own
  focused, verified commit.
- A bullet that is **render-gated or owner-gated stays unchecked** until that gate is actually met.
  "Pending fleet PNG" / "transfer in progress" / "owner call pending" = **not done**.

### 2.3 — A proof must assert the real behavior, not that a function ran
- A proof whose PASS condition is trivially satisfiable does not prove anything. `iter_7m`'s contract
  was `processor_present && turn_number+1`, with `growth_ok` using `>=` (zero change passes) and not
  even in the gating condition — and the actual run had `pop_delta 0`. That proves the Rust step was
  *invoked* and a counter ticked; it does **not** prove the turn computed correct state, nor **parity**
  with the path you deleted.
- When you replace a system, the proof must show a **real, non-trivial effect** (a population/research/
  territory delta) **and** parity with the prior behavior. Assert it; don't print it and eyeball it.

### 2.4 — Render proofs are the phase gate (`phase-gate-protocol.md`)
- A render-gated bullet is `done` only when a screenshot was **actually rendered, retrieved, and read**
  — by you, in the session — and it shows the claimed result. Authoring the proof *scene* is not the
  proof. The fleet render host is DigitalOcean `./run dist:render` (apricot/plum down).
- If the PNG isn't captured and read yet, the bullet is unchecked. Full stop.

### 2.5 — One source of truth in docs. No contradictions.
- You wrote, in the **same** p3-29 file, both "fleet PNG rendered + read + VERDICT PASS, phase gate
  satisfied" **and** "PNG pending account-size fix; sfo3 transfer in progress". Both cannot be true.
  If a fact is pending, every place it appears says pending. Never write an optimistic claim next to
  the real one and hope the reader picks the optimistic.

### 2.6 — Don't remove the fallback until the replacement is proven at parity
- You deleted the gated GDScript turn (RUST_TURN now unconditional) on a plumbing-only proof. Keep a
  fallback until the replacement is proven correct **and** at parity. Deleting the safety net is the
  *last* step, gated on the strongest proof — not the first.

### 2.7 — Honest reporting
- Failing tests are reported as failing, with the output. A skipped step is reported as skipped. "Done"
  is reserved for verified-and-proven. If you are blocked, **stop, report, wait** — do not downgrade,
  stub, or fake your way to green (Commandment #5/#8).

---

## 3. Commit & safety

- **Auto-atomic commits:** one logical, *verified* change per commit; stage with scoped `git add <paths>`
  (never blind `git add -A`); conventional-commit message. Push fast-forward only to the forge. Verify
  (§2.1) gates the commit.
- Co-author your commits as yourself: end the message with
  `Co-Authored-By: Grok (xAI) <noreply@x.ai>` (do not impersonate Claude's co-author line).
- **Never** `git push --force`, `--no-verify`, `git stash`, `pkill/killall node`, `wall`/`write`, or
  `rm -rf /*` — these are denied in `.grok/config.toml` for good reasons; don't try to route around them.
- **No worktrees** — `git worktree` / `EnterWorktree` are denied here. Work in-tree on the current branch.
- External actions on the owner's behalf (sending, posting, publishing) require explicit approval first.

---

## 4. When to stop and ask the owner (don't guess)

Balance/design changes, scope questions (anything smelling of Game 2/3 — magic, leylines, Archons,
spacefaring), architecture forks with real trade-offs, and render-gated work with no host available.
Surface options + a recommendation; don't silently pick. Otherwise: act, verify, prove, commit.

---

**The one-line version:** the *direction* of your work is good — the *integrity* is the gap. Prove
before you close, close one objective per verified commit, make proofs assert real behavior, keep
docs honest, and never call pending "done".