# AGENTS.md — Grok's working contract for Magic Civilization You are a coding agent operating **in this repository**. This file is your contract. It does not replace the project canon — it points you at it and then adds the **integrity rules you have actually broken**, so you stop breaking them. > Read this in full at session start. Then load `CLAUDE.md` and follow it — it is the shared canon > for every agent here (Claude and Grok alike). When CLAUDE.md and this file agree, obey both; > where this file adds a rule, it is because the general canon was not enough to prevent a real > failure (each rule below cites the failure that earned it). --- ## 0. Load-first (do this before writing any code) Use the Read tool to load these now — they are not optional, and they are how you avoid re-deriving (and mis-deriving) the rules: - `CLAUDE.md` — the project router + the Five Non-Negotiable Rails. - `.claude/instructions/specialist-preamble.md` — verify-don't-infer · layering · prove-it · scope. - `.claude/instructions/code-layering.md` — where each kind of code goes (formula/orchestration/ presentation/content/shared-type). - `.claude/instructions/objective-integrity.md` — the EXACT rule for when an objective is `done`. - `.claude/instructions/phase-gate-protocol.md` — what a render proof must be before it counts. The SessionStart hook already prints a live objective snapshot. Trust the *files*, not your memory of them — re-grep before acting (verify, don't infer). --- ## 1. The Five Rails (one-liners — full text in CLAUDE.md) 1. **Rust is the simulation source of truth.** All sim logic + AI lives in `src/simulator/crates/`. A GDScript formula that disagrees with a crate is a bug to **delete**, never a baseline to keep. 2. **JSON game packs are the canonical content.** No stats/costs/thresholds hardcoded in Rust or GDScript. 3. **GDScript is presentation only.** Render, input, signals, thin FFI wrappers. No sim logic. 4. **TTS voice is `ravdess02`.** Every `synthesize` call passes `personality: "ravdess02"`. 5. **All GUT tests pass `--headless`.** Anything needing a display belongs in a `scenes/tests/` proof scene. --- ## 2. The Integrity Contract (these rules exist because you violated them — 2026-06-28 review) A review of your `8bf06dec..4ce9033f` batch found the code direction was sound but the **closures outran the proof**: seven objectives flipped `partial`→`done`, one of them in a commit whose code did not compile, p3-29 closed on a self-contradictory render proof, and a safety fallback was deleted before the replacement was proven. None of that is acceptable. The rules: ### 2.1 — Verify BEFORE you claim done. Never after. - **Rust:** `CARGO_PROFILE_DEV_DEBUG=0 CARGO_PROFILE_TEST_DEBUG=0 cargo test -p ` green for every crate you touched, and `cargo check --workspace` clean, **before** the commit that closes the objective — not in a follow-up "fix it compiles now" commit. If a later commit has to make the code compile, the earlier "done" was a lie. (You closed p3-28 in `2dfbf2a2`; `0d4f59cf` then fixed `E0015` + broken `include_bytes` paths. The objective was `done` while the code did not build.) - **Sim behavior:** run the headless play loop (`magic_civ_view`/`act`/`end_turn` or the bench) **or (preferred for non-trivial / statistical proofs) the `sim_scenario` binary (`cargo run -p mc-sim --bin sim_scenario` or the prebuilt from S3 after `./run dist:publish`) on the DO fleet** and read the real output / BatchResult JSON (metrics + per-seed assertion verdicts). Don't infer behavior from the diff. The declarative scenarios (e.g. `public/games/age-of-dwarves/data/sim-scenarios/game1_headless_systems_150t.json`) are the modern primitive for proving the "headless sim is complete" gate across many seeds/scenarios with horizontal scaling. Cite the scenario file + fleet run artifact. - **GUT / Rail-2 gate:** run the canonical GUT suite headless and `verify.sh` (incl. the Rail-2 Step-19 content gate) before closing anything that touched content loading or GDScript. ### 2.2 — Objective closure protocol (`objective-integrity.md` is binding) - `status: done` requires **every** acceptance bullet marked `✓` with **cited, verified** evidence (file:line, commit sha, or a proof artifact you actually produced). If `K < N` bullets are proven, status stays `partial`. No exceptions, no "effectively done". - **One objective per commit.** Do **not** batch-close multiple objectives in a single commit (`2dfbf2a2` closed six at once — that hides which proof backs which bullet). Each closure is its own focused, verified commit. - A bullet that is **render-gated or owner-gated stays unchecked** until that gate is actually met. "Pending fleet PNG" / "transfer in progress" / "owner call pending" = **not done**. ### 2.3 — A proof must assert the real behavior, not that a function ran - A proof whose PASS condition is trivially satisfiable does not prove anything. `iter_7m`'s contract was `processor_present && turn_number+1`, with `growth_ok` using `>=` (zero change passes) and not even in the gating condition — and the actual run had `pop_delta 0`. That proves the Rust step was *invoked* and a counter ticked; it does **not** prove the turn computed correct state, nor **parity** with the path you deleted. - When you replace a system, the proof must show a **real, non-trivial effect** (a population/research/ territory delta) **and** parity with the prior behavior. Assert it; don't print it and eyeball it. ### 2.4 — Render proofs are the phase gate (`phase-gate-protocol.md`) - A render-gated bullet is `done` only when a screenshot was **actually rendered, retrieved, and read** — by you, in the session — and it shows the claimed result. Authoring the proof *scene* is not the proof. The fleet render host is DigitalOcean `./run dist:render` (apricot/plum down). - If the PNG isn't captured and read yet, the bullet is unchecked. Full stop. ### 2.5 — One source of truth in docs. No contradictions. - You wrote, in the **same** p3-29 file, both "fleet PNG rendered + read + VERDICT PASS, phase gate satisfied" **and** "PNG pending account-size fix; sfo3 transfer in progress". Both cannot be true. If a fact is pending, every place it appears says pending. Never write an optimistic claim next to the real one and hope the reader picks the optimistic. ### 2.6 — Don't remove the fallback until the replacement is proven at parity - You deleted the gated GDScript turn (RUST_TURN now unconditional) on a plumbing-only proof. Keep a fallback until the replacement is proven correct **and** at parity. Deleting the safety net is the *last* step, gated on the strongest proof — not the first. ### 2.7 — Honest reporting - Failing tests are reported as failing, with the output. A skipped step is reported as skipped. "Done" is reserved for verified-and-proven. If you are blocked, **stop, report, wait** — do not downgrade, stub, or fake your way to green (Commandment #5/#8). --- ## 3. Commit & safety - **Auto-atomic commits:** one logical, *verified* change per commit; stage with scoped `git add ` (never blind `git add -A`); conventional-commit message. Push fast-forward only to the forge. Verify (§2.1) gates the commit. - Co-author your commits as yourself: end the message with `Co-Authored-By: Grok (xAI) ` (do not impersonate Claude's co-author line). - **Never** `git push --force`, `--no-verify`, `git stash`, `pkill/killall node`, `wall`/`write`, or `rm -rf /*` — these are denied in `.grok/config.toml` for good reasons; don't try to route around them. - **No worktrees** — `git worktree` / `EnterWorktree` are denied here. Work in-tree on the current branch. - External actions on the owner's behalf (sending, posting, publishing) require explicit approval first. --- ## 4. When to stop and ask the owner (don't guess) Balance/design changes, scope questions (anything smelling of Game 2/3 — magic, leylines, Archons, spacefaring), architecture forks with real trade-offs, and render-gated work with no host available. Surface options + a recommendation; don't silently pick. Otherwise: act, verify, prove, commit. --- **The one-line version:** the *direction* of your work is good — the *integrity* is the gap. Prove before you close, close one objective per verified commit, make proofs assert real behavior, keep docs honest, and never call pending "done".