# AI Roadmap — Magic Civilization Designer-facing narrative of what the AI is, what it can and can't do today, and how each post-launch patch improves it. Engineering reference is [`ai-production.md`](./ai-production.md); modder contract is [`modding/ai-controller.md`](./modding/ai-controller.md). --- ## Where we are at launch (v1.0) Two AI families ship side-by-side: - **Scripted AI** — six named clan personalities (Default, Warmonger, Builder, Tinkersmith, Peaceful, Opportunist). Driven by an MCTS-plus- heuristic engine. Transparent, tunable from JSON, fast. - **Learned AI** — one neural-net opponent named `learned:duel-v1b`, trained via reinforcement learning against the scripted clans. Wins ~90% of 1v1 duels against the baseline. Anchors the **Champion** difficulty tier. Difficulty is built by stacking handicaps + policy temperature on top of either family — not by training "weaker" networks. --- ## What the AI knows today (and what it doesn't) Honest diagnosis of the launch learned AI. The simulator state — every tile, building, tech, opponent personality, fog-of-war reveal — is fed through a heavily-compressed 32-float summary before the policy sees it. The result: | The AI **does** understand | The AI **does not** understand | |---|---| | Its own gold, science, culture | Per-tile terrain (which biomes are around it) | | Total city count, total unit count | Individual cities' tile yields or worked tiles | | How many opponents are at war / peace | *Which* opponent is which — they're aggregated counts | | `science_per_turn` (one number) | The tech tree, prerequisites, or research choices | | 16 hardcoded build options | Any building or unit outside that list | | One-hex moves and attacks | Pathfinding more than one hex ahead | | Whether it has units of type "warrior" or "founder" | Resource stockpiles, strategic resources, luxuries | This is fine for duel-map play against weak opponents (current Champion difficulty). It is **not enough** for 12-FFA play, complex maps, late- game decisions, or tournament-grade strategy. The five post-launch patches below close each of those gaps. The simulator already exposes everything in the right column — the limitation is purely how the policy reads it. No engine rewrites are needed. --- ## v1.1 — "Sight" (Stage 6.5) **What changes:** the AI gains *map awareness*. - Reads the actual hex map (biomes, rivers, improvements, fog) instead of a summary statistic. - Sees the full building + unit catalogs from data files; new content becomes trainable automatically. - Sees its top-3 most-threatening opponents distinctly (instead of "I am at war with N players"). - Is bootstrapped from recordings of the six scripted personalities, so it cold-starts at roughly-scripted-AI strength on day one of training. **What players notice:** the Champion-tier AI no longer makes "why-would-anyone-do-that" map decisions. It scouts. It defends chokepoints. It expands toward food, not into deserts. --- ## v1.2 — "Memory" (Stage 6.6) **What changes:** the AI gains *short-term memory*. - A recurrent network layer carries information across turns. The AI remembers what it just did and what each opponent just did. - Per-opponent memory slots → it forms a working model of each opponent individually ("player 5 has been building catapults", "player 7 has been turtling"). **What players notice:** the AI starts to *adapt*. If you turtle, it shifts to siege. If you rush, it shifts to defense. It also stops repeating obvious mistakes within a single game. --- ## v1.3 — "Foresight" (Stage 6.7 + 6.8) The two largest single-patch upgrades, shipped together. ### AlphaZero search at inference (6.7) **What changes:** the AI thinks ahead before each decision. - At every turn, the AI runs 64–256 quick simulated futures, guided by its trained intuition, and picks the line that looks best. - This is the same recipe AlphaGo / AlphaZero used to surpass humans at Go and chess. - The engine already has the search machinery built (we just plug the neural net into it as the policy + value head). **What players notice:** a step-change in tactical strength. **+200–400 Elo** is the canonical result of adding search to a trained policy. The AI stops blundering. It sets traps. It calculates multi-step combats. ### Multi-step movement (6.8) **What changes:** the AI commands its units like a player does. - Pick a destination tile; the simulator paths there over multiple turns. - Set rally points so freshly-built units head somewhere useful. - Issue patrol routes for scouts. - Order escorts (a defender follows a settler). **What players notice:** the AI stops moving one hex at a time. Armies march in formation. Builders get protected. Scouts cover the map deliberately. --- ## v1.4 — "Mastery" (Stage 6.9) **What changes:** the AI learns *against itself* instead of against the scripted opponents. - Self-play league: generation 0 plays generation 1 plays generation 2, and so on. Each generation must beat the entire prior population to graduate. - 12-slot huge-map free-for-all is the training arena — no handicaps, no easier opponents. - Four specialist variants ship alongside the generalist: **Rush** (early pressure), **Turtle** (defensive consolidation), **Tech** (research race), **Economy** (long-game empire). Each is a separate selectable controller; modders can train their own. **What players notice:** the strongest AI difficulty tier becomes **tournament-grade**. The specialists give the campaign distinct opposition personalities that you can prepare strategies against. New mod authors get four worked examples (one per specialist) to learn from. --- ## How a player picks AI in v1.0 In the New Game screen, each opponent slot has an AI dropdown. Choose: - a scripted clan personality (named, themed), or - the learned AI (currently one: `learned:duel-v1b`). The five patches above add more entries to that dropdown — they do not change the UI flow. A v1.0 save loads in v1.4 because every save records which AI was driving which slot. --- ## How a player picks difficulty Difficulty is **never** "the AI's brain is smaller." Difficulty is: - a resource handicap (the AI gets more/fewer starting yields), and - (for learned AI tiers only) a "temperature" that makes the AI play more or less consistently. So the ladder is: | Difficulty | AI | Why it's harder | |---|---|---| | Settler | Peaceful scripted clan | AI starts behind; rarely attacks | | Chieftain | Default scripted clan | Balanced | | Warlord | Rotating scripted clans | Multiple personalities; less predictable | | King (v1.4+) | Best league-gen AI | Genuinely strong AI | | Champion (v1.0) → (v1.4) | Learned AI, low temperature | Near-optimal play; small handicap | --- ## Cross-references - Engineering reference: [`ai-production.md`](./ai-production.md) - Modder contract: [`modding/ai-controller.md`](./modding/ai-controller.md) - ABI decisions memo: [`modding/abi-decisions.md`](./modding/abi-decisions.md) - Plan file (internal): `~/.claude/plans/in-the-game-civilization-elegant-popcorn.md`