7 KiB
AI Roadmap — Magic Civilization
Designer-facing narrative of what the AI is, what it can and can't do today,
and how each post-launch patch improves it. Engineering reference is
ai-production.md; modder contract is
modding/ai-controller.md.
Where we are at launch (v1.0)
Two AI families ship side-by-side:
- Scripted AI — six named clan personalities (Default, Warmonger, Builder, Tinkersmith, Peaceful, Opportunist). Driven by an MCTS-plus- heuristic engine. Transparent, tunable from JSON, fast.
- Learned AI — one neural-net opponent named
learned:duel-v1b, trained via reinforcement learning against the scripted clans. Wins ~90% of 1v1 duels against the baseline. Anchors the Champion difficulty tier.
Difficulty is built by stacking handicaps + policy temperature on top of either family — not by training "weaker" networks.
What the AI knows today (and what it doesn't)
Honest diagnosis of the launch learned AI. The simulator state — every tile, building, tech, opponent personality, fog-of-war reveal — is fed through a heavily-compressed 32-float summary before the policy sees it. The result:
| The AI does understand | The AI does not understand |
|---|---|
| Its own gold, science, culture | Per-tile terrain (which biomes are around it) |
| Total city count, total unit count | Individual cities' tile yields or worked tiles |
| How many opponents are at war / peace | Which opponent is which — they're aggregated counts |
science_per_turn (one number) |
The tech tree, prerequisites, or research choices |
| 16 hardcoded build options | Any building or unit outside that list |
| One-hex moves and attacks | Pathfinding more than one hex ahead |
| Whether it has units of type "warrior" or "founder" | Resource stockpiles, strategic resources, luxuries |
This is fine for duel-map play against weak opponents (current Champion difficulty). It is not enough for 12-FFA play, complex maps, late- game decisions, or tournament-grade strategy. The five post-launch patches below close each of those gaps.
The simulator already exposes everything in the right column — the limitation is purely how the policy reads it. No engine rewrites are needed.
v1.1 — "Sight" (Stage 6.5)
What changes: the AI gains map awareness.
- Reads the actual hex map (biomes, rivers, improvements, fog) instead of a summary statistic.
- Sees the full building + unit catalogs from data files; new content becomes trainable automatically.
- Sees its top-3 most-threatening opponents distinctly (instead of "I am at war with N players").
- Is bootstrapped from recordings of the six scripted personalities, so it cold-starts at roughly-scripted-AI strength on day one of training.
What players notice: the Champion-tier AI no longer makes "why-would-anyone-do-that" map decisions. It scouts. It defends chokepoints. It expands toward food, not into deserts.
v1.2 — "Memory" (Stage 6.6)
What changes: the AI gains short-term memory.
- A recurrent network layer carries information across turns. The AI remembers what it just did and what each opponent just did.
- Per-opponent memory slots → it forms a working model of each opponent individually ("player 5 has been building catapults", "player 7 has been turtling").
What players notice: the AI starts to adapt. If you turtle, it shifts to siege. If you rush, it shifts to defense. It also stops repeating obvious mistakes within a single game.
v1.3 — "Foresight" (Stage 6.7 + 6.8)
The two largest single-patch upgrades, shipped together.
AlphaZero search at inference (6.7)
What changes: the AI thinks ahead before each decision.
- At every turn, the AI runs 64–256 quick simulated futures, guided by its trained intuition, and picks the line that looks best.
- This is the same recipe AlphaGo / AlphaZero used to surpass humans at Go and chess.
- The engine already has the search machinery built (we just plug the neural net into it as the policy + value head).
What players notice: a step-change in tactical strength. +200–400 Elo is the canonical result of adding search to a trained policy. The AI stops blundering. It sets traps. It calculates multi-step combats.
Multi-step movement (6.8)
What changes: the AI commands its units like a player does.
- Pick a destination tile; the simulator paths there over multiple turns.
- Set rally points so freshly-built units head somewhere useful.
- Issue patrol routes for scouts.
- Order escorts (a defender follows a settler).
What players notice: the AI stops moving one hex at a time. Armies march in formation. Builders get protected. Scouts cover the map deliberately.
v1.4 — "Mastery" (Stage 6.9)
What changes: the AI learns against itself instead of against the scripted opponents.
- Self-play league: generation 0 plays generation 1 plays generation 2, and so on. Each generation must beat the entire prior population to graduate.
- 12-slot huge-map free-for-all is the training arena — no handicaps, no easier opponents.
- Four specialist variants ship alongside the generalist: Rush (early pressure), Turtle (defensive consolidation), Tech (research race), Economy (long-game empire). Each is a separate selectable controller; modders can train their own.
What players notice: the strongest AI difficulty tier becomes tournament-grade. The specialists give the campaign distinct opposition personalities that you can prepare strategies against. New mod authors get four worked examples (one per specialist) to learn from.
How a player picks AI in v1.0
In the New Game screen, each opponent slot has an AI dropdown. Choose:
- a scripted clan personality (named, themed), or
- the learned AI (currently one:
learned:duel-v1b).
The five patches above add more entries to that dropdown — they do not change the UI flow. A v1.0 save loads in v1.4 because every save records which AI was driving which slot.
How a player picks difficulty
Difficulty is never "the AI's brain is smaller." Difficulty is:
- a resource handicap (the AI gets more/fewer starting yields), and
- (for learned AI tiers only) a "temperature" that makes the AI play more or less consistently.
So the ladder is:
| Difficulty | AI | Why it's harder |
|---|---|---|
| Settler | Peaceful scripted clan | AI starts behind; rarely attacks |
| Chieftain | Default scripted clan | Balanced |
| Warlord | Rotating scripted clans | Multiple personalities; less predictable |
| King (v1.4+) | Best league-gen AI | Genuinely strong AI |
| Champion (v1.0) → (v1.4) | Learned AI, low temperature | Near-optimal play; small handicap |
Cross-references
- Engineering reference:
ai-production.md - Modder contract:
modding/ai-controller.md - ABI decisions memo:
modding/abi-decisions.md - Plan file (internal):
~/.claude/plans/in-the-game-civilization-elegant-popcorn.md