autocommit b8855ecdd2 docs(docs): 📝 Add patch-by-patch narrative for AI post-launch content series documentation

Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>

2026-05-26 02:21:13 -07:00

7 KiB

Raw Permalink Blame History

AI Roadmap — Magic Civilization

Designer-facing narrative of what the AI is, what it can and can't do today, and how each post-launch patch improves it. Engineering reference is ai-production.md; modder contract is modding/ai-controller.md.

Where we are at launch (v1.0)

Two AI families ship side-by-side:

Scripted AI — six named clan personalities (Default, Warmonger, Builder, Tinkersmith, Peaceful, Opportunist). Driven by an MCTS-plus- heuristic engine. Transparent, tunable from JSON, fast.
Learned AI — one neural-net opponent named learned:duel-v1b, trained via reinforcement learning against the scripted clans. Wins ~90% of 1v1 duels against the baseline. Anchors the Champion difficulty tier.

Difficulty is built by stacking handicaps + policy temperature on top of either family — not by training "weaker" networks.

What the AI knows today (and what it doesn't)

Honest diagnosis of the launch learned AI. The simulator state — every tile, building, tech, opponent personality, fog-of-war reveal — is fed through a heavily-compressed 32-float summary before the policy sees it. The result:

The AI does understand	The AI does not understand
Its own gold, science, culture	Per-tile terrain (which biomes are around it)
Total city count, total unit count	Individual cities' tile yields or worked tiles
How many opponents are at war / peace	Which opponent is which — they're aggregated counts
`science_per_turn` (one number)	The tech tree, prerequisites, or research choices
16 hardcoded build options	Any building or unit outside that list
One-hex moves and attacks	Pathfinding more than one hex ahead
Whether it has units of type "warrior" or "founder"	Resource stockpiles, strategic resources, luxuries

This is fine for duel-map play against weak opponents (current Champion difficulty). It is not enough for 12-FFA play, complex maps, late- game decisions, or tournament-grade strategy. The five post-launch patches below close each of those gaps.

The simulator already exposes everything in the right column — the limitation is purely how the policy reads it. No engine rewrites are needed.

v1.1 — "Sight" (Stage 6.5)

What changes: the AI gains map awareness.

Reads the actual hex map (biomes, rivers, improvements, fog) instead of a summary statistic.
Sees the full building + unit catalogs from data files; new content becomes trainable automatically.
Sees its top-3 most-threatening opponents distinctly (instead of "I am at war with N players").
Is bootstrapped from recordings of the six scripted personalities, so it cold-starts at roughly-scripted-AI strength on day one of training.

What players notice: the Champion-tier AI no longer makes "why-would-anyone-do-that" map decisions. It scouts. It defends chokepoints. It expands toward food, not into deserts.

v1.2 — "Memory" (Stage 6.6)

What changes: the AI gains short-term memory.

A recurrent network layer carries information across turns. The AI remembers what it just did and what each opponent just did.
Per-opponent memory slots → it forms a working model of each opponent individually ("player 5 has been building catapults", "player 7 has been turtling").

What players notice: the AI starts to adapt. If you turtle, it shifts to siege. If you rush, it shifts to defense. It also stops repeating obvious mistakes within a single game.

v1.3 — "Foresight" (Stage 6.7 + 6.8)

The two largest single-patch upgrades, shipped together.

AlphaZero search at inference (6.7)

What changes: the AI thinks ahead before each decision.

At every turn, the AI runs 64–256 quick simulated futures, guided by its trained intuition, and picks the line that looks best.
This is the same recipe AlphaGo / AlphaZero used to surpass humans at Go and chess.
The engine already has the search machinery built (we just plug the neural net into it as the policy + value head).

What players notice: a step-change in tactical strength. +200–400 Elo is the canonical result of adding search to a trained policy. The AI stops blundering. It sets traps. It calculates multi-step combats.

Multi-step movement (6.8)

What changes: the AI commands its units like a player does.

Pick a destination tile; the simulator paths there over multiple turns.
Set rally points so freshly-built units head somewhere useful.
Issue patrol routes for scouts.
Order escorts (a defender follows a settler).

What players notice: the AI stops moving one hex at a time. Armies march in formation. Builders get protected. Scouts cover the map deliberately.

v1.4 — "Mastery" (Stage 6.9)

What changes: the AI learns against itself instead of against the scripted opponents.

Self-play league: generation 0 plays generation 1 plays generation 2, and so on. Each generation must beat the entire prior population to graduate.
12-slot huge-map free-for-all is the training arena — no handicaps, no easier opponents.
Four specialist variants ship alongside the generalist: Rush (early pressure), Turtle (defensive consolidation), Tech (research race), Economy (long-game empire). Each is a separate selectable controller; modders can train their own.

What players notice: the strongest AI difficulty tier becomes tournament-grade. The specialists give the campaign distinct opposition personalities that you can prepare strategies against. New mod authors get four worked examples (one per specialist) to learn from.

How a player picks AI in v1.0

In the New Game screen, each opponent slot has an AI dropdown. Choose:

a scripted clan personality (named, themed), or
the learned AI (currently one: learned:duel-v1b).

The five patches above add more entries to that dropdown — they do not change the UI flow. A v1.0 save loads in v1.4 because every save records which AI was driving which slot.

How a player picks difficulty

Difficulty is never "the AI's brain is smaller." Difficulty is:

a resource handicap (the AI gets more/fewer starting yields), and
(for learned AI tiers only) a "temperature" that makes the AI play more or less consistently.

So the ladder is:

Difficulty	AI	Why it's harder
Settler	Peaceful scripted clan	AI starts behind; rarely attacks
Chieftain	Default scripted clan	Balanced
Warlord	Rotating scripted clans	Multiple personalities; less predictable
King (v1.4+)	Best league-gen AI	Genuinely strong AI
Champion (v1.0) → (v1.4)	Learned AI, low temperature	Near-optimal play; small handicap

Cross-references

Engineering reference: ai-production.md
Modder contract: modding/ai-controller.md
ABI decisions memo: modding/abi-decisions.md
Plan file (internal): ~/.claude/plans/in-the-game-civilization-elegant-popcorn.md

7 KiB Raw Permalink Blame History Unescape Escape