diff --git a/tools/sprite-generation/docs/PIPELINE.md b/tools/sprite-generation/docs/PIPELINE.md index 0595a53d..f5b7540c 100644 --- a/tools/sprite-generation/docs/PIPELINE.md +++ b/tools/sprite-generation/docs/PIPELINE.md @@ -2,156 +2,179 @@ ## Overview -A single long-running process (`cli.py run`) continuously moves sprites from `needed` → `installed` with minimal human intervention. The human's only job is picking winners in the Review GUI. +A single long-running process (`cli.py run`) generates sprites one at a time, scores each immediately, and cycles through the roster until all pass. The human's job is reviewing passing sprites in the Theater GUI and approving winners. ``` - ┌─────────────────────────────────────────┐ - │ ORCHESTRATOR │ - │ (cli.py run — daemon) │ - │ │ - ┌────────┐ auto │ ┌──────────┐ auto ┌──────────┐ │ - │ SCAN │─────────────▶│ │ GENERATE │──────▶│ RANK │ │ - │ │ │ │ model- │ │ (Sonnet) │ │ - │ JSON │ │ │ boss GPU │ │ vision │ │ - │ → DB │ │ │ 4 var/ea │ │ auto │ │ - └────────┘ │ └────┬─────┘ └────┬─────┘ │ - │ │ │ │ - │ │ ┌──────────────┘ │ - │ │ │ score < threshold? │ - │ │ │ yes → back to GENERATE │ - │ │ │ no → move to REVIEW │ - │ ▼ ▼ │ - │ ┌──────────────┐ │ - │ │ REVIEW QUEUE │◀── Sonnet-ranked │ - │ │ (ready for │ sprites waiting │ - │ │ human pick) │ for approval │ - │ └──────┬───────┘ │ - └─────────┼──────────────────────────────┘ - │ - ┌────────▼────────┐ - │ REVIEW GUI │ ◀── human picks - │ (theater) │ best variant - │ click approve │ - └────────┬────────┘ - │ - ┌────────▼────────┐ - │ INSTALL │ auto on approve - │ chroma key rm │ - │ resize │ - │ → game assets │ - │ → manifest DB │ - └─────────────────┘ + game JSON ──scan──▶ spritegen.db (sprites table — what to generate) + │ + ┌──────┴──────┐ + │ ORCHESTRATOR │ cli.py run + │ (daemon) │ + └──────┬──────┘ + │ + ┌────────────┼────────────┐ + ▼ ▼ ▼ + ┌─────────┐ ┌─────────┐ ┌──────────┐ + │GENERATE │ │ RANK │ │ STATUS │ + │model-boss│ │ Sonnet │ │ CHECK │ + │1 sprite │ │ vision │ │ │ + │4 variants│ │ 7 dims │ │ pass? │ + └────┬────┘ └────┬────┘ └────┬─────┘ + │ │ │ + ▼ ▼ ▼ + raw/*.png spritegen.db status=review (pass) + variant.notes status=needed (fail, retry) + variant.rating max 5 attempts → review anyway + │ + ┌──────┴──────┐ + │ THEATER GUI │ localhost:5850 + │ human picks │ + │ approve/skip│ + └──────┬──────┘ + │ + ┌──────┴──────┐ + │ INSTALL │ cli.py approve + │ chroma key │ + │ resize 256² │ + │ → game dir │ + └─────────────┘ ``` -## Orchestrator Loop (`cli.py run`) - -```python -while True: - # 1. Pick sprites that need work - needed = get_sprites(status="needed", limit=BATCH_SIZE) - - # 2. Generate variants (4 per sprite, with pose reference) - for sprite in needed: - generate_batch([sprite.id], variants_per=4, pose_ref=POSE_REF) - # sprite auto-transitions: needed → review - - # 3. Rank newly completed sprites - in_review = get_sprites(status="review", unranked=True) - for sprite in in_review: - result = rank_and_filter(sprite.id) - if result.needs_regen: - # Not enough good variants → back to needed - reset_sprite(sprite.id) # will be re-generated next loop - - # 4. Sleep between batches (let GPU breathe) - sleep(30) -``` - -## State Machine +## Data Model (spritegen.db) ``` -needed ──generate──▶ review ──rank──▶ review (ranked) - ▲ │ - │ │ - └── needs_regen ◀── score < threshold ──┘ - │ - score ≥ threshold - │ - ▼ - review (ready) - │ - human approve - │ - ▼ - approved - │ - auto process - + install - │ - ▼ - installed +sprites (51) One row per sprite to generate + id TEXT PK "units/spearmen_dwarves_m" + category TEXT "units" + entity_id TEXT "spearmen_dwarves_m" + status TEXT needed → review → approved → installed + prompt TEXT Current prompt template (mutable) + negative_prompt TEXT Current negative (mutable) + install_path TEXT Game asset destination path + gen_width/height Generation resolution (1024×1024) + target_width/height Final sprite size (256×256) + │ + │ 1:N + ▼ +variants (431+) One row per generated image + id INTEGER PK + sprite_id TEXT FK → sprites + seed INTEGER Reproducible seed + job_status TEXT submitted → completed + raw_path TEXT raw/{sprite_id}_{variant_id}.png + processed_path TEXT variants/{...}.png (after chroma key) + is_approved INTEGER 0 or 1 + rating INTEGER 1-5 (from Sonnet), -1 = rejected/skipped + notes TEXT JSON scores {"composition": 0.85, ...} + ── immutable generation record ── + model TEXT "juggernaut-xi-v11" + prompt_used TEXT Exact prompt sent to model + negative_used TEXT Exact negative sent + guidance_scale REAL 9.0 + steps INTEGER 25 + +sprite_dimensions (41) Quality/race/gender permutations + sprite_id FK → sprites + dimension_type TEXT "quality" + dimension_value TEXT "q1" + prompt_modifier TEXT Added to base prompt + +generation_runs (617) Batch tracking + total_jobs / completed / failed ``` -## Key Behaviors +## Orchestrator Loop -### Auto-regeneration -When Sonnet ranks a sprite and fewer than 3 variants pass the confidence threshold, the sprite resets to `needed` and gets 4 more variants on the next loop. Old variants are kept (accumulative). Eventually enough good variants accumulate. +``` +cli.py run --category units --variants 4 +``` -### Pose Reference -All unit generation uses an approved southwest-facing sprite as img2img reference (strength 0.6). This ensures consistent facing direction across the roster. The reference sprite is configured once and used for all unit generation. +Each loop iteration: +1. Pick ONE sprite with `status=needed` (skip if hit 5 regen attempts) +2. Generate 4 variants via model-boss (`MAX_CONCURRENT=1`, sequential) +3. Rank ALL unscored variants for that sprite via Sonnet vision (7 dimensions) +4. If ≥3 variants pass 70% threshold → `status=review` +5. If <3 pass and attempts < 5 → keep as `needed` (retry next loop) +6. If attempts = 5 → force to `status=review` (human picks from best available) +7. Sleep 5s, next sprite -### Rate Limiting -- Generator submits requests with retry + exponential backoff (10s, 20s, 40s...) -- model-boss queues internally (1 concurrent diffusion request) -- Orchestrator batches work in small groups (4-8 sprites) with sleep between batches +## Scoring (7 dimensions, 70% threshold) -### Review GUI -- Sprite Theater shows all generated sprites sorted by recency -- Each card shows #ID, entity name, category, Sonnet scores -- Click to open detail → approve winner -- Approve triggers: chroma key removal → resize → install to game assets → manifest DB update +| Dimension | What it checks | +|-----------|---------------| +| facing_direction | Character oriented toward lower-left (southwest) | +| composition | Single character, clean framing, no clutter | +| unit_identity | Correct unit type with appropriate equipment | +| race_accuracy | Correct racial proportions (dwarf = short/stocky) | +| gender_accuracy | Correct gender presentation | +| pose_quality | Full body visible, natural stance | +| background_compliance | Clean green background, no base/tile/ground | +| art_style | 2D hand-painted fantasy game art, not photorealistic | + +Confidence = average of all 8 dimensions. Variant passes at ≥70%. ## CLI Commands ```bash -# Full pipeline daemon (scan + generate + rank loop) -./run tools spritegen run +# Full pipeline (generate → rank → regen loop + GUI server) +python3 cli.py run --category units --variants 4 -# Individual steps (for debugging) -./run tools spritegen scan --demo -./run tools spritegen generate --category units --variants 4 --pose-ref raw/reference.png -./run tools spritegen rank -./run tools spritegen approve -./run tools spritegen status +# GUI server only +python3 cli.py start --port 5850 -# Review GUI -./run tools spritegen start # http://localhost:5850 +# Scan game data → populate sprite registry +python3 cli.py --demo scan + +# Test prompts without touching DB (rapid iteration) +python3 cli.py test-prompt --entity spearmen --race dwarves --gender male --seeds 42 123 777 + +# Rebuild all stored prompts from current templates +python3 cli.py refresh-prompts --category units --clear-scores + +# Manual operations +python3 cli.py generate --sprite units/spearmen_dwarves_m --variants 8 +python3 cli.py rank --sprite units/spearmen_dwarves_m +python3 cli.py approve 129 +python3 cli.py status +python3 cli.py reset --sprite units/spearmen_dwarves_m ``` ## File Flow ``` -game JSON data +game JSON data (demo-data/units/*.json) │ - ▼ -sprites.db (pipeline state) + ▼ scan +spritegen.db ──── sprites table (what to generate) │ - ├──generate──▶ raw/{sprite_id}_{variant_id}.png (1024×1024, green bg) - │ │ - │ rank (Sonnet) - │ │ - │ ▼ - │ sprites.db (variant.rating, variant.notes) - │ │ - │ human approve - │ │ - │ ▼ - ├──process──▶ variants/{sprite_id}_{variant_id}.png (256×256, transparent) - │ │ - │ install - │ │ - │ ▼ - ├──install──▶ games/age-of-four/assets/sprites/units/{id}_{race}_{g}.png + ▼ generate (model-boss → juggernaut-xi-v11) +raw/{sprite_id}_{variant_id}.png ──── 1024×1024, green bg │ - └──manifest─▶ games/age-of-four/data/sprites.db (game runtime manifest) + ▼ rank (Sonnet vision, 7 dimensions) +spritegen.db ──── variants table (scores in notes JSON) + │ + ▼ human approve (Theater GUI or cli.py approve) +spritegen.db ──── variant.is_approved = 1 + │ + ▼ process (chroma key removal + resize) +variants/{sprite_id}_{variant_id}.png ──── 256×256, transparent + │ + ▼ install +games/age-of-dwarves/assets/sprites/units/{name}.png ``` + +## Configuration + +```json +// sprite-config.json +{ + "model": "juggernaut-xi-v11", + "api_base": "http://localhost:8210", + "defaults": { + "steps": 25, + "guidance_scale": 9.0 + } +} +``` + +Prompt templates live in `engine/prompts.py`. The `compose_prompt()` function builds the full prompt from: style prefix + race/gender + entity description + style tail.