Commit graph

22 commits

Author SHA1 Message Date
autocommit
dbeb3f4088 test(rl-self-play): Add evaluation functions, opponent models, and smoke tests for divergence mining in RL self-play tools
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-05-27 20:26:00 -07:00
autocommit
2637b79e15 feat(rl-self-play): Add lightweight SmokeModelOpponent class with core act() and train() methods for RL self-play testing
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-05-27 20:15:34 -07:00
autocommit
236160134c feat(rl-self-play): Implement opponent model loading, execution, and behavior management for reinforcement learning self-play
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-05-27 20:15:34 -07:00
autocommit
4564074d86 feat(rl-self-play): Add opponent model evaluation support with new training parameters and evaluation metrics in the self-play loop
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-05-27 20:15:33 -07:00
autocommit
20d842004d feat(rl-self-play): Add methods to load and integrate learned opponent policies into MagicCivEnv for reinforcement learning workflows
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-05-27 20:15:33 -07:00
autocommit
e2e578cdab feat(rl-self-play): Add learned opponent policy evaluation options to RL self-play evaluation script
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-05-27 20:15:33 -07:00
autocommit
bb15503079 feat(rl-self-play): Add mine divergence metric for evaluating strategy differences in RL self-play
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-05-27 20:04:30 -07:00
autocommit
fd64dc5622 test(rl-self-play): Add comprehensive test suite for RL self-play pretraining, diagnostics, encoders, harness client, and expert recording validation
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-05-26 02:21:15 -07:00
autocommit
e6d90a6a47 feat(rl-self-play): Add encoder logic, training modes, behavior cloning pretraining, diagnostic tools, and expert data handling to the RL self-play pipeline
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-05-26 02:21:14 -07:00
autocommit
af0cad4873 perf(rl-self-play): Optimize RL self-play environment with faster episode evaluation, optimized state encoding, and reduced training overhead
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-05-26 02:21:13 -07:00
autocommit
eb8b82700c feat(game-engine): Improve game state management with audio utilities, auto-play logic, and entity handling; add integration tests for game-over and rally scenarios; update smoke testing tool for multi-slot support
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-05-26 02:21:12 -07:00
autocommit
3f1aeaa602 infra(player-api): 🧱 Update player API infrastructure to enable multi-slot configuration for concurrent player agents
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-05-26 02:21:12 -07:00
autocommit
34911ad08c perf(rl-self-play): Refactor environment state transitions and agent communication for faster RL self-play execution
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-05-26 02:21:11 -07:00
autocommit
3241bdacd1 feat(rl-self-play): Introduce turn/step cap tracking in evaluation metrics for improved RL self-play observability
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-05-26 02:21:11 -07:00
autocommit
e5a2a37d0e feat(rl-self-play): Add stochastic evaluation with masked softmax sampling to replace deterministic argmax in RL self-play training
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-05-26 02:21:11 -07:00
autocommit
b82e4a8fbd feat(rl-self-play): Introduce no-op penalty and turn advancement bonus in RL environment
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-05-26 02:21:11 -07:00
Natalie
50e174ab06 feat(@projects/@magic-civilization): add step_cap evaluation category
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-05-17 05:34:29 -07:00
Natalie
4a862b76fb fix(@projects/@magic-civilization): 🐛 improve pid detection in rl scripts
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-05-17 05:28:24 -07:00
Natalie
14fbe501ca feat(tooling): add turn tracking and forced end turn logic
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-05-17 05:16:18 -07:00
Natalie
de5fbd42c4 feat(tooling): add apricot gpu device guidance
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-05-17 04:02:09 -07:00
Natalie
7cdc8178b7 feat(tooling): add smoke test for protocol layer
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-05-17 03:59:39 -07:00
Natalie
b7891991a4 feat(@projects/@magic-civilization): add rl_self_play tooling for self-play training
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-05-17 03:54:40 -07:00