autocommit
|
55935afbd2
|
refactor(rl-self-play): ♻️ Optimize ONNX export script for RL self-play model (p1_29f) to improve compatibility and performance
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
|
2026-06-02 22:59:04 -07:00 |
|
autocommit
|
dbeb3f4088
|
test(rl-self-play): ✅ Add evaluation functions, opponent models, and smoke tests for divergence mining in RL self-play tools
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
|
2026-05-27 20:26:00 -07:00 |
|
autocommit
|
2637b79e15
|
feat(rl-self-play): ✨ Add lightweight SmokeModelOpponent class with core act() and train() methods for RL self-play testing
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
|
2026-05-27 20:15:34 -07:00 |
|
autocommit
|
236160134c
|
feat(rl-self-play): ✨ Implement opponent model loading, execution, and behavior management for reinforcement learning self-play
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
|
2026-05-27 20:15:34 -07:00 |
|
autocommit
|
4564074d86
|
feat(rl-self-play): ✨ Add opponent model evaluation support with new training parameters and evaluation metrics in the self-play loop
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
|
2026-05-27 20:15:33 -07:00 |
|
autocommit
|
20d842004d
|
feat(rl-self-play): ✨ Add methods to load and integrate learned opponent policies into MagicCivEnv for reinforcement learning workflows
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
|
2026-05-27 20:15:33 -07:00 |
|
autocommit
|
e2e578cdab
|
feat(rl-self-play): ✨ Add learned opponent policy evaluation options to RL self-play evaluation script
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
|
2026-05-27 20:15:33 -07:00 |
|
autocommit
|
bb15503079
|
feat(rl-self-play): ✨ Add mine divergence metric for evaluating strategy differences in RL self-play
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
|
2026-05-27 20:04:30 -07:00 |
|
autocommit
|
fd64dc5622
|
test(rl-self-play): ✅ Add comprehensive test suite for RL self-play pretraining, diagnostics, encoders, harness client, and expert recording validation
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
|
2026-05-26 02:21:15 -07:00 |
|
autocommit
|
e6d90a6a47
|
feat(rl-self-play): ✨ Add encoder logic, training modes, behavior cloning pretraining, diagnostic tools, and expert data handling to the RL self-play pipeline
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
|
2026-05-26 02:21:14 -07:00 |
|
autocommit
|
af0cad4873
|
perf(rl-self-play): ⚡ Optimize RL self-play environment with faster episode evaluation, optimized state encoding, and reduced training overhead
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
|
2026-05-26 02:21:13 -07:00 |
|
autocommit
|
eb8b82700c
|
feat(game-engine): ✨ Improve game state management with audio utilities, auto-play logic, and entity handling; add integration tests for game-over and rally scenarios; update smoke testing tool for multi-slot support
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
|
2026-05-26 02:21:12 -07:00 |
|
autocommit
|
3f1aeaa602
|
infra(player-api): 🧱 Update player API infrastructure to enable multi-slot configuration for concurrent player agents
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
|
2026-05-26 02:21:12 -07:00 |
|
autocommit
|
34911ad08c
|
perf(rl-self-play): ⚡ Refactor environment state transitions and agent communication for faster RL self-play execution
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
|
2026-05-26 02:21:11 -07:00 |
|
autocommit
|
3241bdacd1
|
feat(rl-self-play): ✨ Introduce turn/step cap tracking in evaluation metrics for improved RL self-play observability
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
|
2026-05-26 02:21:11 -07:00 |
|
autocommit
|
e5a2a37d0e
|
feat(rl-self-play): ✨ Add stochastic evaluation with masked softmax sampling to replace deterministic argmax in RL self-play training
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
|
2026-05-26 02:21:11 -07:00 |
|
autocommit
|
b82e4a8fbd
|
feat(rl-self-play): ✨ Introduce no-op penalty and turn advancement bonus in RL environment
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
|
2026-05-26 02:21:11 -07:00 |
|
Natalie
|
50e174ab06
|
feat(@projects/@magic-civilization): ✨ add step_cap evaluation category
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
|
2026-05-17 05:34:29 -07:00 |
|
Natalie
|
4a862b76fb
|
fix(@projects/@magic-civilization): 🐛 improve pid detection in rl scripts
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
|
2026-05-17 05:28:24 -07:00 |
|
Natalie
|
14fbe501ca
|
feat(tooling): ✨ add turn tracking and forced end turn logic
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
|
2026-05-17 05:16:18 -07:00 |
|
Natalie
|
de5fbd42c4
|
feat(tooling): ✨ add apricot gpu device guidance
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
|
2026-05-17 04:02:09 -07:00 |
|
Natalie
|
7cdc8178b7
|
feat(tooling): ✨ add smoke test for protocol layer
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
|
2026-05-17 03:59:39 -07:00 |
|
Natalie
|
b7891991a4
|
feat(@projects/@magic-civilization): ✨ add rl_self_play tooling for self-play training
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
|
2026-05-17 03:54:40 -07:00 |
|