magicciv

Author	SHA1	Message	Date
autocommit	55935afbd2	refactor(rl-self-play): ♻️ Optimize ONNX export script for RL self-play model (p1_29f) to improve compatibility and performance Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>	2026-06-02 22:59:04 -07:00
autocommit	dbeb3f4088	test(rl-self-play): ✅ Add evaluation functions, opponent models, and smoke tests for divergence mining in RL self-play tools Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>	2026-05-27 20:26:00 -07:00
autocommit	2637b79e15	feat(rl-self-play): ✨ Add lightweight SmokeModelOpponent class with core act() and train() methods for RL self-play testing Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>	2026-05-27 20:15:34 -07:00
autocommit	236160134c	feat(rl-self-play): ✨ Implement opponent model loading, execution, and behavior management for reinforcement learning self-play Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>	2026-05-27 20:15:34 -07:00
autocommit	4564074d86	feat(rl-self-play): ✨ Add opponent model evaluation support with new training parameters and evaluation metrics in the self-play loop Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>	2026-05-27 20:15:33 -07:00
autocommit	20d842004d	feat(rl-self-play): ✨ Add methods to load and integrate learned opponent policies into MagicCivEnv for reinforcement learning workflows Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>	2026-05-27 20:15:33 -07:00
autocommit	e2e578cdab	feat(rl-self-play): ✨ Add learned opponent policy evaluation options to RL self-play evaluation script Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>	2026-05-27 20:15:33 -07:00
autocommit	bb15503079	feat(rl-self-play): ✨ Add mine divergence metric for evaluating strategy differences in RL self-play Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>	2026-05-27 20:04:30 -07:00
autocommit	fd64dc5622	test(rl-self-play): ✅ Add comprehensive test suite for RL self-play pretraining, diagnostics, encoders, harness client, and expert recording validation Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>	2026-05-26 02:21:15 -07:00
autocommit	e6d90a6a47	feat(rl-self-play): ✨ Add encoder logic, training modes, behavior cloning pretraining, diagnostic tools, and expert data handling to the RL self-play pipeline Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>	2026-05-26 02:21:14 -07:00
autocommit	af0cad4873	perf(rl-self-play): ⚡ Optimize RL self-play environment with faster episode evaluation, optimized state encoding, and reduced training overhead Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>	2026-05-26 02:21:13 -07:00
autocommit	eb8b82700c	feat(game-engine): ✨ Improve game state management with audio utilities, auto-play logic, and entity handling; add integration tests for game-over and rally scenarios; update smoke testing tool for multi-slot support Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>	2026-05-26 02:21:12 -07:00
autocommit	3f1aeaa602	infra(player-api): 🧱 Update player API infrastructure to enable multi-slot configuration for concurrent player agents Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>	2026-05-26 02:21:12 -07:00
autocommit	34911ad08c	perf(rl-self-play): ⚡ Refactor environment state transitions and agent communication for faster RL self-play execution Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>	2026-05-26 02:21:11 -07:00
autocommit	3241bdacd1	feat(rl-self-play): ✨ Introduce turn/step cap tracking in evaluation metrics for improved RL self-play observability Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>	2026-05-26 02:21:11 -07:00
autocommit	e5a2a37d0e	feat(rl-self-play): ✨ Add stochastic evaluation with masked softmax sampling to replace deterministic argmax in RL self-play training Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>	2026-05-26 02:21:11 -07:00
autocommit	b82e4a8fbd	feat(rl-self-play): ✨ Introduce no-op penalty and turn advancement bonus in RL environment Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>	2026-05-26 02:21:11 -07:00
Natalie	50e174ab06	feat(@projects/@magic-civilization): ✨ add step_cap evaluation category Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>	2026-05-17 05:34:29 -07:00
Natalie	4a862b76fb	fix(@projects/@magic-civilization): 🐛 improve pid detection in rl scripts Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>	2026-05-17 05:28:24 -07:00
Natalie	14fbe501ca	feat(tooling): ✨ add turn tracking and forced end turn logic Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>	2026-05-17 05:16:18 -07:00
Natalie	de5fbd42c4	feat(tooling): ✨ add apricot gpu device guidance Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>	2026-05-17 04:02:09 -07:00
Natalie	7cdc8178b7	feat(tooling): ✨ add smoke test for protocol layer Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>	2026-05-17 03:59:39 -07:00
Natalie	b7891991a4	feat(@projects/@magic-civilization): ✨ add rl_self_play tooling for self-play training Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>	2026-05-17 03:54:40 -07:00

23 commits