Commit Graph

2 Commits

Author SHA1 Message Date
Khalil Meftah 1f5487eea8 refactor: decouple policy from algorithm 2026-03-11 16:49:14 +01:00
Khalil Meftah 8d50be9faa refactor: RL stack refactoring — RLAlgorithm, RLTrainer, DataMixer, and SAC restructuring
- Add RLAlgorithm base class and RLAlgorithmConfig with draccus.ChoiceRegistry
- Add RLTrainer for unified training orchestration with iterator pattern
- Add DataMixer and OnlineOfflineMixer for online/offline data mixing
- Restructure SAC algorithm with batch iterator and factory pattern
- Add observation normalization pre/post processors
- Add comprehensive tests for all new components
2026-03-03 16:50:00 +01:00