Files
lerobot/lerobot
KeWang1017 0ecf40d396 Refactor SACPolicy for improved action sampling and standard deviation handling
- Updated action selection to use distribution sampling and log probabilities for better stochastic behavior.
- Enhanced standard deviation clamping to prevent extreme values, ensuring stability in policy outputs.
- Cleaned up code by removing unnecessary comments and improving readability.

These changes aim to refine the SAC implementation, enhancing its robustness and performance during training and inference.
2025-03-28 17:18:24 +00:00
..
2025-03-28 17:18:24 +00:00
2025-01-31 13:57:37 +01:00
2024-05-15 12:13:09 +02:00