Commit Graph

529 Commits

Author SHA1 Message Date
Adil Zouitine 2051dd38fc [HIL-SERL] Review feedback modifications (#1112) 2025-05-15 15:24:41 +02:00
Eugene Mironov c7a3973653 [PORT HIL-SERL] Better unit tests coverage for SAC policy (#1074) 2025-05-14 16:41:36 +02:00
Michel Aractingi bbc6b7d841 Added comment on SE(3) in kinematics and nits in lerobot/envs/utils.py 2025-05-12 18:05:22 +02:00
Michel Aractingi b104f8b012 Added number of steps after success as parameter in config 2025-05-09 18:09:10 +02:00
Michel Aractingi 633edcb3af added names in record_dataset function of gym_manipulator 2025-05-07 13:58:24 +02:00
AdilZouitine adbf8bb85e Cleaning configs 2025-05-07 10:26:32 +02:00
Adil Zouitine ad132c9c39 [HIL SERL] Env management and add gym-hil (#1077)
Co-authored-by: Michel Aractingi <michel.aractingi@gmail.com>
2025-05-07 09:39:21 +02:00
Adil Zouitine 70d55c77e9 Merge branch 'main' into user/adil-zouitine/2025-1-7-port-hil-serl-new 2025-05-06 16:43:44 +02:00
Michel Aractingi 5998203a33 [Port HIL-SERL] Final fixes for reward classifier (#1067)
Co-authored-by: s1lent4gnt <kmeftah.khalil@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2025-05-05 11:33:09 +02:00
omahs 8cfab38824 Fix typos (#1070) 2025-05-05 10:35:32 +02:00
Eugene Mironov 6fa7df35df [PORT HIL-SERL] Add unit tests for SAC modeling (#999)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2025-05-05 09:27:42 +02:00
Caroline Pascal 6d723c45a9 feat(encoding): switching to PyAV for ffmpeg related tasks (#983) 2025-04-29 17:39:35 +02:00
Pepijn 42bf1e8b9d Update tutorial (#1021)
Co-authored-by: Simon Alibert <75076266+aliberts@users.noreply.github.com>
2025-04-28 09:00:32 +02:00
AdilZouitine 4257fe5045 rename reward classifier 2025-04-25 18:38:52 +02:00
Michel Aractingi ea89b29fe5 checkout normalize.py to prev commit 2025-04-25 18:10:59 +02:00
AdilZouitine 50e9a8ed6a cleaning 2025-04-25 17:22:02 +02:00
Michel Aractingi bd4db8d747 [Port HIl-Serl] Refactor gym-manipulator (#1034) 2025-04-25 16:34:54 +02:00
AdilZouitine a8da4a347e Clean the code 2025-04-24 17:22:54 +02:00
AdilZouitine b8c2b0bb93 Clean the code and remove todo 2025-04-24 16:10:56 +02:00
Adil Zouitine c58b504a9e [HIL-SERL]Remove overstrict pre-commit modifications (#1028) 2025-04-24 13:48:52 +02:00
Adil Zouitine 299effe0f1 [HIL-SERL] Update CI to allow installation of prerelease versions for lerobot (#1018)
Co-authored-by: imstevenpmwork <steven.palma@huggingface.co>
2025-04-24 10:18:03 +02:00
AdilZouitine c5845ee203 Fix linter issue 2025-04-22 10:37:08 +02:00
AdilZouitine a7a51cfc9c Refactor SACPolicy and configuration to replace 'grasp_critic' terminology with 'discrete_critic'. Update related methods and comments for clarity and consistency in handling discrete actions. 2025-04-18 14:57:03 +00:00
Michel Aractingi c1ee25d9f7 nits in configuration classifier and control_robot 2025-04-18 16:18:13 +02:00
Michel Aractingi 9886520d33 Added option to add current readings to the state of the policy 2025-04-18 16:18:13 +02:00
Michel Aractingi 3b24ad3c84 Fixes for the reward classifier 2025-04-18 16:18:13 +02:00
AdilZouitine 54c3c6d684 Enhance MLP class in modeling_sac.py with detailed docstring and refactor layer construction for improved readability. Simplify layer addition logic by removing unnecessary conditions and ensuring consistent handling of activations and dropout. 2025-04-18 14:15:06 +00:00
pre-commit-ci[bot] fb92935601 [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2025-04-18 13:33:37 +00:00
AdilZouitine dcd850feab Refactor SACObservationEncoder to improve modularity and readability. Split initialization into dedicated methods for image and state layers, and enhance caching logic for image features. Update forward method to streamline feature encoding and ensure proper normalization handling. 2025-04-18 15:10:22 +02:00
AdilZouitine 1ce368503d Refactor SACPolicy initialization by breaking down the constructor into smaller methods for normalization, encoders, critics, actor, and temperature setup. This enhances readability and maintainability. 2025-04-18 15:10:22 +02:00
AdilZouitine fb075a709d Refactor input and output normalization handling in SACPolicy for improved clarity and efficiency. Consolidate encoder initialization logic and remove redundant else statements. 2025-04-18 15:10:22 +02:00
AdilZouitine 3424644ecd Fix init temp
Co-authored-by: s1lent4gnt <kmeftah.khalil@gmail.com>
2025-04-18 15:10:22 +02:00
AdilZouitine c37936f2c9 Update log_std_min type to float in PolicyConfig for consistency 2025-04-18 15:10:22 +02:00
AdilZouitine c5382a450c fix caching
Co-authored-by: s1lent4gnt <kmeftah.khalil@gmail.com>
2025-04-18 15:10:22 +02:00
AdilZouitine 9e5f254db0 change the tanh distribution to match hil serl
Co-authored-by: s1lent4gnt <kmeftah.khalil@gmail.com>
2025-04-18 15:10:22 +02:00
AdilZouitine 8122721f6d match target entropy hil serl
Co-authored-by: s1lent4gnt <kmeftah.khalil@gmail.com>
2025-04-18 15:10:22 +02:00
AdilZouitine 5c352ae558 stick to hil serl nn architecture
Co-authored-by: s1lent4gnt <kmeftah.khalil@gmail.com>
2025-04-18 15:10:22 +02:00
AdilZouitine 9386892f8e Refactor modeling_sac and parameter handling for clarity and reusability.
Co-authored-by: s1lent4gnt <kmeftah.khalil@gmail.com>
2025-04-18 15:10:22 +02:00
AdilZouitine 267a837a2c fix encoder training 2025-04-18 15:10:22 +02:00
pre-commit-ci[bot] 28b595c651 [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2025-04-18 15:10:22 +02:00
Michel Aractingi 9fd4c21d4d General fixes in code, removed delta action, fixed grasp penalty, added logic to put gripper reward in info 2025-04-18 15:10:22 +02:00
pre-commit-ci[bot] 02e1ed0bfb [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2025-04-18 15:10:22 +02:00
AdilZouitine e18274bc9a fix caching and dataset stats is optional 2025-04-18 15:10:22 +02:00
AdilZouitine 68c271ad25 Add rounding for safety 2025-04-18 15:10:22 +02:00
AdilZouitine 203315d378 fix sign issue 2025-04-18 15:10:22 +02:00
AdilZouitine d5a87f67cf Handle gripper penalty 2025-04-18 15:10:22 +02:00
AdilZouitine 8bcf41761d fix caching 2025-04-18 15:10:22 +02:00
AdilZouitine 7c2c67fc3c Enhance SAC configuration and replay buffer with asynchronous prefetching support
- Added async_prefetch parameter to SACConfig for improved buffer management.
- Implemented get_iterator method in ReplayBuffer to support asynchronous prefetching of batches.
- Updated learner_server to utilize the new iterator for online and offline sampling, enhancing training efficiency.
2025-04-18 15:10:22 +02:00
AdilZouitine 70130b9841 Enhance SACPolicy to support shared encoder and optimize action selection
- Cached encoder output in select_action method to reduce redundant computations.
- Updated action selection and grasp critic calls to utilize cached encoder features when available.
2025-04-18 15:10:22 +02:00
AdilZouitine 6167886472 Enhance SACPolicy and learner server for improved grasp critic integration
- Updated SACPolicy to conditionally compute grasp critic losses based on the presence of discrete actions.
- Refactored the forward method to handle grasp critic model selection and loss computation more clearly.
- Adjusted learner server to utilize optimized parameters for grasp critic during training.
- Improved action handling in the ManiskillMockGripperWrapper to accommodate both tuple and single action inputs.
2025-04-18 15:10:22 +02:00