Commit Graph

251 Commits

Author SHA1 Message Date
Michel Aractingi 867174c8bc feat(dataset-tools): add dataset utilities and example script
- Introduced dataset tools for LeRobotDataset, including functions for deleting episodes, splitting datasets, adding/removing features, and merging datasets.
- Added an example script demonstrating the usage of these utilities.
- Implemented comprehensive tests for all new functionalities to ensure reliability and correctness.
2025-08-13 01:45:49 +02:00
Michel Aractingi 267a753eda Merge branch 'main' into user/michel-aractingi/2025_06_30_dataset_v3 2025-08-13 01:39:32 +02:00
Adil Zouitine 88f7bf01c1 feat(pipeline): universal processor for LeRobot (#1431)
* Refactor observation preprocessing to use a modular pipeline system

- Introduced `RobotPipeline` and `ObservationProcessor` for handling observation transformations.
- Updated `preprocess_observation` to maintain backward compatibility while leveraging the new pipeline.
- Added tests for the new processing components and ensured they match the original functionality.
- Removed hardcoded logic in favor of a more flexible, composable architecture.

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Refactor observation processing and improve modularity

- Updated `ObservationProcessor` to enhance the modular design for processing observations.
- Cleaned up imports and improved code readability by removing unnecessary lines and comments.
- Ensured backward compatibility while integrating new processing components.
- Added tests to validate the functionality of the updated processing architecture.

* Remove redundant tests for None observation and serialization methods in `test_observation_processor.py` to streamline the test suite and improve maintainability.

* Refactor processing architecture to use RobotProcessor

- Replaced instances of RobotPipeline with RobotProcessor across the codebase for improved modularity and clarity.
- Introduced ProcessorStepRegistry for better management of processing steps.
- Updated relevant documentation and tests to reflect the new processing structure.
- Enhanced the save/load functionality to support the new processor design.
- Added a model card template for RobotProcessor to facilitate sharing and documentation.

* Add RobotProcessor tutorial to documentation

- Introduced a new tutorial on using RobotProcessor for preprocessing robot data.
- Added a section in the table of contents for easy navigation to the new tutorial.
- The tutorial covers key concepts, real-world scenarios, and practical examples for effective use of the RobotProcessor pipeline.

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Add normalization processor and related components

- Introduced `NormalizationProcessor` to handle both observation normalization and action unnormalization.
- Added `ObservationNormalizer` and `ActionUnnormalizer` classes for specific normalization tasks.
- Updated `__init__.py` to include the new `NormalizationProcessor` in the module exports.
- Enhanced `ObservationProcessor` with registration in the `ProcessorStepRegistry` for better modularity.
- Created `RenameProcessor` for renaming keys in observations, improving flexibility in data processing.

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Enhance processing architecture with new components

- Added `RenameProcessor` to facilitate key renaming in observations, improving data handling flexibility.
- Updated `__init__.py` to include `RenameProcessor` in module exports.
- Refactored `NormalizationProcessor` and `ObservationNormalizer` to use `rsplit` for better key handling.
- Introduced comprehensive tests for `NormalizationProcessor` and `RenameProcessor` to ensure functionality and robustness.

* chore (docs): add docstring for processor

* fix (test): test factory

* fix(test): policies

* Update tests/processor/test_observation_processor.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Adil Zouitine <adilzouitinegm@gmail.com>

* chore(test): add suggestion made by copilot regarding numpy test

* fix(test): import issue

* Refactor normalization components and update tests

- Renamed `ObservationNormalizer` to `NormalizerProcessor` and `ActionUnnormalizer` to `UnnormalizerProcessor` for clarity.
- Consolidated normalization logic for both observations and actions into `NormalizerProcessor` and `UnnormalizerProcessor`.
- Updated tests to reflect the new class names and ensure proper functionality of normalization and unnormalization processes.
- Enhanced handling of missing statistics in normalization processes.

* chore (docstrin):Improve docstring for NormalizerProcessor

* feat (device processor): Implement device processor

* chore (batch handling): Enhance processing components with batch conversion utilities

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix(test): linting issue

* chore (output format): improves output format

* chore (type): add typing for multiprocess envs

* feat (overrides): Implement support for loading processors with parameter overrides

- Added the ability to provide non-serializable objects when loading processors from saved configurations using the `overrides` parameter.
- Enhanced error handling for invalid override keys and instantiation errors.
- Updated documentation and examples to illustrate the usage of overrides for both registered and unregistered steps.
- Added comprehensive tests to validate the new functionality and ensure backward compatibility.

* chore(normalization): addressing comments from copilot

* chore(learner): nit comment from copilot

* feat(pipeline): Enhance step_through method to support both tuple and dict inputs

* refactor(pipeline): Simplify observation and padding data handling in batch transitions

* Apply suggestions from code review

Co-authored-by: Simon Alibert <75076266+aliberts@users.noreply.github.com>
Signed-off-by: Adil Zouitine <adilzouitinegm@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* refactor(pipeline): Introduce ComplementaryDataProcessor for handling complementary data in transitions

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* refactor(pipeline): Transition from tuple to dictionary format for EnvTransition

- Updated the EnvTransition structure to use a dictionary format instead of a tuple, enhancing readability and maintainability.
- Replaced instances of TransitionIndex with TransitionKey for accessing transition components.
- Adjusted related processing functions and tests to accommodate the new dictionary format, ensuring consistent handling of transitions across the codebase.

* refactor(observation_processor): Improve observation processing by using constants and simplifying pixel handling

- Introduced constants for observation keys to enhance readability.
- Streamlined the handling of the "pixels" key by copying observations first and processing images more clearly.
- Updated the environment state and agent position assignments to use the new constants, improving maintainability.

* feat(pipeline): Add hook unregistration functionality and enhance documentation

- Implemented methods to unregister before, after, and reset hooks in the RobotProcessor class, allowing for more flexible hook management.
- Enhanced documentation to clarify hook execution semantics and the implications of modifying transitions within hooks.
- Added comprehensive tests to verify the correct behavior of hook registration and unregistration, including error handling for non-existent hooks.

* refactor(pipeline): Clarify hook behavior and improve documentation

- Updated the RobotProcessor class to ensure hooks are strictly for observation and do not modify transitions, enhancing clarity and maintainability.
- Refactored hook registration methods to reflect the new behavior, ensuring they accept only functions that do not return modified transitions.
- Enhanced documentation to clearly outline the purpose of hooks and their execution semantics.
- Added tests to verify that hooks are not executed during the step_through method while ensuring they function correctly during the __call__ method.

* feat(pipeline): Add __repr__ method to RobotProcessor for improved readability

- Implemented a __repr__ method in the RobotProcessor class to provide a clear string representation of the processor, including step names and optional parameters like name and seed.
- Added comprehensive tests to validate the __repr__ output for various scenarios, including empty processors, single and multiple steps, custom names, and seed values.
- Ensured that the representation handles long lists of steps with truncation for better readability.

* chore(pipeline): Move _CFG_NAME along other class member

* refactor(pipeline): Utilize get_safe_torch_device for device assignment

- Replaced direct torch.device instantiation with get_safe_torch_device to ensure safe device handling.
- This change enhances code readability and maintains consistency in device management across the RobotProcessor class.

* refactor(pipeline): Enhance state filename generation and profiling method

- Updated state filename generation to use the registry name when available, improving clarity in saved files.
- Modified the profile_steps method to include a warmup_runs parameter, allowing for more controlled performance profiling.
- Ensured consistent conditions during profiling by deep copying transitions for each run, enhancing accuracy in timing results.

* chore(doc): address pip install commant lerobot that not exist yet

* feat(pipeline): Enhance configuration filename handling and state file naming

- Introduced support for custom configuration filenames in the `save_pretrained` method, allowing users to specify a filename instead of the default.
- Improved state file naming to include step indices, preventing conflicts when multiple processors of the same type are saved.
- Added automatic detection for configuration files when loading from a directory, with error handling for multiple files.
- Updated tests to validate new features, including custom filenames and automatic config detection.

* refactor(pipeline): Improve state file naming conventions for clarity and uniqueness

- Enhanced state file naming to include the processor's sanitized name, ensuring uniqueness when multiple processors are saved in the same directory.
- Updated tests to reflect changes in state file naming, verifying that filenames now include the processor name and step indices to prevent conflicts.
- Added a new test to validate state file naming when using multiple processors, ensuring distinct filenames for each processor's state files.

* docs(pipeline): Add clarification for repo name sanitization process

* Feat/pipeline add feature contract (#1637)

* Add feature contract to pipelinestep and pipeline

* Add tests

* Add processor tests

* PR feedback

* encorperate pr feedback

* type in doc

* oops

* docs(pipeline): Clarify transition handling and hook behavior

- Updated documentation to specify that hooks always receive transitions in EnvTransition format, ensuring consistent behavior across input formats.
- Refactored the step_through method to yield only EnvTransition objects, regardless of the input format, and updated related tests to reflect this change.
- Enhanced test assertions to verify the structure of results and the correctness of processing steps.

* refactor(pipeline): Remove to() method for device management

- Eliminated the to() method from RobotProcessor, which was responsible for moving tensor states to specified devices.
- Removed associated unit tests that validated the functionality of the to() method across various scenarios.
- Streamlined the pipeline code by focusing on other device management strategies.

* refactor(pipeline): Remove model card generation and streamline processor methods

- Eliminated the _generate_model_card method from RobotProcessor, which was responsible for generating README.md files from a template.
- Updated save_pretrained method to remove model card generation, focusing on serialization of processor definitions and parameters.
- Added default implementations for get_config, state_dict, load_state_dict, reset, and feature_contract methods in various processor classes to enhance consistency and usability.

* refactor(observation): Streamline observation preprocessing and remove unused processor methods

- Updated the `preprocess_observation` function to enhance image handling and ensure proper tensor formatting.
- Removed the `RobotProcessor` and associated transition handling from the `rollout` function, simplifying the observation processing flow.
- Integrated direct calls to `preprocess_observation` for improved clarity and efficiency in the evaluation script.

* refactor(pipeline): Rename parameters for clarity and enhance save/load functionality

- Updated parameter names in the save_pretrained and from_pretrained methods for improved readability, changing destination_path to save_directory and source to pretrained_model_name_or_path.
- Enhanced the save_pretrained method to ensure directory creation and file handling is consistent with the new parameter names.
- Streamlined the loading process in from_pretrained to utilize loaded_config for better clarity and maintainability.

* refactor(pipeline): minor improvements (#1684)

* chore(pipeline): remove unused features + device torch + envtransition keys

* refactor(pipeline): ImageProcessor & StateProcessor are both implemented directly in VanillaObservationPRocessor

* refactor(pipeline): RenameProcessor now inherits from ObservationProcessor + remove unused code

* test(pipeline): fix broken test after refactors

* docs(pipeline): update docstrings VanillaObservationProcessor

* chore(pipeline): move None check to base pipeline classes

---------

Signed-off-by: Adil Zouitine <adilzouitinegm@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Simon Alibert <75076266+aliberts@users.noreply.github.com>
Co-authored-by: Pepijn <138571049+pkooij@users.noreply.github.com>
Co-authored-by: Steven Palma <imstevenpmwork@ieee.org>
2025-08-06 16:11:04 +02:00
Francesco Capuano 90d3a99aa1 Fix policy construction (#1665)
* add: test to check proper construction with multiple features with STATE/ACTION type

* fix: robot and action state should match policy's expectations

* fix minor

Signed-off-by: Francesco Capuano <74058581+fracapuano@users.noreply.github.com>

---------

Signed-off-by: Francesco Capuano <74058581+fracapuano@users.noreply.github.com>
2025-08-04 21:49:51 +02:00
pre-commit-ci[bot] f94092c169 [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2025-07-30 10:12:02 +00:00
Michel Aractingi 1c79e3dec1 Added mock context manager to tests in order to avoid calls to the hub for dummy datasets 2025-07-30 12:11:39 +02:00
Francesco Capuano 527ae8e557 Add variable-size test datasets (#1610)
* fix: dummy datasets can be written to multiple files in multiple folders based on arbitrary data size

* fix: writing atomic episodes to multiple files (maybe)

* fix: moving unused write dataset function to test code
2025-07-30 11:26:28 +02:00
Michel Aractingi 890b1e473d Merge branch 'main' into user/michel-aractingi/2025_06_30_dataset_v3 2025-07-30 00:43:53 +02:00
Caroline Pascal 664e069c3f docs/style: updating docs and deprecated links (#1584) 2025-07-28 12:55:47 +02:00
Eugene Mironov 989f3d05ba [Async Inference] Merge Protos & refactoring (#1480)
* Merge together proto files and refactor Async inference

* Fixup for Async inference

* Drop not reuqired changes

* Fix tests

* Drop old async files

* Drop chunk_size param

* Fix versions

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix wrong fix

Co-authored-by: Ben Zhang <ben.zhang@uwaterloo.ca>

* Fixup

---------

Co-authored-by: Michel Aractingi <michel.aractingi@huggingface.co>
Co-authored-by: Ben Zhang <ben.zhang@uwaterloo.ca>
Co-authored-by: Francesco Capuano <74058581+fracapuano@users.noreply.github.com>
2025-07-23 11:30:01 +02:00
Michel Aractingi 218ebed3ef feat(convert_dataset_v21_to_v3) added the use of more efficient Dataset.from_parquet and concatenate_datasets 2025-07-22 17:27:41 +02:00
Steven Palma f5d6b5b3a7 test(cameras): skip depth test in rs camera for latest version (#1574)
* test(cameras): increase timeout in depth read for testing

* test(cameras): skip test_depth in realsense

---------

Co-authored-by: Michel Aractingi <michel.aractingi@huggingface.co>
2025-07-22 15:14:01 +02:00
Michel Aractingi 5ec70f704e removed check_timestamps_sync that is no longer used in the code,
removed tests in datasets related to check_timestamps_sync
added the use of `clear_episode_buffer` that was not used in `save_episode`
added the creation of the codebase_version tag that was missing in `slurm_upload`
2025-07-18 16:33:20 +02:00
Michel Aractingi e05d22cb7b Merge branch 'main' into user/michel-aractingi/2025_06_30_dataset_v3
Signed-off-by: Michel Aractingi <michel.aractingi@huggingface.co>
2025-07-18 16:33:18 +02:00
Juan Pizarro 7e9f955b40 fix(hil-serl): drain queue on get_last_item_from_queue (#1524)
* fix(hil-serl): drain queue on get_last_item_from_queue

* parametrize queue tests

* revert changes for Darwin

* revert parametrize queue tests

* add test_get_last_item_multiple_items_with_torch_queue

* update test_get_last_item_multiple_items_with_torch_queue

* update test_get_last_item_multiple_items_with_torch_queue
2025-07-17 17:01:48 +02:00
Steven Palma 378e1f0338 Update pre-commit-config.yaml + pyproject.toml + ceil rerun & transformer dependencies version (#1520)
* chore: update .gitignore

* chore: update pre-commit

* chore(deps): update pyproject

* fix(ci): multiple fixes

* chore: pre-commit apply

* chore: address review comments

* Update pyproject.toml

Co-authored-by: Ben Zhang <5977478+ben-z@users.noreply.github.com>
Signed-off-by: Steven Palma <imstevenpmwork@ieee.org>

* chore(deps): add todo

---------

Signed-off-by: Steven Palma <imstevenpmwork@ieee.org>
Co-authored-by: Ben Zhang <5977478+ben-z@users.noreply.github.com>
2025-07-17 14:30:20 +02:00
Michel Aractingi dfb1571bcf Added missing licenses (#1517)
* Added missing liscenses
2025-07-16 11:31:25 +02:00
Simon Alibert 724874e063 Fix tests (#1510) 2025-07-15 11:27:01 +02:00
Francesco Capuano 30c161006d Add Async Inference (#1196)
Co-authored-by: Steven Palma <imstevenpmwork@ieee.org>
Co-authored-by: Michel Aractingi <michel.aractingi@huggingface.co>
2025-07-10 10:39:11 +02:00
fracapuano 4e01f87a6e add: tests forcing new file creation 2025-07-08 13:38:01 +02:00
Michel Aractingi 9287c36f37 - Added missing license in the new scripts
- Added back legacy functions in conversion script of v2 to v21
 - Updated README description for dataset_v3
2025-07-06 22:29:05 +02:00
Michel Aractingi bee74c3eab Fix(tests) fix task index error in test_policies 2025-07-06 10:03:19 +02:00
Michel Aractingi 83bf24cc9a fix(tests) add features argument to load_nested_dataset 2025-07-05 10:16:29 +02:00
Michel Aractingi 3dbc3e60fb Added docstrings to aggregate, fix test_policies.py 2025-07-04 11:27:00 +02:00
Michel Aractingi 69b1f7b118 nit precommit 2025-07-02 18:20:01 +02:00
Michel Aractingi 66454a0fbf Remove more references to lerobot.common 2025-07-02 18:18:19 +02:00
Michel Aractingi 1c17419224 Reverted back files that were changed during the rebase 2025-07-02 17:26:34 +02:00
Michel Aractingi 9dde8829e6 style nit 2025-07-02 17:10:56 +02:00
Michel Aractingi 0f66bbe2f9 Migrate PR to new folder structure introduce on 1417 2025-07-02 17:10:26 +02:00
pre-commit-ci[bot] 6de5670912 [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2025-07-02 11:52:34 +02:00
Michel Aractingi 5e39b4ce94 fix(tests)
- Updated `lerobot_dataset.py:add_frame` to take task as key in frame
- Updated `lerobot_dataset.py` to remove robot argument from `create` function of lerobotdataset and lerobotdatasetmetadata and directly take the features
- Update `test_datasets.py` to features from Mock robot
- Update all the usage of `add_frame` in the library
- Update `dataset_factories.py`; had issues with new argument order
- Raise ValueError when no task is provided (in `datasets/utils.py` validate func)
2025-07-02 11:51:56 +02:00
Michel Aractingi 6b482a93d6 fix(rebase) deleting media related to tutorials 2025-07-02 11:47:09 +02:00
Michel Aractingi d9b9cc80da fix(rebase) reverting files to main 2025-07-02 11:47:07 +02:00
fracapuano 01d0b7b102 fix: modularize tests to improve readability 2025-07-02 11:45:29 +02:00
fracapuano 848a494ff6 add: tests for aggregation code 2025-07-02 11:45:29 +02:00
fracapuano d4fbf6ef39 add: support for videos generation in datasets 2025-07-02 11:45:11 +02:00
Remi Cadene 58795d72c8 In tests: Add use_videos=False by default, Create mp4 file if True, then fix test_datasets and test_aggregate (all passing) 2025-07-02 11:44:21 +02:00
Remi Cadene ad1ad11eac fix hf_dataset.set_transform(hf_transform_to_torch) 2025-07-02 11:43:33 +02:00
Remi Cadene 01bc89b6f4 Merge remote-tracking branch 'origin/user/rcadene/2025_04_11_dataset_v3' into user/rcadene/2025_04_11_dataset_v3 2025-07-02 11:43:24 +02:00
Remi Cadene 8c43b3d05e Faster self.meta.episodes[...]
switch back to set_transform instead of set_format

Add video_files_size_in_mb

pre-commit run --all-files
2025-07-02 11:43:22 +02:00
Remi Cadene d4af22418b Fix unit tests 2025-07-02 11:42:52 +02:00
Remi Cadene 0a390de361 Merge remote-tracking branch 'origin/main' into user/rcadene/2025_04_11_dataset_v3 2025-07-02 11:41:53 +02:00
Simon Alibert d4ee470b00 Package folder structure (#1417)
* Move files

* Replace imports & paths

* Update relative paths

* Update doc symlinks

* Update instructions paths

* Fix imports

* Update grpc files

* Update more instructions

* Downgrade grpc-tools

* Update manifest

* Update more paths

* Update config paths

* Update CI paths

* Update bandit exclusions

* Remove walkthrough section
2025-07-01 16:34:46 +02:00
Pepijn 0b2285d1ec Feat: Improve hub integration (#1382)
* feat(policies): Initial setup to push policies to hub with tags and model card

* feat: add dataset that is used to train

* Add model template summary

* fix: Update link model_card template

* fix: remove print

* fix: change import name

* fix: add model summary in template

* fix: minor text

* fix: comments Lucain

* fix: feedback steven

* fix: restructure push to hub

* fix: remove unneeded changes

* fix: import

* fix: import 2

* Add MANIFEST.in

* fix: feedback pr

* Fix tests

* tests: Add smolvla end-to-end test

* Fix: smolvla test

* fix test name

* fix policy tests

* Add push to hub false policy tests

* Do push to hub cleaner

* fix(ci): add push_to_hub false in tests

---------

Co-authored-by: Steven Palma <steven.palma@huggingface.co>
2025-06-26 14:36:16 +02:00
Adil Zouitine d8079587a2 Port HIL SERL (#644)
Co-authored-by: Michel Aractingi <michel.aractingi@huggingface.co>
Co-authored-by: Eugene Mironov <helper2424@gmail.com>
Co-authored-by: s1lent4gnt <kmeftah.khalil@gmail.com>
Co-authored-by: Ke Wang <superwk1017@gmail.com>
Co-authored-by: Yoel Chornton <yoel.chornton@gmail.com>
Co-authored-by: imstevenpmwork <steven.palma@huggingface.co>
Co-authored-by: Simon Alibert <simon.alibert@huggingface.co>
2025-06-13 13:15:47 +02:00
Simon Alibert 9e6f49f507 Fix test_teleoperate (#1216) 2025-06-06 09:38:37 +02:00
Simon Alibert e23b41e79a Hardware API redesign (#777)
Co-authored-by: Pepijn <138571049+pkooij@users.noreply.github.com>
Co-authored-by: Steven Palma <imstevenpmwork@ieee.org>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Steven Palma <steven.palma@huggingface.co>
Co-authored-by: Adil Zouitine <adilzouitinegm@gmail.com>
Co-authored-by: Pepijn <pepijn@huggingface.co>
2025-06-05 17:48:43 +02:00
mshukor bfd26eef5a Add SmolVLA (#1175)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: fracapuano <francesco.capuano@huggingface.co>
Co-authored-by: Steven Palma <imstevenpmwork@ieee.org>
Co-authored-by: Dana Aubakirova <118912928+danaaubakirova@users.noreply.github.com>
Co-authored-by: Remi <remi.cadene@huggingface.co>
2025-06-03 17:11:50 +02:00
Adil Zouitine 0cf864870c [Fix] Unpin torch beyond 2.6.0 & torchcodec beyond 0.2.1 (#1127) 2025-05-28 16:54:20 +02:00
Remi Cadene 4acf99f622 pre-commit run --all-files 2025-04-21 09:34:19 +02:00