* refactor: RL stack refactoring — RLAlgorithm, RLTrainer, DataMixer, and SAC restructuring * chore: clarify torch.compile disabled note in SACAlgorithm * fix(teleop): keyboard EE teleop not registering special keys and losing intervention state Fixes #2345 Co-authored-by: jpizarrom <jpizarrom@gmail.com> * fix: remove leftover normalization calls from reward classifier predict_reward Fixes #2355 * fix: add thread synchronization to ReplayBuffer to prevent race condition between add() and sample() * refactor: update SACAlgorithm to pass action_dim to _init_critics and fix encoder reference * perf: remove redundant CPU→GPU→CPU transition move in learner * Fix: add kwargs in reward classifier __init__() * fix: include IS_INTERVENTION in complementary_info sent to learner for offline replay buffer * fix: add try/finally to control_loop to ensure image writer cleanup on exit * fix: use string key for IS_INTERVENTION in complementary_info to avoid torch.load serialization error * fix: skip tests that require grpc if not available * fix(tests): ensure tensor stats comparison accounts for reshaping in normalization tests * fix(tests): skip tests that require grpc if not available * refactor(rl): expose public API in rl/__init__ and use relative imports in sub-packages * fix(config): update vision encoder model name to lerobot/resnet10 * fix(sac): clarify torch.compile status * refactor(rl): update shutdown_event type hints from 'any' to 'Any' for consistency and clarity * refactor(sac): simplify optimizer return structure * perf(rl): use async iterators in OnlineOfflineMixer.get_iterator * refactor(sac): decouple algorithm hyperparameters from policy config * update losses names in tests * fix docstring * remove unused type alias * fix test for flat dict structure * refactor(policies): rename policies/sac → policies/gaussian_actor * refactor(rl/sac): consolidate hyperparameter ownership and clean up discrete critic * perf(observation_processor): add CUDA support for image processing * fix(rl): correctly wire HIL-SERL gripper penalty through processor pipeline (cherry picked from commit9c2af818ff) * fix(rl): add time limit processor to environment pipeline (cherry picked from commitcd105f65cb) * fix(rl): clarify discrete gripper action mapping in GripperVelocityToJoint for SO100 (cherry picked from commit494f469a2b) * fix(rl): update neutral gripper action (cherry picked from commit9c9064e5be) * fix(rl): merge environment and action-processor info in transition processing (cherry picked from commit30e1886b64) * fix(rl): mirror gym_manipulator in actor (cherry picked from commitd2a046dfc5) * fix(rl): postprocess action in actor (cherry picked from commitc2556439e5) * fix(rl): improve action processing for discrete and continuous actions (cherry picked from commitf887ab3f6a) * fix(rl): enhance intervention handling in actor and learner (cherry picked from commitef8bfffbd7) * Revert "perf(observation_processor): add CUDA support for image processing" This reverts commit38b88c414c. * refactor(rl): make algorithm a nested config so all SAC hyperparameters are JSON-addressable * refactor(rl): add make_algorithm_config function for RLAlgorithmConfig instantiation * refactor(rl): add type property to RLAlgorithmConfig for better clarity * refactor(rl): make RLAlgorithmConfig an abstract base class for better extensibility * refactor(tests): remove grpc import checks from test files for cleaner code * fix(tests): gate RL tests on the `datasets` extra * refactor: simplify docstrings for clarity and conciseness across multiple files * fix(rl): update gripper position key and handle action absence during reset * fix(rl): record pre-step observation so (obs, action, next.reward) align in gym_manipulator dataset * refactor: clean up import statements * chore: address reviewer comments * chore: improve visual stats reshaping logic and update docstring for clarity * refactor: enforce mandatory config_class and name attributes in RLAlgorithm * refactor: implement NotImplementedError for abstract methods in RLAlgorithm and DataMixer * refactor: replace build_algorithm with make_algorithm for SACAlgorithmConfig and update related tests * refactor: add require_package calls for grpcio and gym-hil in relevant modules * refactor(rl): move grpcio guards to runtime entry points * feat(rl): consolidate HIL-SERL checkpoint into HF-style components Make `RLAlgorithmConfig` and `RLAlgorithm` `HubMixin`s, add abstract `state_dict()` / `load_state_dict()` for critic ensemble, target nets and `log_alpha`, and persist them as a sibling `algorithm/` component next to `pretrained_model/`. Replace the pickled `training_state.pt` with an enriched `training_step.json` carrying `step` and `interaction_step`, so resume restores actor + critics + target nets + temperature + optimizers + RNG + counters from HF-standard files. * refactor(rl): move actor weight-sync wire format from policy to algorithm * refactor(rl): update type hints for learner and actor functions * refactor(rl): hoist grpcio guard to module top in actor/learner * chore(rl): manage import pattern in actor (#3564) * chore(rl): manage import pattern in actor * chore(rl): optional grpc imports in learner; quote grpc ServicerContext types --------- Co-authored-by: Khalil Meftah <khalil.meftah@huggingface.co> * update uv.lock * chore(doc): update doc --------- Co-authored-by: jpizarrom <jpizarrom@gmail.com> Co-authored-by: Steven Palma <imstevenpmwork@ieee.org>
Generating the documentation
To generate the documentation, you first have to build it. Several packages are necessary to build the doc, you can install them with the following command, at the root of the code repository:
pip install -e . -r docs-requirements.txt
You will also need nodejs. Please refer to their installation page
NOTE
You only need to generate the documentation to inspect it locally (if you're planning changes and want to
check how they look before committing for instance). You don't have to git commit the built documentation.
Building the documentation
Once you have setup the doc-builder and additional packages, you can generate the documentation by
typing the following command:
doc-builder build lerobot docs/source/ --build_dir ~/tmp/test-build
You can adapt the --build_dir to set any temporary folder that you prefer. This command will create it and generate
the MDX files that will be rendered as the documentation on the main website. You can inspect them in your favorite
Markdown editor.
Previewing the documentation
To preview the docs, first install the watchdog module with:
pip install watchdog
Then run the following command:
doc-builder preview lerobot docs/source/
The docs will be viewable at http://localhost:3000. You can also preview the docs once you have opened a PR. You will see a bot add a comment to a link where the documentation with your changes lives.
NOTE
The preview command only works with existing doc files. When you add a completely new file, you need to update _toctree.yml & restart preview command (ctrl-c to stop it & call doc-builder preview ... again).
Adding a new element to the navigation bar
Accepted files are Markdown (.md).
Create a file with its extension and put it in the source directory. You can then link it to the toc-tree by putting
the filename without the extension in the _toctree.yml file.
Renaming section headers and moving sections
It helps to keep the old links working when renaming the section header and/or moving sections from one document to another. This is because the old links are likely to be used in Issues, Forums, and Social media and it'd make for a much more superior user experience if users reading those months later could still easily navigate to the originally intended information.
Therefore, we simply keep a little map of moved sections at the end of the document where the original section was. The key is to preserve the original anchor.
So if you renamed a section from: "Section A" to "Section B", then you can add at the end of the file:
Sections that were moved:
[ <a href="#section-b">Section A</a><a id="section-a"></a> ]
and of course, if you moved it to another file, then:
Sections that were moved:
[ <a href="../new-file#section-b">Section A</a><a id="section-a"></a> ]
Use the relative style to link to the new file so that the versioned docs continue to work.
For an example of a rich moved sections set please see the very end of the transformers Trainer doc.
Adding a new tutorial
Adding a new tutorial or section is done in two steps:
- Add a new file under
./source. This file can either be ReStructuredText (.rst) or Markdown (.md). - Link that file in
./source/_toctree.ymlon the correct toc-tree.
Make sure to put your new file under the proper section. If you have a doubt, feel free to ask in a Github Issue or PR.
Writing source documentation
Values that should be put in code should either be surrounded by backticks: `like so`. Note that argument names
and objects like True, None or any strings should usually be put in code.
Writing a multi-line code block
Multi-line code blocks can be useful for displaying examples. They are done between two lines of three backticks as usual in Markdown:
```
# first line of code
# second line
# etc
```
Adding an image
Due to the rapidly growing repository, it is important to make sure that no files that would significantly weigh down the repository are added. This includes images, videos, and other non-text files. We prefer to leverage a hf.co hosted dataset like
the ones hosted on hf-internal-testing in which to place these files and reference
them by URL. We recommend putting them in the following dataset: huggingface/documentation-images.
If an external contribution, feel free to add the images to your PR and ask a Hugging Face member to migrate your images
to this dataset.