lerobot/README.md at 58cf176a77237fd8db3a44ced1ede4daad3b638b

mirror of https://github.com/huggingface/lerobot.git synced 2026-07-05 17:17:01 +00:00

Files

T

Steven Palma 708fa1d189 feat(policies): add Gr00t N1.7 policy (#3922 )

* Add GR00T N1.7 support

Add GR00T N1.7 policy configuration, checkpoint compatibility, processor parity, LIBERO documentation, and focused tests.

Co-authored-by: Ryan Halabi <ryhalabi@nvidia.com>

* Move Groot processor compatibility into Groot loader

* Restore GR00T Flash Attention install guidance

* Allow Groot fake RTC chunk prefetch

* Fix GR00T N1.7 RTC action decoding

* Trim GR00T N1.7 RTC chunks to valid horizon

* Ignore padded GR00T N1.7 RTC prefix rows

* removed n1.5 dependency

* removed remaining N1.5 traces

* groot: auto-enable LIBERO gripper action transform for libero_sim

GR00T N1.7 emits gripper in [0,1] but LIBERO expects [-1,1]. The decode
transform existed but was never auto-enabled for embodiment_tag=libero_sim,
so the policy scored 0% on LIBERO eval. Auto-set it in __post_init__ (still
overridable). LIBERO Spatial eval: 0% -> 98%.

* Reconnect GR00T relative action processors

* groot: remove dead N1.5 code (eagle2_hg_model, flow_matching_action_head, action_encoder)

N1.7 backbone is nvidia/Cosmos-Reason2-2B via Qwen3VLForConditionalGeneration,
not Eagle2 — eagle2_hg_model/ had zero refs outside its own dir.

GR00TN17ActionHead (groot_n1_7.py) re-implements MultiEmbodimentActionEncoder +
CategorySpecificLinear + swish + SinusoidalPositionalEncoding locally, so
flow_matching_action_head.py (N1.5 FlowmatchingActionHead) and its sole
dependency action_encoder.py are dead. Verified: no src/ or tests/ reference.

Removed (~2037 LOC):
- eagle2_hg_model/ (4 files, ~1575 LOC)
- action_head/flow_matching_action_head.py (408 LOC)
- action_head/action_encoder.py (54 LOC)

cross_attention_dit.py KEPT (DiT/AlternateVLDiT/SelfAttentionTransformer live in N1.7).

* groot: reuse lerobot get_device_from_parameters instead of inline lookup

modeling_groot.py duplicated next(self.parameters()).device twice. LeRobot
ships get_device_from_parameters in policies/utils.py (used by diffusion,
vqbet, tdmpc, gaussian_actor). Reuse it for consistency with the framework.

* groot: fix stale Eagle VLM docstring in processor (N1.7 uses Qwen3-VL backbone)

Addresses checker nit: processor_groot.py docstring still described the N1.5
Eagle VLM path with eagle_content/eagle_* keys that no longer exist in the code.

* test(groot): add N1.7 original-vs-LeRobot output parity test

Verifies the LeRobot GR00T N1.7 integration produces equivalent raw
action_pred to NVIDIA Isaac-GR00T for the same checkpoint, inputs, seed,
precision (fp32) and attention kernel (SDPA): max|diff|=8.9e-7 on the
libero_sim embodiment (GR00T-N1.7-LIBERO/libero_10).

The two impls pin incompatible transformers majors (orig 4.57.3 vs
LeRobot 5.x) and cannot share a process, so the original outputs + exact
collated inputs are produced out-of-process and loaded from an .npz. The
test skips on CI / when the checkpoint or artifact are absent.

* test(groot): parametrize N1.7 parity across all checkpoint embodiments

Generalize the original-vs-LeRobot N1.7 output-parity test from a single
libero_sim case to every embodiment tag in the checkpoint (libero_sim, oxe_droid,
real_g1, the real_r1_pro_sharpa family, and the xdof family). Inputs are built
generically from checkpoint metadata; the test discovers per-tag .npz artifacts
and runs one parametrized case each, loading the LeRobot model once via a fixture.

All 9 embodiments match the original to fp32 epsilon (max|diff| < 3e-6), confirming
the integration is correct across the model's full embodiment space and not overfit
to libero_sim.

* test(groot): self-contained parity test + in-repo producer + docs

- Rename test_groot_n1_7_vs_original.py -> test_groot_vs_original.py
- Make the test self-contained: producer script (dump_original_n1_7.py) now lives
  next to the test; default artifact dir is repo-relative
  (tests/policies/groot/artifacts/), overridable via GROOT_N1_7_PARITY_DIR. The
  test only reads artifacts and skips if absent -- it never creates external dirs.
- Heavy .npz artifacts (~6-9MB each) are gitignored and regenerated by the producer;
  never committed.
- Drop the verbose 'MULTIPLE EMBODIMENTS' docstring block (kept a one-line note).
- Document the parity procedure in the groot policy README (docs/source/policy_groot_README.md).
- Rename test fn test_groot_n1_7_get_action_parity -> test_groot_get_action_parity.

9/9 embodiments still pass (max|diff| < 3e-6, fp32 eps).

* docs(groot): drop WHY TWO ENVIRONMENTS block from parity test docstring

* test(groot): move parity producer into utils/ package

Mirror the tests/policies/pi0_pi05/utils convention: move dump_original_n1_7.py into
a tests/policies/groot/utils/ package (with __init__.py) and update all path
references in the test docstring/skip-message and the policy README.

* test(groot): adopt test_groot_lerobot for GR00T N1.7, drop N1.5

The test loaded MODEL_PATH='aractingi/bimanual-handover-groot-10k', an N1.5
checkpoint (config base_model_path=nvidia/GR00T-N1.5-3B, no model_version). On
load, model_version defaults to n1.7 while the base path infers n1.5, so the
version-consistency guard in GrootConfig.__post_init__ raised ValueError and both
test_lerobot_groot_inference and test_lerobot_groot_forward_pass failed. N1.5 is no
longer a supported model_version.

Adopt the test for N1.7:
- MODEL_PATH -> nvidia/GR00T-N1.7-3B (root-level sharded safetensors; loads via
  GrootPolicy.from_pretrained as a base N1.7 model).
- Embodiment tag 'gr1' (N1.5) -> 'gr1_unified' (valid N1.7 tag from the checkpoint
  embodiment_id.json), via a single EMBODIMENT_TAG constant.
- DUMMY_ACTION_HORIZON 16 -> 40 to match N1.7's native action-chunk size.
- Docstrings/labels updated to 'GR00T N1.7'.

Both tests run and pass on CUDA; full tests/policies/groot/ suite is
73 passed / 0 failed / 0 skipped.

* docs(groot): document the N1.5 removal and the N1.7 parity test

- groot.mdx: breaking-change warning and migration path (pin lerobot==0.5.1 to
  keep N1.5, or move to N1.7); the dead `huggingface-cli download` is replaced
  with `hf download`.
- policy_groot_README.md: N1.5 removal note, updated paper / model-card links,
  and the two-comparison (model parity + preprocessor parity) description of
  the original-vs-LeRobot test, including the raw-observation artifacts and
  recorded seed.

* fix(groot): N1.7 backbone loading and DiT parameter-count logging

- select_layer default tracks the N1.7-3B checkpoint value (16); real
  checkpoint loads still override it from config.json.
- get_backbone_cls recognizes Cosmos-Reason2 / Qwen3-VL backbones by name and
  warns (instead of silently assuming) when an unrecognized backbone is loaded
  only on the strength of backbone_model_type='qwen'.
- 'revision' pins the GR00T checkpoint repo only and is no longer forwarded
  into the unrelated backbone repo load; pin the backbone via
  transformers_loading_kwargs instead.
- DiT / SelfAttentionTransformer parameter counts go through logging.debug
  instead of print().

* fix(groot): N1.7 config defaults, N1.5 rejection, and processor/model runtime fixes

Covers the GR00T N1.7 source trio (configuration, processor, model wrapper).

Config:
- GrootConfig defaults are the N1.7 values; explicitly passed legacy N1.5-era
  values (chunk_size=50, max_state_dim=64, ...) are remapped with a warning
  instead of silently.
- action_decode_transform gains an 'auto' sentinel so an explicit 'none'
  opt-out wins over the libero_sim default and survives save/load round-trips.
- action_delta_indices is cached on the inputs that determine it.
- Legacy N1.5 checkpoints/configs (tokenizer_assets_repo, model_type/
  architectures/eagle backbone markers) are rejected with a single clear
  error pointing to lerobot==0.5.1.

Processor:
- GrootN17ActionDecodeStep handles the 2-D (B, D) actions delivered by sync
  select_action (relative eef/non-eef decode in eval/record flows).
- Postprocessor falls back to dataset stats when a raw checkpoint lacks the
  configured embodiment tag; raw-state cache is per-instance, not
  process-global; caller overrides (device, rename_map) are honored on the
  raw-checkpoint branch.
- Camera/modality-key mismatches warn (including the zero-match fallback);
  deprecated Qwen2VLImageProcessorFast replaced with Qwen2VLImageProcessor;
  removed N1.5 processor steps are stubbed to raise the removal guidance and
  the action-unpack step is re-registered as _v2.

Model:
- Flash-attention probe is diagnostic-only; forward raises on a missing loss;
  print() replaced with logging; N1.5 base-path mismatch includes the
  removal guidance.

* fix(groot): skip normalization overrides for training

* fix(groot): GPU/tensor N1.7 image preprocessing + resize to trained resolution

GR00T training was dataloader-bound (0->100->0 GPU-utilization sawtooth).
GrootN17VLMEncodeStep ran the Qwen3-VL image processor per frame on PIL images
on the single CPU main-loop thread, and that cost is timed inside dataloading_s
(preprocessor(batch) runs in the main process, not the dataloader workers), so
adding workers cannot hide it.

- Feed the torchvision-backed Qwen3-VL processor (C,H,W) uint8 tensors instead
  of a per-frame Image.fromarray PIL roundtrip, and run resize/normalize/patchify
  on config.device (GPU) when available. Bit-identical on CPU when no resize is
  configured; with a resize only the PIL->torchvision bicubic backend differs
  (<2/255 per pixel). The use_albumentations path stays PIL/cv2; reload on a box
  without the saved device falls back to CPU.

- Default image_target_size/crop to the N1.7 backbone's training geometry
  (256x256 / 230x230) when a checkpoint ships no image sizing (checkpoint_assets
  is None, e.g. finetuning nvidia/GR00T-N1.7-3B via repo-id with a new
  embodiment). Previously image_target_size=None disabled the resize, so
  full-resolution frames were patchified into ~4.7x more vision tokens than the
  model was trained on -- inflating dataloading_s (patchify) and update_s (VLM
  sequence) and skewing the input distribution. Checkpoints that pin their own
  sizing are honored; the default constants are shared with GR00T_N1_7_DEFAULTS.

Net: preprocessing leaves the CPU critical path and the VLM sees the resolution
it was trained on -- faster training/inference and a correct train/serve
distribution. Affects inference too (shared preprocessor); existing checkpoints
still load (backward compatible) but must be retrained to gain the benefits.

* refactor(groot): N1.7 style cleanup (utils, imports, flash-attn, config)

Mechanical refactor of the GR00T N1.7 policy to match the repo's architecture and
style standards. No change to policy algorithm/numerics; only UX/CLI and packaging
changes. Tests are intentionally left untouched (out of scope) and need updating
for the removed `model_version` field.

Cleanup & consolidation:
- Add `groot/utils.py` holding the pure, side-effect-free helpers (JSON I/O, value
  coercion, stat flattening, rot6d/SE3 math, language/batch prep) shared by the
  config and processor layers.
- Remove dead code: the unused `resolve_groot_n1_7_backbone_model` cache-resolver
  cluster, `GR00TN17Config.to_filtered_dict/json`, and the `_copy_default` wrapper.

Imports & execution guards:
- Hoist nested imports to module top; relative imports within the package, absolute
  for external modules. The version-gated Qwen3-VL classes import under the single
  `_transformers_available` guard (transformers is pinned >=5.4, which ships them).
- No import-time side effects: `_register_with_transformers()` now runs in
  `GR00TN17.__init__` (idempotent via `register(exist_ok=True)`), and the N1.5 step
  stubs register lazily before pipeline deserialization (idempotent via the
  registry, no run-once globals).
- Gate optional deps at the point of use with `require_package(..., extra="groot")`.

Dependencies & docs:
- Drop `flash-attn` (and its build-only dep `ninja`) from the `groot` extra; default
  to SDPA (numerically equivalent) with opt-in via `--policy.use_flash_attention`.
  Un-comment `lerobot[groot]` in the `all` extra and regenerate `uv.lock`.
- Rewrite the `groot.mdx` install section: flash-attn is a purely optional,
  user-managed optimization that LeRobot neither installs nor requires.

Config & CLI:
- Surface previously-frozen knobs on `GrootConfig` (plumbed into `GR00TN17Config`;
  no-ops at their defaults): inference — `num_inference_timesteps`, `rtc_ramp_rate`,
  `use_flash_attention`; fine-tuning — `tune_top_llm_layers` (partial-LLM tuning)
  and `tune_vlln` (previously hardwired to True).
- Convert the single-valued `model_version` and `n1_7_backbone_model` fields to
  internal constants.
- Keep `base_model_path`: it is NOT equivalent to `pretrained_path` (raw NVIDIA
  checkpoints have no LeRobot `type` field and load only via `base_model_path`) and
  is genuinely user-tunable.
- Keep the deprecated Isaac-GR00T/N1.5 fields (and the dead LoRA fields) as a
  back-compat block so a v0.5.1 N1.5 `config.json` still parses under draccus and is
  rejected with the friendly N1.5 removal message instead of an opaque decode error.

* Optimize GR00T N1.7 image preprocessing

* Remove PIL fallback from GR00T preprocessing

* Fix GROOT relative action training stats

* Address GROOT relative action review feedback

* Fix GROOT N1.7 relative action stats

* Fix GROOT relative action training stats

* Fix GROOT relative action padding and RTC leftovers

* Reset rollout state after robot episode end

* Revert "Reset rollout state after robot episode end"

This reverts commit 1322f45aec.

* Move GROOT relative stats out of train script

* Guard GR00T relative action stepwise decode

* Match GR00T N1.7 OSS preprocessing and relative actions

* Apply LIBERO action decode override after loading

* Format GR00T OSS parity changes

* chore(policies): add guards, warnings and comments + recover tests n1.5 check

* fix(style): pre-commit

* fix(ci): guard dependecy checks

* chore(groot): move cv2 to the top as its in the default install tag

* chore(policies): add explicit dataset dependecy to gr00t implementation

* fix(test): add guard

* fix(groot): make N1.7 letterbox opt-in

* feat(groot): activate checkpoint-configured N1.7 raw-state dropout during training

Isaac-GR00T applies dual state regularization during fine-tuning: raw-state
zeroing driven by the processor sidecar's state_dropout_prob (0.2 for the
inspected N1.7 checkpoint) plus encoded-feature dropout. Baseline LeRobot kept
the processor in deterministic mode, so the raw-state dropout never activated
(RCA Tier-2 contributor to the LeRobot-trained SO-101 failures).

- GrootN17PackInputsStep: runtime-only 'training' flag + state_dropout_prob;
  whole-sample state zeroing gated on torch.is_grad_enabled() so eval and
  no_grad validation paths are unaffected
- sidecar loader reads state_dropout_prob from processor_config.json
- state_dropout_prob serializes with the step; the training flag intentionally
  does not (reloaded pipelines default to eval, re-enabled only when processors
  are rebuilt with dataset_meta)
- _set_groot_preprocessor_training toggles any dataclass step exposing a
  'training' field on serialized-pipeline reloads

Verification: tests/policies/groot/test_groot_state_dropout.py (4 passed) on
RTX PRO 6000 / CUDA 13.3.

* fix(groot): align N1.7 fine-tuning optimizer/scheduler/precision with Isaac-GR00T

Evidence from the LeRobot-vs-OSS checkpoint comparison: the LeRobot/HF 8k
checkpoint's DiT moved only ~19% as far from base as the OSS-trained one
(0.0547 vs 0.285 relative L2) - undertrained because the scheduler decayed over
a hardcoded 10k steps regardless of --steps, on top of beta1/clip mismatches.

- AdamW betas (0.95, 0.999) -> (0.9, 0.999) and grad_clip_norm 10.0 -> 1.0
  (Isaac defaults)
- scheduler: hardcoded CosineDecayWithWarmup(10k decay, floor 10% peak) ->
  DiffuserSchedulerConfig HF cosine with ceil(max_steps * warmup_ratio) warmup,
  deriving num_training_steps from the outer --steps at runtime
- model_params_fp32 (default true): keep master weights in FP32 and compute
  under BF16 autocast like the native N1.7 recipe (fixes optimizer-update
  numerics vs pure-BF16 params)
- weight-decay grouping via transformers get_parameter_names: biases and norm
  parameters excluded from decay
- restore the TF4 lm_head/embedding weight tie so the unused Qwen LM head stays
  frozen and deduplicated in checkpoints
- action_mask kept in native dtype for the masked flow-matching loss
- drop_n_last_frames: exclude episode tails that cannot supply a complete
  action chunk (Isaac sampler behavior)

Verification: tests/policies/groot/test_groot_training_optim_contract.py
(7 passed) + remaining groot suite 11 passed/5 skipped on RTX PRO 6000 /
CUDA 13.3. Note: tests/policies/groot/test_groot_n1_7.py does not collect on
the base branch (pre-existing ImportError, fixed in PR #37).

* feat(groot): train-time random crop for N1.7 (eval keeps center crop)

Isaac-GR00T crops a random crop_fraction window during training and the
deterministic center window at eval, replaying the sampled window across all
camera views of a sample. This contract is unchanged since the N1.5 release
(gr00t/data/transform/video.py: "If mode is 'train', return a random crop
transform. If mode is 'eval', return a center crop transform.") and mirrors
LeRobot's own Diffusion/VQBeT crop_is_random pattern. The LeRobot N1.7 port
used the eval center crop for training too, so the fine-tuned projector/DiT
never sees frame borders and trains on a single fixed appearance point.

Scope: crop geometry ONLY - no color jitter, no new dependencies. The random
window is plain numpy slicing inside the existing cv2 eval transform:

- _transform_n1_7_image_for_vlm_albumentations gains crop_position=(y, x)
  fractions; None keeps the center crop byte-identical to before (verified
  by test)
- GrootN17VLMEncodeStep gains a runtime-only 'training' flag (never
  serialized; reloaded pipelines default to eval); training samples ONE
  window per sample and reuses it across (timestep, view) frames - Isaac's
  cross-view consistency
- gated on torch.is_grad_enabled() so no_grad validation and frozen-eval
  paths are unaffected
- wired via dataset_meta is not None in make_groot_pre_post_processors and
  the existing _set_groot_preprocessor_training on serialized reloads

Verification: tests/policies/groot/test_groot_train_random_crop.py (8 passed:
center-crop bit-exactness with crop_position=None, corner/center windows,
cross-view replay, train!=eval, no_grad gating, seed reproducibility,
serialization contract) + groot suite 23 passed / 5 skipped on RTX PRO 6000 /
CUDA 13.3.

* docs(groot): update Training & hardware Evaluation commands

Replace the multi-GPU accelerate-launch Training snippet with the current
single-command 'uv run lerobot-train' N1.7 recipe (relative actions excluding
gripper, bf16, flash attention, chunk/n_action_steps=16, bs64/20k steps).

Replace the bimanual 'Evaluate in your hardware setup' rollout example with the
SO-101 follower RTC 'uv run lerobot-rollout' command (strategy.type=base,
inference.type=rtc, wrist+front cameras, place-the-vial task).

Docs-only; no source/test changes.

* docs(groot): parameterize commands with env vars + fill LIBERO results

- Introduce BASE_MODEL / DATASET_ID / REPO_ID / JOB_NAME / OUTPUT_DIR env vars
  in the training command and reuse OUTPUT_DIR + BASE_MODEL in the rollout cmd.
- Fill the LIBERO benchmark table with GR00T-LeRobot success rates
  (Spatial 94%, Object 98%, Goal 93%, LIBERO 10/Long 90%; avg 93.75%),
  drop the OSS column and XX placeholders. LeRobot-focused.

* docs(groot): drop export block, reference env vars directly

Use $DATASET_ID / $BASE_MODEL / $REPO_ID / $OUTPUT_DIR / $JOB_NAME as
bare placeholders in the commands without concrete export assignments.

* docs(groot): keep BASE_MODEL export in training command

* docs(groot): use literal HF repo IDs for dataset/policy repo_id

Public-facing Hub references (--dataset.repo_id, --policy.repo_id) shown as
concrete IDs; local-only values ($OUTPUT_DIR, $JOB_NAME) stay as placeholders.

* docs(groot): add LIBERO training command example

* docs(groot): remove LIBERO checkpoints subdirectory section

* docs(groot): use $BASE_MODEL for base_model_path in LIBERO eval

* docs(groot): drop hf download step from LIBERO eval, fix intro

* docs(groot): restore suite checkpoint download intro sentence

* docs(groot): remove checkpoint download note above LIBERO eval

* docs(groot): update training and rollout commands with new parameters and dependencies

* Add sample so101 training command

* Remove sample so101 training command

* docs(groot): remove optional Flash Attention setup instructions and update base model path for evaluation

* docs(groot): update training command with  image transformation parameters

* docs(groot): add note on inference.queue_threshold value for stable inference

* chore(style): pre-commit gr00t

* docs(groot): update

* chore(policies): minor details

* fix(groot): license headers + test guards

* chore(policies): fix tests

* docs(groot): relative actions param doc

* chore(policy): address some of the AI review items

---------

Co-authored-by: Andrew Wrenn <awrenn@nvidia.com>
Co-authored-by: Ryan Halabi <ryhalabi@nvidia.com>
Co-authored-by: nv-sachdevkartik <ksachdev@nvidia.com>
Co-authored-by: groot-validation <groot-validation@localhost>
Co-authored-by: johnnynunez <johnnynuca14@gmail.com>
Co-authored-by: lbenhorin <lbenhorin@nvidia.com>

2026-07-03 21:15:09 +02:00

12 KiB

Raw Blame History

LeRobot aims to provide models, datasets, and tools for real-world robotics in PyTorch. The goal is to lower the barrier to entry so that everyone can contribute to and benefit from shared datasets and pretrained models.

🤗 A hardware-agnostic, Python-native interface that standardizes control across diverse platforms, from low-cost arms (SO-100) to humanoids.

🤗 A standardized, scalable LeRobotDataset format (Parquet + MP4 or images) hosted on the Hugging Face Hub, enabling efficient storage, streaming and visualization of massive robotic datasets.

🤗 State-of-the-art policies that have been shown to transfer to the real-world ready for training and deployment.

🤗 Comprehensive support for the open-source ecosystem to democratize physical AI.

Quick Start

LeRobot can be installed directly from PyPI.

pip install lerobot
lerobot-info

Important

For detailed installation guide, please see the Installation Documentation.

Robots & Control

LeRobot provides a unified Robot class interface that decouples control logic from hardware specifics. It supports a wide range of robots and teleoperation devices.

from lerobot.robots.myrobot import MyRobot

# Connect to a robot
robot = MyRobot(config=...)
robot.connect()

# Read observation and send action
obs = robot.get_observation()
action = model.select_action(obs)
robot.send_action(action)

Supported Hardware: SO100, LeKiwi, Koch, HopeJR, OMX, EarthRover, Reachy2, Gamepads, Keyboards, Phones, OpenARM, Unitree G1, reBot B601.

While these devices are natively integrated into the LeRobot codebase, the library is designed to be extensible. You can easily implement the Robot interface to utilize LeRobot's data collection, training, and visualization tools for your own custom robot.

For detailed hardware setup guides, see the Hardware Documentation.

LeRobot Dataset

To solve the data fragmentation problem in robotics, we utilize the LeRobotDataset format.

Structure: Synchronized MP4 videos (or images) for vision and Parquet files for state/action data.
HF Hub Integration: Explore thousands of robotics datasets on the Hugging Face Hub.
Tools: Seamlessly delete episodes, split by indices/fractions, add/remove features, and merge multiple datasets.

from lerobot.datasets.lerobot_dataset import LeRobotDataset

# Load a dataset from the Hub
dataset = LeRobotDataset("lerobot/aloha_mobile_cabinet")

# Access data (automatically handles video decoding)
episode_index=0
print(f"{dataset[episode_index]['action'].shape=}\n")

Learn more about it in the LeRobotDataset Documentation

SoTA Models

LeRobot implements state-of-the-art policies in pure PyTorch, covering Imitation Learning, Reinforcement Learning, and Vision-Language-Action (VLA) models, with more coming soon. It also provides you with the tools to instrument and inspect your training process.

Training a policy is as simple as running a script configuration:

lerobot-train \
  --policy.type=act \
  --dataset.repo_id=lerobot/aloha_mobile_cabinet

Category	Models
Imitation Learning	ACT, Diffusion, VQ-BeT, Multitask DiT Policy
Reinforcement Learning	HIL-SERL, TDMPC & QC-FQL (coming soon)
VLAs Models	Pi0, Pi0Fast, Pi0.5, GR00T N1.7, SmolVLA, XVLA, EO-1, MolmoAct2, WALL-OSS
World Models	VLA-JEPA (more coming soon)
Reward Models	SARM, TOPReward, Robometer

Similarly to the hardware, you can easily implement your own policy & leverage LeRobot's data collection, training, and visualization tools, and share your model to the HF Hub

For detailed policy setup guides, see the Policy Documentation. For GPU/RAM requirements and expected training time per policy, see the Compute Hardware Guide.

Inference & Evaluation

Evaluate your policies in simulation or on real hardware using the unified evaluation script. LeRobot supports standard benchmarks like LIBERO, MetaWorld and more to come.

# Evaluate a policy on the LIBERO benchmark
lerobot-eval \
  --policy.path=lerobot/pi0_libero_finetuned \
  --env.type=libero \
  --env.task=libero_object \
  --eval.n_episodes=10

Learn how to implement your own simulation environment or benchmark and distribute it from the HF Hub by following the EnvHub Documentation

Resources

Documentation: The complete guide to tutorials & API.
Chinese Tutorials: LeRobot+SO-ARM101中文教程-同济子豪兄 Detailed doc for assembling, teleoperate, dataset, train, deploy. Verified by Seed Studio and 5 global hackathon players.
Discord: Join the LeRobot server to discuss with the community.
X: Follow us on X to stay up-to-date with the latest developments.
Robot Learning Tutorial: A free, hands-on course to learn robot learning using LeRobot.
T-Shirt Folding Experiment: An end-to-end demonstration of folding t-shirts with LeRobot.
LeLab: A web interface for LeRobot — teleoperate, calibrate, record datasets, replay, and train your SO arm from the browser, no CLI required.

Citation

If you use LeRobot in your project, please cite the GitHub repository to acknowledge the ongoing development and contributors:

@misc{cadene2024lerobot,
    author = {Cadene, Remi and Alibert, Simon and Soare, Alexander and Gallouedec, Quentin and Zouitine, Adil and Palma, Steven and Kooijmans, Pepijn and Aractingi, Michel and Shukor, Mustafa and Aubakirova, Dana and Russi, Martino and Capuano, Francesco and Pascal, Caroline and Choghari, Jade and Meftah, Khalil and Ellerbach, Maxime and Moss, Jess and Wolf, Thomas},
    title = {LeRobot: State-of-the-art Machine Learning for Real-World Robotics in Pytorch},
    howpublished = "\url{https://github.com/huggingface/lerobot}",
    year = {2024}
}

If you are referencing our research or the academic paper, please also cite our ICLR publication:

ICLR 2026 Paper

@inproceedings{cadenelerobot,
  title={LeRobot: An Open-Source Library for End-to-End Robot Learning},
  author={Cadene, Remi and Alibert, Simon and Capuano, Francesco and Aractingi, Michel and Zouitine, Adil and Kooijmans, Pepijn and Choghari, Jade and Russi, Martino and Pascal, Caroline and Palma, Steven and Shukor, Mustafa and Moss, Jess and Soare, Alexander and Aubakirova, Dana and Lhoest, Quentin and Gallou\'edec, Quentin and Wolf, Thomas},
  booktitle={The Fourteenth International Conference on Learning Representations},
  year={2026},
  url={https://arxiv.org/abs/2602.22818}
}

Contribute

We welcome contributions from everyone in the community! To get started, please read our CONTRIBUTING.md guide. Whether you're adding a new feature, improving documentation, or fixing a bug, your help and feedback are invaluable. We're incredibly excited about the future of open-source robotics and can't wait to work with you on what's next—thank you for your support!

_{Built by the LeRobot team at Hugging Face with ❤️}

12 KiB Raw Blame History