Commit Graph

1547 Commits

Author SHA1 Message Date
Maxime Ellerbach 105aeab1bc Merge remote-tracking branch 'origin/main' into worktree-lingbot-va-port
# Conflicts:
#	docs/source/_toctree.yml
#	src/lerobot/policies/factory.py
#	uv.lock
2026-07-02 14:15:09 +00:00
Caroline Pascal 7ae12124b0 fix(save codec options): making sure codec options are always set via set_if (#3910)
* fix(save codec options): making sure codec options are always safely set through `set_if`

* tests(update): updating tests
2026-07-02 15:29:14 +02:00
Caroline Pascal c746ca2df2 fix(depth unit): adding input depth unit storage in the dataset metadata (#3899)
* fix(depth unit): storing raw depth units in the dataset metadata for correct depth statistics and depth raw frames handling. The unit is stored as a string ("m","mm") under "depth_unit" at the same level as "is_depth_map". Unit is inferred from the depth frame type.

* feat(raw frame unit): adapting dataset reader so that raw depth frames are scaled according to the requested unit

* feat(stats units): rescaling stats when loading a dataset so that the stats are given in the requested unit

* tests(unit): adapting and extending depth tests to units manipulations

* chore(format): formating code

* feat(warning): adding a warning when depth unit is not specified in the dataset

* chore(infer_depth_unit): moving the depth unit inference utility in a more accessible location

* feat(rerun unit): adding correct depth unit display for rerun (foxglove does not support units yet)

* feat(unit getter): adding a proper output_depth_unit getter to LeRobotDataset for cleaner integration

* fix(streaming dataset): extending support for depth units to streaming datasets

* test(rerun): fixing rerun tests
2026-07-02 11:53:13 +02:00
Caroline Pascal b961d2a8c5 feat(libaom-av1): adding support for libaom-av1 codec (#3898) 2026-07-02 11:03:41 +02:00
Steven Palma 052d329470 feat(visualization): add foxglove support (#3902)
* Add Foxglove display mode for teleoperate

Add a --display_mode flag (rerun|foxglove) to lerobot-teleoperate. When set
to foxglove, stream observations/actions over a Foxglove WebSocket server:
images as RawImage/CompressedImage, scalars as typed JSON channels with
schemas generated from the feature names (sanitized so paths don't need
quoting). Adds a `foxglove` extra.

* Add Foxglove display mode to lerobot-record

Wire the --display_mode flag (rerun|foxglove) into lerobot-record, matching
lerobot-teleoperate: route init/log through the backend-agnostic dispatchers
and stop the visualization backend on exit.

* update foxglove-sdk to 0.25.1

* Use static lerobot.Scalars schema for Foxglove state topics

Replace the per-topic JSON schema derived from feature names with a single
static lerobot.Scalars schema: a scalars array of {label, value} objects. The
same schema fits any robot regardless of which observation/action features it
reports, and the label field lets Foxglove name each series automatically so
one filtered path plots every feature.

* add foxglove option to dataset viz

* Make Foxglove dataset playback loop the sole frame emitter

Address review: the listener no longer emits frames, it only mutates
playback state and queues a one-shot seek index that the playback loop
services. The loop is now the only caller of emit_frame, so concurrent
random access into the on-disk dataset / video decoder never overlaps.

Also remove the dead server_holder and tighten the _foxglove_safe_name
docstring to state what it does and why.

* Label Foxglove dataset scalars with feature dimension names

Use the dataset's per-dimension feature names (e.g. joint names) as the
Foxglove series labels for /observation/state and /action/state instead
of bare indices. LeRobot stores `names` inconsistently (flat list,
{category: [...]}, or {name: index}), so _feature_dim_names handles each
and falls back to indices on any unknown format or length mismatch.

* Make Foxglove server host bindable and refactor topic/channel handling

Pass display_ip through as the Foxglove WebSocket bind host (127.0.0.1
for local only, 0.0.0.0 for all interfaces) instead of always binding
locally. In lerobot-dataset-viz, fold the separate --port into --web-port
so one flag covers both the Rerun web viewer and the Foxglove server port.

Add a _foxglove_topic() helper and thread a per-topic channel cache
through the log helpers so dataset playback stays self-contained instead
of mutating the module-global cache. Promote SUCCESS to constants.py.

* feat(viz): add support for foxglove in rollout + add to viz tag

* fix(docs): remove misleading installation note

* fix(visualization): no duplicated prefix, consolidated norm + warnings log

* chore(viz): minor improvements

* refactor(viz): split files + autoplay + updated docs + added minimal tests

* fix(viz): right tags + warning

* feat(deprecated ws-port): removing rerun's depreacted ws-port parameter in dataset visualization

* chore(web ports): adding global variables for default foxglove/rerun web ports

* feat(depth): adding depth support to foxglove visualizer. Because of foxglove limitations (min and max values on RawImage cannot be set from the SDK), depth is normalized between [0,1] when a depth range is provided.

* fix(rerun depth range): making rerun depth range computation safe against missing stats

* chore(foxglove depth): make it simple, and make it work.

* fix(scaling): fixing depth frames scaling

---------

Co-authored-by: Roman Shtylman <roman@foxglove.dev>
Co-authored-by: Caroline Pascal <caroline8.pascal@gmail.com>
2026-07-01 18:39:32 +02:00
Nicolas Rabault e623733861 perf(tests): cache draccus docstring extraction (#3903)
draccus re-parses each config class's source on every parse() to extract
field help text (~2.5s for TrainPipelineConfig). Memoize it for the test
session; the source is constant within a run.

Fast Tests test time: 664s -> 404s (-39%).
2026-07-01 17:05:43 +02:00
Maxime Ellerbach 141c353206 feat(policies): Add FastWAM Policy (#3834)
* Add FastWAM policy

* Add FastWAM policy review updates

* big refactor to use models from diffusers and transformers

* changing reproducable results

* preparing for training adding some temporary debug code aswell to visualize model output

* re-parenting of some layers to enable proper zero-3 FSDP

* linting

* small fix for the preprocessor and padded images

* removing some preprocessors

* removing temporary debug code

* cleaning up

* updating uv lock after rebasing

* adding lazy imports

* linting

* fixing stale assertion

* make tokenizer/text-encoder model ids configurable + some nits

* moving and renaming files to have a cleaner file tree

* removed asserts from the model, added guard instead and completely removed useless asserts

* cleaning up imports

* removing is_main_process and custom logging logic

* removing unused / stale attention path, removing some of the stale forwards within wan/models

---------

Co-authored-by: ZibinDong <zibindong@outlook.com>
Co-authored-by: Steven Palma <imstevenpmwork@ieee.org>
2026-07-01 14:35:57 +02:00
Caroline Pascal 8414188db0 fix(datasets dependency): removing datasets dependency in pretrained.py (#3897) 2026-06-30 20:21:06 +02:00
Khalil Meftah 0da98afd63 Feat(robot): add MIT control mode to ReBot (#3778)
* fix(config): update joint limits for RebotB601Follower and RebotArm102Leader

* feat(config): add MIT control mode ReBot

- Add configurable arm control mode (mit default, pos_vel fallback) with tunable mit_kp / mit_kd
- Add optional gripper control mode (force_pos default, mit optional) with gripper_mit_kp / gripper_mit_kd
- Update tests for MIT arm routing, gripper mode routing, and revised joint limits

* fix(robots): restore joint clipping and wrist_yaw fallback in ReBot B601 send_action

* feat(robot): increase gripper velocity and torque for rebot arm
2026-06-30 17:17:50 +02:00
Khalil Meftah 2f2b567951 Enable MolmoAct2 rollout on SO-100/101 with calibration correction (#3879)
* fix(rollout): improve visual feature mismatch error with --rename_map hint

* feat(policies): add joint frame transform and hardware deployment docs for MolmoAct2

Add MolmoAct2StateFrameTransformStep and MolmoAct2ActionFrameTransformStep
processor steps for cross-calibration compatibility on SO-100/101. Add
joint_signs and joint_offsets config fields. Add hardware deployment section
to molmoact2.mdx with camera naming convention, joint frame correction, and
safety guidance.

* chore(docs): address PR comment

* fix: address reviewer comments
2026-06-29 18:52:59 +02:00
Maxime Ellerbach 18eee1b477 refactor(vla-jepa): removing gpu roundtrip (#3750)
* refactor(vla-jepa): removing gpu roundtrip for the preprocessing part

* major refactor of the forward pass and model input conversion

* linting

* adressing suggestions from reviews
* removing redundant state dtype conversion
* avoiding recreating the same tensor each foward pass
* api simplification of `_encode_qwen`
* avoiding useless video assembly during inference
* guard against video=None for the wm loss
2026-06-29 18:50:04 +02:00
Nicolas Rabault 5ac3b49a5f feat(train): run training remotely on HF Jobs via --job.target (#3856)
* feat(train): add JobConfig group, save_checkpoint_to_hub flag, Hub checkpoint helper

Introduce a JobConfig draccus group on TrainPipelineConfig (--job.target/image/
timeout/detach/tags) whose is_remote property gates remote dispatch, plus a
save_checkpoint_to_hub flag and validation. Add push_checkpoint_to_hub(), which
uploads a saved checkpoint directory to the model repo under checkpoints/<step>/
and creates the repo idempotently (private propagates from policy.private).

* feat(train): run training remotely on HF Jobs via --job.target

When --job.target names a GPU flavor, train() dispatches to lerobot.jobs.submit_to_hf
instead of training locally: it authenticates, ensures the dataset is on the Hub
(pushing a local-only one privately), serializes a pod-compatible train_config.json
(strips client-only fields, points at the model repo), submits via HfApi.run_job
with HF_TOKEN/WANDB_API_KEY secrets, then streams logs and finishes when the model
is pushed. Wires push_checkpoint_to_hub into the training loop behind
save_checkpoint_to_hub, and tags jobs/datasets/model with 'lerobot' + --job.tags.

* docs(train): document remote training on HF Jobs

* test(train): skip remote-dispatch tests without the dataset extra

The module imports lerobot.scripts.lerobot_train, which eagerly pulls in
lerobot.datasets (dataset extra). The base fast-test CI tier runs without
that extra, so collection failed there. Guard with pytest.importorskip,
matching the existing tests/scripts dataset-extra tests.

* refactor(jobs): hoist huggingface_hub imports to module level in hf.py

huggingface_hub is a core dependency, so the per-function dynamic imports
had no lazy-loading rationale. Move them to a single module-level import
and update test monkeypatch targets to lerobot.jobs.hf.* accordingly.

* refactor(jobs): build remote config dict via cfg.to_dict()

TrainPipelineConfig.to_dict() already returns the canonical draccus
encoding, so the StringIO + draccus.dump + json.loads round-trip was
redundant. Use it directly and drop the now-unused io/draccus imports.

* refactor(train): use module-level HfApi import in push_checkpoint_to_hub

huggingface_hub is a core dependency; the in-function import was
unnecessary. Move HfApi to a module-level import and point the test
monkeypatches at lerobot.common.train_utils.HfApi.

* refactor(configs): export JobConfig from the configs package

Re-export JobConfig in lerobot/configs/__init__.py so external callers
import it as `from lerobot.configs import JobConfig`, matching the other
config classes. Adapt the train script and test imports.

* refactor(jobs): check dataset presence with api.repo_exists

Replace the dataset_info try/except RepositoryNotFoundError dance with a
direct api.repo_exists(repo_id, repo_type="dataset") call, dropping the
httpx/RepositoryNotFoundError test scaffolding.

* chore(jobs): annotate ensure_dataset_available api param as HfApi

Add the missing HfApi type hint via a TYPE_CHECKING import.

* refactor(jobs): use HF_LEROBOT_HOME constant for the local cache root

Resolve the local dataset cache via lerobot.utils.constants.HF_LEROBOT_HOME
instead of re-reading the env var by hand, dropping the os/Path imports.
Tests now patch the imported constant and assert on a stable message
substring (the previous "neither" match only passed by accident, matching
the test name embedded in the pytest tmp_path).

* chore(jobs): guard LeRobotDataset import with require_package

Surface a clear "install lerobot[dataset]" error if the datasets extra
is missing, instead of a raw ImportError, before pushing a local dataset.

* docs(configs): clarify the is_remote_target/is_remote split

Add a comment explaining why JobConfig keeps both the staticmethod (tests
a raw target string from argv before a config exists) and the property
(accessor for an existing config instance).

* docs(train): note how to pin a pushed model version for inference

Document --policy.pretrained_revision alongside --policy.path so a
specific Hub-pushed checkpoint (once --save_checkpoint_to_hub has
committed several) can be selected for inference.

* test(jobs): skip dataset import guard in base-deps test

The fast test env installs base deps only, so require_package('datasets')
raised ImportError before the mocked lerobot.datasets import was reached.
Monkeypatch the guard to a no-op so the unit test exercises the upload logic.

* fix(jobs): address claude review findings on remote training

Resolve the claude[bot] review on #3856:

- Reject reward-model training under --job.target with a clear error instead
  of crashing on a None policy inside build_remote_config_file.
- Support --policy.path remote runs: validate() no longer requires repo_id for
  remote runs (it is auto-generated in submit_to_hf), and repo_id/push_to_hub
  are now set after validate() resolves the policy.
- Narrow the bare `except Exception` in _tail_logs/_poll_until_done to
  (OSError, httpx.HTTPError) so programming errors surface instead of being
  silently retried or counted as job failures.
- Install the SIGINT detach handler only on the main thread.
- Generate model repo timestamps in UTC.

* docs(jobs): document the model-pushed marker contract and orphaned repos

Follow-up to the claude[bot] review on #3856 (non-blocking observations):

- Cross-reference the "Model pushed to <url>" log line between its producer
  (PreTrainedPolicy.push_model_to_hub) and the remote-run consumer in
  submit_to_hf, noting the contract is an early-finish optimization that
  falls back to status polling if it drifts.
- Note in the HF Jobs guide that a failed remote run leaves its model repo
  on the Hub (it is not auto-deleted) and how to remove it.

* feat(train): tag each pushed checkpoint with its step

Address review feedback on #3856: pushing a checkpoint to the Hub now
also creates a tag named after the checkpoint step, so a checkpoint can
be recovered with --policy.pretrained_revision=<step> instead of having
to look up its commit sha.

* fix(jobs): hoist ensure_dataset_available to a module-level import

Addresses Caroline's review comment on PR #3856: the local import of
ensure_dataset_available inside submit_to_hf was vestigial. dataset.py
does not import hf.py, so there is no circular-import risk and no extra
load cost (its heavy deps stay lazy), so make it a top-level import.

* refactor(configs): untangle config_path/resume resolution in validate()

Split the re-parse HACK block in TrainPipelineConfig.validate() into focused
helpers (_resolve_pretrained_from_cli, _resolve_resume_checkpoint) that handle
the policy path, reward-model path, and resume config_path as separate,
readable units. Behavior-preserving.

* feat(train): resume training from a Hub checkpoint

Allow --config_path to be a Hub repo id when resuming, not only a local path.
The latest checkpoint under checkpoints/<step>/ is downloaded into a fresh local
run dir and resumed from there (optimizer, scheduler, RNG and data order
restored as for a local resume). TrainPipelineConfig.from_pretrained falls back
to the latest checkpoint's train_config.json when a repo has no root config
(an interrupted run that only pushed checkpoints). The download is skipped when
dispatching remotely so the executor (local machine or HF Jobs pod) performs it.

- add find_latest_hub_checkpoint (utils/hub) and resolve_resume_checkpoint
  (common/train_utils), the symmetric download counterpart to
  push_checkpoint_to_hub
- unit tests for both helpers and the from_pretrained fallback

* feat(jobs): resume a run on HF Jobs from a checkpoint

When --resume is set with a remote --job.target, submit_to_hf resumes from the
checkpoint repo instead of staging a fresh config. A Hub config_path is resumed
in place (its checkpoint config already targets that repo); a local config_path
has its checkpoint uploaded to a new private repo first and the run is forced to
push back to it. The pod command carries --job.target=local so the checkpoint's
saved job.target can't make the pod re-dispatch itself, and the user's CLI
overrides are forwarded so a remote resume matches the same local command.
ensure_dataset_available is hoisted before the resume/fresh branch since it
applies to both.

* docs(train): document resuming from a Hub checkpoint, locally and on jobs

Show that --config_path accepts a Hub repo id for --resume, and that adding
--job.target resumes on HF Jobs (uploading a local checkpoint/dataset first).

* fix(jobs): default remote job timeout to 2d instead of the platform default

HF Jobs applies its own short 30-minute timeout when none is sent, which
silently kills long training runs. Pass an explicit, generous 2d cap by
default; users can still override --job.timeout to fail fast or extend it.

* fix(jobs): drop --dataset.root on resume + restore keyboard-control docs

Address the latest Claude review on #3856:

- _build_resume_job no longer forwards --dataset.root to the pod (a
  host-local path it can't read); the fresh-run path already nulls it in
  build_remote_config_file, so this makes resume consistent. Add a unit
  test for _pod_forwarded_args covering the drop in both flag forms.
- Restore the display-independent keyboard-control docs (n/r/q letter
  equivalents + X11/Wayland/headless Tip) in il_robots.mdx that this
  branch was stale on relative to main (#3875).

* fix(jobs): handle str-typed job stage from huggingface_hub

inspect_job's status.stage is an enum (with .value) in some
huggingface_hub versions and a plain str in others. The poller
assumed the enum shape, raising "'str' object has no attribute
'value'" on resume for users on the str-returning version.

Read it via getattr(..., "value", ...) so both shapes work, and
parametrize the poll test over enum and str stages so the str case
is actually exercised (the old mock only ever simulated the enum).

* refactor(jobs): use relative import for ensure_dataset_available

* refactor(train): hoist submit_to_hf import to module top

The `from lerobot.jobs import submit_to_hf` was a function-local import in
train(); it pulls no heavy/optional deps and has no circular-import risk, so
move it to the top-level import block.

* refactor(train): hoist _remote_target_in_argv imports to module top

Move `import sys` and `from lerobot.configs import JobConfig` out of the
function body and into the top-level import block.

* refactor(utils): use relative import for sibling constants in hub.py

`from lerobot.utils.constants import CHECKPOINTS_DIR` was the odd one out in
utils/ — sibling modules there are imported relatively (.constants, .errors,
.utils, ...). Match that convention.

* refactor(jobs): hoist LeRobotDataset import, guard dataset extra at package init

Move the `from lerobot.datasets import LeRobotDataset` import to the top of
dataset.py and relocate the `require_package("datasets", extra="dataset")`
guard to the jobs package __init__, per review feedback.

* test(jobs): skip test_hf if datasets extra is missing

lerobot.configs.train pulls in datasets at import time, so the module
fails to collect without lerobot[dataset]. Guard with importorskip,
matching the convention in tests/training/test_multi_gpu.py.

* test(jobs): skip test_dataset if datasets extra is missing

tests/jobs/test_dataset.py imports lerobot.jobs.dataset, which triggers
the require_package("datasets") guard in lerobot/jobs/__init__.py at
import time. Without lerobot[dataset] the module fails to collect in the
base CI tier. Guard with importorskip, same as test_hf.py.
2026-06-29 17:59:33 +02:00
Caroline Pascal a5821a01a2 feat(dependencies): bump rerun-sdk to <0.34.0 (#3763)
* Update upper bound to latest rerun-sdk

* chore(updae): update rerun logging to use the latest features

* chore(format): formatting code

* feat(features names and color): improving features names and display colors when replaying an episode

* feat(blueprints): switching to blueprints for backwards (and forward) compatibiltiy

* feat(blueprints): switching to blueprints for backwards (and forward) compatibiltiy

* feat(grid): Leveraging rerun's automatic grid arangement for improved layout

* test(update): update tests

* chore(colors): removing unreliable colors

* chore(simplification): removing no longer needed reshape

* chore(imports): cleaning up imports

* fix(claude): claude reviews

* chore(dependecies): update rerun ceil version

* chore(scripts): recover comments

* chore(utils): add guard for blueprint

* fix(test): style check

* fix(deps): typo bound

---------

Signed-off-by: Steven Palma <imstevenpmwork@ieee.org>
Co-authored-by: ntjohnson1 <24689722+ntjohnson1@users.noreply.github.com>
Co-authored-by: Steven Palma <imstevenpmwork@ieee.org>
Co-authored-by: Steven Palma <steven.palma@huggingface.co>
2026-06-29 17:28:06 +02:00
Caroline Pascal 3dd19d043e feat(depth maps): adding support for depth in LeRobot (#3644)
* feat(depth): add depth quantization helpers and tests

* feat(video): add ffv1 to supported codecs

* feat(depth): persist depth metadata

* feat(depth): extend quantization tools to better fit the encoding/decoding pipeline

* feat(depth): plumb DepthEncoderConfig through LeRobotDataset and DatasetWriter

* feat(depth): wire StreamingVideoEncoder + writer to depth encoder

* feat(depth): wire DatasetReader to decode_depth_frames

* feat(cameras/realsense): expose async depth in metric meters

* feat(features): route 2D camera shapes to observation.depth.<key>

* feat(robots/so_follower): emit + populate depth keys when use_depth

* feat(record): plumb DepthEncoderConfig through lerobot-record

* feat(viz): render depth observations as rr.DepthImage in Viridis

* feat(depth maps writer): adding support for raw depth maps recording with image writer

* chore(format): format code

* feat(depth shape): ensuring depth maps shape is always including the channel

* feat(is_depth): simplifying is_depth nested name + legacy support

* fix(stop_event): fixing stop_event race condition in camera classes

* fix(plumbing): fixing missing parts in the depth maps pipeline

* chore(typos): fixing typos

* test(fix): fixing exisiting tests to still work with latest features

* tests(depth): adding new tests for depth integration validation

* feat(pix_fmt channels): use PyAv to check get pixel formats number of channels

* feat(refactor): refactor DepthEncoderConfig quantization pipeline, so that the methods do not live in the config class. Add pixel format - channels validation.Move the default pixel format for depth in the config file.

* fix(pre-commit): fixing mutable defautl value

* fix(info): fixing info metadata update when is_depth_map was set

* tests(typos): fixing typos in tests

* fix(realsense): fixing typo in realsense serial number

* fix(normalization): restricting 255 normalization to non depth/uint8 images only

* fix(typo): fixing typo

* fix(TIFF): add missing quantization and cleanup for TIFF files

* feat(batched dequantization): optimizing dequantize_depth for torch based batched dequantization

* feat(tools): adding depth support in LeRobotDataset edition tools

* test(aggregate): extending aggregation tests to depth frames

* test(cleaning): cleaning up tests

* fix(from_video_info): fixing early validation issue in from_video_info

* fix(typo): fixing typo

* fix(is_depth): adding missing doctrings and is_depth arguments in video decoding functions

Co-authored-by: Wensi (Vince) Ai <59036629+wensi-ai@users.noreply.github.com>

* fix(depth units): fixing depth units output for the realsense cameras

* feat(output unit): adding support for output unit specification at dataset reading/training time

Co-authored-by: Wensi (Vince) Ai <59036629+wensi-ai@users.noreply.github.com>

* test(depth): cleaning up depth tests

* test(depth encoding): updating and cleaning video/depth encoding tests

* chore(format): formatting code

* docs(depth): improving depth maps docs

* test(fix): fixing depth tests

* test(dataset tools): adding missing tests for new dataset edition tools features

* chore(format): formatting code

* fix(pyav check): fixing PyAV option validation for integer codec options by normalizing
numeric values before calling `is_integer()`

Co-authored-by: Wensi (Vince) Ai <59036629+wensi-ai@users.noreply.github.com>

* docs(mermaid): fixing mermaid diagram

* fix(rebase): rebase follow up corrections

* feat(dataset tools): adding missing docstrings and features for depth fill support in dataset edition tools

* docs(docstring): updating docstrings

* docs(dataset tools): updating docs

* fix(save images): fixing image saving in dataset tools

* fix(update video info): fixing update video info logic to match the recording and editing use cases

* test(reencode): fixing reencoding monkeypatch

* fix(review): add Claude review

* chore(format): format code

* fix(update video info): ditching the differentiated approahces for video info update - video info are always updated unless for preserved keys.

* chore(rebase): fixing rebase merge conflicts

* test(visualization): fixing visualization tests

* feat(docstrings): adding explicit docstring for encoding parameters. Docstrigns will now show up as description in the CLI --help.

* feat(mm as default): adding a global DEFAULT_DEPTH_UNIT variable setting mm as default depth unit

* fix(RGB <-> camera): renaming camera_encoder to rgb_encoder for clarity

* chore(TODO): removing deprecated TODO

* doc(write_u16_plane): improving docstrings for write_u16_plane

* feat(units): adding constants for depth frames units (m and mm)

* fix(spam): replacing spamming warning but a debug log

* feat(leagcy metadata): adding automatic metadata update for legacy 'video.is_depth_map' feature

* fix(copy&reindex): fixing metadat reshaping for single channel frames

* fix(ImageNet): excluding dpeth frames from ImageNet stats

* fix(PyAV container seek): fixing initial  PyAV container seek to be robust againsy codec choice

* feat(lerobot-dataset-viz): adding support for depth in lerobot-dataset-viz

* fix(compress): removing rerun compression for DepthImages

* fix(signle channel squeeze): fixing single channel squeezing

* chore(format): format code

* fix(streaming): adding support for dequantization in streaming_dataset.py

* refactor(read depth): factorizing depth reading methods for realsense camera and adding support for depth-only usage

* chore(renaming): fixing missed RGBEncoderConfig renamings

* docs(renaming): reflecting renamings in a clearer way in the docs

* chore(annotation): excluding depth from the annotation pipeline

* feat(robots): adding depth support in compatible follower robots

* feat(LeSadKiwi): excluding LeKiwi from depth support (for now)

* chore(fail): removing misplaced file

* chore(fail): removing misplaced file

* fix(remove ffv1): removing ffv1 as it does not support MP4

* docs(cheat sheet): adding depth and video encoding to the cheat sheet

* fix(lossless): tuning depth encoding parameters for lossless depth storage

* test(fix): fixing failing tests

* depth(ZMQ): excluding ZMQ from depth support

* Revert "depth(ZMQ): excluding ZMQ from depth support"

This reverts commit b95cf4e4c2.

* fix(image transforms): excluding depth frames from images transforms

* fix(typo): typo

* fix(stats): fixing stats computation for depth frames

* fix(TIFF vs. pytorch): adding an extra uint16 to float32 conversion for depth maps stored as raw TIFF images

* fix(typos): fixing typos

* test(dtype): fixing stats computation typing tests

---------

Signed-off-by: Steven Palma <imstevenpmwork@ieee.org>
Co-authored-by: Wensi (Vince) Ai <59036629+wensi-ai@users.noreply.github.com>
Co-authored-by: Steven Palma <imstevenpmwork@ieee.org>
Co-authored-by: Wensi Ai <wsai@stanford.edu>
2026-06-27 14:21:21 +02:00
Khalil Meftah 6a788fbdb0 Add inline offline validation with train/eval split (#3824)
* refactor(training): rename eval_freq to env_eval_freq

- Rename eval_freq to env_eval_freq to distinguish sim environment evaluation from offline loss evaluation.

* feat(training): add inline offline validation with train/eval split

- Add eval_split config for balanced per-task holdout
- Add eval_steps for periodic inline eval loss computation
- Add max_eval_samples to cap eval cost

* fix(datasets): remap absolute indices in __getitem__ for filtered datasets

* fix(train): vectorize eval subset selection for max_eval_samples

* fix(datasets): Move the remapping into EpisodeAwareSampler via absolute_to_relative_idx

* fix(validation): add eval_split range check and eval_steps warning

Validate eval_split is in [0.0, 1.0) to prevent garbage splits from
out-of-range values. Raise when eval_steps > 0 but eval_split is 0.0
since no offline eval will run.

* fix(train): prepare eval dataloader with accelerator for multi-GPU

Prepare eval_dataloader through accelerator.prepare() so eval data is
sharded across ranks instead of duplicated. Reduce eval_loss across
ranks with mean reduction for consistent logging.

* fix(test): rename eval_freq to env_eval_freq for multi-GPU training
2026-06-25 15:31:24 +02:00
Khalil Meftah c3f180e115 refactor(policies): clean MolmoAct2 to follow EO1/TOPReward patterns (#3724)
Align the MolmoAct2 implementation with lerobot codebase conventions:

- Rename hf_model/ to molmoact2_hf_model/
- Slim config: move all I/O and runtime logic to modeling
- Remove blanket  from 8 vendored files, fix 66 lint issues
- Deduplicate _hf_token() and _resolve_checkpoint_location()
- Make huggingface_hub imports lazy
- Remove custom MolmoAct2CosineDecayWithWarmupSchedulerConfig, use base class
- Extract 13 static/classmethods from MolmoAct2Policy to free functions
- Replace print() with logger in vendored action_tokenizer
- Add module docstrings, class docstring, and key method docstrings
- Add module-level loggers to modeling and processor
- Fix docs: pip to uv install, deduplicate README symlink
- Remove shebangs from all files
2026-06-25 14:19:35 +02:00
Eric Chan 324086abc3 Update follower arm description in documentation (#3780)
Signed-off-by: Eric Chan <hazzelnut@pm.me>
2026-06-25 13:58:08 +02:00
Maxime Ellerbach a66c7761a5 adjusting test to match expected values 2026-06-25 11:28:32 +00:00
Maxime Ellerbach 14bd51f28f updating uv lock and linting 2026-06-25 11:23:35 +00:00
Steven Palma b4e454c0ff feat(utils): display-independent keyboard controls for recording (Wayland / headless / macOS) (#3875)
* feat(utils): headless keyboard control

* refactor(utils): consolidate keyboard listener creation

* fix(rollout): remove import require guard for pynput

---------

Co-authored-by: Leo Toff <leo@toff.dev>
Co-authored-by: Stefano Maestri <stefano.maestri@javalinux.it>
Co-authored-by: Sahil Chande <85823961+SahilChande@users.noreply.github.com>
Co-authored-by: Vinayak Agarwal <63502278+Vinayak-Agarwal-2004@users.noreply.github.com>
Co-authored-by: Abdul Rahim Mirani <abdulrahimmirani@gmail.com>
2026-06-25 10:58:39 +02:00
someone114514 508d18f8a1 Fix ACT policy type examples in docs (#3792) 2026-06-25 08:59:07 +02:00
Alexandre Edmond 536b9621b2 Fix pi0fast model id in docs (#3855) 2026-06-24 11:44:03 +02:00
Maxime Ellerbach 5ee83f17a1 applying fixes 2026-06-24 09:28:07 +00:00
Jiwen Cai 79d4976ae2 fix(deps): pin cmeel-urdfdom <5 and cmeel-tinyxml2 <11 in placo-dep (#3873)
placo pulls in pin (Pinocchio), whose binary wheels dlopen specific cmeel
sonames (liburdfdom_sensor.so.4.0, libtinyxml2.so.10) but declare only `>=`
floors on their cmeel packages. The 2026-05-21 major bumps (cmeel-urdfdom
6.0.0 -> .so.6, cmeel-tinyxml2 11.0.0 -> .so.11) ship newer sonames, so left
unpinned the resolver grabs them and `import placo` fails at load with
"liburdfdom_sensor.so.4.0: cannot open shared object file".

#3647 capped placo and hardened the kinematics import, but the guard only
defers the failure: constructing RobotKinematics still raises. Pin the cmeel
packages to the 4.x / 10.x ABI the placo/pin wheels are built against (there
is no cmeel-urdfdom 5.x; <5 selects 4.x). Regenerated uv.lock with uv 0.8.0
to match CI; the only resolution change is the two cmeel versions (plus a
deterministic decord platform-marker cascade from 4.0.1's wider wheel set).

Fixes #3755
2026-06-24 11:23:25 +02:00
Gangwei XU e50308789c fix(lingbot-va): align RoboTwin evaluation (#3784)
Thank you for the RoboTwin fix, and alignment!
2026-06-23 17:34:40 +00:00
Pepijn b5d3a5a5d3 docs(lingbot_va): condense processor normalization comments
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-23 17:31:37 +00:00
Pepijn 6c1220b8f0 docs(lingbot_va): point checkpoint paths at the lerobot org
The LeRobot-format checkpoints moved from pepijn223/* to lerobot/* (libero_long,
robotwin, base). Update the eval/train --policy.path examples accordingly.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-23 17:31:36 +00:00
Pepijn 3061ca6661 refactor(lingbot_va): use built-in UnnormalizerProcessorStep for actions
Replace the bespoke LingBotVAActionUnnormalizeStep with the standard
UnnormalizerProcessorStep in QUANTILES mode, which computes the identical
(action + 1) / 2 * (q99 - q01) + q01 mapping. The per-channel q01/q99 are stored
as the step's saved state (a safetensors file) and restored on load; a fresh build
has no action stats so the step is an identity passthrough.

The 3 Hub checkpoints (lerobot/lingbot_va_{libero_long,robotwin,base}) have been
re-uploaded with the new post-processor (policy_postprocessor.json +
*_unnormalizer_processor.safetensors); reloading from the Hub round-trips q01/q99.

- processor_lingbot_va.py: drop the custom step + registry; build the post-processor
  with UnnormalizerProcessorStep (explicit ACTION->QUANTILES norm_map so the
  preprocessor / training path is unchanged).
- tests: assert the built-in step is used, identity-when-no-stats, correct quantile
  unnormalization, and a save_pretrained/from_pretrained stats round-trip.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-23 17:31:35 +00:00
Pepijn 2a7b7ea744 docs(lingbot_va): trim provenance comments; default wan path to base repo
- configuration_lingbot_va.py: drop the "──" decorations and the
  "(from transformer/config.json)" note; default wan_pretrained_path to
  robbyant/lingbot-va-base (has the frozen vae/text_encoder/tokenizer subfolders).
- modeling_lingbot_va.py: remove the vendored-code banner and the
  "(upstream wan_va/...)" section-header provenance/dash decorations; condense the
  transformer-dtype comment to one line.

No code changes.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-23 17:31:35 +00:00
Pepijn 50b20c5bf1 docs(lingbot_va): trim verbose comments
- configuration_lingbot_va.py: condense multi-line field comments to one-liners
  (keep the ── section headers).
- processor_lingbot_va.py: shorten the action-quantile explanation block.
- modeling_lingbot_va.py: drop the bare "# ----" separator rules, keeping the
  one-line section headers.

No code changes.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-23 17:31:34 +00:00
Pepijn c764afb8ef refactor(lingbot_va): drop hardcoded action quantiles; source from checkpoint
The LIBERO/RoboTwin action (un)normalization quantiles were hardcoded as module
constants in processor_lingbot_va.py. They are already serialized into each
checkpoint's policy_postprocessor.json (via LingBotVAActionUnnormalizeStep.get_config)
and restored on load by PolicyProcessorPipeline.from_pretrained, so the constants are
dead at eval/load time for the released checkpoints (verified: libero_long/robotwin/base
all carry their quantiles on the Hub).

- Remove LIBERO_ACTION_Q01/Q99, ROBOTWIN_ACTION_Q01/Q99 and _default_action_quantiles.
- make_lingbot_va_pre_post_processors now defaults a fresh (unconverted) build to a
  neutral [-1, 1] mapping (identity rescale); real per-benchmark stats come from the
  saved checkpoint (or postprocessor_overrides), analogous to dataset-stats normalization.
- Update the config doc comment to point at the checkpoint as the source of truth.
- Tests: replace the LIBERO-default assertion with a neutral-default check, and add a
  save_pretrained/from_pretrained round-trip guard for the quantile serialization.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-23 17:31:33 +00:00
Pepijn fa875eafb7 Update pyproject.toml
Signed-off-by: Pepijn <138571049+pkooij@users.noreply.github.com>
2026-06-23 17:31:32 +00:00
Pepijn 54e4926312 Update pyproject.toml
Signed-off-by: Pepijn <138571049+pkooij@users.noreply.github.com>
2026-06-23 17:31:31 +00:00
Pepijn 2471c23af5 Update lingbot_va.mdx
Signed-off-by: Pepijn <138571049+pkooij@users.noreply.github.com>
2026-06-23 17:31:30 +00:00
pepijn223 5422c99682 docs(lingbot_va): document EEF action-channel schema + camera order
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-23 17:31:30 +00:00
pepijn223 5131e6aa37 fix(lingbot_va): CI quality gate + fast-test collection
- Add tests/policies/lingbot_va/__init__.py so the test files don't clash by basename
  with tests/policies/vla_jepa/* under pytest's default import mode (fast-test collection error).
- Fix vendored typos flagged by the typos hook (pach_scale->patch_scale, total_tolen->
  total_token_len, stablized->stabilized) and a mypy union-attr in RoboTwinEnv._read_eef_pose.
- Apply Prettier formatting to docs/source/lingbot_va.mdx.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-23 17:31:29 +00:00
pepijn223 98ee5cdc22 feat(lingbot_va): implement training / fine-tuning (flow-matching loss)
- Implement LingBotVAPolicy.forward(): dual-stream flow-matching training loss
  (latent + action, timestep-weighted, action-masked) ported from upstream train.py;
  VAE-encodes camera clips, UMT5-encodes the task, noises both streams, runs the
  block-causal flex-attention training pass (forward_train).
- training_loss_from_streams() core + _build_training_streams() data prep (action
  scatter into the 30-d space, multi-frame VAE encode incl. robotwin_tshape).
- get_optim_params returns only trainable transformer params (LoRA/PEFT friendly);
  VAE/UMT5 stay frozen. Training needs attn_mode='flex'.
- Add a tiny-config single-training-step test (forward->loss->backward->AdamW) and a
  Training/fine-tuning section in the docs.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-23 17:31:28 +00:00
pepijn223 b81909fc28 feat(lingbot_va): RoboTwin eef-pose eval, single-file model, Hub checkpoints
Make the LingBot-VA port runnable on both LIBERO and RoboTwin and clean up the
package to LeRobot conventions.

- Consolidate all vendored Wan2.2 model code (transformer, attention, VAE helpers,
  flow-matching scheduler, grid utils, flex-attention) into a single
  modeling_lingbot_va.py; remove the separate wan_*/schedulers modules.
- Move the fixed action (un)normalization quantiles out of the config and into the
  post-processor (LIBERO 7-DoF + RoboTwin 16-d eef); remove the conversion script in
  favour of ready-to-use LeRobot-format checkpoints on the Hub.
- Fixes found via on-sim validation: undo LIBERO's 180-degree image flip
  (image_hflip), encode obs as a multi-frame streaming-VAE clip, reset the streaming
  VAE cache between episodes, run the transformer in config.dtype, lazy-load frozen
  VAE/UMT5 by subfolder with the text encoder on CPU.
- RoboTwin: add an end-effector-pose action mode to RoboTwinEnv (16-d per-arm
  xyz+quat+gripper deltas composed onto the initial eef pose, executed via CuRobo IK)
  and the robotwin_tshape latent layout (full-res head + half-res wrists via a second
  streaming VAE) with the upstream RoboTwin action quantiles + camera mapping.
- Predicted-video saving works for both benchmarks; docs + tests updated.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-23 17:31:27 +00:00
Pepijn d600a52943 feat(policies): add LingBot-VA autoregressive video-action world model
Port the LingBot-VA policy (Wan2.2 dual-stream video+action world model) into
LeRobot, following the EO-1 / VLA-JEPA conventions. Covers inference, checkpoint
conversion, and predicted-video saving (training is deferred to a follow-up PR).

- Vendored Wan transformer/attention/flex/VAE/scheduler modules (key names preserved
  for near-identity conversion); torch SDPA default, flashattn/flex lazy-guarded.
- LingBotVAConfig (registered "lingbot_va") + processor with fixed-quantile action
  unnormalization; full dual-stream sampling loop with CFG, two flow-matching
  schedulers and KV cache, mapped onto select_action with observed-keyframe feedback.
- convert_lingbot_va_checkpoints.py (libero/robotwin variants): bundles the ~5B
  transformer, lazy-pulls the frozen VAE+UMT5 from the source repo.
- Predicted-video plumbing in lerobot_eval (predicted_frames_callback; opt-in via
  --policy.save_predicted_video) and ConstantWithWarmupSchedulerConfig.
- pyproject: widen diffusers-dep to <0.37, add lingbot_va + imageio-dep extras,
  add lingbot_va and (missing) eo1 to `all`.
- Factory + policies/__init__ wiring, docs page + toctree, and tests.

Note: the LIBERO success-rate correctness gate must be validated on a CUDA GPU
with the converted checkpoint.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-23 17:30:31 +00:00
Khalil Meftah 6f0ba4be38 Record eval rollouts as LeRobot datasets (#3825)
* feat(eval): record eval rollouts as raw LeRobot datasets

- Record raw env observations inline during rollout(), before
preprocess_observation() transforms them. Uses LeRobotDataset.create()
with add_frame()/save_episode().

- Supports vectorized envs: each env in the batch records independently,
with save_episode() called per env on termination. Each task gets its
own dataset under output_dir/recordings/{task_group}_{task_id}/.

Enabled via --eval.recording=true; disabled by default.

* fix(eval): use FeatureType enum comparison instead of string value

* refactor(eval): per-env datasets recording, no double reset

- Extract _infer_shape_from_obs() to reduce nesting in feature conversion
- Move dataset creation into rollout() using its own env.reset() observation,
  eliminating the extra reset in run_one()
- Replace deepcopy with _shallow_copy_obs() for raw observation stashing
- Support batch_size > 1: each parallel env records to its own dataset
  (single env skips the env_0/ nesting for simplicity)
- One-time warning for env_features keys missing from observations
- Pass recording_dir + env_features through the call chain instead of
  a pre-built recording_dataset object

* refactor(eval): remove shape inference and shallow copy helpers

* feat(eval): optionally push recorded eval datasets to the Hub

* fix(eval): address review comments

- Wrap rollout loop in try/finally so finalize() runs on crash/interrupt
- Guard push_to_hub with num_episodes > 0 to avoid pushing empty datasets
- Hoist loop-invariant multi_env and base_repo_id out of creation loop
2026-06-23 14:03:57 +02:00
Maxime Ellerbach 73782447f2 feat(train): FSDP checkpoint saving (#3810)
* feat(train): FSDP checkpoint saving

* adding docs for FSDP

* adding a test for the fsdp checkpoint path

* cleanup

* fixing final upload to hub

* refactored initial implementation to use torch fsdp api and adding new tests
2026-06-22 13:51:21 +02:00
Khalil Meftah 2d7a42011a fix(policies): support offline batch inference for ACT and Diffusion (#3822)
- Guard ACT's KL divergence computation against None latent params to
prevent crashes during eval when use_vae is set but the forward path
returns no VAE outputs.
- Add offline batch fallback to Diffusion's predict_action_chunk() so
it works with dataloader batches (empty queues) in addition to the
existing online rollout path (populated queues). This enables batched
action prediction for offline evaluation.
2026-06-21 11:48:45 +02:00
Khalil Meftah b06ad40888 feat(hub): add pretrained_revision to pin Hub model versions (#3820)
- Add pretrained_revision field to PreTrainedConfig (policies) and
RewardModelConfig (reward models), and thread it through make_policy(),
make_pre_post_processors(), and make_reward_model() so that weights and
processor configs can be loaded from a specific Hub commit, branch, or
tag. Defaults to None (latest version, preserving current behavior).
Dataset and env hub loading already supported revision pinning.

Co-authored-by: Steven Palma <imstevenpmwork@ieee.org>
2026-06-19 18:32:47 +02:00
Khalil Meftah b3d74f80f0 Fix batch wandb logging metrics and handle scalar stats (#3821)
* fix(logging): batch wandb metrics

- Batch all metrics into a single wandb.log() call instead of one per
key, reducing API overhead.

- Add support for list-valued metrics by expanding them to indexed keys (e.g.
metric_0, metric_1).

* fix(stats): handle scalar stats robustly

- Wrap cast_stats_to_numpy with np.atleast_1d to prevent 0-d arrays
from scalar stats causing shape mismatches downstream.

* fix(logging): remove unused list-valued metric expansion

---------

Co-authored-by: Steven Palma <imstevenpmwork@ieee.org>
2026-06-19 18:31:12 +02:00
Khalil Meftah 552b4c3563 Add third-party env plugin discovery (#3823)
* feat(envs): add env plugin discovery

- Add 'lerobot_env_' to third-party plugin discovery prefixes, completing
the plugin system for all component types (robots, cameras, teleoperators,
policies, and now environments). External packages named lerobot_env_*
can self-register EnvConfig subclasses on import, enabling --env.type=
resolution without lerobot code changes.

* feat(envs): add generic observation passthrough

- Add generic observation passthrough in preprocess_observation() for
unhandled ndarray/tensor keys, replacing the pattern of adding per-env
hardcoded key handlers. Extra keys are forwarded as observation.<key>
and can be shaped by env-specific ProcessorSteps via get_env_processors().

---------

Co-authored-by: Steven Palma <imstevenpmwork@ieee.org>
2026-06-19 18:30:00 +02:00
Nicolas Rabault 8bf6056d14 docs: add LeLab web interface to README (#3831) 2026-06-17 18:22:21 +02:00
Caroline Pascal da92db8fc0 fix(image transforms): cleaning up image_transforms implementation in LeRobotDataset (#3829) 2026-06-17 11:50:09 +02:00
Caroline Pascal 2b0834bcb8 fix(cameras): snapshot stop_event in read loops to avoid None deref (#3812)
* Do not set stop_event to None when stopping thread

* fix(cameras): snapshot stop_event in read loops to avoid None deref
The background read loops accessed self.stop_event repeatedly while
_stop_read_thread() can reassign it to None after join(). Reading the
attribute across the loop condition (and a mid-loop re-check) was a
time-of-check/time-of-use race: stop_event could flip to None between
the `is None` test and the `.is_set()` call, raising AttributeError on
the worker thread.
Snapshot self.stop_event into a local once, guard it, and loop on the
local Event. The Event object is thread-safe and lives for the thread's
lifetime; _stop_read_thread() always calls .set() before nulling the
attribute, so the local observes the stop and exits cleanly. This also
lets us drop the redundant pre-lock stop check.
Applies to OpenCVCamera, RealSenseCamera, and ZMQ camera.

---------

Co-authored-by: Anes Benmerzoug <anes.benmerzoug@gmail.com>
2026-06-17 11:40:17 +02:00
Caroline Pascal 287c823f13 fix(features copy): adding deepcopy on LeRobot dataset features to avoid shallow copy leaks (#3826)
* fix(features copy): adding deepcopy on LeRobot dataset features to avoid shallow copy leaks

* tests(test): adding new test
2026-06-16 17:58:59 +02:00
Pepijn 58ccc01508 fix(datasets): enforce one parquet row group per episode in v3 data writes (#3807)
* fix(datasets): enforce one parquet row group per episode in v3 data writes

LeRobot v3 data shards must hold exactly one row group per episode so a
reader can fetch episode i with pq.ParquetFile(path).read_row_group(i)
(a byte-range read) instead of loading the whole shard. The recording
writer already does this (one write_table per episode); the aggregate
and lerobot-annotate re-write paths instead concatenated many episodes
and wrote them in one shot, collapsing the file to a single row group.

- io_utils: add write_table_one_row_group_per_episode (one ParquetWriter,
  one write_table per episode — same pattern as the recording writer);
  to_parquet_with_hf_images embeds images then writes per-episode row
  groups; to_parquet_one_row_group_per_episode wraps it for plain frames
- aggregate: route non-image data writes through the per-episode writer;
  leave the episodes-metadata parquet untouched (already one row/episode)
- annotate: rewrite shards via the per-episode writer instead of a single
  bulk pq.write_table
- tests: invariant coverage through the aggregate (image + video) and
  annotate paths

No change to on-disk schema, paths, naming, rollover thresholds, or
compression. Readers stay backward-compatible (old collapsed files load).

* Update src/lerobot/datasets/io_utils.py

Co-authored-by: Caroline Pascal <caroline8.pascal@gmail.com>
Signed-off-by: Pepijn <138571049+pkooij@users.noreply.github.com>

* Update src/lerobot/datasets/io_utils.py

Co-authored-by: Caroline Pascal <caroline8.pascal@gmail.com>
Signed-off-by: Pepijn <138571049+pkooij@users.noreply.github.com>

* fix(datasets): correct indentation and add strict= in row-group helper

The web-edited numpy version of write_table_one_row_group_per_episode had an
over-indented line (IndentationError, breaking pre-commit + test collection)
and a zip() without strict=. Fix both; behaviour unchanged.

---------

Signed-off-by: Pepijn <138571049+pkooij@users.noreply.github.com>
Co-authored-by: Caroline Pascal <caroline8.pascal@gmail.com>
2026-06-16 12:15:48 +02:00