Commit Graph

1542 Commits

Author SHA1 Message Date
CarolinePascal 8a5869b093 fix(from_video_info): fixing early validation issue in from_video_info 2026-06-22 17:17:46 +02:00
CarolinePascal 0109b6f1cd test(cleaning): cleaning up tests 2026-06-22 17:17:45 +02:00
CarolinePascal d3eb18c2ba test(aggregate): extending aggregation tests to depth frames 2026-06-22 17:17:45 +02:00
CarolinePascal bf4a00eb35 feat(tools): adding depth support in LeRobotDataset edition tools 2026-06-22 17:17:45 +02:00
CarolinePascal 2b8a1e0ce9 feat(batched dequantization): optimizing dequantize_depth for torch based batched dequantization 2026-06-22 17:17:45 +02:00
CarolinePascal 44453ae966 fix(TIFF): add missing quantization and cleanup for TIFF files 2026-06-22 17:17:45 +02:00
CarolinePascal 551e861750 fix(typo): fixing typo 2026-06-22 17:17:45 +02:00
CarolinePascal 67ee4cfbb2 fix(normalization): restricting 255 normalization to non depth/uint8 images only 2026-06-22 17:17:45 +02:00
CarolinePascal 722eab2d07 fix(realsense): fixing typo in realsense serial number 2026-06-22 17:17:45 +02:00
CarolinePascal 509622ce41 tests(typos): fixing typos in tests 2026-06-22 17:17:45 +02:00
CarolinePascal 454f9f7d43 fix(info): fixing info metadata update when is_depth_map was set 2026-06-22 17:17:45 +02:00
CarolinePascal f3db25adce fix(pre-commit): fixing mutable defautl value 2026-06-22 17:17:45 +02:00
CarolinePascal 765bfabbb9 feat(refactor): refactor DepthEncoderConfig quantization pipeline, so that the methods do not live in the config class. Add pixel format - channels validation.Move the default pixel format for depth in the config file. 2026-06-22 17:17:45 +02:00
CarolinePascal 8f6dd077fa feat(pix_fmt channels): use PyAv to check get pixel formats number of channels 2026-06-22 17:17:45 +02:00
CarolinePascal aaec453190 tests(depth): adding new tests for depth integration validation 2026-06-22 17:17:44 +02:00
CarolinePascal 04b841dfb5 test(fix): fixing exisiting tests to still work with latest features 2026-06-22 17:17:44 +02:00
CarolinePascal eca73edbe2 chore(typos): fixing typos 2026-06-22 17:17:44 +02:00
CarolinePascal 71b3bce19a fix(plumbing): fixing missing parts in the depth maps pipeline 2026-06-22 17:17:44 +02:00
CarolinePascal aae75346b4 fix(stop_event): fixing stop_event race condition in camera classes 2026-06-22 17:17:38 +02:00
CarolinePascal e55a883b0a feat(is_depth): simplifying is_depth nested name + legacy support 2026-06-22 14:44:51 +02:00
CarolinePascal edd2854a2c feat(depth shape): ensuring depth maps shape is always including the channel 2026-06-22 14:44:51 +02:00
CarolinePascal 520c19904f chore(format): format code 2026-06-22 14:44:51 +02:00
CarolinePascal 4566a8d285 feat(depth maps writer): adding support for raw depth maps recording with image writer 2026-06-22 14:44:51 +02:00
CarolinePascal 11aa5d7785 feat(viz): render depth observations as rr.DepthImage in Viridis 2026-06-22 14:44:51 +02:00
CarolinePascal b05df095f5 feat(record): plumb DepthEncoderConfig through lerobot-record 2026-06-22 14:44:51 +02:00
CarolinePascal 2ccf794c80 feat(robots/so_follower): emit + populate depth keys when use_depth 2026-06-22 14:44:51 +02:00
CarolinePascal 52321f7bef feat(features): route 2D camera shapes to observation.depth.<key> 2026-06-22 14:44:50 +02:00
CarolinePascal 443a4a7445 feat(cameras/realsense): expose async depth in metric meters 2026-06-22 14:44:50 +02:00
CarolinePascal e3afbca893 feat(depth): wire DatasetReader to decode_depth_frames 2026-06-22 14:44:48 +02:00
CarolinePascal 4374bb4368 feat(depth): wire StreamingVideoEncoder + writer to depth encoder 2026-06-22 14:21:53 +02:00
CarolinePascal 172cdee0cc feat(depth): plumb DepthEncoderConfig through LeRobotDataset and DatasetWriter 2026-06-22 14:21:53 +02:00
CarolinePascal ee87804cb8 feat(depth): extend quantization tools to better fit the encoding/decoding pipeline 2026-06-22 14:21:53 +02:00
CarolinePascal 0826676c5e feat(depth): persist depth metadata 2026-06-22 14:21:52 +02:00
CarolinePascal b802ec163d feat(video): add ffv1 to supported codecs 2026-06-22 14:21:52 +02:00
CarolinePascal 0509162456 feat(depth): add depth quantization helpers and tests 2026-06-22 14:21:52 +02:00
Maxime Ellerbach 73782447f2 feat(train): FSDP checkpoint saving (#3810)
* feat(train): FSDP checkpoint saving

* adding docs for FSDP

* adding a test for the fsdp checkpoint path

* cleanup

* fixing final upload to hub

* refactored initial implementation to use torch fsdp api and adding new tests
2026-06-22 13:51:21 +02:00
Khalil Meftah 2d7a42011a fix(policies): support offline batch inference for ACT and Diffusion (#3822)
- Guard ACT's KL divergence computation against None latent params to
prevent crashes during eval when use_vae is set but the forward path
returns no VAE outputs.
- Add offline batch fallback to Diffusion's predict_action_chunk() so
it works with dataloader batches (empty queues) in addition to the
existing online rollout path (populated queues). This enables batched
action prediction for offline evaluation.
2026-06-21 11:48:45 +02:00
Khalil Meftah b06ad40888 feat(hub): add pretrained_revision to pin Hub model versions (#3820)
- Add pretrained_revision field to PreTrainedConfig (policies) and
RewardModelConfig (reward models), and thread it through make_policy(),
make_pre_post_processors(), and make_reward_model() so that weights and
processor configs can be loaded from a specific Hub commit, branch, or
tag. Defaults to None (latest version, preserving current behavior).
Dataset and env hub loading already supported revision pinning.

Co-authored-by: Steven Palma <imstevenpmwork@ieee.org>
2026-06-19 18:32:47 +02:00
Khalil Meftah b3d74f80f0 Fix batch wandb logging metrics and handle scalar stats (#3821)
* fix(logging): batch wandb metrics

- Batch all metrics into a single wandb.log() call instead of one per
key, reducing API overhead.

- Add support for list-valued metrics by expanding them to indexed keys (e.g.
metric_0, metric_1).

* fix(stats): handle scalar stats robustly

- Wrap cast_stats_to_numpy with np.atleast_1d to prevent 0-d arrays
from scalar stats causing shape mismatches downstream.

* fix(logging): remove unused list-valued metric expansion

---------

Co-authored-by: Steven Palma <imstevenpmwork@ieee.org>
2026-06-19 18:31:12 +02:00
Khalil Meftah 552b4c3563 Add third-party env plugin discovery (#3823)
* feat(envs): add env plugin discovery

- Add 'lerobot_env_' to third-party plugin discovery prefixes, completing
the plugin system for all component types (robots, cameras, teleoperators,
policies, and now environments). External packages named lerobot_env_*
can self-register EnvConfig subclasses on import, enabling --env.type=
resolution without lerobot code changes.

* feat(envs): add generic observation passthrough

- Add generic observation passthrough in preprocess_observation() for
unhandled ndarray/tensor keys, replacing the pattern of adding per-env
hardcoded key handlers. Extra keys are forwarded as observation.<key>
and can be shaped by env-specific ProcessorSteps via get_env_processors().

---------

Co-authored-by: Steven Palma <imstevenpmwork@ieee.org>
2026-06-19 18:30:00 +02:00
Nicolas Rabault 8bf6056d14 docs: add LeLab web interface to README (#3831) 2026-06-17 18:22:21 +02:00
Caroline Pascal da92db8fc0 fix(image transforms): cleaning up image_transforms implementation in LeRobotDataset (#3829) 2026-06-17 11:50:09 +02:00
Caroline Pascal 2b0834bcb8 fix(cameras): snapshot stop_event in read loops to avoid None deref (#3812)
* Do not set stop_event to None when stopping thread

* fix(cameras): snapshot stop_event in read loops to avoid None deref
The background read loops accessed self.stop_event repeatedly while
_stop_read_thread() can reassign it to None after join(). Reading the
attribute across the loop condition (and a mid-loop re-check) was a
time-of-check/time-of-use race: stop_event could flip to None between
the `is None` test and the `.is_set()` call, raising AttributeError on
the worker thread.
Snapshot self.stop_event into a local once, guard it, and loop on the
local Event. The Event object is thread-safe and lives for the thread's
lifetime; _stop_read_thread() always calls .set() before nulling the
attribute, so the local observes the stop and exits cleanly. This also
lets us drop the redundant pre-lock stop check.
Applies to OpenCVCamera, RealSenseCamera, and ZMQ camera.

---------

Co-authored-by: Anes Benmerzoug <anes.benmerzoug@gmail.com>
2026-06-17 11:40:17 +02:00
Caroline Pascal 287c823f13 fix(features copy): adding deepcopy on LeRobot dataset features to avoid shallow copy leaks (#3826)
* fix(features copy): adding deepcopy on LeRobot dataset features to avoid shallow copy leaks

* tests(test): adding new test
2026-06-16 17:58:59 +02:00
Pepijn 58ccc01508 fix(datasets): enforce one parquet row group per episode in v3 data writes (#3807)
* fix(datasets): enforce one parquet row group per episode in v3 data writes

LeRobot v3 data shards must hold exactly one row group per episode so a
reader can fetch episode i with pq.ParquetFile(path).read_row_group(i)
(a byte-range read) instead of loading the whole shard. The recording
writer already does this (one write_table per episode); the aggregate
and lerobot-annotate re-write paths instead concatenated many episodes
and wrote them in one shot, collapsing the file to a single row group.

- io_utils: add write_table_one_row_group_per_episode (one ParquetWriter,
  one write_table per episode — same pattern as the recording writer);
  to_parquet_with_hf_images embeds images then writes per-episode row
  groups; to_parquet_one_row_group_per_episode wraps it for plain frames
- aggregate: route non-image data writes through the per-episode writer;
  leave the episodes-metadata parquet untouched (already one row/episode)
- annotate: rewrite shards via the per-episode writer instead of a single
  bulk pq.write_table
- tests: invariant coverage through the aggregate (image + video) and
  annotate paths

No change to on-disk schema, paths, naming, rollover thresholds, or
compression. Readers stay backward-compatible (old collapsed files load).

* Update src/lerobot/datasets/io_utils.py

Co-authored-by: Caroline Pascal <caroline8.pascal@gmail.com>
Signed-off-by: Pepijn <138571049+pkooij@users.noreply.github.com>

* Update src/lerobot/datasets/io_utils.py

Co-authored-by: Caroline Pascal <caroline8.pascal@gmail.com>
Signed-off-by: Pepijn <138571049+pkooij@users.noreply.github.com>

* fix(datasets): correct indentation and add strict= in row-group helper

The web-edited numpy version of write_table_one_row_group_per_episode had an
over-indented line (IndentationError, breaking pre-commit + test collection)
and a zip() without strict=. Fix both; behaviour unchanged.

---------

Signed-off-by: Pepijn <138571049+pkooij@users.noreply.github.com>
Co-authored-by: Caroline Pascal <caroline8.pascal@gmail.com>
2026-06-16 12:15:48 +02:00
Caroline Pascal 38327fdc84 fix(images/videos): fixing aggregate_pipeline_dataset_features to avoid unwanted images features deletion (#3783)
* fix(images/videos): fixing aggregate_pipeline_dataset_features to avoid unwanted images features deletion when videos are not used

* fix(docstrings): improving docstrings

Signed-off-by: Caroline Pascal <caroline8.pascal@gmail.com>

---------

Signed-off-by: Caroline Pascal <caroline8.pascal@gmail.com>
2026-06-15 17:55:52 +02:00
Steven Palma 9555efc02c chore(dependencies): update uv.lock (#3595)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2026-06-15 16:29:44 +02:00
Steven Palma d576c59afb refactor(robots): homogenize bi-manual setups implementations (#3772)
* chore(robots): homogenize bi setups

* feat(robots): split openarm mini into single and bi

* refactor(robots): mixin for bi classes

* docs: update docs
2026-06-15 16:28:54 +02:00
Altman 8515d456be fix(datasets): avoid uint8 overflow in image stats (#3697)
* fix(datasets): avoid uint8 overflow in image stats

* fix(datasets): promote stats batches dynamically
2026-06-13 12:09:43 +02:00
Mahbod 30790de178 feat(edit-dataset): add concatenate_videos opt-out to merge (#3663)
* feat(edit-dataset): add `concatenate_videos` opt-out to merge

When merging datasets, source mp4s are concatenated into shards capped at
`video_files_size_in_mb` (default 200 MB). This is great for dataloader
throughput but destroys per-episode (or per-source) video boundaries,
which is undesirable when you want to inspect, ship, or reuse the
individual mp4s.

Add a `concatenate_videos: bool = True` knob plumbed through
`MergeConfig` → `merge_datasets` → `aggregate_datasets` → `aggregate_videos`.
When False, each source mp4 is copied 1:1 to its own destination mp4 with
no re-muxing, so the merge preserves source video boundaries.

Usage:

    lerobot-edit-dataset \
        --new_repo_id user/merged \
        --operation.type=merge \
        --operation.repo_ids "['user/a', 'user/b']" \
        --operation.concatenate_videos=false

Defaults are unchanged; the dataloader path is unaffected because the
`episodes.parquet` `from_timestamp`/`to_timestamp` index keeps working
regardless of whether each mp4 holds one or many episodes.

* feat(edit-dataset): extend concatenate opt-out to data files

Following review, add a concatenate_data flag mirroring concatenate_videos,
threaded through MergeConfig, merge_datasets, aggregate_datasets, aggregate_data
and append_or_create_parquet_file. Metadata index files still always concatenate.

Also trim the verbose docstrings and comments since the names are
self-explanatory, and extend the existing merge test to cover data files.
2026-06-12 20:05:04 +02:00