lerobot

mirror of https://github.com/huggingface/lerobot.git synced 2026-06-27 05:07:15 +00:00

Author	SHA1	Message	Date
Pepijn	4dbe83d3bc	Merge remote-tracking branch 'origin/main' into feat/smolvla-on-steerable # Conflicts: # docs/source/annotation_pipeline.mdx # examples/annotations/run_hf_job.py # pyproject.toml # src/lerobot/annotations/steerable_pipeline/config.py # src/lerobot/annotations/steerable_pipeline/frames.py # src/lerobot/annotations/steerable_pipeline/modules/plan_subtasks_memory.py # src/lerobot/annotations/steerable_pipeline/vlm_client.py # src/lerobot/annotations/steerable_pipeline/writer.py # src/lerobot/datasets/__init__.py # src/lerobot/datasets/sampler.py # src/lerobot/scripts/lerobot_annotate.py # src/lerobot/scripts/lerobot_train.py # tests/annotations/test_frames.py # tests/annotations/test_modules.py # tests/annotations/test_writer.py # tests/datasets/test_sampler.py # tests/scripts/test_lerobot_annotate.py # uv.lock	2026-06-23 11:07:53 +02:00
Maxime Ellerbach	73782447f2	feat(train): FSDP checkpoint saving (#3810 ) * feat(train): FSDP checkpoint saving * adding docs for FSDP * adding a test for the fsdp checkpoint path * cleanup * fixing final upload to hub * refactored initial implementation to use torch fsdp api and adding new tests	2026-06-22 13:51:21 +02:00
Khalil Meftah	2d7a42011a	fix(policies): support offline batch inference for ACT and Diffusion (#3822 ) - Guard ACT's KL divergence computation against None latent params to prevent crashes during eval when use_vae is set but the forward path returns no VAE outputs. - Add offline batch fallback to Diffusion's predict_action_chunk() so it works with dataloader batches (empty queues) in addition to the existing online rollout path (populated queues). This enables batched action prediction for offline evaluation.	2026-06-21 11:48:45 +02:00
Khalil Meftah	b06ad40888	feat(hub): add pretrained_revision to pin Hub model versions (#3820 ) - Add pretrained_revision field to PreTrainedConfig (policies) and RewardModelConfig (reward models), and thread it through make_policy(), make_pre_post_processors(), and make_reward_model() so that weights and processor configs can be loaded from a specific Hub commit, branch, or tag. Defaults to None (latest version, preserving current behavior). Dataset and env hub loading already supported revision pinning. Co-authored-by: Steven Palma <imstevenpmwork@ieee.org>	2026-06-19 18:32:47 +02:00
Khalil Meftah	b3d74f80f0	Fix batch wandb logging metrics and handle scalar stats (#3821 ) * fix(logging): batch wandb metrics - Batch all metrics into a single wandb.log() call instead of one per key, reducing API overhead. - Add support for list-valued metrics by expanding them to indexed keys (e.g. metric_0, metric_1). * fix(stats): handle scalar stats robustly - Wrap cast_stats_to_numpy with np.atleast_1d to prevent 0-d arrays from scalar stats causing shape mismatches downstream. * fix(logging): remove unused list-valued metric expansion --------- Co-authored-by: Steven Palma <imstevenpmwork@ieee.org>	2026-06-19 18:31:12 +02:00
Khalil Meftah	552b4c3563	Add third-party env plugin discovery (#3823 ) * feat(envs): add env plugin discovery - Add 'lerobot_env_' to third-party plugin discovery prefixes, completing the plugin system for all component types (robots, cameras, teleoperators, policies, and now environments). External packages named lerobot_env_* can self-register EnvConfig subclasses on import, enabling --env.type= resolution without lerobot code changes. * feat(envs): add generic observation passthrough - Add generic observation passthrough in preprocess_observation() for unhandled ndarray/tensor keys, replacing the pattern of adding per-env hardcoded key handlers. Extra keys are forwarded as observation.<key> and can be shaped by env-specific ProcessorSteps via get_env_processors(). --------- Co-authored-by: Steven Palma <imstevenpmwork@ieee.org>	2026-06-19 18:30:00 +02:00
Nicolas Rabault	8bf6056d14	docs: add LeLab web interface to README (#3831 )	2026-06-17 18:22:21 +02:00
Caroline Pascal	da92db8fc0	fix(image transforms): cleaning up image_transforms implementation in LeRobotDataset (#3829 )	2026-06-17 11:50:09 +02:00
Caroline Pascal	2b0834bcb8	fix(cameras): snapshot stop_event in read loops to avoid None deref (#3812 ) * Do not set stop_event to None when stopping thread * fix(cameras): snapshot stop_event in read loops to avoid None deref The background read loops accessed self.stop_event repeatedly while _stop_read_thread() can reassign it to None after join(). Reading the attribute across the loop condition (and a mid-loop re-check) was a time-of-check/time-of-use race: stop_event could flip to None between the `is None` test and the `.is_set()` call, raising AttributeError on the worker thread. Snapshot self.stop_event into a local once, guard it, and loop on the local Event. The Event object is thread-safe and lives for the thread's lifetime; _stop_read_thread() always calls .set() before nulling the attribute, so the local observes the stop and exits cleanly. This also lets us drop the redundant pre-lock stop check. Applies to OpenCVCamera, RealSenseCamera, and ZMQ camera. --------- Co-authored-by: Anes Benmerzoug <anes.benmerzoug@gmail.com>	2026-06-17 11:40:17 +02:00
Caroline Pascal	287c823f13	fix(features copy): adding deepcopy on LeRobot dataset features to avoid shallow copy leaks (#3826 ) * fix(features copy): adding deepcopy on LeRobot dataset features to avoid shallow copy leaks * tests(test): adding new test	2026-06-16 17:58:59 +02:00
Pepijn	58ccc01508	fix(datasets): enforce one parquet row group per episode in v3 data writes (#3807 ) * fix(datasets): enforce one parquet row group per episode in v3 data writes LeRobot v3 data shards must hold exactly one row group per episode so a reader can fetch episode i with pq.ParquetFile(path).read_row_group(i) (a byte-range read) instead of loading the whole shard. The recording writer already does this (one write_table per episode); the aggregate and lerobot-annotate re-write paths instead concatenated many episodes and wrote them in one shot, collapsing the file to a single row group. - io_utils: add write_table_one_row_group_per_episode (one ParquetWriter, one write_table per episode — same pattern as the recording writer); to_parquet_with_hf_images embeds images then writes per-episode row groups; to_parquet_one_row_group_per_episode wraps it for plain frames - aggregate: route non-image data writes through the per-episode writer; leave the episodes-metadata parquet untouched (already one row/episode) - annotate: rewrite shards via the per-episode writer instead of a single bulk pq.write_table - tests: invariant coverage through the aggregate (image + video) and annotate paths No change to on-disk schema, paths, naming, rollover thresholds, or compression. Readers stay backward-compatible (old collapsed files load). * Update src/lerobot/datasets/io_utils.py Co-authored-by: Caroline Pascal <caroline8.pascal@gmail.com> Signed-off-by: Pepijn <138571049+pkooij@users.noreply.github.com> * Update src/lerobot/datasets/io_utils.py Co-authored-by: Caroline Pascal <caroline8.pascal@gmail.com> Signed-off-by: Pepijn <138571049+pkooij@users.noreply.github.com> * fix(datasets): correct indentation and add strict= in row-group helper The web-edited numpy version of write_table_one_row_group_per_episode had an over-indented line (IndentationError, breaking pre-commit + test collection) and a zip() without strict=. Fix both; behaviour unchanged. --------- Signed-off-by: Pepijn <138571049+pkooij@users.noreply.github.com> Co-authored-by: Caroline Pascal <caroline8.pascal@gmail.com>	2026-06-16 12:15:48 +02:00
Caroline Pascal	38327fdc84	fix(images/videos): fixing aggregate_pipeline_dataset_features to avoid unwanted images features deletion (#3783 ) * fix(images/videos): fixing aggregate_pipeline_dataset_features to avoid unwanted images features deletion when videos are not used * fix(docstrings): improving docstrings Signed-off-by: Caroline Pascal <caroline8.pascal@gmail.com> --------- Signed-off-by: Caroline Pascal <caroline8.pascal@gmail.com>	2026-06-15 17:55:52 +02:00
Steven Palma	9555efc02c	chore(dependencies): update uv.lock (#3595 ) Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2026-06-15 16:29:44 +02:00
Steven Palma	d576c59afb	refactor(robots): homogenize bi-manual setups implementations (#3772 ) * chore(robots): homogenize bi setups * feat(robots): split openarm mini into single and bi * refactor(robots): mixin for bi classes * docs: update docs	2026-06-15 16:28:54 +02:00
pepijn223	3427499212	feat(pi052): condition low-level prompt on state + fix eval slowdown - Inject discretized proprioceptive state (256 bins, pi05 format) into low-level action-conditioning prompts in both training (PI052TextTokenizerStep) and eval (_with_low_level_subtask_prompt), matching the recipe's documented "[images, subtask, state]" intent. Higher-level subtask/memory text streams stay state-free. - Cache the loc-token tokenizer (_get_loc_tokenizer) instead of reloading it from disk on every _build_text_batch/select_message call (it ran twice per env per replan and dominated eval runtime). - Add a KV cache to select_message decode (bit-identical output to the recompute path) to avoid O(n^2) generation. Net: pi052 eval ~2.9 s/it -> ~0.1 s/it (~25x). Co-authored-by: Cursor <cursoragent@cursor.com>	2026-06-14 13:57:55 +02:00
Altman	8515d456be	fix(datasets): avoid uint8 overflow in image stats (#3697 ) * fix(datasets): avoid uint8 overflow in image stats * fix(datasets): promote stats batches dynamically	2026-06-13 12:09:43 +02:00
Mahbod	30790de178	feat(edit-dataset): add `concatenate_videos` opt-out to merge (#3663 ) * feat(edit-dataset): add `concatenate_videos` opt-out to merge When merging datasets, source mp4s are concatenated into shards capped at `video_files_size_in_mb` (default 200 MB). This is great for dataloader throughput but destroys per-episode (or per-source) video boundaries, which is undesirable when you want to inspect, ship, or reuse the individual mp4s. Add a `concatenate_videos: bool = True` knob plumbed through `MergeConfig` → `merge_datasets` → `aggregate_datasets` → `aggregate_videos`. When False, each source mp4 is copied 1:1 to its own destination mp4 with no re-muxing, so the merge preserves source video boundaries. Usage: lerobot-edit-dataset \ --new_repo_id user/merged \ --operation.type=merge \ --operation.repo_ids "['user/a', 'user/b']" \ --operation.concatenate_videos=false Defaults are unchanged; the dataloader path is unaffected because the `episodes.parquet` `from_timestamp`/`to_timestamp` index keeps working regardless of whether each mp4 holds one or many episodes. * feat(edit-dataset): extend concatenate opt-out to data files Following review, add a concatenate_data flag mirroring concatenate_videos, threaded through MergeConfig, merge_datasets, aggregate_datasets, aggregate_data and append_or_create_parquet_file. Metadata index files still always concatenate. Also trim the verbose docstrings and comments since the names are self-explanatory, and extend the existing merge test to cover data files.	2026-06-12 20:05:04 +02:00
Pepijn	cec8ee0be6	feat: language annotation pipeline (#3471 ) Steerable annotation pipeline (lerobot-annotate) that populates the language_persistent and language_events columns introduced in PR 1 (#3467) directly into data/chunk-/file-.parquet. This is PR 2 of the three-PR plan: PR 1 (Add extensive language support #3467): schema + DSL + rendering, base of this PR PR 2 (this PR): annotation pipeline writing into PR 1's columns PR 3: model with language prediction and runtime A VLM (Qwen-VL family, served on vLLM) watches each episode's video and emits grounded language annotations: subtasks, plans, memory, task rephrasings, interjections + speech, and per-camera VQA. The pipeline is built for production annotation at scale — single-camera grounding, embedded-frame inputs, a describe-then-segment grounding flow, and a deterministic full-episode coverage guarantee — informed by Scale's dense-captioning findings (representation > sampling, rules > reasoning, model capacity is the biggest lever, two-pass systems compound errors)	2026-06-12 15:12:33 +02:00
Nikodem Bartnik	02b315ab6a	Docs/model card improvements (#3634 ) * update policy deployment instruction with rollout * add port and fix formatting * add more base models to generate model card * updated and extended model descriptions * fix bug * improved and extended structure * exclude the templates from config * add images and visualize dataset button * add all policies we have docs for * remove policies without the docs * new fields, improved examples	2026-06-12 13:26:52 +02:00
Pepijn	234c768dfb	feat(datasets): deterministic, resumable shuffling for EpisodeAwareSampler (#3769 ) * fix(datasets): expose a generator on EpisodeAwareSampler for distributed shuffle sync In distributed training, accelerate can only synchronize the shuffle permutation across ranks when the sampler exposes a generator attribute. EpisodeAwareSampler shuffled via the global torch RNG, so disjoint batch shards relied on every rank's global CPU RNG staying in lockstep forever; any rank-asymmetric RNG consumption (e.g. eval rollouts on the main process only) silently desynced the permutations and ranks trained on overlapping/missing samples. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(train): seed sampler generator and gate dataset download per node - Pass a generator seeded with cfg.seed to EpisodeAwareSampler so accelerator.prepare registers it as the synchronized RNG and the shuffle order is reproducible. - Gate the initial make_dataset call on is_local_main_process instead of is_main_process: the global main process only exists on node 0, so on every other node all local ranks were downloading the dataset and building the Arrow cache concurrently. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * feat(datasets): add DeterministicEpisodeAwareSampler with O(1) memory and sample-exact resume Add a sampler that never materializes frame indices: it stores only per-episode boundaries (numpy, a few bytes per episode) and maps logical positions to frame indices on the fly with searchsorted. Shuffling uses a seeded Feistel permutation over [0, num_frames) (cycle-walking to the exact domain), so the data order is a pure function of (seed, epoch): - no RNG state to synchronize across distributed ranks, - constant memory and zero epoch-boundary cost at any dataset size, - O(1) seek to any position, enabling sample-exact resume. Opt in with --deterministic_sampler=true. On resume, lerobot-train maps the checkpointed step back to (epoch, start_index) via compute_sampler_state and continues at the exact sample where the run left off (up to accelerate's even_batches padding at epoch boundaries). The shuffle is pseudo-random rather than a true uniform permutation, the standard trade-off in large-scale training loaders. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * refactor(datasets): fold deterministic mode into EpisodeAwareSampler Instead of a parallel DeterministicEpisodeAwareSampler class, extend the existing EpisodeAwareSampler with a deterministic=True mode (seeded Feistel permutation, epoch auto-advance, state_dict/load_state_dict). The default mode is behavior-identical: same torch.randperm consumption and the same generator contract accelerate synchronizes; the O(N) Python index list is replaced by O(num_episodes) boundary arrays in both modes, with `indices` kept as a back-compat property. Passing a generator together with deterministic=True is rejected, and the state/seek methods raise outside deterministic mode. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * feat(train): enable deterministic_sampler by default Deterministic data order (sample-exact resume, no cross-rank RNG sync, O(1) sampler memory) is now the default for map-style training; set deterministic_sampler=false to restore the legacy RNG-based shuffle. Streaming datasets ignore the flag (the sampler path only applies to map-style datasets), replacing the previous hard validation error so streaming configs keep working with the new default. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * feat(datasets): default EpisodeAwareSampler to deterministic mode and trim comments deterministic=True is now the class default as well as the training default; the legacy RNG path requires an explicit deterministic=False (the train script's non-deterministic branch passes it). Docstrings and inline comments slimmed down across the changed files. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * test(sampler): drain resumed trillion-frame sampler via iter() to avoid list() prealloc list(sampler) calls PyObject_LengthHint -> __len__ (the full 10*12 epoch length) and preallocates that many slots before iterating, OOMing even though the resumed epoch only yields 3 frames. Collect through the iterator (no length hint) so the test exercises the real O(1) seek/drain instead of CPython's list growth heuristic. fix(datasets): guard Feistel cycle-walking loop against non-convergence Replace the unbounded while True in EpisodeAwareSampler._permute with a bounded for loop capped at _MAX_CYCLE_WALK_STEPS (100) and raise RuntimeError if the cycle-walk fails to land in [0, num_frames). The loop is expected to converge in <4 steps on the chosen power-of-two domain, so the bound is a safety net that should never trip in practice but prevents a pathological infinite loop. https://claude.ai/code/session_01HQ15tFrBsHYScjGWosEv22 * fix(datasets): make deterministic-sampler resume robust to world-size changes compute_sampler_state mapped a checkpointed step back to (epoch, start_index) using the current num_processes, but the number of sampler positions a step consumes scales with the world size that produced it. Resuming on a different GPU count therefore landed on the wrong epoch/offset, silently re-seeing or skipping data. Record num_processes in training_step.json at checkpoint time and feed the checkpoint's value into compute_sampler_state on resume, so the data order resumes at the right position regardless of the new world size. Warn when the world size changed (the global offset is correct, but per-rank sample-exactness needs the same topology). Old checkpoints without the field fall back to the current world size. Also document compute_sampler_state's assumptions explicitly: num_processes / batch_size must match the checkpointing run, and accelerate's even_batches=True padding is mirrored by the ceil(... / num_processes) term. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Co-authored-by: Cursor <cursoragent@cursor.com> * style: apply ruff-format to lerobot_train.py Collapse the compute_sampler_state(...) call onto one line so the ruff-format pre-commit hook passes (fixes the failing CI check). Co-authored-by: Cursor <cursoragent@cursor.com> * refactor(datasets): use seeded torch.randperm instead of Feistel in EpisodeAwareSampler Drop the Feistel permutation (and its SplitMix64 hash / cycle-walking) in favor of a torch.randperm seeded from (seed, epoch). The deterministic mode keeps its key properties - data order is a pure function of (seed, epoch), so it reproduces on every rank with no global-RNG synchronization, and - state_dict / load_state_dict still resume sample-exactly, now by regenerating the epoch's permutation and slicing from the saved offset. Construction stays O(num_episodes) (only episode boundaries are stored, never a per-frame index list). The trade-off vs Feistel: the per-epoch shuffle is again O(num_frames) memory (the randperm tensor) and no longer O(1)-seekable, in exchange for ~30 fewer LOC and a truly uniform shuffle. Tests updated: the trillion-frame O(1) test is replaced with a boundary-storage check and a scale resume-exactness test. Co-authored-by: Cursor <cursoragent@cursor.com> * refactor(datasets): make EpisodeAwareSampler always deterministic With Feistel gone, deterministic and legacy modes were both just torch.randperm and the deterministic path strictly dominated (reproducible across ranks via the (seed, epoch) seed, no accelerate generator sync, resumable). Collapse to a single path and drop the redundant flag: - remove the `deterministic` and `generator` constructor args, `_iter_default`, and `_require_deterministic`; `set_epoch` / `state_dict` / `load_state_dict` are now unconditional - remove the `deterministic_sampler` train config field and the legacy generator branch in lerobot_train.py (non-streaming map datasets always use the sampler) - drop the now-obsolete generator/legacy tests Note: removes the `generator` kwarg from EpisodeAwareSampler (back-compat break vs main); the order is now a pure function of (seed, epoch), so no cross-rank RNG sync is needed. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(datasets): address sampler review (batch_size resume guard + docs) - Record batch_size in training_step.json alongside num_processes and feed the checkpoint's value into compute_sampler_state on resume; warn when it differs (per-rank sample-exactness needs the same batch size). - Document the set_epoch vs __iter__ auto-advance coupling on EpisodeAwareSampler (callers should rely on exactly one mechanism per run). - Note the broadened (reproducibility-breaking) sampler guard and the no-generator distributed sharding correctness in lerobot_train.py. - Add load_training_batch_size + parallel tests. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(train): download dataset once on the global main process Gate the training dataset download on the global is_main_process (download once to the shared dataset root, barrier, then every other rank reads the already-populated copy) instead of per-node is_local_main_process. LeRobotDataset skips its snapshot_download when try_load() succeeds, so no rank re-downloads. Assumes the dataset root / HF cache is on storage shared across nodes. Co-authored-by: Cursor <cursoragent@cursor.com> * chore(datasets): trim sampler comment and drop duplicate tests Remove the verbose dataloader-guard comment and the two EpisodeAwareSampler tests that duplicated existing validation/warning coverage (no coverage loss). Co-authored-by: Cursor <cursoragent@cursor.com> --------- Co-authored-by: Claude Fable 5 <noreply@anthropic.com> Co-authored-by: Cursor <cursoragent@cursor.com>	2026-06-12 11:47:16 +02:00
Caroline Pascal	0e9bd9e6fb	feat(trim): adding optional trimming option in reencode_video (#3779 ) * feat(trim): adding optional trimming option in reencode_video * tests(trim): add triming test --------- Co-authored-by: Pepijn <138571049+pkooij@users.noreply.github.com>	2026-06-12 11:29:26 +02:00
Steven Palma	87242cfced	chore(dependecies): relax grpc-related bounds (#3777 ) Signed-off-by: Steven Palma <imstevenpmwork@ieee.org>	2026-06-11 19:13:14 +02:00
Steven Palma	1edc83a0ef	feat(training): bump accelerate + use reduction types for tracked metrics in a multi rank setup (#3773 ) * feat(training): bump accelerate + use reduction types for tracked metrics in a multi rank setup * chore: address feedback	2026-06-11 19:07:28 +02:00
Steven Palma	6fbcf67249	chore: update readme (#3774 ) * chore: update readme * chore: update authors in project readme	2026-06-11 18:17:26 +02:00
Pepijn	41166b39fb	fix(train): synchronize EpisodeAwareSampler shuffling across ranks and gate dataset download per node (#3768 ) * fix(datasets): expose a generator on EpisodeAwareSampler for distributed shuffle sync In distributed training, accelerate can only synchronize the shuffle permutation across ranks when the sampler exposes a generator attribute. EpisodeAwareSampler shuffled via the global torch RNG, so disjoint batch shards relied on every rank's global CPU RNG staying in lockstep forever; any rank-asymmetric RNG consumption (e.g. eval rollouts on the main process only) silently desynced the permutations and ranks trained on overlapping/missing samples. * fix(train): seed sampler generator and gate dataset download per node - Pass a generator seeded with cfg.seed to EpisodeAwareSampler so accelerator.prepare registers it as the synchronized RNG and the shuffle order is reproducible. - Gate the initial make_dataset call on is_local_main_process instead of is_main_process: the global main process only exists on node 0, so on every other node all local ranks were downloading the dataset and building the Arrow cache concurrently.	2026-06-11 11:07:42 +02:00
Steven Palma	79c6821407	chore(dependecies): update mujoco transitives (#3756 )	2026-06-10 12:58:55 +02:00
Steven Palma	507083249f	Revert "fix(pyproject): adding ceiling bound on mujoco (<3.9.0) (#3751 )" (#3754 ) This reverts commit `bd22407d93`.	2026-06-10 10:38:42 +02:00
Caroline Pascal	bd22407d93	fix(pyproject): adding ceiling bound on mujoco (<3.9.0) (#3751 ) * fix(pyproject): adding ceiling bound on mujoco (<3.9.0) * chore(uv.lock): updating uv.lock * fix(linux): adding missing linux dependencies * chore(uv.lock): updating uv.lock	2026-06-09 23:31:43 +02:00
Adil Zouitine	49755a3d9e	feat(processor): Add in-memory processor pipeline serialization (#3732 ) * feat(processor): add in-memory pipeline serialization Expose processor pipeline config and tensor state without requiring temporary files, so processors can be transported, compared, or hashed directly in memory. * feat(processor): enhance DataProcessorPipeline with registry support - Added a new RegisteredLazyTensorStateStep for registry-based serialization tests. - Improved state filename handling in _get_state_filename method. - Refactored validation logic in _validate_loaded_config to simplify parameter types. - Updated tests to verify registry step functionality and ensure correct state loading. * refactor(processor): update state handling in DataProcessorPipeline - Introduced a new static method _get_state_key to derive in-memory state keys from serialized filenames. - Updated state_dict and load_state_dict methods to use suffixless state keys instead of filenames. - Adjusted related tests to reflect changes in state key handling, ensuring consistency in state management * fix(processor): update loaded_config argument description in DataProcessorPipeline - Clarified the documentation for the loaded_config parameter to indicate that it may be a non-dictionary value, enhancing understanding for future developers.	2026-06-08 11:27:24 +02:00
Pepijn	c5965d4971	Merge branch 'main' into feat/smolvla-on-steerable	2026-06-08 11:02:54 +02:00
Maxime Ellerbach	09808183ca	feat(rollout): adding episodic strategy (#3717 ) * feat(rollout): adding legacy strategy * adding legacy to existing tests * updating docs and docstring * changing misleading docstring Signed-off-by: Maxime Ellerbach <maxime@ellerbach.net> * adding extra guard like dagged with try except finally * Potential fix for pull request finding Signed-off-by: Maxime Ellerbach <maxime@ellerbach.net> * adding reset to initial position * moving smooth teleop handover to control_utils and adding this behavior to legacy strategy * reducing duration of the handover * * renaming to episodic * changing semantics of the docstring * fixing leader - follower handover disable torque * adding optionnal config to disable handover * wiring the smooth_leader_follower_handover config * renaming config smooth_leader_to_follower_handover --------- Signed-off-by: Maxime Ellerbach <maxime@ellerbach.net>	2026-06-06 00:32:38 +02:00
pepijn223	470fdd195d	fix(ema): default EMA decay to 0.99 Matches openpi's top-level default (ema_decay=0.99, ~last 100 steps). Co-authored-by: Cursor <cursoragent@cursor.com>	2026-06-05 16:10:00 +02:00
pepijn223	384feca91a	fix(ema): default EMAConfig.enable to False (opt-in) EMA was on by default, so every training run on the branch (incl. VLA-JEPA and other non-flow-matching policies) created a full fp32 shadow copy. EMA only benefits flow-matching/diffusion policies (pi0/pi05/pi052). Make it opt-in via --ema.enable=true; the pi05/pi052 recipes already pass that flag. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-06-05 16:09:08 +02:00
pepijn223	7b35af6eca	Merge remote-tracking branch 'origin/main' into feat/smolvla-on-steerable Co-authored-by: Cursor <cursoragent@cursor.com> # Conflicts: # uv.lock	2026-06-05 14:38:47 +02:00
pepijn223	aca02ff24c	fix(robocasa): align env state/action order to openpi/robocasa convention LeRobot's RoboCasaEnv used a divergent flat state/action layout vs the robocasa package (robocasa.utils.env_utils.convert_action) and the openpi robocasa pipeline. This scrambles I/O when using openpi-convention checkpoints (e.g. the JAX->PyTorch->LeRobot converted pi05 robocasa model: CloseFridge 20% -> 60% once both orders match openpi). - convert_action: ee_pos(3)+ee_rot(3)+gripper(1)+base_motion(4)+control_mode(1) - observation.state: ee_pos_rel(3)+ee_rot_rel(4)+base_pos(3)+base_rot(4)+gripper(2) Matches openpi examples/robocasa/main.py + RobocasaInputs ordering. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-06-05 13:47:43 +02:00
pepijn223	de7ba67556	style: drop decorative === comment banners from pi052 split Replace the === separator banners (against repo style) with plain comments. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-06-04 20:21:10 +02:00
pepijn223	c020c0d053	refactor(pi052): split pi05_backbone into pi_gemma + modeling_pi052 Eliminate the standalone pi052/pi05_backbone.py by distributing its contents: - Generic dual-expert transformer machinery -> lerobot/policies/pi_gemma.py (sdpa_attention_forward, compute_layer_complete, PaliGemmaWithExpertModel, get_gemma_config; the openpi width/depth config is renamed GemmaConfig -> GemmaVariantConfig to avoid clashing with transformers' GemmaConfig). These sit next to the existing PiGemma layer code they already depend on. - pi052-specific model + helpers -> pi052/modeling_pi052.py (PI05Pytorch, ActionSelectKwargs, make_att_2d_masks, pad_vector, resize_with_pad_torch, create_sinusoidal_pos_embedding, sample_beta, get_safe_dtype). DEFAULT_IMAGE_SIZE is duplicated as a plain constant in pi_gemma to avoid a pi_gemma -> pi05 import cycle. Additive to pi_gemma; pi0/pi05 unaffected. Verified bit-exact on pepijn223/pi052_robocasa_full (embed/predict/forward identical) and all 34 pi052 tests pass. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-06-04 20:18:18 +02:00
pepijn223	4cbd91a04e	chore: drop one-off bench/build/train scripts from the PR Remove development-only tooling that doesn't belong in the PR: - examples/benchmark/* (pi052 step/kernel benchmark slurm + harness) - examples/port_datasets/slurm_build_robocasa_composite_seen.py and src/lerobot/scripts/build_robocasa_composite_seen.py (composite_seen dataset build scripts) - scripts/build_episode_filter.py, scripts/build_robocasa_smoke.sh, scripts/train_pi052_human300_exclude_unannotated.sh None are imported by the library, tests, or entry points. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-06-04 20:05:25 +02:00
pepijn223	afe30630cc	test(pi052): repair stale-name CE tests for fused linear CE _fast_ce/_shifted_ce were renamed to _fast_lin_ce/_shifted_lin_ce and changed from logits-based to Liger fused-linear-CE (hidden @ lm_head_weightᵀ). Update the tests via thin adapters that pass an identity lm_head_weight (so the computed logits equal the provided ones), run on CUDA (Liger is GPU-only) and skip otherwise, and loosen the allclose tolerance to absorb GPU-vs-CPU float noise on the tiny losses. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-06-04 20:03:18 +02:00
pepijn223	a594ad7969	refactor(pi052): self-contained policy; revert pi0/pi05 to upstream main The smolvla branch had modified the shared pi0/pi05 modeling + pi05 config to support pi052 (SDPA attention, layernorm/lm_head handling, optimizer foreach/fused/lm_head_lr_scale, embedding scaling). Decouple pi052 instead: - Vendor the PI0.5 backbone (PaliGemmaWithExpertModel, PI05Pytorch, helpers) into pi052/pi05_backbone.py (verbatim copy, no PI05Policy). - Flatten PI052Policy to subclass PreTrainedPolicy directly (no longer PI05Policy); inline the needed PI05Policy methods. - Restore optimizer_foreach/fused + get_optimizer_preset on PI052Config. - Revert pi0, pi0_fast, pi05 modeling and configuration_pi05 to origin/main (byte-identical), so the shared policies carry no smolvla modifications. Behavior verified bit-exact on pepijn223/pi052_robocasa_full: embed_language_ tokens, predict_action_chunk, and the fused flow+text+FAST training loss are identical before/after (max_abs_diff=0). pi052 tests pass (pre-existing stale-name collection errors unchanged). Co-authored-by: Cursor <cursoragent@cursor.com>	2026-06-04 19:59:27 +02:00
Maxime Ellerbach	2e9cd87bbd	feat(policies): add VLA-JEPA (#3568 ) * first commit * feat(policies): add VLA-JEPA * feat(policies): add VLA-JEPA * support vla_jepa * (feat)policies: add VLA-JEPA * linting * adding deps to pyproject.toml * updating uv lock * adding guards to avoid needing transformers and diffusers for type checking and basic tests * fixing action and state dim * fix warnings with qwen processor kwargs * fixing wm_loss not propagating * adjusting obs steps, tublets size to match original implementation * some more fixes to be closer to the original implem * adding more tests to ensure good coverage * align VLA-JEPA architecture with original checkpoint - Remove stale `action_num_heads` / `action_attention_head_dim` config fields; DiT head dimensions are now always derived from the preset (DiT-B/L/test). - Add `num_target_vision_tokens` and `action_max_seq_len` config fields required by the action head's future-token embedding and positional embedding tables. - Fix default `qwen_model_name` to 2B (matches all released checkpoints). - Rename `ActionEncoder` attrs w1/w2/w3 → layer1/layer2/layer3 to match checkpoint key names; replace `nn.Sequential` decoder/state-encoder with `_MLP2` (layer1/layer2 naming). - Fix `VLAJEPAActionHead` to size ActionEncoder and StateEncoder at `inner_dim` (DiT input width) rather than `action_hidden_size` (DiT output width). - Rename `DiT.blocks` → `transformer_blocks` and `attn` → `attn1` to match checkpoint; add alternating cross/self attention (even blocks cross-attend to Qwen context, odd blocks self-attend). - Add `DiT-test` preset for unit tests. - Rewrite `ActionConditionedVideoPredictor` with explicit ViT-style blocks (`_PredictorBlock` with fused qkv) to match checkpoint structure; rename `encoder`/`norm`/`proj` → `predictor_blocks`/`predictor_norm`/`predictor_proj`. * propagate action_is_pad masking through VLA-JEPA policy pipeline Pass the `action_is_pad` tensor from the batch through to the action head so padded timesteps are excluded from the flow-matching loss. * update VLA-JEPA tests for arch changes and action_is_pad - Switch conftest to use `action_model_type="DiT-test"` now that `action_num_heads` / `action_attention_head_dim` have been removed. - Add action_head tests covering fully-padded loss (zero) and equivalence of action_is_pad=None vs all-zeros mask. - Remove obsolete `test_native_to_lerobot_wm_only` test. * add VLA-JEPA documentation Covers architecture overview, pretrained checkpoints, config reference, training/eval commands for LIBERO-10, and guidance on fine-tuning for single-camera datasets. * add one-shot script to convert ginwind/VLA-JEPA checkpoints to safetensors (will remove once migrated) * make default params more aligned with paper and pretrained models - adding possibility of freezing qwen backbone and world model - added tests for weight loading * trying out to re-init the action head to avoid pretraining dimension mismatch * allow different state dim and action dim * removing missleading future_action_window_size to just use chunk_size * lots of changes to make existing weights work, need to massively refactor the pre and post processing * refactoring into using pre and post processor * pre-commit cleanup * fixing doc defaults args Signed-off-by: Maxime Ellerbach <maxime@ellerbach.net> * adressing dtype zeros issue * adding guard for diffusers * fixing training and exal examples * trying to close success rate gap * fix qwen norm layer output libero eval is now as expected * adding instructions for different embodiement + fixing some tests * smol fix to avoid having default CPU device when training * fixing misconception about multiview / singleview handling * removing conversion script * adding licences * adding .mdx docs and shortening polivy_vla_jepa_README.md * removing useless pre-processor * cleanup * removing swish in favor of silu * adding configuration gripper index and threshold * fixing simlink --------- Signed-off-by: Maxime Ellerbach <maxime@ellerbach.net> Co-authored-by: ginwind <ginwind@mail.ustc.edu.cn>	2026-06-04 19:22:51 +02:00
pepijn223	8292548f0d	fix(pi052): stop double-scaling FAST/text token embeddings embed_language_tokens already applies Gemma's sqrt(hidden) normalizer (GemmaTextScaledWordEmbedding, transformers >=5.4.0). pi052 multiplied FAST action-token and autoregressive subtask-text embeddings by sqrt(emb_dim) on top of that, double-scaling them (~2048x). Remove the manual scaling so FAST and text tokens are single-scaled, consistent with the pi05 fix and OpenPI. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-06-04 18:31:41 +02:00
pepijn223	77cc35b932	fix(pi0,pi05,pi0_fast): stop double-scaling text embeddings transformers >=5.4.0 (PR #44432) makes Gemma's embed_tokens a GemmaTextScaledWordEmbedding that already multiplies token embeddings by sqrt(hidden_size). The manual `* sqrt(embed_dim)` applied on top therefore double-scaled text (~2048x instead of ~45x), breaking VLM alignment for models trained/run on stock transformers. Remove the manual scaling and rely on embed_tokens' internal normalizer (matches main #3603). Image features stay raw (un-normalized), as before. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-06-04 18:22:34 +02:00
pepijn223	f0757fc707	fix(pi0,pi0_fast): scale text embeddings by sqrt(embed_dim) to match OpenPI OpenPI (pi0 and pi0-FAST) multiplies language token embeddings by sqrt(embed_dim) — the Gemma embedder normalizer — before the transformer. LeRobot pi0/pi0_fast omitted it, leaving text tokens ~45x under-scaled relative to the residual stream (same class of bug as the pi05 image scaling). pi0: applied in embed_prefix's lang_embed_func. pi0_fast: applied inside embed_language_tokens so prompt, FAST action tokens, and autoregressive next-token embeds are all scaled consistently. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-06-04 18:14:27 +02:00
pepijn223	a48d4e32a1	fix(pi05): don't scale image features by sqrt(hidden_size) lerobot/pi05_base was trained in the OpenPI/big_vision regime where image (soft) tokens are NOT multiplied by the Gemma embedder normalizer (sqrt(hidden_size)) — only text tokens are. Scaling image features here over-scaled them ~45x, breaking the pretrained vision-language alignment and yielding ~0% closed-loop success on RoboCasa across all pi05 runs. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-06-04 17:20:34 +02:00
Pepijn	9596e3d53f	Merge remote-tracking branch 'origin/feat/smolvla-on-steerable' into feat/smolvla-on-steerable	2026-06-04 17:14:33 +02:00
Pepijn	0a6a799317	Merge feat/language-annotation-pipeline into feat/smolvla-on-steerable Bring the authoritative annotation pipeline from the annotation branch. The annotation surface is forced to EXACTLY match feat/language-annotation- pipeline (the annotation branch is the source of truth for annotation code), which also removes smolvla's stale copies: - deleted: steerable_pipeline/vocabulary.py, tests/annotations/test_ vocabulary.py, prompts/module_0_vocabulary.txt, module_1_action_record .txt, module_3_vqa.txt, module_1_plan.txt, and the old module_* prompt names (now plan_/interjections_/vqa.txt). - synced: all of src/lerobot/annotations/, lerobot_annotate.py, examples/annotations/, tests/annotations/, datasets/language.py, tests/datasets/test_language.py, docs/annotation_pipeline.mdx. Non-annotation conflicts resolved by union (keeping both branches' intent): - pyproject.toml: keep smolvla's pi extra (+sentencepiece) and add the molmoact2 extra from main. - policies/factory.py: keep both dataset_repo_id (pi052 FAST tokenizer) and dataset_meta (both are referenced); union the policy-type docstring. - scripts/lerobot_train.py: keep smolvla's pi052 / use_relative_actions processor-rebuild block. - uv.lock: regenerated from the merged pyproject. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-06-04 17:13:36 +02:00
pepijn	e660a51e78	pi052(debug): drop misleading inference/parity dump from text preds The first-token parity check re-tokenized the decoded (stripped) inference string, so the leading-space SentencePiece variant always mismatched the training argmax — a false "DIVERGED" alarm. Remove the autoregressive inference print and parity comparison (and the now-dead per-sample select_message generation), keeping only the prompt, ground-truth target, and teacher-forced argmax accuracy. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-06-04 13:32:44 +00:00
Pepijn	cdd94a703f	annotate(config): tighten field comments to one line each Collapse the remaining multi-line field comments / docstrings in config.py to single lines (or two where a knob genuinely needs it), keeping the essential rationale. Comments only — no field or behavior change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-06-04 15:12:31 +02:00
Pepijn	cd59c8b312	annotate: remove the action_record style/feature entirely Drop the optional structured per-subtask action records — not a feature we want to ship. * language.py: remove 'action_record' from CORE_STYLES + PERSISTENT_STYLES (and the matching assertion in tests/datasets/test_language.py). * config.py: delete ActionRecordsConfig (verb/grasp vocabularies, frames_per_subtask, emit_record_row) and the PlanConfig.action_records field. * plan_subtasks_memory.py: delete _extract_action_record and the run_episode block that emitted style='action_record' rows; drop the now-unused json / to_image_blocks imports. * remove the plan_action_record.txt prompt. * run_hf_job.py: drop the action_records comment. Verified: 40 tests pass; pre-commit (ruff, mypy, bandit) clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-06-04 14:40:34 +02:00

1 2 3 4 5 ...

1849 Commits