- Batch all metrics into a single wandb.log() call instead of one per
key, reducing API overhead.
- Add support for list-valued metrics by expanding them to indexed keys (e.g.
metric_0, metric_1).
* feat(edit-dataset): add `concatenate_videos` opt-out to merge
When merging datasets, source mp4s are concatenated into shards capped at
`video_files_size_in_mb` (default 200 MB). This is great for dataloader
throughput but destroys per-episode (or per-source) video boundaries,
which is undesirable when you want to inspect, ship, or reuse the
individual mp4s.
Add a `concatenate_videos: bool = True` knob plumbed through
`MergeConfig` → `merge_datasets` → `aggregate_datasets` → `aggregate_videos`.
When False, each source mp4 is copied 1:1 to its own destination mp4 with
no re-muxing, so the merge preserves source video boundaries.
Usage:
lerobot-edit-dataset \
--new_repo_id user/merged \
--operation.type=merge \
--operation.repo_ids "['user/a', 'user/b']" \
--operation.concatenate_videos=false
Defaults are unchanged; the dataloader path is unaffected because the
`episodes.parquet` `from_timestamp`/`to_timestamp` index keeps working
regardless of whether each mp4 holds one or many episodes.
* feat(edit-dataset): extend concatenate opt-out to data files
Following review, add a concatenate_data flag mirroring concatenate_videos,
threaded through MergeConfig, merge_datasets, aggregate_datasets, aggregate_data
and append_or_create_parquet_file. Metadata index files still always concatenate.
Also trim the verbose docstrings and comments since the names are
self-explanatory, and extend the existing merge test to cover data files.
Steerable annotation pipeline (lerobot-annotate) that populates the language_persistent and language_events columns introduced in PR 1 (#3467) directly into data/chunk-*/file-*.parquet.
This is PR 2 of the three-PR plan:
PR 1 (Add extensive language support #3467): schema + DSL + rendering, base of this PR
PR 2 (this PR): annotation pipeline writing into PR 1's columns
PR 3: model with language prediction and runtime
A VLM (Qwen-VL family, served on vLLM) watches each episode's video and emits grounded language annotations: subtasks, plans, memory, task rephrasings, interjections + speech, and per-camera VQA. The pipeline is built for production annotation at scale — single-camera grounding, embedded-frame inputs, a describe-then-segment grounding flow, and a deterministic full-episode coverage guarantee — informed by Scale's dense-captioning findings (representation > sampling, rules > reasoning, model capacity is the biggest lever, two-pass systems compound errors)
* update policy deployment instruction with rollout
* add port and fix formatting
* add more base models to generate model card
* updated and extended model descriptions
* fix bug
* improved and extended structure
* exclude the templates from config
* add images and visualize dataset button
* add all policies we have docs for
* remove policies without the docs
* new fields, improved examples
* fix(datasets): expose a generator on EpisodeAwareSampler for distributed shuffle sync
In distributed training, accelerate can only synchronize the shuffle
permutation across ranks when the sampler exposes a generator attribute.
EpisodeAwareSampler shuffled via the global torch RNG, so disjoint batch
shards relied on every rank's global CPU RNG staying in lockstep forever;
any rank-asymmetric RNG consumption (e.g. eval rollouts on the main
process only) silently desynced the permutations and ranks trained on
overlapping/missing samples.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
* fix(train): seed sampler generator and gate dataset download per node
- Pass a generator seeded with cfg.seed to EpisodeAwareSampler so
accelerator.prepare registers it as the synchronized RNG and the
shuffle order is reproducible.
- Gate the initial make_dataset call on is_local_main_process instead of
is_main_process: the global main process only exists on node 0, so on
every other node all local ranks were downloading the dataset and
building the Arrow cache concurrently.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
* feat(datasets): add DeterministicEpisodeAwareSampler with O(1) memory and sample-exact resume
Add a sampler that never materializes frame indices: it stores only
per-episode boundaries (numpy, a few bytes per episode) and maps logical
positions to frame indices on the fly with searchsorted. Shuffling uses a
seeded Feistel permutation over [0, num_frames) (cycle-walking to the
exact domain), so the data order is a pure function of (seed, epoch):
- no RNG state to synchronize across distributed ranks,
- constant memory and zero epoch-boundary cost at any dataset size,
- O(1) seek to any position, enabling sample-exact resume.
Opt in with --deterministic_sampler=true. On resume, lerobot-train maps
the checkpointed step back to (epoch, start_index) via
compute_sampler_state and continues at the exact sample where the run
left off (up to accelerate's even_batches padding at epoch boundaries).
The shuffle is pseudo-random rather than a true uniform permutation, the
standard trade-off in large-scale training loaders.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
* refactor(datasets): fold deterministic mode into EpisodeAwareSampler
Instead of a parallel DeterministicEpisodeAwareSampler class, extend the
existing EpisodeAwareSampler with a deterministic=True mode (seeded
Feistel permutation, epoch auto-advance, state_dict/load_state_dict).
The default mode is behavior-identical: same torch.randperm consumption
and the same generator contract accelerate synchronizes; the O(N) Python
index list is replaced by O(num_episodes) boundary arrays in both modes,
with `indices` kept as a back-compat property. Passing a generator
together with deterministic=True is rejected, and the state/seek methods
raise outside deterministic mode.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
* feat(train): enable deterministic_sampler by default
Deterministic data order (sample-exact resume, no cross-rank RNG sync,
O(1) sampler memory) is now the default for map-style training; set
deterministic_sampler=false to restore the legacy RNG-based shuffle.
Streaming datasets ignore the flag (the sampler path only applies to
map-style datasets), replacing the previous hard validation error so
streaming configs keep working with the new default.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
* feat(datasets): default EpisodeAwareSampler to deterministic mode and trim comments
deterministic=True is now the class default as well as the training
default; the legacy RNG path requires an explicit deterministic=False
(the train script's non-deterministic branch passes it). Docstrings and
inline comments slimmed down across the changed files.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
* test(sampler): drain resumed trillion-frame sampler via iter() to avoid list() prealloc
list(sampler) calls PyObject_LengthHint -> __len__ (the full 10**12 epoch length) and
preallocates that many slots before iterating, OOMing even though the resumed epoch only
yields 3 frames. Collect through the iterator (no length hint) so the test exercises the
real O(1) seek/drain instead of CPython's list growth heuristic.
* fix(datasets): guard Feistel cycle-walking loop against non-convergence
Replace the unbounded while True in EpisodeAwareSampler._permute with a
bounded for loop capped at _MAX_CYCLE_WALK_STEPS (100) and raise
RuntimeError if the cycle-walk fails to land in [0, num_frames). The
loop is expected to converge in <4 steps on the chosen power-of-two
domain, so the bound is a safety net that should never trip in practice
but prevents a pathological infinite loop.
https://claude.ai/code/session_01HQ15tFrBsHYScjGWosEv22
* fix(datasets): make deterministic-sampler resume robust to world-size changes
compute_sampler_state mapped a checkpointed step back to (epoch, start_index)
using the *current* num_processes, but the number of sampler positions a step
consumes scales with the world size that produced it. Resuming on a different
GPU count therefore landed on the wrong epoch/offset, silently re-seeing or
skipping data.
Record num_processes in training_step.json at checkpoint time and feed the
checkpoint's value into compute_sampler_state on resume, so the data order
resumes at the right position regardless of the new world size. Warn when the
world size changed (the global offset is correct, but per-rank sample-exactness
needs the same topology). Old checkpoints without the field fall back to the
current world size.
Also document compute_sampler_state's assumptions explicitly: num_processes /
batch_size must match the checkpointing run, and accelerate's even_batches=True
padding is mirrored by the ceil(... / num_processes) term.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
* style: apply ruff-format to lerobot_train.py
Collapse the compute_sampler_state(...) call onto one line so the
ruff-format pre-commit hook passes (fixes the failing CI check).
Co-authored-by: Cursor <cursoragent@cursor.com>
* refactor(datasets): use seeded torch.randperm instead of Feistel in EpisodeAwareSampler
Drop the Feistel permutation (and its SplitMix64 hash / cycle-walking) in favor of a
torch.randperm seeded from (seed, epoch). The deterministic mode keeps its key properties
- data order is a pure function of (seed, epoch), so it reproduces on every rank with no
global-RNG synchronization, and
- state_dict / load_state_dict still resume sample-exactly, now by regenerating the epoch's
permutation and slicing from the saved offset.
Construction stays O(num_episodes) (only episode boundaries are stored, never a per-frame
index list). The trade-off vs Feistel: the per-epoch shuffle is again O(num_frames) memory
(the randperm tensor) and no longer O(1)-seekable, in exchange for ~30 fewer LOC and a truly
uniform shuffle. Tests updated: the trillion-frame O(1) test is replaced with a
boundary-storage check and a scale resume-exactness test.
Co-authored-by: Cursor <cursoragent@cursor.com>
* refactor(datasets): make EpisodeAwareSampler always deterministic
With Feistel gone, deterministic and legacy modes were both just torch.randperm and the
deterministic path strictly dominated (reproducible across ranks via the (seed, epoch) seed,
no accelerate generator sync, resumable). Collapse to a single path and drop the redundant
flag:
- remove the `deterministic` and `generator` constructor args, `_iter_default`, and
`_require_deterministic`; `set_epoch` / `state_dict` / `load_state_dict` are now unconditional
- remove the `deterministic_sampler` train config field and the legacy generator branch in
lerobot_train.py (non-streaming map datasets always use the sampler)
- drop the now-obsolete generator/legacy tests
Note: removes the `generator` kwarg from EpisodeAwareSampler (back-compat break vs main); the
order is now a pure function of (seed, epoch), so no cross-rank RNG sync is needed.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(datasets): address sampler review (batch_size resume guard + docs)
- Record batch_size in training_step.json alongside num_processes and feed
the checkpoint's value into compute_sampler_state on resume; warn when it
differs (per-rank sample-exactness needs the same batch size).
- Document the set_epoch vs __iter__ auto-advance coupling on EpisodeAwareSampler
(callers should rely on exactly one mechanism per run).
- Note the broadened (reproducibility-breaking) sampler guard and the no-generator
distributed sharding correctness in lerobot_train.py.
- Add load_training_batch_size + parallel tests.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(train): download dataset once on the global main process
Gate the training dataset download on the global is_main_process (download once to the
shared dataset root, barrier, then every other rank reads the already-populated copy)
instead of per-node is_local_main_process. LeRobotDataset skips its snapshot_download
when try_load() succeeds, so no rank re-downloads. Assumes the dataset root / HF cache is
on storage shared across nodes.
Co-authored-by: Cursor <cursoragent@cursor.com>
* chore(datasets): trim sampler comment and drop duplicate tests
Remove the verbose dataloader-guard comment and the two EpisodeAwareSampler tests
that duplicated existing validation/warning coverage (no coverage loss).
Co-authored-by: Cursor <cursoragent@cursor.com>
---------
Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(datasets): expose a generator on EpisodeAwareSampler for distributed shuffle sync
In distributed training, accelerate can only synchronize the shuffle
permutation across ranks when the sampler exposes a generator attribute.
EpisodeAwareSampler shuffled via the global torch RNG, so disjoint batch
shards relied on every rank's global CPU RNG staying in lockstep forever;
any rank-asymmetric RNG consumption (e.g. eval rollouts on the main
process only) silently desynced the permutations and ranks trained on
overlapping/missing samples.
* fix(train): seed sampler generator and gate dataset download per node
- Pass a generator seeded with cfg.seed to EpisodeAwareSampler so
accelerator.prepare registers it as the synchronized RNG and the
shuffle order is reproducible.
- Gate the initial make_dataset call on is_local_main_process instead of
is_main_process: the global main process only exists on node 0, so on
every other node all local ranks were downloading the dataset and
building the Arrow cache concurrently.
* feat(processor): add in-memory pipeline serialization
Expose processor pipeline config and tensor state without requiring temporary files, so processors can be transported, compared, or hashed directly in memory.
* feat(processor): enhance DataProcessorPipeline with registry support
- Added a new RegisteredLazyTensorStateStep for registry-based serialization tests.
- Improved state filename handling in _get_state_filename method.
- Refactored validation logic in _validate_loaded_config to simplify parameter types.
- Updated tests to verify registry step functionality and ensure correct state loading.
* refactor(processor): update state handling in DataProcessorPipeline
- Introduced a new static method _get_state_key to derive in-memory state keys from serialized filenames.
- Updated state_dict and load_state_dict methods to use suffixless state keys instead of filenames.
- Adjusted related tests to reflect changes in state key handling, ensuring consistency in state management
* fix(processor): update loaded_config argument description in DataProcessorPipeline
- Clarified the documentation for the loaded_config parameter to indicate that it may be a non-dictionary value, enhancing understanding for future developers.
* first commit
* feat(policies): add VLA-JEPA
* feat(policies): add VLA-JEPA
* support vla_jepa
* (feat)policies: add VLA-JEPA
* linting
* adding deps to pyproject.toml
* updating uv lock
* adding guards to avoid needing transformers and diffusers for type checking and basic tests
* fixing action and state dim
* fix warnings with qwen processor kwargs
* fixing wm_loss not propagating
* adjusting obs steps, tublets size to match original implementation
* some more fixes to be closer to the original implem
* adding more tests to ensure good coverage
* align VLA-JEPA architecture with original checkpoint
- Remove stale `action_num_heads` / `action_attention_head_dim` config fields;
DiT head dimensions are now always derived from the preset (DiT-B/L/test).
- Add `num_target_vision_tokens` and `action_max_seq_len` config fields required
by the action head's future-token embedding and positional embedding tables.
- Fix default `qwen_model_name` to 2B (matches all released checkpoints).
- Rename `ActionEncoder` attrs w1/w2/w3 → layer1/layer2/layer3 to match
checkpoint key names; replace `nn.Sequential` decoder/state-encoder with
`_MLP2` (layer1/layer2 naming).
- Fix `VLAJEPAActionHead` to size ActionEncoder and StateEncoder at `inner_dim`
(DiT input width) rather than `action_hidden_size` (DiT output width).
- Rename `DiT.blocks` → `transformer_blocks` and `attn` → `attn1` to match
checkpoint; add alternating cross/self attention (even blocks cross-attend to
Qwen context, odd blocks self-attend).
- Add `DiT-test` preset for unit tests.
- Rewrite `ActionConditionedVideoPredictor` with explicit ViT-style blocks
(`_PredictorBlock` with fused qkv) to match checkpoint structure; rename
`encoder`/`norm`/`proj` → `predictor_blocks`/`predictor_norm`/`predictor_proj`.
* propagate action_is_pad masking through VLA-JEPA policy pipeline
Pass the `action_is_pad` tensor from the batch through to the action head
so padded timesteps are excluded from the flow-matching loss.
* update VLA-JEPA tests for arch changes and action_is_pad
- Switch conftest to use `action_model_type="DiT-test"` now that
`action_num_heads` / `action_attention_head_dim` have been removed.
- Add action_head tests covering fully-padded loss (zero) and equivalence
of action_is_pad=None vs all-zeros mask.
- Remove obsolete `test_native_to_lerobot_wm_only` test.
* add VLA-JEPA documentation
Covers architecture overview, pretrained checkpoints, config reference,
training/eval commands for LIBERO-10, and guidance on fine-tuning for
single-camera datasets.
* add one-shot script to convert ginwind/VLA-JEPA checkpoints to safetensors (will remove once migrated)
* make default params more aligned with paper and pretrained models
- adding possibility of freezing qwen backbone and world model
- added tests for weight loading
* trying out to re-init the action head to avoid pretraining dimension mismatch
* allow different state dim and action dim
* removing missleading future_action_window_size to just use chunk_size
* lots of changes to make existing weights work, need to massively refactor the pre and post processing
* refactoring into using pre and post processor
* pre-commit cleanup
* fixing doc defaults args
Signed-off-by: Maxime Ellerbach <maxime@ellerbach.net>
* adressing dtype zeros issue
* adding guard for diffusers
* fixing training and exal examples
* trying to close success rate gap
* fix qwen norm layer output libero eval is now as expected
* adding instructions for different embodiement + fixing some tests
* smol fix to avoid having default CPU device when training
* fixing misconception about multiview / singleview handling
* removing conversion script
* adding licences
* adding .mdx docs and shortening polivy_vla_jepa_README.md
* removing useless pre-processor
* cleanup
* removing swish in favor of silu
* adding configuration gripper index and threshold
* fixing simlink
---------
Signed-off-by: Maxime Ellerbach <maxime@ellerbach.net>
Co-authored-by: ginwind <ginwind@mail.ustc.edu.cn>
The docs pointed at src/lerobot/datasets/v30/, which does not exist.
Both scripts actually live in src/lerobot/scripts/:
- convert_dataset_v21_to_v30.py
- augment_dataset_quantile_stats.py
Updated the four references (one python -m module path and three
file-path invocations) to the correct location, matching each
script's own usage docstring.
* fix(train): enable relative action overrides for pretrained processors
Keep pretrained processor pipelines when use_relative_actions is enabled and
apply relative/absolute action processor settings through overrides. Rename the
relative action processor registry key to relative_actions_processor.
* fix(config): reject rename_map without pretrained checkpoint
Fail fast when rename_map is set during fresh initialization, since fresh
configs derive feature names from the current dataset and no rename is applied.
---------
Co-authored-by: Pepijn <138571049+pkooij@users.noreply.github.com>
* feat(rewards): add TOPReward reward model
* refactor(rewards): clean up TOPReward processor/model
* fix(rewards/topreward): add missing input keys mm_token_type_ids
* fix(rewards/topreward): fix pyproject extra typo and simplify processor (#3653)
Add lerobot[topreward] extra to all in
pyproject.toml, drop the redundant labels arg in scoring, and
collapse the dead-branch shape check in the encoder processor.
* optmize topreward input processing (#3660)
---------
Co-authored-by: Cole <91766445+jcoleharrison@users.noreply.github.com>
Co-authored-by: Haoming Song <haomingsong24@gmail.com>
PR #3145 added YAML support for policy.path but left two bugs:
1. extract_path_fields_from_config only deleted config_data[field] when
no sibling overrides existed. With siblings, the dict stayed in place
and draccus crashed decoding it as PreTrainedConfig (no 'type' key).
Sibling overrides go into _config_yaml_overrides and are applied later
by from_pretrained(), so the field can always be removed.
2. wrap() updated config_path_cli to the cleaned temp file path but
never propagated it to the draccus.parse fallback branch. cli_args
still contained --config_path=<original>, so draccus read the
original YAML with path: still present.
Tests passed because they (a) called extract_path_fields_from_config
directly and (b) included type: alongside path: in the YAML, sidestepping
both bugs.
Co-authored-by: Steven Palma <imstevenpmwork@ieee.org>
* fix(deps): cap placo below 0.9.16 and harden kinematics import
placo 0.9.16 links against liburdfdom_sensor.so.4, which is unavailable
on Ubuntu 24.04 (noble ships urdfdom 3.x). Importing placo on that base
crashes with:
ImportError: liburdfdom_sensor.so.4.0: cannot open shared object file
This broke nightly Latest Deps tests (CPU and GPU) when the lockfile
upgrade picked placo 0.9.16, since lerobot.model.kinematics
unconditionally imports placo when _placo_available is true, and that
check (importlib.util.find_spec) cannot detect dlopen failures of
transitive shared libraries — so unrelated subsystems (RL actor,
gym_manipulator) became unimportable.
Two changes:
1. Pin placo to <0.9.16 in pyproject.toml + regenerate uv.lock
(0.9.16 → 0.9.15). Short-term unblock for nightly CI until system
urdfdom 4.x is broadly available.
2. Harden the import guard in src/lerobot/model/kinematics.py:
wrap 'import placo' in try/except ImportError so a missing
transitive .so no longer crashes module import. RobotKinematics
instantiation now raises an informative ImportError citing the
underlying dlopen failure via _raise_if_placo_unusable().
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(kinematics): hoist _placo_runtime_error to module scope for mypy
Mypy walks the TYPE_CHECKING branch in which the runtime else-block is
not executed, so _placo_runtime_error was only defined at runtime and
mypy reported 'Name "_placo_runtime_error" is not defined' on the
three references inside _raise_if_placo_unusable. Declare the symbol
unconditionally at module scope with a default of None; the runtime
import-failure branch still assigns to it.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* style(kinematics): drop verbose comments around placo import guard
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* chore(gr00t): sync with #3606 for fixing gr00t config crash
* fix(pi0&pi05): fix graph break caused by deepcopy of past_key_values in sample_actions
* fix(pi0&pi05): fix frequent recompile caused by compute_layer_complete
* feat(test): add compile test and benchamrk for pi0 and pi05
* feat(test): add comprehensive testing for pi0 and pi05. Including processor, forward, sample action, etc.
- Fixed broken API examples in Lerobot Imitation Learning Documentation
- Teleoperation with cameras improved by adding a fixed frequency in the loop (without it the cameras feed gets very slow)
- Wrapped record example script in main() to avoid problems on Mac
- Previously teleoperation example was using SO-ARM and teleoperation with cameras was using Koch. I changed it to use SO-ARM in all of the examples.
- Added section on how to train with HF Jobs - CLI and Python examples
- Replaced lerobot-record with lerobot-rollout in policies examples
VideoDecoderCache used an unbounded dict keyed on absolute path, with no
eviction in the standard LeRobotDataset path. With shuffled iteration over
datasets that have many distinct mp4 files, every DataLoader worker
accumulated one cached (VideoDecoder, fsspec file handle) pair per distinct
path it had ever touched. Per-entry cost is ~3-5 MB of host RAM plus one
open FD; at ~8 k entries this is roughly 30 GB per worker.
This was hit in the wild during a SmolVLA training run on a 4,195-episode
SO-101 dataset (8,390 mp4s, two cameras per episode). dmesg showed
anon-rss climbing to 34.9 GB on a single pt_data_worker before the OOM
killer fired ~30 min into training; with --num_workers=8 the per-worker
peak halved to 17.9 GB, which is the expected inverse-scaling signature
when the leak is per-decode and the workload is split across workers. The
working workaround on the affected platform was --dataset.video_backend=pyav,
because the pyav path opens/closes per call and never touches this cache.
Switch the backing store to an OrderedDict and evict LRU entries when the
cap is reached, closing the evicted file handle inside the lock so we do
not leak FDs either. Default cap is DEFAULT_DECODER_CACHE_SIZE = 100,
overridable via LEROBOT_VIDEO_DECODER_CACHE_SIZE or by passing max_size=
to the constructor; max_size=None restores the legacy unbounded behaviour
for callers that need it.
Validation on the original failing workload (decode_video_frames_torchcodec
called over real mp4s from the affected SO-101 dataset):
unbounded: 300 files -> +1087 MB host RSS, cache=300, still climbing
cap=50: 500 files -> +266 MB host RSS, cache=50, stable
cap=50: 2000 calls -> +312 MB host RSS, cache=50, stable
cap=100: 1000 calls -> +470 MB host RSS, cache=100, stable
Three independent seeded runs at cap=50 agreed to within 1% (263 / 266 /
265 MB delta), and the 2000-call multi-pass run shows RSS plateaus after
the cap is reached instead of drifting.
Tests in tests/datasets/test_video_decoder_cache.py cover:
default-is-bounded, size cap, LRU ordering, FD close on eviction, FD close
on clear(), cache-hit invariance, max_size=None fallback, and env-var
override. No regressions in test_video_encoding.py, test_streaming.py, or
test_dataset_reader.py (73 prior tests still pass alongside the 8 new ones).
* feat(utility): adding video re-encode utility
* feat(edit): adding a new lerobot-edit-dataset tool to re-encode all the videos of a dataset
* chore(format): formatting code
* chore(review): fix Claude reviews
* test(reencode dataset): adding missing test for reencode dataset
* Add extensive language support
* Address review: split persistent/event schemas, drop event timestamps
- recipe.py: derive _VALID_ROLES/_VALID_STREAMS from MessageRole/MessageStream Literals
- dataset_metadata.py: keep CODEBASE_VERSION at v3.0
- language.py: remove RESERVED_STYLES; split arrow/feature schemas into
persistent (with timestamp) and event (without timestamp); add docstrings
- language_render.py: events use frame-row timestamp implicitly; no
per-event timestamp filtering or sorting
- converters.py: drop unused subtask_key passthrough
- add docstrings to new public APIs (recipe, render_messages_processor, collate)
- update tests for split schemas; revert uv.lock
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Add docstrings to all new helpers; revert uv.lock
Covers private helpers in recipe.py, language.py, language_render.py,
and render_messages_processor.py. Also reverts uv.lock to main (it was
re-generated by `uv run` during local checks).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(language): add motion (persistent) and trace (event-only) styles
Promote the previously-reserved motion/trace styles to first-class core
styles. motion routes to language_persistent (it tracks robot state over
time); trace routes to language_events (single-moment annotations).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(language): per-camera tagging on view-dependent styles
Adds a nullable `camera` field to the language row struct (both persistent
and event variants) so view-dependent styles like `vqa` can carry which
`observation.images.*` view they were grounded against. Without this,
multi-camera datasets ended up with multiple `(vqa, role)` rows at the
same timestamp that the resolver could not disambiguate.
- `language.py`: add `camera` to PERSISTENT_ROW_FIELDS / EVENT_ROW_FIELDS,
to both Arrow struct types and the HF datasets feature mappings;
introduce VIEW_DEPENDENT_STYLES = {vqa, motion, trace} plus
`is_view_dependent_style` and `validate_camera_field` helpers (camera
required iff style is view-dependent).
- `language_render.py`: thread an optional `camera=` kwarg through every
resolver (`active_at`, `emitted_at`, `nth_prev`, `nth_next`) and through
`_matching_rows` / `_select_*`, so recipes can disambiguate per-camera
VQA with `emitted_at(t, style=vqa, role=assistant, camera=...)`.
Without a `camera` filter, multi-row matches keep raising the existing
ambiguity error — which is the desired behaviour on multi-camera data.
- `recipes/pi05_hirobot.yaml`: replace the single `ask_vqa` branch with
`ask_vqa_top` and `ask_vqa_wrist` per-camera sub-recipes (each carrying
the matching image block), keeping the original 0.20 budget and
documenting the customization point for datasets with different cameras.
- Tests: schema test asserts the new field order; new tests cover
`is_view_dependent_style`, `validate_camera_field` (both required and
forbidden directions), per-camera `emitted_at` filtering, and the
ambiguity error when two cameras emit `(vqa, assistant)` at the same
timestamp without a `camera=` filter. RenderMessagesStep + dataset
passthrough fixtures updated to include the new field.
- `docs/source/language_and_recipes.mdx`: document the `camera` field,
the per-camera resolver pattern, and the canonical recipe convention.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(language): drop motion from VIEW_DEPENDENT_STYLES
Motion primitives are described in robot-frame (joint / Cartesian) terms,
not pixel space, so they are camera-agnostic. Only `vqa` (event) and
`trace` (event, pixel-trajectory) are view-dependent.
The `camera` field stays on PERSISTENT_ROW_FIELDS for schema symmetry —
the validator, resolver, and HF feature mapping behave identically across
the two columns regardless of which styles populate `camera` today —
but persistent rows now always have `camera=None` in practice.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(language): task_aug style + automatic ${task} rephrasing rotation
Adds task-prompt diversity (Xiao 2022 / CAST) without touching
``meta/tasks.parquet`` or forcing recipes to opt in. The plan reserved
``task_aug`` as a future style; this lands it now.
- ``language.py``: add ``task_aug`` to ``CORE_STYLES`` and
``PERSISTENT_STYLES``. ``column_for_style("task_aug")`` returns
``language_persistent`` so PR 2 writers route it correctly.
- ``language_render.py``: ``_resolve_task`` now consults the persistent
slice for rows of ``style="task_aug", role="user"``. When any exist
it picks one deterministically by ``sample_idx`` (blake2b-keyed, not
Python's randomized hash) so an epoch sees every rephrasing of every
episode while the same sample still resolves identically across
reruns. Falls back to the canonical ``meta/tasks.parquet`` task when
no rephrasings are present, so existing datasets and unannotated runs
keep their behaviour. Explicit ``task=`` overrides still win.
- Tests: rephrasing coverage across samples, determinism on repeat
``sample_idx``, fallback when persistent has no ``task_aug`` rows,
and explicit override priority.
Recipes get this for free: any ``${task}`` placeholder rotates through
the available rephrasings. Recipes that want the literal canonical task
can override the binding.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(language): tool catalog in meta/info.json + LeRobotDatasetMetadata.tools
Stores OpenAI-style function schemas at ``meta/info.json["tools"]`` so
datasets can declare which tools are available (today: just ``say``;
tomorrow: per-dataset extensions). The ``DEFAULT_TOOLS`` constant
fills in for unannotated datasets so chat-template consumers don't
have to special-case anything.
Three pieces:
- ``language.py``: ``SAY_TOOL_SCHEMA`` and ``DEFAULT_TOOLS``
constants. Single source of truth — PR 2's writer and PR 3's
runtime tool registry will both import from here instead of
duplicating the dict.
- ``dataset_metadata.py``: ``LeRobotDatasetMetadata.tools`` property
reads ``info.json["tools"]`` and falls back to ``DEFAULT_TOOLS``.
Returns deep-copied dicts so callers can mutate the result safely.
- ``docs/source/tools.mdx``: spec page covering the catalog, per-row
invocations, and the three-step "how to add a new tool" workflow
(declare schema, implement, register). Linked from the docs
toctree under the Datasets section.
This lays the groundwork for PR 2's pipeline writing the catalog out
during annotation, and PR 3's ``src/lerobot/tools/`` package shipping
runnable implementations (one file per tool — first up:
``say.py`` wrapping Kyutai's pocket-tts).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Apply ruff and prettier formatting after merge
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* refactor(language): unify resolver dispatch and prune redundant test scaffolding
* Drop the unused `events` kwarg from `active_at`/`nth_prev`/`nth_next`;
only `emitted_at` actually consults events. The dispatcher in
`_resolve_spec` now passes events conditionally.
* Replace the dual `_persistent_sort_key`/`_event_sort_key` pair with a
single `_row_sort_key` and drop the `sort_key` parameter from
`_select_one`. Event rows lack `timestamp` (it is implicit in the
frame) and now default to `0.0` for sort purposes — the
`(style, role)` tiebreaker is unchanged.
* Inline `_select_latest` into `active_at` (its only caller).
* Collapse `emitted_at`'s dual-branch into one `_select_one` call.
* Tighten `_validate_persistent_resolver` to a single
`column_for_style(style) != LANGUAGE_PERSISTENT` check.
* Parameterize `test_per_camera_blend_renders_both_views` over the two
cameras and factor the sub-recipe builder into `_vqa_subrecipe` so
the test no longer hand-rolls two near-identical recipe blocks.
Net -98 LOC; behavior, public resolver names, and test expectations
unchanged.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(language): always raise on ambiguous resolver matches
`_select_one` previously skipped its ambiguity check whenever any of
`role`/`tool_name`/`camera` was set, on the assumption that the caller
had already pinned down a unique row. That left a real ambiguity hole
for VQA: with two cameras emitting `(vqa, assistant)` at the same
frame, `emitted_at(..., role="assistant")` silently picked the first
sorted row instead of telling the recipe to add `camera=...`. The
existing `test_emitted_at_raises_on_ambiguous_per_camera_vqa` test
already encoded the desired behavior.
Tighten the check: any time `len(rows) > 1` we now raise with the
selectors echoed back, so users see exactly which fields they passed
and that more is needed to disambiguate.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* chore: fix CI — collapse short ValueError to one line, refresh uv.lock
* `ruff format` on CI (newer version) wants the short `camera=None`
ValueError on a single line.
* `uv.lock` was stale relative to `pyproject.toml`'s `datasets>=4.7.0`
pin (and picked up upstream `s390x` marker fixes for cuda packages).
CI runs `uv sync --locked` which rejected the divergence.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(language): keep base install green — drop processor re-export, gate dataset-extra tests
`lerobot.processor` re-exported `RenderMessagesStep` at the package
level, so importing anything from `lerobot.processor` pulled in
`lerobot.datasets.language` → `lerobot.datasets/__init__.py` →
`require_package("datasets")`, which fails in the Tier 1 base install
that intentionally omits the `[dataset]` extra. The chain bricked
collection for unrelated suites (`tests/policies/pi0_pi05/...`,
`tests/envs/...`, etc.).
* Stop re-exporting `RenderMessagesStep` from `lerobot.processor`. The
only consumer (the test) already imports from the submodule.
Document the deliberate omission in the module docstring.
* Add `pytest.importorskip("datasets", ...)` (and `pandas` where
needed) at the top of the four PR-added tests that exercise the
language stack:
- tests/datasets/test_language.py
- tests/datasets/test_language_render.py
- tests/processor/test_render_messages_processor.py
- tests/utils/test_collate.py
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(language): address review — tools accessor, motion docs, conditional collate
* **`meta.tools` actually reads `info.json["tools"]`.** `DatasetInfo`
had no `tools` field, so `from_dict` silently dropped the key (it
warned about unknown fields then discarded them) and the property
always returned `DEFAULT_TOOLS`. Added `tools: list[dict] | None`
to the dataclass; `to_dict()` drops it when unset so existing
datasets keep a clean `info.json`. Fixed the accessor to read
`self.info.tools` (the previous `.get(...)` would have raised
AttributeError on the dataclass anyway). Added regression tests:
fallback when absent, round-trip from disk, and round-trip
through `DatasetInfo.from_dict` / `to_dict`.
* **`motion` is not view-dependent — fix the docs.** The mdx claimed
rows of style `motion` must carry `camera`, but `VIEW_DEPENDENT_STYLES
= {"vqa", "trace"}` and the validator agrees: motion primitives are
joint/Cartesian-frame, not pixel-space. Updated both call-out
paragraphs in `language_and_recipes.mdx`.
* **Conditional `collate_fn` swap.** Added `meta.has_language_columns`
and gate the `lerobot_collate_fn` swap in `lerobot_train.py` on it,
so non-language datasets keep PyTorch's `default_collate`. Also
added a pass-through test in `test_collate.py` that asserts on a
plain tensor batch the custom collate matches `default_collate`
key-for-key, plus a test for the `None`-sample drop path.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* review: dedupe regex, centralize column names, harden collate, more tests
* **#2 — dedupe `_PLACEHOLDER_RE`.** The same regex was compiled in
`recipe.py` and `language_render.py`. Promote to module-level
`PLACEHOLDER_RE` in `recipe.py` (its primary owner — declares
template syntax) and import from `language_render.py`.
* **#3 — centralize language column names.** `io_utils.py` had
hardcoded `{"language_persistent", "language_events"}` literals at
two sites. Replace with `LANGUAGE_COLUMNS` import so a future column
rename can't silently desync.
* **#4 — defensive collate preserved-keys.** `lerobot_collate_fn`
silently filtered language fields from samples that didn't have
them, which would hand downstream consumers a preserved list
shorter than the tensor batch. Now: if any sample carries a key,
every sample in the batch must carry it; otherwise raise a
`ValueError` so the upstream rendering bug surfaces at the boundary.
* **#5 — `_scalar` rejects non-singleton lists.** Previously a zero-
or multi-element list fell through and triggered confusing
`float([])` errors downstream. Now raises `ValueError` with the
actual length.
* **#6 — refactor `_extract_complementary_data`.** Replace 11 lines
of `key = {... if ... else {}}` plus an 11-line splat dict with a
single `_COMPLEMENTARY_KEYS` tuple iterated once.
* **#7 — document `EXTENDED_STYLES`.** Was an empty `set()` with no
comment. Add a docstring explaining it's an intentional extension
point: downstream modules append project-local styles before
`column_for_style` is called.
* **#9 — `tools.mdx` notes the runtime layer is future work.** The
page referenced `src/lerobot/tools/`, `registry.py`, and
`get_tools(meta)` — none exist in this PR. Added a callout at the
start of "How to add your own tool" plus a note on the
implementations paragraph.
* **#10 — tests for YAML round-trip, malformed rows, blend
validation.** `test_recipe.py` grew from 1 case to 12 covering:
blend-or-messages exclusivity, target-turn requirement, blend
emptiness, weight presence/positivity, nested-blend rejection,
`from_dict` with nested blends, `from_yaml` / `load_recipe`
agreement, top-level non-mapping rejection. Added a malformed-row
test for `_normalize_rows` that asserts non-dict entries raise
`TypeError`.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* review: emitted_at uses 0.1s tolerance; MessageTurn requires stream at construction
* **Float tolerance in `emitted_at` for persistent styles.** The
``_timestamp(row) == t`` exact-equality check silently missed any
caller that derived ``t`` arithmetically (e.g. ``frame_idx / fps``)
even though the parquet timestamp would only differ by ULPs. Added
``EMITTED_AT_TOLERANCE_S = 0.1`` and check ``abs(...) <= tolerance``
instead, with a docstring explaining why exact equality wasn't
enough and why 0.1 s is safe at typical 30–100 Hz control rates.
Test asserts the new behavior at half-window (matches) and
double-window (no match) using the constant so it stays in sync.
* **`MessageTurn.stream` is required at construction.** It was typed
``MessageStream | None = None`` so YAML could omit ``stream:`` and
pass the dataclass invariant — but ``_validate_rendered`` rejected
``None`` streams later, surfacing the error at the first sample
instead of at recipe load. Now ``__post_init__`` raises
``ValueError`` if ``stream`` is ``None``, with the list of valid
streams in the message. The redundant late-stage check in
``_validate_rendered`` is replaced with a one-line comment that
cites the upstream invariant. Test pins the new construction-time
rejection.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* docs(tools): drop follow-up-PR references
Reword the two callouts in `tools.mdx` to describe the runtime layer
in present tense ("not part of the catalog layer shipped today",
"those modules don't yet exist in the tree") instead of pointing at a
specific follow-up PR. Keeps the doc honest about what works now
without coupling it to a particular release order.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* review: address CarolinePascal feedback
- language timestamps: float64 -> float32 to match LeRobotDataset frame
timestamps (Arrow struct + HF feature)
- dataset_metadata: hoist `.language` imports to module top — language.py
has no lerobot imports, so there is no circular-import risk
- dataset_metadata: add a `meta.tools` setter that persists the catalog to
info.json and reloads `meta.info`
- feature_utils: validate the `language` dtype instead of returning "" —
warn (non-fatal) when a non-empty value is written at record time
- centralize the scalar-unwrap helper as `lerobot.utils.utils.unwrap_scalar`,
shared by render_messages_processor and language_render
- docs: move `## Layer 2 — recipe anatomy` ahead of the resolver sections,
which describe recipe bindings rather than dataset layout
- language_render: note in EMITTED_AT_TOLERANCE_S that persistent rows change
on a human-action timescale, not the camera frame rate
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(robots): natively integrate Seeed Studio reBot B601-DM arm
Add first-class LeRobot support for the Seeed Studio reBot arm, replacing
the out-of-tree `lerobot-robot-seeed-b601` / `lerobot-teleoperator-rebot-arm-102`
plugin packages.
New devices:
- robot `rebot_b601_follower` — single-arm B601-DM follower (6-DOF + gripper,
Damiao CAN motors via `motorbridge`)
- robot `bi_rebot_b601_follower` — bimanual follower composing two single arms
- teleoperator `rebot_102_leader` — single-arm StarArm102 / reBot Arm 102 leader
(FashionStar UART servos via `motorbridge-smart-servo`)
- teleoperator `bi_rebot_102_leader` — bimanual leader composing two single arms
The bimanual variants reuse the single-arm classes and namespace each arm's
observation/action keys with `left_` / `right_` prefixes, so a bimanual
StarArm102 leader can teleoperate a bimanual reBot B601 follower.
Optional SDK imports are guarded; a `rebot` extra installs `motorbridge` and
`motorbridge-smart-servo`.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* docs: add reBot B601-DM calibration & dual-arm teleoperation guide
Add docs/source/rebot_b601.mdx covering single-arm and bimanual
calibration and teleoperation for the reBot B601-DM follower and
reBot Arm 102 leader, with zero-position reference images from the
Seeed Studio wiki. Register the page in the docs toctree.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* docs: fix reBot B601 MDX build (move JSON example out of <Tip>)
The doc-builder parses `{...}` inside MDX component children as a
Svelte expression, so the joint_directions JSON example broke the
build. Move it into a top-level fenced code block.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* docs: apply prettier formatting to reBot B601 page
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* docs: remove duplicate colocated reBot B601 page
docs/source/rebot_b601.mdx is the canonical, toctree-registered page;
the colocated rebot_b601.md was a redundant thinner copy.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* docs: clarify 6-DOF leader fallback comment in reBot B601 follower
Explain that holding wrist_yaw at zero is what lets a 6-DOF leader
(e.g. so100_leader / so101_leader) teleoperate the 7-DOF follower.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* refactor: address Caroline's PR review on reBot B601 integration
- leader: remove _validate_config (no other lerobot device validates its
config; a key mismatch now surfaces as a plain KeyError)
- leader: simplify _round_to_valid_range to direct modular arithmetic
instead of a bidirectional search loop
- leader: inline the single-use _clamp helper
- follower & leader: write MotorCalibration range_min/range_max from the
configured joint_limits / joint_ranges instead of a fixed [-90, 90]
- docs: add a "Find the USB ports" section (lerobot-find-port) and move
the brltty/permissions tip there; link the OpenArm page for SocketCAN
adapter configuration
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Enhance documentation with Lance format details
Added information about Lance format and `lerobot-lancedb` package for multimodal AI datasets.
Signed-off-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com>
* chore(video backend): renaming codec into video_backend in get_safe_default_video_backend()
* feat(pyav utils): adding suport for PyAV encoding parameters validation
* feat(VideoEncoderConfig): creating a VideoEncoderConfig to encapsulate encoding parameters
* feat(VideoEncoderConfig): propagating the VideoEncoderConfig in the codebase
* chore(docs): updating the docs
* feat(metadata): adding encoding parameters in dataset metadata
* fix(concatenation compatibility): adding compatibility check when concatenating video files
* feat(VideoEncoderConfig init): making VideoEncoderConfig more robust and adaptable to multiple backends
* feat(pyav checks): making pyav parameters checks more robust
* chore(duplicate): removing duplicate get_codec_options definition
* test(existing): adapting existing tests
* test(new): adding new tests for encoding related features
* chore(format): fixing formatting issues
* chore(PyAV): cleaning up PyAV utils and encoding parameters checks to stick to the minimun required tooling.
* chore(format): formatting code
* chore(doctrings): updating docstrings
* fix(camera_encoder_config): Removing camera_encoder_config from LeRobotDataset, as it's only required in LeRobotDatasetWriter.
* feat(default values): applying a consistent naming convention for default RGB cameras video encoder parameters
* fix(rollout): propagating VideoEncoderConfig to the latest recording modes
* chore(format): formatting code, fixing error messages and variable names
* fix(arguments order): reverting changes in arguments order in StreamingVideoEncoder
* chore(relative imports): switching to relative local imports within lerobot.datasets
* test(artifacts): cleaning up artifacts for the video encoding tests
* chore(docs): updating docs
* chore(fromat): formatting code
* fix(imports): refactoring the file architecture to avoid circular imports. VideoEncoderConfig is now defined in lerobot.configs and lazily imports av at runtime.
* fix(typos): fixing typos and small mistakes
* test(factories): updating factories
* feat(aggregate): updating dataset aggregation procedure. Encoding tuning paramters (crf, g,...) are ignored for validation and changed to None in the aggregated dataset if incompatible.
* docs(typos): fixing typos
* fix(deletion): reverting unwanted deletion
* fix(typos): fixing multiple typos
* feat(codec options): passing codec options to lerobot_edit_dataset episode deletion tool
* typo(typo): typo
* fix(typos): fixing remaining typos
* chore(rename): renaming camera_encoder_config to camera_encoder
* docs(clean): cleaning and formating docs
* docs(dataset): addind details about datasets
* chore(format): formatting code
* docs(warning): adding warning regarding encoding parameters modification
* fix(re-encoding): removing inconsistent re-encoding option in lerobot_edit_dataset
* typos(typos): typos
* chore(format): resolving prettier issues
* fix(h264_nvenc): fixing crf handling for h264_nvenc
* docs(clean): removing too technical parts of the docs
* fix(imports): fixing imports at the __init__ level
* fix(imports): fixing not very pretty imports in video config file
* fix(config): add lora_alpha to PeftConfig
PeftConfig was missing the lora_alpha field, causing the PEFT library
to default to alpha=8 regardless of the LoRA rank, which dampens the
adaptation signal for high-rank adapters (e.g., r=128).
This adds lora_alpha: int | None = None to PeftConfig, allowing users
to specify --peft.lora_alpha <value> on the CLI.
Closes#3551
* fix(docs): add lora_alpha to peft training example + clarify scaling formula
- Add --peft.lora_alpha=64 to docs/source/peft_training.mdx example to
prevent new users from hitting the alpha=8 default dampening bug
- Clarify lora_alpha comment in default.py with scaling = lora_alpha / r
* docs: mention both --peft.r and --peft.lora_alpha in LoRA description
---------
Co-authored-by: Cheng Yin <yin@users.noreply.github.com>
* fix(config): support policy.path in YAML config files
policy.path was only handled via CLI args (filtered from sys.argv before
draccus, then retrieved in validate()). When specified in YAML, draccus
would crash because 'path' is not a valid field on PreTrainedConfig.
Extract path fields from the YAML/JSON config before draccus processes
it, store them in a module-level dict, and fall back to it in
get_path_arg() when the CLI doesn't have the path.
Fixes#2957
* fix(parser): preserve YAML policy overrides when loading from pretrained
When policy.path is set in YAML, validate() was calling from_pretrained
with only CLI overrides, discarding any YAML policy fields (e.g. lr,
batch_size) that draccus had already parsed. Fix by capturing the
remaining YAML fields as CLI-style args in _config_yaml_overrides and
merging them into the overrides passed to from_pretrained in train.py,
eval.py, and lerobot_record.py (CLI args still take precedence).
Also fix the NamedTemporaryFile SIM115 ruff warning and add types-PyYAML
to the mypy pre-commit hook.
* fix(parser): serialize bool/None values correctly in YAML policy overrides
Bool values from YAML configs (e.g. push_to_hub: true) were passed as
Python "True"/"False" strings instead of lowercase "true"/"false" that
draccus expects. Also skip None values to avoid passing "None" strings.
* revert: remove types-PyYAML from .pre-commit-config.yaml
* chore: fix quality check caused by untyped YAML import
Co-authored-by: masato-ka <jp6uzv@gmail.com>
Signed-off-by: Khalil Meftah <khalil.meftah@huggingface.co>
---------
Signed-off-by: Khalil Meftah <khalil.meftah@huggingface.co>
Co-authored-by: Khalil Meftah <khalil.meftah@huggingface.co>
Co-authored-by: masato-ka <jp6uzv@gmail.com>
* feat(episode filtering): adding support for episodes filtering at initialization time in LeRobotDataset
* test(tests): adding tests
* chore(format): formatting code
* feat(performance): improving implementation for better performances on big datasets
* chores(warning): improving warnings and errors for episodes filtering
* test(invalid key): adding test for invalid filtering key
* chore(format): formatting code