Recipes were over-commented (paper citations, history of removed
sub-recipes, inference-time loop walkthroughs). Stripped down to a
short header + a one-line note on the boundary-frame memory tail.
Also removed the ``_tool3`` diversity-knobs comment block in
``examples/annotation/run_hf_job.py`` — it was a personal note about
a since-merged experiment.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Recipe changes:
* action_execution now bundles the memory update as a second
assistant target gated on a new ``new_memory`` binding (fires
only at subtask-boundary frames). No "Completed subtask: X"
filler — the model emits the new subtask AND the updated
memory back-to-back in one prefix.
* user_interjection_response sub-recipe removed (current
datasets don't have interjection / say() annotations).
* Standalone memory_update sub-recipe removed (folded above).
* Weights rebalanced: action_execution 0.85, ask_vqa_top/wrist
0.075 each (sums to 1.0).
Runtime ``_msgs_for_memory`` updated to match the new
boundary-frame prompt layout.
Modeling:
* SmolVLA2Policy now fuses the flow + text losses into a SINGLE
backbone forward via ``_compute_fused_loss`` (one
vlm_with_expert pass with [prefix, suffix] embeds, then both
lm_head CE on lang slice + action_out_proj MSE on suffix).
Mirrors pi052's existing ``_compute_all_losses_fused`` —
saves one backbone pass per training step.
Examples:
* Removed the two training SLURM scaffolds; they were
out-of-date with the recipe refactor.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Match the working SmolVLA2 launch pattern so the two SLURM scripts
are interchangeable:
* literal NUM_PROCESSES / BATCH_SIZE / STEPS (no env-var defaults)
* STEPS=10000 to match the next SmolVLA2 run
* save_freq=$STEPS so only the final checkpoint is saved
* dropouts 0.1/0.1/0.1 (mild — matches the operator's iteration)
* flow_loss_weight / text_loss_weight come from the PI052Config
defaults (10.0 / 1.0 per Pi 0.5 paper §IV.D), no need to pass
them explicitly
Job name and policy_repo_id mirror the SmolVLA2 ``_tool-g2`` naming
so the two runs can be compared side-by-side in WandB.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pi 0.5 paper §IV.D Eq. (1) sets the loss balance to α=10 between text
CE and flow MSE: actions are the primary output and the flow head
should dominate the gradient signal. SmolVLA2 was defaulting both
weights to 1.0, which inverts that — text CE (~0.5-2.0 nats) ends up
larger than flow MSE (~0.1-1.0), so the action expert gets less
gradient than the LM head despite being the primary task.
Match the paper's split: text_loss_weight=1.0, flow_loss_weight=10.0.
Same as ``pi052`` (the new full reproduction policy).
Also pin the values explicitly in the SLURM launcher so the choice is
visible and overridable per-run rather than buried in the config
default.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
New ``lerobot.policies.pi052`` (parallel to ``smolvla2``) that adds
text-prediction + hierarchical-inference on top of the existing π0.5
implementation. Mirrors the paper's §IV.D dual-head training:
L = H(text) + α * ‖ω - a - f_θ_action(...)‖², α = 10
Components:
* ``configuration_pi052.py`` thin PI05Config subclass; adds
recipe_path, text/flow loss weights
(default α=10 per paper), prompt
dropout knobs, ``unfreeze_lm_head``.
* ``text_processor_pi052.py`` PI052TextTokenizerStep — concatenates
rendered messages as ``Role: ...``
plain text (PaliGemma has no chat
template), tokenises with the
PaliGemma tokenizer, builds a label
mask covering supervised target
spans. Includes Pi 0.7 §V.E
per-component prompt dropout.
* ``processor_pi052.py`` make_pi052_pre_post_processors —
Rename + Batch + Relative +
Normalize + RenderMessagesStep +
PI052TextTokenizerStep + Device.
Falls back to π0.5's plain pipeline
when recipe_path is unset.
* ``modeling_pi052.py`` PI052Policy(PI05Policy) — re-enables
PaliGemma ``lm_head``, computes
text_loss via CE on the supervised
span, sums with flow_loss in
forward(), and adds select_message
for AR text generation at inference
(same surface as
SmolVLA2Policy.select_message so
SmolVLA2Runtime drives it unchanged).
Plus the supporting plumbing:
* recipe ``configs/recipes/pi052_hirobot.yaml`` — same Hi-Robot blend
as smolvla2_hirobot.yaml, with the same ``${subtask}`` /
``if_present`` supervision fix (current span at every frame, not
``${next_subtask}``).
* SLURM ``examples/training/pi052_hirobot.slurm`` — full training
command matching the SmolVLA2 launcher.
* factory registration: ``--policy.type=pi052`` resolves to
PI052Policy with the new processor.
Same multi-rate runtime (``lerobot.policies.smolvla2.inference``)
drives this policy too — both expose ``predict_action_chunk`` for the
action expert and ``select_message`` for the LM head.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
After _tool-good (2000 steps, 0.50/0.50/0.20 dropout) the LM head's
distribution at position 0 shifted from EOS to subtask-vocabulary
tokens but emitted bag-of-words ("cube arm and") rather than well-
formed sentences. That's the expected mid-fine-tuning phase: token-
level supervision has landed, sequence-level grammar hasn't.
Two changes for the next retrain:
* STEPS=15000 (from 2000) — chat-pretrained backbones need O(10k+)
steps to walk their pretraining priors down far enough to commit
to the fine-tuned distribution structurally, not just at the
token level. _tool-g2's bag-of-words output proves the model is
on the right path; it just needs more gradient signal.
* plan/memory dropout 0.50 -> 0.30 — 0.50 was probably too
aggressive for a small dataset. Half the training samples had
crucial context missing, which slows down learning the full
conditional structure. 0.30 still regularises against prompt
leakage but lets the model learn proper grammar first; the
higher dropout can be revisited once the head is solid.
Subtask dropout stays at 0.20 since subtask isn't in the high-level
prompt anyway (recipe fix removed the "Current subtask:" message).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
After the recipe fix (target=${subtask} at every frame) the model
can still reach low text_loss by reading the answer off the plan in
the prompt: at training the prompt contains the 6-step plan, and the
current subtask is one of those steps, so the model just learns
"active step N matches subtask N" and never needs to look at the
image. Symptom at inference: subtask string is set but never updates
because the model isn't really conditioning on the visual progress.
Drop plan and memory with p=0.50 each — half of training frames the
prompt is just "${task}" (constant for this dataset) + visual prefix,
which is the only place the answer can come from. Forces the LM head
to actually use vision.
``subtask_dropout`` stays at 0.20 because subtask isn't in the
high-level prompt anymore (recipe fix removed the "Current subtask:
X" message); the knob still affects other sub-recipes that reference
it as context.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Match the operator's current training command for the _tool6 retrain:
* default DATASET / POLICY_REPO_ID / JOB_NAME point at the tool6
iteration (super_poulain_full_tool3 → smolvla2_hirobot_super_poulain_tool6)
* STEPS default 2000 (short enough to iterate; bump to 10k for full)
* save_freq=$STEPS so the only checkpoint is the final one
* OUTPUT_DIR includes step count so successive runs don't clobber
* Drop the wider augmentation envelope I added earlier — back to
default ColorJitter ranges (brightness ±20% etc) since the
high_level_subtask recipe fix (current-subtask supervision) is
expected to fix the LM-head collapse on its own; the augmentation
is just the standard regulariser, not a load-bearing widener.
* prompt-dropout fractions stay at the original 0.15 / 0.15 / 0.20.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The tensor-level comparison between dry-run (dataset frame) and live-
robot inference proved the runtime is bug-free — same shape, dtype,
device, channel order, batch dim, and normalization on both paths.
The remaining variable: front-camera mean brightness was 0.26 live vs
0.39 on the dataset frame, ~33% darker. Training augmentation only
covered ±20% brightness, so the live scene sits just outside the
supervised envelope and the LM head collapses to its dominant prior.
Widen the augmentation knobs for the next retrain:
* brightness 0.8–1.2 → 0.5–1.6 (covers ~30% darker / 60% lighter)
* contrast 0.8–1.2 → 0.6–1.5
* saturation 0.5–1.5 → 0.3–1.7
* hue ±0.05 → ±0.10
* affine ±5°/±5% → ±15°/±15% (covers cube placement / camera drift)
* max_num_transforms 3 → 4
And bump prompt-component dropout (subtask 0.20 → 0.30) so the LM
can't lean on stale memorised plan/memory at inference.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two complementary regularisers to attack the
``text_loss=6e-6 = memorised one dataset`` failure mode that's
making the model collapse on real-robot input:
1. **Per-component prompt dropout** (Pi0.7 §V.E / plan's
``feat/pi05-prompt-dropout`` follow-up).
``SmolVLA2ChatTokenizerStep`` gains
``plan_dropout_prob`` / ``memory_dropout_prob`` /
``subtask_dropout_prob`` knobs (default 0.0 — opt-in). At training,
non-target messages whose rendered content starts with
``Plan:`` / ``Memory:`` / ``Current subtask:`` etc. are dropped
with their respective probability before tokenisation, with a
deterministic per-sample RNG keyed off the dataset ``index``.
``target_message_indices`` is re-mapped so the supervision still
lands on the right turn. Forces the model to handle missing
plan/memory/subtask context — directly attacks the real-robot
collapse where a stale or empty plan field puts the prompt OOD.
Surfaced on ``SmolVLA2Config`` as three floats so they're
``--policy.<knob>=<value>``-controllable from the train CLI;
plumbed through ``make_smolvla2_pre_post_processors``.
2. **Image augmentation** is already wired in lerobot via
``--dataset.image_transforms.enable=true`` (torchvision v2
ColorJitter + SharpnessJitter + RandomAffine, default 3 of 6
sampled per frame). No code change needed — just a CLI flag.
``examples/training/smolvla2_hirobot.slurm`` shows the full
training command with both enabled. Drop-in replacement for the
ad-hoc SLURM script Pepijn was using locally; same args, plus the
three dropout probs and the image-transforms flag.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Last bump combined ``module_3.K=3`` with ``vqa_emission_hz=2.0`` and
``executor.episode_parallelism=32``. With 2 cameras per dataset that
produced ~12× the original VQA call volume, all submitted concurrently.
Module 3 latency went from ~30s/phase to ~490s per episode, vLLM's
KV cache pegged at 94% with 800+ in-flight requests, and the
multimodal cache corrupted with ``AssertionError: Expected a cached
item for mm_hash='...'`` (a known vLLM bug under image-heavy
concurrency). Module 1 and 2 ran fine; Module 3 was the bottleneck.
Pull back the multipliers to land in a sustainable spot:
* module_3.K: 3 (kept) — three diverse questions per emission,
where the diversity actually helps the LM head.
* module_3.vqa_emission_hz: 2.0 → 1.0 — back to the original
emission rate. Net VQA volume is now ~3× original (K alone) on
a single camera, ~6× across both cameras — manageable.
* module_2.max_interjections_per_episode: 9 → 6 — still 2× the
default, fewer than the prior 3× to keep total request volume
in check.
* vlm.client_concurrency: 256 → 128 — gives vLLM headroom on the
multimodal request path so the mm_cache doesn't desync.
* executor.episode_parallelism: 32 → 16 — half the episodes
in flight at once, so peak vLLM load is ~half.
n_task_rephrasings stays at 30 (text-only, doesn't load the image
path) and vlm.temperature stays at 0.7. The diversity gains are
preserved; only the throughput knobs come down.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Following Pi0.7 §V (prompt expansion / diverse context conditioning),
push more atom variants per episode and higher VLM sampling
temperature so the training distribution has enough wording diversity
that the LM head is forced to use its parameters rather than memorise
specific (prompt, target) pairs.
Changes vs prior annotation pass:
* vlm.temperature: 0.2 (default) → 0.7 — every Module-1/2/3 call
now produces diverse phrasings; same prompt yields different
completions across emissions.
* module_1.n_task_rephrasings: 10 → 30 — three times as many
``task_aug`` rows in language_persistent. ``${task}`` already
rotates through them deterministically per sample_idx (see
``_resolve_task`` in language_render.py).
* module_2.max_interjections_per_episode: 3 (default) → 9 — more
``user_interjection_response`` training samples + more plan
refresh events.
* module_3.K: 1 → 3 — three VQA pairs per emission tick instead of
one. Combined with the hz bump below, ~6× more VQA samples.
* module_3.vqa_emission_hz: 1.0 → 2.0 — double the VQA emission
rate within each subtask span.
Pushes to a new hub repo (``_tool3``) so the working ``_tool2``
dataset stays intact for comparison. ``${task}`` already wired to
rotate through ``task_aug`` rows, so no renderer change needed.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
A ready-to-run example of launching the annotation pipeline on a
Hugging Face job (h200x2) with two vllm replicas serving
Qwen3.6-35B-A3B-FP8. Lives next to other end-to-end recipes under
examples/.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat: HIL data collection, RTC interpolator, and action queue improvements
- Add Human-in-the-Loop (HIL) data collection examples (sync + RTC)
- Add HIL data collection documentation
- Add ActionInterpolator for smoother policy control at higher rates
- Integrate interpolator into lerobot-record and eval_with_real_robot
- Add action queue clear() and get_processed_left_over() methods
- Add rtc/__init__.py for cleaner imports
* docs: expand Related Work section with paper summaries
* fix: only record dataset frames at original fps, not at interpolated rate
The interpolator speeds up robot control (e.g. 2x) but dataset frames
should still be recorded at the original fps. Interpolated-only
iterations now only send actions to the robot without writing to the
dataset.
* refactor: merge HIL sync and RTC scripts into single file with --rtc.enabled toggle
Combines hil_data_collection.py and hil_data_collection_rtc.py into one
script. RTC is toggled via --rtc.enabled=true (defaults to off for sync
inference). Deletes the separate hil_data_collection_rtc.py and updates
docs to reflect the single-script usage.
* test: add ActionInterpolator test suite (29 tests)
Covers constructor validation, passthrough (multiplier=1), 2x and 3x
interpolation with exact value checks, reset/episode boundaries,
control interval calculation, multi-dim actions, and simulated
control loop integration.
* test: add ActionQueue + ActionInterpolator integration tests
Verifies the interpolator doesn't interfere with RTC's leftover chunk
tracking: queue consumption rate matches base fps regardless of
multiplier, get_left_over/get_processed_left_over only change on
queue.get(), merge preserves smooth interpolation across chunks,
and interpolator reset is independent of queue state.
* feat: register SO follower/leader configs in HIL script
Adds SOFollowerRobotConfig and SOLeaderTeleopConfig imports so
SO100/SO101 robots can be used via --robot.type=so_follower
and --teleop.type=so_leader. Updates docs accordingly.
Made-with: Cursor
* docs: remove em dashes from HIL documentation
Made-with: Cursor
* refactor: rename examples/rac to examples/hil
Updates directory name and all references in docs and script docstrings.
Made-with: Cursor
* fix: encorperate pr feedback comments
* refactor(tests): enhance ActionInterpolator test structure and add detailed docstrings
* feedback pr and test fix
* fix(test): pass correct real_delay in interpolator delay test
The test was passing real_delay=0 and relying on _check_delays to
silently override it with the index-based diff. Now passes real_delay=3
to match the 3 actions consumed during the simulated inference period.
* fix pr feedback
* ordering
* update hil script
* fix
* default name
* fix(bi_openarm): use kw_only=True to fix dataclass field ordering
BiOpenArmFollowerConfig overrides `id` with a default, making it
positional in the child — non-default `left_arm_config` then follows a
default field, which Python dataclasses forbid. Adding kw_only=True
(matching the parent RobotConfig) removes positional constraints.
Made-with: Cursor
* style: format long line in hil_data_collection.py
Made-with: Cursor
* pr feedback
---------
Co-authored-by: Khalil Meftah <khalil.meftah@huggingface.co>
* Add option for pi family models to train with relative actions (relative to state)
* formatting
* add recomputation of stats and option to compute delta stats
* normalzie after delta conversion
* only recompute state for stats
* calulate chunk based stats
* sample 100k
* load from parquet
* sample 1m
* stats per chunck
* fix
* use quantiles
* stats for entire dataset
* fix
* max 1m frames
* compute before dist
* fix multi gpu processor bug
* Fix RTC with delta actions and OpenArms motor_type wiring
* feat: align pi0_fast delta actions with pi0/pi05 and add RTC integration tests
- Add delta_exclude_joints and action_feature_names to PI0FastConfig
- Move to_absolute_actions from modeling to processor pipeline for pi0_fast
- Add delta action detection and logging to eval_with_real_robot.py
- Add delta actions documentation to pi0 and pi05 READMEs
- Fix ruff lint issues in test_delta_actions.py
- Add test_rtc_delta_actions.py (24 tests) covering:
- ActionQueue with delta vs absolute actions
- RTC denoise step with delta leftovers
- Full pipeline roundtrip (delta → RTC → absolute)
- State rebasing approximation bounds
- Non-delta policy compatibility
- Multi-chunk consistency
* chore: clean up test comments, add OpenPI attribution, remove debug logging
- Replace decorative comment separators in test files with plain section headers
- Add attribution comments for 1e-6 epsilon in normalize_processor.py (from OpenPI)
- Remove debug logging blocks from lerobot_train.py
* refactor: extract compute_delta_action_stats into compute_stats.py
Move the ~70-line inline delta action stats block from lerobot_train.py
into a dedicated function in compute_stats.py, where all other stats
computation already lives. The training script now calls it in 6 lines.
* refactor: remove unused get_processed_left_over from ActionQueue
This method was never called outside of tests. Leftover actions for RTC
guidance are always retrieved via get_left_over() (delta/original space).
* revert: remove logging-only changes from eval_with_real_robot.py
The delta actions detection helper and log message added no functional
value — the script already handles delta policies correctly via the
processor pipeline.
* refactor: use ACTION/OBS_STATE constants instead of hardcoded strings
Replace hardcoded "action" and "observation.state" with ACTION and
OBS_STATE from utils.constants in compute_stats.py, dataset_tools.py,
and lerobot_train.py.
* style: remove stray blank lines in training loop
* refactor: move delta action stats to preprocessing step, remove on-the-fly computation
- Remove on-the-fly compute_delta_action_stats from lerobot_train.py
- Rewrite recompute_stats to delegate action stats to compute_delta_action_stats
(chunk-based sampling matching what the model sees during training)
- Add chunk_size parameter to recompute_stats for delta action computation
- Add delta actions documentation to pi0.mdx and pi05.mdx
* feat: add recompute_stats CLI operation to lerobot-edit-dataset
* fix(tests): relax quantile normalization test tolerance for 1e-6 epsilon
* chore: remove agents_memory/pr_details.md from repo
* refactor: rename delta actions to relative actions throughout
What OpenPI calls "DeltaActions" is actually UMI's "relative trajectory"
representation: each action in the chunk is an offset from the current
state, not from the previous action. This avoids error accumulation.
Renamed across all source, tests, docs, and CLI:
- DeltaActionsProcessorStep → RelativeActionsProcessorStep
- to_delta_actions → to_relative_actions
- use_delta_actions → use_relative_actions
- delta_exclude_joints → relative_exclude_joints
- compute_delta_action_stats → compute_relative_action_stats
- delta_action_processor.py → relative_action_processor.py
- test_delta_actions.py → test_relative_actions.py
Kept as-is: AbsoluteActionsProcessorStep (converts TO absolute),
registry ID "delta_actions_processor" (backward compat), and unrelated
delta references (IK pipeline, Robosuite, RA-BC metrics, gym envs).
* docs: add Action Representations guide
Dedicated page explaining absolute, relative, and delta actions with
numerical examples, joint vs EE space, and how to use kinematics
pipelines and the relative action processor. References UMI paper
(Chi et al., 2024) for the terminology.
* docs: remove redundant OpenPI naming note from action representations
* docs: remove opinionated OpenPI reference from delta actions section
* docs: replace ASCII diagram with UMI paper figure
* docs: remove OpenPI reference from action representations
* docs: use HF-hosted image instead of local asset
* docs: clarify figure attribution
* revert: restore original normalization epsilon behavior
The 1e-6 unconditional epsilon change perturbed all normalized values,
breaking backward compatibility tests. The original approach (1e-8 eps
for MEAN_STD, conditional torch.where for QUANTILES) already handles
division by zero correctly without affecting non-degenerate cases.
* fix: restore delta_action_processor.py used by phone/RL teleop
The rename commit incorrectly deleted delta_action_processor.py and
duplicated its classes into relative_action_processor.py. Restore the
original file and import from it instead.
* fix(processor): address PR #2970 review comments
- Remove shebang from relative_action_processor.py (library module, not script)
- Add device alignment in to_relative_actions/to_absolute_actions so _last_state
on CPU doesn't cause cross-device errors when actions are on CUDA
- Rename delta_step → relative_step in AbsoluteActionsProcessorStep for naming
consistency; update factory.py, all processor files, and tests
- Expand _reconnect_relative_absolute_steps docstring to explain why post-hoc
rewiring is needed after deserialization
- Fix off-by-one in compute_stats.py: sample_upper_bound = total_frames - chunk_size + 1
so last valid start index is included and total_frames == chunk_size is not rejected
- Remove redundant NOTE comment in processor_pi05.py (duplicated two lines below)
- Fix pi0_fast processor ordering: move relative_step before NormalizerProcessorStep
so normalizer sees delta actions (matching pi0/pi05); flip postprocessor to
unnormalize → absolute accordingly. Relative stats are now required for all pi models
- Revert use_relative_joint_actions_aloha → use_delta_joint_actions_aloha in
configuration_smolvla.py (preserve existing public API)
- Update action_representations.mdx: add missing joint to 6-DOF example, fix
'based on a figure', clarify pi family ordering, add RTC compatibility section
* update rtc link
* feat: compute relative action stats over full dataset with optional parallelism
Remove the 100k sample cap from compute_relative_action_stats and process
all valid chunks. Vectorize with numpy (pre-load actions/states, fancy
indexing + broadcasting) for a large speedup over the per-index HF dataset
loop. Add num_workers param for thread-based parallelism (numpy releases
the GIL). Update docs to show --push_to_hub for recompute_stats.
* style: apply ruff formatting to compute_stats.py
* testing on real robot
* style: fix ruff format and remove redundant .keys() calls
* refactor(dataset): split reader and writer
* chore(dataset): remove proxys
* refactor(dataset): better reader & writer encapsulation
* refactor(datasets): clean API + reduce leaky implementations
* refactor(dataset): API cleaning for writer, reader and meta
* refactor(dataset): expose writer & reader + other minor improvements
* refactor(dataset): improve teardown routine
* refactor(dataset): add hf_dataset property at the facade level
* chore(dataset): add init for datasset module
* docs(dataset): add docstrings for public API of the dataset classes
* tests(dataset): add tests for new classes
* fix(dataset): remove circular dependecy
* Add SLURM SARM progress annotation script.
Provide a standalone two-stage compute/aggregate pipeline for RA-BC progress generation so large datasets can be processed in parallel and optionally uploaded to the Hub.
Made-with: Cursor
* fix pr comments
* remove comments
* feat(cameras): add new read_latest() method
* fix(cameras): fix threading bug + clear state
* refactor(cameras): multiple improvements
* feat(camera): add context manager to camera base class
* chore(camera): slight modifications to opencv
* test(cameras): update opencv tests according to the changes
* refactor(cameras): reflect desing changes to realsense + deal with depth
* test(cameras): fix realsense tests accordingly to new changes
* refactor(cameras): update reachymini and zmq accordingly
* chore: wrap resource sensitive examples into a try/finally
* test(cameras): add test for new read_latest
* test(cameras): fix problem with image artifact in opencv tests
* test(cameras): fix test_read_latest_high_frequency expectations
* Apply suggestions from code review 1
Co-authored-by: Caroline Pascal <caroline8.pascal@gmail.com>
Signed-off-by: Steven Palma <imstevenpmwork@ieee.org>
* chore(cameras): address feedback
* feat(cameras): add max_age_ms check in read_latest
* test(cameras): fix read_latest tests
* chore(redundancies): removing redundancies in Reachy 2 camera class
* fix(warmup): replacing the arbitrary time.sleep in by an actual warmup in the RealSense camera class
* chore(format): formatting latest changes
* chore(warning): adding a "to be implemented" warning for read_latest() in Camera base class
* chore(warning): making read_latest() warning message shorter and clearer
---------
Signed-off-by: Steven Palma <imstevenpmwork@ieee.org>
Co-authored-by: Caroline Pascal <caroline8.pascal@gmail.com>
* feat(async_inference): server always sends CPU tensors, client handles device conversion
* fix:fix the type annotation of RawObservation in src/lerobot/async_inference/helpers.py
* update the import of robot_client
---------
Co-authored-by: Sato shinji <wwwsatoshinji@gmail.com>
Co-authored-by: Steven Palma <imstevenpmwork@ieee.org>
Co-authored-by: KB <kevin-brian.n-diaye@epita.fr>
This PR extends the integration of Unitree g1 with the LeRobot codebase. By converting robot state to a flat dict we can now record and replay episodes (example groot/holosoma scripts need to be adjusted as well). We also improve the simulation integration by calling .step @ _subscribe_motor_state instead of it running in a separate thread. We also add ZMQ camera to lerobot, streaming base64 images over json
* feat(robots): consolidates bi SO setups
* fix(robots): solve circular dependecy
* fix(robots): teleop & record working
* feat(robots): only one SO
* fix(utils): rename bi so
* fix(scripts): bi so import
* fix(rl): remove imports
* Add basic support for PEFT adapter methods
This changes adds support for training policies with much less parameters
by applying adapter methods such as LoRA on specific parts of the policies
and therefore possibly higher learning rates / batch sizes.
To make this as accessible as possible I thought it useful to provide
defaults for `target_modules` and `modules_to_save`. Currently only SmolVLA
has such defaults but when we agree that this change is useful I will set
out to generate more such defaults. While the user can override these
settings, they are expected to only change the peft_method, rank and init_type
parameters.
* Implement loading of PEFT adapters
Loading a PEFT adapter is currently done by initializing a policy with default config
and then applying the adapter on the resulting model. This has the obvious drawback
that any configurations done during training are not applied in the adapted model.
Currently the `use_peft` attribute of `PreTrainedConfig` is only set during loading
to signal the following code that it has to deal with a PEFT adapter. However
we could imagine a scenario where this is already set at training time and stored
alongside the adapter.
* Store policy config alongside PEFT checkpoint
Before this change the PEFT-wrapped policy did not save the policy's config
alongside the adapter config / weights which prevented us from changing the
policy config. Now the policy config is saved both in full training and PEFT
training.
This change makes loading the PEFT policy adapter much easier as well.
* Add default config for ACT
* Support targets like `all-linear`
* Formatting
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Fix failing tests
* Remove PEFT compatibility changes in config
We'll wait for the PEFT release that fixes this for good.
* Remove `use_peft` parameter from training script
Instead we make the PEFT config optional which has the same effect.
* Log adapter config to WandB
* Better documentation for CLI arguments
* Don't unload & merge the PEFT model
This can make things hard when using quantized layers (user expects quantized base layers with
unquantized adapters for example, merging defaults to upcast the layers leading to higher
memory).
* Correct way of identifying when to save config
* Add CLI end-to-end tests
Currently there don't seem to be any way to test the CLI commands.
Since this change mostly happens in those I thought it best to add
a way to test these commands end-to-end.
More integrated commands like `lerobot-record` need patching but
standalone commands like training seem to work fine.
* Update default targets
Removed ACT since it doesn't make sense to fine-tune ACT without having it pretrained beforehand.
SmolVLA and Pi0/0.5 are much more senseful targets.
* Clean up loading code
- Centralized instantiation of the PEFT wrapper in `make_policy` for inference
(e.g. in `lerobot-record`)
- Training a PEFT policy also sets `cfg.use_peft` so that all inference code loading
the policy can rely on that attribute to identify if PEFT loading is needed
- Modified RTC example to also include PEFT policies. Mostly because this is an example
I'm currently exploring.
* Make sure push_to_hub works
Since PEFT only wraps `push_to_hub` and not `push_model_to_hub`, the reference
to `self` in `policy.push_model_to_hub` is the unwrapped policy which, of course,
doesn't know anything about PEFT.
To make the upload process aware of PEFT, we pass the unwrapped policy down to
`push_model_to_hub` as a kwarg. This is not ideal but I think it is the best way
for now.
* formatting
* Warn when encountering from-scratch-training
* Revamp pretrained model loading
There were quite a few factors that convinced me that the status quo
is able to load pretrained models from the PEFT adapter config but
in fact that didn't work.
This commit fixes the following things:
- policies wrapped in PEFT will now have a `name_or_path` attribute
containing the name or path of the pretrained model we're fine-tuning
- we further assume that SmolVLA without `pretrained_path` and
`load_vlm_weights==False` must be an user-side error
- we assume that using PEFT on from-scratch-policies must be
an user-side-error
* Make it possible to unset policy features
This is necessary to train pre-trained policies on new datasets so that the
features are inferred from the new dataset and not from the pretrained
policy.
* Use correct loading for PEFT in RTC example
* Make it possible to use PeftModels in eval
* Add test checking that PEFT actually reduces params
* Adapt state/action projections instead of full-finetuning
There doesn't seem to be a benefit to fully fine-tune these layers
over just adapting them, so we do that instead.
* Disallow PEFT training on non-pretrained policies
At first I thought it would make sense to have this feature
in case you want to fine-tune a pre-trained section but in the
end it makes more trouble than it's worth.
It's still possible to allow this in the future when a concrete
need arises.
* Add basic documentation
* Formatting
* Add peft as extra dependency, mark tests
Fast tests currently fail because of the missing dependency.
* Fix pre-commit issues
* Add walx <> peft conflict for uv
* Exclude peft from pi install for now
---------
Co-authored-by: nemo <git@ningu.net>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Pepijn <138571049+pkooij@users.noreply.github.com>
* Add Real-Time Chunking (RTC) support for flow matching models
Implement Real-Time Chunking (RTC) for action chunking policies using flow
matching denoising. RTC enables smooth action transitions between consecutive
chunks by using prefix guidance during denoising.
Key features:
- RTCProcessor class with denoise_step method for RTC guidance
- Tracker system for debug tracking using time-based dictionary storage
- RTCDebugVisualizer with comprehensive visualization utilities
- Integration with SmolVLA policy for flow matching models
- Support for multiple prefix attention schedules (ZEROS, ONES, LINEAR, EXP)
- Configurable execution horizon and max guidance weight
- Example scripts for dataset evaluation and real-time control
Technical details:
- Uses autograd-based gradient computation for RTC corrections
- Time-based tracking eliminates duplicate step issues
- Proxy methods in RTCProcessor for cleaner API
- Full integration with LeRobot's policy and dataset systems
Files added/modified:
- src/lerobot/configs/types.py: Add RTCAttentionSchedule enum
- src/lerobot/policies/rtc/: Core RTC implementation
- configuration_rtc.py: RTC configuration
- modeling_rtc.py: RTCProcessor with denoise_step
- debug_handler.py: Tracker for debug information
- debug_visualizer.py: Visualization utilities
- src/lerobot/policies/smolvla/modeling_smolvla.py: RTC integration
- examples/rtc/: Example scripts and evaluation tools
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Alexander Soare <alexander.soare159@gmail.com>
Co-Authored-By: Claude <noreply@anthropic.com>
* Fix rtc_config attribute access in SmolVLA
Use getattr() to safely check for rtc_config attribute existence
instead of direct attribute access. This fixes AttributeError when
loading policies without rtc_config in their config.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Alexander Soare <alexander.soare159@gmail.com>
Co-Authored-By: Claude <noreply@anthropic.com>
* fixup! Fix rtc_config attribute access in SmolVLA
* Add RTCConfig field to SmolVLAConfig
Add rtc_config as an optional field in SmolVLAConfig to properly
support Real-Time Chunking configuration. This replaces the previous
getattr() workarounds with direct attribute access, making the code
cleaner and more maintainable.
Changes:
- Import RTCConfig in configuration_smolvla.py
- Add rtc_config: RTCConfig | None = None field
- Revert getattr() calls to direct attribute access in modeling_smolvla.py
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Alexander Soare <alexander.soare159@gmail.com>
Co-Authored-By: Claude <noreply@anthropic.com>
* Refactor RTC enabled checks to use _rtc_enabled helper
Add _rtc_enabled() helper method in VLAFlowMatching class to simplify
and clean up RTC enabled checks throughout the code. This reduces
code duplication and improves readability.
Changes:
- Add _rtc_enabled() method in VLAFlowMatching
- Replace verbose rtc_config checks with _rtc_enabled() calls
- Maintain exact same functionality with cleaner code
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Alexander Soare <alexander.soare159@gmail.com>
Co-Authored-By: Claude <noreply@anthropic.com>
* Rename track_debug method to track
Simplify the method name from track_debug to just track for better
readability and consistency. The method already has clear documentation
about its debug tracking purpose.
Changes:
- Rename RTCProcessor.track_debug() to track()
- Update all call sites in modeling_smolvla.py and modeling_rtc.py
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Alexander Soare <alexander.soare159@gmail.com>
Co-Authored-By: Claude <noreply@anthropic.com>
* Use output_dir for saving all evaluation images
Update eval_dataset.py to save all comparison images to the
configured output_dir instead of the current directory. This provides
better organization and allows users to specify where outputs should be
saved.
Changes:
- Add os import at top level
- Create output_dir at start of run_evaluation()
- Save all comparison images to output_dir
- Remove duplicate os imports
- Update init_rtc_processor() docstring to be more concise
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Alexander Soare <alexander.soare159@gmail.com>
Co-Authored-By: Claude <noreply@anthropic.com>
* fixup! Use output_dir for saving all evaluation images
* Fix logging buffering and enable tracking when RTC config provided
- Add force=True to logging.basicConfig to override existing configuration
- Enable line buffering for stdout/stderr for real-time log output
- Modify init_rtc_processor to create processor when rtc_config exists
even if RTC is disabled, allowing tracking of denoising data
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Alexander Soare <alexander.soare159@gmail.com>
* Refactor SmolVLA plotting to use tracker data instead of local variables
Remove local tracking variables (correction, x1_t, error) from the
denoising loop and instead retrieve plotting data from the RTC tracker
after each denoise step. This makes the code cleaner and uses the
tracker as the single source of truth for debug/visualization data.
Changes:
- Remove initialization of correction, x1_t, error before denoising loop
- After each Euler step, retrieve most recent debug step from tracker
- Extract correction, x1_t, err from debug step for plotting
- Update tracking condition to use is_debug_enabled() method
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Alexander Soare <alexander.soare159@gmail.com>
* Move plotting logic from modeling_smolvla to eval_dataset script
Refactor to improve separation of concerns:
modeling_smolvla.py changes:
- Remove all plotting logic from sample_actions method
- Remove viz_xt_axs, viz_vt_axs, viz_x1t_axs parameters
- Remove matplotlib and RTCDebugVisualizer imports
- Remove viz_fig, viz_axs, denoise_step_counter instance variables
- Simplify denoising loop to only track data in rtc_processor
eval_dataset.py changes:
- Add _plot_denoising_steps_from_tracker helper method
- Retrieve debug steps from tracker after inference
- Plot x_t, v_t, x1_t, correction, and error from tracker data
- Enable debug tracking (cfg.rtc.debug = True) for visualization
- Remove viz axes parameters from predict_action_chunk calls
modeling_rtc.py changes:
- Remove v_t from track() call (handled by user change)
Benefits:
- Cleaner modeling code focused on inference
- Evaluation script owns all visualization logic
- Better separation of concerns
- Tracker is single source of truth for debug data
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Alexander Soare <alexander.soare159@gmail.com>
* Refactor plotting loging
* fixup! Refactor plotting loging
* Improve visualization: separate correction plot and fix axis scaling
Changes:
- Create separate figure for correction data instead of overlaying on v_t
- Add _rescale_axes helper method to properly scale all axes
- Add 10% margin to y-axis for better visualization
- Fix v_t chart vertical compression issue
Benefits:
- Clearer v_t plot without correction overlay
- Better axis scaling with proper margins
- Separate correction figure for focused analysis
- Improved readability of all denoising visualizations
Output files:
- denoising_xt_comparison.png (x_t trajectories)
- denoising_vt_comparison.png (v_t velocity - now cleaner)
- denoising_correction_comparison.png (NEW - separate corrections)
- denoising_x1t_comparison.png (x1_t state with error)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Alexander Soare <alexander.soare159@gmail.com>
* fixup! Improve visualization: separate correction plot and fix axis scaling
* fixup! fixup! Improve visualization: separate correction plot and fix axis scaling
* fixup! fixup! fixup! Improve visualization: separate correction plot and fix axis scaling
* Fix traacking
* Right kwargs for the policy
* Add tests for tracker
* Fix tests
* Drop not required methods
* Add torch compilation for eval_dataset
* delete policies
* Add matplotliv to dev
* fixup! Add matplotliv to dev
* Experiemnt with late detach
* Debug
* Fix compilation
* Add RTC to PI0
* Pi0
* Pi0 eval dataset
* fixup! Pi0 eval dataset
* Turn off compilation for pi0/pi05
* fixup! Turn off compilation for pi0/pi05
* fixup! fixup! Turn off compilation for pi0/pi05
* fixup! fixup! fixup! Turn off compilation for pi0/pi05
* fixup! fixup! fixup! fixup! Turn off compilation for pi0/pi05
* fixup! fixup! fixup! fixup! fixup! Turn off compilation for pi0/pi05
* Add workable flow
* Small fixes
* Add more tests
* Add validatio at the end
* Update README
* Silent validation
* Fix tests
* Add tests for modeling_rtc
* Add tests for flow matching models with RTC
* fixup! Add tests for flow matching models with RTC
* fixup! fixup! Add tests for flow matching models with RTC
* Add one more test
* fixup! Add one more test
* Fix test to use _rtc_enabled() instead of is_rtc_enabled()
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* fixup! Fix test to use _rtc_enabled() instead of is_rtc_enabled()
* fixup! fixup! Fix test to use _rtc_enabled() instead of is_rtc_enabled()
* Add RTC initialization tests without config for PI0.5 and SmolVLA
Add test_pi05_rtc_initialization_without_rtc_config and
test_smolvla_rtc_initialization_without_rtc_config to verify that
policies can initialize without RTC config and that _rtc_enabled()
returns False in this case.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Fix PI0.5 init_rtc_processor to use getattr instead of direct model access
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Fix SmolVLA init_rtc_processor to use getattr instead of direct model access
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Fix PI0.5 RTC tests to use quantile stats (q01, q99) for normalization
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* fixup! Fix PI0.5 RTC tests to use quantile stats (q01, q99) for normalization
* Fixup eval with real robot
* fixup! Fixup eval with real robot
* fixup! fixup! Fixup eval with real robot
* Extract simulator logic from eval_with real robot and add proper headers to files
* Update images
* Fix tests
* fixup! Fix tests
* add docs for rtc
* enhance doc and add images
* Fix instal instructions
---------
Co-authored-by: Ben Zhang <benzhangniu@gmail.com>
Co-authored-by: Alexander Soare <alexander.soare159@gmail.com>
Co-authored-by: Michel Aractingi <michel.aractingi@huggingface.co>
* (unscrewing things up) (#2288)
* fix: expose a function explicitly building a frame for inference
* fix: first make dataset frame, then make ready for inference
* fix: reducing reliance on lerobot record for policy's ouptuts too
* fix: encapsulating squeezing out + device handling from predict action
* fix: remove duplicated call to build_inference_frame and add a function to only perform data type handling (whole conversion is: keys matching + data type conversion)
* refactor(envs): add custom-observation-size (#2167)
* fix: add MockMotorBus to MockRobot
* rl: first drafts
* add: all components of HIL SERL
* fix: actor block works
* fix: less friction, less friction
* add: hil-serl complete example
* fix: dataset names
* fix: restructuring example folder
* fix: act works but found bug in how ACT works
* fix: same path for both pre and postprocessors
* fix: paths
* add: example usage for act
* add: using ACT example
* fix: training examples
* fix: using examples
* fix: camera index
* fix: rename workflows into tutorial so that the path of the files is lerobot/examples/tutorial/...
* fix: upload everything in one repo
* fix: model name
* fix: simplify model path
* add: VLAs example
---------
Signed-off-by: Francesco Capuano <74058581+fracapuano@users.noreply.github.com>
* fix: minor fix using named attributes
* fix: change model to act
* fix: named attributes for inference frame building
* fix: minor fixes to smolvla
* fix: small changes to pi0
* remove: old file that should have never been committed (ups sorry sorry)
---------
Signed-off-by: Francesco Capuano <74058581+fracapuano@users.noreply.github.com>
* make add_feature take multiple features at a time and rename to add_features
* - New function: modify_features that was a combination of remove features and add features.
- This function is important for when we want to add a feature and remove another so we can do it in one time to avoid copying and creating the dataset multiple times
* feat(dataset-tools): add dataset utilities and example script
- Introduced dataset tools for LeRobotDataset, including functions for deleting episodes, splitting datasets, adding/removing features, and merging datasets.
- Added an example script demonstrating the usage of these utilities.
- Implemented comprehensive tests for all new functionalities to ensure reliability and correctness.
* style fixes
* move example to dataset dir
* missing lisence
* fixes mostly path
* clean comments
* move tests to functions instead of class based
* - fix video editting, decode, delete frames and rencode video
- copy unchanged video and parquet files to avoid recreating the entire dataset
* Fortify tooling tests
* Fix type issue resulting from saving numpy arrays with shape 3,1,1
* added lerobot_edit_dataset
* - revert changes in examples
- remove hardcoded split names
* update comment
* fix comment
add lerobot-edit-dataset shortcut
* Apply suggestion from @Copilot
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Michel Aractingi <michel.aractingi@huggingface.co>
* style nit after copilot review
* fix: bug in dataset root when editing the dataset in place (without setting new_repo_id
* Fix bug in aggregate.py when accumelating video timestamps; add tests to fortify aggregate videos
* Added missing output repo id
* migrate delete episode to using pyav instead of decoding, writing frames to disk and encoding again.
Co-authored-by: Caroline Pascal <caroline8.pascal@gmail.com>
* added modified suffix in case repo_id is not set in delete_episode
* adding docs for dataset tools
* bump av version and add back time_base assignment
* linter
* modified push_to_hub logic in lerobot_edit_dataset
* fix(progress bar): fixing the progress bar issue in dataset tools
* chore(concatenate): removing no longer needed concatenate_datasets usage
* fix(file sizes forwarding): forwarding files and chunk sizes in metadata info when splitting and aggregating datasets
* style fix
* refactor(aggregate): Fix video indexing and timestamp bugs in dataset merging
There were three critical bugs in aggregate.py that prevented correct dataset merging:
1. Video file indices: Changed from += to = assignment to correctly reference
merged video files
2. Video timestamps: Implemented per-source-file offset tracking to maintain
continuous timestamps when merging split datasets (was causing non-monotonic
timestamp warnings)
3. File rotation offsets: Store timestamp offsets after rotation decision to
prevent out-of-bounds frame access (was causing "Invalid frame index" errors
with small file size limits)
Changes:
- Updated update_meta_data() to apply per-source-file timestamp offsets
- Updated aggregate_videos() to track offsets correctly during file rotation
- Added get_video_duration_in_s import for duration calculation
* Improved docs for split dataset and added a check for the possible case that the split size results in zero episodes
* chore(docs): update merge documentation details
Signed-off-by: Steven Palma <imstevenpmwork@ieee.org>
---------
Co-authored-by: CarolinePascal <caroline8.pascal@gmail.com>
Co-authored-by: Jack Vial <vialjack@gmail.com>
Co-authored-by: Steven Palma <imstevenpmwork@ieee.org>
* chore: replace hard-coded 'action' values with constants throughout all the source code
* chore(tests): replace hard-coded action values with constants throughout all the test code
* chore: replace hard-coded OBS values with constants throughout all the source code
* chore(tests): replace hard-coded OBS values with constants throughout all the test code