Commit Graph

1387 Commits

Author SHA1 Message Date
Steven Palma 6f4a96333e chore(docs): update contributing (#3387) 2026-04-15 11:02:37 +02:00
Steven Palma 9021d2d240 refactor(imports): enforce guard pattern (#3382)
* refactor(imports): enforce guard pattern

* fix(tests): skip reachy2 if not installed

* Address review feedback
2026-04-14 22:54:05 +02:00
Khalil Meftah 60e7d67cb8 fix: catch KeyboardInterrupt in safe_stop_image_writer to prevent corrupted frames (#3381) 2026-04-14 18:22:56 +02:00
Radu 1ede000bdd fix(rl): swap dict merge order to preserve teleop intervention flag (#3273)
Co-authored-by: Khalil Meftah <khalil.meftah@huggingface.co>
2026-04-14 16:20:54 +02:00
Khalil Meftah d57c58a532 fix: add thread synchronization to ReplayBuffer to prevent race condition between add() and sample() (#3372) 2026-04-14 13:16:45 +02:00
Matteo Tiezzi b3e76a92f2 fix(groot): compatibility fixes for gr00t in v0.5 (#3182)
* fix(groot): apply groot 0.5 fixes

* fix(groot): correct indentation and add tile count in Eagle25VL processor

* Fixed lint7/style
2026-04-14 13:09:18 +02:00
Khalil Meftah f5c801fd34 fix(test): add missing device placement in multi-task DiT tests (#3349) 2026-04-14 12:25:29 +02:00
Ethan Pronovost cff4bcf4a0 Update reward classifier training config (#3147)
Co-authored-by: Khalil Meftah <khalil.meftah@huggingface.co>
2026-04-14 11:28:49 +02:00
Maxime Ellerbach a656a982af fix(feetech): motor position readings overflow (#3373) 2026-04-13 22:39:58 +02:00
Pepijn 187b2167ed feat(ci): benchmark smoke tests with isolated Docker images (LIBERO + MetaWorld) (#3319)
* docs(benchmarks): add benchmark integration guide and standardize benchmark docs

Add a comprehensive guide for adding new benchmarks to LeRobot, and
refactor the existing LIBERO and Meta-World docs to follow the new
standardized template.



* refactor(envs): move dispatch logic from factory into EnvConfig subclasses

Replace hardcoded if/elif chains in factory.py with create_envs() and
get_env_processors() methods on EnvConfig. New benchmarks now only need
to register a config subclass — no factory.py edits required.

Net -23 lines: factory.py shrinks from ~200 to ~70 lines of logic.



* docs(benchmarks): clean up adding-benchmarks guide for clarity

Rewrite for simpler language, better structure, and easier navigation.
Move quick-reference table to the top, fold eval explanation into
architecture section, condense the doc template to a bulleted outline.



* fix link

* fix task count

* fix: enable SmolVLA eval on LIBERO with custom camera mappings

- Thread camera_name_mapping from LiberoEnv config through to gym envs
- Sync features_map with camera_name_mapping in LiberoEnv.__post_init__
- Fix render() to use first available camera instead of hardcoded "image"
- Handle non-dict final_info in rollout by falling back to info["is_success"]
- Add use_peft legacy field to SmolVLAConfig for checkpoint compat
- Add defaults to GR00TN15Config init=False fields for transformers 5.3



* fix: use direct AutoresetMode import for gymnasium compat



* fix: handle gymnasium < 1.0 without AutoresetMode



* refactor: revert policy changes, keep env-only camera mapping fixes

- Revert GR00T N1.5 default_factory/default changes (transformers compat)
- Revert SmolVLA use_peft legacy field
- Apply ruff formatting fixes
- camera_name_mapping stays entirely in env/eval layer (no policy changes)



* Update docs/source/env_processor.mdx

Co-authored-by: Khalil Meftah <khalil.meftah@huggingface.co>
Signed-off-by: Pepijn <138571049+pkooij@users.noreply.github.com>

* feat(envs): lazy env init + AsyncVectorEnv as default for n_envs > 1

LiberoEnv and MetaworldEnv previously allocated GPU resources (EGL context,
OpenGL framebuffer) in __init__, before AsyncVectorEnv's fork(). Worker
processes inherited stale GPU handles, causing EGL_BAD_CONTEXT crashes on
first render.

Fix: defer OffScreenRenderEnv / MT1 construction to _ensure_env(), called on
first reset() or step() inside the worker subprocess. Each worker creates its
own clean context after fork().

Also fixes lerobot_eval.py:170 (add_envs_task TODO): replace with
env.call("task") which works with both SyncVectorEnv and AsyncVectorEnv.

AsyncVectorEnv is now the default for n_envs > 1; auto-downgraded to
SyncVectorEnv when n_envs=1 (no benefit, less overhead).

Expected speedup: ~15-20x for LIBERO Spatial with batch_size=50.



* fix: close envs between tasks to prevent worker process accumulation

eval_policy_all never closed environments after each task completed,
causing AsyncVectorEnv worker processes to accumulate (N_tasks × n_envs).
This led to OOM, BrokenPipeError and EOFError on multi-task benchmarks.

Also fixes:
- AsyncVectorEnv compat in envs/utils.py (use get_attr/call instead of .envs)
- Tuple task handling in tokenizer_processor and lerobot_eval
- _LazyAsyncVectorEnv for deferred worker spawning in LIBERO



* fix(eval): use task_description instead of task for language conditioning

env.call("task") returns the LIBERO task name with underscores
(e.g. "pick_up_the_black_bowl_...") instead of the natural language
description ("pick up the black bowl ..."). The VLM tokenizes these
completely differently, causing 0.0 reward across all episodes.



* docs: update adding_benchmarks for async env changes

- Replace add_envs_task reference with env.call("task_description")
- Update use_async_envs default to True
- Add note about lazy GPU init for AsyncVectorEnv compatibility



* feat(eval): batch_size=auto + faster env loading

- batch_size=0 (default) auto-tunes based on CPU cores, capped by
  n_episodes and 64. Removes the need for users to guess the right
  value. The old batch_size > n_episodes error is replaced by silently
  clamping to n_episodes.
- _LazyAsyncVectorEnv accepts pre-computed spaces so only one temp env
  is created per suite (not per task). For libero_spatial (10 tasks)
  this avoids 9 redundant LiberoEnv instantiations during env setup.



* docs: add evaluation guide and update benchmarks doc

- New docs/source/evaluation.mdx covering lerobot-eval usage, batch_size
  auto-tuning, AsyncVectorEnv performance, tuning tips, output format,
  multi-task evaluation, and programmatic usage.
- Add evaluation page to _toctree.yml under Benchmarks section.
- Update adding_benchmarks.mdx to reference batch_size auto default and
  link to the evaluation guide.



* docs(evaluation): remove benchmark table, rename section header



* perf(eval): shared memory, observation passthrough, task prefetch

- AsyncVectorEnv now uses shared_memory=True for zero-copy observation transfer
- LiberoEnvConfig.gym_kwargs passes observation_height/width to the env
- eval_policy_all prefetches next task's workers while current task runs



* style: ruff format



* chore: revert env_processor.mdx changes (not part of this PR)



* ci(benchmarks): add isolated integration tests for libero and metaworld

Each benchmark gets its own Docker image (lerobot[libero] / lerobot[metaworld]
only) so incompatible dep trees cannot collide. A 1-episode smoke eval runs
per benchmark on GPU runners.



* ci(benchmarks): pin action hashes and use uv sync --locked



* ci(benchmarks): trigger only on envs/ or lerobot_eval.py changes



* fix(ci): set LIBERO_DATA_FOLDER to bypass interactive stdin prompt

libero/__init__.py calls input() to ask about a custom dataset path,
which raises EOFError when stdin is closed inside Docker. Setting
LIBERO_DATA_FOLDER skips the prompt entirely.



* docs(benchmarks): add CI smoke test step to adding_benchmarks guide



* fix(ci): pre-create libero config in Dockerfile to bypass stdin prompt

libero/__init__.py calls input() when ~/.libero/config.yaml is missing.
We write the config at image build time (without importing libero) so
the prompt never fires at runtime. Also trigger CI on pyproject.toml changes.



* fix(ci): use shell to create libero config instead of multiline python -c

The multiline RUN python -c "..." was being parsed as Dockerfile
instructions. Use printf to write ~/.libero/config.yaml directly.



* fix(ci): point libero config to bundled package init_files

The config was pointing to /tmp/libero_init which doesn't exist.
Use importlib.util.find_spec to locate the hf-libero package directory
and write paths to the actual bundled bddl_files/init_files/assets.



* fix(ci): add smolvla extra to benchmark Dockerfiles

num2words (required by SmolVLM processor) is declared in lerobot[smolvla],
not lerobot[libero/metaworld]. Install both extras together.



* fix(eval): render_frame covers _LazyAsyncVectorEnv

isinstance(env, AsyncVectorEnv) silently skipped _LazyAsyncVectorEnv,
causing video rendering to produce no frames on the default async path.
Switch to hasattr(env, "call") so any async-compatible env (including
_LazyAsyncVectorEnv) hits the call("render") branch.



* refactor(envs): remove unused _get_sub_env_attr helper

_get_sub_env_attr was defined but never called anywhere in the codebase.
_sub_env_has_attr (its sibling) is kept — it is actively used in utils.py.



* chore: apply prettier formatting to docs



* docs(env_processor): remove deprecated add_envs_task from pipeline example

add_envs_task is replaced by env.call("task_description") in this PR.
Remove it from the pipeline walkthrough and renumber the steps (8→7).



* refactor(envs): remove __del__ from _LazyAsyncVectorEnv

__del__ is unreliable as a cleanup mechanism. close() is already called
explicitly in the eval loop's finally block, so the finalizer is redundant.



* fix(eval): prefetch next task's workers after close to avoid GPU memory overlap

Previously, next task's AsyncVectorEnv workers were spawned while the
current task was still running, causing both tasks' GPU contexts to coexist.
Moving the prefetch start into the finally block (after env.close()) ensures
workers for task N+1 only spin up once task N has released GPU memory.



* refactor(envs): move _LazyAsyncVectorEnv to utils and apply to metaworld

_LazyAsyncVectorEnv lived in libero.py but metaworld had the same OOM
problem: all tasks' AsyncVectorEnv workers were spawned eagerly, wasting
GPU memory for tasks not yet running.

Move the class to envs/utils.py so both environments share it, then apply
the same is_async + lazy wrapping pattern in create_metaworld_envs.



* chore: remove out-of-scope benchmark/CI/docs files from PR

Benchmark CI workflow, Dockerfiles, benchmark docs, evaluation smoke-test
doc, and dispatch tests belong in a separate PR. Scope this PR to the
async env init changes only.



* chore: restore adding_benchmarks + test_dispatch, drop env_processor changes

- Restore docs/source/adding_benchmarks.mdx (belongs in this PR)
- Restore tests/envs/test_dispatch.py (belongs in this PR)
- Revert docs/source/env_processor.mdx to main (out of scope for this PR)



* docs(adding_benchmarks): remove CI smoke test step (coming in separate PR)

Step 7 (Dockerfile + benchmark_tests.yml CI job) and its table rows are
out of scope for this PR. The CI infrastructure will be added on top in a
follow-up PR.



* refactor(envs): remove unused add_envs_task

Replaced by env.call("task_description") in lerobot_eval.py. No callers
remain in the codebase.



* style: fix prettier formatting in env_processor.mdx



* fix(ci): use root container chmod to fix PermissionError on artifact dirs

Running chmod on the host doesn't propagate into Docker due to UID/SELinux
mismatch. Instead, spin up the image as root to mkdir+chmod from inside
the container before the eval run mounts the same path.



* fix(ci): re-chmod artifacts after eval to fix unreadable files

Files created by user_lerobot inside the eval container inherit a
restrictive umask, making them unreadable by the runner after the
container exits. Add a post-eval 'docker run --user root' chmod step
so upload-artifact can find the video files.



* feat(ci): add monthly schedule trigger for benchmark tests

Runs on the 1st of every month at 02:00 UTC in addition to the
existing push/PR and manual dispatch triggers.



* fix(ci): change benchmark schedule from monthly to weekly (every Monday)



* fix(ci): use docker cp instead of bind mounts for artifacts

Bind mounts on these runners don't surface container-written files on
the host path (likely DinD/socket-mount setup). Switch to named
containers + docker cp, which copies directly through the daemon and
lands files in the runner's accessible filesystem.



* fix(ci): write eval output to /tmp inside container

user_lerobot cannot create /artifacts at the container root.
Use /tmp/eval-artifacts (always writable) then docker cp it out.



* feat(ci): add parse_eval_metrics step to benchmark workflow

Adds scripts/ci/parse_eval_metrics.py and wires it into both Libero and
MetaWorld jobs so the dashboard can read pc_success, avg_sum_reward and
eval_s from the metrics artifact instead of relying on GitHub step timing.



* feat(ci): add Libero train+eval smoke test (1 step, eval_freq=1)

Runs accelerate launch --num_processes=1 lerobot-train with:
- steps=1, batch_size=1, dataset.episodes=[0] (episode 0 only)
- eval_freq=1 so the training loop triggers eval after step 1
- eval.n_episodes=1, eval.use_async_envs=false

Tests the full train→eval-within-training pipeline in the existing
libero-benchmark-libero:ci image (no extra Docker build cost).
Uploads eval video from /tmp/train-smoke/eval/ as libero-train-smoke-video.



* feat(ci): extract task descriptions and embed in metrics artifact

- Add scripts/ci/extract_task_descriptions.py: runs inside the benchmark
  Docker container (LIBERO/MetaWorld installed) after lerobot-eval and
  writes task_descriptions.json mapping task keys to NL instructions.
  LIBERO: uses libero.libero.benchmark to get suite.get_task(i).language.
  MetaWorld: formats task name as human-readable label.
- Call extraction at the end of each eval bash-c (|| true so never fatal).
- parse_eval_metrics.py reads task_descriptions.json and includes it in
  metrics.json so the health dashboard Space can label videos by task.



* fix(ci): call extract_task_descriptions.py after eval in benchmark jobs

The task descriptions were never populated in metrics.json because
extract_task_descriptions.py was never invoked. The script exists and
parse_eval_metrics.py already looks for its output — the call was
simply missing from the workflow.

Appends the extraction step to the existing bash -c block (runs inside
the container where libero/metaworld is installed) so task_descriptions.json
is written to the eval-artifacts dir before docker cp copies it out.



* fix(test): use SyncVectorEnv in test_base_create_envs

AsyncVectorEnv spawns new subprocesses that do not inherit the
in-process gym registration created by the test. Pass
use_async_envs=False since this test validates dispatch logic,
not async parallelism.



* perf(ci): split Dockerfile dep-install from source-copy for faster rebuilds

The dep-install layer (uv sync) now only depends on pyproject.toml,
uv.lock, and a minimal package stub — not the full src/ tree. Source
code changes only rebuild the final COPY layer (seconds, not minutes).

Also switch from type=local cache (lost on ephemeral runners) to
type=gha (persisted in GitHub Actions cache, shared across all runs).

Before: every src/ change → full uv sync rebuild (~8-10 min)
After:  src/-only change → cached dep layer, ~30s source copy



* fix(ci): add Docker Hub login to avoid pull rate limits

Anonymous pulls from Docker Hub are rate-limited to 100/6h, which
fails when multiple benchmark jobs pull nvidia/cuda in parallel.
Add docker/login-action step (conditional on DOCKERHUB_USERNAME var)
to authenticate and get 200 pulls/6h.

Setup: add DOCKERHUB_USERNAME as a repository variable and
DOCKERHUB_TOKEN as a repository secret in GitHub Settings.



* fix(ci): use existing DOCKERHUB_LEROBOT_USERNAME/PASSWORD secrets



* fix(ci): use env context for secrets check in step if-condition

Step-level 'if' cannot reference 'secrets' directly. Expose the
secret via an env var and check that instead.



* fix(ci): simplify Docker Hub login to match existing workflows

Drop the conditional guard — other workflows (docker_publish,
full_tests) call docker/login-action unconditionally.



* fix(ci): switch Docker cache from type=gha to type=registry

GHA cache is capped at 10GB per repo — a single CUDA + PyTorch +
benchmark image is ~8GB so the cache evicts before it's reused.

Switch to type=registry which pushes cache layers to Docker Hub
(huggingface/lerobot-benchmark-cache:{libero,metaworld}). No size
limit, layers persist until explicitly deleted, and shared across
all runners and branches.



* fix(ci): use GHCR for Docker layer cache (Docker Hub push denied)

Docker Hub CI token can't push to new repos. GHCR works out of the
box — GITHUB_TOKEN has automatic packages:write for the repo owner.

- Add GHCR login step (github.actor + GITHUB_TOKEN)
- Switch cache refs to ghcr.io/huggingface/lerobot/cache-benchmark
- Add packages:write at job level (not workflow, per zizmor)
- Keep Docker Hub login for pulling nvidia/cuda base image



* fix(ci): remove GHCR cache (org blocks GITHUB_TOKEN package writes)

The huggingface org restricts GHCR package creation via GITHUB_TOKEN,
causing 403 on cache export. Remove all registry caching and GHCR
login. The Dockerfile layer split (deps vs source) still helps when
the runner has a warm Docker daemon.

Also fix the metaworld job which had a stale conditional Docker Hub
login and was missing the GHCR login entirely.



* fix(ci): address PR review feedback for benchmark smoke tests

Security:
- Remove "Login to Hugging Face" step — it was a no-op (ephemeral
  --rm container) that exposed the HF token via CLI argument in
  docker inspect / /proc/*/cmdline. The eval step already
  re-authenticates via env var.

Functional:
- Remove feat/benchmark-ci from push trigger branches (won't exist
  post-merge).

Dockerfiles:
- Pin uv to 0.8.0 (was unpinned, fetching whatever latest ships).
- Add comment explaining the chmod +x ptxas workaround (Triton
  packaging bug — ships ptxas without execute bit).

Scripts:
- parse_eval_metrics.py: add note that it runs on bare host and must
  stay stdlib-only.
- parse_eval_metrics.py: add NaN guard for avg_sum_reward and eval_s
  (was only guarding pc_success).



* ci(benchmarks): trigger on PRs targeting feat/benchmark-ci

Benchmark PRs (robomme, libero-plus, robocerebra, robotwin) target
feat/benchmark-ci, not main. Without this, the workflow never runs
on those PRs.



* fix(docker): use uv pip install instead of uv sync (cross-extra conflict)

uv sync --locked validates the entire lockfile across all extras.
Since robomme depends on mani-skill which pins numpy<2.0, and the
base project requires numpy>=2.0, the full lockfile is unsatisfiable.

Switch to uv pip install -e ".[libero,smolvla]" which only resolves
the requested extras for the current Python version and platform,
avoiding the cross-extra numpy conflict entirely.



* chore: revert configs.py, factory.py, test_dispatch.py to main

These use_async_envs default changes belong to the async-vector-env
PR (#3274), not this CI PR. Restore to match origin/main.



* fix: address PR review feedback — broken link, NaN guard, zizmor tags, fork skip

- Remove broken Triton issue link from Dockerfile.benchmark.libero
- Add module-level _safe_int helper to guard n_episodes against NaN
- Move _safe_float to module level alongside _safe_int
- Add # zizmor: ignore[unpinned-uses] to all upload-artifact@v4 steps
- Add if: env.HF_USER_TOKEN != '' to Libero smoke eval for fork PRs



* fix(ci): add fork PR guard to train-smoke and MetaWorld eval steps

Add if: env.HF_USER_TOKEN != '' to the Libero train+eval smoke and
MetaWorld smoke eval steps so fork PRs without the secret skip gracefully.



* fix(ci): remove feat/benchmark-ci from PR trigger branches



* refactor(docker): rebase benchmark images on nightly lerobot-gpu

Use huggingface/lerobot-gpu:latest as base for both libero and metaworld
benchmark Dockerfiles instead of building from nvidia/cuda scratch. The
nightly image already has all extras installed via uv sync --extra all,
so we only need to overlay the PR source code (and libero asset setup).

This eliminates duplicated system dep installation, Python setup, uv
venv creation, and the Triton ptxas workaround from both files.

---------

Signed-off-by: Pepijn <138571049+pkooij@users.noreply.github.com>
Co-authored-by: Khalil Meftah <khalil.meftah@huggingface.co>
2026-04-13 21:24:01 +02:00
Jash Shah 9bd844a3b9 fix(rl): ensure queue and process cleanup on abnormal exit (#3063)
Wrap the main execution in actor_cli and start_learner_threads with
try/finally so that queues are closed and processes are joined even
when an unhandled exception occurs. Previously, exceptions in
act_with_policy or add_actor_information_and_train would skip all
cleanup code, leaking GPU/CPU resources.

Also sets the shutdown_event on exception so child processes exit
gracefully.

Fixes #3059

Co-authored-by: Khalil Meftah <khalil.meftah@huggingface.co>
2026-04-13 16:25:42 +02:00
Steven Palma df0763a2bc feat(dependencies): minimal default tag install (#3362) 2026-04-12 20:03:04 +02:00
Steven Palma 4d2361ef71 chore(dependencies): update uv.lock (#3361)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2026-04-12 16:41:15 +02:00
Steven Palma 3167fe9f08 chore(dependencies): update uv.lock (#3308)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2026-04-12 10:39:18 +02:00
Caroline Pascal d762f4bfe8 fix(dataset): adding metadata loading when reading from a dataset after writing (#3305)
* fix(one shot load): adding metadata loading when reading from a dataset after writing

* refactor(one shot load): move metadata reload to ensure_readable() on LeRobotDatasetMetadata

Move the metadata reload from DatasetReader.load_and_activate() to a new
public ensure_readable() method on LeRobotDatasetMetadata, called from
LeRobotDataset._ensure_reader(). This places lifecycle management in the
right layer: metadata owns its readiness check, the dataset orchestrates
the write-to-read transition, and the reader stays clean.

Also adds a regression test using delta_timestamps to exercise the
meta.episodes access path in the create -> write -> finalize -> read flow.

Co-authored-by: Steven Palma <imstevenpmwork@users.noreply.github.com>

---------

Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Co-authored-by: Steven Palma <imstevenpmwork@users.noreply.github.com>
2026-04-10 11:29:40 +02:00
Steven Palma 6799da35eb chore(ci): proper claude args workflow (#3338) 2026-04-09 16:20:01 +02:00
Steven Palma 3e34d550c8 fix(ci): pin claude-code-action to v1.0.88 (#3336) 2026-04-09 14:16:54 +02:00
hf-security-analysis[bot] 800449aa53 chore(security): update claude.yml (#3333)
* fix(security): remediate workflow vulnerability in .github/workflows/claude.yml

* fix(security): right AUTHOR_ASSOCIATION fetching

---------

Co-authored-by: hf-security-analysis[bot] <265538906+hf-security-analysis[bot]@users.noreply.github.com>
Co-authored-by: Steven Palma <steven.palma@huggingface.co>
2026-04-09 13:02:05 +02:00
Steven Palma 8645d71e56 feat(ci): add agent assitance workflow (#3332)
Co-authored-by: Pepijn <138571049+pkooij@users.noreply.github.com>
2026-04-09 12:06:25 +02:00
Pepijn 919184d6f8 feat(envs): lazy env init + AsyncVectorEnv as default for n_envs > 1 (#3274)
* docs(benchmarks): add benchmark integration guide and standardize benchmark docs

Add a comprehensive guide for adding new benchmarks to LeRobot, and
refactor the existing LIBERO and Meta-World docs to follow the new
standardized template.

Made-with: Cursor

* refactor(envs): move dispatch logic from factory into EnvConfig subclasses

Replace hardcoded if/elif chains in factory.py with create_envs() and
get_env_processors() methods on EnvConfig. New benchmarks now only need
to register a config subclass — no factory.py edits required.

Net -23 lines: factory.py shrinks from ~200 to ~70 lines of logic.

Made-with: Cursor

* docs(benchmarks): clean up adding-benchmarks guide for clarity

Rewrite for simpler language, better structure, and easier navigation.
Move quick-reference table to the top, fold eval explanation into
architecture section, condense the doc template to a bulleted outline.

Made-with: Cursor

* fix link

* fix task count

* fix: enable SmolVLA eval on LIBERO with custom camera mappings

- Thread camera_name_mapping from LiberoEnv config through to gym envs
- Sync features_map with camera_name_mapping in LiberoEnv.__post_init__
- Fix render() to use first available camera instead of hardcoded "image"
- Handle non-dict final_info in rollout by falling back to info["is_success"]
- Add use_peft legacy field to SmolVLAConfig for checkpoint compat
- Add defaults to GR00TN15Config init=False fields for transformers 5.3

Made-with: Cursor

* fix: use direct AutoresetMode import for gymnasium compat

Made-with: Cursor

* fix: handle gymnasium < 1.0 without AutoresetMode

Made-with: Cursor

* refactor: revert policy changes, keep env-only camera mapping fixes

- Revert GR00T N1.5 default_factory/default changes (transformers compat)
- Revert SmolVLA use_peft legacy field
- Apply ruff formatting fixes
- camera_name_mapping stays entirely in env/eval layer (no policy changes)

Made-with: Cursor

* Update docs/source/env_processor.mdx

Co-authored-by: Khalil Meftah <khalil.meftah@huggingface.co>
Signed-off-by: Pepijn <138571049+pkooij@users.noreply.github.com>

* feat(envs): lazy env init + AsyncVectorEnv as default for n_envs > 1

LiberoEnv and MetaworldEnv previously allocated GPU resources (EGL context,
OpenGL framebuffer) in __init__, before AsyncVectorEnv's fork(). Worker
processes inherited stale GPU handles, causing EGL_BAD_CONTEXT crashes on
first render.

Fix: defer OffScreenRenderEnv / MT1 construction to _ensure_env(), called on
first reset() or step() inside the worker subprocess. Each worker creates its
own clean context after fork().

Also fixes lerobot_eval.py:170 (add_envs_task TODO): replace with
env.call("task") which works with both SyncVectorEnv and AsyncVectorEnv.

AsyncVectorEnv is now the default for n_envs > 1; auto-downgraded to
SyncVectorEnv when n_envs=1 (no benefit, less overhead).

Expected speedup: ~15-20x for LIBERO Spatial with batch_size=50.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: close envs between tasks to prevent worker process accumulation

eval_policy_all never closed environments after each task completed,
causing AsyncVectorEnv worker processes to accumulate (N_tasks × n_envs).
This led to OOM, BrokenPipeError and EOFError on multi-task benchmarks.

Also fixes:
- AsyncVectorEnv compat in envs/utils.py (use get_attr/call instead of .envs)
- Tuple task handling in tokenizer_processor and lerobot_eval
- _LazyAsyncVectorEnv for deferred worker spawning in LIBERO

Made-with: Cursor

* fix(eval): use task_description instead of task for language conditioning

env.call("task") returns the LIBERO task name with underscores
(e.g. "pick_up_the_black_bowl_...") instead of the natural language
description ("pick up the black bowl ..."). The VLM tokenizes these
completely differently, causing 0.0 reward across all episodes.

Made-with: Cursor

* docs: update adding_benchmarks for async env changes

- Replace add_envs_task reference with env.call("task_description")
- Update use_async_envs default to True
- Add note about lazy GPU init for AsyncVectorEnv compatibility

Made-with: Cursor

* feat(eval): batch_size=auto + faster env loading

- batch_size=0 (default) auto-tunes based on CPU cores, capped by
  n_episodes and 64. Removes the need for users to guess the right
  value. The old batch_size > n_episodes error is replaced by silently
  clamping to n_episodes.
- _LazyAsyncVectorEnv accepts pre-computed spaces so only one temp env
  is created per suite (not per task). For libero_spatial (10 tasks)
  this avoids 9 redundant LiberoEnv instantiations during env setup.

Made-with: Cursor

* docs: add evaluation guide and update benchmarks doc

- New docs/source/evaluation.mdx covering lerobot-eval usage, batch_size
  auto-tuning, AsyncVectorEnv performance, tuning tips, output format,
  multi-task evaluation, and programmatic usage.
- Add evaluation page to _toctree.yml under Benchmarks section.
- Update adding_benchmarks.mdx to reference batch_size auto default and
  link to the evaluation guide.

Made-with: Cursor

* docs(evaluation): remove benchmark table, rename section header

Made-with: Cursor

* perf(eval): shared memory, observation passthrough, task prefetch

- AsyncVectorEnv now uses shared_memory=True for zero-copy observation transfer
- LiberoEnvConfig.gym_kwargs passes observation_height/width to the env
- eval_policy_all prefetches next task's workers while current task runs

Made-with: Cursor

* style: ruff format

Made-with: Cursor

* chore: revert env_processor.mdx changes (not part of this PR)

Made-with: Cursor

* ci(benchmarks): add isolated integration tests for libero and metaworld

Each benchmark gets its own Docker image (lerobot[libero] / lerobot[metaworld]
only) so incompatible dep trees cannot collide. A 1-episode smoke eval runs
per benchmark on GPU runners.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci(benchmarks): pin action hashes and use uv sync --locked

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci(benchmarks): trigger only on envs/ or lerobot_eval.py changes

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): set LIBERO_DATA_FOLDER to bypass interactive stdin prompt

libero/__init__.py calls input() to ask about a custom dataset path,
which raises EOFError when stdin is closed inside Docker. Setting
LIBERO_DATA_FOLDER skips the prompt entirely.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs(benchmarks): add CI smoke test step to adding_benchmarks guide

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): pre-create libero config in Dockerfile to bypass stdin prompt

libero/__init__.py calls input() when ~/.libero/config.yaml is missing.
We write the config at image build time (without importing libero) so
the prompt never fires at runtime. Also trigger CI on pyproject.toml changes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): use shell to create libero config instead of multiline python -c

The multiline RUN python -c "..." was being parsed as Dockerfile
instructions. Use printf to write ~/.libero/config.yaml directly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): point libero config to bundled package init_files

The config was pointing to /tmp/libero_init which doesn't exist.
Use importlib.util.find_spec to locate the hf-libero package directory
and write paths to the actual bundled bddl_files/init_files/assets.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): add smolvla extra to benchmark Dockerfiles

num2words (required by SmolVLM processor) is declared in lerobot[smolvla],
not lerobot[libero/metaworld]. Install both extras together.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(eval): render_frame covers _LazyAsyncVectorEnv

isinstance(env, AsyncVectorEnv) silently skipped _LazyAsyncVectorEnv,
causing video rendering to produce no frames on the default async path.
Switch to hasattr(env, "call") so any async-compatible env (including
_LazyAsyncVectorEnv) hits the call("render") branch.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(envs): remove unused _get_sub_env_attr helper

_get_sub_env_attr was defined but never called anywhere in the codebase.
_sub_env_has_attr (its sibling) is kept — it is actively used in utils.py.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore: apply prettier formatting to docs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs(env_processor): remove deprecated add_envs_task from pipeline example

add_envs_task is replaced by env.call("task_description") in this PR.
Remove it from the pipeline walkthrough and renumber the steps (8→7).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(envs): remove __del__ from _LazyAsyncVectorEnv

__del__ is unreliable as a cleanup mechanism. close() is already called
explicitly in the eval loop's finally block, so the finalizer is redundant.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(eval): prefetch next task's workers after close to avoid GPU memory overlap

Previously, next task's AsyncVectorEnv workers were spawned while the
current task was still running, causing both tasks' GPU contexts to coexist.
Moving the prefetch start into the finally block (after env.close()) ensures
workers for task N+1 only spin up once task N has released GPU memory.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(envs): move _LazyAsyncVectorEnv to utils and apply to metaworld

_LazyAsyncVectorEnv lived in libero.py but metaworld had the same OOM
problem: all tasks' AsyncVectorEnv workers were spawned eagerly, wasting
GPU memory for tasks not yet running.

Move the class to envs/utils.py so both environments share it, then apply
the same is_async + lazy wrapping pattern in create_metaworld_envs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore: remove out-of-scope benchmark/CI/docs files from PR

Benchmark CI workflow, Dockerfiles, benchmark docs, evaluation smoke-test
doc, and dispatch tests belong in a separate PR. Scope this PR to the
async env init changes only.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore: restore adding_benchmarks + test_dispatch, drop env_processor changes

- Restore docs/source/adding_benchmarks.mdx (belongs in this PR)
- Restore tests/envs/test_dispatch.py (belongs in this PR)
- Revert docs/source/env_processor.mdx to main (out of scope for this PR)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs(adding_benchmarks): remove CI smoke test step (coming in separate PR)

Step 7 (Dockerfile + benchmark_tests.yml CI job) and its table rows are
out of scope for this PR. The CI infrastructure will be added on top in a
follow-up PR.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(envs): remove unused add_envs_task

Replaced by env.call("task_description") in lerobot_eval.py. No callers
remain in the codebase.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* style: fix prettier formatting in env_processor.mdx

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(eval): catch AttributeError and NotImplementedError explicitly for task description

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(envs): use forkserver context and close envs in test to prevent deadlock

AsyncVectorEnv with default fork context leaks worker processes between
test_policy parametrized cases; subsequent env creation deadlocks because
new forked workers inherit stale pipe FDs from previous test's leaked workers.

- configs.py: pass context="forkserver" to AsyncVectorEnv (matches _LazyAsyncVectorEnv)
- test_policies.py: call close_envs(envs) at end of test_policy to clean up workers

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(envs): default use_async_envs=False in create_envs and make_env

Tests that call make_env(n_envs=2) without passing use_async_envs were
getting AsyncVectorEnv, whose forked workers can't resolve gym namespaces
registered at runtime. Default to False (sync) so existing tests pass.

lerobot_eval.py explicitly passes cfg.eval.use_async_envs, so the CLI
async behaviour (controlled by EvalConfig.use_async_envs) is unchanged.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Signed-off-by: Pepijn <138571049+pkooij@users.noreply.github.com>
Co-authored-by: Khalil Meftah <khalil.meftah@huggingface.co>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 10:29:20 +02:00
Pepijn 5de7aa5a4f refactor(envs): move benchmark dispatch into EnvConfig subclasses (#3272)
* docs(benchmarks): add benchmark integration guide and standardize benchmark docs

Add a comprehensive guide for adding new benchmarks to LeRobot, and
refactor the existing LIBERO and Meta-World docs to follow the new
standardized template.

* refactor(envs): move dispatch logic from factory into EnvConfig subclasses

Replace hardcoded if/elif chains in factory.py with create_envs() and
get_env_processors() methods on EnvConfig. New benchmarks now only need
to register a config subclass — no factory.py edits required.

Net -23 lines: factory.py shrinks from ~200 to ~70 lines of logic.

* docs(benchmarks): clean up adding-benchmarks guide for clarity

Rewrite for simpler language, better structure, and easier navigation.
Move quick-reference table to the top, fold eval explanation into
architecture section, condense the doc template to a bulleted outline.

* fix link

* fix task count

* fix(tests): fix 3 failing dispatch tests

- test_registry_all_types: skip non-EnvConfig stubs (e.g. TestPluginConfig)
- test_processors_delegation: use None instead of abstract PreTrainedConfig
- test_custom_get_env_processors_override: use DataProcessorPipeline for isinstance check (PolicyProcessorPipeline is a subscripted generic)

* fix: enable SmolVLA eval on LIBERO with custom camera mappings

- Thread camera_name_mapping from LiberoEnv config through to gym envs
- Sync features_map with camera_name_mapping in LiberoEnv.__post_init__
- Fix render() to use first available camera instead of hardcoded "image"
- Handle non-dict final_info in rollout by falling back to info["is_success"]
- Add use_peft legacy field to SmolVLAConfig for checkpoint compat
- Add defaults to GR00TN15Config init=False fields for transformers 5.3

Made-with: Cursor

* fix: use direct AutoresetMode import for gymnasium compat

Made-with: Cursor

* fix: handle gymnasium < 1.0 without AutoresetMode

Made-with: Cursor

* refactor: revert policy changes, keep env-only camera mapping fixes

- Revert GR00T N1.5 default_factory/default changes (transformers compat)
- Revert SmolVLA use_peft legacy field
- Apply ruff formatting fixes
- camera_name_mapping stays entirely in env/eval layer (no policy changes)

Made-with: Cursor

* Update docs/source/env_processor.mdx

Co-authored-by: Khalil Meftah <khalil.meftah@huggingface.co>
Signed-off-by: Pepijn <138571049+pkooij@users.noreply.github.com>

* Update docs/source/env_processor.mdx

Co-authored-by: Khalil Meftah <khalil.meftah@huggingface.co>
Signed-off-by: Pepijn <138571049+pkooij@users.noreply.github.com>

* Update docs/source/env_processor.mdx

Co-authored-by: Khalil Meftah <khalil.meftah@huggingface.co>
Signed-off-by: Pepijn <138571049+pkooij@users.noreply.github.com>

* fix(eval): raise RuntimeError for unsupported final_info format (Gymnasium < 1.0)

Made-with: Cursor

* style: fix markdown code fences in env_processor.mdx

Made-with: Cursor

* docs: remove duplicate code blocks in env_processor.mdx

Made-with: Cursor

* style: revert quadruple backticks to triple (prettier compat)

* docs(env_processor): add EnvConfig subclass step and policy_cfg examples

- Add missing '### 2. Update Your EnvConfig Subclass' section with
  get_env_processors() snippet
- Update factory usage example to show policy_cfg parameter and
  keyword-argument style for both SmolVLA and ACT cases

* docs(env_processor): rename step 2 and fix policy_cfg examples

- Rename '### 2. Update the Factory' → '### 2. Update Your EnvConfig Subclass'
- Update factory usage examples to use keyword-argument style with
  policy_cfg parameter for both SmolVLA and ACT cases

---------

Signed-off-by: Pepijn <138571049+pkooij@users.noreply.github.com>
Co-authored-by: Khalil Meftah <khalil.meftah@huggingface.co>
2026-04-08 17:48:58 +02:00
Steven Palma 4eecbad32b chore(dependencies): Bump lerobot to 0.5.2 (#3307)
* chore(dependencies): Bump lerobot to 0.5.2

* chore(dependecies): upgrade uv.lock
2026-04-07 17:17:33 +02:00
Pauline Bailly-Masson 1396b9fab7 🔒 Pin GitHub Actions to commit SHAs (#3265)
* 🔒 pin quality.yml actions to commit SHAs

* 🔒 pin fast_tests.yml actions to commit SHAs

* 🔒 pin full_tests.yml actions to commit SHAs

* 🔒 pin documentation.yml actions to commit SHAs

* 🔒 pin documentation-upload-pr.yml actions to commit SHAs

* 🔒 pin release.yml actions to commit SHAs

* 🔒 pin security.yml actions to commit SHAs

---------

Co-authored-by: Steven Palma <imstevenpmwork@ieee.org>
v0.5.1
2026-04-07 16:11:14 +02:00
Francesco Capuano 7c032f19fc feat(dataset): registering torchvision transforms (#3153)
* add: a flexible transformation registry

* fix: image transforms can be set both at init and after

* add: tests

* fix: take in review

* feat(datasets): add image transform setters

* fix: pre-commit

* fix: CI

---------

Signed-off-by: Francesco Capuano <74058581+fracapuano@users.noreply.github.com>
2026-04-07 15:59:11 +02:00
Anthony Chan e2f27bf71b Fix lerobot_train script without interpolation (#3281) 2026-04-07 15:50:18 +02:00
Steven Palma ea36a4a176 chore(docs): new badge for readme (#3303) 2026-04-07 10:47:03 +02:00
Steven Palma 399b3c9ba5 chore(dependencies): update uv.lock (#3302)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2026-04-07 09:49:00 +02:00
Steven Palma 913041e753 fix(ci): latest deps tests permissions (#3296)
* fix(ci): latest deps tests permissions

* fix(ci): force push dep update branch

* fix(ci): change secret for permissions & Ci trigger
2026-04-06 14:56:05 +02:00
Steven Palma 2b541ddd4c docs(ci): add readme for dockerfile (#3295) 2026-04-06 13:22:45 +02:00
Steven Palma 50a1e67e94 feat(ci): add uv.lock (#3292)
* feat(ci): add uv.lock

* feat(ci): use uv.lock in CI PR testing

* chore(ci): rename nightly to docker publish and test

* feat(ci): automated update of uv.lock + remove unbound check + docker images now use uv.lock

* fix(ci): add --force-with-lease + set -e for silent erros
2026-04-06 12:23:37 +02:00
Steven Palma d60a700d2b chore(policy): multi dit docs (#3285)
* docs(policy): add libero results multi task dit + remove readme in src code

* docs(policy): add hyperlink to doc file in src code

* chore(style): pre-commit
2026-04-05 21:23:13 +02:00
Steven Palma 8c3d4cf900 chore(docs): no policy readme in src code (#3286)
* chore(docs): move policies readme out of src code

* chore(docs): create symlink for policy readme
2026-04-05 19:25:38 +02:00
Caroline Pascal b6e60a6e30 chore(dependencies): bump minimum torch from 2.2.1 to 2.7 (#3156)
* feat(ffmpeg): updating ffmpeg verion to 8.X

* Revert "feat(ffmpeg): updating ffmpeg verion to 8.X"

This reverts commit bb0f03185c.

* chore(pyproject): updating pyproject to fit the minimally required version of torchcodec

* chore(docs): updating doc with specific instructions for ffmpeg/torchcodec installation

* fix(typo): reverting ceiling bound on pytorch to 2.11.0

* chore(format): removing empty line

* chore(typo): fixing typo

* chore(docs): adding warning in case of torchcodec/ffmpeg version mismatch

* chore(docs): applying comments

* chore(docs): adding uv commands for evdev on WSL

* fix(typo): fixing typo

* fix(typo): fixing typos again

* chore(ruff): format

* fix(evdev install): splitting evdev install instructions between conda and uv

* chore(ruff): format

---------

Co-authored-by: Steven Palma <imstevenpmwork@ieee.org>
2026-04-05 19:24:43 +02:00
Steven Palma 3596681d94 docs(policy): fix gr00t license docs (#3284) 2026-04-05 19:09:15 +02:00
Pepijn 4dbbcca496 docs(benchmarks): add benchmark integration guide and standardize benchmark docs (#3270)
* docs(benchmarks): add benchmark integration guide and standardize benchmark docs

Add a comprehensive guide for adding new benchmarks to LeRobot, and
refactor the existing LIBERO and Meta-World docs to follow the new
standardized template.

Made-with: Cursor

* docs(benchmarks): clean up adding-benchmarks guide for clarity

Rewrite for simpler language, better structure, and easier navigation.
Move quick-reference table to the top, fold eval explanation into
architecture section, condense the doc template to a bulleted outline.

Made-with: Cursor

* fix link

* fix task count

* Update docs/source/adding_benchmarks.mdx

Co-authored-by: Khalil Meftah <khalil.meftah@huggingface.co>
Signed-off-by: Pepijn <138571049+pkooij@users.noreply.github.com>

* Update docs/source/metaworld.mdx

Co-authored-by: Khalil Meftah <khalil.meftah@huggingface.co>
Signed-off-by: Pepijn <138571049+pkooij@users.noreply.github.com>

* Update docs/source/adding_benchmarks.mdx

Co-authored-by: Khalil Meftah <khalil.meftah@huggingface.co>
Signed-off-by: Pepijn <138571049+pkooij@users.noreply.github.com>

* Update docs/source/adding_benchmarks.mdx

Co-authored-by: Khalil Meftah <khalil.meftah@huggingface.co>
Signed-off-by: Pepijn <138571049+pkooij@users.noreply.github.com>

* Update docs/source/adding_benchmarks.mdx

Co-authored-by: Khalil Meftah <khalil.meftah@huggingface.co>
Signed-off-by: Pepijn <138571049+pkooij@users.noreply.github.com>

* docs(benchmarks): add verification checklist to adding-benchmarks guide

Made-with: Cursor

---------

Signed-off-by: Pepijn <138571049+pkooij@users.noreply.github.com>
Co-authored-by: Khalil Meftah <khalil.meftah@huggingface.co>
2026-04-03 14:44:53 +02:00
Pepijn 818892a38b feat(dagger): Add HIL/Dagger/HG-Dagger/RaC style data collection (#2833)
* feat: HIL data collection, RTC interpolator, and action queue improvements

- Add Human-in-the-Loop (HIL) data collection examples (sync + RTC)
- Add HIL data collection documentation
- Add ActionInterpolator for smoother policy control at higher rates
- Integrate interpolator into lerobot-record and eval_with_real_robot
- Add action queue clear() and get_processed_left_over() methods
- Add rtc/__init__.py for cleaner imports

* docs: expand Related Work section with paper summaries

* fix: only record dataset frames at original fps, not at interpolated rate

The interpolator speeds up robot control (e.g. 2x) but dataset frames
should still be recorded at the original fps. Interpolated-only
iterations now only send actions to the robot without writing to the
dataset.

* refactor: merge HIL sync and RTC scripts into single file with --rtc.enabled toggle

Combines hil_data_collection.py and hil_data_collection_rtc.py into one
script. RTC is toggled via --rtc.enabled=true (defaults to off for sync
inference). Deletes the separate hil_data_collection_rtc.py and updates
docs to reflect the single-script usage.

* test: add ActionInterpolator test suite (29 tests)

Covers constructor validation, passthrough (multiplier=1), 2x and 3x
interpolation with exact value checks, reset/episode boundaries,
control interval calculation, multi-dim actions, and simulated
control loop integration.

* test: add ActionQueue + ActionInterpolator integration tests

Verifies the interpolator doesn't interfere with RTC's leftover chunk
tracking: queue consumption rate matches base fps regardless of
multiplier, get_left_over/get_processed_left_over only change on
queue.get(), merge preserves smooth interpolation across chunks,
and interpolator reset is independent of queue state.

* feat: register SO follower/leader configs in HIL script

Adds SOFollowerRobotConfig and SOLeaderTeleopConfig imports so
SO100/SO101 robots can be used via --robot.type=so_follower
and --teleop.type=so_leader. Updates docs accordingly.

Made-with: Cursor

* docs: remove em dashes from HIL documentation

Made-with: Cursor

* refactor: rename examples/rac to examples/hil

Updates directory name and all references in docs and script docstrings.

Made-with: Cursor

* fix: encorperate pr feedback comments

* refactor(tests): enhance ActionInterpolator test structure and add detailed docstrings

* feedback pr and test fix

* fix(test): pass correct real_delay in interpolator delay test

The test was passing real_delay=0 and relying on _check_delays to
silently override it with the index-based diff. Now passes real_delay=3
to match the 3 actions consumed during the simulated inference period.


* fix pr feedback

* ordering

* update hil script

* fix

* default name

* fix(bi_openarm): use kw_only=True to fix dataclass field ordering

BiOpenArmFollowerConfig overrides `id` with a default, making it
positional in the child — non-default `left_arm_config` then follows a
default field, which Python dataclasses forbid. Adding kw_only=True
(matching the parent RobotConfig) removes positional constraints.

Made-with: Cursor

* style: format long line in hil_data_collection.py

Made-with: Cursor

* pr feedback

---------

Co-authored-by: Khalil Meftah <khalil.meftah@huggingface.co>
2026-04-02 19:53:59 +02:00
Pepijn 66fef25ded docs(toctree): add Benchmarks section for LIBERO and Meta-World (#3268)
* docs(toctree): add Benchmarks section for LIBERO and Meta-World

Move LIBERO and Meta-World pages out of the Simulation section into a
dedicated Benchmarks section so benchmark-specific docs are easier to
find and the Simulation section stays focused on environment hubs.

Made-with: Cursor

* docs(toctree): move IsaacLab Arena into Benchmarks section

Include NVIDIA IsaacLab Arena Environments alongside LIBERO and
Meta-World in the Benchmarks section.

Made-with: Cursor
2026-04-02 19:52:39 +02:00
Pepijn 2cf08b7a4b Add create reward visualization (#3155)
* Add create reward visualization and multimodal analysis tool

* add example for creating progress video for sarm

* nit

* precommit

* refactor: address review comments on create_progress_videos.py

- Add shebang and Apache 2.0 license header
- Replace hardcoded absolute OUTPUT_DIR with relative default (./progress_videos)
- Add argparse CLI (--repo-id, --episode, --camera-key, --output-dir, --gif)
- Wrap entrypoint in def main()
- Replace all print() with logging
- Use logging.error/warning instead of traceback.print_exc
- Release VideoCapture via try/finally; consolidate triple-open into single seek
- Eliminate intermediate clip file: seek directly via CAP_PROP_POS_MSEC
- Make MP4 the default output, GIF opt-in via --gif flag
- Add return types to all functions
- Add Args/Returns docstrings
- Use descriptive variable names throughout

Made-with: Cursor

* refactor: move create_progress_videos.py to examples/dataset/ for consistency

Made-with: Cursor

* refactor: address PR review comments on create_progress_videos.py

- Replace Unicode ellipsis and multiplication sign with ASCII equivalents
- Fix step numbering from 1-5 to 1-4 (only 4 actual steps)
- Move frame_width reading into convert_mp4_to_gif
- Remove unused text_height variable

Made-with: Cursor
2026-04-02 16:58:07 +02:00
Pepijn 15934d8d08 feat(policies): add relative action support for pi0, pi0.5, and pi0_fast (#2970)
* Add option for pi family models to train with relative actions (relative to state)

* formatting

* add recomputation of stats and option to compute delta stats

* normalzie after delta conversion

* only recompute state for stats

* calulate chunk based stats

* sample 100k

* load from parquet

* sample 1m

* stats per chunck

* fix

* use quantiles

* stats for entire dataset

* fix

* max 1m frames

* compute before dist

* fix multi gpu processor bug

* Fix RTC with delta actions and OpenArms motor_type wiring

* feat: align pi0_fast delta actions with pi0/pi05 and add RTC integration tests

- Add delta_exclude_joints and action_feature_names to PI0FastConfig
- Move to_absolute_actions from modeling to processor pipeline for pi0_fast
- Add delta action detection and logging to eval_with_real_robot.py
- Add delta actions documentation to pi0 and pi05 READMEs
- Fix ruff lint issues in test_delta_actions.py
- Add test_rtc_delta_actions.py (24 tests) covering:
  - ActionQueue with delta vs absolute actions
  - RTC denoise step with delta leftovers
  - Full pipeline roundtrip (delta → RTC → absolute)
  - State rebasing approximation bounds
  - Non-delta policy compatibility
  - Multi-chunk consistency

* chore: clean up test comments, add OpenPI attribution, remove debug logging

- Replace decorative comment separators in test files with plain section headers
- Add attribution comments for 1e-6 epsilon in normalize_processor.py (from OpenPI)
- Remove debug logging blocks from lerobot_train.py

* refactor: extract compute_delta_action_stats into compute_stats.py

Move the ~70-line inline delta action stats block from lerobot_train.py
into a dedicated function in compute_stats.py, where all other stats
computation already lives. The training script now calls it in 6 lines.

* refactor: remove unused get_processed_left_over from ActionQueue

This method was never called outside of tests. Leftover actions for RTC
guidance are always retrieved via get_left_over() (delta/original space).

* revert: remove logging-only changes from eval_with_real_robot.py

The delta actions detection helper and log message added no functional
value — the script already handles delta policies correctly via the
processor pipeline.

* refactor: use ACTION/OBS_STATE constants instead of hardcoded strings

Replace hardcoded "action" and "observation.state" with ACTION and
OBS_STATE from utils.constants in compute_stats.py, dataset_tools.py,
and lerobot_train.py.

* style: remove stray blank lines in training loop

* refactor: move delta action stats to preprocessing step, remove on-the-fly computation

- Remove on-the-fly compute_delta_action_stats from lerobot_train.py
- Rewrite recompute_stats to delegate action stats to compute_delta_action_stats
  (chunk-based sampling matching what the model sees during training)
- Add chunk_size parameter to recompute_stats for delta action computation
- Add delta actions documentation to pi0.mdx and pi05.mdx

* feat: add recompute_stats CLI operation to lerobot-edit-dataset

* fix(tests): relax quantile normalization test tolerance for 1e-6 epsilon

* chore: remove agents_memory/pr_details.md from repo

* refactor: rename delta actions to relative actions throughout

What OpenPI calls "DeltaActions" is actually UMI's "relative trajectory"
representation: each action in the chunk is an offset from the current
state, not from the previous action. This avoids error accumulation.

Renamed across all source, tests, docs, and CLI:
- DeltaActionsProcessorStep → RelativeActionsProcessorStep
- to_delta_actions → to_relative_actions
- use_delta_actions → use_relative_actions
- delta_exclude_joints → relative_exclude_joints
- compute_delta_action_stats → compute_relative_action_stats
- delta_action_processor.py → relative_action_processor.py
- test_delta_actions.py → test_relative_actions.py

Kept as-is: AbsoluteActionsProcessorStep (converts TO absolute),
registry ID "delta_actions_processor" (backward compat), and unrelated
delta references (IK pipeline, Robosuite, RA-BC metrics, gym envs).

* docs: add Action Representations guide

Dedicated page explaining absolute, relative, and delta actions with
numerical examples, joint vs EE space, and how to use kinematics
pipelines and the relative action processor. References UMI paper
(Chi et al., 2024) for the terminology.

* docs: remove redundant OpenPI naming note from action representations

* docs: remove opinionated OpenPI reference from delta actions section

* docs: replace ASCII diagram with UMI paper figure

* docs: remove OpenPI reference from action representations

* docs: use HF-hosted image instead of local asset

* docs: clarify figure attribution

* revert: restore original normalization epsilon behavior

The 1e-6 unconditional epsilon change perturbed all normalized values,
breaking backward compatibility tests. The original approach (1e-8 eps
for MEAN_STD, conditional torch.where for QUANTILES) already handles
division by zero correctly without affecting non-degenerate cases.

* fix: restore delta_action_processor.py used by phone/RL teleop

The rename commit incorrectly deleted delta_action_processor.py and
duplicated its classes into relative_action_processor.py. Restore the
original file and import from it instead.

* fix(processor): address PR #2970 review comments

- Remove shebang from relative_action_processor.py (library module, not script)
- Add device alignment in to_relative_actions/to_absolute_actions so _last_state
  on CPU doesn't cause cross-device errors when actions are on CUDA
- Rename delta_step → relative_step in AbsoluteActionsProcessorStep for naming
  consistency; update factory.py, all processor files, and tests
- Expand _reconnect_relative_absolute_steps docstring to explain why post-hoc
  rewiring is needed after deserialization
- Fix off-by-one in compute_stats.py: sample_upper_bound = total_frames - chunk_size + 1
  so last valid start index is included and total_frames == chunk_size is not rejected
- Remove redundant NOTE comment in processor_pi05.py (duplicated two lines below)
- Fix pi0_fast processor ordering: move relative_step before NormalizerProcessorStep
  so normalizer sees delta actions (matching pi0/pi05); flip postprocessor to
  unnormalize → absolute accordingly. Relative stats are now required for all pi models
- Revert use_relative_joint_actions_aloha → use_delta_joint_actions_aloha in
  configuration_smolvla.py (preserve existing public API)
- Update action_representations.mdx: add missing joint to 6-DOF example, fix
  'based on a figure', clarify pi family ordering, add RTC compatibility section

* update rtc link

* feat: compute relative action stats over full dataset with optional parallelism

Remove the 100k sample cap from compute_relative_action_stats and process
all valid chunks. Vectorize with numpy (pre-load actions/states, fancy
indexing + broadcasting) for a large speedup over the per-index HF dataset
loop. Add num_workers param for thread-based parallelism (numpy releases
the GIL). Update docs to show --push_to_hub for recompute_stats.

* style: apply ruff formatting to compute_stats.py

* testing on real robot

* style: fix ruff format and remove redundant .keys() calls
2026-04-01 12:59:12 +02:00
Jai Kumaar Ratadia 9300352876 Fix SO-101 assembly instruction order to match videos (#3242)
* Fix SO-101 assembly instruction order to match videos

Motor horn installation steps were listed after placing motors
into the housing, but the assembly videos show installing horns
first. Reordered steps to match the videos, which is also the
easier approach since horns are harder to attach once the motor
is seated. Added missing detail that bottom horns don't require
screws.

* Update docs/source/so101.mdx

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Jai Kumaar Ratadia <jaikumaarratadia@gmail.com>

---------

Signed-off-by: Jai Kumaar Ratadia <jaikumaarratadia@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Pepijn <138571049+pkooij@users.noreply.github.com>
2026-03-31 12:16:34 +02:00
Steven Palma 720cf8e3a0 Revert "fix(deps): breaking change from transformers 5.4.0" (#3249)
* Revert "fix(deps): breaking change from transformers 5.4.0 (#3231)"

This reverts commit 07502868e5.

* chore(dependecies): pin transformers to 5.3.0 temporarily
2026-03-30 19:11:41 +02:00
Steven Palma 5d4fdf5088 feat(scripts): add transformers version (#3248)
* feat(scripts): add transformers and torch version

* chore(scripts): remove pytorch
2026-03-30 16:33:17 +02:00
四七 3b185f7f9d fix(datasets): remove unreachable timestamp branch in add_frame (#3163)
* fix(datasets): remove unreachable timestamp branch in add_frame and document caller contract

- Remove dead code: frame.pop("timestamp") branch in add_frame() could never
  execute because validate_frame() raises ValueError for any DEFAULT_FEATURES
  key (including timestamp) before we reach that line.
- Expand add_frame() docstring: explicitly document that timestamp and
  frame_index must NOT be passed by the caller.
- Add explanatory comment in validate_frame(): clarifies why DEFAULT_FEATURES
  are excluded from expected_features, preventing future re-introduction of
  the dead branch.

The dead branch originated in #1200, which fixed a shape-(1,) mismatch for a
code path that was subsequently made unreachable by a refactor of validate_frame.

* chore(datasets): narrow PR scope

* fix(datasets): move add_frame timestamp cleanup to dataset_writer
2026-03-28 11:37:57 +01:00
Bryson Jones 2e069b1c47 Feature/add multitask diffusion transformer policy implementation (#2545)
* Add multitask diffusion transformer policy

Add multitask diffusion transformer policy

* expand the observation encoder to support differnt size encoders for vision and text

* add RoPE attention module as this is shown to help training dynamics and generation quality for DiTs

* update readme and citations for multitask dit policy

* remove dino vision encoder and simplify text and vision encoders by removing inheritance structure

* adjust factory comment

* update docstring for multitask dit policy processor file

* simplify config for multitask dit by merging and flattening everything, then adding comments to denote where some parameters are only used for specific objectives

* add references to the modeling file comments

* merge all modules files into the main modeling file

* add torch.no_grad decorators

* split up select action return statement

* remove redundant asserts

* add tutorial to training with multi_task_dit

* fix bugs when testing on hardware

* remove environment state conditioning

* update typo in test instruction comment

* add processor tests to multitask dit tests

* move policy to top of file

* use constants for indexing into batches and remove env state references

* remove the base classes since we don't need to be able to extend

* fix nit formatting in generate actions fcn

* reformat and clean up tutorial for multitask dit policy

* add more descriptions and depth to multitask dit tutorial

* note origins of each training objective

* rename config param for multiple vision encoders

* refactor code to perform task tokenization in the processor instead of in the modeling code for multitask dit

* add multitask dit to toc for docs

* add conditional transformers import to match all other policies that use transformers lib

* add test handling for multitask dit when transformers isnt available

* skip tests without transformers

* remove cropping of images smaller than the crop size

* add kwargs arg to multitask dit constructor

* add wallx dep conflict management for multitask dit policy

* use hyphens for cleanliness in pyproject.toml

* add conflict management to pyproject toml for pi conflict for mtdp as well

* update tests script to not use unnecessary uv sync call which resolves dependencies that do not need to run. This drastically reduces CI run time

* revert fast tests edits

* update docs and readme files, fixing some typos and adding multitask dit to readme

* chore(dependencies): upgrade transformers + hggingface-hub + peft + scipy

* chore(dependencies): bump pi0 family to transformers v5

* chore(dependencies): bump wall x to transformers v5

* chore(dependencies): bump gr00t to transformers v5

* chore(style): fix pre-commit

* fix(policy): xvla forced_bos_token missing

* test(rl): skip ci tests for resnet10

* Fix: full pi models support for transformer v5 (#2967)

* fix(pi): remove loss truncation

* fix(pi): remove state padding before tokenization

* fix(pi): fix image padding value

* fix from_pretrain

* add transformer v5 changes

* remove reference

* more fixes

* make it work

* add support for rest of pi family

* add pifast work

* more changes

* more changes

* more cleanup

* fix torch params

* dtype fix

* torch compile

* embed mismatch fix

* revert groot

* more nit fixes

* remove unused classes

* more fixes

* revert

* nit

* torch dtype warning fix

* but back dynamic renaming

* add tie embedding

---------

Co-authored-by: Yufei Sun <skieyfly@gmail.com>

* chore: fix XVLA in transformers v5 (#3006)

* test(policies): enable wall x CI testing

* style(test): pre-commit check

* style(test): pre-commit

---------

Signed-off-by: Bryson Jones <63133702+brysonjones@users.noreply.github.com>
Co-authored-by: Pepijn <138571049+pkooij@users.noreply.github.com>
Co-authored-by: Steven Palma <imstevenpmwork@ieee.org>
Co-authored-by: Jade Choghari <chogharijade@gmail.com>
Co-authored-by: Yufei Sun <skieyfly@gmail.com>
Co-authored-by: Steven Palma <steven.palma@huggingface.co>
2026-03-28 00:41:26 +01:00
Steven Palma 4e45acca52 fix(dataset): use revision-safe Hub cache for downloaded datasets (#3233)
* refactor(dataset): enhance dataset root directory handling and introduce hub cache support

- Updated DatasetConfig and LeRobotDatasetMetadata to clarify root directory behavior and introduce a dedicated hub cache for downloads.
- Refactored LeRobotDataset and StreamingLeRobotDataset to utilize the new hub cache and improve directory management.
- Added tests to ensure correct behavior when using the hub cache and handling different revisions without a specified root directory.

* refactor(dataset): improve root directory handling in LeRobotDataset

- Updated LeRobotDataset to store the requested root path separately from the actual root path.
- Adjusted metadata loading to use the requested root, enhancing clarity and consistency in directory management.

* refactor(dataset): minor improvements for hub cache support

* chore(datasets): guard in resume + assertion test

---------

Co-authored-by: AdilZouitine <adilzouitinegm@gmail.com>
Co-authored-by: mickaelChen <mickael.chen.levinson@gmail.com>
2026-03-27 22:21:55 +01:00
Maxime Ellerbach 975d89b38d chore(docs): add more guidance to bring your own policies tutorial (#3230)
* chore(docs): add more guidance to bring your own policies tutorial

* removing normalization to avoid confusion with processors

* trailing whitespace

* Update docs/source/bring_your_own_policies.mdx

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Maxime Ellerbach <maxime@ellerbach.net>

* Update docs/source/bring_your_own_policies.mdx

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Maxime Ellerbach <maxime@ellerbach.net>

* adding get optim params and predict_action chunk

* removing extra quotes

---------

Signed-off-by: Maxime Ellerbach <maxime@ellerbach.net>
2026-03-27 21:25:37 +01:00
Maxime Ellerbach 07502868e5 fix(deps): breaking change from transformers 5.4.0 (#3231)
* fix(deps): breaking change from transformers 5.4.0

* Update src/lerobot/policies/xvla/modeling_florence2.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Maxime Ellerbach <maxime@ellerbach.net>

* Update src/lerobot/policies/wall_x/qwen_model/qwen2_5_vl_moe.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Maxime Ellerbach <maxime@ellerbach.net>

* removing dataclass

* bumping transformers 5.4.0

---------

Signed-off-by: Maxime Ellerbach <maxime@ellerbach.net>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-03-27 21:25:12 +01:00
Reece O'Mahoney aa9cc9bd43 fix(logging): suppress noisy httpx INFO logs (#3173)
Set httpx logger level to WARNING in init_logging to prevent
HTTP request traces from flooding the terminal during train and
eval scripts.

Co-authored-by: Steven Palma <imstevenpmwork@ieee.org>
2026-03-26 21:05:15 +01:00
Steven Palma 123495250b refactor(dataset): split LeRobotDataset into DatasetReader & DatasetWriter (+ API cleanup) (#3180)
* refactor(dataset): split reader and writer

* chore(dataset): remove proxys

* refactor(dataset): better reader & writer encapsulation

* refactor(datasets): clean API + reduce leaky implementations

* refactor(dataset): API cleaning for writer, reader and meta

* refactor(dataset): expose writer & reader + other minor improvements

* refactor(dataset): improve teardown routine

* refactor(dataset): add hf_dataset property at the facade level

* chore(dataset): add init for datasset module

* docs(dataset): add docstrings for public API of the dataset classes

* tests(dataset): add tests for new classes

* fix(dataset): remove circular dependecy
2026-03-26 19:09:25 +01:00
Jade Choghari 017ff73fbf chore(docs): add rename map and empty cam guide (#3065)
* add blog/guide

* add to tree

* chore(docs): rephrase rename_map docs for clarity and simplicity

---------

Co-authored-by: Steven Palma <steven.palma@huggingface.co>
Co-authored-by: Steven Palma <imstevenpmwork@ieee.org>
2026-03-23 13:57:53 -07:00