lerobot

mirror of https://github.com/huggingface/lerobot.git synced 2026-05-17 09:39:47 +00:00

Author	SHA1	Message	Date
Pepijn	51025c6cad	fix(robotwin): pin compatible curobo in benchmark image	2026-04-21 18:41:16 +02:00
Pepijn	05783401d3	Merge remote-tracking branch 'origin/feat/robotwin-benchmark' into feat/robotwin-benchmark	2026-04-20 17:31:28 +02:00
Pepijn	cfaeea6b1a	Merge branch 'main' into feat/robotwin-benchmark Resolves conflicts introduced by the RoboCasa365 benchmark merge on main (PR #3375) overlapping with this PR's benchmark CI scaffolding. - .github/workflows/benchmark_tests.yml: keep both the RoboTwin 2.0 job (from this branch) and the new RoboCasa365 job (from main). - docs/source/_toctree.yml: list both `robotwin` and `robocasa` pages under the "Benchmarks" section. - scripts/ci/extract_task_descriptions.py: keep both `_robotwin_descriptions` and `_robocasa_descriptions` helpers and wire them into `main()` alongside each other. Made-with: Cursor	2026-04-20 17:17:00 +02:00
Pepijn	e699e52388	feat(envs): add RoboCasa365 benchmark integration (#3375 ) * feat(envs): add RoboCasa365 benchmark integration Add RoboCasa365 (arXiv:2603.04356) as a new simulation benchmark with 365 everyday kitchen manipulation tasks across 2,500 diverse environments. New files: - src/lerobot/envs/robocasa.py: gym.Env wrapper with deferred env creation, flat 12D action / 16D state vectors, 3-camera support - docs/source/robocasa.mdx: user-facing documentation - docker/Dockerfile.benchmark.robocasa: CI benchmark image Modified files: - src/lerobot/envs/configs.py: RoboCasaEnv config (--env.type=robocasa) - pyproject.toml: robocasa optional dependency group - docs/source/_toctree.yml: sidebar entry - .github/workflows/benchmark_tests.yml: integration test job Refs: https://arxiv.org/abs/2603.04356, https://robocasa.ai Related: huggingface/lerobot#321 * fix(docker): use uv pip to install robocasa in benchmark image The huggingface/lerobot-gpu base image uses `uv` with a venv at /lerobot/.venv — `pip` is not on PATH, so `pip install` fails with "pip: not found". Switch to `uv pip install` which installs into the existing venv. Also drop the @v1.0.0 tag pin from the robocasa git URL since the upstream repo may not have that tag; use default branch instead. * fix(robocasa): editable install + switch to lerobot/smolvla_robocasa - pip install from git omits data files like box_links_assets.json (not declared in package_data). Clone and install editable so the source tree is used at runtime. - Download only tex + fixtures_lw asset types (smoke test doesn't need objaverse/aigen objects). Pipe 'y' to auto-accept download prompt. - Switch CI policy from pepijn223/smolvla_robocasa to lerobot/smolvla_robocasa. * fix(docker): re-install lerobot editably after COPY The nightly huggingface/lerobot-gpu image predates the RoboCasaEnv registration — so `lerobot-eval --env.type=robocasa` fails at argparse with "invalid choice" even after COPY . . overlays the new source. Force an editable reinstall so the venv picks up the current configs.py. * fix(ci): add rename_map for robocasa eval (image* -> camera) Policy lerobot/smolvla_robocasa expects observation.images.camera1/2/3, but RoboCasaEnv produces observation.images.image/image2/image3. fix(robocasa): override RoboCasaGymEnv default split (test -> all) RoboCasaGymEnv defaults split="test", but create_env only accepts {None, "all", "pretrain", "target"}, so the out-of-the-box default crashes with ValueError. Always pass "all" when split is None. * fix(docker): also download objs_lw (lightwheel objects) for robocasa Kitchen tasks (e.g. CloseFridge) reference lightwheel object meshes like Stool022/model.xml. fixtures_lw alone isn't enough — we also need objs_lw. Still skipping objaverse/aigen to keep image size down. Made-with: Cursor * feat(robocasa): raw camera names + benchmark-group task shortcuts Align the LeRobot env with RoboCasa's native conventions so policies trained on the upstream datasets don't need a --rename_map at eval time, and expose the standard task groups as first-class --env.task values. - Preserve raw RoboCasa camera names (e.g. robot0_agentview_left) as observation.images.<name> end-to-end. Drops camera_name_mapping and DEFAULT_CAMERA_NAME_MAPPING; features/features_map are now built dynamically from the parsed camera list. - Accept benchmark-group names as --env.task: atomic_seen, composite_seen, composite_unseen, pretrain50/100/200/300. Expanded lazily via robocasa.utils.dataset_registry and auto-sets the split ("target" \| "pretrain"). - Update CI smoke-eval rename_map to map raw cam names to the camera1/2/3 keys expected by lerobot/smolvla_robocasa. * docs(robocasa): single-task smolvla train+eval recipe on pepijn223/robocasa_CloseFridge - Rewrite observation section to use raw RoboCasa camera keys (observation.images.robot0_agentview_{left,right}, observation.images.robot0_eye_in_hand). - Add a "Training on a single task" section with a full smolvla training command on pepijn223/robocasa_CloseFridge, plus matching single-task eval command. - Document benchmark-group task shortcuts (atomic_seen, composite_seen, composite_unseen, pretrain50/100/200/300) as valid --env.task values. * fix(robocasa): restrict obj_registries to lightwheel by default CloseFridge (and most kitchen tasks) crashed at reset with `ValueError: Probabilities contain NaN` coming out of `sample_kitchen_object_helper`. RoboCasa's upstream default `obj_registries=("objaverse", "lightwheel")` normalizes per-registry candidate counts as probabilities; when a sampled category has zero mjcf paths in every configured registry (because the objaverse asset pack isn't on disk — ~30GB, skipped by our Docker build), the 0/0 divide yields NaNs and `rng.choice` raises. - Add `obj_registries: list[str] = ["lightwheel"]` to `RoboCasaEnv` config; thread it through `create_robocasa_envs`, `_make_env_fns`, and the gym.Env wrapper to the underlying `RoboCasaGymEnv` (which forwards to `create_env` → `robosuite.make` → kitchen env). - Default matches what `download_kitchen_assets --type objs_lw` actually ships, so the env works out of the box without a 30GB objaverse download. - Document the override (`--env.obj_registries='[objaverse,lightwheel]'`) for users who have downloaded the full asset set. * fix(docker): also download tex_generative for robocasa benchmark RoboCasa's lightwheel kitchen fixtures embed references to `generative_textures/wall/tex.png` directly in their MuJoCo XML, so `MjModel.from_xml_string` errors out at reset time with "No such file or directory" even when the env is constructed with `generative_textures=None`. The generative textures live under a separate asset registry key (`tex_generative`) in `download_kitchen_assets`, distinct from the base `tex` pack we were already fetching. - Add `tex_generative` to the download list so the fixture XMLs resolve. - Document the remaining omissions (objaverse/aigen, ~30GB) and how the runtime side pairs this with obj_registries=["lightwheel"] to avoid sampling from categories whose assets aren't on disk. ci(robocasa): smoke-eval 10 atomic tasks instead of 1 Broader coverage in the benchmark CI job: evaluate SmolVLA on ten fixture-centric atomic RoboCasa tasks (one episode each) instead of just CloseFridge. The tasks are all drawn from TARGET_TASKS.atomic_seen and selected to avoid object-manipulation categories that would require the objaverse/aigen asset packs (we only ship objs_lw in the Docker image, paired with obj_registries=["lightwheel"] on the runtime side). Tasks: CloseFridge, OpenCabinet, OpenDrawer, TurnOnMicrowave, TurnOffStove, CloseToasterOvenDoor, SlideDishwasherRack, TurnOnSinkFaucet, NavigateKitchen, TurnOnElectricKettle. `scripts/ci/parse_eval_metrics.py` already handles multi-task output via the `overall` key, so no parser changes needed. Bumped the metrics artifact's task label to `atomic_smoke_10` to reflect the grouping. * fix(pyproject): drop unresolvable robocasa extra robocasa's upstream setup.py hardcodes `lerobot==0.3.3` in install_requires. Exposing it as the `lerobot[robocasa]` extra made uv's dep resolver cycle: `lerobot[robocasa]` -> robocasa -> lerobot (a different version) -> unsolvable. This broke every `uv sync` — even invocations with an unrelated extra like `--extra test` — because uv validates the whole lockfile graph. - Remove the `robocasa` extra from pyproject.toml. Installation instructions in docs/source/robocasa.mdx now walk users through the manual `git clone` + `pip install --no-deps` flow, which matches what the Docker image already does and sidesteps the cyclic dep entirely. - Dockerfile: `uv pip install -e ~/robocasa --no-deps` so the shadowed lerobot==0.3.3 never lands in the image; install robocasa's actual runtime deps (numpy, numba, scipy, mujoco, tianshou, etc.) explicitly. * docs(robocasa): align page with adding_benchmarks template Rework docs/source/robocasa.mdx to follow the standard benchmark doc structure: intro + links + available tasks (with family breakdown and first-class benchmark-group shortcuts) + installation + eval + recommended episodes + policy I/O + training + reproducing results. - Fix the paper link (was pointing at a non-existent arxiv ID). - Surface lerobot/smolvla_robocasa and pepijn223/robocasa_CloseFridge in the top-of-page links so they're findable without reading the training section. - Add an explicit "Object registries" subsection explaining the `--env.obj_registries=[objaverse,lightwheel]` override path. - Add an explicit "Reproducing published results" section pointing at the CI smoke eval. * fix: integrate PR #3375 review feedback - envs(robocasa): hoist the duplicated `_parse_camera_names` helper out of `libero.py` and `robocasa.py` into `envs/utils.py` as the public `parse_camera_names`; call sites updated. - envs(robocasa): give each factory a distinct `episode_index` (`0..n_envs-1`) and derive a per-worker seed series in `reset()` so n_envs workers don't all roll the same scene under a shared outer seed. - envs(robocasa): drop the unused `*kwargs` on `_make_env`; declare `visualization_height` / `visualization_width` on both the wrapper and the `RoboCasaEnv` config + propagate via `gym_kwargs`. - envs(robocasa): emit `info["final_info"]` on termination (matching MetaWorld) so downstream vector-env auto-reset keeps the terminal task/success flags. - docs(robocasa): add `--rename_map` (robot0_agentview_left/ eye_in_hand/agentview_right → camera1/2/3) plus CI-parity flags to all three eval snippets. - docker(robocasa): pin robocasa + robosuite git SHAs and the pip dep versions (pygame, Pillow, opencv-python, pyyaml, pynput, tqdm, termcolor, imageio, h5py, lxml, hidapi, gymnasium) for reproducible benchmark images. - ci(robocasa): update the workflow comment — there is no `lerobot[robocasa]` extra; robocasa/robosuite are installed manually because upstream's `lerobot==0.3.3` pin shadows ours. docs(robocasa): add benchmark banner image * fix(envs): preserve AsyncVectorEnv metadata/unwrapped in lazy eval envs Port of #3416 onto this branch. Also threads the cached metadata through the RoboCasa factory so async eval on `--env.type=robocasa` keeps the same improvement. * fix: integrate PR #3375 review feedback (round 2) - envs(robocasa): when the caller passes `seed=None` to `reset()`, fall back to `self.episode_index` for the inner env seed so each worker still samples a distinct trajectory instead of all workers inheriting the same global RNG state. - envs(robocasa): replace the two module-level `print()` calls in `create_robocasa_envs` with `logger.info(...)` via a module-level `logger = logging.getLogger(__name__)`. - ci(robocasa): run `scripts/ci/extract_task_descriptions.py` after the eval so `metrics.json` carries per-task natural-language labels, matching LIBERO / MetaWorld / VLABench jobs. Added a `_robocasa_descriptions()` extractor that splits CamelCase task names into word-level labels keyed by `<task>_0`.	2026-04-20 17:10:53 +02:00
Haoming Song	b2765b39b8	Cache lazy async env metadata for eval (#3416 ) Co-authored-by: Pepijn <138571049+pkooij@users.noreply.github.com>	2026-04-20 15:33:13 +02:00
Pepijn	d3909da83a	Merge branch 'main' into feat/robotwin-benchmark	2026-04-20 15:28:45 +02:00
Pepijn	1157fb11e6	fix: integrate PR #3315 review feedback - envs(robotwin): default `observation_height/width` in `create_robotwin_envs` to `DEFAULT_CAMERA_H/W` (240/320) so they match the D435 dims baked into `task_config/demo_clean.yml`. - envs(robotwin): resolve `task_config/demo_clean.yml` via `CONFIGS_PATH` instead of a cwd-relative path; works regardless of where `lerobot-eval` is invoked. - envs(robotwin): replace `print()` calls in `create_robotwin_envs` with `logger.info(...)` (module-level `logger = logging.getLogger`). - envs(robotwin): use `_LazyAsyncVectorEnv` for the async path so async workers start lazily (matches LIBERO / RoboCasa / VLABench). - envs(robotwin): cast `agent_pos` space + joint-state output to float32 end-to-end (was mixed float64/float32). - envs(configs): use the existing `_make_vec_env_cls(use_async, n_envs)` helper in `RoboTwinEnvConfig.create_envs`; drop the `get_env_processors` override so RoboTwin uses the identity processor inherited from `EnvConfig`. - processor: delete `RoboTwinProcessorStep` — the float32 cast now happens in the wrapper itself, so the processor is redundant. - tests: drop the `TestRoboTwinProcessorStep` suite; update the mock obs fixture to use float32 `joint_action.vector`. - ci: hoist `ROBOTWIN_POLICY` and `ROBOTWIN_TASKS` to job-level env vars so the task list and policy aren't duplicated across eval / extract / parse steps. - docker: pin RoboTwin + CuRobo upstream clones to commit SHAs (`RoboTwin@0aeea2d6`, `curobo@ca941586`) for reproducibility. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-20 15:18:41 +02:00
Pepijn	777b808c70	ci: skip Docker Hub login step on fork PRs (#3417 ) On fork PRs, `secrets.DOCKERHUB_LEROBOT_*` expand to empty strings, which fails `docker/login-action@v3` with `Error: Username and password required` before any of the actual build/eval work runs. Gate the login step on the env-var expansion of the username so the step is skipped (not failed) when secrets are absent. On the main repo + maintainer-approved fork runs (`pull_request_review` path), the secrets resolve normally, the step runs, and image pulls get the authenticated Docker Hub rate limit. Scope: only `benchmark_tests.yml`, the lone benchmark workflow that triggers on `pull_request` from forks. `full_tests.yml` and `latest_deps_tests.yml` run under `pull_request_review` / schedule / workflow_dispatch, where secrets are already guaranteed. Context: surfaced on #3416 where an external contributor's PR failed at the login step before any test could run. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-20 15:14:35 +02:00
Pepijn	0fed8b45c2	ci: gate Docker Hub login on secret availability Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-20 14:27:06 +02:00
Pepijn	c9c6a6ae3d	fix(envs): preserve AsyncVectorEnv metadata/unwrapped in lazy eval envs Port of #3416 onto this branch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-20 14:05:39 +02:00
Pepijn	2cc147d946	test(robotwin): lower task-count floor from 60 to 50 ROBOTWIN_TASKS was trimmed to 50 tasks (see comment in `src/lerobot/envs/robotwin.py:48`), but the assertion still required ≥60, causing CI failures. Align the test with the current upstream task count. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-20 10:12:58 +02:00
Pepijn	f1ba581ec4	Merge branch 'main' into feat/robotwin-benchmark	2026-04-20 08:45:34 +01:00
Defalt	5c43fa1cce	fix(policies): replace deprecated torch.cuda.amp.autocast with torch.amp.autocast (#3167 ) Co-authored-by: Steven Palma <imstevenpmwork@ieee.org>	2026-04-19 16:25:08 +02:00
k1000dai	3f16d98a9b	episods→episodes (#3410 ) Fixing typo	2026-04-19 12:58:06 +02:00
whats2000	52f508c51c	fix(dataset): cleanup_interrupted_episode wipes image temp dirs (#3405 )	2026-04-19 12:04:24 +02:00
Steven Palma	a8b72d9615	feat(dataset): 2x faster dataloader via parallel decode, uint8 transport, and persistent workers (#3406 ) * feat(dataset): 2xfaster dataloader * fix(dataset): streaming return uint8 decode * fix(tests): adjust normalization step comparison * fix(dataset): with threadexecutor + False default * chore(dataset): make it a config * fix(test): account for uint8 in training path testing	2026-04-19 00:08:22 +02:00
Steven Palma	760220d532	chore(dependencies): update uv.lock (#3365 ) Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2026-04-18 22:32:05 +02:00
Shu Jiuhe	a99943ca26	Improve loading performance in `_absolute_to_relative_idx` when remapping indices (#3279 )	2026-04-18 19:28:50 +02:00
Cheng Yin	a9821af61b	fix(record): pass rename_map to make_policy in lerobot-record (#3240 ) * fix(record): pass rename_map to make_policy in lerobot-record Fixes #3181. The rename_map from dataset config was used for preprocessor construction but not passed to make_policy(), causing feature mismatch errors when camera key names differ between dataset and model config. make_policy() already accepts a rename_map parameter and uses it to skip visual feature consistency validation when remapping is active, but lerobot_record.py was not passing it through. * style: fix ruff format for ternary expression --------- Co-authored-by: Steven Palma <imstevenpmwork@ieee.org>	2026-04-17 16:40:08 +02:00
Pepijn	46803e88bd	fix(robotwin): sync ROBOTWIN_TASKS + doc with upstream (50 tasks) The local ROBOTWIN_TASKS tuple drifted from upstream RoboTwin-Platform/RoboTwin. Users passing names like `close_laptop`, `close_microwave`, `dump_bin`, `place_block`, `pour_water`, `fold_cloth`, etc. got past our validator (the names were in the tuple) but then crashed inside robosuite with a confusing error, because those tasks don't exist in upstream `envs/`. - Replace ROBOTWIN_TASKS with a verbatim mirror of upstream's `envs/` directory: 50 tasks as of main (was 60 with many stale entries). Added a `gh api`-based one-liner comment so future bumps are mechanical. - Update the `60 tasks` claims in robotwin.mdx and RoboTwinEnvConfig's docstring to `50`. - Replace the stale example-task table in robotwin.mdx with ten upstream-confirmed examples, and flag `open_laptop` as temporarily broken (its `check_success()` uses `self.arm_tag` which is only set inside `play_once()`; eval-mode callers hit AttributeError). - Rebuild the "Full benchmark" command with the actual 50-task list, omitting `open_laptop`. Made-with: Cursor	2026-04-17 15:29:57 +01:00
Pepijn	9ead70f016	fix(ci): swap 4 broken RoboTwin tasks in smoke eval The smoke eval hit two upstream issues: - `open_laptop`: bug in OpenMOSS/RoboTwin main — `check_success()` uses `self.arm_tag`, but that attribute is only set inside `play_once()` (the scripted-expert path). During eval `take_action()` calls `check_success()` directly, hitting `AttributeError: 'open_laptop' object has no attribute 'arm_tag'`. - `close_laptop`, `close_microwave`, `place_block`: not present in upstream RoboTwin `envs/` at all — our ROBOTWIN_TASKS tuple drifted from upstream and these names leaked into CI. Replace the four broken tasks with upstream-confirmed equivalents that exist both in ROBOTWIN_TASKS and in RoboTwin's `envs/`: `adjust_bottle`, `lift_pot`, `stamp_seal`, `turn_switch`. New 10-task smoke set: beat_block_hammer, click_bell, handover_block, stack_blocks_two, click_alarmclock, open_microwave, adjust_bottle, lift_pot, stamp_seal, turn_switch. Made-with: Cursor	2026-04-17 15:18:20 +01:00
Pepijn	84bb033631	ci(robotwin): smoke-eval 10 tasks instead of 5 Broader coverage on the RoboTwin 2.0 benchmark CI job: bump the smoke eval from 5 tasks to 10 (one episode each). Added tasks are all drawn from ROBOTWIN_TASKS and mirror the shape/complexity of the existing set (simple single-object or single-fixture manipulations). Tasks now run: beat_block_hammer, click_bell, handover_block, open_laptop, stack_blocks_two, click_alarmclock, close_laptop, close_microwave, open_microwave, place_block. `parse_eval_metrics.py` reads `overall` for multi-task runs so no parser change is needed. Bumped the step name and the metrics label to reflect the 10-task layout. Made-with: Cursor	2026-04-17 13:40:12 +01:00
Steven Palma	d4a229444b	fix(ci): not fail when skipped (#3399 )	2026-04-17 12:02:38 +02:00
Steven Palma	098ebb4d72	feat(ci): send slack notification if latest dependecy test is broken (#3398 )	2026-04-17 11:28:24 +02:00
Pepijn	78201f3226	Merge branch 'main' into feat/robotwin-benchmark	2026-04-16 18:57:39 +02:00
Pepijn	4ccc4e9a66	fix(docs): use plain markdown image to fix MDX build Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 18:57:27 +02:00
Pepijn	fdbbc35cca	fix(docs): use correct RoboTwin 2.0 teaser image URL Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 18:54:12 +02:00
Pepijn	741a6d5246	fix(docs): correct RoboTwin 2.0 paper arxiv link Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 18:53:55 +02:00
Pepijn	a4102ee86d	fix: integrate PR #3315 review feedback - ci: add Docker Hub login step, add HF_USER_TOKEN guard on eval step - docker: tie patches to pinned versions with removal guidance, remove unnecessary HF_TOKEN for public dataset, fix hadolint warnings - docs: fix paper link to arxiv, add teaser image, fix camera names (4→3 cameras), fix observation dims (480x640→240x320) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 18:33:53 +02:00
Maxime Ellerbach	9bc2df80bb	chore(docs): adding a jupyter notebook that gives you ready-to-paste commands (#3395 ) * chore(docs): adding an example quickstart jupyter notebook that gives you ready-to-paste commands * some fixes in the commands * uv lock * Adding notebook to all Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> Signed-off-by: Maxime Ellerbach <maxime@ellerbach.net> * uv lock again --------- Signed-off-by: Maxime Ellerbach <maxime@ellerbach.net> Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>	2026-04-16 17:53:35 +02:00
Remy	bd74f6733d	chore: bump doc-builder SHA for PR upload workflow (#3386 )	2026-04-15 12:15:24 +02:00
Pepijn	ad9662b4a8	fix(robotwin): defer YAML lookup and realign tests with current API __init__ was eagerly calling _load_robotwin_setup_kwargs just to read head_camera_h/w from the YAML. That import (`from envs import CONFIGS_PATH`) required a real RoboTwin install, so constructing the env — and thus every test in tests/envs/test_robotwin.py — blew up with ModuleNotFoundError on fast-tests where RoboTwin isn't installed. Replace the eager lookup with DEFAULT_CAMERA_H/W constants (240×320, the D435 dims baked into task_config/demo_clean.yml). reset() still resolves the full setup_kwargs lazily — that's fine because reset() is only called inside the benchmark Docker image where RoboTwin is present. Also resync the test file with the current env API: - mock get_obs() as the real nested {"observation": {cam: {"rgb": …}}, "joint_action": {"vector": …}} shape - patch both _load_robotwin_task and _load_robotwin_setup_kwargs (_patch_load → _patch_runtime) - drop `front_camera` / `left_wrist` from assertions — aloha-agilex exposes head_camera + left_camera + right_camera, not those - black-frame test now uses left_camera as the missing camera - setup_demo call check loosened to the caller-provided seed/is_test bits (full kwargs include the YAML-derived blob) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 11:14:02 +02:00
Pepijn	f291d3bfa9	docs(robotwin): add robotwin to _toctree.yml under Benchmarks doc-builder's TOC integrity check was rejecting the branch because docs/source/robotwin.mdx existed but wasn't listed in _toctree.yml. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 11:06:34 +02:00
Pepijn	49186359b0	refactor(robotwin): rebase docker image on huggingface/lerobot-gpu Mirror the libero/metaworld/libero_plus/robomme pattern: start from the nightly GPU image (apt deps, python, uv, venv, lerobot[all] already there) and layer on only what RoboTwin 2.0 uniquely needs — cuda-nvcc + cuda-cudart-dev (CuRobo builds from source), Vulkan libs + NVIDIA ICD (SAPIEN renderer), sapien/mplib/open3d/pytorch3d/curobo installs, the mplib + sapien upstream patches, and the TianxingChen asset download. Drops ~90 lines of duplicated base setup (CUDA FROM, apt python, uv install, user creation, venv init, base lerobot install). 199 → 110. Also repoint the docs + env docstring dataset link from hxma/RoboTwin-LeRobot-v3.0 to the canonical lerobot/robotwin_unified. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 11:05:07 +02:00
Steven Palma	6f4a96333e	chore(docs): update contributing (#3387 )	2026-04-15 11:02:37 +02:00
Pepijn	99792bb17b	ci: point benchmark eval checkpoints at the lerobot/ org mirrors pepijn223/smolvla_* → lerobot/smolvla_* across every benchmark job in this branch (libero, metaworld, and the per-branch benchmark). The checkpoints were mirrored into the lerobot/ org and that's the canonical location going forward. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 11:02:06 +02:00
Steven Palma	9021d2d240	refactor(imports): enforce guard pattern (#3382 ) * refactor(imports): enforce guard pattern * fix(tests): skip reachy2 if not installed * Address review feedback	2026-04-14 22:54:05 +02:00
Pepijn	e67ceb213d	feat(robotwin): eval 5 diverse tasks per CI run with NL descriptions Widen the smoke eval from a single task (beat_block_hammer) to five: click_bell, handover_block, open_laptop, stack_blocks_two on top of the original. Each gets its own rollout video in videos/<task>_0/ so the dashboard can surface visually distinct behaviours. extract_task_descriptions.py now has a RoboTwin branch that reads `description/task_instruction/<task>.json` (already shipped in the clone at /opt/robotwin) and pulls the `full_description` field. CI cds into the clone before invoking the script so the relative path resolves. parse_eval_metrics.py is invoked with the same 5-task list so the metrics.json embeds one entry per task. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 21:03:15 +02:00
Pepijn	793f52e360	fix(robotwin): install av-dep so lerobot_eval can write rollout MP4s write_video (utils/io_utils.py:53) lazily imports PyAV via require_package and raises silently inside the video-writing thread when the extra is not installed — so the eval itself succeeds with pc_success=100 but no MP4 ever lands in videos/, and the artifact upload reports "No files were found". Add av-dep to the install line (same pattern as the RoboMME image). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 20:24:03 +02:00
Pepijn	ae113e0d99	fix(robotwin): expose _max_episode_steps for lerobot_eval.rollout rollout() does `env.call("_max_episode_steps")` (lerobot_eval.py:157) to know when to stop stepping. LiberoEnv and MetaworldEnv set this attribute; RoboTwinEnv was tracking the limit under `episode_length` only, so the call raised AttributeError once CuRobo finished warming up. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 20:00:36 +02:00
Pepijn	61a0269560	fix(robotwin): align observation_space dims with D435 camera output lerobot_eval crashed in gym.vector's SyncVectorEnv.reset with: ValueError: Output array is the wrong shape because RoboTwinEnvConfig declared observation_space = (480, 640, 3) but task_config/demo_clean.yml specifies head_camera_type=D435, which renders (240, 320, 3). gym.vector.concatenate pre-allocates a buffer from the declared space, so the first np.stack raises on shape mismatch. Changes: - Config defaults now 240×320 (the D435 dims in _camera_config.yml), with a comment pointing at the source of truth. - RoboTwinEnv.__init__ accepts observation_height/width as Optional and falls back to setup_kwargs["head_camera_h/w"] so the env is self-consistent even if the config is not in sync. - Config camera_names / features_map use the actual aloha-agilex camera names (head_camera, left_camera, right_camera). Drops the stale "front_camera" and "left_wrist"/"right_wrist" entries that never matched anything RoboTwin exposes. - CI workflow's rename_map updated to match the new camera names. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 19:35:01 +02:00
Pepijn	c2160ca86e	refactor(robotwin): drop defensive dict guards, cache black fallback frame _get_obs was guarding every dict access with isinstance(..., dict) in case RoboTwin's get_obs returned something else — but the API contract (envs/_base_task.py:437) always returns a dict, so the guards were silently masking real failures behind plausible-looking zero observations. Drop them. Also: - Cache a single black fallback frame in __init__ instead of allocating a fresh np.zeros((H, W, 3), uint8) for every missing camera on every step — the "camera not exposed" set is static per env. - Only allocate the zero joint_state on the fallback path (not unconditionally before the real value overwrites it). - Replace .flatten() with .ravel() (no copy when already 1-D). - Fold the nested-dict schema comment and two identical torch.enable_grad() rationales into a single Autograd section in the class docstring. - Fix stale `left_wrist` camera name in the observation docstring. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 18:39:40 +02:00
Pepijn	f40b30202b	fix(robotwin): read nested get_obs() output and use aloha-agilex camera names RoboTwin's base_task.get_obs() returns a nested dict: {"observation": {cam: {"rgb": ..., "intrinsic_matrix": ...}}, "joint_action": {"left_arm": ..., "left_gripper": ..., "right_arm": ..., "right_gripper": ..., "vector": np.ndarray}, "endpose": {...}} Our _get_obs was reading raw["{cam}_rgb"] / raw["{cam}"] and raw["joint_action"] as if they were flat, so np.asarray(raw["joint_action"], dtype=float64) tripped on a dict and raised TypeError: float() argument must be a string or a real number, not 'dict' Fix: - Pull images from raw["observation"][cam]["rgb"] - Pull joint state from raw["joint_action"]["vector"] (the flat array) - Update the default camera tuple to (head_camera, left_camera, right_camera) to match RoboTwin's actual wrist-camera names (envs/camera/camera.py:135-151) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 18:25:08 +02:00
Khalil Meftah	60e7d67cb8	fix: catch KeyboardInterrupt in safe_stop_image_writer to prevent corrupted frames (#3381 )	2026-04-14 18:22:56 +02:00
Pepijn	b06f134fe4	fix(robotwin): re-enable autograd for CuRobo planner warmup and take_action lerobot_eval wraps the full rollout in torch.no_grad() (lerobot_eval.py:566), but RoboTwin's setup_demo → load_robot → CuroboPlanner(...) runs motion_gen.warmup(), which invokes Newton's-method trajectory optimization. That optimizer calls cost.backward() internally, which raises RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn when autograd is disabled. take_action() hits the same planner path at every step. Wrap both setup_demo and take_action in torch.enable_grad() so CuRobo's optimizer can build its computation graph. Policy inference is unaffected — rollout()'s inner torch.inference_mode() block around select_action() is untouched, so we still don't allocate grad buffers during policy forward. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 17:39:21 +02:00
Pepijn	5558ea2207	feat(envs): add RoboTwin 2.0 benchmark integration - RoboTwinEnvConfig with 4-camera setup (head/front/left_wrist/right_wrist) - Docker image with SAPIEN, mplib, CuRobo, pytorch3d (Python 3.12) - CI workflow: 1-episode smoke eval with pepijn223/smolvla_robotwin - RoboTwinProcessorStep for state float32 casting - Camera rename_map: head_camera/front_camera/left_wrist -> camera1/2/3 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 16:47:44 +02:00
Radu	1ede000bdd	fix(rl): swap dict merge order to preserve teleop intervention flag (#3273 ) Co-authored-by: Khalil Meftah <khalil.meftah@huggingface.co>	2026-04-14 16:20:54 +02:00
Khalil Meftah	d57c58a532	fix: add thread synchronization to ReplayBuffer to prevent race condition between add() and sample() (#3372 )	2026-04-14 13:16:45 +02:00
Matteo Tiezzi	b3e76a92f2	fix(groot): compatibility fixes for gr00t in v0.5 (#3182 ) * fix(groot): apply groot 0.5 fixes * fix(groot): correct indentation and add tile count in Eagle25VL processor * Fixed lint7/style	2026-04-14 13:09:18 +02:00
Khalil Meftah	f5c801fd34	fix(test): add missing device placement in multi-task DiT tests (#3349 )	2026-04-14 12:25:29 +02:00

1 2 3 4 5 ...

1430 Commits