license + peft-dep + init groot + flat import layering utils dataset

fix fast tests
update fast ci tests
2026-05-12 23:29:52 +00:00 · 2026-04-12 16:43:24 +02:00 · 2026-04-12 14:46:23 +02:00 · 2026-04-12 14:11:50 +02:00 · 2026-04-12 13:52:45 +02:00 · 2026-04-12 12:19:26 +02:00
237 changed files with 4846 additions and 19865 deletions
@@ -2,6 +2,11 @@

 Short, imperative summary (e.g., "fix(robots): handle None in sensor parser"). See [CONTRIBUTING.md](../CONTRIBUTING.md) for PR conventions.

+## Type / Scope
+
+- **Type**: (Bug | Feature | Docs | Performance | Test | CI | Chore)
+- **Scope**: (optional — name of module or package affected)
+
 ## Summary / Motivation

 - One-paragraph description of what changes and why.
@@ -14,14 +19,28 @@ Short, imperative summary (e.g., "fix(robots): handle None in sensor parser"). S

 ## What changed

- Short, concrete bullets explaining the functional changes (how the behavior or output differs now).
+- Short, concrete bullets of the modifications (files/behaviour).
 - Short note if this introduces breaking changes and migration steps.

 ## How was this tested (or how to run locally)

- Tests added: list new tests or test files. `pytest -q tests/ -k <keyword>`
+- Tests added: list new tests or test files.
 - Manual checks / dataset runs performed.
- Instructions for the reviewer for reproducing with a quick example or CLI (if applicable)
+- Instructions for the reviewer
+
+Example:
+
+- Ran the relevant tests:
+
+  ```bash
+  pytest -q tests/ -k <keyword>
+  ```
+
+- Reproduce with a quick example or CLI (if applicable):
+
+  ```bash
+  lerobot-train --some.option=true
+  ```

 ## Checklist (required before merge)

@@ -29,7 +48,6 @@ Short, imperative summary (e.g., "fix(robots): handle None in sensor parser"). S
 - [ ] All tests pass locally (`pytest`)
 - [ ] Documentation updated
 - [ ] CI is green
- [ ] Community Review: I have reviewed another contributor's open PR and linked it here: # (insert PR number/link)

 ## Reviewer notes

@@ -1,951 +0,0 @@
-# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-# Integration tests: build an isolated Docker image per benchmark and run a
-# 1-episode smoke eval. Each benchmark gets its own image so incompatible
-# dependency trees (e.g. hf-libero vs metaworld==3.0.0) can never collide.
-#
-# To add a new benchmark:
-#   1. Add docker/Dockerfile.benchmark.<name>  (install only lerobot[<name>])
-#   2. Copy one of the jobs below and adjust the image name and eval command.
-name: Benchmark Integration Tests
-
-on:
-  # Run manually from the Actions tab
-  workflow_dispatch:
-
-  # Run every Monday at 02:00 UTC.
-  schedule:
-    - cron: "0 2 * * 1"
-
-  push:
-    branches:
-      - main
-    paths:
-      - "src/lerobot/envs/**"
-      - "src/lerobot/scripts/lerobot_eval.py"
-      - "docker/Dockerfile.benchmark.*"
-      - ".github/workflows/benchmark_tests.yml"
-      - "pyproject.toml"
-
-  pull_request:
-    branches:
-      - main
-    paths:
-      - "src/lerobot/envs/**"
-      - "src/lerobot/scripts/lerobot_eval.py"
-      - "docker/Dockerfile.benchmark.*"
-      - ".github/workflows/benchmark_tests.yml"
-      - "pyproject.toml"
-
-permissions:
-  contents: read
-
-env:
-  UV_VERSION: "0.8.0"
-  PYTHON_VERSION: "3.12"
-
-# Cancel in-flight runs for the same branch/PR.
-concurrency:
-  group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
-  cancel-in-progress: true
-
-jobs:
-  # ── LIBERO ────────────────────────────────────────────────────────────────
-  # Isolated image: lerobot[libero] only (hf-libero, dm-control, mujoco chain)
-  libero-integration-test:
-    name: Libero — build image + 1-episode eval
-    runs-on:
-      group: aws-g6-4xlarge-plus
-    env:
-      HF_USER_TOKEN: ${{ secrets.LEROBOT_HF_USER }}
-
-    steps:
-      - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd  # v6.0.2
-        with:
-          persist-credentials: false
-          lfs: true
-
-      - name: Set up Docker Buildx
-        uses: docker/setup-buildx-action@v3 # zizmor: ignore[unpinned-uses]
-        with:
-          cache-binary: false
-
-      - name: Login to Docker Hub
-        if: ${{ env.DOCKERHUB_USERNAME != '' }}
-        uses: docker/login-action@v3 # zizmor: ignore[unpinned-uses]
-        with:
-          username: ${{ secrets.DOCKERHUB_LEROBOT_USERNAME }}
-          password: ${{ secrets.DOCKERHUB_LEROBOT_PASSWORD }}
-        env:
-          DOCKERHUB_USERNAME: ${{ secrets.DOCKERHUB_LEROBOT_USERNAME }}
-
-      # Build the benchmark-specific image. The Dockerfile separates dep-install
-      # from source-copy, so code-only changes skip the slow uv-sync layer
-      # when the runner has a warm Docker daemon cache.
-      - name: Build Libero benchmark image
-        uses: docker/build-push-action@v6 # zizmor: ignore[unpinned-uses]
-        with:
-          context: .
-          file: docker/Dockerfile.benchmark.libero
-          push: false
-          load: true
-          tags: lerobot-benchmark-libero:ci
-
-      - name: Run Libero smoke eval (1 episode)
-        if: env.HF_USER_TOKEN != ''
-        run: |
-          # Named container (no --rm) so we can docker cp artifacts out.
-          # Output to /tmp inside the container — /artifacts doesn't exist
-          # and user_lerobot cannot create root-level dirs.
-          docker run --name libero-eval --gpus all \
-            --shm-size=4g \
-            -e HF_HOME=/tmp/hf \
-            -e HF_USER_TOKEN="${HF_USER_TOKEN}" \
-            -e HF_HUB_DOWNLOAD_TIMEOUT=300 \
-            lerobot-benchmark-libero:ci \
-            bash -c "
-              hf auth login --token \"\$HF_USER_TOKEN\" --add-to-git-credential 2>/dev/null || true
-              lerobot-eval \
-                --policy.path=lerobot/smolvla_libero \
-                --env.type=libero \
-                --env.task=libero_spatial \
-                --eval.batch_size=1 \
-                --eval.n_episodes=1 \
-                --eval.use_async_envs=false \
-                --policy.device=cuda \
-                '--env.camera_name_mapping={\"agentview_image\": \"camera1\", \"robot0_eye_in_hand_image\": \"camera2\"}' \
-                --policy.empty_cameras=1 \
-                --output_dir=/tmp/eval-artifacts
-              python scripts/ci/extract_task_descriptions.py \
-                --env libero --task libero_spatial \
-                --output /tmp/eval-artifacts/task_descriptions.json
-            "
-
-      - name: Copy Libero artifacts from container
-        if: always()
-        run: |
-          mkdir -p /tmp/libero-artifacts
-          docker cp libero-eval:/tmp/eval-artifacts/. /tmp/libero-artifacts/ 2>/dev/null || true
-          docker rm -f libero-eval || true
-
-      - name: Parse Libero eval metrics
-        if: always()
-        run: |
-          python3 scripts/ci/parse_eval_metrics.py \
-            --artifacts-dir /tmp/libero-artifacts \
-            --env libero \
-            --task libero_spatial \
-            --policy lerobot/smolvla_libero
-
-      - name: Upload Libero rollout video
-        if: always()
-        uses: actions/upload-artifact@v4 # zizmor: ignore[unpinned-uses]
-        with:
-          name: libero-rollout-video
-          path: /tmp/libero-artifacts/videos/
-          if-no-files-found: warn
-
-      - name: Upload Libero eval metrics
-        if: always()
-        uses: actions/upload-artifact@v4 # zizmor: ignore[unpinned-uses]
-        with:
-          name: libero-metrics
-          path: /tmp/libero-artifacts/metrics.json
-          if-no-files-found: warn
-
-      # ── LIBERO TRAIN+EVAL SMOKE ──────────────────────────────────────────────
-      # Train SmolVLA for 1 step (batch_size=1, dataset episode 0 only) then
-      # immediately runs eval inside the training loop (eval_freq=1, 1 episode).
-      # Tests the full train→eval-within-training pipeline end-to-end.
-      - name: Run Libero train+eval smoke (1 step, eval_freq=1)
-        if: env.HF_USER_TOKEN != ''
-        run: |
-          docker run --name libero-train-smoke --gpus all \
-            --shm-size=4g \
-            -e HF_HOME=/tmp/hf \
-            -e HF_USER_TOKEN="${HF_USER_TOKEN}" \
-            -e HF_HUB_DOWNLOAD_TIMEOUT=300 \
-            lerobot-benchmark-libero:ci \
-            bash -c "
-              hf auth login --token \"\$HF_USER_TOKEN\" --add-to-git-credential 2>/dev/null || true
-              accelerate launch --num_processes=1 \$(which lerobot-train) \
-                --policy.path=lerobot/smolvla_base \
-                --policy.load_vlm_weights=true \
-                --policy.scheduler_decay_steps=25000 \
-                --policy.freeze_vision_encoder=false \
-                --policy.train_expert_only=false \
-                --dataset.repo_id=lerobot/libero \
-                --dataset.episodes=[0] \
-                --dataset.use_imagenet_stats=false \
-                --env.type=libero \
-                --env.task=libero_spatial \
-                '--env.camera_name_mapping={\"agentview_image\": \"camera1\", \"robot0_eye_in_hand_image\": \"camera2\"}' \
-                --policy.empty_cameras=1 \
-                --output_dir=/tmp/train-smoke \
-                --steps=1 \
-                --batch_size=1 \
-                --eval_freq=1 \
-                --eval.n_episodes=1 \
-                --eval.batch_size=1 \
-                --eval.use_async_envs=false \
-                --save_freq=1 \
-                --policy.push_to_hub=false \
-                '--rename_map={\"observation.images.image\": \"observation.images.camera1\", \"observation.images.image2\": \"observation.images.camera2\"}'
-            "
-
-      - name: Copy Libero train-smoke artifacts from container
-        if: always()
-        run: |
-          mkdir -p /tmp/libero-train-smoke-artifacts
-          docker cp libero-train-smoke:/tmp/train-smoke/. /tmp/libero-train-smoke-artifacts/ 2>/dev/null || true
-          docker rm -f libero-train-smoke || true
-
-      - name: Upload Libero train-smoke eval video
-        if: always()
-        uses: actions/upload-artifact@v4 # zizmor: ignore[unpinned-uses]
-        with:
-          name: libero-train-smoke-video
-          path: /tmp/libero-train-smoke-artifacts/eval/
-          if-no-files-found: warn
-
-  # ── METAWORLD ─────────────────────────────────────────────────────────────
-  # Isolated image: lerobot[metaworld] only (metaworld==3.0.0, mujoco>=3 chain)
-  metaworld-integration-test:
-    name: MetaWorld — build image + 1-episode eval
-    runs-on:
-      group: aws-g6-4xlarge-plus
-    env:
-      HF_USER_TOKEN: ${{ secrets.LEROBOT_HF_USER }}
-
-    steps:
-      - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd  # v6.0.2
-        with:
-          persist-credentials: false
-          lfs: true
-
-      - name: Set up Docker Buildx
-        uses: docker/setup-buildx-action@v3 # zizmor: ignore[unpinned-uses]
-        with:
-          cache-binary: false
-
-      - name: Login to Docker Hub
-        if: ${{ env.DOCKERHUB_USERNAME != '' }}
-        uses: docker/login-action@v3 # zizmor: ignore[unpinned-uses]
-        with:
-          username: ${{ secrets.DOCKERHUB_LEROBOT_USERNAME }}
-          password: ${{ secrets.DOCKERHUB_LEROBOT_PASSWORD }}
-        env:
-          DOCKERHUB_USERNAME: ${{ secrets.DOCKERHUB_LEROBOT_USERNAME }}
-
-      - name: Build MetaWorld benchmark image
-        uses: docker/build-push-action@v6 # zizmor: ignore[unpinned-uses]
-        with:
-          context: .
-          file: docker/Dockerfile.benchmark.metaworld
-          push: false
-          load: true
-          tags: lerobot-benchmark-metaworld:ci
-
-      - name: Run MetaWorld smoke eval (1 episode)
-        if: env.HF_USER_TOKEN != ''
-        run: |
-          docker run --name metaworld-eval --gpus all \
-            --shm-size=4g \
-            -e HF_HOME=/tmp/hf \
-            -e HF_USER_TOKEN="${HF_USER_TOKEN}" \
-            -e HF_HUB_DOWNLOAD_TIMEOUT=300 \
-            lerobot-benchmark-metaworld:ci \
-            bash -c "
-              hf auth login --token \"\$HF_USER_TOKEN\" --add-to-git-credential 2>/dev/null || true
-              lerobot-eval \
-                --policy.path=lerobot/smolvla_metaworld \
-                --env.type=metaworld \
-                --env.task=metaworld-push-v3 \
-                --eval.batch_size=1 \
-                --eval.n_episodes=1 \
-                --eval.use_async_envs=false \
-                --policy.device=cuda \
-                '--rename_map={\"observation.image\": \"observation.images.camera1\"}' \
-                --policy.empty_cameras=2 \
-                --output_dir=/tmp/eval-artifacts
-              python scripts/ci/extract_task_descriptions.py \
-                --env metaworld --task metaworld-push-v3 \
-                --output /tmp/eval-artifacts/task_descriptions.json
-            "
-
-      - name: Copy MetaWorld artifacts from container
-        if: always()
-        run: |
-          mkdir -p /tmp/metaworld-artifacts
-          docker cp metaworld-eval:/tmp/eval-artifacts/. /tmp/metaworld-artifacts/ 2>/dev/null || true
-          docker rm -f metaworld-eval || true
-
-      - name: Parse MetaWorld eval metrics
-        if: always()
-        run: |
-          python3 scripts/ci/parse_eval_metrics.py \
-            --artifacts-dir /tmp/metaworld-artifacts \
-            --env metaworld \
-            --task metaworld-push-v3 \
-            --policy lerobot/smolvla_metaworld
-
-      - name: Upload MetaWorld rollout video
-        if: always()
-        uses: actions/upload-artifact@v4 # zizmor: ignore[unpinned-uses]
-        with:
-          name: metaworld-rollout-video
-          path: /tmp/metaworld-artifacts/videos/
-          if-no-files-found: warn
-
-      - name: Upload MetaWorld eval metrics
-        if: always()
-        uses: actions/upload-artifact@v4 # zizmor: ignore[unpinned-uses]
-        with:
-          name: metaworld-metrics
-          path: /tmp/metaworld-artifacts/metrics.json
-          if-no-files-found: warn
-
-  # ── ROBOTWIN 2.0 ──────────────────────────────────────────────────────────
-  # Isolated image: full RoboTwin 2.0 stack — SAPIEN, mplib, CuRobo,
-  # pytorch3d, + simulation assets (~4 GB).
-  # Build takes ~20 min on first run; subsequent runs hit the layer cache.
-  # Requires an NVIDIA GPU runner with CUDA 12.1 drivers.
-  robotwin-integration-test:
-    name: RoboTwin 2.0 — build image + 1-episode eval
-    runs-on:
-      group: aws-g6-4xlarge-plus
-    env:
-      HF_USER_TOKEN: ${{ secrets.LEROBOT_HF_USER }}
-      ROBOTWIN_POLICY: lerobot/smolvla_robotwin
-      ROBOTWIN_TASKS: beat_block_hammer,click_bell,handover_block,stack_blocks_two,click_alarmclock,open_microwave,adjust_bottle,lift_pot,stamp_seal,turn_switch
-
-    steps:
-      - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd  # v6.0.2
-        with:
-          persist-credentials: false
-          lfs: true
-
-      - name: Set up Docker Buildx
-        uses: docker/setup-buildx-action@v3 # zizmor: ignore[unpinned-uses]
-        with:
-          cache-binary: false
-
-      - name: Login to Docker Hub
-        if: ${{ env.DOCKERHUB_USERNAME != '' }}
-        uses: docker/login-action@v3 # zizmor: ignore[unpinned-uses]
-        with:
-          username: ${{ secrets.DOCKERHUB_LEROBOT_USERNAME }}
-          password: ${{ secrets.DOCKERHUB_LEROBOT_PASSWORD }}
-        env:
-          DOCKERHUB_USERNAME: ${{ secrets.DOCKERHUB_LEROBOT_USERNAME }}
-
-      # Build the full-install image: SAPIEN, mplib, CuRobo, pytorch3d +
-      # simulation assets (~4 GB). Layer cache lives in the runner's local
-      # Docker daemon — reused across re-runs on the same machine.
-      - name: Build RoboTwin 2.0 benchmark image
-        uses: docker/build-push-action@v6 # zizmor: ignore[unpinned-uses]
-        with:
-          context: .
-          file: docker/Dockerfile.benchmark.robotwin
-          push: false
-          load: true
-          tags: lerobot-benchmark-robotwin:ci
-          cache-from: type=local,src=/tmp/.buildx-cache-robotwin
-          cache-to: type=local,dest=/tmp/.buildx-cache-robotwin,mode=max
-
-      - name: Run RoboTwin 2.0 smoke eval (10 tasks, 1 episode each)
-        if: env.HF_USER_TOKEN != ''
-        run: |
-          # Named container (no --rm) so we can docker cp artifacts out.
-          docker run --name robotwin-eval --gpus all \
-            --shm-size=4g \
-            -e HF_HOME=/tmp/hf \
-            -e HF_USER_TOKEN="${HF_USER_TOKEN}" \
-            -e ROBOTWIN_POLICY="${ROBOTWIN_POLICY}" \
-            -e ROBOTWIN_TASKS="${ROBOTWIN_TASKS}" \
-            lerobot-benchmark-robotwin:ci \
-            bash -c "
-              hf auth login --token \"\$HF_USER_TOKEN\" --add-to-git-credential 2>/dev/null || true
-              cd /opt/robotwin && lerobot-eval \
-                --policy.path=\"\$ROBOTWIN_POLICY\" \
-                --env.type=robotwin \
-                --env.task=\"\$ROBOTWIN_TASKS\" \
-                --env.max_parallel_tasks=5 \
-                --eval.batch_size=1 \
-                --eval.n_episodes=1 \
-                --eval.use_async_envs=false \
-                --policy.device=cuda \
-                '--rename_map={\"observation.images.head_camera\": \"observation.images.camera1\", \"observation.images.left_camera\": \"observation.images.camera2\", \"observation.images.right_camera\": \"observation.images.camera3\"}' \
-                --output_dir=/tmp/eval-artifacts
-              python /lerobot/scripts/ci/extract_task_descriptions.py \
-                --env robotwin \
-                --task \"\$ROBOTWIN_TASKS\" \
-                --output /tmp/eval-artifacts/task_descriptions.json
-            "
-
-      - name: Copy RoboTwin artifacts from container
-        if: always()
-        run: |
-          mkdir -p /tmp/robotwin-artifacts
-          docker cp robotwin-eval:/tmp/eval-artifacts/. /tmp/robotwin-artifacts/ 2>/dev/null || true
-          docker rm -f robotwin-eval || true
-
-      - name: Parse RoboTwin eval metrics
-        if: always()
-        run: |
-          python3 scripts/ci/parse_eval_metrics.py \
-            --artifacts-dir /tmp/robotwin-artifacts \
-            --env robotwin \
-            --task "${ROBOTWIN_TASKS}" \
-            --policy "${ROBOTWIN_POLICY}"
-
-      - name: Upload RoboTwin rollout video
-        if: always()
-        uses: actions/upload-artifact@v4
-        with:
-          name: robotwin-rollout-video
-          path: /tmp/robotwin-artifacts/videos/
-          if-no-files-found: warn
-
-      - name: Upload RoboTwin eval metrics
-        if: always()
-        uses: actions/upload-artifact@v4
-        with:
-          name: robotwin-metrics
-          path: /tmp/robotwin-artifacts/metrics.json
-          if-no-files-found: warn
-
-  # ── ROBOCASA365 ──────────────────────────────────────────────────────────
-  # Isolated image: robocasa + robosuite installed manually as editable
-  # clones (no `lerobot[robocasa]` extra — robocasa's setup.py pins
-  # `lerobot==0.3.3`, which would shadow this repo's lerobot).
-  robocasa-integration-test:
-    name: RoboCasa365 — build image + 1-episode eval
-    runs-on:
-      group: aws-g6-4xlarge-plus
-    env:
-      HF_USER_TOKEN: ${{ secrets.LEROBOT_HF_USER }}
-
-    steps:
-      - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd  # v6.0.2
-        with:
-          persist-credentials: false
-          lfs: true
-
-      - name: Set up Docker Buildx
-        uses: docker/setup-buildx-action@v3 # zizmor: ignore[unpinned-uses]
-        with:
-          cache-binary: false
-
-      - name: Login to Docker Hub
-        if: ${{ env.DOCKERHUB_USERNAME != '' }}
-        uses: docker/login-action@v3 # zizmor: ignore[unpinned-uses]
-        with:
-          username: ${{ secrets.DOCKERHUB_LEROBOT_USERNAME }}
-          password: ${{ secrets.DOCKERHUB_LEROBOT_PASSWORD }}
-        env:
-          DOCKERHUB_USERNAME: ${{ secrets.DOCKERHUB_LEROBOT_USERNAME }}
-
-      - name: Build RoboCasa365 benchmark image
-        uses: docker/build-push-action@v6 # zizmor: ignore[unpinned-uses]
-        with:
-          context: .
-          file: docker/Dockerfile.benchmark.robocasa
-          push: false
-          load: true
-          tags: lerobot-benchmark-robocasa:ci
-
-      - name: Run RoboCasa365 smoke eval (10 atomic tasks, 1 episode each)
-        if: env.HF_USER_TOKEN != ''
-        run: |
-          docker run --name robocasa-eval --gpus all \
-            --shm-size=4g \
-            -e HF_HOME=/tmp/hf \
-            -e HF_USER_TOKEN="${HF_USER_TOKEN}" \
-            -e HF_HUB_DOWNLOAD_TIMEOUT=300 \
-            -e MUJOCO_GL=egl \
-            lerobot-benchmark-robocasa:ci \
-            bash -c "
-              hf auth login --token \"\$HF_USER_TOKEN\" --add-to-git-credential 2>/dev/null || true
-              lerobot-eval \
-                --policy.path=lerobot/smolvla_robocasa \
-                --env.type=robocasa \
-                --env.task=CloseFridge,OpenCabinet,OpenDrawer,TurnOnMicrowave,TurnOffStove,CloseToasterOvenDoor,SlideDishwasherRack,TurnOnSinkFaucet,NavigateKitchen,TurnOnElectricKettle \
-                --env.max_parallel_tasks=5 \
-                --eval.batch_size=1 \
-                --eval.n_episodes=1 \
-                --eval.use_async_envs=false \
-                --policy.device=cuda \
-                '--rename_map={\"observation.images.robot0_agentview_left\": \"observation.images.camera1\", \"observation.images.robot0_eye_in_hand\": \"observation.images.camera2\", \"observation.images.robot0_agentview_right\": \"observation.images.camera3\"}' \
-                --output_dir=/tmp/eval-artifacts
-              python scripts/ci/extract_task_descriptions.py \
-                --env robocasa \
-                --task CloseFridge,OpenCabinet,OpenDrawer,TurnOnMicrowave,TurnOffStove,CloseToasterOvenDoor,SlideDishwasherRack,TurnOnSinkFaucet,NavigateKitchen,TurnOnElectricKettle \
-                --output /tmp/eval-artifacts/task_descriptions.json
-            "
-
-      - name: Copy RoboCasa365 artifacts from container
-        if: always()
-        run: |
-          mkdir -p /tmp/robocasa-artifacts
-          docker cp robocasa-eval:/tmp/eval-artifacts/. /tmp/robocasa-artifacts/ 2>/dev/null || true
-          docker rm -f robocasa-eval || true
-
-      - name: Parse RoboCasa365 eval metrics
-        if: always()
-        run: |
-          python3 scripts/ci/parse_eval_metrics.py \
-            --artifacts-dir /tmp/robocasa-artifacts \
-            --env robocasa \
-            --task atomic_smoke_10 \
-            --policy lerobot/smolvla_robocasa
-
-      - name: Upload RoboCasa365 rollout video
-        if: always()
-        uses: actions/upload-artifact@v4 # zizmor: ignore[unpinned-uses]
-        with:
-          name: robocasa-rollout-video
-          path: /tmp/robocasa-artifacts/videos/
-          if-no-files-found: warn
-
-      - name: Upload RoboCasa365 eval metrics
-        if: always()
-        uses: actions/upload-artifact@v4 # zizmor: ignore[unpinned-uses]
-        with:
-          name: robocasa-metrics
-          path: /tmp/robocasa-artifacts/metrics.json
-          if-no-files-found: warn
-
-  # ── ROBOCEREBRA ───────────────────────────────────────────────────────────
-  # Reuses the LIBERO simulator (libero_10 suite) with RoboCerebra camera
-  # defaults (image/wrist_image). The image is layered on
-  # huggingface/lerobot-gpu, which already ships [libero] as part of [all].
-  robocerebra-integration-test:
-    name: RoboCerebra — build image + 1-episode eval
-    runs-on:
-      group: aws-g6-4xlarge-plus
-    env:
-      HF_USER_TOKEN: ${{ secrets.LEROBOT_HF_USER }}
-
-    steps:
-      - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd  # v6.0.2
-        with:
-          persist-credentials: false
-          lfs: true
-
-      - name: Set up Docker Buildx
-        uses: docker/setup-buildx-action@v3 # zizmor: ignore[unpinned-uses]
-        with:
-          cache-binary: false
-
-      - name: Login to Docker Hub
-        if: ${{ env.DOCKERHUB_USERNAME != '' }}
-        uses: docker/login-action@v3 # zizmor: ignore[unpinned-uses]
-        with:
-          username: ${{ secrets.DOCKERHUB_LEROBOT_USERNAME }}
-          password: ${{ secrets.DOCKERHUB_LEROBOT_PASSWORD }}
-        env:
-          DOCKERHUB_USERNAME: ${{ secrets.DOCKERHUB_LEROBOT_USERNAME }}
-
-      - name: Build RoboCerebra benchmark image
-        uses: docker/build-push-action@v6 # zizmor: ignore[unpinned-uses]
-        with:
-          context: .
-          file: docker/Dockerfile.benchmark.robocerebra
-          push: false
-          load: true
-          tags: lerobot-benchmark-robocerebra:ci
-          cache-from: type=local,src=/tmp/.buildx-cache-robocerebra
-          cache-to: type=local,dest=/tmp/.buildx-cache-robocerebra,mode=max
-
-      - name: Run RoboCerebra smoke eval (1 episode)
-        if: env.HF_USER_TOKEN != ''
-        run: |
-          docker run --name robocerebra-eval --gpus all \
-            --shm-size=4g \
-            -e HF_HOME=/tmp/hf \
-            -e HF_USER_TOKEN="${HF_USER_TOKEN}" \
-            -e HF_HUB_DOWNLOAD_TIMEOUT=300 \
-            -e LIBERO_DATA_FOLDER=/tmp/libero_data \
-            lerobot-benchmark-robocerebra:ci \
-            bash -c "
-              hf auth login --token \"\$HF_USER_TOKEN\" --add-to-git-credential 2>/dev/null || true
-              lerobot-eval \
-                --policy.path=lerobot/smolvla_robocerebra \
-                --env.type=libero \
-                --env.task=libero_10 \
-                --env.fps=20 \
-                --env.obs_type=pixels_agent_pos \
-                --env.observation_height=256 \
-                --env.observation_width=256 \
-                '--env.camera_name_mapping={\"agentview_image\": \"image\", \"robot0_eye_in_hand_image\": \"wrist_image\"}' \
-                --eval.batch_size=1 \
-                --eval.n_episodes=1 \
-                --eval.use_async_envs=false \
-                --policy.device=cuda \
-                '--rename_map={\"observation.images.image\": \"observation.images.camera1\", \"observation.images.wrist_image\": \"observation.images.camera2\"}' \
-                --policy.empty_cameras=1 \
-                --output_dir=/tmp/eval-artifacts
-              python scripts/ci/extract_task_descriptions.py \
-                --env libero --task libero_10 \
-                --output /tmp/eval-artifacts/task_descriptions.json
-            "
-
-      - name: Copy RoboCerebra artifacts from container
-        if: always()
-        run: |
-          mkdir -p /tmp/robocerebra-artifacts
-          docker cp robocerebra-eval:/tmp/eval-artifacts/. /tmp/robocerebra-artifacts/ 2>/dev/null || true
-          docker rm -f robocerebra-eval || true
-
-      - name: Parse RoboCerebra eval metrics
-        if: always()
-        run: |
-          python3 scripts/ci/parse_eval_metrics.py \
-            --artifacts-dir /tmp/robocerebra-artifacts \
-            --env robocerebra \
-            --task libero_10 \
-            --policy lerobot/smolvla_robocerebra
-
-      - name: Upload RoboCerebra rollout video
-        if: always()
-        uses: actions/upload-artifact@v4 # zizmor: ignore[unpinned-uses]
-        with:
-          name: robocerebra-rollout-video
-          path: /tmp/robocerebra-artifacts/videos/
-          if-no-files-found: warn
-
-      - name: Upload RoboCerebra eval metrics
-        if: always()
-        uses: actions/upload-artifact@v4 # zizmor: ignore[unpinned-uses]
-        with:
-          name: robocerebra-metrics
-          path: /tmp/robocerebra-artifacts/metrics.json
-          if-no-files-found: warn
-
-  # ── ROBOMME ───────────────────────────────────────────────────────────────
-  # Isolated image: mani-skill/SAPIEN/Vulkan chain with gymnasium and numpy
-  # overrides (robomme can't be a pyproject extra due to numpy<2 pin).
-  robomme-integration-test:
-    name: RoboMME — build image + 1-episode eval
-    runs-on:
-      group: aws-g6-4xlarge-plus
-    env:
-      HF_USER_TOKEN: ${{ secrets.LEROBOT_HF_USER }}
-      ROBOMME_POLICY: lerobot/smolvla_robomme
-      ROBOMME_TASKS: PickXtimes,BinFill,StopCube,MoveCube,InsertPeg,SwingXtimes,VideoUnmask,ButtonUnmask,PickHighlight,PatternLock
-
-    steps:
-      - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
-        with:
-          persist-credentials: false
-          lfs: true
-
-      - name: Set up Docker Buildx
-        uses: docker/setup-buildx-action@v3 # zizmor: ignore[unpinned-uses]
-        with:
-          cache-binary: false
-
-      - name: Login to Docker Hub
-        if: ${{ env.DOCKERHUB_USERNAME != '' }}
-        uses: docker/login-action@v3 # zizmor: ignore[unpinned-uses]
-        with:
-          username: ${{ secrets.DOCKERHUB_LEROBOT_USERNAME }}
-          password: ${{ secrets.DOCKERHUB_LEROBOT_PASSWORD }}
-        env:
-          DOCKERHUB_USERNAME: ${{ secrets.DOCKERHUB_LEROBOT_USERNAME }}
-
-      - name: Build RoboMME benchmark image
-        uses: docker/build-push-action@v6 # zizmor: ignore[unpinned-uses]
-        with:
-          context: .
-          file: docker/Dockerfile.benchmark.robomme
-          push: false
-          load: true
-          tags: lerobot-benchmark-robomme:ci
-
-      - name: Run RoboMME smoke eval (10 tasks, 1 episode each)
-        if: env.HF_USER_TOKEN != ''
-        run: |
-          docker run --name robomme-eval --gpus all \
-            --shm-size=4g \
-            -e HF_HOME=/tmp/hf \
-            -e HF_USER_TOKEN="${HF_USER_TOKEN}" \
-            -e HF_HUB_DOWNLOAD_TIMEOUT=300 \
-            -e ROBOMME_POLICY="${ROBOMME_POLICY}" \
-            -e ROBOMME_TASKS="${ROBOMME_TASKS}" \
-            lerobot-benchmark-robomme:ci \
-            bash -c "
-              hf auth login --token \"\$HF_USER_TOKEN\" --add-to-git-credential 2>/dev/null || true
-              lerobot-eval \
-                --policy.path=\"\$ROBOMME_POLICY\" \
-                --env.type=robomme \
-                --env.task=\"\$ROBOMME_TASKS\" \
-                --env.dataset_split=test \
-                --env.task_ids=[0] \
-                --env.max_parallel_tasks=5 \
-                --eval.batch_size=1 \
-                --eval.n_episodes=1 \
-                --eval.use_async_envs=false \
-                --policy.device=cuda \
-                '--rename_map={\"observation.images.image\": \"observation.images.camera1\", \"observation.images.wrist_image\": \"observation.images.camera2\"}' \
-                --policy.empty_cameras=3 \
-                --output_dir=/tmp/eval-artifacts
-              python scripts/ci/extract_task_descriptions.py \
-                --env robomme --task \"\$ROBOMME_TASKS\" \
-                --output /tmp/eval-artifacts/task_descriptions.json
-            "
-
-      - name: Copy RoboMME artifacts from container
-        if: always()
-        run: |
-          mkdir -p /tmp/robomme-artifacts
-          docker cp robomme-eval:/tmp/eval-artifacts/. /tmp/robomme-artifacts/ 2>/dev/null || true
-          docker rm -f robomme-eval || true
-
-      - name: Parse RoboMME eval metrics
-        if: always()
-        run: |
-          python3 scripts/ci/parse_eval_metrics.py \
-            --artifacts-dir /tmp/robomme-artifacts \
-            --env robomme \
-            --task "${ROBOMME_TASKS}" \
-            --policy "${ROBOMME_POLICY}"
-
-      - name: Upload RoboMME rollout video
-        if: always()
-        uses: actions/upload-artifact@v4 # zizmor: ignore[unpinned-uses]
-        with:
-          name: robomme-rollout-video
-          path: /tmp/robomme-artifacts/videos/
-          if-no-files-found: warn
-
-      - name: Upload RoboMME eval metrics
-        if: always()
-        uses: actions/upload-artifact@v4 # zizmor: ignore[unpinned-uses]
-        with:
-          name: robomme-metrics
-          path: /tmp/robomme-artifacts/metrics.json
-          if-no-files-found: warn
-
-  # ── LIBERO-plus ───────────────────────────────────────────────────────────
-  # Isolated image: LIBERO-plus fork cloned into /home/user_lerobot on top of
-  # huggingface/lerobot-gpu (see docker/Dockerfile.benchmark.libero_plus).
-  libero-plus-integration-test:
-    name: LIBERO-plus — build image + 1-episode eval
-    runs-on:
-      group: aws-g6-4xlarge-plus
-    env:
-      HF_USER_TOKEN: ${{ secrets.LEROBOT_HF_USER }}
-      LIBERO_PLUS_SUITE: libero_spatial
-      LIBERO_PLUS_POLICY: lerobot/smolvla_libero_plus
-      LIBERO_PLUS_TASK_IDS: "[0,100,260,500,1000,1500,2000,2400]"
-
-    steps:
-      - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd  # v6.0.2
-        with:
-          persist-credentials: false
-          lfs: true
-
-      - name: Set up Docker Buildx
-        uses: docker/setup-buildx-action@v3 # zizmor: ignore[unpinned-uses]
-        with:
-          cache-binary: false
-
-      - name: Login to Docker Hub
-        if: ${{ env.DOCKERHUB_USERNAME != '' }}
-        uses: docker/login-action@v3 # zizmor: ignore[unpinned-uses]
-        with:
-          username: ${{ secrets.DOCKERHUB_LEROBOT_USERNAME }}
-          password: ${{ secrets.DOCKERHUB_LEROBOT_PASSWORD }}
-        env:
-          DOCKERHUB_USERNAME: ${{ secrets.DOCKERHUB_LEROBOT_USERNAME }}
-
-      - name: Build LIBERO-plus benchmark image
-        uses: docker/build-push-action@v6 # zizmor: ignore[unpinned-uses]
-        with:
-          context: .
-          file: docker/Dockerfile.benchmark.libero_plus
-          push: false
-          load: true
-          tags: lerobot-benchmark-libero-plus:ci
-          cache-from: type=local,src=/tmp/.buildx-cache-libero-plus
-          cache-to: type=local,dest=/tmp/.buildx-cache-libero-plus,mode=max
-
-      - name: Run LIBERO-plus smoke eval (1 episode)
-        if: env.HF_USER_TOKEN != ''
-        run: |
-          docker run --name libero-plus-eval --gpus all \
-            --shm-size=4g \
-            -e HF_HOME=/tmp/hf \
-            -e HF_USER_TOKEN="${HF_USER_TOKEN}" \
-            -e HF_HUB_DOWNLOAD_TIMEOUT=300 \
-            -e LIBERO_PLUS_SUITE="${LIBERO_PLUS_SUITE}" \
-            -e LIBERO_PLUS_POLICY="${LIBERO_PLUS_POLICY}" \
-            -e LIBERO_PLUS_TASK_IDS="${LIBERO_PLUS_TASK_IDS}" \
-            lerobot-benchmark-libero-plus:ci \
-            bash -c "
-              hf auth login --token \"\$HF_USER_TOKEN\" --add-to-git-credential 2>/dev/null || true
-              lerobot-eval \
-                --policy.path=\"\$LIBERO_PLUS_POLICY\" \
-                --env.type=libero_plus \
-                --env.task=\"\$LIBERO_PLUS_SUITE\" \
-                --env.task_ids=\"\$LIBERO_PLUS_TASK_IDS\" \
-                --env.max_parallel_tasks=5 \
-                --eval.batch_size=1 \
-                --eval.n_episodes=1 \
-                --eval.use_async_envs=false \
-                --policy.device=cuda \
-                '--env.camera_name_mapping={\"agentview_image\": \"camera1\", \"robot0_eye_in_hand_image\": \"camera2\"}' \
-                --policy.empty_cameras=1 \
-                --output_dir=/tmp/eval-artifacts
-              python scripts/ci/extract_task_descriptions.py \
-                --env libero_plus --task \"\$LIBERO_PLUS_SUITE\" \
-                --output /tmp/eval-artifacts/task_descriptions.json
-            "
-
-      - name: Copy LIBERO-plus artifacts from container
-        if: always()
-        run: |
-          mkdir -p /tmp/libero-plus-artifacts
-          docker cp libero-plus-eval:/tmp/eval-artifacts/. /tmp/libero-plus-artifacts/ 2>/dev/null || true
-          docker rm -f libero-plus-eval || true
-
-      - name: Parse LIBERO-plus eval metrics
-        if: always()
-        run: |
-          python3 scripts/ci/parse_eval_metrics.py \
-            --artifacts-dir /tmp/libero-plus-artifacts \
-            --env libero_plus \
-            --task "${LIBERO_PLUS_SUITE}" \
-            --policy "${LIBERO_PLUS_POLICY}"
-
-      - name: Upload LIBERO-plus rollout video
-        if: always()
-        uses: actions/upload-artifact@v4 # zizmor: ignore[unpinned-uses]
-        with:
-          name: libero-plus-rollout-video
-          path: /tmp/libero-plus-artifacts/videos/
-          if-no-files-found: warn
-
-      - name: Upload LIBERO-plus eval metrics
-        if: always()
-        uses: actions/upload-artifact@v4 # zizmor: ignore[unpinned-uses]
-        with:
-          name: libero-plus-metrics
-          path: /tmp/libero-plus-artifacts/metrics.json
-          if-no-files-found: warn
-
-  # ── VLABENCH ─────────────────────────────────────────────────────────────
-  # Isolated image: lerobot[vlabench] only (VLABench, mujoco==3.2.2, dm-control chain)
-  vlabench-integration-test:
-    name: VLABench — build image + 1-episode eval
-    runs-on:
-      group: aws-g6-4xlarge-plus
-    env:
-      HF_USER_TOKEN: ${{ secrets.LEROBOT_HF_USER }}
-
-    steps:
-      - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd  # v6.0.2
-        with:
-          persist-credentials: false
-          lfs: true
-
-      - name: Set up Docker Buildx
-        uses: docker/setup-buildx-action@v3 # zizmor: ignore[unpinned-uses]
-        with:
-          cache-binary: false
-
-      - name: Login to Docker Hub
-        if: ${{ env.DOCKERHUB_USERNAME != '' }}
-        uses: docker/login-action@v3 # zizmor: ignore[unpinned-uses]
-        with:
-          username: ${{ secrets.DOCKERHUB_LEROBOT_USERNAME }}
-          password: ${{ secrets.DOCKERHUB_LEROBOT_PASSWORD }}
-        env:
-          DOCKERHUB_USERNAME: ${{ secrets.DOCKERHUB_LEROBOT_USERNAME }}
-
-      - name: Build VLABench benchmark image
-        uses: docker/build-push-action@v6 # zizmor: ignore[unpinned-uses]
-        with:
-          context: .
-          file: docker/Dockerfile.benchmark.vlabench
-          push: false
-          load: true
-          tags: lerobot-benchmark-vlabench:ci
-          build-args: |
-            VLABENCH_ASSETS_REPO=lerobot/vlabench-assets
-
-      - name: Run VLABench smoke eval (10 tasks, 1 episode each)
-        if: env.HF_USER_TOKEN != ''
-        run: |
-          docker run --name vlabench-eval --gpus all \
-            --shm-size=4g \
-            -e HF_HOME=/tmp/hf \
-            -e HF_USER_TOKEN="${HF_USER_TOKEN}" \
-            -e HF_HUB_DOWNLOAD_TIMEOUT=300 \
-            -e MUJOCO_GL=egl \
-            lerobot-benchmark-vlabench:ci \
-            bash -c "
-              hf auth login --token \"\$HF_USER_TOKEN\" --add-to-git-credential 2>/dev/null || true
-              lerobot-eval \
-                --policy.path=lerobot/smolvla_vlabench \
-                --env.type=vlabench \
-                --env.task=select_fruit,select_toy,select_book,select_painting,select_drink,select_ingredient,select_billiards,select_poker,add_condiment,insert_flower \
-                --env.episode_length=50 \
-                --env.max_parallel_tasks=5 \
-                --eval.batch_size=1 \
-                --eval.n_episodes=1 \
-                --eval.use_async_envs=false \
-                --policy.device=cuda \
-                '--rename_map={\"observation.images.image\": \"observation.images.camera1\", \"observation.images.second_image\": \"observation.images.camera2\", \"observation.images.wrist_image\": \"observation.images.camera3\"}' \
-                --output_dir=/tmp/eval-artifacts
-              python scripts/ci/extract_task_descriptions.py \
-                --env vlabench \
-                --task select_fruit,select_toy,select_book,select_painting,select_drink,select_ingredient,select_billiards,select_poker,add_condiment,insert_flower \
-                --output /tmp/eval-artifacts/task_descriptions.json
-            "
-
-      - name: Copy VLABench artifacts from container
-        if: always()
-        run: |
-          mkdir -p /tmp/vlabench-artifacts
-          docker cp vlabench-eval:/tmp/eval-artifacts/. /tmp/vlabench-artifacts/ 2>/dev/null || true
-          docker rm -f vlabench-eval || true
-
-      - name: Parse VLABench eval metrics
-        if: always()
-        run: |
-          python3 scripts/ci/parse_eval_metrics.py \
-            --artifacts-dir /tmp/vlabench-artifacts \
-            --env vlabench \
-            --task select_fruit,select_toy,select_book,select_painting,select_drink,select_ingredient,select_billiards,select_poker,add_condiment,insert_flower \
-            --policy lerobot/smolvla_vlabench
-
-      - name: Upload VLABench rollout video
-        if: always()
-        uses: actions/upload-artifact@v4 # zizmor: ignore[unpinned-uses]
-        with:
-          name: vlabench-rollout-video
-          path: /tmp/vlabench-artifacts/videos/
-          if-no-files-found: warn
-
-      - name: Upload VLABench eval metrics
-        if: always()
-        uses: actions/upload-artifact@v4 # zizmor: ignore[unpinned-uses]
-        with:
-          name: vlabench-metrics
-          path: /tmp/vlabench-artifacts/metrics.json
-          if-no-files-found: warn
@@ -33,7 +33,7 @@ jobs:
      github.event.workflow_run.event == 'pull_request' &&
      github.event.workflow_run.conclusion == 'success' &&
      github.repository == 'huggingface/lerobot'
-    uses: huggingface/doc-builder/.github/workflows/upload_pr_documentation.yml@2430c1ec91d04667414e2fa31ecfc36c153ea391  # main
+    uses: huggingface/doc-builder/.github/workflows/upload_pr_documentation.yml@90b4ee2c10b81b5c1a6367c4e6fc9e2fb510a7e3  # main
    with:
      package_name: lerobot
    secrets:
@@ -55,7 +55,7 @@ jobs:
      github.repository == 'huggingface/lerobot'
    permissions:
      contents: read
-    uses: huggingface/doc-builder/.github/workflows/build_main_documentation.yml@2430c1ec91d04667414e2fa31ecfc36c153ea391  # main
+    uses: huggingface/doc-builder/.github/workflows/build_main_documentation.yml@90b4ee2c10b81b5c1a6367c4e6fc9e2fb510a7e3  # main
    with:
      commit_sha: ${{ github.sha }}
      package: lerobot
@@ -78,7 +78,7 @@ jobs:
    permissions:
      contents: read
      pull-requests: write
-    uses: huggingface/doc-builder/.github/workflows/build_pr_documentation.yml@2430c1ec91d04667414e2fa31ecfc36c153ea391  # main
+    uses: huggingface/doc-builder/.github/workflows/build_pr_documentation.yml@90b4ee2c10b81b5c1a6367c4e6fc9e2fb510a7e3  # main
    with:
      commit_sha: ${{ github.event.pull_request.head.sha }}
      pr_number: ${{ github.event.number }}
@@ -217,24 +217,6 @@ jobs:
      - name: Run end-to-end tests
        run: make test-end-to-end

-  slack-notification:
-    name: Slack Notification
-    needs: [cpu-tests, gpu-tests, upgrade-lock]
-    if: always() && needs.upgrade-lock.outputs.changed == 'true'
-    runs-on: ubuntu-latest
-    permissions:
-      contents: read
-    env:
-      CI_SLACK_CHANNEL: ${{ secrets.CI_SLACK_CHANNEL }}
-    steps:
-      - name: Post to a Slack channel
-        uses: huggingface/hf-workflows/.github/actions/post-slack@a88e7fa2eaee28de5a4d6142381b1fb792349b67  # main
-        with:
-          slack_channel: ${{ env.CI_SLACK_CHANNEL }}
-          title: "Results of the latest dependency tests (CPU + GPU)"
-          status: ${{ (needs.cpu-tests.result == 'success' && needs.gpu-tests.result == 'success') && 'success' || 'failure' }}
-          slack_token: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}
-
  # This job creates or updates a PR with the upgraded lockfile
  open-pr:
    name: Open PR
@@ -152,14 +152,13 @@ jobs:
            BASE_VERSION="${VERSION%%-*}"
            echo "Installing pre-release version $BASE_VERSION from TestPyPI..."
            uv pip install \
-              --torch-backend cpu \
              --index-url https://test.pypi.org/simple/ \
              --extra-index-url https://pypi.org/simple \
              --index-strategy unsafe-best-match \
               "lerobot[all]==$BASE_VERSION"
          else
            echo "Installing release version $VERSION from PyPI..."
-            uv pip install --torch-backend cpu "lerobot[all]==$VERSION"
+            uv pip install "lerobot[all]==$VERSION"
          fi
      - name: Check lerobot version
        run: uv run python -c "import lerobot; print(lerobot.__version__)"
@@ -19,19 +19,19 @@ on:
  workflow_dispatch:

  # Runs at 02:00
-  # schedule:
-  #   - cron: "0 2 * * *"
+  schedule:
+    - cron: "0 2 * * *"

 env:
  CLOSE_ISSUE_MESSAGE: >
-    This issue was closed because it has been stalled for 30 days with no activity.
+    This issue was closed because it has been stalled for 14 days with no activity.
    Feel free to reopen if is still relevant, or to ping a collaborator if you have any questions.
  CLOSE_PR_MESSAGE: >
-    This PR was closed because it has been stalled for 30 days with no activity.
+    This PR was closed because it has been stalled for 21 days with no activity.
    Feel free to reopen if is still relevant, or to ping a collaborator if you have any questions.
  WARN_ISSUE_MESSAGE: >
    This issue has been automatically marked as stale because it has not had
-    recent activity (1 year). It will be closed if no further activity occurs.
+    recent activity (6 months). It will be closed if no further activity occurs.
    Any change, comment or update to this issue will reset this count.
    Thank you for your contributions.
  WARN_PR_MESSAGE: >
@@ -59,10 +59,10 @@ jobs:
          stale-pr-label: stale
          exempt-issue-labels: never-stale
          exempt-pr-labels: never-stale
-          days-before-issue-stale: 365
-          days-before-issue-close: 30
+          days-before-issue-stale: 180
+          days-before-issue-close: 14
          days-before-pr-stale: 365
-          days-before-pr-close: 30
+          days-before-pr-close: 21
          delete-branch: true
          close-issue-message: ${{ env.CLOSE_ISSUE_MESSAGE }}
          close-pr-message: ${{ env.CLOSE_PR_MESSAGE }}
@@ -1,7 +1,5 @@
 This file provides guidance to AI agents when working with code in this repository.

-> **User-facing help → [`AGENT_GUIDE.md`](./AGENT_GUIDE.md)** (SO-101 setup, recording, picking a policy, training duration, eval — with copy-pasteable commands).
-
 ## Project Overview

 LeRobot is a PyTorch-based library for real-world robotics, providing datasets, pretrained policies, and tools for training, evaluation, data collection, and robot control. It integrates with Hugging Face Hub for model/dataset sharing.
@@ -1,412 +0,0 @@
-# AGENT_GUIDE.md — LeRobot Helper for AI Agents & Users
-
-This file is a practical, copy-paste-friendly companion for any AI agent (Cursor, Claude, ChatGPT, Codex, etc.) helping a user work with LeRobot. It complements [`AGENTS.md`](./AGENTS.md) (dev/contributor context) with **user-facing guidance**: how to start, what to train, how long, how to record, and how to calibrate an SO-101.
-
---
-
-## 1. Start here — ask the user first (MANDATORY)
-
-Before suggesting any command, an agent MUST ask the user at least these questions and wait for answers:
-
-1. **What's your goal?** (e.g. "teach my SO-101 to fold a cloth", "train a policy on an existing HF dataset", "contribute a PR", "understand the codebase")
-2. **What hardware do you have?**
-   - Robot: none / SO-100 / SO-101 / Koch / LeKiwi / Reachy / other
-   - Teleop: leader arm / phone / keyboard / gamepad / none
-   - Cameras: how many, resolution, fixed or moving?
-3. **What machine will you train on?**
-   - GPU model + VRAM (e.g. "laptop 3060 6 GB", "RTX 4090 24 GB", "A100 80 GB", "CPU only")
-   - OS: macOS / Linux / Windows
-4. **Skill level & time budget?** First time, some ML, experienced? Hours, days, a weekend?
-5. **Do you already have a dataset?** Yes (HF repo id?) / no / want to record one
-6. **How can I help right now?** (pick one concrete next step)
-
-Only after you have answers, propose a concrete path. If something is ambiguous, ask again rather than guessing. Bias toward **the simplest thing that works** for the user's hardware and goal.
-
---
-
-## 2. LeRobot in 60 seconds
-
-LeRobot = **datasets + policies + envs + robot control**, unified by a small set of strong abstractions.
-
- **`LeRobotDataset`** — episode-aware dataset (video or images + actions + state), loadable from the Hub or disk.
- **Policies** (`ACT`, `Diffusion`, `SmolVLA`, `π0`, `π0.5`, `Wall-X`, `X-VLA`, `VQ-BeT`, `TD-MPC`, …) — all inherit `PreTrainedPolicy` and can be pushed/pulled from the Hub.
- **Processors** — small composable transforms between dataset → policy → robot.
- **Envs** (sim) and **Robots** (real) — same action/observation contract so code swaps cleanly.
- **CLI** — `lerobot-record`, `lerobot-train`, `lerobot-eval`, `lerobot-teleoperate`, `lerobot-calibrate`, `lerobot-find-port`, `lerobot-setup-motors`, `lerobot-replay`.
-
-See [`AGENTS.md`](./AGENTS.md) for repo architecture.
-
---
-
-## 3. Quickstart paths (pick one)
-
-### Path A — "I have an SO-101 and want my first trained policy"
-
-Go to §4 (SO-101 end-to-end), then §5 (data tips), then §6 (pick a policy — likely **ACT**), then §7 (how long), then §8 (eval).
-
-### Path B — "No hardware, I want to train on an existing dataset"
-
-Skip §4. Pick a policy in §6, pick a duration in §7, then run `lerobot-train` per §4.9 with a Hub `--dataset.repo_id` and an `--env.type` for eval. Finish with §8.
-
-### Path C — "I just want to understand the codebase"
-
-Read §2 above, then `AGENTS.md` "Architecture", then open `src/lerobot/policies/act/` and `src/lerobot/datasets/lerobot_dataset.py` as canonical examples.
-
---
-
-## 4. SO-101 end-to-end cheat-sheet
-
-Full details in [`docs/source/so101.mdx`](./docs/source/so101.mdx) and [`docs/source/il_robots.mdx`](./docs/source/il_robots.mdx). Minimum commands in order. Confirm arms are assembled + powered before issuing.
-
-**4.1 Install**
-
-```bash
-pip install 'lerobot[feetech]'              # SO-100/SO-101 motor stack
-# pip install 'lerobot[all]'                # everything
-# pip install 'lerobot[aloha,pusht]'        # specific features
-# pip install 'lerobot[smolvla]'            # add SmolVLA deps
-git lfs install && git lfs pull
-hf auth login                               # required to push datasets/policies
-```
-
-Contributors can alternatively use `uv sync --locked --extra feetech` (see `AGENTS.md`).
-
-**4.2 Find USB ports** — run once per arm, unplug when prompted.
-
-```bash
-lerobot-find-port
-```
-
-macOS: `/dev/tty.usbmodem...`; Linux: `/dev/ttyACM0` (may need `sudo chmod 666 /dev/ttyACM0`).
-
-**4.3 Setup motor IDs & baudrate** (one-time, per arm)
-
-```bash
-lerobot-setup-motors --robot.type=so101_follower --robot.port=<FOLLOWER_PORT>
-lerobot-setup-motors --teleop.type=so101_leader  --teleop.port=<LEADER_PORT>
-```
-
-**4.4 Calibrate** — center all joints, press Enter, sweep each joint through its full range. The `id` is the calibration key — reuse it everywhere.
-
-```bash
-lerobot-calibrate --robot.type=so101_follower --robot.port=<FOLLOWER_PORT> --robot.id=my_follower
-lerobot-calibrate --teleop.type=so101_leader  --teleop.port=<LEADER_PORT>   --teleop.id=my_leader
-```
-
-**4.5 Teleoperate** (sanity check, no recording)
-
-```bash
-lerobot-teleoperate \
-  --robot.type=so101_follower --robot.port=<FOLLOWER_PORT> --robot.id=my_follower \
-  --teleop.type=so101_leader  --teleop.port=<LEADER_PORT>  --teleop.id=my_leader \
-  --robot.cameras="{ front: {type: opencv, index_or_path: 0, width: 640, height: 480, fps: 30}}" \
-  --display_data=true
-```
-
-> **Feetech timeout / comms error on SO-100 / SO-101?** Before touching software, check the **red motor LEDs** on the daisy chain.
->
-> - **All steady red, gripper → base chain** → wiring OK.
-> - **One or more motors dark / chain stops mid-way** → wiring issue: reseat the 3-pin cables, check the controller-board power supply, and make sure each motor is fully clicked in.
-> - **LEDs blinking** → the motor is in an **error state**: usually overload (forcing a joint past its limit) **or wrong power supply voltage**. SO-100 / SO-101 ship in two variants — a **5 V / 7.4 V** build and a **12 V** build — they are NOT interchangeable. Using a 12 V PSU on a 5 V / 7.4 V arm (or vice-versa) will trip this error; confirm your motor variant before powering up.
->
-> Most "timeout" errors are physical, not code.
-
-**4.6 Record a dataset** — keys: **→** next, **←** redo, **ESC** finish & upload.
-
-```bash
-HF_USER=$(NO_COLOR=1 hf auth whoami | awk -F': *' 'NR==1 {print $2}')
-
-lerobot-record \
-  --robot.type=so101_follower --robot.port=<FOLLOWER_PORT> --robot.id=my_follower \
-  --teleop.type=so101_leader  --teleop.port=<LEADER_PORT>  --teleop.id=my_leader \
-  --robot.cameras="{ front: {type: opencv, index_or_path: 0, width: 640, height: 480, fps: 30}}" \
-  --dataset.repo_id=${HF_USER}/my_task \
-  --dataset.single_task="<describe the task in one sentence>" \
-  --dataset.num_episodes=50 \
-  --dataset.episode_time_s=30 \
-  --dataset.reset_time_s=10 \
-  --display_data=true
-```
-
-**4.7 Visualize** — **always** do this before training. Look for missing frames, camera blur, unreachable targets, inconsistent object positions.
-After upload: https://huggingface.co/spaces/lerobot/visualize_dataset → paste `${HF_USER}/my_task`. Works for **any LeRobot-formatted Hub dataset** — use it to scout other datasets, inspect episode quality, or debug your own data before retraining.
-
-**4.8 Replay an episode** (sanity check)
-
-```bash
-lerobot-replay --robot.type=so101_follower --robot.port=<FOLLOWER_PORT> --robot.id=my_follower \
-  --dataset.repo_id=${HF_USER}/my_task --dataset.episode=0
-```
-
-**4.9 Train** (default: ACT — fastest, lowest memory). Apple silicon: `--policy.device=mps`. See §6/§7 for policy and duration.
-
-```bash
-lerobot-train \
-  --dataset.repo_id=${HF_USER}/my_task \
-  --policy.type=act \
-  --policy.device=cuda \
-  --output_dir=outputs/train/act_my_task \
-  --job_name=act_my_task \
-  --batch_size=8 \
-  --wandb.enable=true \
-  --policy.repo_id=${HF_USER}/act_my_task
-```
-
-**4.10 Evaluate on the real robot** — compare success rate to a teleoperated baseline.
-
-```bash
-lerobot-record \
-  --robot.type=so101_follower --robot.port=<FOLLOWER_PORT> --robot.id=my_follower \
-  --robot.cameras="{ front: {type: opencv, index_or_path: 0, width: 640, height: 480, fps: 30}}" \
-  --dataset.repo_id=${HF_USER}/eval_my_task \
-  --dataset.single_task="<same task description as training>" \
-  --dataset.num_episodes=10 \
-  --policy.path=${HF_USER}/act_my_task
-```
-
---
-
-## 5. Data collection tips (beginner → reliable policy)
-
-Good data beats clever models. Adopt these defaults and deviate only with evidence.
-
-### 5.1 Setup & ergonomics
-
- **Fix the rig and cameras** before touching the software. If the rig vibrates or the operator gets frustrated, fix that first — more bad data won't help.
- **Lighting matters more than resolution.** Diffuse, consistent light. Avoid moving shadows.
- **"Can you do the task from the camera view alone?"** If no, your cameras are wrong. Fix before recording.
- Enable **action interpolation** for rollouts when available for smoother trajectories.
-
-### 5.2 Practice before you record
-
- Do 5–10 demos without recording. Build a deliberate, repeatable strategy.
- Hesitant or inconsistent demos teach the model hesitation.
-
-### 5.3 Quality over speed
-
-Deliberate, high-quality execution beats fast sloppy runs. Optimize for speed only **after** strategy is dialed in — never trade quality for it.
-
-### 5.4 Consistency within and across episodes
-
-Same grasp, approach vector, and timing. Coherent strategies are much easier to learn than wildly varying movements.
-
-### 5.5 Start small, then extend (the golden rule)
-
- **First 50 episodes = constrained version** of the task: one object, fixed position, fixed camera setup, one operator.
- Train a quick ACT model. See what fails.
- **Then add diversity** along one axis at a time: more positions → more lighting → more objects → more operators.
- Don't try to collect the "perfect dataset" on day one. Iterate.
-
-### 5.6 Policy choice for beginners
-
- **Laptop / first time / want results fast → ACT.** Works surprisingly well, trains fast even on a laptop GPU.
- **Bigger GPU / language-conditioned / multi-task → SmolVLA.** Unfreezing the vision encoder (see §7) is a big win here.
- Defer π0 / π0.5 / Wall-X / X-VLA until you have a proven ACT baseline and a 20+ GB GPU.
-
-### 5.7 Recommended defaults for your first task
-
-| Setting          | Value                                                                                                                                                 |
-| ---------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------- |
-| Episodes         | **50** to start, scale to 100–300 after first training                                                                                                |
-| Episode length   | 20–45 s (shorter is fine for grasp/place)                                                                                                             |
-| Reset time       | 10 s                                                                                                                                                  |
-| FPS              | 30                                                                                                                                                    |
-| Cameras          | **2 cameras recommended**: 1 fixed front + 1 wrist. Multi-view often outperforms single-view. A single fixed camera also works to keep things simple. |
-| Task description | Short, specific, action-phrased sentence                                                                                                              |
-
-### 5.8 Troubleshooting signal
-
- Policy fails at one specific stage → record 10–20 more episodes **targeting that stage**.
- Policy flaps / oscillates → likely inconsistent demos, or need more training; re-record worst episodes (use **←** to redo).
- Policy ignores the object → camera framing or lighting issue, not a model issue.
-
-See also: [What makes a good dataset](https://huggingface.co/blog/lerobot-datasets#what-makes-a-good-dataset).
-
---
-
-## 6. Which policy should I train?
-
-Match the policy to the user's **GPU memory** and **time budget**. Numbers below come from an internal profiling run (one training update per policy). They are **indicative only** — see caveats.
-
-### 6.1 Profiling snapshot (indicative)
-
-All policies typically train for **5–10 epochs** (see §7).
-
-> **Human-facing version:** the [Compute Hardware Guide](./docs/source/hardware_guide.mdx) reuses the table below and adds a cloud-GPU tier guide and a Hugging Face Jobs pointer.
-
-| Policy      | Batch | Update (ms) | Peak GPU mem (GB) | Best for                                                                                         |
-| ----------- | ----: | ----------: | ----------------: | ------------------------------------------------------------------------------------------------ |
-| `act`       |     4 |    **83.9** |          **0.94** | First-time users, laptops, single-task. Fast and reliable.                                       |
-| `diffusion` |     4 |       168.6 |              4.94 | Multi-modal action distributions; needs mid-range GPU.                                           |
-| `smolvla`   |     1 |       357.8 |              3.93 | Language-conditioned, multi-task, small VLA. **Unfreeze vision encoder for big gains** (see §7). |
-| `xvla`      |     1 |       731.6 |             15.52 | Large VLA, multi-task.                                                                           |
-| `wall_x`    |     1 |       716.5 |             15.95 | Large VLA with world-model objective.                                                            |
-| `pi0`       |     1 |       940.3 |             15.50 | Strong large VLA baseline (Physical Intelligence).                                               |
-| `pi05`      |     1 |      1055.8 |             16.35 | Newer π policy; similar footprint to `pi0`.                                                      |
-
-**Critical caveats:**
-
- **Optimizer:** measured with **SGD**. LeRobot's default is **AdamW**, which keeps extra optimizer state → **peak memory will be noticeably higher** with the default, especially for `pi0`, `pi05`, `wall_x`, `xvla`.
- **Batch size:** the large policies were profiled at batch 1. In practice use a **larger batch** for stable training (see §7.4). Memory scales roughly linearly with batch.
-
-### 6.2 Decision rules
-
- **< 8 GB VRAM (laptop, 3060, M-series Mac):** → `act`. Maybe `diffusion` if you have ~6–8 GB free.
- **12–16 GB VRAM (4070/4080, A4000):** → `smolvla` with defaults, or `act`/`diffusion` with larger batch. `pi0`/`pi05`/`wall_x`/`xvla` feasible only with small batch + gradient accumulation.
- **24+ GB VRAM (3090/4090/A5000):** → any policy. Prefer `smolvla` (unfrozen) for multi-task; `act` for single-task grasp-and-place (still often the best ROI). Could experiment with `pi0` or `pi05` or `xvla`
- **80 GB (A100/H100):** → any, with healthy batch. `pi05`, `xvla`, `wall_x` become comfortable.
- **CPU only:** → don't train here. Use Google Colab (see [`docs/source/notebooks.mdx`](./docs/source/notebooks.mdx)) or a rented GPU.
-
---
-
-## 7. How long should I train?
-
-Robotics imitation learning usually converges in a **few epochs over the dataset**, not hundreds of thousands of raw steps. Think **epochs first**, then translate to steps.
-
-### 7.1 Rule of thumb
-
- **Typical total: 5–10 epochs.** Start at 5, eval, then decide if more helps.
- Very small datasets (< 30 episodes) may want slightly more epochs — but first, **collect more data**.
- VLAs with a pretrained vision backbone typically need **fewer** epochs than training from scratch.
-
-### 7.2 Steps ↔ epochs conversion
-
-```
-total_frames     = sum of frames over all episodes      # e.g. 50 eps × 30 fps × 30 s ≈ 45,000
-steps_per_epoch  = ceil(total_frames / batch_size)
-total_steps      = epochs × steps_per_epoch
-```
-
-Examples for `--batch_size=8`:
-
-| Dataset size            |  Frames | Steps / epoch | 5 epochs | 10 epochs |
-| ----------------------- | ------: | ------------: | -------: | --------: |
-| 50 eps × 30 s @ 30 fps  |  45,000 |        ~5,625 |      28k |       56k |
-| 100 eps × 30 s @ 30 fps |  90,000 |       ~11,250 |      56k |      113k |
-| 300 eps × 30 s @ 30 fps | 270,000 |       ~33,750 |     169k |      338k |
-
-Pass the resulting total with `--steps=<N>`; eval at intermediate checkpoints (`outputs/train/.../checkpoints/`).
-
-### 7.3 Per-policy starting points (single-task, ~50 episodes)
-
-| Policy         | Batch | Steps (first run) | Notes                                                             |
-| -------------- | ----: | ----------------: | ----------------------------------------------------------------- |
-| `act`          |  8–16 |           30k–80k | Usually converges under 50k for single-task.                      |
-| `diffusion`    |  8–16 |          80k–150k | Benefits from longer training than ACT.                           |
-| `smolvla`      |   4–8 |           30k–80k | Pretrained VLM → converges fast.                                  |
-| `pi0` / `pi05` |   1–4 |           30k–80k | Memory-bound; use gradient accumulation for effective batch ≥ 16! |
-
-### 7.4 Batch size guidance
-
- **Bigger batch is preferable** for stable gradients on teleop data.
- If GPU memory is the bottleneck, use **gradient accumulation** to raise _effective_ batch without raising peak memory.
- Scale **learning rate** gently with batch; most LeRobot defaults work fine for a 2–4× batch change.
-
-### 7.5 Scale LR schedule & checkpoints with `--steps`
-
-LeRobot's default schedulers (e.g. SmolVLA's cosine decay) use `scheduler_decay_steps=30_000`, which is sized for long training runs. When you shorten training (e.g. 5k–10k steps on a small dataset), **scale the scheduler down to match** — otherwise the LR stays near the peak and never decays. Same for checkpoint frequency.
-
-```bash
-lerobot-train ... \
-  --steps=5000 \
-  --policy.scheduler_decay_steps=5000 \
-  --save_freq=5000
-```
-
-Rule of thumb: set `scheduler_decay_steps ≈ steps`, and `save_freq` to whatever granularity you want for eval (e.g. every 1k–5k steps). Match `scheduler_warmup_steps` proportionally if your run is very short.
-
-### 7.6 SmolVLA: unfreeze the vision encoder for real gains
-
-SmolVLA ships with `freeze_vision_encoder=True`. Unfreezing usually **improves performance substantially** on specialized tasks, at the cost of more VRAM and slower steps. Enable with:
-
-```bash
-lerobot-train ... --policy.type=smolvla \
-  --policy.freeze_vision_encoder=false \
-  --policy.train_expert_only=false
-```
-
-### 7.7 Signals to stop / keep going
-
- Train loss plateaus → stop, save a Hub checkpoint.
- Train loss still dropping and you're under 10 epochs → keep going.
-
---
-
-## 8. Evaluation & benchmarks
-
-Two flavors of evaluation:
-
-### 8.1 Real-robot eval (SO-101, etc.)
-
-Reuse `lerobot-record` with `--policy.path` to run the trained policy on-robot and save the run as an eval dataset. Convention: prefix the dataset with `eval_`.
-
-```bash
-lerobot-record \
-  --robot.type=so101_follower --robot.port=<FOLLOWER_PORT> --robot.id=my_follower \
-  --robot.cameras="{ front: {type: opencv, index_or_path: 0, width: 640, height: 480, fps: 30}}" \
-  --dataset.repo_id=${HF_USER}/eval_my_task \
-  --dataset.single_task="<same task description used during training>" \
-  --dataset.num_episodes=10 \
-  --policy.path=${HF_USER}/act_my_task
-```
-
-Report success rate across episodes. Compare to a teleoperated baseline and to an earlier checkpoint to catch regressions.
-
-### 8.2 Sim-benchmark eval
-
-For policies trained on sim datasets (PushT, Aloha, LIBERO, MetaWorld, RoboCasa, …) use `lerobot-eval` against the matching `env.type`:
-
-```bash
-lerobot-eval \
-  --policy.path=${HF_USER}/diffusion_pusht \
-  --env.type=pusht \
-  --eval.n_episodes=50 \
-  --eval.batch_size=10 \
-  --policy.device=cuda
-```
-
- Use `--policy.path=outputs/train/.../checkpoints/<step>/pretrained_model` for local checkpoints.
- `--eval.n_episodes` should be ≥ 50 for a stable success-rate estimate.
- Available envs live in `src/lerobot/envs/`. See [`docs/source/libero.mdx`](./docs/source/libero.mdx), [`metaworld.mdx`](./docs/source/metaworld.mdx), [`robocasa.mdx`](./docs/source/robocasa.mdx), [`vlabench.mdx`](./docs/source/vlabench.mdx) for specific benchmarks.
- To add a new benchmark, see [`docs/source/adding_benchmarks.mdx`](./docs/source/adding_benchmarks.mdx) and [`envhub.mdx`](./docs/source/envhub.mdx).
-
-### 8.2b Dockerfiles for benchmark eval
-
-Benchmark envs have native dependencies that are painful to install locally. The repo ships **pre-baked Dockerfiles** for each supported benchmark — use these to run `lerobot-eval` in a reproducible environment:
-
-| Benchmark   | Dockerfile                                                                             |
-| ----------- | -------------------------------------------------------------------------------------- |
-| LIBERO      | [`docker/Dockerfile.benchmark.libero`](./docker/Dockerfile.benchmark.libero)           |
-| LIBERO+     | [`docker/Dockerfile.benchmark.libero_plus`](./docker/Dockerfile.benchmark.libero_plus) |
-| MetaWorld   | [`docker/Dockerfile.benchmark.metaworld`](./docker/Dockerfile.benchmark.metaworld)     |
-| RoboCasa    | [`docker/Dockerfile.benchmark.robocasa`](./docker/Dockerfile.benchmark.robocasa)       |
-| RoboCerebra | [`docker/Dockerfile.benchmark.robocerebra`](./docker/Dockerfile.benchmark.robocerebra) |
-| RoboMME     | [`docker/Dockerfile.benchmark.robomme`](./docker/Dockerfile.benchmark.robomme)         |
-| RoboTwin    | [`docker/Dockerfile.benchmark.robotwin`](./docker/Dockerfile.benchmark.robotwin)       |
-| VLABench    | [`docker/Dockerfile.benchmark.vlabench`](./docker/Dockerfile.benchmark.vlabench)       |
-
-Build and run (adapt to your benchmark):
-
-```bash
-docker build -f docker/Dockerfile.benchmark.robomme -t lerobot-bench-robomme .
-docker run --gpus all --rm -it \
-  -v $HOME/.cache/huggingface:/root/.cache/huggingface \
-  lerobot-bench-robomme \
-  lerobot-eval --policy.path=<your_policy> --env.type=<env> --eval.n_episodes=50
-```
-
-See [`docker/README.md`](./docker/README.md) for base-image details.
-
-### 8.3 Target success rates
-
-Single-task grasp-and-place with 50 clean episodes: ACT should reach **> 70% success** on the training configuration. Less → data problem (see §5), not model problem. Expect a drop when generalizing to new positions — scale episodes or diversity to recover.
-
---
-
-## 9. Further reading & resources
-
- **Getting started:** [`installation.mdx`](./docs/source/installation.mdx) · [`il_robots.mdx`](./docs/source/il_robots.mdx) · [What makes a good dataset](https://huggingface.co/blog/lerobot-datasets)
- **Per-policy docs:** browse [`docs/source/*.mdx`](./docs/source/) (policies, hardware, benchmarks, advanced training).
- **Community:** [Discord](https://discord.com/invite/s3KuuzsPFb) · [Hub `LeRobot` tag](https://huggingface.co/datasets?other=LeRobot) · [Dataset visualizer](https://huggingface.co/spaces/lerobot/visualize_dataset)
-
-> Keep this file current. If you learn a rule that would prevent a class of user mistakes, add it here and in [`AGENTS.md`](./AGENTS.md).
@@ -78,9 +78,6 @@ Use the templates for required fields and examples.
 - **Issues:** Follow the [ticket template](https://github.com/huggingface/lerobot/blob/main/.github/ISSUE_TEMPLATE/bug-report.yml).
 - **Pull requests:** Rebase on `upstream/main`, use a descriptive branch (don't work on `main`), run `pre-commit` and tests locally, and follow the [PR template](https://github.com/huggingface/lerobot/blob/main/.github/PULL_REQUEST_TEMPLATE.md).

-> [!IMPORTANT]
-> Community Review Policy: To help scale our efforts and foster a collaborative environment, we ask contributors to review at least one other person's open PR before their own receives attention. This shared responsibility multiplies our review capacity and helps everyone's code get merged faster!
-
-Once you have submitted your PR and completed a peer review, a member of the LeRobot team will review your contribution.
+One member of the LeRobot team will then review your contribution.

 Thank you for contributing to LeRobot!
@@ -1,4 +1,3 @@
 include src/lerobot/templates/lerobot_modelcard_template.md
-include src/lerobot/templates/lerobot_rewardmodel_modelcard_template.md
 include src/lerobot/datasets/card_template.md
 include src/lerobot/envs/metaworld_config.json
@@ -109,7 +109,7 @@ lerobot-train \

 Similarly to the hardware, you can easily implement your own policy & leverage LeRobot's data collection, training, and visualization tools, and share your model to the HF Hub

-For detailed policy setup guides, see the [Policy Documentation](https://huggingface.co/docs/lerobot/bring_your_own_policies). For GPU/RAM requirements and expected training time per policy, see the [Compute Hardware Guide](https://huggingface.co/docs/lerobot/hardware_guide).
+For detailed policy setup guides, see the [Policy Documentation](https://huggingface.co/docs/lerobot/bring_your_own_policies).

 ## Inference & Evaluation

@@ -1,42 +0,0 @@
-# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-# Benchmark image for LIBERO integration tests.
-# Extends the nightly GPU image (which already has all extras installed)
-# with the PR's source code and LIBERO-specific asset setup.
-#
-# Build:  docker build -f docker/Dockerfile.benchmark.libero -t lerobot-benchmark-libero .
-# Run:    docker run --gpus all --rm lerobot-benchmark-libero lerobot-eval ...
-
-FROM huggingface/lerobot-gpu:latest
-
-# Pre-download lerobot/libero-assets from HF Hub so nothing is fetched at
-# runtime (which times out on CI). Point the libero config at the cached path.
-# libero/libero/__init__.py calls input() when ~/.libero/config.yaml is missing,
-# so we write the config before any libero import can happen.
-RUN LIBERO_DIR=$(python -c \
-      "import importlib.util, os; s=importlib.util.find_spec('libero'); \
-       print(os.path.join(os.path.dirname(s.origin), 'libero'))") && \
-    mkdir -p /home/user_lerobot/.libero && \
-    python -c "\
-from huggingface_hub import snapshot_download; \
-snapshot_download(repo_id='lerobot/libero-assets', repo_type='dataset', \
-                  local_dir='/home/user_lerobot/.libero/assets')" && \
-    printf "assets: /home/user_lerobot/.libero/assets\nbddl_files: ${LIBERO_DIR}/bddl_files\ndatasets: ${LIBERO_DIR}/../datasets\ninit_states: ${LIBERO_DIR}/init_files\n" \
-    > /home/user_lerobot/.libero/config.yaml
-
-# Overlay the PR's source code on top of the nightly image.
-COPY --chown=user_lerobot:user_lerobot . .
-
-CMD ["/bin/bash"]
@@ -1,84 +0,0 @@
-# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-# Benchmark image for LIBERO-plus integration tests.
-# Extends the nightly GPU image (which has lerobot[all]) with the LIBERO-plus
-# fork source + its 6.4 GB perturbation assets.
-#
-# Build:  docker build -f docker/Dockerfile.benchmark.libero_plus -t lerobot-benchmark-libero-plus .
-# Run:    docker run --gpus all --rm lerobot-benchmark-libero-plus lerobot-eval ...
-
-FROM huggingface/lerobot-gpu:latest
-ENV MUJOCO_GL=egl
-
-# unzip for the 6.4 GB assets.zip; the rest are LIBERO-plus build-time extras
-# (wand / ImageMagick / fontconfig) not in the nightly base.
-USER root
-RUN apt-get update \
-    && apt-get install -y --no-install-recommends \
-         unzip libexpat1 libfontconfig1-dev libmagickwand-dev \
-    && apt-get clean && rm -rf /var/lib/apt/lists/*
-USER user_lerobot
-
-# robosuite==1.4.1 is mandatory (the fork uses `single_arm_env` removed in
-# v1.5+). The rest are LIBERO-plus runtime deps pulled from its setup.py.
-# We install these explicitly instead of via the [libero_plus] extra because
-# the extra's `libero @ git+...` dep installs as a namespace package and then
-# clone and PYTHONPATH-override it below.
-RUN uv pip install --no-cache \
-        "robosuite==1.4.1" \
-        "bddl==1.0.1" \
-        "easydict==1.13" \
-        "mujoco==3.7.0" \
-        "matplotlib==3.10.8" \
-        "Wand==0.6.13" \
-        "scikit-image==0.25.2" \
-        "gym==0.26.2"
-
-# Clone LIBERO-plus and make it importable as `libero`. The nightly base has
-# hf-libero (10 tasks) preinstalled via lerobot[libero]; uninstall it so
-# Python resolves `import libero` to the 2402-task LIBERO-plus module instead.
-# Pinned to the current upstream main SHA so benchmark builds stay reproducible.
-ARG LIBERO_PLUS_SHA=4976dc3
-ENV LIBERO_PLUS_ROOT=/home/user_lerobot/libero-plus/libero/libero
-RUN git clone https://github.com/sylvestf/LIBERO-plus.git /home/user_lerobot/libero-plus \
-    && git -C /home/user_lerobot/libero-plus checkout ${LIBERO_PLUS_SHA} \
-    && cd /home/user_lerobot/libero-plus && uv pip install --no-cache --no-deps -e "." \
-    && (uv pip uninstall hf-libero 2>/dev/null || true)
-ENV PYTHONPATH="/home/user_lerobot/libero-plus:${PYTHONPATH}"
-
-# Perturbation textures/scenes: bddl_base_domain.py resolves XMLs via
-# DIR_PATH/../assets (package-relative, ignoring ~/.libero/config.yaml). All
-# 2402 tasks reference files that ship only in Sylvest/LIBERO-plus's
-# assets.zip (6.4 GB) under a deep author-internal prefix — extract and
-# flatten it under ${LIBERO_PLUS_ROOT}/assets.
-RUN python -c "\
-from huggingface_hub import hf_hub_download; \
-hf_hub_download(repo_id='Sylvest/LIBERO-plus', repo_type='dataset', \
-                filename='assets.zip', local_dir='/tmp/libero-plus-dl')" \
-    && unzip -q /tmp/libero-plus-dl/assets.zip -d /tmp/libero-plus-dl/extract \
-    && ASSETS_DIR=$(find /tmp/libero-plus-dl/extract -type d -name assets | head -1) \
-    && mv "${ASSETS_DIR}" ${LIBERO_PLUS_ROOT}/assets \
-    && rm -rf /tmp/libero-plus-dl
-
-# Point ~/.libero/config.yaml at the clone so LIBERO-plus's imports are
-# non-interactive (it calls input() when the config is missing).
-RUN mkdir -p /home/user_lerobot/.libero \
-    && printf "assets: ${LIBERO_PLUS_ROOT}/assets\nbddl_files: ${LIBERO_PLUS_ROOT}/bddl_files\ndatasets: ${LIBERO_PLUS_ROOT}/../datasets\ninit_states: ${LIBERO_PLUS_ROOT}/init_files\n" \
-       > /home/user_lerobot/.libero/config.yaml
-
-# Overlay the PR's source code on top of the nightly image.
-COPY --chown=user_lerobot:user_lerobot . .
-
-CMD ["/bin/bash"]
@@ -1,27 +0,0 @@
-# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-# Benchmark image for MetaWorld integration tests.
-# Extends the nightly GPU image (which already has all extras installed)
-# with the PR's source code.
-#
-# Build:  docker build -f docker/Dockerfile.benchmark.metaworld -t lerobot-benchmark-metaworld .
-# Run:    docker run --gpus all --rm lerobot-benchmark-metaworld lerobot-eval ...
-
-FROM huggingface/lerobot-gpu:latest
-
-# Overlay the PR's source code on top of the nightly image.
-COPY --chown=user_lerobot:user_lerobot . .
-
-CMD ["/bin/bash"]
@@ -1,71 +0,0 @@
-# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-# Benchmark image for RoboCasa365 integration tests.
-# Extends the nightly GPU image (which already has all extras installed)
-# with the PR's source code and RoboCasa-specific asset setup.
-#
-# Build:  docker build -f docker/Dockerfile.benchmark.robocasa -t lerobot-benchmark-robocasa .
-# Run:    docker run --gpus all --rm lerobot-benchmark-robocasa lerobot-eval ...
-
-FROM huggingface/lerobot-gpu:latest
-
-# Install robocasa + robosuite as editable clones. pip-installing from git
-# omits data files like robocasa/models/assets/box_links/box_links_assets.json
-# (not declared in package_data), which download_kitchen_assets needs at import.
-#
-# `--no-deps` on robocasa is deliberate: its setup.py pins `lerobot==0.3.3`
-# in install_requires, which would shadow the editable lerobot baked into
-# this image. We install robocasa's actual runtime deps explicitly instead.
-# Pinned SHAs for reproducible benchmark runs. Bump when you need an
-# upstream fix; don't rely on `main`/`master` drift.
-ARG ROBOCASA_SHA=56e355ccc64389dfc1b8a61a33b9127b975ba681
-ARG ROBOSUITE_SHA=aaa8b9b214ce8e77e82926d677b4d61d55e577ab
-RUN git clone https://github.com/robocasa/robocasa.git ~/robocasa && \
-    git -C ~/robocasa checkout ${ROBOCASA_SHA} && \
-    git clone https://github.com/ARISE-Initiative/robosuite.git ~/robosuite && \
-    git -C ~/robosuite checkout ${ROBOSUITE_SHA} && \
-    uv pip install --no-cache -e ~/robocasa --no-deps && \
-    uv pip install --no-cache -e ~/robosuite && \
-    uv pip install --no-cache \
-      "numpy==2.2.5" "numba==0.61.2" "scipy==1.15.3" "mujoco==3.3.1" \
-      "pygame==2.6.1" "Pillow==12.2.0" "opencv-python==4.13.0.92" \
-      "pyyaml==6.0.3" "pynput==1.8.1" "tqdm==4.67.3" "termcolor==3.3.0" \
-      "imageio==2.37.3" "h5py==3.16.0" "lxml==6.0.4" "hidapi==0.14.0.post4" \
-      "tianshou==0.4.10" "gymnasium==1.2.3"
-
-# Set up robocasa macros and download kitchen assets. We need:
-#   - tex              : base environment textures
-#   - tex_generative   : AI-generated textures; kitchen fixture XMLs embed
-#                        refs to generative_textures/wall/tex*.png
-#                        unconditionally, so MjModel.from_xml_string fails
-#                        at reset time without them (even if the env is
-#                        constructed with generative_textures=None).
-#   - fixtures_lw      : lightwheel kitchen fixtures (fridge, counters...)
-#   - objs_lw          : lightwheel object meshes (stools, misc props)
-# We skip the objaverse/aigen object packs (~30GB combined) by pairing
-# this with --env.obj_registries=["lightwheel"] on the lerobot side.
-# The download script prompts interactively, so pipe 'y' to auto-accept.
-RUN python -m robocasa.scripts.setup_macros && \
-    yes y | python -m robocasa.scripts.download_kitchen_assets \
-      --type tex tex_generative fixtures_lw objs_lw
-
-# Overlay the PR's source code on top of the nightly image.
-COPY --chown=user_lerobot:user_lerobot . .
-
-# Re-install lerobot editably so the new source (with RoboCasaEnv registration)
-# replaces the stale package baked into the nightly image.
-RUN uv pip install --no-cache --no-deps -e .
-
-CMD ["/bin/bash"]
@@ -1,43 +0,0 @@
-# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-# Benchmark image for RoboCerebra integration tests.
-# RoboCerebra reuses LIBERO's simulator (libero_10 suite) with a different
-# rename_map, so this image is identical to the LIBERO benchmark image —
-# extends the nightly GPU base with LIBERO assets + the PR's source code.
-#
-# Build:  docker build -f docker/Dockerfile.benchmark.robocerebra -t lerobot-benchmark-robocerebra .
-# Run:    docker run --gpus all --rm lerobot-benchmark-robocerebra lerobot-eval ...
-
-FROM huggingface/lerobot-gpu:latest
-
-# Pre-download lerobot/libero-assets from HF Hub so nothing is fetched at
-# runtime (which times out on CI). Point the libero config at the cached path.
-# libero/libero/__init__.py calls input() when ~/.libero/config.yaml is missing,
-# so we write the config before any libero import can happen.
-RUN LIBERO_DIR=$(python -c \
-      "import importlib.util, os; s=importlib.util.find_spec('libero'); \
-       print(os.path.join(os.path.dirname(s.origin), 'libero'))") && \
-    mkdir -p /home/user_lerobot/.libero && \
-    python -c "\
-from huggingface_hub import snapshot_download; \
-snapshot_download(repo_id='lerobot/libero-assets', repo_type='dataset', \
-                  local_dir='/home/user_lerobot/.libero/assets')" && \
-    printf "assets: /home/user_lerobot/.libero/assets\nbddl_files: ${LIBERO_DIR}/bddl_files\ndatasets: ${LIBERO_DIR}/../datasets\ninit_states: ${LIBERO_DIR}/init_files\n" \
-    > /home/user_lerobot/.libero/config.yaml
-
-# Overlay the PR's source code on top of the nightly image.
-COPY --chown=user_lerobot:user_lerobot . .
-
-CMD ["/bin/bash"]
@@ -1,56 +0,0 @@
-# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-# Benchmark image for RoboMME integration tests.
-# Extends the nightly GPU image (which has lerobot[all]) with Vulkan system
-# libs for ManiSkill/SAPIEN and the robomme extra. robomme isn't in [all]
-# because mani-skill hard-pins gymnasium==0.29.1 and numpy<2.0.0 which
-# conflict with lerobot's defaults; both are safe at runtime:
-#   - gymnasium 0.29.x has the same 5-tuple step() API as 1.x (since 0.26)
-#   - numpy 1.26.4 is API-compatible with lerobot's actual usage.
-#
-# Build:  docker build -f docker/Dockerfile.benchmark.robomme -t lerobot-benchmark-robomme .
-# Run:    docker run --gpus all --rm lerobot-benchmark-robomme lerobot-eval ...
-
-FROM huggingface/lerobot-gpu:latest
-
-# NVIDIA Container Toolkit: expose Vulkan driver capability for headless rendering.
-ENV NVIDIA_DRIVER_CAPABILITIES=all \
-    VK_ICD_FILENAMES=/usr/share/vulkan/icd.d/nvidia_icd.json
-
-# ManiSkill/SAPIEN's renderer needs Vulkan, which isn't in the base image.
-USER root
-RUN apt-get update \
-    && apt-get install -y --no-install-recommends \
-         libvulkan1 libvulkan-dev mesa-vulkan-drivers \
-    && mkdir -p /usr/share/vulkan/icd.d \
-    && echo '{"file_format_version":"1.0.0","ICD":{"library_path":"libGLX_nvidia.so.0","api_version":"1.3.0"}}' \
-       > /usr/share/vulkan/icd.d/nvidia_icd.json \
-    && apt-get clean && rm -rf /var/lib/apt/lists/*
-USER user_lerobot
-
-# Install smolvla + av-dep via the PR's pyproject, then layer robomme on top
-# with gymnasium/numpy overrides. robomme isn't a pyproject extra because its
-# mani-skill pin conflicts with lerobot's base numpy>=2 (see pyproject.toml).
-COPY --chown=user_lerobot:user_lerobot setup.py pyproject.toml uv.lock README.md MANIFEST.in ./
-RUN printf 'gymnasium==0.29.1\nnumpy==1.26.4\n' > /tmp/robomme_override.txt \
-    && uv pip install --no-cache --override /tmp/robomme_override.txt \
-         -e ".[smolvla,av-dep]" \
-         "robomme @ git+https://github.com/RoboMME/robomme_benchmark.git@main" \
-    && python -c "import robomme; print('robomme import OK')"
-
-# Overlay the PR's source code on top of the nightly image.
-COPY --chown=user_lerobot:user_lerobot . .
-
-CMD ["/bin/bash"]
@@ -1,138 +0,0 @@
-# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-# Benchmark image for RoboTwin 2.0 integration tests.
-# Extends the nightly GPU image with the RoboTwin simulator stack:
-#   sapien/mplib/pytorch3d + NVlabs CuRobo + embodiments.zip + objects.zip
-# (~3.96 GB of assets; background_texture.zip ~11 GB skipped for smoke eval).
-#
-# Build: docker build -f docker/Dockerfile.benchmark.robotwin -t lerobot-benchmark-robotwin .
-# Run:   docker run --gpus all --rm lerobot-benchmark-robotwin \
-#            lerobot-eval --env.type=robotwin --env.task=beat_block_hammer ...
-
-FROM huggingface/lerobot-gpu:latest
-
-ENV NVIDIA_DRIVER_CAPABILITIES=all \
-    VK_ICD_FILENAMES=/usr/share/vulkan/icd.d/nvidia_icd.json \
-    ROBOTWIN_ROOT=/opt/robotwin
-
-# The nightly base is CUDA -base (no compiler, no Vulkan loader). CuRobo's
-# `pip install -e .` runs nvcc, and SAPIEN renders via Vulkan — add both.
-USER root
-# Pinned upstream SHA for reproducible benchmark runs. Bump when we need
-# an upstream fix; don't rely on `main` drift.
-ARG ROBOTWIN_SHA=0aeea2d669c0f8516f4d5785f0aa33ba812c14b4
-RUN apt-get update \
-    && apt-get install -y --no-install-recommends \
-         cuda-nvcc-12-8 cuda-cudart-dev-12-8 \
-         libvulkan1 vulkan-tools \
-    && mkdir -p /usr/share/vulkan/icd.d \
-    && echo '{"file_format_version":"1.0.0","ICD":{"library_path":"libGLX_nvidia.so.0","api_version":"1.3.0"}}' \
-       > /usr/share/vulkan/icd.d/nvidia_icd.json \
-    && git clone https://github.com/RoboTwin-Platform/RoboTwin.git ${ROBOTWIN_ROOT} \
-    && git -C ${ROBOTWIN_ROOT} checkout ${ROBOTWIN_SHA} \
-    && chown -R user_lerobot:user_lerobot ${ROBOTWIN_ROOT} \
-    && apt-get clean && rm -rf /var/lib/apt/lists/*
-USER user_lerobot
-
-# RoboTwin runtime deps (av is already in the base via [av-dep]).
-RUN uv pip install --no-cache \
-        "sapien==3.0.0b1" "mplib==0.2.1" "transforms3d==0.4.2" "trimesh==4.4.3" \
-        "open3d==0.19.0" "imageio==2.34.2" termcolor zarr pydantic h5py
-
-# pytorch3d has no universal wheel; must be built from source (~10 min, cached).
-RUN uv pip install --no-cache --no-build-isolation \
-        "git+https://github.com/facebookresearch/pytorch3d.git@stable"
-
-# CuRobo — NVlabs motion generator; TORCH_CUDA_ARCH_LIST must be set or the
-# build aborts on an empty arch list. RoboTwin's own installer pins v0.7.8,
-# which still exposes the v1 API (`curobo.types.math`) that RoboTwin imports.
-ARG CUROBO_REF=v0.7.8
-RUN cd ${ROBOTWIN_ROOT}/envs \
-    && git clone --branch ${CUROBO_REF} --depth 1 https://github.com/NVlabs/curobo.git \
-    && cd curobo \
-    && TORCH_CUDA_ARCH_LIST="7.0;7.5;8.0;8.6;8.9;9.0" \
-       uv pip install -e . --no-build-isolation --no-cache
-
-# Upstream patches (mirror RoboTwin's script/_install.sh).
-# These patches target the exact versions pinned above; re-check when upgrading.
-# mplib==0.2.1: drop a broken `or collide` clause in planner.py.
-#   Safe to remove once mplib > 0.2.1 ships with the fix upstream.
-# sapien==3.0.0b1: fix URDF loader encoding + .srdf extension check.
-#   Safe to remove once sapien > 3.0.0b1 ships with the fix upstream.
-RUN python - <<'EOF'
-import pathlib, re, site
-for d in site.getsitepackages():
-    p = pathlib.Path(d) / "mplib" / "planner.py"
-    if p.exists():
-        p.write_text(re.sub(r"\bor collide\b", "", p.read_text(), count=1))
-        print(f"mplib patch applied: {p}")
-    p = pathlib.Path(d) / "sapien" / "wrapper" / "urdf_loader.py"
-    if p.exists():
-        src = p.read_text().replace(
-            "with open(srdf_path) as f:", 'with open(srdf_path, encoding="utf-8") as f:'
-        ).replace('"srdf"', '".srdf"')
-        p.write_text(src)
-        print(f"sapien patch applied: {p}")
-EOF
-
-# Simulation assets from TianxingChen/RoboTwin2.0: embodiments (~220 MB) +
-# objects (~3.74 GB). background_texture (~11 GB) is intentionally skipped.
-# The dataset is public — no auth token needed.
-RUN python - <<'EOF'
-import os, pathlib, zipfile
-from huggingface_hub import hf_hub_download
-
-assets_dir = pathlib.Path(os.environ["ROBOTWIN_ROOT"]) / "assets"
-assets_dir.mkdir(parents=True, exist_ok=True)
-for fname in ("embodiments.zip", "objects.zip"):
-    local = hf_hub_download(
-        repo_id="TianxingChen/RoboTwin2.0",
-        repo_type="dataset",
-        filename=fname,
-        local_dir=str(assets_dir),
-    )
-    with zipfile.ZipFile(local, "r") as z:
-        z.extractall(str(assets_dir))
-    pathlib.Path(local).unlink()
-EOF
-
-WORKDIR ${ROBOTWIN_ROOT}
-RUN python script/update_embodiment_config_path.py
-
-ENV PYTHONPATH="${ROBOTWIN_ROOT}"
-
-# Fail the image build early if the CuRobo package layout regresses. Importing
-# RoboTwin's planner here is too eager because CuRobo constructs CUDA-backed
-# defaults at import time, while Docker builds don't have access to an NVIDIA
-# driver.
-RUN python - <<'EOF'
-from pathlib import Path
-
-from curobo.types.math import Pose
-
-planner_src = (Path("/opt/robotwin/envs/robot/planner.py")).read_text()
-assert "from curobo.types.math import Pose as CuroboPose" in planner_src
-
-print("CuRobo import OK:", Pose.__name__)
-print("RoboTwin planner import references curobo.types.math")
-EOF
-
-# Return to the lerobot source directory (set by base image) before overlaying.
-WORKDIR /lerobot
-
-# Overlay the PR's source code on top of the nightly image.
-COPY --chown=user_lerobot:user_lerobot . .
-
-CMD ["/bin/bash"]
@@ -1,99 +0,0 @@
-# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-# Benchmark image for VLABench integration tests.
-# Extends the nightly GPU image with the PR's source code and VLABench setup.
-#
-# Build:  docker build -f docker/Dockerfile.benchmark.vlabench -t lerobot-benchmark-vlabench .
-# Run:    docker run --gpus all --rm lerobot-benchmark-vlabench lerobot-eval ...
-
-FROM huggingface/lerobot-gpu:latest
-
-# Install VLABench from GitHub (not on PyPI) and pin MuJoCo/dm-control.
-# Shallow-clone without submodule recursion (nested SSH-only submodules fail in CI).
-# Editable install (-e) because VLABench/utils/ has no __init__.py, so
-# find_packages() omits it from wheels; editable mode uses the source tree directly.
-# rrt-algorithms has the same packaging issue (rrt/ dir missing __init__.py).
-# Patch: constant.py calls os.listdir on ~100 asset/obj/meshes/* dirs at import
-# time. Guard the call so missing dirs return [] instead of crashing (in case
-# the asset download is partial).
-#
-# Pinned upstream SHAs for reproducible benchmark runs. Bump when you need
-# an upstream fix; don't rely on `main`/`develop` drift.
-ARG VLABENCH_SHA=cf588fe60c0c7282174fe979f5913170cfe69017
-ARG RRT_ALGORITHMS_SHA=e51d95ee489a225220d6ae2a764c4111f6ba7d85
-RUN git clone https://github.com/OpenMOSS/VLABench.git ~/VLABench && \
-    git -C ~/VLABench checkout ${VLABENCH_SHA} && \
-    git clone https://github.com/motion-planning/rrt-algorithms.git ~/rrt-algorithms && \
-    git -C ~/rrt-algorithms checkout ${RRT_ALGORITHMS_SHA} && \
-    python3 -c "\
-import pathlib; \
-p = pathlib.Path.home() / 'VLABench/VLABench/configs/constant.py'; \
-t = p.read_text(); \
-p.write_text(t.replace( \
-    'subdirs = os.listdir(xml_dir)', \
-    'if not os.path.isdir(xml_dir): return []\n    subdirs = os.listdir(xml_dir)'))" && \
-    uv pip install --no-cache -e ~/VLABench -e ~/rrt-algorithms \
-      mujoco==3.2.2 dm-control==1.0.22 \
-      open3d colorlog scikit-learn openai gdown
-
-# Download VLABench mesh assets. Task configs reference object meshes
-# (obj/meshes/fruit/, containers/basket/, tablewares/plates/, etc.); without
-# them the task builder picks from an empty mesh list and crashes with
-# IndexError at task-build time (random.choice([]) in config_manager.py).
-#
-# Preferred source: an HF Hub mirror. Set VLABENCH_ASSETS_REPO at build time
-# (e.g. --build-arg VLABENCH_ASSETS_REPO=lerobot/vlabench-assets) and we'll
-# snapshot_download the repo into VLABench's assets dir. This is the reliable
-# path for CI — Google Drive frequently returns HTTP 429 ("Too many users have
-# viewed or downloaded this file recently") on shared academic files.
-#
-# After download we *validate* that at least one XML exists under each
-# task-critical subtree and fail the build loudly if not. Silent-empty asset
-# dirs are the #1 cause of VLABench runtime crashes in CI, so we surface them
-# here rather than after a 10-minute eval build.
-#
-# Fallback: VLABench's own gdown-based script. Best-effort only.
-ARG VLABENCH_ASSETS_REPO=""
-RUN ASSETS_DIR="$HOME/VLABench/VLABench/assets" && \
-    if [ -n "${VLABENCH_ASSETS_REPO}" ]; then \
-        echo "Downloading VLABench assets from HF Hub: ${VLABENCH_ASSETS_REPO}" && \
-        uv pip install --no-cache "huggingface_hub[hf_xet]>=0.26" && \
-        python -c "from huggingface_hub import snapshot_download; \
-p = snapshot_download(repo_id='${VLABENCH_ASSETS_REPO}', repo_type='dataset', \
-    local_dir='${ASSETS_DIR}', allow_patterns=['obj/**', 'scenes/**']); \
-print('snapshot_download returned:', p)"; \
-    else \
-        echo "No VLABENCH_ASSETS_REPO set — falling back to gdown" && \
-        python ~/VLABench/scripts/download_assets.py --choice all; \
-    fi && \
-    python -c "\
-from pathlib import Path; \
-import sys; \
-root = Path('${ASSETS_DIR}'); \
-checks = ['obj/meshes/tablewares/plates', 'obj/meshes/containers/basket', 'obj/meshes/fruit', 'obj/meshes/containers/tray']; \
-failed = []; \
-print(f'Validating VLABench assets under {root}'); \
-[print(f'  {c}: {len(list((root/c).rglob(\"*.xml\")))} XMLs') for c in checks]; \
-[failed.append(c) for c in checks if not any((root/c).rglob('*.xml'))]; \
-sys.exit(f'Empty asset dirs (no *.xml): {failed}') if failed else print('All asset dirs populated.')"
-
-# Overlay the PR's source code on top of the nightly image.
-COPY --chown=user_lerobot:user_lerobot . .
-
-# Re-install lerobot editably so the new source (with VLABenchEnv registration
-# and updated obs handling) replaces the stale package baked into the nightly image.
-RUN uv pip install --no-cache --no-deps -e .
-
-CMD ["/bin/bash"]
@@ -18,8 +18,9 @@
 # docker build -f docker/Dockerfile.internal -t lerobot-internal .

 # Configure the base image for CI with GPU access
-ARG CUDA_VERSION=12.8.1
-ARG OS_VERSION=24.04
+# TODO(Steven): Bump these versions
+ARG CUDA_VERSION=12.4.1
+ARG OS_VERSION=22.04
 FROM nvidia/cuda:${CUDA_VERSION}-base-ubuntu${OS_VERSION}

 # Define Python version argument
@@ -35,13 +36,16 @@ ENV DEBIAN_FRONTEND=noninteractive \

 # Install Python, system dependencies, and uv (as root)
 RUN apt-get update && apt-get install -y --no-install-recommends \
-    build-essential git curl \
-    libglib2.0-0 libgl1 libegl1 ffmpeg \
+    software-properties-common build-essential git curl \
+    libglib2.0-0 libgl1-mesa-glx libegl1-mesa ffmpeg \
    libusb-1.0-0-dev speech-dispatcher libgeos-dev portaudio19-dev \
    cmake pkg-config ninja-build \
-    python${PYTHON_VERSION} \
-    python${PYTHON_VERSION}-venv \
-    python${PYTHON_VERSION}-dev \
+    && add-apt-repository -y ppa:deadsnakes/ppa \
+    && apt-get update \
+    && apt-get install -y --no-install-recommends \
+       python${PYTHON_VERSION} \
+       python${PYTHON_VERSION}-venv \
+       python${PYTHON_VERSION}-dev \
    && curl -LsSf https://astral.sh/uv/install.sh | sh \
    && mv /root/.local/bin/uv /usr/local/bin/uv \
    && useradd --create-home --shell /bin/bash user_lerobot \
@@ -3,14 +3,12 @@
    title: LeRobot
  - local: installation
    title: Installation
-  - local: cheat-sheet
-    title: Cheat sheet
  title: Get started
 - sections:
  - local: il_robots
    title: Imitation Learning for Robots
  - local: bring_your_own_policies
-    title: Adding a Policy
+    title: Bring Your Own Policies
  - local: integrate_hardware
    title: Bring Your Own Hardware
  - local: hilserl
@@ -26,12 +24,6 @@
  - local: rename_map
    title: Using Rename Map and Empty Cameras
  title: "Tutorials"
- sections:
-  - local: hardware_guide
-    title: Compute Hardware Guide
-  - local: torch_accelerators
-    title: PyTorch accelerators
-  title: "Compute & Hardware"
 - sections:
  - local: lerobot-dataset-v3
    title: Using LeRobotDataset
@@ -55,8 +47,6 @@
    title: π₀-FAST (Pi0Fast)
  - local: pi05
    title: π₀.₅ (Pi05)
-  - local: eo1
-    title: EO-1
  - local: groot
    title: NVIDIA GR00T N1.5
  - local: xvla
@@ -71,8 +61,6 @@
    title: SARM
  title: "Reward Models"
 - sections:
-  - local: inference
-    title: Policy Deployment (lerobot-rollout)
  - local: async
    title: Use Async Inference
  - local: rtc
@@ -89,22 +77,10 @@
    title: Adding a New Benchmark
  - local: libero
    title: LIBERO
-  - local: libero_plus
-    title: LIBERO-plus
  - local: metaworld
    title: Meta-World
-  - local: robotwin
-    title: RoboTwin 2.0
-  - local: robocasa
-    title: RoboCasa365
-  - local: robocerebra
-    title: RoboCerebra
-  - local: robomme
-    title: RoboMME
  - local: envhub_isaaclab_arena
    title: NVIDIA IsaacLab Arena Environments
-  - local: vlabench
-    title: VLABench
  title: "Benchmarks"
 - sections:
  - local: introduction_processors
@@ -150,6 +126,10 @@
  - local: cameras
    title: Cameras
  title: "Sensors"
+- sections:
+  - local: torch_accelerators
+    title: PyTorch accelerators
+  title: "Supported Hardware"
 - sections:
  - local: notebooks
    title: Notebooks
@@ -1,37 +1,60 @@
-# Adding a Policy
+# Bring Your Own Policies

-This guide walks you through implementing a custom policy and getting it to work with LeRobot's training, evaluation, and deployment tools. There are two paths:
+This tutorial explains how to integrate your own custom policy implementations into the LeRobot ecosystem, allowing you to leverage all LeRobot tools for training, evaluation, and deployment while using your own algorithms.

- **Plugin (out-of-tree)** — ship your policy as a standalone `lerobot_policy_*` package. Faster, no PR required, easy to iterate. Right for experimentation, internal use, or when you want to publish independently.
- **In-tree (contributed to LeRobot)** — land your policy directly in `src/lerobot/policies/`. Requires a PR, but makes your policy a first-class citizen of the library.
+## Step 1: Create a Policy Package

-The plugin route is usually the right starting point — promote to in-tree once the policy has stabilized and there's clear value in shipping it with the library.
+Your custom policy should be organized as an installable Python package following LeRobot's plugin conventions.

-Either way, the building blocks are the same: a configuration class, a policy class, and a processor factory. The first half of this guide covers those shared pieces; the second half covers the path-specific scaffolding ([Path A](#path-a-out-of-tree-plugin), [Path B](#path-b-contributing-in-tree)).
+### Package Structure

-A note on tone: robot-learning is an actively evolving field, and "what a policy looks like" can shift with each new architecture. The conventions described here exist because they let `lerobot-train` and `lerobot-eval` work uniformly across very different models. When a new policy genuinely doesn't fit them, raise it (in your PR, or an issue) — the conventions are not sacred.
+Create a package with the prefix `lerobot_policy_` (IMPORTANT!) followed by your policy name:

---
+```bash
+lerobot_policy_my_custom_policy/
+├── pyproject.toml
+└── src/
+    └── lerobot_policy_my_custom_policy/
+        ├── __init__.py
+        ├── configuration_my_custom_policy.py
+        ├── modeling_my_custom_policy.py
+        └── processor_my_custom_policy.py
+```

-## Anatomy of a policy
+### Package Configuration

-Three building blocks make up every policy. The names below use `my_policy` as a placeholder — replace with your policy's name. That name is load-bearing: it must match the string you pass to `@PreTrainedConfig.register_subclass`, the `MyPolicy.name` class attribute, and the `make_<name>_pre_post_processors` factory function (more on each below).
+Set up your `pyproject.toml`:

-### Configuration class
+```toml
+[project]
+name = "lerobot_policy_my_custom_policy"
+version = "0.1.0"
+dependencies = [
+    # your policy-specific dependencies
+]
+requires-python = ">= 3.12"

-Inherit from [`PreTrainedConfig`](https://github.com/huggingface/lerobot/blob/main/src/lerobot/configs/policies.py) and register your policy type. Here is a template — customize the parameters and methods as needed for your policy's architecture and training requirements.
+[build-system]
+build-backend = # your-build-backend
+requires = # your-build-system
+```
+
+## Step 2: Define the Policy Configuration
+
+Create a configuration class that inherits from [`PreTrainedConfig`](https://github.com/huggingface/lerobot/blob/main/src/lerobot/configs/policies.py) and registers your policy type:
+Here is a template to get you started, customize the parameters and methods as needed for your policy's architecture and training requirements.

 ```python
-# configuration_my_policy.py
+# configuration_my_custom_policy.py
 from dataclasses import dataclass, field
 from lerobot.configs import PreTrainedConfig
 from lerobot.optim import AdamWConfig
 from lerobot.optim import CosineDecayWithWarmupSchedulerConfig

-@PreTrainedConfig.register_subclass("my_policy")
+@PreTrainedConfig.register_subclass("my_custom_policy")
@dataclass
-class MyPolicyConfig(PreTrainedConfig):
-    """Configuration class for MyPolicy.
+class MyCustomPolicyConfig(PreTrainedConfig):
+    """Configuration class for MyCustomPolicy.

    Args:
        n_obs_steps: Number of observation steps to use as input
@@ -54,20 +77,16 @@ class MyPolicyConfig(PreTrainedConfig):
            raise ValueError("n_action_steps cannot exceed horizon")

    def validate_features(self) -> None:
-        """Validate input/output feature compatibility.
-
-        Call this explicitly from your policy's __init__ — the base class does not.
-        """
+        """Validate input/output feature compatibility."""
        if not self.image_features:
-            raise ValueError("MyPolicy requires at least one image feature.")
+            raise ValueError("MyCustomPolicy requires at least one image feature.")
        if self.action_feature is None:
-            raise ValueError("MyPolicy requires 'action' in output_features.")
+            raise ValueError("MyCustomPolicy requires 'action' in output_features.")

    def get_optimizer_preset(self) -> AdamWConfig:
        return AdamWConfig(lr=self.optimizer_lr, weight_decay=self.optimizer_weight_decay)

    def get_scheduler_preset(self):
-        """Return a LRSchedulerConfig from lerobot.optim, or None."""
        return None

    @property
@@ -82,7 +101,8 @@ class MyPolicyConfig(PreTrainedConfig):

    @property
    def action_delta_indices(self) -> list[int]:
-        """Relative timestep offsets for the action chunk the dataset loader returns."""
+        """Relative timestep offsets for the action chunk the dataset loader returns.
+        """
        return list(range(self.horizon))

    @property
@@ -90,34 +110,32 @@ class MyPolicyConfig(PreTrainedConfig):
        return None
 ```

-The string you pass to `@register_subclass` must match `MyPolicy.name` (next section) and is what users supply as `--policy.type` on the CLI. Default to `AdamW` from `lerobot.optim` for `get_optimizer_preset` unless you genuinely need otherwise.
+## Step 3: Implement the Policy Class

-### Policy class
-
-Inherit from [`PreTrainedPolicy`](https://github.com/huggingface/lerobot/blob/main/src/lerobot/policies/pretrained.py) and set two class attributes — both are checked by `__init_subclass__`:
+Create your policy implementation by inheriting from [`PreTrainedPolicy`](https://github.com/huggingface/lerobot/blob/main/src/lerobot/policies/pretrained.py):

 ```python
-# modeling_my_policy.py
+# modeling_my_custom_policy.py
 import torch
 import torch.nn as nn
 from typing import Any

 from lerobot.policies import PreTrainedPolicy
 from lerobot.utils.constants import ACTION
-from .configuration_my_policy import MyPolicyConfig
+from .configuration_my_custom_policy import MyCustomPolicyConfig

-class MyPolicy(PreTrainedPolicy):
-    config_class = MyPolicyConfig  # must match the string in @register_subclass
-    name = "my_policy"
+class MyCustomPolicy(PreTrainedPolicy):
+    config_class = MyCustomPolicyConfig  # must match the string in @register_subclass
+    name = "my_custom_policy"

-    def __init__(self, config: MyPolicyConfig, dataset_stats: dict[str, Any] = None):
+    def __init__(self, config: MyCustomPolicyConfig, dataset_stats: dict[str, Any] = None):
        super().__init__(config, dataset_stats)
        config.validate_features()  # not called automatically by the base class
        self.config = config
        self.model = ...  # your nn.Module here

    def reset(self):
-        """Reset per-episode state. Called by lerobot-eval at the start of each episode."""
+        """Reset episode state."""
        ...

    def get_optim_params(self) -> dict:
@@ -129,51 +147,35 @@ class MyPolicy(PreTrainedPolicy):
        ...

    def select_action(self, batch: dict[str, torch.Tensor], **kwargs) -> torch.Tensor:
-        """Return a single action for the current timestep (called every step at inference)."""
+        """Return a single action for the current timestep (called at inference)."""
        ...

-    def forward(self, batch: dict[str, torch.Tensor]) -> tuple[torch.Tensor, dict | None]:
+    def forward(self, batch: dict[str, torch.Tensor]) -> dict[str, torch.Tensor]:
        """Compute the training loss.

-        Returns `(loss, output_dict)`. `output_dict` may be `None`; everything in it must be
-        logging-friendly Python natives (no tensors with gradients).
-
        `batch["action_is_pad"]` is a bool mask of shape (B, horizon) that marks
-        timesteps padded because the episode ended before `horizon` steps; you
+        timesteps padded because the episode ended before `horizon` steps, you
        can exclude those from your loss.
        """
        actions = batch[ACTION]
        action_is_pad = batch.get("action_is_pad")
        ...
-        return loss, {"some_loss_component": some_loss_component.item()}
+        return {"loss": ...}
 ```

-The methods called by the train/eval loops:
+## Step 4: Add Data Processors

-| Method                                                            | Used by           | What it does                                                                                                                                                                                                                                         |
-| ----------------------------------------------------------------- | ----------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| `reset() -> None`                                                 | `lerobot-eval`    | Clear per-episode state at the start of each episode.                                                                                                                                                                                                |
-| `select_action(batch, **kwargs) -> Tensor`                        | `lerobot-eval`    | Return the next action `(B, action_dim)`. Called every step.                                                                                                                                                                                         |
-| `predict_action_chunk(batch, **kwargs) -> Tensor`                 | the policy itself | Return an action chunk `(B, chunk_size, action_dim)`. Currently abstract on the base class — raise `NotImplementedError` if your policy doesn't chunk.                                                                                               |
-| `forward(batch, reduction="mean") -> tuple[Tensor, dict \| None]` | `lerobot-train`   | Return `(loss, output_dict)`. Accept `reduction="none"` if you want to support per-sample weighting.                                                                                                                                                 |
-| `get_optim_params() -> dict`                                      | the optimizer     | Return `self.parameters()` for simple policies; return a named parameter dict for [multi-optimizer policies](https://github.com/huggingface/lerobot/blob/ecd38c50d7d15b4184cf42649ff1185ee2e11eeb/src/lerobot/policies/sac/modeling_sac.py#L61-L73). |
-| `update() -> None` _(optional)_                                   | `lerobot-train`   | Called after each optimizer step _if defined_. Use for EMA, target nets, replay buffers (TDMPC uses this).                                                                                                                                           |
-
-Batches are flat dictionaries keyed by the constants in [`lerobot.utils.constants`](https://github.com/huggingface/lerobot/blob/main/src/lerobot/utils/constants.py): `OBS_STATE` (`observation.state.<motor>`), `OBS_IMAGES` (`observation.images.<camera>`), `OBS_LANGUAGE`, `ACTION`, etc. Reuse the constants — don't invent new prefixes.
-
-### Processor functions
-
-LeRobot uses `PolicyProcessorPipeline`s to normalize inputs and de-normalize outputs around your policy. For a concrete reference, see [`processor_act.py`](https://github.com/huggingface/lerobot/blob/main/src/lerobot/policies/act/processor_act.py) or [`processor_diffusion.py`](https://github.com/huggingface/lerobot/blob/main/src/lerobot/policies/diffusion/processor_diffusion.py).
+Create processor functions. For a concrete reference, see [processor_act.py](https://github.com/huggingface/lerobot/blob/main/src/lerobot/policies/act/processor_act.py) or [processor_diffusion.py](https://github.com/huggingface/lerobot/blob/main/src/lerobot/policies/diffusion/processor_diffusion.py).

 ```python
-# processor_my_policy.py
+# processor_my_custom_policy.py
 from typing import Any
 import torch

 from lerobot.processor import PolicyAction, PolicyProcessorPipeline


-def make_my_policy_pre_post_processors(
+def make_my_custom_policy_pre_post_processors(
    config,
    dataset_stats: dict[str, dict[str, torch.Tensor]] | None = None,
 ) -> tuple[
@@ -185,48 +187,11 @@ def make_my_policy_pre_post_processors(
    return preprocessor, postprocessor
 ```

-**Important — function naming:** LeRobot discovers your processor by name. The function **must** be called `make_{policy_name}_pre_post_processors` (matching the string you passed to `@PreTrainedConfig.register_subclass`).
+**Important - function naming:** LeRobot discovers your processor by name. The function **must** be called `make_{policy_name}_pre_post_processors` (matching the string you passed to `@PreTrainedConfig.register_subclass`).

---
+## Step 5: Package Initialization

-## Path A: Out-of-tree plugin
-
-The fastest way to ship a policy: package it as a standalone Python distribution and install it alongside LeRobot. No PR required, you own the release cycle, and you can publish to PyPI under your own namespace.
-
-### Package structure
-
-Create a package with the prefix `lerobot_policy_` (IMPORTANT!) followed by your policy name:
-
-```bash
-lerobot_policy_my_policy/
-├── pyproject.toml
-└── src/
-    └── lerobot_policy_my_policy/
-        ├── __init__.py
-        ├── configuration_my_policy.py
-        ├── modeling_my_policy.py
-        └── processor_my_policy.py
-```
-
-### `pyproject.toml`
-
-```toml
-[project]
-name = "lerobot_policy_my_policy"
-version = "0.1.0"
-dependencies = [
-    # your policy-specific dependencies
-]
-requires-python = ">= 3.12"
-
-[build-system]
-build-backend = # your-build-backend
-requires = # your-build-system
-```
-
-### Package `__init__.py`
-
-Expose your classes in the package's `__init__.py` and guard against missing `lerobot`:
+Expose your classes in the package's `__init__.py`:

 ```python
 # __init__.py
@@ -239,148 +204,44 @@ except ImportError:
        "lerobot is not installed. Please install lerobot to use this policy package."
    )

-from .configuration_my_policy import MyPolicyConfig
-from .modeling_my_policy import MyPolicy
-from .processor_my_policy import make_my_policy_pre_post_processors
+from .configuration_my_custom_policy import MyCustomPolicyConfig
+from .modeling_my_custom_policy import MyCustomPolicy
+from .processor_my_custom_policy import make_my_custom_policy_pre_post_processors

 __all__ = [
-    "MyPolicyConfig",
-    "MyPolicy",
-    "make_my_policy_pre_post_processors",
+    "MyCustomPolicyConfig",
+    "MyCustomPolicy",
+    "make_my_custom_policy_pre_post_processors",
 ]
 ```

-### Install and use
+## Step 6: Installation and Usage
+
+### Install Your Policy Package

 ```bash
-cd lerobot_policy_my_policy
+cd lerobot_policy_my_custom_policy
 pip install -e .

 # Or install from PyPI if published
-pip install lerobot_policy_my_policy
+pip install lerobot_policy_my_custom_policy
 ```

+### Use Your Policy
+
 Once installed, your policy automatically integrates with LeRobot's training and evaluation tools:

 ```bash
 lerobot-train \
-    --policy.type my_policy \
+    --policy.type my_custom_policy \
    --env.type pusht \
    --steps 200000
 ```

---
-
-## Path B: Contributing in-tree
-
-When your policy has stabilized and there's clear value in shipping it with the library, you can land it directly in LeRobot. Read the general [contribution guide](./contributing) and the [PR template](https://github.com/huggingface/lerobot/blob/main/.github/PULL_REQUEST_TEMPLATE.md) first — that's where you'll find the testing/quality expectations every PR has to meet (`pre-commit run -a`, `pytest`, the community-review rule, etc.). What's below is the policy-specific layer on top of that.
-
-### In-tree layout
-
-```
-src/lerobot/policies/my_policy/
-├── __init__.py                    # re-exports config + modeling + processor factory
-├── configuration_my_policy.py     # MyPolicyConfig + @register_subclass
-├── modeling_my_policy.py          # MyPolicy(PreTrainedPolicy)
-├── processor_my_policy.py         # make_my_policy_pre_post_processors
-└── README.md                      # symlink → ../../../../docs/source/policy_my_policy_README.md
-```
-
-Two notes:
-
- The `README.md` next to the source is a **symlink** into `docs/source/policy_<name>_README.md` — the actual file lives under `docs/`. Existing policies (act, smolvla, diffusion, …) all do this; copy one of those symlinks. The policy README is conventionally minimal: paper link + BibTeX citation.
- The user-facing tutorial — what to install, how to train, hyperparameters, benchmark numbers — lives separately at `docs/source/<my_policy>.mdx` and is registered in `_toctree.yml` under "Policies".
-
-The file names are load-bearing: the factory does lazy imports by name, and the processor is discovered by the `make_<policy_name>_pre_post_processors` convention.
-
-### Wiring
-
-Three places need to know about your policy. All by name.
-
-1. **`policies/__init__.py`** — re-export `MyPolicyConfig` and add it to `__all__`. **Don't** re-export the modeling class; it loads lazily through the factory (so `import lerobot` stays fast).
-2. **`factory.py:get_policy_class`** — add a branch returning `MyPolicy` from a lazy import.
-3. **`factory.py:make_policy_config`** and **`factory.py:make_pre_post_processors`** — same idea, two more branches.
-
-Mirror an existing policy that's structurally similar to yours; the diff is small.
-
-### Heavy / optional dependencies
-
-Most policies need a heavy backbone (transformers, diffusers, a specific VLM SDK). The convention is **two-step gating**: a `TYPE_CHECKING`-guarded import at module top, and a `require_package` runtime check in the constructor. [`modeling_diffusion.py`](https://github.com/huggingface/lerobot/blob/main/src/lerobot/policies/diffusion/modeling_diffusion.py) is the canonical reference:
-
-```python
-from typing import TYPE_CHECKING
-from lerobot.utils.import_utils import _diffusers_available, require_package
-
-if TYPE_CHECKING or _diffusers_available:
-    from diffusers.schedulers.scheduling_ddim import DDIMScheduler
-else:
-    DDIMScheduler = None  # keeps the symbol bindable at import time
-
-class DiffusionPolicy(PreTrainedPolicy):
-    def __init__(self, config):
-        require_package("diffusers", extra="diffusion")
-        super().__init__(config)
-        ...
-```
-
-This way:
-
- `import lerobot.policies` keeps working without the extra installed (the symbol is just bound to `None`).
- Type checkers see the real symbol.
- Instantiating the policy without the extra raises a clear `ImportError` pointing at `pip install 'lerobot[diffusion]'`.
-
-Add a matching extra to [`pyproject.toml`](https://github.com/huggingface/lerobot/blob/main/pyproject.toml) `[project.optional-dependencies]` and include it in the `all` extra so `pip install 'lerobot[all]'` keeps installing everything.
-
-### Benchmarks and a published checkpoint
-
-A new policy is much easier to review — and far more useful — when it ships with a working checkpoint and at least one number you can reproduce.
-
-**Pick at least one in-tree benchmark.** LeRobot ships sim benchmarks with per-benchmark Docker images (LIBERO, LIBERO-plus, Meta-World, RoboTwin 2.0, RoboCasa365, RoboCerebra, RoboMME, VLABench and more). Pick the one that matches your policy's modality — VLAs usually go to LIBERO or VLABench; image-only BC to LIBERO or Meta-World. The full list lives under [Benchmarks](./libero) in the docs sidebar.
-
-**Push the checkpoint & processors** to the Hub under `lerobot/<policy>_<benchmark>` (or your namespace if you don't have write access; a maintainer can mirror it). Use `PreTrainedPolicy.push_model_to_hub` so the repo gets `config.json`, `model.safetensors`, and a model card.
-
-**Report results in your policy's MDX**, with the exact `lerobot-eval` command and hardware so anyone can re-run:
-
-```markdown
-## Results
-
-Evaluated on LIBERO with `lerobot/<policy>_libero`:
-
-| Suite          | Success rate | n_episodes |
-| -------------- | -----------: | ---------: |
-| libero_spatial |        87.5% |         50 |
-| libero_object  |        93.0% |         50 |
-| libero_goal    |        81.5% |         50 |
-| libero_10      |        62.0% |         50 |
-| **average**    |    **81.0%** |        200 |
-
-Reproduce: `lerobot-eval --policy.path=lerobot/<policy>_libero --env.type=libero --env.task=libero_spatial --eval.n_episodes=50` (1× A100 40 GB).
-```
-
-Use `n_episodes ≥ 50` per suite for stable success-rate estimates.
-
-If your policy is real-robot-only and no sim benchmark applies, swap the sim eval for: a public training dataset on the Hub, the `lerobot-train` command, the checkpoint, and a real-robot success rate over ≥10 episodes via `lerobot-rollout --policy.path=...`.
-
-### PR checklist
-
-The general expectations are in [`CONTRIBUTING.md`](https://github.com/huggingface/lerobot/blob/main/CONTRIBUTING.md) and the [PR template](https://github.com/huggingface/lerobot/blob/main/.github/PULL_REQUEST_TEMPLATE.md). On top of those, reviewers will look for:
-
- [ ] `MyPolicy` and `MyPolicyConfig` cover the surface above; `__init_subclass__` accepts the class.
- [ ] `factory.py` and `policies/__init__.py` are wired (lazy imports for modeling).
- [ ] `make_my_policy_pre_post_processors` follows the naming convention.
- [ ] Optional deps live behind a `[project.optional-dependencies]` extra and the `TYPE_CHECKING + require_package` guard.
- [ ] `tests/policies/` updated; backward-compat artifact committed & policy-specific tests.
- [ ] `src/lerobot/policies/<name>/README.md` symlinked into `docs/source/policy_<name>_README.md`; user-facing `docs/source/<name>.mdx` written and added to `_toctree.yml`.
- [ ] At least one reproducible benchmark eval in the policy MDX with a published checkpoint (sim benchmark, or real-robot dataset + checkpoint).
-
-The fastest way to get a clean PR is to copy the directory of the existing policy closest to yours, rename, and replace contents method by method. Don't wait until everything is polished — open a draft PR early and iterate with us; reviewers would much rather give feedback on a half-finished branch than a fully-merged one.
-
---
-
-## Examples and community contributions
+## Examples and Community Contributions

 Check out these example policy implementations:

- [DiTFlow Policy](https://github.com/danielsanjosepro/lerobot_policy_ditflow) — Diffusion Transformer policy with flow-matching objective. Try it out in this example: [DiTFlow Example](https://github.com/danielsanjosepro/test_lerobot_policy_ditflow)
+- [DiTFlow Policy](https://github.com/danielsanjosepro/lerobot_policy_ditflow) - Diffusion Transformer policy with flow-matching objective. Try it out in this example: [DiTFlow Example](https://github.com/danielsanjosepro/test_lerobot_policy_ditflow)

-Thanks for taking the time to bring a new policy into LeRobot. Every architecture that lands in `main` — and every plugin published by the community — makes the library a little more useful for the next person, and a little more representative of where robot learning is going. We're looking forward to seeing what you ship. 🤗
+Share your policy implementations with the community! 🤗
@@ -1,48 +0,0 @@
-# Cheat sheet
-All of the LeRobot commands in one place. If you forgot how to use a specific command or want to learn about a new one you can do it here.
-
-❗ For all of the commands listed below remember to change the ports/names/ids to your own values!
-
-> [!TIP]
-> Another grate way to look at all the comands and get them configured for your specific setup is to use this [Jupyter Notebook](https://github.com/huggingface/lerobot/blob/main/examples/notebooks/quickstart.ipynb).
-
-### Install
-
-### Useful tools
-###### Find port
-
-```bash
-lerobot-find-port
-```
-
-###### Find cameras
-```bash
-lerobot-find-cameras
-```
-
-### Calibration
-
-### Teleoperation
-
-### Recording a dataset
-
-### Training
-
-### Inference
-
-Inference means running the trained policy/model on a robot. For that we use ```lerobot-rollout```. You will need to provide a path to your policy. It can be a local path or a path to Hugging Face for example "lerobot/folding_latest". You cameras configuration need to match what was used when collecting the dataset. Duration is in seconds if unspecified it will run forever.
-
-> [!TIP]
-> If you are using the previous release V0.5.1 instead of ```lerobot-rollout``` you need to use ```lerobot-record```
-
-``` bash
-lerobot-rollout \
-  --strategy.type=base \
-  --policy.path=${HF_USER}/my_policy \
-  --robot.type=so101_follower \
-  --robot.port=/dev/ttyACM1 \
-  --robot.cameras="{ up: {type: opencv, index_or_path: /dev/video1, width: 640, height: 480, fps: 30}, side: {type: opencv, index_or_path: /dev/video5, width: 640, height: 480, fps: 30}}" \
-  --task="Put lego brick into the transparent box" \
-  --duration=60
-```
-
@@ -1,168 +0,0 @@
-# EO-1
-
-EO-1 is a **Vision-Language-Action policy for robot control**. The LeRobot implementation integrates EO-1 with the standard LeRobot training, evaluation, processor interface.
-
-## Model Overview
-
-EO-1 uses a Qwen2.5-VL backbone for vision-language understanding and adds a continuous flow-matching action head for robot control. The policy formats each robot-control sample as a multimodal conversation: camera images are passed to Qwen2.5-VL, the robot state is represented with EO-1 state tokens, and the future action chunk is represented with EO-1 action tokens.
-
-<img
-  src="https://huggingface.co/datasets/HaomingSong/lerobot-documentation-images/resolve/main/lerobot/eo_pipeline.png"
-  alt="An overview of EO-1"
-  width="85%"
-/>
-
-During training, EO-1 learns to denoise continuous action chunks at the action-token positions. During inference, it samples an action chunk, returns continuous actions, and executes `n_action_steps` from the chunk before sampling again.
-
-### What the LeRobot Integration Covers
-
- Standard `policy.type=eo1` configuration through LeRobot
- Qwen2.5-VL image and text preprocessing through policy processors
- Continuous flow-matching action prediction
- Checkpoint save/load through LeRobot policy APIs
- Training with `lerobot-train` and evaluation with `lerobot-eval`
-
-The broader EO-1 project also includes interleaved vision-text-action pretraining and multimodal reasoning workflows. This page focuses on the LeRobot robot-control policy path.
-
-## Installation Requirements
-
-1. Install LeRobot by following the [Installation Guide](./installation).
-2. Install EO-1 dependencies by running:
-
-   ```bash
-   pip install -e ".[eo1]"
-   ```
-
-3. If you want to train or evaluate on LIBERO, install the LIBERO dependencies too:
-
-   ```bash
-   pip install -e ".[eo1,libero]"
-   ```
-
-EO-1 can use the standard PyTorch scaled-dot-product attention backend through `policy.attn_implementation=sdpa`. If your environment has a compatible `flash_attn` installation, you can request `policy.attn_implementation=flash_attention_2`.
-
-## Data Requirements
-
-EO-1 expects a LeRobot dataset with:
-
- At least one visual observation, for example `observation.images.image`
- `observation.state`
- `action`
- A language task instruction through the dataset `task` field
-
-If your dataset uses different observation names, use `rename_map` to align them with the names expected by your training or evaluation setup.
-
-## Usage
-
-To use EO-1 in a LeRobot configuration, specify the policy type as:
-
-```python
-policy.type=eo1
-```
-
-By default, a new EO-1 policy initializes its backbone from:
-
-```python
-policy.vlm_base=Qwen/Qwen2.5-VL-3B-Instruct
-```
-
-Once a LeRobot-format EO-1 checkpoint is available, load it with:
-
-```python
-policy.path=your-org/your-eo1-checkpoint
-```
-
-## Training
-
-### Training Command Example
-
-```bash
-lerobot-train \
-  --dataset.repo_id=your_org/your_dataset \
-  --policy.type=eo1 \
-  --policy.vlm_base=Qwen/Qwen2.5-VL-3B-Instruct \
-  --policy.dtype=bfloat16 \
-  --policy.attn_implementation=sdpa \
-  --policy.gradient_checkpointing=false \
-  --output_dir=./outputs/eo1_training \
-  --job_name=eo1_training \
-  --steps=300000 \
-  --batch_size=16 \
-  --policy.device=cuda
-```
-
-### Key Training Parameters
-
-| Parameter                              | Default                       | Description                                                             |
-| -------------------------------------- | ----------------------------- | ----------------------------------------------------------------------- |
-| `policy.vlm_base`                      | `Qwen/Qwen2.5-VL-3B-Instruct` | Qwen2.5-VL checkpoint used to initialize a new policy                   |
-| `policy.dtype`                         | `auto`                        | Backbone dtype request: `auto`, `bfloat16`, or `float32`                |
-| `policy.attn_implementation`           | `None`                        | Optional Qwen attention backend, such as `sdpa`                         |
-| `policy.gradient_checkpointing`        | `false`                       | Reduces memory usage during training                                    |
-| `policy.chunk_size`                    | `8`                           | Number of future actions predicted per chunk                            |
-| `policy.n_action_steps`                | `8`                           | Number of actions consumed from a sampled chunk                         |
-| `policy.num_denoise_steps`             | `10`                          | Number of flow-matching denoising steps used during sampling            |
-| `policy.max_state_dim`                 | `32`                          | State padding dimension                                                 |
-| `policy.max_action_dim`                | `32`                          | Action padding dimension                                                |
-| `policy.force_fp32_autocast`           | `true`                        | Keeps the flow head in fp32 even when the backbone uses mixed precision |
-| `policy.supervise_padding_action_dims` | `true`                        | Controls whether padded action dimensions are supervised                |
-| `policy.supervise_padding_actions`     | `true`                        | Controls whether padded future action rows are supervised               |
-
-## Evaluation
-
-EO-1 can be evaluated through `lerobot-eval` once you have a LeRobot-format checkpoint:
-
-```bash
-lerobot-eval \
-  --policy.path=your-org/your-eo1-checkpoint \
-  --env.type=libero \
-  --env.task=libero_object \
-  --eval.batch_size=1 \
-  --eval.n_episodes=20
-```
-
-For datasets or environments whose camera names differ from the checkpoint configuration, pass a `rename_map`:
-
-```bash
-lerobot-eval \
-  --policy.path=your-org/your-eo1-checkpoint \
-  --env.type=libero \
-  --env.task=libero_object \
-  --rename_map='{"observation.images.image2":"observation.images.wrist_image"}'
-```
-
-## Configuration Notes
-
-### Image Processing
-
-EO-1 uses the Qwen2.5-VL processor. The `policy.image_min_pixels` and `policy.image_max_pixels` settings control the image resizing bounds before the visual tokens are passed into the backbone.
-
-### State and Action Dimensions
-
-The policy pads state and action vectors to `policy.max_state_dim` and `policy.max_action_dim` before the EO-1 flow head. Predictions are cropped back to the original action dimension before being returned by the policy.
-
-### Attention Backend
-
-Use `policy.attn_implementation=sdpa` for a portable setup. Use `flash_attention_2` only when `flash_attn` is installed and compatible with your environment.
-
-## References
-
- [EO-1 project](https://github.com/EO-Robotics/EO1)
- [EO-1 paper](https://arxiv.org/abs/2508.21112)
- [Qwen2.5-VL-3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-VL-3B-Instruct)
-
-## Citation
-
-```bibtex
-@article{eo1,
-  title={EO-1: Interleaved Vision-Text-Action Pretraining for General Robot Control},
-  author={Delin Qu and Haoming Song and Qizhi Chen and Zhaoqing Chen and Xianqiang Gao and Xinyi Ye and Qi Lv and Modi Shi and Guanghui Ren and Cheng Ruan and Maoqing Yao and Haoran Yang and Jiacheng Bao and Bin Zhao and Dong Wang},
-  journal={arXiv preprint},
-  year={2025},
-  url={https://arxiv.org/abs/2508.21112}
-}
-```
-
-## License
-
-This LeRobot integration follows the **Apache 2.0 License** used by LeRobot. Check the upstream EO-1 model and dataset pages for the licenses of released EO-1 checkpoints and data.
@@ -1,98 +0,0 @@
-# Compute HW Guide for LeRobot Training
-
-Rough sizing for training a LeRobot policy: how much VRAM each policy needs, what training time looks like, and where to run when local hardware isn't enough.
-
-The numbers below are **indicative** — order-of-magnitude figures for picking hardware, not exact predictions. Throughput depends heavily on dataset I/O, image resolution, batch size, and number of GPUs.
-
-## Memory by policy group
-
-Policies cluster by backbone size; the groupings below give a single VRAM envelope per group instead of repeating numbers per policy. Memory scales roughly linearly with batch size; AdamW (the LeRobot default) carries optimizer state that adds ~30–100% over a forward+backward pass alone.
-
-| Group      | Policies                                    | Peak VRAM (BS 8, AdamW) | Suitable starter GPUs             |
-| ---------- | ------------------------------------------- | ----------------------: | --------------------------------- |
-| Light BC   | `act`, `vqbet`, `tdmpc`                     |                  ~2–6GB | Laptop GPU (RTX 3060), L4, A10G   |
-| Diffusion  | `diffusion`, `multi_task_dit`               |                 ~8–14GB | RTX 4070+ / L4 / A10G             |
-| Small VLA  | `smolvla`                                   |                ~10–16GB | RTX 4080+ / L4 / A10G             |
-| Large VLA  | `pi0`, `pi0_fast`, `pi05`, `xvla`, `wall_x` |                ~24–40GB | A100 40 GB+ (24 GB tight at BS 1) |
-| Multimodal | `groot`, `eo1`                              |                ~24–40GB | A100 40 GB+                       |
-| RL         | `sac`                                       |             config-dep. | See [HIL-SERL guide](./hilserl)   |
-
-Memory-bound? Drop the batch size (~linear), use gradient accumulation to recover effective batch, or for SmolVLA leave `freeze_vision_encoder=True`.
-
-## Training time
-
-Robotics imitation learning typically converges in **5–10 epochs over the dataset**, not hundreds of thousands of raw steps. Once you know your epoch count, wall-clock is essentially:
-
-```text
-total_frames    = sum of frames over all episodes      # 50 ep × 30 fps × 30 s ≈ 45,000
-steps_per_epoch = ceil(total_frames / (num_gpus × batch_size))
-total_steps     = epochs × steps_per_epoch
-wall_clock      ≈ total_steps × per_step_time
-```
-
-Per-step time depends on the policy and the GPU. The numbers in the table below are anchors — pick the row closest to your setup and scale linearly with `total_steps` if you train longer or shorter.
-
-### Common scenarios
-
-Indicative wall-clock for **5 epochs on a ~50-episode dataset (~45k frames at 30 fps × 30 s)**, default optimizer (AdamW), 640×480 images:
-
-| Setup                                | Policy         | Batch | Wall-clock |
-| ------------------------------------ | -------------- | ----- | ---------: |
-| Single RTX 4090 / RTX 3090 (24 GB)   | `act`          | 8     |  ~30–60min |
-| Single RTX 4090 / RTX 3090 (24 GB)   | `diffusion`    | 8     |      ~2–4h |
-| Single L4 / A10G (24 GB)             | `act`          | 8     |      ~1–2h |
-| Single L4 / A10G (24 GB)             | `smolvla`      | 4     |      ~3–6h |
-| Single A100 40 GB                    | `smolvla`      | 16    |      ~1–2h |
-| Single A100 40 GB                    | `pi0` / `pi05` | 4     |      ~4–8h |
-| 4× H100 80 GB cluster (`accelerate`) | `diffusion`    | 32    |  ~30–60min |
-| 4× H100 80 GB cluster (`accelerate`) | `smolvla`      | 32    |      ~1–2h |
-| Apple Silicon M1/M2/M3 Max (MPS)     | `act`          | 4     |     ~6–14h |
-
-These are order-of-magnitude figures. Real runs deviate by ±50% depending on image resolution, dataset I/O, dataloader threading, and exact GPU SKU. They are useful as "is this run going to take an hour or a day?" intuition, not as SLAs.
-
-### Multi-GPU matters a lot
-
-`accelerate launch --num_processes=N` is the easiest way to cut training time. Each optimizer step processes `N × batch_size` samples in roughly the same wall-clock as a single-GPU step, so 4 GPUs ≈ 4× speedup for compute-bound runs. See the [Multi GPU training](./multi_gpu_training) guide for the full setup.
-
-Reference data points on a 4×H100 80 GB cluster (`accelerate launch --num_processes=4`), 5000 steps, batch 32, AdamW, dataset [`imstevenpmwork/super_poulain_draft`](https://huggingface.co/datasets/imstevenpmwork/super_poulain_draft) (~50 episodes, ~640×480 images):
-
-| Policy      | Wall-clock | `update_s` | `dataloading_s` | GPU util | Notable flags                                                                                                                  |
-| ----------- | ---------- | ---------: | --------------: | -------- | ------------------------------------------------------------------------------------------------------------------------------ |
-| `diffusion` | 16m 17s    |      0.167 |           0.015 | ~90%     | defaults (training from scratch)                                                                                               |
-| `smolvla`   | 27m 49s    |      0.312 |           0.011 | ~80%     | `--policy.path=lerobot/smolvla_base`, `freeze_vision_encoder=false`, `train_expert_only=false`                                 |
-| `pi05`      | 3h 41m     |      2.548 |           0.014 | ~95%     | `--policy.pretrained_path=lerobot/pi05_base`, `gradient_checkpointing=true`, `dtype=bfloat16`, vision encoder + expert trained |
-
-The `dataloading_s` vs. `update_s` ratio is the diagnostic that matters: when `dataloading_s` approaches `update_s`, more GPUs stop helping — your dataloader is the bottleneck and you should look at `--num_workers`, image resolution, and disk speed before adding compute.
-
-### Schedule and checkpoints
-
-If you shorten training (e.g. 5k–10k steps on a small dataset), also shorten the LR schedule with `--policy.scheduler_decay_steps≈--steps`. Otherwise the LR stays near its peak and never decays. Same for `--save_freq`.
-
-## Where to run
-
-VRAM is the first filter. Within a tier, pick by budget and availability — the `$`–`$$$$` columns are relative; check current pricing on the provider you actually use.
-
-| Class                      | VRAM  | Tier   | Comfortable for                                             |
-| -------------------------- | ----- | ------ | ----------------------------------------------------------- |
-| RTX 3090 / 4090 (consumer) | 24 GB | `$`    | Light BC, Diffusion, SmolVLA. Tight for VLAs at batch 1.    |
-| L4 / A10G (cloud)          | 24 GB | `$–$$` | Same envelope; common on Google Cloud, RunPod, AWS `g5/g6`. |
-| A100 40 GB                 | 40 GB | `$$$`  | Any policy at reasonable batch sizes.                       |
-| A100 80 GB / H100 80 GB    | 80 GB | `$$$$` | Multi-GPU clusters; large batches for VLAs.                 |
-| **CPU only**               | —     | —      | Don't train. Use Colab or rent a GPU.                       |
-
-### Hugging Face Jobs
-
-[Hugging Face Jobs](https://huggingface.co/docs/hub/jobs) lets you run training on managed HF infrastructure, billed by the second. The repo publishes a ready-to-use image: **`huggingface/lerobot-gpu:latest`**, rebuilt **every night at 02:00 UTC from `main`** ([`docker_publish.yml`](https://github.com/huggingface/lerobot/blob/main/.github/workflows/docker_publish.yml)) — so it tracks the current state of the repo, not a tagged release.
-
-```bash
-hf jobs run --flavor a10g-large huggingface/lerobot-gpu:latest \
-  bash -c "nvidia-smi && lerobot-train \
-    --policy.type=act --dataset.repo_id=<USER>/<DATASET> \
-    --policy.repo_id=<USER>/act_<task> --batch_size=8 --steps=50000"
-```
-
-Notes:
-
- The leading `nvidia-smi` is a quick sanity check that CUDA is visible inside the container — useful to fail fast if the flavor or driver mismatched.
- The default Job timeout is 30 minutes; pass `--timeout 4h` (or longer) for real training.
- `--flavor` maps onto the table above: `t4-small`/`t4-medium` (T4, ACT only), `l4x1`/`l4x4` (L4 24 GB), `a10g-small/large/largex2/largex4` (A10G 24 GB scaled out), `a100-large` (A100). For the current full catalogue + pricing see [https://huggingface.co/docs/hub/jobs](https://huggingface.co/docs/hub/jobs).
@@ -50,30 +50,30 @@ This process can be repeated iteratively: deploy, collect, fine-tune, repeat. Ea

 ### Teleoperator Requirements

-The `lerobot-rollout --strategy.type=dagger` mode requires **teleoperators with active motors** that can:
+The `examples/hil` HIL scripts require **teleoperators with active motors** that can:

 - Enable/disable torque programmatically
 - Move to target positions (to mirror the robot state when pausing)

-**Compatible teleoperators:**
+**Compatible teleoperators in the current `examples/hil` scripts:**

 - `openarm_mini` - OpenArm Mini
 - `so_leader` - SO100 / SO101 leader arm

 > [!IMPORTANT]
-> The provided commands default to `bi_openarm_follower` + `openarm_mini`.
+> The provided `examples/hil` commands default to `bi_openarm_follower` + `openarm_mini`.
 > `so_follower` + `so_leader` configs are also registered and can be used via CLI flags.

 ---

 ## Script

-Use `lerobot-rollout` with `--strategy.type=dagger` for HIL data collection. Select the inference backend with `--inference.type=sync|rtc`:
+A single script handles both synchronous and RTC-based inference. Toggle RTC with `--rtc.enabled=true`:

-| Mode                     | Flag                   | Models                |
-| ------------------------ | ---------------------- | --------------------- |
-| Standard (default)       | _(no flag needed)_     | ACT, Diffusion Policy |
-| Real-Time Chunking (RTC) | `--inference.type=rtc` | Pi0, Pi0.5, SmolVLA   |
+| Mode                     | Flag                 | Models                |
+| ------------------------ | -------------------- | --------------------- |
+| Standard (default)       | _(no flag needed)_   | ACT, Diffusion Policy |
+| Real-Time Chunking (RTC) | `--rtc.enabled=true` | Pi0, Pi0.5, SmolVLA   |

 ---

@@ -97,7 +97,7 @@ python src/lerobot/scripts/lerobot_train.py \
 **Standard inference (ACT, Diffusion Policy):**

 ```bash
-lerobot-rollout --strategy.type=dagger \
+python examples/hil/hil_data_collection.py \
    --robot.type=bi_openarm_follower \
    --robot.left_arm_config.port=can1 \
    --robot.left_arm_config.side=left \
@@ -108,10 +108,11 @@ lerobot-rollout --strategy.type=dagger \
    --teleop.port_left=/dev/ttyACM0 \
    --teleop.port_right=/dev/ttyACM1 \
    --policy.path=outputs/pretrain/checkpoints/last/pretrained_model \
-    --dataset.repo_id=your-username/rollout_hil_dataset \
+    --dataset.repo_id=your-username/hil-dataset \
    --dataset.single_task="Fold the T-shirt properly" \
    --dataset.fps=30 \
-    --strategy.num_episodes=50 \
+    --dataset.episode_time_s=1000 \
+    --dataset.num_episodes=50 \
    --interpolation_multiplier=2
 ```

@@ -120,11 +121,11 @@ lerobot-rollout --strategy.type=dagger \
 For models with high inference latency, enable RTC for smooth execution:

 ```bash
-lerobot-rollout --strategy.type=dagger \
-    --inference.type=rtc \
-    --inference.rtc.execution_horizon=20 \
-    --inference.rtc.max_guidance_weight=5.0 \
-    --inference.rtc.prefix_attention_schedule=LINEAR \
+python examples/hil/hil_data_collection.py \
+    --rtc.enabled=true \
+    --rtc.execution_horizon=20 \
+    --rtc.max_guidance_weight=5.0 \
+    --rtc.prefix_attention_schedule=LINEAR \
    --robot.type=bi_openarm_follower \
    --robot.left_arm_config.port=can1 \
    --robot.left_arm_config.side=left \
@@ -135,10 +136,11 @@ lerobot-rollout --strategy.type=dagger \
    --teleop.port_left=/dev/ttyACM0 \
    --teleop.port_right=/dev/ttyACM1 \
    --policy.path=outputs/pretrain/checkpoints/last/pretrained_model \
-    --dataset.repo_id=your-username/rollout_hil_rtc_dataset \
+    --dataset.repo_id=your-username/hil-rtc-dataset \
    --dataset.single_task="Fold the T-shirt properly" \
    --dataset.fps=30 \
-    --strategy.num_episodes=50 \
+    --dataset.episode_time_s=1000 \
+    --dataset.num_episodes=50 \
    --interpolation_multiplier=3
 ```

@@ -233,7 +235,7 @@ This HIL data collection approach builds on ideas from interactive imitation lea

 - **HG-DAgger** (Kelly et al., 2019) made this practical for robotics: a human expert monitors the robot and only intervenes when needed, rather than labeling every state. The gating between autonomous and human control is exactly the pause → takeover → return-to-policy loop used in the scripts here.

- **RaC** (Hu et al., 2025) scales this loop to long-horizon tasks by explicitly decomposing interventions into **recovery** (teleoperating back to a good state) and **correction** (demonstrating the right behavior from there). This decomposition is the protocol followed by the DAgger strategy in `lerobot-rollout`.
+- **RaC** (Hu et al., 2025) scales this loop to long-horizon tasks by explicitly decomposing interventions into **recovery** (teleoperating back to a good state) and **correction** (demonstrating the right behavior from there). This decomposition is the protocol followed by the HIL scripts in `examples/hil`.

 - **π0.6/RECAP** (Physical Intelligence, 2025) applies the same iterative collect-and-finetune loop at scale with VLA models, showing that even large pretrained policies benefit substantially from targeted human corrections on their own failure modes. π0.6 is trained using RECAP.

@@ -685,10 +685,6 @@ Example configuration for training the [reward classifier](https://huggingface.c

 ```json
 {
-  "dataset": {
-    "repo_id": "hf_username/dataset_name",
-    "root": null
-  },
  "policy": {
    "type": "reward_classifier",
    "model_name": "helper2424/resnet10",
@@ -709,28 +705,8 @@ Example configuration for training the [reward classifier](https://huggingface.c
        "type": "VISUAL",
        "shape": [3, 128, 128]
      }
-    },
-    "push_to_hub": true,
-    "repo_id": "hf_username/model_repo"
-  },
-  "batch_size": 16,
-  "num_workers": 4,
-  "steps": 5000,
-  "log_freq": 10,
-  "eval_freq": 1000,
-  "save_freq": 1000,
-  "save_checkpoint": true,
-  "seed": 2,
-  "resume": false,
-  "optimizer": {
-    "grad_clip_norm": 10.0
-  },
-  "wandb": {
-    "enable": true,
-    "project": "reward-classifier",
-    "disable_artifact": false
-  },
-  "job_name": "reward-classifier"
+    }
+  }
 }
 ```

@@ -32,12 +32,6 @@ Once you’ve gathered enough trajectories, you’ll train a neural network to i

 If you run into any issues at any point, jump into our [Discord community](https://discord.com/invite/s3KuuzsPFb) for support.

-<Tip>
-
-Want to quickly get the right commands for your setup? The [quickstart notebook](https://github.com/huggingface/lerobot/blob/main/examples/notebooks/quickstart.ipynb) [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/lerobot/blob/main/examples/notebooks/quickstart.ipynb) lets you configure your robot once and generates all the commands below ready to paste.
-
-</Tip>
-
 ## Set up and Calibrate

 If you haven't yet set up and calibrated your robot and teleop device, please do so by following the robot-specific tutorial.
@@ -509,42 +503,121 @@ hf upload ${HF_USER}/act_so101_test${CKPT} \

 ## Run inference and evaluate your policy

-Use `lerobot-rollout` to deploy a trained policy on your robot. You can choose different strategies depending on your needs:
+You can use the `record` script from [`lerobot-record`](https://github.com/huggingface/lerobot/blob/main/src/lerobot/scripts/lerobot_record.py) with a policy checkpoint as input, to run inference and evaluate your policy. For instance, run this command or API example to run inference and record 10 evaluation episodes:

 <hfoptions id="eval">
-<hfoption id="Base mode (no recording)">
+<hfoption id="Command">
 ```bash
-lerobot-rollout \
-  --strategy.type=base \
-  --policy.path=${HF_USER}/my_policy \
-  --robot.type=so100_follower \
-  --robot.port=/dev/ttyACM1 \
-  --robot.cameras="{ up: {type: opencv, index_or_path: /dev/video10, width: 640, height: 480, fps: 30}, side: {type: intelrealsense, serial_number_or_name: 233522074606, width: 640, height: 480, fps: 30}}" \
-  --task="Put lego brick into the transparent box" \
-  --duration=60
-```
-</hfoption>
-<hfoption id="Sentry mode (with recording)">
-```bash
-lerobot-rollout \
-  --strategy.type=sentry \
-  --strategy.upload_every_n_episodes=5 \
-  --policy.path=${HF_USER}/my_policy \
+lerobot-record  \
  --robot.type=so100_follower \
  --robot.port=/dev/ttyACM1 \
  --robot.cameras="{ up: {type: opencv, index_or_path: /dev/video10, width: 640, height: 480, fps: 30}, side: {type: intelrealsense, serial_number_or_name: 233522074606, width: 640, height: 480, fps: 30}}" \
+  --robot.id=my_awesome_follower_arm \
+  --display_data=false \
  --dataset.repo_id=${HF_USER}/eval_so100 \
  --dataset.single_task="Put lego brick into the transparent box" \
-  --duration=600
+  --dataset.streaming_encoding=true \
+  --dataset.encoder_threads=2 \
+  # --dataset.vcodec=auto \
+  # <- Teleop optional if you want to teleoperate in between episodes \
+  # --teleop.type=so100_leader \
+  # --teleop.port=/dev/ttyACM0 \
+  # --teleop.id=my_awesome_leader_arm \
+  --policy.path=${HF_USER}/my_policy
 ```
+</hfoption>
+<hfoption id="API example">
+
+<!-- prettier-ignore-start -->
+```python
+from lerobot.cameras.opencv import OpenCVCameraConfig
+from lerobot.datasets import LeRobotDataset
+from lerobot.utils.feature_utils import hw_to_dataset_features
+from lerobot.policies.act import ACTPolicy
+from lerobot.policies import make_pre_post_processors
+from lerobot.robots.so_follower import SO100Follower, SO100FollowerConfig
+from lerobot.scripts.lerobot_record import record_loop
+from lerobot.common.control_utils import init_keyboard_listener
+from lerobot.utils.utils import log_say
+from lerobot.utils.visualization_utils import init_rerun
+
+
+NUM_EPISODES = 5
+FPS = 30
+EPISODE_TIME_SEC = 60
+TASK_DESCRIPTION = "My task description"
+HF_MODEL_ID = "<hf_username>/<model_repo_id>"
+HF_DATASET_ID = "<hf_username>/<eval_dataset_repo_id>"
+
+# Create the robot configuration
+camera_config = {"front": OpenCVCameraConfig(index_or_path=0, width=640, height=480, fps=FPS)}
+robot_config = SO100FollowerConfig(
+    port="/dev/tty.usbmodem58760434471", id="my_awesome_follower_arm", cameras=camera_config
+)
+
+# Initialize the robot
+robot = SO100Follower(robot_config)
+
+# Initialize the policy
+policy = ACTPolicy.from_pretrained(HF_MODEL_ID)
+
+# Configure the dataset features
+action_features = hw_to_dataset_features(robot.action_features, "action")
+obs_features = hw_to_dataset_features(robot.observation_features, "observation")
+dataset_features = {**action_features, **obs_features}
+
+# Create the dataset
+dataset = LeRobotDataset.create(
+    repo_id=HF_DATASET_ID,
+    fps=FPS,
+    features=dataset_features,
+    robot_type=robot.name,
+    use_videos=True,
+    image_writer_threads=4,
+)
+
+# Initialize the keyboard listener and rerun visualization
+_, events = init_keyboard_listener()
+init_rerun(session_name="recording")
+
+# Connect the robot
+robot.connect()
+
+preprocessor, postprocessor = make_pre_post_processors(
+    policy_cfg=policy,
+    pretrained_path=HF_MODEL_ID,
+    dataset_stats=dataset.meta.stats,
+)
+
+for episode_idx in range(NUM_EPISODES):
+    log_say(f"Running inference, recording eval episode {episode_idx + 1} of {NUM_EPISODES}")
+
+    # Run the policy inference loop
+    record_loop(
+        robot=robot,
+        events=events,
+        fps=FPS,
+        policy=policy,
+        preprocessor=preprocessor,
+        postprocessor=postprocessor,
+        dataset=dataset,
+        control_time_s=EPISODE_TIME_SEC,
+        single_task=TASK_DESCRIPTION,
+        display_data=True,
+    )
+
+    dataset.save_episode()
+
+# Clean up
+robot.disconnect()
+dataset.push_to_hub()
+```
+<!-- prettier-ignore-end -->
+
 </hfoption>
 </hfoptions>

-The `--strategy.type` flag selects the execution mode:
+As you can see, it's almost the same command as previously used to record your training dataset. Two things changed:

- `base`: Autonomous rollout with no data recording (useful for quick evaluation)
- `sentry`: Continuous recording with auto-upload (useful for large-scale evaluation)
- `highlight`: Ring buffer recording with keystroke save (useful for capturing interesting events)
- `dagger`: Human-in-the-loop data collection (see [HIL Data Collection](./hil_data_collection))
-
-All strategies support `--inference.type=rtc` for smooth execution with slow VLA models (Pi0, Pi0.5, SmolVLA).
+1. There is an additional `--control.policy.path` argument which indicates the path to your policy checkpoint with (e.g. `outputs/train/eval_act_so101_test/checkpoints/last/pretrained_model`). You can also use the model repository if you uploaded a model checkpoint to the hub (e.g. `${HF_USER}/act_so101_test`).
+2. The name of dataset begins by `eval` to reflect that you are running inference (e.g. `${HF_USER}/eval_act_so101_test`).
@@ -1,261 +0,0 @@
-# Policy Deployment (lerobot-rollout)
-
-`lerobot-rollout` is the single CLI for deploying trained policies on real robots. It supports multiple execution strategies and inference backends, from quick evaluation to continuous recording and human-in-the-loop data collection.
-
-## Quick Start
-
-No extra dependencies are needed beyond your robot and policy extras.
-
-```bash
-lerobot-rollout \
-    --strategy.type=base \
-    --policy.path=lerobot/act_koch_real \
-    --robot.type=koch_follower \
-    --robot.port=/dev/ttyACM0 \
-    --task="pick up cube" \
-    --duration=30
-```
-
-This runs the policy for 30 seconds with no recording.
-
---
-
-## Strategies
-
-Select a strategy with `--strategy.type=<name>`. Each strategy defines a different control loop with its own recording and interaction semantics.
-
-### Base (`--strategy.type=base`)
-
-Autonomous policy execution with no data recording. Use this for quick evaluation, demos, or when you only need to observe the robot.
-
-```bash
-lerobot-rollout \
-    --strategy.type=base \
-    --policy.path=${HF_USER}/my_policy \
-    --robot.type=so100_follower \
-    --robot.port=/dev/ttyACM0 \
-    --robot.cameras="{ front: {type: opencv, index_or_path: 0, width: 640, height: 480, fps: 30}}" \
-    --task="Put lego brick into the box" \
-    --duration=60
-```
-
-| Flag             | Description                                            |
-| ---------------- | ------------------------------------------------------ |
-| `--duration`     | Run time in seconds (0 = infinite)                     |
-| `--task`         | Task description passed to the policy                  |
-| `--display_data` | Stream observations/actions to Rerun for visualization |
-
-### Sentry (`--strategy.type=sentry`)
-
-Continuous autonomous recording with periodic upload to the Hugging Face Hub. Episode boundaries are auto-computed from camera resolution and FPS so each saved episode produces a complete video file, keeping uploads efficient.
-
-Policy state (hidden state, RTC queue) persists across episode boundaries: the robot does not reset between episodes.
-
-```bash
-lerobot-rollout \
-    --strategy.type=sentry \
-    --strategy.upload_every_n_episodes=5 \
-    --policy.path=${HF_USER}/my_policy \
-    --robot.type=so100_follower \
-    --robot.port=/dev/ttyACM0 \
-    --robot.cameras="{ front: {type: opencv, index_or_path: 0, width: 640, height: 480, fps: 30}}" \
-    --dataset.repo_id=${HF_USER}/rollout_eval_data \
-    --dataset.single_task="Put lego brick into the box" \
-    --duration=3600
-```
-
-| Flag                                   | Description                                                 |
-| -------------------------------------- | ----------------------------------------------------------- |
-| `--strategy.upload_every_n_episodes`   | Push to Hub every N episodes (default: 5)                   |
-| `--strategy.target_video_file_size_mb` | Target video file size for episode rotation (default: auto) |
-| `--dataset.repo_id`                    | **Required.** Hub repository for the recorded dataset       |
-| `--dataset.push_to_hub`                | Whether to push to Hub on teardown (default: true)          |
-
-### Highlight (`--strategy.type=highlight`)
-
-Autonomous rollout with on-demand recording via a memory-bounded ring buffer. The robot runs continuously while the buffer captures the last N seconds of telemetry. Press the save key to flush the buffer and start live recording; press it again to save the episode.
-
-```bash
-lerobot-rollout \
-    --strategy.type=highlight \
-    --strategy.ring_buffer_seconds=30 \
-    --strategy.save_key=s \
-    --strategy.push_key=h \
-    --policy.path=${HF_USER}/my_policy \
-    --robot.type=koch_follower \
-    --robot.port=/dev/ttyACM0 \
-    --dataset.repo_id=${HF_USER}/rollout_highlight_data \
-    --dataset.single_task="Pick up the red cube"
-```
-
-**Keyboard controls:**
-
-| Key                | Action                                                   |
-| ------------------ | -------------------------------------------------------- |
-| `s` (configurable) | Start recording (flushes buffer) / stop and save episode |
-| `h` (configurable) | Push dataset to Hub                                      |
-| `ESC`              | Stop the session                                         |
-
-| Flag                                   | Description                                    |
-| -------------------------------------- | ---------------------------------------------- |
-| `--strategy.ring_buffer_seconds`       | Duration of buffered telemetry (default: 30)   |
-| `--strategy.ring_buffer_max_memory_mb` | Memory cap for the ring buffer (default: 2048) |
-| `--strategy.save_key`                  | Key to toggle recording (default: `s`)         |
-| `--strategy.push_key`                  | Key to push to Hub (default: `h`)              |
-
-### DAgger (`--strategy.type=dagger`)
-
-Human-in-the-loop data collection. Alternates between autonomous policy execution and human intervention via a teleoperator. Intervention frames are tagged with `intervention=True`. Requires a teleoperator (`--teleop.type`).
-
-See the [Human-In-the-Loop Data Collection](./hil_data_collection) guide for a detailed walkthrough.
-
-**Corrections-only mode** (default): Only human correction windows are recorded. Each correction becomes one episode.
-
-```bash
-lerobot-rollout \
-    --strategy.type=dagger \
-    --strategy.num_episodes=20 \
-    --policy.path=outputs/pretrain/checkpoints/last/pretrained_model \
-    --robot.type=bi_openarm_follower \
-    --teleop.type=openarm_mini \
-    --dataset.repo_id=${HF_USER}/rollout_hil_data \
-    --dataset.single_task="Fold the T-shirt"
-```
-
-**Continuous recording mode** (`--strategy.record_autonomous=true`): Both autonomous and correction frames are recorded with time-based episode rotation (same as Sentry).
-
-```bash
-lerobot-rollout \
-    --strategy.type=dagger \
-    --strategy.record_autonomous=true \
-    --strategy.num_episodes=50 \
-    --policy.path=${HF_USER}/my_policy \
-    --robot.type=so100_follower \
-    --robot.port=/dev/ttyACM0 \
-    --teleop.type=so101_leader \
-    --teleop.port=/dev/ttyACM1 \
-    --dataset.repo_id=${HF_USER}/rollout_dagger_data \
-    --dataset.single_task="Grasp the block"
-```
-
-**Keyboard controls** (default input device):
-
-| Key     | Action                                      |
-| ------- | ------------------------------------------- |
-| `Space` | Pause / resume policy execution             |
-| `Tab`   | Start / stop human correction               |
-| `Enter` | Push dataset to Hub (corrections-only mode) |
-| `ESC`   | Stop the session                            |
-
-Foot pedal input is also supported via `--strategy.input_device=pedal`. Configure pedal codes with `--strategy.pedal.*` flags.
-
-| Flag                                 | Description                                             |
-| ------------------------------------ | ------------------------------------------------------- |
-| `--strategy.num_episodes`            | Number of correction episodes to record (default: 10)   |
-| `--strategy.record_autonomous`       | Record autonomous frames too (default: false)           |
-| `--strategy.upload_every_n_episodes` | Push to Hub every N episodes (default: 5)               |
-| `--strategy.input_device`            | Input device: `keyboard` or `pedal` (default: keyboard) |
-| `--teleop.type`                      | **Required.** Teleoperator type                         |
-
---
-
-## Inference Backends
-
-Select a backend with `--inference.type=<name>`. All strategies work with both backends.
-
-### Sync (default)
-
-One policy call per control tick. The main loop blocks until the action is computed.
-
-Works with all policies. No extra flags needed.
-
-### Real-Time Chunking (`--inference.type=rtc`)
-
-A background thread produces action chunks asynchronously. The main control loop polls for the next ready action while the policy computes the next chunk in parallel.
-
-Use RTC with large, slow VLA models (Pi0, Pi0.5, SmolVLA) for smooth, continuous motion despite high inference latency.
-
-```bash
-lerobot-rollout \
-    --strategy.type=base \
-    --inference.type=rtc \
-    --inference.rtc.execution_horizon=10 \
-    --inference.rtc.max_guidance_weight=10.0 \
-    --policy.path=${HF_USER}/pi0_policy \
-    --robot.type=so100_follower \
-    --robot.port=/dev/ttyACM0 \
-    --robot.cameras="{ front: {type: opencv, index_or_path: 0, width: 640, height: 480, fps: 30}}" \
-    --task="Pick up the cube" \
-    --duration=60 \
-    --device=cuda
-```
-
-| Flag                                        | Description                                                    |
-| ------------------------------------------- | -------------------------------------------------------------- |
-| `--inference.rtc.execution_horizon`         | Steps to blend with previous chunk (default: varies by policy) |
-| `--inference.rtc.max_guidance_weight`       | Consistency enforcement strength (default: varies by policy)   |
-| `--inference.rtc.prefix_attention_schedule` | Blend schedule: `LINEAR`, `EXP`, `ONES`, `ZEROS`               |
-| `--inference.queue_threshold`               | Max queue size before backpressure (default: 30)               |
-
-See the [Real-Time Chunking](./rtc) guide for details on tuning RTC parameters.
-
---
-
-## Common Flags
-
-| Flag                              | Description                                                       | Default |
-| --------------------------------- | ----------------------------------------------------------------- | ------- |
-| `--policy.path`                   | **Required.** HF Hub model ID or local checkpoint path            | --      |
-| `--robot.type`                    | **Required.** Robot type (e.g. `so100_follower`, `koch_follower`) | --      |
-| `--robot.port`                    | Serial port for the robot                                         | --      |
-| `--robot.cameras`                 | Camera configuration (JSON dict)                                  | --      |
-| `--fps`                           | Control loop frequency                                            | 30      |
-| `--duration`                      | Run time in seconds (0 = infinite)                                | 0       |
-| `--device`                        | Torch device (`cpu`, `cuda`, `mps`)                               | auto    |
-| `--task`                          | Task description (used when no dataset is provided)               | --      |
-| `--display_data`                  | Stream telemetry to Rerun visualization                           | false   |
-| `--display_ip` / `--display_port` | Remote Rerun server address                                       | --      |
-| `--interpolation_multiplier`      | Action interpolation factor                                       | 1       |
-| `--use_torch_compile`             | Enable `torch.compile` for inference                              | false   |
-| `--resume`                        | Resume a previous recording session                               | false   |
-| `--play_sounds`                   | Vocal synthesis for events                                        | true    |
-
---
-
-## Programmatic Usage
-
-For custom deployments (e.g. with kinematics processors), use the rollout module API directly:
-
-```python
-from lerobot.rollout import BaseStrategyConfig, RolloutConfig, build_rollout_context
-from lerobot.rollout.inference import SyncInferenceConfig
-from lerobot.rollout.strategies import BaseStrategy
-from lerobot.utils.process import ProcessSignalHandler
-
-cfg = RolloutConfig(
-    robot=my_robot_config,
-    policy=my_policy_config,
-    strategy=BaseStrategyConfig(),
-    inference=SyncInferenceConfig(),
-    fps=30,
-    duration=60,
-    task="my task",
-)
-
-signal_handler = ProcessSignalHandler(use_threads=True)
-ctx = build_rollout_context(
-    cfg,
-    signal_handler.shutdown_event,
-    robot_action_processor=my_custom_action_processor,       # optional
-    robot_observation_processor=my_custom_obs_processor,     # optional
-)
-
-strategy = BaseStrategy(cfg.strategy)
-try:
-    strategy.setup(ctx)
-    strategy.run(ctx)
-finally:
-    strategy.teardown(ctx)
-```
-
-See `examples/so100_to_so100_EE/rollout.py` and `examples/phone_to_so100/rollout.py` for full examples with kinematics processors.
@@ -207,56 +207,6 @@ pip install 'lerobot[feetech]'        # Feetech motor support

 _Multiple extras can be combined (e.g., `.[core_scripts,pi,pusht]`). For a full list of available extras, refer to `pyproject.toml`._

-### PyTorch CUDA variant (Linux only)
-
-On Linux, the install path determines which CUDA wheel you get. macOS and Windows installs use the PyPI default (MPS / CPU / CUDA-Windows wheel respectively) and can skip this section.
-
-<!-- prettier-ignore-start -->
-
-<hfoptions id="cuda_variant">
-<hfoption id="uv-source">
-
-**Source install via `uv` (`uv sync` or `uv pip install -e .`)**
-
-`torch` and `torchvision` are pinned by the project to the **CUDA 12.8** PyTorch index (`https://download.pytorch.org/whl/cu128`, driver floor **570.86**) — covers Ampere/Ada/Hopper/Blackwell GPUs. No action needed for typical NVIDIA setups.
-
-To override for a different CUDA variant:
-
-```bash
-uv pip install --force-reinstall torch torchvision \
-    --index-url https://download.pytorch.org/whl/cu126   # older drivers; or cu130 for Blackwell on driver ≥ 580
-```
-
-</hfoption>
-<hfoption id="pip-conda">
-
-**Source install via `pip`/`conda`, or `pip install lerobot` from PyPI**
-
-PyPI default torch wheel is currently a cu130-bundled Linux wheel, driver floor **580.65**.
-
-To pick a specific CUDA variant:
-
-**Using `pip` or `conda`** — install torch first with an explicit index, then lerobot:
-
-```bash
-pip install --index-url https://download.pytorch.org/whl/cu128 torch torchvision
-pip install -e ".[all]"          # source
-# — or —
-pip install lerobot              # from PyPI
-```
-
-**Using `uv` to install from PyPI** — one-liner via `--torch-backend` (uv ≥ 0.6):
-
-```bash
-uv pip install --torch-backend cu128 lerobot
-```
-
-Supported values include `auto`, `cpu`, `cu126`, `cu128`, `cu129`, `cu130`, plus various `rocm*` and `xpu`. Swap as needed for your driver.
-
-</hfoption>
-</hfoptions>
-<!-- prettier-ignore-end -->
-
 ### Troubleshooting

 If you encounter build errors, you may need to install additional system dependencies: `cmake`, `build-essential`, and `ffmpeg libs`.
@@ -1,188 +0,0 @@
-# LIBERO-plus
-
-LIBERO-plus is a **robustness benchmark** for Vision-Language-Action (VLA) models built on top of [LIBERO](./libero). It systematically stress-tests policies by applying **seven independent perturbation dimensions** to the original LIBERO task set, exposing failure modes that standard benchmarks miss.
-
- Paper: [In-depth Robustness Analysis of Vision-Language-Action Models](https://arxiv.org/abs/2510.13626)
- GitHub: [sylvestf/LIBERO-plus](https://github.com/sylvestf/LIBERO-plus)
- Dataset: [lerobot/libero_plus](https://huggingface.co/datasets/lerobot/libero_plus)
-
-![An overview of the LIBERO-plus benchmark perturbation dimensions](https://github.com/sylvestf/LIBERO-plus/raw/main/static/images/libero-plus.jpg)
-
-## Perturbation dimensions
-
-LIBERO-plus creates ~10 000 task variants by perturbing each original LIBERO task along these axes:
-
-| Dimension             | What changes                                          |
-| --------------------- | ----------------------------------------------------- |
-| Objects layout        | Target position, presence of confounding objects      |
-| Camera viewpoints     | Camera position, orientation, field-of-view           |
-| Robot initial states  | Manipulator start pose                                |
-| Language instructions | LLM-rewritten task description (paraphrase / synonym) |
-| Light conditions      | Intensity, direction, color, shadow                   |
-| Background textures   | Scene surface and object appearance                   |
-| Sensor noise          | Photometric distortions and image degradation         |
-
-## Available task suites
-
-LIBERO-plus covers the same five suites as LIBERO:
-
-| Suite          | CLI name         | Tasks | Max steps | Description                                        |
-| -------------- | ---------------- | ----- | --------- | -------------------------------------------------- |
-| LIBERO-Spatial | `libero_spatial` | 10    | 280       | Tasks requiring reasoning about spatial relations  |
-| LIBERO-Object  | `libero_object`  | 10    | 280       | Tasks centered on manipulating different objects   |
-| LIBERO-Goal    | `libero_goal`    | 10    | 300       | Goal-conditioned tasks with changing targets       |
-| LIBERO-90      | `libero_90`      | 90    | 400       | Short-horizon tasks from the LIBERO-100 collection |
-| LIBERO-Long    | `libero_10`      | 10    | 520       | Long-horizon tasks from the LIBERO-100 collection  |
-
-<Tip warning={true}>
-  Installing LIBERO-plus **replaces** vanilla LIBERO — it uninstalls `hf-libero`
-  so that `import libero` resolves to the LIBERO-plus fork. You cannot have both
-  installed at the same time. To switch back to vanilla LIBERO, uninstall the
-  fork and reinstall with `pip install -e ".[libero]"`.
-</Tip>
-
-## Installation
-
-### System dependencies (Linux only)
-
-```bash
-sudo apt install libexpat1 libfontconfig1-dev libmagickwand-dev
-```
-
-### Python package
-
-```bash
-pip install -e ".[libero]" "robosuite==1.4.1" bddl easydict mujoco wand scikit-image gym
-git clone https://github.com/sylvestf/LIBERO-plus.git
-cd LIBERO-plus && pip install --no-deps -e .
-pip uninstall -y hf-libero  # so `import libero` resolves to the fork
-```
-
-LIBERO-plus is installed from its GitHub fork rather than a pyproject extra — the fork ships as a namespace package that pip can't handle, so it must be cloned and added to `PYTHONPATH`. See `docker/Dockerfile.benchmark.libero_plus` for the canonical install. MuJoCo is required, so only Linux is supported.
-
-<Tip>
-Set the MuJoCo rendering backend before running evaluation:
-
-```bash
-export MUJOCO_GL=egl   # headless / HPC / cloud
-```
-
-</Tip>
-
-### Download LIBERO-plus assets
-
-LIBERO-plus ships its extended asset pack separately. Download `assets.zip` from the [Hugging Face dataset](https://huggingface.co/datasets/Sylvest/LIBERO-plus/tree/main) and extract it into the LIBERO-plus package directory:
-
-```bash
-# After installing the package, find where it was installed:
-python -c "import libero; print(libero.__file__)"
-# Then extract assets.zip into <package_root>/libero/assets/
-```
-
-## Evaluation
-
-### Default evaluation (recommended)
-
-Evaluate across the four standard suites (10 episodes per task):
-
-```bash
-lerobot-eval \
-  --policy.path="your-policy-id" \
-  --env.type=libero_plus \
-  --env.task=libero_spatial,libero_object,libero_goal,libero_10 \
-  --eval.batch_size=1 \
-  --eval.n_episodes=10 \
-  --env.max_parallel_tasks=1
-```
-
-### Single-suite evaluation
-
-Evaluate on one LIBERO-plus suite:
-
-```bash
-lerobot-eval \
-  --policy.path="your-policy-id" \
-  --env.type=libero_plus \
-  --env.task=libero_spatial \
-  --eval.batch_size=1 \
-  --eval.n_episodes=10
-```
-
- `--env.task` picks the suite (`libero_spatial`, `libero_object`, etc.).
- `--env.task_ids` restricts to specific task indices (`[0]`, `[1,2,3]`, etc.). Omit to run all tasks in the suite.
- `--eval.batch_size` controls how many environments run in parallel.
- `--eval.n_episodes` sets how many episodes to run per task.
-
-### Multi-suite evaluation
-
-Benchmark a policy across multiple suites at once by passing a comma-separated list:
-
-```bash
-lerobot-eval \
-  --policy.path="your-policy-id" \
-  --env.type=libero_plus \
-  --env.task=libero_spatial,libero_object \
-  --eval.batch_size=1 \
-  --eval.n_episodes=10
-```
-
-### Control mode
-
-LIBERO-plus supports two control modes — `relative` (default) and `absolute`. Different VLA checkpoints are trained with different action parameterizations, so make sure the mode matches your policy:
-
-```bash
--env.control_mode=relative   # or "absolute"
-```
-
-### Policy inputs and outputs
-
-**Observations:**
-
- `observation.state` — 8-dim proprioceptive features (eef position, axis-angle orientation, gripper qpos)
- `observation.images.image` — main camera view (`agentview_image`), HWC uint8
- `observation.images.image2` — wrist camera view (`robot0_eye_in_hand_image`), HWC uint8
-
-**Actions:**
-
- Continuous control in `Box(-1, 1, shape=(7,))` — 6D end-effector delta + 1D gripper
-
-### Recommended evaluation episodes
-
-For reproducible benchmarking, use **10 episodes per task** across all four standard suites (Spatial, Object, Goal, Long). This gives 400 total episodes and matches the protocol used for published results.
-
-## Training
-
-### Dataset
-
-A LeRobot-format training dataset for LIBERO-plus is available at:
-
- [lerobot/libero_plus](https://huggingface.co/datasets/lerobot/libero_plus)
-
-### Example training command
-
-```bash
-lerobot-train \
-    --policy.type=smolvla \
-    --policy.repo_id=${HF_USER}/smolvla_libero_plus \
-    --policy.load_vlm_weights=true \
-    --dataset.repo_id=lerobot/libero_plus \
-    --env.type=libero_plus \
-    --env.task=libero_spatial \
-    --output_dir=./outputs/ \
-    --steps=100000 \
-    --batch_size=4 \
-    --eval.batch_size=1 \
-    --eval.n_episodes=1 \
-    --eval_freq=1000
-```
-
-## Relationship to LIBERO
-
-LIBERO-plus is a drop-in extension of LIBERO:
-
- Same Python gym interface (`LiberoEnv`, `LiberoProcessorStep`)
- Same camera names and observation/action format
- Same task suite names
- Installs under the same `libero` Python package name (different GitHub repo)
-
-To use the original LIBERO benchmark, see [LIBERO](./libero) and use `--env.type=libero`.
@@ -61,6 +61,17 @@ lerobot-eval \
  --rename_map='{"observation.images.image": "observation.images.base_0_rgb", "observation.images.image2": "observation.images.left_wrist_0_rgb"}'
 ```

+### Recording
+
+`lerobot-record` also supports rename maps, nested under the dataset config:
+
+```bash
+lerobot-record \ # When running inference
+  --policy.path="<user>/smolVLA_finetuned" \
+  ... \
+  --dataset.rename_map='{"observation.images.glove2": "observation.images.image"}'
+```
+
 ## Alternative: edit the policy config directly

 If you always use the same dataset or environment, you can **edit the policy's `config.json`** so its observation keys match your data source. Then no rename map is needed.
@@ -94,10 +105,10 @@ XVLA-base has three visual inputs and `empty_cameras=0` by default. Your dataset

 ## Quick reference

-| Goal                                    | What to do                                                                  |
-| --------------------------------------- | --------------------------------------------------------------------------- |
-| Dataset keys ≠ policy keys              | `--rename_map='{"dataset_key": "policy_key", ...}'`                         |
-| Env keys ≠ policy keys (eval)           | `--rename_map='{"env_key": "policy_key", ...}'`                             |
-| Rollout with different keys (inference) | `--rename_map='{"source_key": "policy_key", ...}'`.                         |
-| Fewer cameras than policy expects       | `--policy.empty_cameras=N` (supported by PI0, PI05, PI0Fast, SmolVLA, XVLA) |
-| Avoid passing a rename map              | Edit the policy's `config.json` so its keys match your data source          |
+| Goal                                      | What to do                                                                  |
+| ----------------------------------------- | --------------------------------------------------------------------------- |
+| Dataset keys ≠ policy keys                | `--rename_map='{"dataset_key": "policy_key", ...}'`                         |
+| Env keys ≠ policy keys (eval)             | `--rename_map='{"env_key": "policy_key", ...}'`                             |
+| Recording with different keys (inference) | `--dataset.rename_map='{"source_key": "policy_key", ...}'`.                 |
+| Fewer cameras than policy expects         | `--policy.empty_cameras=N` (supported by PI0, PI05, PI0Fast, SmolVLA, XVLA) |
+| Avoid passing a rename map                | Edit the policy's `config.json` so its keys match your data source          |
@@ -1,188 +0,0 @@
-# RoboCasa365
-
-[RoboCasa365](https://robocasa.ai) is a large-scale simulation framework for training and benchmarking **generalist robots** in everyday kitchen tasks. It ships 365 diverse manipulation tasks across 2,500 kitchen environments, 3,200+ object assets and 600+ hours of human demonstration data, on a PandaOmron 12-DOF mobile manipulator (Franka arm on a holonomic base).
-
- Paper: [RoboCasa: Large-Scale Simulation of Everyday Tasks for Generalist Robots](https://arxiv.org/abs/2406.02523)
- GitHub: [robocasa/robocasa](https://github.com/robocasa/robocasa)
- Project website: [robocasa.ai](https://robocasa.ai)
- Pretrained policy: [`lerobot/smolvla_robocasa`](https://huggingface.co/lerobot/smolvla_robocasa)
- Single-task dataset (CloseFridge): [`pepijn223/robocasa_CloseFridge`](https://huggingface.co/datasets/pepijn223/robocasa_CloseFridge)
-
-<img
-  src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/lerobot/robocasa-banner.webp"
-  alt="RoboCasa365 benchmark overview"
-  width="85%"
-/>
-
-## Available tasks
-
-RoboCasa365 organizes its 365 tasks into two families and three upstream benchmark groups that LeRobot exposes as first-class `--env.task` shortcuts:
-
-| Family    | Tasks | Description                                                                     |
-| --------- | ----- | ------------------------------------------------------------------------------- |
-| Atomic    | ~65   | Single-skill tasks: pick-and-place, door/drawer manipulation, appliance control |
-| Composite | ~300  | Multi-step tasks across 60+ categories: cooking, cleaning, organizing, etc.     |
-
-**Atomic task examples:** `CloseFridge`, `OpenDrawer`, `OpenCabinet`, `TurnOnMicrowave`, `TurnOffStove`, `NavigateKitchen`, `PickPlaceCounterToStove`.
-
-**Composite task categories:** baking, boiling, brewing, chopping, clearing table, defrosting food, loading dishwasher, making tea, microwaving food, washing dishes, and more.
-
-`--env.task` accepts three forms:
-
- a single task name (`CloseFridge`)
- a comma-separated list (`CloseFridge,OpenBlenderLid,PickPlaceCoffee`)
- a benchmark-group shortcut — `atomic_seen`, `composite_seen`, `composite_unseen`, `pretrain50`, `pretrain100`, `pretrain200`, `pretrain300` — which auto-expands to the upstream task list and auto-sets the dataset `split` (`target` or `pretrain`).
-
-## Installation
-
-RoboCasa and its dependency `robosuite` are not published on PyPI, and RoboCasa's own `setup.py` hardcodes `lerobot==0.3.3`, which conflicts with this repo's `lerobot`. LeRobot therefore does **not** expose a `robocasa` extra — install the two packages manually as editable clones (using `--no-deps` on `robocasa` to skip its shadowed `lerobot` pin):
-
-```bash
-# After following the standard LeRobot installation instructions.
-
-git clone https://github.com/robocasa/robocasa.git ~/robocasa
-git clone https://github.com/ARISE-Initiative/robosuite.git ~/robosuite
-pip install -e ~/robocasa --no-deps
-pip install -e ~/robosuite
-
-# Robocasa's runtime deps (the ones its setup.py would have pulled, minus
-# the bad lerobot pin).
-pip install numpy numba scipy mujoco pygame Pillow opencv-python \
-            pyyaml pynput tqdm termcolor imageio h5py lxml hidapi \
-            tianshou gymnasium
-
-python -m robocasa.scripts.setup_macros
-# Lightweight assets (lightwheel object meshes + textures). Enough for
-# the default env out of the box.
-python -m robocasa.scripts.download_kitchen_assets \
-  --type tex tex_generative fixtures_lw objs_lw
-# Optional: full objaverse/aigen registries (~30GB) for richer object
-# variety. Enable at eval time via --env.obj_registries (see below).
-# python -m robocasa.scripts.download_kitchen_assets --type objs_objaverse
-```
-
-<Tip>
-RoboCasa requires MuJoCo. Set the rendering backend before training or evaluation:
-
-```bash
-export MUJOCO_GL=egl  # for headless servers (HPC, cloud)
-```
-
-</Tip>
-
-### Object registries
-
-By default the env samples objects only from the `lightwheel` registry (what `--type objs_lw` ships), which avoids a `Probabilities contain NaN` crash when the objaverse / aigen packs aren't on disk. If you've downloaded the full asset set, enable the full registry at runtime:
-
-```bash
--env.obj_registries='[objaverse,lightwheel]'
-```
-
-## Evaluation
-
-All eval snippets below mirror the CI command (see `.github/workflows/benchmark_tests.yml`). The `--rename_map` argument maps RoboCasa's native camera keys (`robot0_agentview_left` / `robot0_eye_in_hand` / `robot0_agentview_right`) onto the three-camera (`camera1` / `camera2` / `camera3`) input layout the released `smolvla_robocasa` policy was trained on.
-
-### Single-task evaluation (recommended for quick iteration)
-
-```bash
-lerobot-eval \
-  --policy.path=lerobot/smolvla_robocasa \
-  --env.type=robocasa \
-  --env.task=CloseFridge \
-  --eval.batch_size=1 \
-  --eval.n_episodes=20 \
-  --eval.use_async_envs=false \
-  --policy.device=cuda \
-  '--rename_map={"observation.images.robot0_agentview_left": "observation.images.camera1", "observation.images.robot0_eye_in_hand": "observation.images.camera2", "observation.images.robot0_agentview_right": "observation.images.camera3"}'
-```
-
-### Multi-task evaluation
-
-Pass a comma-separated list of tasks:
-
-```bash
-lerobot-eval \
-  --policy.path=lerobot/smolvla_robocasa \
-  --env.type=robocasa \
-  --env.task=CloseFridge,OpenCabinet,OpenDrawer,TurnOnMicrowave,TurnOffStove \
-  --eval.batch_size=1 \
-  --eval.n_episodes=20 \
-  --eval.use_async_envs=false \
-  --policy.device=cuda \
-  '--rename_map={"observation.images.robot0_agentview_left": "observation.images.camera1", "observation.images.robot0_eye_in_hand": "observation.images.camera2", "observation.images.robot0_agentview_right": "observation.images.camera3"}'
-```
-
-### Benchmark-group evaluation
-
-Run an entire upstream group (e.g. all 18 `atomic_seen` tasks with `split=target`):
-
-```bash
-lerobot-eval \
-  --policy.path=lerobot/smolvla_robocasa \
-  --env.type=robocasa \
-  --env.task=atomic_seen \
-  --eval.batch_size=1 \
-  --eval.n_episodes=20 \
-  --eval.use_async_envs=false \
-  --policy.device=cuda \
-  '--rename_map={"observation.images.robot0_agentview_left": "observation.images.camera1", "observation.images.robot0_eye_in_hand": "observation.images.camera2", "observation.images.robot0_agentview_right": "observation.images.camera3"}'
-```
-
-### Recommended evaluation episodes
-
-**20 episodes per task** for reproducible benchmarking. Matches the protocol used in published results.
-
-## Policy inputs and outputs
-
-**Observations** (raw RoboCasa camera names are preserved verbatim):
-
- `observation.state` — 16-dim proprioceptive state (base position, base quaternion, relative end-effector position, relative end-effector quaternion, gripper qpos)
- `observation.images.robot0_agentview_left` — left agent view, 256×256 HWC uint8
- `observation.images.robot0_eye_in_hand` — wrist camera view, 256×256 HWC uint8
- `observation.images.robot0_agentview_right` — right agent view, 256×256 HWC uint8
-
-**Actions:**
-
- Continuous control in `Box(-1, 1, shape=(12,))` — base motion (4D) + control mode (1D) + end-effector position (3D) + end-effector rotation (3D) + gripper (1D).
-
-## Training
-
-### Single-task example
-
-A ready-to-use single-task dataset is on the Hub:
-[`pepijn223/robocasa_CloseFridge`](https://huggingface.co/datasets/pepijn223/robocasa_CloseFridge).
-
-Fine-tune a SmolVLA base on `CloseFridge`:
-
-```bash
-lerobot-train \
-  --policy.type=smolvla \
-  --policy.repo_id=${HF_USER}/smolvla_robocasa_CloseFridge \
-  --policy.load_vlm_weights=true \
-  --policy.push_to_hub=true \
-  --dataset.repo_id=pepijn223/robocasa_CloseFridge \
-  --env.type=robocasa \
-  --env.task=CloseFridge \
-  --output_dir=./outputs/smolvla_robocasa_CloseFridge \
-  --steps=100000 \
-  --batch_size=4 \
-  --eval_freq=5000 \
-  --eval.batch_size=1 \
-  --eval.n_episodes=5 \
-  --save_freq=10000
-```
-
-Evaluate the resulting checkpoint:
-
-```bash
-lerobot-eval \
-  --policy.path=${HF_USER}/smolvla_robocasa_CloseFridge \
-  --env.type=robocasa \
-  --env.task=CloseFridge \
-  --eval.batch_size=1 \
-  --eval.n_episodes=20
-```
-
-## Reproducing published results
-
-The released checkpoint [`lerobot/smolvla_robocasa`](https://huggingface.co/lerobot/smolvla_robocasa) is evaluated with the commands in the [Evaluation](#evaluation) section. CI runs a 10-atomic-task smoke eval (one episode each) on every PR touching the benchmark, picking fixture-centric tasks that don't require the objaverse asset pack.
@@ -1,99 +0,0 @@
-# RoboCerebra
-
-[RoboCerebra](https://robocerebra-project.github.io/) is a long-horizon manipulation benchmark that evaluates **high-level reasoning, planning, and memory** in VLAs. Episodes chain multiple sub-goals with language-grounded intermediate instructions, built on top of LIBERO's simulator stack (MuJoCo + robosuite, Franka Panda 7-DOF).
-
- Paper: [RoboCerebra: A Large-scale Benchmark for Long-horizon Robotic Manipulation Evaluation](https://arxiv.org/abs/2506.06677)
- Project website: [robocerebra-project.github.io](https://robocerebra-project.github.io/)
- Dataset: [`lerobot/robocerebra_unified`](https://huggingface.co/datasets/lerobot/robocerebra_unified) — LeRobot v3.0, 6,660 episodes / 571,116 frames at 20 fps, 1,728 language-grounded sub-tasks.
- Pretrained policy: [`lerobot/smolvla_robocerebra`](https://huggingface.co/lerobot/smolvla_robocerebra)
-
-## Available tasks
-
-RoboCerebra reuses LIBERO's simulator, so evaluation runs against the LIBERO `libero_10` long-horizon suite:
-
-| Suite     | CLI name    | Tasks | Description                                                   |
-| --------- | ----------- | ----- | ------------------------------------------------------------- |
-| LIBERO-10 | `libero_10` | 10    | Long-horizon kitchen/living room tasks chaining 3–6 sub-goals |
-
-Each RoboCerebra episode in the dataset is segmented into multiple sub-tasks with natural-language instructions, which the unified dataset exposes as independent supervision signals.
-
-## Installation
-
-RoboCerebra piggybacks on LIBERO, so the `libero` extra is all you need:
-
-```bash
-pip install -e ".[libero]"
-```
-
-<Tip>
-RoboCerebra requires Linux (MuJoCo / robosuite). Set the rendering backend before training or evaluation:
-
-```bash
-export MUJOCO_GL=egl  # for headless servers (HPC, cloud)
-```
-
-</Tip>
-
-## Evaluation
-
-RoboCerebra eval runs against LIBERO's `libero_10` suite with RoboCerebra's camera naming (`image` + `wrist_image`) and an extra empty-camera slot so a three-view-trained policy receives the expected input layout:
-
-```bash
-lerobot-eval \
-  --policy.path=lerobot/smolvla_robocerebra \
-  --env.type=libero \
-  --env.task=libero_10 \
-  --env.fps=20 \
-  --env.obs_type=pixels_agent_pos \
-  --env.observation_height=256 \
-  --env.observation_width=256 \
-  '--env.camera_name_mapping={"agentview_image": "image", "robot0_eye_in_hand_image": "wrist_image"}' \
-  --eval.batch_size=1 \
-  --eval.n_episodes=10 \
-  --eval.use_async_envs=false \
-  --policy.device=cuda \
-  '--rename_map={"observation.images.image": "observation.images.camera1", "observation.images.wrist_image": "observation.images.camera2"}' \
-  --policy.empty_cameras=1
-```
-
-### Recommended evaluation episodes
-
-**10 episodes per task** across the `libero_10` suite (100 total) for reproducible benchmarking. Matches the protocol used in the RoboCerebra paper.
-
-## Policy inputs and outputs
-
-**Observations:**
-
- `observation.state` — 8-dim proprioceptive state (7 joint positions + gripper)
- `observation.images.image` — third-person view, 256×256 HWC uint8
- `observation.images.wrist_image` — wrist-mounted camera view, 256×256 HWC uint8
-
-**Actions:**
-
- Continuous control in `Box(-1, 1, shape=(7,))` — end-effector delta (6D) + gripper (1D)
-
-## Training
-
-The unified dataset at [`lerobot/robocerebra_unified`](https://huggingface.co/datasets/lerobot/robocerebra_unified) exposes two RGB streams and language-grounded sub-task annotations:
-
-| Feature                          | Shape         | Description          |
-| -------------------------------- | ------------- | -------------------- |
-| `observation.images.image`       | (256, 256, 3) | Third-person view    |
-| `observation.images.wrist_image` | (256, 256, 3) | Wrist-mounted camera |
-| `observation.state`              | (8,)          | Joint pos + gripper  |
-| `action`                         | (7,)          | EEF delta + gripper  |
-
-Fine-tune a SmolVLA base on it:
-
-```bash
-lerobot-train \
-  --policy.path=lerobot/smolvla_base \
-  --dataset.repo_id=lerobot/robocerebra_unified \
-  --env.type=libero \
-  --env.task=libero_10 \
-  --output_dir=outputs/smolvla_robocerebra
-```
-
-## Reproducing published results
-
-The released checkpoint [`lerobot/smolvla_robocerebra`](https://huggingface.co/lerobot/smolvla_robocerebra) was trained on `lerobot/robocerebra_unified` and evaluated with the command in the [Evaluation](#evaluation) section. CI runs the same command with `--eval.n_episodes=1` as a smoke test on every PR touching the benchmark.
@@ -1,130 +0,0 @@
-# RoboMME
-
-[RoboMME](https://robomme.github.io) is a memory-augmented manipulation benchmark built on ManiSkill (SAPIEN). It evaluates a robot's ability to retain and use information across an episode — counting, object permanence, reference, and imitation.
-
- **16 tasks** across 4 memory-skill suites
- **1,600 training demos** (100 per task, 50 val, 50 test)
- **Dataset**: [`lerobot/robomme`](https://huggingface.co/datasets/lerobot/robomme) — LeRobot v3.0, 768K frames at 10 fps
- **Simulator**: ManiSkill / SAPIEN, Panda arm, Linux only
-
-![RoboMME benchmark tasks overview](https://cdn-thumbnails.huggingface.co/social-thumbnails/papers/2603.04639/gradient.png)
-
-## Tasks
-
-| Suite                             | Tasks                                                         |
-| --------------------------------- | ------------------------------------------------------------- |
-| **Counting** (temporal memory)    | BinFill, PickXtimes, SwingXtimes, StopCube                    |
-| **Permanence** (spatial memory)   | VideoUnmask, VideoUnmaskSwap, ButtonUnmask, ButtonUnmaskSwap  |
-| **Reference** (object memory)     | PickHighlight, VideoRepick, VideoPlaceButton, VideoPlaceOrder |
-| **Imitation** (procedural memory) | MoveCube, InsertPeg, PatternLock, RouteStick                  |
-
-## Installation
-
-> RoboMME requires **Linux** (ManiSkill/SAPIEN uses Vulkan rendering). Docker is recommended to isolate dependency conflicts.
-
-### Native (Linux)
-
-```bash
-pip install --override <(printf 'gymnasium==0.29.1\nnumpy==1.26.4\n') \
-  -e '.[smolvla,av-dep]' \
-  'robomme @ git+https://github.com/RoboMME/robomme_benchmark.git@main'
-```
-
-> **Dependency note**: `mani-skill` (pulled by `robomme`) pins `gymnasium==0.29.1` and `numpy<2.0.0`, which conflict with lerobot's base `numpy>=2.0.0`. That's why `robomme` is not a pyproject extra — use the override install above, or the Docker approach below to avoid conflicts entirely.
-
-### Docker (recommended)
-
-```bash
-# Build base image first (from repo root)
-docker build -f docker/Dockerfile.eval-base -t lerobot-eval-base .
-
-# Build RoboMME eval image (applies gymnasium + numpy pin overrides)
-docker build -f docker/Dockerfile.benchmark.robomme -t lerobot-robomme .
-```
-
-The `docker/Dockerfile.benchmark.robomme` image overrides `gymnasium==0.29.1` and `numpy==1.26.4` after lerobot's install. Both versions are runtime-safe for lerobot's actual API usage.
-
-## Running Evaluation
-
-### Default (single task, single episode)
-
-```bash
-lerobot-eval \
-    --policy.path=<your_policy_repo> \
-    --env.type=robomme \
-    --env.task=PickXtimes \
-    --env.dataset_split=test \
-    --env.task_ids=[0] \
-    --eval.batch_size=1 \
-    --eval.n_episodes=1
-```
-
-### Multi-task evaluation
-
-Evaluate multiple tasks in one run by comma-separating task names. Use `task_ids` to control which episodes are evaluated per task. Recommended: 50 episodes per task for the test split.
-
-```bash
-lerobot-eval \
-    --policy.path=<your_policy_repo> \
-    --env.type=robomme \
-    --env.task=PickXtimes,BinFill,StopCube,MoveCube,InsertPeg \
-    --env.dataset_split=test \
-    --env.task_ids=[0,1,2,3,4,5,6,7,8,9] \
-    --eval.batch_size=1 \
-    --eval.n_episodes=50
-```
-
-### Key CLI options for `env.type=robomme`
-
-| Option               | Default       | Description                                        |
-| -------------------- | ------------- | -------------------------------------------------- |
-| `env.task`           | `PickXtimes`  | Any of the 16 task names above (comma-separated)   |
-| `env.dataset_split`  | `test`        | `train`, `val`, or `test`                          |
-| `env.action_space`   | `joint_angle` | `joint_angle` (8-D) or `ee_pose` (7-D)             |
-| `env.episode_length` | `300`         | Max steps per episode                              |
-| `env.task_ids`       | `null`        | List of episode indices to evaluate (null = `[0]`) |
-
-## Dataset
-
-The dataset [`lerobot/robomme`](https://huggingface.co/datasets/lerobot/robomme) is in **LeRobot v3.0 format** and can be loaded directly:
-
-```python
-from lerobot.datasets.lerobot_dataset import LeRobotDataset
-
-dataset = LeRobotDataset("lerobot/robomme")
-```
-
-### Dataset features
-
-| Feature            | Shape         | Description                     |
-| ------------------ | ------------- | ------------------------------- |
-| `image`            | (256, 256, 3) | Front camera RGB                |
-| `wrist_image`      | (256, 256, 3) | Wrist camera RGB                |
-| `actions`          | (8,)          | Joint angles + gripper          |
-| `state`            | (8,)          | Joint positions + gripper state |
-| `simple_subgoal`   | str           | High-level language annotation  |
-| `grounded_subgoal` | str           | Grounded language annotation    |
-| `episode_index`    | int           | Episode ID                      |
-| `frame_index`      | int           | Frame within episode            |
-
-### Feature key alignment (training)
-
-The env wrapper exposes `pixels/image` and `pixels/wrist_image` as observation keys. The `features_map` in `RoboMMEEnv` maps these to `observation.images.image` and `observation.images.wrist_image` for the policy. State is exposed as `agent_pos` and maps to `observation.state`.
-
-The dataset's `image` and `wrist_image` columns already align with the policy input keys, so no renaming is needed when fine-tuning.
-
-## Action Spaces
-
-| Type          | Dim | Description                                               |
-| ------------- | --- | --------------------------------------------------------- |
-| `joint_angle` | 8   | 7 joint angles + 1 gripper (−1 closed, +1 open, absolute) |
-| `ee_pose`     | 7   | xyz + roll/pitch/yaw + gripper                            |
-
-Set via `--env.action_space=joint_angle` (default) or `--env.action_space=ee_pose`.
-
-## Platform Notes
-
- **Linux only**: ManiSkill requires SAPIEN/Vulkan. macOS and Windows are not supported.
- **GPU recommended**: Rendering is CPU-capable but slow; CUDA + Vulkan gives full speed.
- **gymnasium / numpy conflict**: See installation note above. Docker image handles this automatically.
- **ManiSkill fork**: `robomme` depends on a specific ManiSkill fork (`YinpeiDai/ManiSkill`), pulled in automatically via the `robomme` package.
@@ -1,223 +0,0 @@
-# RoboTwin 2.0
-
-RoboTwin 2.0 is a **large-scale dual-arm manipulation benchmark** built on the SAPIEN physics engine. It provides a standardized evaluation protocol for bimanual robotic policies across 50 tasks (as of upstream `main`) with strong domain randomization (clutter, lighting, background, tabletop height, and language instructions).
-
- Paper: [RoboTwin 2.0: A Scalable Data Generator and Benchmark with Strong Domain Randomization for Robust Bimanual Robotic Manipulation](https://arxiv.org/abs/2506.18088)
- GitHub: [RoboTwin-Platform/RoboTwin](https://github.com/RoboTwin-Platform/RoboTwin)
- Leaderboard: [robotwin-platform.github.io/leaderboard](https://robotwin-platform.github.io/leaderboard)
- Dataset: [lerobot/robotwin_unified](https://huggingface.co/datasets/lerobot/robotwin_unified)
-
-![RoboTwin 2.0 benchmark overview](https://www.aitntnews.com/pictures/2025/7/8/9a7f79cb-5ba9-11f0-8581-fa163e47d677.png)
-
-## Overview
-
-| Property      | Value                                                    |
-| ------------- | -------------------------------------------------------- |
-| Tasks         | 50 dual-arm manipulation tasks                           |
-| Robot         | Aloha-AgileX bimanual (14 DOF, 7 per arm)                |
-| Action space  | 14-dim joint-space, continuous in `[-1, 1]`              |
-| Cameras       | `head_camera`, `left_camera`, `right_camera`             |
-| Simulator     | SAPIEN (not MuJoCo)                                      |
-| Eval protocol | 100 episodes/task, 50 demo_clean demonstrations          |
-| Eval settings | **Easy** (`demo_clean`) and **Hard** (`demo_randomized`) |
-
-## Available tasks
-
-RoboTwin 2.0 ships 50 dual-arm manipulation tasks in its upstream `envs/` directory. The canonical list is the `ROBOTWIN_TASKS` tuple in `src/lerobot/envs/robotwin.py`, mirrored verbatim from the upstream repo. Example tasks:
-
-| Task                     | CLI name                 | Category          |
-| ------------------------ | ------------------------ | ----------------- |
-| Beat block with hammer   | `beat_block_hammer`      | Tool use          |
-| Click bell / alarm clock | `click_bell`             | Precision press   |
-| Stack blocks (2 / 3)     | `stack_blocks_two/three` | Stacking          |
-| Stack bowls (2 / 3)      | `stack_bowls_two/three`  | Stacking          |
-| Handover block / mic     | `handover_block`         | Bimanual coord.   |
-| Lift pot                 | `lift_pot`               | Bimanual lift     |
-| Shake bottle             | `shake_bottle`           | Continuous motion |
-| Turn switch              | `turn_switch`            | Articulated obj   |
-| Stamp seal               | `stamp_seal`             | Precision place   |
-| Scan object              | `scan_object`            | Mobile manip.     |
-
-Pass a comma-separated list to `--env.task` to run multiple tasks in a single eval sweep.
-
-<Tip warning={true}>
-  `open_laptop` is currently broken upstream (its `check_success()` uses
-  `self.arm_tag`, which is only set inside the scripted-expert `play_once()`
-  path and therefore unavailable during normal policy eval). Avoid it until the
-  upstream bug is fixed, or patch the task to default `self.arm_tag = "left"` in
-  `load_actors()`.
-</Tip>
-
-## Dataset
-
-The RoboTwin 2.0 dataset is available in **LeRobot v3.0 format** on the Hugging Face Hub:
-
-```
-lerobot/robotwin_unified
-```
-
-It contains over 100,000 pre-collected trajectories across all 50 tasks (79.6 GB, Apache 2.0 license). No format conversion is needed — it is already in the correct LeRobot v3.0 schema with video observations and action labels.
-
-You can load it directly with the HF Datasets library:
-
-```python
-from datasets import load_dataset
-
-ds = load_dataset("lerobot/robotwin_unified", split="train")
-```
-
-## Installation
-
-RoboTwin 2.0 requires **Linux** with an NVIDIA GPU (CUDA 12.1 recommended). Installation takes approximately 20 minutes.
-
-### 1. Create a conda environment
-
-```bash
-conda create -n robotwin python=3.10 -y
-conda activate robotwin
-```
-
-### 2. Install LeRobot
-
-```bash
-git clone https://github.com/huggingface/lerobot.git
-cd lerobot
-pip install -e "."
-```
-
-### 3. Install RoboTwin 2.0
-
-```bash
-git clone https://github.com/RoboTwin-Platform/RoboTwin.git
-cd RoboTwin
-bash script/_install.sh
-bash script/_download_assets.sh
-```
-
-The install script handles all Python dependencies including SAPIEN, CuRobo, mplib, and pytorch3d.
-
-<Tip warning={true}>
-If the automated install fails, install manually:
-
-```bash
-pip install -r requirements.txt
-pip install "git+https://github.com/facebookresearch/pytorch3d.git@stable"
-cd envs && git clone https://github.com/NVlabs/curobo.git && cd curobo
-pip install -e . --no-build-isolation
-```
-
-Then apply the required mplib fix: in `mplib/planner.py` line 807, remove `or collide` from the conditional.
-
-</Tip>
-
-### 4. Add RoboTwin to PYTHONPATH
-
-The RoboTwin task modules must be importable by LeRobot. From within the `RoboTwin/` directory:
-
-```bash
-export PYTHONPATH="${PYTHONPATH}:$(pwd)"
-```
-
-Add this to your shell profile to make it permanent.
-
-## Evaluation
-
-### Standard evaluation (recommended)
-
-Evaluate a policy on a single task with the official protocol (100 episodes):
-
-```bash
-lerobot-eval \
-  --policy.path="your-hf-policy-id" \
-  --env.type=robotwin \
-  --env.task=beat_block_hammer \
-  --eval.batch_size=1 \
-  --eval.n_episodes=100
-```
-
-### Single-task quick check
-
-```bash
-lerobot-eval \
-  --policy.path="your-hf-policy-id" \
-  --env.type=robotwin \
-  --env.task=beat_block_hammer \
-  --eval.batch_size=1 \
-  --eval.n_episodes=5
-```
-
-### Multi-task sweep
-
-Evaluate on several tasks in one run:
-
-```bash
-lerobot-eval \
-  --policy.path="your-hf-policy-id" \
-  --env.type=robotwin \
-  --env.task=beat_block_hammer,click_bell,handover_block,stack_blocks_two \
-  --eval.batch_size=1 \
-  --eval.n_episodes=100
-```
-
-### Full benchmark (all 50 tasks)
-
-```bash
-lerobot-eval \
-  --policy.path="your-hf-policy-id" \
-  --env.type=robotwin \
-  --env.task=adjust_bottle,beat_block_hammer,blocks_ranking_rgb,blocks_ranking_size,click_alarmclock,click_bell,dump_bin_bigbin,grab_roller,handover_block,handover_mic,hanging_mug,lift_pot,move_can_pot,move_pillbottle_pad,move_playingcard_away,move_stapler_pad,open_microwave,pick_diverse_bottles,pick_dual_bottles,place_a2b_left,place_a2b_right,place_bread_basket,place_bread_skillet,place_burger_fries,place_can_basket,place_cans_plasticbox,place_container_plate,place_dual_shoes,place_empty_cup,place_fan,place_mouse_pad,place_object_basket,place_object_scale,place_object_stand,place_phone_stand,place_shoe,press_stapler,put_bottles_dustbin,put_object_cabinet,rotate_qrcode,scan_object,shake_bottle,shake_bottle_horizontally,stack_blocks_three,stack_blocks_two,stack_bowls_three,stack_bowls_two,stamp_seal,turn_switch \
-  --eval.batch_size=1 \
-  --eval.n_episodes=100
-```
-
-<Tip>
-  `open_laptop` is intentionally omitted above because of the upstream
-  `self.arm_tag` bug (see the **Available tasks** section). Re-add it once the
-  upstream fix lands.
-</Tip>
-
-## Camera configuration
-
-By default, all three cameras are included:
-
-| Camera key     | Description                    |
-| -------------- | ------------------------------ |
-| `head_camera`  | Torso-mounted overhead view    |
-| `left_camera`  | Left arm wrist-mounted camera  |
-| `right_camera` | Right arm wrist-mounted camera |
-
-To use a subset of cameras, override `--env.camera_names`:
-
-```bash
-lerobot-eval \
-  --policy.path="your-hf-policy-id" \
-  --env.type=robotwin \
-  --env.task=beat_block_hammer \
-  --env.camera_names="head_camera,left_camera" \
-  --eval.batch_size=1 \
-  --eval.n_episodes=10
-```
-
-## Environment config reference
-
-Key parameters for `RoboTwinEnvConfig`:
-
-| Parameter            | Default                                  | Description                        |
-| -------------------- | ---------------------------------------- | ---------------------------------- |
-| `task`               | `"beat_block_hammer"`                    | Comma-separated task name(s)       |
-| `fps`                | `25`                                     | Simulation FPS                     |
-| `episode_length`     | `300`                                    | Max steps per episode              |
-| `obs_type`           | `"pixels_agent_pos"`                     | `"pixels"` or `"pixels_agent_pos"` |
-| `camera_names`       | `"head_camera,left_camera,right_camera"` | Comma-separated active cameras     |
-| `observation_height` | `240`                                    | Camera pixel height                |
-| `observation_width`  | `320`                                    | Camera pixel width                 |
-
-## Leaderboard submission
-
-Results can be submitted to the [RoboTwin 2.0 leaderboard](https://robotwin-platform.github.io/leaderboard). The official protocol requires:
-
- Training on 50 `demo_clean` demonstrations per task
- Evaluating 100 episodes per task
- Reporting success rate separately for **Easy** (`demo_clean`) and **Hard** (`demo_randomized`) settings
-
-For submission instructions, refer to the [RoboTwin 2.0 documentation](https://robotwin-platform.github.io/doc/).
@@ -34,7 +34,7 @@ pip install -e ".[smolvla]"

 ### Using RTC with Pi0

-You can use `lerobot-rollout --strategy.type=base --inference.type=rtc` for RTC deployment on real robots.
+You can find a complete reference implementation in [eval_with_real_robot.py](examples/rtc/eval_with_real_robot.py).
 The snippet below provides a simplified pseudo-example of how RTC operates with Pi0 in your pipeline:

 ```python
@@ -137,12 +137,8 @@ The script generates a visualization of the denoising process, comparing standar
 ## Testing RTC with a Real Robot

 ```bash
-lerobot-rollout \
-    --strategy.type=base \
+python examples/rtc/eval_with_real_robot.py \
    --policy.path=${HF_USERNAME}/policy_repo_id \
-    --inference.type=rtc \
-    --inference.rtc.execution_horizon=10 \
-    --inference.rtc.max_guidance_weight=10.0 \
    --robot.type=so100_follower \
    --robot.port=/dev/tty.usbmodem58FA0834591 \
    --robot.cameras="{ gripper: {type: opencv, index_or_path: 1, width: 640, height: 480, fps: 30}, front: {type: opencv, index_or_path: 0, width: 640, height: 480, fps: 30}}" \
@@ -182,7 +178,7 @@ visualizer = RTCDebugVisualizer()
 # ... create plots
 ```

-See `examples/rtc/eval_dataset.py` for a complete example of offline RTC visualization.
+See `examples/rtc/eval_dataset.py` for a complete example of visualization.

 ## References

@@ -46,7 +46,7 @@ This ensures identical task states map to consistent progress values, even acros

 ## Inputs and Targets (What the new code expects)

-SARM is trained through its processor (`src/lerobot/rewards/sarm/processor_sarm.py`), which:
+SARM is trained through its processor (`src/lerobot/policies/sarm/processor_sarm.py`), which:

 - **Encodes** images and task text with CLIP (ViT-B/32) into `video_features` and `text_features`
 - **Pads/truncates** robot state into `state_features` (up to `max_state_dim`)
@@ -347,7 +347,7 @@ Use `compute_rabc_weights.py` with `--visualize-only` to visualize model predict
 <hfoption id="single_stage">

 ```bash
-python -m lerobot.rewards.sarm.compute_rabc_weights \
+python src/lerobot/policies/sarm/compute_rabc_weights.py \
  --dataset-repo-id your-username/your-dataset \
  --reward-model-path your-username/sarm-model \
  --visualize-only \
@@ -360,7 +360,7 @@ python -m lerobot.rewards.sarm.compute_rabc_weights \
 <hfoption id="dense_only">

 ```bash
-python -m lerobot.rewards.sarm.compute_rabc_weights \
+python src/lerobot/policies/sarm/compute_rabc_weights.py \
  --dataset-repo-id your-username/your-dataset \
  --reward-model-path your-username/sarm-model \
  --visualize-only \
@@ -373,7 +373,7 @@ python -m lerobot.rewards.sarm.compute_rabc_weights \
 <hfoption id="dual">

 ```bash
-python -m lerobot.rewards.sarm.compute_rabc_weights \
+python src/lerobot/policies/sarm/compute_rabc_weights.py \
  --dataset-repo-id your-username/your-dataset \
  --reward-model-path your-username/sarm-model \
  --visualize-only \
@@ -429,7 +429,7 @@ The weighting follows **Equations 8-9** from the paper:
 First, run the SARM model on all frames in your dataset to compute progress values:

 ```bash
-python -m lerobot.rewards.sarm.compute_rabc_weights \
+python src/lerobot/policies/sarm/compute_rabc_weights.py \
  --dataset-repo-id your-username/your-dataset \
  --reward-model-path your-username/sarm-model \
  --head-mode sparse \
@@ -465,15 +465,15 @@ This script:

 ### Step 5b: Train Policy with RA-BC

-Once you have the progress file, train your policy with RA-BC weighting. The progress file is auto-detected from the dataset path (`sarm_progress.parquet`) if not explicitly provided. Currently PI0, PI0.5 and SmolVLA are supported with RA-BC:
+Once you have the progress file, train your policy with RA-BC weighting. The progress file is auto-detected from the dataset path (`sarm_progress.parquet`). Currently PI0, PI0.5 and SmolVLA are supported with RA-BC:

 ```bash
 lerobot-train \
  --dataset.repo_id=your-username/your-dataset \
  --policy.type=pi0 \
-  --sample_weighting.type=rabc \
-  --sample_weighting.head_mode=sparse \
-  --sample_weighting.kappa=0.01 \
+  --use_rabc=true \
+  --rabc_head_mode=sparse \
+  --rabc_kappa=0.01 \
  --output_dir=outputs/train/policy_rabc \
  --batch_size=32 \
  --steps=40000
@@ -488,13 +488,12 @@ The training script automatically:

 **RA-BC Arguments:**

-| Argument                           | Description                                            | Default                 |
-| ---------------------------------- | ------------------------------------------------------ | ----------------------- |
-| `--sample_weighting.type`          | Weighting strategy type (`rabc` or `uniform`)          | `rabc`                  |
-| `--sample_weighting.progress_path` | Path to progress parquet file                          | `sarm_progress.parquet` |
-| `--sample_weighting.head_mode`     | Which SARM head's progress to use: `sparse` or `dense` | `sparse`                |
-| `--sample_weighting.kappa`         | Threshold κ for high-quality samples                   | `0.01`                  |
-| `--sample_weighting.epsilon`       | Small constant for numerical stability                 | `1e-6`                  |
+| Argument               | Description                                                | Default                            |
+| ---------------------- | ---------------------------------------------------------- | ---------------------------------- |
+| `--use_rabc`           | Enable RA-BC sample weighting                              | `false`                            |
+| `--rabc_progress_path` | Path to progress parquet file (auto-detected from dataset) | `sarm_progress.parquet` in dataset |
+| `--rabc_head_mode`     | Which SARM head's progress to use: `sparse` or `dense`     | `sparse`                           |
+| `--rabc_kappa`         | Threshold κ for high-quality samples                       | `0.01`                             |

 ### Tuning RA-BC Kappa

@@ -512,30 +511,30 @@ The `kappa` parameter is the threshold that determines which samples get full we

 Monitor these WandB metrics during training:

-| Metric                        | Healthy Range | Problem Indicator         |
-| ----------------------------- | ------------- | ------------------------- |
-| `sample_weight_mean_weight`   | 0.3 - 0.8     | ≈ 1.0 means kappa too low |
-| `sample_weighting/delta_mean` | > 0           | Should be positive        |
-| `sample_weighting/delta_std`  | > 0           | Variance in data quality  |
+| Metric             | Healthy Range | Problem Indicator         |
+| ------------------ | ------------- | ------------------------- |
+| `rabc_mean_weight` | 0.3 - 0.8     | ≈ 1.0 means kappa too low |
+| `rabc_delta_mean`  | > 0           | Should be positive        |
+| `rabc_delta_std`   | > 0           | Variance in data quality  |

-**If `sample_weight_mean_weight ≈ 1.0`:** Your kappa is too low. Most samples have `delta > kappa` and bypass the soft-weighting entirely. RA-BC becomes equivalent to vanilla BC.
+**If `rabc_mean_weight ≈ 1.0`:** Your kappa is too low. Most samples have `delta > kappa` and bypass the soft-weighting entirely. RA-BC becomes equivalent to vanilla BC.

 **Setting kappa based on your data:**

-The default `kappa=0.01` was tuned for the paper's T-shirt folding task (~90s episodes at 30fps). For your dataset, check the logged `sample_weighting/delta_mean` and `sample_weighting/delta_std`:
+The default `kappa=0.01` was tuned for the paper's T-shirt folding task (~90s episodes at 30fps). For your dataset, check the logged `rabc_delta_mean` and `rabc_delta_std`:

 ```
 # If delta_mean ≈ 0.03 and delta_std ≈ 0.02:
 # Most deltas fall in range [0.01, 0.05]

 # Option 1: Set kappa = delta_mean (medium selectivity)
--sample_weighting.kappa=0.03
+--rabc_kappa=0.03

 # Option 2: Set kappa = delta_mean + delta_std (high selectivity)
--sample_weighting.kappa=0.05
+--rabc_kappa=0.05

 # Option 3: Set kappa = delta_mean + 2*delta_std (very selective)
--sample_weighting.kappa=0.07
+--rabc_kappa=0.07
 ```

 **When RA-BC may not help:**
@@ -551,8 +550,8 @@ accelerate launch \
  src/lerobot/scripts/lerobot_train.py \
  --dataset.repo_id=your-username/your-dataset \
  --policy.type=pi0 \
-  --sample_weighting.type=rabc \
-  --sample_weighting.kappa=0.01 \
+  --use_rabc=true \
+  --rabc_kappa=0.01 \
  --output_dir=outputs/train/policy_rabc \
  --batch_size=32 \
  --steps=40000
@@ -577,7 +576,7 @@ accelerate launch \
 ### RA-BC

 1. **Train SARM first**: RA-BC quality depends entirely on SARM quality
-2. **Monitor `sample_weight_mean_weight`**: If it's ≈ 1.0, increase kappa (see [Tuning RA-BC Kappa](#tuning-ra-bc-kappa))
+2. **Monitor `rabc_mean_weight`**: If it's ≈ 1.0, increase kappa (see [Tuning RA-BC Kappa](#tuning-ra-bc-kappa))

 ---

@@ -274,8 +274,7 @@ python src/lerobot/scripts/lerobot_train.py \
 Once trained, we recommend deploying policies using inference-time RTC:

 ```bash
-lerobot-rollout \
-  --strategy.type=base \
+python examples/rtc/eval_with_real_robot.py \
  --policy.path=your-username/your-repo-id \
  --policy.device=cuda \
  --robot.type=unitree_g1 \
@@ -285,7 +284,7 @@ lerobot-rollout \
  --task="task_description" \
  --duration=1000 \
  --fps=30 \
-  --inference.type=rtc
+  --rtc.enabled=true
 ```

 ---
@@ -1,176 +0,0 @@
-# VLABench
-
-[VLABench](https://github.com/OpenMOSS/VLABench) is a large-scale benchmark for **language-conditioned robotic manipulation with long-horizon reasoning**. The upstream suite covers 100 task categories across 2,000+ objects and evaluates six dimensions of robot intelligence: mesh & texture understanding, spatial reasoning, world-knowledge transfer, semantic instruction comprehension, physical-law understanding, and long-horizon planning. Built on MuJoCo / dm_control with a Franka Panda 7-DOF arm. LeRobot exposes **43 of these tasks** through `--env.task` (21 primitives + 22 composites, see [Available tasks](#available-tasks) below).
-
- Paper: [VLABench: A Large-Scale Benchmark for Language-Conditioned Robotics Manipulation with Long-Horizon Reasoning](https://arxiv.org/abs/2412.18194)
- GitHub: [OpenMOSS/VLABench](https://github.com/OpenMOSS/VLABench)
- Project website: [vlabench.github.io](https://vlabench.github.io)
- Pretrained policy: [`lerobot/smolvla_vlabench`](https://huggingface.co/lerobot/smolvla_vlabench)
-
-<img
-  src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/lerobot/vlabench.png"
-  alt="VLABench benchmark overview"
-  width="85%"
-/>
-
-## Available tasks
-
-VLABench ships two task suites covering **43 task categories** in LeRobot's `--env.task` surface:
-
-| Suite     | CLI name    | Tasks | Description                                                      |
-| --------- | ----------- | ----- | ---------------------------------------------------------------- |
-| Primitive | `primitive` | 21    | Single / few-skill combinations (select, insert, physics QA)     |
-| Composite | `composite` | 22    | Multi-step reasoning and long-horizon planning (cook, rearrange) |
-
-**Primitive tasks:** `select_fruit`, `select_toy`, `select_chemistry_tube`, `add_condiment`, `select_book`, `select_painting`, `select_drink`, `insert_flower`, `select_billiards`, `select_ingredient`, `select_mahjong`, `select_poker`, and physical-reasoning tasks (`density_qa`, `friction_qa`, `magnetism_qa`, `reflection_qa`, `simple_cuestick_usage`, `simple_seesaw_usage`, `sound_speed_qa`, `thermal_expansion_qa`, `weight_qa`).
-
-**Composite tasks:** `cluster_billiards`, `cluster_book`, `cluster_drink`, `cluster_toy`, `cook_dishes`, `cool_drink`, `find_unseen_object`, `get_coffee`, `hammer_nail`, `heat_food`, `make_juice`, `play_mahjong`, `play_math_game`, `play_poker`, `play_snooker`, `rearrange_book`, `rearrange_chemistry_tube`, `set_dining_table`, `set_study_table`, `store_food`, `take_chemistry_experiment`, `use_seesaw_complex`.
-
-`--env.task` accepts three forms:
-
- a single task name (`select_fruit`)
- a comma-separated list (`select_fruit,heat_food`)
- a suite shortcut (`primitive`, `composite`, or `primitive,composite`)
-
-## Installation
-
-VLABench is **not on PyPI** — its only distribution is the [OpenMOSS/VLABench](https://github.com/OpenMOSS/VLABench) GitHub repo — so LeRobot does not expose a `vlabench` extra. Install it manually as an editable clone, alongside the MuJoCo / dm_control pins VLABench needs, then fetch the mesh assets:
-
-```bash
-# After following the standard LeRobot installation instructions.
-
-git clone https://github.com/OpenMOSS/VLABench.git ~/VLABench
-git clone https://github.com/motion-planning/rrt-algorithms.git ~/rrt-algorithms
-pip install -e ~/VLABench -e ~/rrt-algorithms
-pip install "mujoco==3.2.2" "dm-control==1.0.22" \
-            open3d colorlog scikit-learn openai gdown
-
-python ~/VLABench/scripts/download_assets.py
-```
-
-<Tip>
-VLABench requires Linux (`sys_platform == 'linux'`) and Python 3.10+. Set the MuJoCo rendering backend before running:
-
-```bash
-export MUJOCO_GL=egl  # for headless servers (HPC, cloud)
-```
-
-</Tip>
-
-## Evaluation
-
-All eval snippets below mirror the command CI runs (see `.github/workflows/benchmark_tests.yml`). The `--rename_map` argument maps VLABench's `image` / `second_image` / `wrist_image` camera keys onto the three-camera (`camera1` / `camera2` / `camera3`) input layout the released `smolvla_vlabench` policy was trained on.
-
-### Single-task evaluation (recommended for quick iteration)
-
-```bash
-lerobot-eval \
-  --policy.path=lerobot/smolvla_vlabench \
-  --env.type=vlabench \
-  --env.task=select_fruit \
-  --eval.batch_size=1 \
-  --eval.n_episodes=10 \
-  --eval.use_async_envs=false \
-  --policy.device=cuda \
-  '--rename_map={"observation.images.image": "observation.images.camera1", "observation.images.second_image": "observation.images.camera2", "observation.images.wrist_image": "observation.images.camera3"}'
-```
-
-### Multi-task evaluation
-
-Pass a comma-separated list of tasks:
-
-```bash
-lerobot-eval \
-  --policy.path=lerobot/smolvla_vlabench \
-  --env.type=vlabench \
-  --env.task=select_fruit,select_toy,add_condiment,heat_food \
-  --eval.batch_size=1 \
-  --eval.n_episodes=10 \
-  --eval.use_async_envs=false \
-  --policy.device=cuda \
-  '--rename_map={"observation.images.image": "observation.images.camera1", "observation.images.second_image": "observation.images.camera2", "observation.images.wrist_image": "observation.images.camera3"}'
-```
-
-### Suite-wide evaluation
-
-Run an entire suite (all 21 primitives or all 22 composites):
-
-```bash
-lerobot-eval \
-  --policy.path=lerobot/smolvla_vlabench \
-  --env.type=vlabench \
-  --env.task=primitive \
-  --eval.batch_size=1 \
-  --eval.n_episodes=10 \
-  --eval.use_async_envs=false \
-  --policy.device=cuda \
-  --env.max_parallel_tasks=1 \
-  '--rename_map={"observation.images.image": "observation.images.camera1", "observation.images.second_image": "observation.images.camera2", "observation.images.wrist_image": "observation.images.camera3"}'
-```
-
-Or both suites:
-
-```bash
-lerobot-eval \
-  --policy.path=lerobot/smolvla_vlabench \
-  --env.type=vlabench \
-  --env.task=primitive,composite \
-  --eval.batch_size=1 \
-  --eval.n_episodes=10 \
-  --eval.use_async_envs=false \
-  --policy.device=cuda \
-  --env.max_parallel_tasks=1 \
-  '--rename_map={"observation.images.image": "observation.images.camera1", "observation.images.second_image": "observation.images.camera2", "observation.images.wrist_image": "observation.images.camera3"}'
-```
-
-### Recommended evaluation episodes
-
-**10 episodes per task** for reproducible benchmarking (210 total for the full primitive suite, 220 for composite). Matches the protocol in the VLABench paper.
-
-## Policy inputs and outputs
-
-**Observations:**
-
- `observation.state` — 7-dim end-effector state (position xyz + Euler xyz + gripper)
- `observation.images.image` — front camera, 480×480 HWC uint8
- `observation.images.second_image` — second camera, 480×480 HWC uint8
- `observation.images.wrist_image` — wrist camera, 480×480 HWC uint8
-
-**Actions:**
-
- Continuous control in `Box(-1, 1, shape=(7,))` — 3D position + 3D Euler orientation + 1D gripper.
-
-## Training
-
-### Datasets
-
-Pre-collected VLABench datasets in LeRobot format on the Hub:
-
- [`VLABench/vlabench_primitive_ft_lerobot_video`](https://huggingface.co/datasets/VLABench/vlabench_primitive_ft_lerobot_video) — 5,000 episodes, 128 tasks, 480×480 images.
- [`VLABench/vlabench_composite_ft_lerobot_video`](https://huggingface.co/datasets/VLABench/vlabench_composite_ft_lerobot_video) — 5,977 episodes, 167 tasks, 224×224 images.
-
-### Example training command
-
-Fine-tune a SmolVLA base on the primitive suite:
-
-```bash
-lerobot-train \
-  --policy.type=smolvla \
-  --policy.repo_id=${HF_USER}/smolvla_vlabench_primitive \
-  --policy.load_vlm_weights=true \
-  --policy.push_to_hub=true \
-  --dataset.repo_id=VLABench/vlabench_primitive_ft_lerobot_video \
-  --env.type=vlabench \
-  --env.task=select_fruit \
-  --output_dir=./outputs/smolvla_vlabench_primitive \
-  --steps=100000 \
-  --batch_size=4 \
-  --eval_freq=5000 \
-  --eval.batch_size=1 \
-  --eval.n_episodes=1 \
-  --save_freq=10000
-```
-
-## Reproducing published results
-
-The released checkpoint [`lerobot/smolvla_vlabench`](https://huggingface.co/lerobot/smolvla_vlabench) was trained on the primitive-suite dataset above and is evaluated with the [Single-task](#single-task-evaluation-recommended-for-quick-iteration) / [Suite-wide](#suite-wide-evaluation) commands. CI runs a 10-primitive-task smoke eval (one episode each) on every PR touching the benchmark.
@@ -220,7 +220,7 @@ REAL_DIM = 12
 # Postprocessing: Trim 20D predictions to 12D for deployment
 ```

-See the [action_hub.py](https://github.com/huggingface/lerobot/blob/main/src/lerobot/policies/xvla/action_hub.py) implementation for details.
+See the [action_hub.py](/home/jade_choghari/robot/lerobot/src/lerobot/policies/xvla/action_hub.py) implementation for details.

 #### Auto Action Mode (Recommended)

@@ -519,9 +519,9 @@ If you use X-VLA in your research, please cite:

 - [X-VLA Paper](https://arxiv.org/pdf/2510.10274)
 - [LeRobot Documentation](https://github.com/huggingface/lerobot)
- [Action Registry Implementation](https://github.com/huggingface/lerobot/blob/main/src/lerobot/policies/xvla/action_hub.py)
- [Processor Implementation](https://github.com/huggingface/lerobot/blob/main/src/lerobot/policies/xvla/processor_xvla.py)
- [Model Configuration](https://github.com/huggingface/lerobot/blob/main/src/lerobot/policies/xvla/configuration_xvla.py)
+- [Action Registry Implementation](https://github.com/huggingface/lerobot/src/lerobot/policies/xvla/action_hub.py)
+- [Processor Implementation](https://github.com/huggingface/lerobot/src/lerobot/policies/xvla/processor_xvla.py)
+- [Model Configuration](https://github.com/huggingface/lerobot/src/lerobot/policies/xvla/configuration_xvla.py)

 ## Contributing

@@ -69,7 +69,7 @@ class ComputeProgressShards(PipelineStep):
        import torch
        from tqdm import tqdm

-        from lerobot.rewards.sarm.compute_rabc_weights import (
+        from lerobot.policies.sarm.compute_rabc_weights import (
            generate_all_frame_indices,
            interpolate_progress,
            load_sarm_resources,
@@ -0,0 +1,226 @@
+# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Shared utilities for Human-in-the-Loop data collection scripts."""
+
+import logging
+import time
+from dataclasses import dataclass, field
+from pathlib import Path
+
+from lerobot.common.control_utils import is_headless
+from lerobot.processor import (
+    IdentityProcessorStep,
+    RobotAction,
+    RobotObservation,
+    RobotProcessorPipeline,
+    observation_to_transition,
+    robot_action_observation_to_transition,
+    transition_to_observation,
+    transition_to_robot_action,
+)
+from lerobot.robots import Robot
+from lerobot.teleoperators import Teleoperator
+from lerobot.utils.robot_utils import precise_sleep
+
+logger = logging.getLogger(__name__)
+
+
+@dataclass
+class HILDatasetConfig:
+    repo_id: str
+    single_task: str
+    root: str | Path | None = None
+    fps: int = 30
+    episode_time_s: float = 120
+    num_episodes: int = 50
+    video: bool = True
+    push_to_hub: bool = True
+    private: bool = False
+    tags: list[str] | None = None
+    num_image_writer_processes: int = 0
+    num_image_writer_threads_per_camera: int = 4
+    video_encoding_batch_size: int = 1
+    vcodec: str = "auto"
+    streaming_encoding: bool = True
+    encoder_queue_maxsize: int = 30
+    encoder_threads: int | None = None
+    rename_map: dict[str, str] = field(default_factory=dict)
+
+
+def teleop_has_motor_control(teleop: Teleoperator) -> bool:
+    """Check if teleoperator has motor control capabilities."""
+    return all(hasattr(teleop, attr) for attr in ("enable_torque", "disable_torque", "write_goal_positions"))
+
+
+def teleop_disable_torque(teleop: Teleoperator) -> None:
+    """Disable teleop torque if supported."""
+    if hasattr(teleop, "disable_torque"):
+        teleop.disable_torque()
+
+
+def teleop_enable_torque(teleop: Teleoperator) -> None:
+    """Enable teleop torque if supported."""
+    if hasattr(teleop, "enable_torque"):
+        teleop.enable_torque()
+
+
+def teleop_smooth_move_to(teleop: Teleoperator, target_pos: dict, duration_s: float = 2.0, fps: int = 50):
+    """Smoothly move teleop to target position if motor control is available."""
+    if not teleop_has_motor_control(teleop):
+        logger.warning("Teleop does not support motor control - cannot mirror robot position")
+        return
+
+    teleop_enable_torque(teleop)
+    current = teleop.get_action()
+    steps = max(int(duration_s * fps), 1)
+
+    for step in range(steps + 1):
+        t = step / steps
+        interp = {}
+        for k in current:
+            if k in target_pos:
+                interp[k] = current[k] * (1 - t) + target_pos[k] * t
+            else:
+                interp[k] = current[k]
+        teleop.write_goal_positions(interp)
+        time.sleep(1 / fps)
+
+
+def init_keyboard_listener():
+    """Initialize keyboard listener with HIL controls."""
+    events = {
+        "exit_early": False,
+        "rerecord_episode": False,
+        "stop_recording": False,
+        "policy_paused": False,
+        "correction_active": False,
+        "resume_policy": False,
+        "in_reset": False,
+        "start_next_episode": False,
+    }
+
+    if is_headless():
+        logger.warning("Headless environment - keyboard controls unavailable")
+        return None, events
+
+    from pynput import keyboard
+
+    def on_press(key):
+        try:
+            if events["in_reset"]:
+                if key in [keyboard.Key.space, keyboard.Key.right]:
+                    logger.info("[HIL] Starting next episode...")
+                    events["start_next_episode"] = True
+                elif hasattr(key, "char") and key.char == "c":
+                    events["start_next_episode"] = True
+                elif key == keyboard.Key.esc:
+                    logger.info("[HIL] ESC - Stop recording, pushing to hub...")
+                    events["stop_recording"] = True
+                    events["start_next_episode"] = True
+            else:
+                if key == keyboard.Key.space:
+                    if not events["policy_paused"] and not events["correction_active"]:
+                        logger.info("[HIL] PAUSED - Press 'c' to take control or 'p' to resume policy")
+                        events["policy_paused"] = True
+                elif hasattr(key, "char") and key.char == "c":
+                    if events["policy_paused"] and not events["correction_active"]:
+                        logger.info("[HIL] Taking control...")
+                        events["start_next_episode"] = True
+                elif hasattr(key, "char") and key.char == "p":
+                    if events["policy_paused"] or events["correction_active"]:
+                        logger.info("[HIL] Resuming policy...")
+                        events["resume_policy"] = True
+                elif key == keyboard.Key.right:
+                    logger.info("[HIL] End episode")
+                    events["exit_early"] = True
+                elif key == keyboard.Key.left:
+                    logger.info("[HIL] Re-record episode")
+                    events["rerecord_episode"] = True
+                    events["exit_early"] = True
+                elif key == keyboard.Key.esc:
+                    logger.info("[HIL] ESC - Stop recording...")
+                    events["stop_recording"] = True
+                    events["exit_early"] = True
+        except Exception as e:
+            logger.info(f"Key error: {e}")
+
+    listener = keyboard.Listener(on_press=on_press)
+    listener.start()
+    return listener, events
+
+
+def make_identity_processors():
+    """Create identity processors for recording."""
+    teleop_proc = RobotProcessorPipeline[tuple[RobotAction, RobotObservation], RobotAction](
+        steps=[IdentityProcessorStep()],
+        to_transition=robot_action_observation_to_transition,
+        to_output=transition_to_robot_action,
+    )
+    obs_proc = RobotProcessorPipeline[RobotObservation, RobotObservation](
+        steps=[IdentityProcessorStep()],
+        to_transition=observation_to_transition,
+        to_output=transition_to_observation,
+    )
+    return teleop_proc, obs_proc
+
+
+def reset_loop(robot: Robot, teleop: Teleoperator, events: dict, fps: int):
+    """Reset period where human repositions environment."""
+    logger.info("[HIL] RESET")
+
+    events["in_reset"] = True
+    events["start_next_episode"] = False
+
+    obs = robot.get_observation()
+    robot_pos = {k: v for k, v in obs.items() if k.endswith(".pos") and k in robot.observation_features}
+    teleop_smooth_move_to(teleop, robot_pos, duration_s=2.0, fps=50)
+
+    logger.info("Press any key to enable teleoperation")
+    while not events["start_next_episode"] and not events["stop_recording"]:
+        precise_sleep(0.05)
+
+    if events["stop_recording"]:
+        return
+
+    events["start_next_episode"] = False
+    teleop_disable_torque(teleop)
+    logger.info("Teleop enabled - press any key to start episode")
+
+    while not events["start_next_episode"] and not events["stop_recording"]:
+        loop_start = time.perf_counter()
+        action = teleop.get_action()
+        robot.send_action(action)
+        precise_sleep(1 / fps - (time.perf_counter() - loop_start))
+
+    events["in_reset"] = False
+    events["start_next_episode"] = False
+    events["exit_early"] = False
+    events["policy_paused"] = False
+    events["correction_active"] = False
+    events["resume_policy"] = False
+
+
+def print_controls(rtc: bool = False):
+    """Print control instructions."""
+    mode = "Human-in-the-Loop Data Collection" + (" (RTC)" if rtc else "")
+    logger.info(
+        "%s\n  Controls:\n"
+        "    SPACE  - Pause policy\n"
+        "    c      - Take control\n"
+        "    p      - Resume policy after pause/correction\n"
+        "    →      - End episode\n"
+        "    ESC    - Stop and push to hub",
+        mode,
+    )
@@ -14,21 +14,17 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.

-import logging
-import time
-
-from lerobot.common.control_utils import init_keyboard_listener, predict_action
+from lerobot.common.control_utils import init_keyboard_listener
 from lerobot.datasets import LeRobotDataset
 from lerobot.policies import make_pre_post_processors
 from lerobot.policies.act import ACTPolicy
-from lerobot.policies.utils import make_robot_action
 from lerobot.processor import make_default_processors
 from lerobot.robots.lekiwi import LeKiwiClient, LeKiwiClientConfig
+from lerobot.scripts.lerobot_record import record_loop
 from lerobot.utils.constants import ACTION, OBS_STR
-from lerobot.utils.feature_utils import build_dataset_frame, hw_to_dataset_features
-from lerobot.utils.robot_utils import precise_sleep
+from lerobot.utils.feature_utils import hw_to_dataset_features
 from lerobot.utils.utils import log_say
-from lerobot.utils.visualization_utils import init_rerun, log_rerun_data
+from lerobot.utils.visualization_utils import init_rerun

 NUM_EPISODES = 2
 FPS = 30
@@ -39,9 +35,6 @@ HF_DATASET_ID = "<hf_username>/<eval_dataset_repo_id>"


 def main():
-    # NOTE: For production policy deployment, use `lerobot-rollout` CLI instead.
-    # This script provides a self-contained example for educational purposes.
-
    # Create the robot configuration & robot
    robot_config = LeKiwiClientConfig(remote_ip="172.18.134.136", id="lekiwi")

@@ -90,67 +83,43 @@ def main():
            raise ValueError("Robot is not connected!")

        print("Starting evaluate loop...")
-        control_interval = 1 / FPS
        recorded_episodes = 0
        while recorded_episodes < NUM_EPISODES and not events["stop_recording"]:
            log_say(f"Running inference, recording eval episode {recorded_episodes} of {NUM_EPISODES}")

-            # Inline evaluation loop: predict actions and send to robot
-            timestamp = 0
-            start_episode_t = time.perf_counter()
-            while timestamp < EPISODE_TIME_SEC:
-                start_loop_t = time.perf_counter()
-
-                if events["exit_early"]:
-                    events["exit_early"] = False
-                    break
-
-                # Get robot observation
-                obs = robot.get_observation()
-                obs_processed = robot_observation_processor(obs)
-                observation_frame = build_dataset_frame(dataset.features, obs_processed, prefix=OBS_STR)
-
-                # Predict action using the policy
-                action_tensor = predict_action(
-                    observation=observation_frame,
-                    policy=policy,
-                    device=policy.config.device,
-                    preprocessor=preprocessor,
-                    postprocessor=postprocessor,
-                    use_amp=policy.config.device.type == "cuda",
-                    task=TASK_DESCRIPTION,
-                    robot_type=robot.name,
-                )
-
-                # Convert policy output to robot action dict
-                action_values = make_robot_action(action_tensor, dataset.features)
-
-                # Process and send action to robot
-                robot_action_to_send = robot_action_processor((action_values, obs))
-                robot.send_action(robot_action_to_send)
-
-                # Write to dataset
-                action_frame = build_dataset_frame(dataset.features, action_values, prefix=ACTION)
-                frame = {**observation_frame, **action_frame, "task": TASK_DESCRIPTION}
-                dataset.add_frame(frame)
-
-                log_rerun_data(observation=obs_processed, action=action_values)
-
-                dt_s = time.perf_counter() - start_loop_t
-                sleep_time_s = control_interval - dt_s
-                if sleep_time_s < 0:
-                    logging.warning(
-                        f"Evaluate loop is running slower ({1 / dt_s:.1f} Hz) than the target FPS ({FPS} Hz)."
-                    )
-                precise_sleep(max(sleep_time_s, 0.0))
-                timestamp = time.perf_counter() - start_episode_t
+            # Main record loop
+            record_loop(
+                robot=robot,
+                events=events,
+                fps=FPS,
+                policy=policy,
+                preprocessor=preprocessor,  # Pass the pre and post policy processors
+                postprocessor=postprocessor,
+                dataset=dataset,
+                control_time_s=EPISODE_TIME_SEC,
+                single_task=TASK_DESCRIPTION,
+                display_data=True,
+                teleop_action_processor=teleop_action_processor,
+                robot_action_processor=robot_action_processor,
+                robot_observation_processor=robot_observation_processor,
+            )

            # Reset the environment if not stopping or re-recording
            if not events["stop_recording"] and (
                (recorded_episodes < NUM_EPISODES - 1) or events["rerecord_episode"]
            ):
                log_say("Reset the environment")
-                log_say("Waiting for environment reset, press right arrow key when ready...")
+                record_loop(
+                    robot=robot,
+                    events=events,
+                    fps=FPS,
+                    control_time_s=EPISODE_TIME_SEC,
+                    single_task=TASK_DESCRIPTION,
+                    display_data=True,
+                    teleop_action_processor=teleop_action_processor,
+                    robot_action_processor=robot_action_processor,
+                    robot_observation_processor=robot_observation_processor,
+                )

            if events["rerecord_episode"]:
                log_say("Re-record episode")
@@ -45,6 +45,9 @@ def main():
    leader_arm = SO100Leader(leader_arm_config)
    keyboard = KeyboardTeleop(keyboard_config)

+    # TODO(Steven): Update this example to use pipelines
+    teleop_action_processor, robot_action_processor, robot_observation_processor = make_default_processors()
+
    # Configure the dataset features
    action_features = hw_to_dataset_features(robot.action_features, ACTION)
    obs_features = hw_to_dataset_features(robot.observation_features, OBS_STR)
@@ -74,10 +77,6 @@ def main():
        if not robot.is_connected or not leader_arm.is_connected or not keyboard.is_connected:
            raise ValueError("Robot or teleop is not connected!")

-        teleop_action_processor, robot_action_processor, robot_observation_processor = (
-            make_default_processors()
-        )
-
        print("Starting record loop...")
        recorded_episodes = 0
        while recorded_episodes < NUM_EPISODES and not events["stop_recording"]:
@@ -88,14 +87,14 @@ def main():
                robot=robot,
                events=events,
                fps=FPS,
-                teleop_action_processor=teleop_action_processor,
-                robot_action_processor=robot_action_processor,
-                robot_observation_processor=robot_observation_processor,
                dataset=dataset,
                teleop=[leader_arm, keyboard],
                control_time_s=EPISODE_TIME_SEC,
                single_task=TASK_DESCRIPTION,
                display_data=True,
+                teleop_action_processor=teleop_action_processor,
+                robot_action_processor=robot_action_processor,
+                robot_observation_processor=robot_observation_processor,
            )

            # Reset the environment if not stopping or re-recording
@@ -107,13 +106,13 @@ def main():
                    robot=robot,
                    events=events,
                    fps=FPS,
-                    teleop_action_processor=teleop_action_processor,
-                    robot_action_processor=robot_action_processor,
-                    robot_observation_processor=robot_observation_processor,
                    teleop=[leader_arm, keyboard],
                    control_time_s=RESET_TIME_SEC,
                    single_task=TASK_DESCRIPTION,
                    display_data=True,
+                    teleop_action_processor=teleop_action_processor,
+                    robot_action_processor=robot_action_processor,
+                    robot_observation_processor=robot_observation_processor,
                )

            if events["rerecord_episode"]:
@@ -1,77 +0,0 @@
-# !/usr/bin/env python
-
-# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-"""Run a trained policy on LeKiwi without recording (base rollout).
-
-Uses the rollout engine's :class:`BaseStrategy` (autonomous execution,
-no dataset) with :class:`SyncInferenceConfig` (inline policy call per
-control tick).  For a CLI entry point with the same capabilities plus
-recording, upload, and human-in-the-loop variants, see ``lerobot-rollout``.
-"""
-
-from lerobot.configs import PreTrainedConfig
-from lerobot.robots.lekiwi import LeKiwiClientConfig
-from lerobot.rollout import BaseStrategyConfig, RolloutConfig, build_rollout_context
-from lerobot.rollout.inference import SyncInferenceConfig
-from lerobot.rollout.strategies import BaseStrategy
-from lerobot.utils.process import ProcessSignalHandler
-from lerobot.utils.utils import init_logging
-
-FPS = 30
-DURATION_SEC = 60
-TASK_DESCRIPTION = "My task description"
-HF_MODEL_ID = "<hf_username>/<model_repo_id>"
-
-
-def main():
-    init_logging()
-
-    # Robot: LeKiwi client — make sure lekiwi_host is already running on the robot.
-    robot_config = LeKiwiClientConfig(remote_ip="172.18.134.136", id="lekiwi")
-
-    # Policy: load the pretrained config.  ``pretrained_path`` is read downstream
-    # by ``build_rollout_context`` to reload the full model.
-    policy_config = PreTrainedConfig.from_pretrained(HF_MODEL_ID)
-    policy_config.pretrained_path = HF_MODEL_ID
-
-    # Assemble the rollout config: base strategy (no recording) + sync inference.
-    cfg = RolloutConfig(
-        robot=robot_config,
-        policy=policy_config,
-        strategy=BaseStrategyConfig(),
-        inference=SyncInferenceConfig(),
-        fps=FPS,
-        duration=DURATION_SEC,
-        task=TASK_DESCRIPTION,
-    )
-
-    # Graceful Ctrl-C: the strategy loop exits when shutdown_event is set.
-    signal_handler = ProcessSignalHandler(use_threads=True)
-
-    # Build the context (connects robot, loads policy, wires the inference strategy).
-    # No custom processors here — LeKiwi runs on raw joint features.
-    ctx = build_rollout_context(cfg, signal_handler.shutdown_event)
-
-    strategy = BaseStrategy(cfg.strategy)
-    try:
-        strategy.setup(ctx)
-        strategy.run(ctx)
-    finally:
-        strategy.teardown(ctx)
-
-
-if __name__ == "__main__":
-    main()
@@ -1,342 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "# 🤗 LeRobot Quickstart\n",
-    "\n",
-    "Calibration → teleoperation → data collection → training → evaluation.\n",
-    "\n",
-    "Install the required dependencies: `pip install -e .[notebook,dataset,training,viz,hardware]`.\n",
-    "\n",
-    "**How to use:**\n",
-    "1. Edit the **Configuration** cell with your settings.\n",
-    "2. Run all cells (`Run All`).\n",
-    "3. Each section prints a ready-to-paste terminal command - copy it and run it.\n",
-    "\n",
-    "Each setup is different, please refer to the [LeRobot documentation](https://huggingface.co/docs/lerobot/il_robots) for more details on each step and available options. <br>\n",
-    "Feel free to make this notebook your own and adapt it to your needs!"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "---\n",
-    "## Utils"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "def _cameras_arg(cameras: dict) -> str:\n",
-    "    if not cameras:\n",
-    "        return \"\"\n",
-    "    entries = [f\"{n}: {{{', '.join(f'{k}: {v}' for k, v in cfg.items())}}}\" for n, cfg in cameras.items()]\n",
-    "    return \"{ \" + \", \".join(entries) + \" }\"\n",
-    "\n",
-    "\n",
-    "def print_cmd(*parts: str) -> None:\n",
-    "    \"\"\"Print a shell command with line continuations, skipping empty parts.\"\"\"\n",
-    "    non_empty = [p for p in parts if p]\n",
-    "    print(\" \\\\\\n    \".join(non_empty))"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "---\n",
-    "## Configuration\n",
-    "\n",
-    "Edit this cell, then **Run All** to generate all commands below."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Robot (follower) - run `lerobot-find-port` to discover the port\n",
-    "ROBOT_TYPE = \"so101_follower\"\n",
-    "ROBOT_PORT = \"/dev/ttyACM0\"\n",
-    "ROBOT_ID = \"my_follower_arm\"\n",
-    "\n",
-    "# Teleop (leader) - run `lerobot-find-port` to discover the port\n",
-    "TELEOP_TYPE = \"so101_leader\"\n",
-    "TELEOP_PORT = \"/dev/ttyACM1\"\n",
-    "TELEOP_ID = \"my_leader_arm\"\n",
-    "\n",
-    "# Cameras - set to {} to disable\n",
-    "# Run `lerobot-find-cameras opencv` to list available cameras and their indices\n",
-    "CAMERAS = {\n",
-    "    \"top\": {\"type\": \"opencv\", \"index_or_path\": 2, \"width\": 640, \"height\": 480, \"fps\": 30},\n",
-    "    \"wrist\": {\"type\": \"opencv\", \"index_or_path\": 4, \"width\": 640, \"height\": 480, \"fps\": 30},\n",
-    "}\n",
-    "\n",
-    "# Dataset\n",
-    "HF_USER = \"your_hf_username\"  # `huggingface-cli whoami` to find your username\n",
-    "DATASET_NAME = \"my_so101_dataset\"\n",
-    "TASK_DESCRIPTION = \"pick and place the block\"\n",
-    "NUM_EPISODES = 10\n",
-    "\n",
-    "# Training\n",
-    "POLICY_TYPE = \"act\"  # act, diffusion, smolvla, ...\n",
-    "POLICY_DEVICE = \"cuda\"  # cuda / cpu / mps\n",
-    "TRAIN_STEPS = 10_000\n",
-    "SAVE_FREQ = 2_000\n",
-    "OUTPUT_DIR = f\"outputs/train/{DATASET_NAME}\"\n",
-    "\n",
-    "# Inference - Hub repo ID or local checkpoint path\n",
-    "# e.g. set to f\"{OUTPUT_DIR}/checkpoints/last\" to use a local checkpoint\n",
-    "POLICY_PATH = f\"{HF_USER}/{DATASET_NAME}_{POLICY_TYPE}\"\n",
-    "LAST_CHECKPOINT_PATH = f\"{OUTPUT_DIR}/checkpoints/last\"\n",
-    "\n",
-    "# Derived\n",
-    "DATASET_REPO_ID = f\"{HF_USER}/{DATASET_NAME}\"\n",
-    "DATASET_ROOT = f\"data/{DATASET_NAME}\"\n",
-    "POLICY_REPO_ID = f\"{HF_USER}/{DATASET_NAME}_{POLICY_TYPE}\"\n",
-    "EVAL_REPO_ID = f\"{HF_USER}/eval_{DATASET_NAME}\"\n",
-    "CAMERAS_ARG = _cameras_arg(CAMERAS)\n",
-    "CAMERAS_FLAG = f'--robot.cameras=\"{CAMERAS_ARG}\"' if CAMERAS_ARG else \"\"\n",
-    "\n",
-    "print(f\"Robot  : {ROBOT_TYPE} @ {ROBOT_PORT}\")\n",
-    "print(f\"Teleop : {TELEOP_TYPE} @ {TELEOP_PORT}\")\n",
-    "print(f\"Cameras: {list(CAMERAS) or 'none'}\")\n",
-    "print(f\"Dataset: {DATASET_REPO_ID} ({NUM_EPISODES} episodes) saved to {DATASET_ROOT}\")\n",
-    "print(f\"Policy : {POLICY_TYPE} -> {POLICY_REPO_ID}\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "---\n",
-    "## 1. Calibration\n",
-    "\n",
-    "Run once per arm before first use."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Follower\n",
-    "print_cmd(\n",
-    "    \"lerobot-calibrate\",\n",
-    "    f\"--robot.type={ROBOT_TYPE}\",\n",
-    "    f\"--robot.port={ROBOT_PORT}\",\n",
-    "    f\"--robot.id={ROBOT_ID}\",\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Leader\n",
-    "print_cmd(\n",
-    "    \"lerobot-calibrate\",\n",
-    "    f\"--teleop.type={TELEOP_TYPE}\",\n",
-    "    f\"--teleop.port={TELEOP_PORT}\",\n",
-    "    f\"--teleop.id={TELEOP_ID}\",\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "---\n",
-    "## 2. Teleoperation\n",
-    "\n",
-    "See the [teleoperation docs](https://huggingface.co/docs/lerobot/il_robots#teleoperate) and the [cameras guide](https://huggingface.co/docs/lerobot/cameras) for more options."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "print_cmd(\n",
-    "    \"lerobot-teleoperate\",\n",
-    "    f\"--robot.type={ROBOT_TYPE}\",\n",
-    "    f\"--robot.port={ROBOT_PORT}\",\n",
-    "    f\"--robot.id={ROBOT_ID}\",\n",
-    "    CAMERAS_FLAG,\n",
-    "    f\"--teleop.type={TELEOP_TYPE}\",\n",
-    "    f\"--teleop.port={TELEOP_PORT}\",\n",
-    "    f\"--teleop.id={TELEOP_ID}\",\n",
-    "    \"--display_data=true\",\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "---\n",
-    "## 3. Record Dataset\n",
-    "\n",
-    "See the [recording docs](https://huggingface.co/docs/lerobot/il_robots#record-a-dataset) for tips on gathering good data."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "print_cmd(\n",
-    "    \"lerobot-record\",\n",
-    "    f\"--robot.type={ROBOT_TYPE}\",\n",
-    "    f\"--robot.port={ROBOT_PORT}\",\n",
-    "    f\"--robot.id={ROBOT_ID}\",\n",
-    "    CAMERAS_FLAG,\n",
-    "    f\"--teleop.type={TELEOP_TYPE}\",\n",
-    "    f\"--teleop.port={TELEOP_PORT}\",\n",
-    "    f\"--teleop.id={TELEOP_ID}\",\n",
-    "    f\"--dataset.repo_id={DATASET_REPO_ID}\",\n",
-    "    f\"--dataset.num_episodes={NUM_EPISODES}\",\n",
-    "    f'--dataset.single_task=\"{TASK_DESCRIPTION}\"',\n",
-    "    \"--dataset.streaming_encoding=true\",\n",
-    "    \"--display_data=true\",\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Resume a previously interrupted recording session\n",
-    "print_cmd(\n",
-    "    \"lerobot-record\",\n",
-    "    f\"--robot.type={ROBOT_TYPE}\",\n",
-    "    f\"--robot.port={ROBOT_PORT}\",\n",
-    "    f\"--robot.id={ROBOT_ID}\",\n",
-    "    CAMERAS_FLAG,\n",
-    "    f\"--teleop.type={TELEOP_TYPE}\",\n",
-    "    f\"--teleop.port={TELEOP_PORT}\",\n",
-    "    f\"--teleop.id={TELEOP_ID}\",\n",
-    "    f\"--dataset.repo_id={DATASET_REPO_ID}\",\n",
-    "    f\"--dataset.root={DATASET_ROOT}\",\n",
-    "    f\"--dataset.num_episodes={NUM_EPISODES}\",\n",
-    "    f'--dataset.single_task=\"{TASK_DESCRIPTION}\"',\n",
-    "    \"--dataset.streaming_encoding=true\",\n",
-    "    \"--display_data=true\",\n",
-    "    \"--resume=true\",\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "---\n",
-    "## 4. Train Policy\n",
-    "\n",
-    "See the [training docs](https://huggingface.co/docs/lerobot/il_robots#train-a-policy) for configuration options and tips."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "print_cmd(\n",
-    "    \"lerobot-train\",\n",
-    "    f\"--dataset.repo_id={DATASET_REPO_ID}\",\n",
-    "    f\"--policy.type={POLICY_TYPE}\",\n",
-    "    f\"--policy.device={POLICY_DEVICE}\",\n",
-    "    f\"--policy.repo_id={POLICY_REPO_ID}\",\n",
-    "    f\"--output_dir={OUTPUT_DIR}\",\n",
-    "    f\"--steps={TRAIN_STEPS}\",\n",
-    "    f\"--save_freq={SAVE_FREQ}\",\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Resume a previously interrupted training session\n",
-    "print_cmd(\n",
-    "    \"lerobot-train\",\n",
-    "    f\"--config_path={LAST_CHECKPOINT_PATH}/pretrained_model/train_config.json\",\n",
-    "    \"--resume=true\",\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "---\n",
-    "## 5. Inference\n",
-    "\n",
-    "Uses `POLICY_PATH` from the Configuration cell (defaults to the Hub repo ID). You can also put there the `LAST_CHECKPOINT_PATH`.\n",
-    "\n",
-    "See the [inference docs](https://huggingface.co/docs/lerobot/il_robots#run-inference-and-evaluate-your-policy) for details."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "print_cmd(\n",
-    "    \"lerobot-record\",\n",
-    "    f\"--policy.path={POLICY_PATH}\",\n",
-    "    f\"--robot.type={ROBOT_TYPE}\",\n",
-    "    f\"--robot.port={ROBOT_PORT}\",\n",
-    "    f\"--robot.id={ROBOT_ID}\",\n",
-    "    CAMERAS_FLAG,\n",
-    "    f\"--teleop.type={TELEOP_TYPE}\",\n",
-    "    f\"--teleop.port={TELEOP_PORT}\",\n",
-    "    f\"--teleop.id={TELEOP_ID}\",\n",
-    "    f\"--dataset.repo_id={EVAL_REPO_ID}\",\n",
-    "    f\"--dataset.num_episodes={NUM_EPISODES}\",\n",
-    "    f'--dataset.single_task=\"{TASK_DESCRIPTION}\"',\n",
-    "    \"--dataset.streaming_encoding=true\",\n",
-    ")"
-   ]
-  }
- ],
- "metadata": {
-  "kernelspec": {
-   "display_name": "lerobot (3.12.3)",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.12.3"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 4
-}
@@ -1,136 +0,0 @@
-# OMX Follower — Cube Pick And Place Example
-
-This is an example of what is possible to do with LeRobot on a physical setup.
-It is a WIP and being used internally at LeRobot and specific to our setup, but we hope it can be a useful reference for how to use LeRobot APIs and CLIs.
-
-It includes an end-to-end example for the **OMX Follower** robot arm: pick and place a cube dataset, train a policy, and deploy it autonomously.
-
-## Hardware
-
-| Component | Value                                |
-| --------- | ------------------------------------ |
-| Robot     | OMX Follower                         |
-| Cameras   | 2× OpenCV cameras (wrist + top-down) |
-
-## Scripts
-
-| Script                 | Purpose                                                         |
-| ---------------------- | --------------------------------------------------------------- |
-| `reset_environment.py` | Standalone utility: sweep workspace, grab cube, place cube      |
-| `record_grab.py`       | Automated data collection: reset → place → record grab episodes |
-
-## Setup
-
-Make sure you have LeRobot installed in your env. (See [the installation guide](https://huggingface.co/docs/lerobot/installation))
-
-Next, we will declare some environment variables for convenience. Adjust the camera indices and robot port to match your system configuration.
-
-```bash
-export ROBOT_PORT=/dev/ttyACM0
-export TELEOP_PORT=/dev/ttyACM1
-export HF_USERNAME=<your_hf_username>
-export ROBOT_CAMERAS="{ wrist: {type: opencv, index_or_path: 0, width: 640, height: 480, fps: 30, fourcc: MJPG}, top: {type: opencv, index_or_path: 2, width: 640, height: 480, fps: 30, fourcc: MJPG} }"
-```
-
-## Step 1 — Collect Data
-
-```bash
-lerobot-record \
-    --robot.type=omx_follower \
-    --robot.port=$ROBOT_PORT \
-    --robot.id=omx_follower \
-    --robot.cameras="$ROBOT_CAMERAS" \
-    --teleop.type=omx_leader \
-    --teleop.port=$TELEOP_PORT \
-    --teleop.id=omx_leader \
-    --dataset.repo_id=$HF_USERNAME/omx_pickandplace \
-    --dataset.root=data/omx_pickandplace \
-    --dataset.num_episodes=50 \
-    --dataset.single_task="Pick the cube and place it in the blue square" \
-    --dataset.streaming_encoding=true \
-    --dataset.push_to_hub=true
-```
-
-### Bonus Auto-Collect script
-
-/!\ This is specific to our setup and the task of picking and placing a cube. It is not a general-purpose data collection script. As you may notice, it doesn't require a teleop.
-
-```bash
-python -m examples.omx.record_grab \
-    --robot.type=omx_follower \
-    --robot.port=$ROBOT_PORT \
-    --robot.id=omx_follower \
-    --robot.cameras="$ROBOT_CAMERAS" \
-    --dataset.repo_id=$HF_USERNAME/omx_pickandplace \
-    --dataset.root=data/omx_pickandplace \
-    --dataset.num_episodes=50 \
-    --dataset.single_task="Pick the cube and place it in the blue square" \
-    --dataset.streaming_encoding=true \
-    --dataset.push_to_hub=true
-```
-
-Each episode:
-
-1. The arm grabs the cube from the center of the workspace and places it at a random position.
-2. The arm returns to HOME.
-3. A targeted grab is recorded: HOME → approach raised → lower onto cube → grasp → lift → carry → drop → HOME.
-
-A dataset is already available here [`maximellerbach/omx_pickandplace`](https://huggingface.co/datasets/maximellerbach/omx_pickandplace), so you can skip directly to training if you want.
-
-## Step 2 — Train
-
-To train a simple `ACT` policy on the collected dataset, you can use the `lerobot-train` CLI:
-
-```bash
-lerobot-train \
-    --dataset.repo_id=$HF_USERNAME/omx_pickandplace \
-    --policy.type=act \
-    --output_dir=outputs/train/omx_pickandplace_act \
-    --policy.device=cuda \
-    --policy.repo_id=$HF_USERNAME/omx_pickandplace_act \
-    --steps=20000 \
-    --wandb.enable=true
-```
-
-A pretrained `ACT` policy is already available here [`maximellerbach/omx_pickandplace_act`](https://huggingface.co/maximellerbach/omx_pickandplace_act).
-
-## Step 3 — Rollout
-
-Use the `lerobot-rollout` CLI with base strategy:
-
-```bash
-lerobot-rollout \
-    --strategy.type=base \
-    --robot.type=omx_follower \
-    --robot.port=$ROBOT_PORT \
-    --robot.id=omx_follower \
-    --robot.cameras="$ROBOT_CAMERAS" \
-    --policy.path=$HF_USERNAME/omx_pickandplace_act \
-```
-
-For continuous recording with automatic upload (sentry mode):
-
-```bash
-lerobot-rollout \
-    --strategy.type=sentry \
-    --strategy.upload_every_n_episodes=10 \
-    --robot.type=omx_follower \
-    --robot.port=$ROBOT_PORT \
-    --robot.id=omx_follower \
-    --robot.cameras="$ROBOT_CAMERAS" \
-    --policy.path=$HF_USERNAME/omx_pickandplace_act \
-    --dataset.repo_id=$HF_USERNAME/rollout_omx_pickandplace_act \
-```
-
-## Environment Reset Utility
-
-Those are specific to this particular physical setup. Those are scripts that execute hardcoded sequences of actions on the robot to reset the environment, which is useful for data collection and evaluation. They are not general-purpose scripts.
-
-`reset_environment.py` can be run standalone to prepare the workspace:
-
-```bash
-# Grab cube + place it at a random position on the left side
-python -m examples.omx.reset_environment --port $ROBOT_PORT --mode grab_and_place
-```
-
-It also exposes `grab_cube(robot)` and `place_cube(robot)` for use in custom scripts.
@@ -1,422 +0,0 @@
-#!/usr/bin/env python3
-"""
-Auto-record grab episodes for the OMX robot arm.
-
-Each episode cycle:
-  1. grab_and_place  — grab cube from workspace center and place at a random (pan, reach) position
-  2. HOME            — return arm to home with gripper open
-  3. record_grab     — execute a targeted grab to the stored position while recording
-                       observations + actions to a LeRobotDataset
-
-Usage (run from repo root):
-    python -m examples.omx.record_grab \\
-        --robot.type=omx_follower \\
-        --robot.port=/dev/ttyACM0 \\
-        --robot.id=omx_follower \\
-        --robot.cameras="{ wrist: {type: opencv, index_or_path: 6, width: 640, height: 480, fps: 30, fourcc: MJPG}, top: {type: opencv, index_or_path: 4, width: 640, height: 480, fps: 30, fourcc: MJPG} }" \\
-        --dataset.repo_id=<hf_username>/<dataset_name> \\
-        --dataset.root=data/omx_grab \\
-        --dataset.num_episodes=50 \\
-        --dataset.single_task="Grab the cube" \\
-        --dataset.streaming_encoding=true
-"""
-
-import logging
-from dataclasses import dataclass
-from pprint import pformat
-
-import numpy as np
-
-from lerobot.cameras import CameraConfig  # noqa: F401
-from lerobot.cameras.opencv import OpenCVCameraConfig  # noqa: F401
-from lerobot.configs import parser
-from lerobot.configs.dataset import DatasetRecordConfig
-from lerobot.datasets import (
-    LeRobotDataset,
-    VideoEncodingManager,
-    aggregate_pipeline_dataset_features,
-    create_initial_features,
-)
-from lerobot.processor import make_default_processors
-from lerobot.robots import RobotConfig, make_robot_from_config
-from lerobot.robots.omx_follower import OmxFollower
-from lerobot.utils.constants import ACTION, OBS_STR
-from lerobot.utils.feature_utils import build_dataset_frame, combine_feature_dicts
-from lerobot.utils.robot_utils import precise_sleep
-
-from .reset_environment import (
-    APPROACH_SPEED,
-    GRIPPER_CLOSE_POS,
-    HOME_POSE,
-    PUSH_END_ELBOW_FLEX,
-    PUSH_END_SHOULDER_LIFT,
-    PUSH_START_ELBOW_FLEX,
-    PUSH_START_SHOULDER_LIFT,
-    array_to_pose,
-    grab_cube,
-    horizontal_wrist_flex,
-    move_to_pose,
-    place_cube,
-    pose_to_array,
-)
-
-# ── Grab-episode motion parameters ────────────────────────────────────────────
-
-# Shoulder-lift offset for the raised approach phase (subtracted from the target sl, arm is higher).
-GRAB_RAISE_SL_OFFSET = 20.0
-GRAB_LOWER_SPEED = 20.0
-RECORD_SPEED = 30.0
-
-# Pose the arm travels to after closing the gripper (cube held).
-GRAB_CARRY_POSE = {
-    "shoulder_pan.pos": -23.0,
-    "shoulder_lift.pos": 5.0,
-    "elbow_flex.pos": 18.0,
-    "wrist_flex.pos": -14.0,
-    "wrist_roll.pos": 0.0,
-    "gripper.pos": GRIPPER_CLOSE_POS,
-}
-
-# Per-joint jitter limits (degrees) applied to transit waypoints for human-like variation.
-# Cube-approach and carry poses are never jittered to preserve precision.
-_JITTER_LIMITS: dict[str, float] = {
-    "shoulder_pan.pos": 5.0,
-    "shoulder_lift.pos": 4.0,
-    "elbow_flex.pos": 4.0,
-    "wrist_flex.pos": 3.0,
-    "wrist_roll.pos": 2.0,
-    "gripper.pos": 0.0,
-}
-
-
-def _jitter_pose(pose: dict, rng: np.random.Generator) -> dict:
-    """Return a copy of pose with independent per-joint random perturbations."""
-    return {
-        k: v + rng.uniform(-_JITTER_LIMITS.get(k, 0.0), _JITTER_LIMITS.get(k, 0.0)) for k, v in pose.items()
-    }
-
-
-def _random_stuck_pose(rng: np.random.Generator) -> dict:
-    """Return a physically plausible stuck pose (failed grasp), gripper closed.
-
-    ef bounds are piecewise-linear in sl so the arm stays in a reachable,
-    table-safe envelope across the full sl range:
-      sl=-50 → ef ∈ [  0,  50]   (arm raised, can be bent forward)
-      sl=  0 → ef ∈ [-25,  25]   (mid reach)
-      sl= 30 → ef ∈ [-20,   0]   (arm extended, little room to flex)
-    wrist_flex is randomly offset from the horizontal value.
-    """
-    pan = float(rng.uniform(-5.0, 35.0))
-    sl = float(rng.uniform(-50.0, 30.0))
-
-    if sl <= 0.0:
-        alpha = (sl + 50.0) / 50.0  # 0 at sl=-50, 1 at sl=0
-        ef_lo = alpha * -25.0  # 0 → -25
-        ef_hi = 50.0 + alpha * -25.0  # 50 → 25
-    else:
-        alpha = sl / 30.0  # 0 at sl=0, 1 at sl=30
-        ef_lo = -25.0 + alpha * 5.0  # -25 → -20
-        ef_hi = 25.0 + alpha * -25.0  # 25 → 0
-
-    ef = float(rng.uniform(ef_lo, ef_hi))
-    wf = horizontal_wrist_flex(sl, ef) + float(rng.uniform(-15.0, 15.0))
-    return {
-        "shoulder_pan.pos": pan,
-        "shoulder_lift.pos": sl,
-        "elbow_flex.pos": ef,
-        "wrist_flex.pos": wf,
-        "wrist_roll.pos": float(rng.uniform(-15.0, 15.0)),
-        "gripper.pos": GRIPPER_CLOSE_POS,
-    }
-
-
-logger = logging.getLogger(__name__)
-
-
-@dataclass
-class OmxRecordGrabConfig:
-    robot: RobotConfig
-    dataset: DatasetRecordConfig
-    # Resume recording on an existing dataset.
-    resume: bool = False
-    # Fraction of episodes that start from a random stuck pose (gripper closed) to
-    # generate recovery data.  0.0 = disabled, 1.0 = all episodes are recovery starts.
-    recovery_prob: float = 0.5
-
-
-def record_episode_spline(
-    robot: OmxFollower,
-    waypoints: list[dict],
-    speeds: list[float],
-    dataset: LeRobotDataset,
-    task: str,
-) -> None:
-    """Execute a Catmull-Rom-style spline through waypoints, recording each frame.
-
-    Segment durations are parameterized from the maximum absolute joint delta
-    between consecutive waypoints divided by the requested segment speed,
-    producing non-uniform timing in joint space. Interior tangents are derived
-    from the adjacent per-segment velocities, with clamped (zero-velocity)
-    endpoints so the arm starts and stops smoothly. Each segment is cubic
-    Hermite, giving C1 continuity at every waypoint.
-    """
-    pts = [pose_to_array(w) for w in waypoints]
-    n = len(pts)
-
-    # Steps and duration per segment
-    n_steps_list = []
-    timestamps = []
-    for i in range(n - 1):
-        max_dist = float(np.max(np.abs(pts[i + 1] - pts[i])))
-        ns = max(1, int(max_dist / speeds[i] * dataset.fps)) if max_dist >= 0.5 else 0
-        n_steps_list.append(ns)
-        timestamps.append(ns / dataset.fps)
-
-    # Velocity tangents (deg/sec) — clamped at endpoints, Catmull-Rom for interior
-    vels = [np.zeros_like(pts[0])]
-    for i in range(1, n - 1):
-        v_prev = (pts[i] - pts[i - 1]) / timestamps[i - 1] if timestamps[i - 1] > 0 else np.zeros_like(pts[0])
-        v_next = (pts[i + 1] - pts[i]) / timestamps[i] if timestamps[i] > 0 else np.zeros_like(pts[0])
-        vels.append(0.5 * (v_prev + v_next))
-    vels.append(np.zeros_like(pts[0]))
-
-    dt = 1.0 / dataset.fps
-    for seg in range(n - 1):
-        ns = n_steps_list[seg]
-        if ns == 0:
-            continue
-        p0, p1 = pts[seg], pts[seg + 1]
-        # Scale velocity (deg/sec) to t-space tangent (deg/t-unit, where t: 0→1 over ns steps)
-        m0 = vels[seg] * timestamps[seg]
-        m1 = vels[seg + 1] * timestamps[seg]
-
-        for step in range(1, ns + 1):
-            t = step / ns
-            h00 = 2 * t**3 - 3 * t**2 + 1
-            h10 = t**3 - 2 * t**2 + t
-            h01 = -2 * t**3 + 3 * t**2
-            h11 = t**3 - t**2
-            commanded = h00 * p0 + h10 * m0 + h01 * p1 + h11 * m1
-
-            action = array_to_pose(commanded)
-            robot.send_action(action)
-            obs = robot.get_observation()
-            obs_frame = build_dataset_frame(dataset.features, obs, prefix=OBS_STR)
-            action_frame = build_dataset_frame(dataset.features, action, prefix=ACTION)
-            dataset.add_frame({**obs_frame, **action_frame, "task": task})
-            precise_sleep(dt)
-
-
-def record_grab_episode(
-    robot: OmxFollower,
-    dataset: LeRobotDataset,
-    pan: float,
-    t: float,
-    task: str,
-    recovery_start: bool = False,
-) -> None:
-    """Execute a targeted grab to the stored (pan, t) position, recording every frame.
-
-    Normal sequence (initial HOME move is NOT recorded):
-      HOME → raised approach above cube → lower → close gripper
-           → raise [jittered] → retract [jittered] → GRAB_CARRY_POSE → drop → HOME
-
-    Recovery sequence (recovery_start=True): arm is moved to a random stuck pose
-    (gripper closed) without recording, then recording begins from there:
-      stuck_pose → raised approach above cube → [normal grab sequence from there]
-
-    All segments are joined by a Catmull-Rom spline (C1-continuous velocities).
-    """
-    sl = PUSH_START_SHOULDER_LIFT + t * (PUSH_END_SHOULDER_LIFT - PUSH_START_SHOULDER_LIFT)
-    ef = PUSH_START_ELBOW_FLEX + t * (PUSH_END_ELBOW_FLEX - PUSH_START_ELBOW_FLEX)
-    sl_raised = sl - GRAB_RAISE_SL_OFFSET
-    wf_horizontal = horizontal_wrist_flex(sl, ef)
-
-    rng = np.random.default_rng()
-
-    if recovery_start:
-        stuck_pose = _random_stuck_pose(rng)
-        logger.info(f"Recovery start: {stuck_pose}")
-        move_to_pose(robot, stuck_pose, APPROACH_SPEED)
-        first_waypoints = [stuck_pose]
-        first_speeds = []
-    else:
-        jittery_start = _jitter_pose(HOME_POSE, rng)
-        move_to_pose(robot, jittery_start, APPROACH_SPEED)
-        first_waypoints = [jittery_start]
-        first_speeds = []
-
-    waypoints = first_waypoints + [
-        {  # raised approach: arm above cube
-            "shoulder_pan.pos": pan,
-            "shoulder_lift.pos": sl_raised,
-            "elbow_flex.pos": ef,
-            "wrist_flex.pos": horizontal_wrist_flex(sl_raised, ef),
-            "wrist_roll.pos": 0.0,
-            "gripper.pos": 60.0,
-        },
-        {  # lower onto cube — no jitter: precision needed
-            "shoulder_pan.pos": pan,
-            "shoulder_lift.pos": sl,
-            "elbow_flex.pos": ef,
-            "wrist_flex.pos": wf_horizontal,
-            "wrist_roll.pos": 0.0,
-            "gripper.pos": 60.0,
-        },
-        {  # close gripper — no jitter: precision needed
-            "shoulder_pan.pos": pan,
-            "shoulder_lift.pos": sl,
-            "elbow_flex.pos": ef,
-            "wrist_flex.pos": wf_horizontal,
-            "wrist_roll.pos": 0.0,
-            "gripper.pos": GRIPPER_CLOSE_POS,
-        },
-        _jitter_pose(
-            {  # raise with cube
-                "shoulder_pan.pos": pan,
-                "shoulder_lift.pos": sl_raised,
-                "elbow_flex.pos": ef,
-                "wrist_flex.pos": horizontal_wrist_flex(sl_raised, ef),
-                "wrist_roll.pos": 0.0,
-                "gripper.pos": GRIPPER_CLOSE_POS,
-            },
-            rng,
-        ),
-        _jitter_pose(
-            {  # retract: fold arm toward HOME before sweeping to carry zone
-                "shoulder_pan.pos": pan * 0.25,
-                "shoulder_lift.pos": HOME_POSE["shoulder_lift.pos"] + 5.0,
-                "elbow_flex.pos": HOME_POSE["elbow_flex.pos"] - 5.0,
-                "wrist_flex.pos": 0.0,
-                "wrist_roll.pos": 0.0,
-                "gripper.pos": GRIPPER_CLOSE_POS,
-            },
-            rng,
-        ),
-        GRAB_CARRY_POSE,  # no jitter: target drop zone
-        {**GRAB_CARRY_POSE, "gripper.pos": 60.0},  # drop cube
-        HOME_POSE,
-    ]
-    speeds = first_speeds + [
-        RECORD_SPEED,  # (HOME →) raised approach
-        GRAB_LOWER_SPEED,  # raised approach → lower
-        GRAB_LOWER_SPEED,  # lower → close gripper
-        RECORD_SPEED,  # close gripper → raise
-        RECORD_SPEED,  # raise → retract
-        RECORD_SPEED,  # retract → carry pose
-        RECORD_SPEED,  # carry pose → drop
-        RECORD_SPEED,  # drop → HOME
-    ]
-
-    record_episode_spline(robot, waypoints, speeds, dataset, task)
-
-    # Dwell at HOME for ~0.5 s before next episode
-    home_action = build_dataset_frame(dataset.features, HOME_POSE, prefix=ACTION)
-    dt = 1.0 / dataset.fps
-    for _ in range(int(dataset.fps * 0.5)):
-        robot.send_action(HOME_POSE)
-        obs = robot.get_observation()
-        obs_frame = build_dataset_frame(dataset.features, obs, prefix=OBS_STR)
-        dataset.add_frame({**obs_frame, **home_action, "task": task})
-        precise_sleep(dt)
-
-
-@parser.wrap()
-def record_grab(cfg: OmxRecordGrabConfig) -> LeRobotDataset:
-    logging.basicConfig(level=logging.INFO, format="%(levelname)s: %(message)s")
-    logger.info(pformat(cfg))
-
-    robot = make_robot_from_config(cfg.robot)
-    use_videos = cfg.dataset.video
-
-    teleop_action_processor, _, robot_obs_processor = make_default_processors()
-
-    dataset_features = combine_feature_dicts(
-        aggregate_pipeline_dataset_features(
-            pipeline=teleop_action_processor,
-            initial_features=create_initial_features(action=robot.action_features),
-            use_videos=use_videos,
-        ),
-        aggregate_pipeline_dataset_features(
-            pipeline=robot_obs_processor,
-            initial_features=create_initial_features(observation=robot.observation_features),
-            use_videos=use_videos,
-        ),
-    )
-
-    num_cameras = len(robot.cameras) if hasattr(robot, "cameras") else 0
-    dataset = None
-
-    try:
-        if cfg.resume:
-            dataset = LeRobotDataset.resume(
-                cfg.dataset.repo_id,
-                root=cfg.dataset.root,
-                streaming_encoding=cfg.dataset.streaming_encoding,
-                batch_encoding_size=cfg.dataset.video_encoding_batch_size,
-                vcodec=cfg.dataset.vcodec,
-                encoder_threads=cfg.dataset.encoder_threads,
-                image_writer_processes=cfg.dataset.num_image_writer_processes if num_cameras > 0 else 0,
-                image_writer_threads=cfg.dataset.num_image_writer_threads_per_camera * num_cameras
-                if num_cameras > 0
-                else 0,
-            )
-        else:
-            cfg.dataset.stamp_repo_id()
-            dataset = LeRobotDataset.create(
-                cfg.dataset.repo_id,
-                cfg.dataset.fps,
-                root=cfg.dataset.root,
-                robot_type=robot.name,
-                features=dataset_features,
-                use_videos=use_videos,
-                streaming_encoding=cfg.dataset.streaming_encoding,
-                batch_encoding_size=cfg.dataset.video_encoding_batch_size,
-                vcodec=cfg.dataset.vcodec,
-                encoder_threads=cfg.dataset.encoder_threads,
-                image_writer_processes=cfg.dataset.num_image_writer_processes if num_cameras > 0 else 0,
-                image_writer_threads=cfg.dataset.num_image_writer_threads_per_camera * num_cameras
-                if num_cameras > 0
-                else 0,
-            )
-
-        robot.connect(calibrate=True)
-
-        rng = np.random.default_rng()
-        with VideoEncodingManager(dataset):
-            for episode_idx in range(cfg.dataset.num_episodes):
-                logger.info(f"=== Episode {episode_idx + 1}/{cfg.dataset.num_episodes} ===")
-
-                logger.info("Step 1: grabbing and placing cube...")
-                grab_cube(robot)
-                pan, t = place_cube(robot)
-                logger.info(f"Cube placed at pan={pan:.1f}, reach={t:.2f}")
-
-                recovery_start = cfg.recovery_prob > 0 and float(rng.random()) < cfg.recovery_prob
-                logger.info(f"Step 2: recording {'recovery ' if recovery_start else ''}grab episode...")
-                record_grab_episode(
-                    robot,
-                    dataset,
-                    pan,
-                    t,
-                    cfg.dataset.single_task,
-                    recovery_start=recovery_start,
-                )
-
-                dataset.save_episode()
-                logger.info(f"Episode {episode_idx + 1} saved.")
-
-    finally:
-        if dataset:
-            dataset.finalize()
-        if robot.is_connected:
-            robot.disconnect()
-
-    if cfg.dataset.push_to_hub and dataset and dataset.num_episodes > 0:
-        dataset.push_to_hub(tags=cfg.dataset.tags, private=cfg.dataset.private)
-
-    return dataset
-
-
-if __name__ == "__main__":
-    record_grab()
@@ -1,267 +0,0 @@
-#!/usr/bin/env python3
-"""
-Auto-reset and cube-grab utility for the OMX robot arm.
-
-Provides:
-  - grab_cube(robot): sweep workspace, center cube, close gripper
-  - place_cube(robot): carry cube to a random position, release
-
-Standalone usage (run from repo root):
-    python -m examples.omx.reset_environment --port /dev/ttyACM1 --mode grab
-    python -m examples.omx.reset_environment --port /dev/ttyACM1 --mode grab_and_place
-
-Joint range: -100 to 100 for arm joints; gripper: 50 = closed, 80 = open.
-
-To read current joint values for calibration, add after robot.connect():
-    obs = robot.get_observation()
-    print({k: round(obs[k], 1) for k in JOINT_NAMES})
-    robot.disconnect(); raise SystemExit
-
-Parallel-to-ground IK: wrist_flex = WRIST_HORIZONTAL_OFFSET - shoulder_lift - elbow_flex.
-Linear interpolation preserves this constraint between any two poses that satisfy it.
-"""
-
-import argparse
-import logging
-
-import numpy as np
-
-from lerobot.robots.omx_follower import OmxFollower, OmxFollowerConfig
-from lerobot.robots.robot import Robot
-from lerobot.utils.robot_utils import precise_sleep
-
-logger = logging.getLogger(__name__)
-
-# ── Poses ─────────────────────────────────────────────────────────────────────
-
-HOME_POSE = {
-    "shoulder_pan.pos": 0.0,
-    "shoulder_lift.pos": -50.0,
-    "elbow_flex.pos": 50.0,
-    "wrist_flex.pos": 0.0,
-    "wrist_roll.pos": 0.0,
-    "gripper.pos": 60.0,
-}
-
-SWEEP_WAYPOINTS = [
-    {
-        "shoulder_pan.pos": -60.0,
-        "shoulder_lift.pos": 50.0,
-        "elbow_flex.pos": -60.0,
-        "wrist_flex.pos": -20.0,
-        "wrist_roll.pos": 0.0,
-        "gripper.pos": 60.0,
-    },
-    {
-        "shoulder_pan.pos": -30.0,
-        "shoulder_lift.pos": 50.0,
-        "elbow_flex.pos": -60.0,
-        "wrist_flex.pos": -5.0,
-        "wrist_roll.pos": 0.0,
-        "gripper.pos": 60.0,
-    },
-    {
-        "shoulder_pan.pos": 20.0,
-        "shoulder_lift.pos": 50.0,
-        "elbow_flex.pos": -55.0,
-        "wrist_flex.pos": -5.0,
-        "wrist_roll.pos": 0.0,
-        "gripper.pos": 60.0,
-    },
-]
-
-# ── Motion parameters ─────────────────────────────────────────────────────────
-
-CONTROL_HZ = 30
-APPROACH_SPEED = 50.0
-SWEEP_SPEED = 40.0
-
-# ── Grab-sequence parameters ──────────────────────────────────────────────────
-
-GRAB_PAN = 0.0
-SWEEP_LEFT_PAN = -60.0
-SWEEP_RIGHT_PAN = 60.0
-SWEEP_END_OFFSET = 5.0  # stop before center so the cube isn't pushed past GRAB_PAN
-SWEEP_END_PAN_RANGE = (15.0, 20.0)
-
-SWEEP_LOW_SHOULDER_LIFT = 50.0
-SWEEP_LOW_ELBOW_FLEX_START = -60.0
-SWEEP_LOW_ELBOW_FLEX_END = -55.0
-
-SWEEP_HIGH_WRIST_FLEX = -20.0  # wrist tilted up during high approach to clear obstacles
-
-PUSH_START_SHOULDER_LIFT = 0.0
-PUSH_START_ELBOW_FLEX = 45.0
-PUSH_END_SHOULDER_LIFT = 50.0
-PUSH_END_ELBOW_FLEX = -50.0
-# Subtracted from shoulder_lift during the push sweep to clear the platform surface.
-# Does not affect the grab-target interpolation in record_grab.py.
-PUSH_RAISE_OFFSET = 5.0
-
-WRIST_HORIZONTAL_OFFSET = 0.0  # tune if gripper tilts during push: + tilts nose up, - down
-GRIPPER_CLOSE_POS = 50.0
-
-PLACE_LEFT_PAN_RANGE = (5.0, 30.0)  # random pan range for cube placement on the left side
-PLACE_REACH_RANGE = (0.1, 0.7)  # 0 = arm retracted (PUSH_START), 1 = fully extended (PUSH_END)
-
-JOINT_NAMES = [
-    "shoulder_pan.pos",
-    "shoulder_lift.pos",
-    "elbow_flex.pos",
-    "wrist_flex.pos",
-    "wrist_roll.pos",
-    "gripper.pos",
-]
-
-# ── Helpers ───────────────────────────────────────────────────────────────────
-
-
-def pose_to_array(pose: dict) -> np.ndarray:
-    return np.array([pose[k] for k in JOINT_NAMES])
-
-
-def array_to_pose(arr: np.ndarray) -> dict:
-    return {k: float(arr[i]) for i, k in enumerate(JOINT_NAMES)}
-
-
-def horizontal_wrist_flex(shoulder_lift: float, elbow_flex: float) -> float:
-    return WRIST_HORIZONTAL_OFFSET - shoulder_lift - elbow_flex
-
-
-def _low_sweep_pose(pan: float, elbow_flex: float, wrist_flex: float | None = None) -> dict:
-    sl = SWEEP_LOW_SHOULDER_LIFT
-    return {
-        "shoulder_pan.pos": pan,
-        "shoulder_lift.pos": sl,
-        "elbow_flex.pos": elbow_flex,
-        "wrist_flex.pos": horizontal_wrist_flex(sl, elbow_flex) if wrist_flex is None else wrist_flex,
-        "wrist_roll.pos": 0.0,
-        "gripper.pos": 60.0,
-    }
-
-
-def _high_sweep_pose(pan: float) -> dict:
-    return {**HOME_POSE, "shoulder_pan.pos": pan, "wrist_flex.pos": SWEEP_HIGH_WRIST_FLEX}
-
-
-def _push_pose(shoulder_lift: float, elbow_flex: float, pan: float = GRAB_PAN, gripper: float = 70.0) -> dict:
-    return {
-        "shoulder_pan.pos": pan,
-        "shoulder_lift.pos": shoulder_lift,
-        "elbow_flex.pos": elbow_flex,
-        "wrist_flex.pos": horizontal_wrist_flex(shoulder_lift, elbow_flex),
-        "wrist_roll.pos": 0.0,
-        "gripper.pos": gripper,
-    }
-
-
-def move_to_pose(robot: Robot, target: dict, speed: float) -> None:
-    """Interpolate from current position to target at the given speed (units/s)."""
-    obs = robot.get_observation()
-    current = np.array([obs[k] for k in JOINT_NAMES])
-    goal = pose_to_array(target)
-
-    max_distance = float(np.max(np.abs(goal - current)))
-    if max_distance < 0.5:
-        return
-
-    n_steps = max(1, int(max_distance / speed * CONTROL_HZ))
-    dt = 1.0 / CONTROL_HZ
-    for step in range(1, n_steps + 1):
-        t = step / n_steps
-        robot.send_action(array_to_pose(current + t * (goal - current)))
-        precise_sleep(dt)
-
-
-# ── Sequences ─────────────────────────────────────────────────────────────────
-
-
-def grab_cube(robot: Robot) -> None:
-    """Left sweep → right sweep → extend arm parallel to ground → close gripper."""
-    move_to_pose(robot, HOME_POSE, APPROACH_SPEED)
-
-    for pan, end_pan in [
-        (SWEEP_LEFT_PAN, GRAB_PAN - SWEEP_END_OFFSET),
-        (SWEEP_RIGHT_PAN, GRAB_PAN + SWEEP_END_OFFSET),
-    ]:
-        logger.info(f"Sweeping {'left' if pan < 0 else 'right'} → center...")
-        move_to_pose(robot, _high_sweep_pose(pan), APPROACH_SPEED)
-        move_to_pose(
-            robot, _low_sweep_pose(pan, SWEEP_LOW_ELBOW_FLEX_START, wrist_flex=-20.0), APPROACH_SPEED
-        )
-        move_to_pose(robot, _low_sweep_pose(end_pan, SWEEP_LOW_ELBOW_FLEX_END, wrist_flex=0.0), SWEEP_SPEED)
-        move_to_pose(robot, HOME_POSE, APPROACH_SPEED)
-
-    logger.info("Extending to push cube into gripper...")
-    move_to_pose(
-        robot,
-        _push_pose(PUSH_START_SHOULDER_LIFT - PUSH_RAISE_OFFSET, PUSH_START_ELBOW_FLEX),
-        APPROACH_SPEED,
-    )
-    move_to_pose(
-        robot,
-        _push_pose(PUSH_END_SHOULDER_LIFT - PUSH_RAISE_OFFSET, PUSH_END_ELBOW_FLEX),
-        SWEEP_SPEED,
-    )
-
-    logger.info("Closing gripper...")
-    move_to_pose(
-        robot,
-        _push_pose(PUSH_END_SHOULDER_LIFT, PUSH_END_ELBOW_FLEX, gripper=GRIPPER_CLOSE_POS),
-        APPROACH_SPEED,
-    )
-
-    logger.info("Grab complete.")
-
-
-def place_cube(robot: Robot) -> tuple[float, float]:
-    """Carry the cube (gripper closed) to a random position on the left side, then release.
-
-    Returns:
-        (pan, t): pan angle and reach scalar [0, 1] of the placement position.
-    """
-    pan = float(np.random.uniform(*PLACE_LEFT_PAN_RANGE))
-    t = float(np.random.uniform(*PLACE_REACH_RANGE))
-    sl = PUSH_START_SHOULDER_LIFT + t * (PUSH_END_SHOULDER_LIFT - PUSH_START_SHOULDER_LIFT)
-    ef = PUSH_START_ELBOW_FLEX + t * (PUSH_END_ELBOW_FLEX - PUSH_START_ELBOW_FLEX)
-    logger.info(f"Placing cube at pan={pan:.1f}, reach={t:.2f}...")
-
-    move_to_pose(robot, {**HOME_POSE, "gripper.pos": GRIPPER_CLOSE_POS}, APPROACH_SPEED)
-    move_to_pose(
-        robot, {**HOME_POSE, "shoulder_pan.pos": pan, "gripper.pos": GRIPPER_CLOSE_POS}, APPROACH_SPEED
-    )
-    move_to_pose(robot, _push_pose(sl, ef, pan=pan, gripper=GRIPPER_CLOSE_POS), APPROACH_SPEED)
-    move_to_pose(robot, _push_pose(sl, ef, pan=pan, gripper=80.0), APPROACH_SPEED)
-    move_to_pose(robot, HOME_POSE, APPROACH_SPEED)
-    logger.info("Place complete.")
-    return pan, t
-
-
-# ── Entry point ───────────────────────────────────────────────────────────────
-
-
-def main():
-    parser = argparse.ArgumentParser(description="OMX arm reset / grab script")
-    parser.add_argument("--port", default="/dev/ttyACM1")
-    parser.add_argument("--robot_id", default="omx_follower")
-    parser.add_argument("--mode", choices=["grab", "grab_and_place"], default="grab_and_place")
-    args = parser.parse_args()
-
-    logging.basicConfig(level=logging.INFO, format="%(levelname)s: %(message)s")
-
-    robot = OmxFollower(OmxFollowerConfig(port=args.port, id=args.robot_id))
-    robot.connect(calibrate=True)
-
-    try:
-        if args.mode == "grab":
-            grab_cube(robot)
-        elif args.mode == "grab_and_place":
-            grab_cube(robot)
-            place_cube(robot)
-
-    finally:
-        robot.disconnect()
-
-
-if __name__ == "__main__":
-    main()
@@ -14,17 +14,13 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.

-import logging
-import time
-
 from lerobot.cameras.opencv import OpenCVCameraConfig
-from lerobot.common.control_utils import init_keyboard_listener, predict_action
+from lerobot.common.control_utils import init_keyboard_listener
 from lerobot.configs import FeatureType, PolicyFeature
 from lerobot.datasets import LeRobotDataset, aggregate_pipeline_dataset_features, create_initial_features
 from lerobot.model.kinematics import RobotKinematics
 from lerobot.policies import make_pre_post_processors
 from lerobot.policies.act import ACTPolicy
-from lerobot.policies.utils import make_robot_action
 from lerobot.processor import (
    RobotProcessorPipeline,
    make_default_teleop_action_processor,
@@ -38,12 +34,11 @@ from lerobot.robots.so_follower.robot_kinematic_processor import (
    ForwardKinematicsJointsToEE,
    InverseKinematicsEEToJoints,
 )
+from lerobot.scripts.lerobot_record import record_loop
 from lerobot.types import RobotAction, RobotObservation
-from lerobot.utils.constants import ACTION, OBS_STR
-from lerobot.utils.feature_utils import build_dataset_frame, combine_feature_dicts
-from lerobot.utils.robot_utils import precise_sleep
+from lerobot.utils.feature_utils import combine_feature_dicts
 from lerobot.utils.utils import log_say
-from lerobot.utils.visualization_utils import init_rerun, log_rerun_data
+from lerobot.utils.visualization_utils import init_rerun

 NUM_EPISODES = 5
 FPS = 30
@@ -54,9 +49,6 @@ HF_DATASET_ID = "<hf_username>/<dataset_repo_id>"


 def main():
-    # NOTE: For production policy deployment, use `lerobot-rollout` CLI instead.
-    # This script provides a self-contained example for educational purposes.
-
    # Create the robot configuration & robot
    camera_config = {"front": OpenCVCameraConfig(index_or_path=0, width=640, height=480, fps=FPS)}
    robot_config = SO100FollowerConfig(
@@ -151,67 +143,43 @@ def main():
            raise ValueError("Robot is not connected!")

        print("Starting evaluate loop...")
-        control_interval = 1 / FPS
        episode_idx = 0
        for episode_idx in range(NUM_EPISODES):
            log_say(f"Running inference, recording eval episode {episode_idx + 1} of {NUM_EPISODES}")

-            # Inline evaluation loop: predict actions and send to robot
-            timestamp = 0
-            start_episode_t = time.perf_counter()
-            while timestamp < EPISODE_TIME_SEC:
-                start_loop_t = time.perf_counter()
-
-                if events["exit_early"]:
-                    events["exit_early"] = False
-                    break
-
-                # Get robot observation
-                obs = robot.get_observation()
-                obs_processed = robot_joints_to_ee_pose_processor(obs)
-                observation_frame = build_dataset_frame(dataset.features, obs_processed, prefix=OBS_STR)
-
-                # Predict action using the policy
-                action_tensor = predict_action(
-                    observation=observation_frame,
-                    policy=policy,
-                    device=policy.config.device,
-                    preprocessor=preprocessor,
-                    postprocessor=postprocessor,
-                    use_amp=policy.config.device.type == "cuda",
-                    task=TASK_DESCRIPTION,
-                    robot_type=robot.name,
-                )
-
-                # Convert policy output to robot action dict
-                action_values = make_robot_action(action_tensor, dataset.features)
-
-                # Process and send action to robot (EE -> joints via IK)
-                robot_action_to_send = robot_ee_to_joints_processor((action_values, obs))
-                robot.send_action(robot_action_to_send)
-
-                # Write to dataset
-                action_frame = build_dataset_frame(dataset.features, action_values, prefix=ACTION)
-                frame = {**observation_frame, **action_frame, "task": TASK_DESCRIPTION}
-                dataset.add_frame(frame)
-
-                log_rerun_data(observation=obs_processed, action=action_values)
-
-                dt_s = time.perf_counter() - start_loop_t
-                sleep_time_s = control_interval - dt_s
-                if sleep_time_s < 0:
-                    logging.warning(
-                        f"Evaluate loop is running slower ({1 / dt_s:.1f} Hz) than the target FPS ({FPS} Hz)."
-                    )
-                precise_sleep(max(sleep_time_s, 0.0))
-                timestamp = time.perf_counter() - start_episode_t
+            # Main record loop
+            record_loop(
+                robot=robot,
+                events=events,
+                fps=FPS,
+                policy=policy,
+                preprocessor=preprocessor,  # Pass the pre and post policy processors
+                postprocessor=postprocessor,
+                dataset=dataset,
+                control_time_s=EPISODE_TIME_SEC,
+                single_task=TASK_DESCRIPTION,
+                display_data=True,
+                teleop_action_processor=make_default_teleop_action_processor(),
+                robot_action_processor=robot_ee_to_joints_processor,
+                robot_observation_processor=robot_joints_to_ee_pose_processor,
+            )

            # Reset the environment if not stopping or re-recording
            if not events["stop_recording"] and (
                (episode_idx < NUM_EPISODES - 1) or events["rerecord_episode"]
            ):
                log_say("Reset the environment")
-                log_say("Waiting for environment reset, press right arrow key when ready...")
+                record_loop(
+                    robot=robot,
+                    events=events,
+                    fps=FPS,
+                    control_time_s=EPISODE_TIME_SEC,
+                    single_task=TASK_DESCRIPTION,
+                    display_data=True,
+                    teleop_action_processor=make_default_teleop_action_processor(),
+                    robot_action_processor=robot_ee_to_joints_processor,
+                    robot_observation_processor=robot_joints_to_ee_pose_processor,
+                )

            if events["rerecord_episode"]:
                log_say("Re-record episode")
@@ -222,6 +190,7 @@ def main():

            # Save episode
            dataset.save_episode()
+            episode_idx += 1
    finally:
        # Clean up
        log_say("Stop recording")
@@ -65,15 +65,14 @@ def main():
    robot = SO100Follower(robot_config)
    phone = Phone(teleop_config)

-    # NOTE: It is highly recommended to use the urdf in the SO-ARM100 repo:
-    #   https://github.com/TheRobotStudio/SO-ARM100/blob/main/Simulation/SO101/so101_new_calib.urdf
+    # NOTE: It is highly recommended to use the urdf in the SO-ARM100 repo: https://github.com/TheRobotStudio/SO-ARM100/blob/main/Simulation/SO101/so101_new_calib.urdf
    kinematics_solver = RobotKinematics(
        urdf_path="./SO101/so101_new_calib.urdf",
        target_frame_name="gripper_frame_link",
        joint_names=list(robot.bus.motors.keys()),
    )

-    # Build pipeline to convert phone action to EE action (with gripper velocity mapped to joint).
+    # Build pipeline to convert phone action to EE action
    phone_to_robot_ee_pose_processor = RobotProcessorPipeline[
        tuple[RobotAction, RobotObservation], RobotAction
    ](
@@ -95,7 +94,7 @@ def main():
        to_output=transition_to_robot_action,
    )

-    # Build pipeline to convert EE action to joints action (IK).
+    # Build pipeline to convert EE action to joints action
    robot_ee_to_joints_processor = RobotProcessorPipeline[tuple[RobotAction, RobotObservation], RobotAction](
        steps=[
            InverseKinematicsEEToJoints(
@@ -108,7 +107,7 @@ def main():
        to_output=transition_to_robot_action,
    )

-    # Build pipeline to convert joint observation to EE observation (FK).
+    # Build pipeline to convert joint observation to EE observation
    robot_joints_to_ee_pose = RobotProcessorPipeline[RobotObservation, RobotObservation](
        steps=[
            ForwardKinematicsJointsToEE(
@@ -119,12 +118,13 @@ def main():
        to_output=transition_to_observation,
    )

-    # Create the dataset, deriving features from the pipelines so the on-disk schema
-    # matches exactly what the pipelines produce at runtime.
+    # Create the dataset
    dataset = LeRobotDataset.create(
        repo_id=HF_REPO_ID,
        fps=FPS,
        features=combine_feature_dicts(
+            # Run the feature contract of the pipelines
+            # This tells you how the features would look like after the pipeline steps
            aggregate_pipeline_dataset_features(
                pipeline=phone_to_robot_ee_pose_processor,
                initial_features=create_initial_features(action=phone.action_features),
@@ -163,14 +163,14 @@ def main():
                robot=robot,
                events=events,
                fps=FPS,
-                teleop_action_processor=phone_to_robot_ee_pose_processor,
-                robot_action_processor=robot_ee_to_joints_processor,
-                robot_observation_processor=robot_joints_to_ee_pose,
                teleop=phone,
                dataset=dataset,
                control_time_s=EPISODE_TIME_SEC,
                single_task=TASK_DESCRIPTION,
                display_data=True,
+                teleop_action_processor=phone_to_robot_ee_pose_processor,
+                robot_action_processor=robot_ee_to_joints_processor,
+                robot_observation_processor=robot_joints_to_ee_pose,
            )

            # Reset the environment if not stopping or re-recording
@@ -182,13 +182,13 @@ def main():
                    robot=robot,
                    events=events,
                    fps=FPS,
-                    teleop_action_processor=phone_to_robot_ee_pose_processor,
-                    robot_action_processor=robot_ee_to_joints_processor,
-                    robot_observation_processor=robot_joints_to_ee_pose,
                    teleop=phone,
                    control_time_s=RESET_TIME_SEC,
                    single_task=TASK_DESCRIPTION,
                    display_data=True,
+                    teleop_action_processor=phone_to_robot_ee_pose_processor,
+                    robot_action_processor=robot_ee_to_joints_processor,
+                    robot_observation_processor=robot_joints_to_ee_pose,
                )

            if events["rerecord_episode"]:
@@ -1,126 +0,0 @@
-# !/usr/bin/env python
-
-# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-"""Run a trained EE-space policy on SO100 (phone-trained) without recording.
-
-Mirrors ``examples/so100_to_so100_EE/rollout.py`` — the model was trained
-with phone teleoperation in EE space, so at deployment we only need the
-joint↔EE conversion on the robot side; the phone is not used.
-
-Uses :class:`BaseStrategy` (no recording) + :class:`SyncInferenceConfig`
-(inline policy call).  For recording during rollout, switch to Sentry,
-Highlight, or DAgger via ``lerobot-rollout --strategy.type=...``.
-"""
-
-from lerobot.cameras.opencv import OpenCVCameraConfig
-from lerobot.configs import PreTrainedConfig
-from lerobot.model.kinematics import RobotKinematics
-from lerobot.processor import (
-    RobotProcessorPipeline,
-    observation_to_transition,
-    robot_action_observation_to_transition,
-    transition_to_observation,
-    transition_to_robot_action,
-)
-from lerobot.robots.so_follower import SO100Follower, SO100FollowerConfig
-from lerobot.robots.so_follower.robot_kinematic_processor import (
-    ForwardKinematicsJointsToEE,
-    InverseKinematicsEEToJoints,
-)
-from lerobot.rollout import BaseStrategyConfig, RolloutConfig, build_rollout_context
-from lerobot.rollout.inference import SyncInferenceConfig
-from lerobot.rollout.strategies import BaseStrategy
-from lerobot.types import RobotAction, RobotObservation
-from lerobot.utils.process import ProcessSignalHandler
-from lerobot.utils.utils import init_logging
-
-FPS = 30
-DURATION_SEC = 60
-TASK_DESCRIPTION = "My task description"
-HF_MODEL_ID = "<hf_username>/<model_repo_id>"
-
-
-def main():
-    init_logging()
-
-    camera_config = {"front": OpenCVCameraConfig(index_or_path=0, width=640, height=480, fps=FPS)}
-    robot_config = SO100FollowerConfig(
-        port="/dev/tty.usbmodem58760434471",
-        id="my_awesome_follower_arm",
-        cameras=camera_config,
-        use_degrees=True,
-    )
-
-    # Peek at motor names once to build the kinematic solver.
-    temp_robot = SO100Follower(robot_config)
-    motor_names = list(temp_robot.bus.motors.keys())
-
-    kinematics_solver = RobotKinematics(
-        urdf_path="./SO101/so101_new_calib.urdf",
-        target_frame_name="gripper_frame_link",
-        joint_names=motor_names,
-    )
-
-    robot_joints_to_ee_pose_processor = RobotProcessorPipeline[RobotObservation, RobotObservation](
-        steps=[ForwardKinematicsJointsToEE(kinematics=kinematics_solver, motor_names=motor_names)],
-        to_transition=observation_to_transition,
-        to_output=transition_to_observation,
-    )
-
-    robot_ee_to_joints_processor = RobotProcessorPipeline[tuple[RobotAction, RobotObservation], RobotAction](
-        steps=[
-            InverseKinematicsEEToJoints(
-                kinematics=kinematics_solver,
-                motor_names=motor_names,
-                initial_guess_current_joints=True,
-            ),
-        ],
-        to_transition=robot_action_observation_to_transition,
-        to_output=transition_to_robot_action,
-    )
-
-    policy_config = PreTrainedConfig.from_pretrained(HF_MODEL_ID)
-    policy_config.pretrained_path = HF_MODEL_ID
-
-    cfg = RolloutConfig(
-        robot=robot_config,
-        policy=policy_config,
-        strategy=BaseStrategyConfig(),
-        inference=SyncInferenceConfig(),
-        fps=FPS,
-        duration=DURATION_SEC,
-        task=TASK_DESCRIPTION,
-    )
-
-    signal_handler = ProcessSignalHandler(use_threads=True)
-
-    ctx = build_rollout_context(
-        cfg,
-        signal_handler.shutdown_event,
-        robot_action_processor=robot_ee_to_joints_processor,
-        robot_observation_processor=robot_joints_to_ee_pose_processor,
-    )
-
-    strategy = BaseStrategy(cfg.strategy)
-    try:
-        strategy.setup(ctx)
-        strategy.run(ctx)
-    finally:
-        strategy.teardown(ctx)
-
-
-if __name__ == "__main__":
-    main()
@@ -0,0 +1,673 @@
+#!/usr/bin/env python
+
+# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""
+Demo script showing how to use Real-Time Chunking (RTC) with action chunking policies on real robots.
+
+This script demonstrates:
+1. Creating a robot and policy (SmolVLA, Pi0, etc.) with RTC
+2. Consuming actions from the policy while the robot executes
+3. Periodically requesting new action chunks in the background using threads
+4. Managing action buffers and timing for real-time operation
+
+For simulation environments, see eval_with_simulation.py
+
+Usage:
+    # Run RTC with Real robot with RTC
+    uv run examples/rtc/eval_with_real_robot.py \
+        --policy.path=<USER>/smolvla_check_rtc_last3 \
+        --policy.device=mps \
+        --rtc.enabled=true \
+        --rtc.execution_horizon=20 \
+        --robot.type=so100_follower \
+        --robot.port=/dev/tty.usbmodem58FA0834591 \
+        --robot.id=so100_follower \
+        --robot.cameras="{ gripper: {type: opencv, index_or_path: 1, width: 640, height: 480, fps: 30}, front: {type: opencv, index_or_path: 0, width: 640, height: 480, fps: 30}}" \
+        --task="Move green small object into the purple platform" \
+        --duration=120
+
+    # Run RTC with Real robot without RTC
+    uv run examples/rtc/eval_with_real_robot.py \
+        --policy.path=<USER>/smolvla_check_rtc_last3 \
+        --policy.device=mps \
+        --rtc.enabled=false \
+        --robot.type=so100_follower \
+        --robot.port=/dev/tty.usbmodem58FA0834591 \
+        --robot.id=so100_follower \
+        --robot.cameras="{ gripper: {type: opencv, index_or_path: 1, width: 640, height: 480, fps: 30}, front: {type: opencv, index_or_path: 0, width: 640, height: 480, fps: 30}}" \
+        --task="Move green small object into the purple platform" \
+        --duration=120
+
+    # Run RTC with Real robot with pi0.5 policy
+    uv run examples/rtc/eval_with_real_robot.py \
+        --policy.path=<USER>/pi05_check_rtc \
+        --policy.device=mps \
+        --rtc.enabled=true \
+        --rtc.execution_horizon=20 \
+        --robot.type=so100_follower \
+        --robot.port=/dev/tty.usbmodem58FA0834591 \
+        --robot.id=so100_follower \
+        --robot.cameras="{ gripper: {type: opencv, index_or_path: 0, width: 640, height: 480, fps: 30}, front: {type: opencv, index_or_path: 1, width: 640, height: 480, fps: 30}}" \
+        --task="Move green small object into the purple platform" \
+        --duration=120
+
+    # Run RTC with bi_openarm_follower (dual-arm OpenArms) and pi0.5 policy
+    python examples/rtc/eval_with_real_robot.py \
+        --policy.path=lerobot-data-collection/folding_final \
+        --robot.type=bi_openarm_follower \
+        --robot.cameras='{left_wrist: {type: opencv, index_or_path: "/dev/video4", width: 1280, height: 720, fps: 30}, base: {type: opencv, index_or_path: "/dev/video2", width: 640, height: 480, fps: 30}, right_wrist: {type: opencv, index_or_path: "/dev/video0", width: 1280, height: 720, fps: 30}}' \
+        --robot.left_arm_config.port=can0 \
+        --robot.left_arm_config.side=left \
+        --robot.left_arm_config.can_interface=socketcan \
+        --robot.left_arm_config.disable_torque_on_disconnect=true \
+        --robot.left_arm_config.max_relative_target=8.0 \
+        --robot.right_arm_config.port=can1 \
+        --robot.right_arm_config.side=right \
+        --robot.right_arm_config.can_interface=socketcan \
+        --robot.right_arm_config.disable_torque_on_disconnect=true \
+        --robot.right_arm_config.max_relative_target=8.0 \
+        --task="Fold the T-shirt properly" \
+        --fps=30 \
+        --duration=2000 \
+        --interpolation_multiplier=3 \
+        --rtc.enabled=true \
+        --rtc.execution_horizon=20 \
+        --rtc.max_guidance_weight=5.0 \
+        --rtc.prefix_attention_schedule=LINEAR \
+        --device=cuda
+"""
+
+import logging
+import math
+import sys
+import time
+import traceback
+from dataclasses import dataclass, field
+from threading import Event, Lock, Thread
+
+import torch
+from torch import Tensor
+
+from lerobot.cameras.opencv import OpenCVCameraConfig  # noqa: F401
+from lerobot.cameras.realsense import RealSenseCameraConfig  # noqa: F401
+from lerobot.cameras.zmq import ZMQCameraConfig  # noqa: F401
+from lerobot.configs import PreTrainedConfig, RTCAttentionSchedule, parser
+from lerobot.policies import get_policy_class, make_pre_post_processors
+from lerobot.policies.rtc import ActionInterpolator, ActionQueue, LatencyTracker, RTCConfig
+from lerobot.processor import (
+    NormalizerProcessorStep,
+    RelativeActionsProcessorStep,
+    TransitionKey,
+    create_transition,
+    make_default_robot_action_processor,
+    make_default_robot_observation_processor,
+    to_relative_actions,
+)
+from lerobot.rl.process import ProcessSignalHandler
+from lerobot.robots import (  # noqa: F401
+    Robot,
+    RobotConfig,
+    bi_openarm_follower,
+    bi_so_follower,
+    koch_follower,
+    so_follower,
+    unitree_g1,
+)
+from lerobot.robots.utils import make_robot_from_config
+from lerobot.utils.constants import OBS_IMAGES, OBS_STATE
+from lerobot.utils.feature_utils import build_dataset_frame, hw_to_dataset_features
+from lerobot.utils.hub import HubMixin
+from lerobot.utils.utils import init_logging
+
+logging.basicConfig(level=logging.INFO)
+logger = logging.getLogger(__name__)
+
+
+class RobotWrapper:
+    def __init__(self, robot: Robot):
+        self.robot = robot
+        self.lock = Lock()
+
+    def get_observation(self) -> dict[str, Tensor]:
+        with self.lock:
+            return self.robot.get_observation()
+
+    def send_action(self, action: Tensor):
+        with self.lock:
+            self.robot.send_action(action)
+
+    def observation_features(self) -> list[str]:
+        with self.lock:
+            return self.robot.observation_features
+
+    def action_features(self) -> list[str]:
+        with self.lock:
+            return self.robot.action_features
+
+
+@dataclass
+class RTCDemoConfig(HubMixin):
+    """Configuration for RTC demo with action chunking policies and real robots."""
+
+    # Policy configuration
+    policy: PreTrainedConfig | None = None
+
+    # Robot configuration
+    robot: RobotConfig | None = None
+
+    # RTC configuration
+    rtc: RTCConfig = field(
+        default_factory=lambda: RTCConfig(
+            execution_horizon=10,
+            max_guidance_weight=1.0,
+            prefix_attention_schedule=RTCAttentionSchedule.EXP,
+        )
+    )
+
+    # Demo parameters
+    duration: float = 30.0  # Duration to run the demo (seconds)
+    fps: float = 10.0  # Action execution frequency (Hz)
+    interpolation_multiplier: int = 1  # Control rate multiplier (1=off, 2=2x, 3=3x)
+
+    # Compute device
+    device: str | None = None  # Device to run on (cuda, cpu, auto)
+
+    # Get new actions horizon. The amount of executed steps after which will be requested new actions.
+    # It should be higher than inference delay + execution horizon.
+    action_queue_size_to_get_new_actions: int = 30
+
+    # Task to execute
+    task: str = field(default="", metadata={"help": "Task to execute"})
+
+    # Torch compile configuration
+    use_torch_compile: bool = field(
+        default=False,
+        metadata={"help": "Use torch.compile for faster inference (PyTorch 2.0+)"},
+    )
+
+    torch_compile_backend: str = field(
+        default="inductor",
+        metadata={"help": "Backend for torch.compile (inductor, aot_eager, cudagraphs)"},
+    )
+
+    torch_compile_mode: str = field(
+        default="default",
+        metadata={"help": "Compilation mode (default, reduce-overhead, max-autotune)"},
+    )
+
+    torch_compile_disable_cudagraphs: bool = field(
+        default=True,
+        metadata={
+            "help": "Disable CUDA graphs in torch.compile. Required due to in-place tensor "
+            "operations in denoising loop (x_t += dt * v_t) which cause tensor aliasing issues."
+        },
+    )
+
+    def __post_init__(self):
+        # HACK: We parse again the cli args here to get the pretrained path if there was one.
+        policy_path = parser.get_path_arg("policy")
+        if policy_path:
+            cli_overrides = parser.get_cli_overrides("policy")
+            self.policy = PreTrainedConfig.from_pretrained(policy_path, cli_overrides=cli_overrides)
+            self.policy.pretrained_path = policy_path
+        else:
+            raise ValueError("Policy path is required")
+
+        # Validate that robot configuration is provided
+        if self.robot is None:
+            raise ValueError("Robot configuration must be provided")
+
+    @classmethod
+    def __get_path_fields__(cls) -> list[str]:
+        """This enables the parser to load config from the policy using `--policy.path=local/dir`"""
+        return ["policy"]
+
+
+def is_image_key(k: str) -> bool:
+    return k.startswith(OBS_IMAGES)
+
+
+def _reanchor_relative_rtc_prefix(
+    prev_actions_absolute: Tensor,
+    current_state: Tensor,
+    relative_step: RelativeActionsProcessorStep,
+    normalizer_step: NormalizerProcessorStep | None,
+    policy_device: torch.device | str,
+) -> Tensor:
+    """Convert absolute leftovers into model-space for relative-action RTC policies.
+
+    When a policy uses relative actions, the RTC prefix (leftover actions from
+    the previous chunk) is stored in absolute space. Before feeding it back to
+    the policy we need to re-express it relative to the *current* robot state
+    and then re-normalize.
+    """
+    state = current_state.detach().cpu()
+    if state.dim() == 1:
+        state = state.unsqueeze(0)
+
+    action_cpu = prev_actions_absolute.detach().cpu()
+    mask = relative_step._build_mask(action_cpu.shape[-1])
+    relative_actions = to_relative_actions(action_cpu, state, mask)
+
+    transition = create_transition(action=relative_actions)
+    if normalizer_step is not None:
+        transition = normalizer_step(transition)
+
+    return transition[TransitionKey.ACTION].to(policy_device)
+
+
+def get_actions(
+    policy,
+    robot: RobotWrapper,
+    robot_observation_processor,
+    action_queue: ActionQueue,
+    shutdown_event: Event,
+    cfg: RTCDemoConfig,
+):
+    """Thread function to request action chunks from the policy.
+
+    Args:
+        policy: The policy instance (SmolVLA, Pi0, etc.)
+        robot: The robot instance for getting observations
+        robot_observation_processor: Processor for raw robot observations
+        action_queue: Queue to put new action chunks
+        shutdown_event: Event to signal shutdown
+        cfg: Demo configuration
+    """
+    try:
+        logger.info("[GET_ACTIONS] Starting get actions thread")
+
+        latency_tracker = LatencyTracker()  # Track latency of action chunks
+        fps = cfg.fps
+        time_per_chunk = 1.0 / fps
+
+        # Only keep .pos joints + camera streams if the policy was trained on positions,
+        # not the full pos/vel/torque state the robot exposes.
+        observation_features_hw = {
+            key: value
+            for key, value in robot.observation_features().items()
+            if key.endswith(".pos") or isinstance(value, tuple)
+        }
+
+        dataset_features = hw_to_dataset_features(observation_features_hw, "observation")
+        policy_device = policy.config.device
+
+        # Load preprocessor and postprocessor from pretrained files
+        # The stats are embedded in the processor .safetensors files
+        logger.info(f"[GET_ACTIONS] Loading preprocessor/postprocessor from {cfg.policy.pretrained_path}")
+
+        preprocessor, postprocessor = make_pre_post_processors(
+            policy_cfg=cfg.policy,
+            pretrained_path=cfg.policy.pretrained_path,
+            dataset_stats=None,  # Will load from pretrained processor files
+            preprocessor_overrides={
+                "device_processor": {"device": cfg.policy.device},
+            },
+        )
+
+        logger.info("[GET_ACTIONS] Preprocessor/postprocessor loaded successfully with embedded stats")
+
+        relative_step = next(
+            (s for s in preprocessor.steps if isinstance(s, RelativeActionsProcessorStep) and s.enabled),
+            None,
+        )
+        normalizer_step = next(
+            (s for s in preprocessor.steps if isinstance(s, NormalizerProcessorStep)),
+            None,
+        )
+        if relative_step is not None:
+            if relative_step.action_names is None:
+                cfg_names = getattr(cfg.policy, "action_feature_names", None)
+                if cfg_names:
+                    relative_step.action_names = list(cfg_names)
+                else:
+                    relative_step.action_names = [
+                        k for k in robot.robot.action_features if k.endswith(".pos")
+                    ]
+            logger.info("[GET_ACTIONS] Relative actions enabled: will re-anchor RTC prefix")
+
+        get_actions_threshold = cfg.action_queue_size_to_get_new_actions
+
+        if not cfg.rtc.enabled:
+            get_actions_threshold = 0
+
+        while not shutdown_event.is_set():
+            if action_queue.qsize() <= get_actions_threshold:
+                current_time = time.perf_counter()
+                action_index_before_inference = action_queue.get_action_index()
+                prev_actions = action_queue.get_left_over()
+
+                inference_latency = latency_tracker.max()
+                inference_delay = math.ceil(inference_latency / time_per_chunk)
+
+                obs = robot.get_observation()
+
+                # Apply robot observation processor
+                obs_processed = robot_observation_processor(obs)
+
+                obs_with_policy_features = build_dataset_frame(
+                    dataset_features, obs_processed, prefix="observation"
+                )
+
+                for name in obs_with_policy_features:
+                    obs_with_policy_features[name] = torch.from_numpy(obs_with_policy_features[name])
+                    if "image" in name:
+                        obs_with_policy_features[name] = (
+                            obs_with_policy_features[name].type(torch.float32) / 255
+                        )
+                        obs_with_policy_features[name] = (
+                            obs_with_policy_features[name].permute(2, 0, 1).contiguous()
+                        )
+                    obs_with_policy_features[name] = obs_with_policy_features[name].unsqueeze(0)
+                    obs_with_policy_features[name] = obs_with_policy_features[name].to(policy_device)
+
+                obs_with_policy_features["task"] = [cfg.task]  # Task should be a list, not a string!
+                obs_with_policy_features["robot_type"] = (
+                    robot.robot.name if hasattr(robot.robot, "name") else ""
+                )
+
+                preproceseded_obs = preprocessor(obs_with_policy_features)
+
+                # Re-anchor leftover actions for relative-action policies.
+                # We need the *postprocessed* (absolute) leftover, not the original
+                # (normalized/relative) one that get_left_over() returns.
+                if (
+                    prev_actions is not None
+                    and relative_step is not None
+                    and OBS_STATE in obs_with_policy_features
+                ):
+                    with action_queue.lock:
+                        if action_queue.queue is not None:
+                            prev_actions_abs = action_queue.queue[action_queue.last_index :].clone()
+                        else:
+                            prev_actions_abs = None
+                    if prev_actions_abs is not None and prev_actions_abs.numel() > 0:
+                        prev_actions = _reanchor_relative_rtc_prefix(
+                            prev_actions_absolute=prev_actions_abs,
+                            current_state=obs_with_policy_features[OBS_STATE],
+                            relative_step=relative_step,
+                            normalizer_step=normalizer_step,
+                            policy_device=policy_device,
+                        )
+
+                # Generate actions WITH RTC
+                actions = policy.predict_action_chunk(
+                    preproceseded_obs,
+                    inference_delay=inference_delay,
+                    prev_chunk_left_over=prev_actions,
+                )
+
+                # Store original actions (before postprocessing) for RTC
+                original_actions = actions.squeeze(0).clone()
+
+                postprocessed_actions = postprocessor(actions)
+
+                postprocessed_actions = postprocessed_actions.squeeze(0)
+
+                new_latency = time.perf_counter() - current_time
+                new_delay = math.ceil(new_latency / time_per_chunk)
+                latency_tracker.add(new_latency)
+
+                if cfg.action_queue_size_to_get_new_actions < cfg.rtc.execution_horizon + new_delay:
+                    logger.warning(
+                        "[GET_ACTIONS] cfg.action_queue_size_to_get_new_actions Too small, It should be higher than inference delay + execution horizon."
+                    )
+
+                action_queue.merge(
+                    original_actions, postprocessed_actions, new_delay, action_index_before_inference
+                )
+            else:
+                # Small sleep to prevent busy waiting
+                time.sleep(0.1)
+
+        logger.info("[GET_ACTIONS] get actions thread shutting down")
+    except Exception as e:
+        logger.error(f"[GET_ACTIONS] Fatal exception in get_actions thread: {e}")
+        logger.error(traceback.format_exc())
+        sys.exit(1)
+
+
+def actor_control(
+    robot: RobotWrapper,
+    robot_action_processor,
+    action_queue: ActionQueue,
+    shutdown_event: Event,
+    cfg: RTCDemoConfig,
+):
+    """Thread function to execute actions on the robot.
+
+    Args:
+        robot: The robot instance
+        action_queue: Queue to get actions from
+        shutdown_event: Event to signal shutdown
+        cfg: Demo configuration
+    """
+    try:
+        logger.info("[ACTOR] Starting actor thread")
+
+        action_keys = [k for k in robot.action_features() if k.endswith(".pos")]
+
+        action_count = 0
+        interpolator = ActionInterpolator(multiplier=cfg.interpolation_multiplier)
+        action_interval = interpolator.get_control_interval(cfg.fps)
+
+        while not shutdown_event.is_set():
+            start_time = time.perf_counter()
+
+            if interpolator.needs_new_action():
+                new_action = action_queue.get()
+                if new_action is not None:
+                    interpolator.add(new_action.cpu())
+
+            action = interpolator.get()
+            if action is not None:
+                action = action.cpu()
+                action_dict = {key: action[i].item() for i, key in enumerate(action_keys)}
+                action_processed = robot_action_processor((action_dict, None))
+                robot.send_action(action_processed)
+                action_count += 1
+
+            dt_s = time.perf_counter() - start_time
+            time.sleep(max(0, (action_interval - dt_s) - 0.001))
+
+        logger.info(f"[ACTOR] Actor thread shutting down. Total actions executed: {action_count}")
+    except Exception as e:
+        logger.error(f"[ACTOR] Fatal exception in actor_control thread: {e}")
+        logger.error(traceback.format_exc())
+        sys.exit(1)
+
+
+def _apply_torch_compile(policy, cfg: RTCDemoConfig):
+    """Apply torch.compile to the policy's predict_action_chunk method.
+
+    Args:
+        policy: Policy instance to compile
+        cfg: Configuration containing torch compile settings
+
+    Returns:
+        Policy with compiled predict_action_chunk method
+    """
+
+    # PI models handle their own compilation
+    if policy.type == "pi05" or policy.type == "pi0":
+        return policy
+
+    try:
+        # Check if torch.compile is available (PyTorch 2.0+)
+        if not hasattr(torch, "compile"):
+            logger.warning(
+                f"torch.compile is not available. Requires PyTorch 2.0+. "
+                f"Current version: {torch.__version__}. Skipping compilation."
+            )
+            return policy
+
+        logger.info("Applying torch.compile to predict_action_chunk...")
+        logger.info(f"  Backend: {cfg.torch_compile_backend}")
+        logger.info(f"  Mode: {cfg.torch_compile_mode}")
+        logger.info(f"  Disable CUDA graphs: {cfg.torch_compile_disable_cudagraphs}")
+
+        # Compile the predict_action_chunk method
+        # - CUDA graphs disabled to prevent tensor aliasing from in-place ops (x_t += dt * v_t)
+        compile_kwargs = {
+            "backend": cfg.torch_compile_backend,
+            "mode": cfg.torch_compile_mode,
+        }
+
+        # Disable CUDA graphs if requested (prevents tensor aliasing issues)
+        if cfg.torch_compile_disable_cudagraphs:
+            compile_kwargs["options"] = {"triton.cudagraphs": False}
+
+        original_method = policy.predict_action_chunk
+        compiled_method = torch.compile(original_method, **compile_kwargs)
+        policy.predict_action_chunk = compiled_method
+        logger.info("✓ Successfully compiled predict_action_chunk")
+
+    except Exception as e:
+        logger.error(f"Failed to apply torch.compile: {e}")
+        logger.warning("Continuing without torch.compile")
+
+    return policy
+
+
+@parser.wrap()
+def demo_cli(cfg: RTCDemoConfig):
+    """Main entry point for RTC demo with draccus configuration."""
+
+    # Initialize logging
+    init_logging()
+
+    logger.info(f"Using device: {cfg.device}")
+
+    # Setup signal handler for graceful shutdown
+    signal_handler = ProcessSignalHandler(use_threads=True, display_pid=False)
+    shutdown_event = signal_handler.shutdown_event
+
+    policy = None
+    robot = None
+    get_actions_thread = None
+    actor_thread = None
+
+    policy_class = get_policy_class(cfg.policy.type)
+
+    # Load config and set compile_model for pi0/pi05 models
+    config = PreTrainedConfig.from_pretrained(cfg.policy.pretrained_path)
+
+    if cfg.policy.type == "pi05" or cfg.policy.type == "pi0":
+        config.compile_model = cfg.use_torch_compile
+
+    if config.use_peft:
+        from peft import PeftConfig, PeftModel
+
+        peft_pretrained_path = cfg.policy.pretrained_path
+        peft_config = PeftConfig.from_pretrained(peft_pretrained_path)
+
+        policy = policy_class.from_pretrained(
+            pretrained_name_or_path=peft_config.base_model_name_or_path, config=config
+        )
+        policy = PeftModel.from_pretrained(policy, peft_pretrained_path, config=peft_config)
+    else:
+        policy = policy_class.from_pretrained(cfg.policy.pretrained_path, config=config)
+
+    # Turn on RTC
+    policy.config.rtc_config = cfg.rtc
+
+    # Init RTC processort, as by default if RTC disabled in the config
+    # The processor won't be created
+    policy.init_rtc_processor()
+
+    assert policy.name in ["smolvla", "pi05", "pi0"], "Only smolvla, pi05, and pi0 are supported for RTC"
+
+    policy = policy.to(cfg.device)
+    policy.eval()
+
+    # Apply torch.compile to predict_action_chunk method if enabled
+    if cfg.use_torch_compile:
+        policy = _apply_torch_compile(policy, cfg)
+
+    # Create robot
+    logger.info(f"Initializing robot: {cfg.robot.type}")
+    robot = make_robot_from_config(cfg.robot)
+    robot.connect()
+    robot_wrapper = RobotWrapper(robot)
+
+    # Create robot observation processor
+    robot_observation_processor = make_default_robot_observation_processor()
+    robot_action_processor = make_default_robot_action_processor()
+
+    # Create action queue for communication between threads
+    action_queue = ActionQueue(cfg.rtc)
+
+    # Start chunk requester thread
+    get_actions_thread = Thread(
+        target=get_actions,
+        args=(policy, robot_wrapper, robot_observation_processor, action_queue, shutdown_event, cfg),
+        daemon=True,
+        name="GetActions",
+    )
+    get_actions_thread.start()
+    logger.info("Started get actions thread")
+
+    # Start action executor thread
+    actor_thread = Thread(
+        target=actor_control,
+        args=(robot_wrapper, robot_action_processor, action_queue, shutdown_event, cfg),
+        daemon=True,
+        name="Actor",
+    )
+    actor_thread.start()
+    logger.info("Started actor thread")
+
+    logger.info("Started stop by duration thread")
+
+    # Main thread monitors for duration or shutdown
+    logger.info(f"Running demo for {cfg.duration} seconds...")
+    start_time = time.time()
+
+    while not shutdown_event.is_set() and (time.time() - start_time) < cfg.duration:
+        time.sleep(10)
+
+        # Log queue status periodically
+        if int(time.time() - start_time) % 5 == 0:
+            logger.info(f"[MAIN] Action queue size: {action_queue.qsize()}")
+
+        if time.time() - start_time > cfg.duration:
+            break
+
+    logger.info("Demo duration reached or shutdown requested")
+
+    # Signal shutdown
+    shutdown_event.set()
+
+    # Wait for threads to finish
+    if get_actions_thread and get_actions_thread.is_alive():
+        logger.info("Waiting for chunk requester thread to finish...")
+        get_actions_thread.join()
+
+    if actor_thread and actor_thread.is_alive():
+        logger.info("Waiting for action executor thread to finish...")
+        actor_thread.join()
+
+    # Cleanup robot
+    if robot:
+        robot.disconnect()
+        logger.info("Robot disconnected")
+
+    logger.info("Cleanup completed")
+
+
+if __name__ == "__main__":
+    demo_cli()
+    logging.info("RTC demo finished")
@@ -14,17 +14,13 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.

-import logging
-import time
-
 from lerobot.cameras.opencv import OpenCVCameraConfig
-from lerobot.common.control_utils import init_keyboard_listener, predict_action
+from lerobot.common.control_utils import init_keyboard_listener
 from lerobot.configs import FeatureType, PolicyFeature
 from lerobot.datasets import LeRobotDataset, aggregate_pipeline_dataset_features, create_initial_features
 from lerobot.model.kinematics import RobotKinematics
 from lerobot.policies import make_pre_post_processors
 from lerobot.policies.act import ACTPolicy
-from lerobot.policies.utils import make_robot_action
 from lerobot.processor import (
    RobotProcessorPipeline,
    make_default_teleop_action_processor,
@@ -38,12 +34,11 @@ from lerobot.robots.so_follower.robot_kinematic_processor import (
    ForwardKinematicsJointsToEE,
    InverseKinematicsEEToJoints,
 )
+from lerobot.scripts.lerobot_record import record_loop
 from lerobot.types import RobotAction, RobotObservation
-from lerobot.utils.constants import ACTION, OBS_STR
-from lerobot.utils.feature_utils import build_dataset_frame, combine_feature_dicts
-from lerobot.utils.robot_utils import precise_sleep
+from lerobot.utils.feature_utils import combine_feature_dicts
 from lerobot.utils.utils import log_say
-from lerobot.utils.visualization_utils import init_rerun, log_rerun_data
+from lerobot.utils.visualization_utils import init_rerun

 NUM_EPISODES = 5
 FPS = 30
@@ -54,9 +49,6 @@ HF_DATASET_ID = "<hf_username>/<dataset_repo_id>"


 def main():
-    # NOTE: For production policy deployment, use `lerobot-rollout` CLI instead.
-    # This script provides a self-contained example for educational purposes.
-
    # Create the robot configuration & robot
    camera_config = {"front": OpenCVCameraConfig(index_or_path=0, width=640, height=480, fps=FPS)}
    robot_config = SO100FollowerConfig(
@@ -151,67 +143,43 @@ def main():
            raise ValueError("Robot is not connected!")

        print("Starting evaluate loop...")
-        control_interval = 1 / FPS
        episode_idx = 0
        for episode_idx in range(NUM_EPISODES):
            log_say(f"Running inference, recording eval episode {episode_idx + 1} of {NUM_EPISODES}")

-            # Inline evaluation loop: predict actions and send to robot
-            timestamp = 0
-            start_episode_t = time.perf_counter()
-            while timestamp < EPISODE_TIME_SEC:
-                start_loop_t = time.perf_counter()
-
-                if events["exit_early"]:
-                    events["exit_early"] = False
-                    break
-
-                # Get robot observation
-                obs = robot.get_observation()
-                obs_processed = robot_joints_to_ee_pose_processor(obs)
-                observation_frame = build_dataset_frame(dataset.features, obs_processed, prefix=OBS_STR)
-
-                # Predict action using the policy
-                action_tensor = predict_action(
-                    observation=observation_frame,
-                    policy=policy,
-                    device=policy.config.device,
-                    preprocessor=preprocessor,
-                    postprocessor=postprocessor,
-                    use_amp=policy.config.device.type == "cuda",
-                    task=TASK_DESCRIPTION,
-                    robot_type=robot.name,
-                )
-
-                # Convert policy output to robot action dict
-                action_values = make_robot_action(action_tensor, dataset.features)
-
-                # Process and send action to robot (EE -> joints via IK)
-                robot_action_to_send = robot_ee_to_joints_processor((action_values, obs))
-                robot.send_action(robot_action_to_send)
-
-                # Write to dataset
-                action_frame = build_dataset_frame(dataset.features, action_values, prefix=ACTION)
-                frame = {**observation_frame, **action_frame, "task": TASK_DESCRIPTION}
-                dataset.add_frame(frame)
-
-                log_rerun_data(observation=obs_processed, action=action_values)
-
-                dt_s = time.perf_counter() - start_loop_t
-                sleep_time_s = control_interval - dt_s
-                if sleep_time_s < 0:
-                    logging.warning(
-                        f"Evaluate loop is running slower ({1 / dt_s:.1f} Hz) than the target FPS ({FPS} Hz)."
-                    )
-                precise_sleep(max(sleep_time_s, 0.0))
-                timestamp = time.perf_counter() - start_episode_t
+            # Main record loop
+            record_loop(
+                robot=robot,
+                events=events,
+                fps=FPS,
+                policy=policy,
+                preprocessor=preprocessor,  # Pass the pre and post policy processors
+                postprocessor=postprocessor,
+                dataset=dataset,
+                control_time_s=EPISODE_TIME_SEC,
+                single_task=TASK_DESCRIPTION,
+                display_data=True,
+                teleop_action_processor=make_default_teleop_action_processor(),
+                robot_action_processor=robot_ee_to_joints_processor,
+                robot_observation_processor=robot_joints_to_ee_pose_processor,
+            )

            # Reset the environment if not stopping or re-recording
            if not events["stop_recording"] and (
                (episode_idx < NUM_EPISODES - 1) or events["rerecord_episode"]
            ):
                log_say("Reset the environment")
-                log_say("Waiting for environment reset, press right arrow key when ready...")
+                record_loop(
+                    robot=robot,
+                    events=events,
+                    fps=FPS,
+                    control_time_s=EPISODE_TIME_SEC,
+                    single_task=TASK_DESCRIPTION,
+                    display_data=True,
+                    teleop_action_processor=make_default_teleop_action_processor(),
+                    robot_action_processor=robot_ee_to_joints_processor,
+                    robot_observation_processor=robot_joints_to_ee_pose_processor,
+                )

            if events["rerecord_episode"]:
                log_say("Re-record episode")
@@ -222,6 +190,7 @@ def main():

            # Save episode
            dataset.save_episode()
+            episode_idx += 1
    finally:
        # Clean up
        log_say("Stop recording")
@@ -62,20 +62,21 @@ def main():
    follower = SO100Follower(follower_config)
    leader = SO100Leader(leader_config)

-    # NOTE: It is highly recommended to use the urdf in the SO-ARM100 repo:
-    #   https://github.com/TheRobotStudio/SO-ARM100/blob/main/Simulation/SO101/so101_new_calib.urdf
+    # NOTE: It is highly recommended to use the urdf in the SO-ARM100 repo: https://github.com/TheRobotStudio/SO-ARM100/blob/main/Simulation/SO101/so101_new_calib.urdf
    follower_kinematics_solver = RobotKinematics(
        urdf_path="./SO101/so101_new_calib.urdf",
        target_frame_name="gripper_frame_link",
        joint_names=list(follower.bus.motors.keys()),
    )
+
+    # NOTE: It is highly recommended to use the urdf in the SO-ARM100 repo: https://github.com/TheRobotStudio/SO-ARM100/blob/main/Simulation/SO101/so101_new_calib.urdf
    leader_kinematics_solver = RobotKinematics(
        urdf_path="./SO101/so101_new_calib.urdf",
        target_frame_name="gripper_frame_link",
        joint_names=list(leader.bus.motors.keys()),
    )

-    # Build pipeline to convert follower joints to EE observation.
+    # Build pipeline to convert follower joints to EE observation
    follower_joints_to_ee = RobotProcessorPipeline[RobotObservation, RobotObservation](
        steps=[
            ForwardKinematicsJointsToEE(
@@ -86,7 +87,7 @@ def main():
        to_output=transition_to_observation,
    )

-    # Build pipeline to convert leader joints to EE action.
+    # Build pipeline to convert leader joints to EE action
    leader_joints_to_ee = RobotProcessorPipeline[tuple[RobotAction, RobotObservation], RobotAction](
        steps=[
            ForwardKinematicsJointsToEE(
@@ -97,9 +98,9 @@ def main():
        to_output=transition_to_robot_action,
    )

-    # Build pipeline to convert EE action to follower joints (with safety bounds).
+    # Build pipeline to convert EE action to follower joints
    ee_to_follower_joints = RobotProcessorPipeline[tuple[RobotAction, RobotObservation], RobotAction](
-        steps=[
+        [
            EEBoundsAndSafety(
                end_effector_bounds={"min": [-1.0, -1.0, -1.0], "max": [1.0, 1.0, 1.0]},
                max_ee_step_m=0.10,
@@ -114,12 +115,13 @@ def main():
        to_output=transition_to_robot_action,
    )

-    # Create the dataset, deriving features from the pipelines so the on-disk schema
-    # matches exactly what the pipelines produce at runtime.
+    # Create the dataset
    dataset = LeRobotDataset.create(
        repo_id=HF_REPO_ID,
        fps=FPS,
        features=combine_feature_dicts(
+            # Run the feature contract of the pipelines
+            # This tells you how the features would look like after the pipeline steps
            aggregate_pipeline_dataset_features(
                pipeline=leader_joints_to_ee,
                initial_features=create_initial_features(action=leader.action_features),
@@ -142,7 +144,7 @@ def main():

    # Initialize the keyboard listener and rerun visualization
    listener, events = init_keyboard_listener()
-    init_rerun(session_name="recording_so100_ee")
+    init_rerun(session_name="recording_phone")

    try:
        if not leader.is_connected or not follower.is_connected:
@@ -158,14 +160,14 @@ def main():
                robot=follower,
                events=events,
                fps=FPS,
-                teleop_action_processor=leader_joints_to_ee,
-                robot_action_processor=ee_to_follower_joints,
-                robot_observation_processor=follower_joints_to_ee,
                teleop=leader,
                dataset=dataset,
                control_time_s=EPISODE_TIME_SEC,
                single_task=TASK_DESCRIPTION,
                display_data=True,
+                teleop_action_processor=leader_joints_to_ee,
+                robot_action_processor=ee_to_follower_joints,
+                robot_observation_processor=follower_joints_to_ee,
            )

            # Reset the environment if not stopping or re-recording
@@ -177,13 +179,13 @@ def main():
                    robot=follower,
                    events=events,
                    fps=FPS,
-                    teleop_action_processor=leader_joints_to_ee,
-                    robot_action_processor=ee_to_follower_joints,
-                    robot_observation_processor=follower_joints_to_ee,
                    teleop=leader,
                    control_time_s=RESET_TIME_SEC,
                    single_task=TASK_DESCRIPTION,
                    display_data=True,
+                    teleop_action_processor=leader_joints_to_ee,
+                    robot_action_processor=ee_to_follower_joints,
+                    robot_observation_processor=follower_joints_to_ee,
                )

            if events["rerecord_episode"]:
@@ -1,134 +0,0 @@
-# !/usr/bin/env python
-
-# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-"""Run a trained EE-space policy on SO100 without recording (base rollout).
-
-Uses the rollout engine's :class:`BaseStrategy` (autonomous execution,
-no dataset) with :class:`SyncInferenceConfig` (inline policy call per
-control tick).  The custom observation/action processors convert between
-joint space (robot hardware) and end-effector space (policy I/O) via
-forward/inverse kinematics.
-"""
-
-from lerobot.cameras.opencv import OpenCVCameraConfig
-from lerobot.configs import PreTrainedConfig
-from lerobot.model.kinematics import RobotKinematics
-from lerobot.processor import (
-    RobotProcessorPipeline,
-    observation_to_transition,
-    robot_action_observation_to_transition,
-    transition_to_observation,
-    transition_to_robot_action,
-)
-from lerobot.robots.so_follower import SO100Follower, SO100FollowerConfig
-from lerobot.robots.so_follower.robot_kinematic_processor import (
-    ForwardKinematicsJointsToEE,
-    InverseKinematicsEEToJoints,
-)
-from lerobot.rollout import BaseStrategyConfig, RolloutConfig, build_rollout_context
-from lerobot.rollout.inference import SyncInferenceConfig
-from lerobot.rollout.strategies import BaseStrategy
-from lerobot.types import RobotAction, RobotObservation
-from lerobot.utils.process import ProcessSignalHandler
-from lerobot.utils.utils import init_logging
-
-FPS = 30
-DURATION_SEC = 60
-TASK_DESCRIPTION = "My task description"
-HF_MODEL_ID = "<hf_username>/<model_repo_id>"
-
-
-def main():
-    init_logging()
-
-    # Robot configuration — the rollout engine will connect it inside build_rollout_context.
-    camera_config = {"front": OpenCVCameraConfig(index_or_path=0, width=640, height=480, fps=FPS)}
-    robot_config = SO100FollowerConfig(
-        port="/dev/tty.usbmodem5A460814411",
-        id="my_awesome_follower_arm",
-        cameras=camera_config,
-        use_degrees=True,
-    )
-
-    # Kinematic solver: we need the motor-name list, so peek at the robot once.
-    # (The rollout engine owns the connected instance; we only use this for introspection.)
-    temp_robot = SO100Follower(robot_config)
-    motor_names = list(temp_robot.bus.motors.keys())
-
-    # NOTE: It is highly recommended to use the urdf in the SO-ARM100 repo:
-    #   https://github.com/TheRobotStudio/SO-ARM100/blob/main/Simulation/SO101/so101_new_calib.urdf
-    kinematics_solver = RobotKinematics(
-        urdf_path="./SO101/so101_new_calib.urdf",
-        target_frame_name="gripper_frame_link",
-        joint_names=motor_names,
-    )
-
-    # Joint-space observation → EE-space observation (consumed by the policy).
-    robot_joints_to_ee_pose_processor = RobotProcessorPipeline[RobotObservation, RobotObservation](
-        steps=[ForwardKinematicsJointsToEE(kinematics=kinematics_solver, motor_names=motor_names)],
-        to_transition=observation_to_transition,
-        to_output=transition_to_observation,
-    )
-
-    # EE-space action (produced by the policy) → joint-space action (sent to robot).
-    robot_ee_to_joints_processor = RobotProcessorPipeline[tuple[RobotAction, RobotObservation], RobotAction](
-        steps=[
-            InverseKinematicsEEToJoints(
-                kinematics=kinematics_solver,
-                motor_names=motor_names,
-                initial_guess_current_joints=True,
-            ),
-        ],
-        to_transition=robot_action_observation_to_transition,
-        to_output=transition_to_robot_action,
-    )
-
-    # Policy config (full model is loaded inside build_rollout_context).
-    policy_config = PreTrainedConfig.from_pretrained(HF_MODEL_ID)
-    policy_config.pretrained_path = HF_MODEL_ID
-
-    cfg = RolloutConfig(
-        robot=robot_config,
-        policy=policy_config,
-        strategy=BaseStrategyConfig(),
-        inference=SyncInferenceConfig(),
-        fps=FPS,
-        duration=DURATION_SEC,
-        task=TASK_DESCRIPTION,
-    )
-
-    signal_handler = ProcessSignalHandler(use_threads=True)
-
-    # Pass the EE kinematic processors via kwargs; the defaults (identity) would
-    # otherwise skip the joint↔EE conversion and the policy would receive the
-    # wrong observation/action space.
-    ctx = build_rollout_context(
-        cfg,
-        signal_handler.shutdown_event,
-        robot_action_processor=robot_ee_to_joints_processor,
-        robot_observation_processor=robot_joints_to_ee_pose_processor,
-    )
-
-    strategy = BaseStrategy(cfg.strategy)
-    try:
-        strategy.setup(ctx)
-        strategy.run(ctx)
-    finally:
-        strategy.teardown(ctx)
-
-
-if __name__ == "__main__":
-    main()
@@ -10,7 +10,7 @@ from lerobot.datasets import LeRobotDataset
 from lerobot.envs.configs import HILSerlProcessorConfig, HILSerlRobotEnvConfig
 from lerobot.policies import SACConfig
 from lerobot.policies.sac.modeling_sac import SACPolicy
-from lerobot.rewards.classifier.modeling_classifier import Classifier
+from lerobot.policies.sac.reward_model.modeling_classifier import Classifier
 from lerobot.rl.buffer import ReplayBuffer
 from lerobot.rl.gym_manipulator import make_robot_env
 from lerobot.robots.so_follower import SO100FollowerConfig
@@ -1,7 +1,7 @@
 import torch

 from lerobot.datasets import LeRobotDataset
-from lerobot.rewards import RewardClassifierConfig, make_reward_model, make_reward_pre_post_processors
+from lerobot.policies import RewardClassifierConfig, make_policy, make_pre_post_processors


 def main():
@@ -22,10 +22,10 @@ def main():
        model_name="microsoft/resnet-18",
    )

-    # Make reward model, preprocessor, and optimizer
-    reward_model = make_reward_model(config, dataset_stats=dataset.meta.stats)
-    optimizer = config.get_optimizer_preset().build(reward_model.parameters())
-    preprocessor, _ = make_reward_pre_post_processors(config, dataset_stats=dataset.meta.stats)
+    # Make policy, preprocessor, and optimizer
+    policy = make_policy(config, ds_meta=dataset.meta)
+    optimizer = config.get_optimizer_preset().build(policy.parameters())
+    preprocessor, _ = make_pre_post_processors(policy_cfg=config, dataset_stats=dataset.meta.stats)

    classifier_id = "<user>/reward_classifier_hil_serl_example"

@@ -42,7 +42,7 @@ def main():
            batch = preprocessor(batch)

            # Forward pass
-            loss, output_dict = reward_model.forward(batch)
+            loss, output_dict = policy.forward(batch)

            # Backward pass and optimization
            optimizer.zero_grad()
@@ -58,8 +58,8 @@ def main():

    print("Training finished!")

-    # You can now save the trained reward model.
-    reward_model.push_to_hub(classifier_id)
+    # You can now save the trained policy.
+    policy.push_to_hub(classifier_id)


 if __name__ == "__main__":
@@ -59,8 +59,8 @@ keywords = ["lerobot", "huggingface", "robotics",  "machine learning", "artifici

 dependencies = [
    # Core ML
-    "torch>=2.7,<2.12.0",
-    "torchvision>=0.22.0,<0.27.0",
+    "torch>=2.7,<2.11.0",
+    "torchvision>=0.22.0,<0.26.0",
    "numpy>=2.0.0,<2.3.0", # NOTE: Explicitly listing numpy helps the resolver converge faster. Upper bound imposed by opencv-python-headless.
    "opencv-python-headless>=4.9.0,<4.14.0",
    "Pillow>=10.0.0,<13.0.0",
@@ -99,7 +99,7 @@ dataset = [
    "pandas>=2.0.0,<3.0.0", # NOTE: Transitive dependency of datasets
    "pyarrow>=21.0.0,<30.0.0", # NOTE: Transitive dependency of datasets
    "lerobot[av-dep]",
-    "torchcodec>=0.3.0,<0.12.0; sys_platform != 'win32' and (sys_platform != 'linux' or (platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 'armv7l')) and (sys_platform != 'darwin' or platform_machine != 'x86_64')", # NOTE: Windows support starts at version 0.7 (needs torch==2.8), ffmpeg>=8 support starts at version 0.8.1 (needs torch==2.9), system-wide ffmpeg support starts at version 0.10 (needs torch==2.10), 0.11 needs torch==2.11, 0.12 needs torch==2.12.
+    "torchcodec>=0.3.0,<0.11.0; sys_platform != 'win32' and (sys_platform != 'linux' or (platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 'armv7l')) and (sys_platform != 'darwin' or platform_machine != 'x86_64')", # NOTE: Windows support starts at version 0.7 (needs torch==2.8), ffmpeg>=8 support starts at version 0.8.1 (needs torch==2.9), system-wide ffmpeg support starts at version 0.10 (needs torch==2.10).
    "jsonlines>=4.0.0,<5.0.0",
 ]
 training = [
@@ -108,9 +108,9 @@ training = [
    "wandb>=0.24.0,<0.25.0",
 ]
 hardware = [
-    "lerobot[pynput-dep]",
-    "lerobot[pyserial-dep]",
-    "lerobot[deepdiff-dep]",
+    "pynput>=1.7.8,<1.9.0",
+    "pyserial>=3.5,<4.0",
+    "deepdiff>=7.0.1,<9.0.0",
 ]
 viz = [
    "rerun-sdk>=0.24.0,<0.27.0",
@@ -128,7 +128,7 @@ dataset_viz = ["lerobot[dataset]", "lerobot[viz]"]
 av-dep = ["av>=15.0.0,<16.0.0"]
 pygame-dep = ["pygame>=2.5.1,<2.7.0"]
 placo-dep = ["placo>=0.9.6,<0.9.17"]
-transformers-dep = ["transformers>=5.4.0,<5.6.0"]
+transformers-dep = ["transformers==5.3.0"] # TODO(Steven): https://github.com/huggingface/lerobot/pull/3249
 grpcio-dep = ["grpcio==1.73.1", "protobuf>=6.31.1,<6.32.0"]
 can-dep = ["python-can>=4.2.0,<5.0.0"]
 peft-dep = ["peft>=0.18.0,<1.0.0"]
@@ -136,14 +136,10 @@ scipy-dep = ["scipy>=1.14.0,<2.0.0"]
 diffusers-dep = ["diffusers>=0.27.2,<0.36.0"]
 qwen-vl-utils-dep = ["qwen-vl-utils>=0.0.11,<0.1.0"]
 matplotlib-dep = ["matplotlib>=3.10.3,<4.0.0", "contourpy>=1.3.0,<2.0.0"] # NOTE: Explicitly listing contourpy helps the resolver converge faster.
-pyserial-dep = ["pyserial>=3.5,<4.0"]
-deepdiff-dep = ["deepdiff>=7.0.1,<9.0.0"]
-pynput-dep = ["pynput>=1.7.8,<1.9.0"]
-pyzmq-dep = ["pyzmq>=26.2.1,<28.0.0"]

 # Motors
-feetech = ["feetech-servo-sdk>=1.0.0,<2.0.0", "lerobot[pyserial-dep]", "lerobot[deepdiff-dep]"]
-dynamixel = ["dynamixel-sdk>=3.7.31,<3.9.0", "lerobot[pyserial-dep]", "lerobot[deepdiff-dep]"]
+feetech = ["feetech-servo-sdk>=1.0.0,<2.0.0"]
+dynamixel = ["dynamixel-sdk>=3.7.31,<3.9.0"]
 damiao = ["lerobot[can-dep]"]
 robstride = ["lerobot[can-dep]"]

@@ -151,11 +147,10 @@ robstride = ["lerobot[can-dep]"]
 openarms = ["lerobot[damiao]"]
 gamepad = ["lerobot[pygame-dep]", "hidapi>=0.14.0,<0.15.0"]
 hopejr = ["lerobot[feetech]", "lerobot[pygame-dep]"]
-lekiwi = ["lerobot[feetech]", "lerobot[pyzmq-dep]"]
+lekiwi = ["lerobot[feetech]", "pyzmq>=26.2.1,<28.0.0"]
 unitree_g1 = [
    # "unitree-sdk2==1.0.1",
-    "lerobot[pyzmq-dep]",
-    "lerobot[pyserial-dep]",
+    "pyzmq>=26.2.1,<28.0.0",
    "onnxruntime>=1.16.0,<2.0.0",
    "onnx>=1.16.0,<2.0.0",
    "meshcat>=0.3.0,<0.4.0",
@@ -194,7 +189,6 @@ groot = [
 ]
 sarm = ["lerobot[transformers-dep]", "pydantic>=2.0.0,<3.0.0", "faker>=33.0.0,<35.0.0", "lerobot[matplotlib-dep]", "lerobot[qwen-vl-utils-dep]"]
 xvla = ["lerobot[transformers-dep]"]
-eo1 = ["lerobot[transformers-dep]", "lerobot[qwen-vl-utils-dep]"]
 hilserl = ["lerobot[transformers-dep]", "gym-hil>=0.1.13,<0.2.0", "lerobot[grpcio-dep]", "lerobot[placo-dep]"]

 # Features
@@ -202,8 +196,7 @@ async = ["lerobot[grpcio-dep]", "lerobot[matplotlib-dep]"]
 peft = ["lerobot[transformers-dep]", "lerobot[peft-dep]"]

 # Development
-dev = ["pre-commit>=3.7.0,<5.0.0", "debugpy>=1.8.1,<1.9.0", "lerobot[grpcio-dep]", "grpcio-tools==1.73.1", "mypy>=1.19.1", "ruff>=0.14.1", "lerobot[notebook]"]
-notebook = ["jupyter>=1.0.0,<2.0.0", "ipykernel>=6.0.0,<7.0.0"]
+dev = ["pre-commit>=3.7.0,<5.0.0", "debugpy>=1.8.1,<1.9.0", "lerobot[grpcio-dep]", "grpcio-tools==1.73.1", "mypy>=1.19.1", "ruff>=0.14.1"]
 test = ["pytest>=8.1.0,<9.0.0", "pytest-timeout>=2.4.0,<3.0.0", "pytest-cov>=5.0.0,<8.0.0", "mock-serial>=0.0.1,<0.1.0 ; sys_platform != 'win32'"]
 video_benchmark = ["scikit-image>=0.23.2,<0.26.0", "pandas>=2.2.2,<2.4.0"]

@@ -213,20 +206,6 @@ aloha = ["lerobot[dataset]", "gym-aloha>=0.1.2,<0.2.0", "lerobot[scipy-dep]"]
 pusht = ["lerobot[dataset]", "gym-pusht>=0.1.5,<0.2.0", "pymunk>=6.6.0,<7.0.0"] # TODO: Fix pymunk version in gym-pusht instead
 libero = ["lerobot[dataset]", "lerobot[transformers-dep]", "hf-libero>=0.1.3,<0.2.0; sys_platform == 'linux'", "lerobot[scipy-dep]"]
 metaworld = ["lerobot[dataset]", "metaworld==3.0.0", "lerobot[scipy-dep]"]
-# NOTE: vlabench is NOT exposed as a `lerobot` extra. Its only distribution
-# is the OpenMOSS/VLABench GitHub repo (package name `VLABench`, no PyPI
-# release), so any `vlabench>=X` pip spec is unresolvable. Install it
-# manually alongside MuJoCo / dm-control — see docs/source/vlabench.mdx
-# for the recipe.
-# NOTE: robomme is NOT a pyproject extra — mani-skill hard-pins numpy<2
-# which conflicts with lerobot's numpy>=2 base pin, so the two trees can't
-# resolve into a single env. Install it only in the RoboMME Docker image
-# via `uv pip install --override` (see docker/Dockerfile.benchmark.robomme).
-# NOTE: robocasa is NOT exposed as a `lerobot` extra. Its setup.py pins
-# `lerobot==0.3.3` in install_requires, which cyclically shadows our own
-# workspace `lerobot` and makes the graph unsolvable under any resolver
-# (uv, pip). Install it manually alongside robosuite — see
-# docs/source/robocasa.mdx for the recipe.

 # All
 all = [
@@ -290,23 +269,8 @@ lerobot-find-joint-limits="lerobot.scripts.lerobot_find_joint_limits:main"
 lerobot-imgtransform-viz="lerobot.scripts.lerobot_imgtransform_viz:main"
 lerobot-edit-dataset="lerobot.scripts.lerobot_edit_dataset:main"
 lerobot-setup-can="lerobot.scripts.lerobot_setup_can:main"
-lerobot-rollout="lerobot.scripts.lerobot_rollout:main"

 # ---------------- Tool Configurations ----------------
-
-# cu128 wheels keep broad hardware reach; the driver floor is 570.86.
-# To use a different CUDA variant, reinstall torch with an explicit index, e.g.:
-#   uv pip install --force-reinstall torch torchvision \
-#       --index-url https://download.pytorch.org/whl/cu130
-[[tool.uv.index]]
-name = "pytorch-cu128"
-url = "https://download.pytorch.org/whl/cu128"
-explicit = true
-
-[tool.uv.sources]
-torch = [{ index = "pytorch-cu128", marker = "sys_platform == 'linux'" }]
-torchvision = [{ index = "pytorch-cu128", marker = "sys_platform == 'linux'" }]
-
 [tool.setuptools.package-data]
 lerobot = ["envs/*.json"]

@@ -1,207 +0,0 @@
-#!/usr/bin/env python3
-# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-"""Extract natural-language task descriptions for a benchmark suite.
-
-Runs inside the benchmark Docker container (where the env library is installed)
-immediately after lerobot-eval, writing a JSON file that parse_eval_metrics.py
-picks up and embeds in metrics.json.
-
-Output format: {"<suite>_<task_idx>": "<nl instruction>", ...}
-
-Usage:
-    python scripts/ci/extract_task_descriptions.py \\
-        --env libero --task libero_spatial \\
-        --output /tmp/eval-artifacts/task_descriptions.json
-"""
-
-from __future__ import annotations
-
-import argparse
-import json
-import re
-import sys
-from pathlib import Path
-
-# LIBERO-plus derives task.language by space-joining the perturbation-variant
-# filename (grab_language_from_filename in libero/libero/benchmark/__init__.py),
-# so non-_language_ variants inherit a trailing metadata blob like
-# "view 0 0 100 0 0 initstate 0 noise 45" or "add 16". Strip those tokens so
-# the description matches the base instruction used in the training dataset.
-_LIBERO_PERTURBATION_TAIL_RE = re.compile(
-    r"(?:\s(?:view|initstate|noise|add|tb|table|light|level)(?:\s\d+)+)+$"
-)
-
-
-def _strip_libero_perturbation_tail(instruction: str) -> str:
-    return _LIBERO_PERTURBATION_TAIL_RE.sub("", instruction).strip()
-
-
-def _libero_descriptions(task_suite: str) -> dict[str, str]:
-    from libero.libero import benchmark  # type: ignore[import-untyped]
-
-    suite_dict = benchmark.get_benchmark_dict()
-    if task_suite not in suite_dict:
-        print(
-            f"[extract_task_descriptions] Unknown LIBERO suite '{task_suite}'. "
-            f"Available: {list(suite_dict.keys())}",
-            file=sys.stderr,
-        )
-        return {}
-    suite = suite_dict[task_suite]()
-    return {
-        f"{task_suite}_{i}": _strip_libero_perturbation_tail(suite.get_task(i).language)
-        for i in range(suite.n_tasks)
-    }
-
-
-def _metaworld_descriptions(task_name: str) -> dict[str, str]:
-    # MetaWorld tasks don't expose a separate NL description attribute;
-    # use a cleaned version of the task name as the description.
-    label = task_name.removeprefix("metaworld-").replace("-", " ").strip()
-    return {f"{task_name}_0": label}
-
-
-def _robotwin_descriptions(task_names: str) -> dict[str, str]:
-    """Return descriptions for each requested RoboTwin task. Reads
-    `description/task_instruction/<task>.json` from the RoboTwin clone
-    (cwd is /opt/robotwin in CI). Falls back to the task name if missing."""
-    out: dict[str, str] = {}
-    root = Path("description/task_instruction")
-    for name in (t.strip() for t in task_names.split(",") if t.strip()):
-        desc_file = root / f"{name}.json"
-        desc = name.replace("_", " ")
-        if desc_file.is_file():
-            data = json.loads(desc_file.read_text())
-            full = data.get("full_description") or desc
-            # Strip the schema placeholders ({A}, {a}) — keep the sentence readable.
-            desc = full.replace("<", "").replace(">", "")
-        out[f"{name}_0"] = desc
-    return out
-
-
-def _robocasa_descriptions(task_spec: str) -> dict[str, str]:
-    """For each task in the comma-separated list, emit a cleaned-name label.
-
-    RoboCasa episodes carry their language instruction in the env's
-    `ep_meta['lang']`, populated per reset. Pulling it requires spinning
-    up the full kitchen env per task (~seconds each); we use the task
-    name as the key here and let the eval's episode info carry the
-    actual instruction.
-    """
-    out: dict[str, str] = {}
-    for task in (t.strip() for t in task_spec.split(",") if t.strip()):
-        # Split CamelCase into words: "CloseFridge" → "close fridge".
-        label = "".join(f" {c.lower()}" if c.isupper() else c for c in task).strip()
-        out[f"{task}_0"] = label or task
-    return out
-
-
-_ROBOMME_DESCRIPTIONS = {
-    "BinFill": "Fill the target bin with the correct number of cubes",
-    "PickXtimes": "Pick the indicated cube the specified number of times",
-    "SwingXtimes": "Swing the object the specified number of times",
-    "StopCube": "Grasp and stop the moving cube",
-    "VideoUnmask": "Pick the cube shown in the reference video",
-    "VideoUnmaskSwap": "Pick the cube matching the reference video after a swap",
-    "ButtonUnmask": "Press the button indicated by the reference",
-    "ButtonUnmaskSwap": "Press the correct button after objects are swapped",
-    "PickHighlight": "Pick the highlighted cube",
-    "VideoRepick": "Repick the cube shown in the reference video",
-    "VideoPlaceButton": "Place the cube on the button shown in the video",
-    "VideoPlaceOrder": "Place cubes in the order shown in the video",
-    "MoveCube": "Move the cube to the target location",
-    "InsertPeg": "Insert the peg into the target hole",
-    "PatternLock": "Unlock the pattern by pressing buttons in sequence",
-    "RouteStick": "Route the stick through the required waypoints",
-}
-
-
-def _robomme_descriptions(task_names: str, task_ids: list[int] | None = None) -> dict[str, str]:
-    """Return descriptions for each requested RoboMME task. Keys match the
-    video filename pattern `<task>_<task_id>` used by the eval script."""
-    if task_ids is None:
-        task_ids = [0]
-    out: dict[str, str] = {}
-    for name in (t.strip() for t in task_names.split(",") if t.strip()):
-        desc = _ROBOMME_DESCRIPTIONS.get(name, name)
-        for tid in task_ids:
-            out[f"{name}_{tid}"] = desc
-    return out
-
-
-def _vlabench_descriptions(task_spec: str) -> dict[str, str]:
-    """For each task in the comma-separated list, emit a cleaned-name label.
-
-    VLABench tasks carry language instructions on their dm_control task
-    object, but pulling them requires loading the full env per task
-    (~seconds each). The CI smoke-eval already captures the instruction
-    inside its episode info; this mapping is just enough to key
-    `metrics.json` by `<task>_0`.
-    """
-    out: dict[str, str] = {}
-    for task in (t.strip() for t in task_spec.split(",") if t.strip()):
-        out[f"{task}_0"] = task.replace("_", " ").strip()
-    return out
-
-
-def main() -> int:
-    parser = argparse.ArgumentParser(description=__doc__)
-    parser.add_argument("--env", required=True, help="Environment family (libero, metaworld, ...)")
-    parser.add_argument("--task", required=True, help="Task/suite name (e.g. libero_spatial)")
-    parser.add_argument(
-        "--task-ids",
-        type=str,
-        default=None,
-        help="Comma-separated task IDs (e.g. '0,1,2'). Default: [0]",
-    )
-    parser.add_argument("--output", required=True, help="Path to write task_descriptions.json")
-    args = parser.parse_args()
-
-    task_ids: list[int] | None = None
-    if args.task_ids:
-        task_ids = [int(x.strip()) for x in args.task_ids.split(",")]
-
-    descriptions: dict[str, str] = {}
-    try:
-        if args.env == ("libero", "libero_plus"):
-            descriptions = _libero_descriptions(args.task)
-        elif args.env == "metaworld":
-            descriptions = _metaworld_descriptions(args.task)
-        elif args.env == "robotwin":
-            descriptions = _robotwin_descriptions(args.task)
-        elif args.env == "robocasa":
-            descriptions = _robocasa_descriptions(args.task)
-        elif args.env == "robomme":
-            descriptions = _robomme_descriptions(args.task, task_ids=task_ids)
-        elif args.env == "vlabench":
-            descriptions = _vlabench_descriptions(args.task)
-        else:
-            print(
-                f"[extract_task_descriptions] No description extractor for env '{args.env}'.",
-                file=sys.stderr,
-            )
-    except Exception as exc:
-        print(f"[extract_task_descriptions] Warning: {exc}", file=sys.stderr)
-
-    out_path = Path(args.output)
-    out_path.parent.mkdir(parents=True, exist_ok=True)
-    out_path.write_text(json.dumps(descriptions, indent=2))
-    print(f"[extract_task_descriptions] {len(descriptions)} descriptions → {out_path}")
-    return 0
-
-
-if __name__ == "__main__":
-    sys.exit(main())
@@ -1,147 +0,0 @@
-#!/usr/bin/env python3
-# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-"""Parse lerobot-eval output into a small metrics.json artifact.
-
-Reads eval_info.json written by lerobot-eval --output_dir and extracts the
-key metrics needed by the health dashboard. Handles both single-task and
-multi-task eval output formats.
-
-NOTE: This script runs on the bare CI runner (not inside Docker), so it
-must use only Python stdlib modules. Do not add third-party imports.
-
-Usage:
-    python scripts/ci/parse_eval_metrics.py \\
-        --artifacts-dir /tmp/libero-artifacts \\
-        --env libero \\
-        --task libero_spatial \\
-        --policy pepijn223/smolvla_libero
-
-Writes <artifacts-dir>/metrics.json. The CI workflow then uploads this file
-as a GitHub Actions artifact named "<env>-metrics".
-"""
-
-from __future__ import annotations
-
-import argparse
-import json
-import math
-import sys
-from pathlib import Path
-
-
-def _safe_float(v: float | int | None) -> float | None:
-    if v is None:
-        return None
-    f = float(v)
-    return None if math.isnan(f) else f
-
-
-def _safe_int(v: float | int | None) -> int | None:
-    if v is None:
-        return None
-    f = float(v)
-    return None if math.isnan(f) else int(f)
-
-
-def _extract_metrics(info: dict) -> tuple[float | None, int | None, float | None, float | None]:
-    """Extract (pc_success, n_episodes, avg_sum_reward, eval_s) from eval_info.json.
-
-    Handles two output shapes:
-      - Single-task: {"aggregated": {"pc_success": 80.0, ...}}
-      - Multi-task:  {"overall": {"pc_success": 80.0, "n_episodes": 5, ...}}
-    """
-    for key in ("aggregated", "overall"):
-        if key not in info:
-            continue
-        agg = info[key]
-        pc = agg.get("pc_success")
-        n = agg.get("n_episodes")
-        reward = agg.get("avg_sum_reward")
-        eval_s = agg.get("eval_s")
-
-        if pc is not None and not math.isnan(pc):
-            return (
-                float(pc),
-                _safe_int(n),
-                _safe_float(reward),
-                _safe_float(eval_s),
-            )
-
-    return None, None, None, None
-
-
-def main() -> int:
-    parser = argparse.ArgumentParser(
-        description=__doc__, formatter_class=argparse.RawDescriptionHelpFormatter
-    )
-    parser.add_argument("--artifacts-dir", required=True, help="Path to the mounted artifacts volume")
-    parser.add_argument("--env", required=True, help="Environment name (e.g. libero)")
-    parser.add_argument("--task", required=True, help="Task name (e.g. libero_spatial)")
-    parser.add_argument("--policy", required=True, help="Policy hub path (e.g. pepijn223/smolvla_libero)")
-    args = parser.parse_args()
-
-    artifacts_dir = Path(args.artifacts_dir)
-    eval_info_path = artifacts_dir / "eval_info.json"
-
-    pc_success: float | None = None
-    n_episodes: int | None = None
-    avg_sum_reward: float | None = None
-    eval_s: float | None = None
-
-    if eval_info_path.exists():
-        try:
-            info = json.loads(eval_info_path.read_text())
-            pc_success, n_episodes, avg_sum_reward, eval_s = _extract_metrics(info)
-        except (json.JSONDecodeError, KeyError, TypeError) as exc:
-            print(f"[parse_eval_metrics] Warning: could not parse eval_info.json: {exc}", file=sys.stderr)
-    else:
-        print(
-            f"[parse_eval_metrics] Warning: {eval_info_path} not found — eval may have failed.",
-            file=sys.stderr,
-        )
-
-    task_descriptions: dict[str, str] = {}
-    task_desc_path = artifacts_dir / "task_descriptions.json"
-    if task_desc_path.exists():
-        try:
-            task_descriptions = json.loads(task_desc_path.read_text())
-        except json.JSONDecodeError as exc:
-            print(
-                f"[parse_eval_metrics] Warning: could not parse task_descriptions.json: {exc}",
-                file=sys.stderr,
-            )
-
-    metrics = {
-        "env": args.env,
-        "task": args.task,
-        "policy": args.policy,
-        "pc_success": pc_success,
-        "n_episodes": n_episodes,
-        "avg_sum_reward": avg_sum_reward,
-        "eval_s": eval_s,
-        "task_descriptions": task_descriptions,
-    }
-
-    out_path = artifacts_dir / "metrics.json"
-    out_path.write_text(json.dumps(metrics, indent=2))
-    print(f"[parse_eval_metrics] Written: {out_path}")
-    print(json.dumps(metrics, indent=2))
-
-    return 0
-
-
-if __name__ == "__main__":
-    sys.exit(main())
@@ -33,7 +33,7 @@ import cv2  # type: ignore  # TODO: add type stubs for OpenCV
 import numpy as np  # type: ignore  # TODO: add type stubs for numpy

 from lerobot.utils.decorators import check_if_not_connected
-from lerobot.utils.import_utils import _reachy2_sdk_available, require_package
+from lerobot.utils.import_utils import _reachy2_sdk_available

 if TYPE_CHECKING or _reachy2_sdk_available:
    from reachy2_sdk.media.camera import CameraView
@@ -76,7 +76,6 @@ class Reachy2Camera(Camera):
        Args:
            config: The configuration settings for the camera.
        """
-        require_package("reachy2_sdk", extra="reachy2")
        super().__init__(config)

        self.config = config
@@ -17,21 +17,18 @@ Provides the RealSenseCamera class for capturing frames from Intel RealSense cam
 """

 import logging
-import sys
 import time
 from threading import Event, Lock, Thread
-from typing import TYPE_CHECKING, Any
+from typing import Any

 import cv2  # type: ignore  # TODO: add type stubs for OpenCV
 import numpy as np  # type: ignore  # TODO: add type stubs for numpy
 from numpy.typing import NDArray  # type: ignore  # TODO: add type stubs for numpy.typing

-from lerobot.utils.import_utils import _pyrealsense2_available, require_package
-
-if TYPE_CHECKING or _pyrealsense2_available:
-    import pyrealsense2 as rs
-else:
-    rs = None
+try:
+    import pyrealsense2 as rs  # type: ignore  # TODO: add type stubs for pyrealsense2
+except Exception as e:
+    logging.info(f"Could not import realsense: {e}")

 from lerobot.utils.decorators import check_if_already_connected, check_if_not_connected
 from lerobot.utils.errors import DeviceNotConnectedError
@@ -42,7 +39,6 @@ from ..utils import get_cv2_rotation
 from .configuration_realsense import RealSenseCameraConfig

 logger = logging.getLogger(__name__)
-pkg_name = "pyrealsense2-macosx" if sys.platform == "darwin" else "pyrealsense2"


 class RealSenseCamera(Camera):
@@ -116,7 +112,7 @@ class RealSenseCamera(Camera):
        Args:
            config: The configuration settings for the camera.
        """
-        require_package(pkg_name, extra="intelrealsense", import_name="pyrealsense2")
+
        super().__init__(config)

        self.config = config
@@ -28,19 +28,12 @@ import json
 import logging
 import time
 from threading import Event, Lock, Thread
-from typing import TYPE_CHECKING, Any
+from typing import Any

 import cv2
 import numpy as np
 from numpy.typing import NDArray

-from lerobot.utils.import_utils import _zmq_available, require_package
-
-if TYPE_CHECKING or _zmq_available:
-    import zmq
-else:
-    zmq = None
-
 from lerobot.utils.decorators import check_if_already_connected, check_if_not_connected
 from lerobot.utils.errors import DeviceNotConnectedError

@@ -81,8 +74,8 @@ class ZMQCamera(Camera):
    """

    def __init__(self, config: ZMQCameraConfig):
-        require_package("pyzmq", extra="pyzmq-dep", import_name="zmq")
        super().__init__(config)
+        import zmq

        self.config = config
        self.server_address = config.server_address
@@ -124,6 +117,8 @@ class ZMQCamera(Camera):
        logger.info(f"Connecting to {self}...")

        try:
+            import zmq
+
            self.context = zmq.Context()
            self.socket = self.context.socket(zmq.SUB)
            self.socket.setsockopt_string(zmq.SUBSCRIBE, "")
@@ -185,8 +180,11 @@ class ZMQCamera(Camera):

        try:
            message = self.socket.recv_string()
-        except zmq.Again as e:
-            raise TimeoutError(f"{self} timeout after {self.timeout_ms}ms") from e
+        except Exception as e:
+            # zmq is lazy-imported in connect(), so check by name to avoid a top-level import
+            if type(e).__name__ == "Again":
+                raise TimeoutError(f"{self} timeout after {self.timeout_ms}ms") from e
+            raise

        # Decode JSON message
        data = json.loads(message)
@@ -28,12 +28,6 @@ import numpy as np
 import torch

 from lerobot.policies import PreTrainedPolicy, prepare_observation_for_inference
-from lerobot.utils.import_utils import _deepdiff_available, require_package
-
-if TYPE_CHECKING or _deepdiff_available:
-    from deepdiff import DeepDiff
-else:
-    DeepDiff = None

 if TYPE_CHECKING:
    from lerobot.datasets import LeRobotDataset
@@ -223,7 +217,10 @@ def sanity_check_dataset_robot_compatibility(
    Raises:
        ValueError: If any of the checked metadata fields do not match.
    """
-    require_package("deepdiff", extra="deepdiff-dep")
+    from lerobot.utils.import_utils import require_package
+
+    require_package("deepdiff", extra="hardware")
+    from deepdiff import DeepDiff

    from lerobot.utils.constants import DEFAULT_FEATURES

@@ -41,12 +41,8 @@ def cfg_to_group(
            return tag
        return tag[:max_tag_length]

-    if cfg.is_reward_model_training:
-        trainable_tag = f"reward_model:{cfg.reward_model.type}"
-    else:
-        trainable_tag = f"policy:{cfg.policy.type}"
    lst = [
-        trainable_tag,
+        f"policy:{cfg.policy.type}",
        f"seed:{cfg.seed}",
    ]
    if cfg.dataset is not None:
@@ -21,7 +21,6 @@ are intentionally NOT re-exported here to avoid circular dependencies
 Import them directly: ``from lerobot.configs.train import TrainPipelineConfig``
 """

-from .dataset import DatasetRecordConfig
 from .default import DatasetConfig, EvalConfig, PeftConfig, WandBConfig
 from .policies import PreTrainedConfig
 from .types import (
@@ -40,7 +39,6 @@ __all__ = [
    "PolicyFeature",
    "RTCAttentionSchedule",
    # Config classes
-    "DatasetRecordConfig",
    "DatasetConfig",
    "EvalConfig",
    "PeftConfig",
@@ -1,80 +0,0 @@
-# Copyright 2024 The HuggingFace Inc. team. All rights reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-"""Shared dataset recording configuration used by both ``lerobot-record`` and ``lerobot-rollout``."""
-
-from dataclasses import dataclass
-from datetime import datetime
-from pathlib import Path
-
-
-@dataclass
-class DatasetRecordConfig:
-    # Dataset identifier. By convention it should match '{hf_username}/{dataset_name}' (e.g. `lerobot/test`).
-    repo_id: str = ""
-    # A short but accurate description of the task performed during the recording (e.g. "Pick the Lego block and drop it in the box on the right.")
-    single_task: str = ""
-    # Root directory where the dataset will be stored (e.g. 'dataset/path'). If None, defaults to $HF_LEROBOT_HOME/repo_id.
-    root: str | Path | None = None
-    # Limit the frames per second.
-    fps: int = 30
-    # Number of seconds for data recording for each episode.
-    episode_time_s: int | float = 60
-    # Number of seconds for resetting the environment after each episode.
-    reset_time_s: int | float = 60
-    # Number of episodes to record.
-    num_episodes: int = 50
-    # Encode frames in the dataset into video
-    video: bool = True
-    # Upload dataset to Hugging Face hub.
-    push_to_hub: bool = True
-    # Upload on private repository on the Hugging Face hub.
-    private: bool = False
-    # Add tags to your dataset on the hub.
-    tags: list[str] | None = None
-    # Number of subprocesses handling the saving of frames as PNG. Set to 0 to use threads only;
-    # set to ≥1 to use subprocesses, each using threads to write images. The best number of processes
-    # and threads depends on your system. We recommend 4 threads per camera with 0 processes.
-    # If fps is unstable, adjust the thread count. If still unstable, try using 1 or more subprocesses.
-    num_image_writer_processes: int = 0
-    # Number of threads writing the frames as png images on disk, per camera.
-    # Too many threads might cause unstable teleoperation fps due to main thread being blocked.
-    # Not enough threads might cause low camera fps.
-    num_image_writer_threads_per_camera: int = 4
-    # Number of episodes to record before batch encoding videos
-    # Set to 1 for immediate encoding (default behavior), or higher for batched encoding
-    video_encoding_batch_size: int = 1
-    # Video codec for encoding videos. Options: 'h264', 'hevc', 'libsvtav1', 'auto',
-    # or hardware-specific: 'h264_videotoolbox', 'h264_nvenc', 'h264_vaapi', 'h264_qsv'.
-    # Use 'auto' to auto-detect the best available hardware encoder.
-    vcodec: str = "libsvtav1"
-    # Enable streaming video encoding: encode frames in real-time during capture instead
-    # of writing PNG images first. Makes save_episode() near-instant. More info in the documentation: https://huggingface.co/docs/lerobot/streaming_video_encoding
-    streaming_encoding: bool = False
-    # Maximum number of frames to buffer per camera when using streaming encoding.
-    # ~1s buffer at 30fps. Provides backpressure if the encoder can't keep up.
-    encoder_queue_maxsize: int = 30
-    # Number of threads per encoder instance. None = auto (codec default).
-    # Lower values reduce CPU usage, maps to 'lp' (via svtav1-params) for libsvtav1 and 'threads' for h264/hevc..
-    encoder_threads: int | None = None
-
-    def stamp_repo_id(self) -> None:
-        """Append a date-time tag to ``repo_id`` so each recording session gets a unique name.
-
-        Must be called explicitly at dataset *creation* time — not on resume,
-        where the existing ``repo_id`` (already stamped) must be preserved.
-        """
-        if self.repo_id:
-            timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
-            self.repo_id = f"{self.repo_id}_{timestamp}"
@@ -35,9 +35,6 @@ class DatasetConfig:
    revision: str | None = None
    use_imagenet_stats: bool = True
    video_backend: str = field(default_factory=get_safe_default_codec)
-    # When True, video frames are returned as uint8 tensors (0-255) instead of float32 (0.0-1.0).
-    # This reduces memory and speeds up DataLoader IPC. The training pipeline handles the conversion.
-    return_uint8: bool = False
    streaming: bool = False

    def __post_init__(self) -> None:
@@ -1,163 +0,0 @@
-# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-import abc
-import builtins
-import json
-import logging
-import os
-import tempfile
-from dataclasses import dataclass, field
-from pathlib import Path
-from typing import Any, TypeVar
-
-import draccus
-from huggingface_hub import hf_hub_download
-from huggingface_hub.constants import CONFIG_NAME
-from huggingface_hub.errors import HfHubHTTPError
-
-from lerobot.configs.types import PolicyFeature
-from lerobot.optim.optimizers import OptimizerConfig
-from lerobot.optim.schedulers import LRSchedulerConfig
-from lerobot.utils.device_utils import auto_select_torch_device, is_torch_device_available
-from lerobot.utils.hub import HubMixin
-
-T = TypeVar("T", bound="RewardModelConfig")
-logger = logging.getLogger(__name__)
-
-
-@dataclass
-class RewardModelConfig(draccus.ChoiceRegistry, HubMixin, abc.ABC):
-    """Base configuration for reward models.
-
-    Args:
-    input_features: A dictionary defining the PolicyFeature of the input data for the reward. The key represents
-        the input data name, and the value is PolicyFeature, which consists of FeatureType and shape attributes.
-    output_features: A dictionary defining the PolicyFeature of the output data for the reward. The key represents
-        the output data name, and the value is PolicyFeature, which consists of FeatureType and shape attributes.
-    """
-
-    # Reuses PolicyFeature
-    input_features: dict[str, PolicyFeature] = field(default_factory=dict)
-    output_features: dict[str, PolicyFeature] = field(default_factory=dict)
-
-    device: str | None = None
-
-    pretrained_path: str | None = None
-
-    push_to_hub: bool = False
-    repo_id: str | None = None
-
-    # Hub metadata
-    license: str | None = None
-    tags: list[str] | None = None
-    private: bool | None = None
-
-    def __post_init__(self) -> None:
-        if not self.device or not is_torch_device_available(self.device):
-            auto_device = auto_select_torch_device()
-            logger.warning(f"Device '{self.device}' is not available. Switching to '{auto_device}'.")
-            self.device = auto_device.type
-
-    @property
-    def type(self) -> str:
-        choice_name = self.get_choice_name(self.__class__)
-        if not isinstance(choice_name, str):
-            raise TypeError(f"Expected string from get_choice_name, got {type(choice_name)}")
-        return choice_name
-
-    @property
-    def observation_delta_indices(self) -> list | None:  # type: ignore[type-arg]
-        return None
-
-    @property
-    def action_delta_indices(self) -> list | None:  # type: ignore[type-arg]
-        return None
-
-    @property
-    def reward_delta_indices(self) -> list | None:  # type: ignore[type-arg]
-        return None
-
-    @abc.abstractmethod
-    def get_optimizer_preset(self) -> OptimizerConfig:
-        raise NotImplementedError
-
-    def get_scheduler_preset(self) -> LRSchedulerConfig | None:
-        return None
-
-    def validate_features(self) -> None:
-        pass
-
-    def _save_pretrained(self, save_directory: Path) -> None:
-        with open(save_directory / CONFIG_NAME, "w") as f, draccus.config_type("json"):
-            draccus.dump(self, f, indent=4)
-
-    @classmethod
-    def from_pretrained(
-        cls: builtins.type[T],
-        pretrained_name_or_path: str | Path,
-        *,
-        force_download: bool = False,
-        resume_download: bool | None = None,
-        proxies: dict[Any, Any] | None = None,
-        token: str | bool | None = None,
-        cache_dir: str | Path | None = None,
-        local_files_only: bool = False,
-        revision: str | None = None,
-        **reward_kwargs: Any,
-    ) -> T:
-        model_id = str(pretrained_name_or_path)
-        config_file: str | None = None
-        if Path(model_id).is_dir():
-            if CONFIG_NAME in os.listdir(model_id):
-                config_file = os.path.join(model_id, CONFIG_NAME)
-            else:
-                logger.error(f"{CONFIG_NAME} not found in {Path(model_id).resolve()}")
-        else:
-            try:
-                config_file = hf_hub_download(
-                    repo_id=model_id,
-                    filename=CONFIG_NAME,
-                    revision=revision,
-                    cache_dir=cache_dir,
-                    force_download=force_download,
-                    proxies=proxies,
-                    resume_download=resume_download,
-                    token=token,
-                    local_files_only=local_files_only,
-                )
-            except HfHubHTTPError as e:
-                raise FileNotFoundError(
-                    f"{CONFIG_NAME} not found on the HuggingFace Hub in {model_id}"
-                ) from e
-
-        if config_file is None:
-            raise FileNotFoundError(f"{CONFIG_NAME} not found in {model_id}")
-
-        # HACK: Parse the original config to get the config subclass, so that we can
-        # apply cli overrides.
-        with draccus.config_type("json"):
-            orig_config = draccus.parse(cls, config_file, args=[])
-
-        with open(config_file) as f:
-            config = json.load(f)
-
-        config.pop("type", None)
-        with tempfile.NamedTemporaryFile("w+", delete=False, suffix=".json") as f:
-            json.dump(config, f)
-            config_file = f.name
-
-        cli_overrides = reward_kwargs.pop("cli_overrides", [])
-        with draccus.config_type("json"):
-            return draccus.parse(orig_config.__class__, config_file, args=cli_overrides)
@@ -13,9 +13,7 @@
 # limitations under the License.
 import builtins
 import datetime as dt
-import json
 import os
-import tempfile
 from dataclasses import dataclass, field
 from pathlib import Path
 from typing import Any
@@ -28,57 +26,18 @@ from lerobot import envs
 from lerobot.configs import parser
 from lerobot.optim import LRSchedulerConfig, OptimizerConfig
 from lerobot.utils.hub import HubMixin
-from lerobot.utils.sample_weighting import SampleWeightingConfig

 from .default import DatasetConfig, EvalConfig, PeftConfig, WandBConfig
 from .policies import PreTrainedConfig
-from .rewards import RewardModelConfig

 TRAIN_CONFIG_NAME = "train_config.json"


-def _migrate_legacy_rabc_fields(config: dict[str, Any]) -> dict[str, Any] | None:
-    """Return migrated payload for legacy RA-BC fields, or None when no migration is needed."""
-    legacy_fields = (
-        "use_rabc",
-        "rabc_progress_path",
-        "rabc_kappa",
-        "rabc_epsilon",
-        "rabc_head_mode",
-    )
-    if not any(key in config for key in legacy_fields):
-        return None
-
-    migrated_config = dict(config)
-    use_rabc = bool(migrated_config.pop("use_rabc", False))
-    rabc_progress_path = migrated_config.pop("rabc_progress_path", None)
-    rabc_kappa = migrated_config.pop("rabc_kappa", None)
-    rabc_epsilon = migrated_config.pop("rabc_epsilon", None)
-    rabc_head_mode = migrated_config.pop("rabc_head_mode", None)
-
-    # New configs may already define sample_weighting explicitly. In that case,
-    # legacy fields are ignored after being stripped from the payload.
-    if migrated_config.get("sample_weighting") is None and use_rabc:
-        sample_weighting: dict[str, Any] = {"type": "rabc"}
-        if rabc_progress_path is not None:
-            sample_weighting["progress_path"] = rabc_progress_path
-        if rabc_kappa is not None:
-            sample_weighting["kappa"] = rabc_kappa
-        if rabc_epsilon is not None:
-            sample_weighting["epsilon"] = rabc_epsilon
-        if rabc_head_mode is not None:
-            sample_weighting["head_mode"] = rabc_head_mode
-        migrated_config["sample_weighting"] = sample_weighting
-
-    return migrated_config
-
-
@dataclass
 class TrainPipelineConfig(HubMixin):
    dataset: DatasetConfig
    env: envs.EnvConfig | None = None
    policy: PreTrainedConfig | None = None
-    reward_model: RewardModelConfig | None = None
    # Set `dir` to where you would like to save all of the run outputs. If you run another training session
    # with the same value for `dir` its contents will be overwritten unless you set `resume` to true.
    output_dir: Path | None = None
@@ -97,8 +56,6 @@ class TrainPipelineConfig(HubMixin):
    # Number of workers for the dataloader.
    num_workers: int = 4
    batch_size: int = 8
-    prefetch_factor: int = 4
-    persistent_workers: bool = True
    steps: int = 100_000
    eval_freq: int = 20_000
    log_freq: int = 200
@@ -113,41 +70,27 @@ class TrainPipelineConfig(HubMixin):
    wandb: WandBConfig = field(default_factory=WandBConfig)
    peft: PeftConfig | None = None

-    # Sample weighting configuration (e.g., for RA-BC training)
-    sample_weighting: SampleWeightingConfig | None = None
+    # RA-BC (Reward-Aligned Behavior Cloning) parameters
+    use_rabc: bool = False  # Enable reward-weighted training
+    rabc_progress_path: str | None = None  # Path to precomputed SARM progress parquet file
+    rabc_kappa: float = 0.01  # Hard threshold for high-quality samples
+    rabc_epsilon: float = 1e-6  # Small constant for numerical stability
+    rabc_head_mode: str | None = "sparse"  # For dual-head models: "sparse" or "dense"

    # Rename map for the observation to override the image and state keys
    rename_map: dict[str, str] = field(default_factory=dict)
    checkpoint_path: Path | None = field(init=False, default=None)

-    @property
-    def is_reward_model_training(self) -> bool:
-        """True when the config targets a reward model rather than a policy."""
-        return self.reward_model is not None
-
-    @property
-    def trainable_config(self) -> PreTrainedConfig | RewardModelConfig:
-        """Return whichever config (policy or reward_model) is active."""
-        if self.is_reward_model_training:
-            return self.reward_model  # type: ignore[return-value]
-        return self.policy  # type: ignore[return-value]
-
    def validate(self) -> None:
        # HACK: We parse again the cli args here to get the pretrained paths if there was some.
        policy_path = parser.get_path_arg("policy")
-        reward_model_path = parser.get_path_arg("reward_model")
-
-        if reward_model_path:
-            cli_overrides = parser.get_cli_overrides("reward_model")
-            self.reward_model = RewardModelConfig.from_pretrained(
-                reward_model_path, cli_overrides=cli_overrides
-            )
-            self.reward_model.pretrained_path = str(Path(reward_model_path))
-        elif policy_path:
+        if policy_path:
+            # Only load the policy config
            cli_overrides = parser.get_cli_overrides("policy")
            self.policy = PreTrainedConfig.from_pretrained(policy_path, cli_overrides=cli_overrides)
            self.policy.pretrained_path = Path(policy_path)
        elif self.resume:
+            # The entire train config is already loaded, we just need to get the checkpoint dir
            config_path = parser.parse_arg("config_path")
            if not config_path:
                raise ValueError(
@@ -163,22 +106,18 @@ class TrainPipelineConfig(HubMixin):
            policy_dir = Path(config_path).parent
            if self.policy is not None:
                self.policy.pretrained_path = policy_dir
-            if self.reward_model is not None:
-                self.reward_model.pretrained_path = str(policy_dir)
            self.checkpoint_path = policy_dir.parent

-        if self.policy is None and self.reward_model is None:
+        if self.policy is None:
            raise ValueError(
-                "Neither policy nor reward_model is configured. "
-                "Please specify one with `--policy.path` or `--reward_model.path`."
+                "Policy is not configured. Please specify a pretrained policy with `--policy.path`."
            )

-        active_cfg = self.trainable_config
        if not self.job_name:
            if self.env is None:
-                self.job_name = f"{active_cfg.type}"
+                self.job_name = f"{self.policy.type}"
            else:
-                self.job_name = f"{self.env.type}_{active_cfg.type}"
+                self.job_name = f"{self.env.type}_{self.policy.type}"

        if not self.resume and isinstance(self.output_dir, Path) and self.output_dir.is_dir():
            raise FileExistsError(
@@ -196,16 +135,26 @@ class TrainPipelineConfig(HubMixin):
        if not self.use_policy_training_preset and (self.optimizer is None or self.scheduler is None):
            raise ValueError("Optimizer and Scheduler must be set when the policy presets are not used.")
        elif self.use_policy_training_preset and not self.resume:
-            self.optimizer = active_cfg.get_optimizer_preset()
-            self.scheduler = active_cfg.get_scheduler_preset()
+            self.optimizer = self.policy.get_optimizer_preset()
+            self.scheduler = self.policy.get_scheduler_preset()

-        if hasattr(active_cfg, "push_to_hub") and active_cfg.push_to_hub and not active_cfg.repo_id:
-            raise ValueError("'repo_id' argument missing. Please specify it to push the model to the hub.")
+        if self.policy.push_to_hub and not self.policy.repo_id:
+            raise ValueError(
+                "'policy.repo_id' argument missing. Please specify it to push the model to the hub."
+            )
+
+        if self.use_rabc and not self.rabc_progress_path:
+            # Auto-detect from dataset path
+            repo_id = self.dataset.repo_id
+            if self.dataset.root:
+                self.rabc_progress_path = str(Path(self.dataset.root) / "sarm_progress.parquet")
+            else:
+                self.rabc_progress_path = f"hf://datasets/{repo_id}/sarm_progress.parquet"

    @classmethod
    def __get_path_fields__(cls) -> list[str]:
-        """Keys for draccus pretrained-path loading."""
-        return ["policy", "reward_model"]
+        """This enables the parser to load config from the policy using `--policy.path=local/dir`"""
+        return ["policy"]

    def to_dict(self) -> dict[str, Any]:
        return draccus.encode(self)  # type: ignore[no-any-return]  # because of the third-party library draccus uses Any as the return type
@@ -256,17 +205,6 @@ class TrainPipelineConfig(HubMixin):
                ) from e

        cli_args = kwargs.pop("cli_args", [])
-        # Legacy RA-BC migration only applies to framework-saved checkpoints (always JSON).
-        # Hand-written YAML/TOML configs are expected to use the current sample_weighting schema.
-        if config_file is not None and config_file.endswith(".json"):
-            with open(config_file) as f:
-                config = json.load(f)
-            migrated_config = _migrate_legacy_rabc_fields(config)
-            if migrated_config is not None:
-                with tempfile.NamedTemporaryFile("w+", delete=False, suffix=".json") as f:
-                    json.dump(migrated_config, f)
-                    config_file = f.name
-
        with draccus.config_type("json"):
            return draccus.parse(cls, config_file, args=cli_args)

@@ -97,8 +97,8 @@ def update_data_df(df, src_meta, dst_meta):
        pd.DataFrame: Updated DataFrame with adjusted indices.
    """

-    df["episode_index"] = df["episode_index"] + dst_meta.info.total_episodes
-    df["index"] = df["index"] + dst_meta.info.total_frames
+    df["episode_index"] = df["episode_index"] + dst_meta.info["total_episodes"]
+    df["index"] = df["index"] + dst_meta.info["total_frames"]

    src_task_names = src_meta.tasks.index.take(df["task_index"].to_numpy())
    df["task_index"] = dst_meta.tasks.loc[src_task_names, "task_index"].to_numpy()
@@ -225,9 +225,9 @@ def update_meta_data(
        # Clean up temporary columns
        df = df.drop(columns=["_orig_chunk", "_orig_file"])

-    df["dataset_from_index"] = df["dataset_from_index"] + dst_meta.info.total_frames
-    df["dataset_to_index"] = df["dataset_to_index"] + dst_meta.info.total_frames
-    df["episode_index"] = df["episode_index"] + dst_meta.info.total_episodes
+    df["dataset_from_index"] = df["dataset_from_index"] + dst_meta.info["total_frames"]
+    df["dataset_to_index"] = df["dataset_to_index"] + dst_meta.info["total_frames"]
+    df["episode_index"] = df["episode_index"] + dst_meta.info["total_episodes"]

    return df

@@ -237,8 +237,8 @@ def aggregate_datasets(
    aggr_repo_id: str,
    roots: list[Path] | None = None,
    aggr_root: Path | None = None,
-    data_files_size_in_mb: int | None = None,
-    video_files_size_in_mb: int | None = None,
+    data_files_size_in_mb: float | None = None,
+    video_files_size_in_mb: float | None = None,
    chunk_size: int | None = None,
 ):
    """Aggregates multiple LeRobot datasets into a single unified dataset.
@@ -313,8 +313,8 @@ def aggregate_datasets(
        # to avoid interference between different source datasets
        data_idx.pop("src_to_dst", None)

-        dst_meta.info.total_episodes += src_meta.total_episodes
-        dst_meta.info.total_frames += src_meta.total_frames
+        dst_meta.info["total_episodes"] += src_meta.total_episodes
+        dst_meta.info["total_frames"] += src_meta.total_frames

    finalize_aggregation(dst_meta, all_metadata)
    logging.info("Aggregation complete.")
@@ -640,10 +640,14 @@ def finalize_aggregation(aggr_meta, all_metadata):
    write_tasks(aggr_meta.tasks, aggr_meta.root)

    logging.info("write info")
-    aggr_meta.info.total_tasks = len(aggr_meta.tasks)
-    aggr_meta.info.total_episodes = sum(m.total_episodes for m in all_metadata)
-    aggr_meta.info.total_frames = sum(m.total_frames for m in all_metadata)
-    aggr_meta.info.splits = {"train": f"0:{sum(m.total_episodes for m in all_metadata)}"}
+    aggr_meta.info.update(
+        {
+            "total_tasks": len(aggr_meta.tasks),
+            "total_episodes": sum(m.total_episodes for m in all_metadata),
+            "total_frames": sum(m.total_frames for m in all_metadata),
+            "splits": {"train": f"0:{sum(m.total_episodes for m in all_metadata)}"},
+        }
+    )
    write_info(aggr_meta.info, aggr_meta.root)

    logging.info("write stats")
@@ -37,11 +37,13 @@ from .io_utils import (
    load_subtasks,
    load_tasks,
    write_info,
+    write_json,
    write_stats,
    write_tasks,
 )
 from .utils import (
    DEFAULT_EPISODES_PATH,
+    INFO_PATH,
    check_version_compatibility,
    get_safe_version,
    has_legacy_hub_download_metadata,
@@ -226,7 +228,7 @@ class LeRobotDatasetMetadata:
    @property
    def _version(self) -> packaging.version.Version:
        """Codebase version used to create this dataset."""
-        return packaging.version.parse(self.info.codebase_version)
+        return packaging.version.parse(self.info["codebase_version"])

    def get_data_file_path(self, ep_index: int) -> Path:
        """Return the relative parquet file path for the given episode index.
@@ -281,27 +283,27 @@ class LeRobotDatasetMetadata:
    @property
    def data_path(self) -> str:
        """Formattable string for the parquet files."""
-        return self.info.data_path
+        return self.info["data_path"]

    @property
    def video_path(self) -> str | None:
        """Formattable string for the video files."""
-        return self.info.video_path
+        return self.info["video_path"]

    @property
    def robot_type(self) -> str | None:
        """Robot type used in recording this dataset."""
-        return self.info.robot_type
+        return self.info["robot_type"]

    @property
    def fps(self) -> int:
        """Frames per second used during data collection."""
-        return self.info.fps
+        return self.info["fps"]

    @property
    def features(self) -> dict[str, dict]:
        """All features contained in the dataset."""
-        return self.info.features
+        return self.info["features"]

    @property
    def image_keys(self) -> list[str]:
@@ -331,32 +333,32 @@ class LeRobotDatasetMetadata:
    @property
    def total_episodes(self) -> int:
        """Total number of episodes available."""
-        return self.info.total_episodes
+        return self.info["total_episodes"]

    @property
    def total_frames(self) -> int:
        """Total number of frames saved in this dataset."""
-        return self.info.total_frames
+        return self.info["total_frames"]

    @property
    def total_tasks(self) -> int:
        """Total number of different tasks performed in this dataset."""
-        return self.info.total_tasks
+        return self.info["total_tasks"]

    @property
    def chunks_size(self) -> int:
        """Max number of files per chunk."""
-        return self.info.chunks_size
+        return self.info["chunks_size"]

    @property
    def data_files_size_in_mb(self) -> int:
        """Max size of data file in mega bytes."""
-        return self.info.data_files_size_in_mb
+        return self.info["data_files_size_in_mb"]

    @property
    def video_files_size_in_mb(self) -> int:
        """Max size of video file in mega bytes."""
-        return self.info.video_files_size_in_mb
+        return self.info["video_files_size_in_mb"]

    def get_task_index(self, task: str) -> int | None:
        """
@@ -500,10 +502,10 @@ class LeRobotDatasetMetadata:
        self._save_episode_metadata(episode_dict)

        # Update info
-        self.info.total_episodes += 1
-        self.info.total_frames += episode_length
-        self.info.total_tasks = len(self.tasks)
-        self.info.splits = {"train": f"0:{self.info.total_episodes}"}
+        self.info["total_episodes"] += 1
+        self.info["total_frames"] += episode_length
+        self.info["total_tasks"] = len(self.tasks)
+        self.info["splits"] = {"train": f"0:{self.info['total_episodes']}"}

        write_info(self.info, self.root)

@@ -522,7 +524,7 @@ class LeRobotDatasetMetadata:
        for key in video_keys:
            if not self.features[key].get("info", None):
                video_path = self.root / self.video_path.format(video_key=key, chunk_index=0, file_index=0)
-                self.info.features[key]["info"] = get_video_info(video_path)
+                self.info["features"][key]["info"] = get_video_info(video_path)

    def update_chunk_settings(
        self,
@@ -544,17 +546,17 @@ class LeRobotDatasetMetadata:
        if chunks_size is not None:
            if chunks_size <= 0:
                raise ValueError(f"chunks_size must be positive, got {chunks_size}")
-            self.info.chunks_size = chunks_size
+            self.info["chunks_size"] = chunks_size

        if data_files_size_in_mb is not None:
            if data_files_size_in_mb <= 0:
                raise ValueError(f"data_files_size_in_mb must be positive, got {data_files_size_in_mb}")
-            self.info.data_files_size_in_mb = data_files_size_in_mb
+            self.info["data_files_size_in_mb"] = data_files_size_in_mb

        if video_files_size_in_mb is not None:
            if video_files_size_in_mb <= 0:
                raise ValueError(f"video_files_size_in_mb must be positive, got {video_files_size_in_mb}")
-            self.info.video_files_size_in_mb = video_files_size_in_mb
+            self.info["video_files_size_in_mb"] = video_files_size_in_mb

        # Update the info file on disk
        write_info(self.info, self.root)
@@ -651,7 +653,7 @@ class LeRobotDatasetMetadata:
                f"Features contain video keys {obj.video_keys}, but 'use_videos' is set to False. "
                "Either remove video features from the features dict, or set 'use_videos=True'."
            )
-        write_info(obj.info, obj.root)
+        write_json(obj.info, obj.root / INFO_PATH)
        obj.revision = None
        obj._pq_writer = None
        obj.latest_episode = None
@@ -16,7 +16,6 @@
 """Private reader component for LeRobotDataset. Handles random-access reading (HF dataset, delta indices, video decoding)."""

 from collections.abc import Callable
-from concurrent.futures import ThreadPoolExecutor
 from pathlib import Path

 import datasets
@@ -50,7 +49,6 @@ class DatasetReader:
        video_backend: str,
        delta_timestamps: dict[str, list[float]] | None,
        image_transforms: Callable | None,
-        return_uint8: bool = False,
    ):
        """Initialize the reader with metadata, filtering, and transform config.

@@ -75,7 +73,6 @@ class DatasetReader:
        self._tolerance_s = tolerance_s
        self._video_backend = video_backend
        self._image_transforms = image_transforms
-        self._return_uint8 = return_uint8

        self.hf_dataset: datasets.Dataset | None = None
        self._absolute_to_relative_idx: dict[int, int] | None = None
@@ -108,8 +105,10 @@ class DatasetReader:
        """Build absolute-to-relative index mapping from loaded hf_dataset."""
        self._absolute_to_relative_idx = None
        if self.episodes is not None and self.hf_dataset is not None:
-            indices = self.hf_dataset.data.column("index").to_numpy()
-            self._absolute_to_relative_idx = dict(zip(indices.tolist(), range(len(indices)), strict=True))
+            self._absolute_to_relative_idx = {
+                abs_idx.item() if isinstance(abs_idx, torch.Tensor) else abs_idx: rel_idx
+                for rel_idx, abs_idx in enumerate(self.hf_dataset["index"])
+            }

    @property
    def num_frames(self) -> int:
@@ -236,30 +235,16 @@ class DatasetReader:
        Segmentation Fault.
        """
        ep = self._meta.episodes[ep_idx]
-
-        def _decode_single(vid_key: str, query_ts: list[float]) -> tuple[str, torch.Tensor]:
+        item = {}
+        for vid_key, query_ts in query_timestamps.items():
            from_timestamp = ep[f"videos/{vid_key}/from_timestamp"]
            shifted_query_ts = [from_timestamp + ts for ts in query_ts]
+
            video_path = self.root / self._meta.get_video_file_path(ep_idx, vid_key)
-            frames = decode_video_frames(
-                video_path,
-                shifted_query_ts,
-                self._tolerance_s,
-                self._video_backend,
-                return_uint8=self._return_uint8,
-            )
-            return vid_key, frames.squeeze(0)
+            frames = decode_video_frames(video_path, shifted_query_ts, self._tolerance_s, self._video_backend)
+            item[vid_key] = frames.squeeze(0)

-        items = list(query_timestamps.items())
-
-        # Single camera: no threading overhead
-        if len(items) <= 1:
-            return {vid_key: _decode_single(vid_key, query_ts)[1] for vid_key, query_ts in items}
-
-        # Multi-camera: decode in parallel (video decoding releases the GIL)
-        with ThreadPoolExecutor(max_workers=len(items)) as pool:
-            futures = [pool.submit(_decode_single, k, ts) for k, ts in items]
-            return dict(f.result() for f in futures)
+        return item

    def get_item(self, idx) -> dict:
        """Core __getitem__ logic. Assumes hf_dataset is loaded.
@@ -897,10 +897,14 @@ def _copy_and_reindex_episodes_metadata(

    dst_meta.finalize()

-    dst_meta.info.total_episodes = len(episode_mapping)
-    dst_meta.info.total_frames = total_frames
-    dst_meta.info.total_tasks = len(dst_meta.tasks) if dst_meta.tasks is not None else 0
-    dst_meta.info.splits = {"train": f"0:{len(episode_mapping)}"}
+    dst_meta.info.update(
+        {
+            "total_episodes": len(episode_mapping),
+            "total_frames": total_frames,
+            "total_tasks": len(dst_meta.tasks) if dst_meta.tasks is not None else 0,
+            "splits": {"train": f"0:{len(episode_mapping)}"},
+        }
+    )
    write_info(dst_meta.info, dst_meta.root)

    if not all_stats:
@@ -1065,20 +1069,21 @@ def _copy_episodes_metadata_and_stats(
    if episodes_dir.exists():
        shutil.copytree(episodes_dir, dst_episodes_dir, dirs_exist_ok=True)

-    dst_meta.info.total_episodes = src_dataset.meta.total_episodes
-    dst_meta.info.total_frames = src_dataset.meta.total_frames
-    dst_meta.info.total_tasks = src_dataset.meta.total_tasks
-    # Preserve original splits if available, otherwise create default
-    dst_meta.info.splits = (
-        src_dataset.meta.info.splits
-        if src_dataset.meta.info.splits
-        else {"train": f"0:{src_dataset.meta.total_episodes}"}
+    dst_meta.info.update(
+        {
+            "total_episodes": src_dataset.meta.total_episodes,
+            "total_frames": src_dataset.meta.total_frames,
+            "total_tasks": src_dataset.meta.total_tasks,
+            "splits": src_dataset.meta.info.get("splits", {"train": f"0:{src_dataset.meta.total_episodes}"}),
+        }
    )

    if dst_meta.video_keys and src_dataset.meta.video_keys:
        for key in dst_meta.video_keys:
            if key in src_dataset.meta.features:
-                dst_meta.info.features[key]["info"] = src_dataset.meta.info.features[key].get("info", {})
+                dst_meta.info["features"][key]["info"] = src_dataset.meta.info["features"][key].get(
+                    "info", {}
+                )

    write_info(dst_meta.info, dst_meta.root)

@@ -1520,7 +1525,7 @@ def modify_tasks(
    write_tasks(new_task_df, root)

    # Update info.json
-    dataset.meta.info.total_tasks = len(unique_tasks)
+    dataset.meta.info["total_tasks"] = len(unique_tasks)
    write_info(dataset.meta.info, root)

    # Reload metadata to reflect changes
@@ -1853,10 +1858,10 @@ def convert_image_to_video_dataset(
        episodes_df.to_parquet(episodes_path, index=False)

        # Update metadata info
-        new_meta.info.total_episodes = len(episode_indices)
-        new_meta.info.total_frames = sum(ep["length"] for ep in all_episode_metadata.values())
-        new_meta.info.total_tasks = dataset.meta.total_tasks
-        new_meta.info.splits = {"train": f"0:{len(episode_indices)}"}
+        new_meta.info["total_episodes"] = len(episode_indices)
+        new_meta.info["total_frames"] = sum(ep["length"] for ep in all_episode_metadata.values())
+        new_meta.info["total_tasks"] = dataset.meta.total_tasks
+        new_meta.info["splits"] = {"train": f"0:{len(episode_indices)}"}

        # Update video info for all image keys (now videos)
        # We need to manually set video info since update_video_info() checks video_keys first
@@ -1865,7 +1870,7 @@ def convert_image_to_video_dataset(
                video_path = new_meta.root / new_meta.video_path.format(
                    video_key=img_key, chunk_index=0, file_index=0
                )
-                new_meta.info.features[img_key]["info"] = get_video_info(video_path)
+                new_meta.info["features"][img_key]["info"] = get_video_info(video_path)

        write_info(new_meta.info, new_meta.root)

@@ -597,7 +597,7 @@ class DatasetWriter:

    def cleanup_interrupted_episode(self, episode_index: int) -> None:
        """Remove temporary image directories for an interrupted episode."""
-        for key in self._meta.camera_keys:
+        for key in self._meta.video_keys:
            img_dir = self._get_image_file_path(
                episode_index=episode_index, image_key=key, frame_index=0
            ).parent
@@ -19,7 +19,6 @@ from pprint import pformat
 import torch

 from lerobot.configs import PreTrainedConfig
-from lerobot.configs.rewards import RewardModelConfig
 from lerobot.configs.train import TrainPipelineConfig
 from lerobot.transforms import ImageTransforms
 from lerobot.utils.constants import ACTION, IMAGENET_STATS, OBS_PREFIX, REWARD
@@ -31,14 +30,12 @@ from .streaming_dataset import StreamingLeRobotDataset


 def resolve_delta_timestamps(
-    cfg: PreTrainedConfig | RewardModelConfig, ds_meta: LeRobotDatasetMetadata
+    cfg: PreTrainedConfig, ds_meta: LeRobotDatasetMetadata
 ) -> dict[str, list] | None:
-    """Resolves delta_timestamps by reading from the 'delta_indices' properties of the config.
+    """Resolves delta_timestamps by reading from the 'delta_indices' properties of the PreTrainedConfig.

    Args:
-        cfg (PreTrainedConfig | RewardModelConfig): The config to read delta_indices from. Both
-            ``PreTrainedConfig`` and concrete ``RewardModelConfig`` subclasses expose the
-            ``{observation,action,reward}_delta_indices`` properties used below.
+        cfg (PreTrainedConfig): The PreTrainedConfig to read delta_indices from.
        ds_meta (LeRobotDatasetMetadata): The dataset from which features and fps are used to build
            delta_timestamps against.

@@ -85,7 +82,7 @@ def make_dataset(cfg: TrainPipelineConfig) -> LeRobotDataset | MultiLeRobotDatas
        ds_meta = LeRobotDatasetMetadata(
            cfg.dataset.repo_id, root=cfg.dataset.root, revision=cfg.dataset.revision
        )
-        delta_timestamps = resolve_delta_timestamps(cfg.trainable_config, ds_meta)
+        delta_timestamps = resolve_delta_timestamps(cfg.policy, ds_meta)
        if not cfg.dataset.streaming:
            dataset = LeRobotDataset(
                cfg.dataset.repo_id,
@@ -95,7 +92,6 @@ def make_dataset(cfg: TrainPipelineConfig) -> LeRobotDataset | MultiLeRobotDatas
                image_transforms=image_transforms,
                revision=cfg.dataset.revision,
                video_backend=cfg.dataset.video_backend,
-                return_uint8=True,
                tolerance_s=cfg.tolerance_s,
            )
        else:
@@ -108,7 +104,6 @@ def make_dataset(cfg: TrainPipelineConfig) -> LeRobotDataset | MultiLeRobotDatas
                revision=cfg.dataset.revision,
                max_num_shards=cfg.num_workers,
                tolerance_s=cfg.tolerance_s,
-                return_uint8=True,
            )
    else:
        raise NotImplementedError("The MultiLeRobotDataset isn't supported for now.")
@@ -28,7 +28,6 @@ from .utils import (
    DEFAULT_DATA_PATH,
    DEFAULT_VIDEO_FILE_SIZE_IN_MB,
    DEFAULT_VIDEO_PATH,
-    DatasetInfo,
 )


@@ -79,8 +78,8 @@ def create_empty_dataset_info(
    chunks_size: int | None = None,
    data_files_size_in_mb: int | None = None,
    video_files_size_in_mb: int | None = None,
-) -> DatasetInfo:
-    """Create a template ``DatasetInfo`` object for a new dataset's ``meta/info.json``.
+) -> dict:
+    """Create a template dictionary for a new dataset's `info.json`.

    Args:
        codebase_version (str): The version of the LeRobot codebase.
@@ -88,24 +87,25 @@ def create_empty_dataset_info(
        features (dict): The LeRobot features dictionary for the dataset.
        use_videos (bool): Whether the dataset will store videos.
        robot_type (str | None): The type of robot used, if any.
-        chunks_size (int | None): Max files per chunk directory. Defaults to ``DEFAULT_CHUNK_SIZE``.
-        data_files_size_in_mb (int | None): Max parquet file size in MB. Defaults to ``DEFAULT_DATA_FILE_SIZE_IN_MB``.
-        video_files_size_in_mb (int | None): Max video file size in MB. Defaults to ``DEFAULT_VIDEO_FILE_SIZE_IN_MB``.

    Returns:
-        DatasetInfo: A typed dataset information object with initial metadata.
+        dict: A dictionary with the initial dataset metadata.
    """
-    return DatasetInfo(
-        codebase_version=codebase_version,
-        fps=fps,
-        features=features,
-        robot_type=robot_type,
-        chunks_size=chunks_size or DEFAULT_CHUNK_SIZE,
-        data_files_size_in_mb=data_files_size_in_mb or DEFAULT_DATA_FILE_SIZE_IN_MB,
-        video_files_size_in_mb=video_files_size_in_mb or DEFAULT_VIDEO_FILE_SIZE_IN_MB,
-        data_path=DEFAULT_DATA_PATH,
-        video_path=DEFAULT_VIDEO_PATH if use_videos else None,
-    )
+    return {
+        "codebase_version": codebase_version,
+        "robot_type": robot_type,
+        "total_episodes": 0,
+        "total_frames": 0,
+        "total_tasks": 0,
+        "chunks_size": chunks_size or DEFAULT_CHUNK_SIZE,
+        "data_files_size_in_mb": data_files_size_in_mb or DEFAULT_DATA_FILE_SIZE_IN_MB,
+        "video_files_size_in_mb": video_files_size_in_mb or DEFAULT_VIDEO_FILE_SIZE_IN_MB,
+        "fps": fps,
+        "splits": {},
+        "data_path": DEFAULT_DATA_PATH,
+        "video_path": DEFAULT_VIDEO_PATH if use_videos else None,
+        "features": features,
+    }


 def check_delta_timestamps(
@@ -30,13 +30,13 @@ def safe_stop_image_writer(func):
    def wrapper(*args, **kwargs):
        try:
            return func(*args, **kwargs)
-        except BaseException:
+        except Exception as e:
            dataset = kwargs.get("dataset")
            writer = getattr(dataset, "writer", None) if dataset else None
            if writer is not None and writer.image_writer is not None:
                logger.warning("Waiting for image writer to terminate...")
                writer.image_writer.stop()
-            raise
+            raise e

    return wrapper

@@ -39,7 +39,6 @@ from .utils import (
    EPISODES_DIR,
    INFO_PATH,
    STATS_PATH,
-    DatasetInfo,
    serialize_dict,
 )

@@ -116,21 +115,25 @@ def embed_images(dataset: datasets.Dataset) -> datasets.Dataset:
    return dataset


-def write_info(info: DatasetInfo, local_dir: Path) -> None:
-    write_json(info.to_dict(), local_dir / INFO_PATH)
+def write_info(info: dict, local_dir: Path) -> None:
+    write_json(info, local_dir / INFO_PATH)


-def load_info(local_dir: Path) -> DatasetInfo:
+def load_info(local_dir: Path) -> dict:
    """Load dataset info metadata from its standard file path.

+    Also converts shape lists to tuples for consistency.
+
    Args:
        local_dir (Path): The root directory of the dataset.

    Returns:
-        DatasetInfo: The typed dataset information object.
+        dict: The dataset information dictionary.
    """
-    raw = load_json(local_dir / INFO_PATH)
-    return DatasetInfo.from_dict(raw)
+    info = load_json(local_dir / INFO_PATH)
+    for ft in info["features"].values():
+        ft["shape"] = tuple(ft["shape"])
+    return info


 def write_stats(stats: dict, local_dir: Path) -> None:
@@ -56,7 +56,6 @@ class LeRobotDataset(torch.utils.data.Dataset):
        force_cache_sync: bool = False,
        download_videos: bool = True,
        video_backend: str | None = None,
-        return_uint8: bool = False,
        batch_encoding_size: int = 1,
        vcodec: str = "libsvtav1",
        streaming_encoding: bool = False,
@@ -203,7 +202,6 @@ class LeRobotDataset(torch.utils.data.Dataset):
        self.tolerance_s = tolerance_s
        self.revision = revision if revision else CODEBASE_VERSION
        self._video_backend = video_backend if video_backend else get_safe_default_codec()
-        self._return_uint8 = return_uint8
        self._batch_encoding_size = batch_encoding_size
        self._vcodec = resolve_vcodec(vcodec)
        self._encoder_threads = encoder_threads
@@ -227,7 +225,6 @@ class LeRobotDataset(torch.utils.data.Dataset):
            video_backend=self._video_backend,
            delta_timestamps=delta_timestamps,
            image_transforms=image_transforms,
-            return_uint8=self._return_uint8,
        )

        # Load actual data
@@ -291,7 +288,6 @@ class LeRobotDataset(torch.utils.data.Dataset):
                video_backend=self._video_backend,
                delta_timestamps=self.delta_timestamps,
                image_transforms=self.image_transforms,
-                return_uint8=self._return_uint8,
            )
        return self.reader

@@ -630,8 +626,6 @@ class LeRobotDataset(torch.utils.data.Dataset):
        streaming_encoding: bool = False,
        encoder_queue_maxsize: int = 30,
        encoder_threads: int | None = None,
-        video_files_size_in_mb: int | None = None,
-        data_files_size_in_mb: int | None = None,
    ) -> "LeRobotDataset":
        """Create a new LeRobotDataset from scratch for recording data.

@@ -679,8 +673,6 @@ class LeRobotDataset(torch.utils.data.Dataset):
            root=root,
            use_videos=use_videos,
            metadata_buffer_size=metadata_buffer_size,
-            video_files_size_in_mb=video_files_size_in_mb,
-            data_files_size_in_mb=data_files_size_in_mb,
        )
        obj.repo_id = obj.meta.repo_id
        obj._requested_root = obj.meta.root
@@ -691,7 +683,6 @@ class LeRobotDataset(torch.utils.data.Dataset):
        obj.delta_timestamps = None
        obj.episodes = None
        obj._video_backend = video_backend if video_backend is not None else get_safe_default_codec()
-        obj._return_uint8 = False
        obj._batch_encoding_size = batch_encoding_size
        obj._vcodec = vcodec
        obj._encoder_threads = encoder_threads
@@ -784,7 +775,6 @@ class LeRobotDataset(torch.utils.data.Dataset):
        obj.delta_timestamps = None
        obj.episodes = None
        obj._video_backend = video_backend if video_backend else get_safe_default_codec()
-        obj._return_uint8 = False
        obj._batch_encoding_size = batch_encoding_size
        obj._vcodec = vcodec
        obj._encoder_threads = encoder_threads
@@ -123,7 +123,7 @@ class MultiLeRobotDataset(torch.utils.data.Dataset):

        NOTE: Fow now, this relies on a check in __init__ to make sure all sub-datasets have the same info.
        """
-        return self._datasets[0].meta.info.fps
+        return self._datasets[0].meta.info["fps"]

    @property
    def video(self) -> bool:
@@ -133,7 +133,7 @@ class MultiLeRobotDataset(torch.utils.data.Dataset):

        NOTE: Fow now, this relies on a check in __init__ to make sure all sub-datasets have the same info.
        """
-        return len(self._datasets[0].meta.video_keys) > 0
+        return self._datasets[0].meta.info.get("video", False)

    @property
    def features(self) -> datasets.Features:
@@ -251,7 +251,6 @@ class StreamingLeRobotDataset(torch.utils.data.IterableDataset):
        seed: int = 42,
        rng: np.random.Generator | None = None,
        shuffle: bool = True,
-        return_uint8: bool = False,
    ):
        """Initialize a StreamingLeRobotDataset.

@@ -289,7 +288,6 @@ class StreamingLeRobotDataset(torch.utils.data.IterableDataset):

        self.streaming = streaming
        self.buffer_size = buffer_size
-        self._return_uint8 = return_uint8

        # We cache the video decoders to avoid re-initializing them at each frame (avoiding a ~10x slowdown)
        self.video_decoder_cache = None
@@ -434,7 +432,7 @@ class StreamingLeRobotDataset(torch.utils.data.IterableDataset):

    def _make_padding_camera_frame(self, camera_key: str):
        """Variable-shape padding frame for given camera keys, given in (H, W, C)"""
-        return torch.zeros(self.meta.info.features[camera_key]["shape"]).permute(-1, 0, 1)
+        return torch.zeros(self.meta.info["features"][camera_key]["shape"]).permute(-1, 0, 1)

    def _get_video_frame_padding_mask(
        self,
@@ -555,11 +553,7 @@ class StreamingLeRobotDataset(torch.utils.data.IterableDataset):
            root = self.meta.url_root if self.streaming and not self.streaming_from_local else self.root
            video_path = f"{root}/{self.meta.get_video_file_path(ep_idx, video_key)}"
            frames = decode_video_frames_torchcodec(
-                video_path,
-                query_ts,
-                self.tolerance_s,
-                decoder_cache=self.video_decoder_cache,
-                return_uint8=self._return_uint8,
+                video_path, query_ts, self.tolerance_s, decoder_cache=self.video_decoder_cache
            )

            item[video_key] = frames.squeeze(0) if len(query_ts) == 1 else frames
@@ -14,11 +14,9 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 import contextlib
-import dataclasses
 import importlib.resources
 import json
 import logging
-from dataclasses import dataclass, field
 from pathlib import Path

 import datasets
@@ -72,9 +70,6 @@ class ForwardCompatibilityError(CompatibilityError):
        super().__init__(message)


-logger = logging.getLogger(__name__)
-
-
 DEFAULT_CHUNK_SIZE = 1000  # Max number of files per chunk
 DEFAULT_DATA_FILE_SIZE_IN_MB = 100  # Max size per file
 DEFAULT_VIDEO_FILE_SIZE_IN_MB = 200  # Max size per file
@@ -99,123 +94,6 @@ LEGACY_EPISODES_STATS_PATH = "meta/episodes_stats.jsonl"
 LEGACY_TASKS_PATH = "meta/tasks.jsonl"


-@dataclass
-class DatasetInfo:
-    """Typed representation of the ``meta/info.json`` file for a LeRobot dataset.
-
-    Replaces the previously untyped ``dict`` returned by ``load_info()`` and
-    created by ``create_empty_dataset_info()``.  Using a dataclass provides
-    explicit field definitions, IDE auto-completion, and validation at
-    construction time.
-    """
-
-    codebase_version: str
-    fps: int
-    features: dict[str, dict]
-
-    # Episode / frame counters — start at zero for new datasets
-    total_episodes: int = 0
-    total_frames: int = 0
-    total_tasks: int = 0
-
-    # Storage settings
-    chunks_size: int = field(default=DEFAULT_CHUNK_SIZE)
-    data_files_size_in_mb: int = field(default=DEFAULT_DATA_FILE_SIZE_IN_MB)
-    video_files_size_in_mb: int = field(default=DEFAULT_VIDEO_FILE_SIZE_IN_MB)
-
-    # File path templates
-    data_path: str = field(default=DEFAULT_DATA_PATH)
-    video_path: str | None = field(default=DEFAULT_VIDEO_PATH)
-
-    # Optional metadata
-    robot_type: str | None = None
-    splits: dict[str, str] = field(default_factory=dict)
-
-    def __post_init__(self) -> None:
-        # Coerce feature shapes from list to tuple — JSON deserialisation
-        # returns lists, but the rest of the codebase expects tuples.
-        for ft in self.features.values():
-            if isinstance(ft.get("shape"), list):
-                ft["shape"] = tuple(ft["shape"])
-
-        if self.fps <= 0:
-            raise ValueError(f"fps must be positive, got {self.fps}")
-        if self.chunks_size <= 0:
-            raise ValueError(f"chunks_size must be positive, got {self.chunks_size}")
-        if self.data_files_size_in_mb <= 0:
-            raise ValueError(f"data_files_size_in_mb must be positive, got {self.data_files_size_in_mb}")
-        if self.video_files_size_in_mb <= 0:
-            raise ValueError(f"video_files_size_in_mb must be positive, got {self.video_files_size_in_mb}")
-
-    def to_dict(self) -> dict:
-        """Return a JSON-serialisable dict.
-
-        Converts tuple shapes back to lists so ``json.dump`` can handle them.
-        """
-        d = dataclasses.asdict(self)
-        for ft in d["features"].values():
-            if isinstance(ft.get("shape"), tuple):
-                ft["shape"] = list(ft["shape"])
-        return d
-
-    @classmethod
-    def from_dict(cls, data: dict) -> "DatasetInfo":
-        """Construct from a raw dict (e.g. loaded directly from JSON).
-
-        Unknown keys are ignored for forward compatibility with datasets that
-        carry additional fields (e.g. ``total_videos`` from v2.x). A warning is
-        logged when such fields are present.
-        """
-        known = {f.name for f in dataclasses.fields(cls)}
-        unknown = sorted(k for k in data if k not in known)
-        if unknown:
-            logger.warning(f"Unknown fields in DatasetInfo: {unknown}. These will be ignored.")
-        return cls(**{k: v for k, v in data.items() if k in known})
-
-    # ---------------------------------------------------------------------------
-    # Temporary dict-style compatibility layer
-    # Allows existing ``info["key"]`` call-sites to keep working without changes.
-    # Once all callers have been migrated to attribute access, remove these.
-    # ---------------------------------------------------------------------------
-    def __getitem__(self, key: str):
-        import warnings
-
-        warnings.warn(
-            f"Accessing DatasetInfo with dict-style syntax info['{key}'] is deprecated. "
-            f"Use attribute access info.{key} instead.",
-            DeprecationWarning,
-            stacklevel=2,
-        )
-        try:
-            return getattr(self, key)
-        except AttributeError as err:
-            raise KeyError(key) from err
-
-    def __setitem__(self, key: str, value) -> None:
-        import warnings
-
-        warnings.warn(
-            f"Setting DatasetInfo with dict-style syntax info['{key}'] = ... is deprecated. "
-            f"Use attribute assignment info.{key} = ... instead.",
-            DeprecationWarning,
-            stacklevel=2,
-        )
-        if not hasattr(self, key):
-            raise KeyError(f"DatasetInfo has no field '{key}'")
-        setattr(self, key, value)
-
-    def __contains__(self, key: str) -> bool:
-        """Check if a field exists (dict-like interface)."""
-        return hasattr(self, key)
-
-    def get(self, key: str, default=None):
-        """Get attribute value with default fallback (dict-like interface)."""
-        try:
-            return getattr(self, key)
-        except AttributeError:
-            return default
-
-
 def has_legacy_hub_download_metadata(root: Path) -> bool:
    """Return ``True`` when *root* looks like a legacy Hub ``local_dir`` mirror.

@@ -416,7 +294,7 @@ def create_branch(repo_id: str, *, branch: str, repo_type: str | None = None) ->

 def create_lerobot_dataset_card(
    tags: list | None = None,
-    dataset_info: DatasetInfo | None = None,
+    dataset_info: dict | None = None,
    **kwargs,
 ) -> DatasetCard:
    """Create a `DatasetCard` for a LeRobot dataset.
@@ -427,7 +305,7 @@ def create_lerobot_dataset_card(

    Args:
        tags (list | None): A list of tags to add to the dataset card.
-        dataset_info (DatasetInfo | None): The dataset's info object, which will
+        dataset_info (dict | None): The dataset's info dictionary, which will
            be displayed on the card.
        **kwargs: Additional keyword arguments to populate the card template.

@@ -440,7 +318,7 @@ def create_lerobot_dataset_card(
        card_tags += tags
    if dataset_info:
        dataset_structure = "[meta/info.json](meta/info.json):\n"
-        dataset_structure += f"```json\n{json.dumps(dataset_info.to_dict(), indent=4)}\n```\n"
+        dataset_structure += f"```json\n{json.dumps(dataset_info, indent=4)}\n```\n"
        kwargs = {**kwargs, "dataset_structure": dataset_structure}
    card_data = DatasetCardData(
        license=kwargs.get("license"),
@@ -123,7 +123,6 @@ def decode_video_frames(
    timestamps: list[float],
    tolerance_s: float,
    backend: str | None = None,
-    return_uint8: bool = False,
 ) -> torch.Tensor:
    """
    Decodes video frames using the specified backend.
@@ -132,23 +131,19 @@ def decode_video_frames(
        video_path (Path): Path to the video file.
        timestamps (list[float]): List of timestamps to extract frames.
        tolerance_s (float): Allowed deviation in seconds for frame retrieval.
-        backend (str, optional): Backend to use for decoding. Defaults to "torchcodec" when available in the platform; otherwise, defaults to "pyav".
-        return_uint8 (bool): If True, return raw uint8 frames without float32 normalization.
-            This reduces memory for DataLoader IPC; normalization can be done on GPU afterward.
+        backend (str, optional): Backend to use for decoding. Defaults to "torchcodec" when available in the platform; otherwise, defaults to "pyav"..

    Returns:
-        torch.Tensor: Decoded frames (float32 in [0,1] by default, or uint8 if return_uint8=True).
+        torch.Tensor: Decoded frames.

    Currently supports torchcodec on cpu and pyav.
    """
    if backend is None:
        backend = get_safe_default_codec()
    if backend == "torchcodec":
-        return decode_video_frames_torchcodec(video_path, timestamps, tolerance_s, return_uint8=return_uint8)
+        return decode_video_frames_torchcodec(video_path, timestamps, tolerance_s)
    elif backend in ["pyav", "video_reader"]:
-        return decode_video_frames_torchvision(
-            video_path, timestamps, tolerance_s, backend, return_uint8=return_uint8
-        )
+        return decode_video_frames_torchvision(video_path, timestamps, tolerance_s, backend)
    else:
        raise ValueError(f"Unsupported video backend: {backend}")

@@ -159,7 +154,6 @@ def decode_video_frames_torchvision(
    tolerance_s: float,
    backend: str = "pyav",
    log_loaded_timestamps: bool = False,
-    return_uint8: bool = False,
 ) -> torch.Tensor:
    """Loads frames associated to the requested timestamps of a video

@@ -246,17 +240,14 @@ def decode_video_frames_torchvision(
    if log_loaded_timestamps:
        logger.info(f"{closest_ts=}")

+    # convert to the pytorch format which is float32 in [0,1] range (and channel first)
+    closest_frames = closest_frames.type(torch.float32) / 255
+
    if len(timestamps) != len(closest_frames):
        raise FrameTimestampError(
            f"Number of retrieved frames ({len(closest_frames)}) does not match "
            f"number of queried timestamps ({len(timestamps)})"
        )
-
-    if return_uint8:
-        return closest_frames
-
-    # convert to the pytorch format which is float32 in [0,1] range (and channel first)
-    closest_frames = closest_frames.type(torch.float32) / 255
    return closest_frames


@@ -282,11 +273,7 @@ class VideoDecoderCache:
        with self._lock:
            if video_path not in self._cache:
                file_handle = fsspec.open(video_path).__enter__()
-                try:
-                    decoder = VideoDecoder(file_handle, seek_mode="approximate")
-                except Exception:
-                    file_handle.close()
-                    raise
+                decoder = VideoDecoder(file_handle, seek_mode="approximate")
                self._cache[video_path] = (decoder, file_handle)

            return self._cache[video_path][0]
@@ -319,7 +306,6 @@ def decode_video_frames_torchcodec(
    tolerance_s: float,
    log_loaded_timestamps: bool = False,
    decoder_cache: VideoDecoderCache | None = None,
-    return_uint8: bool = False,
 ) -> torch.Tensor:
    """Loads frames associated with the requested timestamps of a video using torchcodec.

@@ -387,16 +373,14 @@ def decode_video_frames_torchcodec(
    if log_loaded_timestamps:
        logger.info(f"{closest_ts=}")

+    # convert to float32 in [0,1] range
+    closest_frames = (closest_frames / 255.0).type(torch.float32)
+
    if not len(timestamps) == len(closest_frames):
        raise FrameTimestampError(
            f"Retrieved timestamps differ from queried {set(closest_frames) - set(timestamps)}"
        )

-    if return_uint8:
-        return closest_frames
-
-    # convert to float32 in [0,1] range
-    closest_frames = (closest_frames / 255.0).type(torch.float32)
    return closest_frames


@@ -331,7 +331,6 @@ class LiberoEnv(EnvConfig):
    camera_name_mapping: dict[str, str] | None = None
    observation_height: int = 360
    observation_width: int = 360
-    is_libero_plus: bool = False
    features: dict[str, PolicyFeature] = field(
        default_factory=lambda: {
            ACTION: PolicyFeature(type=FeatureType.ACTION, shape=(7,)),
@@ -433,7 +432,6 @@ class LiberoEnv(EnvConfig):
            control_mode=self.control_mode,
            episode_length=self.episode_length,
            camera_name_mapping=self.camera_name_mapping,
-            is_libero_plus=self.is_libero_plus,
        )

    def get_env_processors(self):
@@ -498,146 +496,6 @@ class MetaworldEnv(EnvConfig):
        )


-@EnvConfig.register_subclass("robocasa")
-@dataclass
-class RoboCasaEnv(EnvConfig):
-    task: str = "CloseFridge"
-    fps: int = 20
-    episode_length: int = 1000
-    obs_type: str = "pixels_agent_pos"
-    render_mode: str = "rgb_array"
-    camera_name: str = "robot0_agentview_left,robot0_eye_in_hand,robot0_agentview_right"
-    observation_height: int = 256
-    observation_width: int = 256
-    visualization_height: int = 512
-    visualization_width: int = 512
-    split: str | None = None
-    # Object-mesh registries to sample from. Upstream default is
-    # ("objaverse", "lightwheel"), but objaverse is ~30GB and the CI image
-    # only ships the lightwheel pack. Override to include objaverse once
-    # you've run `python -m robocasa.scripts.download_kitchen_assets
-    # --type objaverse` locally.
-    obj_registries: list[str] = field(default_factory=lambda: ["lightwheel"])
-    features: dict[str, PolicyFeature] = field(
-        default_factory=lambda: {ACTION: PolicyFeature(type=FeatureType.ACTION, shape=(12,))}
-    )
-    features_map: dict[str, str] = field(default_factory=lambda: {ACTION: ACTION, "agent_pos": OBS_STATE})
-
-    def __post_init__(self):
-        if self.obs_type not in ("pixels", "pixels_agent_pos"):
-            raise ValueError(f"Unsupported obs_type: {self.obs_type}")
-
-        # Preserve raw RoboCasa camera names end-to-end (e.g.
-        # `observation.images.robot0_agentview_left`). This matches the
-        # naming convention used by the RoboCasa datasets on the Hub, so
-        # trained policies don't need a `--rename_map` at eval time.
-        cams = [c.strip() for c in self.camera_name.split(",") if c.strip()]
-        for cam in cams:
-            self.features[f"pixels/{cam}"] = PolicyFeature(
-                type=FeatureType.VISUAL,
-                shape=(self.observation_height, self.observation_width, 3),
-            )
-            self.features_map[f"pixels/{cam}"] = f"{OBS_IMAGES}.{cam}"
-
-        if self.obs_type == "pixels_agent_pos":
-            self.features["agent_pos"] = PolicyFeature(type=FeatureType.STATE, shape=(16,))
-
-    @property
-    def gym_kwargs(self) -> dict:
-        kwargs: dict[str, Any] = {
-            "obs_type": self.obs_type,
-            "render_mode": self.render_mode,
-            "observation_height": self.observation_height,
-            "observation_width": self.observation_width,
-            "visualization_height": self.visualization_height,
-            "visualization_width": self.visualization_width,
-        }
-        if self.split is not None:
-            kwargs["split"] = self.split
-        return kwargs
-
-    def create_envs(self, n_envs: int, use_async_envs: bool = False):
-        from .robocasa import create_robocasa_envs
-
-        if self.task is None:
-            raise ValueError("RoboCasaEnv requires a task to be specified")
-        env_cls = _make_vec_env_cls(use_async_envs, n_envs)
-        return create_robocasa_envs(
-            task=self.task,
-            n_envs=n_envs,
-            camera_name=self.camera_name,
-            gym_kwargs=self.gym_kwargs,
-            env_cls=env_cls,
-            episode_length=self.episode_length,
-            obj_registries=tuple(self.obj_registries),
-        )
-
-
-@EnvConfig.register_subclass("vlabench")
-@dataclass
-class VLABenchEnv(EnvConfig):
-    task: str = "select_fruit"
-    fps: int = 10
-    episode_length: int = 500
-    obs_type: str = "pixels_agent_pos"
-    render_mode: str = "rgb_array"
-    render_resolution: tuple[int, int] = (480, 480)
-    robot: str = "franka"
-    action_mode: str = "eef"
-    features: dict[str, PolicyFeature] = field(
-        default_factory=lambda: {
-            ACTION: PolicyFeature(type=FeatureType.ACTION, shape=(7,)),
-        }
-    )
-    features_map: dict[str, str] = field(
-        default_factory=lambda: {
-            ACTION: ACTION,
-            "agent_pos": OBS_STATE,
-            "pixels/image": f"{OBS_IMAGES}.image",
-            "pixels/second_image": f"{OBS_IMAGES}.second_image",
-            "pixels/wrist_image": f"{OBS_IMAGES}.wrist_image",
-        }
-    )
-
-    def __post_init__(self):
-        h, w = self.render_resolution
-        if self.obs_type == "pixels":
-            self.features["pixels/image"] = PolicyFeature(type=FeatureType.VISUAL, shape=(h, w, 3))
-            self.features["pixels/second_image"] = PolicyFeature(type=FeatureType.VISUAL, shape=(h, w, 3))
-            self.features["pixels/wrist_image"] = PolicyFeature(type=FeatureType.VISUAL, shape=(h, w, 3))
-        elif self.obs_type == "pixels_agent_pos":
-            self.features["pixels/image"] = PolicyFeature(type=FeatureType.VISUAL, shape=(h, w, 3))
-            self.features["pixels/second_image"] = PolicyFeature(type=FeatureType.VISUAL, shape=(h, w, 3))
-            self.features["pixels/wrist_image"] = PolicyFeature(type=FeatureType.VISUAL, shape=(h, w, 3))
-            self.features["agent_pos"] = PolicyFeature(type=FeatureType.STATE, shape=(7,))
-        else:
-            raise ValueError(f"Unsupported obs_type: {self.obs_type}")
-
-    @property
-    def gym_kwargs(self) -> dict:
-        return {
-            "obs_type": self.obs_type,
-            "render_mode": self.render_mode,
-            "render_resolution": self.render_resolution,
-            "robot": self.robot,
-            "max_episode_steps": self.episode_length,
-            "action_mode": self.action_mode,
-        }
-
-    def create_envs(self, n_envs: int, use_async_envs: bool = False):
-        from .vlabench import create_vlabench_envs
-
-        if self.task is None:
-            raise ValueError("VLABenchEnv requires a task to be specified")
-        env_cls = _make_vec_env_cls(use_async_envs, n_envs)
-        return create_vlabench_envs(
-            task=self.task,
-            n_envs=n_envs,
-            gym_kwargs=self.gym_kwargs,
-            env_cls=env_cls,
-        )
-
-
@EnvConfig.register_subclass("isaaclab_arena")
@dataclass
 class IsaaclabArenaEnv(HubEnvConfig):
@@ -716,171 +574,3 @@ class IsaaclabArenaEnv(HubEnvConfig):
            ),
            PolicyProcessorPipeline(steps=[]),
        )
-
-
-@EnvConfig.register_subclass("libero_plus")
-@dataclass
-class LiberoPlusEnv(LiberoEnv):
-    """Config for LIBERO-plus robustness benchmark evaluation.
-
-    LIBERO-plus extends LIBERO with 7 perturbation dimensions (camera viewpoints,
-    object layouts, robot initial states, language instructions, lighting, background
-    textures, sensor noise) producing ~10k task variants.
-
-    The gym interface is identical to LIBERO so this class reuses ``LiberoEnv``
-    entirely — only the registered name and default task suite differ.
-
-    Install: see docker/Dockerfile.benchmark.libero_plus — LIBERO-plus ships
-    as a namespace package from a git fork and must be cloned + PYTHONPATH'd
-    rather than installed as a pyproject extra.
-
-    See Also:
-        https://github.com/sylvestf/LIBERO-plus
-    """
-
-    task: str = "libero_spatial"
-    is_libero_plus: bool = True
-
-
-@EnvConfig.register_subclass("robotwin")
-@dataclass
-class RoboTwinEnvConfig(EnvConfig):
-    """Configuration for RoboTwin 2.0 benchmark environments.
-
-    RoboTwin 2.0 is a dual-arm manipulation benchmark with 50 tasks built on the
-    SAPIEN simulator. The robot is an Aloha-AgileX bimanual platform with 14 DOF
-    (7 per arm). All three cameras are enabled by default.
-
-    See: https://robotwin-platform.github.io
-    Dataset: https://huggingface.co/datasets/lerobot/robotwin_unified
-    """
-
-    task: str = "beat_block_hammer"  # single task or comma-separated list
-    fps: int = 25
-    episode_length: int = 300
-    obs_type: str = "pixels_agent_pos"
-    render_mode: str = "rgb_array"
-    # Available cameras from RoboTwin's aloha-agilex embodiment: head_camera
-    # (torso-mounted) + left_camera / right_camera (wrists).
-    camera_names: str = "head_camera,left_camera,right_camera"
-    # Match the D435 dims in task_config/demo_clean.yml (_camera_config.yml).
-    # Gym's vector-env concatenate pre-allocates buffers of this shape, so it
-    # must equal what SAPIEN actually renders.
-    observation_height: int = 240
-    observation_width: int = 320
-    features: dict[str, PolicyFeature] = field(
-        default_factory=lambda: {
-            ACTION: PolicyFeature(type=FeatureType.ACTION, shape=(14,)),
-        }
-    )
-    features_map: dict[str, str] = field(
-        default_factory=lambda: {
-            ACTION: ACTION,
-            "pixels/head_camera": f"{OBS_IMAGES}.head_camera",
-            "pixels/left_camera": f"{OBS_IMAGES}.left_camera",
-            "pixels/right_camera": f"{OBS_IMAGES}.right_camera",
-            "agent_pos": OBS_STATE,
-        }
-    )
-
-    def __post_init__(self):
-        cam_list = [c.strip() for c in self.camera_names.split(",") if c.strip()]
-        for cam in cam_list:
-            self.features[f"pixels/{cam}"] = PolicyFeature(
-                type=FeatureType.VISUAL,
-                shape=(self.observation_height, self.observation_width, 3),
-            )
-            # Keep features_map entry if already set (default_factory); add if missing.
-            key = f"pixels/{cam}"
-            if key not in self.features_map:
-                self.features_map[key] = f"{OBS_IMAGES}.{cam}"
-
-        if self.obs_type == "pixels_agent_pos":
-            self.features["agent_pos"] = PolicyFeature(
-                type=FeatureType.STATE,
-                shape=(14,),  # 14 DOF: 7 per arm
-            )
-        elif self.obs_type != "pixels":
-            raise ValueError(
-                f"Unsupported obs_type '{self.obs_type}'. "
-                "RoboTwinEnvConfig supports 'pixels' and 'pixels_agent_pos'."
-            )
-
-    @property
-    def gym_kwargs(self) -> dict:
-        return {}
-
-    def create_envs(self, n_envs: int, use_async_envs: bool = True):
-        from lerobot.envs.robotwin import create_robotwin_envs
-
-        if not self.task:
-            raise ValueError("RoboTwinEnvConfig requires `task` to be specified.")
-
-        env_cls = _make_vec_env_cls(use_async_envs, n_envs)
-        cam_list = [c.strip() for c in self.camera_names.split(",") if c.strip()]
-        return create_robotwin_envs(
-            task=self.task,
-            n_envs=n_envs,
-            env_cls=env_cls,
-            camera_names=cam_list,
-            observation_height=self.observation_height,
-            observation_width=self.observation_width,
-            episode_length=self.episode_length,
-        )
-
-
-@EnvConfig.register_subclass("robomme")
-@dataclass
-class RoboMMEEnv(EnvConfig):
-    """RoboMME memory-augmented manipulation benchmark (ManiSkill/SAPIEN).
-
-    16 tasks across 4 suites: Counting, Permanence, Reference, Imitation.
-    Dataset: lerobot/robomme (LeRobot v3.0, 1,600 episodes).
-    Benchmark: https://github.com/RoboMME/robomme_benchmark
-
-    Requires the `robomme` git package installed separately (Linux only);
-    see docker/Dockerfile.benchmark.robomme for the canonical install.
-    """
-
-    task: str = "PickXtimes"
-    fps: int = 10
-    episode_length: int = 300
-    action_space: str = "joint_angle"  # or "ee_pose" (7-D)
-    dataset_split: str = "test"  # "train" | "val" | "test"
-    task_ids: list[int] | None = None
-    features: dict[str, PolicyFeature] = field(default_factory=dict)
-    features_map: dict[str, str] = field(
-        default_factory=lambda: {
-            ACTION: ACTION,
-            "pixels/image": f"{OBS_IMAGES}.image",
-            "pixels/wrist_image": f"{OBS_IMAGES}.wrist_image",
-            "agent_pos": OBS_STATE,
-        }
-    )
-
-    def __post_init__(self):
-        action_dim = 8 if self.action_space == "joint_angle" else 7
-        self.features = {
-            ACTION: PolicyFeature(type=FeatureType.ACTION, shape=(action_dim,)),
-            "pixels/image": PolicyFeature(type=FeatureType.VISUAL, shape=(256, 256, 3)),
-            "pixels/wrist_image": PolicyFeature(type=FeatureType.VISUAL, shape=(256, 256, 3)),
-            "agent_pos": PolicyFeature(type=FeatureType.STATE, shape=(8,)),
-        }
-
-    @property
-    def gym_kwargs(self) -> dict:
-        return {}
-
-    def create_envs(self, n_envs: int, use_async_envs: bool = True):
-        from lerobot.envs.robomme import create_robomme_envs
-
-        env_cls = _make_vec_env_cls(use_async_envs, n_envs)
-        return create_robomme_envs(
-            task=self.task,
-            n_envs=n_envs,
-            action_space_type=self.action_space,
-            dataset=self.dataset_split,
-            episode_length=self.episode_length,
-            task_ids=self.task_ids,
-            env_cls=env_cls,
-        )
@@ -16,7 +16,6 @@
 from __future__ import annotations

 import os
-import re
 from collections import defaultdict
 from collections.abc import Callable, Iterable, Mapping, Sequence
 from functools import partial
@@ -32,7 +31,20 @@ from libero.libero.envs import OffScreenRenderEnv

 from lerobot.types import RobotObservation

-from .utils import _LazyAsyncVectorEnv, parse_camera_names
+from .utils import _LazyAsyncVectorEnv
+
+
+def _parse_camera_names(camera_name: str | Sequence[str]) -> list[str]:
+    """Normalize camera_name into a non-empty list of strings."""
+    if isinstance(camera_name, str):
+        cams = [c.strip() for c in camera_name.split(",") if c.strip()]
+    elif isinstance(camera_name, (list | tuple)):
+        cams = [str(c).strip() for c in camera_name if str(c).strip()]
+    else:
+        raise TypeError(f"camera_name must be str or sequence[str], got {type(camera_name).__name__}")
+    if not cams:
+        raise ValueError("camera_name resolved to an empty list.")
+    return cams


 def _get_suite(name: str) -> benchmark.Benchmark:
@@ -57,34 +69,14 @@ def _select_task_ids(total_tasks: int, task_ids: Iterable[int] | None) -> list[i
    return ids


-# LIBERO-plus perturbation variants encode the perturbation in the filename
-# but on disk only the base `.pruned_init` exists — strip the suffix to match
-# LIBERO-plus's own suite.get_task_init_states() (we reimplement it here so we
-# can pass weights_only=False for PyTorch 2.6+ numpy pickles).
-_LIBERO_PERTURBATION_SUFFIX_RE = re.compile(r"_(?:language|view|light)_[^.]*|_(?:table|tb)_\d+")
-
-
-def get_task_init_states(task_suite: Any, i: int, is_libero_plus: bool = False) -> np.ndarray:
-    task = task_suite.tasks[i]
-    filename = Path(task.init_states_file)
-    root = Path(get_libero_path("init_states"))
-
-    if not is_libero_plus:
-        init_states_path = root / task.problem_folder / filename.name
-        return torch.load(init_states_path, weights_only=False)  # nosec B614
-
-    # LIBERO-plus: `_add_` / `_level` variants store extra-object layouts under
-    # libero_newobj/ as a flat array that must be reshaped to (1, -1).
-    if "_add_" in filename.name or "_level" in filename.name:
-        init_states_path = root / "libero_newobj" / task.problem_folder / filename.name
-        init_states = torch.load(init_states_path, weights_only=False)  # nosec B614
-        return init_states.reshape(1, -1)
-
-    # LIBERO-plus perturbation variants encode the perturbation in the filename
-    # but on disk only the base `.pruned_init` exists — strip the suffix to match.
-    stripped = _LIBERO_PERTURBATION_SUFFIX_RE.sub("", filename.stem) + filename.suffix
-    init_states_path = root / task.problem_folder / stripped
-    return torch.load(init_states_path, weights_only=False)  # nosec B614
+def get_task_init_states(task_suite: Any, i: int) -> np.ndarray:
+    init_states_path = (
+        Path(get_libero_path("init_states"))
+        / task_suite.tasks[i].problem_folder
+        / task_suite.tasks[i].init_states_file
+    )
+    init_states = torch.load(init_states_path, weights_only=False)  # nosec B614
+    return init_states


 def get_libero_dummy_action():
@@ -126,11 +118,9 @@ class LiberoEnv(gym.Env):
        camera_name_mapping: dict[str, str] | None = None,
        num_steps_wait: int = 10,
        control_mode: str = "relative",
-        is_libero_plus: bool = False,
    ):
        super().__init__()
        self.task_id = task_id
-        self.is_libero_plus = is_libero_plus
        self.obs_type = obs_type
        self.render_mode = render_mode
        self.observation_width = observation_width
@@ -138,7 +128,7 @@ class LiberoEnv(gym.Env):
        self.visualization_width = visualization_width
        self.visualization_height = visualization_height
        self.init_states = init_states
-        self.camera_name = parse_camera_names(
+        self.camera_name = _parse_camera_names(
            camera_name
        )  # agentview_image (main) or robot0_eye_in_hand_image (wrist)

@@ -157,11 +147,7 @@ class LiberoEnv(gym.Env):
        self.episode_index = episode_index
        self.episode_length = episode_length
        # Load once and keep
-        self._init_states = (
-            get_task_init_states(task_suite, self.task_id, is_libero_plus=self.is_libero_plus)
-            if self.init_states
-            else None
-        )
+        self._init_states = get_task_init_states(task_suite, self.task_id) if self.init_states else None
        self._reset_stride = n_envs  # when performing a reset, append `_reset_stride` to `init_state_id`.

        self.init_state_id = self.episode_index  # tie each sub-env to a fixed init state
@@ -394,7 +380,6 @@ def _make_env_fns(
    gym_kwargs: Mapping[str, Any],
    control_mode: str,
    camera_name_mapping: dict[str, str] | None = None,
-    is_libero_plus: bool = False,
 ) -> list[Callable[[], LiberoEnv]]:
    """Build n_envs factory callables for a single (suite, task_id)."""

@@ -411,7 +396,6 @@ def _make_env_fns(
            n_envs=n_envs,
            control_mode=control_mode,
            camera_name_mapping=camera_name_mapping,
-            is_libero_plus=is_libero_plus,
            **local_kwargs,
        )

@@ -434,7 +418,6 @@ def create_libero_envs(
    control_mode: str = "relative",
    episode_length: int | None = None,
    camera_name_mapping: dict[str, str] | None = None,
-    is_libero_plus: bool = False,
 ) -> dict[str, dict[int, Any]]:
    """
    Create vectorized LIBERO environments with a consistent return shape.
@@ -454,7 +437,7 @@ def create_libero_envs(
    gym_kwargs = dict(gym_kwargs or {})
    task_ids_filter = gym_kwargs.pop("task_ids", None)  # optional: limit to specific tasks

-    camera_names = parse_camera_names(camera_name)
+    camera_names = _parse_camera_names(camera_name)
    suite_names = [s.strip() for s in str(task).split(",") if s.strip()]
    if not suite_names:
        raise ValueError("`task` must contain at least one LIBERO suite name.")
@@ -479,7 +462,6 @@ def create_libero_envs(
        # Probe once and reuse to avoid creating a temp env per task.
        cached_obs_space: spaces.Space | None = None
        cached_act_space: spaces.Space | None = None
-        cached_metadata: dict[str, Any] | None = None

        for tid in selected:
            fns = _make_env_fns(
@@ -493,14 +475,12 @@ def create_libero_envs(
                gym_kwargs=gym_kwargs,
                control_mode=control_mode,
                camera_name_mapping=camera_name_mapping,
-                is_libero_plus=is_libero_plus,
            )
            if is_async:
-                lazy = _LazyAsyncVectorEnv(fns, cached_obs_space, cached_act_space, cached_metadata)
+                lazy = _LazyAsyncVectorEnv(fns, cached_obs_space, cached_act_space)
                if cached_obs_space is None:
                    cached_obs_space = lazy.observation_space
                    cached_act_space = lazy.action_space
-                    cached_metadata = lazy.metadata
                out[suite_name][tid] = lazy
            else:
                out[suite_name][tid] = env_cls(fns)
@@ -311,7 +311,6 @@ def create_metaworld_envs(
    is_async = env_cls is gym.vector.AsyncVectorEnv
    cached_obs_space = None
    cached_act_space = None
-    cached_metadata = None
    out: dict[str, dict[int, Any]] = defaultdict(dict)

    for group in task_groups:
@@ -325,11 +324,10 @@ def create_metaworld_envs(
            fns = [(lambda tn=task_name: MetaworldEnv(task=tn, **gym_kwargs)) for _ in range(n_envs)]

            if is_async:
-                lazy = _LazyAsyncVectorEnv(fns, cached_obs_space, cached_act_space, cached_metadata)
+                lazy = _LazyAsyncVectorEnv(fns, cached_obs_space, cached_act_space)
                if cached_obs_space is None:
                    cached_obs_space = lazy.observation_space
                    cached_act_space = lazy.action_space
-                    cached_metadata = lazy.metadata
                out[group][tid] = lazy
            else:
                out[group][tid] = env_cls(fns)
@@ -1,425 +0,0 @@
-#!/usr/bin/env python
-
-# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-from __future__ import annotations
-
-import logging
-from collections import defaultdict
-from collections.abc import Callable, Sequence
-from functools import partial
-from typing import Any
-
-import gymnasium as gym
-import numpy as np
-from gymnasium import spaces
-
-from lerobot.types import RobotObservation
-
-from .utils import _LazyAsyncVectorEnv, parse_camera_names
-
-logger = logging.getLogger(__name__)
-
-# Dimensions for the flat action/state vectors used by the LeRobot wrapper.
-# These correspond to the PandaOmron robot in RoboCasa365.
-OBS_STATE_DIM = 16  # base_pos(3) + base_quat(4) + ee_pos_rel(3) + ee_quat_rel(4) + gripper_qpos(2)
-ACTION_DIM = 12  # base_motion(4) + control_mode(1) + ee_pos(3) + ee_rot(3) + gripper(1)
-ACTION_LOW = -1.0
-ACTION_HIGH = 1.0
-
-# Default PandaOmron cameras. We surface these raw names directly as
-# `observation.images.<name>` so the LeRobot dataset/policy keys match
-# RoboCasa's native convention (no implicit renaming).
-DEFAULT_CAMERAS = [
-    "robot0_agentview_left",
-    "robot0_eye_in_hand",
-    "robot0_agentview_right",
-]
-
-# Object-mesh registries to sample from. RoboCasa's upstream default is
-# ("objaverse", "lightwheel"), but the objaverse pack is huge (~30GB) and
-# most users — including our CI image — only download the lightwheel pack
-# (`--type objs_lw` in `download_kitchen_assets`). When a sampled object
-# category has zero candidates in every registry, robocasa crashes with
-# `ValueError: Probabilities contain NaN` (0/0 divide in the probability
-# normalization). Restricting to registries that are actually on disk
-# avoids the NaN and matches what the asset download provides.
-DEFAULT_OBJ_REGISTRIES: tuple[str, ...] = ("lightwheel",)
-
-# Task-group shortcuts accepted as `--env.task`. When the user passes one of
-# these names, we expand it to the upstream RoboCasa task list and auto-set
-# the dataset split. Individual task names (optionally comma-separated) still
-# take precedence; this only triggers on an exact group-name match.
-_TASK_GROUP_SPLITS = {
-    "atomic_seen": "target",
-    "composite_seen": "target",
-    "composite_unseen": "target",
-    "pretrain50": "pretrain",
-    "pretrain100": "pretrain",
-    "pretrain200": "pretrain",
-    "pretrain300": "pretrain",
-}
-
-
-def _resolve_tasks(task: str) -> tuple[list[str], str | None]:
-    """Resolve a `--env.task` value to (task_names, split_override).
-
-    If `task` is a known task-group name (e.g. `atomic_seen`, `pretrain100`),
-    expand it via `robocasa.utils.dataset_registry.{TARGET,PRETRAINING}_TASKS`
-    and return the matching split. Otherwise treat `task` as a single task or
-    comma-separated list and leave the split untouched (None).
-    """
-    key = task.strip()
-    if key in _TASK_GROUP_SPLITS:
-        from robocasa.utils.dataset_registry import PRETRAINING_TASKS, TARGET_TASKS
-
-        combined = {**TARGET_TASKS, **PRETRAINING_TASKS}
-        if key not in combined:
-            raise ValueError(
-                f"Task group '{key}' is not available in this version of robocasa. "
-                f"Known groups: {sorted(combined.keys())}."
-            )
-        return list(combined[key]), _TASK_GROUP_SPLITS[key]
-
-    names = [t.strip() for t in task.split(",") if t.strip()]
-    if not names:
-        raise ValueError("`task` must contain at least one RoboCasa task name.")
-    return names, None
-
-
-def convert_action(flat_action: np.ndarray) -> dict[str, Any]:
-    """Split a flat (12,) action vector into a RoboCasa action dict.
-
-    Layout: base_motion(4) + control_mode(1) + ee_pos(3) + ee_rot(3) + gripper(1)
-    """
-    return {
-        "action.base_motion": flat_action[0:4],
-        "action.control_mode": flat_action[4:5],
-        "action.end_effector_position": flat_action[5:8],
-        "action.end_effector_rotation": flat_action[8:11],
-        "action.gripper_close": flat_action[11:12],
-    }
-
-
-class RoboCasaEnv(gym.Env):
-    """LeRobot gym.Env wrapper for RoboCasa365 kitchen environments.
-
-    Wraps RoboCasaGymEnv from the robocasa package and converts its
-    dict-based observations and actions into the flat arrays LeRobot expects.
-    Raw RoboCasa camera names are preserved verbatim under `pixels/<cam>`.
-    """
-
-    metadata = {"render_modes": ["rgb_array"], "render_fps": 20}
-
-    def __init__(
-        self,
-        task: str,
-        camera_name: str | Sequence[str] = ",".join(DEFAULT_CAMERAS),
-        obs_type: str = "pixels_agent_pos",
-        render_mode: str = "rgb_array",
-        observation_width: int = 256,
-        observation_height: int = 256,
-        visualization_width: int = 512,
-        visualization_height: int = 512,
-        split: str | None = None,
-        episode_length: int | None = None,
-        obj_registries: Sequence[str] = DEFAULT_OBJ_REGISTRIES,
-        episode_index: int = 0,
-    ):
-        super().__init__()
-        self.task = task
-        self.obs_type = obs_type
-        self.render_mode = render_mode
-        self.observation_width = observation_width
-        self.observation_height = observation_height
-        self.visualization_width = visualization_width
-        self.visualization_height = visualization_height
-        self.split = split
-        self.obj_registries = tuple(obj_registries)
-        # Per-worker index (0..n_envs-1) used to spread the user-provided
-        # seed across factories so each sub-env explores a distinct layout
-        # even when the same seed is passed to `reset()`.
-        self.episode_index = int(episode_index)
-
-        self.camera_name = parse_camera_names(camera_name)
-
-        self._max_episode_steps = episode_length if episode_length is not None else 1000
-
-        # Deferred — created on first reset() inside the worker subprocess
-        # to avoid inheriting stale GPU/EGL contexts across fork().
-        self._env: Any = None
-        self.task_description = ""
-
-        images = {
-            cam: spaces.Box(
-                low=0,
-                high=255,
-                shape=(self.observation_height, self.observation_width, 3),
-                dtype=np.uint8,
-            )
-            for cam in self.camera_name
-        }
-
-        if self.obs_type == "pixels":
-            self.observation_space = spaces.Dict({"pixels": spaces.Dict(images)})
-        elif self.obs_type == "pixels_agent_pos":
-            self.observation_space = spaces.Dict(
-                {
-                    "pixels": spaces.Dict(images),
-                    "agent_pos": spaces.Box(
-                        low=-np.inf,
-                        high=np.inf,
-                        shape=(OBS_STATE_DIM,),
-                        dtype=np.float32,
-                    ),
-                }
-            )
-        else:
-            raise ValueError(f"Unsupported obs_type '{self.obs_type}'. Use 'pixels' or 'pixels_agent_pos'.")
-
-        self.action_space = spaces.Box(
-            low=ACTION_LOW,
-            high=ACTION_HIGH,
-            shape=(ACTION_DIM,),
-            dtype=np.float32,
-        )
-
-    def _ensure_env(self) -> None:
-        """Create the underlying RoboCasaGymEnv on first use.
-
-        Called inside the worker subprocess after fork(), so each worker gets
-        its own clean rendering context rather than inheriting a stale one from
-        the parent process (which causes crashes with AsyncVectorEnv).
-        """
-        if self._env is not None:
-            return
-        from robocasa.wrappers.gym_wrapper import RoboCasaGymEnv
-
-        # RoboCasaGymEnv defaults split="test", which create_env rejects
-        # (only None/"all"/"pretrain"/"target" are valid). Always pass a
-        # valid value so we don't hit that default. Extra kwargs are
-        # forwarded to the underlying kitchen env via create_env/robosuite.make.
-        self._env = RoboCasaGymEnv(
-            env_name=self.task,
-            camera_widths=self.observation_width,
-            camera_heights=self.observation_height,
-            split=self.split if self.split is not None else "all",
-            obj_registries=self.obj_registries,
-        )
-
-        ep_meta = self._env.env.get_ep_meta()
-        self.task_description = ep_meta.get("lang", self.task)
-
-    def _format_raw_obs(self, raw_obs: dict) -> RobotObservation:
-        """Convert RoboCasaGymEnv observation dict to LeRobot format."""
-        # RoboCasaGymEnv emits camera frames under "video.<cam>".
-        images = {cam: raw_obs[f"video.{cam}"] for cam in self.camera_name if f"video.{cam}" in raw_obs}
-
-        if self.obs_type == "pixels":
-            return {"pixels": images}
-
-        # `state.*` keys come from PandaOmronKeyConverter inside the wrapper.
-        agent_pos = np.concatenate(
-            [
-                raw_obs.get("state.base_position", np.zeros(3)),
-                raw_obs.get("state.base_rotation", np.zeros(4)),
-                raw_obs.get("state.end_effector_position_relative", np.zeros(3)),
-                raw_obs.get("state.end_effector_rotation_relative", np.zeros(4)),
-                raw_obs.get("state.gripper_qpos", np.zeros(2)),
-            ],
-            axis=-1,
-        ).astype(np.float32)
-
-        return {"pixels": images, "agent_pos": agent_pos}
-
-    def render(self) -> np.ndarray:
-        self._ensure_env()
-        assert self._env is not None
-        return self._env.render()
-
-    def reset(self, seed=None, **kwargs):
-        self._ensure_env()
-        assert self._env is not None
-        super().reset(seed=seed)
-        # Spread the seed across workers so n_envs factories don't all
-        # roll the same scene. With an explicit user seed we shift it by
-        # episode_index; with no seed we fall back to episode_index so
-        # each worker is still distinct rather than inheriting the same
-        # global RNG state.
-        worker_seed = seed + self.episode_index if seed is not None else self.episode_index
-        raw_obs, info = self._env.reset(seed=worker_seed)
-
-        ep_meta = self._env.env.get_ep_meta()
-        self.task_description = ep_meta.get("lang", self.task)
-
-        observation = self._format_raw_obs(raw_obs)
-        info = {"is_success": False}
-        return observation, info
-
-    def step(self, action: np.ndarray) -> tuple[RobotObservation, float, bool, bool, dict[str, Any]]:
-        self._ensure_env()
-        assert self._env is not None
-        if action.ndim != 1:
-            raise ValueError(
-                f"Expected action to be 1-D (shape (action_dim,)), "
-                f"but got shape {action.shape} with ndim={action.ndim}"
-            )
-
-        action_dict = convert_action(action)
-        raw_obs, reward, done, truncated, info = self._env.step(action_dict)
-
-        is_success = bool(info.get("success", False))
-        terminated = done or is_success
-        info.update({"task": self.task, "done": done, "is_success": is_success})
-
-        observation = self._format_raw_obs(raw_obs)
-        if terminated:
-            info["final_info"] = {
-                "task": self.task,
-                "done": bool(done),
-                "is_success": bool(is_success),
-            }
-            self.reset()
-
-        return observation, reward, terminated, truncated, info
-
-    def close(self):
-        if self._env is not None:
-            self._env.close()
-
-
-def _make_env_fns(
-    *,
-    task: str,
-    n_envs: int,
-    camera_names: list[str],
-    obs_type: str,
-    render_mode: str,
-    observation_width: int,
-    observation_height: int,
-    visualization_width: int,
-    visualization_height: int,
-    split: str | None,
-    episode_length: int | None,
-    obj_registries: Sequence[str],
-) -> list[Callable[[], RoboCasaEnv]]:
-    """Build n_envs factory callables for a single task.
-
-    Each factory carries a distinct ``episode_index`` (``0..n_envs-1``) so
-    ``RoboCasaEnv.reset()`` can derive a per-worker seed series from the
-    user-provided seed.
-    """
-
-    def _make_env(episode_index: int) -> RoboCasaEnv:
-        return RoboCasaEnv(
-            task=task,
-            camera_name=camera_names,
-            obs_type=obs_type,
-            render_mode=render_mode,
-            observation_width=observation_width,
-            observation_height=observation_height,
-            visualization_width=visualization_width,
-            visualization_height=visualization_height,
-            split=split,
-            episode_length=episode_length,
-            obj_registries=obj_registries,
-            episode_index=episode_index,
-        )
-
-    return [partial(_make_env, i) for i in range(n_envs)]
-
-
-def create_robocasa_envs(
-    task: str,
-    n_envs: int,
-    gym_kwargs: dict[str, Any] | None = None,
-    camera_name: str | Sequence[str] = ",".join(DEFAULT_CAMERAS),
-    env_cls: Callable[[Sequence[Callable[[], Any]]], Any] | None = None,
-    episode_length: int | None = None,
-    obj_registries: Sequence[str] = DEFAULT_OBJ_REGISTRIES,
-) -> dict[str, dict[int, Any]]:
-    """Create vectorized RoboCasa365 environments with a consistent return shape.
-
-    Returns:
-        dict[task_name][task_id] -> vec_env (env_cls([...]) with exactly n_envs factories)
-
-    `task` can be:
-      - a single task name (e.g. `CloseFridge`)
-      - a comma-separated list of task names (e.g. `CloseFridge,PickPlaceCoffee`)
-      - a benchmark-group shortcut (`atomic_seen`, `composite_seen`,
-        `composite_unseen`, `pretrain50`, `pretrain100`, `pretrain200`,
-        `pretrain300`), which auto-expands to the upstream task list and
-        auto-sets the dataset `split` ("target" or "pretrain").
-    """
-    if env_cls is None or not callable(env_cls):
-        raise ValueError("env_cls must be a callable that wraps a list of environment factory callables.")
-    if not isinstance(n_envs, int) or n_envs <= 0:
-        raise ValueError(f"n_envs must be a positive int; got {n_envs}.")
-
-    gym_kwargs = dict(gym_kwargs or {})
-    obs_type = gym_kwargs.pop("obs_type", "pixels_agent_pos")
-    render_mode = gym_kwargs.pop("render_mode", "rgb_array")
-    observation_width = gym_kwargs.pop("observation_width", 256)
-    observation_height = gym_kwargs.pop("observation_height", 256)
-    visualization_width = gym_kwargs.pop("visualization_width", 512)
-    visualization_height = gym_kwargs.pop("visualization_height", 512)
-    split = gym_kwargs.pop("split", None)
-
-    camera_names = parse_camera_names(camera_name)
-    task_names, group_split = _resolve_tasks(str(task))
-    if group_split is not None and split is None:
-        split = group_split
-
-    logger.info(
-        "Creating RoboCasa envs | tasks=%s | split=%s | n_envs(per task)=%d",
-        task_names,
-        split,
-        n_envs,
-    )
-
-    is_async = env_cls is gym.vector.AsyncVectorEnv
-
-    cached_obs_space: spaces.Space | None = None
-    cached_act_space: spaces.Space | None = None
-    cached_metadata: dict[str, Any] | None = None
-    out: dict[str, dict[int, Any]] = defaultdict(dict)
-
-    for task_name in task_names:
-        fns = _make_env_fns(
-            task=task_name,
-            n_envs=n_envs,
-            camera_names=camera_names,
-            obs_type=obs_type,
-            render_mode=render_mode,
-            observation_width=observation_width,
-            observation_height=observation_height,
-            visualization_width=visualization_width,
-            visualization_height=visualization_height,
-            split=split,
-            episode_length=episode_length,
-            obj_registries=obj_registries,
-        )
-
-        if is_async:
-            lazy = _LazyAsyncVectorEnv(fns, cached_obs_space, cached_act_space, cached_metadata)
-            if cached_obs_space is None:
-                cached_obs_space = lazy.observation_space
-                cached_act_space = lazy.action_space
-                cached_metadata = lazy.metadata
-            out[task_name][0] = lazy
-        else:
-            out[task_name][0] = env_cls(fns)
-        logger.info("Built vec env | task=%s | n_envs=%d", task_name, n_envs)
-
-    return {name: dict(task_map) for name, task_map in out.items()}
@@ -1,245 +0,0 @@
-"""RoboMME environment wrapper for LeRobot evaluation.
-
-Wraps the RoboMME ``BenchmarkEnvBuilder`` into a Gymnasium-compatible
-``VectorEnv`` suitable for ``lerobot_eval``.
-
-RoboMME tasks:
-  Counting:    BinFill, PickXtimes, SwingXtimes, StopCube
-  Permanence:  VideoUnmask, VideoUnmaskSwap, ButtonUnmask, ButtonUnmaskSwap
-  Reference:   PickHighlight, VideoRepick, VideoPlaceButton, VideoPlaceOrder
-  Imitation:   MoveCube, InsertPeg, PatternLock, RouteStick
-
-Dataset: lerobot/robomme (LeRobot v3.0, 1,600 episodes)
-Install: see docker/Dockerfile.benchmark.robomme  (Linux only — mani-skill vs numpy pin conflict)
-Benchmark: https://github.com/RoboMME/robomme_benchmark
-"""
-
-from __future__ import annotations
-
-from collections.abc import Callable, Sequence
-from functools import partial
-from typing import Any
-
-import gymnasium as gym
-import numpy as np
-from gymnasium import spaces
-
-from .utils import _LazyAsyncVectorEnv
-
-ROBOMME_TASKS = [
-    "BinFill",
-    "PickXtimes",
-    "SwingXtimes",
-    "StopCube",
-    "VideoUnmask",
-    "VideoUnmaskSwap",
-    "ButtonUnmask",
-    "ButtonUnmaskSwap",
-    "PickHighlight",
-    "VideoRepick",
-    "VideoPlaceButton",
-    "VideoPlaceOrder",
-    "MoveCube",
-    "InsertPeg",
-    "PatternLock",
-    "RouteStick",
-]
-
-
-class RoboMMEGymEnv(gym.Env):
-    """Thin Gymnasium wrapper around a single RoboMME episode env."""
-
-    metadata = {"render_modes": ["rgb_array"], "render_fps": 10}
-
-    def __init__(
-        self,
-        task: str = "PickXtimes",
-        action_space_type: str = "joint_angle",
-        dataset: str = "test",
-        episode_idx: int = 0,
-        max_steps: int = 300,
-    ):
-        super().__init__()
-        from robomme.env_record_wrapper import BenchmarkEnvBuilder
-
-        self._task = task
-        self._action_space_type = action_space_type
-        self._dataset = dataset
-        self._episode_idx = episode_idx
-        self._max_steps = max_steps
-        self._max_episode_steps = max_steps
-
-        self._builder = BenchmarkEnvBuilder(
-            env_id=task,
-            dataset=dataset,
-            action_space=action_space_type,
-            gui_render=False,
-            max_steps=max_steps,
-        )
-        self._env = None
-        self._last_raw_obs: dict | None = None
-
-        action_dim = 8 if action_space_type == "joint_angle" else 7
-        self.action_space = spaces.Box(low=-1.0, high=1.0, shape=(action_dim,), dtype=np.float32)
-        # `pixels` must be a nested Dict so `preprocess_observation()` in
-        # envs/utils.py picks it up and maps each camera to
-        # `observation.images.<cam>`. A flat layout (`pixels/image`,
-        # `pixels/wrist_image`) silently drops every image from the batch.
-        self.observation_space = spaces.Dict(
-            {
-                "pixels": spaces.Dict(
-                    {
-                        "image": spaces.Box(0, 255, shape=(256, 256, 3), dtype=np.uint8),
-                        "wrist_image": spaces.Box(0, 255, shape=(256, 256, 3), dtype=np.uint8),
-                    }
-                ),
-                "agent_pos": spaces.Box(-np.inf, np.inf, shape=(8,), dtype=np.float32),
-            }
-        )
-
-    def reset(self, *, seed=None, options=None):
-        super().reset(seed=seed)
-        self._env = self._builder.make_env_for_episode(
-            episode_idx=self._episode_idx,
-            max_steps=self._max_steps,
-        )
-        obs, info = self._env.reset()
-        self._last_raw_obs = obs
-        return self._convert_obs(obs), self._convert_info(info)
-
-    def step(self, action):
-        obs, reward, terminated, truncated, info = self._env.step(action)
-        self._last_raw_obs = obs
-
-        terminated_bool = bool(terminated.item()) if hasattr(terminated, "item") else bool(terminated)
-        truncated_bool = bool(truncated.item()) if hasattr(truncated, "item") else bool(truncated)
-
-        status = info.get("status", "ongoing")
-        is_success = status == "success"
-        conv_info = self._convert_info(info)
-        conv_info["is_success"] = is_success
-
-        return self._convert_obs(obs), float(reward), terminated_bool, truncated_bool, conv_info
-
-    def render(self) -> np.ndarray | None:
-        """Return the front camera image from the last observation for video recording."""
-        if self._last_raw_obs is None:
-            return np.zeros((256, 256, 3), dtype=np.uint8)
-        front = self._last_raw_obs.get("front_rgb_list")
-        if front is None:
-            return np.zeros((256, 256, 3), dtype=np.uint8)
-        frame = front[-1] if isinstance(front, list) else front
-        return np.asarray(frame, dtype=np.uint8)
-
-    def _convert_obs(self, obs: dict) -> dict:
-        front_rgb = (
-            obs["front_rgb_list"][-1] if isinstance(obs["front_rgb_list"], list) else obs["front_rgb_list"]
-        )
-        wrist_rgb = (
-            obs["wrist_rgb_list"][-1] if isinstance(obs["wrist_rgb_list"], list) else obs["wrist_rgb_list"]
-        )
-        joint_state = (
-            obs["joint_state_list"][-1]
-            if isinstance(obs["joint_state_list"], list)
-            else obs["joint_state_list"]
-        )
-        gripper_state = (
-            obs["gripper_state_list"][-1]
-            if isinstance(obs["gripper_state_list"], list)
-            else obs["gripper_state_list"]
-        )
-
-        front_rgb = np.asarray(front_rgb, dtype=np.uint8)
-        wrist_rgb = np.asarray(wrist_rgb, dtype=np.uint8)
-        joint = np.asarray(joint_state, dtype=np.float32).flatten()[:7]
-        gripper = np.asarray(gripper_state, dtype=np.float32).flatten()[:1]
-        state = np.concatenate([joint, gripper])
-
-        return {
-            "pixels": {"image": front_rgb, "wrist_image": wrist_rgb},
-            "agent_pos": state,
-        }
-
-    def _convert_info(self, info: dict) -> dict:
-        return {
-            "status": info.get("status", "ongoing"),
-            "task_goal": info.get("task_goal", ""),
-        }
-
-
-def _make_env_fns(
-    *,
-    task: str,
-    n_envs: int,
-    action_space_type: str,
-    dataset: str,
-    episode_length: int,
-    task_id: int,
-) -> list[Callable[[], RoboMMEGymEnv]]:
-    """Build n_envs factory callables for one RoboMME task id."""
-
-    def _make_one(episode_index: int) -> RoboMMEGymEnv:
-        return RoboMMEGymEnv(
-            task=task,
-            action_space_type=action_space_type,
-            dataset=dataset,
-            episode_idx=episode_index,
-            max_steps=episode_length,
-        )
-
-    return [partial(_make_one, task_id + i) for i in range(n_envs)]
-
-
-def create_robomme_envs(
-    task: str,
-    n_envs: int = 1,
-    action_space_type: str = "joint_angle",
-    dataset: str = "test",
-    episode_length: int = 300,
-    task_ids: list[int] | None = None,
-    env_cls: Callable[[Sequence[Callable[[], Any]]], Any] | None = None,
-) -> dict[str, dict[int, gym.vector.VectorEnv]]:
-    """Create vectorized RoboMME environments for evaluation.
-
-    `task` may be a single RoboMME task name (e.g. "PickXtimes") or a
-    comma-separated list (e.g. "PickXtimes,BinFill,StopCube"). Each task
-    becomes its own suite in the returned mapping.
-
-    Returns {suite_name: {task_id: VectorEnv}} matching lerobot's expected format.
-    """
-    if env_cls is None or not callable(env_cls):
-        raise ValueError("env_cls must be a callable that wraps a list of env factory callables.")
-    if not isinstance(n_envs, int) or n_envs <= 0:
-        raise ValueError(f"n_envs must be a positive int; got {n_envs}.")
-
-    if task_ids is None:
-        task_ids = [0]
-
-    task_names = [t.strip() for t in task.split(",") if t.strip()]
-    is_async = env_cls is gym.vector.AsyncVectorEnv
-    cached_obs_space: spaces.Space | None = None
-    cached_act_space: spaces.Space | None = None
-    cached_metadata: dict[str, Any] | None = None
-    out: dict[str, dict[int, gym.vector.VectorEnv]] = {}
-    for task_name in task_names:
-        envs_by_task: dict[int, gym.vector.VectorEnv] = {}
-        for task_id in task_ids:
-            fns = _make_env_fns(
-                task=task_name,
-                n_envs=n_envs,
-                action_space_type=action_space_type,
-                dataset=dataset,
-                episode_length=episode_length,
-                task_id=task_id,
-            )
-            if is_async:
-                lazy = _LazyAsyncVectorEnv(fns, cached_obs_space, cached_act_space, cached_metadata)
-                if cached_obs_space is None:
-                    cached_obs_space = lazy.observation_space
-                    cached_act_space = lazy.action_space
-                    cached_metadata = lazy.metadata
-                envs_by_task[task_id] = lazy
-            else:
-                envs_by_task[task_id] = env_cls(fns)
-        out[task_name] = envs_by_task
-    return out
@@ -1,488 +0,0 @@
-#!/usr/bin/env python
-
-# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-from __future__ import annotations
-
-import importlib
-import logging
-from collections import defaultdict
-from collections.abc import Callable, Sequence
-from functools import partial
-from typing import Any
-
-import gymnasium as gym
-import numpy as np
-import torch
-from gymnasium import spaces
-
-from lerobot.types import RobotObservation
-
-from .utils import _LazyAsyncVectorEnv
-
-logger = logging.getLogger(__name__)
-
-# Camera names as used by RoboTwin 2.0. The wrapper appends "_rgb" when looking
-# up keys in get_obs() output (e.g. "head_camera" → "head_camera_rgb").
-ROBOTWIN_CAMERA_NAMES: tuple[str, ...] = (
-    "head_camera",
-    "left_camera",
-    "right_camera",
-)
-
-ACTION_DIM = 14  # 7 DOF × 2 arms
-ACTION_LOW = -1.0
-ACTION_HIGH = 1.0
-DEFAULT_EPISODE_LENGTH = 300
-# D435 dims from task_config/_camera_config.yml (what demo_clean.yml selects).
-DEFAULT_CAMERA_H = 240
-DEFAULT_CAMERA_W = 320
-
-# Task list from RoboTwin 2.0's `envs/` directory — mirrors upstream exactly
-# (50 tasks as of main; earlier revisions had 60 with a different split).
-# Keep this in sync with:
-#   gh api /repos/RoboTwin-Platform/RoboTwin/contents/envs --paginate \
-#     | jq -r '.[].name' | grep -E '\.py$' | grep -v '^_' | sed 's/\.py$//'
-ROBOTWIN_TASKS: tuple[str, ...] = (
-    "adjust_bottle",
-    "beat_block_hammer",
-    "blocks_ranking_rgb",
-    "blocks_ranking_size",
-    "click_alarmclock",
-    "click_bell",
-    "dump_bin_bigbin",
-    "grab_roller",
-    "handover_block",
-    "handover_mic",
-    "hanging_mug",
-    "lift_pot",
-    "move_can_pot",
-    "move_pillbottle_pad",
-    "move_playingcard_away",
-    "move_stapler_pad",
-    "open_laptop",
-    "open_microwave",
-    "pick_diverse_bottles",
-    "pick_dual_bottles",
-    "place_a2b_left",
-    "place_a2b_right",
-    "place_bread_basket",
-    "place_bread_skillet",
-    "place_burger_fries",
-    "place_can_basket",
-    "place_cans_plasticbox",
-    "place_container_plate",
-    "place_dual_shoes",
-    "place_empty_cup",
-    "place_fan",
-    "place_mouse_pad",
-    "place_object_basket",
-    "place_object_scale",
-    "place_object_stand",
-    "place_phone_stand",
-    "place_shoe",
-    "press_stapler",
-    "put_bottles_dustbin",
-    "put_object_cabinet",
-    "rotate_qrcode",
-    "scan_object",
-    "shake_bottle",
-    "shake_bottle_horizontally",
-    "stack_blocks_three",
-    "stack_blocks_two",
-    "stack_bowls_three",
-    "stack_bowls_two",
-    "stamp_seal",
-    "turn_switch",
-)
-
-
-_ROBOTWIN_SETUP_CACHE: dict[str, dict[str, Any]] = {}
-
-
-def _load_robotwin_setup_kwargs(task_name: str) -> dict[str, Any]:
-    """Build the kwargs dict RoboTwin's setup_demo expects.
-
-    Mirrors the config loading done by RoboTwin's ``script/eval_policy.py``:
-    reads ``task_config/demo_clean.yml``, resolves the embodiment file from
-    ``_embodiment_config.yml``, loads the robot's own ``config.yml``, and
-    reads camera dimensions from ``_camera_config.yml``.
-
-    Uses ``aloha-agilex`` single-robot dual-arm by default (the only embodiment
-    used by beat_block_hammer and most smoke-test tasks).
-    """
-    if task_name in _ROBOTWIN_SETUP_CACHE:
-        return dict(_ROBOTWIN_SETUP_CACHE[task_name])
-
-    import os
-
-    import yaml  # type: ignore[import-untyped]
-    from envs import CONFIGS_PATH  # type: ignore[import-not-found]
-
-    task_config = "demo_clean"
-    with open(os.path.join(CONFIGS_PATH, f"{task_config}.yml"), encoding="utf-8") as f:
-        args = yaml.safe_load(f)
-
-    # Resolve embodiment — demo_clean.yml uses [aloha-agilex] (dual-arm single robot)
-    with open(os.path.join(CONFIGS_PATH, "_embodiment_config.yml"), encoding="utf-8") as f:
-        embodiment_types = yaml.safe_load(f)
-    embodiment = args.get("embodiment", ["aloha-agilex"])
-    if len(embodiment) == 1:
-        robot_file = embodiment_types[embodiment[0]]["file_path"]
-        args["left_robot_file"] = robot_file
-        args["right_robot_file"] = robot_file
-        args["dual_arm_embodied"] = True
-    elif len(embodiment) == 3:
-        args["left_robot_file"] = embodiment_types[embodiment[0]]["file_path"]
-        args["right_robot_file"] = embodiment_types[embodiment[1]]["file_path"]
-        args["embodiment_dis"] = embodiment[2]
-        args["dual_arm_embodied"] = False
-    else:
-        raise ValueError(f"embodiment must have 1 or 3 items, got {len(embodiment)}")
-
-    with open(os.path.join(args["left_robot_file"], "config.yml"), encoding="utf-8") as f:
-        args["left_embodiment_config"] = yaml.safe_load(f)
-    with open(os.path.join(args["right_robot_file"], "config.yml"), encoding="utf-8") as f:
-        args["right_embodiment_config"] = yaml.safe_load(f)
-
-    # Camera dimensions
-    with open(os.path.join(CONFIGS_PATH, "_camera_config.yml"), encoding="utf-8") as f:
-        camera_config = yaml.safe_load(f)
-    head_cam = args["camera"]["head_camera_type"]
-    args["head_camera_h"] = camera_config[head_cam]["h"]
-    args["head_camera_w"] = camera_config[head_cam]["w"]
-
-    # Headless overrides
-    args["render_freq"] = 0
-    args["task_name"] = task_name
-    args["task_config"] = task_config
-
-    _ROBOTWIN_SETUP_CACHE[task_name] = args
-    return dict(args)
-
-
-def _load_robotwin_task(task_name: str) -> type:
-    """Dynamically import and return a RoboTwin 2.0 task class.
-
-    RoboTwin tasks live in ``envs/<task_name>.py`` relative to the repository
-    root and are expected to be on ``sys.path`` after installation.
-    """
-    try:
-        module = importlib.import_module(f"envs.{task_name}")
-    except ModuleNotFoundError as e:
-        raise ModuleNotFoundError(
-            f"Could not import RoboTwin task '{task_name}'. "
-            "Ensure RoboTwin 2.0 is installed and its 'envs/' directory is on PYTHONPATH. "
-            "See the RoboTwin installation guide: https://robotwin-platform.github.io/doc/usage/robotwin-install.html"
-        ) from e
-    task_cls = getattr(module, task_name, None)
-    if task_cls is None:
-        raise AttributeError(f"Task class '{task_name}' not found in envs/{task_name}.py")
-    return task_cls
-
-
-class RoboTwinEnv(gym.Env):
-    """Gymnasium wrapper around a single RoboTwin 2.0 task.
-
-    RoboTwin uses a custom SAPIEN-based API (``setup_demo`` / ``get_obs`` /
-    ``take_action`` / ``check_success``) rather than the standard gym interface.
-    This class bridges that API to Gymnasium so that ``lerobot-eval`` can drive
-    RoboTwin exactly like LIBERO or Meta-World.
-
-    The underlying SAPIEN environment is created lazily on the first ``reset()``
-    call *inside the worker process*.  This is required for
-    ``gym.vector.AsyncVectorEnv`` compatibility: SAPIEN allocates EGL/GPU
-    contexts that must not be forked from the parent process.
-
-    Observations
-    ------------
-    The ``pixels`` dict uses the raw RoboTwin camera names as keys (e.g.
-    ``"head_camera"``, ``"left_camera"``). ``preprocess_observation`` in
-    ``envs/utils.py`` then converts these to ``observation.images.<cam>``.
-
-    Actions
-    -------
-    14-dim float32 array in ``[-1, 1]`` (joint-space, 7 DOF per arm).
-
-    Autograd
-    --------
-    ``setup_demo`` and ``take_action`` drive CuRobo's Newton trajectory
-    optimizer, which calls ``cost.backward()`` internally. lerobot_eval wraps
-    the rollout in ``torch.no_grad()``, so both call sites re-enable grad.
-    """
-
-    metadata = {"render_modes": ["rgb_array"], "render_fps": 25}
-
-    def __init__(
-        self,
-        task_name: str,
-        episode_index: int = 0,
-        n_envs: int = 1,
-        camera_names: Sequence[str] = ROBOTWIN_CAMERA_NAMES,
-        observation_height: int | None = None,
-        observation_width: int | None = None,
-        episode_length: int = DEFAULT_EPISODE_LENGTH,
-        render_mode: str = "rgb_array",
-    ):
-        super().__init__()
-        self.task_name = task_name
-        self.task = task_name  # used by add_envs_task() in utils.py
-        self.task_description = task_name.replace("_", " ")
-        self.episode_index = episode_index
-        self._reset_stride = n_envs
-        self.camera_names = list(camera_names)
-        # Default to D435 dims (the camera type baked into task_config/demo_clean.yml).
-        # The YAML-driven lookup is deferred to reset() so construction doesn't
-        # import RoboTwin's `envs` module — fast-tests run without RoboTwin installed.
-        self.observation_height = observation_height or DEFAULT_CAMERA_H
-        self.observation_width = observation_width or DEFAULT_CAMERA_W
-        self.episode_length = episode_length
-        self._max_episode_steps = episode_length  # lerobot_eval.rollout reads this
-        self.render_mode = render_mode
-
-        self._env: Any | None = None  # deferred — created on first reset() inside worker
-        self._step_count: int = 0
-        self._black_frame = np.zeros((self.observation_height, self.observation_width, 3), dtype=np.uint8)
-
-        image_spaces = {
-            cam: spaces.Box(
-                low=0,
-                high=255,
-                shape=(self.observation_height, self.observation_width, 3),
-                dtype=np.uint8,
-            )
-            for cam in self.camera_names
-        }
-        self.observation_space = spaces.Dict(
-            {
-                "pixels": spaces.Dict(image_spaces),
-                "agent_pos": spaces.Box(low=-np.inf, high=np.inf, shape=(ACTION_DIM,), dtype=np.float32),
-            }
-        )
-        self.action_space = spaces.Box(
-            low=ACTION_LOW, high=ACTION_HIGH, shape=(ACTION_DIM,), dtype=np.float32
-        )
-
-    def _ensure_env(self) -> None:
-        """Create the SAPIEN environment on first use.
-
-        Called inside the worker subprocess after fork(), so each worker gets
-        its own EGL/GPU context rather than inheriting a stale one from the
-        parent process (which causes crashes with AsyncVectorEnv).
-        """
-        if self._env is not None:
-            return
-        task_cls = _load_robotwin_task(self.task_name)
-        self._env = task_cls()
-
-    def _get_obs(self) -> RobotObservation:
-        assert self._env is not None, "_get_obs called before _ensure_env()"
-        raw = self._env.get_obs()
-        cameras_raw = raw.get("observation", {})
-
-        images: dict[str, np.ndarray] = {}
-        for cam in self.camera_names:
-            cam_data = cameras_raw.get(cam)
-            img = cam_data.get("rgb") if cam_data else None
-            if img is None:
-                images[cam] = self._black_frame
-                continue
-            img = np.asarray(img, dtype=np.uint8)
-            if img.ndim == 2:
-                img = np.stack([img, img, img], axis=-1)
-            elif img.shape[-1] != 3:
-                img = img[..., :3]
-            images[cam] = img
-
-        ja = raw.get("joint_action") or {}
-        vec = ja.get("vector")
-        if vec is not None:
-            arr = np.asarray(vec, dtype=np.float32).ravel()
-            joint_state = (
-                arr[:ACTION_DIM] if arr.size >= ACTION_DIM else np.zeros(ACTION_DIM, dtype=np.float32)
-            )
-        else:
-            joint_state = np.zeros(ACTION_DIM, dtype=np.float32)
-
-        return {"pixels": images, "agent_pos": joint_state}
-
-    def reset(self, seed: int | None = None, **kwargs) -> tuple[RobotObservation, dict]:
-        self._ensure_env()
-        super().reset(seed=seed)
-        assert self._env is not None  # set by _ensure_env() above
-
-        actual_seed = self.episode_index if seed is None else seed
-        setup_kwargs = _load_robotwin_setup_kwargs(self.task_name)
-        setup_kwargs.update(seed=actual_seed, is_test=True)
-        with torch.enable_grad():
-            self._env.setup_demo(**setup_kwargs)
-        self.episode_index += self._reset_stride
-        self._step_count = 0
-
-        obs = self._get_obs()
-        return obs, {"is_success": False, "task": self.task_name}
-
-    def step(self, action: np.ndarray) -> tuple[RobotObservation, float, bool, bool, dict[str, Any]]:
-        assert self._env is not None, "step() called before reset()"
-        if action.ndim != 1 or action.shape[0] != ACTION_DIM:
-            raise ValueError(f"Expected 1-D action of shape ({ACTION_DIM},), got {action.shape}")
-
-        with torch.enable_grad():
-            if hasattr(self._env, "take_action"):
-                self._env.take_action(action)
-            else:
-                self._env.step(action)
-
-        self._step_count += 1
-
-        is_success = bool(getattr(self._env, "eval_success", False))
-        if not is_success and hasattr(self._env, "check_success"):
-            is_success = bool(self._env.check_success())
-
-        obs = self._get_obs()
-        reward = float(is_success)
-        terminated = is_success
-        truncated = self._step_count >= self.episode_length
-
-        info: dict[str, Any] = {
-            "task": self.task_name,
-            "is_success": is_success,
-            "step": self._step_count,
-        }
-        if terminated or truncated:
-            info["final_info"] = {
-                "task": self.task_name,
-                "is_success": is_success,
-            }
-            self.reset()
-
-        return obs, reward, terminated, truncated, info
-
-    def render(self) -> np.ndarray:
-        self._ensure_env()
-        obs = self._get_obs()
-        # Prefer head camera for rendering; fall back to first available.
-        if "head_camera" in obs["pixels"]:
-            return obs["pixels"]["head_camera"]
-        return next(iter(obs["pixels"].values()))
-
-    def close(self) -> None:
-        if self._env is not None:
-            if hasattr(self._env, "close_env"):
-                import contextlib
-
-                with contextlib.suppress(TypeError):
-                    self._env.close_env()
-            self._env = None
-
-
-# ---- Multi-task factory --------------------------------------------------------
-
-
-def _make_env_fns(
-    *,
-    task_name: str,
-    n_envs: int,
-    camera_names: list[str],
-    observation_height: int,
-    observation_width: int,
-    episode_length: int,
-) -> list[Callable[[], RoboTwinEnv]]:
-    """Return n_envs factory callables for a single task."""
-
-    def _make_one(episode_index: int) -> RoboTwinEnv:
-        return RoboTwinEnv(
-            task_name=task_name,
-            episode_index=episode_index,
-            n_envs=n_envs,
-            camera_names=camera_names,
-            observation_height=observation_height,
-            observation_width=observation_width,
-            episode_length=episode_length,
-        )
-
-    return [partial(_make_one, i) for i in range(n_envs)]
-
-
-def create_robotwin_envs(
-    task: str,
-    n_envs: int,
-    env_cls: Callable[[Sequence[Callable[[], Any]]], Any] | None = None,
-    camera_names: Sequence[str] = ROBOTWIN_CAMERA_NAMES,
-    observation_height: int = DEFAULT_CAMERA_H,
-    observation_width: int = DEFAULT_CAMERA_W,
-    episode_length: int = DEFAULT_EPISODE_LENGTH,
-) -> dict[str, dict[int, Any]]:
-    """Create vectorized RoboTwin 2.0 environments.
-
-    Returns:
-        ``dict[task_name][0] -> VectorEnv`` — one entry per task, each wrapping
-        ``n_envs`` parallel rollouts.
-
-    Args:
-        task: Comma-separated list of task names (e.g. ``"beat_block_hammer"``
-            or ``"beat_block_hammer,click_bell"``).
-        n_envs: Number of parallel rollouts per task.
-        env_cls: Vector env constructor (e.g. ``gym.vector.AsyncVectorEnv``).
-        camera_names: Cameras to include in observations.
-        observation_height: Pixel height for all cameras.
-        observation_width: Pixel width for all cameras.
-        episode_length: Max steps before truncation.
-    """
-    if env_cls is None or not callable(env_cls):
-        raise ValueError("env_cls must be callable (e.g. gym.vector.AsyncVectorEnv).")
-    if not isinstance(n_envs, int) or n_envs <= 0:
-        raise ValueError(f"n_envs must be a positive int; got {n_envs}.")
-
-    task_names = [t.strip() for t in str(task).split(",") if t.strip()]
-    if not task_names:
-        raise ValueError("`task` must contain at least one RoboTwin task name.")
-
-    unknown = [t for t in task_names if t not in ROBOTWIN_TASKS]
-    if unknown:
-        raise ValueError(f"Unknown RoboTwin tasks: {unknown}. Available tasks: {sorted(ROBOTWIN_TASKS)}")
-
-    logger.info(
-        "Creating RoboTwin envs | tasks=%s | n_envs(per task)=%d",
-        task_names,
-        n_envs,
-    )
-
-    is_async = env_cls is gym.vector.AsyncVectorEnv
-    cached_obs_space: spaces.Space | None = None
-    cached_act_space: spaces.Space | None = None
-    cached_metadata: dict[str, Any] | None = None
-
-    out: dict[str, dict[int, Any]] = defaultdict(dict)
-    for task_name in task_names:
-        fns = _make_env_fns(
-            task_name=task_name,
-            n_envs=n_envs,
-            camera_names=list(camera_names),
-            observation_height=observation_height,
-            observation_width=observation_width,
-            episode_length=episode_length,
-        )
-        if is_async:
-            lazy = _LazyAsyncVectorEnv(fns, cached_obs_space, cached_act_space, cached_metadata)
-            if cached_obs_space is None:
-                cached_obs_space = lazy.observation_space
-                cached_act_space = lazy.action_space
-                cached_metadata = lazy.metadata
-            out[task_name][0] = lazy
-        else:
-            out[task_name][0] = env_cls(fns)
-        logger.info("Built vec env | task=%s | n_envs=%d", task_name, n_envs)
-
-    return {k: dict(v) for k, v in out.items()}
@@ -34,25 +34,6 @@ from lerobot.utils.utils import get_channel_first_image_shape
 from .configs import EnvConfig


-def parse_camera_names(camera_name: str | Sequence[str]) -> list[str]:
-    """Normalize ``camera_name`` into a non-empty list of strings.
-
-    Accepts a comma-separated string (``"cam_a,cam_b"``) or a sequence of
-    strings (tuples/lists). Whitespace is stripped; empty entries are
-    dropped. Raises ``TypeError`` for unsupported input types and
-    ``ValueError`` when the normalized list is empty.
-    """
-    if isinstance(camera_name, str):
-        cams = [c.strip() for c in camera_name.split(",") if c.strip()]
-    elif isinstance(camera_name, (list | tuple)):
-        cams = [str(c).strip() for c in camera_name if str(c).strip()]
-    else:
-        raise TypeError(f"camera_name must be str or sequence[str], got {type(camera_name).__name__}")
-    if not cams:
-        raise ValueError("camera_name resolved to an empty list.")
-    return cams
-
-
 def _convert_nested_dict(d):
    result = {}
    for k, v in d.items():
@@ -172,20 +153,17 @@ class _LazyAsyncVectorEnv:
        env_fns: list[Callable],
        observation_space=None,
        action_space=None,
-        metadata=None,
    ):
        self._env_fns = env_fns
        self._env: gym.vector.AsyncVectorEnv | None = None
        self.num_envs = len(env_fns)
-        if observation_space is not None and action_space is not None and metadata is not None:
+        if observation_space is not None and action_space is not None:
            self.observation_space = observation_space
            self.action_space = action_space
-            self.metadata = metadata
        else:
            tmp = env_fns[0]()
            self.observation_space = tmp.observation_space
            self.action_space = tmp.action_space
-            self.metadata = tmp.metadata
            tmp.close()
        self.single_observation_space = self.observation_space
        self.single_action_space = self.action_space
@@ -194,10 +172,6 @@ class _LazyAsyncVectorEnv:
        if self._env is None:
            self._env = gym.vector.AsyncVectorEnv(self._env_fns, context="forkserver", shared_memory=True)

-    @property
-    def unwrapped(self):
-        return self
-
    def reset(self, **kwargs):
        self._ensure()
        return self._env.reset(**kwargs)
@@ -1,589 +0,0 @@
-#!/usr/bin/env python
-
-# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-"""VLABench environment wrapper for LeRobot.
-
-VLABench is a large-scale benchmark for language-conditioned robotic manipulation
-with long-horizon reasoning, built on MuJoCo/dm_control.
-
- Paper: https://arxiv.org/abs/2412.18194
- GitHub: https://github.com/OpenMOSS/VLABench
- Website: https://vlabench.github.io
-"""
-
-from __future__ import annotations
-
-import contextlib
-import logging
-from collections import defaultdict
-from collections.abc import Callable, Sequence
-from typing import Any
-
-import cv2
-import gymnasium as gym
-import numpy as np
-from gymnasium import spaces
-from scipy.spatial.transform import Rotation
-
-from lerobot.types import RobotObservation
-
-from .utils import _LazyAsyncVectorEnv
-
-logger = logging.getLogger(__name__)
-
-ACTION_DIM = 7  # pos(3) + euler(3) + gripper(1)
-ACTION_LOW = np.array([-1.0, -1.0, -1.0, -1.0, -1.0, -1.0, 0.0], dtype=np.float32)
-ACTION_HIGH = np.array([1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0], dtype=np.float32)
-
-# Default max episode steps per task type
-DEFAULT_MAX_EPISODE_STEPS = 500
-
-# VLABench task suites
-PRIMITIVE_TASKS = [
-    "select_fruit",
-    "select_toy",
-    "select_chemistry_tube",
-    "add_condiment",
-    "select_book",
-    "select_painting",
-    "select_drink",
-    "insert_flower",
-    "select_billiards",
-    "select_ingredient",
-    "select_mahjong",
-    "select_poker",
-    # Physical series
-    "density_qa",
-    "friction_qa",
-    "magnetism_qa",
-    "reflection_qa",
-    "simple_cuestick_usage",
-    "simple_seesaw_usage",
-    "sound_speed_qa",
-    "thermal_expansion_qa",
-    "weight_qa",
-]
-
-COMPOSITE_TASKS = [
-    "cluster_billiards",
-    "cluster_book",
-    "cluster_drink",
-    "cluster_toy",
-    "cook_dishes",
-    "cool_drink",
-    "find_unseen_object",
-    "get_coffee",
-    "hammer_nail",
-    "heat_food",
-    "make_juice",
-    "play_mahjong",
-    "play_math_game",
-    "play_poker",
-    "play_snooker",
-    "rearrange_book",
-    "rearrange_chemistry_tube",
-    "set_dining_table",
-    "set_study_table",
-    "store_food",
-    "take_chemistry_experiment",
-    "use_seesaw_complex",
-]
-
-SUITE_TASKS: dict[str, list[str]] = {
-    "primitive": PRIMITIVE_TASKS,
-    "composite": COMPOSITE_TASKS,
-}
-
-
-class VLABenchEnv(gym.Env):
-    """Gymnasium wrapper for VLABench environments.
-
-    Wraps the dm_control-based VLABench simulator behind a standard gym.Env interface.
-    Supports multiple cameras (front, second, wrist) and end-effector control.
-    """
-
-    metadata = {"render_modes": ["rgb_array"], "render_fps": 10}
-
-    def __init__(
-        self,
-        task: str = "select_fruit",
-        obs_type: str = "pixels_agent_pos",
-        render_mode: str = "rgb_array",
-        render_resolution: tuple[int, int] = (480, 480),
-        robot: str = "franka",
-        max_episode_steps: int = DEFAULT_MAX_EPISODE_STEPS,
-        action_mode: str = "eef",
-    ):
-        super().__init__()
-        self.task = task
-        self.obs_type = obs_type
-        self.render_mode = render_mode
-        self.render_resolution = render_resolution
-        self.robot = robot
-        self._max_episode_steps = max_episode_steps
-        self.action_mode = action_mode
-
-        # Deferred — created on first reset() inside worker subprocess to avoid
-        # inheriting stale GPU/EGL contexts when AsyncVectorEnv spawns workers.
-        # We never cache `env.physics`: dm_control exposes it as a weakref
-        # proxy that goes stale across resets (rebuilds the sim), so we always
-        # refetch it via `self._env.physics` at the call site.
-        self._env = None
-        self.task_description = ""  # populated on first reset
-        # Cached world-frame XYZ of the robot base link. The VLABench datasets
-        # log both `observation.state` positions and `actions` positions in
-        # robot-base frame (see VLABench/scripts/convert_to_lerobot.py which
-        # subtracts `robot_frame_pos` from ee_pos). The robot is attached at a
-        # fixed offset per task so this is safe to cache once per env build.
-        self._robot_base_xyz: np.ndarray | None = None
-
-        h, w = self.render_resolution
-
-        if self.obs_type == "state":
-            raise NotImplementedError(
-                "The 'state' observation type is not supported in VLABenchEnv. "
-                "Please use 'pixels' or 'pixels_agent_pos'."
-            )
-        elif self.obs_type == "pixels":
-            self.observation_space = spaces.Dict(
-                {
-                    "pixels": spaces.Dict(
-                        {
-                            "image": spaces.Box(low=0, high=255, shape=(h, w, 3), dtype=np.uint8),
-                            "second_image": spaces.Box(low=0, high=255, shape=(h, w, 3), dtype=np.uint8),
-                            "wrist_image": spaces.Box(low=0, high=255, shape=(h, w, 3), dtype=np.uint8),
-                        }
-                    ),
-                }
-            )
-        elif self.obs_type == "pixels_agent_pos":
-            self.observation_space = spaces.Dict(
-                {
-                    "pixels": spaces.Dict(
-                        {
-                            "image": spaces.Box(low=0, high=255, shape=(h, w, 3), dtype=np.uint8),
-                            "second_image": spaces.Box(low=0, high=255, shape=(h, w, 3), dtype=np.uint8),
-                            "wrist_image": spaces.Box(low=0, high=255, shape=(h, w, 3), dtype=np.uint8),
-                        }
-                    ),
-                    "agent_pos": spaces.Box(low=-np.inf, high=np.inf, shape=(7,), dtype=np.float64),
-                }
-            )
-        else:
-            raise ValueError(f"Unsupported obs_type: {self.obs_type}")
-
-        self.action_space = spaces.Box(low=ACTION_LOW, high=ACTION_HIGH, dtype=np.float32)
-
-    # Max attempts to rebuild the underlying env when MuJoCo throws
-    # `PhysicsError` (e.g. mjWARN_BADQACC) during VLABench's 20-step
-    # reset warm-up. Some random task/layout samples land in unstable
-    # initial configurations; re-sampling the layout almost always
-    # gives a stable one. A handful of upstream tasks (notably
-    # `select_mahjong`) have layout samplers that diverge often enough
-    # to need >>5 retries, so we pick a generous ceiling.
-    _ENSURE_ENV_MAX_ATTEMPTS = 20
-
-    def _ensure_env(self) -> None:
-        """Create the underlying VLABench env on first use.
-
-        Called inside the worker subprocess after fork(), so each worker gets
-        its own clean rendering context rather than inheriting a stale one from
-        the parent process (which causes crashes with AsyncVectorEnv).
-
-        Retries on `PhysicsError`: VLABench's `LM4ManipDMEnv.reset()` runs 20
-        warm-up `step()` calls while toggling gravity/fluids to let the scene
-        settle; for some random layouts MuJoCo's integrator diverges and
-        raises `mjWARN_BADQACC`. Re-sampling the layout almost always yields
-        a stable one, so we retry a number of times before giving up. Between
-        attempts we reseed NumPy's global RNG from OS entropy so the upstream
-        task sampler explores fresh initial states — without this, retries
-        can replay the same diverging configuration when the sampler is
-        deterministic given the current RNG state.
-        """
-        if self._env is not None:
-            return
-
-        import VLABench.robots  # noqa: F401  # type: ignore[import-untyped]
-        import VLABench.tasks  # noqa: F401  # type: ignore[import-untyped]
-        from dm_control.rl.control import PhysicsError  # type: ignore[import-untyped]
-        from VLABench.envs import load_env  # type: ignore[import-untyped]
-
-        h, w = self.render_resolution
-        last_exc: PhysicsError | None = None
-        for attempt in range(1, self._ENSURE_ENV_MAX_ATTEMPTS + 1):
-            try:
-                env = load_env(task=self.task, robot=self.robot, render_resolution=(h, w))
-                self._env = env
-                break
-            except PhysicsError as exc:
-                last_exc = exc
-                logger.warning(
-                    "PhysicsError on attempt %d/%d while building task '%s': %s. Retrying with fresh layout…",
-                    attempt,
-                    self._ENSURE_ENV_MAX_ATTEMPTS,
-                    self.task,
-                    exc,
-                )
-                np.random.seed(None)
-        if self._env is None:
-            assert last_exc is not None
-            raise RuntimeError(
-                f"VLABench task '{self.task}' failed to produce a stable "
-                f"initial layout after {self._ENSURE_ENV_MAX_ATTEMPTS} "
-                f"attempts. This task's upstream sampler diverges too "
-                f"often for the configured robot; consider removing it "
-                f"from the eval set. Last physics error: {last_exc}"
-            ) from last_exc
-
-        # Extract task description from the dm_control task
-        task_obj = self._env.task
-        if hasattr(task_obj, "task_description"):
-            self.task_description = task_obj.task_description
-        elif hasattr(task_obj, "language_instruction"):
-            self.task_description = task_obj.language_instruction
-        else:
-            self.task_description = self.task
-
-        # Cache robot base world position so `_build_ctrl_from_action` and
-        # `_get_obs` can translate between robot-frame (dataset) and
-        # world-frame (dm_control) without hitting physics every call.
-        try:
-            self._robot_base_xyz = np.asarray(self._env.get_robot_frame_position(), dtype=np.float64).reshape(
-                3
-            )
-        except Exception:
-            # Fallback to VLABench's default Franka base position.
-            self._robot_base_xyz = np.array([0.0, -0.4, 0.78], dtype=np.float64)
-
-    def _get_obs(self) -> dict:
-        """Get current observation from the environment."""
-        assert self._env is not None
-
-        obs = self._env.get_observation()
-        h, w = self.render_resolution
-
-        def _to_hwc3(arr: np.ndarray) -> np.ndarray:
-            """Coerce any camera array to the declared (h, w, 3) uint8 shape."""
-            a = np.asarray(arr)
-            # Drop a leading singleton batch dim if present.
-            while a.ndim > 3 and a.shape[0] == 1:
-                a = a[0]
-            if a.ndim == 3 and a.shape[0] in (1, 3, 4) and a.shape[-1] not in (1, 3, 4):
-                # CHW → HWC
-                a = np.transpose(a, (1, 2, 0))
-            if a.ndim == 2:
-                a = np.stack([a] * 3, axis=-1)
-            if a.ndim != 3:
-                return np.zeros((h, w, 3), dtype=np.uint8)
-            # Force 3 channels.
-            if a.shape[-1] == 1:
-                a = np.repeat(a, 3, axis=-1)
-            elif a.shape[-1] == 4:
-                a = a[..., :3]
-            elif a.shape[-1] != 3:
-                return np.zeros((h, w, 3), dtype=np.uint8)
-            if a.shape[:2] != (h, w):
-                a = cv2.resize(a, (w, h), interpolation=cv2.INTER_AREA)
-            return a.astype(np.uint8)
-
-        # Extract camera images — VLABench returns (n_cameras, C, H, W) or individual arrays
-        raw_frames: list[np.ndarray] = []
-        if "rgb" in obs:
-            rgb = obs["rgb"]
-            if isinstance(rgb, np.ndarray):
-                if rgb.ndim == 4:
-                    raw_frames = [rgb[i] for i in range(rgb.shape[0])]
-                elif rgb.ndim == 3:
-                    raw_frames = [rgb]
-
-        image_keys = ["image", "second_image", "wrist_image"]
-        images: dict[str, np.ndarray] = {}
-        for i, key in enumerate(image_keys):
-            if i < len(raw_frames):
-                images[key] = _to_hwc3(raw_frames[i])
-            else:
-                images[key] = np.zeros((h, w, 3), dtype=np.uint8)
-
-        # Convert VLABench's raw ee_state `[pos_world(3), quat_wxyz(4), open(1)]`
-        # to the dataset's observation.state layout `[pos_robot(3), euler_xyz(3),
-        # gripper(1)]`. See VLABench/scripts/convert_to_lerobot.py — positions
-        # are stored in robot-base frame and orientations as scipy extrinsic
-        # 'xyz' euler angles.
-        raw = np.asarray(obs.get("ee_state", np.zeros(8)), dtype=np.float64).ravel()
-        pos_world = raw[:3] if raw.size >= 3 else np.zeros(3, dtype=np.float64)
-        quat_wxyz = raw[3:7] if raw.size >= 7 else np.array([1.0, 0.0, 0.0, 0.0], dtype=np.float64)
-        gripper = float(raw[7]) if raw.size >= 8 else 0.0
-
-        base = self._robot_base_xyz if self._robot_base_xyz is not None else np.zeros(3, dtype=np.float64)
-        pos_robot = pos_world - base
-        euler_xyz = Rotation.from_quat([quat_wxyz[1], quat_wxyz[2], quat_wxyz[3], quat_wxyz[0]]).as_euler(
-            "xyz", degrees=False
-        )
-
-        ee_state = np.concatenate([pos_robot, euler_xyz, [gripper]]).astype(np.float64)
-
-        if self.obs_type == "pixels":
-            return {"pixels": images}
-        elif self.obs_type == "pixels_agent_pos":
-            return {
-                "pixels": images,
-                "agent_pos": ee_state.astype(np.float64),
-            }
-        else:
-            raise ValueError(f"Unknown obs_type: {self.obs_type}")
-
-    # ---- Action adaptation (EEF → joint ctrl) --------------------------------
-    #
-    # The HF vlabench datasets log 7D actions
-    # `[x, y, z (robot frame), rx, ry, rz (scipy extrinsic xyz), gripper]`,
-    # exactly matching VLABench's own eval pipeline (evaluator.base):
-    #   pos, euler, g = policy(...)
-    #   quat = euler_to_quaternion(*euler)      # extrinsic xyz -> wxyz
-    #   _, qpos = robot.get_qpos_from_ee_pos(physics, pos=pos + base, quat=quat)
-    #   env.step(np.concatenate([qpos, [g, g]]))
-    #
-    # VLABench's dm_control task writes `data.ctrl[:] = action` directly — for
-    # Franka that's 9 entries (7 arm joints + 2 gripper fingers). We mirror the
-    # above conversion so the policy's EEF commands actually drive the robot.
-
-    _FRANKA_FINGER_OPEN = 0.04  # qpos when gripper fully open
-
-    def _build_ctrl_from_action(self, action: np.ndarray, ctrl_dim: int) -> np.ndarray:
-        """Convert a 7D EEF action into the `ctrl_dim`-sized joint command vector.
-
-        For the Franka default (ctrl_dim=9): 7 arm joint qposes (via IK) +
-        2 gripper finger qposes (open/closed based on the gripper scalar).
-        If the action is already joint-space (shape matches ctrl_dim), pass
-        through.
-        """
-        if action.shape[0] == ctrl_dim:
-            return action.astype(np.float64, copy=False)
-
-        if action.shape[0] != 7:
-            # Unknown layout — fall back to zero-pad so the sim doesn't crash.
-            padded = np.zeros(ctrl_dim, dtype=np.float64)
-            padded[: min(action.shape[0], ctrl_dim)] = action[:ctrl_dim]
-            return padded
-
-        from dm_control.utils.inverse_kinematics import qpos_from_site_pose
-
-        # Action position is in robot-base frame (see convert_to_lerobot.py);
-        # dm_control's IK expects a world-frame target.
-        base = self._robot_base_xyz if self._robot_base_xyz is not None else np.zeros(3, dtype=np.float64)
-        pos_world = np.asarray(action[:3], dtype=np.float64) + base
-        rx, ry, rz = float(action[3]), float(action[4]), float(action[5])
-        gripper = float(np.clip(action[6], 0.0, 1.0))
-
-        # Dataset euler is scipy extrinsic 'xyz' (same as VLABench's
-        # `euler_to_quaternion`). scipy emits `[x, y, z, w]`; dm_control's IK
-        # and MuJoCo use `[w, x, y, z]`, so reorder.
-        qxyzw = Rotation.from_euler("xyz", [rx, ry, rz], degrees=False).as_quat()
-        quat = np.array([qxyzw[3], qxyzw[0], qxyzw[1], qxyzw[2]], dtype=np.float64)
-
-        assert self._env is not None
-        robot = self._env.task.robot
-        site_name = robot.end_effector_site.full_identifier
-
-        # inplace=False so IK doesn't mutate physics state mid-step — we only
-        # want the solved qpos. Fetch a fresh physics handle — caching it can
-        # yield a stale weakref after a reset.
-        ik_result = qpos_from_site_pose(
-            self._env.physics,
-            site_name=site_name,
-            target_pos=pos_world,
-            target_quat=quat,
-            inplace=False,
-            max_steps=100,
-        )
-        n_dof = robot.n_dof  # 7 for Franka
-        arm_qpos = ik_result.qpos[:n_dof]
-
-        # Dataset gripper convention: 1 = open (finger qpos = 0.04),
-        # 0 = closed (finger qpos = 0.0). See VLABench/scripts/convert_to_lerobot.py
-        # where `trajectory[i][-1] > 0.03` is encoded as `1`.
-        finger_qpos = gripper * self._FRANKA_FINGER_OPEN
-
-        ctrl = np.zeros(ctrl_dim, dtype=np.float64)
-        ctrl[:n_dof] = arm_qpos
-        # Remaining entries are gripper fingers (usually 2 for Franka).
-        ctrl[n_dof:] = finger_qpos
-        return ctrl
-
-    def reset(self, seed=None, **kwargs) -> tuple[RobotObservation, dict[str, Any]]:
-        self._ensure_env()
-        assert self._env is not None
-        super().reset(seed=seed)
-
-        if seed is not None:
-            self._seed_inner_env(int(self.np_random.integers(0, 2**31 - 1)))
-
-        self._env.reset()
-
-        observation = self._get_obs()
-        info = {"is_success": False}
-        return observation, info
-
-    def _seed_inner_env(self, seed: int) -> None:
-        """Propagate `seed` to the inner dm_control env. `Environment.reset()`
-        doesn't accept a seed, so we re-seed the task and environment
-        `RandomState`s directly. Best-effort: silently skipped when the
-        expected attributes are absent on a given VLABench version.
-        """
-        for owner_attr, rng_attr in (("task", "random"), (None, "_random_state")):
-            owner = getattr(self._env, owner_attr) if owner_attr else self._env
-            rng = getattr(owner, rng_attr, None)
-            rng_seed = getattr(rng, "seed", None)
-            if callable(rng_seed):
-                rng_seed(seed)
-
-    def step(self, action: np.ndarray) -> tuple[RobotObservation, float, bool, bool, dict[str, Any]]:
-        from dm_control.rl.control import PhysicsError  # type: ignore[import-untyped]
-
-        self._ensure_env()
-        assert self._env is not None
-
-        if action.ndim != 1:
-            raise ValueError(
-                f"Expected action to be 1-D (shape (action_dim,)), "
-                f"but got shape {action.shape} with ndim={action.ndim}"
-            )
-
-        if self.action_mode not in ("eef", "joint", "delta_eef"):
-            raise ValueError(f"Unknown action_mode: {self.action_mode}")
-
-        # Always refetch physics — dm_control returns a weakref proxy that can
-        # go stale across resets.
-        physics = self._env.physics
-        ctrl_dim = int(physics.data.ctrl.shape[0])
-        ctrl = self._build_ctrl_from_action(action, ctrl_dim)
-        try:
-            timestep = self._env.step(ctrl)
-        except PhysicsError as exc:
-            # Physics integrator diverged (e.g. mjWARN_BADQACC). Treat it as
-            # a graceful failed termination rather than a hard crash — the
-            # rest of the multi-task eval should still run.
-            logger.warning(
-                "PhysicsError during step on task '%s': %s. Terminating episode.",
-                self.task,
-                exc,
-            )
-            observation = self._get_obs()
-            info = {"task": self.task, "is_success": False, "physics_error": True}
-            # Drop the stale env so the next reset() rebuilds it cleanly.
-            with contextlib.suppress(Exception):
-                self._env.close()
-            self._env = None
-            return observation, 0.0, True, False, info
-
-        # Extract reward from dm_control timestep
-        reward = float(timestep.reward) if timestep.reward is not None else 0.0
-
-        # Check success via the task's termination condition
-        is_success = False
-        if hasattr(self._env, "task") and hasattr(self._env.task, "should_terminate_episode"):
-            is_success = bool(self._env.task.should_terminate_episode(self._env.physics))
-
-        terminated = is_success
-        truncated = False
-        info = {
-            "task": self.task,
-            "is_success": is_success,
-        }
-
-        observation = self._get_obs()
-
-        if terminated:
-            self.reset()
-
-        return observation, reward, terminated, truncated, info
-
-    def render(self) -> np.ndarray:
-        self._ensure_env()
-        obs = self._get_obs()
-        return obs["pixels"]["image"]
-
-    def close(self):
-        if self._env is not None:
-            self._env.close()
-            self._env = None
-
-
-# ---- Main API ----------------------------------------------------------------
-
-
-def create_vlabench_envs(
-    task: str,
-    n_envs: int,
-    gym_kwargs: dict[str, Any] | None = None,
-    env_cls: Callable[[Sequence[Callable[[], Any]]], Any] | None = None,
-) -> dict[str, dict[int, Any]]:
-    """
-    Create vectorized VLABench environments with a consistent return shape.
-
-    Returns:
-        dict[suite_name][task_id] -> vec_env (env_cls([...]) with exactly n_envs factories)
-
-    Notes:
-        - n_envs is the number of rollouts *per task*.
-        - `task` can be a suite name ("primitive", "composite"), a comma-separated list of
-          suite names, or individual task names (e.g. "select_fruit,heat_food").
-    """
-    if env_cls is None or not callable(env_cls):
-        raise ValueError("env_cls must be a callable that wraps a list of environment factory callables.")
-    if not isinstance(n_envs, int) or n_envs <= 0:
-        raise ValueError(f"n_envs must be a positive int; got {n_envs}.")
-
-    gym_kwargs = dict(gym_kwargs or {})
-    task_groups = [t.strip() for t in task.split(",") if t.strip()]
-    if not task_groups:
-        raise ValueError("`task` must contain at least one VLABench task or suite name.")
-
-    logger.info(
-        "Creating VLABench envs | task_groups=%s | n_envs(per task)=%d",
-        task_groups,
-        n_envs,
-    )
-
-    is_async = env_cls is gym.vector.AsyncVectorEnv
-    cached_obs_space = None
-    cached_act_space = None
-    cached_metadata = None
-    out: dict[str, dict[int, Any]] = defaultdict(dict)
-
-    for group in task_groups:
-        # Check if it's a suite name, otherwise treat as individual task
-        tasks = SUITE_TASKS.get(group, [group])
-
-        for tid, task_name in enumerate(tasks):
-            logger.info(
-                "Building vec env | group=%s | task_id=%d | task=%s",
-                group,
-                tid,
-                task_name,
-            )
-
-            fns = [(lambda tn=task_name: VLABenchEnv(task=tn, **gym_kwargs)) for _ in range(n_envs)]
-
-            if is_async:
-                lazy = _LazyAsyncVectorEnv(fns, cached_obs_space, cached_act_space, cached_metadata)
-                if cached_obs_space is None:
-                    cached_obs_space = lazy.observation_space
-                    cached_act_space = lazy.action_space
-                    cached_metadata = lazy.metadata
-                out[group][tid] = lazy
-            else:
-                out[group][tid] = env_cls(fns)
-
-    return {group: dict(task_map) for group, task_map in out.items()}
@@ -12,19 +12,8 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.

-from __future__ import annotations
-
-from typing import TYPE_CHECKING
-
 import numpy as np

-from lerobot.utils.import_utils import _placo_available, require_package
-
-if TYPE_CHECKING or _placo_available:
-    import placo  # type: ignore[import-not-found]
-else:
-    placo = None
-

 class RobotKinematics:
    """Robot kinematics using placo library for forward and inverse kinematics."""
@@ -43,7 +32,13 @@ class RobotKinematics:
            target_frame_name (str): Name of the end-effector frame in the URDF
            joint_names (list[str] | None): List of joint names to use for the kinematics solver
        """
-        require_package("placo", extra="placo-dep")
+        try:
+            import placo  # type: ignore[import-not-found] # C++ library with Python bindings, no type stubs available. TODO: Create stub file or request upstream typing support.
+        except ImportError as e:
+            raise ImportError(
+                "placo is required for RobotKinematics. "
+                "Please install the optional dependencies of `kinematics` in the package."
+            ) from e

        self.robot = placo.RobotWrapper(urdf_path)
        self.solver = placo.KinematicsSolver(self.robot)
@@ -24,7 +24,7 @@ from functools import cached_property
 from typing import TYPE_CHECKING, Any, TypedDict

 from lerobot.utils.decorators import check_if_already_connected, check_if_not_connected
-from lerobot.utils.import_utils import _can_available, require_package
+from lerobot.utils.import_utils import _can_available

 if TYPE_CHECKING or _can_available:
    import can
@@ -111,7 +111,6 @@ class DamiaoMotorsBus(MotorsBusBase):
            bitrate: Nominal bitrate in bps (default: 1000000 = 1 Mbps)
            data_bitrate: Data bitrate for CAN FD in bps (default: 5000000 = 5 Mbps), ignored if use_can_fd is False
        """
-        require_package("python-can", extra="damiao", import_name="can")
        super().__init__(port, motors, calibration)
        self.port = port
        self.can_interface = can_interface
@@ -216,14 +216,6 @@ class FeetechMotorsBus(SerialMotorsBus):
                self.write("Maximum_Acceleration", motor, maximum_acceleration)
            self.write("Acceleration", motor, acceleration)

-            # Clear bit 4 (0x10) of the Phase register (0x12) to set angle feedback mode to 0.
-            # This forces position readings to be in the range [0, resolution - 1] and prevents overflow or negative values.
-            # Only known to be necessary for the STS3215.
-            if self.motors[motor].model == "sts3215":
-                phase = self.read("Phase", motor, normalize=False)
-                if phase & 0x10:
-                    self.write("Phase", motor, phase & ~0x10)
-
    @property
    def is_calibrated(self) -> bool:
        motors_calibration = self.read_calibration()
@@ -356,8 +356,8 @@ class SerialMotorsBus(MotorsBusBase):
        motors: dict[str, Motor],
        calibration: dict[str, MotorCalibration] | None = None,
    ):
-        require_package("pyserial", extra="pyserial-dep", import_name="serial")
-        require_package("deepdiff", extra="deepdiff-dep")
+        require_package("pyserial", extra="hardware", import_name="serial")
+        require_package("deepdiff", extra="hardware")
        super().__init__(port, motors, calibration)

        self.port_handler: PortHandler
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
Steven Palma	62135d846f	license + peft-dep + init groot + flat import layering utils dataset	2026-04-12 16:43:24 +02:00
Steven Palma	718d2fc59d	fix fast tests	2026-04-12 14:46:23 +02:00
Steven Palma	8e75f61b31	update fast ci tests	2026-04-12 14:11:50 +02:00
Steven Palma	2bf33ccb98	fix leaking imports in minimal testing	2026-04-12 13:52:45 +02:00
Steven Palma	27292a3432	complete migration	2026-04-12 12:19:26 +02:00
Steven Palma	87528186c0	address minor review comments	2026-04-12 11:27:59 +02:00
Steven Palma	8ef4d78178	upgrade uv lock	2026-04-12 10:40:11 +02:00
Steven Palma	5ccf99b930	add explicit transitative deps	2026-04-12 10:20:44 +02:00
Steven Palma	1624fc1797	is_available checks centralized	2026-04-12 09:56:03 +02:00
Steven Palma	b132e2b5d6	docs and examples imports update	2026-04-12 09:43:13 +02:00
Steven Palma	89b4652de0	fix diffusion tests ci	2026-04-11 21:23:12 +02:00
Steven Palma	5940126fb5	fix test imports	2026-04-11 21:07:53 +02:00
Steven Palma	c9636bb53f	fix policy imports	2026-04-11 20:39:03 +02:00
Steven Palma	af0d72bd42	refactor import fixes	2026-04-11 18:02:59 +02:00
Steven Palma	d626964119	big imports refactor	2026-04-11 15:03:24 +02:00
Steven Palma	964acd0151	refactor: more changes	2026-04-11 11:13:15 +02:00
Steven Palma	4767f51971	Merge branch 'main' into feat/minimal_default_install	2026-04-10 20:57:38 +02:00
Steven Palma	4c39981908	refactor: minor improvements	2026-04-10 18:31:07 +02:00
Steven Palma	882a6b0965	refactor: several fixes	2026-04-10 15:35:31 +02:00
Steven Palma	e2381633cd	feat(dependecies): minimal default tag install	2026-04-10 14:22:13 +02:00