Merge branch 'main' into feat/behavior-1k

* refactor behaviour1k_lerobot_dataset.py
2026-05-12 23:29:52 +00:00 · 2025-12-04 18:50:56 +01:00 · 2025-11-03 13:28:31 +01:00 · 2025-11-03 12:23:12 +00:00 · 2025-10-30 18:14:09 +01:00 · 2025-10-30 18:12:50 +01:00
431 changed files with 9501 additions and 42323 deletions
@@ -12,83 +12,57 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.

-name: "🚀 Issue / Bug / Request"
-description: Report a bug, suggest an improvement, or ask a technical question.
+name: "\U0001F41B Bug Report"
+description: Submit a bug report to help us improve LeRobot
 body:
  - type: markdown
    attributes:
      value: |
-        ### Thanks for contributing to LeRobot! 🙌
-        Please choose the most relevant sections below. If this is a general "how-to" question, consider our [Discord](https://discord.gg/s3KuuzsPFb) for faster community support.
-
-  - type: dropdown
-    id: issue-type
-    attributes:
-      label: Ticket Type
-      description: What kind of ticket are you opening?
-      options:
-        - "🐛 Bug Report (Something isn't working)"
-        - "💡 Feature Request / Improvement"
-        - "❓ Technical Question"
-        - "🧹 Maintenance / Documentation"
-    validations:
-      required: true
+        Thanks for taking the time to submit a bug report! 🐛
+        If this is not a bug related to the LeRobot library directly, but instead a general question about your code or the library specifically please use our [discord](https://discord.gg/s3KuuzsPFb).

  - type: textarea
    id: system-info
    attributes:
-      label: Environment & System Info
-      description: |
-        For bugs or technical questions, please run `lerobot-info` and paste the output.
-        (Optional for feature requests).
+      label: System Info
+      description: Please share your LeRobot configuration by running `lerobot-info` (if installed) or `python -m lerobot.scripts.display_sys_info` (if not installed) and pasting the output below.
      render: Shell
-      placeholder: lerobot version, OS, python version, etc.
+      placeholder: lerobot version, OS, python version, numpy version, torch version, and lerobot's configuration
+    validations:
+      required: true
+
+  - type: checkboxes
+    id: information-scripts-examples
+    attributes:
+      label: Information
+      description: 'The problem arises when using:'
+      options:
+        - label: "One of the scripts in the examples/ folder of LeRobot"
+        - label: "My own task or dataset (give details below)"

  - type: textarea
-    id: description
+    id: reproduction
    validations:
      required: true
    attributes:
-      label: Description
+      label: Reproduction
      description: |
-        Provide a clear summary of the issue or your proposal.
-        - **Bugs:** What is happening?
-        - **Features:** What is the goal/use case?
-        - **Questions:** What are you trying to achieve?
+        If needed, provide a simple code sample that reproduces the problem you ran into. It can be a Colab link or just a code snippet.
+        Sharing error messages or stack traces could be useful as well!
+        Important! Use code tags to correctly format your code. See https://help.github.com/en/github/writing-on-github/creating-and-highlighting-code-blocks#syntax-highlighting
+        Try to avoid screenshots, as they are hard to read and don't allow copy-and-pasting.
+
      placeholder: |
-        A clear and concise description of the issue or suggestion.
+        Steps to reproduce the behavior:
+
+          1.
+          2.
+          3.

  - type: textarea
-    id: context-repro
+    id: expected-behavior
+    validations:
+      required: true
    attributes:
-      label: Context & Reproduction
-      description: |
-        Provide a code snippet, steps to reproduce a bug, or technical details about your proposal.
-        Please use code blocks for scripts and CLI commands.
-      placeholder: |
-        Steps to reproduce / Usage example:
-        1.
-        2.
-        3.
-
-  - type: textarea
-    id: logs
-    attributes:
-      label: Relevant logs or stack trace
-      description: If applicable, paste relevant error logs here.
-      render: Shell
-
-  - type: checkboxes
-    id: extras
-    attributes:
-      label: Checklist
-      options:
-        - label: I have searched existing tickets to ensure this isn't a duplicate.
-        - label: I am using the latest version of the `main` branch.
-        - label: I have verified this is not an environment-specific problem.
-
-  - type: textarea
-    id: workaround
-    attributes:
-      label: Additional Info / Workarounds
-      description: Anything else we should know? If you have a workaround, please share it!
+      label: Expected behavior
+      description: "A clear and concise description of what you would expect to happen."
@@ -1,55 +1,41 @@
-## Title
+## What this does

-Short, imperative summary (e.g., "fix(robots): handle None in sensor parser"). See [CONTRIBUTING.md](../CONTRIBUTING.md) for PR conventions.
+Explain what this PR does. Feel free to tag your PR with the appropriate label(s).

-## Type / Scope
+Examples:
+| Title | Label |
+|----------------------|-----------------|
+| Fixes #[issue] | (🐛 Bug) |
+| Adds new dataset | (🗃️ Dataset) |
+| Optimizes something | (⚡️ Performance) |

- **Type**: (Bug | Feature | Docs | Performance | Test | CI | Chore)
- **Scope**: (optional — name of module or package affected)
+## How it was tested

-## Summary / Motivation
+Explain/show how you tested your changes.

- One-paragraph description of what changes and why.
- Why this change is needed and any trade-offs or design notes.
+Examples:

-## Related issues
+- Added `test_something` in `tests/test_stuff.py`.
+- Added `new_feature` and checked that training converges with policy X on dataset/environment Y.
+- Optimized `some_function`, it now runs X times faster than previously.

- Fixes / Closes: # (if any)
- Related: # (if any)
+## How to checkout & try? (for the reviewer)

-## What changed
+Provide a simple way for the reviewer to try out your changes.

- Short, concrete bullets of the modifications (files/behaviour).
- Short note if this introduces breaking changes and migration steps.
+Examples:

-## How was this tested (or how to run locally)
+```bash
+pytest -sx tests/test_stuff.py::test_something
+```

- Tests added: list new tests or test files.
- Manual checks / dataset runs performed.
- Instructions for the reviewer
+```bash
+lerobot-train --some.option=true
+```

-Example:
+## SECTION TO REMOVE BEFORE SUBMITTING YOUR PR

- Ran the relevant tests:
+**Note**: Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
+members/contributors who may be interested in your PR. Try to avoid tagging more than 3 people.

-  ```bash
-  pytest -q tests/ -k <keyword>
-  ```
-
- Reproduce with a quick example or CLI (if applicable):
-
-  ```bash
-  lerobot-train --some.option=true
-  ```
-
-## Checklist (required before merge)
-
- [ ] Linting/formatting run (`pre-commit run -a`)
- [ ] All tests pass locally (`pytest`)
- [ ] Documentation updated
- [ ] CI is green
-
-## Reviewer notes
-
- Anything the reviewer should focus on (performance, edge-cases, specific files) or general notes.
- Anyone in the community is free to review the PR.
+**Note**: Before submitting this PR, please read the [contributor guideline](https://github.com/huggingface/lerobot/blob/main/CONTRIBUTING.md#submitting-a-pull-request-pr).
@@ -1,69 +0,0 @@
-# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-CI:
-  - changed-files:
-      - any-glob-to-any-file:
-        - '.github/**'
-        - 'docker/**'
-
-github_actions:
-  - changed-files:
-      - any-glob-to-any-file: '.github/**'
-
-documentation:
-  - changed-files:
-      - any-glob-to-any-file:
-          - '**/*.md'
-          - '**/*.mdx'
-          - 'docs/**'
-
-examples:
-  - changed-files:
-      - any-glob-to-any-file: 'examples/**'
-
-tests:
-  - changed-files:
-      - any-glob-to-any-file: 'tests/**'
-
-sensors:
-  - changed-files:
-      - any-glob-to-any-file: 'src/lerobot/cameras/**'
-
-configuration:
-  - changed-files:
-      - any-glob-to-any-file: 'src/lerobot/configs/**'
-
-dataset:
-  - changed-files:
-      - any-glob-to-any-file: 'src/lerobot/datasets/**'
-
-evaluation:
-  - changed-files:
-      - any-glob-to-any-file: 'src/lerobot/envs/**'
-
-robots:
-  - changed-files:
-      - any-glob-to-any-file:
-          - 'src/lerobot/teleoperators/**'
-          - 'src/lerobot/robots/**'
-          - 'src/lerobot/motors/**'
-
-policies:
-  - changed-files:
-      - any-glob-to-any-file: 'src/lerobot/policies/**'
-
-processor:
-  - changed-files:
-      - any-glob-to-any-file: 'src/lerobot/processor/**'
@@ -31,8 +31,7 @@ jobs:
    name: Upload Preview and Comment
    if: >
      github.event.workflow_run.event == 'pull_request' &&
-      github.event.workflow_run.conclusion == 'success' &&
-      github.repository == 'huggingface/lerobot'
+      github.event.workflow_run.conclusion == 'success'
    uses: huggingface/doc-builder/.github/workflows/upload_pr_documentation.yml@main
    with:
      package_name: lerobot
@@ -18,11 +18,6 @@ name: Documentation
 on:
  # Allows running this workflow manually from the Actions tab
  workflow_dispatch:
-    inputs:
-      version:
-        description: 'Version tag (e.g. v0.1.2) - Leave empty for standard main build'
-        required: false
-        type: string

  # Triggers the workflow on push events to main for the docs folder
  push:
@@ -38,9 +33,6 @@ on:
    paths:
      - "docs/**"

-  release:
-    types: [published]
-
 # Ensures that only the latest commit for a PR or branch is built, canceling older runs.
 concurrency:
  group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
@@ -50,22 +42,14 @@ jobs:
  # This job builds and deploys the official documentation.
  build_main_docs:
    name: Build Main Docs
-    if: >
-      (github.event_name == 'push' || github.event_name == 'workflow_dispatch' || github.event_name == 'release') &&
-      github.repository == 'huggingface/lerobot'
+    if: github.event_name == 'push' || github.event_name == 'workflow_dispatch'
    permissions:
      contents: read
    uses: huggingface/doc-builder/.github/workflows/build_main_documentation.yml@main
    with:
      commit_sha: ${{ github.sha }}
      package: lerobot
-      additional_args: >-
-        --not_python_module
-        ${{
-          (github.event_name == 'release' && format('--version {0}', github.event.release.tag_name)) ||
-          (inputs.version != '' && format('--version {0}', inputs.version)) ||
-          ''
-        }}
+      additional_args: --not_python_module
    secrets:
      token: ${{ secrets.HUGGINGFACE_PUSH }}
      hf_token: ${{ secrets.HF_DOC_BUILD_PUSH }}
@@ -74,7 +58,7 @@ jobs:
  # The result of this job triggers the 'Upload PR Documentation' workflow.
  build_pr_docs:
    name: Build PR Docs
-    if: github.event_name == 'pull_request' && github.repository == 'huggingface/lerobot'
+    if: github.event_name == 'pull_request'
    permissions:
      contents: read
      pull-requests: write
@@ -44,7 +44,8 @@ permissions:
 # Sets up the environment variables
 env:
  UV_VERSION: "0.8.0"
-  PYTHON_VERSION: "3.12"
+  PYTHON_VERSION: "3.10"
+  DOCKER_IMAGE_NAME: huggingface/lerobot-gpu

 # Ensures that only the latest commit for a PR or branch is built, canceling older runs.
 concurrency:
@@ -61,9 +62,8 @@ jobs:
      MUJOCO_GL: egl
      HF_HOME: /mnt/cache/.cache/huggingface
      HF_LEROBOT_HOME: /mnt/cache/.cache/huggingface/lerobot
-      HF_USER_TOKEN: ${{ secrets.LEROBOT_HF_USER }}
    steps:
-      - uses: actions/checkout@v6
+      - uses: actions/checkout@v4
        with:
          persist-credentials: false
          lfs: true
@@ -90,11 +90,5 @@ jobs:
      - name: Install lerobot with test extras
        run: uv sync --extra "test"

-      - name: Login to Hugging Face
-        if: env.HF_USER_TOKEN != ''
-        run: |
-          uv run hf auth login --token "$HF_USER_TOKEN" --add-to-git-credential
-          uv run hf auth whoami
-
      - name: Run pytest
        run: uv run pytest tests -vv --maxfail=10
@@ -37,7 +37,7 @@ permissions:
 # Sets up the environment variables
 env:
  UV_VERSION: "0.8.0"
-  PYTHON_VERSION: "3.12"
+  PYTHON_VERSION: "3.10"
  DOCKER_IMAGE_NAME: huggingface/lerobot-gpu

 # Ensures that only the latest action is built, canceling older runs.
@@ -60,9 +60,8 @@ jobs:
      MUJOCO_GL: egl
      HF_HOME: /mnt/cache/.cache/huggingface
      HF_LEROBOT_HOME: /mnt/cache/.cache/huggingface/lerobot
-      HF_USER_TOKEN: ${{ secrets.LEROBOT_HF_USER }}
    steps:
-      - uses: actions/checkout@v6
+      - uses: actions/checkout@v4
        with:
          lfs: true
          persist-credentials: false
@@ -86,13 +85,7 @@ jobs:
          python-version: ${{ env.PYTHON_VERSION }}

      - name: Install lerobot with all extras
-        run: uv sync --extra all # TODO(Steven): Make flash-attn optional
-
-      - name: Login to Hugging Face
-        if: env.HF_USER_TOKEN != ''
-        run: |
-          uv run hf auth login --token "$HF_USER_TOKEN" --add-to-git-credential
-          uv run hf auth whoami
+        run: uv sync --all-extras --no-extra groot # TODO(Steven): Make flash-attn optional

      - name: Run pytest (all extras)
        run: uv run pytest tests -vv --maxfail=10
@@ -108,11 +101,9 @@ jobs:
    runs-on:
      group: aws-general-8-plus
    if: |
-      github.repository == 'huggingface/lerobot' && (
-        (github.event_name == 'pull_request_review' && github.event.review.state == 'approved' && github.event.pull_request.head.repo.fork == false) ||
-        github.event_name == 'push' ||
-        github.event_name == 'workflow_dispatch'
-      )
+      (github.event_name == 'pull_request_review' && github.event.review.state == 'approved' && github.event.pull_request.head.repo.fork == false) ||
+      github.event_name == 'push' ||
+      github.event_name == 'workflow_dispatch'
    outputs:
      image_tag: ${{ steps.set_tag.outputs.image_tag }}
    env:
@@ -136,7 +127,7 @@ jobs:
          sudo apt-get update
          sudo apt-get install git-lfs
          git lfs install
-      - uses: actions/checkout@v6
+      - uses: actions/checkout@v4
        with:
          lfs: true
          persist-credentials: false
@@ -169,7 +160,6 @@ jobs:
      HF_LEROBOT_HOME: /home/user_lerobot/.cache/huggingface/lerobot
      TORCH_HOME: /home/user_lerobot/.cache/torch
      TRITON_CACHE_DIR: /home/user_lerobot/.cache/triton
-      HF_USER_TOKEN: ${{ secrets.LEROBOT_HF_USER }}
    container:
      image: ${{ needs.build-and-push-docker.outputs.image_tag }} # zizmor: ignore[unpinned-images]
      options: --gpus all --shm-size "16gb"
@@ -181,13 +171,6 @@ jobs:
        shell: bash
        working-directory: /lerobot
    steps:
-      - name: Login to Hugging Face
-        if: env.HF_USER_TOKEN != ''
-        run: |
-          hf auth login --token "$HF_USER_TOKEN" --add-to-git-credential
-          hf auth whoami
-      - name: Fix ptxas permissions
-        run: chmod +x /lerobot/.venv/lib/python3.12/site-packages/triton/backends/nvidia/bin/ptxas
      - name: Run pytest on GPU
        run: pytest tests -vv --maxfail=10
      - name: Run end-to-end tests
@@ -203,18 +186,15 @@ jobs:
    steps:
      - name: Get Docker Hub Token and Delete Image
        # zizmor: ignore[template-injection]
-        env:
-          DOCKERHUB_LEROBOT_USERNAME: ${{ secrets.DOCKERHUB_LEROBOT_USERNAME }}
-          DOCKERHUB_LEROBOT_PASSWORD: ${{ secrets.DOCKERHUB_LEROBOT_PASSWORD }}
-          IMAGE_FULL: ${{ needs.build-and-push-docker.outputs.image_tag }}
        run: |
-          IMAGE_NAME=$(echo "$IMAGE_FULL" | cut -d':' -f1)
-          IMAGE_TAG=$(echo "$IMAGE_FULL" | cut -d':' -f2-)
+          IMAGE_NAME=$(echo "${{ needs.build-and-push-docker.outputs.image_tag }}" | cut -d':' -f1)
+          IMAGE_TAG=$(echo "${{ needs.build-and-push-docker.outputs.image_tag }}" | cut -d':' -f2)
+
          echo "Attempting to delete image: $IMAGE_NAME:$IMAGE_TAG"

          TOKEN=$(curl -s -H "Content-Type: application/json" \
                       -X POST \
-                       -d "{\"username\": \"$DOCKERHUB_LEROBOT_USERNAME\", \"password\": \"$DOCKERHUB_LEROBOT_PASSWORD\"}" \
+                       -d '{"username": "${{ secrets.DOCKERHUB_LEROBOT_USERNAME }}", "password": "${{ secrets.DOCKERHUB_LEROBOT_PASSWORD }}"}' \
                       https://hub.docker.com/v2/users/login/ | jq -r .token)

          if [ "$TOKEN" == "null" ] || [ -z "$TOKEN" ]; then
@@ -225,7 +205,7 @@ jobs:
          HTTP_RESPONSE=$(curl -s -o /dev/null -w "%{http_code}" \
                               -H "Authorization: JWT ${TOKEN}" \
                               -X DELETE \
-                               https://hub.docker.com/v2/repositories/${IMAGE_NAME}/tags/$IMAGE_TAG)
+                               https://hub.docker.com/v2/repositories/${IMAGE_NAME}/tags/${IMAGE_TAG}/)

          if [ "$HTTP_RESPONSE" -eq 204 ]; then
            echo "Successfully deleted Docker image tag: $IMAGE_NAME:$IMAGE_TAG"
@@ -1,77 +0,0 @@
-# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-# This workflow automatically labels issues based on their content.
-name: Issue Labeler
-on:
-  # Trigger on new issues and edits to existing issues
-  issues:
-    types: [opened, edited]
-
-permissions:
-  contents: read
-  issues: write
-
-jobs:
-  label-issue:
-    name: Auto Label Issue
-    runs-on: ubuntu-latest
-    if: github.repository == 'huggingface/lerobot'
-    steps:
-      - uses: actions/github-script@v8
-        with:
-          script: |
-            // Setup Input Text
-            const body = (context.payload.issue.body || '');
-            const title = (context.payload.issue.title || '');
-            const cleanBody = body.replace(/```[\s\S]*?```/g, '');
-            const text = `${title}\n${cleanBody}`.toLowerCase();
-            const labelsToAdd = new Set();
-            const matches = (re) => re.test(text);
-
-            // Keyword Heuristics
-
-            if (matches(/\b(bug|error|crash|exception)\b/i)) labelsToAdd.add('bug');
-            if (matches(/\b(new feature|enhancement|improvement|proposal|feature request)\b/i)) labelsToAdd.add('enhancement');
-            if (matches(/\b(question|how to|clarify|explain|how do i|help me|question about)\b/i)) labelsToAdd.add('question');
-            if (matches(/\b(documentation|docs?|readme|tutorial|wiki|typo|docstring)\b/i)) labelsToAdd.add('documentation');
-            if (matches(/\b(example|sample|demo|notebook)s?\b/i)) labelsToAdd.add('examples');
-            if (matches(/\b(datasets?|data loader|data augmentation|data preprocessing)\b/i)) labelsToAdd.add('dataset');
-            if (matches(/\b(mujoco|isaac|simulation|sim)\b/i)) labelsToAdd.add('simulation');
-            if (matches(/\b(train|training|optimizer|gradient|wandb|sac)\b/i)) labelsToAdd.add('training');
-            if (matches(/\b(rerun|plot|render|rendering|visualizer)/i)) labelsToAdd.add('visualization');
-            if (matches(/\b(cameras?|opencv|realsense|lidars?|sensors?|imus?|microphones?|rgbd|encoders?)\b/i)) labelsToAdd.add('sensors');
-            if (matches(/\b(urdf|actuators?|calibration|end-effector|kinematics)\b/i)) labelsToAdd.add('robots');
-            if (matches(/\b(teleop|teleoperator|controller|leader|follower|joystick|gamepad)\b/i)) labelsToAdd.add('teleoperators');
-            if (matches(/\b(policy|policies|model?)\b/i)) labelsToAdd.add('policies');
-            if (matches(/\b(processor|pipeline|preprocessor|postprocessor)s?\b/i)) labelsToAdd.add('processor');
-            if (matches(/\b(eval|evaluate|evaluation|metrics?|score|benchmarks?)\b/i)) labelsToAdd.add('evaluation');
-            if (matches(/\b(tests?|pytest|unittest|failing test)\b/i)) labelsToAdd.add('tests');
-            if (matches(/\b(ci|github actions?|github workflows?|gha|docker|pypi)\b/i)) labelsToAdd.add('CI');
-            if (matches(/\b(perf|latency|throughput|fps|speed|performance|slow|fast|slower|faster|memory usage)\b/i)) labelsToAdd.add('performance');
-            if (matches(/\b(dependency|dependencies|pip|install error|importerror|package not found|pyproject)\b/i)) labelsToAdd.add('dependencies');
-            if (matches(/\b(configuration|config|arguments?|input feature|dracuss)\b/i)) labelsToAdd.add('configuration');
-
-            // Apply Labels
-            const labels = Array.from(labelsToAdd).filter(Boolean);
-
-            if (labels.length > 0) {
-              console.log(`Adding labels: ${labels.join(', ')}`);
-              await github.rest.issues.addLabels({
-                owner: context.repo.owner,
-                repo: context.repo.repo,
-                issue_number: context.issue.number,
-                labels,
-              });
-            }
@@ -28,7 +28,7 @@ on:
 # Sets up the environment variables
 env:
  UV_VERSION: "0.8.0"
-  PYTHON_VERSION: "3.12"
+  PYTHON_VERSION: "3.10"
  DOCKER_IMAGE_NAME_CPU: huggingface/lerobot-cpu:latest
  DOCKER_IMAGE_NAME_GPU: huggingface/lerobot-gpu:latest

@@ -43,7 +43,6 @@ jobs:
    name: Build CPU Docker for Nightly
    runs-on:
      group: aws-general-8-plus
-    if: github.repository == 'huggingface/lerobot'
    outputs:
      image_tag: ${{ env.DOCKER_IMAGE_NAME_CPU }}
    steps:
@@ -52,7 +51,7 @@ jobs:
          sudo apt-get update
          sudo apt-get install git-lfs
          git lfs install
-      - uses: actions/checkout@v6
+      - uses: actions/checkout@v4
        with:
          lfs: true
          persist-credentials: false
@@ -78,7 +77,6 @@ jobs:
    name: Build GPU Docker for Nightly
    runs-on:
      group: aws-general-8-plus
-    if: github.repository == 'huggingface/lerobot'
    outputs:
      image_tag: ${{ env.DOCKER_IMAGE_NAME_GPU }}
    steps:
@@ -87,7 +85,7 @@ jobs:
          sudo apt-get update
          sudo apt-get install git-lfs
          git lfs install
-      - uses: actions/checkout@v6
+      - uses: actions/checkout@v4
        with:
          lfs: true
          persist-credentials: false
@@ -119,7 +117,6 @@ jobs:
      HF_LEROBOT_HOME: /home/user_lerobot/.cache/huggingface/lerobot
      TORCH_HOME: /home/user_lerobot/.cache/torch
      TRITON_CACHE_DIR: /home/user_lerobot/.cache/triton
-      HF_USER_TOKEN: ${{ secrets.LEROBOT_HF_USER }}
    container:
      image: ${{ needs.build-docker-cpu-nightly.outputs.image_tag }} # zizmor: ignore[unpinned-images]
      options: --shm-size "16gb"
@@ -131,11 +128,6 @@ jobs:
        shell: bash
        working-directory: /lerobot
    steps:
-      - name: Login to Hugging Face
-        if: env.HF_USER_TOKEN != ''
-        run: |
-          hf auth login --token "$HF_USER_TOKEN" --add-to-git-credential
-          hf auth whoami
      - name: Run pytest on CPU
        run: pytest tests -vv --maxfail=10
      - name: Run end-to-end tests
@@ -152,7 +144,6 @@ jobs:
      HF_LEROBOT_HOME: /home/user_lerobot/.cache/huggingface/lerobot
      TORCH_HOME: /home/user_lerobot/.cache/torch
      TRITON_CACHE_DIR: /home/user_lerobot/.cache/triton
-      HF_USER_TOKEN: ${{ secrets.LEROBOT_HF_USER }}
    container:
      image: ${{ needs.build-docker-gpu-nightly.outputs.image_tag }} # zizmor: ignore[unpinned-images]
      options: --gpus all --shm-size "16gb"
@@ -164,11 +155,6 @@ jobs:
        shell: bash
        working-directory: /lerobot
    steps:
-      - name: Login to Hugging Face
-        if: env.HF_USER_TOKEN != ''
-        run: |
-          hf auth login --token "$HF_USER_TOKEN" --add-to-git-credential
-          hf auth whoami
      - name: Run pytest on GPU
        run: pytest tests -vv --maxfail=10
      - name: Run end-to-end tests
@@ -186,7 +172,6 @@ jobs:
      TORCH_HOME: /home/user_lerobot/.cache/torch
      TRITON_CACHE_DIR: /home/user_lerobot/.cache/triton
      CUDA_VISIBLE_DEVICES: "0,1,2,3"
-      HF_USER_TOKEN: ${{ secrets.LEROBOT_HF_USER }}
    container:
      image: ${{ needs.build-docker-gpu-nightly.outputs.image_tag }} # zizmor: ignore[unpinned-images]
      options: --gpus all --shm-size "16gb"
@@ -198,15 +183,12 @@ jobs:
        shell: bash
        working-directory: /lerobot
    steps:
-      - name: Login to Hugging Face
-        if: env.HF_USER_TOKEN != ''
-        run: |
-          hf auth login --token "$HF_USER_TOKEN" --add-to-git-credential
-          hf auth whoami
      - name: Verify GPU availability
        run: |
          nvidia-smi
          python -c "import torch; print(f'PyTorch CUDA available: {torch.cuda.is_available()}'); print(f'Number of GPUs: {torch.cuda.device_count()}')"

      - name: Run multi-GPU training tests
-        run: pytest -vv tests/training/
+      # TODO(Steven): Investigate why motors tests are failing in multi-GPU setup
+        run: pytest tests -vv --maxfail=10 --ignore=tests/motors/
+        timeout-minutes: 10
@@ -1,39 +0,0 @@
-# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-# This workflow labels pull requests based on the files that were changed.
-name: Pull Request Labeler
-
-on:
-  # Allows labeling pull requests when they are opened or updated
-  # zizmor: ignore[dangerous-triggers] Needed to label PRs from forks
-  pull_request_target:
-    branches:
-      - main
-    types: [opened, synchronize, reopened, ready_for_review]
-
-permissions:
-  contents: read
-  pull-requests: write
-
-jobs:
-  triage:
-    name: Label PR
-    runs-on: ubuntu-latest
-    if: github.repository == 'huggingface/lerobot' && !github.event.pull_request.draft
-    steps:
-      - uses: actions/labeler@v6
-        with:
-          repo-token: ${{ secrets.GITHUB_TOKEN }}
-          sync-labels: true # Removes labels if files are removed from the PR
@@ -43,14 +43,14 @@ jobs:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout code
-        uses: actions/checkout@v6
+        uses: actions/checkout@v4
        with:
          persist-credentials: false

      - name: Set up Python
-        uses: actions/setup-python@v6
+        uses: actions/setup-python@v5
        with:
-          python-version: '3.12'
+          python-version: '3.10'

      - name: Run pre-commit hooks
        uses: pre-commit/action@v3.0.1 # zizmor: ignore[unpinned-uses]
@@ -22,14 +22,13 @@ on:
 # Sets up the environment variables
 env:
  UV_VERSION: "0.8.0"
-  PYTHON_VERSION: "3.12"
+  PYTHON_VERSION: "3.10"

 jobs:
  # This job builds the Python package and publishes it to PyPI
  build-and-publish:
    name: Build and publish Python distributions
    runs-on: ubuntu-latest
-    if: github.repository == 'huggingface/lerobot'
    outputs:
      version: ${{ steps.extract_info.outputs.tag_version }}
    permissions:
@@ -38,14 +37,14 @@ jobs:

    steps:
      - name: Checkout code
-        uses: actions/checkout@v6
+        uses: actions/checkout@v4
        with:
          persist-credentials: false

      - name: Set up Python
-        uses: actions/setup-python@v6
+        uses: actions/setup-python@v5
        with:
-          python-version: '3.12'
+          python-version: '3.10'

      - name: Extract Version
        id: extract_info
@@ -83,6 +82,14 @@ jobs:
            exit 1
          fi

+      - name: Remove Tags with Git dependencies
+        # TODO(Steven): Temporary patch to remove pi from PyPi 0.4.0 release due to its reliance on git dependencies.
+        run: |
+          echo "::info:: Checking for Git dependencies to remove from pyproject.toml..."
+          grep -E '@ git\+https|lerobot\[pi\]' pyproject.toml | sed 's/^/::warning:: Removing line: /' || true
+          sed -E -i '/@ git\+https|lerobot\[pi\]/d' pyproject.toml
+          echo "::info:: Git dependencies removed. Proceeding with build."
+
      - name: Install build dependencies
        run: python -m pip install build

@@ -127,7 +134,7 @@ jobs:
    env:
      MUJOCO_GL: egl
    steps:
-      - uses: actions/checkout@v6
+      - uses: actions/checkout@v4
        with:
          lfs: true
          persist-credentials: false
@@ -169,3 +176,4 @@ jobs:

 # TODO(Steven): Publish draft/pre-release and to test pypi weekly
 # TODO(Steven): Separate build and publish job
+# TODO(Steven): Tag documentation with the same version as the package
@@ -43,7 +43,7 @@ jobs:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout code
-        uses: actions/checkout@v6 # zizmor: ignore[unpinned-uses]
+        uses: actions/checkout@v4 # zizmor: ignore[unpinned-uses]
        with:
          fetch-depth: 0
          persist-credentials: false
@@ -45,7 +45,6 @@ jobs:
  stale:
    name: Close Stale Issues and PRs
    runs-on: ubuntu-latest
-    if: github.repository == 'huggingface/lerobot'
    permissions:
      actions: write
      contents: write # only for delete-branch option
@@ -20,8 +20,8 @@ on:
  workflow_dispatch:

  # Run on the 1st and 15th of every month at 09:00 UTC
-  # schedule:
-  #  - cron: '0 2 1,15 * *'
+  schedule:
+    - cron: '0 2 1,15 * *'

 permissions:
  contents: read
@@ -29,7 +29,7 @@ permissions:
 # Sets up the environment variables
 env:
  UV_VERSION: "0.8.0"
-  PYTHON_VERSION: "3.12"
+  PYTHON_VERSION: "3.10"
  DOCKER_IMAGE_NAME: huggingface/lerobot-gpu:unbound

 # Ensures that only the latest action is built, canceling older runs.
@@ -43,14 +43,12 @@ jobs:
  full-tests:
    name: Full Unbound Tests
    runs-on: ubuntu-latest
-    if: github.repository == 'huggingface/lerobot'
    env:
      MUJOCO_GL: egl
      HF_HOME: /mnt/cache/.cache/huggingface
      HF_LEROBOT_HOME: /mnt/cache/.cache/huggingface/lerobot
-      HF_USER_TOKEN: ${{ secrets.LEROBOT_HF_USER }}
    steps:
-      - uses: actions/checkout@v6
+      - uses: actions/checkout@v4
        with:
          lfs: true
          persist-credentials: false
@@ -79,12 +77,8 @@ jobs:
          echo "Dependencies unbound:" && cat pyproject.toml

      - name: Install lerobot with all extras
-        run: uv sync --extra all # TODO(Steven): Make flash-attn optional
-      - name: Login to Hugging Face
-        if: env.HF_USER_TOKEN != ''
-        run: |
-          uv run hf auth login --token "$HF_USER_TOKEN" --add-to-git-credential
-          uv run hf auth whoami
+        run: uv sync --all-extras --no-extra groot # TODO(Steven): Make flash-attn optional
+
      - name: Run pytest (all extras)
        run: uv run pytest tests -vv

@@ -96,7 +90,6 @@ jobs:
    name: Build and Push Docker
    runs-on:
      group: aws-general-8-plus
-    if: github.repository == 'huggingface/lerobot'
    outputs:
      image_tag: ${{ env.DOCKER_IMAGE_NAME }}
    env:
@@ -107,7 +100,7 @@ jobs:
          sudo apt-get update
          sudo apt-get install git-lfs
          git lfs install
-      - uses: actions/checkout@v6
+      - uses: actions/checkout@v4
        with:
          lfs: true
          persist-credentials: false
@@ -142,7 +135,6 @@ jobs:
      HF_LEROBOT_HOME: /home/user_lerobot/.cache/huggingface/lerobot
      TORCH_HOME: /home/user_lerobot/.cache/torch
      TRITON_CACHE_DIR: /home/user_lerobot/.cache/triton
-      HF_USER_TOKEN: ${{ secrets.LEROBOT_HF_USER }}
    container:
      image: ${{ needs.build-and-push-docker.outputs.image_tag }} # zizmor: ignore[unpinned-images]
      options: --gpus all --shm-size "16gb"
@@ -154,11 +146,6 @@ jobs:
        shell: bash
        working-directory: /lerobot
    steps:
-      - name: Login to Hugging Face
-        if: env.HF_USER_TOKEN != ''
-        run: |
-          hf auth login --token "$HF_USER_TOKEN" --add-to-git-credential
-          hf auth whoami
      - name: Run pytest on GPU
        run: pytest tests -vv
      - name: Run end-to-end tests
@@ -174,19 +161,15 @@ jobs:
    steps:
      - name: Get Docker Hub Token and Delete Image
        # zizmor: ignore[template-injection]
-        env:
-          DOCKERHUB_LEROBOT_USERNAME: ${{ secrets.DOCKERHUB_LEROBOT_USERNAME }}
-          DOCKERHUB_LEROBOT_PASSWORD: ${{ secrets.DOCKERHUB_LEROBOT_PASSWORD }}
-          IMAGE_FULL: ${{ needs.build-and-push-docker.outputs.image_tag }}
        run: |
-          IMAGE_NAME=$(echo "$IMAGE_FULL" | cut -d':' -f1)
-          IMAGE_TAG=$(echo "$IMAGE_FULL" | cut -d':' -f2)
+          IMAGE_NAME=$(echo "${{ needs.build-and-push-docker.outputs.image_tag }}" | cut -d':' -f1)
+          IMAGE_TAG=$(echo "${{ needs.build-and-push-docker.outputs.image_tag }}" | cut -d':' -f2)

          echo "Attempting to delete image: $IMAGE_NAME:$IMAGE_TAG"

          TOKEN=$(curl -s -H "Content-Type: application/json" \
                       -X POST \
-                       -d "{\"username\": \"$DOCKERHUB_LEROBOT_USERNAME\", \"password\": \"$DOCKERHUB_LEROBOT_PASSWORD\"}" \
+                       -d '{"username": "${{ secrets.DOCKERHUB_LEROBOT_USERNAME }}", "password": "${{ secrets.DOCKERHUB_LEROBOT_PASSWORD }}"}' \
                       https://hub.docker.com/v2/users/login/ | jq -r .token)

          if [ "$TOKEN" == "null" ] || [ -z "$TOKEN" ]; then
@@ -197,7 +180,7 @@ jobs:
          HTTP_RESPONSE=$(curl -s -o /dev/null -w "%{http_code}" \
                               -H "Authorization: JWT ${TOKEN}" \
                               -X DELETE \
-                               https://hub.docker.com/v2/repositories/${IMAGE_NAME}/tags/$IMAGE_TAG)
+                               https://hub.docker.com/v2/repositories/${IMAGE_NAME}/tags/${IMAGE_TAG}/)

          if [ "$HTTP_RESPONSE" -eq 204 ]; then
            echo "Successfully deleted Docker image tag: $IMAGE_NAME:$IMAGE_TAG"
@@ -13,7 +13,7 @@
 # limitations under the License.

 default_language_version:
-    python: python3.12
+    python: python3.10

 exclude: "tests/artifacts/.*\\.safetensors$"

@@ -55,7 +55,7 @@ repos:
    rev: v3.21.0
    hooks:
    -   id: pyupgrade
-        args: [--py312-plus]
+        args: [--py310-plus]

  ##### Markdown Quality #####
  - repo: https://github.com/rbubley/mirrors-prettier
@@ -87,7 +87,7 @@ repos:
  # TODO(Steven): Uncomment when ready to use
  ##### Static Analysis & Typing #####
  - repo: https://github.com/pre-commit/mirrors-mypy
-    rev: v1.19.1
+    rev: v1.18.2
    hooks:
      - id: mypy
        args: [--config-file=pyproject.toml]
@@ -1,25 +0,0 @@
-# AI Usage Policy
-
-The LeRobot project welcomes contributions from everyone, and we have a few guidelines regarding AI usage to ensure high code quality, clear communication, and a healthy open-source ecosystem:
-
- **Please disclose significant AI assistance.** If you used AI tools (e.g., Copilot, Claude, Cursor, ChatGPT) to generate a substantial portion of your code or text, let us know in your PR description. Transparency helps us review your changes more effectively.
- **Own your code (The Human-in-the-Loop).** You must fully understand all the changes you are proposing. If you cannot explain what your AI-assisted code does or how it interacts with LeRobot's broader architecture, please take the time to learn and test it before submitting.
- **Keep issues and discussions focused.** You are welcome to use AI to help draft issues or PR descriptions, but please review and edit them carefully before posting. AI can often be overly verbose; trimming the noise and getting straight to the point helps our maintainers address your needs faster.
-
-Our core maintainers also use AI tools to aid their workflows, but they do so while bringing deep contextual knowledge of the LeRobot codebase to validate the output. We ask all contributors to apply that same level of rigor.
-
-## Remember the Human Maintainers
-
-Please remember that LeRobot is maintained by a dedicated team of humans.
-
-Every discussion, issue, and pull request is read and reviewed by real people. While AI tools can generate thousands of lines of code in seconds, reviewing that code still takes human time and energy. Submitting unverified or low-effort AI output puts an unfair burden on our maintainers.
-
-Today, the quality of the AI output still heavily depends on the developer driving the tool. We ask that you respect our maintainers' time by thoroughly vetting, testing, and refining your submissions.
-
-## AI is Welcome Here
-
-LeRobot operates at the cutting edge of AI and robotics, and many of our maintainers actively embrace AI coding assistants as valuable productivity tools. We are a pro-AI project!
-
-Our reason for having an AI policy is not an anti-AI stance. Rather, it exists to ensure that AI is used to enhance human contributions, not replace them with unverified noise. It's about how the tools are used, not the tools themselves.
-
-We value the unique human insight you bring to the LeRobot community. Let AI empower your workflow, but always let your own judgment take the wheel.
@@ -52,7 +52,7 @@ decisions when appropriate.

 This Code of Conduct applies within all community spaces, and also applies when
 an individual is officially representing the community in public spaces.
-Examples of representing our community include using an official e-mail address,
+Examples of representing our community include using an official email address,
 posting via an official social media account, or acting as an appointed
 representative at an online or offline event.

@@ -60,7 +60,7 @@ representative at an online or offline event.

 Instances of abusive, harassing, or otherwise unacceptable behavior may be
 reported to the community leaders responsible for enforcement at
-feedback@huggingface.co.
+[feedback@huggingface.co](mailto:feedback@huggingface.co).
 All complaints will be reviewed and investigated promptly and fairly.

 All community leaders are obligated to respect the privacy and security of the
@@ -1,83 +1,323 @@
-# How to contribute to 🤗 LeRobot
+# How to contribute to 🤗 LeRobot?

-Everyone is welcome to contribute, and we value everybody's contribution. Code is not the only way to help the community. Answering questions, helping others, reaching out, and improving the documentation are immensely valuable.
+Everyone is welcome to contribute, and we value everybody's contribution. Code
+is thus not the only way to help the community. Answering questions, helping
+others, reaching out and improving the documentations are immensely valuable to
+the community.

-Whichever way you choose to contribute, please be mindful to respect our [code of conduct](https://github.com/huggingface/lerobot/blob/main/CODE_OF_CONDUCT.md) and our [AI policy](https://github.com/huggingface/lerobot/blob/main/AI_POLICY.md).
+It also helps us if you spread the word: reference the library from blog posts
+on the awesome projects it made possible, shout out on Twitter when it has
+helped you, or simply ⭐️ the repo to say "thank you".

-## Ways to Contribute
+Whichever way you choose to contribute, please be mindful to respect our
+[code of conduct](https://github.com/huggingface/lerobot/blob/main/CODE_OF_CONDUCT.md).

-You can contribute in many ways:
+## You can contribute in so many ways!

- **Fixing issues:** Resolve bugs or improve existing code.
- **New features:** Develop new features.
- **Extend:** Implement new models/policies, robots, or simulation environments and upload datasets to the Hugging Face Hub.
- **Documentation:** Improve examples, guides, and docstrings.
- **Feedback:** Submit tickets related to bugs or desired new features.
+Some of the ways you can contribute to 🤗 LeRobot:

-If you are unsure where to start, join our [Discord Channel](https://discord.gg/q8Dzzpym3f).
+- Fixing outstanding issues with the existing code.
+- Implementing new models, datasets or simulation environments.
+- Contributing to the examples or to the documentation.
+- Submitting issues related to bugs or desired new features.

-## Development Setup
+Following the guides below, feel free to open issues and PRs and to coordinate your efforts with the community on our [Discord Channel](https://discord.gg/VjFz58wn3R). For specific inquiries, reach out to [Remi Cadene](mailto:remi.cadene@huggingface.co).

-To contribute code, you need to set up a development environment.
+If you are not sure how to contribute or want to know the next features we working on, look on this project page: [LeRobot TODO](https://github.com/orgs/huggingface/projects/46)

-### 1. Fork and Clone
+## Submitting a new issue or feature request

-Fork the repository on GitHub, then clone your fork:
-
-```bash
-git clone https://github.com/<your-handle>/lerobot.git
-cd lerobot
-git remote add upstream https://github.com/huggingface/lerobot.git
-```
-
-### 2. Environment Installation
-
-Please follow our [Installation Guide](https://huggingface.co/docs/lerobot/installation) for the environment setup & installation from source.
-
-## Running Tests & Quality Checks
-
-### Code Style (Pre-commit)
-
-Install `pre-commit` hooks to run checks automatically before you commit:
-
-```bash
-pre-commit install
-```
-
-To run checks manually on all files:
-
-```bash
-pre-commit run --all-files
-```
-
-### Running Tests
-
-We use `pytest`. First, ensure you have test artifacts by installing **git-lfs**:
+Do your best to follow these guidelines when submitting an issue or a feature
+request. It will make it easier for us to come back to you quickly and with good
+feedback.
+
+### Did you find a bug?
+
+The 🤗 LeRobot library is robust and reliable thanks to the users who notify us of
+the problems they encounter. So thank you for reporting an issue.
+
+First, we would really appreciate it if you could **make sure the bug was not
+already reported** (use the search bar on Github under Issues).
+
+Did not find it? :( So we can act quickly on it, please follow these steps:
+
+- Include your **OS type and version**, the versions of **Python** and **PyTorch**.
+- A short, self-contained, code snippet that allows us to reproduce the bug in
+  less than 30s.
+- The full traceback if an exception is raised.
+- Attach any other additional information, like screenshots, you think may help.
+
+### Do you want a new feature?
+
+A good feature request addresses the following points:
+
+1. Motivation first:
+
+- Is it related to a problem/frustration with the library? If so, please explain
+  why. Providing a code snippet that demonstrates the problem is best.
+- Is it related to something you would need for a project? We'd love to hear
+  about it!
+- Is it something you worked on and think could benefit the community?
+  Awesome! Tell us what problem it solved for you.
+
+2. Write a _paragraph_ describing the feature.
+3. Provide a **code snippet** that demonstrates its future use.
+4. In case this is related to a paper, please attach a link.
+5. Attach any additional information (drawings, screenshots, etc.) you think may help.
+
+If your issue is well written we're already 80% of the way there by the time you
+post it.
+
+## Adding new policies, datasets or environments
+
+Look at our implementations for [datasets](./src/lerobot/datasets/), [policies](./src/lerobot/policies/),
+environments ([aloha](https://github.com/huggingface/gym-aloha),
+[pusht](https://github.com/huggingface/gym-pusht))
+and follow the same api design.
+
+When implementing a new dataset loadable with LeRobotDataset follow these steps:
+
+- Update `available_datasets_per_env` in `lerobot/__init__.py`
+
+When implementing a new environment (e.g. `gym_aloha`), follow these steps:
+
+- Update `available_tasks_per_env` and `available_datasets_per_env` in `lerobot/__init__.py`
+
+When implementing a new policy class (e.g. `DiffusionPolicy`) follow these steps:
+
+- Update `available_policies` and `available_policies_per_env`, in `lerobot/__init__.py`
+- Set the required `name` class attribute.
+- Update variables in `tests/test_available.py` by importing your new Policy class
+
+## Submitting a pull request (PR)
+
+Before writing code, we strongly advise you to search through the existing PRs or
+issues to make sure that nobody is already working on the same thing. If you are
+unsure, it is always a good idea to open an issue to get some feedback.
+
+You will need basic `git` proficiency to be able to contribute to
+🤗 LeRobot. `git` is not the easiest tool to use but it has the greatest
+manual. Type `git --help` in a shell and enjoy. If you prefer books, [Pro
+Git](https://git-scm.com/book/en/v2) is a very good reference.
+
+Follow these steps to start contributing:
+
+1. Fork the [repository](https://github.com/huggingface/lerobot) by
+   clicking on the 'Fork' button on the repository's page. This creates a copy of the code
+   under your GitHub user account.
+
+2. Clone your fork to your local disk, and add the base repository as a remote. The following command
+   assumes you have your public SSH key uploaded to GitHub. See the following guide for more
+   [information](https://docs.github.com/en/repositories/creating-and-managing-repositories/cloning-a-repository).
+
+   ```bash
+   git clone git@github.com:<your Github handle>/lerobot.git
+   cd lerobot
+   git remote add upstream https://github.com/huggingface/lerobot.git
+   ```
+
+3. Create a new branch to hold your development changes, and do this for every new PR you work on.
+
+   Start by synchronizing your `main` branch with the `upstream/main` branch (more details in the [GitHub Docs](https://docs.github.com/en/github/collaborating-with-issues-and-pull-requests/syncing-a-fork)):
+
+   ```bash
+   git checkout main
+   git fetch upstream
+   git rebase upstream/main
+   ```
+
+   Once your `main` branch is synchronized, create a new branch from it:
+
+   ```bash
+   git checkout -b a-descriptive-name-for-my-changes
+   ```
+
+   🚨 **Do not** work on the `main` branch.
+
+4. for development, we advise to use a tool like `poetry` or `uv` instead of just `pip` to easily track our dependencies.
+   Follow the instructions to [install poetry](https://python-poetry.org/docs/#installation) (use a version >=2.1.0) or to [install uv](https://docs.astral.sh/uv/getting-started/installation/#installation-methods) if you don't have one of them already.
+
+   Set up a development environment with conda:
+
+   ```bash
+   conda create -y -n lerobot-dev python=3.10 && conda activate lerobot-dev
+   ```
+
+   If you're using `uv`, it can manage python versions so you can instead do:
+
+   ```bash
+   uv venv --python 3.10 && source .venv/bin/activate
+   ```
+
+   To develop on 🤗 LeRobot, you will at least need to install the `dev` and `test` extras dependencies along with the core library:
+
+   using `poetry`
+
+   ```bash
+   poetry sync --extras "dev test"
+   ```
+
+   using `uv`
+
+   ```bash
+   uv sync --extra dev --extra test
+   ```
+
+   You can also install the project with all its dependencies (including environments):
+
+   using `poetry`
+
+   ```bash
+   poetry sync --all-extras
+   ```
+
+   using `uv`
+
+   ```bash
+   uv sync --all-extras
+   ```
+
+   > **Note:** If you don't install simulation environments with `--all-extras`, the tests that require them will be skipped when running the pytest suite locally. However, they _will_ be tested in the CI. In general, we advise you to install everything and test locally before pushing.
+
+   Whichever command you chose to install the project (e.g. `poetry sync --all-extras`), you should run it again when pulling code with an updated version of `pyproject.toml` and `poetry.lock` in order to synchronize your virtual environment with the new dependencies.
+
+   The equivalent of `pip install some-package`, would just be:
+
+   using `poetry`
+
+   ```bash
+   poetry add some-package
+   ```
+
+   using `uv`
+
+   ```bash
+   uv add some-package
+   ```
+
+   When making changes to the poetry sections of the `pyproject.toml`, you should run the following command to lock dependencies.
+   using `poetry`
+
+   ```bash
+   poetry lock
+   ```
+
+   using `uv`
+
+   ```bash
+   uv lock
+   ```
+
+5. Develop the features on your branch.
+
+   As you work on the features, you should make sure that the test suite
+   passes. You should run the tests impacted by your changes like this (see
+   below an explanation regarding the environment variable):
+
+   ```bash
+   pytest tests/<TEST_TO_RUN>.py
+   ```
+
+6. Follow our style.
+
+   `lerobot` relies on `ruff` to format its source code
+   consistently. Set up [`pre-commit`](https://pre-commit.com/) to run these checks
+   automatically as Git commit hooks.
+
+   Install `pre-commit` hooks:
+
+   ```bash
+   pre-commit install
+   ```
+
+   You can run these hooks whenever you need on staged files with:
+
+   ```bash
+   pre-commit
+   ```
+
+   Once you're happy with your changes, add changed files using `git add` and
+   make a commit with `git commit` to record your changes locally:
+
+   ```bash
+   git add modified_file.py
+   git commit
+   ```
+
+   Note, if you already committed some changes that have a wrong formatting, you can use:
+
+   ```bash
+   pre-commit run --all-files
+   ```
+
+   Please write [good commit messages](https://chris.beams.io/posts/git-commit/).
+
+   It is a good idea to sync your copy of the code with the original
+   repository regularly. This way you can quickly account for changes:
+
+   ```bash
+   git fetch upstream
+   git rebase upstream/main
+   ```
+
+   Push the changes to your account using:
+
+   ```bash
+   git push -u origin a-descriptive-name-for-my-changes
+   ```
+
+7. Once you are satisfied (**and the checklist below is happy too**), go to the
+   webpage of your fork on GitHub. Click on 'Pull request' to send your changes
+   to the project maintainers for review.
+
+8. It's ok if maintainers ask you for changes. It happens to core contributors
+   too! So everyone can see the changes in the Pull request, work in your local
+   branch and push the changes to your fork. They will automatically appear in
+   the pull request.
+
+### Checklist
+
+1. The title of your pull request should be a summary of its contribution;
+2. If your pull request addresses an issue, please mention the issue number in
+   the pull request description to make sure they are linked (and people
+   consulting the issue know you are working on it);
+3. To indicate a work in progress please prefix the title with `[WIP]`, or preferably mark
+   the PR as a draft PR. These are useful to avoid duplicated work, and to differentiate
+   it from PRs ready to be merged;
+4. Make sure existing tests pass;
+
+### Tests
+
+An extensive test suite is included to test the library behavior and several examples. Library tests can be found in the [tests folder](https://github.com/huggingface/lerobot/tree/main/tests).
+
+Install [git lfs](https://git-lfs.com/) to retrieve test artifacts (if you don't have it already).
+
+On Mac:

 ```bash
+brew install git-lfs
 git lfs install
+```
+
+On Ubuntu:
+
+```bash
+sudo apt-get install git-lfs
+git lfs install
+```
+
+Pull artifacts if they're not in [tests/artifacts](tests/artifacts)
+
+```bash
 git lfs pull
 ```

-Run the full suite (this may require extras installed):
+We use `pytest` in order to run the tests. From the root of the
+repository, here's how to run tests with `pytest` for the library:

 ```bash
-pytest -sv ./tests
+python -m pytest -sv ./tests
 ```

-Or run a specific test file during development:
-
-```bash
-pytest -sv tests/test_specific_feature.py
-```
-
-## Submitting Issues & Pull Requests
-
-Use the templates for required fields and examples.
-
- **Issues:** Follow the [ticket template](https://github.com/huggingface/lerobot/blob/main/.github/ISSUE_TEMPLATE/bug-report.yml).
- **Pull requests:** Rebase on `upstream/main`, use a descriptive branch (don't work on `main`), run `pre-commit` and tests locally, and follow the [PR template](https://github.com/huggingface/lerobot/blob/main/.github/PULL_REQUEST_TEMPLATE.md).
-
-One member of the LeRobot team will then review your contribution.
-
-Thank you for contributing to LeRobot!
+You can specify a smaller set of tests in order to test only the feature
+you're working on.
@@ -1,3 +1,2 @@
 include src/lerobot/templates/lerobot_modelcard_template.md
 include src/lerobot/datasets/card_template.md
-include src/lerobot/envs/metaworld_config.json
@@ -1,5 +1,7 @@
 <p align="center">
-  <img alt="LeRobot, Hugging Face Robotics Library" src="./media/readme/lerobot-logo-thumbnail.png" width="100%">
+  <img alt="LeRobot, Hugging Face Robotics Library" src="https://raw.githubusercontent.com/huggingface/lerobot/main/media/lerobot-logo-thumbnail.png" width="100%">
+  <br/>
+  <br/>
 </p>

 <div align="center">
@@ -10,132 +12,323 @@
 [![Status](https://img.shields.io/pypi/status/lerobot)](https://pypi.org/project/lerobot/)
 [![Version](https://img.shields.io/pypi/v/lerobot)](https://pypi.org/project/lerobot/)
 [![Contributor Covenant](https://img.shields.io/badge/Contributor%20Covenant-v2.1-ff69b4.svg)](https://github.com/huggingface/lerobot/blob/main/CODE_OF_CONDUCT.md)
-[![Discord](https://img.shields.io/badge/Discord-Join_Us-5865F2?style=flat&logo=discord&logoColor=white)](https://discord.gg/q8Dzzpym3f)
+[![Discord](https://dcbadge.vercel.app/api/server/C5P34WJ68S?style=flat)](https://discord.gg/s3KuuzsPFb)
+
+<!-- [![Coverage](https://codecov.io/gh/huggingface/lerobot/branch/main/graph/badge.svg?token=TODO)](https://codecov.io/gh/huggingface/lerobot) -->

 </div>

-**LeRobot** aims to provide models, datasets, and tools for real-world robotics in PyTorch. The goal is to lower the barrier to entry so that everyone can contribute to and benefit from shared datasets and pretrained models.
+<h2 align="center">
+    <p><a href="https://huggingface.co/docs/lerobot/hope_jr">
+        Build Your Own HopeJR Robot!</a></p>
+</h2>

-🤗 A hardware-agnostic, Python-native interface that standardizes control across diverse platforms, from low-cost arms (SO-100) to humanoids.
+<div align="center">
+  <img
+    src="https://raw.githubusercontent.com/huggingface/lerobot/main/media/hope_jr/hopejr.png"
+    alt="HopeJR robot"
+    title="HopeJR robot"
+    width="60%"
+  />

-🤗 A standardized, scalable LeRobotDataset format (Parquet + MP4 or images) hosted on the Hugging Face Hub, enabling efficient storage, streaming and visualization of massive robotic datasets.
+  <p><strong>Meet HopeJR – A humanoid robot arm and hand for dexterous manipulation!</strong></p>
+  <p>Control it with exoskeletons and gloves for precise hand movements.</p>
+  <p>Perfect for advanced manipulation tasks! 🤖</p>

-🤗 State-of-the-art policies that have been shown to transfer to the real-world ready for training and deployment.
+  <p><a href="https://huggingface.co/docs/lerobot/hope_jr">
+      See the full HopeJR tutorial here.</a></p>
+</div>

-🤗 Comprehensive support for the open-source ecosystem to democratize physical AI.
+<br/>

-## Quick Start
+<h2 align="center">
+    <p><a href="https://huggingface.co/docs/lerobot/so101">
+        Build Your Own SO-101 Robot!</a></p>
+</h2>

-LeRobot can be installed directly from PyPI.
+<div align="center">
+  <table>
+    <tr>
+      <td align="center"><img src="https://raw.githubusercontent.com/huggingface/lerobot/main/media/so101/so101.webp" alt="SO-101 follower arm" title="SO-101 follower arm" width="90%"/></td>
+      <td align="center"><img src="https://raw.githubusercontent.com/huggingface/lerobot/main/media/so101/so101-leader.webp" alt="SO-101 leader arm" title="SO-101 leader arm" width="90%"/></td>
+    </tr>
+  </table>
+
+  <p><strong>Meet the updated SO100, the SO-101 – Just €114 per arm!</strong></p>
+  <p>Train it in minutes with a few simple moves on your laptop.</p>
+  <p>Then sit back and watch your creation act autonomously! 🤯</p>
+
+  <p><a href="https://huggingface.co/docs/lerobot/so101">
+      See the full SO-101 tutorial here.</a></p>
+
+  <p>Want to take it to the next level? Make your SO-101 mobile by building LeKiwi!</p>
+  <p>Check out the <a href="https://huggingface.co/docs/lerobot/lekiwi">LeKiwi tutorial</a> and bring your robot to life on wheels.</p>
+
+  <img src="https://raw.githubusercontent.com/huggingface/lerobot/main/media/lekiwi/kiwi.webp" alt="LeKiwi mobile robot" title="LeKiwi mobile robot" width="50%">
+</div>
+
+<br/>
+
+<h3 align="center">
+    <p>LeRobot: State-of-the-art AI for real-world robotics</p>
+</h3>
+
+---
+
+🤗 LeRobot aims to provide models, datasets, and tools for real-world robotics in PyTorch. The goal is to lower the barrier to entry to robotics so that everyone can contribute and benefit from sharing datasets and pretrained models.
+
+🤗 LeRobot contains state-of-the-art approaches that have been shown to transfer to the real-world with a focus on imitation learning and reinforcement learning.
+
+🤗 LeRobot already provides a set of pretrained models, datasets with human collected demonstrations, and simulation environments to get started without assembling a robot. In the coming weeks, the plan is to add more and more support for real-world robotics on the most affordable and capable robots out there.
+
+🤗 LeRobot hosts pretrained models and datasets on this Hugging Face community page: [huggingface.co/lerobot](https://huggingface.co/lerobot)
+
+#### Examples of pretrained models on simulation environments
+
+<table>
+  <tr>
+    <td><img src="https://raw.githubusercontent.com/huggingface/lerobot/main/media/gym/aloha_act.gif" width="100%" alt="ACT policy on ALOHA env"/></td>
+    <td><img src="https://raw.githubusercontent.com/huggingface/lerobot/main/media/gym/simxarm_tdmpc.gif" width="100%" alt="TDMPC policy on SimXArm env"/></td>
+    <td><img src="https://raw.githubusercontent.com/huggingface/lerobot/main/media/gym/pusht_diffusion.gif" width="100%" alt="Diffusion policy on PushT env"/></td>
+  </tr>
+  <tr>
+    <td align="center">ACT policy on ALOHA env</td>
+    <td align="center">TDMPC policy on SimXArm env</td>
+    <td align="center">Diffusion policy on PushT env</td>
+  </tr>
+</table>
+
+## Installation
+
+LeRobot works with Python 3.10+ and PyTorch 2.2+.
+
+### Environment Setup
+
+Create a virtual environment with Python 3.10 and activate it, e.g. with [`miniforge`](https://conda-forge.org/download/):
+
+```bash
+conda create -y -n lerobot python=3.10
+conda activate lerobot
+```
+
+When using `conda`, install `ffmpeg` in your environment:
+
+```bash
+conda install ffmpeg -c conda-forge
+```
+
+> **NOTE:** This usually installs `ffmpeg 7.X` for your platform compiled with the `libsvtav1` encoder. If `libsvtav1` is not supported (check supported encoders with `ffmpeg -encoders`), you can:
+>
+> - _[On any platform]_ Explicitly install `ffmpeg 7.X` using:
+>
+> ```bash
+> conda install ffmpeg=7.1.1 -c conda-forge
+> ```
+>
+> - _[On Linux only]_ Install [ffmpeg build dependencies](https://trac.ffmpeg.org/wiki/CompilationGuide/Ubuntu#GettheDependencies) and [compile ffmpeg from source with libsvtav1](https://trac.ffmpeg.org/wiki/CompilationGuide/Ubuntu#libsvtav1), and make sure you use the corresponding ffmpeg binary to your install with `which ffmpeg`.
+
+### Install LeRobot 🤗
+
+#### From Source
+
+First, clone the repository and navigate into the directory:
+
+```bash
+git clone https://github.com/huggingface/lerobot.git
+cd lerobot
+```
+
+Then, install the library in editable mode. This is useful if you plan to contribute to the code.
+
+```bash
+pip install -e .
+```
+
+> **NOTE:** If you encounter build errors, you may need to install additional dependencies (`cmake`, `build-essential`, and `ffmpeg libs`). On Linux, run:
+> `sudo apt-get install cmake build-essential python3-dev pkg-config libavformat-dev libavcodec-dev libavdevice-dev libavutil-dev libswscale-dev libswresample-dev libavfilter-dev`. For other systems, see: [Compiling PyAV](https://pyav.org/docs/develop/overview/installation.html#bring-your-own-ffmpeg)
+
+For simulations, 🤗 LeRobot comes with gymnasium environments that can be installed as extras:
+
+- [aloha](https://github.com/huggingface/gym-aloha)
+- [xarm](https://github.com/huggingface/gym-xarm)
+- [pusht](https://github.com/huggingface/gym-pusht)
+
+For instance, to install 🤗 LeRobot with aloha and pusht, use:
+
+```bash
+pip install -e ".[aloha, pusht]"
+```
+
+### Installation from PyPI
+
+**Core Library:**
+Install the base package with:

 ```bash
 pip install lerobot
-lerobot-info
 ```

-> [!IMPORTANT]
-> For detailed installation guide, please see the [Installation Documentation](https://huggingface.co/docs/lerobot/installation).
+_This installs only the default dependencies._

-## Robots & Control
-
-<div align="center">
-  <img src="./media/readme/robots_control_video.webp" width="640px" alt="Reachy 2 Demo">
-</div>
-
-LeRobot provides a unified `Robot` class interface that decouples control logic from hardware specifics. It supports a wide range of robots and teleoperation devices.
-
-```python
-from lerobot.robots.myrobot import MyRobot
-
-# Connect to a robot
-robot = MyRobot(config=...)
-robot.connect()
-
-# Read observation and send action
-obs = robot.get_observation()
-action = model.select_action(obs)
-robot.send_action(action)
-```
-
-**Supported Hardware:** SO100, LeKiwi, Koch, HopeJR, OMX, EarthRover, Reachy2, Gamepads, Keyboards, Phones, OpenARM, Unitree G1.
-
-While these devices are natively integrated into the LeRobot codebase, the library is designed to be extensible. You can easily implement the Robot interface to utilize LeRobot's data collection, training, and visualization tools for your own custom robot.
-
-For detailed hardware setup guides, see the [Hardware Documentation](https://huggingface.co/docs/lerobot/integrate_hardware).
-
-## LeRobot Dataset
-
-To solve the data fragmentation problem in robotics, we utilize the **LeRobotDataset** format.
-
- **Structure:** Synchronized MP4 videos (or images) for vision and Parquet files for state/action data.
- **HF Hub Integration:** Explore thousands of robotics datasets on the [Hugging Face Hub](https://huggingface.co/lerobot).
- **Tools:** Seamlessly delete episodes, split by indices/fractions, add/remove features, and merge multiple datasets.
-
-```python
-from lerobot.datasets.lerobot_dataset import LeRobotDataset
-
-# Load a dataset from the Hub
-dataset = LeRobotDataset("lerobot/aloha_mobile_cabinet")
-
-# Access data (automatically handles video decoding)
-episode_index=0
-print(f"{dataset[episode_index]['action'].shape=}\n")
-```
-
-Learn more about it in the [LeRobotDataset Documentation](https://huggingface.co/docs/lerobot/lerobot-dataset-v3)
-
-## SoTA Models
-
-LeRobot implements state-of-the-art policies in pure PyTorch, covering Imitation Learning, Reinforcement Learning, and Vision-Language-Action (VLA) models, with more coming soon. It also provides you with the tools to instrument and inspect your training process.
-
-<p align="center">
-  <img alt="Gr00t Architecture" src="./media/readme/VLA_architecture.jpg" width="640px">
-</p>
-
-Training a policy is as simple as running a script configuration:
+**Extra Features:**
+To install additional functionality, use one of the following:

 ```bash
-lerobot-train \
-  --policy=act \
-  --dataset.repo_id=lerobot/aloha_mobile_cabinet
+pip install 'lerobot[all]'          # All available features
+pip install 'lerobot[aloha,pusht]'  # Specific features (Aloha & Pusht)
+pip install 'lerobot[feetech]'      # Feetech motor support
 ```

-| Category                   | Models                                                                                                                                                                                                       |
-| -------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
-| **Imitation Learning**     | [ACT](./docs/source/policy_act_README.md), [Diffusion](./docs/source/policy_diffusion_README.md), [VQ-BeT](./docs/source/policy_vqbet_README.md)                                                             |
-| **Reinforcement Learning** | [HIL-SERL](./docs/source/hilserl.mdx), [TDMPC](./docs/source/policy_tdmpc_README.md) & QC-FQL (coming soon)                                                                                                  |
-| **VLAs Models**            | [Pi0Fast](./docs/source/pi0fast.mdx), [Pi0.5](./docs/source/pi05.mdx), [GR00T N1.5](./docs/source/policy_groot_README.md), [SmolVLA](./docs/source/policy_smolvla_README.md), [XVLA](./docs/source/xvla.mdx) |
+_Replace `[...]` with your desired features._

-Similarly to the hardware, you can easily implement your own policy & leverage LeRobot's data collection, training, and visualization tools, and share your model to the HF Hub
+**Available Tags:**
+For a full list of optional dependencies, see:
+https://pypi.org/project/lerobot/

-For detailed policy setup guides, see the [Policy Documentation](https://huggingface.co/docs/lerobot/bring_your_own_policies).
+> [!NOTE]
+> For lerobot 0.4.0, if you want to install pi tags, you will have to do: `pip install "lerobot[pi]@git+https://github.com/huggingface/lerobot.git"`.
+>
+> This will be solved in the next patch release

-## Inference & Evaluation
+### Weights & Biases

-Evaluate your policies in simulation or on real hardware using the unified evaluation script. LeRobot supports standard benchmarks like **LIBERO**, **MetaWorld** and more to come.
+To use [Weights and Biases](https://docs.wandb.ai/quickstart) for experiment tracking, log in with

 ```bash
-# Evaluate a policy on the LIBERO benchmark
-lerobot-eval \
-  --policy.path=lerobot/pi0_libero_finetuned \
-  --env.type=libero \
-  --env.task=libero_object \
-  --eval.n_episodes=10
+wandb login
 ```

-Learn how to implement your own simulation environment or benchmark and distribute it from the HF Hub by following the [EnvHub Documentation](https://huggingface.co/docs/lerobot/envhub)
+(note: you will also need to enable WandB in the configuration. See below.)

-## Resources
+### Visualize datasets

- **[Documentation](https://huggingface.co/docs/lerobot/index):** The complete guide to tutorials & API.
- **[Chinese Tutorials: LeRobot+SO-ARM101中文教程-同济子豪兄](https://zihao-ai.feishu.cn/wiki/space/7589642043471924447)** Detailed doc for assembling, teleoperate, dataset, train, deploy. Verified by Seed Studio and 5 global hackathon players.
- **[Discord](https://discord.gg/q8Dzzpym3f):** Join the `LeRobot` server to discuss with the community.
- **[X](https://x.com/LeRobotHF):** Follow us on X to stay up-to-date with the latest developments.
- **[Robot Learning Tutorial](https://huggingface.co/spaces/lerobot/robot-learning-tutorial):** A free, hands-on course to learn robot learning using LeRobot.
+Check out [example 1](https://github.com/huggingface/lerobot/blob/main/examples/dataset/load_lerobot_dataset.py) that illustrates how to use our dataset class which automatically downloads data from the Hugging Face hub.
+
+You can also locally visualize episodes from a dataset on the hub by executing our script from the command line:
+
+```bash
+lerobot-dataset-viz \
+    --repo-id lerobot/pusht \
+    --episode-index 0
+```
+
+or from a dataset in a local folder with the `root` option and the `--mode local` (in the following case the dataset will be searched for in `./my_local_data_dir/lerobot/pusht`)
+
+```bash
+lerobot-dataset-viz \
+    --repo-id lerobot/pusht \
+    --root ./my_local_data_dir \
+    --mode local \
+    --episode-index 0
+```
+
+It will open `rerun.io` and display the camera streams, robot states and actions, like this:
+
+https://github-production-user-asset-6210df.s3.amazonaws.com/4681518/328035972-fd46b787-b532-47e2-bb6f-fd536a55a7ed.mov?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAVCODYLSA53PQK4ZA%2F20240505%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20240505T172924Z&X-Amz-Expires=300&X-Amz-Signature=d680b26c532eeaf80740f08af3320d22ad0b8a4e4da1bcc4f33142c15b509eda&X-Amz-SignedHeaders=host&actor_id=24889239&key_id=0&repo_id=748713144
+
+Our script can also visualize datasets stored on a distant server. See `lerobot-dataset-viz --help` for more instructions.
+
+### The `LeRobotDataset` format
+
+A dataset in `LeRobotDataset` format is very simple to use. It can be loaded from a repository on the Hugging Face hub or a local folder simply with e.g. `dataset = LeRobotDataset("lerobot/aloha_static_coffee")` and can be indexed into like any Hugging Face and PyTorch dataset. For instance `dataset[0]` will retrieve a single temporal frame from the dataset containing observation(s) and an action as PyTorch tensors ready to be fed to a model.
+
+A specificity of `LeRobotDataset` is that, rather than retrieving a single frame by its index, we can retrieve several frames based on their temporal relationship with the indexed frame, by setting `delta_timestamps` to a list of relative times with respect to the indexed frame. For example, with `delta_timestamps = {"observation.image": [-1, -0.5, -0.2, 0]}` one can retrieve, for a given index, 4 frames: 3 "previous" frames 1 second, 0.5 seconds, and 0.2 seconds before the indexed frame, and the indexed frame itself (corresponding to the 0 entry). See example [1_load_lerobot_dataset.py](https://github.com/huggingface/lerobot/blob/main/examples/dataset/load_lerobot_dataset.py) for more details on `delta_timestamps`.
+
+Under the hood, the `LeRobotDataset` format makes use of several ways to serialize data which can be useful to understand if you plan to work more closely with this format. We tried to make a flexible yet simple dataset format that would cover most type of features and specificities present in reinforcement learning and robotics, in simulation and in real-world, with a focus on cameras and robot states but easily extended to other types of sensory inputs as long as they can be represented by a tensor.
+
+Here are the important details and internal structure organization of a typical `LeRobotDataset` instantiated with `dataset = LeRobotDataset("lerobot/aloha_static_coffee")`. The exact features will change from dataset to dataset but not the main aspects:
+
+```
+dataset attributes:
+  ├ hf_dataset: a Hugging Face dataset (backed by Arrow/parquet). Typical features example:
+  │  ├ observation.images.cam_high (VideoFrame):
+  │  │   VideoFrame = {'path': path to a mp4 video, 'timestamp' (float32): timestamp in the video}
+  │  ├ observation.state (list of float32): position of an arm joints (for instance)
+  │  ... (more observations)
+  │  ├ action (list of float32): goal position of an arm joints (for instance)
+  │  ├ episode_index (int64): index of the episode for this sample
+  │  ├ frame_index (int64): index of the frame for this sample in the episode ; starts at 0 for each episode
+  │  ├ timestamp (float32): timestamp in the episode
+  │  ├ next.done (bool): indicates the end of an episode ; True for the last frame in each episode
+  │  └ index (int64): general index in the whole dataset
+  ├ meta: a LeRobotDatasetMetadata object containing:
+  │  ├ info: a dictionary of metadata on the dataset
+  │  │  ├ codebase_version (str): this is to keep track of the codebase version the dataset was created with
+  │  │  ├ fps (int): frame per second the dataset is recorded/synchronized to
+  │  │  ├ features (dict): all features contained in the dataset with their shapes and types
+  │  │  ├ total_episodes (int): total number of episodes in the dataset
+  │  │  ├ total_frames (int): total number of frames in the dataset
+  │  │  ├ robot_type (str): robot type used for recording
+  │  │  ├ data_path (str): formattable string for the parquet files
+  │  │  └ video_path (str): formattable string for the video files (if using videos)
+  │  ├ episodes: a DataFrame containing episode metadata with columns:
+  │  │  ├ episode_index (int): index of the episode
+  │  │  ├ tasks (list): list of tasks for this episode
+  │  │  ├ length (int): number of frames in this episode
+  │  │  ├ dataset_from_index (int): start index of this episode in the dataset
+  │  │  └ dataset_to_index (int): end index of this episode in the dataset
+  │  ├ stats: a dictionary of statistics (max, mean, min, std) for each feature in the dataset, for instance
+  │  │  ├ observation.images.front_cam: {'max': tensor with same number of dimensions (e.g. `(c, 1, 1)` for images, `(c,)` for states), etc.}
+  │  │  └ ...
+  │  └ tasks: a DataFrame containing task information with task names as index and task_index as values
+  ├ root (Path): local directory where the dataset is stored
+  ├ image_transforms (Callable): optional image transformations to apply to visual modalities
+  └ delta_timestamps (dict): optional delta timestamps for temporal queries
+```
+
+A `LeRobotDataset` is serialised using several widespread file formats for each of its parts, namely:
+
+- hf_dataset stored using Hugging Face datasets library serialization to parquet
+- videos are stored in mp4 format to save space
+- metadata are stored in plain json/jsonl files
+
+Dataset can be uploaded/downloaded from the HuggingFace hub seamlessly. To work on a local dataset, you can specify its location with the `root` argument if it's not in the default `~/.cache/huggingface/lerobot` location.
+
+#### Reproduce state-of-the-art (SOTA)
+
+We provide some pretrained policies on our [hub page](https://huggingface.co/lerobot) that can achieve state-of-the-art performances.
+You can reproduce their training by loading the config from their run. Simply running:
+
+```bash
+lerobot-train --config_path=lerobot/diffusion_pusht
+```
+
+reproduces SOTA results for Diffusion Policy on the PushT task.
+
+## Contribute
+
+If you would like to contribute to 🤗 LeRobot, please check out our [contribution guide](https://github.com/huggingface/lerobot/blob/main/CONTRIBUTING.md).
+
+### Add a pretrained policy
+
+Once you have trained a policy you may upload it to the Hugging Face hub using a hub id that looks like `${hf_user}/${repo_name}` (e.g. [lerobot/diffusion_pusht](https://huggingface.co/lerobot/diffusion_pusht)).
+
+You first need to find the checkpoint folder located inside your experiment directory (e.g. `outputs/train/2024-05-05/20-21-12_aloha_act_default/checkpoints/002500`). Within that there is a `pretrained_model` directory which should contain:
+
+- `config.json`: A serialized version of the policy configuration (following the policy's dataclass config).
+- `model.safetensors`: A set of `torch.nn.Module` parameters, saved in [Hugging Face Safetensors](https://huggingface.co/docs/safetensors/index) format.
+- `train_config.json`: A consolidated configuration containing all parameters used for training. The policy configuration should match `config.json` exactly. This is useful for anyone who wants to evaluate your policy or for reproducibility.
+
+To upload these to the hub, run the following:
+
+```bash
+huggingface-cli upload ${hf_user}/${repo_name} path/to/pretrained_model
+```
+
+See [lerobot_eval.py](https://github.com/huggingface/lerobot/blob/main/src/lerobot/scripts/lerobot_eval.py) for an example of how other people may use your policy.
+
+### Acknowledgment
+
+- The LeRobot team 🤗 for building SmolVLA [Paper](https://arxiv.org/abs/2506.01844), [Blog](https://huggingface.co/blog/smolvla).
+- Thanks to Tony Zhao, Zipeng Fu and colleagues for open sourcing ACT policy, ALOHA environments and datasets. Ours are adapted from [ALOHA](https://tonyzhaozh.github.io/aloha) and [Mobile ALOHA](https://mobile-aloha.github.io).
+- Thanks to Cheng Chi, Zhenjia Xu and colleagues for open sourcing Diffusion policy, Pusht environment and datasets, as well as UMI datasets. Ours are adapted from [Diffusion Policy](https://diffusion-policy.cs.columbia.edu) and [UMI Gripper](https://umi-gripper.github.io).
+- Thanks to Nicklas Hansen, Yunhai Feng and colleagues for open sourcing TDMPC policy, Simxarm environments and datasets. Ours are adapted from [TDMPC](https://github.com/nicklashansen/tdmpc) and [FOWM](https://www.yunhaifeng.com/FOWM).
+- Thanks to Antonio Loquercio and Ashish Kumar for their early support.
+- Thanks to [Seungjae (Jay) Lee](https://sjlee.cc/), [Mahi Shafiullah](https://mahis.life/) and colleagues for open sourcing [VQ-BeT](https://sjlee.cc/vq-bet/) policy and helping us adapt the codebase to our repository. The policy is adapted from [VQ-BeT repo](https://github.com/jayLEE0301/vq_bet_official).

 ## Citation

-If you use LeRobot in your project, please cite the GitHub repository to acknowledge the ongoing development and contributors:
+If you want, you can cite this work with:

 ```bibtex
@misc{cadene2024lerobot,
@@ -146,31 +339,6 @@ If you use LeRobot in your project, please cite the GitHub repository to acknowl
 }
 ```

-If you are referencing our research or the academic paper, please also cite our ICLR publication:
+## Star History

-<details>
-<summary><b>ICLR 2026 Paper</b></summary>
-
-```bibtex
-@inproceedings{cadenelerobot,
-  title={LeRobot: An Open-Source Library for End-to-End Robot Learning},
-  author={Cadene, Remi and Alibert, Simon and Capuano, Francesco and Aractingi, Michel and Zouitine, Adil and Kooijmans, Pepijn and Choghari, Jade and Russi, Martino and Pascal, Caroline and Palma, Steven and Shukor, Mustafa and Moss, Jess and Soare, Alexander and Aubakirova, Dana and Lhoest, Quentin and Gallou\'edec, Quentin and Wolf, Thomas},
-  booktitle={The Fourteenth International Conference on Learning Representations},
-  year={2026},
-  url={https://arxiv.org/abs/2602.22818}
-}
-```
-
-</details>
-
-## Contribute
-
-We welcome contributions from everyone in the community! To get started, please read our [CONTRIBUTING.md](https://github.com/huggingface/lerobot/blob/main/CONTRIBUTING.md) guide. Whether you're adding a new feature, improving documentation, or fixing a bug, your help and feedback are invaluable. We're incredibly excited about the future of open-source robotics and can't wait to work with you on what's next—thank you for your support!
-
-<p align="center">
-  <img alt="SO101 Video" src="./media/readme/so100_video.webp" width="640px">
-</p>
-
-<div align="center">
-<sub>Built by the <a href="https://huggingface.co/lerobot">LeRobot</a> team at <a href="https://huggingface.co">Hugging Face</a> with ❤️</sub>
-</div>
+[![Star History Chart](https://api.star-history.com/svg?repos=huggingface/lerobot&type=Timeline)](https://star-history.com/#huggingface/lerobot&Timeline)
@@ -1,48 +0,0 @@
-# Security Policy
-
-## Project Status & Philosophy
-
-`lerobot` has so far been primarily a research and prototyping tool, which is why deployment security hasn’t been a strong focus until now. As `lerobot` continues to be adopted and deployed in production, we are paying much closer attention to these kinds of issues.
-
-Fortunately, being an open-source project, the community can also help by reporting and fixing vulnerabilities. We appreciate your efforts to responsibly disclose your findings and will make every effort to acknowledge your contributions.
-
-## Reporting a Vulnerability
-
-To report a security issue, please use the GitHub Security Advisory ["Report a Vulnerability"](https://github.com/huggingface/lerobot/security/advisories/new) tab.
-
-The `lerobot` team will send a response indicating the next steps in handling your report. After the initial reply to your report, the security team will keep you informed of the progress towards a fix and full announcement, and may ask for additional information or guidance.
-
-#### Hugging Face Security Team
-
-Since this project is part of the Hugging Face ecosystem, feel free to submit vulnerability reports directly to: **[security@huggingface.co](mailto:security@huggingface.co)**. Someone from the HF security team will review the report and recommend next steps.
-
-#### Open Source Disclosures
-
-If reporting a vulnerability specific to the open-source codebase (and not the underlying Hub infrastructure), you may also use [Huntr](https://huntr.com), a vulnerability disclosure program for open source software.
-
-## Supported Versions
-
-Currently, we treat `lerobot` as a rolling release. We prioritize security updates for the latest available version (`main` branch).
-
-| Version  | Supported |
-| -------- | --------- |
-| Latest   | ✅        |
-| < Latest | ❌        |
-
-## Secure Usage Guidelines
-
-`lerobot` is tightly coupled to the Hugging Face Hub for sharing data and pretrained policies. When downloading artifacts uploaded by others, you expose yourself to risks. Please read below for recommendations to keep your runtime and robot environment safe.
-
-### Remote Artefacts (Weights & Policies)
-
-Models and policies uploaded to the Hugging Face Hub come in different formats. We heavily recommend uploading and downloading models in the [`safetensors`](https://github.com/huggingface/safetensors) format.
-
-`safetensors` was developed specifically to prevent arbitrary code execution on your system, which is critical when running software on physical hardware/robots.
-
-To avoid loading models from unsafe formats (e.g., `pickle`), you should ensure you are prioritizing `safetensors` files.
-
-### Remote Code
-
-Some models or environments on the Hub may require `trust_remote_code=True` to run custom architecture code.
-
-Please **always** verify the content of the modeling files when using this argument. We recommend setting a specific `revision` (commit hash) when loading remote code to ensure you protect yourself from unverified updates to the repository.
@@ -28,9 +28,9 @@ We don't expect the same optimal settings for a dataset of images from a simulat
 For these reasons, we run this benchmark on four representative datasets:

 - `lerobot/pusht_image`: (96 x 96 pixels) simulation with simple geometric shapes, fixed camera.
- `lerobot/aloha_mobile_shrimp_image`: (480 x 640 pixels) real-world indoor, moving camera.
- `lerobot/paris_street`: (720 x 1280 pixels) real-world outdoor, moving camera.
- `lerobot/kitchen`: (1080 x 1920 pixels) real-world indoor, fixed camera.
+- `aliberts/aloha_mobile_shrimp_image`: (480 x 640 pixels) real-world indoor, moving camera.
+- `aliberts/paris_street`: (720 x 1280 pixels) real-world outdoor, moving camera.
+- `aliberts/kitchen`: (1080 x 1920 pixels) real-world indoor, fixed camera.

 Note: The datasets used for this benchmark need to be image datasets, not video datasets.

@@ -179,7 +179,7 @@ python benchmark/video/run_video_benchmark.py \
    --output-dir outputs/video_benchmark \
    --repo-ids \
        lerobot/pusht_image \
-        lerobot/aloha_mobile_shrimp_image \
+        aliberts/aloha_mobile_shrimp_image \
    --vcodec libx264 libx265 \
    --pix-fmt yuv444p yuv420p \
    --g 2 20 None \
@@ -203,9 +203,9 @@ python benchmark/video/run_video_benchmark.py \
    --output-dir outputs/video_benchmark \
    --repo-ids \
        lerobot/pusht_image \
-        lerobot/aloha_mobile_shrimp_image \
-        lerobot/paris_street \
-        lerobot/kitchen \
+        aliberts/aloha_mobile_shrimp_image \
+        aliberts/paris_street \
+        aliberts/kitchen \
    --vcodec libx264 libx265 \
    --pix-fmt yuv444p yuv420p \
    --g 1 2 3 4 5 6 10 15 20 40 None \
@@ -221,9 +221,9 @@ python benchmark/video/run_video_benchmark.py \
    --output-dir outputs/video_benchmark \
    --repo-ids \
        lerobot/pusht_image \
-        lerobot/aloha_mobile_shrimp_image \
-        lerobot/paris_street \
-        lerobot/kitchen \
+        aliberts/aloha_mobile_shrimp_image \
+        aliberts/paris_street \
+        aliberts/kitchen \
    --vcodec libsvtav1 \
    --pix-fmt yuv420p \
    --g 1 2 3 4 5 6 10 15 20 40 None \
@@ -252,37 +252,37 @@ Since we're using av1 encoding, we're choosing the `pyav` decoder as `video_read

 These tables show the results for `g=2` and `crf=30`, using `timestamps-modes=6_frames` and `backend=pyav`

-| video_images_size_ratio           | vcodec     | pix_fmt |           |           |           |
-| --------------------------------- | ---------- | ------- | --------- | --------- | --------- |
-|                                   | libx264    |         | libx265   |           | libsvtav1 |
-| repo_id                           | yuv420p    | yuv444p | yuv420p   | yuv444p   | yuv420p   |
-| lerobot/pusht_image               | **16.97%** | 17.58%  | 18.57%    | 18.86%    | 22.06%    |
-| lerobot/aloha_mobile_shrimp_image | 2.14%      | 2.11%   | 1.38%     | **1.37%** | 5.59%     |
-| lerobot/paris_street              | 2.12%      | 2.13%   | **1.54%** | **1.54%** | 4.43%     |
-| lerobot/kitchen                   | 1.40%      | 1.39%   | **1.00%** | **1.00%** | 2.52%     |
+| video_images_size_ratio            | vcodec     | pix_fmt |           |           |           |
+| ---------------------------------- | ---------- | ------- | --------- | --------- | --------- |
+|                                    | libx264    |         | libx265   |           | libsvtav1 |
+| repo_id                            | yuv420p    | yuv444p | yuv420p   | yuv444p   | yuv420p   |
+| lerobot/pusht_image                | **16.97%** | 17.58%  | 18.57%    | 18.86%    | 22.06%    |
+| aliberts/aloha_mobile_shrimp_image | 2.14%      | 2.11%   | 1.38%     | **1.37%** | 5.59%     |
+| aliberts/paris_street              | 2.12%      | 2.13%   | **1.54%** | **1.54%** | 4.43%     |
+| aliberts/kitchen                   | 1.40%      | 1.39%   | **1.00%** | **1.00%** | 2.52%     |

-| video_images_load_time_ratio      | vcodec  | pix_fmt |          |         |           |
-| --------------------------------- | ------- | ------- | -------- | ------- | --------- |
-|                                   | libx264 |         | libx265  |         | libsvtav1 |
-| repo_id                           | yuv420p | yuv444p | yuv420p  | yuv444p | yuv420p   |
-| lerobot/pusht_image               | 6.45    | 5.19    | **1.90** | 2.12    | 2.47      |
-| lerobot/aloha_mobile_shrimp_image | 11.80   | 7.92    | 0.71     | 0.85    | **0.48**  |
-| lerobot/paris_street              | 2.21    | 2.05    | 0.36     | 0.49    | **0.30**  |
-| lerobot/kitchen                   | 1.46    | 1.46    | 0.28     | 0.51    | **0.26**  |
+| video_images_load_time_ratio       | vcodec  | pix_fmt |          |         |           |
+| ---------------------------------- | ------- | ------- | -------- | ------- | --------- |
+|                                    | libx264 |         | libx265  |         | libsvtav1 |
+| repo_id                            | yuv420p | yuv444p | yuv420p  | yuv444p | yuv420p   |
+| lerobot/pusht_image                | 6.45    | 5.19    | **1.90** | 2.12    | 2.47      |
+| aliberts/aloha_mobile_shrimp_image | 11.80   | 7.92    | 0.71     | 0.85    | **0.48**  |
+| aliberts/paris_street              | 2.21    | 2.05    | 0.36     | 0.49    | **0.30**  |
+| aliberts/kitchen                   | 1.46    | 1.46    | 0.28     | 0.51    | **0.26**  |

-|                                   |          | vcodec   | pix_fmt      |          |           |              |
-| --------------------------------- | -------- | -------- | ------------ | -------- | --------- | ------------ |
-|                                   |          | libx264  |              | libx265  |           | libsvtav1    |
-| repo_id                           | metric   | yuv420p  | yuv444p      | yuv420p  | yuv444p   | yuv420p      |
-| lerobot/pusht_image               | avg_mse  | 2.90E-04 | **2.03E-04** | 3.13E-04 | 2.29E-04  | 2.19E-04     |
-|                                   | avg_psnr | 35.44    | 37.07        | 35.49    | **37.30** | 37.20        |
-|                                   | avg_ssim | 98.28%   | **98.85%**   | 98.31%   | 98.84%    | 98.72%       |
-| lerobot/aloha_mobile_shrimp_image | avg_mse  | 2.76E-04 | 2.59E-04     | 3.17E-04 | 3.06E-04  | **1.30E-04** |
-|                                   | avg_psnr | 35.91    | 36.21        | 35.88    | 36.09     | **40.17**    |
-|                                   | avg_ssim | 95.19%   | 95.18%       | 95.00%   | 95.05%    | **97.73%**   |
-| lerobot/paris_street              | avg_mse  | 6.89E-04 | 6.70E-04     | 4.03E-03 | 4.02E-03  | **3.09E-04** |
-|                                   | avg_psnr | 33.48    | 33.68        | 32.05    | 32.15     | **35.40**    |
-|                                   | avg_ssim | 93.76%   | 93.75%       | 89.46%   | 89.46%    | **95.46%**   |
-| lerobot/kitchen                   | avg_mse  | 2.50E-04 | 2.24E-04     | 4.28E-04 | 4.18E-04  | **1.53E-04** |
-|                                   | avg_psnr | 36.73    | 37.33        | 36.56    | 36.75     | **39.12**    |
-|                                   | avg_ssim | 95.47%   | 95.58%       | 95.52%   | 95.53%    | **96.82%**   |
+|                                    |          | vcodec   | pix_fmt      |          |           |              |
+| ---------------------------------- | -------- | -------- | ------------ | -------- | --------- | ------------ |
+|                                    |          | libx264  |              | libx265  |           | libsvtav1    |
+| repo_id                            | metric   | yuv420p  | yuv444p      | yuv420p  | yuv444p   | yuv420p      |
+| lerobot/pusht_image                | avg_mse  | 2.90E-04 | **2.03E-04** | 3.13E-04 | 2.29E-04  | 2.19E-04     |
+|                                    | avg_psnr | 35.44    | 37.07        | 35.49    | **37.30** | 37.20        |
+|                                    | avg_ssim | 98.28%   | **98.85%**   | 98.31%   | 98.84%    | 98.72%       |
+| aliberts/aloha_mobile_shrimp_image | avg_mse  | 2.76E-04 | 2.59E-04     | 3.17E-04 | 3.06E-04  | **1.30E-04** |
+|                                    | avg_psnr | 35.91    | 36.21        | 35.88    | 36.09     | **40.17**    |
+|                                    | avg_ssim | 95.19%   | 95.18%       | 95.00%   | 95.05%    | **97.73%**   |
+| aliberts/paris_street              | avg_mse  | 6.89E-04 | 6.70E-04     | 4.03E-03 | 4.02E-03  | **3.09E-04** |
+|                                    | avg_psnr | 33.48    | 33.68        | 32.05    | 32.15     | **35.40**    |
+|                                    | avg_ssim | 93.76%   | 93.75%       | 89.46%   | 89.46%    | **95.46%**   |
+| aliberts/kitchen                   | avg_mse  | 2.50E-04 | 2.24E-04     | 4.28E-04 | 4.18E-04  | **1.53E-04** |
+|                                    | avg_psnr | 36.73    | 37.33        | 36.56    | 36.75     | **39.12**    |
+|                                    | avg_ssim | 95.47%   | 95.58%       | 95.52%   | 95.53%    | **96.82%**   |
@@ -24,7 +24,7 @@ ARG OS_VERSION=22.04
 FROM nvidia/cuda:${CUDA_VERSION}-base-ubuntu${OS_VERSION}

 # Define Python version argument
-ARG PYTHON_VERSION=3.12
+ARG PYTHON_VERSION=3.10

 # Configure environment variables
 ENV DEBIAN_FRONTEND=noninteractive \
@@ -73,7 +73,7 @@ ENV HOME=/home/user_lerobot \
 RUN uv venv --python python${PYTHON_VERSION}

 # Install Python dependencies for caching
-COPY --chown=user_lerobot:user_lerobot setup.py pyproject.toml README.md MANIFEST.in ./
+COPY --chown=user_lerobot:user_lerobot pyproject.toml README.md MANIFEST.in ./
 COPY --chown=user_lerobot:user_lerobot src/ src/

 ARG UNBOUND_DEPS=false
@@ -85,8 +85,6 @@ RUN if [ "$UNBOUND_DEPS" = "true" ]; then \

 RUN uv pip install --no-cache ".[all]"

-RUN chmod +x /lerobot/.venv/lib/python${PYTHON_VERSION}/site-packages/triton/backends/nvidia/bin/ptxas
-
 # Copy the rest of the application source code
 # Make sure to have the git-LFS files for testing
 COPY --chown=user_lerobot:user_lerobot . .
@@ -18,10 +18,8 @@
 # docker build -f docker/Dockerfile.user -t lerobot-user .
 # docker run -it --rm lerobot-user

-# With USB physical access : docker run -it --device=/dev/ -v /dev/:/dev/ --rm lerobot-user
-
 # Configure the base image
-ARG PYTHON_VERSION=3.12
+ARG PYTHON_VERSION=3.10
 FROM python:${PYTHON_VERSION}-slim

 # Configure environment variables
@@ -61,7 +59,7 @@ ENV HOME=/home/user_lerobot \
 RUN uv venv

 # Install Python dependencies for caching
-COPY --chown=user_lerobot:user_lerobot setup.py pyproject.toml README.md MANIFEST.in ./
+COPY --chown=user_lerobot:user_lerobot pyproject.toml README.md MANIFEST.in ./
 COPY --chown=user_lerobot:user_lerobot src/ src/

 ARG UNBOUND_DEPS=false
@@ -7,6 +7,8 @@
 - sections:
  - local: il_robots
    title: Imitation Learning for Robots
+  - local: cameras
+    title: Cameras
  - local: bring_your_own_policies
    title: Bring Your Own Policies
  - local: integrate_hardware
@@ -17,8 +19,6 @@
    title: Train RL in Simulation
  - local: multi_gpu_training
    title: Multi GPU training
-  - local: peft_training
-    title: Training with PEFT (e.g., LoRA)
  title: "Tutorials"
 - sections:
  - local: lerobot-dataset-v3
@@ -27,10 +27,6 @@
    title: Porting Large Datasets
  - local: using_dataset_tools
    title: Using the Dataset Tools
-  - local: dataset_subtask
-    title: Using Subtasks in the Dataset
-  - local: streaming_video_encoding
-    title: Streaming Video Encoding
  title: "Datasets"
 - sections:
  - local: act
@@ -39,21 +35,13 @@
    title: SmolVLA
  - local: pi0
    title: π₀ (Pi0)
-  - local: pi0fast
-    title: π₀-FAST (Pi0Fast)
  - local: pi05
    title: π₀.₅ (Pi05)
  - local: groot
    title: NVIDIA GR00T N1.5
  - local: xvla
    title: X-VLA
-  - local: walloss
-    title: WALL-OSS
  title: "Policies"
- sections:
-  - local: sarm
-    title: SARM
-  title: "Reward Models"
 - sections:
  - local: async
    title: Use Async Inference
@@ -65,8 +53,6 @@
    title: Environments from the Hub
  - local: envhub_leisaac
    title: Control & Train Robots in Sim (LeIsaac)
-  - local: envhub_isaaclab_arena
-    title: NVIDIA IsaacLab Arena Environments
  - local: libero
    title: Using Libero
  - local: metaworld
@@ -101,30 +87,16 @@
    title: Unitree G1
  - local: earthrover_mini_plus
    title: Earth Rover Mini
-  - local: omx
-    title: OMX
-  - local: openarm
-    title: OpenArm
  title: "Robots"
 - sections:
  - local: phone_teleop
    title: Phone
  title: "Teleoperators"
- sections:
-  - local: cameras
-    title: Cameras
-  title: "Sensors"
- sections:
-  - local: torch_accelerators
-    title: PyTorch accelerators
-  title: "Supported Hardware"
 - sections:
  - local: notebooks
    title: Notebooks
  - local: feetech
    title: Updating Feetech Firmware
-  - local: damiao
-    title: Damiao Motors and CAN Bus
  title: "Resources"
 - sections:
  - local: contributing
@@ -88,8 +88,5 @@ lerobot-record \
  --dataset.repo_id=${HF_USER}/eval_act_your_dataset \
  --dataset.num_episodes=10 \
  --dataset.single_task="Your task description" \
-  --dataset.streaming_encoding=true \
-  --dataset.encoder_threads=2 \
-  # --dataset.vcodec=auto \
  --policy.path=${HF_USER}/act_policy
 ```
@@ -48,7 +48,7 @@ python -m lerobot.async_inference.robot_client \
    --task="dummy" \ # POLICY: The task to run the policy on (`Fold my t-shirt`). Not necessarily defined for all policies, such as `act`
    --policy_type=your_policy_type \ # POLICY: the type of policy to run (smolvla, act, etc)
    --pretrained_name_or_path=user/model \ # POLICY: the model name/path on server to the checkpoint to run (e.g., lerobot/smolvla_base)
-    --policy_device=mps \ # POLICY: the device to run the policy on, on the server (cuda, mps, xpu, cpu)
+    --policy_device=mps \ # POLICY: the device to run the policy on, on the server
    --actions_per_chunk=50 \ # POLICY: the number of actions to output at once
    --chunk_size_threshold=0.5 \ # CLIENT: the threshold for the chunk size before sending a new observation to the server
    --aggregate_fn_name=weighted_average \ # CLIENT: the function to aggregate actions on overlapping portions
@@ -169,7 +169,7 @@ python -m lerobot.async_inference.robot_client \
 <!-- prettier-ignore-start -->
 ```python
 import threading
-from lerobot.robots.so_follower import SO100FollowerConfig
+from lerobot.robots.so100_follower import SO100FollowerConfig
 from lerobot.cameras.opencv.configuration_opencv import OpenCVCameraConfig
 from lerobot.async_inference.configs import RobotClientConfig
 from lerobot.async_inference.robot_client import RobotClient
@@ -195,7 +195,6 @@ client_cfg = RobotClientConfig(
    robot=robot_cfg,
    server_address="localhost:8080",
    policy_device="mps",
-    client_device="cpu",
    policy_type="smolvla",
    pretrained_name_or_path="<user>/smolvla_async",
    chunk_size_threshold=0.5,
@@ -32,7 +32,7 @@ version = "0.1.0"
 dependencies = [
    # your policy-specific dependencies
 ]
-requires-python = ">= 3.12"
+requires-python = ">= 3.11"

 [build-system]
 build-backend = # your-build-backend
@@ -82,7 +82,7 @@ Create your policy implementation by inheriting from LeRobot's base `PreTrainedP
 # modeling_my_custom_policy.py
 import torch
 import torch.nn as nn
-from typing import Any
+from typing import Dict, Any

 from lerobot.policies.pretrained import PreTrainedPolicy
 from .configuration_my_custom_policy import MyCustomPolicyConfig
@@ -91,7 +91,7 @@ class MyCustomPolicy(PreTrainedPolicy):
    config_class = MyCustomPolicyConfig
    name = "my_custom_policy"

-    def __init__(self, config: MyCustomPolicyConfig, dataset_stats: dict[str, Any] = None):
+    def __init__(self, config: MyCustomPolicyConfig, dataset_stats: Dict[str, Any] = None):
        super().__init__(config, dataset_stats)
        ...
 ```
@@ -102,7 +102,7 @@ Create processor functions:

 ```python
 # processor_my_custom_policy.py
-from typing import Any
+from typing import Dict, Any
 import torch


@@ -1,22 +1,12 @@
 # Cameras

-LeRobot offers multiple options for video capture:
+LeRobot offers multiple options for video capture, including phone cameras, built-in laptop cameras, external webcams, and Intel RealSense cameras. To efficiently record frames from most cameras, you can use either the `OpenCVCamera` or `RealSenseCamera` class. For additional compatibility details on the `OpenCVCamera` class, refer to the [Video I/O with OpenCV Overview](https://docs.opencv.org/4.x/d0/da7/videoio_overview.html).

-| Class             | Supported Cameras                   |
-| ----------------- | ----------------------------------- |
-| `OpenCVCamera`    | Phone, built-in laptop, USB webcams |
-| `ZMQCamera`       | Network-connected cameras           |
-| `RealSenseCamera` | Intel RealSense (with depth)        |
-| `Reachy2Camera`   | Reachy 2 robot cameras              |
+### Finding your camera

-> [!TIP]
-> For `OpenCVCamera` compatibility details, see the [Video I/O with OpenCV Overview](https://docs.opencv.org/4.x/d0/da7/videoio_overview.html).
+To instantiate a camera, you need a camera identifier. This identifier might change if you reboot your computer or re-plug your camera, a behavior mostly dependant on your operating system.

-### Find your camera
-
-Every camera requires a unique identifier to be instantiated, allowing you to distinguish between multiple connected devices.
-
-`OpenCVCamera` and `RealSenseCamera` support auto-discovery. Run the command below to list available devices and their identifiers. Note that these identifiers may change after rebooting your computer or re-plugging the camera, depending on your operating system.
+To find the camera indices of the cameras plugged into your system, run the following script:

 ```bash
 lerobot-find-cameras opencv # or realsense for Intel Realsense cameras
@@ -24,7 +14,7 @@ lerobot-find-cameras opencv # or realsense for Intel Realsense cameras

 The output will look something like this if you have two cameras connected:

-```bash
+```
 --- Detected Cameras ---
 Camera #0:
  Name: OpenCV Camera @ 0
@@ -43,37 +33,13 @@ Camera #0:
 > [!WARNING]
 > When using Intel RealSense cameras in `macOS`, you could get this [error](https://github.com/IntelRealSense/librealsense/issues/12307): `Error finding RealSense cameras: failed to set power state`, this can be solved by running the same command with `sudo` permissions. Note that using RealSense cameras in `macOS` is unstable.

-`ZMQCamera` and `Reachy2Camera` do not support auto-discovery. They must be configured manually by providing their network address and port or robot SDK settings.
+## Use Cameras

-## Use cameras
+Below are two examples, demonstrating how to work with the API.

-### Frame access modes
-
-All camera classes implement three access modes for capturing frames:
-
-| Method                    | Behavior                                                                                                                                                   | Blocks?        | Best For                                 |
-| ------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------- | ---------------------------------------- |
-| `read()`                  | Waits for the camera hardware to return a frame. May block for a long time depending on the camera and SDK.                                                | Yes            | Simple scripts, sequential capture       |
-| `async_read(timeout_ms)`  | Returns the latest unconsumed frame from background thread. Blocks only if buffer is empty, up to `timeout_ms`. Raises `TimeoutError` if no frame arrives. | With a timeout | Control loops synchronized to camera FPS |
-| `read_latest(max_age_ms)` | Peeks at the most recent frame in buffer (may be stale). Raises `TimeoutError` if frame is older than `max_age_ms`.                                        | No             | UI visualization, logging, monitoring    |
-
-### Usage examples
-
-The following examples show how to use the camera API to configure and capture frames from different camera types.
-
- **Blocking and non-blocking frame capture** using an OpenCV-based camera
+- **Asynchronous frame capture** using an OpenCV-based camera
 - **Color and depth capture** using an Intel RealSense camera

-> [!WARNING]
-> Failing to cleanly disconnect cameras can cause resource leaks. Use the context manager protocol to ensure automatic cleanup:
->
-> ```python
-> with OpenCVCamera(config) as camera:
->     ...
-> ```
->
-> You can also call `connect()` and `disconnect()` manually, but always use a `finally` block for the latter.
-
 <hfoptions id="shell_restart">
 <hfoption id="Open CV Camera">

@@ -94,30 +60,16 @@ config = OpenCVCameraConfig(
 )

 # Instantiate and connect an `OpenCVCamera`, performing a warm-up read (default).
-with OpenCVCamera(config) as camera:
-
-    # Read a frame synchronously — blocks until hardware delivers a new frame
-    frame = camera.read()
-    print(f"read() call returned frame with shape:", frame.shape)
-
-    # Read a frame asynchronously with a timeout — returns the latest unconsumed frame or waits up to timeout_ms for a new one
-    try:
-        for i in range(10):
-            frame = camera.async_read(timeout_ms=200)
-            print(f"async_read call returned frame {i} with shape:", frame.shape)
-    except TimeoutError as e:
-        print(f"No frame received within timeout: {e}")
-
-    # Instantly return a frame - returns the most recent frame captured by the camera
-    try:
-        initial_frame = camera.read_latest(max_age_ms=1000)
-        for i in range(10):
-            frame = camera.read_latest(max_age_ms=1000)
-            print(f"read_latest call returned frame {i} with shape:", frame.shape)
-            print(f"Was a new frame received by the camera? {not (initial_frame == frame).any()}")
-    except TimeoutError as e:
-        print(f"Frame too old: {e}")
+camera = OpenCVCamera(config)
+camera.connect()

+# Read frames asynchronously in a loop via `async_read(timeout_ms)`
+try:
+    for i in range(10):
+        frame = camera.async_read(timeout_ms=200)
+        print(f"Async frame {i} shape:", frame.shape)
+finally:
+    camera.disconnect()
 ```
 <!-- prettier-ignore-end -->

@@ -159,10 +111,10 @@ finally:
 </hfoption>
 </hfoptions>

-## Use your phone's camera
+## Use your phone

 <hfoptions id="use phone">
-<hfoption id="iPhone & macOS">
+<hfoption id="Mac">

 To use your iPhone as a camera on macOS, enable the Continuity Camera feature:

@@ -172,49 +124,83 @@ To use your iPhone as a camera on macOS, enable the Continuity Camera feature:

 For more details, visit [Apple support](https://support.apple.com/en-gb/guide/mac-help/mchl77879b8a/mac).

+Your iPhone should be detected automatically when running the camera setup script in the next section.
+
 </hfoption>
-<hfoption id="OBS virtual camera">
+<hfoption id="Linux">

-If you want to use your phone as a camera using OBS, follow these steps to set up a virtual camera.
+If you want to use your phone as a camera on Linux, follow these steps to set up a virtual camera

-1. _(Linux only) Install `v4l2loopback-dkms` and `v4l-utils`_. These packages create virtual camera devices and verify their settings. Install with:
+1. _Install `v4l2loopback-dkms` and `v4l-utils`_. Those packages are required to create virtual camera devices (`v4l2loopback`) and verify their settings with the `v4l2-ctl` utility from `v4l-utils`. Install them using:

-```bash
+<!-- prettier-ignore-start -->
+```python
 sudo apt install v4l2loopback-dkms v4l-utils
 ```
+<!-- prettier-ignore-end -->

-2. _Install the [DroidCam app](https://droidcam.app) on your phone_. This app is available for both iOS and Android.
-3. _Download and install [OBS Studio](https://obsproject.com)_.
-4. _Download and install the [DroidCam OBS plugin](https://droidcam.app/obs)_.
-5. _Start OBS Studio_.
+2. _Install [DroidCam](https://droidcam.app) on your phone_. This app is available for both iOS and Android.
+3. _Install [OBS Studio](https://obsproject.com)_. This software will help you manage the camera feed. Install it using [Flatpak](https://flatpak.org):

-6. _Add your phone as a source_. Follow the instructions [here](https://droidcam.app/obs/usage). Be sure to set the resolution to `640x480` to avoid the watermarks.
-7. _Adjust resolution settings_. In OBS Studio, go to `File > Settings > Video` or `OBS > Preferences... > Video`. Change the `Base(Canvas) Resolution` and the `Output(Scaled) Resolution` to `640x480` by manually typing it.
+<!-- prettier-ignore-start -->
+```python
+flatpak install flathub com.obsproject.Studio
+```
+<!-- prettier-ignore-end -->
+
+4. _Install the DroidCam OBS plugin_. This plugin integrates DroidCam with OBS Studio. Install it with:
+
+<!-- prettier-ignore-start -->
+```python
+flatpak install flathub com.obsproject.Studio.Plugin.DroidCam
+```
+<!-- prettier-ignore-end -->
+
+5. _Start OBS Studio_. Launch with:
+
+<!-- prettier-ignore-start -->
+```python
+flatpak run com.obsproject.Studio
+```
+<!-- prettier-ignore-end -->
+
+6. _Add your phone as a source_. Follow the instructions [here](https://droidcam.app/obs/usage). Be sure to set the resolution to `640x480`.
+7. _Adjust resolution settings_. In OBS Studio, go to `File > Settings > Video`. Change the `Base(Canvas) Resolution` and the `Output(Scaled) Resolution` to `640x480` by manually typing it in.
 8. _Start virtual camera_. In OBS Studio, follow the instructions [here](https://obsproject.com/kb/virtual-camera-guide).
-9. _Verify the virtual camera setup and resolution_.
-   - **Linux**: Use `v4l2-ctl` to list devices and check resolution:
-     ```bash
-     v4l2-ctl --list-devices  # find VirtualCam and note its /dev/videoX path
-     v4l2-ctl -d /dev/videoX --get-fmt-video  # replace with your VirtualCam path
-     ```
-     You should see `VirtualCam` listed and resolution `640x480`.
-   - **macOS**: Open Photo Booth or FaceTime and select "OBS Virtual Camera" as the input.
-   - **Windows**: The native Camera app doesn't support virtual cameras. Use a video conferencing app (Zoom, Teams) or run `lerobot-find-cameras opencv` directly to verify.
+9. _Verify the virtual camera setup_. Use `v4l2-ctl` to list the devices:

-<details>
-<summary><strong>Troubleshooting</strong></summary>
+<!-- prettier-ignore-start -->
+```python
+v4l2-ctl --list-devices
+```
+<!-- prettier-ignore-end -->

-> The virtual camera resolution is incorrect.
+You should see an entry like:

-Delete the virtual camera source and recreate it. The resolution cannot be changed after creation.
+```
+VirtualCam (platform:v4l2loopback-000):
+/dev/video1
+```

-> Error reading frame in background thread for OpenCVCamera(X): OpenCVCamera(X) frame width=640 or height=480 do not match configured width=1920 or height=1080.
+10. _Check the camera resolution_. Use `v4l2-ctl` to ensure that the virtual camera output resolution is `640x480`. Change `/dev/video1` to the port of your virtual camera from the output of `v4l2-ctl --list-devices`.

-This error is caused by OBS Virtual Camera advertising a `1920x1080` resolution despite rescaling. The only fix for now is to comment out the width and height check in `_postprocess_image()`.
+<!-- prettier-ignore-start -->
+```python
+v4l2-ctl -d /dev/video1 --get-fmt-video
+```
+<!-- prettier-ignore-end -->

-</details>
+You should see an entry like:
+
+```
+>>> Format Video Capture:
+>>>	Width/Height      : 640/480
+>>>	Pixel Format      : 'YUYV' (YUYV 4:2:2)
+```
+
+Troubleshooting: If the resolution is not correct you will have to delete the Virtual Camera port and try again as it cannot be changed.
+
+If everything is set up correctly, you can proceed with the rest of the tutorial.

 </hfoption>
 </hfoptions>
-
-If everything is set up correctly, your phone will appear as a standard OpenCV camera and can be used with `OpenCVCamera`.
@@ -1,165 +0,0 @@
-# Damiao Motors and CAN Bus
-
-This guide covers setup and usage of Damiao motors with LeRobot via CAN bus communication.
-
-Currently, only Linux is supported, as the OpenArms CAN adapter only has drivers for Linux.
-
-## Linux CAN Setup
-
-Before using Damiao motors, you need to set up the CAN interface on your Linux system.
-
-### Install CAN Utilities
-
-```bash
-sudo apt-get install can-utils
-```
-
-### Configure CAN Interface (Manual)
-
-For standard CAN FD (recommended for OpenArms):
-
-```bash
-sudo ip link set can0 down
-sudo ip link set can0 type can bitrate 1000000 dbitrate 5000000 fd on
-sudo ip link set can0 up
-```
-
-For standard CAN (without FD):
-
-```bash
-sudo ip link set can0 down
-sudo ip link set can0 type can bitrate 1000000
-sudo ip link set can0 up
-```
-
-### Configure CAN Interface (Using LeRobot)
-
-LeRobot provides a utility script to setup and test CAN interfaces:
-
-```bash
-# Setup multiple interfaces (e.g., OpenArms Followers with 2 CAN buses)
-lerobot-setup-can --mode=setup --interfaces=can0,can1
-```
-
-## Debugging CAN Communication
-
-Use the built-in debug tools to test motor communication:
-
-```bash
-# Test motors on all interfaces
-lerobot-setup-can --mode=test --interfaces=can0,can1
-
-# Run speed/latency test
-lerobot-setup-can --mode=speed --interfaces=can0
-```
-
-The test mode will scan for motors (IDs 0x01-0x08) and report which ones respond. Example output:
-
-```
-can0: UP (CAN FD)
-  Motor 0x01 (joint_1): ✓ FOUND
-    → Response 0x11 [FD]: 00112233...
-  Motor 0x02 (joint_2): ✓ FOUND
-  Motor 0x03 (joint_3): ✗ No response
-  ...
-  Summary: 2/8 motors found
-```
-
-## Usage
-
-### Basic Setup
-
-```python
-from lerobot.motors import Motor
-from lerobot.motors.damiao import DamiaoMotorsBus
-
-# Define your motors with send/receive CAN IDs
-motors = {
-    "joint_1": Motor(id=0x01, motor_type_str="dm8009", recv_id=0x11),
-    "joint_2": Motor(id=0x02, motor_type_str="dm4340", recv_id=0x12),
-    "joint_3": Motor(id=0x03, motor_type_str="dm4310", recv_id=0x13),
-}
-
-# Create the bus
-bus = DamiaoMotorsBus(
-    port="can0",  # Linux socketcan interface
-    motors=motors,
-)
-
-# Connect
-bus.connect()
-```
-
-### Reading Motor States
-
-```python
-# Read single motor position (degrees)
-position = bus.read("Present_Position", "joint_1")
-
-# Read from multiple motors
-positions = bus.sync_read("Present_Position")  # All motors
-positions = bus.sync_read("Present_Position", ["joint_1", "joint_2"])
-
-# Read all states at once (position, velocity, torque)
-states = bus.sync_read_all_states()
-# Returns: {'joint_1': {'position': 45.2, 'velocity': 1.3, 'torque': 0.5}, ...}
-```
-
-### Writing Motor Commands
-
-```python
-# Enable torque
-bus.enable_torque()
-
-# Set goal position (degrees)
-bus.write("Goal_Position", "joint_1", 45.0)
-
-# Set positions for multiple motors
-bus.sync_write("Goal_Position", {
-    "joint_1": 45.0,
-    "joint_2": -30.0,
-    "joint_3": 90.0,
-})
-
-# Disable torque
-bus.disable_torque()
-```
-
-## Configuration Options
-
-| Parameter      | Default   | Description                                                 |
-| -------------- | --------- | ----------------------------------------------------------- |
-| `port`         | -         | CAN interface (`can0`) or serial port (`/dev/cu.usbmodem*`) |
-| `use_can_fd`   | `True`    | Enable CAN FD for higher data rates                         |
-| `bitrate`      | `1000000` | Nominal bitrate (1 Mbps)                                    |
-| `data_bitrate` | `5000000` | CAN FD data bitrate (5 Mbps)                                |
-
-## Motor Configuration
-
-Each motor requires:
-
- `id`: CAN ID for sending commands
- `motor_type`: One of the supported motor types (e.g., `"dm8009"`, `"dm4340"`)
- `recv_id`: CAN ID for receiving responses
-
-OpenArms default IDs follow the pattern: send ID `0x0N`, receive ID `0x1N` where N is the joint number.
-
-## Troubleshooting
-
-### No Response from Motors
-
-1. **Check power**
-2. **Verify CAN wiring**: Check CAN-H, CAN-L, and GND connections
-3. **Check motor IDs**: Use Damiao Debugging Tools to verify/configure IDs
-4. **Test CAN interface**: Run `candump can0` to see if messages are being received
-5. **Run diagnostics**: `lerobot-setup-can --mode=test --interfaces=can0`
-
-### Motor Timeout Parameter
-
-If motors were configured with timeout=0, they won't respond to commands. Use Damiao Debugging Tools to set a non-zero timeout value.
-
-### Verify CAN FD Status
-
-```bash
-ip -d link show can0 | grep fd
-```
@@ -1,278 +0,0 @@
-# Using Subtasks in LeRobot Datasets
-
-Subtask support in robotics datasets has proven effective in improving robot reasoning and understanding. Subtasks are particularly useful for:
-
- **Hierarchical policies**: Building policies that include subtask predictions to visualize robot reasoning in real time
- **Reward modeling**: Helping reward models understand task progression (e.g., SARM-style stage-aware reward models)
- **Task decomposition**: Breaking down complex manipulation tasks into atomic, interpretable steps
-
-LeRobotDataset now supports subtasks as part of its dataset structure, alongside tasks.
-
-## What are Subtasks?
-
-While a **task** describes the overall goal (e.g., "Pick up the apple and place it in the basket"), **subtasks** break down the execution into finer-grained steps:
-
-1. "Approach the apple"
-2. "Grasp the apple"
-3. "Lift the apple"
-4. "Move to basket"
-5. "Release the apple"
-
-Each frame in the dataset can be annotated with its corresponding subtask, enabling models to learn and predict these intermediate stages.
-
-<img
-  src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/lerobot/subtask-asset.png"
-  alt="An overview of subtask annotation showing how frames are labeled with intermediate subtask stages"
-  width="80%"
-/>
-
-<p>
-  <em>Figure: Overview of subtask annotation.</em>
-</p>
-
-**Reference:** _Subtask-learning based for robot self-assembly in flexible collaborative assembly in manufacturing_, Original Article, Published: 19 April 2022.
-
-## Dataset Structure
-
-Subtask information is stored in the dataset metadata:
-
-```
-my-dataset/
-├── data/
-│   └── ...
-├── meta/
-│   ├── info.json
-│   ├── stats.json
-│   ├── tasks.parquet
-│   ├── subtasks.parquet      # Subtask index → subtask string mapping
-│   └── episodes/
-│       └── ...
-└── videos/
-    └── ...
-```
-
-### Subtasks Parquet File
-
-The `meta/subtasks.parquet` file maps subtask indices to their natural language descriptions:
-
-| subtask_index | subtask (index column) |
-| ------------- | ---------------------- |
-| 0             | "Approach the apple"   |
-| 1             | "Grasp the apple"      |
-| 2             | "Lift the apple"       |
-| ...           | ...                    |
-
-### Frame-Level Annotations
-
-Each frame in the dataset can include a `subtask_index` field that references the subtasks parquet file:
-
-```python
-# Example frame data in the parquet file
-{
-    "index": 42,
-    "timestamp": 1.4,
-    "episode_index": 0,
-    "task_index": 0,
-    "subtask_index": 2,  # References "Lift the apple"
-    "observation.state": [...],
-    "action": [...],
-}
-```
-
-## Annotating Datasets with Subtasks
-
-We provide a HuggingFace Space for easily annotating any LeRobotDataset with subtasks:
-
-**[https://huggingface.co/spaces/lerobot/annotate](https://huggingface.co/spaces/lerobot/annotate)**
-
-After completing your annotation:
-
-1. Click "Push to Hub" to upload your annotated dataset
-2. You can also run the annotation space locally by following the instructions at [github.com/huggingface/lerobot-annotate](https://github.com/huggingface/lerobot-annotate)
-
-## Loading Datasets with Subtasks
-
-When you load a dataset with subtask annotations, the subtask information is automatically available:
-
-```python
-from lerobot.datasets.lerobot_dataset import LeRobotDataset
-
-# Load a dataset with subtask annotations
-dataset = LeRobotDataset("jadechoghari/collect-fruit-annotated")
-
-# Access a sample
-sample = dataset[100]
-
-# The sample includes both task and subtask information
-print(sample["task"])        # "Collect the fruit"
-print(sample["subtask"])     # "Grasp the apple"
-print(sample["task_index"])  # tensor(0)
-print(sample["subtask_index"])  # tensor(2)
-```
-
-### Checking for Subtask Support
-
-You can check if a dataset has subtask annotations:
-
-```python
-# Check if subtasks are available
-has_subtasks = (
-    "subtask_index" in dataset.features
-    and dataset.meta.subtasks is not None
-)
-
-if has_subtasks:
-    print(f"Dataset has {len(dataset.meta.subtasks)} unique subtasks")
-    print("Subtasks:", list(dataset.meta.subtasks.index))
-```
-
-## Using Subtasks for Training
-
-### With the Tokenizer Processor
-
-The `TokenizerProcessor` automatically handles subtask tokenization for Vision-Language Action (VLA) models:
-
-```python
-from lerobot.processor.tokenizer_processor import TokenizerProcessor
-from lerobot.processor.pipeline import ProcessorPipeline
-
-# Create a tokenizer processor
-tokenizer_processor = TokenizerProcessor(
-    tokenizer_name_or_path="google/paligemma-3b-pt-224",
-    padding="max_length",
-    max_length=64,
-)
-
-# The processor will automatically tokenize subtasks if present in the batch
-# and add them to the observation under:
-# - "observation.subtask.tokens"
-# - "observation.subtask.attention_mask"
-```
-
-When subtasks are available in the batch, the tokenizer processor adds:
-
- `observation.subtask.tokens`: Tokenized subtask text
- `observation.subtask.attention_mask`: Attention mask for the subtask tokens
-
-### DataLoader with Subtasks
-
-```python
-import torch
-from lerobot.datasets.lerobot_dataset import LeRobotDataset
-
-dataset = LeRobotDataset("jadechoghari/collect-fruit-annotated")
-
-dataloader = torch.utils.data.DataLoader(
-    dataset,
-    batch_size=16,
-    shuffle=True,
-)
-
-for batch in dataloader:
-    # Access subtask information in the batch
-    subtasks = batch["subtask"]  # List of subtask strings
-    subtask_indices = batch["subtask_index"]  # Tensor of subtask indices
-
-    # Use for training hierarchical policies or reward models
-    print(f"Batch subtasks: {set(subtasks)}")
-```
-
-## Example Datasets with Subtask Annotations
-
-Try loading a dataset with subtask annotations:
-
-```python
-from lerobot.datasets.lerobot_dataset import LeRobotDataset
-
-# Example dataset with subtask annotations
-dataset = LeRobotDataset("jadechoghari/collect-fruit-annotated")
-
-# Explore the subtasks
-print("Available subtasks:")
-for subtask_name in dataset.meta.subtasks.index:
-    print(f"  - {subtask_name}")
-
-# Get subtask distribution
-subtask_counts = {}
-for i in range(len(dataset)):
-    sample = dataset[i]
-    subtask = sample["subtask"]
-    subtask_counts[subtask] = subtask_counts.get(subtask, 0) + 1
-
-print("\nSubtask distribution:")
-for subtask, count in sorted(subtask_counts.items(), key=lambda x: -x[1]):
-    print(f"  {subtask}: {count} frames")
-```
-
-## Use Cases
-
-### 1. Hierarchical Policy Training
-
-Train policies that predict both actions and current subtask:
-
-```python
-class HierarchicalPolicy(nn.Module):
-    def __init__(self, num_subtasks):
-        super().__init__()
-        self.action_head = nn.Linear(hidden_dim, action_dim)
-        self.subtask_head = nn.Linear(hidden_dim, num_subtasks)
-
-    def forward(self, observations):
-        features = self.encoder(observations)
-        actions = self.action_head(features)
-        subtask_logits = self.subtask_head(features)
-        return actions, subtask_logits
-```
-
-### 2. Stage-Aware Reward Modeling (SARM)
-
-Build reward models that understand task progression:
-
-```python
-# SARM predicts:
-# - Stage: Which subtask is being executed (discrete)
-# - Progress: How far along the subtask (continuous 0-1)
-
-class SARMRewardModel(nn.Module):
-    def forward(self, observations):
-        features = self.encoder(observations)
-        stage_logits = self.stage_classifier(features)
-        progress = self.progress_regressor(features)
-        return stage_logits, progress
-```
-
-### 3. Progress Visualization
-
-Monitor robot execution by tracking subtask progression:
-
-```python
-def visualize_execution(model, observations):
-    for t, obs in enumerate(observations):
-        action, subtask_logits = model(obs)
-        predicted_subtask = subtask_names[subtask_logits.argmax()]
-        print(f"t={t}: Executing '{predicted_subtask}'")
-```
-
-## API Reference
-
-### LeRobotDataset Properties
-
-| Property                    | Type                   | Description                                |
-| --------------------------- | ---------------------- | ------------------------------------------ |
-| `meta.subtasks`             | `pd.DataFrame \| None` | DataFrame mapping subtask names to indices |
-| `features["subtask_index"]` | `dict`                 | Feature spec for subtask_index if present  |
-
-### Sample Keys
-
-When subtasks are available, each sample includes:
-
-| Key             | Type           | Description                          |
-| --------------- | -------------- | ------------------------------------ |
-| `subtask_index` | `torch.Tensor` | Integer index of the current subtask |
-| `subtask`       | `str`          | Natural language subtask description |
-
-## Related Resources
-
- [SARM Paper](https://arxiv.org/pdf/2509.25358) - Stage-Aware Reward Modeling for Long Horizon Robot Manipulation
- [LeRobot Annotate Space](https://huggingface.co/spaces/lerobot/annotate) - Interactive annotation tool
- [LeRobotDataset v3.0](./lerobot-dataset-v3) - Dataset format documentation
@@ -1,11 +1,5 @@
 # EarthRover Mini Plus

-<img
-  src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/lerobot/Earth_Rover_Mini_5_240c9adc-4f9e-44b7-982f-5d1dc24af1d8.png.webp"
-  alt="EarthRover Mini Plus"
-  width="70%"
-/>
-
 The EarthRover Mini Plus is a fully open source mobile robot that connects through the cloud using the Frodobots SDK. This lets you control the robot and record datasets for training AI models.

 ## What You Need
@@ -13,47 +7,28 @@ The EarthRover Mini Plus is a fully open source mobile robot that connects throu
 ### Hardware

 - EarthRover Mini robot
- Computer with Python 3.12 or newer
+- Computer with Python 3.10 or newer
 - Internet connection

 ### Setting Up the Frodobots SDK

-The robot needs the [Frodobots SDK](https://github.com/frodobots-org/earth-rovers-sdk) running on your computer. Here's how:
+The robot needs the [Frodobots SDK](https://github.com/Frodobots/earth-rovers-sdk) running on your computer. Here's how:

 1. Download and install the SDK:

 ```bash
-git clone https://github.com/frodobots-org/earth-rovers-sdk.git
+git clone https://github.com/Frodobots/earth-rovers-sdk.git
 cd earth-rovers-sdk
 pip install -r requirements.txt
 ```

-2. Save Credentials:
-
-Write your .env variables with the SDK API key and bot name provided by the Frodobots team.
-
-```bash
-SDK_API_TOKEN=your_sdk_api_token_here
-BOT_SLUG=your_bot_slug_here
-CHROME_EXECUTABLE_PATH=/path/to/chrome_or_chromium
-# Default value is MAP_ZOOM_LEVEL=18 https://wiki.openstreetmap.org/wiki/Zoom_levels
-MAP_ZOOM_LEVEL=18
-MISSION_SLUG=your_mission_slug_here
-# Image quality between 0.1 and 1.0 (default: 0.8)
-# Recommended: 0.8 for better performance
-IMAGE_QUALITY=0.8
-# Image format: jpeg, png or webp (default: png)
-# Recommended: jpeg for better performance and lower bandwidth usage
-IMAGE_FORMAT=jpeg
-```
-
-3. Start the SDK:
+2. Start the SDK:

 ```bash
 hypercorn main:app --reload
 ```

-4. Open your web browser and go to `http://localhost:8000`, then click "Join"
+3. Open your web browser and go to `http://localhost:8000`, then click "Join"

 The SDK gives you:

@@ -170,13 +145,13 @@ Once you can drive the robot well, you can start recording data to train AI mode
 We use Hugging Face to store your data online. First, log in with your token from [Hugging Face settings](https://huggingface.co/settings/tokens):

 ```bash
-hf auth login --token ${HUGGINGFACE_TOKEN} --add-to-git-credential
+huggingface-cli login --token ${HUGGINGFACE_TOKEN} --add-to-git-credential
 ```

 Store your Hugging Face username:

 ```bash
-HF_USER=$(hf auth whoami | awk -F': *' 'NR==1 {print $2}')
+HF_USER=$(huggingface-cli whoami | head -n 1)
 echo $HF_USER
 ```

@@ -185,16 +160,13 @@ echo $HF_USER
 Use the standard recording command:

 ```bash
-lerobot-record \
+python src/lerobot/scripts/lerobot_record.py \
    --robot.type=earthrover_mini_plus \
    --teleop.type=keyboard_rover \
    --dataset.repo_id=your_username/dataset_name \
    --dataset.num_episodes=2 \
    --dataset.fps=10 \
    --dataset.single_task="Navigate around obstacles" \
-    --dataset.streaming_encoding=true \
-    --dataset.encoder_threads=2 \
-    # --dataset.vcodec=auto \
    --display_data=true
 ```

@@ -204,26 +176,22 @@ Replace `your_username/dataset_name` with your Hugging Face username and a name

 Your dataset includes:

-**Your Actions (2 features)**:
+**Your Actions (2 things)**:

- `linear_velocity`: How much you moved forward/backward
- `angular_velocity`: How much you turned left/right
+- How much you moved forward/backward
+- How much you turned left/right

-**Robot Observations (24 features)**:
+**Robot Observations (12 things)**:

 - Front camera video
 - Rear camera video
 - Current speed
 - Battery level
- Orientation
- GPS (latitude, longitude, signal strength)
+- Which way the robot is facing
+- GPS location (latitude, longitude, signal strength)
 - Network signal strength
 - Vibration level
- Lamp state (on/off)
- Accelerometer (x, y, z)
- Gyroscope (x, y, z)
- Magnetometer (x, y, z)
- Wheel RPMs (4 wheels)
+- Lamp status (on/off)

 ### Where Your Data Goes

@@ -2,32 +2,14 @@

 The **EnvHub** feature allows you to load simulation environments directly from the Hugging Face Hub with a single line of code. This unlocks a powerful new model for collaboration: instead of environments being locked away inside monolithic libraries, anyone can publish custom environments and share them with the community.

-## What is EnvHub?
+## Overview

-EnvHub lets you create custom robotics simulation environments with your own robot models and scenarios, and make them easily usable by anyone through the LeRobot framework.
+With EnvHub, you can:

-EnvHub packages are stored on the Hugging Face Hub, and can be seamlessly pulled and used in your AI robotics projects through LeRobot with a single line of code.
-
-Thanks to EnvHub, you can:
-
-1. **Create and publish environments** to the Hugging Face Hub as Git repositories, and distribute complex physics simulations without packaging hassles
-2. **Load environments** dynamically, without installing them as packages
-3. **Version and track** environment changes using Git semantics
-4. **Discover** new simulation tasks shared by the community
-
-This design means you can go from discovering an interesting environment on the Hub to running experiments in seconds, or create your own custom robot and environment without worrying about dependency conflicts or complex installation procedures.
-
-When you create an EnvHub package, you can build anything you want inside it and use any simulation tool you like: this is your own space to play with. The only requirement is that the package contains an `env.py` file that defines the environment and allows LeRobot to load and use your EnvHub package.
-
-This `env.py` file needs to expose a small API so LeRobot can load and run it. In particular, you must provide a `make_env(n_envs: int = 1, use_async_envs: bool = False)` or `make_env(n_envs: int = 1, use_async_envs: bool = False, cfg: EnvConfig)` function, which is the main entry point for LeRobot. It should return one of:
-
- A `gym.vector.VectorEnv` (most common)
- A single `gym.Env` (will be automatically wrapped)
- A dict mapping `{suite_name: {task_id: VectorEnv}}` (for multi-task benchmarks)
-
-You can also pass an `EnvConfig` object to `make_env` to configure the environment (e.g. the number of environments, task, camera name, initial states, control mode, episode length, etc.).
-
-Finally, your environment must implement the standard `gym.vector.VectorEnv` interface so it works with LeRobot, including methods like `reset` and `step`.
+- Load environments from the Hub instantly
+- Share your custom simulation tasks with the community
+- Version control your environments using Git
+- Distribute complex physics simulations without packaging hassles

 ## Quick Start

@@ -47,6 +29,17 @@ env = make_env("lerobot/cartpole-env", trust_remote_code=True)
  hash for reproducibility and security.
 </Tip>

+## What is EnvHub?
+
+EnvHub is a framework that allows researchers and developers to:
+
+1. **Publish environments** to the Hugging Face Hub as Git repositories
+2. **Load environments** dynamically without installing them as packages
+3. **Version and track** environment changes using Git semantics
+4. **Discover** new simulation tasks shared by the community
+
+This design means you can go from discovering an interesting environment on the Hub to running experiments in seconds, without worrying about dependency conflicts or complex installation procedures.
+
 ## Repository Structure

 To make your environment loadable from the Hub, your repository must contain at minimum:
@@ -155,10 +148,10 @@ Upload your repository to Hugging Face:
 pip install huggingface_hub

 # Login to Hugging Face
-hf auth login
+huggingface-cli login

 # Create a new repository
-hf repo create my-org/my-custom-env
+huggingface-cli repo create my-custom-env --type space --org my-org

 # Initialize git and push
 git init
@@ -1,510 +0,0 @@
-# NVIDIA IsaacLab Arena & LeRobot
-
-LeRobot EnvHub now supports **GPU-accelerated simulation** with IsaacLab Arena for policy evaluation at scale.
-Train and evaluate imitation learning policies with high-fidelity simulation — all integrated into the LeRobot ecosystem.
-
-<img
-  src="https://huggingface.co/nvidia/isaaclab-arena-envs/resolve/main/assets/Gr1OpenMicrowaveEnvironment.png"
-  alt="IsaacLab Arena - GR1 Microwave Environment"
-  style={{ maxWidth: "100%", borderRadius: "8px", marginBottom: "1rem" }}
-/>
-
-[IsaacLab Arena](https://github.com/isaac-sim/IsaacLab-Arena) integrates with NVIDIA IsaacLab to provide:
-
- 🤖 **Humanoid embodiments**: GR1, G1, Galileo with various configurations
- 🎯 **Manipulation & loco-manipulation tasks**: Door opening, pick-and-place, button pressing, and more
- ⚡ **GPU-accelerated rollouts**: Parallel environment execution on NVIDIA GPUs
- 🖼️ **RTX Rendering**: Evaluate vision-based policies with realistic rendering, reflections and refractions
- 📦 **LeRobot-compatible datasets**: Ready for training with GR00T N1x, PI0, SmolVLA, ACT, and Diffusion policies
- 🔄 **EnvHub integration**: Load environments from HuggingFace EnvHub with one line
-
-## Installation
-
-### Prerequisites
-
-Hardware requirements are shared with Isaac Sim, and are detailed in [Isaac Sim Requirements](https://docs.isaacsim.omniverse.nvidia.com/5.1.0/installation/requirements.html).
-
- NVIDIA GPU with CUDA support
- NVIDIA driver compatible with IsaacSim 5.1.0
- Linux (Ubuntu 22.04 / 24.04)
-
-### Setup
-
-```bash
-# 1. Create conda environment
-conda create -y -n lerobot-arena python=3.11
-conda activate lerobot-arena
-conda install -y -c conda-forge ffmpeg=7.1.1
-
-# 2. Install Isaac Sim 5.1.0
-pip install "isaacsim[all,extscache]==5.1.0" --extra-index-url https://pypi.nvidia.com
-
-# Accept NVIDIA EULA (required)
-export ACCEPT_EULA=Y
-export PRIVACY_CONSENT=Y
-
-# 3. Install IsaacLab 2.3.0
-git clone https://github.com/isaac-sim/IsaacLab.git
-cd IsaacLab
-git checkout v2.3.0
-./isaaclab.sh -i
-cd ..
-
-# 4. Install IsaacLab Arena
-git clone https://github.com/isaac-sim/IsaacLab-Arena.git
-cd IsaacLab-Arena
-git checkout release/0.1.1
-pip install -e .
-cd ..
-
-
-# 5. Install LeRobot
-git clone https://github.com/huggingface/lerobot.git
-cd lerobot
-pip install -e .
-cd ..
-
-
-# 6. Install additional dependencies
-pip install onnxruntime==1.23.2 lightwheel-sdk==1.0.1 vuer[all]==0.0.70 qpsolvers==4.8.1
-pip install numpy==1.26.0 # Isaac Sim 5.1 depends on numpy==1.26.0, this will be fixed in next release
-```
-
-## Evaluating Policies
-
-### Pre-trained Policies
-
-The following trained policies are available:
-
-| Policy                      | Architecture | Task          | Link                                                                     |
-| :-------------------------- | :----------- | :------------ | :----------------------------------------------------------------------- |
-| pi05-arena-gr1-microwave    | PI0.5        | GR1 Microwave | [HuggingFace](https://huggingface.co/nvidia/pi05-arena-gr1-microwave)    |
-| smolvla-arena-gr1-microwave | SmolVLA      | GR1 Microwave | [HuggingFace](https://huggingface.co/nvidia/smolvla-arena-gr1-microwave) |
-
-### Evaluate SmolVLA
-
-```bash
-pip install -e ".[smolvla]"
-pip install numpy==1.26.0 # revert numpy to version 1.26
-```
-
-```bash
-lerobot-eval \
-    --policy.path=nvidia/smolvla-arena-gr1-microwave \
-    --env.type=isaaclab_arena \
-    --env.hub_path=nvidia/isaaclab-arena-envs \
-    --rename_map='{"observation.images.robot_pov_cam_rgb": "observation.images.robot_pov_cam"}' \
-    --policy.device=cuda \
-    --env.environment=gr1_microwave \
-    --env.embodiment=gr1_pink \
-    --env.object=mustard_bottle \
-    --env.headless=false \
-    --env.enable_cameras=true \
-    --env.video=true \
-    --env.video_length=10 \
-    --env.video_interval=15 \
-    --env.state_keys=robot_joint_pos \
-    --env.camera_keys=robot_pov_cam_rgb \
-    --trust_remote_code=True \
-    --eval.batch_size=1
-```
-
-### Evaluate PI0.5
-
-```bash
-pip install -e ".[pi]"
-pip install numpy==1.26.0 # revert numpy to version 1.26
-```
-
-<Tip>PI0.5 requires disabling torch compile for evaluation:</Tip>
-
-```bash
-TORCH_COMPILE_DISABLE=1 TORCHINDUCTOR_DISABLE=1 lerobot-eval \
-    --policy.path=nvidia/pi05-arena-gr1-microwave \
-    --env.type=isaaclab_arena \
-    --env.hub_path=nvidia/isaaclab-arena-envs \
-    --rename_map='{"observation.images.robot_pov_cam_rgb": "observation.images.robot_pov_cam"}' \
-    --policy.device=cuda \
-    --env.environment=gr1_microwave \
-    --env.embodiment=gr1_pink \
-    --env.object=mustard_bottle \
-    --env.headless=false \
-    --env.enable_cameras=true \
-    --env.video=true \
-    --env.video_length=15 \
-    --env.video_interval=15 \
-    --env.state_keys=robot_joint_pos \
-    --env.camera_keys=robot_pov_cam_rgb \
-    --trust_remote_code=True \
-    --eval.batch_size=1
-```
-
-<Tip>
-  To change the number of parallel environments, use the ```--eval.batch_size```
-  flag.
-</Tip>
-
-### What to Expect
-
-During evaluation, you will see a progress bar showing the running success rate:
-
-```
-Stepping through eval batches:   8%|██████▍    | 4/50 [00:45<08:06, 10.58s/it, running_success_rate=25.0%]
-```
-
-### Video Recording
-
-To enable video recording during evaluation, add the following flags to your command:
-
-```bash
--env.video=true \
--env.video_length=15 \
--env.video_interval=15
-```
-
-For more details on video recording, see the [IsaacLab Recording Documentation](https://isaac-sim.github.io/IsaacLab/main/source/how-to/record_video.html).
-
-<Tip>
-When running headless with `--env.headless=true`, you must also enable cameras explicitly for camera enabled environments:
-
-```bash
--env.headless=true --env.enable_cameras=true
-```
-
-</Tip>
-
-### Output Directory
-
-Evaluation videos are saved to the output directory with the following structure:
-
-```
-outputs/eval/<date>/<timestamp>_<env>_<policy>/videos/<task>_<env_id>/eval_episode_<n>.mp4
-```
-
-For example:
-
-```
-outputs/eval/2026-01-02/14-38-01_isaaclab_arena_smolvla/videos/gr1_microwave_0/eval_episode_0.mp4
-```
-
-## Training Policies
-
-To learn more about training policies with LeRobot, please refer to the training documentation:
-
- [SmolVLA](./smolvla)
- [Pi0.5](./pi05)
- [GR00T N1.5](./groot)
-
-Sample IsaacLab Arena datasets are available on HuggingFace Hub for experimentation:
-
-| Dataset                                                                                                   | Description                | Frames |
-| :-------------------------------------------------------------------------------------------------------- | :------------------------- | :----- |
-| [Arena-GR1-Manipulation-Task](https://huggingface.co/datasets/nvidia/Arena-GR1-Manipulation-Task-v3)      | GR1 microwave manipulation | ~4K    |
-| [Arena-G1-Loco-Manipulation-Task](https://huggingface.co/datasets/nvidia/Arena-G1-Loco-Manipulation-Task) | G1 loco-manipulation       | ~4K    |
-
-## Environment Configuration
-
-### Full Configuration Options
-
-```python
-from lerobot.envs.configs import IsaaclabArenaEnv
-
-config = IsaaclabArenaEnv(
-    # Environment selection
-    environment="gr1_microwave",      # Task environment
-    embodiment="gr1_pink",            # Robot embodiment
-    object="power_drill",             # Object to manipulate
-
-    # Simulation settings
-    episode_length=300,               # Max steps per episode
-    headless=True,                    # Run without GUI
-    device="cuda:0",                  # GPU device
-    seed=42,                          # Random seed
-
-    # Observation configuration
-    state_keys="robot_joint_pos",     # State observation keys (comma-separated)
-    camera_keys="robot_pov_cam_rgb",  # Camera observation keys (comma-separated)
-    state_dim=54,                     # Expected state dimension
-    action_dim=36,                    # Expected action dimension
-    camera_height=512,                # Camera image height
-    camera_width=512,                 # Camera image width
-    enable_cameras=True,              # Enable camera observations
-
-    # Video recording
-    video=False,                      # Enable video recording
-    video_length=100,                 # Frames per video
-    video_interval=200,               # Steps between recordings
-
-    # Advanced
-    mimic=False,                      # Enable mimic mode
-    teleop_device=None,               # Teleoperation device
-    disable_fabric=False,             # Disable fabric optimization
-    enable_pinocchio=True,            # Enable Pinocchio for IK
-)
-```
-
-### Using Environment Hub directly for advanced usage
-
-Create a file called `test_env_load_arena.py` or [download from the EnvHub](https://huggingface.co/nvidia/isaaclab-arena-envs/blob/main/tests/test_env_load_arena.py):
-
-```python
-import logging
-from dataclasses import asdict
-from pprint import pformat
-import torch
-import tqdm
-from lerobot.configs import parser
-from lerobot.configs.eval import EvalPipelineConfig
-
-
-@parser.wrap()
-def main(cfg: EvalPipelineConfig):
-    """Run random action rollout for IsaacLab Arena environment."""
-    logging.info(pformat(asdict(cfg)))
-
-    from lerobot.envs.factory import make_env
-
-    env_dict = make_env(
-        cfg.env,
-        n_envs=cfg.env.num_envs,
-        trust_remote_code=True,
-    )
-    env = next(iter(env_dict.values()))[0]
-    env.reset()
-    for _ in tqdm.tqdm(range(cfg.env.episode_length)):
-        with torch.inference_mode():
-            actions = env.action_space.sample()
-            obs, rewards, terminated, truncated, info = env.step(actions)
-            if terminated.any() or truncated.any():
-                obs, info = env.reset()
-    env.close()
-
-
-if __name__ == "__main__":
-    main()
-```
-
-Run with:
-
-```bash
-python test_env_load_arena.py \
-    --env.environment=g1_locomanip_pnp \
-    --env.embodiment=gr1_pink \
-    --env.object=cracker_box \
-    --env.num_envs=4 \
-    --env.enable_cameras=true \
-    --env.seed=1000 \
-    --env.video=true \
-    --env.video_length=10 \
-    --env.video_interval=15 \
-    --env.headless=false \
-    --env.hub_path=nvidia/isaaclab-arena-envs \
-    --env.type=isaaclab_arena
-```
-
-## Creating New Environments
-
-First create a new IsaacLab Arena environment by following the [IsaacLab Arena Documentation](https://isaac-sim.github.io/IsaacLab-Arena/release/0.1.1/index.html).
-
-Clone our EnvHub repo:
-
-```bash
-git clone https://huggingface.co/nvidia/isaaclab-arena-envs
-```
-
-Modify the `example_envs.yaml` file based on your new environment.
-[Upload](./envhub#step-3-upload-to-the-hub) your modified repo to HuggingFace EnvHub.
-
-<Tip>
-  Your IsaacLab Arena environment code must be locally available during
-  evaluation. Users can clone your environment repository separately, or you can
-  bundle the environment code and assets directly in your EnvHub repo.
-</Tip>
-
-Then, when evaluating, use your new environment:
-
-```bash
-lerobot-eval \
-    --env.hub_path=<your-env-hub-path>/isaaclab-arena-envs \
-    --env.environment=<your new environment> \
-    ...other flags...
-```
-
-We look forward to your contributions!
-
-## Troubleshooting
-
-### CUDA out of memory
-
-Reduce `batch_size` or use a GPU with more VRAM:
-
-```bash
--eval.batch_size=1
-```
-
-### EULA not accepted
-
-Set environment variables before running:
-
-```bash
-export ACCEPT_EULA=Y
-export PRIVACY_CONSENT=Y
-```
-
-### Video recording not working
-
-Enable cameras when running headless:
-
-```bash
--env.video=true --env.enable_cameras=true --env.headless=true
-```
-
-### Policy output dimension mismatch
-
-Ensure `action_dim` matches your policy:
-
-```bash
--env.action_dim=36
-```
-
-### libGLU.so.1 Errors during Isaac Sim initialization
-
-Ensure you have the following dependencies installed, this is likely to happen on headless machines.
-
-```bash
-sudo apt update && sudo apt install -y libglu1-mesa libxt6
-```
-
-## See Also
-
- [EnvHub Documentation](./envhub.mdx) - General EnvHub usage
- [IsaacLab Arena GitHub](https://github.com/isaac-sim/IsaacLab-Arena)
- [IsaacLab Documentation](https://isaac-sim.github.io/IsaacLab/)
-
-## Lightwheel LW-BenchHub
-
-[Lightwheel](https://www.lightwheel.ai) is bringing `Lightwheel-Libero-Tasks` and `Lightwheel-RoboCasa-Tasks` with 268 tasks to the LeRobot ecosystem.
-LW-BenchHub collects and generates large-scale datasets via teleoperation that comply with the LeRobot specification, enabling out-of-the-box training and evaluation workflows.
-With the unified interface provided by EnvHub, developers can quickly build end-to-end experimental pipelines.
-
-### Install
-
-Assuming you followed the [Installation](#installation) steps, you can install LW-BenchHub with:
-
-```bash
-conda install pinocchio -c conda-forge -y
-pip install numpy==1.26.0 # revert numpy to version 1.26
-
-sudo apt-get install git-lfs && git lfs install
-
-git clone https://github.com/LightwheelAI/lw_benchhub
-git lfs pull # Ensure LFS files (e.g., .usd assets) are downloaded
-
-cd lw_benchhub
-pip install -e .
-```
-
-For more detailed instructions, please refer to the [LW-BenchHub Documentation](https://docs.lightwheel.net/lw_benchhub/usage/Installation).
-
-### Lightwheel Tasks Dataset
-
-LW-BenchHub datasets are available on HuggingFace Hub:
-
-| Dataset                                                                                                       | Description             | Tasks | Frames |
-| :------------------------------------------------------------------------------------------------------------ | :---------------------- | :---- | :----- |
-| [Lightwheel-Tasks-X7S](https://huggingface.co/datasets/LightwheelAI/Lightwheel-Tasks-X7S)                     | X7S LIBERO and RoboCasa | 117   | ~10.3M |
-| [Lightwheel-Tasks-Double-Piper](https://huggingface.co/datasets/LightwheelAI/Lightwheel-Tasks-Double-Piper)   | Double-Piper LIBERO     | 130   | ~6.0M  |
-| [Lightwheel-Tasks-G1-Controller](https://huggingface.co/datasets/LightwheelAI/Lightwheel-Tasks-G1-Controller) | G1-Controller LIBERO    | 62    | ~2.7M  |
-| [Lightwheel-Tasks-G1-WBC](https://huggingface.co/datasets/LightwheelAI/Lightwheel-Tasks-G1-WBC)               | G1-WBC RoboCasa         | 32    | ~1.5M  |
-
-For training policies, refer to the [Training Policies](#training-policies) section.
-
-### Evaluating Policies
-
-#### Pre-trained Policies
-
-The following trained policies are available:
-
-| Policy                   | Architecture | Task                           | Layout     | Robot           | Link                                                                                  |
-| :----------------------- | :----------- | :----------------------------- | :--------- | :-------------- | :------------------------------------------------------------------------------------ |
-| smolvla-double-piper-pnp | SmolVLA      | L90K1PutTheBlackBowlOnThePlate | libero-1-1 | DoublePiper-Abs | [HuggingFace](https://huggingface.co/LightwheelAI/smolvla-double-piper-pnp/tree/main) |
-
-#### Evaluate SmolVLA
-
-```bash
-lerobot-eval \
-  --policy.path=LightwheelAI/smolvla-double-piper-pnp \
-  --env.type=isaaclab_arena \
-  --rename_map='{"observation.images.left_hand_camera_rgb": "observation.images.left_hand", "observation.images.right_hand_camera_rgb": "observation.images.right_hand", "observation.images.first_person_camera_rgb": "observation.images.first_person"}' \
-  --env.hub_path=LightwheelAI/lw_benchhub_env \
-  --env.kwargs='{"config_path": "configs/envhub/example.yml"}' \
-  --trust_remote_code=true \
-  --env.state_keys=joint_pos \
-  --env.action_dim=12 \
-  --env.camera_keys=left_hand_camera_rgb,right_hand_camera_rgb,first_person_camera_rgb \
-  --policy.device=cuda \
-  --eval.batch_size=10 \
-  --eval.n_episodes=100
-```
-
-### Environment Configuration
-
-Evaluation can be quickly launched by modifying the `robot`, `task`, and `layout` settings in the configuration file.
-
-#### Full Configuration Options
-
-```yml
-# =========================
-# Basic Settings
-# =========================
-disable_fabric: false
-device: cuda:0
-sensitivity: 1.0
-step_hz: 50
-enable_cameras: true
-execute_mode: eval
-episode_length_s: 20.0 # Episode length in seconds, increase if episodes timeout during eval
-
-# =========================
-# Robot Settings
-# =========================
-robot: DoublePiper-Abs # Robot type, DoublePiper-Abs, X7S-Abs, G1-Controller or G1-Controller-DecoupledWBC
-robot_scale: 1.0
-
-# =========================
-# Task & Scene Settings
-# =========================
-task: L90K1PutTheBlackBowlOnThePlate # Task name
-scene_backend: robocasa
-task_backend: robocasa
-debug_assets: null
-layout: libero-1-1 # Layout and style ID
-sources:
-  - objaverse
-  - lightwheel
-  - aigen_objs
-object_projects: []
-usd_simplify: false
-seed: 42
-
-# =========================
-# Object Placement Retry Settings
-# =========================
-max_scene_retry: 4
-max_object_placement_retry: 3
-
-resample_objects_placement_on_reset: true
-resample_robot_placement_on_reset: true
-
-# =========================
-# Replay Configuration Settings
-# =========================
-replay_cfgs:
-  add_camera_to_observation: true
-  render_resolution: [640, 480]
-```
-
-### See Also
-
- [LW-BenchHub GitHub](https://github.com/LightwheelAI/LW-BenchHub)
- [LW-BenchHub Documentation](https://docs.lightwheel.net/lw_benchhub/)
@@ -137,8 +137,7 @@ from lerobot.teleoperators import (  # noqa: F401
    Teleoperator,
    TeleoperatorConfig,
    make_teleoperator_from_config,
-    so_leader,
-    bi_so_leader,
+    so101_leader,
 )
 from lerobot.utils.robot_utils import precise_sleep
 from lerobot.utils.utils import init_logging
@@ -197,7 +196,7 @@ def teleop_loop(teleop: Teleoperator, env: gym.Env, fps: int):
            obs, info = env.reset()

        dt_s = time.perf_counter() - loop_start
-        precise_sleep(max(1 / fps - dt_s, 0.0))
+        precise_sleep(1 / fps - dt_s)
        loop_s = time.perf_counter() - loop_start
        print(f"\ntime: {loop_s * 1e3:.2f}ms ({1 / loop_s:.0f} Hz)")

@@ -223,7 +222,7 @@ def teleoperate(cfg: TeleoperateConfig):

 def main():
    teleoperate(TeleoperateConfig(
-        teleop=so_leader.SO101LeaderConfig(
+        teleop=so101_leader.SO101LeaderConfig(
            port="/dev/ttyACM0",
            id='leader',
            use_degrees=False,
@@ -12,12 +12,6 @@ Developers and researchers can post-train GR00T N1.5 with their own real or synt

 GR00T N1.5 (specifically the GR00T-N1.5-3B model) is built using pre-trained vision and language encoders. It utilizes a flow matching action transformer to model a chunk of actions, conditioned on vision, language, and proprioception.

-<img
-  src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/lerobot/lerobot-groot-paper1%20(1).png"
-  alt="An overview of GR00T"
-  width="80%"
-/>
-
 Its strong performance comes from being trained on an expansive and diverse humanoid dataset, which includes:

 - Real captured data from robots.
@@ -109,7 +103,7 @@ Once you have trained your model using your parameters you can run inference in

 ```bash
 lerobot-record \
-  --robot.type=bi_so_follower \
+  --robot.type=bi_so100_follower \
  --robot.left_arm_port=/dev/ttyACM1 \
  --robot.right_arm_port=/dev/ttyACM0 \
  --robot.id=bimanual_follower \
@@ -120,12 +114,9 @@ lerobot-record \
  --display_data=true \
  --dataset.repo_id=<user>/eval_groot-bimanual  \
  --dataset.num_episodes=10 \
-  --dataset.single_task="Grab and handover the red cube to the other arm" \
-  --dataset.streaming_encoding=true \
-  --dataset.encoder_threads=2 \
-  # --dataset.vcodec=auto \
-  --policy.path=<user>/groot-bimanual \ # your trained model
-  --dataset.episode_time_s=30 \
+  --dataset.single_task="Grab and handover the red cube to the other arm"
+  --policy.path=<user>/groot-bimanual # your trained model
+  --dataset.episode_time_s=30
  --dataset.reset_time_s=10
 ```

@@ -224,15 +224,12 @@ lerobot-record \
    --teleop.port=/dev/tty.usbmodem1201 \
    --teleop.id=right \
    --teleop.side=right \
-    --dataset.repo_id=<USER>/hand_record_test_with_video_data \
+    --dataset.repo_id=nepyope/hand_record_test_with_video_data \
    --dataset.single_task="Hand recording test with video data" \
    --dataset.num_episodes=1 \
    --dataset.episode_time_s=5 \
    --dataset.push_to_hub=true \
    --dataset.private=true \
-    --dataset.streaming_encoding=true \
-    --dataset.encoder_threads=2 \
-    # --dataset.vcodec=auto \
    --display_data=true
 ```

@@ -244,7 +241,7 @@ lerobot-replay \
    --robot.port=/dev/tty.usbmodem58760432281 \
    --robot.id=right \
    --robot.side=right \
-    --dataset.repo_id=<USER>/hand_record_test_with_camera \
+    --dataset.repo_id=nepyope/hand_record_test_with_camera \
    --dataset.episode=0
 ```

@@ -252,13 +249,13 @@ lerobot-replay \

 ```bash
 lerobot-train \
-  --dataset.repo_id=<USER>/hand_record_test_with_video_data \
+  --dataset.repo_id=nepyope/hand_record_test_with_video_data \
  --policy.type=act \
  --output_dir=outputs/train/hopejr_hand \
  --job_name=hopejr \
  --policy.device=mps \
  --wandb.enable=true \
-  --policy.repo_id=<USER>/hand_test_policy
+  --policy.repo_id=nepyope/hand_test_policy
 ```

 ### Evaluate
@@ -273,11 +270,8 @@ lerobot-record \
  --robot.side=right \
  --robot.cameras='{"main": {"type": "opencv", "index_or_path": 0, "width": 640, "height": 480, "fps": 30}}' \
  --display_data=false \
-  --dataset.repo_id=<USER>/eval_hopejr \
+  --dataset.repo_id=nepyope/eval_hopejr \
  --dataset.single_task="Evaluate hopejr hand policy" \
  --dataset.num_episodes=10 \
-  --dataset.streaming_encoding=true \
-  --dataset.encoder_threads=2 \
-  # --dataset.vcodec=auto \
  --policy.path=outputs/train/hopejr_hand/checkpoints/last/pretrained_model
 ```
@@ -58,8 +58,8 @@ lerobot-teleoperate \

 <!-- prettier-ignore-start -->
 ```python
-from lerobot.teleoperators.so_leader import SO101LeaderConfig, SO101Leader
-from lerobot.robots.so_follower import SO101FollowerConfig, SO101Follower
+from lerobot.teleoperators.so101_leader import SO101LeaderConfig, SO101Leader
+from lerobot.robots.so101_follower import SO101FollowerConfig, SO101Follower

 robot_config = SO101FollowerConfig(
    port="/dev/tty.usbmodem58760431541",
@@ -159,13 +159,13 @@ We use the Hugging Face hub features for uploading your dataset. If you haven't
 Add your token to the CLI by running this command:

 ```bash
-hf auth login --token ${HUGGINGFACE_TOKEN} --add-to-git-credential
+huggingface-cli login --token ${HUGGINGFACE_TOKEN} --add-to-git-credential
 ```

 Then store your Hugging Face repository name in a variable:

 ```bash
-HF_USER=$(NO_COLOR=1 hf auth whoami | awk -F': *' 'NR==1 {print $2}')
+HF_USER=$(hf auth whoami | head -n 1)
 echo $HF_USER
 ```

@@ -185,10 +185,7 @@ lerobot-record \
    --display_data=true \
    --dataset.repo_id=${HF_USER}/record-test \
    --dataset.num_episodes=5 \
-    --dataset.single_task="Grab the black cube" \
-    --dataset.streaming_encoding=true \
-    # --dataset.vcodec=auto \
-    --dataset.encoder_threads=2
+    --dataset.single_task="Grab the black cube"
 ```
 </hfoption>
 <hfoption id="API example">
@@ -198,14 +195,13 @@ lerobot-record \
 from lerobot.cameras.opencv.configuration_opencv import OpenCVCameraConfig
 from lerobot.datasets.lerobot_dataset import LeRobotDataset
 from lerobot.datasets.utils import hw_to_dataset_features
-from lerobot.robots.so_follower import SO100Follower, SO100FollowerConfig
-from lerobot.teleoperators.so_leader.config_so100_leader import SO100LeaderConfig
-from lerobot.teleoperators.so_leader.so100_leader import SO100Leader
+from lerobot.robots.so100_follower import SO100Follower, SO100FollowerConfig
+from lerobot.teleoperators.so100_leader.config_so100_leader import SO100LeaderConfig
+from lerobot.teleoperators.so100_leader.so100_leader import SO100Leader
 from lerobot.utils.control_utils import init_keyboard_listener
 from lerobot.utils.utils import log_say
 from lerobot.utils.visualization_utils import init_rerun
-from lerobot.scripts.lerobot_record import record_loop
-from lerobot.processor import make_default_processors
+from lerobot.record import record_loop

 NUM_EPISODES = 5
 FPS = 30
@@ -213,19 +209,12 @@ EPISODE_TIME_SEC = 60
 RESET_TIME_SEC = 10
 TASK_DESCRIPTION = "My task description"

-# Create robot configuration
+# Create the robot and teleoperator configurations
+camera_config = {"front": OpenCVCameraConfig(index_or_path=0, width=640, height=480, fps=FPS)}
 robot_config = SO100FollowerConfig(
-    id="my_awesome_follower_arm",
-    cameras={
-        "front": OpenCVCameraConfig(index_or_path=0, width=640, height=480, fps=FPS) # Optional: fourcc="MJPG" for troubleshooting OpenCV async error.
-    },
-    port="/dev/tty.usbmodem58760434471",
-)
-
-teleop_config = SO100LeaderConfig(
-    id="my_awesome_leader_arm",
-    port="/dev/tty.usbmodem585A0077581",
+    port="/dev/tty.usbmodem58760434471", id="my_awesome_follower_arm", cameras=camera_config
 )
+teleop_config = SO100LeaderConfig(port="/dev/tty.usbmodem585A0077581", id="my_awesome_leader_arm")

 # Initialize the robot and teleoperator
 robot = SO100Follower(robot_config)
@@ -254,9 +243,6 @@ init_rerun(session_name="recording")
 robot.connect()
 teleop.connect()

-# Create the required processors
-teleop_action_processor, robot_action_processor, robot_observation_processor = make_default_processors()
-
 episode_idx = 0
 while episode_idx < NUM_EPISODES and not events["stop_recording"]:
    log_say(f"Recording episode {episode_idx + 1} of {NUM_EPISODES}")
@@ -265,9 +251,6 @@ while episode_idx < NUM_EPISODES and not events["stop_recording"]:
        robot=robot,
        events=events,
        fps=FPS,
-        teleop_action_processor=teleop_action_processor,
-        robot_action_processor=robot_action_processor,
-        robot_observation_processor=robot_observation_processor,
        teleop=teleop,
        dataset=dataset,
        control_time_s=EPISODE_TIME_SEC,
@@ -282,9 +265,6 @@ while episode_idx < NUM_EPISODES and not events["stop_recording"]:
            robot=robot,
            events=events,
            fps=FPS,
-            teleop_action_processor=teleop_action_processor,
-            robot_action_processor=robot_action_processor,
-            robot_observation_processor=robot_observation_processor,
            teleop=teleop,
            control_time_s=RESET_TIME_SEC,
            single_task=TASK_DESCRIPTION,
@@ -327,7 +307,7 @@ You can look for other LeRobot datasets on the hub by searching for `LeRobot` [t
 You can also push your local dataset to the Hub manually, running:

 ```bash
-hf upload ${HF_USER}/record-test ~/.cache/huggingface/lerobot/{repo-id} --repo-type dataset
+huggingface-cli upload ${HF_USER}/record-test ~/.cache/huggingface/lerobot/{repo-id} --repo-type dataset
 ```

 #### Record function
@@ -411,8 +391,8 @@ lerobot-replay \
 import time

 from lerobot.datasets.lerobot_dataset import LeRobotDataset
-from lerobot.robots.so_follower.config_so100_follower import SO100FollowerConfig
-from lerobot.robots.so_follower.so100_follower import SO100Follower
+from lerobot.robots.so100_follower.config_so100_follower import SO100FollowerConfig
+from lerobot.robots.so100_follower.so100_follower import SO100Follower
 from lerobot.utils.robot_utils import precise_sleep
 from lerobot.utils.utils import log_say

@@ -435,7 +415,7 @@ for idx in range(dataset.num_frames):
    }
    robot.send_action(action)

-    precise_sleep(max(1.0 / dataset.fps - (time.perf_counter() - t0), 0.0))
+    precise_sleep(1.0 / dataset.fps - (time.perf_counter() - t0))

 robot.disconnect()
 ```
@@ -491,7 +471,7 @@ If your local computer doesn't have a powerful GPU you could utilize Google Cola
 Once training is done, upload the latest checkpoint with:

 ```bash
-hf upload ${HF_USER}/act_so101_test \
+huggingface-cli upload ${HF_USER}/act_so101_test \
  outputs/train/act_so101_test/checkpoints/last/pretrained_model
 ```

@@ -499,7 +479,7 @@ You can also upload intermediate checkpoints with:

 ```bash
 CKPT=010000
-hf upload ${HF_USER}/act_so101_test${CKPT} \
+huggingface-cli upload ${HF_USER}/act_so101_test${CKPT} \
  outputs/train/act_so101_test/checkpoints/${CKPT}/pretrained_model
 ```

@@ -518,9 +498,6 @@ lerobot-record  \
  --display_data=false \
  --dataset.repo_id=${HF_USER}/eval_so100 \
  --dataset.single_task="Put lego brick into the transparent box" \
-  --dataset.streaming_encoding=true \
-  --dataset.encoder_threads=2 \
-  # --dataset.vcodec=auto \
  # <- Teleop optional if you want to teleoperate in between episodes \
  # --teleop.type=so100_leader \
  # --teleop.port=/dev/ttyACM0 \
@@ -537,8 +514,8 @@ from lerobot.datasets.lerobot_dataset import LeRobotDataset
 from lerobot.datasets.utils import hw_to_dataset_features
 from lerobot.policies.act.modeling_act import ACTPolicy
 from lerobot.policies.factory import make_pre_post_processors
-from lerobot.robots.so_follower.config_so100_follower import SO100FollowerConfig
-from lerobot.robots.so_follower.so100_follower import SO100Follower
+from lerobot.robots.so100_follower.config_so100_follower import SO100FollowerConfig
+from lerobot.robots.so100_follower.so100_follower import SO100Follower
 from lerobot.scripts.lerobot_record import record_loop
 from lerobot.utils.control_utils import init_keyboard_listener
 from lerobot.utils.utils import log_say
@@ -1,57 +1,30 @@
 # Installation

-This guide uses `conda` (via miniforge) to manage environments (recommended). If you prefer another environment manager (e.g. `uv`, `venv`), ensure you have Python >=3.12 and `ffmpeg` installed with the `libsvtav1` encoder, then skip ahead to [Environment Setup](#step-2-environment-setup).
-
-## Step 1 (`conda` only): Install [`miniforge`](https://conda-forge.org/download/)
+## Install [`miniforge`](https://conda-forge.org/download/)

 ```bash
 wget "https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-$(uname)-$(uname -m).sh"
 bash Miniforge3-$(uname)-$(uname -m).sh
 ```

-## Step 2: Environment Setup
+## Environment Setup

-Create a virtual environment with Python 3.12:
+Create a virtual environment with Python 3.10, using conda:

-<!-- prettier-ignore-start -->
-<hfoptions id="create_venv">
-<hfoption id="conda">
 ```bash
-conda create -y -n lerobot python=3.12
+conda create -y -n lerobot python=3.10
 ```
-</hfoption>
-<hfoption id="uv">
+
+Then activate your conda environment, you have to do this each time you open a shell to use lerobot:
+
 ```bash
-uv python install 3.12
-uv venv --python 3.12
-```
-</hfoption>
-</hfoptions>
-<!-- prettier-ignore-end -->
-
-Then activate your virtual environment, you have to do this each time you open a shell to use lerobot:
-
-<!-- prettier-ignore-start -->
-<hfoptions id="activate_venv">
-<hfoption id="conda">```bash
 conda activate lerobot
-```</hfoption>
-<hfoption id="uv">
-```bash
-# Linux/macOSsource
-source .venv/bin/activate
-# Windows PowerShell
-source .venv\Scripts\Activate.ps1
 ```
-</hfoption>
-</hfoptions>
-<!-- prettier-ignore-end -->

 When using `conda`, install `ffmpeg` in your environment:

 ```bash
 conda install ffmpeg -c conda-forge
-ffmpeg -version  # ffmpeg 8.X is not yet supported !
 ```

 > [!TIP]
@@ -65,17 +38,7 @@ ffmpeg -version  # ffmpeg 8.X is not yet supported !
 >
 > - _[On Linux only]_ If you want to bring your own ffmpeg: Install [ffmpeg build dependencies](https://trac.ffmpeg.org/wiki/CompilationGuide/Ubuntu#GettheDependencies) and [compile ffmpeg from source with libsvtav1](https://trac.ffmpeg.org/wiki/CompilationGuide/Ubuntu#libsvtav1), and make sure you use the corresponding ffmpeg binary to your install with `which ffmpeg`.

-> [!NOTE]
-> When installing LeRobot inside WSL (Windows Subsystem for Linux), make sure to install `evdev` with the following command:
->
-> ```bash
-> conda install evdev -c conda-forge
-> ```
-
-> [!IMPORTANT]
-> If you are using `uv` you will have to install `ffmpeg` system-wide (outside of the virtual environment). You rely on `uv` and `torchcodec` ability to dynamically link to the system `ffmpeg`.
-
-## Step 3: Install LeRobot 🤗
+## Install LeRobot 🤗

 ### From Source

@@ -88,45 +51,23 @@ cd lerobot

 Then, install the library in editable mode. This is useful if you plan to contribute to the code.

-<!-- prettier-ignore-start -->
-<hfoptions id="install_lerobot_src">
-<hfoption id="conda">
 ```bash
 pip install -e .
 ```
-</hfoption>
-<hfoption id="uv">
-```bash
-uv pip install -e .
-```
-</hfoption>
-</hfoptions>
-<!-- prettier-ignore-end -->

 ### Installation from PyPI

 **Core Library:**
 Install the base package with:

-<!-- prettier-ignore-start -->
-<hfoptions id="install_lerobot_pypi">
-<hfoption id="conda">
 ```bash
 pip install lerobot
 ```
-</hfoption>
-<hfoption id="uv">
-```bash
-uv pip install lerobot
-```
-</hfoption>
-</hfoptions>
-<!-- prettier-ignore-end -->

 _This installs only the default dependencies._

 **Extra Features:**
-To install additional functionality, use one of the following (If you are using `uv`, replace `pip install` with `uv pip install` in the commands below.):
+To install additional functionality, use one of the following:

 ```bash
 pip install 'lerobot[all]'          # All available features
@@ -140,10 +81,13 @@ _Replace `[...]` with your desired features._
 For a full list of optional dependencies, see:
 https://pypi.org/project/lerobot/

+> [!NOTE]
+> For lerobot 0.4.0, if you want to install pi, you will have to do: `pip install "lerobot[pi]@git+https://github.com/huggingface/lerobot.git"`
+
 ### Troubleshooting

 If you encounter build errors, you may need to install additional dependencies: `cmake`, `build-essential`, and `ffmpeg libs`.
-To install these for Linux run:
+To install these for linux run:

 ```bash
 sudo apt-get install cmake build-essential python3-dev pkg-config libavformat-dev libavcodec-dev libavdevice-dev libavutil-dev libswscale-dev libswresample-dev libavfilter-dev
@@ -153,7 +97,7 @@ For other systems, see: [Compiling PyAV](https://pyav.org/docs/develop/overview/

 ## Optional dependencies

-LeRobot provides optional extras for specific functionalities. Multiple extras can be combined (e.g., `.[aloha,feetech]`). For all available extras, refer to `pyproject.toml`. If you are using `uv`, replace `pip install` with `uv pip install` in the commands below.
+LeRobot provides optional extras for specific functionalities. Multiple extras can be combined (e.g., `.[aloha,feetech]`). For all available extras, refer to `pyproject.toml`.

 ### Simulations

@@ -18,7 +18,7 @@ If you're using Feetech or Dynamixel motors, LeRobot provides built-in bus inter
 - [`DynamixelMotorsBus`](https://github.com/huggingface/lerobot/blob/main/src/lerobot/motors/dynamixel/dynamixel.py) – for controlling Dynamixel servos

 Please refer to the [`MotorsBus`](https://github.com/huggingface/lerobot/blob/main/src/lerobot/motors/motors_bus.py) abstract class to learn about its API.
-For a good example of how it can be used, you can have a look at our own [SO101 follower implementation](https://github.com/huggingface/lerobot/blob/main/src/lerobot/robots/so_follower/so101_follower/so101_follower.py)
+For a good example of how it can be used, you can have a look at our own [SO101 follower implementation](https://github.com/huggingface/lerobot/blob/main/src/lerobot/robots/so101_follower/so101_follower.py)

 Use these if compatible. Otherwise, you'll need to find or write a Python interface (not covered in this tutorial):

@@ -1,11 +1,5 @@
 # LeKiwi

-<img
-  src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/lerobot/1740517739083.jpeg"
-  alt="LeKiwi"
-  width="70%"
-/>
-
 In the steps below, we explain how to assemble the LeKiwi mobile robot.

 ## Source the parts
@@ -210,7 +204,7 @@ lerobot-calibrate \

 <!-- prettier-ignore-start -->
 ```python
-from lerobot.teleoperators.so_leader import SO100LeaderConfig, SO100Leader
+from lerobot.teleoperators.so100_leader import SO100LeaderConfig, SO100Leader

 config = SO100LeaderConfig(
    port="/dev/tty.usbmodem58760431551",
@@ -279,13 +273,13 @@ We use the Hugging Face hub features for uploading your dataset. If you haven't
 Add your token to the CLI by running this command:

 ```bash
-hf auth login --token ${HUGGINGFACE_TOKEN} --add-to-git-credential
+huggingface-cli login --token ${HUGGINGFACE_TOKEN} --add-to-git-credential
 ```

 Then store your Hugging Face repository name in a variable:

 ```bash
-HF_USER=$(hf auth whoami | awk -F': *' 'NR==1 {print $2}')
+HF_USER=$(huggingface-cli whoami | head -n 1)
 echo $HF_USER
 ```

@@ -41,10 +41,7 @@ lerobot-record \
  --display_data=true \
  --dataset.repo_id=${HF_USER}/record-test \
  --dataset.num_episodes=5 \
-  --dataset.single_task="Grab the black cube" \
-  --dataset.streaming_encoding=true \
-  # --dataset.vcodec=auto \
-  --dataset.encoder_threads=2
+  --dataset.single_task="Grab the black cube"
 ```

 See the [recording guide](./il_robots#record-a-dataset) for more details.
@@ -42,7 +42,6 @@ lerobot-eval \
 ```

 - `--env.task` picks the suite (`libero_object`, `libero_spatial`, etc.).
- `--env.task_ids` picks task ids to run (`[0]`, `[1,2,3]`, etc.). Omit this flag (or set it to `null`) to run all tasks in the suite.
 - `--eval.batch_size` controls how many environments run in parallel.
 - `--eval.n_episodes` sets how many episodes to run in total.

@@ -1,197 +0,0 @@
-## Order and Assemble the parts
-
-First, assemble the OMX hardware following the official assembly guide.
-
-OMX Assembly Guide: https://ai.robotis.com/omx/assembly_guide_omx.html
-
-OMX robots are shipped preconfigured from the factory. Motor IDs, communication parameters, and joint offsets are already set, so no additional motor setup or calibration is required before using LeRobot.
-
-## Install LeRobot 🤗
-
-To install LeRobot, follow our [Installation Guide](./installation)
-
-In addition to these instructions, you need to install the Dynamixel SDK:
-
-```bash
-pip install -e ".[dynamixel]"
-```
-
-## Connect the robot
-
-To find the port for each bus servo adapter, run this script:
-
-```bash
-lerobot-find-port
-```
-
-This command runs and when prompted, disconnect the USB cable from either the leader or follower arm and press Enter. The output will show 'The port of this MotorsBus is [port]'. This identifies the port for the disconnected arm. Repeat for the other arm to identify both ports.
-
-<hfoptions id="find_port">
-<hfoption id="Mac">
-
-Example output on macOS:
-
-```
-Finding all available ports for the MotorBus.
-['/dev/tty.usbmodem575E0032081', '/dev/tty.usbmodem575E0031751']
-Remove the USB cable from your MotorsBus and press Enter when done.
-
-[...Disconnect corresponding leader or follower arm and press Enter...]
-
-The port of this MotorsBus is /dev/tty.usbmodem575E0032081
-Reconnect the USB cable.
-```
-
-Where the found port is: `/dev/tty.usbmodem575E0032081` corresponding to your leader or follower arm.
-
-</hfoption>
-<hfoption id="Linux">
-
-On Linux, we strongly recommend using udev rules to assign persistent and human-readable device names to the OMX leader and follower arms. This avoids issues where device names such as ttyACM0 and ttyACM1 change when the robot is unplugged, replugged, or when the system is rebooted.
-
-#### 1. Find your device serial numbers
-
-You should have obtained the port numbers like ../../ttyACM? for the leader and follower using `lerobot-find-port`. You can match those results with the serial numbers using the `ls -l /dev/serial/by-id/` command.
-To create udev rules, you need the unique serial number for each OMX device. The easiest way is to list devices under:
-
-```bash
-ls -l /dev/serial/by-id/
-```
-
-You will see output similar to:
-
-```bash
-usb-ROBOTIS_OpenRB-150_228BDD7B503059384C2E3120FF0A2B19-if00 -> ../../ttyACM0
-usb-ROBOTIS_OpenRB-150_67E1ED68503059384C2E3120FF092234-if00 -> ../../ttyACM1
-```
-
-In each line, the serial number is the long string after `usb-ROBOTIS_OpenRB-150_` and before `-if00`.
-
-Follower serial: `228BDD7B503059384C2E3120FF0A2B19`
-
-Leader serial: `67E1ED68503059384C2E3120FF092234`
-
-#### 2. Create the udev rule
-
-Create a new udev rule file:
-
-```bash
-sudo nano /etc/udev/rules.d/99-omx.rules
-```
-
-Paste the following lines, replacing the serial numbers with the values you found above:
-
-```bash
-SUBSYSTEM=="tty", ATTRS{idVendor}=="0403", ATTRS{serial}=="228BDD7B503059384C2E3120FF0A2B19", SYMLINK+="omx_follower"
-SUBSYSTEM=="tty", ATTRS{idVendor}=="0403", ATTRS{serial}=="67E1ED68503059384C2E3120FF092234", SYMLINK+="omx_leader"
-```
-
-Save the file and reload udev rules:
-
-```bash
-sudo udevadm control --reload-rules
-sudo udevadm trigger
-```
-
-Now unplug and replug both devices once.
-
-#### 3. Verify the symlinks
-
-Check that the persistent device names exist:
-
-```bash
-ls -l /dev/omx_follower /dev/omx_leader
-```
-
-You should see them pointing to ttyACM\* devices:
-
-```bash
-/dev/omx_follower -> ttyACM*
-/dev/omx_leader   -> ttyACM*
-```
-
-These names remain stable across reboots and reconnections.
-
-</hfoption>
-</hfoptions>
-
-## Teleoperate
-
-After identifying the correct ports, you can directly teleoperate the follower arm using the leader arm.
-
-<hfoptions id="teleoperate">
-<hfoption id="Mac">
-
-### Teleoperate without camera
-
-```bash
-lerobot-teleoperate \
-  --robot.type=omx_follower \
-  --robot.port=<your_follower_port> \
-  --robot.id=omx_follower_arm \
-  --teleop.type=omx_leader \
-  --teleop.port=<your_leader_port> \
-  --teleop.id=omx_leader_arm
-```
-
-During teleoperation, motions of the leader arm are mirrored in real time by the follower arm. OMX is already preconfigured, teleoperation can begin immediately without any calibration steps.
-
-### Teleoperate with camera
-
-You can also enable camera input during teleoperation by providing a camera configuration for the follower arm.
-
-```bash
-lerobot-teleoperate \
-  --robot.type=omx_follower \
-  --robot.port=<your_follower_port> \
-  --robot.id=omx_follower_arm \
-  --robot.cameras="{front: {type: opencv, index_or_path: '/dev/video0', width: 640, height: 480, fps: 30}}" \
-  --teleop.type=omx_leader \
-  --teleop.port=<your_leader_port> \
-  --teleop.id=omx_leader_arm \
-  --display_data=true
-```
-
-When the camera is enabled, the camera stream is displayed in real time and synchronized with the robot state. This setup is useful for visual monitoring and can be reused later for demonstration recording and imitation learning.
-
-</hfoption>
-<hfoption id="Linux">
-
-### Teleoperate without camera
-
-```bash
-lerobot-teleoperate \
-  --robot.type=omx_follower \
-  --robot.port=/dev/omx_follower \
-  --robot.id=omx_follower_arm \
-  --teleop.type=omx_leader \
-  --teleop.port=/dev/omx_leader \
-  --teleop.id=omx_leader_arm
-```
-
-During teleoperation, motions of the leader arm are mirrored in real time by the follower arm. OMX is already preconfigured, teleoperation can begin immediately without any calibration steps.
-
-### Teleoperate with camera
-
-You can also enable camera input during teleoperation by providing a camera configuration for the follower arm.
-
-```bash
-lerobot-teleoperate \
-  --robot.type=omx_follower \
-  --robot.port=/dev/omx_follower \
-  --robot.id=omx_follower_arm \
-  --robot.cameras="{front: {type: opencv, index_or_path: '/dev/video0', width: 640, height: 480, fps: 30}}" \
-  --teleop.type=omx_leader \
-  --teleop.port=/dev/omx_leader \
-  --teleop.id=omx_leader_arm \
-  --display_data=true
-```
-
-When the camera is enabled, the camera stream is displayed in real time and synchronized with the robot state. This setup is useful for visual monitoring and can be reused later for demonstration recording and imitation learning.
-
-</hfoption>
-</hfoptions>
-
-Congrats 🎉, your robot is all set to learn a task on its own.
-
-> If you have any questions or need help, please reach out on [Discord](https://discord.com/invite/robotis).
@@ -1,276 +0,0 @@
-# OpenArm
-
-[OpenArm](https://openarm.dev) is an open-source 7DOF humanoid arm designed for physical AI research and deployment.
-
-To get your OpenArm, assembled or DIY, and join the global community, browse verified and certified manufacturers worldwide at [openarm.dev](https://openarm.dev).
-
-## What's Unique?
-
- **Human-Scale Design**: OpenArm is designed with human-like proportions, scaled for a person around 160-165cm tall. This provides an optimal balance between practical reach and manageable inertia for safe, responsive operation.
-
- **Safety-First Architecture**: Built with QDD backdrivable motors and high compliance, OpenArm prioritizes safe human-robot interaction while maintaining practical payload capabilities (6.0kg peak / 4.1kg nominal) for real-world tasks.
-
- **Built for Durability**: Critical structural components use aluminum and stainless steel construction, ensuring robust performance for repetitive data collection and continuous research use.
-
- **Fully Accessible & Buildable**: Every component, from CNC parts and 3D-printed casings to electrical wiring is designed to be purchasable and buildable by individual researchers and labs, with complete fabrication data provided.
-
- **Practical & Affordable**: At $6,500 USD for a complete bimanual system, OpenArm delivers research-grade capabilities at a fraction of traditional humanoid robot costs.
-
-## Platform Requirements
-
-<Tip warning={true}>
-  **Linux Only**: OpenArm currently only works on Linux. The CAN bus USB adapter
-  does not have macOS drivers and has not been tested on Windows.
-</Tip>
-
-## Safety Guide
-
-Before operating OpenArm, please read the [official safety guide](https://docs.openarm.dev/getting-started/safety-guide). Key points:
-
- **Secure installation**: Fasten the arm to a flat, stable surface with screws or clamps
- **Safe distance**: Keep body parts and objects outside the range of motion during operation
- **Protective equipment**: Always wear safety goggles; use additional PPE as needed
- **Payload limits**: Do not exceed specified payload limits (6.0kg peak / 4.1kg nominal per arm)
- **Emergency stop**: Know the location and operation of the emergency stop device
- **Regular inspection**: Check for loose screws, damaged mechanical limits, unusual noises, and wiring damage
-
-## Hardware Setup
-
-Follow the official [OpenArm hardware documentation](https://docs.openarm.dev) for:
-
- Bill of materials and sourcing
- 3D printing instructions
- Mechanical assembly
- Electrical wiring
-
-The hardware repositories are available at [github.com/enactic/openarm](https://github.com/enactic/openarm).
-
-## CAN Bus Setup
-
-OpenArm uses CAN bus communication with Damiao motors. Once you have the CAN bus USB adapter plugged into your Linux PC, follow the [Damiao Motors and CAN Bus guide](./damiao) to configure the interface.
-
-Quick setup:
-
-```bash
-# Setup CAN interfaces
-lerobot-setup-can --mode=setup --interfaces=can0,can1
-
-# Test motor communication
-lerobot-setup-can --mode=test --interfaces=can0,can1
-```
-
-## Install LeRobot 🤗
-
-Follow our [Installation Guide](./installation), then install the Damiao motor support:
-
-```bash
-pip install -e ".[damiao]"
-```
-
-## Usage
-
-### Follower Arm (Robot)
-
-<hfoptions id="follower">
-<hfoption id="Command">
-
-```bash
-lerobot-calibrate \
-    --robot.type=openarm_follower \
-    --robot.port=can0 \
-    --robot.side=right \
-    --robot.id=my_openarm_follower
-```
-
-</hfoption>
-<hfoption id="API example">
-
-```python
-from lerobot.robots.openarm_follower import OpenArmFollower, OpenArmFollowerConfig
-
-config = OpenArmFollowerConfig(
-    port="can0",
-    side="right",  # or "left" for left arm
-    id="my_openarm_follower",
-)
-
-follower = OpenArmFollower(config)
-follower.connect()
-
-# Read current state
-obs = follower.get_observation()
-print(obs)
-
-# Send action (position in degrees)
-action = {
-    "joint_1.pos": 0.0,
-    "joint_2.pos": 0.0,
-    "joint_3.pos": 0.0,
-    "joint_4.pos": 45.0,
-    "joint_5.pos": 0.0,
-    "joint_6.pos": 0.0,
-    "joint_7.pos": 0.0,
-    "gripper.pos": 0.0,
-}
-follower.send_action(action)
-
-follower.disconnect()
-```
-
-</hfoption>
-</hfoptions>
-
-### Leader Arm (Teleoperator)
-
-The leader arm is used for teleoperation - manually moving it to control the follower arm.
-
-<hfoptions id="leader">
-<hfoption id="Command">
-
-```bash
-lerobot-calibrate \
-    --teleop.type=openarm_leader \
-    --teleop.port=can1 \
-    --teleop.id=my_openarm_leader
-```
-
-</hfoption>
-<hfoption id="API example">
-
-```python
-from lerobot.teleoperators.openarm_leader import OpenArmLeader, OpenArmLeaderConfig
-
-config = OpenArmLeaderConfig(
-    port="can1",
-    id="my_openarm_leader",
-    manual_control=True,  # Disable torque for manual movement
-)
-
-leader = OpenArmLeader(config)
-leader.connect()
-
-# Read current position (as action to send to follower)
-action = leader.get_action()
-print(action)
-
-leader.disconnect()
-```
-
-</hfoption>
-</hfoptions>
-
-### Teleoperation
-
-To teleoperate OpenArm with leader-follower control:
-
-```bash
-lerobot-teleoperate \
-    --robot.type=openarm_follower \
-    --robot.port=can0 \
-    --robot.side=right \
-    --robot.id=my_follower \
-    --teleop.type=openarm_leader \
-    --teleop.port=can1 \
-    --teleop.id=my_leader
-```
-
-### Bimanual Teleoperation
-
-To teleoperate a bimanual OpenArm setup with two leader and two follower arms:
-
-```bash
-lerobot-teleoperate \
-    --robot.type=bi_openarm_follower \
-    --robot.left_arm_config.port=can0 \
-    --robot.left_arm_config.side=left \
-    --robot.right_arm_config.port=can1 \
-    --robot.right_arm_config.side=right \
-    --robot.id=my_bimanual_follower \
-    --teleop.type=bi_openarm_leader \
-    --teleop.left_arm_config.port=can2 \
-    --teleop.right_arm_config.port=can3 \
-    --teleop.id=my_bimanual_leader
-```
-
-### Recording Data
-
-To record a dataset during teleoperation:
-
-```bash
-lerobot-record \
-    --robot.type=openarm_follower \
-    --robot.port=can0 \
-    --robot.side=right \
-    --robot.id=my_follower \
-    --teleop.type=openarm_leader \
-    --teleop.port=can1 \
-    --teleop.id=my_leader \
-    --repo-id=my_hf_username/my_openarm_dataset \
-    --fps=30 \
-    --num-episodes=10
-```
-
-## Configuration Options
-
-### Follower Configuration
-
-| Parameter             | Default   | Description                                                |
-| --------------------- | --------- | ---------------------------------------------------------- |
-| `port`                | -         | CAN interface (e.g., `can0`)                               |
-| `side`                | `None`    | Arm side: `"left"`, `"right"`, or `None` for custom limits |
-| `use_can_fd`          | `True`    | Enable CAN FD for higher data rates                        |
-| `can_bitrate`         | `1000000` | Nominal bitrate (1 Mbps)                                   |
-| `can_data_bitrate`    | `5000000` | CAN FD data bitrate (5 Mbps)                               |
-| `max_relative_target` | `None`    | Safety limit for relative target positions                 |
-| `position_kp`         | Per-joint | Position control proportional gains                        |
-| `position_kd`         | Per-joint | Position control derivative gains                          |
-
-### Leader Configuration
-
-| Parameter          | Default   | Description                         |
-| ------------------ | --------- | ----------------------------------- |
-| `port`             | -         | CAN interface (e.g., `can1`)        |
-| `manual_control`   | `True`    | Disable torque for manual movement  |
-| `use_can_fd`       | `True`    | Enable CAN FD for higher data rates |
-| `can_bitrate`      | `1000000` | Nominal bitrate (1 Mbps)            |
-| `can_data_bitrate` | `5000000` | CAN FD data bitrate (5 Mbps)        |
-
-## Motor Configuration
-
-OpenArm uses Damiao motors with the following default configuration:
-
-| Joint                       | Motor Type | Send ID | Recv ID |
-| --------------------------- | ---------- | ------- | ------- |
-| joint_1 (Shoulder pan)      | DM8009     | 0x01    | 0x11    |
-| joint_2 (Shoulder lift)     | DM8009     | 0x02    | 0x12    |
-| joint_3 (Shoulder rotation) | DM4340     | 0x03    | 0x13    |
-| joint_4 (Elbow flex)        | DM4340     | 0x04    | 0x14    |
-| joint_5 (Wrist roll)        | DM4310     | 0x05    | 0x15    |
-| joint_6 (Wrist pitch)       | DM4310     | 0x06    | 0x16    |
-| joint_7 (Wrist rotation)    | DM4310     | 0x07    | 0x17    |
-| gripper                     | DM4310     | 0x08    | 0x18    |
-
-## Troubleshooting
-
-### No Response from Motors
-
-1. Check power supply connections
-2. Verify CAN wiring (CAN-H, CAN-L, GND)
-3. Run diagnostics: `lerobot-setup-can --mode=test --interfaces=can0`
-4. See the [Damiao troubleshooting guide](./damiao#troubleshooting) for more details
-
-### CAN Interface Not Found
-
-Ensure the CAN interface is configured:
-
-```bash
-ip link show can0
-```
-
-## Resources
-
- [OpenArm Website](https://openarm.dev)
- [OpenArm Documentation](https://docs.openarm.dev)
- [OpenArm GitHub](https://github.com/enactic/openarm)
- [Safety Guide](https://docs.openarm.dev/getting-started/safety-guide)
- [Damiao Motors and CAN Bus](./damiao)
@@ -1,62 +0,0 @@
-# Parameter efficient fine-tuning with 🤗 PEFT
-
-[🤗 PEFT](https://github.com/huggingface/peft) (Parameter-Efficient Fine-Tuning) is a library for efficiently adapting
-large pretrained models such as pre-trained policies (e.g., SmolVLA, π₀, ...) to new tasks without training all
-of the model's parameters while yielding comparable performance.
-
-Install the `lerobot[peft]` optional package to enable PEFT support.
-
-To read about all the possible methods of adaption, please refer to the [🤗 PEFT docs](https://huggingface.co/docs/peft/index).
-
-## Training SmolVLA
-
-In this section we'll show you how to train a pre-trained SmolVLA policy with PEFT on the libero dataset.
-For brevity we're only training on the `libero_spatial` subset. We will use `lerobot/smolvla_base` as the model
-to parameter efficiently fine-tune:
-
-```
-lerobot-train \
- --policy.path=lerobot/smolvla_base \
- --policy.repo_id=your_hub_name/my_libero_smolvla \
- --dataset.repo_id=HuggingFaceVLA/libero \
- --policy.output_features=null \
- --policy.input_features=null \
- --policy.optimizer_lr=1e-3 \
- --policy.scheduler_decay_lr=1e-4 \
- --env.type=libero \
- --env.task=libero_spatial \
- --steps=100000 \
- --batch_size=32 \
- --peft.method_type=LORA \
- --peft.r=64
-```
-
-Note the `--peft.method_type` parameter that let's you select which PEFT method to use. Here we use
-[LoRA](https://huggingface.co/docs/peft/main/en/package_reference/lora) (Low-Rank Adapter) which is probably the most
-popular fine-tuning method to date. Low-rank adaption means that we only fine-tune a matrix with comparably low rank
-instead of the full weight matrix. This rank can be specified using the `--peft.r` parameter. The higher the rank
-the closer you get to full fine-tuning
-
-There are more complex methods that have more parameters. These are not yet supported, feel free to raise an issue
-if you want to see a specific PEFT method supported.
-
-By default, PEFT will target the `q_proj` and `v_proj` layers of the LM expert in SmolVLA. It will also target the
-state and action projection matrices as they are most likely task-dependent. If you need to target different layers
-you can use `--peft.target_modules` to specify which layers to target. You can refer to the respective PEFT method's
-documentation to see what inputs are supported, (e.g., [LoRA's target_modules documentation](https://huggingface.co/docs/peft/main/en/package_reference/lora#peft.LoraConfig.target_modules)).
-Usually a list of suffixes or a regex are supported. For example, to target the MLPs of the `lm_expert` instead of
-the `q` and `v` projections, use:
-
-```
--peft.target_modules='(model\.vlm_with_expert\.lm_expert\..*\.(down|gate|up)_proj|.*\.(state_proj|action_in_proj|action_out_proj|action_time_mlp_in|action_time_mlp_out))'
-```
-
-In case you need to fully fine-tune a layer instead of just adapting it, you can supply a list of layer suffixes
-to the `--peft.full_training_modules` parameter:
-
-```
--peft.full_training_modules=["state_proj"]
-```
-
-The learning rate and the scheduled target learning rate can usually be scaled by a factor of 10 compared to the
-learning rate used for full fine-tuning (e.g., 1e-4 normal, so 1e-3 using LoRA).
@@ -44,7 +44,7 @@ Modify the examples to use `PhoneOS.IOS` or `PhoneOS.ANDROID` in `PhoneConfig`.

 Teleoperation example:

-```python
+```36:43:examples/phone_so100_teleop.py
 from lerobot.teleoperators.phone.config_phone import PhoneConfig, PhoneOS

 teleop_config = PhoneConfig(phone_os=PhoneOS.IOS)  # or PhoneOS.ANDROID
@@ -66,13 +66,12 @@ Run on of the examples scripts to teleoperate, record a dataset, replay a datase

 All scripts assume you configured your robot (e.g., SO-100 follower) and set the correct serial port.

-Additionally you need to **copy the URDF of the robot into the examples folder**. For the examples in this tutorial (using SO100/SO101), copy the `SO101` folder from the [SO-ARM100 repo](https://github.com/TheRobotStudio/SO-ARM100/blob/main/Simulation/SO101) into the `examples/phone_to_so100/` directory, so that the URDF file path becomes `examples/phone_to_so100/SO101/so101_new_calib.urdf`.
+Additionally you need to **copy the urdf of the robot to the examples folder**. For the examples in this tutorial (Using SO100/SO101) it is highly recommended to use the urdf in the [SO-ARM100 repo](https://github.com/TheRobotStudio/SO-ARM100/blob/main/Simulation/SO101/so101_new_calib.urdf)

 - Run this example to teleoperate:

  ```bash
-  cd examples/phone_to_so100
-  python teleoperate.py
+  python examples/phone_to_so100/teleoperate.py
  ```

 After running the example:
@@ -85,29 +84,26 @@ Additionally you can customize mapping or safety limits by editing the processor
 - Run this example to record a dataset, which saves absolute end effector observations and actions:

  ```bash
-  cd examples/phone_to_so100
-  python record.py
+  python examples/phone_to_so100/record.py
  ```

 - Run this example to replay recorded episodes:

  ```bash
-  cd examples/phone_to_so100
-  python replay.py
+  python examples/phone_to_so100/replay.py
  ```

 - Run this example to evaluate a pretrained policy:

  ```bash
-  cd examples/phone_to_so100
-  python evaluate.py
+  python examples/phone_to_so100/evaluate.py
  ```

 ### Important pipeline steps and options

 - Kinematics are used in multiple steps. We use [Placo](https://github.com/Rhoban/placo) which is a wrapper around Pinocchio for handling our kinematics. We construct the kinematics object by passing the robot's URDF and target frame. We set `target_frame_name` to the gripper frame.

-  ```python
+  ```examples/phone_to_so100/teleoperate.py
  kinematics_solver = RobotKinematics(
    urdf_path="./SO101/so101_new_calib.urdf",
    target_frame_name="gripper_frame_link",
@@ -118,7 +114,7 @@ Additionally you can customize mapping or safety limits by editing the processor

 - The `MapPhoneActionToRobotAction` step converts the calibrated phone pose and inputs into target deltas and gripper commands, below is shown what the step outputs.

-  ```python
+  ```src/lerobot/teleoperators/phone/phone_processor.py
  action["enabled"] = enabled
        action["target_x"] = -pos[1] if enabled else 0.0
        action["target_y"] = pos[0] if enabled else 0.0
@@ -131,7 +127,7 @@ Additionally you can customize mapping or safety limits by editing the processor

 - The `EEReferenceAndDelta` step converts target deltas to an absolute desired EE pose, storing a reference on enable, the `end_effector_step_sizes` are the step sizes for the EE pose and can be modified to change the motion speed.

-  ```python
+  ```examples/phone_to_so100/teleoperate.py
  EEReferenceAndDelta(
      kinematics=kinematics_solver,
      end_effector_step_sizes={"x": 0.5, "y": 0.5, "z": 0.5},
@@ -142,7 +138,7 @@ Additionally you can customize mapping or safety limits by editing the processor

 - The `EEBoundsAndSafety` step clamps EE motion to a workspace and checks for large ee step jumps to ensure safety. The `end_effector_bounds` are the bounds for the EE pose and can be modified to change the workspace. The `max_ee_step_m` are the step limits for the EE pose and can be modified to change the safety limits.

-  ```python
+  ```examples/phone_to_so100/teleoperate.py
  EEBoundsAndSafety(
      end_effector_bounds={"min": [-1.0, -1.0, -1.0], "max": [1.0, 1.0, 1.0]},
      max_ee_step_m=0.10,
@@ -151,7 +147,7 @@ Additionally you can customize mapping or safety limits by editing the processor

 - The `GripperVelocityToJoint` step turns a velocity‑like gripper input into absolute gripper position using the current measured state. The `speed_factor` is the factor by which the velocity is multiplied.

-  ```python
+  ```examples/phone_to_so100/teleoperate.py
  GripperVelocityToJoint(speed_factor=20.0)
  ```

@@ -161,7 +157,7 @@ We use different IK initial guesses in the kinematic steps. As initial guess eit

 - Closed loop (used in record/eval): sets `initial_guess_current_joints=True` so IK starts from the measured joints each frame.

-  ```python
+  ```examples/phone_to_so100/record.py
  InverseKinematicsEEToJoints(
      kinematics=kinematics_solver,
      motor_names=list(robot.bus.motors.keys()),
@@ -171,7 +167,7 @@ We use different IK initial guesses in the kinematic steps. As initial guess eit

 - Open loop (used in replay): sets `initial_guess_current_joints=False` so IK continues from the previous IK solution rather than the measured state. This preserves action stability when we replay without feedback.

-  ```python
+  ```examples/phone_to_so100/replay.py
  InverseKinematicsEEToJoints(
      kinematics=kinematics_solver,
      motor_names=list(robot.bus.motors.keys()),
@@ -6,12 +6,6 @@

 π₀ represents a breakthrough in robotics as the first general-purpose robot foundation model developed by [Physical Intelligence](https://www.physicalintelligence.company/blog/pi0). Unlike traditional robot programs that are narrow specialists programmed for repetitive motions, π₀ is designed to be a generalist policy that can understand visual inputs, interpret natural language instructions, and control a variety of different robots across diverse tasks.

-<img
-  src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/lerobot/lerobot-pi0%20(1).png"
-  alt="An overview of Pi0"
-  width="85%"
-/>
-
 ### The Vision for Physical Intelligence

 As described by Physical Intelligence, while AI has achieved remarkable success in digital domains, from chess-playing to drug discovery, human intelligence still dramatically outpaces AI in the physical world. To paraphrase Moravec's paradox, winning a game of chess represents an "easy" problem for AI, but folding a shirt or cleaning up a table requires solving some of the most difficult engineering problems ever conceived. π₀ represents a first step toward developing artificial physical intelligence that enables users to simply ask robots to perform any task they want, just like they can with large language models.
@@ -34,6 +28,11 @@ As described by Physical Intelligence, while AI has achieved remarkable success
   pip install -e ".[pi]"
   ```

+   > [!NOTE]
+   > For lerobot 0.4.0, if you want to install pi tag, you will have to do: `pip install "lerobot[pi]@git+https://github.com/huggingface/lerobot.git"`.
+   >
+   > This will be solved in the next patch release
+
 ## Training Data and Capabilities

 π₀ is trained on the largest robot interaction dataset to date, combining three key data sources:
@@ -55,7 +54,7 @@ policy.type=pi0
 For training π₀, you can use the standard LeRobot training script with the appropriate configuration:

 ```bash
-lerobot-train \
+python src/lerobot/scripts/lerobot_train.py \
    --dataset.repo_id=your_dataset \
    --policy.type=pi0 \
    --output_dir=./outputs/pi0_training \
@@ -65,8 +64,6 @@ lerobot-train \
    --policy.compile_model=true \
    --policy.gradient_checkpointing=true \
    --policy.dtype=bfloat16 \
-    --policy.freeze_vision_encoder=false \
-    --policy.train_expert_only=false \
    --steps=3000 \
    --policy.device=cuda \
    --batch_size=32
@@ -82,15 +79,6 @@ lerobot-train \
  - [lerobot/pi0_base](https://huggingface.co/lerobot/pi0_base)
  - [lerobot/pi0_libero](https://huggingface.co/lerobot/pi0_libero) (specifically trained on the Libero dataset)

-### Training Parameters Explained
-
-| Parameter               | Default | Description                                 |
-| ----------------------- | ------- | ------------------------------------------- |
-| `freeze_vision_encoder` | `false` | Do not freeze the vision encoder            |
-| `train_expert_only`     | `false` | Do not freeze the VLM, train all parameters |
-
-**💡 Tip**: Setting `train_expert_only=true` freezes the VLM and trains only the action expert and projections, allowing finetuning with reduced memory usage.
-
 ## License

 This model follows the **Apache 2.0 License**, consistent with the original [OpenPI repository](https://github.com/Physical-Intelligence/openpi).
@@ -36,6 +36,11 @@ This diverse training mixture creates a "curriculum" that enables generalization
   pip install -e ".[pi]"
   ```

+   > [!NOTE]
+   > For lerobot 0.4.0, if you want to install pi tag, you will have to do: `pip install "lerobot[pi]@git+https://github.com/huggingface/lerobot.git"`.
+   >
+   > This will be solved in the next patch release
+
 ## Usage

 To use π₀.₅ in your LeRobot configuration, specify the policy type as:
@@ -51,7 +56,7 @@ policy.type=pi05
 Here's a complete training command for finetuning the base π₀.₅ model on your own dataset:

 ```bash
-lerobot-train \
+python src/lerobot/scripts/lerobot_train.py\
    --dataset.repo_id=your_dataset \
    --policy.type=pi05 \
    --output_dir=./outputs/pi05_training \
@@ -62,8 +67,6 @@ lerobot-train \
    --policy.gradient_checkpointing=true \
    --wandb.enable=true \
    --policy.dtype=bfloat16 \
-    --policy.freeze_vision_encoder=false \
-    --policy.train_expert_only=false \
    --steps=3000 \
    --policy.device=cuda \
    --batch_size=32
@@ -79,15 +82,6 @@ lerobot-train \
  - [lerobot/pi05_base](https://huggingface.co/lerobot/pi05_base)
  - [lerobot/pi05_libero](https://huggingface.co/lerobot/pi05_libero) (specifically trained on the Libero dataset)

-### Training Parameters Explained
-
-| Parameter               | Default | Description                                 |
-| ----------------------- | ------- | ------------------------------------------- |
-| `freeze_vision_encoder` | `false` | Do not freeze the vision encoder            |
-| `train_expert_only`     | `false` | Do not freeze the VLM, train all parameters |
-
-**💡 Tip**: Setting `train_expert_only=true` freezes the VLM and trains only the action expert and projections, allowing finetuning with reduced memory usage.
-
 If your dataset is not converted with `quantiles`, you can convert it with the following command:

 ```bash
@@ -1,241 +0,0 @@
-# π₀-FAST (Pi0-FAST)
-
-π₀-FAST is a **Vision-Language-Action model for general robot control** that uses autoregressive next-token prediction to model continuous robot actions.
-
-## Model Overview
-
-π₀-FAST combines the power of Vision-Language Models with a novel action tokenization approach called **FAST (Frequency-space Action Sequence Tokenization)**. This enables training autoregressive VLAs on highly dexterous tasks that are impossible with standard binning-based discretization, while training **up to 5x faster** than diffusion-based approaches like π₀.
-
-<img
-  src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/lerobot/lerobot-pifast.png"
-  alt="An overview of Pi0-FAST"
-  width="85%"
-/>
-
-### Why FAST?
-
-Standard approaches for robot action tokenization use simple per-dimension, per-timestep binning schemes. While passable for simple behaviors, this rapidly breaks down for complex and dexterous skills that require precision and high-frequency control.
-
-FAST solves this by compressing action sequences using signal processing techniques, resulting in a dense sequence of action tokens that can be predicted autoregressively—just like language tokens.
-
-### How FAST Tokenization Works
-
-The FAST tokenizer compresses action sequences through the following steps:
-
-1. **Normalize**: Take a continuous action chunk of shape `(H, D)` where `H` is the horizon and `D` is the action dimension. Normalize using one of the supported normalization methods (Quantiles recommended to handle outliers).
-
-2. **Discrete Cosine Transform (DCT)**: Apply DCT (via scipy) to each action dimension separately. DCT is a compression algorithm commonly used in image and audio codecs (JPEG, MP3).
-
-3. **Quantization**: Round and remove insignificant coefficients for each action dimension, producing a sparse frequency matrix.
-
-4. **Flatten**: Flatten the matrix into a 1D vector, with low-frequency components first.
-
-5. **Byte Pair Encoding (BPE)**: Train a BPE tokenizer to compress the DCT coefficients into dense action tokens, typically achieving **10x compression** over prior tokenization approaches.
-
-This approach can transform **any existing VLM** into a VLA by training it to predict these FAST tokens.
-
-## Installation Requirements
-
-1. Install LeRobot by following our [Installation Guide](./installation).
-2. Install π₀-FAST dependencies by running:
-
-   ```bash
-   pip install -e ".[pi]"
-   ```
-
-## Training a Custom FAST Tokenizer
-
-You have two options for the FAST tokenizer:
-
-1. **Use the pre-trained tokenizer**: The `lerobot/fast-action-tokenizer` tokenizer was trained on 1M+ real robot action sequences and works as a general-purpose tokenizer.
-
-2. **Train your own tokenizer**: For maximum performance on your specific dataset, you can finetune the tokenizer on your own data.
-
-### Training Your Own Tokenizer
-
-```bash
-lerobot-train-tokenizer \
-    --repo_id "user/my-lerobot-dataset" \
-    --action_horizon 10 \
-    --encoded_dims "0:6" \
-    --vocab_size 1024 \
-    --scale 10.0 \
-    --normalization_mode QUANTILES \
-    --output_dir "./my_fast_tokenizer" \
-    --push_to_hub \
-    --hub_repo_id "username/my-action-tokenizer"
-```
-
-### Key Tokenizer Parameters
-
-| Parameter              | Description                                                                       | Default      |
-| ---------------------- | --------------------------------------------------------------------------------- | ------------ |
-| `--repo_id`            | LeRobot dataset repository ID                                                     | Required     |
-| `--action_horizon`     | Number of future actions in each chunk                                            | `10`         |
-| `--encoded_dims`       | Comma-separated dimension ranges to encode (e.g., `"0:6,7:23"`)                   | `"0:6,7:23"` |
-| `--vocab_size`         | BPE vocabulary size                                                               | `1024`       |
-| `--scale`              | DCT scaling factor for quantization                                               | `10.0`       |
-| `--normalization_mode` | Normalization mode (`MEAN_STD`, `MIN_MAX`, `QUANTILES`, `QUANTILE10`, `IDENTITY`) | `QUANTILES`  |
-| `--sample_fraction`    | Fraction of chunks to sample per episode                                          | `0.1`        |
-
-## Usage
-
-To use π₀-FAST in LeRobot, specify the policy type as:
-
-```python
-policy.type=pi0_fast
-```
-
-## Training
-
-For training π₀-FAST, you can use the LeRobot training script:
-
-```bash
-lerobot-train \
-    --dataset.repo_id=your_dataset \
-    --policy.type=pi0_fast \
-    --output_dir=./outputs/pi0fast_training \
-    --job_name=pi0fast_training \
-    --policy.pretrained_path=lerobot/pi0_fast_base \
-    --policy.dtype=bfloat16 \
-    --policy.gradient_checkpointing=true \
-    --policy.chunk_size=10 \
-    --policy.n_action_steps=10 \
-    --policy.max_action_tokens=256 \
-    --steps=100000 \
-    --batch_size=4 \
-    --policy.device=cuda
-```
-
-### Key Training Parameters
-
-| Parameter                              | Description                                        | Default                         |
-| -------------------------------------- | -------------------------------------------------- | ------------------------------- |
-| `--policy.gradient_checkpointing=true` | Reduces memory usage significantly during training | `false`                         |
-| `--policy.dtype=bfloat16`              | Use mixed precision training for efficiency        | `float32`                       |
-| `--policy.chunk_size`                  | Number of action steps to predict (action horizon) | `50`                            |
-| `--policy.n_action_steps`              | Number of action steps to execute                  | `50`                            |
-| `--policy.max_action_tokens`           | Maximum number of FAST tokens per action chunk     | `256`                           |
-| `--policy.action_tokenizer_name`       | FAST tokenizer to use                              | `lerobot/fast-action-tokenizer` |
-| `--policy.compile_model=true`          | Enable torch.compile for faster training           | `false`                         |
-
-## Inference
-
-### KV-Caching for Fast Inference
-
-π₀-FAST supports **KV-caching**, a widely used optimization in LLM inference. This caches the key-value pairs from the attention mechanism, avoiding redundant computation during autoregressive decoding.
-
-```python
-# KV-caching is enabled by default
-policy.use_kv_cache=true
-```
-
-### Inference Example
-
-```python
-from lerobot.policies.pi0_fast import PI0FastPolicy, PI0FastConfig
-
-# Load the policy
-policy = PI0FastPolicy.from_pretrained("your-model-path")
-
-# During inference
-actions = policy.predict_action_chunk(batch)
-```
-
-## Model Architecture
-
-π₀-FAST uses a PaliGemma-based architecture:
-
- **Vision Encoder**: SigLIP vision tower for image understanding
- **Language Model**: Gemma 2B for processing language instructions and predicting action tokens
-
-The model takes images, text instructions, and robot state as input, and outputs discrete FAST tokens that are decoded back to continuous actions.
-
-## Configuration Options
-
-| Parameter            | Description                                     | Default    |
-| -------------------- | ----------------------------------------------- | ---------- |
-| `paligemma_variant`  | VLM backbone variant (`gemma_300m`, `gemma_2b`) | `gemma_2b` |
-| `max_state_dim`      | Maximum state vector dimension (padded)         | `32`       |
-| `max_action_dim`     | Maximum action vector dimension (padded)        | `32`       |
-| `temperature`        | Sampling temperature (0.0 for greedy)           | `0.0`      |
-| `max_decoding_steps` | Maximum decoding steps                          | `256`      |
-| `use_kv_cache`       | Enable KV caching for faster inference          | `true`     |
-
-## Comparison with π₀
-
-| Feature               | π₀                        | π₀-FAST                      |
-| --------------------- | ------------------------- | ---------------------------- |
-| Action Representation | Flow Matching (Diffusion) | Autoregressive Tokens (FAST) |
-| Training Speed        | 1x                        | **5x faster**                |
-| Dexterity             | High                      | High                         |
-| Inference Method      | Iterative Denoising       | Autoregressive Decoding      |
-| KV-Caching            | N/A                       | Supported                    |
-
-## Reproducing π₀Fast results
-
-We reproduce the results of π₀Fast on the LIBERO benchmark using the LeRobot implementation. We take the LeRobot PiFast base model [lerobot/pi0fast-base](https://huggingface.co/lerobot/pi0fast-base) and finetune for an additional 40kk steps in bfloat16, with batch size of 256 on 8 H100 GPUs using the [HuggingFace LIBERO dataset](https://huggingface.co/datasets/HuggingFaceVLA/libero).
-
-The finetuned model can be found here:
-
- **π₀Fast LIBERO**: [lerobot/pi0fast-libero](https://huggingface.co/lerobot/pi0fast-libero)
-
-With the following training command:
-
-```bash
-lerobot-train \
-  --dataset.repo_id=lerobot/libero \
-  --output_dir=outputs/libero_pi0fast \
-  --job_name=libero_pi0fast \
-  --policy.path=lerobot/pi0fast_base \
-  --policy.dtype=bfloat16 \
-  --steps=100000 \
-  --save_freq=20000 \
-  --batch_size=4 \
-  --policy.device=cuda \
-  --policy.scheduler_warmup_steps=4000 \
-  --policy.scheduler_decay_steps=100000 \
-  --policy.scheduler_decay_lr=1e-5 \
-  --policy.gradient_checkpointing=true \
-  --policy.chunk_size=10 \
-  --policy.n_action_steps=10 \
-  --policy.max_action_tokens=256 \
-  --policy.empty_cameras=1 \
-```
-
-We then evaluate the finetuned model using the LeRobot LIBERO implementation, by running the following command:
-
-```bash
-tasks="libero_object,libero_spatial,libero_goal,libero_10"
-lerobot-eval \
-  --policy.path=lerobot/pi0fast-libero \
-  --policy.max_action_tokens=256 \
-  --env.type=libero \
-  --policy.gradient_checkpointing=false \
-  --env.task=${tasks} \
-  --eval.batch_size=1 \
-  --eval.n_episodes=1 \
-  --rename_map='{"observation.images.image":"observation.images.base_0_rgb","observation.images.image2":"observation.images.left_wrist_0_rgb"}'
-```
-
-**Note:** We set `n_action_steps=10`, similar to the original OpenPI implementation.
-
-### Results
-
-We obtain the following results on the LIBERO benchmark:
-
-| Model       | LIBERO Spatial | LIBERO Object | LIBERO Goal | LIBERO 10 | Average  |
-| ----------- | -------------- | ------------- | ----------- | --------- | -------- |
-| **π₀-fast** | 70.0           | 100.0         | 100.0       | 60.0      | **82.5** |
-
-The full evaluation output folder, including videos, is available [here](https://drive.google.com/drive/folders/1HXpwPTRm4hx6g1sF2P7OOqGG0TwPU7LQ?usp=sharing)
-
-## License
-
-This model follows the **Apache 2.0 License**, consistent with the original [OpenPI repository](https://github.com/Physical-Intelligence/openpi).
-
-## References
-
- [FAST: Efficient Robot Action Tokenization](https://www.physicalintelligence.company/research/fast) - Physical Intelligence Blog
- [OpenPI Repository](https://github.com/Physical-Intelligence/openpi) - Original implementation
- [FAST Tokenizer on Hugging Face](https://huggingface.co/physical-intelligence/fast) - Pre-trained tokenizer
@@ -1,45 +0,0 @@
-# WALL-OSS
-
-This repository contains the Hugging Face port of [**WALL-OSS**](https://x2robot.com/en/research/68bc2cde8497d7f238dde690), a Vision-Language-Action model for cross-embodiment robotic control based on Qwen2.5-VL with flow matching/FAST action prediction.
-
---
-
-## Model Overview
-
-| Feature            | Description                                           |
-| ------------------ | ----------------------------------------------------- |
-| Base Model         | Qwen2.5-VL (Vision-Language Model)                    |
-| Action Prediction  | Flow Matching (diffusion) or FAST (discrete tokens)   |
-| Architecture       | Mixture of Experts (MoE) with action-specific routing |
-| Multi-Modal Inputs | Vision (images/videos), Language, Proprioception      |
-
---
-
-## Additional Resources
-
-Paper: https://arxiv.org/pdf/2509.11766
-
-Official Repository: https://github.com/X-Square-Robot/wall-x
-
-Hugging Face: https://huggingface.co/x-square-robot
-
---
-
-## Citation
-
-If you use this work, please cite:
-
-```bibtex
-@article{zhai2025igniting,
-    title   = {Igniting VLMs Toward the Embodied Space},
-    author  = {Zhai, Andy and Liu, Brae and Fang, Bruno and Cai, Chalse and Ma, Ellie and Yin, Ethan and Wang, Hao and Zhou, Hugo and Wang, James and Shi, Lights and Liang, Lucy and Wang, Make and Wang, Qian and Gan, Roy and Yu, Ryan and Li, Shalfun and Liu, Starrick and Chen, Sylas and Chen, Vincent and Xu, Zach},
-    journal = {arXiv preprint arXiv:2509.11766},
-    year    = {2025}
-}
-```
-
---
-
-## License
-
-This model follows the **Apache 2.0 License**, consistent with the original [WallX repository](https://github.com/X-Square-Robot/wall-x).
@@ -30,7 +30,7 @@ Each of these pipelines handle different conversions between different action an

 Below is an example of the three pipelines that we use in the phone to SO-100 follower examples:

-```python
+```69:90:examples/phone_so100_record.py
 phone_to_robot_ee_pose_processor = RobotProcessorPipeline[RobotAction, RobotAction]( # teleop -> dataset action
    steps=[
        MapPhoneActionToRobotAction(platform=teleop_config.phone_os),
@@ -84,7 +84,7 @@ Dataset features are determined by the keys saved in the dataset. Each step can

 Below is and example of how we declare features with the `transform_features` method in the phone to SO-100 follower examples:

-```python
+```src/lerobot/robots/so100_follower/robot_kinematic_processor.py
    def transform_features(
        self, features: dict[PipelineFeatureType, dict[str, PolicyFeature]]
    ) -> dict[PipelineFeatureType, dict[str, PolicyFeature]]:
@@ -103,7 +103,7 @@ Here we declare what PolicyFeatures we modify in this step, so we know what feat

 Below is an example of how we aggregate and merge features in the phone to SO-100 record example:

-```python
+```121:145:examples/phone_so100_record.py
 features=combine_feature_dicts(
        # Run the feature contract of the pipelines
        # This tells you how the features would look like after the pipeline steps
@@ -38,7 +38,6 @@ docker run --rm -it \
  start_rviz:=true start_sdk_server:=true mujoco:=true
 ```

-> [!NOTE]
 > If MuJoCo runs slowly (low simulation frequency), append `-e LD_LIBRARY_PATH="/opt/host-libs:$LD_LIBRARY_PATH" \` to the previous command to improve performance:
 >
 > ```
@@ -142,7 +141,7 @@ If you choose this option but still want to use the VR teleoperation application
 First add reachy2 and reachy2_teleoperator to the imports of the record script. Then you can use the following command:

 ```bash
-lerobot-record \
+python -m lerobot.record \
    --robot.type=reachy2 \
    --robot.ip_address=192.168.0.200 \
    --robot.id=r2-0000 \
@@ -151,7 +150,6 @@ lerobot-record \
    --teleop.type=reachy2_teleoperator \
    --teleop.ip_address=192.168.0.200 \
    --teleop.with_mobile_base=false \
-    --robot.with_torso_camera=true \
    --dataset.repo_id=pollen_robotics/record_test \
    --dataset.single_task="Reachy 2 recording test" \
    --dataset.num_episodes=1 \
@@ -159,9 +157,6 @@ lerobot-record \
    --dataset.fps=15 \
    --dataset.push_to_hub=true \
    --dataset.private=true \
-    --dataset.streaming_encoding=true \
-    --dataset.encoder_threads=2 \
-    # --dataset.vcodec=auto \
    --display_data=true
 ```

@@ -170,7 +165,7 @@ lerobot-record \
 **Extended setup overview (all options included):**

 ```bash
-lerobot-record \
+python -m lerobot.record \
    --robot.type=reachy2 \
    --robot.ip_address=192.168.0.200 \
    --robot.use_external_commands=true \
@@ -182,8 +177,6 @@ lerobot-record \
    --robot.with_left_teleop_camera=true \
    --robot.with_right_teleop_camera=true \
    --robot.with_torso_camera=false \
-    --robot.camera_width=640 \
-    --robot.camera_height=480 \
    --robot.disable_torque_on_disconnect=false \
    --robot.max_relative_target=5.0 \
    --teleop.type=reachy2_teleoperator \
@@ -201,9 +194,6 @@ lerobot-record \
    --dataset.fps=15 \
    --dataset.push_to_hub=true \
    --dataset.private=true \
-    --dataset.streaming_encoding=true \
-    --dataset.encoder_threads=2 \
-    # --dataset.vcodec=auto \
    --display_data=true
 ```

@@ -222,10 +212,9 @@ Must be set to true if a compliant Reachy 2 is used to control another one.
 From our initial tests, recording **all** joints when only some are moving can reduce model quality with certain policies.
 To avoid this, you can exclude specific parts from recording and replay using:

-```bash
+````
 --robot.with_<part>=false
-```
-
+```,
 with `<part>` being one of : `mobile_base`, `l_arm`, `r_arm", `neck`, `antennas`.
 It determine whether the corresponding part is recorded in the observations. True if not set.

@@ -233,60 +222,49 @@ By default, **all parts are recorded**.

 The same per-part mechanism is available in `reachy2_teleoperator` as well.

-```bash
--teleop.with\_<part>
-```
+````

+--teleop.with\_<part>
+
+```
 with `<part>` being one of : `mobile_base`, `l_arm`, `r_arm", `neck`, `antennas`.
 Determine whether the corresponding part is recorded in the actions. True if not set.

 > **Important:** In a given session, the **enabled parts must match** on both the robot and the teleoperator.
-> For example, if the robot runs with `--robot.with_mobile_base=false`, the teleoperator must disable the same part `--teleoperator.with_mobile_base=false`.
+For example, if the robot runs with `--robot.with_mobile_base=false`, the teleoperator must disable the same part `--teleoperator.with_mobile_base=false`.

 ##### Use the relevant cameras

-You can do the same for **cameras**. Enable or disable each camera with default parameters using:
+You can do the same for **cameras**. By default, only the **teleoperation cameras** are recorded (both `left_teleop_camera` and `right_teleop_camera`). Enable or disable each camera with:

-```bash
--robot.with_left_teleop_camera=<true|false> \
--robot.with_right_teleop_camera=<true|false> \
+```
+
+--robot.with_left_teleop_camera=<true|false>
+--robot.with_right_teleop_camera=<true|false>
 --robot.with_torso_camera=<true|false>
-```

-By default, no camera is recorded, all camera arguments are set to `false`.
-If you want to, you can use custom `width` and `height` parameters for Reachy 2's cameras using the `--robot.camera_width` & `--robot.camera_height` argument:
+````

-```bash
--robot.camera_width=1920 \
--robot.camera_height=1080
-```
-
-This will change the resolution of all 3 default robot cameras (enabled by the above bool arguments).
-
-If you want, you can add additional cameras other than the ones in the robot as usual with:
-
-```bash
--robot.cameras="{ extra: {type: opencv, index_or_path: 42, width: 640, height: 480, fps: 30}}" \
-```

 ## Step 2: Replay

 Make sure the robot is configured with the same parts as the dataset:

 ```bash
-lerobot-replay \
+python -m lerobot.replay \
    --robot.type=reachy2 \
    --robot.ip_address=192.168.0.200 \
    --robot.use_external_commands=false \
    --robot.with_mobile_base=false \
    --dataset.repo_id=pollen_robotics/record_test \
    --dataset.episode=0
-```
+    --display_data=true
+````

 ## Step 3: Train

 ```bash
-lerobot-train \
+python -m lerobot.scripts.train \
  --dataset.repo_id=pollen_robotics/record_test \
  --policy.type=act \
  --output_dir=outputs/train/reachy2_test \
@@ -299,9 +277,10 @@ lerobot-train \
 ## Step 4: Evaluate

 ```bash
-lerobot-eval \
+python -m lerobot.record \
  --robot.type=reachy2 \
  --robot.ip_address=192.168.0.200 \
+  --display_data=false \
  --dataset.repo_id=pollen_robotics/eval_record_test \
  --dataset.single_task="Evaluate reachy2 policy" \
  --dataset.num_episodes=10 \
@@ -1,592 +0,0 @@
-# SARM: Stage-Aware Reward Modeling
-
-SARM (Stage-Aware Reward Modeling) is a video-based reward modeling framework for long-horizon robot manipulation tasks. This guide covers how to train SARM reward models and optionally use them with Reward-Aligned Behavior Cloning (RA-BC).
-
-**Paper**: [SARM: Stage-Aware Reward Modeling for Long Horizon Robot Manipulation](https://arxiv.org/abs/2509.25358)
-
-<img
-  src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/lerobot/lerobot-sarm.png"
-  alt="An overview of SARM"
-  width="80%"
-/>
-
-## Why Reward Models?
-
-Standard behavior cloning treats all demonstration frames equally, but real-world robot datasets are messy. They contain hesitations, corrections, and variable-quality trajectories. Reward models solve this by learning a generalizable notion of **task progress** from demonstrations: given video frames and a task description, they predict how close the robot is to completing the task (0→1). This learned "progress signal" can be used in multiple ways, two promising applications are: (1) **weighted imitation learning** (RA-BC), where high-progress frames receive more weight during policy training, and (2) **reinforcement learning**, where the reward model provides dense rewards for online or offline policy improvement.
-
-## Overview
-
-SARM has following features:
-
-1. **Stage-aware architecture**: Jointly predicts the high-level task stage and fine-grained progress within each stage
-2. **Subtask annotations**: Uses natural language subtask annotations to derive consistent progress labels
-3. **Temporal proportions**: Computes dataset-level priors (α̅\_k) for each subtask to normalize progress across variable-length demonstrations
-
-SARM trains on a compact **stage+tau** target for each frame:
-
- **stage**: integer stage index `k ∈ {0, ..., K-1}`
- **τ (tau)**: within-stage progress `τ ∈ [0, 1]`
- **target encoding**: `y = k + τ` (this is what the dataset processor produces)
-
-At inference time (and in downstream RA-BC), SARM converts the raw `k + τ` value into a **normalized progress** in `[0, 1]` using dataset-level **temporal proportions** `α̅_k` (stored in `meta/temporal_proportions_*.json`).
-
-This matches **Formula (2)** from the paper:
-
-```
-progress_t = P_{k-1} + α̅_k × τ_t
-```
-
-Where:
-
- `τ_t = (t - s_k) / (e_k - s_k)` is within-subtask normalized time
- `P_{k-1}` is cumulative prior (sum of previous subtask proportions)
- `α̅_k` is the temporal proportion for subtask k
-
-This ensures identical task states map to consistent progress values, even across demonstrations of different lengths.
-
-## Inputs and Targets (What the new code expects)
-
-SARM is trained through its processor (`src/lerobot/policies/sarm/processor_sarm.py`), which:
-
- **Encodes** images and task text with CLIP (ViT-B/32) into `video_features` and `text_features`
- **Pads/truncates** robot state into `state_features` (up to `max_state_dim`)
- **Builds targets** as `sparse_targets` (and `dense_targets` in `dense_only`/`dual`) using the stage+tau encoding `y = k + τ`
- **Masks rewind frames** using a per-sample `lengths` tensor (rewind is a training-time augmentation)
-
-At minimum, each training sample needs:
-
- `task` (string): task description
- `policy.image_key` images and `policy.state_key` states from the dataset
-
---
-
-## Annotation Modes
-
-You can choose from **3 annotation modes** that determine how progress labels are computed:
-
-| Mode           | Annotations Required | Heads                        | Use Case                                                     |
-| -------------- | -------------------- | ---------------------------- | ------------------------------------------------------------ |
-| `single_stage` | None                 | Sparse only                  | Simple tasks, quick experiments, no VLM needed               |
-| `dense_only`   | Dense (VLM)          | Dual (sparse auto-generated) | Detailed subtask tracking without defining high-level stages |
-| `dual`         | Sparse + Dense (VLM) | Dual                         | Full SARM paper setup with both granularities                |
-
-### Mode Details
-
-<hfoptions id="mode_explanation">
-<hfoption id="single_stage">
-
-**No annotations required.** The entire episode is treated as a single stage called `"task"`, and progress is linear from 0 to 1 over the episode duration.
-
- **Sparse head**: 1 stage ("task"), linear progress
- **Dense head**: Not used
- **Best for**: Simple tasks, quick experiments, or when VLM annotation is not available
-
-## Set Up Your Environment
-
-1. Install LeRobot by following our [Installation Guide](./installation).
-2. Install SARM dependencies by running:
-
-```bash
-pip install -e ".[sarm]"
-```
-
-Workflow:
-
-```
-1. Train SARM → 2. Visualize predictions → 3. (Optional) Train policy with RA-BC
-```
-
-</hfoption>
-<hfoption id="dense_only">
-
-**Only dense (fine-grained) annotations from a VLM.** The sparse head automatically uses a single `"task"` stage covering the full episode, while the dense head learns detailed subtask progression.
-
- **Sparse head**: 1 stage ("task"), linear progress (auto-generated)
- **Dense head**: Multiple fine-grained stages from VLM annotations
- **Best for**: When you want detailed subtask tracking but don't need to define high-level stages
-
-Workflow:
-
-```
-1. Annotate (dense) → 2. Verify → 3. Train SARM → 4. Visualize → 5. (Optional) Train policy with RA-BC
-```
-
-</hfoption>
-<hfoption id="dual">
-
-**Both sparse and dense annotations from VLM.** Full dual-head mode as described in the SARM paper, with both high-level (sparse) and fine-grained (dense) stage predictions.
-
- **Sparse head**: High-level stages from VLM annotations
- **Dense head**: Fine-grained stages from VLM annotations
- **Best for**: Complex multi-stage tasks where both granularities are useful
-
-Workflow:
-
-```
-1. Annotate (sparse+dense) → 2. Verify → 3. Train SARM → 4. Visualize → 5. (Optional) Train policy with RA-BC
-```
-
-</hfoption>
-</hfoptions>
-
---
-
-## Step 1: Subtask Annotation
-
-<hfoptions id="annotation_mode">
-<hfoption id="single_stage">
-
-**No annotation required!** Skip this step entirely. The model will use the episode's task description and compute linear progress automatically.
-
-</hfoption>
-<hfoption id="dense_only">
-
-Generate **dense (fine-grained) annotations only** using a VLM. The sparse stage will be auto-generated.
-
-```bash
-python src/lerobot/data_processing/sarm_annotations/subtask_annotation.py \
-  --repo-id your-username/your-dataset \
-  --dense-only \
-  --dense-subtasks "Bring robot arms up from starting position,Grab near side and do 1st fold,Grab side and do 2nd fold,Grab side and do 3rd fold to finish folding" \
-  --video-key observation.images.base \
-  --num-workers 4 \
-  --push-to-hub
-```
-
-**What gets saved:**
-
- `meta/temporal_proportions_sparse.json` - Auto-generated sparse proportions (`{"task": 1.0}`)
- `meta/temporal_proportions_dense.json` - Dense temporal proportions
- Per-episode columns in `episodes/*.parquet`:
-  - `dense_subtask_names`, `dense_subtask_start_frames`, `dense_subtask_end_frames`
-  - (also time-based columns: `dense_subtask_start_times`, `dense_subtask_end_times`)
-
-</hfoption>
-<hfoption id="dual">
-
-Generate **both sparse (high-level) and dense (fine-grained) annotations** using a VLM.
-
-```bash
-python src/lerobot/data_processing/sarm_annotations/subtask_annotation.py \
-  --repo-id your-username/your-dataset \
-  --sparse-subtasks "Bring arms up from starting position,Fold the towel (3 folds in total)" \
-  --dense-subtasks "Bring robot arms up from starting position,Grab near side and do 1st fold,Grab side and do 2nd fold,Grab side and do 3rd fold to finish folding" \
-  --video-key observation.images.base \
-  --num-workers 4 \
-  --push-to-hub
-```
-
-**What gets saved:**
-
- `meta/temporal_proportions_sparse.json` - Sparse temporal proportions
- `meta/temporal_proportions_dense.json` - Dense temporal proportions
- Per-episode columns in `episodes/*.parquet`:
-  - `sparse_subtask_names`, `sparse_subtask_start_frames`, `sparse_subtask_end_frames`
-  - `dense_subtask_names`, `dense_subtask_start_frames`, `dense_subtask_end_frames`
-  - (also time-based columns: `*_subtask_start_times`, `*_subtask_end_times`)
-
-</hfoption>
-</hfoptions>
-
-### Annotation Arguments
-
-| Argument               | Description                                                                     |
-| ---------------------- | ------------------------------------------------------------------------------- |
-| `--repo-id`            | HuggingFace dataset repository ID                                               |
-| `--sparse-subtasks`    | Comma-separated list of high-level subtask names                                |
-| `--dense-subtasks`     | Comma-separated list of fine-grained subtask names                              |
-| `--dense-only`         | Generate only dense annotations (auto-creates sparse "task" stage)              |
-| `--video-key`          | Camera/video key to use (e.g., `observation.images.top`)                        |
-| `--num-workers`        | Number of parallel GPU workers (default: 1)                                     |
-| `--episodes`           | Specific episode indices to annotate (default: all)                             |
-| `--skip-existing`      | Skip episodes that already have annotations                                     |
-| `--model`              | VLM model (default: `Qwen/Qwen3-VL-30B-A3B-Instruct`)                           |
-| `--num-visualizations` | Number of episodes to visualize after annotation (default: 5, set to 0 to skip) |
-
-> **Note**: After annotation completes, 5 episodes are automatically visualized by default. Use `--num-visualizations 0` to skip this step.
-
---
-
-## Step 2: Verify Annotations
-
-<hfoptions id="verify_mode">
-<hfoption id="single_stage">
-
-**No verification needed!** Skip this step.
-
-</hfoption>
-<hfoption id="dense_only">
-
-Visualize annotations using the `--visualize-only` flag:
-
-```bash
-python src/lerobot/data_processing/sarm_annotations/subtask_annotation.py \
-  --repo-id your-username/your-dataset \
-  --visualize-only \
-  --visualize-type dense \
-  --num-visualizations 5 \
-  --video-key observation.images.base \
-  --output-dir ./subtask_viz
-```
-
-</hfoption>
-<hfoption id="dual">
-
-Visualize annotations using the `--visualize-only` flag:
-
-```bash
-python src/lerobot/data_processing/sarm_annotations/subtask_annotation.py \
-  --repo-id your-username/your-dataset \
-  --visualize-only \
-  --visualize-type both \
-  --num-visualizations 5 \
-  --video-key observation.images.base \
-  --output-dir ./subtask_viz
-```
-
-</hfoption>
-</hfoptions>
-
-This generates visualizations showing video frames with subtask boundaries overlaid and timeline of subtasks.
-
-### Visualization Arguments
-
-| Argument               | Description                                                    |
-| ---------------------- | -------------------------------------------------------------- |
-| `--visualize-only`     | Only visualize existing annotations (no generation)            |
-| `--num-visualizations` | Number of episodes to visualize (default: 5)                   |
-| `--visualize-type`     | Type of annotations to visualize: `sparse`, `dense`, or `both` |
-
-**Tip**: If annotations are inaccurate, adjust your subtask descriptions to be more specific and re-run.
-
---
-
-## Step 3: Train SARM
-
-<hfoptions id="train_mode">
-<hfoption id="single_stage">
-
-Train with **no annotations** - uses linear progress from 0 to 1:
-
-```bash
-lerobot-train \
-  --dataset.repo_id=your-username/your-dataset \
-  --policy.type=sarm \
-  --policy.annotation_mode=single_stage \
-  --policy.image_key=observation.images.base \
-  --output_dir=outputs/train/sarm_single \
-  --batch_size=32 \
-  --steps=5000 \
-  --wandb.enable=true \
-  --wandb.project=sarm \
-  --policy.repo_id=your-username/your-model-name
-```
-
-</hfoption>
-<hfoption id="dense_only">
-
-Train with **dense annotations only** (sparse auto-generated):
-
-```bash
-lerobot-train \
-  --dataset.repo_id=your-username/your-dataset \
-  --policy.type=sarm \
-  --policy.annotation_mode=dense_only \
-  --policy.image_key=observation.images.base \
-  --output_dir=outputs/train/sarm_dense \
-  --batch_size=32 \
-  --steps=5000 \
-  --wandb.enable=true \
-  --wandb.project=sarm \
-  --policy.repo_id=your-username/your-model-name
-```
-
-</hfoption>
-<hfoption id="dual">
-
-Train with **both sparse and dense annotations**:
-
-```bash
-lerobot-train \
-  --dataset.repo_id=your-username/your-dataset \
-  --policy.type=sarm \
-  --policy.annotation_mode=dual \
-  --policy.image_key=observation.images.base \
-  --output_dir=outputs/train/sarm_dual \
-  --batch_size=32 \
-  --steps=5000 \
-  --wandb.enable=true \
-  --wandb.project=sarm \
-  --policy.repo_id=your-username/your-model-name
-```
-
-</hfoption>
-</hfoptions>
-
-### Multi-GPU Training
-
-Add `accelerate launch --multi_gpu --num_processes=4` to use multiple GPUs for training.
-
-### Training Arguments
-
-| Argument                   | Description                                                       | Default                  |
-| -------------------------- | ----------------------------------------------------------------- | ------------------------ |
-| `--policy.annotation_mode` | `single_stage`, `dense_only`, or `dual`                           | `single_stage`           |
-| `--policy.image_key`       | Camera key for images                                             | `observation.images.top` |
-| `--policy.state_key`       | Key for joint states                                              | `observation.state`      |
-| `--policy.n_obs_steps`     | Observation history steps (total obs frames = `n_obs_steps + 1`)  | `8`                      |
-| `--policy.frame_gap`       | Gap (in frames) between sampled observations (at 30 fps: 30 ≈ 1s) | `30`                     |
-
---
-
-## Step 4: Visualize Predictions
-
-Use `compute_rabc_weights.py` with `--visualize-only` to visualize model predictions (and, if available, annotation-derived targets) without writing a parquet file.
-
-<hfoptions id="viz_mode">
-<hfoption id="single_stage">
-
-```bash
-python src/lerobot/policies/sarm/compute_rabc_weights.py \
-  --dataset-repo-id your-username/your-dataset \
-  --reward-model-path your-username/sarm-model \
-  --visualize-only \
-  --num-visualizations 5 \
-  --head-mode sparse \
-  --output-dir ./sarm_viz
-```
-
-</hfoption>
-<hfoption id="dense_only">
-
-```bash
-python src/lerobot/policies/sarm/compute_rabc_weights.py \
-  --dataset-repo-id your-username/your-dataset \
-  --reward-model-path your-username/sarm-model \
-  --visualize-only \
-  --num-visualizations 5 \
-  --head-mode dense \
-  --output-dir ./sarm_viz
-```
-
-</hfoption>
-<hfoption id="dual">
-
-```bash
-python src/lerobot/policies/sarm/compute_rabc_weights.py \
-  --dataset-repo-id your-username/your-dataset \
-  --reward-model-path your-username/sarm-model \
-  --visualize-only \
-  --num-visualizations 5 \
-  --head-mode both \
-  --output-dir ./sarm_viz
-```
-
-</hfoption>
-</hfoptions>
-
-The visualization shows:
-
- **Progress plot**: Predicted progress (and optional annotation-derived “GT” when available and `--stride 1`)
- **Stage probabilities**: Stacked area plot of predicted stage probabilities
- **Sample frames**: Key frames from the episode with progress/stage labels
-
-### Visualization Arguments
-
-| Argument               | Description                                               |
-| ---------------------- | --------------------------------------------------------- |
-| `--visualize-only`     | Only visualize predictions (no RABC computation)          |
-| `--num-visualizations` | Number of episodes to visualize (default: 5)              |
-| `--head-mode`          | SARM head to use: `sparse`, `dense`, or `both`            |
-| `--stride`             | Compute every N frames, interpolate the rest (default: 1) |
-
---
-
-## Step 5 (Optional): Train Policy with RA-BC
-
-Reward-Aligned Behavior Cloning (RA-BC) uses the trained SARM model to weight training samples based on predicted progress improvement. This requires two steps:
-
-1. **Precompute progress values** for all frames using the trained SARM model
-2. **Train policy** with RA-BC weighting using the precomputed values
-
-### How RA-BC Works
-
-For each training sample, RA-BC computes the progress delta:
-
-```
-r_i = φ(o_{t+Δ}) - φ(o_t)
-```
-
-Where `φ` is the SARM progress prediction and `Δ` is the policy's `chunk_size`. Samples with positive progress (good demonstrations) get higher weights, while samples with negative or zero progress get down-weighted.
-
-The weighting follows **Equations 8-9** from the paper:
-
- **Soft weight**: `w̃_i = clip((r_i − (μ − 2σ)) / (4σ + ε), 0, 1)`
- **Final weight**: `w_i = 𝟙{r_i > κ} + 𝟙{0 ≤ r_i ≤ κ} × w̃_i`
-
-### Step 5a: Compute SARM Progress Values
-
-First, run the SARM model on all frames in your dataset to compute progress values:
-
-```bash
-python src/lerobot/policies/sarm/compute_rabc_weights.py \
-  --dataset-repo-id your-username/your-dataset \
-  --reward-model-path your-username/sarm-model \
-  --head-mode sparse \
-  --num-visualizations 5 \
-  --push-to-hub
-```
-
-This script:
-
- Processes all frames and computes progress values
- Saves progress values to a parquet file next to the dataset on disk (defaults to `<dataset_root>/sarm_progress.parquet`)
- Generates visualizations of the first N episodes (default: 5)
-
-**Arguments:**
-
-| Argument               | Description                                                    | Default    |
-| ---------------------- | -------------------------------------------------------------- | ---------- |
-| `--reward-model-path`  | Path to trained SARM model                                     | (required) |
-| `--head-mode`          | SARM head to use: `sparse`, `dense`, or `both`                 | `sparse`   |
-| `--device`             | Device for inference                                           | `cuda`     |
-| `--visualize-only`     | Only visualize predictions (no RA-BC computation)              | `false`    |
-| `--num-visualizations` | Number of episodes to visualize (default: 5, set to 0 to skip) | `5`        |
-
-**Output format** (`sarm_progress.parquet`):
-
-| Column            | Description                                    |
-| ----------------- | ---------------------------------------------- |
-| `index`           | Global frame index in dataset                  |
-| `episode_index`   | Episode number                                 |
-| `frame_index`     | Local frame index within episode               |
-| `progress_sparse` | Sparse head progress value [0, 1]              |
-| `progress_dense`  | Dense head progress value [0, 1] (if computed) |
-
-### Step 5b: Train Policy with RA-BC
-
-Once you have the progress file, train your policy with RA-BC weighting. The progress file is auto-detected from the dataset path (`sarm_progress.parquet`). Currently PI0, PI0.5 and SmolVLA are supported with RA-BC:
-
-```bash
-lerobot-train \
-  --dataset.repo_id=your-username/your-dataset \
-  --policy.type=pi0 \
-  --use_rabc=true \
-  --rabc_head_mode=sparse \
-  --rabc_kappa=0.01 \
-  --output_dir=outputs/train/policy_rabc \
-  --batch_size=32 \
-  --steps=40000
-```
-
-The training script automatically:
-
- Loads the precomputed progress values from the parquet file
- Uses the policy's `chunk_size` to compute progress deltas (Δ)
- Computes sample weights based on progress improvement
- Applies weighted loss during training
-
-**RA-BC Arguments:**
-
-| Argument               | Description                                                | Default                            |
-| ---------------------- | ---------------------------------------------------------- | ---------------------------------- |
-| `--use_rabc`           | Enable RA-BC sample weighting                              | `false`                            |
-| `--rabc_progress_path` | Path to progress parquet file (auto-detected from dataset) | `sarm_progress.parquet` in dataset |
-| `--rabc_head_mode`     | Which SARM head's progress to use: `sparse` or `dense`     | `sparse`                           |
-| `--rabc_kappa`         | Threshold κ for high-quality samples                       | `0.01`                             |
-
-### Tuning RA-BC Kappa
-
-The `kappa` parameter is the threshold that determines which samples get full weight (w=1). Understanding how to tune it is critical for RA-BC to work effectively.
-
-**How the weighting works:**
-
-| Condition           | Weight                  |
-| ------------------- | ----------------------- |
-| `delta > kappa`     | 1.0 (hard threshold)    |
-| `0 ≤ delta ≤ kappa` | Soft weight from Eq. 8  |
-| `delta < 0`         | 0.0 (negative progress) |
-
-**Diagnosing kappa issues:**
-
-Monitor these WandB metrics during training:
-
-| Metric             | Healthy Range | Problem Indicator         |
-| ------------------ | ------------- | ------------------------- |
-| `rabc_mean_weight` | 0.3 - 0.8     | ≈ 1.0 means kappa too low |
-| `rabc_delta_mean`  | > 0           | Should be positive        |
-| `rabc_delta_std`   | > 0           | Variance in data quality  |
-
-**If `rabc_mean_weight ≈ 1.0`:** Your kappa is too low. Most samples have `delta > kappa` and bypass the soft-weighting entirely. RA-BC becomes equivalent to vanilla BC.
-
-**Setting kappa based on your data:**
-
-The default `kappa=0.01` was tuned for the paper's T-shirt folding task (~90s episodes at 30fps). For your dataset, check the logged `rabc_delta_mean` and `rabc_delta_std`:
-
-```
-# If delta_mean ≈ 0.03 and delta_std ≈ 0.02:
-# Most deltas fall in range [0.01, 0.05]
-
-# Option 1: Set kappa = delta_mean (medium selectivity)
--rabc_kappa=0.03
-
-# Option 2: Set kappa = delta_mean + delta_std (high selectivity)
--rabc_kappa=0.05
-
-# Option 3: Set kappa = delta_mean + 2*delta_std (very selective)
--rabc_kappa=0.07
-```
-
-**When RA-BC may not help:**
-
-If your dataset is already high quality (consistent progress across all demonstrations), RA-BC won't provide much benefit since there's nothing to filter.
-
-### Multi-GPU Training with RA-BC
-
-```bash
-accelerate launch \
-  --multi_gpu \
-  --num_processes=4 \
-  src/lerobot/scripts/lerobot_train.py \
-  --dataset.repo_id=your-username/your-dataset \
-  --policy.type=pi0 \
-  --use_rabc=true \
-  --rabc_kappa=0.01 \
-  --output_dir=outputs/train/policy_rabc \
-  --batch_size=32 \
-  --steps=40000
-```
-
---
-
-## Tips & Best Practices
-
-### Choosing a Mode
-
- **Start with `single_stage`** for quick experiments - no annotation overhead
- Use **`dense_only`** when you want detailed progress tracking but tasks don't have clear high-level stages
- Use **`dual`** for complex tasks where both coarse and fine-grained progress is meaningful
-
-### Annotation Quality
-
-1. **Be specific with subtask names**: Instead of "fold", use "grab near side and fold toward center"
-2. **Verify with visualization**: Always check a few episodes before training
-3. **Consistent naming**: Use the same subtask names across all episodes
-
-### RA-BC
-
-1. **Train SARM first**: RA-BC quality depends entirely on SARM quality
-2. **Monitor `rabc_mean_weight`**: If it's ≈ 1.0, increase kappa (see [Tuning RA-BC Kappa](#tuning-ra-bc-kappa))
-
---
-
-## Citation
-
-```bibtex
-@article{chen2025sarm,
-  title={SARM: Stage-Aware Reward Modeling for Long Horizon Robot Manipulation},
-  author={Chen, Qianzhong and Yu, Justin and Schwager, Mac and Abbeel, Pieter and Shentu, Yide and Wu, Philipp},
-  journal={arXiv preprint arXiv:2509.25358},
-  year={2025}
-}
-```
@@ -106,9 +106,6 @@ lerobot-record \
  --dataset.repo_id=${HF_USER}/eval_DATASET_NAME_test \  # <- This will be the dataset name on HF Hub
  --dataset.episode_time_s=50 \
  --dataset.num_episodes=10 \
-  --dataset.streaming_encoding=true \
-  --dataset.encoder_threads=2 \
-  # --dataset.vcodec=auto \
  # <- Teleop optional if you want to teleoperate in between episodes \
  # --teleop.type=so100_leader \
  # --teleop.port=/dev/ttyACM0 \
@@ -103,7 +103,7 @@ lerobot-setup-motors \

 <!-- prettier-ignore-start -->
 ```python
-from lerobot.robots.so_follower import SO100Follower, SO100FollowerConfig
+from lerobot.robots.so100_follower import SO100Follower, SO100FollowerConfig

 config = SO100FollowerConfig(
    port="/dev/tty.usbmodem585A0076841",
@@ -177,7 +177,7 @@ lerobot-setup-motors \

 <!-- prettier-ignore-start -->
 ```python
-from lerobot.teleoperators.so_leader import SO100Leader, SO100LeaderConfig
+from lerobot.teleoperators.so100_leader import SO100Leader, SO100LeaderConfig

 config = SO100LeaderConfig(
    port="/dev/tty.usbmodem585A0076841",
@@ -579,7 +579,7 @@ lerobot-calibrate \

 <!-- prettier-ignore-start -->
 ```python
-from lerobot.robots.so_follower import SO100FollowerConfig, SO100Follower
+from lerobot.robots.so100_follower import SO100FollowerConfig, SO100Follower

 config = SO100FollowerConfig(
    port="/dev/tty.usbmodem585A0076891",
@@ -617,7 +617,7 @@ lerobot-calibrate \

 <!-- prettier-ignore-start -->
 ```python
-from lerobot.teleoperators.so_leader import SO100LeaderConfig, SO100Leader
+from lerobot.teleoperators.so100_leader import SO100LeaderConfig, SO100Leader

 config = SO100LeaderConfig(
    port="/dev/tty.usbmodem58760431551",
@@ -1,18 +1,5 @@
 # SO-101

-<div style="display: flex; align-items: center; gap: 10px;">
-  <img
-    src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/lerobot/SO101_Follower.webp"
-    alt="SO-101"
-    width="60%"
-  />
-  <img
-    src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/lerobot/SO101_Leader.webp"
-    alt="SO-101"
-    width="60%"
-  />
-</div>
-
 In the steps below, we explain how to assemble our flagship robot, the SO-101.

 ## Source the parts
@@ -138,7 +125,7 @@ lerobot-setup-motors \

 <!-- prettier-ignore-start -->
 ```python
-from lerobot.robots.so_follower import SO101Follower, SO101FollowerConfig
+from lerobot.robots.so101_follower import SO101Follower, SO101FollowerConfig

 config = SO101FollowerConfig(
    port="/dev/tty.usbmodem585A0076841",
@@ -214,7 +201,7 @@ lerobot-setup-motors \

 <!-- prettier-ignore-start -->
 ```python
-from lerobot.teleoperators.so_leader import SO101Leader, SO101LeaderConfig
+from lerobot.teleoperators.so101_leader import SO101Leader, SO101LeaderConfig

 config = SO101LeaderConfig(
    port="/dev/tty.usbmodem585A0076841",
@@ -377,7 +364,7 @@ lerobot-calibrate \

 <!-- prettier-ignore-start -->
 ```python
-from lerobot.robots.so_follower import SO101FollowerConfig, SO101Follower
+from lerobot.robots.so101_follower import SO101FollowerConfig, SO101Follower

 config = SO101FollowerConfig(
    port="/dev/tty.usbmodem585A0076891",
@@ -426,7 +413,7 @@ lerobot-calibrate \

 <!-- prettier-ignore-start -->
 ```python
-from lerobot.teleoperators.so_leader import SO101LeaderConfig, SO101Leader
+from lerobot.teleoperators.so101_leader import SO101LeaderConfig, SO101Leader

 config = SO101LeaderConfig(
    port="/dev/tty.usbmodem58760431551",
@@ -1,155 +0,0 @@
-# Streaming Video Encoding Guide
-
-## 1. Overview
-
-Streaming video encoding eliminates the traditional PNG round-trip during video dataset recording. Instead of:
-
-1. Capture frame -> write PNG to disk -> (at episode end) read PNG's -> encode to MP4 -> delete PNG's
-
-Frames can be encoded in real-time during capture:
-
-1. Capture frame -> queue to encoder thread -> encode to MP4 directly
-
-This makes `save_episode()` near-instant (the video is already encoded by the time the episode ends) and removes the blocking wait that previously occurred between episodes, especially with multiple cameras in long episodes.
-
-## 2. Tuning Parameters
-
-| Parameter               | CLI Flag                          | Type          | Default       | Description                                                       |
-| ----------------------- | --------------------------------- | ------------- | ------------- | ----------------------------------------------------------------- |
-| `streaming_encoding`    | `--dataset.streaming_encoding`    | `bool`        | `True`        | Enable real-time encoding during capture                          |
-| `vcodec`                | `--dataset.vcodec`                | `str`         | `"libsvtav1"` | Video codec. `"auto"` detects best HW encoder                     |
-| `encoder_threads`       | `--dataset.encoder_threads`       | `int \| None` | `None` (auto) | Threads per encoder instance. `None` will leave the vcoded decide |
-| `encoder_queue_maxsize` | `--dataset.encoder_queue_maxsize` | `int`         | `60`          | Max buffered frames per camera (~2s at 30fps). Consumes RAM       |
-
-## 3. Performance Considerations
-
-Streaming encoding means the CPU is encoding video **during** the capture loop, not after. This creates a CPU budget that must be shared between:
-
- **Control loop** (reading cameras, control the robot, writing non-video data)
- **Encoder threads** (one pool per camera)
- **Rerun visualization** (if enabled)
- **OS and other processes**
-
-### Resolution & Number of Cameras Impact
-
-| Setup                     | Throughput (px/sec) | CPU Encoding Load | Notes                          |
-| ------------------------- | ------------------- | ----------------- | ------------------------------ |
-| 2camsx 640x480x3 @30fps   | 55M                 | Low               | Works on most systems          |
-| 2camsx 1280x720x3 @30fps  | 165M                | Moderate          | Comfortable on modern systems  |
-| 2camsx 1920x1080x3 @30fps | 373M                | High              | Requires powerful high-end CPU |
-
-### `encoder_threads` Tuning
-
-This parameter controls how many threads each encoder instance uses internally:
-
- **Higher values** (e.g., 4-5): Faster encoding, but uses more CPU cores per camera. Good for high-end systems with many cores.
- **Lower values** (e.g., 1-2): Less CPU per camera, freeing cores for capture and visualization. Good for low-res images and capable CPUs.
- **`None` (default)**: Lets the codec decide. Information available in the codec logs.
-
-### Backpressure and Frame Dropping
-
-Each camera has a bounded queue (`encoder_queue_maxsize`, default 60 frames). When the encoder can't keep up:
-
-1. The queue fills up (consuming RAM)
-2. New frames are **dropped** (not blocked) — the capture loop continues uninterrupted
-3. A warning is logged: `"Encoder queue full for {camera}, dropped N frame(s)"`
-4. At episode end, total dropped frames per camera are reported
-
-### Symptoms of Encoder Falling Behind
-
- **System feels laggy and freezes**: all CPUs are at 100%
- **Dropped frame warnings** in the log or lower frames/FPS than expected in the recorded dataset
- **Choppy robot movement**: If CPU is severely overloaded, even the capture loop may be affected
- **Accumulated rerun lag**: Visualization falls behind real-time
-
-## 4. Hardware-Accelerated Encoding
-
-### When to Use
-
-Use HW encoding when:
-
- CPU is the bottleneck (dropped frames, choppy robot, rerun lag)
- You have compatible hardware (GPU or dedicated encoder)
- You're recording at high throughput (high resolution or with many cameras)
-
-### Choosing a Codec
-
-| Codec                 | CPU Usage | File Size      | Quality | Notes                                                            |
-| --------------------- | --------- | -------------- | ------- | ---------------------------------------------------------------- |
-| `libsvtav1` (default) | High      | Smallest       | Best    | Default. Best compression but most CPU-intensive                 |
-| `h264`                | Medium    | ~30-50% larger | Good    | Software H.264. Lower CPU                                        |
-| HW encoders           | Very Low  | Largest        | Good    | Offloads to dedicated hardware. Best for CPU-constrained systems |
-
-### Available HW Encoders
-
-| Encoder             | Platform      | Hardware                                                                                         | CLI Value                            |
-| ------------------- | ------------- | ------------------------------------------------------------------------------------------------ | ------------------------------------ |
-| `h264_videotoolbox` | macOS         | Apple Silicon / Intel                                                                            | `--dataset.vcodec=h264_videotoolbox` |
-| `hevc_videotoolbox` | macOS         | Apple Silicon / Intel                                                                            | `--dataset.vcodec=hevc_videotoolbox` |
-| `h264_nvenc`        | Linux/Windows | NVIDIA GPU                                                                                       | `--dataset.vcodec=h264_nvenc`        |
-| `hevc_nvenc`        | Linux/Windows | NVIDIA GPU                                                                                       | `--dataset.vcodec=hevc_nvenc`        |
-| `h264_vaapi`        | Linux         | Intel/AMD GPU                                                                                    | `--dataset.vcodec=h264_vaapi`        |
-| `h264_qsv`          | Linux/Windows | Intel Quick Sync                                                                                 | `--dataset.vcodec=h264_qsv`          |
-| `auto`              | Any           | Probes the system for available HW encoders. Falls back to `libsvtav1` if no HW encoder is found | `--dataset.vcodec=auto`              |
-
-> [!NOTE]
-> In order to use the HW accelerated encoders you might need to upgrade your GPU drivers.
-
-> [!NOTE]
-> `libsvtav1` is the default because it provides the best training performance; other vcodecs can reduce CPU usage and be faster, but they typically produce larger files and may affect training time.
-
-## 5. Troubleshooting
-
-| Symptom                                                            | Likely Cause                                 | Fix                                                                                                                                                                                                                                                                                  |
-| ------------------------------------------------------------------ | -------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
-| System freezes or choppy robot movement or Rerun visualization lag | CPU starved (100% load usage)                | Close other apps, reduce encoding throughput, lower `encoder_threads`, use `h264`, use `display_data=False`. If the CPU continues to be at 100% then it might be insufficient for your setup, consider `--dataset.streaming_encoding=false` or HW encoding (`--dataset.vcodec=auto`) |
-| "Encoder queue full" warnings or dropped frames in dataset         | Encoder can't keep up (Queue overflow)       | If CPU is not at 100%: Increase `encoder_threads`, increase `encoder_queue_maxsize` or use HW encoding (`--dataset.vcodec=auto`).                                                                                                                                                    |
-| High RAM usage                                                     | Queue filling faster than encoding           | `encoder_threads` too low or CPU insufficient. Reduce `encoder_queue_maxsize` or use HW encoding                                                                                                                                                                                     |
-| Large video files                                                  | Using HW encoder or H.264                    | Expected trade-off. Switch to `libsvtav1` if CPU allows                                                                                                                                                                                                                              |
-| `save_episode()` still slow                                        | `streaming_encoding` is `False`              | Set `--dataset.streaming_encoding=true`                                                                                                                                                                                                                                              |
-| Encoder thread crash                                               | Codec not available or invalid settings      | Check `vcodec` is installed, try `--dataset.vcodec=auto`                                                                                                                                                                                                                             |
-| Recorded dataset is missing frames                                 | CPU/GPU starvation or occasional load spikes | If ~5% of frames are missing, your system is likely overloaded — follow the recommendations above. If fewer frames are missing (~2%), they are probably due to occasional transient load spikes (often at startup) and can be considered expected.                                   |
-
-## 6. Recommended Configurations
-
-These estimates are conservative; we recommend testing them on your setup—start with a low load and increase it gradually.
-
-### High-End Systems: modern 12+ cores (24+ threads)
-
-A throughput between ~250-500M px/sec should be comfortable in CPU. For even better results try HW encoding if available.
-
-```bash
-# 3camsx 1280x720x3 @30fps: Defaults work well. Optionally increase encoder parallelism.
-# 2camsx 1920x1080x3 @30fps: Defaults work well. Optionally increase encoder parallelism.
-lerobot-record --dataset.encoder_threads=5 ...
-
-# 3camsx 1920x1080x3 @30fps: Might require some tuning.
-```
-
-### Mid-Range Systems: modern 8+ cores (16+ threads) or Apple Silicon
-
-A throughput between ~80-300M px/sec should be possible in CPU.
-
-```bash
-# 3camsx 640x480x3 @30fps: Defaults work well. Optionally decrease encoder parallelism.
-# 2camsx 1280x720x3 @30fps: Defaults work well. Optionally decrease encoder parallelism.
-lerobot-record --dataset.encoder_threads=2 ...
-
-# 2camsx 1920x1080x3 @30fps: Might require some tuning.
-```
-
-### Low-Resource Systems: modern 4+ cores (8+ threads) or Raspberry Pi 5
-
-On very constrained systems, streaming encoding may compete too heavily with the capture loop. Disabling it falls back to the PNG-based approach where encoding happens between episodes (blocking, but doesn't interfere with capture). Alternatively, record at a lower throughput to reduce both capture and encoding load. Consider also changing codec to `h264` and using batch encoding.
-
-```bash
-# 2camsx 640x480x3 @30fps: Requires some tuning.
-
-# Use H.264, disable streaming, consider batching encoding
-lerobot-record --dataset.vcodec=h264 --dataset.streaming_encoding=false ...
-```
-
-## 7. Closing note
-
-Performance ultimately depends on your exact setup — frames-per-second, resolution, CPU cores and load, available memory, episode length, and the encoder you choose. Always test with your target workload, be mindful about your CPU & system capabilities and tune `encoder_threads`, `encoder_queue_maxsize`, and
-`vcodec` reasonably. That said, a common practical configuration (for many applications) is three cameras at 640×480x3 @30fps; this usually runs fine with the default streaming video encoding settings in modern systems. Always verify your recorded dataset is healthy by comparing the video duration to the CLI episode duration and confirming the row count equals FPS × CLI duration.
@@ -1,42 +0,0 @@
-# PyTorch accelerators
-
-LeRobot supports multiple hardware acceleration options for both training and inference.
-
-These options include:
-
- **CPU**: CPU executes all computations, no dedicated accelerator is used
- **CUDA**: acceleration with NVIDIA & AMD GPUs
- **MPS**: acceleration with Apple Silicon GPUs
- **XPU**: acceleration with Intel integrated and discrete GPUs
-
-## Getting Started
-
-To use particular accelerator, a suitable version of PyTorch should be installed.
-
-For CPU, CUDA, and MPS backends follow instructions provided on [PyTorch installation page](https://pytorch.org/get-started/locally).
-For XPU backend, follow instructions from [PyTorch documentation](https://docs.pytorch.org/docs/stable/notes/get_start_xpu.html).
-
-### Verifying the installation
-
-After installation, accelerator availability can be verified by running
-
-```python
-import torch
-print(torch.<backend_name>.is_available())  # <backend_name> is cuda, mps, or xpu
-```
-
-## How to run training or evaluation
-
-To select the desired accelerator, use the `--policy.device` flag when running `lerobot-train` or `lerobot-eval`. For example, to use MPS on Apple Silicon, run:
-
-```bash
-lerobot-train
-    --policy.device=mps ...
-```
-
-```bash
-lerobot-eval \
-    --policy.device=mps ...
-```
-
-However, in most cases, presence of an accelerator is detected automatically and `policy.device` parameter can be omitted from CLI commands.
@@ -1,72 +1,22 @@
-# Unitree G1
+# Unitree G1 Robot Setup and Control

-<img
-  src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/lerobot/unitree_thumbnail.jpg"
-  alt="Unitree G1 locomanipulation demo"
-  style={{ width: "100%" }}
-/>
+This guide covers the complete setup process for the Unitree G1 humanoid, from initial connection to running gr00t_wbc locomotion.

-The Unitree G1 humanoid is now supported in LeRobot! You can teleoperate, train locomanipulation policies, test in sim, and more. Both 29 and 23 DoF variants are supported.
+## About the Unitree G1
+
+We offer support for both 29 and 23 DOF G1. In this first PR we introduce:
+
+- **`unitree g1` robot class, handling low level communication with the humanoid**
+- **ZMQ socket bridge** for remote communication over WiFi, allowing one to deploy policies remotely instead of over ethernet or directly on the Orin
+- **GR00T locomotion policy** for bipedal walking and balance

 ---

-## Part 1: Getting Started
+## Part 1: Connect to Robot over Ethernet

-### Install the Unitree SDK
+### Step 1: Configure Your Computer's Ethernet Interface

-Follow the [unitree_sdk2_python installation guide](https://github.com/unitreerobotics/unitree_sdk2_python#installation). Tested with `unitree_sdk2py==1.0.1` and `cyclonedds==0.10.2`:
-
-```bash
-conda create -y -n lerobot python=3.12
-conda activate lerobot
-git clone https://github.com/unitreerobotics/unitree_sdk2_python.git
-cd unitree_sdk2_python
-pip install -e .
-cd ..
-```
-
-### Install LeRobot
-
-```bash
-conda install ffmpeg -c conda-forge
-conda install -c conda-forge "pinocchio>=3.0.0,<4.0.0"
-git clone https://github.com/huggingface/lerobot.git
-cd lerobot
-pip install -e '.[unitree_g1]'
-```
-
-<Tip>
-  For now, pinocchio must be installed from conda-forge (not pip) to include the
-  CasADi bindings needed for arm IK.
-</Tip>
-
-### Test the Installation (Simulation)
-
-The simulation environment has its own dependencies. Check the Simulation environment dependencies: [Unitree G1 Mujoco EnvHub](https://huggingface.co/lerobot/unitree-g1-mujoco/tree/main).
-
-```bash
-pip install mujoco loguru msgpack msgpack-numpy
-```
-
-```bash
-lerobot-teleoperate \
-  --robot.type=unitree_g1 \
-  --robot.is_simulation=true \
-  --teleop.type=unitree_g1 \
-  --teleop.id=wbc_unitree \
-  --robot.cameras='{"global_view": {"type": "zmq", "server_address": "localhost", "port": 5555, "camera_name": "head_camera", "width": 640, "height": 480, "fps": 30, "warmup_s": 5}}' \
-  --display_data=true \
-  --robot.controller=GrootLocomotionController
-```
-
-This will launch a [MuJoCo sim instance](https://huggingface.co/lerobot/unitree-g1-mujoco/tree/main) for the G1. You can connect a gamepad to your machine before launching in order to control the robot's locomotion in sim. We support both [HolosomaLocomotionController](https://github.com/amazon-far/holosoma) and [GrootLocomotionController](https://github.com/NVlabs/GR00T-WholeBodyControl) via `--robot.controller`.
-
- Press `9` to release the robot
- Press `7` / `8` to increase / decrease waist height
-
-### Connect to the Physical Robot
-
-The G1's Ethernet IP is fixed at `192.168.123.164`. Your machine must have a static IP on the same subnet: `192.168.123.x` where `x ≠ 164`.
+Set a static IP on the same subnet as the robot:

 ```bash
 # Replace 'enp131s0' with your ethernet interface name (check with `ip a`)
@@ -75,228 +25,179 @@ sudo ip addr add 192.168.123.200/24 dev enp131s0
 sudo ip link set enp131s0 up
 ```

-### SSH into the Robot
+**Note**: The robot's Ethernet IP is fixed at `192.168.123.164`. Your computer must use `192.168.123.x` where x ≠ 164.
+
+### Step 2: SSH into the Robot

 ```bash
 ssh unitree@192.168.123.164
 # Password: 123
 ```

-### Share Internet via Ethernet
+You should now be connected to the robot's onboard computer.

-The G1 needs internet access to clone repos and install packages. Share your laptop's connection over Ethernet:
+---

-**On your laptop:**
-
-```bash
-sudo sysctl -w net.ipv4.ip_forward=1
-
-# Replace wlp132s0f0 with your WiFi interface name
-sudo iptables -t nat -A POSTROUTING -o wlp132s0f0 -s 192.168.123.0/24 -j MASQUERADE
-sudo iptables -A FORWARD -i wlp132s0f0 -o enp131s0 -m state --state RELATED,ESTABLISHED -j ACCEPT
-sudo iptables -A FORWARD -i enp131s0 -o wlp132s0f0 -j ACCEPT
-```
-
-**On the G1:**
-
-```bash
-sudo ip route del default 2>/dev/null || true
-sudo ip route add default via 192.168.123.200 dev eth0
-echo "nameserver 8.8.8.8" | sudo tee /etc/resolv.conf
-
-# Verify
-ping -c 3 8.8.8.8
-```
-
-### Install the Unitree SDK on the G1
-
-Follow the [unitree_sdk2_python installation guide](https://github.com/unitreerobotics/unitree_sdk2_python#installation):
-
-```bash
-conda create -y -n lerobot python=3.12
-conda activate lerobot
-git clone https://github.com/unitreerobotics/unitree_sdk2_python.git
-cd unitree_sdk2_python
-python -m pip install -e .
-cd ..
-```
-
-### Install LeRobot on the G1
-
-```bash
-git clone https://github.com/huggingface/lerobot.git
-cd lerobot
-conda install -c conda-forge "pinocchio>=3.0.0,<4.0.0"
-python -m pip install -e '.[unitree_g1]'
-```
-
-<Tip>
-  For now, pinocchio must be installed from conda-forge (not pip) to include the
-  CasADi bindings needed for arm IK.
-</Tip>
-
-### (Optional) Enable WiFi on the Robot
-
-For wireless SSH access, you can enable WiFi on the G1 (it's blocked by default):
+## Part 2: Enable WiFi on the Robot
+
+Once connected via Ethernet, follow these steps to enable WiFi:
+
+### Step 1: Enable WiFi Hardware

 ```bash
+# Unblock WiFi radio
+sudo rfkill unblock wifi
 sudo rfkill unblock all
+
+# Bring up WiFi interface
 sudo ip link set wlan0 up
+
+# Enable NetworkManager control
 sudo nmcli radio wifi on
 sudo nmcli device set wlan0 managed yes
 sudo systemctl restart NetworkManager
 ```

-**Connect to a WiFi network:**
+### Step 2: Enable Internet Forwarding
+
+**On your laptop:**

 ```bash
+# Enable IP forwarding
+sudo sysctl -w net.ipv4.ip_forward=1
+
+# Set up NAT (replace wlp132s0f0 with your WiFi interface)
+sudo iptables -t nat -A POSTROUTING -o wlp132s0f0 -s 192.168.123.0/24 -j MASQUERADE
+sudo iptables -A FORWARD -i wlp132s0f0 -o enp131s0 -m state --state RELATED,ESTABLISHED -j ACCEPT
+sudo iptables -A FORWARD -i enp131s0 -o wlp132s0f0 -j ACCEPT
+```
+
+**On the robot:**
+
+```bash
+# Add laptop as default gateway
+sudo ip route del default 2>/dev/null || true
+sudo ip route add default via 192.168.123.200 dev eth0
+echo "nameserver 8.8.8.8" | sudo tee /etc/resolv.conf
+
+# Test connection
+ping -c 3 8.8.8.8
+```
+
+### Step 3: Connect to WiFi Network
+
+```bash
+# List available networks
 nmcli device wifi list

+# Connect to your WiFi (example)
 sudo nmcli connection add type wifi ifname wlan0 con-name "YourNetwork" ssid "YourNetwork"
 sudo nmcli connection modify "YourNetwork" wifi-sec.key-mgmt wpa-psk
 sudo nmcli connection modify "YourNetwork" wifi-sec.psk "YourPassword"
 sudo nmcli connection modify "YourNetwork" connection.autoconnect yes
 sudo nmcli connection up "YourNetwork"

+# Check WiFi IP address
 ip a show wlan0
 ```

-You can then SSH over WiFi instead of Ethernet:
+### Step 4: SSH Over WiFi
+
+Once connected to WiFi, note the robot's IP address and disconnect the Ethernet cable. You can now SSH over WiFi:

 ```bash
-ssh unitree@<ROBOT_WIFI_IP>
+ssh unitree@<YOUR_ROBOT_IP>
 # Password: 123
 ```

---
-
-## Part 2: Teleoperation & Locomotion
-
-### Run the Robot Server
-
-On the robot (from `~/lerobot`):
-
-```bash
-cd ~/lerobot
-python src/lerobot/robots/unitree_g1/run_g1_server.py --camera
-```
-
-### Run the Locomotion Policy
-
-You can run the teleoperation client from your laptop over Ethernet, over WiFi (experimental), or directly on the robot itself. Mind potential latency introduced by your network.
-
-**From your laptop:**
-
-```bash
-lerobot-teleoperate \
-  --robot.type=unitree_g1 \
-  --robot.is_simulation=false \
-  --robot.robot_ip=<ROBOT_IP> \
-  --teleop.type=unitree_g1 \
-  --teleop.id=wbc_unitree \
-  --robot.cameras='{"global_view": {"type": "zmq", "server_address": "<ROBOT_IP>", "port": 5555, "camera_name": "head_camera", "width": 640, "height": 480, "fps": 30}}' \
-  --display_data=true \
-  --robot.controller=HolosomaLocomotionController
-```
-
-We support both [GrootLocomotionController](https://github.com/NVlabs/GR00T-WholeBodyControl) and [HolosomaLocomotionController](https://github.com/amazon-far/holosoma) via `--robot.controller`.
+Replace `<YOUR_ROBOT_IP>` with your robot's actual WiFi IP address (e.g., `172.18.129.215`).

 ---

-## Part 3: Loco-Manipulation with the Homunculus Exoskeleton
+## Part 3: Robot Server Setup

-We provide a loco-manipulation solution via the Homunculus Exoskeleton — an open-source 7 DoF exoskeleton for whole-body control. Check it out [here](https://github.com/nepyope/hmc_exo).
+### Step 1: Install LeRobot on the Orin

-### Calibrate
+SSH into the robot and install LeRobot:

 ```bash
-lerobot-calibrate \
-  --teleop.type=unitree_g1 \
-  --teleop.left_arm_config.port=/dev/ttyACM1 \
-  --teleop.right_arm_config.port=/dev/ttyACM0 \
-  --teleop.id=exo
+ssh unitree@<YOUR_ROBOT_IP>
+
+conda create -y -n lerobot python=3.10
+conda activate lerobot
+git clone https://github.com/huggingface/lerobot.git
+cd lerobot
+pip install -e '.[unitree_g1]'
+git clone https://github.com/unitreerobotics/unitree_sdk2_python.git
+cd unitree_sdk2_python  && pip install -e .
 ```

-During calibration move each joint through its entire range. After fitting, move the joint in a neutral position and press `n` to advance.
+**Note**: The Unitree SDK requires CycloneDDS v0.10.2 to be installed. See the [Unitree SDK documentation](https://github.com/unitreerobotics/unitree_sdk2_python) for details.

-### Record a Dataset
+### Step 2: Run the Robot Server
+
+On the robot:

 ```bash
-lerobot-record \
-  --robot.type=unitree_g1 \
-  --robot.is_simulation=true \
-  --robot.cameras='{"global_view": {"type": "zmq", "server_address": "localhost", "port": 5555, "camera_name": "head_camera", "width": 640, "height": 480, "fps": 30}}' \
-  --teleop.type=unitree_g1 \
-  --teleop.left_arm_config.port=/dev/ttyACM1 \
-  --teleop.right_arm_config.port=/dev/ttyACM0 \
-  --teleop.id=exo \
-  --dataset.repo_id=your-username/dataset-name \
-  --dataset.single_task="Test" \
-  --dataset.num_episodes=2 \
-  --dataset.episode_time_s=5 \
-  --dataset.reset_time_s=5 \
-  --dataset.push_to_hub=true \
-  --dataset.streaming_encoding=true \
-  --dataset.encoder_threads=2
+python src/lerobot/robots/unitree_g1/run_g1_server.py
 ```

-> **Note:** Omit `--teleop.left_arm_config.port` and `--teleop.right_arm_config.port` if you're only using the joystick.
-
-Example dataset: [nepyope/unitree_box_move_blue_full](https://huggingface.co/datasets/nepyope/unitree_box_move_blue_full)
+**Important**: Keep this terminal running. The server must be active for remote control.

 ---

-## Part 4: Training & Inference
+## Part 4: Running GR00T Locomotion

-### Train
+With the robot server running, you can now control the robot from your laptop.
+
+### Step 1: Install LeRobot on your machine

 ```bash
-python src/lerobot/scripts/lerobot_train.py \
-  --dataset.repo_id=your-username/dataset-name  \
-  --policy.type=pi05 \
-  --output_dir=./outputs/pi05_training \
-  --job_name=pi05_training \
-  --policy.repo_id=your-username/your-repo-id \
-  --policy.pretrained_path=lerobot/pi05_base \
-  --policy.compile_model=true \
-  --policy.gradient_checkpointing=true \
-  --wandb.enable=true \
-  --policy.dtype=bfloat16 \
-  --policy.freeze_vision_encoder=false \
-  --policy.train_expert_only=false \
-  --steps=3000 \
-  --policy.device=cuda \
-  --batch_size=32
+conda create -y -n lerobot python=3.10
+conda activate lerobot
+git clone https://github.com/huggingface/lerobot.git
+cd lerobot
+pip install -e '.[unitree_g1]'
+git clone https://github.com/unitreerobotics/unitree_sdk2_python.git
+cd unitree_sdk2_python  && pip install -e .
 ```

-### Inference with RTC
+### Step 2: Update Robot IP in Config

-Once trained, we recommend deploying policies using inference-time RTC:
+Edit the config file to match your robot's WiFi IP:
+
+```python
+# In src/lerobot/robots/unitree_g1/config_unitree_g1.py
+robot_ip: str = "<YOUR_ROBOT_IP>"  # Replace with your robot's WiFi IP.
+```
+
+**Note**: When running directly on the G1 (not remotely), set `robot_ip: str = "127.0.0.1"` instead.
+
+### Step 3: Run the Locomotion Policy

 ```bash
-python examples/rtc/eval_with_real_robot.py \
-  --policy.path=your-username/your-repo-id \
-  --policy.device=cuda \
-  --robot.type=unitree_g1 \
-  --robot.is_simulation=false \
-  --robot.controller=HolosomaLocomotionController \
-  --robot.cameras='{"global_view": {"type": "zmq", "server_address": "<ROBOT_IP>", "port": 5555, "camera_name": "head_camera", "width": 640, "height": 480, "fps": 30}}' \
-  --task="task_description" \
-  --duration=1000 \
-  --fps=30 \
-  --rtc.enabled=true
+# Run GR00T locomotion controller
+python examples/unitree_g1/gr00t_locomotion.py --repo-id "nepyope/GR00T-WholeBodyControl_g1"
 ```

+### Step 4: Control with Remote
+
+- **Left stick**: Forward/backward and left/right movement
+- **Right stick**: Rotation
+- **R1 button**: Raise waist height
+- **R2 button**: Lower waist height
+
+Press `Ctrl+C` to stop the policy.
+
 ---

 ## Additional Resources

 - [Unitree SDK Documentation](https://github.com/unitreerobotics/unitree_sdk2_python)
- [GR00T-WholeBodyControl](https://github.com/NVlabs/GR00T-WholeBodyControl)
- [Holosoma](https://github.com/amazon-far/holosoma)
+- [GR00T Policy Repository](https://huggingface.co/nepyope/GR00T-WholeBodyControl_g1)
 - [LeRobot Documentation](https://github.com/huggingface/lerobot)
- [Unitree IL LeRobot](https://github.com/unitreerobotics/unitree_IL_lerobot)
+- [Unitree_IL_Lerobot](https://github.com/unitreerobotics/unitree_IL_lerobot)

 ---

-_Last updated: March 2026_
+_Last updated: December 2025_
@@ -11,15 +11,13 @@ LeRobot provides several utilities for manipulating datasets:
 3. **Merge Datasets** - Combine multiple datasets into one. The datasets must have identical features, and episodes are concatenated in the order specified in `repo_ids`
 4. **Add Features** - Add new features to a dataset
 5. **Remove Features** - Remove features from a dataset
-6. **Convert to Video** - Convert image-based datasets to video format for efficient storage
-7. **Show the Info of Datasets** - Show the summary of datasets information such as number of episode etc.

 The core implementation is in `lerobot.datasets.dataset_tools`.
 An example script detailing how to use the tools API is available in `examples/dataset/use_dataset_tools.py`.

 ## Command-Line Tool: lerobot-edit-dataset

-`lerobot-edit-dataset` is a command-line script for editing datasets. It can be used to delete episodes, split datasets, merge datasets, add features, remove features, and convert image datasets to video format.
+`lerobot-edit-dataset` is a command-line script for editing datasets. It can be used to delete episodes, split datasets, merge datasets, add features, and remove features.

 Run `lerobot-edit-dataset --help` for more information on the configuration of each operation.

@@ -88,102 +86,9 @@ lerobot-edit-dataset \
    --operation.feature_names "['observation.images.top']"
 ```

-#### Convert to Video
-
-Convert an image-based dataset to video format, creating a new LeRobotDataset where images are stored as videos. This is useful for reducing storage requirements and improving data loading performance. The new dataset will have the exact same structure as the original, but with images encoded as MP4 videos in the proper LeRobot format.
-
-```bash
-# Local-only: Save to a custom output directory (no hub push)
-lerobot-edit-dataset \
-    --repo_id lerobot/pusht_image \
-    --operation.type convert_image_to_video \
-    --operation.output_dir /path/to/output/pusht_video
-
-# Save with new repo_id (local storage)
-lerobot-edit-dataset \
-    --repo_id lerobot/pusht_image \
-    --new_repo_id lerobot/pusht_video \
-    --operation.type convert_image_to_video
-
-# Convert and push to Hugging Face Hub
-lerobot-edit-dataset \
-    --repo_id lerobot/pusht_image \
-    --new_repo_id lerobot/pusht_video \
-    --operation.type convert_image_to_video \
-    --push_to_hub true
-
-# Convert with custom video codec and quality settings
-lerobot-edit-dataset \
-    --repo_id lerobot/pusht_image \
-    --operation.type convert_image_to_video \
-    --operation.output_dir outputs/pusht_video \
-    --operation.vcodec libsvtav1 \
-    --operation.pix_fmt yuv420p \
-    --operation.g 2 \
-    --operation.crf 30
-
-# Convert only specific episodes
-lerobot-edit-dataset \
-    --repo_id lerobot/pusht_image \
-    --operation.type convert_image_to_video \
-    --operation.output_dir outputs/pusht_video \
-    --operation.episode_indices "[0, 1, 2, 5, 10]"
-
-# Convert with multiple workers for parallel processing
-lerobot-edit-dataset \
-    --repo_id lerobot/pusht_image \
-    --operation.type convert_image_to_video \
-    --operation.output_dir outputs/pusht_video \
-    --operation.num_workers 8
-
-# For memory-constrained systems, users can now specify limits:
-lerobot-edit-dataset \
-    --repo_id lerobot/pusht_image \
-    --operation.type convert_to_video \
-    --operation.max_episodes_per_batch 50 \
-    --operation.max_frames_per_batch 10000
-```
-
-**Parameters:**
-
- `output_dir`: Custom output directory (optional - by default uses `new_repo_id` or `{repo_id}_video`)
- `vcodec`: Video codec to use - options: `h264`, `hevc`, `libsvtav1` (default: `libsvtav1`)
- `pix_fmt`: Pixel format - options: `yuv420p`, `yuv444p` (default: `yuv420p`)
- `g`: Group of pictures (GOP) size - lower values give better quality but larger files (default: 2)
- `crf`: Constant rate factor - lower values give better quality but larger files, 0 is lossless (default: 30)
- `fast_decode`: Fast decode tuning option (default: 0)
- `episode_indices`: List of specific episodes to convert (default: all episodes)
- `num_workers`: Number of parallel workers for processing (default: 4)
-
-**Note:** The resulting dataset will be a proper LeRobotDataset with all cameras encoded as videos in the `videos/` directory, with parquet files containing only metadata (no raw image data). All episodes, stats, and tasks are preserved.
-
-### Show the information of datasets
-
-Show the information of datasets such as number of episode, number of frame, File size and so on.
-No change will be made to the dataset
-
-```bash
-
-# Show dataset information without feature details
-lerobot-edit-dataset \
-    --repo_id lerobot/pusht_image \
-    --operation.type info \
-
-# Show dataset information with feature details
-lerobot-edit-dataset \
-    --repo_id lerobot/pusht_image \
-    --operation.type info \
-    --operation.show_features true
-
-```
-
-**Parameters:**
-
- `parameters`: The flag to control show or no show dataset information with feature details.(default=false)
-
 ### Push to Hub

-Add the `--push_to_hub true` flag to any command to automatically upload the resulting dataset to the Hugging Face Hub:
+Add the `--push_to_hub` flag to any command to automatically upload the resulting dataset to the Hugging Face Hub:

 ```bash
 lerobot-edit-dataset \
@@ -191,45 +96,7 @@ lerobot-edit-dataset \
    --new_repo_id lerobot/pusht_after_deletion \
    --operation.type delete_episodes \
    --operation.episode_indices "[0, 2, 5]" \
-    --push_to_hub true
+    --push_to_hub
 ```

 There is also a tool for adding features to a dataset that is not yet covered in `lerobot-edit-dataset`.
-
-# Dataset Visualization
-
-## Online Visualization
-
-When you record a dataset using `lerobot`, it automatically uploads to the Hugging Face Hub unless you specify otherwise. To view the dataset online, use our **LeRobot Dataset Visualizer**, available at:
-https://huggingface.co/spaces/lerobot/visualize_dataset
-
-## Local Visualization
-
-You can also visualize episodes from a dataset locally using our command-line tool.
-
-**From the Hugging Face Hub:**
-
-```bash
-lerobot-dataset-viz \
-    --repo-id lerobot/pusht \
-    --episode-index 0
-```
-
-**From a local folder:**
-Add the `--root` option and set `--mode local`. For example, to search in `./my_local_data_dir/lerobot/pusht`:
-
-```bash
-lerobot-dataset-viz \
-    --repo-id lerobot/pusht \
-    --root ./my_local_data_dir \
-    --mode local \
-    --episode-index 0
-```
-
-Once executed, the tool opens `rerun.io` and displays the camera streams, robot states, and actions for the selected episode.
-
-For advanced usage—including visualizing datasets stored on a remote server—run:
-
-```bash
-lerobot-dataset-viz --help
-```
@@ -1,80 +0,0 @@
-# WALL-OSS
-
-WALL-OSS is an open-source foundation model for embodied intelligence, proposed by the [XSquare Robot](https://x2robot.com/en/research/68bc2cde8497d7f238dde690) team in 2025. The LeRobot implementation is adapted from their open-source [WallX](https://github.com/X-Square-Robot/wall-x) repository.
-
-X Square Robot’s WALL-OSS is now integrated into Hugging Face’s LeRobot ecosystem. This is an exciting collaborative project between the LeRobot and X Square Robot teams. You can now post-train, evaluate, and deploy WALL-OSS directly through LeRobot. With this, we’re aiming to make it easier for the open-source robotics community to customize and deploy WALL-OSS foundation models. Read and explore WALL-OSS [paper](https://arxiv.org/pdf/2509.11766) and [code](https://github.com/X-Square-Robot/wall-x).
-
-## Model Overview
-
-The WALL-OSS team is building the embodied foundation model to capture and compress the world's most valuable data: the continuous, high-fidelity stream of physical interaction. By creating a direct feedback loop between the model's decisions and the body's lived experience, the emergence of a truly generalizable intelligence is enabled—one that understands not just how the world works, but how to act effectively within it.
-
-<img
-  src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/lerobot/walloss-lerobot-paper.png"
-  alt="An overview of WALL-OSS"
-  width="85%"
-/>
-
-Technically, WALL-OSS introduces a tightly coupled multimodal architecture (tightly-coupled MoE structure) that integrates both discrete and continuous action modeling strategies. Through a two-stage training pipeline (Inspiration → Integration), the model gradually unifies semantic reasoning and high-frequency action generation. Its core innovations include:
-
- **Embodied perception–enhanced multimodal pretraining**: Large-scale training on unified vision–language–action data to strengthen spatial, causal, and manipulation understanding.
- **Unified Cross-Level Chain-of-Thought (Uni-CoT)**: A single differentiable framework that unifies high-level instruction reasoning, sub-task decomposition, and fine-grained action synthesis, forming a continuous chain from “understanding” to “execution.”
- **Mixture-of-Experts (MoE) action heads**: Dynamically activating experts depending on the task phase and modeling actions in discrete or continuous space to maintain stable VLM priors.
- **Two-stage training paradigm**:
-  - **Inspiration stage**: Injecting discrete action priors to strengthen spatial understanding and semantic-action alignment.
-  - **Integration stage**: Using flow matching to achieve high-frequency continuous control.
-
-## Installation Requirements
-
-1. Install LeRobot by following our [Installation Guide](./installation).
-2. Install WallX dependencies by running:
-
-   ```bash
-   pip install -e ".[wallx]"
-   ```
-
-## Usage
-
-To use WallX in LeRobot, specify the policy type as:
-
-```python
-policy.type=wall_x
-```
-
-## Training
-
-For training WallX, you can use the standard LeRobot training script with the appropriate configuration:
-
-```bash
-lerobot-train \
-    --dataset.repo_id=your_dataset \
-    --policy.type=wall_x \
-    --output_dir=./outputs/wallx_training \
-    --job_name=wallx_training \
-    --policy.repo_id=your_repo_id \
-    --policy.pretrained_name_or_path=x-square-robot/wall-oss-flow \
-    --policy.prediction_mode=diffusion \
-    --policy.attn_implementation=eager \
-    --steps=3000 \
-    --policy.device=cuda \
-    --batch_size=32
-```
-
-### Training Arguments
-
-| Argument                       | Description                                                                                                                                                   |
-| ------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| `--dataset.repo_id`            | The Hugging Face Hub repository ID for your training dataset (e.g., `lerobot/aloha_sim_insertion_human`)                                                      |
-| `--policy.type`                | Specifies using the WallX policy architecture                                                                                                                 |
-| `--output_dir`                 | Local directory where training checkpoints and logs will be saved                                                                                             |
-| `--job_name`                   | A name identifier for this training run (used in logging/tracking)                                                                                            |
-| `--policy.repo_id`             | Your Hugging Face Hub repo ID where the trained model will be pushed                                                                                          |
-| `--policy.pretrained_path`     | Path to pretrained WallX weights to initialize from (the official WALL-OSS checkpoint)                                                                        |
-| `--policy.prediction_mode`     | The action prediction strategy: `diffusion` or `fast` - `diffusion` uses iterative denoising for action generation, `fast` uses next token prediction instead |
-| `--policy.attn_implementation` | Attention implementation backend - `eager` uses standard PyTorch attention (alternatives include `flash_attention_2` or `sdpa`)                               |
-| `--steps`                      | Total number of training steps to run                                                                                                                         |
-| `--policy.device`              | Device to train on (`cuda` for GPU, `cpu` for CPU)                                                                                                            |
-| `--batch_size`                 | Number of samples per training batch                                                                                                                          |
-
-## License
-
-This model follows the **Apache 2.0 License**, consistent with the original [WallX repository](https://github.com/X-Square-Robot/wall-x).
@@ -24,7 +24,7 @@ Built from pure Transformer encoders, X-VLA scales naturally with model size and
  <img
    src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/lerobot/xvla-architecture2.png"
    alt="XVLA Architecture 2"
-    style="width: 60%; height: auto;"
+    style="width: 32%; max-width: 450px; height: auto;"
  />
 </p>

@@ -120,7 +120,7 @@ Adapted for Google Robot platforms.

 ### Recommended Training Configuration

-When fine-tuning X-VLA for a new embodiment or task, we recommend not freezing the VLM, and also setting the `policy.dtype=bfloat16` to not hit OOM errors.
+When fine-tuning X-VLA for a new embodiment or task, we recommend the following freezing strategy:

 ```bash
 lerobot-train \
@@ -129,44 +129,42 @@ lerobot-train \
  --job_name=xvla_training \
  --policy.path="lerobot/xvla-base" \
  --policy.repo_id="HF_USER/xvla-your-robot" \
-  --policy.dtype=bfloat16 \
-  --policy.action_mode=auto \
-  --steps=20000 \
+  --steps=3000 \
  --policy.device=cuda \
-  --policy.freeze_vision_encoder=false \
-  --policy.freeze_language_encoder=false \
-  --policy.train_policy_transformer=true \
-  --policy.train_soft_prompts=true \
+  --policy.freeze_vision_encoder=True \
+  --policy.freeze_language_encoder=True \
+  --policy.train_policy_transformer=True \
+  --policy.train_soft_prompts=True \
+  --policy.action_mode=YOUR_ACTION_MODE
 ```

 ### Training Parameters Explained

-| Parameter                  | Default | Description                                    |
-| -------------------------- | ------- | ---------------------------------------------- |
-| `freeze_vision_encoder`    | `false` | Do not freeze the VLM vision encoder weights   |
-| `freeze_language_encoder`  | `false` | Do not freeze the VLM language encoder weights |
-| `train_policy_transformer` | `true`  | Allow policy transformer layers to train       |
-| `train_soft_prompts`       | `true`  | Allow soft prompts to train                    |
+| Parameter                  | Default | Description                              |
+| -------------------------- | ------- | ---------------------------------------- |
+| `freeze_vision_encoder`    | `True`  | Freeze the VLM vision encoder weights    |
+| `freeze_language_encoder`  | `True`  | Freeze the VLM language encoder weights  |
+| `train_policy_transformer` | `True`  | Allow policy transformer layers to train |
+| `train_soft_prompts`       | `True`  | Allow soft prompts to train              |

-**💡 Best Practice**: For Phase II adaptation to new embodiments, do not freeze the VLM encoders and also train the policy transformer and soft prompts.
+**💡 Best Practice**: For Phase II adaptation to new embodiments, freeze the VLM encoders and only train the policy transformer and soft prompts. This provides excellent sample efficiency with minimal compute.

 ### Example: Training on Bimanual Robot

 ```bash
 lerobot-train \
-  --dataset.repo_id=<USER>/bimanual-so100-handover-cube \
+  --dataset.repo_id=pepijn223/bimanual-so100-handover-cube \
  --output_dir=./outputs/xvla_bimanual \
  --job_name=xvla_so101_training \
  --policy.path="lerobot/xvla-base" \
-  --policy.dtype=bfloat16 \
  --policy.repo_id="YOUR_USERNAME/xvla-biso101" \
  --steps=3000 \
  --policy.device=cuda \
  --policy.action_mode=so101_bimanual \
-  --policy.freeze_vision_encoder=false \
-  --policy.freeze_language_encoder=false \
-  --policy.train_policy_transformer=true \
-  --policy.train_soft_prompts=true
+  --policy.freeze_vision_encoder=True \
+  --policy.freeze_language_encoder=True \
+  --policy.train_policy_transformer=True \
+  --policy.train_soft_prompts=True
 ```

 💡 **Best Performance:** If you have sufficient computational resources and want to achieve best X-VLA finetuning performance, you should follow the official finetuning strategy:
@@ -174,7 +172,71 @@ lerobot-train \
 **🔥 Full-finetune all components with a custom learning-rate scheme**

 To ensure stable optimization, the Vision-Language Model (VLM) must be trained with only 1/10 of the base learning rate, while all other components use the full LR.
-This LR ratio is crucial for achieving strong and stable finetuning performance. This is already done for you by default.
+This LR ratio is crucial for achieving strong and stable finetuning performance.
+To enable this behavior, you must:
+
+1. Implement a custom optimizer and register it in your training config
+
+```
+from dataclasses import dataclass, asdict
+from lerobot.optim.optimizers import OptimizerConfig
+import torch
+
+@OptimizerConfig.register_subclass("xvla-adamw")
+@dataclass
+class XVLAAdamW(OptimizerConfig):
+    lr: float = 1e-4
+    betas: tuple[float, float] = (0.9, 0.99)
+    eps: float = 1e-8
+    weight_decay: float = 0.0
+    grad_clip_norm: float = 10.0
+
+    def build(self, params: dict) -> torch.optim.Optimizer:
+        """
+        Expect `named_parameters()` as input.
+        Apply lr = lr / 10 for all VLM-related parameters.
+        """
+        assert isinstance(params, dict), \
+            "Custom LR optimizer requires `named_parameters()` as inputs."
+        kwargs = asdict(self)
+        kwargs.pop("grad_clip_norm")
+        vlm_group, other_group = [], []
+        for name, p in params.items():
+            if not p.requires_grad:
+                continue
+            if "vlm" in name.lower():
+                vlm_group.append(p)
+            else:
+                other_group.append(p)
+
+        param_groups = [
+            {"params": vlm_group, "lr": self.lr * 0.1, "weight_decay": self.weight_decay * 0.1},
+            {"params": other_group, "lr": self.lr, "weight_decay": self.weight_decay},
+        ]
+
+        return torch.optim.AdamW(param_groups, **kwargs)
+```
+
+2. Modify X-VLA’s get_optim_params to return named parameters
+
+Replace:
+
+```
+def get_optim_params(self) -> dict:
+    """Return only trainable parameters for optimization."""
+    return filter(lambda p: p.requires_grad, self.parameters())
+```
+
+with:
+
+```
+def get_optim_params(self):
+    """Return trainable named parameters."""
+    return filter(lambda kv: kv[1].requires_grad, self.named_parameters())
+```
+
+This ensures the optimizer receives a dict of named parameters, allowing it to correctly detect VLM modules and apply the 1/10 LR rule.
+
 ❕Note

 Completely matching the official reported performance may require an additional warm-up LR schedule for soft-prompts, which can bring minor improvements.
@@ -264,26 +326,6 @@ domain_id = 3

 The domain_id is automatically added to observations by the `XVLAAddDomainIdProcessorStep` in the preprocessing pipeline.

-The `lerobot/xvla-base` model has been trained on the following domain IDs. It is recommended to choose one that most resembles your robot/configuration:
-
-#### Fine-tuning Datasets
-
-| Dataset Name     | Domain ID |
-| ---------------- | --------- |
-| Bridge           | 0         |
-| RT1              | 1         |
-| Calvin           | 2         |
-| libero           | 3         |
-| widowx-air       | 4         |
-| AIR-AGILEX-HQ    | 5         |
-| robotwin2_abs_ee | 6         |
-| robotwin2_clean  | 6         |
-| robocasa-human   | 7         |
-| VLABench         | 8         |
-| AGIBOT-challenge | 9         |
-| AIR-AGILEX       | 10        |
-| AIRBOT           | 18        |
-
 ### 3. Processor Steps

 X-VLA requires specific preprocessing and postprocessing steps for proper operation.
@@ -22,7 +22,7 @@ lerobot-replay \
    --robot.type=so100_follower \
    --robot.port=/dev/tty.usbmodem58760431541 \
    --robot.id=black \
-    --dataset.repo_id=<USER>/record-test \
+    --dataset.repo_id=aliberts/record-test \
    --dataset.episode=2
 ```
 """
@@ -41,7 +41,8 @@ from lerobot.robots import (  # noqa: F401
    RobotConfig,
    koch_follower,
    make_robot_from_config,
-    so_follower,
+    so100_follower,
+    so101_follower,
 )
 from lerobot.utils.constants import ACTION
 from lerobot.utils.robot_utils import precise_sleep
@@ -57,7 +58,7 @@ class DatasetReplayConfig:
    repo_id: str
    # Episode to replay.
    episode: int
-    # Root directory where the dataset will be stored (e.g. 'dataset/path'). If None, defaults to $HF_LEROBOT_HOME/repo_id.
+    # Root directory where the dataset will be stored (e.g. 'dataset/path').
    root: str | Path | None = None
    # Limit the frames per second. By default, uses the policy fps.
    fps: int = 30
@@ -81,25 +82,24 @@ def replay(cfg: ReplayConfig):
    actions = dataset.hf_dataset.select_columns(ACTION)
    robot.connect()

-    try:
-        log_say("Replaying episode", cfg.play_sounds, blocking=True)
-        for idx in range(dataset.num_frames):
-            start_episode_t = time.perf_counter()
+    log_say("Replaying episode", cfg.play_sounds, blocking=True)
+    for idx in range(dataset.num_frames):
+        start_episode_t = time.perf_counter()

-            action_array = actions[idx][ACTION]
-            action = {}
-            for i, name in enumerate(dataset.features[ACTION]["names"]):
-                key = f"{name.removeprefix('main_')}.pos"
-                action[key] = action_array[i].item()
+        action_array = actions[idx][ACTION]
+        action = {}
+        for i, name in enumerate(dataset.features[ACTION]["names"]):
+            key = f"{name.removeprefix('main_')}.pos"
+            action[key] = action_array[i].item()

-            action["shoulder_lift.pos"] = -(action["shoulder_lift.pos"] - 90)
-            action["elbow_flex.pos"] -= 90
-            robot.send_action(action)
+        action["shoulder_lift.pos"] = -(action["shoulder_lift.pos"] - 90)
+        action["elbow_flex.pos"] -= 90
+        robot.send_action(action)

-            dt_s = time.perf_counter() - start_episode_t
-            precise_sleep(max(1 / dataset.fps - dt_s, 0.0))
-    finally:
-        robot.disconnect()
+        dt_s = time.perf_counter() - start_episode_t
+        precise_sleep(1 / dataset.fps - dt_s)
+
+    robot.disconnect()


 if __name__ == "__main__":
@@ -0,0 +1,464 @@
+#!/usr/bin/env python
+
+# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""
+BehaviorLeRobotDatasetV3: A wrapper around LeRobotDataset v3.0 for loading BEHAVIOR-1K data.
+
+This wrapper extends LeRobotDataset to support BEHAVIOR-1K specific features:
+- Modality and camera selection (rgb, depth, seg_instance_id)
+- Efficient chunk streaming mode with keyframe access
+- Additional BEHAVIOR-1K metadata (cam_rel_poses, task_info, etc.)
+"""
+
+import logging
+from collections.abc import Callable
+from pathlib import Path
+
+import datasets
+import numpy as np
+from behaviour_1k_constants import ROBOT_CAMERA_NAMES, ROBOT_TYPE
+from torch.utils.data import Dataset, get_worker_info
+
+from lerobot.datasets.lerobot_dataset import CODEBASE_VERSION, LeRobotDataset, LeRobotDatasetMetadata
+from lerobot.datasets.utils import (
+    check_delta_timestamps,
+    get_delta_indices,
+    get_safe_version,
+    hf_transform_to_torch,
+)
+from lerobot.datasets.video_utils import decode_video_frames, get_safe_default_codec
+from lerobot.utils.constants import HF_LEROBOT_HOME
+
+logger = logging.getLogger(__name__)
+
+
+class BehaviorLeRobotDatasetMetadata(LeRobotDatasetMetadata):
+    """
+    Extended metadata class for BEHAVIOR-1K datasets.
+
+    Adds support for:
+    - Modality and camera filtering
+    - Custom metainfo and annotation paths
+    """
+
+    def __init__(
+        self,
+        repo_id: str,
+        root: str | Path | None = None,
+        revision: str | None = None,
+        force_cache_sync: bool = False,
+        metadata_buffer_size: int = 10,
+        modalities: set[str] | None = None,
+        cameras: set[str] | None = None,
+    ):
+        self.modalities = set(modalities) if modalities else {"rgb", "depth", "seg_instance_id"}
+        self.camera_names = set(cameras) if cameras else {"head", "left_wrist", "right_wrist"}
+
+        assert self.modalities.issubset({"rgb", "depth", "seg_instance_id"}), (
+            f"Modalities must be subset of ['rgb', 'depth', 'seg_instance_id'], got {self.modalities}"
+        )
+
+        assert self.camera_names.issubset(set(ROBOT_CAMERA_NAMES[ROBOT_TYPE])), (
+            f"Camera names must be subset of {list(ROBOT_CAMERA_NAMES[ROBOT_TYPE])}, got {self.camera_names}"
+        )
+
+        super().__init__(repo_id, root, revision, force_cache_sync, metadata_buffer_size)
+
+    @property
+    def filtered_features(self) -> dict[str, dict]:
+        """Return only features matching selected modalities and cameras."""
+        features = {}
+        for name, feature_info in self.features.items():
+            if not name.startswith("observation.images."):
+                features[name] = feature_info
+                continue
+
+            parts = name.split(".")
+            if len(parts) >= 4:
+                modality = parts[2]
+                camera = parts[3]
+                if modality in self.modalities and camera in self.camera_names:
+                    features[name] = feature_info
+
+        return features
+
+    @property
+    def video_keys(self) -> list[str]:
+        """Return only video keys for selected modalities and cameras."""
+        all_video_keys = super().video_keys
+
+        filtered_keys = []
+        for key in all_video_keys:
+            parts = key.split(".")
+            if len(parts) >= 4:
+                modality = parts[2]
+                camera = parts[3]
+                if modality in self.modalities and camera in self.camera_names:
+                    filtered_keys.append(key)
+
+        return filtered_keys
+
+    def get_metainfo_path(self, ep_index: int) -> Path:
+        """Get path to episode metainfo file."""
+        if "metainfo_path" in self.info:
+            fpath = self.info["metainfo_path"].format(episode_index=ep_index)
+            return Path(fpath)
+        return None
+
+    def get_annotation_path(self, ep_index: int) -> Path:
+        """Get path to episode annotation file."""
+        if "annotation_path" in self.info:
+            fpath = self.info["annotation_path"].format(episode_index=ep_index)
+            return Path(fpath)
+        return None
+
+
+class BehaviorLeRobotDatasetV3(LeRobotDataset):
+    """
+    BEHAVIOR-1K wrapper for LeRobotDataset v3.0.
+
+    Each BEHAVIOR-1K dataset contains a single task (e.g., behavior1k-task0000).
+    See https://huggingface.co/collections/lerobot/behavior-1k for all available tasks.
+
+    Key features:
+    - Modality and camera selection
+    - Efficient chunk streaming with keyframe access (recommended for B1K with GOP=250)
+    - Support for BEHAVIOR-1K specific observations (cam_rel_poses, task_info, task_index)
+    """
+
+    def __init__(
+        self,
+        repo_id: str,
+        root: str | Path | None = None,
+        episodes: list[int] | None = None,
+        image_transforms: Callable | None = None,
+        delta_timestamps: dict[list[float]] | None = None,
+        tolerance_s: float = 1e-4,
+        revision: str | None = None,
+        force_cache_sync: bool = False,
+        download_videos: bool = True,
+        video_backend: str | None = None,
+        batch_encoding_size: int = 1,
+        # BEHAVIOR-1K specific arguments
+        modalities: list[str] | None = None,
+        cameras: list[str] | None = None,
+        check_timestamp_sync: bool = True,
+        chunk_streaming_using_keyframe: bool = True,
+        shuffle: bool = True,
+        seed: int = 42,
+    ):
+        """
+        Initialize BEHAVIOR-1K dataset.
+
+        Args:
+            repo_id: HuggingFace repository ID (e.g., "lerobot/behavior1k-task0000")
+            root: Local directory for dataset storage
+            episodes: List of episode indices to load (for train/val split)
+            image_transforms: Torchvision v2 transforms for images
+            delta_timestamps: Temporal offsets for history/future frames
+            tolerance_s: Tolerance for timestamp synchronization
+            revision: Git revision/branch to load
+            force_cache_sync: Force re-download from hub
+            download_videos: Whether to download video files
+            video_backend: Video decoder ('pyav' or 'torchcodec')
+            batch_encoding_size: Batch size for video encoding
+            modalities: List of modalities to load (None = all: rgb, depth, seg_instance_id)
+            cameras: List of cameras to load (None = all: head, left_wrist, right_wrist)
+            check_timestamp_sync: Verify timestamp synchronization (can be slow)
+            chunk_streaming_using_keyframe: Use keyframe-based streaming (STRONGLY RECOMMENDED for B1K)
+            shuffle: Shuffle chunks in streaming mode
+            seed: Random seed for shuffling
+        """
+        Dataset.__init__(self)
+
+        self.repo_id = repo_id
+        if root:
+            self.root = Path(root)
+        else:
+            dataset_name = repo_id.split("/")[-1] if "/" in repo_id else repo_id
+            self.root = HF_LEROBOT_HOME / dataset_name
+
+        self.image_transforms = image_transforms
+        self.delta_timestamps = delta_timestamps
+        self.tolerance_s = tolerance_s
+        self.revision = revision if revision else CODEBASE_VERSION
+        self.video_backend = video_backend if video_backend else get_safe_default_codec()
+        self.delta_indices = None
+        self.batch_encoding_size = batch_encoding_size
+        self.episodes_since_last_encoding = 0
+        self.seed = seed
+
+        self.image_writer = None
+        self.episode_buffer = None
+        self.writer = None
+        self.latest_episode = None
+        self._current_file_start_frame = None
+
+        self.root.mkdir(exist_ok=True, parents=True)
+
+        if modalities is None:
+            modalities = ["rgb", "depth", "seg_instance_id"]
+        if "seg_instance_id" in modalities:
+            assert chunk_streaming_using_keyframe, (
+                "For performance, seg_instance_id requires chunk_streaming_using_keyframe=True"
+            )
+        if "depth" in modalities:
+            assert self.video_backend == "pyav", "Depth videos require video_backend='pyav'"
+        if cameras is None:
+            cameras = ["head", "left_wrist", "right_wrist"]
+
+        self.meta = BehaviorLeRobotDatasetMetadata(
+            repo_id=self.repo_id,
+            root=self.root,
+            revision=self.revision,
+            force_cache_sync=force_cache_sync,
+            modalities=modalities,
+            cameras=cameras,
+        )
+
+        if episodes is not None:
+            self.episodes = sorted([i for i in episodes if i < len(self.meta.episodes)])
+        else:
+            self.episodes = list(range(len(self.meta.episodes)))
+
+        logger.info(f"Total episodes: {len(self.episodes)}")
+
+        self._chunk_streaming_using_keyframe = chunk_streaming_using_keyframe
+        if self._chunk_streaming_using_keyframe:
+            if not shuffle:
+                logger.warning("Chunk streaming enabled but shuffle=False. This may reduce randomness.")
+            self.chunks = self._get_keyframe_chunk_indices()
+            self.current_streaming_chunk_idx = None if shuffle else 0
+            self.current_streaming_frame_idx = None if shuffle else self.chunks[0][0] if self.chunks else 0
+            self.obs_loaders = {}
+            self._should_obs_loaders_reload = True
+
+        self._lazy_loading = False
+        self._recorded_frames = self.meta.total_frames
+        self._writer_closed_for_reading = False
+
+        try:
+            if force_cache_sync:
+                raise FileNotFoundError
+            self.hf_dataset = self.load_hf_dataset()
+        except (AssertionError, FileNotFoundError, NotADirectoryError):
+            self.revision = get_safe_version(self.repo_id, self.revision)
+            self.download_episodes(download_videos)
+            self.hf_dataset = self.load_hf_dataset()
+
+        if self.delta_timestamps is not None:
+            check_delta_timestamps(self.delta_timestamps, self.meta.fps, self.tolerance_s)
+            self.delta_indices = get_delta_indices(self.delta_timestamps, self.meta.fps)
+
+    @property
+    def fps(self) -> int:
+        """Frames per second."""
+        return self.meta.fps
+
+    @property
+    def features(self) -> dict:
+        """Dataset features (filtered by modalities/cameras)."""
+        return self.meta.filtered_features
+
+    @property
+    def num_episodes(self) -> int:
+        """Number of episodes."""
+        return len(self.episodes)
+
+    @property
+    def num_frames(self) -> int:
+        """Total number of frames."""
+        return len(self.hf_dataset)
+
+    def get_episodes_file_paths(self) -> list[str]:
+        """
+        Get download patterns for requested episodes.
+
+        Returns glob patterns for download rather than specific file paths.
+
+        Note: Unlike the base LeRobotDataset, this method cannot filter downloads to only
+        requested episodes because:
+        1. BEHAVIOR-1K episode indices are encoded (e.g., 10010 for task 1, episode 10)
+        2. Episodes are chunked across multiple parquet/video files
+        3. The parquet files are organized by chunk, not by episode
+
+        Therefore, we download full data/meta/video directories and rely on
+        `self.load_hf_dataset()` to filter to requested episodes from the loaded data.
+        """
+        allow_patterns = ["data/**", "meta/**"]
+
+        # Filter by modalities and cameras for video patterns
+        if len(self.meta.video_keys) > 0:
+            if len(self.meta.modalities) != 3 or len(self.meta.camera_names) != 3:
+                # Only download specific modality/camera combinations
+                for modality in self.meta.modalities:
+                    for camera in self.meta.camera_names:
+                        allow_patterns.append(f"**/observation.images.{modality}.{camera}/**")
+            else:
+                # Download all videos (no filtering needed)
+                allow_patterns.append("videos/**")
+
+        return allow_patterns
+
+    def download_episodes(self, download_videos: bool = True) -> None:
+        """
+        Download episodes with modality/camera filtering.
+
+        Follows the same pattern as base LeRobotDataset.download() but uses
+        get_episodes_file_paths() which returns patterns for modality/camera filtering.
+        """
+        ignore_patterns = None if download_videos else "videos/"
+        files = self.get_episodes_file_paths()
+        self.pull_from_repo(allow_patterns=files, ignore_patterns=ignore_patterns)
+
+    def pull_from_repo(
+        self,
+        allow_patterns: list[str] | str | None = None,
+        ignore_patterns: list[str] | str | None = None,
+    ) -> None:
+        """Pull dataset from HuggingFace Hub."""
+
+        from huggingface_hub import snapshot_download
+
+        logger.info(f"Pulling dataset {self.repo_id} from HuggingFace Hub...")
+        snapshot_download(
+            self.repo_id,
+            repo_type="dataset",
+            revision=self.revision,
+            local_dir=self.root,
+            allow_patterns=allow_patterns,
+            ignore_patterns=ignore_patterns,
+        )
+
+    def load_hf_dataset(self) -> datasets.Dataset:
+        """Load dataset from parquet files."""
+        from datasets import load_dataset
+
+        path = str(self.root / "data")
+        hf_dataset = load_dataset("parquet", data_dir=path, split="train")
+
+        hf_dataset.set_transform(hf_transform_to_torch)
+        return hf_dataset
+
+    def _get_keyframe_chunk_indices(self, chunk_size: int = 250) -> list[tuple[int, int, int]]:
+        """
+        Divide episodes into chunks based on GOP size (keyframe interval).
+
+        For BEHAVIOR-1K, GOP size is 250 frames for efficient storage.
+
+        Returns:
+            List of (start_index, end_index, local_start_index) tuples
+        """
+        chunks = []
+        offset = 0
+
+        for ep_array_idx in self.episodes:
+            # self.episodes contains array indices, so access directly
+            ep = self.meta.episodes[ep_array_idx]
+            length = ep["length"]
+            local_starts = list(range(0, length, chunk_size))
+            local_ends = local_starts[1:] + [length]
+
+            for local_start, local_end in zip(local_starts, local_ends, strict=True):
+                chunks.append((offset + local_start, offset + local_end, local_start))
+            offset += length
+
+        return chunks
+
+    def __getitem__(self, idx: int) -> dict:
+        """Get item by index, with optional chunk streaming."""
+        if not self._chunk_streaming_using_keyframe:
+            item = self.hf_dataset[idx]
+
+            for key in self.meta.video_keys:
+                if key in self.features:
+                    ep_idx = item["episode_index"].item()
+                    timestamp = item["timestamp"].item()
+                    video_path = self.root / self.meta.get_video_file_path(ep_idx, key)
+                    frames = decode_video_frames(
+                        video_path, [timestamp], self.tolerance_s, self.video_backend
+                    )
+                    item[key] = frames.squeeze(0)
+
+            if self.image_transforms is not None:
+                for key in self.features:
+                    if key.startswith("observation.images."):
+                        item[key] = self.image_transforms(item[key])
+
+            if "task_index" in item:
+                task_idx = item["task_index"].item()
+                try:
+                    item["task"] = self.meta.tasks.iloc[task_idx].name
+                except (IndexError, AttributeError):
+                    item["task"] = f"task_{task_idx}"
+
+            return item
+
+        return self._get_item_streaming(idx)
+
+    def _get_item_streaming(self, idx: int) -> dict:
+        """Get item in chunk streaming mode."""
+        if self.current_streaming_chunk_idx is None:
+            worker_info = get_worker_info()
+            worker_id = 0 if worker_info is None else worker_info.id
+            rng = np.random.default_rng(self.seed + worker_id)
+            rng.shuffle(self.chunks)
+            self.current_streaming_chunk_idx = rng.integers(0, len(self.chunks)).item()
+            self.current_streaming_frame_idx = self.chunks[self.current_streaming_chunk_idx][0]
+
+        if self.current_streaming_frame_idx >= self.chunks[self.current_streaming_chunk_idx][1]:
+            self.current_streaming_chunk_idx += 1
+            if self.current_streaming_chunk_idx >= len(self.chunks):
+                self.current_streaming_chunk_idx = 0
+            self.current_streaming_frame_idx = self.chunks[self.current_streaming_chunk_idx][0]
+            self._should_obs_loaders_reload = True
+
+        item = self.hf_dataset[self.current_streaming_frame_idx]
+        ep_idx = item["episode_index"].item()
+
+        if self._should_obs_loaders_reload:
+            for loader in self.obs_loaders.values():
+                if hasattr(loader, "close"):
+                    loader.close()
+            self.obs_loaders = {}
+            self.current_streaming_episode_idx = ep_idx
+            self._should_obs_loaders_reload = False
+
+        for key in self.meta.video_keys:
+            if key in self.features:
+                timestamp = item["timestamp"].item()
+                video_path = self.root / self.meta.get_video_file_path(ep_idx, key)
+                frames = decode_video_frames(video_path, [timestamp], self.tolerance_s, self.video_backend)
+                item[key] = frames.squeeze(0)
+
+        if self.image_transforms is not None:
+            for key in self.features:
+                if key.startswith("observation.images."):
+                    item[key] = self.image_transforms(item[key])
+
+        if "task_index" in item:
+            task_idx = item["task_index"].item()
+            try:
+                item["task"] = self.meta.tasks.iloc[task_idx].name
+            except (IndexError, AttributeError):
+                item["task"] = f"task_{task_idx}"
+
+        self.current_streaming_frame_idx += 1
+        return item
+
+    def __len__(self) -> int:
+        """Total number of frames."""
+        return len(self.hf_dataset)
@@ -0,0 +1,350 @@
+#!/usr/bin/env python
+
+# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+from collections import OrderedDict
+
+import numpy as np
+import torch as th
+
+ROBOT_TYPE = "R1Pro"
+FPS = 30
+
+ROBOT_CAMERA_NAMES = {
+    "A1": {
+        "external": "external::external_camera",
+        "wrist": "external::wrist_camera",
+    },
+    "R1Pro": {
+        "left_wrist": "robot_r1::robot_r1:left_realsense_link:Camera:0",
+        "right_wrist": "robot_r1::robot_r1:right_realsense_link:Camera:0",
+        "head": "robot_r1::robot_r1:zed_link:Camera:0",
+    },
+}
+
+# Camera resolutions and corresponding intrinstics
+HEAD_RESOLUTION = (720, 720)
+WRIST_RESOLUTION = (480, 480)
+# TODO: Fix A1
+CAMERA_INTRINSICS = {
+    "A1": {
+        "external": np.array(
+            [[306.0, 0.0, 360.0], [0.0, 306.0, 360.0], [0.0, 0.0, 1.0]], dtype=np.float32
+        ),  # 240x240
+        "wrist": np.array(
+            [[388.6639, 0.0, 240.0], [0.0, 388.6639, 240.0], [0.0, 0.0, 1.0]], dtype=np.float32
+        ),  # 240x240
+    },
+    "R1Pro": {
+        "head": np.array(
+            [[306.0, 0.0, 360.0], [0.0, 306.0, 360.0], [0.0, 0.0, 1.0]], dtype=np.float32
+        ),  # 720x720
+        "left_wrist": np.array(
+            [[388.6639, 0.0, 240.0], [0.0, 388.6639, 240.0], [0.0, 0.0, 1.0]], dtype=np.float32
+        ),  # 480x480
+        "right_wrist": np.array(
+            [[388.6639, 0.0, 240.0], [0.0, 388.6639, 240.0], [0.0, 0.0, 1.0]], dtype=np.float32
+        ),  # 480x480
+    },
+}
+
+
+# Dataset features for BEHAVIOR-1K LeRobotDataset v3.0
+BEHAVIOR_DATASET_FEATURES = {
+    # Actions
+    "action": {
+        "dtype": "float32",
+        "shape": (23,),  # 23-dimensional action space for R1Pro
+        "names": None,
+    },
+    # Proprioception
+    "observation.state": {
+        "dtype": "float32",
+        "shape": (256,),  # Full proprioception state
+        "names": None,
+    },
+    # Camera relative poses
+    "observation.cam_rel_poses": {
+        "dtype": "float32",
+        "shape": (21,),  # 3 cameras * 7 (pos + quat)
+        "names": None,
+    },
+    # Task information
+    "observation.task_info": {
+        "dtype": "float32",
+        "shape": (None,),  # Variable size
+        "names": None,
+    },
+    # RGB images
+    "observation.images.rgb.head": {
+        "dtype": "video",
+        "shape": [720, 720, 3],
+        "names": ["height", "width", "channels"],
+    },
+    "observation.images.rgb.left_wrist": {
+        "dtype": "video",
+        "shape": [480, 480, 3],
+        "names": ["height", "width", "channels"],
+    },
+    "observation.images.rgb.right_wrist": {
+        "dtype": "video",
+        "shape": [480, 480, 3],
+        "names": ["height", "width", "channels"],
+    },
+    # Depth images
+    "observation.images.depth.head": {
+        "dtype": "video",
+        "shape": [720, 720, 1],
+        "names": ["height", "width", "channels"],
+    },
+    "observation.images.depth.left_wrist": {
+        "dtype": "video",
+        "shape": [480, 480, 1],
+        "names": ["height", "width", "channels"],
+    },
+    "observation.images.depth.right_wrist": {
+        "dtype": "video",
+        "shape": [480, 480, 1],
+        "names": ["height", "width", "channels"],
+    },
+    # Segmentation instance ID images
+    "observation.images.seg_instance_id.head": {
+        "dtype": "video",
+        "shape": [720, 720, 1],
+        "names": ["height", "width", "channels"],
+    },
+    "observation.images.seg_instance_id.left_wrist": {
+        "dtype": "video",
+        "shape": [480, 480, 1],
+        "names": ["height", "width", "channels"],
+    },
+    "observation.images.seg_instance_id.right_wrist": {
+        "dtype": "video",
+        "shape": [480, 480, 1],
+        "names": ["height", "width", "channels"],
+    },
+}
+
+
+# Action indices
+ACTION_QPOS_INDICES = {
+    "A1": OrderedDict(
+        {
+            "arm": np.s_[0:6],
+            "gripper": np.s_[6:7],
+        }
+    ),
+    "R1Pro": OrderedDict(
+        {
+            "base": np.s_[0:3],
+            "torso": np.s_[3:7],
+            "left_arm": np.s_[7:14],
+            "left_gripper": np.s_[14:15],
+            "right_arm": np.s_[15:22],
+            "right_gripper": np.s_[22:23],
+        }
+    ),
+}
+
+
+# Proprioception configuration
+PROPRIOCEPTION_INDICES = {
+    "A1": OrderedDict(
+        {
+            "joint_qpos": np.s_[0:8],
+            "joint_qpos_sin": np.s_[8:16],
+            "joint_qpos_cos": np.s_[16:24],
+            "joint_qvel": np.s_[24:32],
+            "joint_qeffort": np.s_[32:40],
+            "eef_0_pos": np.s_[40:43],
+            "eef_0_quat": np.s_[43:47],
+            "grasp_0": np.s_[47:48],
+            "gripper_0_qpos": np.s_[48:50],
+            "gripper_0_qvel": np.s_[50:52],
+        }
+    ),
+    "R1Pro": OrderedDict(
+        {
+            "joint_qpos": np.s_[
+                0:28
+            ],  # Full robot joint positions, the first 6 are base joints, which is NOT allowed in standard track
+            "joint_qpos_sin": np.s_[
+                28:56
+            ],  # Full robot joint positions, the first 6 are base joints, which is NOT allowed in standard track
+            "joint_qpos_cos": np.s_[
+                56:84
+            ],  # Full robot joint positions, the first 6 are base joints, which is NOT allowed in standard track
+            "joint_qvel": np.s_[84:112],
+            "joint_qeffort": np.s_[112:140],
+            "robot_pos": np.s_[140:143],  # Global pos, this is NOT allowed in standard track
+            "robot_ori_cos": np.s_[143:146],  # Global ori, this is NOT allowed in standard track
+            "robot_ori_sin": np.s_[146:149],  # Global ori, this is NOT allowed in standard track
+            "robot_2d_ori": np.s_[149:150],  # 2D global ori, this is NOT allowed in standard track
+            "robot_2d_ori_cos": np.s_[150:151],  # 2D global ori, this is NOT allowed in standard track
+            "robot_2d_ori_sin": np.s_[151:152],  # 2D global ori, this is NOT allowed in standard track
+            "robot_lin_vel": np.s_[152:155],
+            "robot_ang_vel": np.s_[155:158],
+            "arm_left_qpos": np.s_[158:165],
+            "arm_left_qpos_sin": np.s_[165:172],
+            "arm_left_qpos_cos": np.s_[172:179],
+            "arm_left_qvel": np.s_[179:186],
+            "eef_left_pos": np.s_[186:189],
+            "eef_left_quat": np.s_[189:193],
+            "gripper_left_qpos": np.s_[193:195],
+            "gripper_left_qvel": np.s_[195:197],
+            "arm_right_qpos": np.s_[197:204],
+            "arm_right_qpos_sin": np.s_[204:211],
+            "arm_right_qpos_cos": np.s_[211:218],
+            "arm_right_qvel": np.s_[218:225],
+            "eef_right_pos": np.s_[225:228],
+            "eef_right_quat": np.s_[228:232],
+            "gripper_right_qpos": np.s_[232:234],
+            "gripper_right_qvel": np.s_[234:236],
+            "trunk_qpos": np.s_[236:240],
+            "trunk_qvel": np.s_[240:244],
+            "base_qpos": np.s_[244:247],  # Base joint position, this is NOT allowed in standard track
+            "base_qpos_sin": np.s_[247:250],  # Base joint position, this is NOT allowed in standard track
+            "base_qpos_cos": np.s_[250:253],  # Base joint position, this is NOT allowed in standard track
+            "base_qvel": np.s_[253:256],
+        }
+    ),
+}
+
+# Proprioception indices
+PROPRIO_QPOS_INDICES = {
+    "A1": OrderedDict(
+        {
+            "arm": np.s_[0:6],
+            "gripper": np.s_[6:8],
+        }
+    ),
+    "R1Pro": OrderedDict(
+        {
+            "torso": np.s_[6:10],
+            "left_arm": np.s_[10:24:2],
+            "right_arm": np.s_[11:24:2],
+            "left_gripper": np.s_[24:26],
+            "right_gripper": np.s_[26:28],
+        }
+    ),
+}
+
+
+# Joint limits (lower, upper)
+JOINT_RANGE = {
+    "A1": {
+        "arm": (
+            th.tensor([-2.8798, 0.0, -3.3161, -2.8798, -1.6581, -2.8798], dtype=th.float32),
+            th.tensor([2.8798, 3.1415, 0.0, 2.8798, 1.6581, 2.8798], dtype=th.float32),
+        ),
+        "gripper": (th.tensor([0.00], dtype=th.float32), th.tensor([0.03], dtype=th.float32)),
+    },
+    "R1Pro": {
+        "base": (
+            th.tensor([-0.75, -0.75, -1.0], dtype=th.float32),
+            th.tensor([0.75, 0.75, 1.0], dtype=th.float32),
+        ),
+        "torso": (
+            th.tensor([-1.1345, -2.7925, -1.8326, -3.0543], dtype=th.float32),
+            th.tensor([1.8326, 2.5307, 1.5708, 3.0543], dtype=th.float32),
+        ),
+        "left_arm": (
+            th.tensor([-4.4506, -0.1745, -2.3562, -2.0944, -2.3562, -1.0472, -1.5708], dtype=th.float32),
+            th.tensor([1.3090, 3.1416, 2.3562, 0.3491, 2.3562, 1.0472, 1.5708], dtype=th.float32),
+        ),
+        "left_gripper": (th.tensor([0.00], dtype=th.float32), th.tensor([0.05], dtype=th.float32)),
+        "right_arm": (
+            th.tensor([-4.4506, -3.1416, -2.3562, -2.0944, -2.3562, -1.0472, -1.5708], dtype=th.float32),
+            th.tensor([1.3090, 0.1745, 2.3562, 0.3491, 2.3562, 1.0472, 1.5708], dtype=th.float32),
+        ),
+        "right_gripper": (th.tensor([0.00], dtype=th.float32), th.tensor([0.05], dtype=th.float32)),
+    },
+}
+
+
+EEF_POSITION_RANGE = {
+    "A1": {
+        "0": (th.tensor([0.0, -0.7, 0.0], dtype=th.float32), th.tensor([0.7, 0.7, 0.7], dtype=th.float32)),
+    },
+    "R1Pro": {
+        "left": (
+            th.tensor([0.0, -0.65, 0.0], dtype=th.float32),
+            th.tensor([0.65, 0.65, 2.5], dtype=th.float32),
+        ),
+        "right": (
+            th.tensor([0.0, -0.65, 0.0], dtype=th.float32),
+            th.tensor([0.65, 0.65, 2.5], dtype=th.float32),
+        ),
+    },
+}
+
+
+TASK_NAMES_TO_INDICES = {
+    # B10
+    "turning_on_radio": 0,
+    "picking_up_trash": 1,
+    "putting_away_Halloween_decorations": 2,
+    "cleaning_up_plates_and_food": 3,
+    "can_meat": 4,
+    "setting_mousetraps": 5,
+    "hiding_Easter_eggs": 6,
+    "picking_up_toys": 7,
+    "rearranging_kitchen_furniture": 8,
+    "putting_up_Christmas_decorations_inside": 9,
+    # B20
+    "set_up_a_coffee_station_in_your_kitchen": 10,
+    "putting_dishes_away_after_cleaning": 11,
+    "preparing_lunch_box": 12,
+    "loading_the_car": 13,
+    "carrying_in_groceries": 14,
+    "bringing_in_wood": 15,
+    "moving_boxes_to_storage": 16,
+    "bringing_water": 17,
+    "tidying_bedroom": 18,
+    "outfit_a_basic_toolbox": 19,
+    # B30
+    "sorting_vegetables": 20,
+    "collecting_childrens_toys": 21,
+    "putting_shoes_on_rack": 22,
+    "boxing_books_up_for_storage": 23,
+    "storing_food": 24,
+    "clearing_food_from_table_into_fridge": 25,
+    "assembling_gift_baskets": 26,
+    "sorting_household_items": 27,
+    "getting_organized_for_work": 28,
+    "clean_up_your_desk": 29,
+    # B40
+    "setting_the_fire": 30,
+    "clean_boxing_gloves": 31,
+    "wash_a_baseball_cap": 32,
+    "wash_dog_toys": 33,
+    "hanging_pictures": 34,
+    "attach_a_camera_to_a_tripod": 35,
+    "clean_a_patio": 36,
+    "clean_a_trumpet": 37,
+    "spraying_for_bugs": 38,
+    "spraying_fruit_trees": 39,
+    # B50
+    "make_microwave_popcorn": 40,
+    "cook_cabbage": 41,
+    "chop_an_onion": 42,
+    "slicing_vegetables": 43,
+    "chopping_wood": 44,
+    "cook_hot_dogs": 45,
+    "cook_bacon": 46,
+    "freeze_pies": 47,
+    "canning_food": 48,
+    "make_pizza": 49,
+}
+TASK_INDICES_TO_NAMES = {v: k for k, v in TASK_NAMES_TO_INDICES.items()}
@@ -0,0 +1,605 @@
+#!/usr/bin/env python
+
+# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""Convert Behavior Dataset to LeRobotDataset v3.0 format"""
+
+import argparse
+import json
+import logging
+import shutil
+from pathlib import Path
+
+import jsonlines
+import pandas as pd
+import pyarrow as pa
+import tqdm
+from datasets import Dataset, Features, Image
+
+from lerobot.datasets.compute_stats import aggregate_stats
+from lerobot.datasets.utils import (
+    DEFAULT_CHUNK_SIZE,
+    DEFAULT_DATA_FILE_SIZE_IN_MB,
+    DEFAULT_DATA_PATH,
+    DEFAULT_VIDEO_FILE_SIZE_IN_MB,
+    DEFAULT_VIDEO_PATH,
+    LEGACY_EPISODES_PATH,
+    LEGACY_EPISODES_STATS_PATH,
+    LEGACY_TASKS_PATH,
+    cast_stats_to_numpy,
+    flatten_dict,
+    get_file_size_in_mb,
+    get_parquet_file_size_in_mb,
+    get_parquet_num_frames,
+    load_info,
+    update_chunk_file_indices,
+    write_episodes,
+    write_info,
+    write_stats,
+    write_tasks,
+)
+from lerobot.datasets.video_utils import concatenate_video_files, get_video_duration_in_s
+from lerobot.utils.utils import init_logging
+
+# script to convert one single task to v3.1
+# TASK = 1
+NEW_ROOT = Path("/fsx/jade_choghari/tmp/bb")
+
+
+def get_total_episodes_task(local_dir: Path, task_id: int, task_ranges: dict, step) -> int:
+    """
+    Calculates the total number of episodes for a single, specified task.
+    """
+    # Simply load the episodes for the task and count them.
+    episodes = legacy_load_episodes_task(
+        local_dir=local_dir, task_id=task_id, task_ranges=task_ranges, step=step
+    )
+    return len(episodes)
+
+
+NUM_CAMERAS = 9
+
+
+def get_total_frames_task(local_dir, meta_path, task_id: int, task_ranges: dict, step: int) -> int:
+    episodes_metadata = legacy_load_episodes_task(
+        local_dir=local_dir, task_id=task_id, task_ranges=task_ranges, step=step
+    )
+    total_frames = 0
+    # like 'duration'
+    for ep in episodes_metadata.values():
+        duration_s = ep["length"]
+        total_frames += int(duration_s)
+    return total_frames
+
+
+def convert_info(
+    root, new_root, data_file_size_in_mb, video_file_size_in_mb, meta_path, task_id: int, task_ranges, step
+):
+    info = load_info(root)
+    info["codebase_version"] = "v3.0"
+    del info["total_videos"]
+    info["data_files_size_in_mb"] = data_file_size_in_mb
+    info["video_files_size_in_mb"] = video_file_size_in_mb
+    info["data_path"] = DEFAULT_DATA_PATH
+    info["video_path"] = DEFAULT_VIDEO_PATH if info["video_path"] is not None else None
+    info["fps"] = int(info["fps"])
+    for key in info["features"]:
+        if info["features"][key]["dtype"] == "video":
+            # already has fps in video_info
+            continue
+        info["features"][key]["fps"] = info["fps"]
+
+    info["total_episodes"] = get_total_episodes_task(root, task_id, task_ranges, step)
+    info["total_videos"] = info["total_episodes"] * NUM_CAMERAS
+    info["total_frames"] = get_total_frames_task(root, meta_path, task_id, task_ranges, step)
+    info["total_tasks"] = 1
+    write_info(info, new_root)
+
+
+def load_jsonlines(fpath: Path) -> list[any]:
+    with jsonlines.open(fpath, "r") as reader:
+        return list(reader)
+
+
+def legacy_load_tasks(local_dir: Path) -> tuple[dict, dict]:
+    tasks = load_jsonlines(local_dir / LEGACY_TASKS_PATH)
+    # return tasks dict such that
+    tasks = {item["task_index"]: item["task"] for item in sorted(tasks, key=lambda x: x["task_index"])}
+    task_to_task_index = {task: task_index for task_index, task in tasks.items()}
+    return tasks, task_to_task_index
+
+
+def convert_tasks(root, new_root, task_id: int):
+    tasks, _ = legacy_load_tasks(root)
+    if task_id not in tasks:
+        raise ValueError(f"Task ID {task_id} not found in tasks (available: {list(tasks.keys())})")
+    tasks = {task_id: tasks[task_id]}
+    task_indices = tasks.keys()
+    task_strings = tasks.values()
+    df_tasks = pd.DataFrame({"task_index": task_indices}, index=task_strings)
+    write_tasks(df_tasks, new_root)
+
+
+def concat_data_files(paths_to_cat, new_root, chunk_idx, file_idx, image_keys):
+    # TODO(rcadene): to save RAM use Dataset.from_parquet(file) and concatenate_datasets
+    dataframes = [pd.read_parquet(file) for file in paths_to_cat]
+    # Concatenate all DataFrames along rows
+    concatenated_df = pd.concat(dataframes, ignore_index=True)
+
+    path = new_root / DEFAULT_DATA_PATH.format(chunk_index=chunk_idx, file_index=file_idx)
+    path.parent.mkdir(parents=True, exist_ok=True)
+    if len(image_keys) > 0:
+        schema = pa.Schema.from_pandas(concatenated_df)
+        features = Features.from_arrow_schema(schema)
+        for key in image_keys:
+            features[key] = Image()
+        schema = features.arrow_schema
+    else:
+        schema = None
+
+    concatenated_df.to_parquet(path, index=False, schema=schema)
+
+
+def get_image_keys(root):
+    info = load_info(root)
+    features = info["features"]
+    image_keys = [key for key, ft in features.items() if ft["dtype"] == "image"]
+    return image_keys
+
+
+def convert_data(root: Path, new_root: Path, data_file_size_in_mb: int, task_index: int):
+    task_dir_name = f"task-00{task_index}"
+    data_dir = root / "data" / task_dir_name
+    ep_paths = sorted(data_dir.glob("*.parquet"))
+    image_keys = get_image_keys(root)
+
+    ep_idx = 0
+    chunk_idx = 0
+    file_idx = 0
+    size_in_mb = 0
+    num_frames = 0
+    paths_to_cat = []
+    episodes_metadata = []
+
+    logging.info(f"Converting data files from {len(ep_paths)} episodes")
+
+    for ep_path in tqdm.tqdm(ep_paths, desc="convert data files"):
+        ep_size_in_mb = get_parquet_file_size_in_mb(ep_path)
+        ep_num_frames = get_parquet_num_frames(ep_path)
+        ep_metadata = {
+            "episode_index": ep_idx,
+            "data/chunk_index": chunk_idx,
+            "data/file_index": file_idx,
+            "dataset_from_index": num_frames,
+            "dataset_to_index": num_frames + ep_num_frames,
+        }
+        size_in_mb += ep_size_in_mb
+        num_frames += ep_num_frames
+        episodes_metadata.append(ep_metadata)
+        ep_idx += 1
+
+        if size_in_mb < data_file_size_in_mb:
+            paths_to_cat.append(ep_path)
+            continue
+
+        if paths_to_cat:
+            concat_data_files(paths_to_cat, new_root, chunk_idx, file_idx, image_keys)
+
+        # Reset for the next file
+        size_in_mb = ep_size_in_mb
+        paths_to_cat = [ep_path]
+
+        chunk_idx, file_idx = update_chunk_file_indices(chunk_idx, file_idx, DEFAULT_CHUNK_SIZE)
+
+    # Write remaining data if any
+    if paths_to_cat:
+        concat_data_files(paths_to_cat, new_root, chunk_idx, file_idx, image_keys)
+
+    return episodes_metadata
+
+
+def convert_videos_of_camera(
+    root: Path, new_root: Path, video_key: str, video_file_size_in_mb: int, task_index: int
+):
+    # Access old paths to mp4
+    # videos_dir = root / "videos"
+    # ep_paths = sorted(videos_dir.glob(f"*/{video_key}/*.mp4"))
+    task_dir_name = f"task-00{task_index}"
+    videos_dir = root / "videos" / task_dir_name / video_key
+    ep_paths = sorted(videos_dir.glob("*.mp4"))
+    print("ep_paths", ep_paths)
+    ep_idx = 0
+    chunk_idx = 0
+    file_idx = 0
+    size_in_mb = 0
+    duration_in_s = 0.0
+    paths_to_cat = []
+    episodes_metadata = []
+
+    for ep_path in tqdm.tqdm(ep_paths, desc=f"convert videos of {video_key}"):
+        ep_size_in_mb = get_file_size_in_mb(ep_path)
+        ep_duration_in_s = get_video_duration_in_s(ep_path)
+
+        # Check if adding this episode would exceed the limit
+        if size_in_mb + ep_size_in_mb >= video_file_size_in_mb and len(paths_to_cat) > 0:
+            # Size limit would be exceeded, save current accumulation WITHOUT this episode
+            concatenate_video_files(
+                paths_to_cat,
+                new_root
+                / DEFAULT_VIDEO_PATH.format(video_key=video_key, chunk_index=chunk_idx, file_index=file_idx),
+            )
+
+            # Update episodes metadata for the file we just saved
+            for i, _ in enumerate(paths_to_cat):
+                past_ep_idx = ep_idx - len(paths_to_cat) + i
+                episodes_metadata[past_ep_idx][f"videos/{video_key}/chunk_index"] = chunk_idx
+                episodes_metadata[past_ep_idx][f"videos/{video_key}/file_index"] = file_idx
+
+            # Move to next file and start fresh with current episode
+            chunk_idx, file_idx = update_chunk_file_indices(chunk_idx, file_idx, DEFAULT_CHUNK_SIZE)
+            size_in_mb = 0
+            duration_in_s = 0.0
+            paths_to_cat = []
+
+        # Add current episode metadata
+        ep_metadata = {
+            "episode_index": ep_idx,
+            f"videos/{video_key}/chunk_index": chunk_idx,  # Will be updated when file is saved
+            f"videos/{video_key}/file_index": file_idx,  # Will be updated when file is saved
+            f"videos/{video_key}/from_timestamp": duration_in_s,
+            f"videos/{video_key}/to_timestamp": duration_in_s + ep_duration_in_s,
+        }
+        episodes_metadata.append(ep_metadata)
+
+        # Add current episode to accumulation
+        paths_to_cat.append(ep_path)
+        size_in_mb += ep_size_in_mb
+        duration_in_s += ep_duration_in_s
+        ep_idx += 1
+
+    # Write remaining videos if any
+    if paths_to_cat:
+        concatenate_video_files(
+            paths_to_cat,
+            new_root
+            / DEFAULT_VIDEO_PATH.format(video_key=video_key, chunk_index=chunk_idx, file_index=file_idx),
+        )
+
+        # Update episodes metadata for the final file
+        for i, _ in enumerate(paths_to_cat):
+            past_ep_idx = ep_idx - len(paths_to_cat) + i
+            episodes_metadata[past_ep_idx][f"videos/{video_key}/chunk_index"] = chunk_idx
+            episodes_metadata[past_ep_idx][f"videos/{video_key}/file_index"] = file_idx
+
+    return episodes_metadata
+
+
+def get_video_keys(root):
+    info = load_info(root)
+    features = info["features"]
+    video_keys = [key for key, ft in features.items() if ft["dtype"] == "video"]
+    return video_keys
+
+
+def convert_videos(root: Path, new_root: Path, video_file_size_in_mb: int, task_id: int):
+    logging.info(f"Converting videos from {root} to {new_root}")
+
+    video_keys = get_video_keys(root)
+    if len(video_keys) == 0:
+        return None
+
+    video_keys = sorted(video_keys)
+
+    eps_metadata_per_cam = []
+    for camera in video_keys:
+        eps_metadata = convert_videos_of_camera(root, new_root, camera, video_file_size_in_mb, task_id)
+        eps_metadata_per_cam.append(eps_metadata)
+
+    num_eps_per_cam = [len(eps_cam_map) for eps_cam_map in eps_metadata_per_cam]
+    if len(set(num_eps_per_cam)) != 1:
+        raise ValueError(f"All cams dont have same number of episodes ({num_eps_per_cam}).")
+
+    episods_metadata = []
+    num_cameras = len(video_keys)
+    num_episodes = num_eps_per_cam[0]
+    for ep_idx in tqdm.tqdm(range(num_episodes), desc="convert videos"):
+        # Sanity check
+        ep_ids = [eps_metadata_per_cam[cam_idx][ep_idx]["episode_index"] for cam_idx in range(num_cameras)]
+        ep_ids += [ep_idx]
+        if len(set(ep_ids)) != 1:
+            raise ValueError(f"All episode indices need to match ({ep_ids}).")
+
+        ep_dict = {}
+        for cam_idx in range(num_cameras):
+            ep_dict.update(eps_metadata_per_cam[cam_idx][ep_idx])
+        episods_metadata.append(ep_dict)
+
+    return episods_metadata
+
+
+def infer_task_episode_ranges(episodes_jsonl_path: Path) -> dict:
+    """
+    Parse the Behavior-1K episodes.jsonl metadata and infer contiguous episode ranges per unique task.
+    Returns a dict:
+      { task_id: { "task_string": ..., "ep_start": ..., "ep_end": ... } }
+    """
+    task_ranges = {}
+    task_id = 0
+    current_task_str = None
+    ep_start = None
+    ep_end = None
+
+    with open(episodes_jsonl_path) as f:
+        for line in f:
+            if not line.strip():
+                continue
+            ep = json.loads(line)
+            ep_idx = ep["episode_index"]
+            task_str = ep["tasks"][0] if ep["tasks"] else "UNKNOWN"
+
+            if current_task_str is None:
+                current_task_str = task_str
+                ep_start = ep_idx
+                ep_end = ep_idx
+            elif task_str == current_task_str:
+                ep_end = ep_idx
+            else:
+                # close previous task group
+                task_ranges[task_id] = {
+                    "task_string": current_task_str,
+                    "ep_start": ep_start,
+                    "ep_end": ep_end,
+                }
+                task_id += 1
+                # start new one
+                current_task_str = task_str
+                ep_start = ep_idx
+                ep_end = ep_idx
+
+    # store last task
+    if current_task_str is not None:
+        task_ranges[task_id] = {
+            "task_string": current_task_str,
+            "ep_start": ep_start,
+            "ep_end": ep_end,
+        }
+
+    return task_ranges
+
+
+def legacy_load_episodes_task(local_dir: Path, task_id: int, task_ranges: dict, step: int = 10) -> dict:
+    """
+    Load only the episodes belonging to a specific task, inferred automatically from episode ranges.
+
+    Args:
+        local_dir (Path): Root path containing legacy meta/episodes.jsonl
+        task_id (int): Which task to load (key from the inferred task_ranges dict)
+        task_ranges (dict): Mapping from infer_task_episode_ranges()
+        step (int): Episode index step (Behavior-1K = 10)
+    """
+    all_episodes = legacy_load_episodes(local_dir)
+
+    # get the range for this task
+    if task_id not in task_ranges:
+        raise ValueError(f"Task id {task_id} not found in task_ranges")
+
+    ep_start = task_ranges[task_id]["ep_start"]
+    ep_end = task_ranges[task_id]["ep_end"]
+
+    task_episode_indices = range(ep_start, ep_end + step, step)
+    return {i: all_episodes[i] for i in task_episode_indices if i in all_episodes}
+
+
+def legacy_load_episodes(local_dir: Path) -> dict:
+    episodes = load_jsonlines(local_dir / LEGACY_EPISODES_PATH)
+    return {item["episode_index"]: item for item in sorted(episodes, key=lambda x: x["episode_index"])}
+
+
+def legacy_load_episodes_stats(local_dir: Path) -> dict:
+    episodes_stats = load_jsonlines(local_dir / LEGACY_EPISODES_STATS_PATH)
+    return {
+        item["episode_index"]: cast_stats_to_numpy(item["stats"])
+        for item in sorted(episodes_stats, key=lambda x: x["episode_index"])
+    }
+
+
+def legacy_load_episodes_stats_task(local_dir: Path, task_id: int, task_ranges: dict, step: int = 10) -> dict:
+    all_stats = legacy_load_episodes_stats(local_dir)
+
+    if task_id not in task_ranges:
+        raise ValueError(f"Task id {task_id} not found in task_ranges")
+
+    ep_start = task_ranges[task_id]["ep_start"]
+    ep_end = task_ranges[task_id]["ep_end"]
+
+    task_episode_indices = range(ep_start, ep_end + step, step)
+    return {i: all_stats[i] for i in task_episode_indices if i in all_stats}
+
+
+def generate_episode_metadata_dict(
+    episodes_legacy_metadata, episodes_metadata, episodes_stats, episodes_videos=None
+):
+    num_episodes = len(episodes_metadata)
+    episodes_legacy_metadata_vals = list(episodes_legacy_metadata.values())
+    episodes_stats_vals = list(episodes_stats.values())
+    episodes_stats_keys = list(episodes_stats.keys())
+
+    for i in range(num_episodes):
+        ep_legacy_metadata = episodes_legacy_metadata_vals[i]
+        ep_metadata = episodes_metadata[i]
+        ep_stats = episodes_stats_vals[i]
+
+        ep_ids_set = {
+            ep_legacy_metadata["episode_index"],
+            ep_metadata["episode_index"],
+            episodes_stats_keys[i],
+        }
+
+        if episodes_videos is None:
+            ep_video = {}
+        else:
+            ep_video = episodes_videos[i]
+            ep_ids_set.add(ep_video["episode_index"])
+        # we skip this check because ep_ids have a step of 10, whereas we convert with a step of 1
+        # if len(ep_ids_set) != 1:
+        #     raise ValueError(f"Number of episodes is not the same ({ep_ids_set}).")
+
+        ep_dict = {**ep_metadata, **ep_video, **ep_legacy_metadata, **flatten_dict({"stats": ep_stats})}
+        ep_dict["meta/episodes/chunk_index"] = 0
+        ep_dict["meta/episodes/file_index"] = 0
+        yield ep_dict
+
+
+def convert_episodes_metadata(
+    root, new_root, episodes_metadata, task_id: int, task_ranges, episodes_video_metadata=None
+):
+    logging.info(f"Converting episodes metadata from {root} to {new_root}")
+
+    # filter by task
+    episodes_legacy_metadata = legacy_load_episodes_task(root, task_id=task_id, task_ranges=task_ranges)
+    episodes_stats = legacy_load_episodes_stats_task(root, task_id=task_id, task_ranges=task_ranges)
+
+    num_eps_set = {len(episodes_legacy_metadata), len(episodes_metadata)}
+    if episodes_video_metadata is not None:
+        num_eps_set.add(len(episodes_video_metadata))
+
+    if len(num_eps_set) != 1:
+        raise ValueError(f"Number of episodes is not the same ({num_eps_set}).")
+
+    ds_episodes = Dataset.from_generator(
+        lambda: generate_episode_metadata_dict(
+            episodes_legacy_metadata, episodes_metadata, episodes_stats, episodes_video_metadata
+        )
+    )
+    write_episodes(ds_episodes, new_root)
+
+    stats = aggregate_stats(list(episodes_stats.values()))
+    write_stats(stats, new_root)
+
+
+def convert_dataset_local(
+    data_path: Path,
+    new_repo: Path,
+    task_id: int,
+    data_file_size_in_mb: int = DEFAULT_DATA_FILE_SIZE_IN_MB,
+    video_file_size_in_mb: int = DEFAULT_VIDEO_FILE_SIZE_IN_MB,
+    force_conversion: bool = False,
+):
+    """
+    Convert a local dataset to v3.x format, task-by-task, without using the Hugging Face Hub.
+
+    Args:
+        data_path (Path): path to local dataset root (e.g. /fsx/.../2025-challenge-demos)
+        new_repo (Path): path where converted dataset will be written (e.g. /fsx/.../behavior1k_v3)
+        task_id (int): which task to convert (index)
+        data_file_size_in_mb (int): max size per data chunk
+        video_file_size_in_mb (int): max size per video chunk
+        force_conversion (bool): overwrite existing conversion if True
+    """
+
+    root = Path(data_path)
+    new_root = Path(new_repo)
+
+    # Clean up if needed
+    if new_root.exists() and force_conversion:
+        shutil.rmtree(new_root)
+    new_root.mkdir(parents=True, exist_ok=True)
+
+    print(f"🔹 Starting conversion for task {task_id}")
+    print(f"Input root: {root}")
+    print(f"Output root: {new_root}")
+    # Infer task episode ranges
+    episodes_meta_path = root / "meta" / "episodes.jsonl"
+    task_ranges = infer_task_episode_ranges(episodes_meta_path)
+    convert_info(
+        root,
+        new_root,
+        data_file_size_in_mb,
+        video_file_size_in_mb,
+        episodes_meta_path,
+        task_id,
+        task_ranges,
+        step=10,
+    )
+    convert_tasks(root, new_root, task_id)
+    episodes_metadata = convert_data(root, new_root, data_file_size_in_mb, task_index=task_id)
+    episodes_videos_metadata = convert_videos(root, new_root, video_file_size_in_mb, task_id=task_id)
+    convert_episodes_metadata(
+        root,
+        new_root,
+        episodes_metadata,
+        task_id=task_id,
+        task_ranges=task_ranges,
+        episodes_video_metadata=episodes_videos_metadata,
+    )
+
+    print(f"✅ Conversion complete for task {task_id}")
+    print(f"Converted dataset written to: {new_root}")
+
+
+if __name__ == "__main__":
+    import argparse
+    from pathlib import Path
+
+    init_logging()
+
+    parser = argparse.ArgumentParser(
+        description="Convert Behavior-1K tasks to LeRobot v3 format (local only)"
+    )
+    parser.add_argument(
+        "--data-path",
+        type=str,
+        required=True,
+        help="Path to the local Behavior-1K dataset (e.g. /fsx/francesco_capuano/.cache/behavior-1k/2025-challenge-demos)",
+    )
+    parser.add_argument(
+        "--new-repo",
+        type=str,
+        required=True,
+        help="Path to the output directory for the converted dataset",
+    )
+    parser.add_argument(
+        "--task-id",
+        type=int,
+        required=True,
+        help="Task index to convert (e.g. 0, 1, 2, ...)",
+    )
+    parser.add_argument(
+        "--data-file-size-in-mb",
+        type=int,
+        default=DEFAULT_DATA_FILE_SIZE_IN_MB,
+        help=f"Maximum size per data chunk (default: {DEFAULT_DATA_FILE_SIZE_IN_MB})",
+    )
+    parser.add_argument(
+        "--video-file-size-in-mb",
+        type=int,
+        default=DEFAULT_VIDEO_FILE_SIZE_IN_MB,
+        help=f"Maximum size per video chunk (default: {DEFAULT_VIDEO_FILE_SIZE_IN_MB})",
+    )
+    parser.add_argument(
+        "--force-conversion",
+        action="store_true",
+        help="Force overwrite of existing conversion output if present.",
+    )
+
+    args = parser.parse_args()
+
+    convert_dataset_local(
+        data_path=Path(args.data_path),
+        new_repo=Path(args.new_repo),
+        task_id=args.task_id,
+        data_file_size_in_mb=args.data_file_size_in_mb,
+        video_file_size_in_mb=args.video_file_size_in_mb,
+        force_conversion=args.force_conversion,
+    )
@@ -0,0 +1,130 @@
+#!/usr/bin/env python
+
+# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""
+Test script to verify BEHAVIOR-1K dataset loading with v3.0 wrapper.
+"""
+
+import argparse
+import logging
+
+from behavior_lerobot_dataset_v3 import BehaviorLeRobotDatasetV3
+
+from lerobot.utils.utils import init_logging
+
+init_logging()
+
+
+def load_behavior1k_dataset(repo_id, root):
+    """Test basic dataset loading."""
+    logging.info("=" * 80)
+    logging.info("Testing BEHAVIOR-1K dataset loading")
+    logging.info("=" * 80)
+
+    logging.info(f"\n1. Loading dataset with repo_id: {repo_id}")
+    dataset = BehaviorLeRobotDatasetV3(
+        repo_id=repo_id,
+        root=root,
+        modalities=["rgb"],
+        cameras=["head"],
+        chunk_streaming_using_keyframe=False,
+        check_timestamp_sync=False,
+    )
+
+    logging.info("\n2. Dataset loaded successfully!")
+    logging.info(f"   - Number of episodes: {dataset.num_episodes}")
+    logging.info(f"   - Number of frames: {dataset.num_frames}")
+    logging.info(f"   - FPS: {dataset.fps}")
+    logging.info(f"   - Features: {list(dataset.features)}")
+
+    return dataset
+
+
+def load_behavior1k_dataset_with_multiple_modalities(repo_id, root):
+    """Test loading multiple modalities and cameras."""
+    logging.info("\n" + "=" * 80)
+    logging.info("Testing multi-modality loading with repo_id: {repo_id}")
+    logging.info("=" * 80)
+
+    logging.info(f"\n1. Loading dataset with RGB + Depth with repo_id: {repo_id}")
+    dataset = BehaviorLeRobotDatasetV3(
+        repo_id=repo_id,
+        root=root,
+        modalities=["rgb", "depth"],
+        cameras=["head", "left_wrist", "right_wrist"],
+        chunk_streaming_using_keyframe=False,
+        check_timestamp_sync=False,
+        video_backend="pyav",
+    )
+
+    logging.info(f"\n2. Dataset loaded with modalities: {list(dataset.features)}")
+    logging.info(f"   - Total features: {len(dataset.features)}")
+
+    rgb_keys = [k for k in dataset.features if "rgb" in k]
+    depth_keys = [k for k in dataset.features if "depth" in k]
+    logging.info(f"   - RGB features: {rgb_keys}")
+    logging.info(f"   - Depth features: {depth_keys}")
+
+    logging.info("\n3. SUCCESS! Multi-modality loading works.")
+
+    return dataset
+
+
+def stream_behavior1k_dataset(repo_id, root):
+    """Test chunk streaming mode."""
+    logging.info("\n" + "=" * 80)
+    logging.info("Testing chunk streaming mode")
+    logging.info("=" * 80)
+
+    logging.info("\n1. Loading dataset with chunk streaming...")
+    dataset = BehaviorLeRobotDatasetV3(
+        repo_id=repo_id,
+        root=root,
+        modalities=["rgb"],
+        cameras=["head"],
+        chunk_streaming_using_keyframe=True,
+        shuffle=True,
+        seed=42,
+        check_timestamp_sync=False,
+    )
+
+    logging.info("\n2. Dataset loaded in streaming mode")
+    logging.info(f"   - Number of chunks: {len(dataset.chunks)}")
+    logging.info(f"   - First chunk range: {dataset.chunks[0]}")
+
+    logging.info("\n3. Testing frame access in streaming mode...")
+    for i in range(min(3, len(dataset))):
+        frame = dataset[i]
+        logging.info(
+            f"   - Frame {i}: episode_index={frame['episode_index'].item()}, "
+            f"task_index={frame['task_index'].item()}"
+        )
+
+    logging.info("\n4. SUCCESS! Chunk streaming works.")
+
+    return dataset
+
+
+if __name__ == "__main__":
+    parser = argparse.ArgumentParser()
+    parser.add_argument("--repo-id", type=str, default=None)
+    parser.add_argument("--root", type=str, default=None)
+
+    args = parser.parse_args()
+
+    load_behavior1k_dataset(args.repo_id, args.root)
+    load_behavior1k_dataset_with_multiple_modalities(args.repo_id, args.root)
+    stream_behavior1k_dataset(args.repo_id, args.root)
@@ -32,8 +32,7 @@ import torch
 from huggingface_hub import HfApi

 import lerobot
-from lerobot.datasets.dataset_metadata import LeRobotDatasetMetadata
-from lerobot.datasets.lerobot_dataset import LeRobotDataset
+from lerobot.datasets.lerobot_dataset import LeRobotDataset, LeRobotDatasetMetadata


 def main():
@@ -1,490 +0,0 @@
-#!/usr/bin/env python
-
-# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-"""
-SLURM-distributed SARM RA-BC annotation pipeline.
-
-Computes SARM progress values for all frames in a dataset, distributed across
-SLURM workers, then merges the shards into a single sarm_progress.parquet.
-
-Two subcommands, each a separate SLURM submission:
-
-  compute    – N workers, each computes progress for a subset of episodes
-  aggregate  – 1 worker, merges N shards into sarm_progress.parquet, pushes to hub
-
-Usage:
-    python slurm_compute_rabc.py compute \\
-        --repo-id user/dataset --reward-model-path user/sarm_model \\
-        --stride 10 --device cpu --workers 50 --partition cpu
-
-    python slurm_compute_rabc.py aggregate \\
-        --repo-id user/dataset --reward-model-path user/sarm_model \\
-        --partition cpu --push-to-hub
-"""
-
-import argparse
-from pathlib import Path
-
-from datatrove.executor import LocalPipelineExecutor
-from datatrove.executor.slurm import SlurmPipelineExecutor
-from datatrove.pipeline.base import PipelineStep
-
-
-class ComputeProgressShards(PipelineStep):
-    """Each worker computes SARM progress for its assigned episodes."""
-
-    def __init__(
-        self, repo_id, reward_model_path, stride=1, head_mode="sparse", device="cpu", shard_dir="rabc_shards"
-    ):
-        super().__init__()
-        if stride < 1:
-            raise ValueError(f"stride must be >= 1, got {stride}")
-        self.repo_id = repo_id
-        self.reward_model_path = reward_model_path
-        self.stride = stride
-        self.head_mode = head_mode
-        self.device = device
-        self.shard_dir = shard_dir
-
-    def run(self, data=None, rank: int = 0, world_size: int = 1):
-        import logging
-        from pathlib import Path
-
-        import numpy as np
-        import pyarrow as pa
-        import pyarrow.parquet as pq
-        import torch
-        from tqdm import tqdm
-
-        from lerobot.policies.sarm.compute_rabc_weights import (
-            generate_all_frame_indices,
-            interpolate_progress,
-            load_sarm_resources,
-        )
-        from lerobot.utils.utils import init_logging
-
-        init_logging()
-
-        dataset, reward_model, preprocess = load_sarm_resources(
-            self.repo_id,
-            self.reward_model_path,
-            self.device,
-        )
-
-        if hasattr(preprocess, "eval"):
-            preprocess.eval()
-        for step in preprocess.steps:
-            if hasattr(step, "eval"):
-                step.eval()
-
-        image_key = reward_model.config.image_key
-        state_key = reward_model.config.state_key
-        frame_gap = reward_model.config.frame_gap
-        center_idx = reward_model.config.n_obs_steps // 2
-
-        dual_mode = reward_model.config.uses_dual_heads
-        compute_sparse = self.head_mode in ("sparse", "both") or not dual_mode
-        compute_dense = self.head_mode in ("dense", "both") and dual_mode
-
-        my_episodes = list(range(dataset.num_episodes))[rank::world_size]
-        if not my_episodes:
-            logging.info(f"Rank {rank}: no episodes assigned")
-            return
-        logging.info(f"Rank {rank}: {len(my_episodes)} / {dataset.num_episodes} episodes")
-
-        all_rows = []
-
-        for ep_idx in tqdm(my_episodes, desc=f"Rank {rank}"):
-            ep = dataset.meta.episodes[ep_idx]
-            ep_start, ep_end = ep["dataset_from_index"], ep["dataset_to_index"]
-            task = dataset[ep_start].get("task", "perform the task")
-
-            all_ep_indices = generate_all_frame_indices(ep_start, ep_end, frame_gap)
-            if self.stride > 1:
-                compute_indices = [i for i in all_ep_indices if (i - ep_start) % self.stride == 0]
-                if (ep_end - 1) not in compute_indices:
-                    compute_indices.append(ep_end - 1)
-                compute_indices = sorted(set(compute_indices))
-            else:
-                compute_indices = all_ep_indices
-
-            frame_results = {}
-            for qi in tqdm(compute_indices, desc=f"  Ep {ep_idx}", leave=False):
-                try:
-                    sample = dataset[qi]
-                    batch = {
-                        image_key: sample[image_key],
-                        "task": task,
-                        "index": qi,
-                        "episode_index": ep_idx,
-                    }
-                    if state_key in sample:
-                        batch[state_key] = sample[state_key]
-
-                    with torch.no_grad():
-                        processed = preprocess(batch)
-                        vf = processed["video_features"].to(self.device)
-                        tf = processed["text_features"].to(self.device)
-                        sf = processed.get("state_features")
-                        if sf is not None:
-                            sf = sf.to(self.device)
-                        lengths = processed.get("lengths")
-
-                        sparse_val = dense_val = np.nan
-                        if compute_sparse:
-                            r = reward_model.calculate_rewards(
-                                text_embeddings=tf,
-                                video_embeddings=vf,
-                                state_features=sf,
-                                lengths=lengths,
-                                return_all_frames=True,
-                                head_mode="sparse",
-                            )
-                            sparse_val = float(r[0, center_idx] if r.ndim == 2 else r[center_idx])
-                        if compute_dense:
-                            r = reward_model.calculate_rewards(
-                                text_embeddings=tf,
-                                video_embeddings=vf,
-                                state_features=sf,
-                                lengths=lengths,
-                                return_all_frames=True,
-                                head_mode="dense",
-                            )
-                            dense_val = float(r[0, center_idx] if r.ndim == 2 else r[center_idx])
-
-                        frame_results[qi] = (sparse_val, dense_val)
-                except Exception as e:
-                    logging.warning(f"Failed frame {qi}: {e}")
-
-            if not frame_results:
-                logging.warning(f"Episode {ep_idx}: all frames failed, skipping")
-                continue
-
-            # Interpolate to all frames in this episode
-            computed_idx = np.array(sorted(frame_results.keys()))
-            all_frame_arr = np.arange(ep_start, ep_end)
-
-            sparse_vals = np.array([frame_results[i][0] for i in computed_idx]) if compute_sparse else None
-            dense_vals = np.array([frame_results[i][1] for i in computed_idx]) if compute_dense else None
-
-            if self.stride > 1 and len(computed_idx) > 1:
-                if compute_sparse:
-                    sparse_vals = interpolate_progress(computed_idx, sparse_vals, all_frame_arr)
-                if compute_dense:
-                    dense_vals = interpolate_progress(computed_idx, dense_vals, all_frame_arr)
-                output_frames = all_frame_arr
-            else:
-                # Use only successfully computed frames to avoid indexing mismatch on failures
-                output_frames = computed_idx
-
-            for i, fi in enumerate(output_frames):
-                row = {"index": int(fi), "episode_index": ep_idx, "frame_index": int(fi - ep_start)}
-                if compute_sparse:
-                    row["progress_sparse"] = float(sparse_vals[i])
-                if compute_dense:
-                    row["progress_dense"] = float(dense_vals[i])
-                all_rows.append(row)
-
-        if all_rows:
-            import pandas as pd
-
-            df = pd.DataFrame(all_rows).sort_values("index").reset_index(drop=True)
-            table = pa.Table.from_pandas(df, preserve_index=False)
-            table = table.replace_schema_metadata({b"reward_model_path": self.reward_model_path.encode()})
-            shard_dir = Path(self.shard_dir)
-            shard_dir.mkdir(parents=True, exist_ok=True)
-            out = shard_dir / f"shard_{rank:05d}.parquet"
-            pq.write_table(table, out)
-            logging.info(f"Rank {rank}: saved {len(df)} rows to {out}")
-
-
-class AggregateProgress(PipelineStep):
-    """Merge all shard parquets into final sarm_progress.parquet."""
-
-    def __init__(self, repo_id, reward_model_path, shard_dir="rabc_shards", push_to_hub=False):
-        super().__init__()
-        self.repo_id = repo_id
-        self.reward_model_path = reward_model_path
-        self.shard_dir = shard_dir
-        self.push_to_hub = push_to_hub
-
-    def run(self, data=None, rank: int = 0, world_size: int = 1):
-        import datetime
-        import logging
-        import os
-        from pathlib import Path
-
-        import pandas as pd
-        import pyarrow as pa
-        import pyarrow.parquet as pq
-
-        from lerobot.datasets.lerobot_dataset import LeRobotDataset
-        from lerobot.utils.utils import init_logging
-
-        init_logging()
-        if rank != 0:
-            return
-
-        shard_dir = Path(self.shard_dir)
-        shards = sorted(shard_dir.glob("shard_*.parquet"))
-        if not shards:
-            raise FileNotFoundError(f"No shards found in {shard_dir}")
-
-        # Log shard modification time range to help detect stale files
-        mtimes = [os.path.getmtime(s) for s in shards]
-        oldest = datetime.datetime.fromtimestamp(min(mtimes)).isoformat(timespec="seconds")
-        newest = datetime.datetime.fromtimestamp(max(mtimes)).isoformat(timespec="seconds")
-        logging.info(f"Aggregating {len(shards)} shards (oldest: {oldest}, newest: {newest})")
-
-        df = pd.concat([pd.read_parquet(s) for s in shards], ignore_index=True)
-        df = df.sort_values("index").reset_index(drop=True)
-
-        table = pa.Table.from_pandas(df, preserve_index=False)
-        table = table.replace_schema_metadata({b"reward_model_path": self.reward_model_path.encode()})
-
-        temp_ds = LeRobotDataset(self.repo_id, download_videos=False)
-        out_path = Path(temp_ds.root) / "sarm_progress.parquet"
-        out_path.parent.mkdir(parents=True, exist_ok=True)
-        pq.write_table(table, out_path)
-        logging.info(f"Saved {len(df)} rows to {out_path}")
-
-        for col in ["progress_sparse", "progress_dense"]:
-            if col in df.columns:
-                v = df[col].dropna()
-                logging.info(
-                    f"{col}: mean={v.mean():.4f} std={v.std():.4f} min={v.min():.4f} max={v.max():.4f}"
-                )
-
-        if self.push_to_hub:
-            from huggingface_hub import HfApi
-
-            api = HfApi()
-            hub_path = "sarm_progress.parquet"
-            logging.info(f"Uploading to {self.repo_id}/{hub_path}")
-            api.upload_file(
-                path_or_fileobj=str(out_path),
-                path_in_repo=hub_path,
-                repo_id=self.repo_id,
-                repo_type="dataset",
-            )
-            logging.info(f"Uploaded: https://huggingface.co/datasets/{self.repo_id}/blob/main/{hub_path}")
-
-
-def make_compute_executor(
-    repo_id,
-    reward_model_path,
-    stride,
-    head_mode,
-    device,
-    shard_dir,
-    logs_dir,
-    job_name,
-    slurm,
-    workers,
-    partition,
-    cpus_per_task,
-    mem_per_cpu,
-):
-    kwargs = {
-        "pipeline": [
-            ComputeProgressShards(repo_id, reward_model_path, stride, head_mode, device, str(shard_dir)),
-        ],
-        "logging_dir": str(logs_dir / job_name),
-    }
-
-    if slurm:
-        kwargs.update(
-            {
-                "job_name": job_name,
-                "tasks": workers,
-                "workers": workers,
-                "time": "24:00:00",
-                "partition": partition,
-                "cpus_per_task": cpus_per_task,
-                "sbatch_args": {"mem-per-cpu": mem_per_cpu},
-            }
-        )
-        return SlurmPipelineExecutor(**kwargs)
-
-    kwargs.update({"tasks": workers, "workers": 1})
-    return LocalPipelineExecutor(**kwargs)
-
-
-def make_aggregate_executor(
-    repo_id,
-    reward_model_path,
-    shard_dir,
-    logs_dir,
-    job_name,
-    slurm,
-    partition,
-    cpus_per_task,
-    mem_per_cpu,
-    push_to_hub,
-):
-    kwargs = {
-        "pipeline": [
-            AggregateProgress(repo_id, reward_model_path, str(shard_dir), push_to_hub),
-        ],
-        "logging_dir": str(logs_dir / job_name),
-    }
-
-    if slurm:
-        kwargs.update(
-            {
-                "job_name": job_name,
-                "tasks": 1,
-                "workers": 1,
-                "time": "02:00:00",
-                "partition": partition,
-                "cpus_per_task": cpus_per_task,
-                "sbatch_args": {"mem-per-cpu": mem_per_cpu},
-            }
-        )
-        return SlurmPipelineExecutor(**kwargs)
-
-    kwargs.update({"tasks": 1, "workers": 1})
-    return LocalPipelineExecutor(**kwargs)
-
-
-def _add_shared_args(p):
-    p.add_argument(
-        "--repo-id",
-        type=str,
-        required=True,
-        help="Hugging Face repository identifier, e.g. 'user/dataset'.",
-    )
-    p.add_argument(
-        "--shard-dir",
-        type=Path,
-        default=Path("rabc_shards"),
-        help="Directory to read/write per-rank parquet shards.",
-    )
-    p.add_argument(
-        "--logs-dir",
-        type=Path,
-        default=Path("logs"),
-        help="Directory for datatrove logs.",
-    )
-    p.add_argument(
-        "--job-name",
-        type=str,
-        default=None,
-        help="SLURM job name (defaults to rabc_<subcommand>).",
-    )
-    p.add_argument(
-        "--slurm",
-        type=int,
-        default=1,
-        help="1 = submit via SLURM; 0 = run locally (useful for debugging).",
-    )
-    p.add_argument(
-        "--partition",
-        type=str,
-        default=None,
-        help="SLURM partition to submit to.",
-    )
-    p.add_argument(
-        "--cpus-per-task",
-        type=int,
-        default=4,
-        help="Number of CPUs per SLURM task.",
-    )
-    p.add_argument(
-        "--mem-per-cpu",
-        type=str,
-        default="4G",
-        help="Memory per CPU, e.g. '4G' or '1950M'.",
-    )
-
-
-def main():
-    parser = argparse.ArgumentParser(
-        description="SLURM-distributed SARM RA-BC annotation pipeline",
-        formatter_class=argparse.RawDescriptionHelpFormatter,
-    )
-    sub = parser.add_subparsers(dest="command", required=True)
-
-    # compute subcommand
-    cp = sub.add_parser(
-        "compute",
-        help="Distribute progress computation across SLURM workers.",
-    )
-    _add_shared_args(cp)
-    cp.add_argument(
-        "--reward-model-path",
-        type=str,
-        required=True,
-        help="Path or HF repo id of the SARM reward model.",
-    )
-    cp.add_argument(
-        "--stride",
-        type=int,
-        default=1,
-        help="Compute every Nth frame; intermediate frames are interpolated (must be >= 1).",
-    )
-    cp.add_argument(
-        "--head-mode",
-        type=str,
-        default="sparse",
-        choices=["sparse", "dense", "both"],
-        help="Which reward head(s) to compute.",
-    )
-    cp.add_argument(
-        "--device",
-        type=str,
-        default="cpu",
-        help="Device for reward model inference, e.g. 'cpu' or 'cuda'.",
-    )
-    cp.add_argument(
-        "--workers",
-        type=int,
-        default=50,
-        help="Number of parallel SLURM tasks (one shard per worker).",
-    )
-
-    # aggregate subcommand
-    ap = sub.add_parser(
-        "aggregate",
-        help="Merge per-rank shards into a single sarm_progress.parquet.",
-    )
-    _add_shared_args(ap)
-    ap.add_argument(
-        "--reward-model-path",
-        type=str,
-        required=True,
-        help="Path or HF repo id of the SARM reward model (stored in parquet metadata).",
-    )
-    ap.add_argument(
-        "--push-to-hub",
-        action="store_true",
-        help="Upload sarm_progress.parquet to the Hugging Face Hub after aggregation.",
-    )
-
-    args = parser.parse_args()
-    job_name = args.job_name or f"rabc_{args.command}"
-    kwargs = vars(args)
-    kwargs["slurm"] = kwargs.pop("slurm") == 1
-    kwargs["job_name"] = job_name
-    command = kwargs.pop("command")
-
-    executor = make_compute_executor(**kwargs) if command == "compute" else make_aggregate_executor(**kwargs)
-
-    executor.run()
-
-
-if __name__ == "__main__":
-    main()
@@ -1,717 +0,0 @@
-"""
-Action consistency analysis for imitation learning datasets.
-
-Two parallel analyses per dataset:
-  1. State-based: KNN in joint-state space → action chunk variance
-  2. Image-based: KNN in SigLIP embedding space → action chunk variance
-
-Comparing them reveals whether visual similarity and proprioceptive similarity
-agree on where the data is inconsistent — and images are what the policy
-primarily sees.
-"""
-
-import json
-from pathlib import Path
-
-import av
-import matplotlib.pyplot as plt
-import numpy as np
-import pandas as pd
-import torch
-from huggingface_hub import snapshot_download
-from matplotlib.colors import LinearSegmentedColormap
-from PIL import Image
-from scipy.spatial import cKDTree
-from transformers import AutoImageProcessor, AutoModel
-
-DATASETS = [
-    {"repo_id": "lerobot-data-collection/level2_final_quality3", "label": "HQ curated"},
-    {"repo_id": "lerobot-data-collection/level12_rac_2_2026-02-08_1", "label": "Full collection"},
-]
-OUTPUT_DIR = Path(__file__).resolve().parent / "outputs"
-OUTPUT_DIR.mkdir(exist_ok=True)
-
-MAX_FRAMES = 100_000
-K_NEIGHBORS = 50
-ACTION_CHUNK_SIZE = 30
-CAMERA_KEY = "observation.images.base"
-ENCODER_MODEL = "google/siglip-base-patch16-224"
-ENCODE_BATCH_SIZE = 512
-SEED = 42
-DPI = 150
-
-CONSISTENCY_CMAP = LinearSegmentedColormap.from_list(
-    "consistency", ["#0a2e0a", "#1a8e1a", "#88cc22", "#ffaa22", "#ff2222"]
-)
-
-# FK chains from OpenArm bimanual URDF (same as workspace_density.py).
-LEFT_CHAIN = [
-    ((-np.pi / 2, 0, 0), (0, 0.031, 0.698), None),
-    ((0, 0, 0), (0, 0, 0.0625), (0, 0, 1)),
-    ((-np.pi / 2, 0, 0), (-0.0301, 0, 0.06), (-1, 0, 0)),
-    ((0, 0, 0), (0.0301, 0, 0.06625), (0, 0, 1)),
-    ((0, 0, 0), (0, 0.0315, 0.15375), (0, 1, 0)),
-    ((0, 0, 0), (0, -0.0315, 0.0955), (0, 0, 1)),
-    ((0, 0, 0), (0.0375, 0, 0.1205), (1, 0, 0)),
-    ((0, 0, 0), (-0.0375, 0, 0), (0, -1, 0)),
-    ((0, 0, 0), (0, 0, 0.1001), None),
-    ((0, 0, 0), (0, 0, 0.08), None),
-]
-RIGHT_CHAIN = [
-    ((np.pi / 2, 0, 0), (0, -0.031, 0.698), None),
-    ((0, 0, 0), (0, 0, 0.0625), (0, 0, 1)),
-    ((np.pi / 2, 0, 0), (-0.0301, 0, 0.06), (-1, 0, 0)),
-    ((0, 0, 0), (0.0301, 0, 0.06625), (0, 0, 1)),
-    ((0, 0, 0), (0, 0.0315, 0.15375), (0, 1, 0)),
-    ((0, 0, 0), (0, -0.0315, 0.0955), (0, 0, 1)),
-    ((0, 0, 0), (0.0375, 0, 0.1205), (1, 0, 0)),
-    ((0, 0, 0), (-0.0375, 0, 0), (0, 1, 0)),
-    ((0, 0, 0), (0, 0, 0.1001), None),
-    ((0, 0, 0), (0, 0, 0.08), None),
-]
-
-
-# ── FK math ─────────────────────────────────────────────
-
-
-def _rot_x(a: float) -> np.ndarray:
-    c, s = np.cos(a), np.sin(a)
-    return np.array([[1, 0, 0], [0, c, -s], [0, s, c]])
-
-
-def _rot_y(a: float) -> np.ndarray:
-    c, s = np.cos(a), np.sin(a)
-    return np.array([[c, 0, s], [0, 1, 0], [-s, 0, c]])
-
-
-def _rot_z(a: float) -> np.ndarray:
-    c, s = np.cos(a), np.sin(a)
-    return np.array([[c, -s, 0], [s, c, 0], [0, 0, 1]])
-
-
-def _tf(rpy: tuple, xyz: tuple) -> np.ndarray:
-    r, p, y = rpy
-    mat = np.eye(4)
-    mat[:3, :3] = _rot_z(y) @ _rot_y(p) @ _rot_x(r)
-    mat[:3, 3] = xyz
-    return mat
-
-
-def _batch_axis_rot(axis: tuple, angles: np.ndarray) -> np.ndarray:
-    n = len(angles)
-    ax = np.asarray(axis, dtype=np.float64)
-    ax = ax / np.linalg.norm(ax)
-    x, y, z = ax
-    c = np.cos(angles)
-    s = np.sin(angles)
-    t = 1 - c
-    rot = np.zeros((n, 4, 4))
-    rot[:, 0, 0] = t * x * x + c
-    rot[:, 0, 1] = t * x * y - s * z
-    rot[:, 0, 2] = t * x * z + s * y
-    rot[:, 1, 0] = t * x * y + s * z
-    rot[:, 1, 1] = t * y * y + c
-    rot[:, 1, 2] = t * y * z - s * x
-    rot[:, 2, 0] = t * x * z - s * y
-    rot[:, 2, 1] = t * y * z + s * x
-    rot[:, 2, 2] = t * z * z + c
-    rot[:, 3, 3] = 1.0
-    return rot
-
-
-def batch_fk(chain: list, joint_angles: np.ndarray) -> np.ndarray:
-    n = joint_angles.shape[0]
-    tf_batch = np.tile(np.eye(4), (n, 1, 1))
-    qi = 0
-    for rpy, xyz, axis in chain:
-        tf_batch = tf_batch @ _tf(rpy, xyz)
-        if axis is not None:
-            rot = _batch_axis_rot(axis, joint_angles[:, qi])
-            tf_batch = np.einsum("nij,njk->nik", tf_batch, rot)
-            qi += 1
-    return tf_batch[:, :3, 3]
-
-
-# ── Data helpers ────────────────────────────────────────
-
-
-def _flatten_names(obj: object) -> list[str]:
-    if isinstance(obj, dict):
-        out: list[str] = []
-        for v in obj.values():
-            out.extend(_flatten_names(v))
-        return out
-    if isinstance(obj, (list, tuple)):
-        out = []
-        for item in obj:
-            if isinstance(item, (list, tuple, dict)):
-                out.extend(_flatten_names(item))
-            else:
-                out.append(str(item))
-        return out
-    return [str(obj)]
-
-
-def _detect_and_convert(vals: np.ndarray) -> np.ndarray:
-    mx = np.max(np.abs(vals))
-    if mx > 360:
-        print(f"    Unit detection: servo ticks (max={mx:.0f})")
-        return (vals - 2048) / 2048 * np.pi
-    if mx > 6.3:
-        print(f"    Unit detection: degrees (max={mx:.1f})")
-        return np.deg2rad(vals)
-    print(f"    Unit detection: radians (max={mx:.3f})")
-    return vals.astype(np.float64)
-
-
-def _find_joint_indices(features: dict, state_col: str, n_dim: int) -> tuple[list[int], list[int]]:
-    feat = features.get("observation.state", features.get(state_col, {}))
-    names = _flatten_names(feat.get("names", []))
-    left_idx: list[int] = []
-    right_idx: list[int] = []
-    if names and len(names) == n_dim:
-        names_l = [n.lower() for n in names]
-        print(f"  Feature names: {names[:4]}…{names[-4:]}")
-        for j in range(1, 8):
-            for i, nm in enumerate(names_l):
-                if f"left_joint_{j}" in nm and i not in left_idx:
-                    left_idx.append(i)
-                    break
-            for i, nm in enumerate(names_l):
-                if f"right_joint_{j}" in nm and i not in right_idx:
-                    right_idx.append(i)
-                    break
-    if len(left_idx) == 7 and len(right_idx) == 7:
-        print(f"  Matched by name: left={left_idx} right={right_idx}")
-        return left_idx, right_idx
-    if n_dim >= 16:
-        print("  Falling back to positional: [0:7]=left, [8:15]=right")
-        return list(range(7)), list(range(8, 15))
-    if n_dim >= 14:
-        print("  Falling back to positional: [0:7]=left, [7:14]=right")
-        return list(range(7)), list(range(7, 14))
-    raise RuntimeError(f"State dim {n_dim} too small for bimanual 7-DOF robot")
-
-
-def download_data(repo_id: str, camera_key: str) -> Path:
-    print(f"  Downloading {repo_id} (parquet + {camera_key} videos) …")
-    return Path(
-        snapshot_download(
-            repo_id=repo_id,
-            repo_type="dataset",
-            allow_patterns=[
-                "meta/**",
-                "data/**",
-                f"videos/{camera_key}/**",
-            ],
-        )
-    )
-
-
-# ── Data loading ────────────────────────────────────────
-
-
-def _build_action_chunks(
-    actions: np.ndarray, episode_ids: np.ndarray, chunk_size: int
-) -> tuple[np.ndarray, np.ndarray]:
-    """
-    For each frame, concatenate the next chunk_size actions from the same episode.
-    Returns (action_chunks, valid_mask).
-    """
-    n = len(actions)
-    act_dim = actions.shape[1]
-    chunks = np.zeros((n, chunk_size * act_dim), dtype=np.float64)
-    valid = np.zeros(n, dtype=bool)
-
-    for i in range(n):
-        end = i + chunk_size
-        if end > n:
-            continue
-        if episode_ids[i] != episode_ids[end - 1]:
-            continue
-        chunks[i] = actions[i:end].ravel()
-        valid[i] = True
-
-    return chunks, valid
-
-
-def load_state_action_data(local: Path, max_frames: int, chunk_size: int, rng: np.random.Generator) -> dict:
-    """
-    Load observation.state and action, build action chunks, subsample, normalize.
-    Also returns the original row indices (`chosen_idx`) for video frame mapping.
-    """
-    info = json.loads((local / "meta" / "info.json").read_text())
-    features = info.get("features", {})
-
-    dfs = [pd.read_parquet(pq) for pq in sorted((local / "data").glob("**/*.parquet"))]
-    df = pd.concat(dfs, ignore_index=True)
-    n_total = len(df)
-    print(f"  Total frames: {n_total:,}")
-
-    state_col = next((c for c in df.columns if "observation.state" in c), None)
-    action_col = next((c for c in df.columns if c == "action"), None)
-    if state_col is None:
-        raise RuntimeError(f"No observation.state column. Available: {list(df.columns)}")
-    if action_col is None:
-        raise RuntimeError(f"No action column. Available: {list(df.columns)}")
-
-    ep_col = next((c for c in df.columns if c == "episode_index"), None)
-    if ep_col is None:
-        raise RuntimeError(f"No episode_index column. Available: {list(df.columns)}")
-
-    state_all = np.stack(df[state_col].values).astype(np.float64)
-    action_all = np.stack(df[action_col].values).astype(np.float64)
-    episode_all = df[ep_col].values.astype(np.int64)
-
-    n_dim = state_all.shape[1]
-    act_dim = action_all.shape[1]
-    print(f"  State dim: {n_dim}  Action dim: {act_dim}  Chunk size: {chunk_size}")
-    print(f"  Action chunk dim: {chunk_size * act_dim}")
-
-    left_idx, right_idx = _find_joint_indices(features, state_col, n_dim)
-
-    print("  Building action chunks …")
-    action_chunks, valid = _build_action_chunks(action_all, episode_all, chunk_size)
-    valid_idx = np.where(valid)[0]
-    print(f"  Valid frames (with full action chunk): {len(valid_idx):,} / {n_total:,}")
-
-    if len(valid_idx) > max_frames:
-        chosen = np.sort(rng.choice(valid_idx, max_frames, replace=False))
-    else:
-        chosen = valid_idx
-    print(f"  Using {len(chosen):,} frames")
-
-    state_raw = state_all[chosen]
-    action_raw = action_chunks[chosen]
-    episode_ids = episode_all[chosen]
-
-    state_mean = state_raw.mean(axis=0)
-    state_std = state_raw.std(axis=0)
-    state_std[state_std < 1e-8] = 1.0
-    state_norm = (state_raw - state_mean) / state_std
-
-    action_mean = action_raw.mean(axis=0)
-    action_std = action_raw.std(axis=0)
-    action_std[action_std < 1e-8] = 1.0
-    action_norm = (action_raw - action_mean) / action_std
-
-    return {
-        "state_raw": state_raw,
-        "state_norm": state_norm,
-        "action_raw": action_raw,
-        "action_norm": action_norm,
-        "episode_ids": episode_ids,
-        "episode_all": episode_all,
-        "left_joint_idx": left_idx,
-        "right_joint_idx": right_idx,
-        "n_total": n_total,
-        "chosen_idx": chosen,
-        "df": df,
-    }
-
-
-# ── Video → frame extraction ──────────────────────────────
-
-
-def build_video_lookup(local: Path, camera_key: str) -> dict:
-    """
-    Build a mapping from episode_index → {video_path, fps, from_ts}.
-    """
-    info = json.loads((local / "meta" / "info.json").read_text())
-    fps = info["fps"]
-    video_template = info.get(
-        "video_path",
-        "videos/{video_key}/chunk-{chunk_index:03d}/file-{file_index:03d}.mp4",
-    )
-
-    ep_rows = []
-    for pq in sorted((local / "meta" / "episodes").glob("**/*.parquet")):
-        ep_rows.append(pd.read_parquet(pq))
-    ep_df = pd.concat(ep_rows, ignore_index=True)
-
-    chunk_col = f"videos/{camera_key}/chunk_index"
-    file_col = f"videos/{camera_key}/file_index"
-    ts_from = f"videos/{camera_key}/from_timestamp"
-    if chunk_col not in ep_df.columns:
-        chunk_col = f"{camera_key}/chunk_index"
-        file_col = f"{camera_key}/file_index"
-        ts_from = f"{camera_key}/from_timestamp"
-
-    lookup: dict[int, dict] = {}
-    for _, row in ep_df.iterrows():
-        ci = int(row[chunk_col])
-        fi = int(row[file_col])
-        video_rel = video_template.format(video_key=camera_key, chunk_index=ci, file_index=fi)
-        lookup[int(row["episode_index"])] = {
-            "video_path": local / video_rel,
-            "from_ts": float(row[ts_from]),
-            "fps": fps,
-        }
-    return lookup
-
-
-def _decode_video_frames(video_path: str) -> list[np.ndarray]:
-    """Decode all frames from a video file using PyAV. Returns list of RGB arrays."""
-    container = av.open(video_path)
-    stream = container.streams.video[0]
-    stream.thread_type = "AUTO"
-    decoded = []
-    for frame in container.decode(stream):
-        decoded.append(frame.to_ndarray(format="rgb24"))
-    container.close()
-    return decoded
-
-
-def extract_frames(
-    chosen_idx: np.ndarray,
-    episode_all: np.ndarray,
-    video_lookup: dict,
-) -> list[np.ndarray | None]:
-    """
-    Extract RGB frames for each chosen global index using PyAV.
-    Returns list of (H, W, 3) RGB arrays (or None on failure).
-    """
-    unique_eps = np.unique(episode_all)
-    ep_start: dict[int, int] = {}
-    for ep in unique_eps:
-        ep_start[int(ep)] = int(np.where(episode_all == ep)[0][0])
-
-    # Build jobs: (output_index, video_path, local_frame_number)
-    jobs: list[tuple[int, str, int]] = []
-    for out_i, global_i in enumerate(chosen_idx):
-        ep = int(episode_all[global_i])
-        info = video_lookup.get(ep)
-        if info is None:
-            continue
-        local_frame = global_i - ep_start[ep]
-        jobs.append((out_i, str(info["video_path"]), local_frame))
-
-    # Group by video file, decode each video once
-    from collections import defaultdict
-
-    video_jobs: dict[str, list[tuple[int, int]]] = defaultdict(list)
-    for out_i, vpath, local_frame in jobs:
-        video_jobs[vpath].append((out_i, local_frame))
-
-    frames: list[np.ndarray | None] = [None] * len(chosen_idx)
-    extracted = 0
-    n_videos = len(video_jobs)
-    for vi, (vpath, frame_requests) in enumerate(video_jobs.items()):
-        if not Path(vpath).exists():
-            continue
-        try:
-            decoded = _decode_video_frames(vpath)
-        except Exception as exc:
-            print(f"    Warning: failed to decode {Path(vpath).name}: {exc}")
-            continue
-        for out_i, local_frame in frame_requests:
-            if 0 <= local_frame < len(decoded):
-                frames[out_i] = decoded[local_frame]
-                extracted += 1
-        if (vi + 1) % 50 == 0 or (vi + 1) == n_videos:
-            print(f"    Decoded {vi + 1}/{n_videos} videos ({extracted:,} frames so far)")
-        del decoded
-
-    print(f"  Extracted {extracted:,} / {len(chosen_idx):,} frames from video")
-    return frames
-
-
-# ── SigLIP encoding ─────────────────────────────────────
-
-
-def encode_frames_siglip(
-    frames: list[np.ndarray | None],
-    model_name: str,
-    batch_size: int,
-    device: torch.device,
-) -> np.ndarray:
-    """
-    Encode RGB frames through SigLIP vision encoder.
-    Returns (N, embed_dim) float32 array. Frames that are None get a zero vector.
-    """
-    print(f"  Loading SigLIP model: {model_name} …")
-    processor = AutoImageProcessor.from_pretrained(model_name)
-    model = AutoModel.from_pretrained(model_name).to(device).eval()
-    embed_dim = model.config.vision_config.hidden_size
-
-    n = len(frames)
-    embeddings = np.zeros((n, embed_dim), dtype=np.float32)
-
-    valid_indices = [i for i, f in enumerate(frames) if f is not None]
-    print(f"  Encoding {len(valid_indices):,} valid frames in batches of {batch_size} …")
-
-    for batch_start in range(0, len(valid_indices), batch_size):
-        batch_idx = valid_indices[batch_start : batch_start + batch_size]
-        pil_images = [Image.fromarray(frames[i]) for i in batch_idx]
-
-        inputs = processor(images=pil_images, return_tensors="pt").to(device)
-        with torch.no_grad():
-            image_features = model.get_image_features(**inputs)
-        image_features = torch.nn.functional.normalize(image_features, dim=-1)
-        embeddings[batch_idx] = image_features.cpu().numpy()
-
-        done = min(batch_start + batch_size, len(valid_indices))
-        if done % (batch_size * 10) == 0 or done == len(valid_indices):
-            print(f"    {done:,} / {len(valid_indices):,} encoded")
-
-    del model, processor
-    torch.cuda.empty_cache()
-    return embeddings
-
-
-# ── KNN consistency ─────────────────────────────────────
-
-
-def compute_consistency(
-    features: np.ndarray,
-    action_norm: np.ndarray,
-    episode_ids: np.ndarray,
-    k: int,
-    label: str = "",
-) -> np.ndarray:
-    """
-    For each frame, find K nearest neighbors in feature space from other episodes.
-    Return per-frame action variance (mean across action dims).
-    """
-    n = len(features)
-    print(f"  Building KD-tree on {n:,} vectors ({label}) …")
-    tree = cKDTree(features)
-
-    k_query = min(k * 3, n - 1)
-    print(f"  Querying {k_query} neighbors per frame …")
-    _dists, indices = tree.query(features, k=k_query + 1)
-    indices = indices[:, 1:]
-
-    print(f"  Computing cross-episode action variance ({label}) …")
-    variance = np.zeros(n)
-    for i in range(n):
-        ep_i = episode_ids[i]
-        neighbors = indices[i]
-        cross_ep = neighbors[episode_ids[neighbors] != ep_i][:k]
-        if len(cross_ep) < 2:
-            variance[i] = 0.0
-            continue
-        neighbor_actions = action_norm[cross_ep]
-        variance[i] = np.mean(np.var(neighbor_actions, axis=0))
-
-    return variance
-
-
-# ── Visualization ───────────────────────────────────────
-
-
-def _style_ax(ax: plt.Axes) -> None:
-    ax.set_facecolor("#0d1117")
-    ax.tick_params(colors="#555", labelsize=8)
-    for spine in ax.spines.values():
-        spine.set_color("#333")
-
-
-def _plot_histogram(ax: plt.Axes, variance: np.ndarray, title: str, color: str) -> None:
-    _style_ax(ax)
-    median_var = np.median(variance)
-    mean_var = np.mean(variance)
-    nonzero = variance[variance > 0]
-    if len(nonzero) > 0:
-        bins = np.logspace(np.log10(nonzero.min().clip(1e-6)), np.log10(nonzero.max()), 60)
-        ax.hist(nonzero, bins=bins, color=color, alpha=0.8, edgecolor="#222")
-    ax.set_xscale("log")
-    ax.axvline(median_var, color="#ff6600", linewidth=2, label=f"median={median_var:.3f}")
-    ax.axvline(mean_var, color="#ff2222", linewidth=2, linestyle="--", label=f"mean={mean_var:.3f}")
-    ax.set_xlabel("Action variance (log scale)", color="#888", fontsize=10)
-    ax.set_ylabel("Frame count", color="#888", fontsize=10)
-    ax.set_title(title, color="white", fontsize=11, pad=10)
-    ax.legend(fontsize=8, facecolor="#1a1a2e", edgecolor="#333", labelcolor="white")
-
-
-def _plot_episode_curves(
-    ax: plt.Axes,
-    var_state: np.ndarray,
-    var_image: np.ndarray,
-    episode_ids: np.ndarray,
-    title: str,
-) -> None:
-    _style_ax(ax)
-    unique_eps = np.unique(episode_ids)
-
-    ep_means_s = np.array([var_state[episode_ids == ep].mean() for ep in unique_eps])
-    ep_means_i = np.array([var_image[episode_ids == ep].mean() for ep in unique_eps])
-
-    sorted_s = np.sort(ep_means_s)[::-1]
-    sorted_i = np.sort(ep_means_i)[::-1]
-    ep_x = np.arange(len(unique_eps))
-
-    ax.fill_between(ep_x, sorted_s, alpha=0.2, color="#4363d8")
-    ax.plot(ep_x, sorted_s, color="#4363d8", linewidth=1.2, label=f"State (med={np.median(ep_means_s):.3f})")
-    ax.fill_between(ep_x, sorted_i, alpha=0.2, color="#e6194b")
-    ax.plot(ep_x, sorted_i, color="#e6194b", linewidth=1.2, label=f"Image (med={np.median(ep_means_i):.3f})")
-
-    ax.set_xlabel("Episode rank (worst → best)", color="#888", fontsize=10)
-    ax.set_ylabel("Mean action variance", color="#888", fontsize=10)
-    ax.set_title(title, color="white", fontsize=11, pad=10)
-    ax.legend(fontsize=8, facecolor="#1a1a2e", edgecolor="#333", labelcolor="white")
-
-
-def _plot_heatmap(
-    ax: plt.Axes, fig: plt.Figure, tcp_xz: np.ndarray, variance: np.ndarray, title: str
-) -> None:
-    _style_ax(ax)
-    order = np.argsort(variance)
-    pts = tcp_xz[order]
-    var_sorted = variance[order]
-    vmin = np.percentile(variance[variance > 0], 5) if np.any(variance > 0) else 0
-    vmax = np.percentile(variance[variance > 0], 95) if np.any(variance > 0) else 1
-    sc = ax.scatter(
-        pts[:, 0],
-        pts[:, 1],
-        c=var_sorted,
-        cmap=CONSISTENCY_CMAP,
-        s=0.5,
-        alpha=0.6,
-        vmin=vmin,
-        vmax=vmax,
-        rasterized=True,
-    )
-    ax.set_xlabel("X (m)", color="#888", fontsize=10)
-    ax.set_ylabel("Z (m)", color="#888", fontsize=10)
-    ax.set_title(title, color="white", fontsize=11, pad=10)
-    ax.set_aspect("equal")
-    cbar = fig.colorbar(sc, ax=ax, shrink=0.8, pad=0.02)
-    cbar.set_label("Action variance", color="white", fontsize=9)
-    cbar.ax.tick_params(colors="#aaa", labelsize=7)
-
-
-def render(results: list[dict], out_path: Path) -> None:
-    """
-    4-row x N-column figure:
-      Row 0: State-based variance histogram
-      Row 1: Image-based variance histogram
-      Row 2: Per-episode curves (both overlaid)
-      Row 3: Spatial heatmap (image-based variance)
-    """
-    n_ds = len(results)
-    fig, axes = plt.subplots(4, n_ds, figsize=(9 * n_ds, 24), facecolor="#0d1117")
-    if n_ds == 1:
-        axes = axes[:, np.newaxis]
-
-    headline_parts = []
-    for col, r in enumerate(results):
-        label = r["label"]
-        var_s = r["var_state"]
-        var_i = r["var_image"]
-        tcp_xz = r["tcp_xz"]
-        episode_ids = r["episode_ids"]
-
-        med_s = np.median(var_s)
-        med_i = np.median(var_i)
-        headline_parts.append(f"{label}: state={med_s:.3f}, image={med_i:.3f}")
-
-        _plot_histogram(axes[0, col], var_s, f"{label}\nState-based variance (K={K_NEIGHBORS})", "#4363d8")
-        _plot_histogram(
-            axes[1, col], var_i, f"{label}\nImage-based variance (SigLIP, K={K_NEIGHBORS})", "#e6194b"
-        )
-        _plot_episode_curves(
-            axes[2, col],
-            var_s,
-            var_i,
-            episode_ids,
-            f"{label}\nPer-episode inconsistency ({len(np.unique(episode_ids)):,} episodes)",
-        )
-        _plot_heatmap(
-            axes[3, col],
-            fig,
-            tcp_xz,
-            var_i,
-            f"{label}\nImage-based variance by TCP position (XZ)",
-        )
-
-    fig.suptitle(
-        f"Action Consistency: State vs Image  (chunk={ACTION_CHUNK_SIZE}, K={K_NEIGHBORS})\n"
-        + "  |  ".join(headline_parts),
-        color="white",
-        fontsize=15,
-        y=0.99,
-    )
-    plt.tight_layout(rect=[0, 0, 1, 0.96])
-    plt.savefig(out_path, dpi=DPI, bbox_inches="tight", facecolor=fig.get_facecolor())
-    plt.close()
-    print(f"\n✓ Saved: {out_path}")
-
-
-# ── Main ────────────────────────────────────────────────
-
-
-def main() -> None:
-    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
-    print(f"Device: {device}")
-    rng = np.random.default_rng(SEED)
-    results = []
-
-    for ds in DATASETS:
-        repo_id, label = ds["repo_id"], ds["label"]
-        print(f"\n{'=' * 60}")
-        print(f"  {label}: {repo_id}")
-        print(f"{'=' * 60}")
-
-        local = download_data(repo_id, CAMERA_KEY)
-        data = load_state_action_data(local, MAX_FRAMES, ACTION_CHUNK_SIZE, rng)
-
-        # --- State-based KNN ---
-        var_state = compute_consistency(
-            data["state_norm"], data["action_norm"], data["episode_ids"], K_NEIGHBORS, "state"
-        )
-        print(
-            f"  State variance: median={np.median(var_state):.4f}  "
-            f"mean={np.mean(var_state):.4f}  p90={np.percentile(var_state, 90):.4f}"
-        )
-
-        # --- Image-based KNN ---
-        print("\n  Preparing image embeddings …")
-        video_lookup = build_video_lookup(local, CAMERA_KEY)
-        frames = extract_frames(data["chosen_idx"], data["episode_all"], video_lookup)
-        embeddings = encode_frames_siglip(frames, ENCODER_MODEL, ENCODE_BATCH_SIZE, device)
-        del frames  # free memory
-
-        var_image = compute_consistency(
-            embeddings, data["action_norm"], data["episode_ids"], K_NEIGHBORS, "image"
-        )
-        print(
-            f"  Image variance: median={np.median(var_image):.4f}  "
-            f"mean={np.mean(var_image):.4f}  p90={np.percentile(var_image, 90):.4f}"
-        )
-
-        # FK for spatial heatmap
-        print("  Computing FK for spatial heatmap …")
-        left_raw = data["state_raw"][:, data["left_joint_idx"]]
-        left_rad = _detect_and_convert(left_raw)
-        left_tcp = batch_fk(LEFT_CHAIN, left_rad)
-        tcp_xz = left_tcp[:, [0, 2]]
-
-        results.append(
-            {
-                "label": label,
-                "var_state": var_state,
-                "var_image": var_image,
-                "episode_ids": data["episode_ids"],
-                "tcp_xz": tcp_xz,
-                "n_total": data["n_total"],
-            }
-        )
-
-    out = OUTPUT_DIR / "action_consistency_comparison.jpg"
-    render(results, out)
-
-    # Save worst-episodes summary (image-based, since that's the stronger signal)
-    worst_summary = {}
-    for r in results:
-        unique_eps = np.unique(r["episode_ids"])
-        ep_means = {int(ep): float(r["var_image"][r["episode_ids"] == ep].mean()) for ep in unique_eps}
-        ranked = sorted(ep_means.items(), key=lambda x: x[1], reverse=True)[:50]
-        worst_summary[r["label"]] = [{"episode": ep, "mean_variance": v} for ep, v in ranked]
-    worst_path = OUTPUT_DIR / "action_consistency_worst_episodes.json"
-    worst_path.write_text(json.dumps(worst_summary, indent=2))
-    print(f"✓ Saved worst episodes: {worst_path}")
-
-
-if __name__ == "__main__":
-    main()
@@ -1,178 +0,0 @@
-"""
-Create a JPG grid of random frames sampled from a LeRobot video dataset.
-Downloads metadata + video chunks from HuggingFace, picks random frames,
-decodes them, and tiles into a single image.
-"""
-
-import json
-import random
-from pathlib import Path
-
-import cv2
-import numpy as np
-import pandas as pd
-from huggingface_hub import snapshot_download
-
-REPO_ID = "lerobot-data-collection/level2_final_quality3"
-CAMERA_KEY = "observation.images.base"
-GRID_COLS = 15
-GRID_ROWS = 10
-THUMB_WIDTH = 160
-OUTPUT_DIR = Path(__file__).resolve().parent / "outputs"
-OUTPUT_DIR.mkdir(exist_ok=True)
-SEED = 1
-
-
-def download_metadata(repo_id: str) -> Path:
-    """Download only metadata (no videos yet)."""
-    print(f"[1/3] Downloading metadata for {repo_id} …")
-    return Path(
-        snapshot_download(
-            repo_id=repo_id,
-            repo_type="dataset",
-            allow_patterns=["meta/**"],
-            ignore_patterns=["*.mp4"],
-        )
-    )
-
-
-def load_video_info(local: Path) -> tuple[str, list[dict], int]:
-    """Parse info.json and episode parquets. Returns (camera_key, episode_rows, fps)."""
-    info = json.loads((local / "meta" / "info.json").read_text())
-    fps = info["fps"]
-    features = info["features"]
-
-    video_keys = [k for k, v in features.items() if v.get("dtype") == "video"]
-    if not video_keys:
-        raise RuntimeError("No video keys found in dataset features")
-
-    if CAMERA_KEY is not None:
-        if CAMERA_KEY not in video_keys:
-            raise RuntimeError(f"CAMERA_KEY='{CAMERA_KEY}' not found. Available: {video_keys}")
-        cam = CAMERA_KEY
-    else:
-        cam = video_keys[0]
-    print(f"   camera='{cam}'  all_cams={video_keys}  fps={fps}")
-
-    ep_rows = []
-    for pq in sorted((local / "meta" / "episodes").glob("**/*.parquet")):
-        ep_rows.append(pd.read_parquet(pq))
-    ep_df = pd.concat(ep_rows, ignore_index=True)
-
-    video_template = info.get(
-        "video_path",
-        "videos/{video_key}/chunk-{chunk_index:03d}/file-{file_index:03d}.mp4",
-    )
-
-    chunk_col = f"videos/{cam}/chunk_index"
-    file_col = f"videos/{cam}/file_index"
-    ts_from = f"videos/{cam}/from_timestamp"
-    ts_to = f"videos/{cam}/to_timestamp"
-    if chunk_col not in ep_df.columns:
-        chunk_col = f"{cam}/chunk_index"
-        file_col = f"{cam}/file_index"
-        ts_from = f"{cam}/from_timestamp"
-        ts_to = f"{cam}/to_timestamp"
-
-    episodes = []
-    for _, row in ep_df.iterrows():
-        ci = int(row[chunk_col])
-        fi = int(row[file_col])
-        episodes.append(
-            {
-                "episode_index": int(row["episode_index"]),
-                "chunk_index": ci,
-                "file_index": fi,
-                "from_ts": float(row[ts_from]),
-                "to_ts": float(row[ts_to]),
-                "video_rel": video_template.format(video_key=cam, chunk_index=ci, file_index=fi),
-            }
-        )
-    return cam, episodes, fps
-
-
-def pick_random_frames(episodes: list[dict], fps: int, n: int, rng: random.Random) -> list[dict]:
-    """Pick n random (episode, timestamp) pairs, return sorted by video file for efficient access."""
-    picks = []
-    for _ in range(n):
-        ep = rng.choice(episodes)
-        duration = ep["to_ts"] - ep["from_ts"]
-        if duration <= 0:
-            continue
-        t = ep["from_ts"] + rng.random() * duration
-        picks.append({**ep, "seek_ts": t})
-    picks.sort(key=lambda p: (p["video_rel"], p["seek_ts"]))
-    return picks
-
-
-def download_video_files(repo_id: str, local: Path, picks: list[dict]) -> None:
-    """Download only the video files we need."""
-    needed = sorted({p["video_rel"] for p in picks})
-    print(f"[2/3] Downloading {len(needed)} video file(s) …")
-    snapshot_download(
-        repo_id=repo_id,
-        repo_type="dataset",
-        local_dir=str(local),
-        allow_patterns=needed,
-    )
-
-
-def extract_frame(video_path: Path, seek_ts: float) -> np.ndarray | None:
-    """Decode a single frame at the given timestamp."""
-    cap = cv2.VideoCapture(str(video_path))
-    cap.set(cv2.CAP_PROP_POS_MSEC, seek_ts * 1000.0)
-    ret, frame = cap.read()
-    cap.release()
-    return frame if ret else None
-
-
-def build_grid(frames: list[np.ndarray], cols: int, thumb_w: int) -> np.ndarray:
-    """Resize frames to uniform thumbnails and tile into a grid."""
-    if not frames:
-        raise RuntimeError("No frames decoded")
-
-    h0, w0 = frames[0].shape[:2]
-    thumb_h = int(thumb_w * h0 / w0)
-
-    thumbs = [cv2.resize(f, (thumb_w, thumb_h), interpolation=cv2.INTER_AREA) for f in frames]
-
-    rows = []
-    for i in range(0, len(thumbs), cols):
-        row_thumbs = thumbs[i : i + cols]
-        while len(row_thumbs) < cols:
-            row_thumbs.append(np.zeros_like(row_thumbs[0]))
-        rows.append(np.hstack(row_thumbs))
-    return np.vstack(rows)
-
-
-def main() -> None:
-    rng = random.Random(SEED)
-    n_frames = GRID_COLS * GRID_ROWS
-
-    local = download_metadata(REPO_ID)
-    cam, episodes, fps = load_video_info(local)
-    picks = pick_random_frames(episodes, fps, n_frames, rng)
-    download_video_files(REPO_ID, local, picks)
-
-    print(f"[3/3] Decoding {n_frames} frames …")
-    frames: list[np.ndarray] = []
-    for p in picks:
-        vp = local / p["video_rel"]
-        if not vp.exists():
-            print(f"   SKIP: {p['video_rel']} not found")
-            continue
-        frame = extract_frame(vp, p["seek_ts"])
-        if frame is not None:
-            frames.append(frame)
-
-    print(f"   Decoded {len(frames)}/{n_frames} frames")
-    grid = build_grid(frames, GRID_COLS, THUMB_WIDTH)
-
-    safe_name = REPO_ID.replace("/", "_")
-    out_path = OUTPUT_DIR / f"{safe_name}_grid_{GRID_COLS}x{GRID_ROWS}.jpg"
-    cv2.imwrite(str(out_path), grid, [cv2.IMWRITE_JPEG_QUALITY, 92])
-    print(f"\n✓ Saved: {out_path}  ({grid.shape[1]}×{grid.shape[0]})")
-
-
-if __name__ == "__main__":
-    main()
@@ -1,526 +0,0 @@
-"""
-Create MP4 videos with sarm_progress overlay for specified episodes.
-Downloads datasets from HuggingFace, extracts episode video + progress data,
-and draws the progress line directly on each frame (no panel, no axes).
-"""
-
-import json
-import subprocess
-from pathlib import Path
-
-import cv2
-import numpy as np
-import pandas as pd
-from huggingface_hub import snapshot_download
-
-DATASETS = [
-    {"repo_id": "lerobot-data-collection/level2_final_quality3", "episode": 250},
-]
-CAMERA_KEY = (
-    "observation.images.base"  # None = auto-select first camera, or set e.g. "observation.images.top"
-)
-OUTPUT_DIR = Path(__file__).resolve().parent / "outputs"
-OUTPUT_DIR.mkdir(exist_ok=True)
-
-# Progress line spans the full video height
-GRAPH_Y_TOP_FRAC = 0.01
-GRAPH_Y_BOT_FRAC = 0.99
-LINE_THICKNESS = 3
-SHADOW_THICKNESS = 6  # white edge thickness
-REF_ALPHA = 0.45  # opacity of the 1.0 reference line
-FILL_ALPHA = 0.55  # opacity of the grey fill under the line
-SCORE_FONT_SCALE = 0.8
-TASK_FONT_SCALE = 0.55
-
-
-def download_episode(repo_id: str, episode: int) -> Path:
-    """Download only the files needed for this episode."""
-    # We need: meta/, sarm_progress.parquet, and the relevant video/data chunks.
-    # We'll download meta + sarm first, then figure out chunks.
-    print(f"\n[1/5] Downloading metadata for {repo_id} …")
-    local = Path(
-        snapshot_download(
-            repo_id=repo_id,
-            repo_type="dataset",
-            allow_patterns=["meta/**", "sarm_progress.parquet"],
-            ignore_patterns=["*.mp4"],
-        )
-    )
-    return local
-
-
-def load_episode_meta(local: Path, episode: int) -> dict:
-    """Read info.json + episode-level parquet to get fps, video paths, timestamps."""
-    info = json.loads((local / "meta" / "info.json").read_text())
-    fps = info["fps"]
-    features = info["features"]
-
-    # Find video keys (keys whose dtype=="video")
-    video_keys = [k for k, v in features.items() if v.get("dtype") == "video"]
-    if not video_keys:
-        raise RuntimeError("No video keys found in dataset features")
-    if CAMERA_KEY is not None:
-        if CAMERA_KEY not in video_keys:
-            raise RuntimeError(f"CAMERA_KEY='{CAMERA_KEY}' not found. Available: {video_keys}")
-        first_cam = CAMERA_KEY
-    else:
-        first_cam = video_keys[0]
-    print(f"   fps={fps}  camera='{first_cam}'  all_cams={video_keys}")
-
-    # Load all episode-meta parquet files and find our episode
-    ep_rows = []
-    for pq in sorted((local / "meta" / "episodes").glob("**/*.parquet")):
-        df = pd.read_parquet(pq)
-        ep_rows.append(df)
-    ep_df = pd.concat(ep_rows, ignore_index=True)
-    row = ep_df[ep_df["episode_index"] == episode]
-    if row.empty:
-        raise RuntimeError(f"Episode {episode} not found in episode metadata")
-    row = row.iloc[0]
-
-    # Extract video chunk/file index for first camera
-    # Try both dot and slash variants of the key
-    chunk_col = f"videos/{first_cam}/chunk_index"
-    file_col = f"videos/{first_cam}/file_index"
-    ts_col = f"videos/{first_cam}/from_timestamp"
-    to_col = f"videos/{first_cam}/to_timestamp"
-
-    # Some datasets use different column naming
-    if chunk_col not in row.index:
-        # Try without the 'videos/' prefix
-        chunk_col = f"{first_cam}/chunk_index"
-        file_col = f"{first_cam}/file_index"
-        ts_col = f"{first_cam}/from_timestamp"
-        to_col = f"{first_cam}/to_timestamp"
-    if chunk_col not in row.index:
-        raise RuntimeError(
-            f"Cannot find video metadata columns for {first_cam}.\nAvailable: {list(row.index)}"
-        )
-
-    chunk_idx = int(row[chunk_col])
-    file_idx = int(row[file_col])
-    from_ts = float(row[ts_col])
-    to_ts = float(row[to_col])
-
-    video_template = info.get(
-        "video_path", "videos/{video_key}/chunk-{chunk_index:03d}/file-{file_index:03d}.mp4"
-    )
-    video_rel = video_template.format(
-        video_key=first_cam,
-        chunk_index=chunk_idx,
-        file_index=file_idx,
-    )
-
-    # Load task name for this episode
-    # tasks.parquet uses the task string as the row index; task_index column holds the int id
-    task_name = ""
-    try:
-        # Prefer the 'tasks' list directly on the episode row
-        if "tasks" in row.index and row["tasks"] is not None:
-            tasks_val = row["tasks"]
-            if isinstance(tasks_val, (list, tuple, np.ndarray)) and len(tasks_val) > 0:
-                task_name = str(tasks_val[0])
-            else:
-                task_name = str(tasks_val).strip("[]'")
-        else:
-            tasks_pq = local / "meta" / "tasks.parquet"
-            if tasks_pq.exists():
-                tasks_df = pd.read_parquet(tasks_pq)
-                # Row index is the task string; task_index column is the int
-                task_idx = int(row.get("task_index", 0)) if "task_index" in row.index else 0
-                match = tasks_df[tasks_df["task_index"] == task_idx]
-                if not match.empty:
-                    task_name = str(match.index[0])
-        print(f"   Task name: '{task_name}'")
-    except Exception as e:
-        print(f"   WARNING: could not load task name: {e}")
-
-    return {
-        "fps": fps,
-        "first_cam": first_cam,
-        "video_rel": video_rel,
-        "chunk_index": chunk_idx,
-        "file_index": file_idx,
-        "from_ts": from_ts,
-        "to_ts": to_ts,
-        "task_name": task_name,
-    }
-
-
-def download_video(repo_id: str, local: Path, video_rel: str) -> Path:
-    """Download the specific video file if not already present."""
-    video_path = local / video_rel
-    if video_path.exists():
-        print(f"   Video already cached: {video_path}")
-        return video_path
-    print(f"[2/5] Downloading video file {video_rel} …")
-    snapshot_download(
-        repo_id=repo_id,
-        repo_type="dataset",
-        local_dir=str(local),
-        allow_patterns=[video_rel],
-    )
-    if not video_path.exists():
-        raise RuntimeError(f"Video not found after download: {video_path}")
-    return video_path
-
-
-def load_progress(local: Path, episode: int) -> np.ndarray | None:
-    """Load sarm_progress values for this episode. Returns sorted array of (frame_index, progress)."""
-    pq_path = local / "sarm_progress.parquet"
-    if not pq_path.exists():
-        print("   WARNING: sarm_progress.parquet not found, trying data parquet …")
-        return None
-    df = pd.read_parquet(pq_path)
-    print(f"   sarm_progress.parquet columns: {list(df.columns)}")
-    ep_df = df[df["episode_index"] == episode].copy()
-    if ep_df.empty:
-        print(f"   WARNING: No sarm_progress rows for episode {episode}")
-        return None
-    ep_df = ep_df.sort_values("frame_index")
-
-    # Prefer dense, fall back to sparse
-    if "progress_dense" in ep_df.columns and ep_df["progress_dense"].notna().any():
-        prog_col = "progress_dense"
-    elif "progress_sparse" in ep_df.columns:
-        prog_col = "progress_sparse"
-    else:
-        # Last resort: any column with 'progress' in the name
-        prog_cols = [c for c in ep_df.columns if "progress" in c.lower()]
-        if not prog_cols:
-            return None
-        prog_col = prog_cols[0]
-
-    print(f"   Using progress column: '{prog_col}'")
-    return ep_df[["frame_index", prog_col]].rename(columns={prog_col: "progress"}).values
-
-
-def extract_episode_clip(video_path: Path, from_ts: float, to_ts: float, out_path: Path) -> Path:
-    """Use ffmpeg to cut the episode segment from the combined video file."""
-    duration = to_ts - from_ts
-    print(f"[3/5] Extracting clip [{from_ts:.3f}s → {to_ts:.3f}s] ({duration:.2f}s) …")
-    cmd = [
-        "ffmpeg",
-        "-y",
-        "-ss",
-        str(from_ts),
-        "-i",
-        str(video_path),
-        "-t",
-        str(duration),
-        "-c:v",
-        "libx264",
-        "-preset",
-        "fast",
-        "-crf",
-        "18",
-        "-an",
-        str(out_path),
-    ]
-    result = subprocess.run(cmd, capture_output=True, text=True)
-    if result.returncode != 0:
-        raise RuntimeError(f"ffmpeg clip extraction failed:\n{result.stderr}")
-    return out_path
-
-
-def precompute_pixels(
-    progress_data: np.ndarray,
-    n_frames: int,
-    frame_w: int,
-    frame_h: int,
-) -> np.ndarray:
-    """
-    Map each progress sample to pixel coordinates.
-    Returns array of shape (N, 2) with (x, y) in pixel space.
-    x spans full video width; y maps progress [0,1] to graph band.
-    """
-    frame_indices = progress_data[:, 0].astype(float)
-    progress_vals = np.clip(progress_data[:, 1].astype(float), 0.0, 1.0)
-
-    y_top = int(frame_h * GRAPH_Y_TOP_FRAC)
-    y_bot = int(frame_h * GRAPH_Y_BOT_FRAC)
-    graph_h = y_bot - y_top
-
-    xs = (frame_indices / (n_frames - 1) * (frame_w - 1)).astype(int)
-    # progress=1 → y_top, progress=0 → y_bot
-    ys = (y_bot - progress_vals * graph_h).astype(int)
-
-    return np.stack([xs, ys], axis=1)  # (N, 2)
-
-
-def progress_color(t: float) -> tuple[int, int, int]:
-    """Interpolate BGR color red→green based on normalised position t in [0,1]."""
-    r = int(255 * (1.0 - t))
-    g = int(255 * t)
-    return (0, g, r)  # BGR
-
-
-def prerender_fill(
-    pixels: np.ndarray,
-    frame_w: int,
-    frame_h: int,
-) -> np.ndarray:
-    """Pre-render the full grey fill polygon under the curve as a BGRA image."""
-    y_bot = int(frame_h * GRAPH_Y_BOT_FRAC)
-    fill_img = np.zeros((frame_h, frame_w, 4), dtype=np.uint8)
-    poly = np.concatenate(
-        [
-            pixels,
-            [[pixels[-1][0], y_bot], [pixels[0][0], y_bot]],
-        ],
-        axis=0,
-    ).astype(np.int32)
-    cv2.fillPoly(fill_img, [poly], color=(128, 128, 128, int(255 * FILL_ALPHA)))
-    return fill_img
-
-
-def alpha_composite(base: np.ndarray, overlay_bgra: np.ndarray, x_max: int) -> None:
-    """Blend overlay onto base in-place, but only for x < x_max."""
-    if x_max <= 0:
-        return
-    roi_b = base[:, :x_max]
-    roi_o = overlay_bgra[:, :x_max]
-    alpha = roi_o[:, :, 3:4].astype(np.float32) / 255.0
-    roi_b[:] = np.clip(
-        roi_o[:, :, :3].astype(np.float32) * alpha + roi_b.astype(np.float32) * (1.0 - alpha),
-        0,
-        255,
-    ).astype(np.uint8)
-
-
-def draw_text_outlined(
-    frame: np.ndarray,
-    text: str,
-    pos: tuple[int, int],
-    font_scale: float,
-    thickness: int = 1,
-) -> None:
-    """Draw text with a dark outline for readability on any background."""
-    font = cv2.FONT_HERSHEY_SIMPLEX
-    cv2.putText(frame, text, pos, font, font_scale, (0, 0, 0), thickness + 2, cv2.LINE_AA)
-    cv2.putText(frame, text, pos, font, font_scale, (255, 255, 255), thickness, cv2.LINE_AA)
-
-
-def composite_video(
-    clip_path: Path,
-    progress_data: np.ndarray,
-    out_path: Path,
-    fps: float,
-    frame_h: int,
-    frame_w: int,
-    task_name: str = "",
-) -> Path:
-    """Read clip frames, draw gradient progress line with fill + labels, export as GIF."""
-    n_total = int(cv2.VideoCapture(str(clip_path)).get(cv2.CAP_PROP_FRAME_COUNT))
-    pixels = precompute_pixels(progress_data, n_total, frame_w, frame_h)
-
-    y_ref = int(frame_h * GRAPH_Y_TOP_FRAC)
-
-    # Pre-render fill polygon (line is drawn per-frame with live color)
-    fill_img = prerender_fill(pixels, frame_w, frame_h)
-
-    # 1.0 reference line overlay (full width, drawn once)
-    ref_img = np.zeros((frame_h, frame_w, 4), dtype=np.uint8)
-    cv2.line(ref_img, (0, y_ref), (frame_w - 1, y_ref), (200, 200, 200, int(255 * REF_ALPHA)), 1, cv2.LINE_AA)
-
-    frame_indices = progress_data[:, 0].astype(int)
-    progress_vals = progress_data[:, 1].astype(float)
-
-    print(f"[4/4] Compositing {n_total} frames …")
-    cap = cv2.VideoCapture(str(clip_path))
-    fourcc = cv2.VideoWriter_fourcc(*"mp4v")
-    tmp_path = out_path.parent / (out_path.stem + "_tmp.mp4")
-    writer = cv2.VideoWriter(str(tmp_path), fourcc, fps, (frame_w, frame_h))
-
-    fi = 0
-    while True:
-        ret, frame = cap.read()
-        if not ret:
-            break
-
-        n_drawn = int(np.searchsorted(frame_indices, fi, side="right"))
-        x_cur = int(pixels[min(n_drawn, len(pixels)) - 1][0]) + 1 if n_drawn > 0 else 0
-
-        # 1. reference line (full width, always)
-        alpha_composite(frame, ref_img, frame_w)
-
-        # 2. grey fill under curve up to current x
-        alpha_composite(frame, fill_img, x_cur)
-
-        # 3. progress line — single color that transitions red→green over time
-        if n_drawn >= 2:
-            t_cur = (n_drawn - 1) / max(len(progress_vals) - 1, 1)
-            line_col = progress_color(t_cur)
-            pts = pixels[:n_drawn].reshape(-1, 1, 2).astype(np.int32)
-            cv2.polylines(
-                frame,
-                [pts],
-                isClosed=False,
-                color=(255, 255, 255),
-                thickness=SHADOW_THICKNESS,
-                lineType=cv2.LINE_AA,
-            )
-            cv2.polylines(
-                frame, [pts], isClosed=False, color=line_col, thickness=LINE_THICKNESS, lineType=cv2.LINE_AA
-            )
-
-        # 4. score — bottom right
-        if n_drawn > 0:
-            score = float(progress_vals[min(n_drawn, len(progress_vals)) - 1])
-            score_text = f"{score:.2f}"
-            (tw, th), _ = cv2.getTextSize(score_text, cv2.FONT_HERSHEY_SIMPLEX, SCORE_FONT_SCALE, 2)
-            sx = frame_w - tw - 12
-            sy = frame_h - 12
-            # coloured score matching current gradient position
-            t_cur = (n_drawn - 1) / max(len(progress_vals) - 1, 1)
-            score_col = progress_color(t_cur)
-            cv2.putText(
-                frame,
-                score_text,
-                (sx, sy),
-                cv2.FONT_HERSHEY_SIMPLEX,
-                SCORE_FONT_SCALE,
-                (0, 0, 0),
-                4,
-                cv2.LINE_AA,
-            )
-            cv2.putText(
-                frame,
-                score_text,
-                (sx, sy),
-                cv2.FONT_HERSHEY_SIMPLEX,
-                SCORE_FONT_SCALE,
-                score_col,
-                2,
-                cv2.LINE_AA,
-            )
-
-        # 5. task name — top centre
-        if task_name:
-            (tw, _), _ = cv2.getTextSize(task_name, cv2.FONT_HERSHEY_SIMPLEX, TASK_FONT_SCALE, 1)
-            tx = max((frame_w - tw) // 2, 4)
-            draw_text_outlined(frame, task_name, (tx, 22), TASK_FONT_SCALE)
-
-        writer.write(frame)
-        fi += 1
-        if fi % 100 == 0:
-            print(f"   Frame {fi}/{n_total} …", end="\r")
-
-    cap.release()
-    writer.release()
-    print()
-
-    # Convert to GIF: full resolution, 12fps, 128-color diff palette (<40MB)
-    gif_path = out_path.with_suffix(".gif")
-    palette = out_path.parent / "_palette.png"
-    r1 = subprocess.run(  # nosec B607
-        [
-            "ffmpeg",
-            "-y",
-            "-i",
-            str(tmp_path),
-            "-vf",
-            f"fps=10,scale={frame_w}:-1:flags=lanczos,palettegen=max_colors=128:stats_mode=diff",
-            "-update",
-            "1",
-            str(palette),
-        ],
-        capture_output=True,
-        text=True,
-    )
-    if r1.returncode != 0:
-        print(f"   WARNING: palettegen failed:\n{r1.stderr[-500:]}")
-    r2 = subprocess.run(  # nosec B607
-        [
-            "ffmpeg",
-            "-y",
-            "-i",
-            str(tmp_path),
-            "-i",
-            str(palette),
-            "-filter_complex",
-            f"fps=10,scale={frame_w}:-1:flags=lanczos[v];[v][1:v]paletteuse=dither=bayer:bayer_scale=3",
-            str(gif_path),
-        ],
-        capture_output=True,
-        text=True,
-    )
-    if r2.returncode != 0:
-        print(f"   WARNING: gif encode failed:\n{r2.stderr[-500:]}")
-    tmp_path.unlink(missing_ok=True)
-    palette.unlink(missing_ok=True)
-    return gif_path
-
-
-def process_dataset(repo_id: str, episode: int):
-    safe_name = repo_id.replace("/", "_")
-    print(f"\n{'=' * 60}")
-    print(f"Processing: {repo_id}  |  episode {episode}")
-    print(f"{'=' * 60}")
-
-    # 1. Download metadata
-    local = download_episode(repo_id, episode)
-    print(f"   Local cache: {local}")
-
-    # 2. Read episode metadata
-    ep_meta = load_episode_meta(local, episode)
-    print(f"   Episode meta: {ep_meta}")
-
-    # 3. Download video file
-    video_path = download_video(repo_id, local, ep_meta["video_rel"])
-
-    # 4. Extract clip
-    clip_path = OUTPUT_DIR / f"{safe_name}_ep{episode}_clip.mp4"
-    extract_episode_clip(video_path, ep_meta["from_ts"], ep_meta["to_ts"], clip_path)
-
-    # 5. Load progress data
-    progress_data = load_progress(local, episode)
-    if progress_data is None:
-        print("   ERROR: Could not load sarm_progress data. Skipping overlay.")
-        return
-
-    n_progress = len(progress_data)
-    print(f"   Progress frames: {n_progress}")
-
-    # 6. Get clip dimensions
-    cap = cv2.VideoCapture(str(clip_path))
-    frame_w = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
-    frame_h = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
-    n_frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
-    actual_fps = cap.get(cv2.CAP_PROP_FPS) or ep_meta["fps"]
-    cap.release()
-    print(f"   Clip: {frame_w}×{frame_h}  {n_frames} frames @ {actual_fps:.1f}fps")
-
-    # 7. Composite (draw line directly on frames)
-    out_path = OUTPUT_DIR / f"{safe_name}_ep{episode}_progress.mp4"
-    final = composite_video(
-        clip_path,
-        progress_data,
-        out_path,
-        actual_fps,
-        frame_h,
-        frame_w,
-        task_name=ep_meta.get("task_name", ""),
-    )
-    clip_path.unlink(missing_ok=True)
-    print(f"\n✓ Done: {final}")
-    return final
-
-
-if __name__ == "__main__":
-    results = []
-    for cfg in DATASETS:
-        try:
-            out = process_dataset(cfg["repo_id"], cfg["episode"])
-            if out:
-                results.append(out)
-        except Exception as e:
-            print(f"\nERROR processing {cfg['repo_id']}: {e}")
-            import traceback
-
-            traceback.print_exc()
-
-    print("\n" + "=" * 60)
-    print("Output files:")
-    for r in results:
-        print(f"  {r}")
@@ -1,496 +0,0 @@
-"""
-Visualize end-effector workspace density and trajectory clusters for OpenArm datasets.
-Downloads joint position data (no videos) from HuggingFace, computes forward
-kinematics per episode, clusters trajectories with K-means, and renders
-2D projections comparing dataset coverage and multimodality.
-"""
-
-import json
-from pathlib import Path
-
-import matplotlib.pyplot as plt
-import numpy as np
-import pandas as pd
-from huggingface_hub import snapshot_download
-from sklearn.cluster import KMeans
-
-DATASETS = [
-    {"repo_id": "lerobot-data-collection/level2_final_quality3", "label": "HQ curated"},
-    {"repo_id": "lerobot-data-collection/level12_rac_2_2026-02-08_1", "label": "Full collection"},
-]
-OUTPUT_DIR = Path(__file__).resolve().parent / "outputs"
-OUTPUT_DIR.mkdir(exist_ok=True)
-
-N_CLUSTERS = 10
-WAYPOINTS = 50
-SEED = 42
-DPI = 180
-
-CLUSTER_COLORS = [
-    "#e6194b",
-    "#3cb44b",
-    "#4363d8",
-    "#f58231",
-    "#911eb4",
-    "#42d4f4",
-    "#f032e6",
-    "#bfef45",
-    "#fabed4",
-    "#dcbeff",
-    "#9a6324",
-    "#fffac8",
-    "#800000",
-    "#aaffc3",
-    "#808000",
-    "#ffd8b1",
-    "#000075",
-    "#a9a9a9",
-]
-
-# FK chains extracted from OpenArm bimanual URDF.
-# Each entry: (rpy, xyz, revolute_axis_or_None).
-LEFT_CHAIN = [
-    ((-np.pi / 2, 0, 0), (0, 0.031, 0.698), None),
-    ((0, 0, 0), (0, 0, 0.0625), (0, 0, 1)),
-    ((-np.pi / 2, 0, 0), (-0.0301, 0, 0.06), (-1, 0, 0)),
-    ((0, 0, 0), (0.0301, 0, 0.06625), (0, 0, 1)),
-    ((0, 0, 0), (0, 0.0315, 0.15375), (0, 1, 0)),
-    ((0, 0, 0), (0, -0.0315, 0.0955), (0, 0, 1)),
-    ((0, 0, 0), (0.0375, 0, 0.1205), (1, 0, 0)),
-    ((0, 0, 0), (-0.0375, 0, 0), (0, -1, 0)),
-    ((0, 0, 0), (0, 0, 0.1001), None),
-    ((0, 0, 0), (0, 0, 0.08), None),
-]
-RIGHT_CHAIN = [
-    ((np.pi / 2, 0, 0), (0, -0.031, 0.698), None),
-    ((0, 0, 0), (0, 0, 0.0625), (0, 0, 1)),
-    ((np.pi / 2, 0, 0), (-0.0301, 0, 0.06), (-1, 0, 0)),
-    ((0, 0, 0), (0.0301, 0, 0.06625), (0, 0, 1)),
-    ((0, 0, 0), (0, 0.0315, 0.15375), (0, 1, 0)),
-    ((0, 0, 0), (0, -0.0315, 0.0955), (0, 0, 1)),
-    ((0, 0, 0), (0.0375, 0, 0.1205), (1, 0, 0)),
-    ((0, 0, 0), (-0.0375, 0, 0), (0, 1, 0)),
-    ((0, 0, 0), (0, 0, 0.1001), None),
-    ((0, 0, 0), (0, 0, 0.08), None),
-]
-
-
-# ── FK math ─────────────────────────────────────────────
-
-
-def _rot_x(a: float) -> np.ndarray:
-    c, s = np.cos(a), np.sin(a)
-    return np.array([[1, 0, 0], [0, c, -s], [0, s, c]])
-
-
-def _rot_y(a: float) -> np.ndarray:
-    c, s = np.cos(a), np.sin(a)
-    return np.array([[c, 0, s], [0, 1, 0], [-s, 0, c]])
-
-
-def _rot_z(a: float) -> np.ndarray:
-    c, s = np.cos(a), np.sin(a)
-    return np.array([[c, -s, 0], [s, c, 0], [0, 0, 1]])
-
-
-def _tf(rpy: tuple, xyz: tuple) -> np.ndarray:
-    """Build a 4x4 homogeneous transform from URDF rpy + xyz."""
-    r, p, y = rpy
-    mat = np.eye(4)
-    mat[:3, :3] = _rot_z(y) @ _rot_y(p) @ _rot_x(r)
-    mat[:3, 3] = xyz
-    return mat
-
-
-def _batch_axis_rot(axis: tuple, angles: np.ndarray) -> np.ndarray:
-    """Batched Rodrigues rotation: (n,) angles around a fixed axis → (n, 4, 4)."""
-    n = len(angles)
-    ax = np.asarray(axis, dtype=np.float64)
-    ax = ax / np.linalg.norm(ax)
-    x, y, z = ax
-    c = np.cos(angles)
-    s = np.sin(angles)
-    t = 1 - c
-    rot = np.zeros((n, 4, 4))
-    rot[:, 0, 0] = t * x * x + c
-    rot[:, 0, 1] = t * x * y - s * z
-    rot[:, 0, 2] = t * x * z + s * y
-    rot[:, 1, 0] = t * x * y + s * z
-    rot[:, 1, 1] = t * y * y + c
-    rot[:, 1, 2] = t * y * z - s * x
-    rot[:, 2, 0] = t * x * z - s * y
-    rot[:, 2, 1] = t * y * z + s * x
-    rot[:, 2, 2] = t * z * z + c
-    rot[:, 3, 3] = 1.0
-    return rot
-
-
-def batch_fk(chain: list, joint_angles: np.ndarray) -> np.ndarray:
-    """Vectorized FK: (n, 7) radians → (n, 3) TCP positions in world frame."""
-    n = joint_angles.shape[0]
-    tf_batch = np.tile(np.eye(4), (n, 1, 1))
-    qi = 0
-    for rpy, xyz, axis in chain:
-        tf_batch = tf_batch @ _tf(rpy, xyz)
-        if axis is not None:
-            rot = _batch_axis_rot(axis, joint_angles[:, qi])
-            tf_batch = np.einsum("nij,njk->nik", tf_batch, rot)
-            qi += 1
-    return tf_batch[:, :3, 3]
-
-
-# ── Data loading ────────────────────────────────────────
-
-
-def _flatten_names(obj: object) -> list[str]:
-    """Recursively flatten a names structure (list, dict, or nested) into a flat string list."""
-    if isinstance(obj, dict):
-        out: list[str] = []
-        for v in obj.values():
-            out.extend(_flatten_names(v))
-        return out
-    if isinstance(obj, (list, tuple)):
-        out = []
-        for item in obj:
-            if isinstance(item, (list, tuple, dict)):
-                out.extend(_flatten_names(item))
-            else:
-                out.append(str(item))
-        return out
-    return [str(obj)]
-
-
-def _detect_and_convert(vals: np.ndarray) -> np.ndarray:
-    """Auto-detect servo ticks / degrees / radians and convert to radians."""
-    mx = np.max(np.abs(vals))
-    if mx > 360:
-        print(f"    Unit detection: servo ticks (max={mx:.0f})")
-        return (vals - 2048) / 2048 * np.pi
-    if mx > 6.3:
-        print(f"    Unit detection: degrees (max={mx:.1f})")
-        return np.deg2rad(vals)
-    print(f"    Unit detection: radians (max={mx:.3f})")
-    return vals.astype(np.float64)
-
-
-def _find_joint_indices(features: dict, state_col: str, n_dim: int) -> tuple[list[int], list[int]]:
-    """Try to find left/right joint indices from info.json feature names."""
-    feat = features.get("observation.state", features.get(state_col, {}))
-    names = _flatten_names(feat.get("names", []))
-
-    left_idx: list[int] = []
-    right_idx: list[int] = []
-    if names and len(names) == n_dim:
-        names_l = [n.lower() for n in names]
-        print(f"  Feature names: {names[:4]}…{names[-4:]}")
-        for j in range(1, 8):
-            for i, nm in enumerate(names_l):
-                if f"left_joint_{j}" in nm and i not in left_idx:
-                    left_idx.append(i)
-                    break
-            for i, nm in enumerate(names_l):
-                if f"right_joint_{j}" in nm and i not in right_idx:
-                    right_idx.append(i)
-                    break
-
-    if len(left_idx) == 7 and len(right_idx) == 7:
-        print(f"  Matched by name: left={left_idx} right={right_idx}")
-        return left_idx, right_idx
-    if n_dim >= 16:
-        print("  Falling back to positional: [0:7]=left, [8:15]=right")
-        return list(range(7)), list(range(8, 15))
-    if n_dim >= 14:
-        print("  Falling back to positional: [0:7]=left, [7:14]=right")
-        return list(range(7)), list(range(7, 14))
-    raise RuntimeError(f"State dim {n_dim} too small for bimanual 7-DOF robot")
-
-
-def download_data(repo_id: str) -> Path:
-    print(f"  Downloading {repo_id} (parquet only) …")
-    return Path(
-        snapshot_download(
-            repo_id=repo_id,
-            repo_type="dataset",
-            allow_patterns=["meta/**", "data/**"],
-            ignore_patterns=["*.mp4", "videos/**"],
-        )
-    )
-
-
-def resample_trajectory(traj: np.ndarray, n_waypoints: int) -> np.ndarray:
-    """Resample a (F, 3) trajectory to exactly n_waypoints via linear interpolation."""
-    f = traj.shape[0]
-    if f == n_waypoints:
-        return traj
-    old_t = np.linspace(0, 1, f)
-    new_t = np.linspace(0, 1, n_waypoints)
-    return np.column_stack([np.interp(new_t, old_t, traj[:, d]) for d in range(3)])
-
-
-def load_episode_trajectories(local: Path) -> list[dict]:
-    """
-    Load per-episode joint data, compute FK, return list of trajectory dicts.
-    Each dict: {"left_tcp": (F,3), "right_tcp": (F,3), "episode_index": int}.
-    Uses all episodes in the dataset for a fair comparison.
-    """
-    info = json.loads((local / "meta" / "info.json").read_text())
-    features = info.get("features", {})
-
-    dfs = [pd.read_parquet(pq) for pq in sorted((local / "data").glob("**/*.parquet"))]
-    df = pd.concat(dfs, ignore_index=True)
-    print(f"  Total frames: {len(df):,}")
-
-    state_col = next((c for c in df.columns if "observation.state" in c), None)
-    if state_col is None:
-        raise RuntimeError(f"No observation.state column. Available: {list(df.columns)}")
-
-    first = df[state_col].iloc[0]
-    if not hasattr(first, "__len__"):
-        raise RuntimeError(f"observation.state is scalar ({type(first)}), expected array")
-
-    state = np.stack(df[state_col].values).astype(np.float64)
-    n_dim = state.shape[1]
-    print(f"  State dim: {n_dim}  max|val|: {np.max(np.abs(state)):.1f}")
-
-    left_idx, right_idx = _find_joint_indices(features, state_col, n_dim)
-
-    ep_col = next((c for c in df.columns if c == "episode_index"), None)
-    if ep_col is None:
-        raise RuntimeError(f"No episode_index column. Available: {list(df.columns)}")
-
-    episode_ids = df[ep_col].values
-    unique_eps = np.unique(episode_ids)
-    print(f"  Episodes: {len(unique_eps):,}")
-
-    left_raw = state[:, left_idx]
-    right_raw = state[:, right_idx]
-    left_all = _detect_and_convert(left_raw)
-    right_all = _detect_and_convert(right_raw)
-
-    print("  Computing FK per episode …")
-    trajectories = []
-    for ep_id in unique_eps:
-        mask = episode_ids == ep_id
-        left_tcp = batch_fk(LEFT_CHAIN, left_all[mask])
-        right_tcp = batch_fk(RIGHT_CHAIN, right_all[mask])
-        if len(left_tcp) < 3:
-            continue
-        trajectories.append({"left_tcp": left_tcp, "right_tcp": right_tcp, "episode_index": int(ep_id)})
-
-    print(f"  Valid trajectories: {len(trajectories):,}")
-    return trajectories
-
-
-# ── Clustering ──────────────────────────────────────────
-
-
-def cluster_trajectories(
-    trajectories: list[dict], n_clusters: int, n_waypoints: int
-) -> tuple[np.ndarray, np.ndarray]:
-    """
-    K-means on resampled trajectory features.
-    Combines left+right TCP into a single feature vector per episode.
-    Returns (labels, centroid_trajs (k, waypoints, 6), spread_per_cluster (k,) in metres).
-    Spread = mean per-waypoint Euclidean distance from each trajectory to its centroid.
-    """
-    feat_vecs = []
-    for t in trajectories:
-        left_rs = resample_trajectory(t["left_tcp"], n_waypoints)
-        right_rs = resample_trajectory(t["right_tcp"], n_waypoints)
-        feat_vecs.append(np.concatenate([left_rs.ravel(), right_rs.ravel()]))
-    feat_matrix = np.array(feat_vecs)
-
-    k = min(n_clusters, len(feat_vecs))
-    km = KMeans(n_clusters=k, n_init=10, random_state=SEED)
-    labels = km.fit_predict(feat_matrix)
-
-    centroids_flat = km.cluster_centers_
-    centroid_trajs = np.zeros((k, n_waypoints, 6))
-    for ci in range(k):
-        left_flat = centroids_flat[ci, : n_waypoints * 3]
-        right_flat = centroids_flat[ci, n_waypoints * 3 :]
-        centroid_trajs[ci, :, :3] = left_flat.reshape(n_waypoints, 3)
-        centroid_trajs[ci, :, 3:] = right_flat.reshape(n_waypoints, 3)
-
-    # Mean per-waypoint distance to centroid (in metres) for each cluster
-    spread = np.zeros(k)
-    for ci in range(k):
-        members = np.where(labels == ci)[0]
-        if len(members) == 0:
-            continue
-        centroid_left = centroid_trajs[ci, :, :3]
-        centroid_right = centroid_trajs[ci, :, 3:]
-        dists = []
-        for mi in members:
-            t = trajectories[mi]
-            left_rs = resample_trajectory(t["left_tcp"], n_waypoints)
-            right_rs = resample_trajectory(t["right_tcp"], n_waypoints)
-            d_left = np.linalg.norm(left_rs - centroid_left, axis=1).mean()
-            d_right = np.linalg.norm(right_rs - centroid_right, axis=1).mean()
-            dists.append((d_left + d_right) / 2)
-        spread[ci] = np.mean(dists)
-
-    return labels, centroid_trajs, spread
-
-
-# ── Visualization ───────────────────────────────────────
-
-PROJ_VIEWS = [
-    ("XZ (side)", 0, 2, "X (m)", "Z (m)"),
-    ("XY (top)", 0, 1, "X (m)", "Y (m)"),
-    ("YZ (front)", 1, 2, "Y (m)", "Z (m)"),
-]
-
-
-def render(results: list[dict], out_path: Path) -> None:
-    """
-    2-row × 3-col grid per dataset (3 projections × 2 datasets).
-    Trajectory lines colored by cluster, centroid trajectories drawn thick.
-    """
-    n_ds = len(results)
-    n_proj = len(PROJ_VIEWS)
-    fig, axes = plt.subplots(n_ds, n_proj, figsize=(7 * n_proj, 7 * n_ds), facecolor="#0d1117")
-    if n_ds == 1:
-        axes = axes[np.newaxis, :]
-
-    for row, r in enumerate(results):
-        trajectories = r["trajectories"]
-        labels = r["labels"]
-        centroids = r["centroids"]
-        k = centroids.shape[0]
-
-        cluster_sizes = np.bincount(labels, minlength=k)
-        size_order = np.argsort(-cluster_sizes)
-        pcts = cluster_sizes / len(labels) * 100
-        spread = r["spread"]
-
-        for col, (view_name, dim_a, dim_b, xlabel, ylabel) in enumerate(PROJ_VIEWS):
-            ax = axes[row, col]
-            ax.set_facecolor("#0d1117")
-
-            for ti, traj in enumerate(trajectories):
-                color = CLUSTER_COLORS[labels[ti] % len(CLUSTER_COLORS)]
-                for tcp_key in ("left_tcp", "right_tcp"):
-                    pts = traj[tcp_key]
-                    ax.plot(pts[:, dim_a], pts[:, dim_b], color=color, alpha=0.12, linewidth=0.4)
-
-            for ci in range(k):
-                color = CLUSTER_COLORS[ci % len(CLUSTER_COLORS)]
-                left_c = centroids[ci, :, :3]
-                right_c = centroids[ci, :, 3:]
-                lw = 1.5 + 2.0 * cluster_sizes[ci] / cluster_sizes.max()
-                for c_pts in (left_c, right_c):
-                    ax.plot(
-                        c_pts[:, dim_a],
-                        c_pts[:, dim_b],
-                        color=color,
-                        linewidth=lw,
-                        alpha=0.95,
-                        zorder=10,
-                    )
-                    ax.plot(
-                        c_pts[0, dim_a],
-                        c_pts[0, dim_b],
-                        "o",
-                        color=color,
-                        markersize=4,
-                        zorder=11,
-                    )
-                    ax.plot(
-                        c_pts[-1, dim_a],
-                        c_pts[-1, dim_b],
-                        "s",
-                        color=color,
-                        markersize=4,
-                        zorder=11,
-                    )
-
-            ax.set_xlabel(xlabel, color="#888", fontsize=9)
-            ax.set_ylabel(ylabel, color="#888", fontsize=9)
-            ax.tick_params(colors="#555", labelsize=7)
-            for spine in ax.spines.values():
-                spine.set_color("#333")
-            ax.set_aspect("equal")
-
-            mean_spread_cm = np.average(spread, weights=cluster_sizes) * 100
-            if col == 0:
-                ax.set_title(
-                    f"{r['label']}  ({r['n_episodes']:,} episodes, {k} clusters, "
-                    f"avg spread {mean_spread_cm:.1f}cm)",
-                    color="white",
-                    fontsize=11,
-                    pad=10,
-                )
-            else:
-                ax.set_title(view_name, color="#aaa", fontsize=10, pad=8)
-
-        # Cluster size + spread legend on the rightmost panel
-        legend_ax = axes[row, -1]
-        for ci in size_order:
-            color = CLUSTER_COLORS[ci % len(CLUSTER_COLORS)]
-            spread_cm = spread[ci] * 100
-            label = f"C{ci}: {cluster_sizes[ci]} eps ({pcts[ci]:.0f}%) ±{spread_cm:.1f}cm"
-            legend_ax.plot([], [], color=color, linewidth=3, label=label)
-        legend_ax.legend(
-            loc="upper right",
-            fontsize=7,
-            frameon=True,
-            facecolor="#1a1a2e",
-            edgecolor="#333",
-            labelcolor="white",
-            handlelength=1.5,
-        )
-
-    fig.suptitle(
-        "End-Effector Trajectory Clusters (FK · K-means)",
-        color="white",
-        fontsize=16,
-        y=0.98,
-    )
-    plt.tight_layout(rect=[0, 0, 1, 0.95])
-    plt.savefig(out_path, dpi=DPI, bbox_inches="tight", facecolor=fig.get_facecolor())
-    plt.close()
-    print(f"\n✓ Saved: {out_path}")
-
-
-# ── Main ────────────────────────────────────────────────
-
-
-def main() -> None:
-    results = []
-
-    for ds in DATASETS:
-        repo_id, label = ds["repo_id"], ds["label"]
-        print(f"\n{'=' * 60}")
-        print(f"  {label}: {repo_id}")
-        print(f"{'=' * 60}")
-
-        local = download_data(repo_id)
-        trajectories = load_episode_trajectories(local)
-        labels, centroids, spread = cluster_trajectories(trajectories, N_CLUSTERS, WAYPOINTS)
-
-        cluster_sizes = np.bincount(labels, minlength=centroids.shape[0])
-        print(f"  Cluster sizes: {sorted(cluster_sizes, reverse=True)}")
-        for ci in np.argsort(-cluster_sizes):
-            print(
-                f"    C{ci}: {cluster_sizes[ci]} eps ({cluster_sizes[ci] / len(labels) * 100:.0f}%) "
-                f"spread ±{spread[ci] * 100:.1f}cm"
-            )
-
-        results.append(
-            {
-                "label": label,
-                "trajectories": trajectories,
-                "labels": labels,
-                "centroids": centroids,
-                "spread": spread,
-                "n_episodes": len(trajectories),
-            }
-        )
-
-    out = OUTPUT_DIR / "workspace_trajectory_clusters.jpg"
-    render(results, out)
-
-
-if __name__ == "__main__":
-    main()
@@ -14,8 +14,8 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.

-from lerobot.datasets.feature_utils import hw_to_dataset_features
 from lerobot.datasets.lerobot_dataset import LeRobotDataset
+from lerobot.datasets.utils import hw_to_dataset_features
 from lerobot.policies.act.modeling_act import ACTPolicy
 from lerobot.policies.factory import make_pre_post_processors
 from lerobot.processor import make_default_processors
@@ -78,24 +78,40 @@ def main():
    listener, events = init_keyboard_listener()
    init_rerun(session_name="lekiwi_evaluate")

-    try:
-        if not robot.is_connected:
-            raise ValueError("Robot is not connected!")
+    if not robot.is_connected:
+        raise ValueError("Robot is not connected!")

-        print("Starting evaluate loop...")
-        recorded_episodes = 0
-        while recorded_episodes < NUM_EPISODES and not events["stop_recording"]:
-            log_say(f"Running inference, recording eval episode {recorded_episodes} of {NUM_EPISODES}")
+    print("Starting evaluate loop...")
+    recorded_episodes = 0
+    while recorded_episodes < NUM_EPISODES and not events["stop_recording"]:
+        log_say(f"Running inference, recording eval episode {recorded_episodes} of {NUM_EPISODES}")

-            # Main record loop
+        # Main record loop
+        record_loop(
+            robot=robot,
+            events=events,
+            fps=FPS,
+            policy=policy,
+            preprocessor=preprocessor,  # Pass the pre and post policy processors
+            postprocessor=postprocessor,
+            dataset=dataset,
+            control_time_s=EPISODE_TIME_SEC,
+            single_task=TASK_DESCRIPTION,
+            display_data=True,
+            teleop_action_processor=teleop_action_processor,
+            robot_action_processor=robot_action_processor,
+            robot_observation_processor=robot_observation_processor,
+        )
+
+        # Reset the environment if not stopping or re-recording
+        if not events["stop_recording"] and (
+            (recorded_episodes < NUM_EPISODES - 1) or events["rerecord_episode"]
+        ):
+            log_say("Reset the environment")
            record_loop(
                robot=robot,
                events=events,
                fps=FPS,
-                policy=policy,
-                preprocessor=preprocessor,  # Pass the pre and post policy processors
-                postprocessor=postprocessor,
-                dataset=dataset,
                control_time_s=EPISODE_TIME_SEC,
                single_task=TASK_DESCRIPTION,
                display_data=True,
@@ -104,42 +120,24 @@ def main():
                robot_observation_processor=robot_observation_processor,
            )

-            # Reset the environment if not stopping or re-recording
-            if not events["stop_recording"] and (
-                (recorded_episodes < NUM_EPISODES - 1) or events["rerecord_episode"]
-            ):
-                log_say("Reset the environment")
-                record_loop(
-                    robot=robot,
-                    events=events,
-                    fps=FPS,
-                    control_time_s=EPISODE_TIME_SEC,
-                    single_task=TASK_DESCRIPTION,
-                    display_data=True,
-                    teleop_action_processor=teleop_action_processor,
-                    robot_action_processor=robot_action_processor,
-                    robot_observation_processor=robot_observation_processor,
-                )
+        if events["rerecord_episode"]:
+            log_say("Re-record episode")
+            events["rerecord_episode"] = False
+            events["exit_early"] = False
+            dataset.clear_episode_buffer()
+            continue

-            if events["rerecord_episode"]:
-                log_say("Re-record episode")
-                events["rerecord_episode"] = False
-                events["exit_early"] = False
-                dataset.clear_episode_buffer()
-                continue
+        # Save episode
+        dataset.save_episode()
+        recorded_episodes += 1

-            # Save episode
-            dataset.save_episode()
-            recorded_episodes += 1
+    # Clean up
+    log_say("Stop recording")
+    robot.disconnect()
+    listener.stop()

-    finally:
-        # Clean up
-        log_say("Stop recording")
-        robot.disconnect()
-        listener.stop()
-
-        dataset.finalize()
-        dataset.push_to_hub()
+    dataset.finalize()
+    dataset.push_to_hub()


 if __name__ == "__main__":
@@ -14,14 +14,14 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.

-from lerobot.datasets.feature_utils import hw_to_dataset_features
 from lerobot.datasets.lerobot_dataset import LeRobotDataset
+from lerobot.datasets.utils import hw_to_dataset_features
 from lerobot.processor import make_default_processors
 from lerobot.robots.lekiwi.config_lekiwi import LeKiwiClientConfig
 from lerobot.robots.lekiwi.lekiwi_client import LeKiwiClient
 from lerobot.scripts.lerobot_record import record_loop
 from lerobot.teleoperators.keyboard import KeyboardTeleop, KeyboardTeleopConfig
-from lerobot.teleoperators.so_leader import SO100Leader, SO100LeaderConfig
+from lerobot.teleoperators.so100_leader import SO100Leader, SO100LeaderConfig
 from lerobot.utils.constants import ACTION, OBS_STR
 from lerobot.utils.control_utils import init_keyboard_listener
 from lerobot.utils.utils import log_say
@@ -74,23 +74,40 @@ def main():
    listener, events = init_keyboard_listener()
    init_rerun(session_name="lekiwi_record")

-    try:
-        if not robot.is_connected or not leader_arm.is_connected or not keyboard.is_connected:
-            raise ValueError("Robot or teleop is not connected!")
+    if not robot.is_connected or not leader_arm.is_connected or not keyboard.is_connected:
+        raise ValueError("Robot or teleop is not connected!")

-        print("Starting record loop...")
-        recorded_episodes = 0
-        while recorded_episodes < NUM_EPISODES and not events["stop_recording"]:
-            log_say(f"Recording episode {recorded_episodes}")
+    print("Starting record loop...")
+    recorded_episodes = 0
+    while recorded_episodes < NUM_EPISODES and not events["stop_recording"]:
+        log_say(f"Recording episode {recorded_episodes}")

-            # Main record loop
+        # Main record loop
+        record_loop(
+            robot=robot,
+            events=events,
+            fps=FPS,
+            dataset=dataset,
+            teleop=[leader_arm, keyboard],
+            control_time_s=EPISODE_TIME_SEC,
+            single_task=TASK_DESCRIPTION,
+            display_data=True,
+            teleop_action_processor=teleop_action_processor,
+            robot_action_processor=robot_action_processor,
+            robot_observation_processor=robot_observation_processor,
+        )
+
+        # Reset the environment if not stopping or re-recording
+        if not events["stop_recording"] and (
+            (recorded_episodes < NUM_EPISODES - 1) or events["rerecord_episode"]
+        ):
+            log_say("Reset the environment")
            record_loop(
                robot=robot,
                events=events,
                fps=FPS,
-                dataset=dataset,
                teleop=[leader_arm, keyboard],
-                control_time_s=EPISODE_TIME_SEC,
+                control_time_s=RESET_TIME_SEC,
                single_task=TASK_DESCRIPTION,
                display_data=True,
                teleop_action_processor=teleop_action_processor,
@@ -98,44 +115,26 @@ def main():
                robot_observation_processor=robot_observation_processor,
            )

-            # Reset the environment if not stopping or re-recording
-            if not events["stop_recording"] and (
-                (recorded_episodes < NUM_EPISODES - 1) or events["rerecord_episode"]
-            ):
-                log_say("Reset the environment")
-                record_loop(
-                    robot=robot,
-                    events=events,
-                    fps=FPS,
-                    teleop=[leader_arm, keyboard],
-                    control_time_s=RESET_TIME_SEC,
-                    single_task=TASK_DESCRIPTION,
-                    display_data=True,
-                    teleop_action_processor=teleop_action_processor,
-                    robot_action_processor=robot_action_processor,
-                    robot_observation_processor=robot_observation_processor,
-                )
+        if events["rerecord_episode"]:
+            log_say("Re-record episode")
+            events["rerecord_episode"] = False
+            events["exit_early"] = False
+            dataset.clear_episode_buffer()
+            continue

-            if events["rerecord_episode"]:
-                log_say("Re-record episode")
-                events["rerecord_episode"] = False
-                events["exit_early"] = False
-                dataset.clear_episode_buffer()
-                continue
+        # Save episode
+        dataset.save_episode()
+        recorded_episodes += 1

-            # Save episode
-            dataset.save_episode()
-            recorded_episodes += 1
-    finally:
-        # Clean up
-        log_say("Stop recording")
-        robot.disconnect()
-        leader_arm.disconnect()
-        keyboard.disconnect()
-        listener.stop()
+    # Clean up
+    log_say("Stop recording")
+    robot.disconnect()
+    leader_arm.disconnect()
+    keyboard.disconnect()
+    listener.stop()

-        dataset.finalize()
-        dataset.push_to_hub()
+    dataset.finalize()
+    dataset.push_to_hub()


 if __name__ == "__main__":
@@ -42,27 +42,25 @@ def main():
    # Connect to the robot
    robot.connect()

-    try:
-        if not robot.is_connected:
-            raise ValueError("Robot is not connected!")
+    if not robot.is_connected:
+        raise ValueError("Robot is not connected!")

-        print("Starting replay loop...")
-        log_say(f"Replaying episode {EPISODE_IDX}")
-        for idx in range(len(episode_frames)):
-            t0 = time.perf_counter()
+    print("Starting replay loop...")
+    log_say(f"Replaying episode {EPISODE_IDX}")
+    for idx in range(len(episode_frames)):
+        t0 = time.perf_counter()

-            # Get recorded action from dataset
-            action = {
-                name: float(actions[idx][ACTION][i])
-                for i, name in enumerate(dataset.features[ACTION]["names"])
-            }
+        # Get recorded action from dataset
+        action = {
+            name: float(actions[idx][ACTION][i]) for i, name in enumerate(dataset.features[ACTION]["names"])
+        }

-            # Send action to robot
-            _ = robot.send_action(action)
+        # Send action to robot
+        _ = robot.send_action(action)

-            precise_sleep(max(1.0 / dataset.fps - (time.perf_counter() - t0), 0.0))
-    finally:
-        robot.disconnect()
+        precise_sleep(max(1.0 / dataset.fps - (time.perf_counter() - t0), 0.0))
+
+    robot.disconnect()


 if __name__ == "__main__":
@@ -18,7 +18,7 @@ import time

 from lerobot.robots.lekiwi import LeKiwiClient, LeKiwiClientConfig
 from lerobot.teleoperators.keyboard.teleop_keyboard import KeyboardTeleop, KeyboardTeleopConfig
-from lerobot.teleoperators.so_leader import SO100Leader, SO100LeaderConfig
+from lerobot.teleoperators.so100_leader import SO100Leader, SO100LeaderConfig
 from lerobot.utils.robot_utils import precise_sleep
 from lerobot.utils.visualization_utils import init_rerun, log_rerun_data

@@ -16,13 +16,15 @@

 from lerobot.cameras.opencv.configuration_opencv import OpenCVCameraConfig
 from lerobot.configs.types import FeatureType, PolicyFeature
-from lerobot.datasets.feature_utils import combine_feature_dicts
 from lerobot.datasets.lerobot_dataset import LeRobotDataset
 from lerobot.datasets.pipeline_features import aggregate_pipeline_dataset_features, create_initial_features
+from lerobot.datasets.utils import combine_feature_dicts
 from lerobot.model.kinematics import RobotKinematics
 from lerobot.policies.act.modeling_act import ACTPolicy
 from lerobot.policies.factory import make_pre_post_processors
 from lerobot.processor import (
+    RobotAction,
+    RobotObservation,
    RobotProcessorPipeline,
    make_default_teleop_action_processor,
 )
@@ -32,13 +34,13 @@ from lerobot.processor.converters import (
    transition_to_observation,
    transition_to_robot_action,
 )
-from lerobot.robots.so_follower import SO100Follower, SO100FollowerConfig
-from lerobot.robots.so_follower.robot_kinematic_processor import (
+from lerobot.robots.so100_follower.config_so100_follower import SO100FollowerConfig
+from lerobot.robots.so100_follower.robot_kinematic_processor import (
    ForwardKinematicsJointsToEE,
    InverseKinematicsEEToJoints,
 )
+from lerobot.robots.so100_follower.so100_follower import SO100Follower
 from lerobot.scripts.lerobot_record import record_loop
-from lerobot.types import RobotAction, RobotObservation
 from lerobot.utils.control_utils import init_keyboard_listener
 from lerobot.utils.utils import log_say
 from lerobot.utils.visualization_utils import init_rerun
@@ -141,24 +143,38 @@ def main():
    listener, events = init_keyboard_listener()
    init_rerun(session_name="phone_so100_evaluate")

-    try:
-        if not robot.is_connected:
-            raise ValueError("Robot is not connected!")
+    if not robot.is_connected:
+        raise ValueError("Robot is not connected!")

-        print("Starting evaluate loop...")
-        episode_idx = 0
-        for episode_idx in range(NUM_EPISODES):
-            log_say(f"Running inference, recording eval episode {episode_idx + 1} of {NUM_EPISODES}")
+    print("Starting evaluate loop...")
+    episode_idx = 0
+    for episode_idx in range(NUM_EPISODES):
+        log_say(f"Running inference, recording eval episode {episode_idx + 1} of {NUM_EPISODES}")

-            # Main record loop
+        # Main record loop
+        record_loop(
+            robot=robot,
+            events=events,
+            fps=FPS,
+            policy=policy,
+            preprocessor=preprocessor,  # Pass the pre and post policy processors
+            postprocessor=postprocessor,
+            dataset=dataset,
+            control_time_s=EPISODE_TIME_SEC,
+            single_task=TASK_DESCRIPTION,
+            display_data=True,
+            teleop_action_processor=make_default_teleop_action_processor(),
+            robot_action_processor=robot_ee_to_joints_processor,
+            robot_observation_processor=robot_joints_to_ee_pose_processor,
+        )
+
+        # Reset the environment if not stopping or re-recording
+        if not events["stop_recording"] and ((episode_idx < NUM_EPISODES - 1) or events["rerecord_episode"]):
+            log_say("Reset the environment")
            record_loop(
                robot=robot,
                events=events,
                fps=FPS,
-                policy=policy,
-                preprocessor=preprocessor,  # Pass the pre and post policy processors
-                postprocessor=postprocessor,
-                dataset=dataset,
                control_time_s=EPISODE_TIME_SEC,
                single_task=TASK_DESCRIPTION,
                display_data=True,
@@ -167,41 +183,24 @@ def main():
                robot_observation_processor=robot_joints_to_ee_pose_processor,
            )

-            # Reset the environment if not stopping or re-recording
-            if not events["stop_recording"] and (
-                (episode_idx < NUM_EPISODES - 1) or events["rerecord_episode"]
-            ):
-                log_say("Reset the environment")
-                record_loop(
-                    robot=robot,
-                    events=events,
-                    fps=FPS,
-                    control_time_s=EPISODE_TIME_SEC,
-                    single_task=TASK_DESCRIPTION,
-                    display_data=True,
-                    teleop_action_processor=make_default_teleop_action_processor(),
-                    robot_action_processor=robot_ee_to_joints_processor,
-                    robot_observation_processor=robot_joints_to_ee_pose_processor,
-                )
+        if events["rerecord_episode"]:
+            log_say("Re-record episode")
+            events["rerecord_episode"] = False
+            events["exit_early"] = False
+            dataset.clear_episode_buffer()
+            continue

-            if events["rerecord_episode"]:
-                log_say("Re-record episode")
-                events["rerecord_episode"] = False
-                events["exit_early"] = False
-                dataset.clear_episode_buffer()
-                continue
+        # Save episode
+        dataset.save_episode()
+        episode_idx += 1

-            # Save episode
-            dataset.save_episode()
-            episode_idx += 1
-    finally:
-        # Clean up
-        log_say("Stop recording")
-        robot.disconnect()
-        listener.stop()
+    # Clean up
+    log_say("Stop recording")
+    robot.disconnect()
+    listener.stop()

-        dataset.finalize()
-        dataset.push_to_hub()
+    dataset.finalize()
+    dataset.push_to_hub()


 if __name__ == "__main__":
@@ -15,30 +15,30 @@
 # limitations under the License.

 from lerobot.cameras.opencv.configuration_opencv import OpenCVCameraConfig
-from lerobot.datasets.feature_utils import combine_feature_dicts
 from lerobot.datasets.lerobot_dataset import LeRobotDataset
 from lerobot.datasets.pipeline_features import aggregate_pipeline_dataset_features, create_initial_features
+from lerobot.datasets.utils import combine_feature_dicts
 from lerobot.model.kinematics import RobotKinematics
-from lerobot.processor import RobotProcessorPipeline
+from lerobot.processor import RobotAction, RobotObservation, RobotProcessorPipeline
 from lerobot.processor.converters import (
    observation_to_transition,
    robot_action_observation_to_transition,
    transition_to_observation,
    transition_to_robot_action,
 )
-from lerobot.robots.so_follower import SO100Follower, SO100FollowerConfig
-from lerobot.robots.so_follower.robot_kinematic_processor import (
+from lerobot.robots.so100_follower.config_so100_follower import SO100FollowerConfig
+from lerobot.robots.so100_follower.robot_kinematic_processor import (
    EEBoundsAndSafety,
    EEReferenceAndDelta,
    ForwardKinematicsJointsToEE,
    GripperVelocityToJoint,
    InverseKinematicsEEToJoints,
 )
+from lerobot.robots.so100_follower.so100_follower import SO100Follower
 from lerobot.scripts.lerobot_record import record_loop
 from lerobot.teleoperators.phone.config_phone import PhoneConfig, PhoneOS
 from lerobot.teleoperators.phone.phone_processor import MapPhoneActionToRobotAction
 from lerobot.teleoperators.phone.teleop_phone import Phone
-from lerobot.types import RobotAction, RobotObservation
 from lerobot.utils.control_utils import init_keyboard_listener
 from lerobot.utils.utils import log_say
 from lerobot.utils.visualization_utils import init_rerun
@@ -150,23 +150,38 @@ def main():
    listener, events = init_keyboard_listener()
    init_rerun(session_name="phone_so100_record")

-    try:
-        if not robot.is_connected or not phone.is_connected:
-            raise ValueError("Robot or teleop is not connected!")
+    if not robot.is_connected or not phone.is_connected:
+        raise ValueError("Robot or teleop is not connected!")

-        print("Starting record loop. Move your phone to teleoperate the robot...")
-        episode_idx = 0
-        while episode_idx < NUM_EPISODES and not events["stop_recording"]:
-            log_say(f"Recording episode {episode_idx + 1} of {NUM_EPISODES}")
+    print("Starting record loop. Move your phone to teleoperate the robot...")
+    episode_idx = 0
+    while episode_idx < NUM_EPISODES and not events["stop_recording"]:
+        log_say(f"Recording episode {episode_idx + 1} of {NUM_EPISODES}")

-            # Main record loop
+        # Main record loop
+        record_loop(
+            robot=robot,
+            events=events,
+            fps=FPS,
+            teleop=phone,
+            dataset=dataset,
+            control_time_s=EPISODE_TIME_SEC,
+            single_task=TASK_DESCRIPTION,
+            display_data=True,
+            teleop_action_processor=phone_to_robot_ee_pose_processor,
+            robot_action_processor=robot_ee_to_joints_processor,
+            robot_observation_processor=robot_joints_to_ee_pose,
+        )
+
+        # Reset the environment if not stopping or re-recording
+        if not events["stop_recording"] and (episode_idx < NUM_EPISODES - 1 or events["rerecord_episode"]):
+            log_say("Reset the environment")
            record_loop(
                robot=robot,
                events=events,
                fps=FPS,
                teleop=phone,
-                dataset=dataset,
-                control_time_s=EPISODE_TIME_SEC,
+                control_time_s=RESET_TIME_SEC,
                single_task=TASK_DESCRIPTION,
                display_data=True,
                teleop_action_processor=phone_to_robot_ee_pose_processor,
@@ -174,43 +189,25 @@ def main():
                robot_observation_processor=robot_joints_to_ee_pose,
            )

-            # Reset the environment if not stopping or re-recording
-            if not events["stop_recording"] and (
-                episode_idx < NUM_EPISODES - 1 or events["rerecord_episode"]
-            ):
-                log_say("Reset the environment")
-                record_loop(
-                    robot=robot,
-                    events=events,
-                    fps=FPS,
-                    teleop=phone,
-                    control_time_s=RESET_TIME_SEC,
-                    single_task=TASK_DESCRIPTION,
-                    display_data=True,
-                    teleop_action_processor=phone_to_robot_ee_pose_processor,
-                    robot_action_processor=robot_ee_to_joints_processor,
-                    robot_observation_processor=robot_joints_to_ee_pose,
-                )
+        if events["rerecord_episode"]:
+            log_say("Re-recording episode")
+            events["rerecord_episode"] = False
+            events["exit_early"] = False
+            dataset.clear_episode_buffer()
+            continue

-            if events["rerecord_episode"]:
-                log_say("Re-recording episode")
-                events["rerecord_episode"] = False
-                events["exit_early"] = False
-                dataset.clear_episode_buffer()
-                continue
+        # Save episode
+        dataset.save_episode()
+        episode_idx += 1

-            # Save episode
-            dataset.save_episode()
-            episode_idx += 1
-    finally:
-        # Clean up
-        log_say("Stop recording")
-        robot.disconnect()
-        phone.disconnect()
-        listener.stop()
+    # Clean up
+    log_say("Stop recording")
+    robot.disconnect()
+    phone.disconnect()
+    listener.stop()

-        dataset.finalize()
-        dataset.push_to_hub()
+    dataset.finalize()
+    dataset.push_to_hub()


 if __name__ == "__main__":
@@ -18,16 +18,16 @@ import time

 from lerobot.datasets.lerobot_dataset import LeRobotDataset
 from lerobot.model.kinematics import RobotKinematics
-from lerobot.processor import RobotProcessorPipeline
+from lerobot.processor import RobotAction, RobotObservation, RobotProcessorPipeline
 from lerobot.processor.converters import (
    robot_action_observation_to_transition,
    transition_to_robot_action,
 )
-from lerobot.robots.so_follower import SO100Follower, SO100FollowerConfig
-from lerobot.robots.so_follower.robot_kinematic_processor import (
+from lerobot.robots.so100_follower.config_so100_follower import SO100FollowerConfig
+from lerobot.robots.so100_follower.robot_kinematic_processor import (
    InverseKinematicsEEToJoints,
 )
-from lerobot.types import RobotAction, RobotObservation
+from lerobot.robots.so100_follower.so100_follower import SO100Follower
 from lerobot.utils.constants import ACTION
 from lerobot.utils.robot_utils import precise_sleep
 from lerobot.utils.utils import log_say
@@ -74,34 +74,32 @@ def main():
    # Connect to the robot
    robot.connect()

-    try:
-        if not robot.is_connected:
-            raise ValueError("Robot is not connected!")
+    if not robot.is_connected:
+        raise ValueError("Robot is not connected!")

-        print("Starting replay loop...")
-        log_say(f"Replaying episode {EPISODE_IDX}")
-        for idx in range(len(episode_frames)):
-            t0 = time.perf_counter()
+    print("Starting replay loop...")
+    log_say(f"Replaying episode {EPISODE_IDX}")
+    for idx in range(len(episode_frames)):
+        t0 = time.perf_counter()

-            # Get recorded action from dataset
-            ee_action = {
-                name: float(actions[idx][ACTION][i])
-                for i, name in enumerate(dataset.features[ACTION]["names"])
-            }
+        # Get recorded action from dataset
+        ee_action = {
+            name: float(actions[idx][ACTION][i]) for i, name in enumerate(dataset.features[ACTION]["names"])
+        }

-            # Get robot observation
-            robot_obs = robot.get_observation()
+        # Get robot observation
+        robot_obs = robot.get_observation()

-            # Dataset EE -> robot joints
-            joint_action = robot_ee_to_joints_processor((ee_action, robot_obs))
+        # Dataset EE -> robot joints
+        joint_action = robot_ee_to_joints_processor((ee_action, robot_obs))

-            # Send action to robot
-            _ = robot.send_action(joint_action)
+        # Send action to robot
+        _ = robot.send_action(joint_action)

-            precise_sleep(max(1.0 / dataset.fps - (time.perf_counter() - t0), 0.0))
-    finally:
-        # Clean up
-        robot.disconnect()
+        precise_sleep(1.0 / dataset.fps - (time.perf_counter() - t0))
+
+    # Clean up
+    robot.disconnect()


 if __name__ == "__main__":
@@ -16,22 +16,22 @@
 import time

 from lerobot.model.kinematics import RobotKinematics
-from lerobot.processor import RobotProcessorPipeline
+from lerobot.processor import RobotAction, RobotObservation, RobotProcessorPipeline
 from lerobot.processor.converters import (
    robot_action_observation_to_transition,
    transition_to_robot_action,
 )
-from lerobot.robots.so_follower import SO100Follower, SO100FollowerConfig
-from lerobot.robots.so_follower.robot_kinematic_processor import (
+from lerobot.robots.so100_follower.config_so100_follower import SO100FollowerConfig
+from lerobot.robots.so100_follower.robot_kinematic_processor import (
    EEBoundsAndSafety,
    EEReferenceAndDelta,
    GripperVelocityToJoint,
    InverseKinematicsEEToJoints,
 )
+from lerobot.robots.so100_follower.so100_follower import SO100Follower
 from lerobot.teleoperators.phone.config_phone import PhoneConfig, PhoneOS
 from lerobot.teleoperators.phone.phone_processor import MapPhoneActionToRobotAction
 from lerobot.teleoperators.phone.teleop_phone import Phone
-from lerobot.types import RobotAction, RobotObservation
 from lerobot.utils.robot_utils import precise_sleep
 from lerobot.utils.visualization_utils import init_rerun, log_rerun_data

@@ -22,8 +22,7 @@ from pathlib import Path
 import numpy as np
 import tensorflow_datasets as tfds

-from lerobot.datasets.dataset_metadata import LeRobotDatasetMetadata
-from lerobot.datasets.lerobot_dataset import LeRobotDataset
+from lerobot.datasets.lerobot_dataset import LeRobotDataset, LeRobotDatasetMetadata
 from lerobot.utils.utils import get_elapsed_time_in_days_hours_minutes_seconds

 DROID_SHARDS = 2048
@@ -26,7 +26,7 @@ from huggingface_hub import HfApi
 from huggingface_hub.constants import REPOCARD_NAME
 from port_droid import DROID_SHARDS

-from lerobot.datasets.dataset_metadata import CODEBASE_VERSION, LeRobotDatasetMetadata
+from lerobot.datasets.lerobot_dataset import CODEBASE_VERSION, LeRobotDatasetMetadata
 from lerobot.datasets.utils import create_lerobot_dataset_card
 from lerobot.utils.utils import init_logging

@@ -155,7 +155,7 @@ class UploadDataset(PipelineStep):
        from datasets.utils.tqdm import disable_progress_bars
        from huggingface_hub import CommitOperationAdd, preupload_lfs_files

-        from lerobot.datasets.dataset_metadata import LeRobotDatasetMetadata
+        from lerobot.datasets.lerobot_dataset import LeRobotDatasetMetadata
        from lerobot.utils.utils import init_logging

        init_logging()
@@ -27,8 +27,8 @@ measuring consistency and ground truth alignment.
 Usage:
    # Basic usage with smolvla policy
    uv run python examples/rtc/eval_dataset.py \
-        --policy.path=<USER>/smolvla_check_rtc_last3 \
-        --dataset.repo_id=<USER>/check_rtc \
+        --policy.path=helper2424/smolvla_check_rtc_last3 \
+        --dataset.repo_id=helper2424/check_rtc \
        --rtc.execution_horizon=8 \
        --device=mps \
        --rtc.max_guidance_weight=10.0 \
@@ -58,16 +58,16 @@ Usage:
        --device=cuda

    uv run python examples/rtc/eval_dataset.py \
-        --policy.path=<USER>/reuben_pi0 \
-        --dataset.repo_id=<USER>/so101_cube_in_cup \
+        --policy.path=lipsop/reuben_pi0 \
+        --dataset.repo_id=ReubenLim/so101_cube_in_cup \
        --rtc.execution_horizon=8 \
        --device=cuda

    # With torch.compile for faster inference (PyTorch 2.0+)
    # Note: CUDA graphs disabled by default due to in-place ops in denoising loop
    uv run python examples/rtc/eval_dataset.py \
-        --policy.path=<USER>/smolvla_check_rtc_last3 \
-        --dataset.repo_id=<USER>/check_rtc \
+        --policy.path=helper2424/smolvla_check_rtc_last3 \
+        --dataset.repo_id=helper2424/check_rtc \
        --rtc.execution_horizon=8 \
        --device=mps \
        --use_torch_compile=true \
@@ -75,8 +75,8 @@ Usage:

    # With torch.compile on CUDA (CUDA graphs disabled by default)
    uv run python examples/rtc/eval_dataset.py \
-        --policy.path=<USER>/smolvla_check_rtc_last3 \
-        --dataset.repo_id=<USER>/check_rtc \
+        --policy.path=helper2424/smolvla_check_rtc_last3 \
+        --dataset.repo_id=helper2424/check_rtc \
        --rtc.execution_horizon=8 \
        --device=cuda \
        --use_torch_compile=true \
@@ -84,8 +84,8 @@ Usage:

    # Enable CUDA graphs (advanced - may cause tensor aliasing errors)
    uv run python examples/rtc/eval_dataset.py \
-        --policy.path=<USER>/smolvla_check_rtc_last3 \
-        --dataset.repo_id=<USER>/check_rtc \
+        --policy.path=helper2424/smolvla_check_rtc_last3 \
+        --dataset.repo_id=helper2424/check_rtc \
        --use_torch_compile=true \
        --torch_compile_backend=inductor \
        --torch_compile_mode=max-autotune \
@@ -113,9 +113,8 @@ from lerobot.configs import parser
 from lerobot.configs.default import DatasetConfig
 from lerobot.configs.policies import PreTrainedConfig
 from lerobot.configs.types import RTCAttentionSchedule
-from lerobot.datasets.dataset_metadata import LeRobotDatasetMetadata
 from lerobot.datasets.factory import resolve_delta_timestamps
-from lerobot.datasets.lerobot_dataset import LeRobotDataset
+from lerobot.datasets.lerobot_dataset import LeRobotDataset, LeRobotDatasetMetadata
 from lerobot.policies.factory import get_policy_class, make_pre_post_processors
 from lerobot.policies.rtc.configuration_rtc import RTCConfig
 from lerobot.policies.rtc.debug_visualizer import RTCDebugVisualizer
@@ -28,7 +28,7 @@ For simulation environments, see eval_with_simulation.py
 Usage:
    # Run RTC with Real robot with RTC
    uv run examples/rtc/eval_with_real_robot.py \
-        --policy.path=<USER>/smolvla_check_rtc_last3 \
+        --policy.path=helper2424/smolvla_check_rtc_last3 \
        --policy.device=mps \
        --rtc.enabled=true \
        --rtc.execution_horizon=20 \
@@ -41,7 +41,7 @@ Usage:

    # Run RTC with Real robot without RTC
    uv run examples/rtc/eval_with_real_robot.py \
-        --policy.path=<USER>/smolvla_check_rtc_last3 \
+        --policy.path=helper2424/smolvla_check_rtc_last3 \
        --policy.device=mps \
        --rtc.enabled=false \
        --robot.type=so100_follower \
@@ -53,7 +53,7 @@ Usage:

    # Run RTC with Real robot with pi0.5 policy
    uv run examples/rtc/eval_with_real_robot.py \
-        --policy.path=<USER>/pi05_check_rtc \
+        --policy.path=helper2424/pi05_check_rtc \
        --policy.device=mps \
        --rtc.enabled=true \
        --rtc.execution_horizon=20 \
@@ -78,11 +78,10 @@ from torch import Tensor

 from lerobot.cameras.opencv.configuration_opencv import OpenCVCameraConfig  # noqa: F401
 from lerobot.cameras.realsense.configuration_realsense import RealSenseCameraConfig  # noqa: F401
-from lerobot.cameras.zmq.configuration_zmq import ZMQCameraConfig  # noqa: F401
 from lerobot.configs import parser
 from lerobot.configs.policies import PreTrainedConfig
 from lerobot.configs.types import RTCAttentionSchedule
-from lerobot.datasets.feature_utils import build_dataset_frame, hw_to_dataset_features
+from lerobot.datasets.utils import build_dataset_frame, hw_to_dataset_features
 from lerobot.policies.factory import get_policy_class, make_pre_post_processors
 from lerobot.policies.rtc.action_queue import ActionQueue
 from lerobot.policies.rtc.configuration_rtc import RTCConfig
@@ -95,10 +94,9 @@ from lerobot.rl.process import ProcessSignalHandler
 from lerobot.robots import (  # noqa: F401
    Robot,
    RobotConfig,
-    bi_so_follower,
    koch_follower,
-    so_follower,
-    unitree_g1,
+    so100_follower,
+    so101_follower,
 )
 from lerobot.robots.utils import make_robot_from_config
 from lerobot.utils.constants import OBS_IMAGES
@@ -457,18 +455,7 @@ def demo_cli(cfg: RTCDemoConfig):
    if cfg.policy.type == "pi05" or cfg.policy.type == "pi0":
        config.compile_model = cfg.use_torch_compile

-    if config.use_peft:
-        from peft import PeftConfig, PeftModel
-
-        peft_pretrained_path = cfg.policy.pretrained_path
-        peft_config = PeftConfig.from_pretrained(peft_pretrained_path)
-
-        policy = policy_class.from_pretrained(
-            pretrained_name_or_path=peft_config.base_model_name_or_path, config=config
-        )
-        policy = PeftModel.from_pretrained(policy, peft_pretrained_path, config=peft_config)
-    else:
-        policy = policy_class.from_pretrained(cfg.policy.pretrained_path, config=config)
+    policy = policy_class.from_pretrained(cfg.policy.pretrained_path, config=config)

    # Turn on RTC
    policy.config.rtc_config = cfg.rtc
@@ -16,13 +16,15 @@

 from lerobot.cameras.opencv.configuration_opencv import OpenCVCameraConfig
 from lerobot.configs.types import FeatureType, PolicyFeature
-from lerobot.datasets.feature_utils import combine_feature_dicts
 from lerobot.datasets.lerobot_dataset import LeRobotDataset
 from lerobot.datasets.pipeline_features import aggregate_pipeline_dataset_features, create_initial_features
+from lerobot.datasets.utils import combine_feature_dicts
 from lerobot.model.kinematics import RobotKinematics
 from lerobot.policies.act.modeling_act import ACTPolicy
 from lerobot.policies.factory import make_pre_post_processors
 from lerobot.processor import (
+    RobotAction,
+    RobotObservation,
    RobotProcessorPipeline,
    make_default_teleop_action_processor,
 )
@@ -32,13 +34,13 @@ from lerobot.processor.converters import (
    transition_to_observation,
    transition_to_robot_action,
 )
-from lerobot.robots.so_follower import SO100Follower, SO100FollowerConfig
-from lerobot.robots.so_follower.robot_kinematic_processor import (
+from lerobot.robots.so100_follower.config_so100_follower import SO100FollowerConfig
+from lerobot.robots.so100_follower.robot_kinematic_processor import (
    ForwardKinematicsJointsToEE,
    InverseKinematicsEEToJoints,
 )
+from lerobot.robots.so100_follower.so100_follower import SO100Follower
 from lerobot.scripts.lerobot_record import record_loop
-from lerobot.types import RobotAction, RobotObservation
 from lerobot.utils.control_utils import init_keyboard_listener
 from lerobot.utils.utils import log_say
 from lerobot.utils.visualization_utils import init_rerun
@@ -141,24 +143,38 @@ def main():
    listener, events = init_keyboard_listener()
    init_rerun(session_name="so100_so100_evaluate")

-    try:
-        if not robot.is_connected:
-            raise ValueError("Robot is not connected!")
+    if not robot.is_connected:
+        raise ValueError("Robot is not connected!")

-        print("Starting evaluate loop...")
-        episode_idx = 0
-        for episode_idx in range(NUM_EPISODES):
-            log_say(f"Running inference, recording eval episode {episode_idx + 1} of {NUM_EPISODES}")
+    print("Starting evaluate loop...")
+    episode_idx = 0
+    for episode_idx in range(NUM_EPISODES):
+        log_say(f"Running inference, recording eval episode {episode_idx + 1} of {NUM_EPISODES}")

-            # Main record loop
+        # Main record loop
+        record_loop(
+            robot=robot,
+            events=events,
+            fps=FPS,
+            policy=policy,
+            preprocessor=preprocessor,  # Pass the pre and post policy processors
+            postprocessor=postprocessor,
+            dataset=dataset,
+            control_time_s=EPISODE_TIME_SEC,
+            single_task=TASK_DESCRIPTION,
+            display_data=True,
+            teleop_action_processor=make_default_teleop_action_processor(),
+            robot_action_processor=robot_ee_to_joints_processor,
+            robot_observation_processor=robot_joints_to_ee_pose_processor,
+        )
+
+        # Reset the environment if not stopping or re-recording
+        if not events["stop_recording"] and ((episode_idx < NUM_EPISODES - 1) or events["rerecord_episode"]):
+            log_say("Reset the environment")
            record_loop(
                robot=robot,
                events=events,
                fps=FPS,
-                policy=policy,
-                preprocessor=preprocessor,  # Pass the pre and post policy processors
-                postprocessor=postprocessor,
-                dataset=dataset,
                control_time_s=EPISODE_TIME_SEC,
                single_task=TASK_DESCRIPTION,
                display_data=True,
@@ -167,41 +183,24 @@ def main():
                robot_observation_processor=robot_joints_to_ee_pose_processor,
            )

-            # Reset the environment if not stopping or re-recording
-            if not events["stop_recording"] and (
-                (episode_idx < NUM_EPISODES - 1) or events["rerecord_episode"]
-            ):
-                log_say("Reset the environment")
-                record_loop(
-                    robot=robot,
-                    events=events,
-                    fps=FPS,
-                    control_time_s=EPISODE_TIME_SEC,
-                    single_task=TASK_DESCRIPTION,
-                    display_data=True,
-                    teleop_action_processor=make_default_teleop_action_processor(),
-                    robot_action_processor=robot_ee_to_joints_processor,
-                    robot_observation_processor=robot_joints_to_ee_pose_processor,
-                )
+        if events["rerecord_episode"]:
+            log_say("Re-record episode")
+            events["rerecord_episode"] = False
+            events["exit_early"] = False
+            dataset.clear_episode_buffer()
+            continue

-            if events["rerecord_episode"]:
-                log_say("Re-record episode")
-                events["rerecord_episode"] = False
-                events["exit_early"] = False
-                dataset.clear_episode_buffer()
-                continue
+        # Save episode
+        dataset.save_episode()
+        episode_idx += 1

-            # Save episode
-            dataset.save_episode()
-            episode_idx += 1
-    finally:
-        # Clean up
-        log_say("Stop recording")
-        robot.disconnect()
-        listener.stop()
+    # Clean up
+    log_say("Stop recording")
+    robot.disconnect()
+    listener.stop()

-        dataset.finalize()
-        dataset.push_to_hub()
+    dataset.finalize()
+    dataset.push_to_hub()


 if __name__ == "__main__":
@@ -16,26 +16,27 @@


 from lerobot.cameras.opencv.configuration_opencv import OpenCVCameraConfig
-from lerobot.datasets.feature_utils import combine_feature_dicts
 from lerobot.datasets.lerobot_dataset import LeRobotDataset
 from lerobot.datasets.pipeline_features import aggregate_pipeline_dataset_features, create_initial_features
+from lerobot.datasets.utils import combine_feature_dicts
 from lerobot.model.kinematics import RobotKinematics
-from lerobot.processor import RobotProcessorPipeline
+from lerobot.processor import RobotAction, RobotObservation, RobotProcessorPipeline
 from lerobot.processor.converters import (
    observation_to_transition,
    robot_action_observation_to_transition,
    transition_to_observation,
    transition_to_robot_action,
 )
-from lerobot.robots.so_follower import SO100Follower, SO100FollowerConfig
-from lerobot.robots.so_follower.robot_kinematic_processor import (
+from lerobot.robots.so100_follower.config_so100_follower import SO100FollowerConfig
+from lerobot.robots.so100_follower.robot_kinematic_processor import (
    EEBoundsAndSafety,
    ForwardKinematicsJointsToEE,
    InverseKinematicsEEToJoints,
 )
+from lerobot.robots.so100_follower.so100_follower import SO100Follower
 from lerobot.scripts.lerobot_record import record_loop
-from lerobot.teleoperators.so_leader import SO100Leader, SO100LeaderConfig
-from lerobot.types import RobotAction, RobotObservation
+from lerobot.teleoperators.so100_leader.config_so100_leader import SO100LeaderConfig
+from lerobot.teleoperators.so100_leader.so100_leader import SO100Leader
 from lerobot.utils.control_utils import init_keyboard_listener
 from lerobot.utils.utils import log_say
 from lerobot.utils.visualization_utils import init_rerun
@@ -147,23 +148,38 @@ def main():
    listener, events = init_keyboard_listener()
    init_rerun(session_name="recording_phone")

-    try:
-        if not leader.is_connected or not follower.is_connected:
-            raise ValueError("Robot or teleop is not connected!")
+    if not leader.is_connected or not follower.is_connected:
+        raise ValueError("Robot or teleop is not connected!")

-        print("Starting record loop...")
-        episode_idx = 0
-        while episode_idx < NUM_EPISODES and not events["stop_recording"]:
-            log_say(f"Recording episode {episode_idx + 1} of {NUM_EPISODES}")
+    print("Starting record loop...")
+    episode_idx = 0
+    while episode_idx < NUM_EPISODES and not events["stop_recording"]:
+        log_say(f"Recording episode {episode_idx + 1} of {NUM_EPISODES}")

-            # Main record loop
+        # Main record loop
+        record_loop(
+            robot=follower,
+            events=events,
+            fps=FPS,
+            teleop=leader,
+            dataset=dataset,
+            control_time_s=EPISODE_TIME_SEC,
+            single_task=TASK_DESCRIPTION,
+            display_data=True,
+            teleop_action_processor=leader_joints_to_ee,
+            robot_action_processor=ee_to_follower_joints,
+            robot_observation_processor=follower_joints_to_ee,
+        )
+
+        # Reset the environment if not stopping or re-recording
+        if not events["stop_recording"] and (episode_idx < NUM_EPISODES - 1 or events["rerecord_episode"]):
+            log_say("Reset the environment")
            record_loop(
                robot=follower,
                events=events,
                fps=FPS,
                teleop=leader,
-                dataset=dataset,
-                control_time_s=EPISODE_TIME_SEC,
+                control_time_s=RESET_TIME_SEC,
                single_task=TASK_DESCRIPTION,
                display_data=True,
                teleop_action_processor=leader_joints_to_ee,
@@ -171,44 +187,25 @@ def main():
                robot_observation_processor=follower_joints_to_ee,
            )

-            # Reset the environment if not stopping or re-recording
-            if not events["stop_recording"] and (
-                episode_idx < NUM_EPISODES - 1 or events["rerecord_episode"]
-            ):
-                log_say("Reset the environment")
-                record_loop(
-                    robot=follower,
-                    events=events,
-                    fps=FPS,
-                    teleop=leader,
-                    control_time_s=RESET_TIME_SEC,
-                    single_task=TASK_DESCRIPTION,
-                    display_data=True,
-                    teleop_action_processor=leader_joints_to_ee,
-                    robot_action_processor=ee_to_follower_joints,
-                    robot_observation_processor=follower_joints_to_ee,
-                )
+        if events["rerecord_episode"]:
+            log_say("Re-recording episode")
+            events["rerecord_episode"] = False
+            events["exit_early"] = False
+            dataset.clear_episode_buffer()
+            continue

-            if events["rerecord_episode"]:
-                log_say("Re-recording episode")
-                events["rerecord_episode"] = False
-                events["exit_early"] = False
-                dataset.clear_episode_buffer()
-                continue
+        # Save episode
+        dataset.save_episode()
+        episode_idx += 1

-            # Save episode
-            dataset.save_episode()
-            episode_idx += 1
+    # Clean up
+    log_say("Stop recording")
+    leader.disconnect()
+    follower.disconnect()
+    listener.stop()

-    finally:
-        # Clean up
-        log_say("Stop recording")
-        leader.disconnect()
-        follower.disconnect()
-        listener.stop()
-
-        dataset.finalize()
-        dataset.push_to_hub()
+    dataset.finalize()
+    dataset.push_to_hub()


 if __name__ == "__main__":
@@ -19,16 +19,16 @@ import time

 from lerobot.datasets.lerobot_dataset import LeRobotDataset
 from lerobot.model.kinematics import RobotKinematics
-from lerobot.processor import RobotProcessorPipeline
+from lerobot.processor import RobotAction, RobotObservation, RobotProcessorPipeline
 from lerobot.processor.converters import (
    robot_action_observation_to_transition,
    transition_to_robot_action,
 )
-from lerobot.robots.so_follower import SO100Follower, SO100FollowerConfig
-from lerobot.robots.so_follower.robot_kinematic_processor import (
+from lerobot.robots.so100_follower.config_so100_follower import SO100FollowerConfig
+from lerobot.robots.so100_follower.robot_kinematic_processor import (
    InverseKinematicsEEToJoints,
 )
-from lerobot.types import RobotAction, RobotObservation
+from lerobot.robots.so100_follower.so100_follower import SO100Follower
 from lerobot.utils.constants import ACTION
 from lerobot.utils.robot_utils import precise_sleep
 from lerobot.utils.utils import log_say
@@ -75,35 +75,32 @@ def main():
    # Connect to the robot
    robot.connect()

-    try:
-        if not robot.is_connected:
-            raise ValueError("Robot is not connected!")
+    if not robot.is_connected:
+        raise ValueError("Robot is not connected!")

-        print("Starting replay loop...")
-        log_say(f"Replaying episode {EPISODE_IDX}")
-        for idx in range(len(episode_frames)):
-            t0 = time.perf_counter()
+    print("Starting replay loop...")
+    log_say(f"Replaying episode {EPISODE_IDX}")
+    for idx in range(len(episode_frames)):
+        t0 = time.perf_counter()

-            # Get recorded action from dataset
-            ee_action = {
-                name: float(actions[idx][ACTION][i])
-                for i, name in enumerate(dataset.features[ACTION]["names"])
-            }
+        # Get recorded action from dataset
+        ee_action = {
+            name: float(actions[idx][ACTION][i]) for i, name in enumerate(dataset.features[ACTION]["names"])
+        }

-            # Get robot observation
-            robot_obs = robot.get_observation()
+        # Get robot observation
+        robot_obs = robot.get_observation()

-            # Dataset EE -> robot joints
-            joint_action = robot_ee_to_joints_processor((ee_action, robot_obs))
+        # Dataset EE -> robot joints
+        joint_action = robot_ee_to_joints_processor((ee_action, robot_obs))

-            # Send action to robot
-            _ = robot.send_action(joint_action)
+        # Send action to robot
+        _ = robot.send_action(joint_action)

-            precise_sleep(max(1.0 / dataset.fps - (time.perf_counter() - t0), 0.0))
+        precise_sleep(1.0 / dataset.fps - (time.perf_counter() - t0))

-    finally:
-        # Clean up
-        robot.disconnect()
+    # Clean up
+    robot.disconnect()


 if __name__ == "__main__":
@@ -17,20 +17,21 @@
 import time

 from lerobot.model.kinematics import RobotKinematics
-from lerobot.processor import RobotProcessorPipeline
+from lerobot.processor import RobotAction, RobotObservation, RobotProcessorPipeline
 from lerobot.processor.converters import (
    robot_action_observation_to_transition,
    robot_action_to_transition,
    transition_to_robot_action,
 )
-from lerobot.robots.so_follower import SO100Follower, SO100FollowerConfig
-from lerobot.robots.so_follower.robot_kinematic_processor import (
+from lerobot.robots.so100_follower.config_so100_follower import SO100FollowerConfig
+from lerobot.robots.so100_follower.robot_kinematic_processor import (
    EEBoundsAndSafety,
    ForwardKinematicsJointsToEE,
    InverseKinematicsEEToJoints,
 )
-from lerobot.teleoperators.so_leader import SO100Leader, SO100LeaderConfig
-from lerobot.types import RobotAction, RobotObservation
+from lerobot.robots.so100_follower.so100_follower import SO100Follower
+from lerobot.teleoperators.so100_leader.config_so100_leader import SO100LeaderConfig
+from lerobot.teleoperators.so100_leader.so100_leader import SO100Leader
 from lerobot.utils.robot_utils import precise_sleep
 from lerobot.utils.visualization_utils import init_rerun, log_rerun_data

@@ -19,9 +19,8 @@ from pathlib import Path
 import torch

 from lerobot.configs.types import FeatureType
-from lerobot.datasets.dataset_metadata import LeRobotDatasetMetadata
-from lerobot.datasets.feature_utils import dataset_to_policy_features
-from lerobot.datasets.lerobot_dataset import LeRobotDataset
+from lerobot.datasets.lerobot_dataset import LeRobotDataset, LeRobotDatasetMetadata
+from lerobot.datasets.utils import dataset_to_policy_features
 from lerobot.policies.diffusion.configuration_diffusion import DiffusionConfig
 from lerobot.policies.diffusion.modeling_diffusion import DiffusionPolicy
 from lerobot.policies.factory import make_pre_post_processors
@@ -20,9 +20,9 @@ from pathlib import Path
 import torch

 from lerobot.configs.types import FeatureType
-from lerobot.datasets.dataset_metadata import LeRobotDatasetMetadata
-from lerobot.datasets.feature_utils import dataset_to_policy_features
+from lerobot.datasets.lerobot_dataset import LeRobotDatasetMetadata
 from lerobot.datasets.streaming_dataset import StreamingLeRobotDataset
+from lerobot.datasets.utils import dataset_to_policy_features
 from lerobot.policies.act.configuration_act import ACTConfig
 from lerobot.policies.act.modeling_act import ACTPolicy
 from lerobot.policies.factory import make_pre_post_processors
@@ -5,9 +5,8 @@ from pathlib import Path
 import torch

 from lerobot.configs.types import FeatureType
-from lerobot.datasets.dataset_metadata import LeRobotDatasetMetadata
-from lerobot.datasets.feature_utils import dataset_to_policy_features
-from lerobot.datasets.lerobot_dataset import LeRobotDataset
+from lerobot.datasets.lerobot_dataset import LeRobotDataset, LeRobotDatasetMetadata
+from lerobot.datasets.utils import dataset_to_policy_features
 from lerobot.policies.act.configuration_act import ACTConfig
 from lerobot.policies.act.modeling_act import ACTPolicy
 from lerobot.policies.factory import make_pre_post_processors
@@ -1,11 +1,12 @@
 import torch

 from lerobot.cameras.opencv.configuration_opencv import OpenCVCameraConfig
-from lerobot.datasets.dataset_metadata import LeRobotDatasetMetadata
+from lerobot.datasets.lerobot_dataset import LeRobotDatasetMetadata
 from lerobot.policies.act.modeling_act import ACTPolicy
 from lerobot.policies.factory import make_pre_post_processors
 from lerobot.policies.utils import build_inference_frame, make_robot_action
-from lerobot.robots.so_follower import SO100Follower, SO100FollowerConfig
+from lerobot.robots.so100_follower.config_so100_follower import SO100FollowerConfig
+from lerobot.robots.so100_follower.so100_follower import SO100Follower

 MAX_EPISODES = 5
 MAX_STEPS_PER_EPISODE = 20
@@ -4,7 +4,7 @@ from lerobot.async_inference.configs import RobotClientConfig
 from lerobot.async_inference.helpers import visualize_action_queue_size
 from lerobot.async_inference.robot_client import RobotClient
 from lerobot.cameras.opencv.configuration_opencv import OpenCVCameraConfig
-from lerobot.robots.so_follower import SO100FollowerConfig
+from lerobot.robots.so100_follower import SO100FollowerConfig


 def main():
@@ -30,7 +30,6 @@ def main():
        robot=robot_cfg,
        server_address=server_address,
        policy_device="mps",
-        client_device="cpu",
        policy_type="act",
        pretrained_name_or_path="<user>/robot_learning_tutorial_act",
        chunk_size_threshold=0.5,  # g
@@ -5,9 +5,8 @@ from pathlib import Path
 import torch

 from lerobot.configs.types import FeatureType
-from lerobot.datasets.dataset_metadata import LeRobotDatasetMetadata
-from lerobot.datasets.feature_utils import dataset_to_policy_features
-from lerobot.datasets.lerobot_dataset import LeRobotDataset
+from lerobot.datasets.lerobot_dataset import LeRobotDataset, LeRobotDatasetMetadata
+from lerobot.datasets.utils import dataset_to_policy_features
 from lerobot.policies.diffusion.configuration_diffusion import DiffusionConfig
 from lerobot.policies.diffusion.modeling_diffusion import DiffusionPolicy
 from lerobot.policies.factory import make_pre_post_processors
@@ -1,11 +1,12 @@
 import torch

 from lerobot.cameras.opencv.configuration_opencv import OpenCVCameraConfig
-from lerobot.datasets.dataset_metadata import LeRobotDatasetMetadata
+from lerobot.datasets.lerobot_dataset import LeRobotDatasetMetadata
 from lerobot.policies.diffusion.modeling_diffusion import DiffusionPolicy
 from lerobot.policies.factory import make_pre_post_processors
 from lerobot.policies.utils import build_inference_frame, make_robot_action
-from lerobot.robots.so_follower import SO100Follower, SO100FollowerConfig
+from lerobot.robots.so100_follower.config_so100_follower import SO100FollowerConfig
+from lerobot.robots.so100_follower.so100_follower import SO100Follower

 MAX_EPISODES = 5
 MAX_STEPS_PER_EPISODE = 20
@@ -1,11 +1,12 @@
 import torch

 from lerobot.cameras.opencv.configuration_opencv import OpenCVCameraConfig
-from lerobot.datasets.feature_utils import hw_to_dataset_features
+from lerobot.datasets.utils import hw_to_dataset_features
 from lerobot.policies.factory import make_pre_post_processors
 from lerobot.policies.pi0.modeling_pi0 import PI0Policy
 from lerobot.policies.utils import build_inference_frame, make_robot_action
-from lerobot.robots.so_follower import SO100Follower, SO100FollowerConfig
+from lerobot.robots.so100_follower.config_so100_follower import SO100FollowerConfig
+from lerobot.robots.so100_follower.so100_follower import SO100Follower

 MAX_EPISODES = 5
 MAX_STEPS_PER_EPISODE = 20
@@ -6,16 +6,16 @@ from queue import Empty, Full
 import torch
 import torch.optim as optim

-from lerobot.datasets.feature_utils import hw_to_dataset_features
 from lerobot.datasets.lerobot_dataset import LeRobotDataset
+from lerobot.datasets.utils import hw_to_dataset_features
 from lerobot.envs.configs import HILSerlProcessorConfig, HILSerlRobotEnvConfig
 from lerobot.policies.sac.configuration_sac import SACConfig
 from lerobot.policies.sac.modeling_sac import SACPolicy
 from lerobot.policies.sac.reward_model.modeling_classifier import Classifier
 from lerobot.rl.buffer import ReplayBuffer
 from lerobot.rl.gym_manipulator import make_robot_env
-from lerobot.robots.so_follower import SO100FollowerConfig
-from lerobot.teleoperators.so_leader import SO100LeaderConfig
+from lerobot.robots.so100_follower import SO100FollowerConfig
+from lerobot.teleoperators.so100_leader import SO100LeaderConfig
 from lerobot.teleoperators.utils import TeleopEvents

 LOG_EVERY = 10
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
Jade Choghari	03cce79c88	Merge branch 'main' into feat/behavior-1k	2025-12-04 18:50:56 +01:00
Michel Aractingi	3918ab7882	Merge branch 'main' into feat/behavior-1k	2025-11-03 13:28:31 +01:00
Michel Aractingi	65b0e73ae4	* refactor behaviour1k_lerobot_dataset.py * add example scripts to load behaviour 1k data in `load_behaviour1k_dataset.py`	2025-11-03 12:23:12 +00:00
Jade Choghari	ca7c5fcdfe	remove tester	2025-10-30 18:14:09 +01:00
Jade Choghari	28f8098df4	fix style	2025-10-30 18:12:50 +01:00
Jade Choghari	db7d501281	remove comments	2025-10-30 18:12:06 +01:00
Jade Choghari	88380fe34e	update changes	2025-10-30 18:11:27 +01:00
Jade Choghari	154abfd233	update Signed-off-by: Jade Choghari <chogharijade@gmail.com>	2025-10-27 17:52:21 +01:00
Jade Choghari	dc14266762	add Signed-off-by: Jade Choghari <chogharijade@gmail.com>	2025-10-27 16:44:58 +01:00
Michel Aractingi	fd623e0cc5	Modify convert_to_lerobot_v3 script for behaviours dataset to take a single task id and create a dataset outof it	2025-10-24 17:06:21 +02:00
Michel Aractingi	a52e88d349	add scripts for convert behavior-1k to datasetv3	2025-10-24 14:17:30 +02:00