Compare commits

..

33 Commits

Author SHA1 Message Date
Michel Aractingi 08d2ed8015 lerobot dataset fix 2025-12-17 16:46:43 +01:00
Michel Aractingi 4bcd14b8de add evaluate_with_rtc script 2025-12-17 16:46:43 +01:00
Michel Aractingi c34935090d integrate delete button openarm UI (#2535)
* add visualize_dataset call from `lerobot_dataset_viz` in web record server

* add delete button

* fixes

* remove viz

* unused import
2025-12-17 16:46:43 +01:00
CarolinePascal 9cfd56587e fix(num processes) 2025-12-17 16:46:43 +01:00
Caroline Pascal ff8584a025 fix(os version)
Signed-off-by: Caroline Pascal <caroline8.pascal@gmail.com>
2025-12-17 16:46:43 +01:00
Caroline Pascal 6bc1e5186a fix(import os)
Signed-off-by: Caroline Pascal <caroline8.pascal@gmail.com>
2025-12-17 16:46:43 +01:00
Caroline Pascal 69dc8165ae fix(max workers)
Signed-off-by: Caroline Pascal <caroline8.pascal@gmail.com>
2025-12-17 16:46:42 +01:00
CarolinePascal 021bca2ad9 feat(multi-processes): adding support for multiprocess encoding 2025-12-17 16:46:42 +01:00
CarolinePascal 4e0ee0d643 feat(preset): adding encoding preset 2025-12-17 16:46:42 +01:00
croissant 0a8aa85871 ruse video datasets 2025-12-17 16:46:42 +01:00
croissant 76ddd8b948 use image datasets and change ui 2025-12-17 16:46:42 +01:00
croissant bf08733068 frontend set correct port openarms mini 2025-12-17 16:46:42 +01:00
croissant e38f56c071 add default mini arms 2025-12-17 16:46:41 +01:00
croissant 19fe69dac0 add improv openarm mini 2025-12-17 16:46:41 +01:00
pepijn kooijmans 14319ee608 add openarms mini 2025-12-17 16:46:41 +01:00
croissant 9b04fd25b6 cam res 2025-12-17 16:46:41 +01:00
Pepijn 40e98ba690 fix calibration of gripper and add max clip positions for openarm for safety 2025-12-17 16:46:41 +01:00
pepijn kooijmans 894d65d58a add openarms to setup motors 2025-12-17 16:46:41 +01:00
Pepijn f58d508df2 cleanuo 2025-12-17 16:46:40 +01:00
Pepijn e22b909e7c Add mini openarms to test 2025-12-17 16:46:40 +01:00
croissant 09f1673cbf add longer timeout 2025-12-17 16:46:40 +01:00
croissant 4744f99990 add timing debugging, foot pedal and eval script 2025-12-17 16:46:40 +01:00
croissant 01c1735739 add disable torque 2025-12-17 16:46:40 +01:00
croissant 6808a42455 add pid ramp 2025-12-17 16:46:40 +01:00
croissant fff719cb4f add web interface example 2025-12-17 16:46:39 +01:00
croissant e2c00f6ed8 speedup 2025-12-17 16:46:39 +01:00
croissant 0f90db23c5 add full bimanual gravity comp 2025-12-17 16:46:39 +01:00
Michel Aractingi 96b192f2ae Add gravity compensation to the openarms teleoperation (#2352)
* adding first attempt at gcompensation to open arms

* add teleop with gravity compensation script
2025-12-17 16:46:39 +01:00
Pepijn ecdc34a699 faster canbus 2025-12-17 16:46:39 +01:00
croissant fa6a2fb9b7 pos teleop 2025-12-17 16:46:39 +01:00
Pepijn b011643dc9 add tests and debug 2025-12-17 16:46:38 +01:00
Pepijn 30c10c1c6e Add damiao motors and open arm robot 2025-12-17 16:46:38 +01:00
Pepijn 56e2360072 add damiao 2025-12-17 16:46:38 +01:00
403 changed files with 15808 additions and 36422 deletions
+35 -61
View File
@@ -12,83 +12,57 @@
# See the License for the specific language governing permissions and
# limitations under the License.
name: "🚀 Issue / Bug / Request"
description: Report a bug, suggest an improvement, or ask a technical question.
name: "\U0001F41B Bug Report"
description: Submit a bug report to help us improve LeRobot
body:
- type: markdown
attributes:
value: |
### Thanks for contributing to LeRobot! 🙌
Please choose the most relevant sections below. If this is a general "how-to" question, consider our [Discord](https://discord.gg/s3KuuzsPFb) for faster community support.
- type: dropdown
id: issue-type
attributes:
label: Ticket Type
description: What kind of ticket are you opening?
options:
- "🐛 Bug Report (Something isn't working)"
- "💡 Feature Request / Improvement"
- "❓ Technical Question"
- "🧹 Maintenance / Documentation"
validations:
required: true
Thanks for taking the time to submit a bug report! 🐛
If this is not a bug related to the LeRobot library directly, but instead a general question about your code or the library specifically please use our [discord](https://discord.gg/s3KuuzsPFb).
- type: textarea
id: system-info
attributes:
label: Environment & System Info
description: |
For bugs or technical questions, please run `lerobot-info` and paste the output.
(Optional for feature requests).
label: System Info
description: Please share your LeRobot configuration by running `lerobot-info` (if installed) or `python -m lerobot.scripts.display_sys_info` (if not installed) and pasting the output below.
render: Shell
placeholder: lerobot version, OS, python version, etc.
placeholder: lerobot version, OS, python version, numpy version, torch version, and lerobot's configuration
validations:
required: true
- type: checkboxes
id: information-scripts-examples
attributes:
label: Information
description: 'The problem arises when using:'
options:
- label: "One of the scripts in the examples/ folder of LeRobot"
- label: "My own task or dataset (give details below)"
- type: textarea
id: description
id: reproduction
validations:
required: true
attributes:
label: Description
label: Reproduction
description: |
Provide a clear summary of the issue or your proposal.
- **Bugs:** What is happening?
- **Features:** What is the goal/use case?
- **Questions:** What are you trying to achieve?
If needed, provide a simple code sample that reproduces the problem you ran into. It can be a Colab link or just a code snippet.
Sharing error messages or stack traces could be useful as well!
Important! Use code tags to correctly format your code. See https://help.github.com/en/github/writing-on-github/creating-and-highlighting-code-blocks#syntax-highlighting
Try to avoid screenshots, as they are hard to read and don't allow copy-and-pasting.
placeholder: |
A clear and concise description of the issue or suggestion.
Steps to reproduce the behavior:
1.
2.
3.
- type: textarea
id: context-repro
id: expected-behavior
validations:
required: true
attributes:
label: Context & Reproduction
description: |
Provide a code snippet, steps to reproduce a bug, or technical details about your proposal.
Please use code blocks for scripts and CLI commands.
placeholder: |
Steps to reproduce / Usage example:
1.
2.
3.
- type: textarea
id: logs
attributes:
label: Relevant logs or stack trace
description: If applicable, paste relevant error logs here.
render: Shell
- type: checkboxes
id: extras
attributes:
label: Checklist
options:
- label: I have searched existing tickets to ensure this isn't a duplicate.
- label: I am using the latest version of the `main` branch.
- label: I have verified this is not an environment-specific problem.
- type: textarea
id: workaround
attributes:
label: Additional Info / Workarounds
description: Anything else we should know? If you have a workaround, please share it!
label: Expected behavior
description: "A clear and concise description of what you would expect to happen."
+27 -41
View File
@@ -1,55 +1,41 @@
## Title
## What this does
Short, imperative summary (e.g., "fix(robots): handle None in sensor parser"). See [CONTRIBUTING.md](../CONTRIBUTING.md) for PR conventions.
Explain what this PR does. Feel free to tag your PR with the appropriate label(s).
## Type / Scope
Examples:
| Title | Label |
|----------------------|-----------------|
| Fixes #[issue] | (🐛 Bug) |
| Adds new dataset | (🗃️ Dataset) |
| Optimizes something | (⚡️ Performance) |
- **Type**: (Bug | Feature | Docs | Performance | Test | CI | Chore)
- **Scope**: (optional — name of module or package affected)
## How it was tested
## Summary / Motivation
Explain/show how you tested your changes.
- One-paragraph description of what changes and why.
- Why this change is needed and any trade-offs or design notes.
Examples:
## Related issues
- Added `test_something` in `tests/test_stuff.py`.
- Added `new_feature` and checked that training converges with policy X on dataset/environment Y.
- Optimized `some_function`, it now runs X times faster than previously.
- Fixes / Closes: # (if any)
- Related: # (if any)
## How to checkout & try? (for the reviewer)
## What changed
Provide a simple way for the reviewer to try out your changes.
- Short, concrete bullets of the modifications (files/behaviour).
- Short note if this introduces breaking changes and migration steps.
Examples:
## How was this tested (or how to run locally)
```bash
pytest -sx tests/test_stuff.py::test_something
```
- Tests added: list new tests or test files.
- Manual checks / dataset runs performed.
- Instructions for the reviewer
```bash
lerobot-train --some.option=true
```
Example:
## SECTION TO REMOVE BEFORE SUBMITTING YOUR PR
- Ran the relevant tests:
**Note**: Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR. Try to avoid tagging more than 3 people.
```bash
pytest -q tests/ -k <keyword>
```
- Reproduce with a quick example or CLI (if applicable):
```bash
lerobot-train --some.option=true
```
## Checklist (required before merge)
- [ ] Linting/formatting run (`pre-commit run -a`)
- [ ] All tests pass locally (`pytest`)
- [ ] Documentation updated
- [ ] CI is green
## Reviewer notes
- Anything the reviewer should focus on (performance, edge-cases, specific files) or general notes.
- Anyone in the community is free to review the PR.
**Note**: Before submitting this PR, please read the [contributor guideline](https://github.com/huggingface/lerobot/blob/main/CONTRIBUTING.md#submitting-a-pull-request-pr).
-69
View File
@@ -1,69 +0,0 @@
# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
CI:
- changed-files:
- any-glob-to-any-file:
- '.github/**'
- 'docker/**'
github_actions:
- changed-files:
- any-glob-to-any-file: '.github/**'
documentation:
- changed-files:
- any-glob-to-any-file:
- '**/*.md'
- '**/*.mdx'
- 'docs/**'
examples:
- changed-files:
- any-glob-to-any-file: 'examples/**'
tests:
- changed-files:
- any-glob-to-any-file: 'tests/**'
sensors:
- changed-files:
- any-glob-to-any-file: 'src/lerobot/cameras/**'
configuration:
- changed-files:
- any-glob-to-any-file: 'src/lerobot/configs/**'
dataset:
- changed-files:
- any-glob-to-any-file: 'src/lerobot/datasets/**'
evaluation:
- changed-files:
- any-glob-to-any-file: 'src/lerobot/envs/**'
robots:
- changed-files:
- any-glob-to-any-file:
- 'src/lerobot/teleoperators/**'
- 'src/lerobot/robots/**'
- 'src/lerobot/motors/**'
policies:
- changed-files:
- any-glob-to-any-file: 'src/lerobot/policies/**'
processor:
- changed-files:
- any-glob-to-any-file: 'src/lerobot/processor/**'
+2 -16
View File
@@ -18,11 +18,6 @@ name: Documentation
on:
# Allows running this workflow manually from the Actions tab
workflow_dispatch:
inputs:
version:
description: 'Version tag (e.g. v0.1.2) - Leave empty for standard main build'
required: false
type: string
# Triggers the workflow on push events to main for the docs folder
push:
@@ -38,9 +33,6 @@ on:
paths:
- "docs/**"
release:
types: [published]
# Ensures that only the latest commit for a PR or branch is built, canceling older runs.
concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
@@ -51,7 +43,7 @@ jobs:
build_main_docs:
name: Build Main Docs
if: >
(github.event_name == 'push' || github.event_name == 'workflow_dispatch' || github.event_name == 'release') &&
(github.event_name == 'push' || github.event_name == 'workflow_dispatch') &&
github.repository == 'huggingface/lerobot'
permissions:
contents: read
@@ -59,13 +51,7 @@ jobs:
with:
commit_sha: ${{ github.sha }}
package: lerobot
additional_args: >-
--not_python_module
${{
(github.event_name == 'release' && format('--version {0}', github.event.release.tag_name)) ||
(inputs.version != '' && format('--version {0}', inputs.version)) ||
''
}}
additional_args: --not_python_module
secrets:
token: ${{ secrets.HUGGINGFACE_PUSH }}
hf_token: ${{ secrets.HF_DOC_BUILD_PUSH }}
+2 -9
View File
@@ -44,7 +44,7 @@ permissions:
# Sets up the environment variables
env:
UV_VERSION: "0.8.0"
PYTHON_VERSION: "3.12"
PYTHON_VERSION: "3.10"
# Ensures that only the latest commit for a PR or branch is built, canceling older runs.
concurrency:
@@ -61,9 +61,8 @@ jobs:
MUJOCO_GL: egl
HF_HOME: /mnt/cache/.cache/huggingface
HF_LEROBOT_HOME: /mnt/cache/.cache/huggingface/lerobot
HF_USER_TOKEN: ${{ secrets.LEROBOT_HF_USER }}
steps:
- uses: actions/checkout@v6
- uses: actions/checkout@v4
with:
persist-credentials: false
lfs: true
@@ -90,11 +89,5 @@ jobs:
- name: Install lerobot with test extras
run: uv sync --extra "test"
- name: Login to Hugging Face
if: env.HF_USER_TOKEN != ''
run: |
uv run hf auth login --token "$HF_USER_TOKEN" --add-to-git-credential
uv run hf auth whoami
- name: Run pytest
run: uv run pytest tests -vv --maxfail=10
+12 -32
View File
@@ -37,7 +37,7 @@ permissions:
# Sets up the environment variables
env:
UV_VERSION: "0.8.0"
PYTHON_VERSION: "3.12"
PYTHON_VERSION: "3.10"
DOCKER_IMAGE_NAME: huggingface/lerobot-gpu
# Ensures that only the latest action is built, canceling older runs.
@@ -60,9 +60,8 @@ jobs:
MUJOCO_GL: egl
HF_HOME: /mnt/cache/.cache/huggingface
HF_LEROBOT_HOME: /mnt/cache/.cache/huggingface/lerobot
HF_USER_TOKEN: ${{ secrets.LEROBOT_HF_USER }}
steps:
- uses: actions/checkout@v6
- uses: actions/checkout@v4
with:
lfs: true
persist-credentials: false
@@ -86,13 +85,7 @@ jobs:
python-version: ${{ env.PYTHON_VERSION }}
- name: Install lerobot with all extras
run: uv sync --extra all # TODO(Steven): Make flash-attn optional
- name: Login to Hugging Face
if: env.HF_USER_TOKEN != ''
run: |
uv run hf auth login --token "$HF_USER_TOKEN" --add-to-git-credential
uv run hf auth whoami
run: uv sync --all-extras --no-extra groot # TODO(Steven): Make flash-attn optional
- name: Run pytest (all extras)
run: uv run pytest tests -vv --maxfail=10
@@ -108,11 +101,9 @@ jobs:
runs-on:
group: aws-general-8-plus
if: |
github.repository == 'huggingface/lerobot' && (
(github.event_name == 'pull_request_review' && github.event.review.state == 'approved' && github.event.pull_request.head.repo.fork == false) ||
github.event_name == 'push' ||
github.event_name == 'workflow_dispatch'
)
(github.event_name == 'pull_request_review' && github.event.review.state == 'approved' && github.event.pull_request.head.repo.fork == false) ||
github.event_name == 'push' ||
github.event_name == 'workflow_dispatch'
outputs:
image_tag: ${{ steps.set_tag.outputs.image_tag }}
env:
@@ -136,7 +127,7 @@ jobs:
sudo apt-get update
sudo apt-get install git-lfs
git lfs install
- uses: actions/checkout@v6
- uses: actions/checkout@v4
with:
lfs: true
persist-credentials: false
@@ -169,7 +160,6 @@ jobs:
HF_LEROBOT_HOME: /home/user_lerobot/.cache/huggingface/lerobot
TORCH_HOME: /home/user_lerobot/.cache/torch
TRITON_CACHE_DIR: /home/user_lerobot/.cache/triton
HF_USER_TOKEN: ${{ secrets.LEROBOT_HF_USER }}
container:
image: ${{ needs.build-and-push-docker.outputs.image_tag }} # zizmor: ignore[unpinned-images]
options: --gpus all --shm-size "16gb"
@@ -181,13 +171,6 @@ jobs:
shell: bash
working-directory: /lerobot
steps:
- name: Login to Hugging Face
if: env.HF_USER_TOKEN != ''
run: |
hf auth login --token "$HF_USER_TOKEN" --add-to-git-credential
hf auth whoami
- name: Fix ptxas permissions
run: chmod +x /lerobot/.venv/lib/python3.12/site-packages/triton/backends/nvidia/bin/ptxas
- name: Run pytest on GPU
run: pytest tests -vv --maxfail=10
- name: Run end-to-end tests
@@ -203,18 +186,15 @@ jobs:
steps:
- name: Get Docker Hub Token and Delete Image
# zizmor: ignore[template-injection]
env:
DOCKERHUB_LEROBOT_USERNAME: ${{ secrets.DOCKERHUB_LEROBOT_USERNAME }}
DOCKERHUB_LEROBOT_PASSWORD: ${{ secrets.DOCKERHUB_LEROBOT_PASSWORD }}
IMAGE_FULL: ${{ needs.build-and-push-docker.outputs.image_tag }}
run: |
IMAGE_NAME=$(echo "$IMAGE_FULL" | cut -d':' -f1)
IMAGE_TAG=$(echo "$IMAGE_FULL" | cut -d':' -f2-)
IMAGE_NAME=$(echo "${{ needs.build-and-push-docker.outputs.image_tag }}" | cut -d':' -f1)
IMAGE_TAG=$(echo "${{ needs.build-and-push-docker.outputs.image_tag }}" | cut -d':' -f2)
echo "Attempting to delete image: $IMAGE_NAME:$IMAGE_TAG"
TOKEN=$(curl -s -H "Content-Type: application/json" \
-X POST \
-d "{\"username\": \"$DOCKERHUB_LEROBOT_USERNAME\", \"password\": \"$DOCKERHUB_LEROBOT_PASSWORD\"}" \
-d '{"username": "${{ secrets.DOCKERHUB_LEROBOT_USERNAME }}", "password": "${{ secrets.DOCKERHUB_LEROBOT_PASSWORD }}"}' \
https://hub.docker.com/v2/users/login/ | jq -r .token)
if [ "$TOKEN" == "null" ] || [ -z "$TOKEN" ]; then
@@ -225,7 +205,7 @@ jobs:
HTTP_RESPONSE=$(curl -s -o /dev/null -w "%{http_code}" \
-H "Authorization: JWT ${TOKEN}" \
-X DELETE \
https://hub.docker.com/v2/repositories/${IMAGE_NAME}/tags/$IMAGE_TAG)
https://hub.docker.com/v2/repositories/${IMAGE_NAME}/tags/${IMAGE_TAG}/)
if [ "$HTTP_RESPONSE" -eq 204 ]; then
echo "Successfully deleted Docker image tag: $IMAGE_NAME:$IMAGE_TAG"
-77
View File
@@ -1,77 +0,0 @@
# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# This workflow automatically labels issues based on their content.
name: Issue Labeler
on:
# Trigger on new issues and edits to existing issues
issues:
types: [opened, edited]
permissions:
contents: read
issues: write
jobs:
label-issue:
name: Auto Label Issue
runs-on: ubuntu-latest
if: github.repository == 'huggingface/lerobot'
steps:
- uses: actions/github-script@v8
with:
script: |
// Setup Input Text
const body = (context.payload.issue.body || '');
const title = (context.payload.issue.title || '');
const cleanBody = body.replace(/```[\s\S]*?```/g, '');
const text = `${title}\n${cleanBody}`.toLowerCase();
const labelsToAdd = new Set();
const matches = (re) => re.test(text);
// Keyword Heuristics
if (matches(/\b(bug|error|crash|exception)\b/i)) labelsToAdd.add('bug');
if (matches(/\b(new feature|enhancement|improvement|proposal|feature request)\b/i)) labelsToAdd.add('enhancement');
if (matches(/\b(question|how to|clarify|explain|how do i|help me|question about)\b/i)) labelsToAdd.add('question');
if (matches(/\b(documentation|docs?|readme|tutorial|wiki|typo|docstring)\b/i)) labelsToAdd.add('documentation');
if (matches(/\b(example|sample|demo|notebook)s?\b/i)) labelsToAdd.add('examples');
if (matches(/\b(datasets?|data loader|data augmentation|data preprocessing)\b/i)) labelsToAdd.add('dataset');
if (matches(/\b(mujoco|isaac|simulation|sim)\b/i)) labelsToAdd.add('simulation');
if (matches(/\b(train|training|optimizer|gradient|wandb|sac)\b/i)) labelsToAdd.add('training');
if (matches(/\b(rerun|plot|render|rendering|visualizer)/i)) labelsToAdd.add('visualization');
if (matches(/\b(cameras?|opencv|realsense|lidars?|sensors?|imus?|microphones?|rgbd|encoders?)\b/i)) labelsToAdd.add('sensors');
if (matches(/\b(urdf|actuators?|calibration|end-effector|kinematics)\b/i)) labelsToAdd.add('robots');
if (matches(/\b(teleop|teleoperator|controller|leader|follower|joystick|gamepad)\b/i)) labelsToAdd.add('teleoperators');
if (matches(/\b(policy|policies|model?)\b/i)) labelsToAdd.add('policies');
if (matches(/\b(processor|pipeline|preprocessor|postprocessor)s?\b/i)) labelsToAdd.add('processor');
if (matches(/\b(eval|evaluate|evaluation|metrics?|score|benchmarks?)\b/i)) labelsToAdd.add('evaluation');
if (matches(/\b(tests?|pytest|unittest|failing test)\b/i)) labelsToAdd.add('tests');
if (matches(/\b(ci|github actions?|github workflows?|gha|docker|pypi)\b/i)) labelsToAdd.add('CI');
if (matches(/\b(perf|latency|throughput|fps|speed|performance|slow|fast|slower|faster|memory usage)\b/i)) labelsToAdd.add('performance');
if (matches(/\b(dependency|dependencies|pip|install error|importerror|package not found|pyproject)\b/i)) labelsToAdd.add('dependencies');
if (matches(/\b(configuration|config|arguments?|input feature|dracuss)\b/i)) labelsToAdd.add('configuration');
// Apply Labels
const labels = Array.from(labelsToAdd).filter(Boolean);
if (labels.length > 0) {
console.log(`Adding labels: ${labels.join(', ')}`);
await github.rest.issues.addLabels({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: context.issue.number,
labels,
});
}
+6 -22
View File
@@ -28,7 +28,7 @@ on:
# Sets up the environment variables
env:
UV_VERSION: "0.8.0"
PYTHON_VERSION: "3.12"
PYTHON_VERSION: "3.10"
DOCKER_IMAGE_NAME_CPU: huggingface/lerobot-cpu:latest
DOCKER_IMAGE_NAME_GPU: huggingface/lerobot-gpu:latest
@@ -52,7 +52,7 @@ jobs:
sudo apt-get update
sudo apt-get install git-lfs
git lfs install
- uses: actions/checkout@v6
- uses: actions/checkout@v4
with:
lfs: true
persist-credentials: false
@@ -87,7 +87,7 @@ jobs:
sudo apt-get update
sudo apt-get install git-lfs
git lfs install
- uses: actions/checkout@v6
- uses: actions/checkout@v4
with:
lfs: true
persist-credentials: false
@@ -119,7 +119,6 @@ jobs:
HF_LEROBOT_HOME: /home/user_lerobot/.cache/huggingface/lerobot
TORCH_HOME: /home/user_lerobot/.cache/torch
TRITON_CACHE_DIR: /home/user_lerobot/.cache/triton
HF_USER_TOKEN: ${{ secrets.LEROBOT_HF_USER }}
container:
image: ${{ needs.build-docker-cpu-nightly.outputs.image_tag }} # zizmor: ignore[unpinned-images]
options: --shm-size "16gb"
@@ -131,11 +130,6 @@ jobs:
shell: bash
working-directory: /lerobot
steps:
- name: Login to Hugging Face
if: env.HF_USER_TOKEN != ''
run: |
hf auth login --token "$HF_USER_TOKEN" --add-to-git-credential
hf auth whoami
- name: Run pytest on CPU
run: pytest tests -vv --maxfail=10
- name: Run end-to-end tests
@@ -152,7 +146,6 @@ jobs:
HF_LEROBOT_HOME: /home/user_lerobot/.cache/huggingface/lerobot
TORCH_HOME: /home/user_lerobot/.cache/torch
TRITON_CACHE_DIR: /home/user_lerobot/.cache/triton
HF_USER_TOKEN: ${{ secrets.LEROBOT_HF_USER }}
container:
image: ${{ needs.build-docker-gpu-nightly.outputs.image_tag }} # zizmor: ignore[unpinned-images]
options: --gpus all --shm-size "16gb"
@@ -164,11 +157,6 @@ jobs:
shell: bash
working-directory: /lerobot
steps:
- name: Login to Hugging Face
if: env.HF_USER_TOKEN != ''
run: |
hf auth login --token "$HF_USER_TOKEN" --add-to-git-credential
hf auth whoami
- name: Run pytest on GPU
run: pytest tests -vv --maxfail=10
- name: Run end-to-end tests
@@ -186,7 +174,6 @@ jobs:
TORCH_HOME: /home/user_lerobot/.cache/torch
TRITON_CACHE_DIR: /home/user_lerobot/.cache/triton
CUDA_VISIBLE_DEVICES: "0,1,2,3"
HF_USER_TOKEN: ${{ secrets.LEROBOT_HF_USER }}
container:
image: ${{ needs.build-docker-gpu-nightly.outputs.image_tag }} # zizmor: ignore[unpinned-images]
options: --gpus all --shm-size "16gb"
@@ -198,15 +185,12 @@ jobs:
shell: bash
working-directory: /lerobot
steps:
- name: Login to Hugging Face
if: env.HF_USER_TOKEN != ''
run: |
hf auth login --token "$HF_USER_TOKEN" --add-to-git-credential
hf auth whoami
- name: Verify GPU availability
run: |
nvidia-smi
python -c "import torch; print(f'PyTorch CUDA available: {torch.cuda.is_available()}'); print(f'Number of GPUs: {torch.cuda.device_count()}')"
- name: Run multi-GPU training tests
run: pytest -vv tests/training/
# TODO(Steven): Investigate why motors tests are failing in multi-GPU setup
run: pytest tests -vv --maxfail=10 --ignore=tests/motors/
timeout-minutes: 10
-39
View File
@@ -1,39 +0,0 @@
# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# This workflow labels pull requests based on the files that were changed.
name: Pull Request Labeler
on:
# Allows labeling pull requests when they are opened or updated
# zizmor: ignore[dangerous-triggers] Needed to label PRs from forks
pull_request_target:
branches:
- main
types: [opened, synchronize, reopened, ready_for_review]
permissions:
contents: read
pull-requests: write
jobs:
triage:
name: Label PR
runs-on: ubuntu-latest
if: github.repository == 'huggingface/lerobot' && !github.event.pull_request.draft
steps:
- uses: actions/labeler@v6
with:
repo-token: ${{ secrets.GITHUB_TOKEN }}
sync-labels: true # Removes labels if files are removed from the PR
+3 -3
View File
@@ -43,14 +43,14 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v6
uses: actions/checkout@v4
with:
persist-credentials: false
- name: Set up Python
uses: actions/setup-python@v6
uses: actions/setup-python@v5
with:
python-version: '3.12'
python-version: '3.10'
- name: Run pre-commit hooks
uses: pre-commit/action@v3.0.1 # zizmor: ignore[unpinned-uses]
+14 -5
View File
@@ -22,7 +22,7 @@ on:
# Sets up the environment variables
env:
UV_VERSION: "0.8.0"
PYTHON_VERSION: "3.12"
PYTHON_VERSION: "3.10"
jobs:
# This job builds the Python package and publishes it to PyPI
@@ -38,14 +38,14 @@ jobs:
steps:
- name: Checkout code
uses: actions/checkout@v6
uses: actions/checkout@v4
with:
persist-credentials: false
- name: Set up Python
uses: actions/setup-python@v6
uses: actions/setup-python@v5
with:
python-version: '3.12'
python-version: '3.10'
- name: Extract Version
id: extract_info
@@ -83,6 +83,14 @@ jobs:
exit 1
fi
- name: Remove Tags with Git dependencies
# TODO(Steven): Temporary patch to remove pi from PyPi 0.4.0 release due to its reliance on git dependencies.
run: |
echo "::info:: Checking for Git dependencies to remove from pyproject.toml..."
grep -E '@ git\+https|lerobot\[pi\]' pyproject.toml | sed 's/^/::warning:: Removing line: /' || true
sed -E -i '/@ git\+https|lerobot\[pi\]/d' pyproject.toml
echo "::info:: Git dependencies removed. Proceeding with build."
- name: Install build dependencies
run: python -m pip install build
@@ -127,7 +135,7 @@ jobs:
env:
MUJOCO_GL: egl
steps:
- uses: actions/checkout@v6
- uses: actions/checkout@v4
with:
lfs: true
persist-credentials: false
@@ -169,3 +177,4 @@ jobs:
# TODO(Steven): Publish draft/pre-release and to test pypi weekly
# TODO(Steven): Separate build and publish job
# TODO(Steven): Tag documentation with the same version as the package
+1 -1
View File
@@ -43,7 +43,7 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v6 # zizmor: ignore[unpinned-uses]
uses: actions/checkout@v4 # zizmor: ignore[unpinned-uses]
with:
fetch-depth: 0
persist-credentials: false
+11 -27
View File
@@ -20,8 +20,8 @@ on:
workflow_dispatch:
# Run on the 1st and 15th of every month at 09:00 UTC
# schedule:
# - cron: '0 2 1,15 * *'
schedule:
- cron: '0 2 1,15 * *'
permissions:
contents: read
@@ -29,7 +29,7 @@ permissions:
# Sets up the environment variables
env:
UV_VERSION: "0.8.0"
PYTHON_VERSION: "3.12"
PYTHON_VERSION: "3.10"
DOCKER_IMAGE_NAME: huggingface/lerobot-gpu:unbound
# Ensures that only the latest action is built, canceling older runs.
@@ -48,9 +48,8 @@ jobs:
MUJOCO_GL: egl
HF_HOME: /mnt/cache/.cache/huggingface
HF_LEROBOT_HOME: /mnt/cache/.cache/huggingface/lerobot
HF_USER_TOKEN: ${{ secrets.LEROBOT_HF_USER }}
steps:
- uses: actions/checkout@v6
- uses: actions/checkout@v4
with:
lfs: true
persist-credentials: false
@@ -79,12 +78,8 @@ jobs:
echo "Dependencies unbound:" && cat pyproject.toml
- name: Install lerobot with all extras
run: uv sync --extra all # TODO(Steven): Make flash-attn optional
- name: Login to Hugging Face
if: env.HF_USER_TOKEN != ''
run: |
uv run hf auth login --token "$HF_USER_TOKEN" --add-to-git-credential
uv run hf auth whoami
run: uv sync --all-extras --no-extra groot # TODO(Steven): Make flash-attn optional
- name: Run pytest (all extras)
run: uv run pytest tests -vv
@@ -96,7 +91,6 @@ jobs:
name: Build and Push Docker
runs-on:
group: aws-general-8-plus
if: github.repository == 'huggingface/lerobot'
outputs:
image_tag: ${{ env.DOCKER_IMAGE_NAME }}
env:
@@ -107,7 +101,7 @@ jobs:
sudo apt-get update
sudo apt-get install git-lfs
git lfs install
- uses: actions/checkout@v6
- uses: actions/checkout@v4
with:
lfs: true
persist-credentials: false
@@ -142,7 +136,6 @@ jobs:
HF_LEROBOT_HOME: /home/user_lerobot/.cache/huggingface/lerobot
TORCH_HOME: /home/user_lerobot/.cache/torch
TRITON_CACHE_DIR: /home/user_lerobot/.cache/triton
HF_USER_TOKEN: ${{ secrets.LEROBOT_HF_USER }}
container:
image: ${{ needs.build-and-push-docker.outputs.image_tag }} # zizmor: ignore[unpinned-images]
options: --gpus all --shm-size "16gb"
@@ -154,11 +147,6 @@ jobs:
shell: bash
working-directory: /lerobot
steps:
- name: Login to Hugging Face
if: env.HF_USER_TOKEN != ''
run: |
hf auth login --token "$HF_USER_TOKEN" --add-to-git-credential
hf auth whoami
- name: Run pytest on GPU
run: pytest tests -vv
- name: Run end-to-end tests
@@ -174,19 +162,15 @@ jobs:
steps:
- name: Get Docker Hub Token and Delete Image
# zizmor: ignore[template-injection]
env:
DOCKERHUB_LEROBOT_USERNAME: ${{ secrets.DOCKERHUB_LEROBOT_USERNAME }}
DOCKERHUB_LEROBOT_PASSWORD: ${{ secrets.DOCKERHUB_LEROBOT_PASSWORD }}
IMAGE_FULL: ${{ needs.build-and-push-docker.outputs.image_tag }}
run: |
IMAGE_NAME=$(echo "$IMAGE_FULL" | cut -d':' -f1)
IMAGE_TAG=$(echo "$IMAGE_FULL" | cut -d':' -f2)
IMAGE_NAME=$(echo "${{ needs.build-and-push-docker.outputs.image_tag }}" | cut -d':' -f1)
IMAGE_TAG=$(echo "${{ needs.build-and-push-docker.outputs.image_tag }}" | cut -d':' -f2)
echo "Attempting to delete image: $IMAGE_NAME:$IMAGE_TAG"
TOKEN=$(curl -s -H "Content-Type: application/json" \
-X POST \
-d "{\"username\": \"$DOCKERHUB_LEROBOT_USERNAME\", \"password\": \"$DOCKERHUB_LEROBOT_PASSWORD\"}" \
-d '{"username": "${{ secrets.DOCKERHUB_LEROBOT_USERNAME }}", "password": "${{ secrets.DOCKERHUB_LEROBOT_PASSWORD }}"}' \
https://hub.docker.com/v2/users/login/ | jq -r .token)
if [ "$TOKEN" == "null" ] || [ -z "$TOKEN" ]; then
@@ -197,7 +181,7 @@ jobs:
HTTP_RESPONSE=$(curl -s -o /dev/null -w "%{http_code}" \
-H "Authorization: JWT ${TOKEN}" \
-X DELETE \
https://hub.docker.com/v2/repositories/${IMAGE_NAME}/tags/$IMAGE_TAG)
https://hub.docker.com/v2/repositories/${IMAGE_NAME}/tags/${IMAGE_TAG}/)
if [ "$HTTP_RESPONSE" -eq 204 ]; then
echo "Successfully deleted Docker image tag: $IMAGE_NAME:$IMAGE_TAG"
+3 -3
View File
@@ -13,7 +13,7 @@
# limitations under the License.
default_language_version:
python: python3.12
python: python3.10
exclude: "tests/artifacts/.*\\.safetensors$"
@@ -55,7 +55,7 @@ repos:
rev: v3.21.0
hooks:
- id: pyupgrade
args: [--py312-plus]
args: [--py310-plus]
##### Markdown Quality #####
- repo: https://github.com/rbubley/mirrors-prettier
@@ -87,7 +87,7 @@ repos:
# TODO(Steven): Uncomment when ready to use
##### Static Analysis & Typing #####
- repo: https://github.com/pre-commit/mirrors-mypy
rev: v1.19.1
rev: v1.18.2
hooks:
- id: mypy
args: [--config-file=pyproject.toml]
-25
View File
@@ -1,25 +0,0 @@
# AI Usage Policy
The LeRobot project welcomes contributions from everyone, and we have a few guidelines regarding AI usage to ensure high code quality, clear communication, and a healthy open-source ecosystem:
- **Please disclose significant AI assistance.** If you used AI tools (e.g., Copilot, Claude, Cursor, ChatGPT) to generate a substantial portion of your code or text, let us know in your PR description. Transparency helps us review your changes more effectively.
- **Own your code (The Human-in-the-Loop).** You must fully understand all the changes you are proposing. If you cannot explain what your AI-assisted code does or how it interacts with LeRobot's broader architecture, please take the time to learn and test it before submitting.
- **Keep issues and discussions focused.** You are welcome to use AI to help draft issues or PR descriptions, but please review and edit them carefully before posting. AI can often be overly verbose; trimming the noise and getting straight to the point helps our maintainers address your needs faster.
Our core maintainers also use AI tools to aid their workflows, but they do so while bringing deep contextual knowledge of the LeRobot codebase to validate the output. We ask all contributors to apply that same level of rigor.
## Remember the Human Maintainers
Please remember that LeRobot is maintained by a dedicated team of humans.
Every discussion, issue, and pull request is read and reviewed by real people. While AI tools can generate thousands of lines of code in seconds, reviewing that code still takes human time and energy. Submitting unverified or low-effort AI output puts an unfair burden on our maintainers.
Today, the quality of the AI output still heavily depends on the developer driving the tool. We ask that you respect our maintainers' time by thoroughly vetting, testing, and refining your submissions.
## AI is Welcome Here
LeRobot operates at the cutting edge of AI and robotics, and many of our maintainers actively embrace AI coding assistants as valuable productivity tools. We are a pro-AI project!
Our reason for having an AI policy is not an anti-AI stance. Rather, it exists to ensure that AI is used to enhance human contributions, not replace them with unverified noise. It's about how the tools are used, not the tools themselves.
We value the unique human insight you bring to the LeRobot community. Let AI empower your workflow, but always let your own judgment take the wheel.
+2 -2
View File
@@ -52,7 +52,7 @@ decisions when appropriate.
This Code of Conduct applies within all community spaces, and also applies when
an individual is officially representing the community in public spaces.
Examples of representing our community include using an official e-mail address,
Examples of representing our community include using an official email address,
posting via an official social media account, or acting as an appointed
representative at an online or offline event.
@@ -60,7 +60,7 @@ representative at an online or offline event.
Instances of abusive, harassing, or otherwise unacceptable behavior may be
reported to the community leaders responsible for enforcement at
feedback@huggingface.co.
[feedback@huggingface.co](mailto:feedback@huggingface.co).
All complaints will be reviewed and investigated promptly and fairly.
All community leaders are obligated to respect the privacy and security of the
+303 -63
View File
@@ -1,83 +1,323 @@
# How to contribute to 🤗 LeRobot
# How to contribute to 🤗 LeRobot?
Everyone is welcome to contribute, and we value everybody's contribution. Code is not the only way to help the community. Answering questions, helping others, reaching out, and improving the documentation are immensely valuable.
Everyone is welcome to contribute, and we value everybody's contribution. Code
is thus not the only way to help the community. Answering questions, helping
others, reaching out and improving the documentations are immensely valuable to
the community.
Whichever way you choose to contribute, please be mindful to respect our [code of conduct](./CODE_OF_CONDUCT.md) and our [AI policy](./AI_POLICY.md).
It also helps us if you spread the word: reference the library from blog posts
on the awesome projects it made possible, shout out on Twitter when it has
helped you, or simply ⭐️ the repo to say "thank you".
## Ways to Contribute
Whichever way you choose to contribute, please be mindful to respect our
[code of conduct](https://github.com/huggingface/lerobot/blob/main/CODE_OF_CONDUCT.md).
You can contribute in many ways:
## You can contribute in so many ways!
- **Fixing issues:** Resolve bugs or improve existing code.
- **New features:** Develop new features.
- **Extend:** Implement new models/policies, robots, or simulation environments and upload datasets to the Hugging Face Hub.
- **Documentation:** Improve examples, guides, and docstrings.
- **Feedback:** Submit tickets related to bugs or desired new features.
Some of the ways you can contribute to 🤗 LeRobot:
If you are unsure where to start, join our [Discord Channel](https://discord.gg/q8Dzzpym3f).
- Fixing outstanding issues with the existing code.
- Implementing new models, datasets or simulation environments.
- Contributing to the examples or to the documentation.
- Submitting issues related to bugs or desired new features.
## Development Setup
Following the guides below, feel free to open issues and PRs and to coordinate your efforts with the community on our [Discord Channel](https://discord.gg/VjFz58wn3R). For specific inquiries, reach out to [Remi Cadene](mailto:remi.cadene@huggingface.co).
To contribute code, you need to set up a development environment.
If you are not sure how to contribute or want to know the next features we working on, look on this project page: [LeRobot TODO](https://github.com/orgs/huggingface/projects/46)
### 1. Fork and Clone
## Submitting a new issue or feature request
Fork the repository on GitHub, then clone your fork:
```bash
git clone https://github.com/<your-handle>/lerobot.git
cd lerobot
git remote add upstream https://github.com/huggingface/lerobot.git
```
### 2. Environment Installation
Please follow our [Installation Guide](./docs/source/installation.mdx) for the environment setup & installation from source.
## Running Tests & Quality Checks
### Code Style (Pre-commit)
Install `pre-commit` hooks to run checks automatically before you commit:
```bash
pre-commit install
```
To run checks manually on all files:
```bash
pre-commit run --all-files
```
### Running Tests
We use `pytest`. First, ensure you have test artifacts by installing **git-lfs**:
Do your best to follow these guidelines when submitting an issue or a feature
request. It will make it easier for us to come back to you quickly and with good
feedback.
### Did you find a bug?
The 🤗 LeRobot library is robust and reliable thanks to the users who notify us of
the problems they encounter. So thank you for reporting an issue.
First, we would really appreciate it if you could **make sure the bug was not
already reported** (use the search bar on Github under Issues).
Did not find it? :( So we can act quickly on it, please follow these steps:
- Include your **OS type and version**, the versions of **Python** and **PyTorch**.
- A short, self-contained, code snippet that allows us to reproduce the bug in
less than 30s.
- The full traceback if an exception is raised.
- Attach any other additional information, like screenshots, you think may help.
### Do you want a new feature?
A good feature request addresses the following points:
1. Motivation first:
- Is it related to a problem/frustration with the library? If so, please explain
why. Providing a code snippet that demonstrates the problem is best.
- Is it related to something you would need for a project? We'd love to hear
about it!
- Is it something you worked on and think could benefit the community?
Awesome! Tell us what problem it solved for you.
2. Write a _paragraph_ describing the feature.
3. Provide a **code snippet** that demonstrates its future use.
4. In case this is related to a paper, please attach a link.
5. Attach any additional information (drawings, screenshots, etc.) you think may help.
If your issue is well written we're already 80% of the way there by the time you
post it.
## Adding new policies, datasets or environments
Look at our implementations for [datasets](./src/lerobot/datasets/), [policies](./src/lerobot/policies/),
environments ([aloha](https://github.com/huggingface/gym-aloha),
[pusht](https://github.com/huggingface/gym-pusht))
and follow the same api design.
When implementing a new dataset loadable with LeRobotDataset follow these steps:
- Update `available_datasets_per_env` in `lerobot/__init__.py`
When implementing a new environment (e.g. `gym_aloha`), follow these steps:
- Update `available_tasks_per_env` and `available_datasets_per_env` in `lerobot/__init__.py`
When implementing a new policy class (e.g. `DiffusionPolicy`) follow these steps:
- Update `available_policies` and `available_policies_per_env`, in `lerobot/__init__.py`
- Set the required `name` class attribute.
- Update variables in `tests/test_available.py` by importing your new Policy class
## Submitting a pull request (PR)
Before writing code, we strongly advise you to search through the existing PRs or
issues to make sure that nobody is already working on the same thing. If you are
unsure, it is always a good idea to open an issue to get some feedback.
You will need basic `git` proficiency to be able to contribute to
🤗 LeRobot. `git` is not the easiest tool to use but it has the greatest
manual. Type `git --help` in a shell and enjoy. If you prefer books, [Pro
Git](https://git-scm.com/book/en/v2) is a very good reference.
Follow these steps to start contributing:
1. Fork the [repository](https://github.com/huggingface/lerobot) by
clicking on the 'Fork' button on the repository's page. This creates a copy of the code
under your GitHub user account.
2. Clone your fork to your local disk, and add the base repository as a remote. The following command
assumes you have your public SSH key uploaded to GitHub. See the following guide for more
[information](https://docs.github.com/en/repositories/creating-and-managing-repositories/cloning-a-repository).
```bash
git clone git@github.com:<your Github handle>/lerobot.git
cd lerobot
git remote add upstream https://github.com/huggingface/lerobot.git
```
3. Create a new branch to hold your development changes, and do this for every new PR you work on.
Start by synchronizing your `main` branch with the `upstream/main` branch (more details in the [GitHub Docs](https://docs.github.com/en/github/collaborating-with-issues-and-pull-requests/syncing-a-fork)):
```bash
git checkout main
git fetch upstream
git rebase upstream/main
```
Once your `main` branch is synchronized, create a new branch from it:
```bash
git checkout -b a-descriptive-name-for-my-changes
```
🚨 **Do not** work on the `main` branch.
4. for development, we advise to use a tool like `poetry` or `uv` instead of just `pip` to easily track our dependencies.
Follow the instructions to [install poetry](https://python-poetry.org/docs/#installation) (use a version >=2.1.0) or to [install uv](https://docs.astral.sh/uv/getting-started/installation/#installation-methods) if you don't have one of them already.
Set up a development environment with conda:
```bash
conda create -y -n lerobot-dev python=3.10 && conda activate lerobot-dev
```
If you're using `uv`, it can manage python versions so you can instead do:
```bash
uv venv --python 3.10 && source .venv/bin/activate
```
To develop on 🤗 LeRobot, you will at least need to install the `dev` and `test` extras dependencies along with the core library:
using `poetry`
```bash
poetry sync --extras "dev test"
```
using `uv`
```bash
uv sync --extra dev --extra test
```
You can also install the project with all its dependencies (including environments):
using `poetry`
```bash
poetry sync --all-extras
```
using `uv`
```bash
uv sync --all-extras
```
> **Note:** If you don't install simulation environments with `--all-extras`, the tests that require them will be skipped when running the pytest suite locally. However, they _will_ be tested in the CI. In general, we advise you to install everything and test locally before pushing.
Whichever command you chose to install the project (e.g. `poetry sync --all-extras`), you should run it again when pulling code with an updated version of `pyproject.toml` and `poetry.lock` in order to synchronize your virtual environment with the new dependencies.
The equivalent of `pip install some-package`, would just be:
using `poetry`
```bash
poetry add some-package
```
using `uv`
```bash
uv add some-package
```
When making changes to the poetry sections of the `pyproject.toml`, you should run the following command to lock dependencies.
using `poetry`
```bash
poetry lock
```
using `uv`
```bash
uv lock
```
5. Develop the features on your branch.
As you work on the features, you should make sure that the test suite
passes. You should run the tests impacted by your changes like this (see
below an explanation regarding the environment variable):
```bash
pytest tests/<TEST_TO_RUN>.py
```
6. Follow our style.
`lerobot` relies on `ruff` to format its source code
consistently. Set up [`pre-commit`](https://pre-commit.com/) to run these checks
automatically as Git commit hooks.
Install `pre-commit` hooks:
```bash
pre-commit install
```
You can run these hooks whenever you need on staged files with:
```bash
pre-commit
```
Once you're happy with your changes, add changed files using `git add` and
make a commit with `git commit` to record your changes locally:
```bash
git add modified_file.py
git commit
```
Note, if you already committed some changes that have a wrong formatting, you can use:
```bash
pre-commit run --all-files
```
Please write [good commit messages](https://chris.beams.io/posts/git-commit/).
It is a good idea to sync your copy of the code with the original
repository regularly. This way you can quickly account for changes:
```bash
git fetch upstream
git rebase upstream/main
```
Push the changes to your account using:
```bash
git push -u origin a-descriptive-name-for-my-changes
```
7. Once you are satisfied (**and the checklist below is happy too**), go to the
webpage of your fork on GitHub. Click on 'Pull request' to send your changes
to the project maintainers for review.
8. It's ok if maintainers ask you for changes. It happens to core contributors
too! So everyone can see the changes in the Pull request, work in your local
branch and push the changes to your fork. They will automatically appear in
the pull request.
### Checklist
1. The title of your pull request should be a summary of its contribution;
2. If your pull request addresses an issue, please mention the issue number in
the pull request description to make sure they are linked (and people
consulting the issue know you are working on it);
3. To indicate a work in progress please prefix the title with `[WIP]`, or preferably mark
the PR as a draft PR. These are useful to avoid duplicated work, and to differentiate
it from PRs ready to be merged;
4. Make sure existing tests pass;
### Tests
An extensive test suite is included to test the library behavior and several examples. Library tests can be found in the [tests folder](https://github.com/huggingface/lerobot/tree/main/tests).
Install [git lfs](https://git-lfs.com/) to retrieve test artifacts (if you don't have it already).
On Mac:
```bash
brew install git-lfs
git lfs install
```
On Ubuntu:
```bash
sudo apt-get install git-lfs
git lfs install
```
Pull artifacts if they're not in [tests/artifacts](tests/artifacts)
```bash
git lfs pull
```
Run the full suite (this may require extras installed):
We use `pytest` in order to run the tests. From the root of the
repository, here's how to run tests with `pytest` for the library:
```bash
pytest -sv ./tests
python -m pytest -sv ./tests
```
Or run a specific test file during development:
```bash
pytest -sv tests/test_specific_feature.py
```
## Submitting Issues & Pull Requests
Use the templates for required fields and examples.
- **Issues:** Follow the [ticket template](./.github/ISSUE_TEMPLATE/bug-report.yml).
- **Pull requests:** Rebase on `upstream/main`, use a descriptive branch (don't work on `main`), run `pre-commit` and tests locally, and follow the [PR template](./.github/PULL_REQUEST_TEMPLATE.md).
One member of the LeRobot team will then review your contribution.
Thank you for contributing to LeRobot!
You can specify a smaller set of tests in order to test only the feature
you're working on.
-1
View File
@@ -1,3 +1,2 @@
include src/lerobot/templates/lerobot_modelcard_template.md
include src/lerobot/datasets/card_template.md
include src/lerobot/envs/metaworld_config.json
+290 -105
View File
@@ -1,5 +1,7 @@
<p align="center">
<img alt="LeRobot, Hugging Face Robotics Library" src="./media/readme/lerobot-logo-thumbnail.png" width="100%">
<img alt="LeRobot, Hugging Face Robotics Library" src="https://raw.githubusercontent.com/huggingface/lerobot/main/media/lerobot-logo-thumbnail.png" width="100%">
<br/>
<br/>
</p>
<div align="center">
@@ -10,132 +12,323 @@
[![Status](https://img.shields.io/pypi/status/lerobot)](https://pypi.org/project/lerobot/)
[![Version](https://img.shields.io/pypi/v/lerobot)](https://pypi.org/project/lerobot/)
[![Contributor Covenant](https://img.shields.io/badge/Contributor%20Covenant-v2.1-ff69b4.svg)](https://github.com/huggingface/lerobot/blob/main/CODE_OF_CONDUCT.md)
[![Discord](https://img.shields.io/badge/Discord-Join_Us-5865F2?style=flat&logo=discord&logoColor=white)](https://discord.gg/q8Dzzpym3f)
[![Discord](https://dcbadge.vercel.app/api/server/C5P34WJ68S?style=flat)](https://discord.gg/s3KuuzsPFb)
<!-- [![Coverage](https://codecov.io/gh/huggingface/lerobot/branch/main/graph/badge.svg?token=TODO)](https://codecov.io/gh/huggingface/lerobot) -->
</div>
**LeRobot** aims to provide models, datasets, and tools for real-world robotics in PyTorch. The goal is to lower the barrier to entry so that everyone can contribute to and benefit from shared datasets and pretrained models.
<h2 align="center">
<p><a href="https://huggingface.co/docs/lerobot/hope_jr">
Build Your Own HopeJR Robot!</a></p>
</h2>
🤗 A hardware-agnostic, Python-native interface that standardizes control across diverse platforms, from low-cost arms (SO-100) to humanoids.
<div align="center">
<img
src="https://raw.githubusercontent.com/huggingface/lerobot/main/media/hope_jr/hopejr.png"
alt="HopeJR robot"
title="HopeJR robot"
width="60%"
/>
🤗 A standardized, scalable LeRobotDataset format (Parquet + MP4 or images) hosted on the Hugging Face Hub, enabling efficient storage, streaming and visualization of massive robotic datasets.
<p><strong>Meet HopeJR A humanoid robot arm and hand for dexterous manipulation!</strong></p>
<p>Control it with exoskeletons and gloves for precise hand movements.</p>
<p>Perfect for advanced manipulation tasks! 🤖</p>
🤗 State-of-the-art policies that have been shown to transfer to the real-world ready for training and deployment.
<p><a href="https://huggingface.co/docs/lerobot/hope_jr">
See the full HopeJR tutorial here.</a></p>
</div>
🤗 Comprehensive support for the open-source ecosystem to democratize physical AI.
<br/>
## Quick Start
<h2 align="center">
<p><a href="https://huggingface.co/docs/lerobot/so101">
Build Your Own SO-101 Robot!</a></p>
</h2>
LeRobot can be installed directly from PyPI.
<div align="center">
<table>
<tr>
<td align="center"><img src="https://raw.githubusercontent.com/huggingface/lerobot/main/media/so101/so101.webp" alt="SO-101 follower arm" title="SO-101 follower arm" width="90%"/></td>
<td align="center"><img src="https://raw.githubusercontent.com/huggingface/lerobot/main/media/so101/so101-leader.webp" alt="SO-101 leader arm" title="SO-101 leader arm" width="90%"/></td>
</tr>
</table>
<p><strong>Meet the updated SO100, the SO-101 Just €114 per arm!</strong></p>
<p>Train it in minutes with a few simple moves on your laptop.</p>
<p>Then sit back and watch your creation act autonomously! 🤯</p>
<p><a href="https://huggingface.co/docs/lerobot/so101">
See the full SO-101 tutorial here.</a></p>
<p>Want to take it to the next level? Make your SO-101 mobile by building LeKiwi!</p>
<p>Check out the <a href="https://huggingface.co/docs/lerobot/lekiwi">LeKiwi tutorial</a> and bring your robot to life on wheels.</p>
<img src="https://raw.githubusercontent.com/huggingface/lerobot/main/media/lekiwi/kiwi.webp" alt="LeKiwi mobile robot" title="LeKiwi mobile robot" width="50%">
</div>
<br/>
<h3 align="center">
<p>LeRobot: State-of-the-art AI for real-world robotics</p>
</h3>
---
🤗 LeRobot aims to provide models, datasets, and tools for real-world robotics in PyTorch. The goal is to lower the barrier to entry to robotics so that everyone can contribute and benefit from sharing datasets and pretrained models.
🤗 LeRobot contains state-of-the-art approaches that have been shown to transfer to the real-world with a focus on imitation learning and reinforcement learning.
🤗 LeRobot already provides a set of pretrained models, datasets with human collected demonstrations, and simulation environments to get started without assembling a robot. In the coming weeks, the plan is to add more and more support for real-world robotics on the most affordable and capable robots out there.
🤗 LeRobot hosts pretrained models and datasets on this Hugging Face community page: [huggingface.co/lerobot](https://huggingface.co/lerobot)
#### Examples of pretrained models on simulation environments
<table>
<tr>
<td><img src="https://raw.githubusercontent.com/huggingface/lerobot/main/media/gym/aloha_act.gif" width="100%" alt="ACT policy on ALOHA env"/></td>
<td><img src="https://raw.githubusercontent.com/huggingface/lerobot/main/media/gym/simxarm_tdmpc.gif" width="100%" alt="TDMPC policy on SimXArm env"/></td>
<td><img src="https://raw.githubusercontent.com/huggingface/lerobot/main/media/gym/pusht_diffusion.gif" width="100%" alt="Diffusion policy on PushT env"/></td>
</tr>
<tr>
<td align="center">ACT policy on ALOHA env</td>
<td align="center">TDMPC policy on SimXArm env</td>
<td align="center">Diffusion policy on PushT env</td>
</tr>
</table>
## Installation
LeRobot works with Python 3.10+ and PyTorch 2.2+.
### Environment Setup
Create a virtual environment with Python 3.10 and activate it, e.g. with [`miniforge`](https://conda-forge.org/download/):
```bash
conda create -y -n lerobot python=3.10
conda activate lerobot
```
When using `conda`, install `ffmpeg` in your environment:
```bash
conda install ffmpeg -c conda-forge
```
> **NOTE:** This usually installs `ffmpeg 7.X` for your platform compiled with the `libsvtav1` encoder. If `libsvtav1` is not supported (check supported encoders with `ffmpeg -encoders`), you can:
>
> - _[On any platform]_ Explicitly install `ffmpeg 7.X` using:
>
> ```bash
> conda install ffmpeg=7.1.1 -c conda-forge
> ```
>
> - _[On Linux only]_ Install [ffmpeg build dependencies](https://trac.ffmpeg.org/wiki/CompilationGuide/Ubuntu#GettheDependencies) and [compile ffmpeg from source with libsvtav1](https://trac.ffmpeg.org/wiki/CompilationGuide/Ubuntu#libsvtav1), and make sure you use the corresponding ffmpeg binary to your install with `which ffmpeg`.
### Install LeRobot 🤗
#### From Source
First, clone the repository and navigate into the directory:
```bash
git clone https://github.com/huggingface/lerobot.git
cd lerobot
```
Then, install the library in editable mode. This is useful if you plan to contribute to the code.
```bash
pip install -e .
```
> **NOTE:** If you encounter build errors, you may need to install additional dependencies (`cmake`, `build-essential`, and `ffmpeg libs`). On Linux, run:
> `sudo apt-get install cmake build-essential python3-dev pkg-config libavformat-dev libavcodec-dev libavdevice-dev libavutil-dev libswscale-dev libswresample-dev libavfilter-dev`. For other systems, see: [Compiling PyAV](https://pyav.org/docs/develop/overview/installation.html#bring-your-own-ffmpeg)
For simulations, 🤗 LeRobot comes with gymnasium environments that can be installed as extras:
- [aloha](https://github.com/huggingface/gym-aloha)
- [xarm](https://github.com/huggingface/gym-xarm)
- [pusht](https://github.com/huggingface/gym-pusht)
For instance, to install 🤗 LeRobot with aloha and pusht, use:
```bash
pip install -e ".[aloha, pusht]"
```
### Installation from PyPI
**Core Library:**
Install the base package with:
```bash
pip install lerobot
lerobot-info
```
> [!IMPORTANT]
> For detailed installation guide, please see the [Installation Documentation](https://huggingface.co/docs/lerobot/installation).
_This installs only the default dependencies._
## Robots & Control
<div align="center">
<img src="./media/readme/robots_control_video.webp" width="640px" alt="Reachy 2 Demo">
</div>
LeRobot provides a unified `Robot` class interface that decouples control logic from hardware specifics. It supports a wide range of robots and teleoperation devices.
```python
from lerobot.robots.myrobot import MyRobot
# Connect to a robot
robot = MyRobot(config=...)
robot.connect()
# Read observation and send action
obs = robot.get_observation()
action = model.select_action(obs)
robot.send_action(action)
```
**Supported Hardware:** SO100, LeKiwi, Koch, HopeJR, OMX, EarthRover, Reachy2, Gamepads, Keyboards, Phones, OpenARM, Unitree G1.
While these devices are natively integrated into the LeRobot codebase, the library is designed to be extensible. You can easily implement the Robot interface to utilize LeRobot's data collection, training, and visualization tools for your own custom robot.
For detailed hardware setup guides, see the [Hardware Documentation](https://huggingface.co/docs/lerobot/integrate_hardware).
## LeRobot Dataset
To solve the data fragmentation problem in robotics, we utilize the **LeRobotDataset** format.
- **Structure:** Synchronized MP4 videos (or images) for vision and Parquet files for state/action data.
- **HF Hub Integration:** Explore thousands of robotics datasets on the [Hugging Face Hub](https://huggingface.co/lerobot).
- **Tools:** Seamlessly delete episodes, split by indices/fractions, add/remove features, and merge multiple datasets.
```python
from lerobot.datasets.lerobot_dataset import LeRobotDataset
# Load a dataset from the Hub
dataset = LeRobotDataset("lerobot/aloha_mobile_cabinet")
# Access data (automatically handles video decoding)
episode_index=0
print(f"{dataset[episode_index]['action'].shape=}\n")
```
Learn more about it in the [LeRobotDataset Documentation](https://huggingface.co/docs/lerobot/lerobot-dataset-v3)
## SoTA Models
LeRobot implements state-of-the-art policies in pure PyTorch, covering Imitation Learning, Reinforcement Learning, and Vision-Language-Action (VLA) models, with more coming soon. It also provides you with the tools to instrument and inspect your training process.
<p align="center">
<img alt="Gr00t Architecture" src="./media/readme/VLA_architecture.jpg" width="640px">
</p>
Training a policy is as simple as running a script configuration:
**Extra Features:**
To install additional functionality, use one of the following:
```bash
lerobot-train \
--policy=act \
--dataset.repo_id=lerobot/aloha_mobile_cabinet
pip install 'lerobot[all]' # All available features
pip install 'lerobot[aloha,pusht]' # Specific features (Aloha & Pusht)
pip install 'lerobot[feetech]' # Feetech motor support
```
| Category | Models |
| -------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Imitation Learning** | [ACT](./docs/source/policy_act_README.md), [Diffusion](./docs/source/policy_diffusion_README.md), [VQ-BeT](./docs/source/policy_vqbet_README.md), [Multitask DiT Policy](./docs/source/policy_multi_task_dit_README.md) |
| **Reinforcement Learning** | [HIL-SERL](./docs/source/hilserl.mdx), [TDMPC](./docs/source/policy_tdmpc_README.md) & QC-FQL (coming soon) |
| **VLAs Models** | [Pi0Fast](./docs/source/pi0fast.mdx), [Pi0.5](./docs/source/pi05.mdx), [GR00T N1.5](./docs/source/policy_groot_README.md), [SmolVLA](./docs/source/policy_smolvla_README.md), [XVLA](./docs/source/xvla.mdx) |
_Replace `[...]` with your desired features._
Similarly to the hardware, you can easily implement your own policy & leverage LeRobot's data collection, training, and visualization tools, and share your model to the HF Hub
**Available Tags:**
For a full list of optional dependencies, see:
https://pypi.org/project/lerobot/
For detailed policy setup guides, see the [Policy Documentation](https://huggingface.co/docs/lerobot/bring_your_own_policies).
> [!NOTE]
> For lerobot 0.4.0, if you want to install pi tags, you will have to do: `pip install "lerobot[pi]@git+https://github.com/huggingface/lerobot.git"`.
>
> This will be solved in the next patch release
## Inference & Evaluation
### Weights & Biases
Evaluate your policies in simulation or on real hardware using the unified evaluation script. LeRobot supports standard benchmarks like **LIBERO**, **MetaWorld** and more to come.
To use [Weights and Biases](https://docs.wandb.ai/quickstart) for experiment tracking, log in with
```bash
# Evaluate a policy on the LIBERO benchmark
lerobot-eval \
--policy.path=lerobot/pi0_libero_finetuned \
--env.type=libero \
--env.task=libero_object \
--eval.n_episodes=10
wandb login
```
Learn how to implement your own simulation environment or benchmark and distribute it from the HF Hub by following the [EnvHub Documentation](https://huggingface.co/docs/lerobot/envhub)
(note: you will also need to enable WandB in the configuration. See below.)
## Resources
### Visualize datasets
- **[Documentation](https://huggingface.co/docs/lerobot/index):** The complete guide to tutorials & API.
- **[Chinese Tutorials: LeRobot+SO-ARM101中文教程-同济子豪兄](https://zihao-ai.feishu.cn/wiki/space/7589642043471924447)** Detailed doc for assembling, teleoperate, dataset, train, deploy. Verified by Seed Studio and 5 global hackathon players.
- **[Discord](https://discord.gg/q8Dzzpym3f):** Join the `LeRobot` server to discuss with the community.
- **[X](https://x.com/LeRobotHF):** Follow us on X to stay up-to-date with the latest developments.
- **[Robot Learning Tutorial](https://huggingface.co/spaces/lerobot/robot-learning-tutorial):** A free, hands-on course to learn robot learning using LeRobot.
Check out [example 1](https://github.com/huggingface/lerobot/blob/main/examples/dataset/load_lerobot_dataset.py) that illustrates how to use our dataset class which automatically downloads data from the Hugging Face hub.
You can also locally visualize episodes from a dataset on the hub by executing our script from the command line:
```bash
lerobot-dataset-viz \
--repo-id lerobot/pusht \
--episode-index 0
```
or from a dataset in a local folder with the `root` option and the `--mode local` (in the following case the dataset will be searched for in `./my_local_data_dir/lerobot/pusht`)
```bash
lerobot-dataset-viz \
--repo-id lerobot/pusht \
--root ./my_local_data_dir \
--mode local \
--episode-index 0
```
It will open `rerun.io` and display the camera streams, robot states and actions, like this:
https://github-production-user-asset-6210df.s3.amazonaws.com/4681518/328035972-fd46b787-b532-47e2-bb6f-fd536a55a7ed.mov?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAVCODYLSA53PQK4ZA%2F20240505%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20240505T172924Z&X-Amz-Expires=300&X-Amz-Signature=d680b26c532eeaf80740f08af3320d22ad0b8a4e4da1bcc4f33142c15b509eda&X-Amz-SignedHeaders=host&actor_id=24889239&key_id=0&repo_id=748713144
Our script can also visualize datasets stored on a distant server. See `lerobot-dataset-viz --help` for more instructions.
### The `LeRobotDataset` format
A dataset in `LeRobotDataset` format is very simple to use. It can be loaded from a repository on the Hugging Face hub or a local folder simply with e.g. `dataset = LeRobotDataset("lerobot/aloha_static_coffee")` and can be indexed into like any Hugging Face and PyTorch dataset. For instance `dataset[0]` will retrieve a single temporal frame from the dataset containing observation(s) and an action as PyTorch tensors ready to be fed to a model.
A specificity of `LeRobotDataset` is that, rather than retrieving a single frame by its index, we can retrieve several frames based on their temporal relationship with the indexed frame, by setting `delta_timestamps` to a list of relative times with respect to the indexed frame. For example, with `delta_timestamps = {"observation.image": [-1, -0.5, -0.2, 0]}` one can retrieve, for a given index, 4 frames: 3 "previous" frames 1 second, 0.5 seconds, and 0.2 seconds before the indexed frame, and the indexed frame itself (corresponding to the 0 entry). See example [1_load_lerobot_dataset.py](https://github.com/huggingface/lerobot/blob/main/examples/dataset/load_lerobot_dataset.py) for more details on `delta_timestamps`.
Under the hood, the `LeRobotDataset` format makes use of several ways to serialize data which can be useful to understand if you plan to work more closely with this format. We tried to make a flexible yet simple dataset format that would cover most type of features and specificities present in reinforcement learning and robotics, in simulation and in real-world, with a focus on cameras and robot states but easily extended to other types of sensory inputs as long as they can be represented by a tensor.
Here are the important details and internal structure organization of a typical `LeRobotDataset` instantiated with `dataset = LeRobotDataset("lerobot/aloha_static_coffee")`. The exact features will change from dataset to dataset but not the main aspects:
```
dataset attributes:
├ hf_dataset: a Hugging Face dataset (backed by Arrow/parquet). Typical features example:
│ ├ observation.images.cam_high (VideoFrame):
│ │ VideoFrame = {'path': path to a mp4 video, 'timestamp' (float32): timestamp in the video}
│ ├ observation.state (list of float32): position of an arm joints (for instance)
│ ... (more observations)
│ ├ action (list of float32): goal position of an arm joints (for instance)
│ ├ episode_index (int64): index of the episode for this sample
│ ├ frame_index (int64): index of the frame for this sample in the episode ; starts at 0 for each episode
│ ├ timestamp (float32): timestamp in the episode
│ ├ next.done (bool): indicates the end of an episode ; True for the last frame in each episode
│ └ index (int64): general index in the whole dataset
├ meta: a LeRobotDatasetMetadata object containing:
│ ├ info: a dictionary of metadata on the dataset
│ │ ├ codebase_version (str): this is to keep track of the codebase version the dataset was created with
│ │ ├ fps (int): frame per second the dataset is recorded/synchronized to
│ │ ├ features (dict): all features contained in the dataset with their shapes and types
│ │ ├ total_episodes (int): total number of episodes in the dataset
│ │ ├ total_frames (int): total number of frames in the dataset
│ │ ├ robot_type (str): robot type used for recording
│ │ ├ data_path (str): formattable string for the parquet files
│ │ └ video_path (str): formattable string for the video files (if using videos)
│ ├ episodes: a DataFrame containing episode metadata with columns:
│ │ ├ episode_index (int): index of the episode
│ │ ├ tasks (list): list of tasks for this episode
│ │ ├ length (int): number of frames in this episode
│ │ ├ dataset_from_index (int): start index of this episode in the dataset
│ │ └ dataset_to_index (int): end index of this episode in the dataset
│ ├ stats: a dictionary of statistics (max, mean, min, std) for each feature in the dataset, for instance
│ │ ├ observation.images.front_cam: {'max': tensor with same number of dimensions (e.g. `(c, 1, 1)` for images, `(c,)` for states), etc.}
│ │ └ ...
│ └ tasks: a DataFrame containing task information with task names as index and task_index as values
├ root (Path): local directory where the dataset is stored
├ image_transforms (Callable): optional image transformations to apply to visual modalities
└ delta_timestamps (dict): optional delta timestamps for temporal queries
```
A `LeRobotDataset` is serialised using several widespread file formats for each of its parts, namely:
- hf_dataset stored using Hugging Face datasets library serialization to parquet
- videos are stored in mp4 format to save space
- metadata are stored in plain json/jsonl files
Dataset can be uploaded/downloaded from the HuggingFace hub seamlessly. To work on a local dataset, you can specify its location with the `root` argument if it's not in the default `~/.cache/huggingface/lerobot` location.
#### Reproduce state-of-the-art (SOTA)
We provide some pretrained policies on our [hub page](https://huggingface.co/lerobot) that can achieve state-of-the-art performances.
You can reproduce their training by loading the config from their run. Simply running:
```bash
lerobot-train --config_path=lerobot/diffusion_pusht
```
reproduces SOTA results for Diffusion Policy on the PushT task.
## Contribute
If you would like to contribute to 🤗 LeRobot, please check out our [contribution guide](https://github.com/huggingface/lerobot/blob/main/CONTRIBUTING.md).
### Add a pretrained policy
Once you have trained a policy you may upload it to the Hugging Face hub using a hub id that looks like `${hf_user}/${repo_name}` (e.g. [lerobot/diffusion_pusht](https://huggingface.co/lerobot/diffusion_pusht)).
You first need to find the checkpoint folder located inside your experiment directory (e.g. `outputs/train/2024-05-05/20-21-12_aloha_act_default/checkpoints/002500`). Within that there is a `pretrained_model` directory which should contain:
- `config.json`: A serialized version of the policy configuration (following the policy's dataclass config).
- `model.safetensors`: A set of `torch.nn.Module` parameters, saved in [Hugging Face Safetensors](https://huggingface.co/docs/safetensors/index) format.
- `train_config.json`: A consolidated configuration containing all parameters used for training. The policy configuration should match `config.json` exactly. This is useful for anyone who wants to evaluate your policy or for reproducibility.
To upload these to the hub, run the following:
```bash
huggingface-cli upload ${hf_user}/${repo_name} path/to/pretrained_model
```
See [lerobot_eval.py](https://github.com/huggingface/lerobot/blob/main/src/lerobot/scripts/lerobot_eval.py) for an example of how other people may use your policy.
### Acknowledgment
- The LeRobot team 🤗 for building SmolVLA [Paper](https://arxiv.org/abs/2506.01844), [Blog](https://huggingface.co/blog/smolvla).
- Thanks to Tony Zhao, Zipeng Fu and colleagues for open sourcing ACT policy, ALOHA environments and datasets. Ours are adapted from [ALOHA](https://tonyzhaozh.github.io/aloha) and [Mobile ALOHA](https://mobile-aloha.github.io).
- Thanks to Cheng Chi, Zhenjia Xu and colleagues for open sourcing Diffusion policy, Pusht environment and datasets, as well as UMI datasets. Ours are adapted from [Diffusion Policy](https://diffusion-policy.cs.columbia.edu) and [UMI Gripper](https://umi-gripper.github.io).
- Thanks to Nicklas Hansen, Yunhai Feng and colleagues for open sourcing TDMPC policy, Simxarm environments and datasets. Ours are adapted from [TDMPC](https://github.com/nicklashansen/tdmpc) and [FOWM](https://www.yunhaifeng.com/FOWM).
- Thanks to Antonio Loquercio and Ashish Kumar for their early support.
- Thanks to [Seungjae (Jay) Lee](https://sjlee.cc/), [Mahi Shafiullah](https://mahis.life/) and colleagues for open sourcing [VQ-BeT](https://sjlee.cc/vq-bet/) policy and helping us adapt the codebase to our repository. The policy is adapted from [VQ-BeT repo](https://github.com/jayLEE0301/vq_bet_official).
## Citation
If you use LeRobot in your research, please cite:
If you want, you can cite this work with:
```bibtex
@misc{cadene2024lerobot,
@@ -146,14 +339,6 @@ If you use LeRobot in your research, please cite:
}
```
## Contribute
## Star History
We welcome contributions from everyone in the community! To get started, please read our [CONTRIBUTING.md](./CONTRIBUTING.md) guide. Whether you're adding a new feature, improving documentation, or fixing a bug, your help and feedback are invaluable. We're incredibly excited about the future of open-source robotics and can't wait to work with you on what's next—thank you for your support!
<p align="center">
<img alt="SO101 Video" src="./media/readme/so100_video.webp" width="640px">
</p>
<div align="center">
<sub>Built by the <a href="https://huggingface.co/lerobot">LeRobot</a> team at <a href="https://huggingface.co">Hugging Face</a> with ❤️</sub>
</div>
[![Star History Chart](https://api.star-history.com/svg?repos=huggingface/lerobot&type=Timeline)](https://star-history.com/#huggingface/lerobot&Timeline)
-48
View File
@@ -1,48 +0,0 @@
# Security Policy
## Project Status & Philosophy
`lerobot` has so far been primarily a research and prototyping tool, which is why deployment security hasnt been a strong focus until now. As `lerobot` continues to be adopted and deployed in production, we are paying much closer attention to these kinds of issues.
Fortunately, being an open-source project, the community can also help by reporting and fixing vulnerabilities. We appreciate your efforts to responsibly disclose your findings and will make every effort to acknowledge your contributions.
## Reporting a Vulnerability
To report a security issue, please use the GitHub Security Advisory ["Report a Vulnerability"](https://github.com/huggingface/lerobot/security/advisories/new) tab.
The `lerobot` team will send a response indicating the next steps in handling your report. After the initial reply to your report, the security team will keep you informed of the progress towards a fix and full announcement, and may ask for additional information or guidance.
#### Hugging Face Security Team
Since this project is part of the Hugging Face ecosystem, feel free to submit vulnerability reports directly to: **[security@huggingface.co](mailto:security@huggingface.co)**. Someone from the HF security team will review the report and recommend next steps.
#### Open Source Disclosures
If reporting a vulnerability specific to the open-source codebase (and not the underlying Hub infrastructure), you may also use [Huntr](https://huntr.com), a vulnerability disclosure program for open source software.
## Supported Versions
Currently, we treat `lerobot` as a rolling release. We prioritize security updates for the latest available version (`main` branch).
| Version | Supported |
| -------- | --------- |
| Latest | ✅ |
| < Latest | ❌ |
## Secure Usage Guidelines
`lerobot` is tightly coupled to the Hugging Face Hub for sharing data and pretrained policies. When downloading artifacts uploaded by others, you expose yourself to risks. Please read below for recommendations to keep your runtime and robot environment safe.
### Remote Artefacts (Weights & Policies)
Models and policies uploaded to the Hugging Face Hub come in different formats. We heavily recommend uploading and downloading models in the [`safetensors`](https://github.com/huggingface/safetensors) format.
`safetensors` was developed specifically to prevent arbitrary code execution on your system, which is critical when running software on physical hardware/robots.
To avoid loading models from unsafe formats (e.g., `pickle`), you should ensure you are prioritizing `safetensors` files.
### Remote Code
Some models or environments on the Hub may require `trust_remote_code=True` to run custom architecture code.
Please **always** verify the content of the modeling files when using this argument. We recommend setting a specific `revision` (commit hash) when loading remote code to ensure you protect yourself from unverified updates to the repository.
+42 -42
View File
@@ -28,9 +28,9 @@ We don't expect the same optimal settings for a dataset of images from a simulat
For these reasons, we run this benchmark on four representative datasets:
- `lerobot/pusht_image`: (96 x 96 pixels) simulation with simple geometric shapes, fixed camera.
- `lerobot/aloha_mobile_shrimp_image`: (480 x 640 pixels) real-world indoor, moving camera.
- `lerobot/paris_street`: (720 x 1280 pixels) real-world outdoor, moving camera.
- `lerobot/kitchen`: (1080 x 1920 pixels) real-world indoor, fixed camera.
- `aliberts/aloha_mobile_shrimp_image`: (480 x 640 pixels) real-world indoor, moving camera.
- `aliberts/paris_street`: (720 x 1280 pixels) real-world outdoor, moving camera.
- `aliberts/kitchen`: (1080 x 1920 pixels) real-world indoor, fixed camera.
Note: The datasets used for this benchmark need to be image datasets, not video datasets.
@@ -179,7 +179,7 @@ python benchmark/video/run_video_benchmark.py \
--output-dir outputs/video_benchmark \
--repo-ids \
lerobot/pusht_image \
lerobot/aloha_mobile_shrimp_image \
aliberts/aloha_mobile_shrimp_image \
--vcodec libx264 libx265 \
--pix-fmt yuv444p yuv420p \
--g 2 20 None \
@@ -203,9 +203,9 @@ python benchmark/video/run_video_benchmark.py \
--output-dir outputs/video_benchmark \
--repo-ids \
lerobot/pusht_image \
lerobot/aloha_mobile_shrimp_image \
lerobot/paris_street \
lerobot/kitchen \
aliberts/aloha_mobile_shrimp_image \
aliberts/paris_street \
aliberts/kitchen \
--vcodec libx264 libx265 \
--pix-fmt yuv444p yuv420p \
--g 1 2 3 4 5 6 10 15 20 40 None \
@@ -221,9 +221,9 @@ python benchmark/video/run_video_benchmark.py \
--output-dir outputs/video_benchmark \
--repo-ids \
lerobot/pusht_image \
lerobot/aloha_mobile_shrimp_image \
lerobot/paris_street \
lerobot/kitchen \
aliberts/aloha_mobile_shrimp_image \
aliberts/paris_street \
aliberts/kitchen \
--vcodec libsvtav1 \
--pix-fmt yuv420p \
--g 1 2 3 4 5 6 10 15 20 40 None \
@@ -252,37 +252,37 @@ Since we're using av1 encoding, we're choosing the `pyav` decoder as `video_read
These tables show the results for `g=2` and `crf=30`, using `timestamps-modes=6_frames` and `backend=pyav`
| video_images_size_ratio | vcodec | pix_fmt | | | |
| --------------------------------- | ---------- | ------- | --------- | --------- | --------- |
| | libx264 | | libx265 | | libsvtav1 |
| repo_id | yuv420p | yuv444p | yuv420p | yuv444p | yuv420p |
| lerobot/pusht_image | **16.97%** | 17.58% | 18.57% | 18.86% | 22.06% |
| lerobot/aloha_mobile_shrimp_image | 2.14% | 2.11% | 1.38% | **1.37%** | 5.59% |
| lerobot/paris_street | 2.12% | 2.13% | **1.54%** | **1.54%** | 4.43% |
| lerobot/kitchen | 1.40% | 1.39% | **1.00%** | **1.00%** | 2.52% |
| video_images_size_ratio | vcodec | pix_fmt | | | |
| ---------------------------------- | ---------- | ------- | --------- | --------- | --------- |
| | libx264 | | libx265 | | libsvtav1 |
| repo_id | yuv420p | yuv444p | yuv420p | yuv444p | yuv420p |
| lerobot/pusht_image | **16.97%** | 17.58% | 18.57% | 18.86% | 22.06% |
| aliberts/aloha_mobile_shrimp_image | 2.14% | 2.11% | 1.38% | **1.37%** | 5.59% |
| aliberts/paris_street | 2.12% | 2.13% | **1.54%** | **1.54%** | 4.43% |
| aliberts/kitchen | 1.40% | 1.39% | **1.00%** | **1.00%** | 2.52% |
| video_images_load_time_ratio | vcodec | pix_fmt | | | |
| --------------------------------- | ------- | ------- | -------- | ------- | --------- |
| | libx264 | | libx265 | | libsvtav1 |
| repo_id | yuv420p | yuv444p | yuv420p | yuv444p | yuv420p |
| lerobot/pusht_image | 6.45 | 5.19 | **1.90** | 2.12 | 2.47 |
| lerobot/aloha_mobile_shrimp_image | 11.80 | 7.92 | 0.71 | 0.85 | **0.48** |
| lerobot/paris_street | 2.21 | 2.05 | 0.36 | 0.49 | **0.30** |
| lerobot/kitchen | 1.46 | 1.46 | 0.28 | 0.51 | **0.26** |
| video_images_load_time_ratio | vcodec | pix_fmt | | | |
| ---------------------------------- | ------- | ------- | -------- | ------- | --------- |
| | libx264 | | libx265 | | libsvtav1 |
| repo_id | yuv420p | yuv444p | yuv420p | yuv444p | yuv420p |
| lerobot/pusht_image | 6.45 | 5.19 | **1.90** | 2.12 | 2.47 |
| aliberts/aloha_mobile_shrimp_image | 11.80 | 7.92 | 0.71 | 0.85 | **0.48** |
| aliberts/paris_street | 2.21 | 2.05 | 0.36 | 0.49 | **0.30** |
| aliberts/kitchen | 1.46 | 1.46 | 0.28 | 0.51 | **0.26** |
| | | vcodec | pix_fmt | | | |
| --------------------------------- | -------- | -------- | ------------ | -------- | --------- | ------------ |
| | | libx264 | | libx265 | | libsvtav1 |
| repo_id | metric | yuv420p | yuv444p | yuv420p | yuv444p | yuv420p |
| lerobot/pusht_image | avg_mse | 2.90E-04 | **2.03E-04** | 3.13E-04 | 2.29E-04 | 2.19E-04 |
| | avg_psnr | 35.44 | 37.07 | 35.49 | **37.30** | 37.20 |
| | avg_ssim | 98.28% | **98.85%** | 98.31% | 98.84% | 98.72% |
| lerobot/aloha_mobile_shrimp_image | avg_mse | 2.76E-04 | 2.59E-04 | 3.17E-04 | 3.06E-04 | **1.30E-04** |
| | avg_psnr | 35.91 | 36.21 | 35.88 | 36.09 | **40.17** |
| | avg_ssim | 95.19% | 95.18% | 95.00% | 95.05% | **97.73%** |
| lerobot/paris_street | avg_mse | 6.89E-04 | 6.70E-04 | 4.03E-03 | 4.02E-03 | **3.09E-04** |
| | avg_psnr | 33.48 | 33.68 | 32.05 | 32.15 | **35.40** |
| | avg_ssim | 93.76% | 93.75% | 89.46% | 89.46% | **95.46%** |
| lerobot/kitchen | avg_mse | 2.50E-04 | 2.24E-04 | 4.28E-04 | 4.18E-04 | **1.53E-04** |
| | avg_psnr | 36.73 | 37.33 | 36.56 | 36.75 | **39.12** |
| | avg_ssim | 95.47% | 95.58% | 95.52% | 95.53% | **96.82%** |
| | | vcodec | pix_fmt | | | |
| ---------------------------------- | -------- | -------- | ------------ | -------- | --------- | ------------ |
| | | libx264 | | libx265 | | libsvtav1 |
| repo_id | metric | yuv420p | yuv444p | yuv420p | yuv444p | yuv420p |
| lerobot/pusht_image | avg_mse | 2.90E-04 | **2.03E-04** | 3.13E-04 | 2.29E-04 | 2.19E-04 |
| | avg_psnr | 35.44 | 37.07 | 35.49 | **37.30** | 37.20 |
| | avg_ssim | 98.28% | **98.85%** | 98.31% | 98.84% | 98.72% |
| aliberts/aloha_mobile_shrimp_image | avg_mse | 2.76E-04 | 2.59E-04 | 3.17E-04 | 3.06E-04 | **1.30E-04** |
| | avg_psnr | 35.91 | 36.21 | 35.88 | 36.09 | **40.17** |
| | avg_ssim | 95.19% | 95.18% | 95.00% | 95.05% | **97.73%** |
| aliberts/paris_street | avg_mse | 6.89E-04 | 6.70E-04 | 4.03E-03 | 4.02E-03 | **3.09E-04** |
| | avg_psnr | 33.48 | 33.68 | 32.05 | 32.15 | **35.40** |
| | avg_ssim | 93.76% | 93.75% | 89.46% | 89.46% | **95.46%** |
| aliberts/kitchen | avg_mse | 2.50E-04 | 2.24E-04 | 4.28E-04 | 4.18E-04 | **1.53E-04** |
| | avg_psnr | 36.73 | 37.33 | 36.56 | 36.75 | **39.12** |
| | avg_ssim | 95.47% | 95.58% | 95.52% | 95.53% | **96.82%** |
+2 -4
View File
@@ -24,7 +24,7 @@ ARG OS_VERSION=22.04
FROM nvidia/cuda:${CUDA_VERSION}-base-ubuntu${OS_VERSION}
# Define Python version argument
ARG PYTHON_VERSION=3.12
ARG PYTHON_VERSION=3.10
# Configure environment variables
ENV DEBIAN_FRONTEND=noninteractive \
@@ -73,7 +73,7 @@ ENV HOME=/home/user_lerobot \
RUN uv venv --python python${PYTHON_VERSION}
# Install Python dependencies for caching
COPY --chown=user_lerobot:user_lerobot setup.py pyproject.toml README.md MANIFEST.in ./
COPY --chown=user_lerobot:user_lerobot pyproject.toml README.md MANIFEST.in ./
COPY --chown=user_lerobot:user_lerobot src/ src/
ARG UNBOUND_DEPS=false
@@ -85,8 +85,6 @@ RUN if [ "$UNBOUND_DEPS" = "true" ]; then \
RUN uv pip install --no-cache ".[all]"
RUN chmod +x /lerobot/.venv/lib/python${PYTHON_VERSION}/site-packages/triton/backends/nvidia/bin/ptxas
# Copy the rest of the application source code
# Make sure to have the git-LFS files for testing
COPY --chown=user_lerobot:user_lerobot . .
+2 -2
View File
@@ -19,7 +19,7 @@
# docker run -it --rm lerobot-user
# Configure the base image
ARG PYTHON_VERSION=3.12
ARG PYTHON_VERSION=3.10
FROM python:${PYTHON_VERSION}-slim
# Configure environment variables
@@ -59,7 +59,7 @@ ENV HOME=/home/user_lerobot \
RUN uv venv
# Install Python dependencies for caching
COPY --chown=user_lerobot:user_lerobot setup.py pyproject.toml README.md MANIFEST.in ./
COPY --chown=user_lerobot:user_lerobot pyproject.toml README.md MANIFEST.in ./
COPY --chown=user_lerobot:user_lerobot src/ src/
ARG UNBOUND_DEPS=false
+2 -28
View File
@@ -7,6 +7,8 @@
- sections:
- local: il_robots
title: Imitation Learning for Robots
- local: cameras
title: Cameras
- local: bring_your_own_policies
title: Bring Your Own Policies
- local: integrate_hardware
@@ -17,8 +19,6 @@
title: Train RL in Simulation
- local: multi_gpu_training
title: Multi GPU training
- local: peft_training
title: Training with PEFT (e.g., LoRA)
title: "Tutorials"
- sections:
- local: lerobot-dataset-v3
@@ -27,10 +27,6 @@
title: Porting Large Datasets
- local: using_dataset_tools
title: Using the Dataset Tools
- local: dataset_subtask
title: Using Subtasks in the Dataset
- local: streaming_video_encoding
title: Streaming Video Encoding
title: "Datasets"
- sections:
- local: act
@@ -39,23 +35,13 @@
title: SmolVLA
- local: pi0
title: π₀ (Pi0)
- local: pi0fast
title: π₀-FAST (Pi0Fast)
- local: pi05
title: π₀.₅ (Pi05)
- local: groot
title: NVIDIA GR00T N1.5
- local: xvla
title: X-VLA
- local: multi_task_dit
title: Multitask DiT Policy
- local: walloss
title: WALL-OSS
title: "Policies"
- sections:
- local: sarm
title: SARM
title: "Reward Models"
- sections:
- local: async
title: Use Async Inference
@@ -67,8 +53,6 @@
title: Environments from the Hub
- local: envhub_leisaac
title: Control & Train Robots in Sim (LeIsaac)
- local: envhub_isaaclab_arena
title: NVIDIA IsaacLab Arena Environments
- local: libero
title: Using Libero
- local: metaworld
@@ -103,19 +87,11 @@
title: Unitree G1
- local: earthrover_mini_plus
title: Earth Rover Mini
- local: omx
title: OMX
- local: openarm
title: OpenArm
title: "Robots"
- sections:
- local: phone_teleop
title: Phone
title: "Teleoperators"
- sections:
- local: cameras
title: Cameras
title: "Sensors"
- sections:
- local: torch_accelerators
title: PyTorch accelerators
@@ -125,8 +101,6 @@
title: Notebooks
- local: feetech
title: Updating Feetech Firmware
- local: damiao
title: Damiao Motors and CAN Bus
title: "Resources"
- sections:
- local: contributing
-3
View File
@@ -88,8 +88,5 @@ lerobot-record \
--dataset.repo_id=${HF_USER}/eval_act_your_dataset \
--dataset.num_episodes=10 \
--dataset.single_task="Your task description" \
--dataset.streaming_encoding=true \
--dataset.encoder_threads=2 \
# --dataset.vcodec=auto \
--policy.path=${HF_USER}/act_policy
```
+2 -3
View File
@@ -48,7 +48,7 @@ python -m lerobot.async_inference.robot_client \
--task="dummy" \ # POLICY: The task to run the policy on (`Fold my t-shirt`). Not necessarily defined for all policies, such as `act`
--policy_type=your_policy_type \ # POLICY: the type of policy to run (smolvla, act, etc)
--pretrained_name_or_path=user/model \ # POLICY: the model name/path on server to the checkpoint to run (e.g., lerobot/smolvla_base)
--policy_device=mps \ # POLICY: the device to run the policy on, on the server (cuda, mps, xpu, cpu)
--policy_device=mps \ # POLICY: the device to run the policy on, on the server
--actions_per_chunk=50 \ # POLICY: the number of actions to output at once
--chunk_size_threshold=0.5 \ # CLIENT: the threshold for the chunk size before sending a new observation to the server
--aggregate_fn_name=weighted_average \ # CLIENT: the function to aggregate actions on overlapping portions
@@ -169,7 +169,7 @@ python -m lerobot.async_inference.robot_client \
<!-- prettier-ignore-start -->
```python
import threading
from lerobot.robots.so_follower import SO100FollowerConfig
from lerobot.robots.so100_follower import SO100FollowerConfig
from lerobot.cameras.opencv.configuration_opencv import OpenCVCameraConfig
from lerobot.async_inference.configs import RobotClientConfig
from lerobot.async_inference.robot_client import RobotClient
@@ -195,7 +195,6 @@ client_cfg = RobotClientConfig(
robot=robot_cfg,
server_address="localhost:8080",
policy_device="mps",
client_device="cpu",
policy_type="smolvla",
pretrained_name_or_path="<user>/smolvla_async",
chunk_size_threshold=0.5,
+4 -4
View File
@@ -32,7 +32,7 @@ version = "0.1.0"
dependencies = [
# your policy-specific dependencies
]
requires-python = ">= 3.12"
requires-python = ">= 3.11"
[build-system]
build-backend = # your-build-backend
@@ -82,7 +82,7 @@ Create your policy implementation by inheriting from LeRobot's base `PreTrainedP
# modeling_my_custom_policy.py
import torch
import torch.nn as nn
from typing import Any
from typing import Dict, Any
from lerobot.policies.pretrained import PreTrainedPolicy
from .configuration_my_custom_policy import MyCustomPolicyConfig
@@ -91,7 +91,7 @@ class MyCustomPolicy(PreTrainedPolicy):
config_class = MyCustomPolicyConfig
name = "my_custom_policy"
def __init__(self, config: MyCustomPolicyConfig, dataset_stats: dict[str, Any] = None):
def __init__(self, config: MyCustomPolicyConfig, dataset_stats: Dict[str, Any] = None):
super().__init__(config, dataset_stats)
...
```
@@ -102,7 +102,7 @@ Create processor functions:
```python
# processor_my_custom_policy.py
from typing import Any
from typing import Dict, Any
import torch
+81 -95
View File
@@ -1,22 +1,12 @@
# Cameras
LeRobot offers multiple options for video capture:
LeRobot offers multiple options for video capture, including phone cameras, built-in laptop cameras, external webcams, and Intel RealSense cameras. To efficiently record frames from most cameras, you can use either the `OpenCVCamera` or `RealSenseCamera` class. For additional compatibility details on the `OpenCVCamera` class, refer to the [Video I/O with OpenCV Overview](https://docs.opencv.org/4.x/d0/da7/videoio_overview.html).
| Class | Supported Cameras |
| ----------------- | ----------------------------------- |
| `OpenCVCamera` | Phone, built-in laptop, USB webcams |
| `ZMQCamera` | Network-connected cameras |
| `RealSenseCamera` | Intel RealSense (with depth) |
| `Reachy2Camera` | Reachy 2 robot cameras |
### Finding your camera
> [!TIP]
> For `OpenCVCamera` compatibility details, see the [Video I/O with OpenCV Overview](https://docs.opencv.org/4.x/d0/da7/videoio_overview.html).
To instantiate a camera, you need a camera identifier. This identifier might change if you reboot your computer or re-plug your camera, a behavior mostly dependant on your operating system.
### Find your camera
Every camera requires a unique identifier to be instantiated, allowing you to distinguish between multiple connected devices.
`OpenCVCamera` and `RealSenseCamera` support auto-discovery. Run the command below to list available devices and their identifiers. Note that these identifiers may change after rebooting your computer or re-plugging the camera, depending on your operating system.
To find the camera indices of the cameras plugged into your system, run the following script:
```bash
lerobot-find-cameras opencv # or realsense for Intel Realsense cameras
@@ -24,7 +14,7 @@ lerobot-find-cameras opencv # or realsense for Intel Realsense cameras
The output will look something like this if you have two cameras connected:
```bash
```
--- Detected Cameras ---
Camera #0:
Name: OpenCV Camera @ 0
@@ -43,37 +33,13 @@ Camera #0:
> [!WARNING]
> When using Intel RealSense cameras in `macOS`, you could get this [error](https://github.com/IntelRealSense/librealsense/issues/12307): `Error finding RealSense cameras: failed to set power state`, this can be solved by running the same command with `sudo` permissions. Note that using RealSense cameras in `macOS` is unstable.
`ZMQCamera` and `Reachy2Camera` do not support auto-discovery. They must be configured manually by providing their network address and port or robot SDK settings.
## Use Cameras
## Use cameras
Below are two examples, demonstrating how to work with the API.
### Frame access modes
All camera classes implement three access modes for capturing frames:
| Method | Behavior | Blocks? | Best For |
| ------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------- | ---------------------------------------- |
| `read()` | Waits for the camera hardware to return a frame. May block for a long time depending on the camera and SDK. | Yes | Simple scripts, sequential capture |
| `async_read(timeout_ms)` | Returns the latest unconsumed frame from background thread. Blocks only if buffer is empty, up to `timeout_ms`. Raises `TimeoutError` if no frame arrives. | With a timeout | Control loops synchronized to camera FPS |
| `read_latest(max_age_ms)` | Peeks at the most recent frame in buffer (may be stale). Raises `TimeoutError` if frame is older than `max_age_ms`. | No | UI visualization, logging, monitoring |
### Usage examples
The following examples show how to use the camera API to configure and capture frames from different camera types.
- **Blocking and non-blocking frame capture** using an OpenCV-based camera
- **Asynchronous frame capture** using an OpenCV-based camera
- **Color and depth capture** using an Intel RealSense camera
> [!WARNING]
> Failing to cleanly disconnect cameras can cause resource leaks. Use the context manager protocol to ensure automatic cleanup:
>
> ```python
> with OpenCVCamera(config) as camera:
> ...
> ```
>
> You can also call `connect()` and `disconnect()` manually, but always use a `finally` block for the latter.
<hfoptions id="shell_restart">
<hfoption id="Open CV Camera">
@@ -94,30 +60,16 @@ config = OpenCVCameraConfig(
)
# Instantiate and connect an `OpenCVCamera`, performing a warm-up read (default).
with OpenCVCamera(config) as camera:
# Read a frame synchronously — blocks until hardware delivers a new frame
frame = camera.read()
print(f"read() call returned frame with shape:", frame.shape)
# Read a frame asynchronously with a timeout — returns the latest unconsumed frame or waits up to timeout_ms for a new one
try:
for i in range(10):
frame = camera.async_read(timeout_ms=200)
print(f"async_read call returned frame {i} with shape:", frame.shape)
except TimeoutError as e:
print(f"No frame received within timeout: {e}")
# Instantly return a frame - returns the most recent frame captured by the camera
try:
initial_frame = camera.read_latest(max_age_ms=1000)
for i in range(10):
frame = camera.read_latest(max_age_ms=1000)
print(f"read_latest call returned frame {i} with shape:", frame.shape)
print(f"Was a new frame received by the camera? {not (initial_frame == frame).any()}")
except TimeoutError as e:
print(f"Frame too old: {e}")
camera = OpenCVCamera(config)
camera.connect()
# Read frames asynchronously in a loop via `async_read(timeout_ms)`
try:
for i in range(10):
frame = camera.async_read(timeout_ms=200)
print(f"Async frame {i} shape:", frame.shape)
finally:
camera.disconnect()
```
<!-- prettier-ignore-end -->
@@ -159,10 +111,10 @@ finally:
</hfoption>
</hfoptions>
## Use your phone's camera
## Use your phone
<hfoptions id="use phone">
<hfoption id="iPhone & macOS">
<hfoption id="Mac">
To use your iPhone as a camera on macOS, enable the Continuity Camera feature:
@@ -172,49 +124,83 @@ To use your iPhone as a camera on macOS, enable the Continuity Camera feature:
For more details, visit [Apple support](https://support.apple.com/en-gb/guide/mac-help/mchl77879b8a/mac).
Your iPhone should be detected automatically when running the camera setup script in the next section.
</hfoption>
<hfoption id="OBS virtual camera">
<hfoption id="Linux">
If you want to use your phone as a camera using OBS, follow these steps to set up a virtual camera.
If you want to use your phone as a camera on Linux, follow these steps to set up a virtual camera
1. _(Linux only) Install `v4l2loopback-dkms` and `v4l-utils`_. These packages create virtual camera devices and verify their settings. Install with:
1. _Install `v4l2loopback-dkms` and `v4l-utils`_. Those packages are required to create virtual camera devices (`v4l2loopback`) and verify their settings with the `v4l2-ctl` utility from `v4l-utils`. Install them using:
```bash
<!-- prettier-ignore-start -->
```python
sudo apt install v4l2loopback-dkms v4l-utils
```
<!-- prettier-ignore-end -->
2. _Install the [DroidCam app](https://droidcam.app) on your phone_. This app is available for both iOS and Android.
3. _Download and install [OBS Studio](https://obsproject.com)_.
4. _Download and install the [DroidCam OBS plugin](https://droidcam.app/obs)_.
5. _Start OBS Studio_.
2. _Install [DroidCam](https://droidcam.app) on your phone_. This app is available for both iOS and Android.
3. _Install [OBS Studio](https://obsproject.com)_. This software will help you manage the camera feed. Install it using [Flatpak](https://flatpak.org):
6. _Add your phone as a source_. Follow the instructions [here](https://droidcam.app/obs/usage). Be sure to set the resolution to `640x480` to avoid the watermarks.
7. _Adjust resolution settings_. In OBS Studio, go to `File > Settings > Video` or `OBS > Preferences... > Video`. Change the `Base(Canvas) Resolution` and the `Output(Scaled) Resolution` to `640x480` by manually typing it.
<!-- prettier-ignore-start -->
```python
flatpak install flathub com.obsproject.Studio
```
<!-- prettier-ignore-end -->
4. _Install the DroidCam OBS plugin_. This plugin integrates DroidCam with OBS Studio. Install it with:
<!-- prettier-ignore-start -->
```python
flatpak install flathub com.obsproject.Studio.Plugin.DroidCam
```
<!-- prettier-ignore-end -->
5. _Start OBS Studio_. Launch with:
<!-- prettier-ignore-start -->
```python
flatpak run com.obsproject.Studio
```
<!-- prettier-ignore-end -->
6. _Add your phone as a source_. Follow the instructions [here](https://droidcam.app/obs/usage). Be sure to set the resolution to `640x480`.
7. _Adjust resolution settings_. In OBS Studio, go to `File > Settings > Video`. Change the `Base(Canvas) Resolution` and the `Output(Scaled) Resolution` to `640x480` by manually typing it in.
8. _Start virtual camera_. In OBS Studio, follow the instructions [here](https://obsproject.com/kb/virtual-camera-guide).
9. _Verify the virtual camera setup and resolution_.
- **Linux**: Use `v4l2-ctl` to list devices and check resolution:
```bash
v4l2-ctl --list-devices # find VirtualCam and note its /dev/videoX path
v4l2-ctl -d /dev/videoX --get-fmt-video # replace with your VirtualCam path
```
You should see `VirtualCam` listed and resolution `640x480`.
- **macOS**: Open Photo Booth or FaceTime and select "OBS Virtual Camera" as the input.
- **Windows**: The native Camera app doesn't support virtual cameras. Use a video conferencing app (Zoom, Teams) or run `lerobot-find-cameras opencv` directly to verify.
9. _Verify the virtual camera setup_. Use `v4l2-ctl` to list the devices:
<details>
<summary><strong>Troubleshooting</strong></summary>
<!-- prettier-ignore-start -->
```python
v4l2-ctl --list-devices
```
<!-- prettier-ignore-end -->
> The virtual camera resolution is incorrect.
You should see an entry like:
Delete the virtual camera source and recreate it. The resolution cannot be changed after creation.
```
VirtualCam (platform:v4l2loopback-000):
/dev/video1
```
> Error reading frame in background thread for OpenCVCamera(X): OpenCVCamera(X) frame width=640 or height=480 do not match configured width=1920 or height=1080.
10. _Check the camera resolution_. Use `v4l2-ctl` to ensure that the virtual camera output resolution is `640x480`. Change `/dev/video1` to the port of your virtual camera from the output of `v4l2-ctl --list-devices`.
This error is caused by OBS Virtual Camera advertising a `1920x1080` resolution despite rescaling. The only fix for now is to comment out the width and height check in `_postprocess_image()`.
<!-- prettier-ignore-start -->
```python
v4l2-ctl -d /dev/video1 --get-fmt-video
```
<!-- prettier-ignore-end -->
</details>
You should see an entry like:
```
>>> Format Video Capture:
>>> Width/Height : 640/480
>>> Pixel Format : 'YUYV' (YUYV 4:2:2)
```
Troubleshooting: If the resolution is not correct you will have to delete the Virtual Camera port and try again as it cannot be changed.
If everything is set up correctly, you can proceed with the rest of the tutorial.
</hfoption>
</hfoptions>
If everything is set up correctly, your phone will appear as a standard OpenCV camera and can be used with `OpenCVCamera`.
-165
View File
@@ -1,165 +0,0 @@
# Damiao Motors and CAN Bus
This guide covers setup and usage of Damiao motors with LeRobot via CAN bus communication.
Currently, only Linux is supported, as the OpenArms CAN adapter only has drivers for Linux.
## Linux CAN Setup
Before using Damiao motors, you need to set up the CAN interface on your Linux system.
### Install CAN Utilities
```bash
sudo apt-get install can-utils
```
### Configure CAN Interface (Manual)
For standard CAN FD (recommended for OpenArms):
```bash
sudo ip link set can0 down
sudo ip link set can0 type can bitrate 1000000 dbitrate 5000000 fd on
sudo ip link set can0 up
```
For standard CAN (without FD):
```bash
sudo ip link set can0 down
sudo ip link set can0 type can bitrate 1000000
sudo ip link set can0 up
```
### Configure CAN Interface (Using LeRobot)
LeRobot provides a utility script to setup and test CAN interfaces:
```bash
# Setup multiple interfaces (e.g., OpenArms Followers with 2 CAN buses)
lerobot-setup-can --mode=setup --interfaces=can0,can1
```
## Debugging CAN Communication
Use the built-in debug tools to test motor communication:
```bash
# Test motors on all interfaces
lerobot-setup-can --mode=test --interfaces=can0,can1
# Run speed/latency test
lerobot-setup-can --mode=speed --interfaces=can0
```
The test mode will scan for motors (IDs 0x01-0x08) and report which ones respond. Example output:
```
can0: UP (CAN FD)
Motor 0x01 (joint_1): ✓ FOUND
→ Response 0x11 [FD]: 00112233...
Motor 0x02 (joint_2): ✓ FOUND
Motor 0x03 (joint_3): ✗ No response
...
Summary: 2/8 motors found
```
## Usage
### Basic Setup
```python
from lerobot.motors import Motor
from lerobot.motors.damiao import DamiaoMotorsBus
# Define your motors with send/receive CAN IDs
motors = {
"joint_1": Motor(id=0x01, motor_type_str="dm8009", recv_id=0x11),
"joint_2": Motor(id=0x02, motor_type_str="dm4340", recv_id=0x12),
"joint_3": Motor(id=0x03, motor_type_str="dm4310", recv_id=0x13),
}
# Create the bus
bus = DamiaoMotorsBus(
port="can0", # Linux socketcan interface
motors=motors,
)
# Connect
bus.connect()
```
### Reading Motor States
```python
# Read single motor position (degrees)
position = bus.read("Present_Position", "joint_1")
# Read from multiple motors
positions = bus.sync_read("Present_Position") # All motors
positions = bus.sync_read("Present_Position", ["joint_1", "joint_2"])
# Read all states at once (position, velocity, torque)
states = bus.sync_read_all_states()
# Returns: {'joint_1': {'position': 45.2, 'velocity': 1.3, 'torque': 0.5}, ...}
```
### Writing Motor Commands
```python
# Enable torque
bus.enable_torque()
# Set goal position (degrees)
bus.write("Goal_Position", "joint_1", 45.0)
# Set positions for multiple motors
bus.sync_write("Goal_Position", {
"joint_1": 45.0,
"joint_2": -30.0,
"joint_3": 90.0,
})
# Disable torque
bus.disable_torque()
```
## Configuration Options
| Parameter | Default | Description |
| -------------- | --------- | ----------------------------------------------------------- |
| `port` | - | CAN interface (`can0`) or serial port (`/dev/cu.usbmodem*`) |
| `use_can_fd` | `True` | Enable CAN FD for higher data rates |
| `bitrate` | `1000000` | Nominal bitrate (1 Mbps) |
| `data_bitrate` | `5000000` | CAN FD data bitrate (5 Mbps) |
## Motor Configuration
Each motor requires:
- `id`: CAN ID for sending commands
- `motor_type`: One of the supported motor types (e.g., `"dm8009"`, `"dm4340"`)
- `recv_id`: CAN ID for receiving responses
OpenArms default IDs follow the pattern: send ID `0x0N`, receive ID `0x1N` where N is the joint number.
## Troubleshooting
### No Response from Motors
1. **Check power**
2. **Verify CAN wiring**: Check CAN-H, CAN-L, and GND connections
3. **Check motor IDs**: Use Damiao Debugging Tools to verify/configure IDs
4. **Test CAN interface**: Run `candump can0` to see if messages are being received
5. **Run diagnostics**: `lerobot-setup-can --mode=test --interfaces=can0`
### Motor Timeout Parameter
If motors were configured with timeout=0, they won't respond to commands. Use Damiao Debugging Tools to set a non-zero timeout value.
### Verify CAN FD Status
```bash
ip -d link show can0 | grep fd
```
-278
View File
@@ -1,278 +0,0 @@
# Using Subtasks in LeRobot Datasets
Subtask support in robotics datasets has proven effective in improving robot reasoning and understanding. Subtasks are particularly useful for:
- **Hierarchical policies**: Building policies that include subtask predictions to visualize robot reasoning in real time
- **Reward modeling**: Helping reward models understand task progression (e.g., SARM-style stage-aware reward models)
- **Task decomposition**: Breaking down complex manipulation tasks into atomic, interpretable steps
LeRobotDataset now supports subtasks as part of its dataset structure, alongside tasks.
## What are Subtasks?
While a **task** describes the overall goal (e.g., "Pick up the apple and place it in the basket"), **subtasks** break down the execution into finer-grained steps:
1. "Approach the apple"
2. "Grasp the apple"
3. "Lift the apple"
4. "Move to basket"
5. "Release the apple"
Each frame in the dataset can be annotated with its corresponding subtask, enabling models to learn and predict these intermediate stages.
<img
src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/lerobot/subtask-asset.png"
alt="An overview of subtask annotation showing how frames are labeled with intermediate subtask stages"
width="80%"
/>
<p>
<em>Figure: Overview of subtask annotation.</em>
</p>
**Reference:** _Subtask-learning based for robot self-assembly in flexible collaborative assembly in manufacturing_, Original Article, Published: 19 April 2022.
## Dataset Structure
Subtask information is stored in the dataset metadata:
```
my-dataset/
├── data/
│ └── ...
├── meta/
│ ├── info.json
│ ├── stats.json
│ ├── tasks.parquet
│ ├── subtasks.parquet # Subtask index → subtask string mapping
│ └── episodes/
│ └── ...
└── videos/
└── ...
```
### Subtasks Parquet File
The `meta/subtasks.parquet` file maps subtask indices to their natural language descriptions:
| subtask_index | subtask (index column) |
| ------------- | ---------------------- |
| 0 | "Approach the apple" |
| 1 | "Grasp the apple" |
| 2 | "Lift the apple" |
| ... | ... |
### Frame-Level Annotations
Each frame in the dataset can include a `subtask_index` field that references the subtasks parquet file:
```python
# Example frame data in the parquet file
{
"index": 42,
"timestamp": 1.4,
"episode_index": 0,
"task_index": 0,
"subtask_index": 2, # References "Lift the apple"
"observation.state": [...],
"action": [...],
}
```
## Annotating Datasets with Subtasks
We provide a HuggingFace Space for easily annotating any LeRobotDataset with subtasks:
**[https://huggingface.co/spaces/lerobot/annotate](https://huggingface.co/spaces/lerobot/annotate)**
After completing your annotation:
1. Click "Push to Hub" to upload your annotated dataset
2. You can also run the annotation space locally by following the instructions at [github.com/huggingface/lerobot-annotate](https://github.com/huggingface/lerobot-annotate)
## Loading Datasets with Subtasks
When you load a dataset with subtask annotations, the subtask information is automatically available:
```python
from lerobot.datasets.lerobot_dataset import LeRobotDataset
# Load a dataset with subtask annotations
dataset = LeRobotDataset("jadechoghari/collect-fruit-annotated")
# Access a sample
sample = dataset[100]
# The sample includes both task and subtask information
print(sample["task"]) # "Collect the fruit"
print(sample["subtask"]) # "Grasp the apple"
print(sample["task_index"]) # tensor(0)
print(sample["subtask_index"]) # tensor(2)
```
### Checking for Subtask Support
You can check if a dataset has subtask annotations:
```python
# Check if subtasks are available
has_subtasks = (
"subtask_index" in dataset.features
and dataset.meta.subtasks is not None
)
if has_subtasks:
print(f"Dataset has {len(dataset.meta.subtasks)} unique subtasks")
print("Subtasks:", list(dataset.meta.subtasks.index))
```
## Using Subtasks for Training
### With the Tokenizer Processor
The `TokenizerProcessor` automatically handles subtask tokenization for Vision-Language Action (VLA) models:
```python
from lerobot.processor.tokenizer_processor import TokenizerProcessor
from lerobot.processor.pipeline import ProcessorPipeline
# Create a tokenizer processor
tokenizer_processor = TokenizerProcessor(
tokenizer_name_or_path="google/paligemma-3b-pt-224",
padding="max_length",
max_length=64,
)
# The processor will automatically tokenize subtasks if present in the batch
# and add them to the observation under:
# - "observation.subtask.tokens"
# - "observation.subtask.attention_mask"
```
When subtasks are available in the batch, the tokenizer processor adds:
- `observation.subtask.tokens`: Tokenized subtask text
- `observation.subtask.attention_mask`: Attention mask for the subtask tokens
### DataLoader with Subtasks
```python
import torch
from lerobot.datasets.lerobot_dataset import LeRobotDataset
dataset = LeRobotDataset("jadechoghari/collect-fruit-annotated")
dataloader = torch.utils.data.DataLoader(
dataset,
batch_size=16,
shuffle=True,
)
for batch in dataloader:
# Access subtask information in the batch
subtasks = batch["subtask"] # List of subtask strings
subtask_indices = batch["subtask_index"] # Tensor of subtask indices
# Use for training hierarchical policies or reward models
print(f"Batch subtasks: {set(subtasks)}")
```
## Example Datasets with Subtask Annotations
Try loading a dataset with subtask annotations:
```python
from lerobot.datasets.lerobot_dataset import LeRobotDataset
# Example dataset with subtask annotations
dataset = LeRobotDataset("jadechoghari/collect-fruit-annotated")
# Explore the subtasks
print("Available subtasks:")
for subtask_name in dataset.meta.subtasks.index:
print(f" - {subtask_name}")
# Get subtask distribution
subtask_counts = {}
for i in range(len(dataset)):
sample = dataset[i]
subtask = sample["subtask"]
subtask_counts[subtask] = subtask_counts.get(subtask, 0) + 1
print("\nSubtask distribution:")
for subtask, count in sorted(subtask_counts.items(), key=lambda x: -x[1]):
print(f" {subtask}: {count} frames")
```
## Use Cases
### 1. Hierarchical Policy Training
Train policies that predict both actions and current subtask:
```python
class HierarchicalPolicy(nn.Module):
def __init__(self, num_subtasks):
super().__init__()
self.action_head = nn.Linear(hidden_dim, action_dim)
self.subtask_head = nn.Linear(hidden_dim, num_subtasks)
def forward(self, observations):
features = self.encoder(observations)
actions = self.action_head(features)
subtask_logits = self.subtask_head(features)
return actions, subtask_logits
```
### 2. Stage-Aware Reward Modeling (SARM)
Build reward models that understand task progression:
```python
# SARM predicts:
# - Stage: Which subtask is being executed (discrete)
# - Progress: How far along the subtask (continuous 0-1)
class SARMRewardModel(nn.Module):
def forward(self, observations):
features = self.encoder(observations)
stage_logits = self.stage_classifier(features)
progress = self.progress_regressor(features)
return stage_logits, progress
```
### 3. Progress Visualization
Monitor robot execution by tracking subtask progression:
```python
def visualize_execution(model, observations):
for t, obs in enumerate(observations):
action, subtask_logits = model(obs)
predicted_subtask = subtask_names[subtask_logits.argmax()]
print(f"t={t}: Executing '{predicted_subtask}'")
```
## API Reference
### LeRobotDataset Properties
| Property | Type | Description |
| --------------------------- | ---------------------- | ------------------------------------------ |
| `meta.subtasks` | `pd.DataFrame \| None` | DataFrame mapping subtask names to indices |
| `features["subtask_index"]` | `dict` | Feature spec for subtask_index if present |
### Sample Keys
When subtasks are available, each sample includes:
| Key | Type | Description |
| --------------- | -------------- | ------------------------------------ |
| `subtask_index` | `torch.Tensor` | Integer index of the current subtask |
| `subtask` | `str` | Natural language subtask description |
## Related Resources
- [SARM Paper](https://arxiv.org/pdf/2509.25358) - Stage-Aware Reward Modeling for Long Horizon Robot Manipulation
- [LeRobot Annotate Space](https://huggingface.co/spaces/lerobot/annotate) - Interactive annotation tool
- [LeRobotDataset v3.0](./lerobot-dataset-v3) - Dataset format documentation
+8 -36
View File
@@ -1,11 +1,5 @@
# EarthRover Mini Plus
<img
src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/lerobot/Earth_Rover_Mini_5_240c9adc-4f9e-44b7-982f-5d1dc24af1d8.png.webp"
alt="EarthRover Mini Plus"
width="70%"
/>
The EarthRover Mini Plus is a fully open source mobile robot that connects through the cloud using the Frodobots SDK. This lets you control the robot and record datasets for training AI models.
## What You Need
@@ -13,47 +7,28 @@ The EarthRover Mini Plus is a fully open source mobile robot that connects throu
### Hardware
- EarthRover Mini robot
- Computer with Python 3.12 or newer
- Computer with Python 3.10 or newer
- Internet connection
### Setting Up the Frodobots SDK
The robot needs the [Frodobots SDK](https://github.com/frodobots-org/earth-rovers-sdk) running on your computer. Here's how:
The robot needs the [Frodobots SDK](https://github.com/Frodobots/earth-rovers-sdk) running on your computer. Here's how:
1. Download and install the SDK:
```bash
git clone https://github.com/frodobots-org/earth-rovers-sdk.git
git clone https://github.com/Frodobots/earth-rovers-sdk.git
cd earth-rovers-sdk
pip install -r requirements.txt
```
2. Save Credentials:
Write your .env variables with the SDK API key and bot name provided by the Frodobots team.
```bash
SDK_API_TOKEN=your_sdk_api_token_here
BOT_SLUG=your_bot_slug_here
CHROME_EXECUTABLE_PATH=/path/to/chrome_or_chromium
# Default value is MAP_ZOOM_LEVEL=18 https://wiki.openstreetmap.org/wiki/Zoom_levels
MAP_ZOOM_LEVEL=18
MISSION_SLUG=your_mission_slug_here
# Image quality between 0.1 and 1.0 (default: 0.8)
# Recommended: 0.8 for better performance
IMAGE_QUALITY=0.8
# Image format: jpeg, png or webp (default: png)
# Recommended: jpeg for better performance and lower bandwidth usage
IMAGE_FORMAT=jpeg
```
3. Start the SDK:
2. Start the SDK:
```bash
hypercorn main:app --reload
```
4. Open your web browser and go to `http://localhost:8000`, then click "Join"
3. Open your web browser and go to `http://localhost:8000`, then click "Join"
The SDK gives you:
@@ -170,13 +145,13 @@ Once you can drive the robot well, you can start recording data to train AI mode
We use Hugging Face to store your data online. First, log in with your token from [Hugging Face settings](https://huggingface.co/settings/tokens):
```bash
hf auth login --token ${HUGGINGFACE_TOKEN} --add-to-git-credential
huggingface-cli login --token ${HUGGINGFACE_TOKEN} --add-to-git-credential
```
Store your Hugging Face username:
```bash
HF_USER=$(hf auth whoami | awk -F': *' 'NR==1 {print $2}')
HF_USER=$(huggingface-cli whoami | head -n 1)
echo $HF_USER
```
@@ -185,16 +160,13 @@ echo $HF_USER
Use the standard recording command:
```bash
lerobot-record \
python src/lerobot/scripts/lerobot_record.py \
--robot.type=earthrover_mini_plus \
--teleop.type=keyboard_rover \
--dataset.repo_id=your_username/dataset_name \
--dataset.num_episodes=2 \
--dataset.fps=10 \
--dataset.single_task="Navigate around obstacles" \
--dataset.streaming_encoding=true \
--dataset.encoder_threads=2 \
# --dataset.vcodec=auto \
--display_data=true
```
+19 -26
View File
@@ -2,32 +2,14 @@
The **EnvHub** feature allows you to load simulation environments directly from the Hugging Face Hub with a single line of code. This unlocks a powerful new model for collaboration: instead of environments being locked away inside monolithic libraries, anyone can publish custom environments and share them with the community.
## What is EnvHub?
## Overview
EnvHub lets you create custom robotics simulation environments with your own robot models and scenarios, and make them easily usable by anyone through the LeRobot framework.
With EnvHub, you can:
EnvHub packages are stored on the Hugging Face Hub, and can be seamlessly pulled and used in your AI robotics projects through LeRobot with a single line of code.
Thanks to EnvHub, you can:
1. **Create and publish environments** to the Hugging Face Hub as Git repositories, and distribute complex physics simulations without packaging hassles
2. **Load environments** dynamically, without installing them as packages
3. **Version and track** environment changes using Git semantics
4. **Discover** new simulation tasks shared by the community
This design means you can go from discovering an interesting environment on the Hub to running experiments in seconds, or create your own custom robot and environment without worrying about dependency conflicts or complex installation procedures.
When you create an EnvHub package, you can build anything you want inside it and use any simulation tool you like: this is your own space to play with. The only requirement is that the package contains an `env.py` file that defines the environment and allows LeRobot to load and use your EnvHub package.
This `env.py` file needs to expose a small API so LeRobot can load and run it. In particular, you must provide a `make_env(n_envs: int = 1, use_async_envs: bool = False)` or `make_env(n_envs: int = 1, use_async_envs: bool = False, cfg: EnvConfig)` function, which is the main entry point for LeRobot. It should return one of:
- A `gym.vector.VectorEnv` (most common)
- A single `gym.Env` (will be automatically wrapped)
- A dict mapping `{suite_name: {task_id: VectorEnv}}` (for multi-task benchmarks)
You can also pass an `EnvConfig` object to `make_env` to configure the environment (e.g. the number of environments, task, camera name, initial states, control mode, episode length, etc.).
Finally, your environment must implement the standard `gym.vector.VectorEnv` interface so it works with LeRobot, including methods like `reset` and `step`.
- Load environments from the Hub instantly
- Share your custom simulation tasks with the community
- Version control your environments using Git
- Distribute complex physics simulations without packaging hassles
## Quick Start
@@ -47,6 +29,17 @@ env = make_env("lerobot/cartpole-env", trust_remote_code=True)
hash for reproducibility and security.
</Tip>
## What is EnvHub?
EnvHub is a framework that allows researchers and developers to:
1. **Publish environments** to the Hugging Face Hub as Git repositories
2. **Load environments** dynamically without installing them as packages
3. **Version and track** environment changes using Git semantics
4. **Discover** new simulation tasks shared by the community
This design means you can go from discovering an interesting environment on the Hub to running experiments in seconds, without worrying about dependency conflicts or complex installation procedures.
## Repository Structure
To make your environment loadable from the Hub, your repository must contain at minimum:
@@ -155,10 +148,10 @@ Upload your repository to Hugging Face:
pip install huggingface_hub
# Login to Hugging Face
hf auth login
huggingface-cli login
# Create a new repository
hf repo create my-org/my-custom-env
huggingface-cli repo create my-custom-env --type space --org my-org
# Initialize git and push
git init
-510
View File
@@ -1,510 +0,0 @@
# NVIDIA IsaacLab Arena & LeRobot
LeRobot EnvHub now supports **GPU-accelerated simulation** with IsaacLab Arena for policy evaluation at scale.
Train and evaluate imitation learning policies with high-fidelity simulation — all integrated into the LeRobot ecosystem.
<img
src="https://huggingface.co/nvidia/isaaclab-arena-envs/resolve/main/assets/Gr1OpenMicrowaveEnvironment.png"
alt="IsaacLab Arena - GR1 Microwave Environment"
style={{ maxWidth: "100%", borderRadius: "8px", marginBottom: "1rem" }}
/>
[IsaacLab Arena](https://github.com/isaac-sim/IsaacLab-Arena) integrates with NVIDIA IsaacLab to provide:
- 🤖 **Humanoid embodiments**: GR1, G1, Galileo with various configurations
- 🎯 **Manipulation & loco-manipulation tasks**: Door opening, pick-and-place, button pressing, and more
- ⚡ **GPU-accelerated rollouts**: Parallel environment execution on NVIDIA GPUs
- 🖼️ **RTX Rendering**: Evaluate vision-based policies with realistic rendering, reflections and refractions
- 📦 **LeRobot-compatible datasets**: Ready for training with GR00T N1x, PI0, SmolVLA, ACT, and Diffusion policies
- 🔄 **EnvHub integration**: Load environments from HuggingFace EnvHub with one line
## Installation
### Prerequisites
Hardware requirements are shared with Isaac Sim, and are detailed in [Isaac Sim Requirements](https://docs.isaacsim.omniverse.nvidia.com/5.1.0/installation/requirements.html).
- NVIDIA GPU with CUDA support
- NVIDIA driver compatible with IsaacSim 5.1.0
- Linux (Ubuntu 22.04 / 24.04)
### Setup
```bash
# 1. Create conda environment
conda create -y -n lerobot-arena python=3.11
conda activate lerobot-arena
conda install -y -c conda-forge ffmpeg=7.1.1
# 2. Install Isaac Sim 5.1.0
pip install "isaacsim[all,extscache]==5.1.0" --extra-index-url https://pypi.nvidia.com
# Accept NVIDIA EULA (required)
export ACCEPT_EULA=Y
export PRIVACY_CONSENT=Y
# 3. Install IsaacLab 2.3.0
git clone https://github.com/isaac-sim/IsaacLab.git
cd IsaacLab
git checkout v2.3.0
./isaaclab.sh -i
cd ..
# 4. Install IsaacLab Arena
git clone https://github.com/isaac-sim/IsaacLab-Arena.git
cd IsaacLab-Arena
git checkout release/0.1.1
pip install -e .
cd ..
# 5. Install LeRobot
git clone https://github.com/huggingface/lerobot.git
cd lerobot
pip install -e .
cd ..
# 6. Install additional dependencies
pip install onnxruntime==1.23.2 lightwheel-sdk==1.0.1 vuer[all]==0.0.70 qpsolvers==4.8.1
pip install numpy==1.26.0 # Isaac Sim 5.1 depends on numpy==1.26.0, this will be fixed in next release
```
## Evaluating Policies
### Pre-trained Policies
The following trained policies are available:
| Policy | Architecture | Task | Link |
| :-------------------------- | :----------- | :------------ | :----------------------------------------------------------------------- |
| pi05-arena-gr1-microwave | PI0.5 | GR1 Microwave | [HuggingFace](https://huggingface.co/nvidia/pi05-arena-gr1-microwave) |
| smolvla-arena-gr1-microwave | SmolVLA | GR1 Microwave | [HuggingFace](https://huggingface.co/nvidia/smolvla-arena-gr1-microwave) |
### Evaluate SmolVLA
```bash
pip install -e ".[smolvla]"
pip install numpy==1.26.0 # revert numpy to version 1.26
```
```bash
lerobot-eval \
--policy.path=nvidia/smolvla-arena-gr1-microwave \
--env.type=isaaclab_arena \
--env.hub_path=nvidia/isaaclab-arena-envs \
--rename_map='{"observation.images.robot_pov_cam_rgb": "observation.images.robot_pov_cam"}' \
--policy.device=cuda \
--env.environment=gr1_microwave \
--env.embodiment=gr1_pink \
--env.object=mustard_bottle \
--env.headless=false \
--env.enable_cameras=true \
--env.video=true \
--env.video_length=10 \
--env.video_interval=15 \
--env.state_keys=robot_joint_pos \
--env.camera_keys=robot_pov_cam_rgb \
--trust_remote_code=True \
--eval.batch_size=1
```
### Evaluate PI0.5
```bash
pip install -e ".[pi]"
pip install numpy==1.26.0 # revert numpy to version 1.26
```
<Tip>PI0.5 requires disabling torch compile for evaluation:</Tip>
```bash
TORCH_COMPILE_DISABLE=1 TORCHINDUCTOR_DISABLE=1 lerobot-eval \
--policy.path=nvidia/pi05-arena-gr1-microwave \
--env.type=isaaclab_arena \
--env.hub_path=nvidia/isaaclab-arena-envs \
--rename_map='{"observation.images.robot_pov_cam_rgb": "observation.images.robot_pov_cam"}' \
--policy.device=cuda \
--env.environment=gr1_microwave \
--env.embodiment=gr1_pink \
--env.object=mustard_bottle \
--env.headless=false \
--env.enable_cameras=true \
--env.video=true \
--env.video_length=15 \
--env.video_interval=15 \
--env.state_keys=robot_joint_pos \
--env.camera_keys=robot_pov_cam_rgb \
--trust_remote_code=True \
--eval.batch_size=1
```
<Tip>
To change the number of parallel environments, use the ```--eval.batch_size```
flag.
</Tip>
### What to Expect
During evaluation, you will see a progress bar showing the running success rate:
```
Stepping through eval batches: 8%|██████▍ | 4/50 [00:45<08:06, 10.58s/it, running_success_rate=25.0%]
```
### Video Recording
To enable video recording during evaluation, add the following flags to your command:
```bash
--env.video=true \
--env.video_length=15 \
--env.video_interval=15
```
For more details on video recording, see the [IsaacLab Recording Documentation](https://isaac-sim.github.io/IsaacLab/main/source/how-to/record_video.html).
<Tip>
When running headless with `--env.headless=true`, you must also enable cameras explicitly for camera enabled environments:
```bash
--env.headless=true --env.enable_cameras=true
```
</Tip>
### Output Directory
Evaluation videos are saved to the output directory with the following structure:
```
outputs/eval/<date>/<timestamp>_<env>_<policy>/videos/<task>_<env_id>/eval_episode_<n>.mp4
```
For example:
```
outputs/eval/2026-01-02/14-38-01_isaaclab_arena_smolvla/videos/gr1_microwave_0/eval_episode_0.mp4
```
## Training Policies
To learn more about training policies with LeRobot, please refer to the training documentation:
- [SmolVLA](./smolvla)
- [Pi0.5](./pi05)
- [GR00T N1.5](./groot)
Sample IsaacLab Arena datasets are available on HuggingFace Hub for experimentation:
| Dataset | Description | Frames |
| :-------------------------------------------------------------------------------------------------------- | :------------------------- | :----- |
| [Arena-GR1-Manipulation-Task](https://huggingface.co/datasets/nvidia/Arena-GR1-Manipulation-Task-v3) | GR1 microwave manipulation | ~4K |
| [Arena-G1-Loco-Manipulation-Task](https://huggingface.co/datasets/nvidia/Arena-G1-Loco-Manipulation-Task) | G1 loco-manipulation | ~4K |
## Environment Configuration
### Full Configuration Options
```python
from lerobot.envs.configs import IsaaclabArenaEnv
config = IsaaclabArenaEnv(
# Environment selection
environment="gr1_microwave", # Task environment
embodiment="gr1_pink", # Robot embodiment
object="power_drill", # Object to manipulate
# Simulation settings
episode_length=300, # Max steps per episode
headless=True, # Run without GUI
device="cuda:0", # GPU device
seed=42, # Random seed
# Observation configuration
state_keys="robot_joint_pos", # State observation keys (comma-separated)
camera_keys="robot_pov_cam_rgb", # Camera observation keys (comma-separated)
state_dim=54, # Expected state dimension
action_dim=36, # Expected action dimension
camera_height=512, # Camera image height
camera_width=512, # Camera image width
enable_cameras=True, # Enable camera observations
# Video recording
video=False, # Enable video recording
video_length=100, # Frames per video
video_interval=200, # Steps between recordings
# Advanced
mimic=False, # Enable mimic mode
teleop_device=None, # Teleoperation device
disable_fabric=False, # Disable fabric optimization
enable_pinocchio=True, # Enable Pinocchio for IK
)
```
### Using Environment Hub directly for advanced usage
Create a file called `test_env_load_arena.py` or [download from the EnvHub](https://huggingface.co/nvidia/isaaclab-arena-envs/blob/main/tests/test_env_load_arena.py):
```python
import logging
from dataclasses import asdict
from pprint import pformat
import torch
import tqdm
from lerobot.configs import parser
from lerobot.configs.eval import EvalPipelineConfig
@parser.wrap()
def main(cfg: EvalPipelineConfig):
"""Run random action rollout for IsaacLab Arena environment."""
logging.info(pformat(asdict(cfg)))
from lerobot.envs.factory import make_env
env_dict = make_env(
cfg.env,
n_envs=cfg.env.num_envs,
trust_remote_code=True,
)
env = next(iter(env_dict.values()))[0]
env.reset()
for _ in tqdm.tqdm(range(cfg.env.episode_length)):
with torch.inference_mode():
actions = env.action_space.sample()
obs, rewards, terminated, truncated, info = env.step(actions)
if terminated.any() or truncated.any():
obs, info = env.reset()
env.close()
if __name__ == "__main__":
main()
```
Run with:
```bash
python test_env_load_arena.py \
--env.environment=g1_locomanip_pnp \
--env.embodiment=gr1_pink \
--env.object=cracker_box \
--env.num_envs=4 \
--env.enable_cameras=true \
--env.seed=1000 \
--env.video=true \
--env.video_length=10 \
--env.video_interval=15 \
--env.headless=false \
--env.hub_path=nvidia/isaaclab-arena-envs \
--env.type=isaaclab_arena
```
## Creating New Environments
First create a new IsaacLab Arena environment by following the [IsaacLab Arena Documentation](https://isaac-sim.github.io/IsaacLab-Arena/release/0.1.1/index.html).
Clone our EnvHub repo:
```bash
git clone https://huggingface.co/nvidia/isaaclab-arena-envs
```
Modify the `example_envs.yaml` file based on your new environment.
[Upload](./envhub#step-3-upload-to-the-hub) your modified repo to HuggingFace EnvHub.
<Tip>
Your IsaacLab Arena environment code must be locally available during
evaluation. Users can clone your environment repository separately, or you can
bundle the environment code and assets directly in your EnvHub repo.
</Tip>
Then, when evaluating, use your new environment:
```bash
lerobot-eval \
--env.hub_path=<your-env-hub-path>/isaaclab-arena-envs \
--env.environment=<your new environment> \
...other flags...
```
We look forward to your contributions!
## Troubleshooting
### CUDA out of memory
Reduce `batch_size` or use a GPU with more VRAM:
```bash
--eval.batch_size=1
```
### EULA not accepted
Set environment variables before running:
```bash
export ACCEPT_EULA=Y
export PRIVACY_CONSENT=Y
```
### Video recording not working
Enable cameras when running headless:
```bash
--env.video=true --env.enable_cameras=true --env.headless=true
```
### Policy output dimension mismatch
Ensure `action_dim` matches your policy:
```bash
--env.action_dim=36
```
### libGLU.so.1 Errors during Isaac Sim initialization
Ensure you have the following dependencies installed, this is likely to happen on headless machines.
```bash
sudo apt update && sudo apt install -y libglu1-mesa libxt6
```
## See Also
- [EnvHub Documentation](./envhub.mdx) - General EnvHub usage
- [IsaacLab Arena GitHub](https://github.com/isaac-sim/IsaacLab-Arena)
- [IsaacLab Documentation](https://isaac-sim.github.io/IsaacLab/)
## Lightwheel LW-BenchHub
[Lightwheel](https://www.lightwheel.ai) is bringing `Lightwheel-Libero-Tasks` and `Lightwheel-RoboCasa-Tasks` with 268 tasks to the LeRobot ecosystem.
LW-BenchHub collects and generates large-scale datasets via teleoperation that comply with the LeRobot specification, enabling out-of-the-box training and evaluation workflows.
With the unified interface provided by EnvHub, developers can quickly build end-to-end experimental pipelines.
### Install
Assuming you followed the [Installation](#installation) steps, you can install LW-BenchHub with:
```bash
conda install pinocchio -c conda-forge -y
pip install numpy==1.26.0 # revert numpy to version 1.26
sudo apt-get install git-lfs && git lfs install
git clone https://github.com/LightwheelAI/lw_benchhub
git lfs pull # Ensure LFS files (e.g., .usd assets) are downloaded
cd lw_benchhub
pip install -e .
```
For more detailed instructions, please refer to the [LW-BenchHub Documentation](https://docs.lightwheel.net/lw_benchhub/usage/Installation).
### Lightwheel Tasks Dataset
LW-BenchHub datasets are available on HuggingFace Hub:
| Dataset | Description | Tasks | Frames |
| :------------------------------------------------------------------------------------------------------------ | :---------------------- | :---- | :----- |
| [Lightwheel-Tasks-X7S](https://huggingface.co/datasets/LightwheelAI/Lightwheel-Tasks-X7S) | X7S LIBERO and RoboCasa | 117 | ~10.3M |
| [Lightwheel-Tasks-Double-Piper](https://huggingface.co/datasets/LightwheelAI/Lightwheel-Tasks-Double-Piper) | Double-Piper LIBERO | 130 | ~6.0M |
| [Lightwheel-Tasks-G1-Controller](https://huggingface.co/datasets/LightwheelAI/Lightwheel-Tasks-G1-Controller) | G1-Controller LIBERO | 62 | ~2.7M |
| [Lightwheel-Tasks-G1-WBC](https://huggingface.co/datasets/LightwheelAI/Lightwheel-Tasks-G1-WBC) | G1-WBC RoboCasa | 32 | ~1.5M |
For training policies, refer to the [Training Policies](#training-policies) section.
### Evaluating Policies
#### Pre-trained Policies
The following trained policies are available:
| Policy | Architecture | Task | Layout | Robot | Link |
| :----------------------- | :----------- | :----------------------------- | :--------- | :-------------- | :------------------------------------------------------------------------------------ |
| smolvla-double-piper-pnp | SmolVLA | L90K1PutTheBlackBowlOnThePlate | libero-1-1 | DoublePiper-Abs | [HuggingFace](https://huggingface.co/LightwheelAI/smolvla-double-piper-pnp/tree/main) |
#### Evaluate SmolVLA
```bash
lerobot-eval \
--policy.path=LightwheelAI/smolvla-double-piper-pnp \
--env.type=isaaclab_arena \
--rename_map='{"observation.images.left_hand_camera_rgb": "observation.images.left_hand", "observation.images.right_hand_camera_rgb": "observation.images.right_hand", "observation.images.first_person_camera_rgb": "observation.images.first_person"}' \
--env.hub_path=LightwheelAI/lw_benchhub_env \
--env.kwargs='{"config_path": "configs/envhub/example.yml"}' \
--trust_remote_code=true \
--env.state_keys=joint_pos \
--env.action_dim=12 \
--env.camera_keys=left_hand_camera_rgb,right_hand_camera_rgb,first_person_camera_rgb \
--policy.device=cuda \
--eval.batch_size=10 \
--eval.n_episodes=100
```
### Environment Configuration
Evaluation can be quickly launched by modifying the `robot`, `task`, and `layout` settings in the configuration file.
#### Full Configuration Options
```yml
# =========================
# Basic Settings
# =========================
disable_fabric: false
device: cuda:0
sensitivity: 1.0
step_hz: 50
enable_cameras: true
execute_mode: eval
episode_length_s: 20.0 # Episode length in seconds, increase if episodes timeout during eval
# =========================
# Robot Settings
# =========================
robot: DoublePiper-Abs # Robot type, DoublePiper-Abs, X7S-Abs, G1-Controller or G1-Controller-DecoupledWBC
robot_scale: 1.0
# =========================
# Task & Scene Settings
# =========================
task: L90K1PutTheBlackBowlOnThePlate # Task name
scene_backend: robocasa
task_backend: robocasa
debug_assets: null
layout: libero-1-1 # Layout and style ID
sources:
- objaverse
- lightwheel
- aigen_objs
object_projects: []
usd_simplify: false
seed: 42
# =========================
# Object Placement Retry Settings
# =========================
max_scene_retry: 4
max_object_placement_retry: 3
resample_objects_placement_on_reset: true
resample_robot_placement_on_reset: true
# =========================
# Replay Configuration Settings
# =========================
replay_cfgs:
add_camera_to_observation: true
render_resolution: [640, 480]
```
### See Also
- [LW-BenchHub GitHub](https://github.com/LightwheelAI/LW-BenchHub)
- [LW-BenchHub Documentation](https://docs.lightwheel.net/lw_benchhub/)
+3 -4
View File
@@ -137,8 +137,7 @@ from lerobot.teleoperators import ( # noqa: F401
Teleoperator,
TeleoperatorConfig,
make_teleoperator_from_config,
so_leader,
bi_so_leader,
so101_leader,
)
from lerobot.utils.robot_utils import precise_sleep
from lerobot.utils.utils import init_logging
@@ -197,7 +196,7 @@ def teleop_loop(teleop: Teleoperator, env: gym.Env, fps: int):
obs, info = env.reset()
dt_s = time.perf_counter() - loop_start
precise_sleep(max(1 / fps - dt_s, 0.0))
precise_sleep(1 / fps - dt_s)
loop_s = time.perf_counter() - loop_start
print(f"\ntime: {loop_s * 1e3:.2f}ms ({1 / loop_s:.0f} Hz)")
@@ -223,7 +222,7 @@ def teleoperate(cfg: TeleoperateConfig):
def main():
teleoperate(TeleoperateConfig(
teleop=so_leader.SO101LeaderConfig(
teleop=so101_leader.SO101LeaderConfig(
port="/dev/ttyACM0",
id='leader',
use_degrees=False,
+4 -13
View File
@@ -12,12 +12,6 @@ Developers and researchers can post-train GR00T N1.5 with their own real or synt
GR00T N1.5 (specifically the GR00T-N1.5-3B model) is built using pre-trained vision and language encoders. It utilizes a flow matching action transformer to model a chunk of actions, conditioned on vision, language, and proprioception.
<img
src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/lerobot/lerobot-groot-paper1%20(1).png"
alt="An overview of GR00T"
width="80%"
/>
Its strong performance comes from being trained on an expansive and diverse humanoid dataset, which includes:
- Real captured data from robots.
@@ -109,7 +103,7 @@ Once you have trained your model using your parameters you can run inference in
```bash
lerobot-record \
--robot.type=bi_so_follower \
--robot.type=bi_so100_follower \
--robot.left_arm_port=/dev/ttyACM1 \
--robot.right_arm_port=/dev/ttyACM0 \
--robot.id=bimanual_follower \
@@ -120,12 +114,9 @@ lerobot-record \
--display_data=true \
--dataset.repo_id=<user>/eval_groot-bimanual \
--dataset.num_episodes=10 \
--dataset.single_task="Grab and handover the red cube to the other arm" \
--dataset.streaming_encoding=true \
--dataset.encoder_threads=2 \
# --dataset.vcodec=auto \
--policy.path=<user>/groot-bimanual \ # your trained model
--dataset.episode_time_s=30 \
--dataset.single_task="Grab and handover the red cube to the other arm"
--policy.path=<user>/groot-bimanual # your trained model
--dataset.episode_time_s=30
--dataset.reset_time_s=10
```
+5 -11
View File
@@ -224,15 +224,12 @@ lerobot-record \
--teleop.port=/dev/tty.usbmodem1201 \
--teleop.id=right \
--teleop.side=right \
--dataset.repo_id=<USER>/hand_record_test_with_video_data \
--dataset.repo_id=nepyope/hand_record_test_with_video_data \
--dataset.single_task="Hand recording test with video data" \
--dataset.num_episodes=1 \
--dataset.episode_time_s=5 \
--dataset.push_to_hub=true \
--dataset.private=true \
--dataset.streaming_encoding=true \
--dataset.encoder_threads=2 \
# --dataset.vcodec=auto \
--display_data=true
```
@@ -244,7 +241,7 @@ lerobot-replay \
--robot.port=/dev/tty.usbmodem58760432281 \
--robot.id=right \
--robot.side=right \
--dataset.repo_id=<USER>/hand_record_test_with_camera \
--dataset.repo_id=nepyope/hand_record_test_with_camera \
--dataset.episode=0
```
@@ -252,13 +249,13 @@ lerobot-replay \
```bash
lerobot-train \
--dataset.repo_id=<USER>/hand_record_test_with_video_data \
--dataset.repo_id=nepyope/hand_record_test_with_video_data \
--policy.type=act \
--output_dir=outputs/train/hopejr_hand \
--job_name=hopejr \
--policy.device=mps \
--wandb.enable=true \
--policy.repo_id=<USER>/hand_test_policy
--policy.repo_id=nepyope/hand_test_policy
```
### Evaluate
@@ -273,11 +270,8 @@ lerobot-record \
--robot.side=right \
--robot.cameras='{"main": {"type": "opencv", "index_or_path": 0, "width": 640, "height": 480, "fps": 30}}' \
--display_data=false \
--dataset.repo_id=<USER>/eval_hopejr \
--dataset.repo_id=nepyope/eval_hopejr \
--dataset.single_task="Evaluate hopejr hand policy" \
--dataset.num_episodes=10 \
--dataset.streaming_encoding=true \
--dataset.encoder_threads=2 \
# --dataset.vcodec=auto \
--policy.path=outputs/train/hopejr_hand/checkpoints/last/pretrained_model
```
+16 -22
View File
@@ -58,8 +58,8 @@ lerobot-teleoperate \
<!-- prettier-ignore-start -->
```python
from lerobot.teleoperators.so_leader import SO101LeaderConfig, SO101Leader
from lerobot.robots.so_follower import SO101FollowerConfig, SO101Follower
from lerobot.teleoperators.so101_leader import SO101LeaderConfig, SO101Leader
from lerobot.robots.so101_follower import SO101FollowerConfig, SO101Follower
robot_config = SO101FollowerConfig(
port="/dev/tty.usbmodem58760431541",
@@ -159,13 +159,13 @@ We use the Hugging Face hub features for uploading your dataset. If you haven't
Add your token to the CLI by running this command:
```bash
hf auth login --token ${HUGGINGFACE_TOKEN} --add-to-git-credential
huggingface-cli login --token ${HUGGINGFACE_TOKEN} --add-to-git-credential
```
Then store your Hugging Face repository name in a variable:
```bash
HF_USER=$(hf auth whoami | awk -F': *' 'NR==1 {print $2}')
HF_USER=$(hf auth whoami | head -n 1)
echo $HF_USER
```
@@ -185,10 +185,7 @@ lerobot-record \
--display_data=true \
--dataset.repo_id=${HF_USER}/record-test \
--dataset.num_episodes=5 \
--dataset.single_task="Grab the black cube" \
--dataset.streaming_encoding=true \
# --dataset.vcodec=auto \
--dataset.encoder_threads=2
--dataset.single_task="Grab the black cube"
```
</hfoption>
<hfoption id="API example">
@@ -198,9 +195,9 @@ lerobot-record \
from lerobot.cameras.opencv.configuration_opencv import OpenCVCameraConfig
from lerobot.datasets.lerobot_dataset import LeRobotDataset
from lerobot.datasets.utils import hw_to_dataset_features
from lerobot.robots.so_follower import SO100Follower, SO100FollowerConfig
from lerobot.teleoperators.so_leader.config_so100_leader import SO100LeaderConfig
from lerobot.teleoperators.so_leader.so100_leader import SO100Leader
from lerobot.robots.so100_follower import SO100Follower, SO100FollowerConfig
from lerobot.teleoperators.so100_leader.config_so100_leader import SO100LeaderConfig
from lerobot.teleoperators.so100_leader.so100_leader import SO100Leader
from lerobot.utils.control_utils import init_keyboard_listener
from lerobot.utils.utils import log_say
from lerobot.utils.visualization_utils import init_rerun
@@ -327,7 +324,7 @@ You can look for other LeRobot datasets on the hub by searching for `LeRobot` [t
You can also push your local dataset to the Hub manually, running:
```bash
hf upload ${HF_USER}/record-test ~/.cache/huggingface/lerobot/{repo-id} --repo-type dataset
huggingface-cli upload ${HF_USER}/record-test ~/.cache/huggingface/lerobot/{repo-id} --repo-type dataset
```
#### Record function
@@ -411,8 +408,8 @@ lerobot-replay \
import time
from lerobot.datasets.lerobot_dataset import LeRobotDataset
from lerobot.robots.so_follower.config_so100_follower import SO100FollowerConfig
from lerobot.robots.so_follower.so100_follower import SO100Follower
from lerobot.robots.so100_follower.config_so100_follower import SO100FollowerConfig
from lerobot.robots.so100_follower.so100_follower import SO100Follower
from lerobot.utils.robot_utils import precise_sleep
from lerobot.utils.utils import log_say
@@ -435,7 +432,7 @@ for idx in range(dataset.num_frames):
}
robot.send_action(action)
precise_sleep(max(1.0 / dataset.fps - (time.perf_counter() - t0), 0.0))
precise_sleep(1.0 / dataset.fps - (time.perf_counter() - t0))
robot.disconnect()
```
@@ -491,7 +488,7 @@ If your local computer doesn't have a powerful GPU you could utilize Google Cola
Once training is done, upload the latest checkpoint with:
```bash
hf upload ${HF_USER}/act_so101_test \
huggingface-cli upload ${HF_USER}/act_so101_test \
outputs/train/act_so101_test/checkpoints/last/pretrained_model
```
@@ -499,7 +496,7 @@ You can also upload intermediate checkpoints with:
```bash
CKPT=010000
hf upload ${HF_USER}/act_so101_test${CKPT} \
huggingface-cli upload ${HF_USER}/act_so101_test${CKPT} \
outputs/train/act_so101_test/checkpoints/${CKPT}/pretrained_model
```
@@ -518,9 +515,6 @@ lerobot-record \
--display_data=false \
--dataset.repo_id=${HF_USER}/eval_so100 \
--dataset.single_task="Put lego brick into the transparent box" \
--dataset.streaming_encoding=true \
--dataset.encoder_threads=2 \
# --dataset.vcodec=auto \
# <- Teleop optional if you want to teleoperate in between episodes \
# --teleop.type=so100_leader \
# --teleop.port=/dev/ttyACM0 \
@@ -537,8 +531,8 @@ from lerobot.datasets.lerobot_dataset import LeRobotDataset
from lerobot.datasets.utils import hw_to_dataset_features
from lerobot.policies.act.modeling_act import ACTPolicy
from lerobot.policies.factory import make_pre_post_processors
from lerobot.robots.so_follower.config_so100_follower import SO100FollowerConfig
from lerobot.robots.so_follower.so100_follower import SO100Follower
from lerobot.robots.so100_follower.config_so100_follower import SO100FollowerConfig
from lerobot.robots.so100_follower.so100_follower import SO100Follower
from lerobot.scripts.lerobot_record import record_loop
from lerobot.utils.control_utils import init_keyboard_listener
from lerobot.utils.utils import log_say
+8 -14
View File
@@ -1,20 +1,18 @@
# Installation
This guide uses conda (via miniforge) to manage environments. If you prefer another environment manager (e.g. `uv`, `venv`), ensure you have Python >=3.12 and ffmpeg installed with the `libsvtav1` encoder, then skip ahead to [Install LeRobot](#step-3-install-lerobot-).
## Step 1: Install [`miniforge`](https://conda-forge.org/download/)
## Install [`miniforge`](https://conda-forge.org/download/)
```bash
wget "https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-$(uname)-$(uname -m).sh"
bash Miniforge3-$(uname)-$(uname -m).sh
```
## Step 2: Environment Setup
## Environment Setup
Create a virtual environment with Python 3.12, using conda:
Create a virtual environment with Python 3.10, using conda:
```bash
conda create -y -n lerobot python=3.12
conda create -y -n lerobot python=3.10
```
Then activate your conda environment, you have to do this each time you open a shell to use lerobot:
@@ -40,14 +38,7 @@ conda install ffmpeg -c conda-forge
>
> - _[On Linux only]_ If you want to bring your own ffmpeg: Install [ffmpeg build dependencies](https://trac.ffmpeg.org/wiki/CompilationGuide/Ubuntu#GettheDependencies) and [compile ffmpeg from source with libsvtav1](https://trac.ffmpeg.org/wiki/CompilationGuide/Ubuntu#libsvtav1), and make sure you use the corresponding ffmpeg binary to your install with `which ffmpeg`.
> [!NOTE]
> When installing LeRobot inside WSL (Windows Subsystem for Linux), make sure to install `evdev` with the following command:
>
> ```bash
> conda install evdev -c conda-forge
> ```
## Step 3: Install LeRobot 🤗
## Install LeRobot 🤗
### From Source
@@ -90,6 +81,9 @@ _Replace `[...]` with your desired features._
For a full list of optional dependencies, see:
https://pypi.org/project/lerobot/
> [!NOTE]
> For lerobot 0.4.0, if you want to install pi, you will have to do: `pip install "lerobot[pi]@git+https://github.com/huggingface/lerobot.git"`
### Troubleshooting
If you encounter build errors, you may need to install additional dependencies: `cmake`, `build-essential`, and `ffmpeg libs`.
+1 -1
View File
@@ -18,7 +18,7 @@ If you're using Feetech or Dynamixel motors, LeRobot provides built-in bus inter
- [`DynamixelMotorsBus`](https://github.com/huggingface/lerobot/blob/main/src/lerobot/motors/dynamixel/dynamixel.py) for controlling Dynamixel servos
Please refer to the [`MotorsBus`](https://github.com/huggingface/lerobot/blob/main/src/lerobot/motors/motors_bus.py) abstract class to learn about its API.
For a good example of how it can be used, you can have a look at our own [SO101 follower implementation](https://github.com/huggingface/lerobot/blob/main/src/lerobot/robots/so_follower/so101_follower/so101_follower.py)
For a good example of how it can be used, you can have a look at our own [SO101 follower implementation](https://github.com/huggingface/lerobot/blob/main/src/lerobot/robots/so101_follower/so101_follower.py)
Use these if compatible. Otherwise, you'll need to find or write a Python interface (not covered in this tutorial):
+3 -9
View File
@@ -1,11 +1,5 @@
# LeKiwi
<img
src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/lerobot/1740517739083.jpeg"
alt="LeKiwi"
width="70%"
/>
In the steps below, we explain how to assemble the LeKiwi mobile robot.
## Source the parts
@@ -210,7 +204,7 @@ lerobot-calibrate \
<!-- prettier-ignore-start -->
```python
from lerobot.teleoperators.so_leader import SO100LeaderConfig, SO100Leader
from lerobot.teleoperators.so100_leader import SO100LeaderConfig, SO100Leader
config = SO100LeaderConfig(
port="/dev/tty.usbmodem58760431551",
@@ -279,13 +273,13 @@ We use the Hugging Face hub features for uploading your dataset. If you haven't
Add your token to the CLI by running this command:
```bash
hf auth login --token ${HUGGINGFACE_TOKEN} --add-to-git-credential
huggingface-cli login --token ${HUGGINGFACE_TOKEN} --add-to-git-credential
```
Then store your Hugging Face repository name in a variable:
```bash
HF_USER=$(hf auth whoami | awk -F': *' 'NR==1 {print $2}')
HF_USER=$(huggingface-cli whoami | head -n 1)
echo $HF_USER
```
+1 -4
View File
@@ -41,10 +41,7 @@ lerobot-record \
--display_data=true \
--dataset.repo_id=${HF_USER}/record-test \
--dataset.num_episodes=5 \
--dataset.single_task="Grab the black cube" \
--dataset.streaming_encoding=true \
# --dataset.vcodec=auto \
--dataset.encoder_threads=2
--dataset.single_task="Grab the black cube"
```
See the [recording guide](./il_robots#record-a-dataset) for more details.
-1
View File
@@ -42,7 +42,6 @@ lerobot-eval \
```
- `--env.task` picks the suite (`libero_object`, `libero_spatial`, etc.).
- `--env.task_ids` picks task ids to run (`[0]`, `[1,2,3]`, etc.). Omit this flag (or set it to `null`) to run all tasks in the suite.
- `--eval.batch_size` controls how many environments run in parallel.
- `--eval.n_episodes` sets how many episodes to run in total.
-340
View File
@@ -1,340 +0,0 @@
# Multitask DiT Policy
Multitask Diffusion Transformer (DiT) Policy is an evolution of the original Diffusion Policy architecture, which leverages a large DiT with text and vision conditioning for multitask robot learning. This implementation supports both diffusion and flow matching objectives for action generation, enabling robots to perform diverse manipulation tasks conditioned on language instructions.
## Model Overview
The model uses:
- **CLIP Vision Encoder**: Processes RGB images from multiple camera views
- **CLIP Text Encoder**: Encodes language task instructions (frozen weights with learnable projection)
- **Diffusion Transformer**: Predicts action sequences conditioned on observations and language
- **Two Objectives**: Supports both diffusion (DDPM/DDIM) and flow matching for action generation
This model is exciting because you can achieve extremely high dexterity, competitive with multi-billion parameter
VLAs, with only ~450M parameters and significantly less training.
## Installation Requirements
Multitask DiT Policy has additional dependencies. Install it with:
```bash
pip install lerobot[multi_task_dit]
```
This will install all necessary dependencies including the HuggingFace Transformers library for CLIP models.
## Usage
To use Multitask DiT in your LeRobot configuration, specify the policy type as:
```python
policy.type=multi_task_dit
```
## Training
### Basic Training Command
Here's a complete training command for training Multitask DiT on your dataset:
```bash
lerobot-train \
--dataset.repo_id=YOUR_DATASET \
--output_dir=./outputs/multitask_dit_training \
--batch_size=32 \
--steps=5000 \
--save_freq=500 \
--log_freq=100 \
--policy.type=multi_task_dit \
--policy.device=cuda \
--policy.repo_id="HF_USER/multitask-dit-your-robot" \
--wandb.enable=true
```
### Recommended Hyperparameters and Dataset Details (30Hz Control Frequency)
For reliable performance, start with these suggested default hyperparameters:
```bash
lerobot-train \
--dataset.repo_id=YOUR_DATASET \
--output_dir=./outputs/mutitask_dit_training \
--batch_size=320 \
--steps=30000 \
--policy.type=multi_task_dit \
--policy.device=cuda \
--policy.horizon=32 \
--policy.n_action_steps=24 \
--policy.objective=diffusion \
--policy.noise_scheduler_type=DDPM \
--policy.num_train_timesteps=100 \
--policy.repo_id="HF_USER/multitask-dit-your-robot" \
--wandb.enable=true
```
**Key Parameters:**
- **Batch Size**: 192-320 - If you have access to a GPU that can support this, you will get the best training dynamics
- **Horizon**: 32 - number of action steps to predict, ~1.0 sec at 30Hz
- **n_action_steps**: 24 - ~0.8 seconds at 30Hz
- **Objective**: `diffusion` - start with diffusion and experiment with flow matching if generation quality is poor
- **Training Steps**: >30k steps recommended for a single task
### Training Configuration Parameters
#### Objective Selection
Choose between diffusion and flow matching:
```bash
# Diffusion objective (default)
--policy.objective=diffusion \
--policy.noise_scheduler_type=DDPM \ # or "DDIM"
--policy.num_train_timesteps=100 \
--policy.num_inference_steps=10 \ # For faster inference
--policy.beta_schedule=squaredcos_cap_v2 \ # Noise schedule type
--policy.prediction_type=epsilon \ # "epsilon" (predict noise) or "sample" (predict clean)
--policy.clip_sample=true \ # Clip samples during denoising
--policy.clip_sample_range=1.0 # Clipping range [-x, x]
# Flow matching objective
--policy.objective=flow_matching \
--policy.timestep_sampling_strategy=beta \ # or "uniform" | the beta sampling strategy performance appears much better in practice
--policy.num_integration_steps=100 \
--policy.integration_method=euler \ # or "rk4"
--policy.sigma_min=0.0 # Minimum noise in flow interpolation path
```
#### Transformer Architecture
Adjust model capacity based on dataset size:
```bash
# Small datasets (< 100 examples)
--policy.num_layers=4 \
--policy.hidden_dim=512 \
--policy.num_heads=8 # should ideally be hidden_dim // 64
# Medium datasets (100-5k examples) - default
--policy.num_layers=6 \
--policy.hidden_dim=512 \
--policy.num_heads=8 # should ideally be hidden_dim // 64
# Large datasets (> 5k examples)
--policy.num_layers=8 \
--policy.hidden_dim=512 \
--policy.num_heads=8 # should ideally be hidden_dim // 64
```
**Positional Encoding Options:**
The model supports two positional encoding methods for action sequences:
```bash
# Rotary Position Embedding (RoPE) - default, recommended
--policy.use_rope=true \
--policy.rope_base=10000.0 # Base frequency for RoPE
# Absolute positional encoding
--policy.use_positional_encoding=true # Disables RoPE when true
```
**Other Transformer Parameters:**
```bash
--policy.dropout=0.1 # Dropout rate for DiT blocks (0.0-1.0)
--policy.timestep_embed_dim=256 # Timestep embedding dimension
```
#### Vision Encoder Configuration
```bash
# Use different CLIP model for more expressivity at the cost of inference time
# experiment with larger or smaller models depending on the complexity of your tasks and size of dataset
--policy.vision_encoder_name=openai/clip-vit-large-patch14
# Use separate vision encoder per camera
# This may be useful when cameras have significantly different characteristics, but
# be wary of increased VRAM footprint.
--policy.use_separate_rgb_encoder_per_camera=true
# Image preprocessing
--policy.image_resize_shape=[XXX,YYY] \ # you may need to resize your images for inference speed ups
--policy.image_crop_shape=[224,224] \
--policy.image_crop_is_random=true # Random during training, center at inference
```
#### Text Encoder Configuration
```bash
# Use different CLIP text encoder model
# same as vision: experiment with larger or smaller models depending on the
# complexity of your tasks and size of dataset
--policy.text_encoder_name=openai/clip-vit-large-patch14
```
#### Learning Rate Configuration
The vision encoder uses a separate learning rate multiplier, where 1/10th is suggested to be the ideal staritng point:
```bash
--policy.optimizer_lr=2e-5 \
--policy.vision_encoder_lr_multiplier=0.1 # Vision encoder LR = 0.1 * optimizer_lr
```
### Training Tuning Guidelines
#### 1. Flow Matching with Beta Sampling
The original diffusion implementation here is based on the work described in [TRI's LBM paper](https://arxiv.org/abs/2507.05331)
Additionally, we have implemented a flow-matching objective, which is described at a high-level in [Boston Dynamics blog post](https://bostondynamics.com/blog/large-behavior-models-atlas-find-new-footing/).
Consider testing the flow-matching objective and evaluating performance differences for your task:
```bash
--policy.objective=flow_matching \
--policy.timestep_sampling_strategy=beta \
--policy.timestep_sampling_alpha=1.5 \
--policy.timestep_sampling_beta=1.0 \
--policy.timestep_sampling_s=0.999
```
This hasn't been shown to be a silver bullet across every user case, but it occasionally results in smoother and more consistent actions.
#### 2. Number of Transformer Layers
Match model capacity to your dataset size:
- **Small datasets** (< 100 examples): Reduce to 4 layers
- **Large datasets** (> 5k examples): Increase to 8 layers
#### 3. `horizon` Tuning
The model can be sensitive to the horizon you choose. Start with around a 1 second horizon based on your control frequency:
- **30 Hz frequency**: `horizon=30`
- **10 Hz frequency**: `horizon=10`
Then experiment with increasing from there. The horizon determines how far into the future the model predicts actions.
#### 4. `n_action_steps` Sensitivity
The model can also be very sensitive to `n_action_steps`. Start with it being around 0.8 seconds based on your control frequency and tune from there:
- **Lower values**: More reactive but potentially less stable for long-horizon tasks
- **Higher values**: Better for long-horizon execution but open-loop failures are limited in their recovery
### Inference Tuning
For faster inference, use DDIM with fewer sampling steps:
```bash
--policy.noise_scheduler_type=DDIM \
--policy.num_inference_steps=10
```
### Resuming Training
To resume training from a checkpoint:
```bash
lerobot-train \
--config_path=./outputs/mutitask_dit_training/checkpoints/last/pretrained_model/train_config.json \
--resume=true
```
The checkpoint directory should contain `model.safetensors` and `config.json` files (saved automatically during training). When resuming, the configuration is loaded from the checkpoint, so you don't need to specify other parameters.
## Common Failure Modes and Debugging
Training these models can be finicky. Here are common failure modes and debugging approaches:
### Idling / No Motion
The model may "collapse" during inference, resulting in static or no motion. This can occur when:
1. **Insufficient training data**: If you only have 20-50 examples, try to roughly double your dataset size. Once you have above 300 examples, if you're still seeing this, the task may be too complex.
2. **Multiple similar tasks**: When your dataset contains multiple similar tasks (e.g., picking up 2 different objects), the model may rely too heavily on language conditioning which might not be rich enough.
**Debugging tips:**
- Increase dataset size (double until you get to over 300 examples)
- Train for longer, up to 100k steps, even when the loss flatlines
- Check if the model is receiving proper language instructions or increase diversity of instruction
### Executing the Wrong Task
Sometimes the robot will completely ignore your instruction and perform some other task. This generally only happens if you have trained on multiple tasks.
**Potential causes:**
- Language instruction ambiguity
- Insufficient task-specific training data
- Model confusion between similar tasks in the multitask dataset
**Debugging tips:**
- Verify language instruction specificity, especially if descriptions are similar between multiple tasks
- Check task distribution in your training dataset and add weighting to the failing/ignored task
- Consider task-specific fine-tuning
### Training Instability
If training loss is unstable or diverging:
- Try adjusting learning rate between `1e-5` and `3e-4`
- Increase batch size if possible
- Check that your dataset normalization is correct
- Verify image preprocessing is working correctly
## Performance Considerations
### GPU Requirements
- **Inference**: At least an RTX 5070 Ti (or equivalent GPU) is recommended for reasonable speed performance
- **Training**: A GPU with enough VRAM to load batch sizes of >64 is ideal, which will vary depending on the number of image observations, etc
### Batch Size Recommendations
- **Minimum**: 64 (less than this may result in unstable training)
- **Recommended**: 256-320 (best performance, requires larger GPU)
## Example: Training on Custom Dataset
Here's a complete example training on a custom dataset:
```bash
lerobot-train \
--dataset.repo_id=YOUR_DATASET \
--output_dir=./outputs/mutitask_dit_training \
--batch_size=320 \
--steps=30000 \
--save_freq=1000 \
--log_freq=100 \
--eval_freq=1000 \
--policy.type=multi_task_dit \
--policy.device=cuda \
--policy.horizon=32 \
--policy.n_action_steps=24 \
--policy.objective=diffusion \
--policy.noise_scheduler_type=DDPM \
--policy.num_layers=6 \
--policy.hidden_dim=512 \
--policy.vision_encoder_name=openai/clip-vit-base-patch16 \
--policy.image_resize_shape=[320,240] \
--policy.image_crop_shape=[224,224] \
--policy.repo_id="HF_USER/multitask-dit-your-robot" \
--wandb.enable=true \
--wandb.project=multitask_dit
```
## References
For more details on the technical implementation and architecture, see:
- [A Careful Examination of Large Behavior Models for Multitask Dexterous Manipulation](https://arxiv.org/abs/2507.05331)
- [Large Behavior Models and Atlas Find New Footing](https://bostondynamics.com/blog/large-behavior-models-atlas-find-new-footing/)
- [Dissecting and Open-Sourcing Multitask Diffusion Transformer Policy](https://brysonkjones.substack.com/p/dissecting-and-open-sourcing-multitask-diffusion-transformer-policy)
-197
View File
@@ -1,197 +0,0 @@
## Order and Assemble the parts
First, assemble the OMX hardware following the official assembly guide.
OMX Assembly Guide: https://ai.robotis.com/omx/assembly_guide_omx.html
OMX robots are shipped preconfigured from the factory. Motor IDs, communication parameters, and joint offsets are already set, so no additional motor setup or calibration is required before using LeRobot.
## Install LeRobot 🤗
To install LeRobot, follow our [Installation Guide](./installation)
In addition to these instructions, you need to install the Dynamixel SDK:
```bash
pip install -e ".[dynamixel]"
```
## Connect the robot
To find the port for each bus servo adapter, run this script:
```bash
lerobot-find-port
```
This command runs and when prompted, disconnect the USB cable from either the leader or follower arm and press Enter. The output will show 'The port of this MotorsBus is [port]'. This identifies the port for the disconnected arm. Repeat for the other arm to identify both ports.
<hfoptions id="find_port">
<hfoption id="Mac">
Example output on macOS:
```
Finding all available ports for the MotorBus.
['/dev/tty.usbmodem575E0032081', '/dev/tty.usbmodem575E0031751']
Remove the USB cable from your MotorsBus and press Enter when done.
[...Disconnect corresponding leader or follower arm and press Enter...]
The port of this MotorsBus is /dev/tty.usbmodem575E0032081
Reconnect the USB cable.
```
Where the found port is: `/dev/tty.usbmodem575E0032081` corresponding to your leader or follower arm.
</hfoption>
<hfoption id="Linux">
On Linux, we strongly recommend using udev rules to assign persistent and human-readable device names to the OMX leader and follower arms. This avoids issues where device names such as ttyACM0 and ttyACM1 change when the robot is unplugged, replugged, or when the system is rebooted.
#### 1. Find your device serial numbers
You should have obtained the port numbers like ../../ttyACM? for the leader and follower using `lerobot-find-port`. You can match those results with the serial numbers using the `ls -l /dev/serial/by-id/` command.
To create udev rules, you need the unique serial number for each OMX device. The easiest way is to list devices under:
```bash
ls -l /dev/serial/by-id/
```
You will see output similar to:
```bash
usb-ROBOTIS_OpenRB-150_228BDD7B503059384C2E3120FF0A2B19-if00 -> ../../ttyACM0
usb-ROBOTIS_OpenRB-150_67E1ED68503059384C2E3120FF092234-if00 -> ../../ttyACM1
```
In each line, the serial number is the long string after `usb-ROBOTIS_OpenRB-150_` and before `-if00`.
Follower serial: `228BDD7B503059384C2E3120FF0A2B19`
Leader serial: `67E1ED68503059384C2E3120FF092234`
#### 2. Create the udev rule
Create a new udev rule file:
```bash
sudo nano /etc/udev/rules.d/99-omx.rules
```
Paste the following lines, replacing the serial numbers with the values you found above:
```bash
SUBSYSTEM=="tty", ATTRS{idVendor}=="0403", ATTRS{serial}=="228BDD7B503059384C2E3120FF0A2B19", SYMLINK+="omx_follower"
SUBSYSTEM=="tty", ATTRS{idVendor}=="0403", ATTRS{serial}=="67E1ED68503059384C2E3120FF092234", SYMLINK+="omx_leader"
```
Save the file and reload udev rules:
```bash
sudo udevadm control --reload-rules
sudo udevadm trigger
```
Now unplug and replug both devices once.
#### 3. Verify the symlinks
Check that the persistent device names exist:
```bash
ls -l /dev/omx_follower /dev/omx_leader
```
You should see them pointing to ttyACM\* devices:
```bash
/dev/omx_follower -> ttyACM*
/dev/omx_leader -> ttyACM*
```
These names remain stable across reboots and reconnections.
</hfoption>
</hfoptions>
## Teleoperate
After identifying the correct ports, you can directly teleoperate the follower arm using the leader arm.
<hfoptions id="teleoperate">
<hfoption id="Mac">
### Teleoperate without camera
```bash
lerobot-teleoperate \
--robot.type=omx_follower \
--robot.port=<your_follower_port> \
--robot.id=omx_follower_arm \
--teleop.type=omx_leader \
--teleop.port=<your_leader_port> \
--teleop.id=omx_leader_arm
```
During teleoperation, motions of the leader arm are mirrored in real time by the follower arm. OMX is already preconfigured, teleoperation can begin immediately without any calibration steps.
### Teleoperate with camera
You can also enable camera input during teleoperation by providing a camera configuration for the follower arm.
```bash
lerobot-teleoperate \
--robot.type=omx_follower \
--robot.port=<your_follower_port> \
--robot.id=omx_follower_arm \
--robot.cameras="{front: {type: opencv, index_or_path: '/dev/video0', width: 640, height: 480, fps: 30}}" \
--teleop.type=omx_leader \
--teleop.port=<your_leader_port> \
--teleop.id=omx_leader_arm \
--display_data=true
```
When the camera is enabled, the camera stream is displayed in real time and synchronized with the robot state. This setup is useful for visual monitoring and can be reused later for demonstration recording and imitation learning.
</hfoption>
<hfoption id="Linux">
### Teleoperate without camera
```bash
lerobot-teleoperate \
--robot.type=omx_follower \
--robot.port=/dev/omx_follower \
--robot.id=omx_follower_arm \
--teleop.type=omx_leader \
--teleop.port=/dev/omx_leader \
--teleop.id=omx_leader_arm
```
During teleoperation, motions of the leader arm are mirrored in real time by the follower arm. OMX is already preconfigured, teleoperation can begin immediately without any calibration steps.
### Teleoperate with camera
You can also enable camera input during teleoperation by providing a camera configuration for the follower arm.
```bash
lerobot-teleoperate \
--robot.type=omx_follower \
--robot.port=/dev/omx_follower \
--robot.id=omx_follower_arm \
--robot.cameras="{front: {type: opencv, index_or_path: '/dev/video0', width: 640, height: 480, fps: 30}}" \
--teleop.type=omx_leader \
--teleop.port=/dev/omx_leader \
--teleop.id=omx_leader_arm \
--display_data=true
```
When the camera is enabled, the camera stream is displayed in real time and synchronized with the robot state. This setup is useful for visual monitoring and can be reused later for demonstration recording and imitation learning.
</hfoption>
</hfoptions>
Congrats 🎉, your robot is all set to learn a task on its own.
> If you have any questions or need help, please reach out on [Discord](https://discord.com/invite/robotis).
-276
View File
@@ -1,276 +0,0 @@
# OpenArm
[OpenArm](https://openarm.dev) is an open-source 7DOF humanoid arm designed for physical AI research and deployment.
To get your OpenArm, assembled or DIY, and join the global community, browse verified and certified manufacturers worldwide at [openarm.dev](https://openarm.dev).
## What's Unique?
- **Human-Scale Design**: OpenArm is designed with human-like proportions, scaled for a person around 160-165cm tall. This provides an optimal balance between practical reach and manageable inertia for safe, responsive operation.
- **Safety-First Architecture**: Built with QDD backdrivable motors and high compliance, OpenArm prioritizes safe human-robot interaction while maintaining practical payload capabilities (6.0kg peak / 4.1kg nominal) for real-world tasks.
- **Built for Durability**: Critical structural components use aluminum and stainless steel construction, ensuring robust performance for repetitive data collection and continuous research use.
- **Fully Accessible & Buildable**: Every component, from CNC parts and 3D-printed casings to electrical wiring is designed to be purchasable and buildable by individual researchers and labs, with complete fabrication data provided.
- **Practical & Affordable**: At $6,500 USD for a complete bimanual system, OpenArm delivers research-grade capabilities at a fraction of traditional humanoid robot costs.
## Platform Requirements
<Tip warning={true}>
**Linux Only**: OpenArm currently only works on Linux. The CAN bus USB adapter
does not have macOS drivers and has not been tested on Windows.
</Tip>
## Safety Guide
Before operating OpenArm, please read the [official safety guide](https://docs.openarm.dev/getting-started/safety-guide). Key points:
- **Secure installation**: Fasten the arm to a flat, stable surface with screws or clamps
- **Safe distance**: Keep body parts and objects outside the range of motion during operation
- **Protective equipment**: Always wear safety goggles; use additional PPE as needed
- **Payload limits**: Do not exceed specified payload limits (6.0kg peak / 4.1kg nominal per arm)
- **Emergency stop**: Know the location and operation of the emergency stop device
- **Regular inspection**: Check for loose screws, damaged mechanical limits, unusual noises, and wiring damage
## Hardware Setup
Follow the official [OpenArm hardware documentation](https://docs.openarm.dev) for:
- Bill of materials and sourcing
- 3D printing instructions
- Mechanical assembly
- Electrical wiring
The hardware repositories are available at [github.com/enactic/openarm](https://github.com/enactic/openarm).
## CAN Bus Setup
OpenArm uses CAN bus communication with Damiao motors. Once you have the CAN bus USB adapter plugged into your Linux PC, follow the [Damiao Motors and CAN Bus guide](./damiao) to configure the interface.
Quick setup:
```bash
# Setup CAN interfaces
lerobot-setup-can --mode=setup --interfaces=can0,can1
# Test motor communication
lerobot-setup-can --mode=test --interfaces=can0,can1
```
## Install LeRobot 🤗
Follow our [Installation Guide](./installation), then install the Damiao motor support:
```bash
pip install -e ".[damiao]"
```
## Usage
### Follower Arm (Robot)
<hfoptions id="follower">
<hfoption id="Command">
```bash
lerobot-calibrate \
--robot.type=openarm_follower \
--robot.port=can0 \
--robot.side=right \
--robot.id=my_openarm_follower
```
</hfoption>
<hfoption id="API example">
```python
from lerobot.robots.openarm_follower import OpenArmFollower, OpenArmFollowerConfig
config = OpenArmFollowerConfig(
port="can0",
side="right", # or "left" for left arm
id="my_openarm_follower",
)
follower = OpenArmFollower(config)
follower.connect()
# Read current state
obs = follower.get_observation()
print(obs)
# Send action (position in degrees)
action = {
"joint_1.pos": 0.0,
"joint_2.pos": 0.0,
"joint_3.pos": 0.0,
"joint_4.pos": 45.0,
"joint_5.pos": 0.0,
"joint_6.pos": 0.0,
"joint_7.pos": 0.0,
"gripper.pos": 0.0,
}
follower.send_action(action)
follower.disconnect()
```
</hfoption>
</hfoptions>
### Leader Arm (Teleoperator)
The leader arm is used for teleoperation - manually moving it to control the follower arm.
<hfoptions id="leader">
<hfoption id="Command">
```bash
lerobot-calibrate \
--teleop.type=openarm_leader \
--teleop.port=can1 \
--teleop.id=my_openarm_leader
```
</hfoption>
<hfoption id="API example">
```python
from lerobot.teleoperators.openarm_leader import OpenArmLeader, OpenArmLeaderConfig
config = OpenArmLeaderConfig(
port="can1",
id="my_openarm_leader",
manual_control=True, # Disable torque for manual movement
)
leader = OpenArmLeader(config)
leader.connect()
# Read current position (as action to send to follower)
action = leader.get_action()
print(action)
leader.disconnect()
```
</hfoption>
</hfoptions>
### Teleoperation
To teleoperate OpenArm with leader-follower control:
```bash
lerobot-teleoperate \
--robot.type=openarm_follower \
--robot.port=can0 \
--robot.side=right \
--robot.id=my_follower \
--teleop.type=openarm_leader \
--teleop.port=can1 \
--teleop.id=my_leader
```
### Bimanual Teleoperation
To teleoperate a bimanual OpenArm setup with two leader and two follower arms:
```bash
lerobot-teleoperate \
--robot.type=bi_openarm_follower \
--robot.left_arm_config.port=can0 \
--robot.left_arm_config.side=left \
--robot.right_arm_config.port=can1 \
--robot.right_arm_config.side=right \
--robot.id=my_bimanual_follower \
--teleop.type=bi_openarm_leader \
--teleop.left_arm_config.port=can2 \
--teleop.right_arm_config.port=can3 \
--teleop.id=my_bimanual_leader
```
### Recording Data
To record a dataset during teleoperation:
```bash
lerobot-record \
--robot.type=openarm_follower \
--robot.port=can0 \
--robot.side=right \
--robot.id=my_follower \
--teleop.type=openarm_leader \
--teleop.port=can1 \
--teleop.id=my_leader \
--repo-id=my_hf_username/my_openarm_dataset \
--fps=30 \
--num-episodes=10
```
## Configuration Options
### Follower Configuration
| Parameter | Default | Description |
| --------------------- | --------- | ---------------------------------------------------------- |
| `port` | - | CAN interface (e.g., `can0`) |
| `side` | `None` | Arm side: `"left"`, `"right"`, or `None` for custom limits |
| `use_can_fd` | `True` | Enable CAN FD for higher data rates |
| `can_bitrate` | `1000000` | Nominal bitrate (1 Mbps) |
| `can_data_bitrate` | `5000000` | CAN FD data bitrate (5 Mbps) |
| `max_relative_target` | `None` | Safety limit for relative target positions |
| `position_kp` | Per-joint | Position control proportional gains |
| `position_kd` | Per-joint | Position control derivative gains |
### Leader Configuration
| Parameter | Default | Description |
| ------------------ | --------- | ----------------------------------- |
| `port` | - | CAN interface (e.g., `can1`) |
| `manual_control` | `True` | Disable torque for manual movement |
| `use_can_fd` | `True` | Enable CAN FD for higher data rates |
| `can_bitrate` | `1000000` | Nominal bitrate (1 Mbps) |
| `can_data_bitrate` | `5000000` | CAN FD data bitrate (5 Mbps) |
## Motor Configuration
OpenArm uses Damiao motors with the following default configuration:
| Joint | Motor Type | Send ID | Recv ID |
| --------------------------- | ---------- | ------- | ------- |
| joint_1 (Shoulder pan) | DM8009 | 0x01 | 0x11 |
| joint_2 (Shoulder lift) | DM8009 | 0x02 | 0x12 |
| joint_3 (Shoulder rotation) | DM4340 | 0x03 | 0x13 |
| joint_4 (Elbow flex) | DM4340 | 0x04 | 0x14 |
| joint_5 (Wrist roll) | DM4310 | 0x05 | 0x15 |
| joint_6 (Wrist pitch) | DM4310 | 0x06 | 0x16 |
| joint_7 (Wrist rotation) | DM4310 | 0x07 | 0x17 |
| gripper | DM4310 | 0x08 | 0x18 |
## Troubleshooting
### No Response from Motors
1. Check power supply connections
2. Verify CAN wiring (CAN-H, CAN-L, GND)
3. Run diagnostics: `lerobot-setup-can --mode=test --interfaces=can0`
4. See the [Damiao troubleshooting guide](./damiao#troubleshooting) for more details
### CAN Interface Not Found
Ensure the CAN interface is configured:
```bash
ip link show can0
```
## Resources
- [OpenArm Website](https://openarm.dev)
- [OpenArm Documentation](https://docs.openarm.dev)
- [OpenArm GitHub](https://github.com/enactic/openarm)
- [Safety Guide](https://docs.openarm.dev/getting-started/safety-guide)
- [Damiao Motors and CAN Bus](./damiao)
+328
View File
@@ -0,0 +1,328 @@
# OpenArms Robot
OpenArms is a 7 DOF robotic arm with a gripper, designed by [Enactic, Inc.](https://www.enactic.com/) It uses Damiao motors controlled via CAN bus communication and MIT control mode for smooth, precise motion.
## Hardware Overview
- **7 DOF per arm** (14 DOF total for dual arm setup)
- **1 gripper per arm** (2 grippers total)
- **Damiao motors** with 4 different types:
- **DM8009** (DM-J8009P-2EC) for shoulders (J1, J2) - high torque
- **DM4340** for shoulder rotation and elbow (J3, J4)
- **DM4310** (DM-J4310-2EC V1.1) for wrist (J5, J6, J7) and gripper (J8)
- **24V power supply** required
- **CAN interface device**:
- **Linux**: Any SocketCAN-compatible adapter
- **macOS**: CANable, PEAK PCAN-USB, or Kvaser USBcan
- Proper CAN wiring (CANH, CANL, 120Ω termination)
## Motor Configuration
Each arm has the following motor configuration based on the [OpenArm setup guide](https://docs.openarm.dev/software/setup/):
| Joint | Motor | Motor Type | Sender CAN ID | Receiver ID | Description |
|-------|-------|------------|---------------|-------------|-------------|
| J1 | joint_1 | DM8009 | 0x01 | 0x11 | Shoulder pan |
| J2 | joint_2 | DM8009 | 0x02 | 0x12 | Shoulder lift |
| J3 | joint_3 | DM4340 | 0x03 | 0x13 | Shoulder rotation |
| J4 | joint_4 | DM4340 | 0x04 | 0x14 | Elbow flex |
| J5 | joint_5 | DM4310 | 0x05 | 0x15 | Wrist roll |
| J6 | joint_6 | DM4310 | 0x06 | 0x16 | Wrist pitch |
| J7 | joint_7 | DM4310 | 0x07 | 0x17 | Wrist rotation |
| J8 | gripper | DM4310 | 0x08 | 0x18 | Gripper |
For dual arm setups, the left arm uses IDs 0x09-0x10 for joints 1-8 with the same motor types.
## Quick Start
```bash
# Install system dependencies
sudo apt install can-utils iproute2
# Install LeRobot with OpenArms support
pip install -e ".[openarms]"
```
## Setup Guide
### Step 1: Motor ID Configuration
**IMPORTANT**: Before using the robot, motors must be configured with the correct CAN IDs.
Refer to the [OpenArm Motor ID Configuration Guide](https://docs.openarm.dev/software/setup/motor-id) for detailed instructions using the Damiao Debugging Tools on Windows.
Key points:
- Each motor needs a unique **Sender CAN ID** (0x01-0x08)
- Each motor needs a unique **Receiver/Master ID** (0x11-0x18)
- Use the Damiao Debugging Tools to set these IDs
### Step 2: Setup CAN Interface
Configure your CAN interface as described in the [OpenArm CAN Setup Guide](https://docs.openarm.dev/software/setup/can-setup):
#### Linux (SocketCAN)
```bash
# Find your CAN interface
ip link show
# Configure can0, 1, 2, 3
sudo ip link set can0 down
sudo ip link set can0 type can bitrate 1000000
sudo ip link set can0 up
sudo ip link set can1 down
sudo ip link set can1 type can bitrate 1000000
sudo ip link set can1 up
sudo ip link set can2 down
sudo ip link set can2 type can bitrate 1000000
sudo ip link set can2 up
sudo ip link set can3 down
sudo ip link set can3 type can bitrate 1000000
sudo ip link set can3 up
# Verify configuration
ip link show can0
```
or run:
`examples/openarms/setup_can.sh`
### Testing canbus and motor connection
Please run this script to check if all motors can be found and to find your can-fd speed: `python examples/openarms/debug_can_communication.py`
## Usage
### Basic Setup
```python
from lerobot.robots.openarms import OpenArmsFollower
from lerobot.robots.openarms.config_openarms_follower import OpenArmsFollowerConfig
# Configure for dual arm setup
config = OpenArmsFollowerConfig(
port="can0",
can_interface="socketcan", # Or "auto" for auto-detection
id="openarms_dual",
is_dual_arm=True,
)
robot = OpenArmsFollower(config)
robot.connect()
```
### Calibration
On first use, you'll need to calibrate the robot:
```python
robot.calibrate()
```
The calibration process will:
1. Disable torque on all motors
2. Ask you to position arms in **hanging position with grippers closed**
3. Set this as the zero position
4. Ask you to move each joint through its full range
5. Record min/max positions for each joint
6. Save calibration to file
### Reading Observations
The robot provides comprehensive state information:
```python
observation = robot.get_observation()
# Observation includes for each motor:
# - {motor_name}.pos: Position in degrees
# - {motor_name}.vel: Velocity in degrees/second
# - {motor_name}.torque: Motor torque
# - {camera_name}: Camera images (if configured)
print(f"Right arm joint 1 position: {observation['right_joint_1.pos']:.1f}°")
print(f"Right arm joint 1 velocity: {observation['right_joint_1.vel']:.1f}°/s")
print(f"Right arm joint 1 torque: {observation['right_joint_1.torque']:.3f} N·m")
```
### Sending Actions
```python
# Send target positions (in degrees)
action = {
"right_joint_1.pos": 45.0,
"right_joint_2.pos": -30.0,
# ... all joints
"right_gripper.pos": 45.0, # Half-closed
}
actual_action = robot.send_action(action)
```
### Gripper Control
```python
# Open gripper
robot.open_gripper(arm="right")
# Close gripper
robot.close_gripper(arm="right")
```
## Safety Features
### 1. Maximum Relative Target
Limits how far a joint can move in a single command to prevent sudden movements:
```python
config = OpenArmsFollowerConfig(
port="can0",
# Limit all joints to 10 degrees per command
max_relative_target=10.0,
# Or set per-motor limits
max_relative_target={
"right_joint_1": 15.0, # Slower moving joint
"right_joint_2": 10.0,
"right_gripper": 5.0, # Very slow gripper
}
)
```
**How it works**: If current position is 50° and you command 80°, with `max_relative_target=10.0`, the robot will only move to 60° in that step.
### 2. Torque Limits
Control maximum torque output, especially important for grippers and teleoperation:
```python
config = OpenArmsFollowerConfig(
port="can0",
# Gripper torque limit (fraction of motor's max torque)
gripper_torque_limit=0.5, # 50% of max torque
)
```
Lower torque limits prevent damage when gripping delicate objects.
### 3. MIT Control Gains
Control responsiveness and stability via PID-like gains:
```python
config = OpenArmsFollowerConfig(
port="can0",
position_kp=10.0, # Position gain (higher = more responsive)
position_kd=0.5, # Velocity damping (higher = more damped)
)
```
**Guidelines**:
- **For following (robot)**: Higher gains for responsiveness
- `position_kp=10.0`, `position_kd=0.5`
- **For teleoperation (leader)**: Lower gains or disable torque for manual movement
- `manual_control=True` (torque disabled)
### 4. Velocity Limits
Velocity limits are enforced by the Damiao motors based on motor type. For DM4310:
- Max velocity: 30 rad/s ≈ 1718°/s
The motors will automatically limit velocity to safe values.
## Teleoperation
### Leader Arm Setup
The leader arm is moved manually (torque disabled) to generate commands:
```python
from lerobot.teleoperators.openarms import OpenArmsLeader
from lerobot.teleoperators.openarms.config_openarms_leader import OpenArmsLeaderConfig
config = OpenArmsLeaderConfig(
port="can1", # Separate CAN interface for leader
id="openarms_leader",
manual_control=True, # Torque disabled for manual movement
is_dual_arm=True,
)
leader = OpenArmsLeader(config)
leader.connect()
# Read current position as action
action = leader.get_action()
# action contains positions for all joints in degrees
```
### Safety Considerations for Teleoperation
1. **Use separate CAN interfaces** for leader and follower to avoid conflicts
2. **Enable max_relative_target** on follower to smooth abrupt movements
3. **Lower torque limits** on follower to prevent damage from tracking errors
4. **Test with one arm** before enabling dual arm teleoperation
5. **Have emergency stop** ready (power switch or CAN disable)
```python
# Recommended follower config for teleoperation
follower_config = OpenArmsFollowerConfig(
port="can0",
max_relative_target=5.0, # Small steps for smooth following
gripper_torque_limit=0.3, # Low torque for safety
position_kp=5.0, # Lower gains for gentler following
position_kd=0.3,
)
```
## Troubleshooting
### Motor Shaking/Unstable
- **Lower control gains**: Reduce `position_kp` and `position_kd`
- **Check calibration**: Re-run calibration procedure
- **Verify power**: Insufficient current can cause instability
- **Check mechanical**: Loose connections, binding, or damaged components
### CAN Bus Errors
```bash
# Check for errors
ip -s link show can0
# Reset CAN interface
sudo ip link set can0 down
sudo ip link set can0 up
```
### Control Mode
OpenArms uses **MIT control mode** which allows simultaneous control of:
- Position (degrees)
- Velocity (degrees/second)
- Torque (N·m)
- Position gain (Kp)
- Velocity damping (Kd)
### Communication
- **Protocol**: CAN 2.0 at 1 Mbps (or CAN-FD at 5 Mbps)
- **Frame format**: Standard 11-bit IDs
- **Update rate**: Typically 50-100 Hz depending on motor count
- **Latency**: ~10-20ms per motor command
## References
- [OpenArm Official Documentation](https://docs.openarm.dev/)
- [OpenArm Setup Guide](https://docs.openarm.dev/software/setup/)
- [Motor ID Configuration](https://docs.openarm.dev/software/setup/motor-id)
- [CAN Interface Setup](https://docs.openarm.dev/software/setup/can-setup)
- [Motor Communication Test](https://docs.openarm.dev/software/setup/configure-test)
- [Damiao Motor Documentation](https://wiki.seeedstudio.com/damiao_series/)
- [Enactic GitHub](https://github.com/enactic/openarm_can)
-62
View File
@@ -1,62 +0,0 @@
# Parameter efficient fine-tuning with 🤗 PEFT
[🤗 PEFT](https://github.com/huggingface/peft) (Parameter-Efficient Fine-Tuning) is a library for efficiently adapting
large pretrained models such as pre-trained policies (e.g., SmolVLA, π₀, ...) to new tasks without training all
of the model's parameters while yielding comparable performance.
Install the `lerobot[peft]` optional package to enable PEFT support.
To read about all the possible methods of adaption, please refer to the [🤗 PEFT docs](https://huggingface.co/docs/peft/index).
## Training SmolVLA
In this section we'll show you how to train a pre-trained SmolVLA policy with PEFT on the libero dataset.
For brevity we're only training on the `libero_spatial` subset. We will use `lerobot/smolvla_base` as the model
to parameter efficiently fine-tune:
```
lerobot-train \
--policy.path=lerobot/smolvla_base \
--policy.repo_id=your_hub_name/my_libero_smolvla \
--dataset.repo_id=HuggingFaceVLA/libero \
--policy.output_features=null \
--policy.input_features=null \
--policy.optimizer_lr=1e-3 \
--policy.scheduler_decay_lr=1e-4 \
--env.type=libero \
--env.task=libero_spatial \
--steps=100000 \
--batch_size=32 \
--peft.method_type=LORA \
--peft.r=64
```
Note the `--peft.method_type` parameter that let's you select which PEFT method to use. Here we use
[LoRA](https://huggingface.co/docs/peft/main/en/package_reference/lora) (Low-Rank Adapter) which is probably the most
popular fine-tuning method to date. Low-rank adaption means that we only fine-tune a matrix with comparably low rank
instead of the full weight matrix. This rank can be specified using the `--peft.r` parameter. The higher the rank
the closer you get to full fine-tuning
There are more complex methods that have more parameters. These are not yet supported, feel free to raise an issue
if you want to see a specific PEFT method supported.
By default, PEFT will target the `q_proj` and `v_proj` layers of the LM expert in SmolVLA. It will also target the
state and action projection matrices as they are most likely task-dependent. If you need to target different layers
you can use `--peft.target_modules` to specify which layers to target. You can refer to the respective PEFT method's
documentation to see what inputs are supported, (e.g., [LoRA's target_modules documentation](https://huggingface.co/docs/peft/main/en/package_reference/lora#peft.LoraConfig.target_modules)).
Usually a list of suffixes or a regex are supported. For example, to target the MLPs of the `lm_expert` instead of
the `q` and `v` projections, use:
```
--peft.target_modules='(model\.vlm_with_expert\.lm_expert\..*\.(down|gate|up)_proj|.*\.(state_proj|action_in_proj|action_out_proj|action_time_mlp_in|action_time_mlp_out))'
```
In case you need to fully fine-tune a layer instead of just adapting it, you can supply a list of layer suffixes
to the `--peft.full_training_modules` parameter:
```
--peft.full_training_modules=["state_proj"]
```
The learning rate and the scheduled target learning rate can usually be scaled by a factor of 10 compared to the
learning rate used for full fine-tuning (e.g., 1e-4 normal, so 1e-3 using LoRA).
+13 -17
View File
@@ -44,7 +44,7 @@ Modify the examples to use `PhoneOS.IOS` or `PhoneOS.ANDROID` in `PhoneConfig`.
Teleoperation example:
```python
```36:43:examples/phone_so100_teleop.py
from lerobot.teleoperators.phone.config_phone import PhoneConfig, PhoneOS
teleop_config = PhoneConfig(phone_os=PhoneOS.IOS) # or PhoneOS.ANDROID
@@ -66,13 +66,12 @@ Run on of the examples scripts to teleoperate, record a dataset, replay a datase
All scripts assume you configured your robot (e.g., SO-100 follower) and set the correct serial port.
Additionally you need to **copy the URDF of the robot into the examples folder**. For the examples in this tutorial (using SO100/SO101), copy the `SO101` folder from the [SO-ARM100 repo](https://github.com/TheRobotStudio/SO-ARM100/blob/main/Simulation/SO101) into the `examples/phone_to_so100/` directory, so that the URDF file path becomes `examples/phone_to_so100/SO101/so101_new_calib.urdf`.
Additionally you need to **copy the urdf of the robot to the examples folder**. For the examples in this tutorial (Using SO100/SO101) it is highly recommended to use the urdf in the [SO-ARM100 repo](https://github.com/TheRobotStudio/SO-ARM100/blob/main/Simulation/SO101/so101_new_calib.urdf)
- Run this example to teleoperate:
```bash
cd examples/phone_to_so100
python teleoperate.py
python examples/phone_to_so100/teleoperate.py
```
After running the example:
@@ -85,29 +84,26 @@ Additionally you can customize mapping or safety limits by editing the processor
- Run this example to record a dataset, which saves absolute end effector observations and actions:
```bash
cd examples/phone_to_so100
python record.py
python examples/phone_to_so100/record.py
```
- Run this example to replay recorded episodes:
```bash
cd examples/phone_to_so100
python replay.py
python examples/phone_to_so100/replay.py
```
- Run this example to evaluate a pretrained policy:
```bash
cd examples/phone_to_so100
python evaluate.py
python examples/phone_to_so100/evaluate.py
```
### Important pipeline steps and options
- Kinematics are used in multiple steps. We use [Placo](https://github.com/Rhoban/placo) which is a wrapper around Pinocchio for handling our kinematics. We construct the kinematics object by passing the robot's URDF and target frame. We set `target_frame_name` to the gripper frame.
```python
```examples/phone_to_so100/teleoperate.py
kinematics_solver = RobotKinematics(
urdf_path="./SO101/so101_new_calib.urdf",
target_frame_name="gripper_frame_link",
@@ -118,7 +114,7 @@ Additionally you can customize mapping or safety limits by editing the processor
- The `MapPhoneActionToRobotAction` step converts the calibrated phone pose and inputs into target deltas and gripper commands, below is shown what the step outputs.
```python
```src/lerobot/teleoperators/phone/phone_processor.py
action["enabled"] = enabled
action["target_x"] = -pos[1] if enabled else 0.0
action["target_y"] = pos[0] if enabled else 0.0
@@ -131,7 +127,7 @@ Additionally you can customize mapping or safety limits by editing the processor
- The `EEReferenceAndDelta` step converts target deltas to an absolute desired EE pose, storing a reference on enable, the `end_effector_step_sizes` are the step sizes for the EE pose and can be modified to change the motion speed.
```python
```examples/phone_to_so100/teleoperate.py
EEReferenceAndDelta(
kinematics=kinematics_solver,
end_effector_step_sizes={"x": 0.5, "y": 0.5, "z": 0.5},
@@ -142,7 +138,7 @@ Additionally you can customize mapping or safety limits by editing the processor
- The `EEBoundsAndSafety` step clamps EE motion to a workspace and checks for large ee step jumps to ensure safety. The `end_effector_bounds` are the bounds for the EE pose and can be modified to change the workspace. The `max_ee_step_m` are the step limits for the EE pose and can be modified to change the safety limits.
```python
```examples/phone_to_so100/teleoperate.py
EEBoundsAndSafety(
end_effector_bounds={"min": [-1.0, -1.0, -1.0], "max": [1.0, 1.0, 1.0]},
max_ee_step_m=0.10,
@@ -151,7 +147,7 @@ Additionally you can customize mapping or safety limits by editing the processor
- The `GripperVelocityToJoint` step turns a velocitylike gripper input into absolute gripper position using the current measured state. The `speed_factor` is the factor by which the velocity is multiplied.
```python
```examples/phone_to_so100/teleoperate.py
GripperVelocityToJoint(speed_factor=20.0)
```
@@ -161,7 +157,7 @@ We use different IK initial guesses in the kinematic steps. As initial guess eit
- Closed loop (used in record/eval): sets `initial_guess_current_joints=True` so IK starts from the measured joints each frame.
```python
```examples/phone_to_so100/record.py
InverseKinematicsEEToJoints(
kinematics=kinematics_solver,
motor_names=list(robot.bus.motors.keys()),
@@ -171,7 +167,7 @@ We use different IK initial guesses in the kinematic steps. As initial guess eit
- Open loop (used in replay): sets `initial_guess_current_joints=False` so IK continues from the previous IK solution rather than the measured state. This preserves action stability when we replay without feedback.
```python
```examples/phone_to_so100/replay.py
InverseKinematicsEEToJoints(
kinematics=kinematics_solver,
motor_names=list(robot.bus.motors.keys()),
+6 -18
View File
@@ -6,12 +6,6 @@
π₀ represents a breakthrough in robotics as the first general-purpose robot foundation model developed by [Physical Intelligence](https://www.physicalintelligence.company/blog/pi0). Unlike traditional robot programs that are narrow specialists programmed for repetitive motions, π₀ is designed to be a generalist policy that can understand visual inputs, interpret natural language instructions, and control a variety of different robots across diverse tasks.
<img
src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/lerobot/lerobot-pi0%20(1).png"
alt="An overview of Pi0"
width="85%"
/>
### The Vision for Physical Intelligence
As described by Physical Intelligence, while AI has achieved remarkable success in digital domains, from chess-playing to drug discovery, human intelligence still dramatically outpaces AI in the physical world. To paraphrase Moravec's paradox, winning a game of chess represents an "easy" problem for AI, but folding a shirt or cleaning up a table requires solving some of the most difficult engineering problems ever conceived. π₀ represents a first step toward developing artificial physical intelligence that enables users to simply ask robots to perform any task they want, just like they can with large language models.
@@ -34,6 +28,11 @@ As described by Physical Intelligence, while AI has achieved remarkable success
pip install -e ".[pi]"
```
> [!NOTE]
> For lerobot 0.4.0, if you want to install pi tag, you will have to do: `pip install "lerobot[pi]@git+https://github.com/huggingface/lerobot.git"`.
>
> This will be solved in the next patch release
## Training Data and Capabilities
π₀ is trained on the largest robot interaction dataset to date, combining three key data sources:
@@ -55,7 +54,7 @@ policy.type=pi0
For training π₀, you can use the standard LeRobot training script with the appropriate configuration:
```bash
lerobot-train \
python src/lerobot/scripts/lerobot_train.py \
--dataset.repo_id=your_dataset \
--policy.type=pi0 \
--output_dir=./outputs/pi0_training \
@@ -65,8 +64,6 @@ lerobot-train \
--policy.compile_model=true \
--policy.gradient_checkpointing=true \
--policy.dtype=bfloat16 \
--policy.freeze_vision_encoder=false \
--policy.train_expert_only=false \
--steps=3000 \
--policy.device=cuda \
--batch_size=32
@@ -82,15 +79,6 @@ lerobot-train \
- [lerobot/pi0_base](https://huggingface.co/lerobot/pi0_base)
- [lerobot/pi0_libero](https://huggingface.co/lerobot/pi0_libero) (specifically trained on the Libero dataset)
### Training Parameters Explained
| Parameter | Default | Description |
| ----------------------- | ------- | ------------------------------------------- |
| `freeze_vision_encoder` | `false` | Do not freeze the vision encoder |
| `train_expert_only` | `false` | Do not freeze the VLM, train all parameters |
**💡 Tip**: Setting `train_expert_only=true` freezes the VLM and trains only the action expert and projections, allowing finetuning with reduced memory usage.
## License
This model follows the **Apache 2.0 License**, consistent with the original [OpenPI repository](https://github.com/Physical-Intelligence/openpi).
+6 -12
View File
@@ -36,6 +36,11 @@ This diverse training mixture creates a "curriculum" that enables generalization
pip install -e ".[pi]"
```
> [!NOTE]
> For lerobot 0.4.0, if you want to install pi tag, you will have to do: `pip install "lerobot[pi]@git+https://github.com/huggingface/lerobot.git"`.
>
> This will be solved in the next patch release
## Usage
To use π₀.₅ in your LeRobot configuration, specify the policy type as:
@@ -51,7 +56,7 @@ policy.type=pi05
Here's a complete training command for finetuning the base π₀.₅ model on your own dataset:
```bash
lerobot-train \
python src/lerobot/scripts/lerobot_train.py\
--dataset.repo_id=your_dataset \
--policy.type=pi05 \
--output_dir=./outputs/pi05_training \
@@ -62,8 +67,6 @@ lerobot-train \
--policy.gradient_checkpointing=true \
--wandb.enable=true \
--policy.dtype=bfloat16 \
--policy.freeze_vision_encoder=false \
--policy.train_expert_only=false \
--steps=3000 \
--policy.device=cuda \
--batch_size=32
@@ -79,15 +82,6 @@ lerobot-train \
- [lerobot/pi05_base](https://huggingface.co/lerobot/pi05_base)
- [lerobot/pi05_libero](https://huggingface.co/lerobot/pi05_libero) (specifically trained on the Libero dataset)
### Training Parameters Explained
| Parameter | Default | Description |
| ----------------------- | ------- | ------------------------------------------- |
| `freeze_vision_encoder` | `false` | Do not freeze the vision encoder |
| `train_expert_only` | `false` | Do not freeze the VLM, train all parameters |
**💡 Tip**: Setting `train_expert_only=true` freezes the VLM and trains only the action expert and projections, allowing finetuning with reduced memory usage.
If your dataset is not converted with `quantiles`, you can convert it with the following command:
```bash
-241
View File
@@ -1,241 +0,0 @@
# π₀-FAST (Pi0-FAST)
π₀-FAST is a **Vision-Language-Action model for general robot control** that uses autoregressive next-token prediction to model continuous robot actions.
## Model Overview
π₀-FAST combines the power of Vision-Language Models with a novel action tokenization approach called **FAST (Frequency-space Action Sequence Tokenization)**. This enables training autoregressive VLAs on highly dexterous tasks that are impossible with standard binning-based discretization, while training **up to 5x faster** than diffusion-based approaches like π₀.
<img
src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/lerobot/lerobot-pifast.png"
alt="An overview of Pi0-FAST"
width="85%"
/>
### Why FAST?
Standard approaches for robot action tokenization use simple per-dimension, per-timestep binning schemes. While passable for simple behaviors, this rapidly breaks down for complex and dexterous skills that require precision and high-frequency control.
FAST solves this by compressing action sequences using signal processing techniques, resulting in a dense sequence of action tokens that can be predicted autoregressively—just like language tokens.
### How FAST Tokenization Works
The FAST tokenizer compresses action sequences through the following steps:
1. **Normalize**: Take a continuous action chunk of shape `(H, D)` where `H` is the horizon and `D` is the action dimension. Normalize using one of the supported normalization methods (Quantiles recommended to handle outliers).
2. **Discrete Cosine Transform (DCT)**: Apply DCT (via scipy) to each action dimension separately. DCT is a compression algorithm commonly used in image and audio codecs (JPEG, MP3).
3. **Quantization**: Round and remove insignificant coefficients for each action dimension, producing a sparse frequency matrix.
4. **Flatten**: Flatten the matrix into a 1D vector, with low-frequency components first.
5. **Byte Pair Encoding (BPE)**: Train a BPE tokenizer to compress the DCT coefficients into dense action tokens, typically achieving **10x compression** over prior tokenization approaches.
This approach can transform **any existing VLM** into a VLA by training it to predict these FAST tokens.
## Installation Requirements
1. Install LeRobot by following our [Installation Guide](./installation).
2. Install π₀-FAST dependencies by running:
```bash
pip install -e ".[pi]"
```
## Training a Custom FAST Tokenizer
You have two options for the FAST tokenizer:
1. **Use the pre-trained tokenizer**: The `lerobot/fast-action-tokenizer` tokenizer was trained on 1M+ real robot action sequences and works as a general-purpose tokenizer.
2. **Train your own tokenizer**: For maximum performance on your specific dataset, you can finetune the tokenizer on your own data.
### Training Your Own Tokenizer
```bash
lerobot-train-tokenizer \
--repo_id "user/my-lerobot-dataset" \
--action_horizon 10 \
--encoded_dims "0:6" \
--vocab_size 1024 \
--scale 10.0 \
--normalization_mode QUANTILES \
--output_dir "./my_fast_tokenizer" \
--push_to_hub \
--hub_repo_id "username/my-action-tokenizer"
```
### Key Tokenizer Parameters
| Parameter | Description | Default |
| ---------------------- | --------------------------------------------------------------------------------- | ------------ |
| `--repo_id` | LeRobot dataset repository ID | Required |
| `--action_horizon` | Number of future actions in each chunk | `10` |
| `--encoded_dims` | Comma-separated dimension ranges to encode (e.g., `"0:6,7:23"`) | `"0:6,7:23"` |
| `--vocab_size` | BPE vocabulary size | `1024` |
| `--scale` | DCT scaling factor for quantization | `10.0` |
| `--normalization_mode` | Normalization mode (`MEAN_STD`, `MIN_MAX`, `QUANTILES`, `QUANTILE10`, `IDENTITY`) | `QUANTILES` |
| `--sample_fraction` | Fraction of chunks to sample per episode | `0.1` |
## Usage
To use π₀-FAST in LeRobot, specify the policy type as:
```python
policy.type=pi0_fast
```
## Training
For training π₀-FAST, you can use the LeRobot training script:
```bash
lerobot-train \
--dataset.repo_id=your_dataset \
--policy.type=pi0_fast \
--output_dir=./outputs/pi0fast_training \
--job_name=pi0fast_training \
--policy.pretrained_path=lerobot/pi0_fast_base \
--policy.dtype=bfloat16 \
--policy.gradient_checkpointing=true \
--policy.chunk_size=10 \
--policy.n_action_steps=10 \
--policy.max_action_tokens=256 \
--steps=100000 \
--batch_size=4 \
--policy.device=cuda
```
### Key Training Parameters
| Parameter | Description | Default |
| -------------------------------------- | -------------------------------------------------- | ------------------------------- |
| `--policy.gradient_checkpointing=true` | Reduces memory usage significantly during training | `false` |
| `--policy.dtype=bfloat16` | Use mixed precision training for efficiency | `float32` |
| `--policy.chunk_size` | Number of action steps to predict (action horizon) | `50` |
| `--policy.n_action_steps` | Number of action steps to execute | `50` |
| `--policy.max_action_tokens` | Maximum number of FAST tokens per action chunk | `256` |
| `--policy.action_tokenizer_name` | FAST tokenizer to use | `lerobot/fast-action-tokenizer` |
| `--policy.compile_model=true` | Enable torch.compile for faster training | `false` |
## Inference
### KV-Caching for Fast Inference
π₀-FAST supports **KV-caching**, a widely used optimization in LLM inference. This caches the key-value pairs from the attention mechanism, avoiding redundant computation during autoregressive decoding.
```python
# KV-caching is enabled by default
policy.use_kv_cache=true
```
### Inference Example
```python
from lerobot.policies.pi0_fast import PI0FastPolicy, PI0FastConfig
# Load the policy
policy = PI0FastPolicy.from_pretrained("your-model-path")
# During inference
actions = policy.predict_action_chunk(batch)
```
## Model Architecture
π₀-FAST uses a PaliGemma-based architecture:
- **Vision Encoder**: SigLIP vision tower for image understanding
- **Language Model**: Gemma 2B for processing language instructions and predicting action tokens
The model takes images, text instructions, and robot state as input, and outputs discrete FAST tokens that are decoded back to continuous actions.
## Configuration Options
| Parameter | Description | Default |
| -------------------- | ----------------------------------------------- | ---------- |
| `paligemma_variant` | VLM backbone variant (`gemma_300m`, `gemma_2b`) | `gemma_2b` |
| `max_state_dim` | Maximum state vector dimension (padded) | `32` |
| `max_action_dim` | Maximum action vector dimension (padded) | `32` |
| `temperature` | Sampling temperature (0.0 for greedy) | `0.0` |
| `max_decoding_steps` | Maximum decoding steps | `256` |
| `use_kv_cache` | Enable KV caching for faster inference | `true` |
## Comparison with π₀
| Feature | π₀ | π₀-FAST |
| --------------------- | ------------------------- | ---------------------------- |
| Action Representation | Flow Matching (Diffusion) | Autoregressive Tokens (FAST) |
| Training Speed | 1x | **5x faster** |
| Dexterity | High | High |
| Inference Method | Iterative Denoising | Autoregressive Decoding |
| KV-Caching | N/A | Supported |
## Reproducing π₀Fast results
We reproduce the results of π₀Fast on the LIBERO benchmark using the LeRobot implementation. We take the LeRobot PiFast base model [lerobot/pi0fast-base](https://huggingface.co/lerobot/pi0fast-base) and finetune for an additional 40kk steps in bfloat16, with batch size of 256 on 8 H100 GPUs using the [HuggingFace LIBERO dataset](https://huggingface.co/datasets/HuggingFaceVLA/libero).
The finetuned model can be found here:
- **π₀Fast LIBERO**: [lerobot/pi0fast-libero](https://huggingface.co/lerobot/pi0fast-libero)
With the following training command:
```bash
lerobot-train \
--dataset.repo_id=lerobot/libero \
--output_dir=outputs/libero_pi0fast \
--job_name=libero_pi0fast \
--policy.path=lerobot/pi0fast_base \
--policy.dtype=bfloat16 \
--steps=100000 \
--save_freq=20000 \
--batch_size=4 \
--policy.device=cuda \
--policy.scheduler_warmup_steps=4000 \
--policy.scheduler_decay_steps=100000 \
--policy.scheduler_decay_lr=1e-5 \
--policy.gradient_checkpointing=true \
--policy.chunk_size=10 \
--policy.n_action_steps=10 \
--policy.max_action_tokens=256 \
--policy.empty_cameras=1 \
```
We then evaluate the finetuned model using the LeRobot LIBERO implementation, by running the following command:
```bash
tasks="libero_object,libero_spatial,libero_goal,libero_10"
lerobot-eval \
--policy.path=lerobot/pi0fast-libero \
--policy.max_action_tokens=256 \
--env.type=libero \
--policy.gradient_checkpointing=false \
--env.task=${tasks} \
--eval.batch_size=1 \
--eval.n_episodes=1 \
--rename_map='{"observation.images.image":"observation.images.base_0_rgb","observation.images.image2":"observation.images.left_wrist_0_rgb"}'
```
**Note:** We set `n_action_steps=10`, similar to the original OpenPI implementation.
### Results
We obtain the following results on the LIBERO benchmark:
| Model | LIBERO Spatial | LIBERO Object | LIBERO Goal | LIBERO 10 | Average |
| ----------- | -------------- | ------------- | ----------- | --------- | -------- |
| **π₀-fast** | 70.0 | 100.0 | 100.0 | 60.0 | **82.5** |
The full evaluation output folder, including videos, is available [here](https://drive.google.com/drive/folders/1HXpwPTRm4hx6g1sF2P7OOqGG0TwPU7LQ?usp=sharing)
## License
This model follows the **Apache 2.0 License**, consistent with the original [OpenPI repository](https://github.com/Physical-Intelligence/openpi).
## References
- [FAST: Efficient Robot Action Tokenization](https://www.physicalintelligence.company/research/fast) - Physical Intelligence Blog
- [OpenPI Repository](https://github.com/Physical-Intelligence/openpi) - Original implementation
- [FAST Tokenizer on Hugging Face](https://huggingface.co/physical-intelligence/fast) - Pre-trained tokenizer
@@ -1,37 +0,0 @@
# Multitask DiT Policy
## Citation
If you use this work, please cite the following works:
```bibtex
@misc{jones2025multitaskditpolicy,
author = {Bryson Jones},
title = {Dissecting and Open-Sourcing Multitask Diffusion Transformer Policy},
year = {2025},
url = {https://brysonkjones.substack.com/p/dissecting-and-open-sourcing-multitask-diffusion-transformer-policy},
note = {Blog post}
}
```
```bibtex
@misc{trilbmteam2025carefulexaminationlargebehaviormodels,
author = {TRI LBM Team},
title = {A Careful Examination of Large Behavior Models for Multitask Dexterous Manipulation},
year = {2025},
eprint = {arXiv:2507.05331},
archivePrefix = {arXiv},
primaryClass = {cs.RO},
url = {https://arxiv.org/abs/2507.05331}
}
```
```bibtex
@misc{bostondynamics2025largebehaviormodelsatlas,
author = {Boston Dynamics and TRI Research Team},
title = {Large Behavior Models and Atlas Find New Footing},
year = {2025},
url = {https://bostondynamics.com/blog/large-behavior-models-atlas-find-new-footing/},
note = {Blog post}
}
```
-45
View File
@@ -1,45 +0,0 @@
# WALL-OSS
This repository contains the Hugging Face port of [**WALL-OSS**](https://x2robot.com/en/research/68bc2cde8497d7f238dde690), a Vision-Language-Action model for cross-embodiment robotic control based on Qwen2.5-VL with flow matching/FAST action prediction.
---
## Model Overview
| Feature | Description |
| ------------------ | ----------------------------------------------------- |
| Base Model | Qwen2.5-VL (Vision-Language Model) |
| Action Prediction | Flow Matching (diffusion) or FAST (discrete tokens) |
| Architecture | Mixture of Experts (MoE) with action-specific routing |
| Multi-Modal Inputs | Vision (images/videos), Language, Proprioception |
---
## Additional Resources
Paper: https://arxiv.org/pdf/2509.11766
Official Repository: https://github.com/X-Square-Robot/wall-x
Hugging Face: https://huggingface.co/x-square-robot
---
## Citation
If you use this work, please cite:
```bibtex
@article{zhai2025igniting,
title = {Igniting VLMs Toward the Embodied Space},
author = {Zhai, Andy and Liu, Brae and Fang, Bruno and Cai, Chalse and Ma, Ellie and Yin, Ethan and Wang, Hao and Zhou, Hugo and Wang, James and Shi, Lights and Liang, Lucy and Wang, Make and Wang, Qian and Gan, Roy and Yu, Ryan and Li, Shalfun and Liu, Starrick and Chen, Sylas and Chen, Vincent and Xu, Zach},
journal = {arXiv preprint arXiv:2509.11766},
year = {2025}
}
```
---
## License
This model follows the **Apache 2.0 License**, consistent with the original [WallX repository](https://github.com/X-Square-Robot/wall-x).
+3 -3
View File
@@ -30,7 +30,7 @@ Each of these pipelines handle different conversions between different action an
Below is an example of the three pipelines that we use in the phone to SO-100 follower examples:
```python
```69:90:examples/phone_so100_record.py
phone_to_robot_ee_pose_processor = RobotProcessorPipeline[RobotAction, RobotAction]( # teleop -> dataset action
steps=[
MapPhoneActionToRobotAction(platform=teleop_config.phone_os),
@@ -84,7 +84,7 @@ Dataset features are determined by the keys saved in the dataset. Each step can
Below is and example of how we declare features with the `transform_features` method in the phone to SO-100 follower examples:
```python
```src/lerobot/robots/so100_follower/robot_kinematic_processor.py
def transform_features(
self, features: dict[PipelineFeatureType, dict[str, PolicyFeature]]
) -> dict[PipelineFeatureType, dict[str, PolicyFeature]]:
@@ -103,7 +103,7 @@ Here we declare what PolicyFeatures we modify in this step, so we know what feat
Below is an example of how we aggregate and merge features in the phone to SO-100 record example:
```python
```121:145:examples/phone_so100_record.py
features=combine_feature_dicts(
# Run the feature contract of the pipelines
# This tells you how the features would look like after the pipeline steps
+21 -42
View File
@@ -38,7 +38,6 @@ docker run --rm -it \
start_rviz:=true start_sdk_server:=true mujoco:=true
```
> [!NOTE]
> If MuJoCo runs slowly (low simulation frequency), append `-e LD_LIBRARY_PATH="/opt/host-libs:$LD_LIBRARY_PATH" \` to the previous command to improve performance:
>
> ```
@@ -142,7 +141,7 @@ If you choose this option but still want to use the VR teleoperation application
First add reachy2 and reachy2_teleoperator to the imports of the record script. Then you can use the following command:
```bash
lerobot-record \
python -m lerobot.record \
--robot.type=reachy2 \
--robot.ip_address=192.168.0.200 \
--robot.id=r2-0000 \
@@ -151,7 +150,6 @@ lerobot-record \
--teleop.type=reachy2_teleoperator \
--teleop.ip_address=192.168.0.200 \
--teleop.with_mobile_base=false \
--robot.with_torso_camera=true \
--dataset.repo_id=pollen_robotics/record_test \
--dataset.single_task="Reachy 2 recording test" \
--dataset.num_episodes=1 \
@@ -159,9 +157,6 @@ lerobot-record \
--dataset.fps=15 \
--dataset.push_to_hub=true \
--dataset.private=true \
--dataset.streaming_encoding=true \
--dataset.encoder_threads=2 \
# --dataset.vcodec=auto \
--display_data=true
```
@@ -170,7 +165,7 @@ lerobot-record \
**Extended setup overview (all options included):**
```bash
lerobot-record \
python -m lerobot.record \
--robot.type=reachy2 \
--robot.ip_address=192.168.0.200 \
--robot.use_external_commands=true \
@@ -182,8 +177,6 @@ lerobot-record \
--robot.with_left_teleop_camera=true \
--robot.with_right_teleop_camera=true \
--robot.with_torso_camera=false \
--robot.camera_width=640 \
--robot.camera_height=480 \
--robot.disable_torque_on_disconnect=false \
--robot.max_relative_target=5.0 \
--teleop.type=reachy2_teleoperator \
@@ -201,9 +194,6 @@ lerobot-record \
--dataset.fps=15 \
--dataset.push_to_hub=true \
--dataset.private=true \
--dataset.streaming_encoding=true \
--dataset.encoder_threads=2 \
# --dataset.vcodec=auto \
--display_data=true
```
@@ -222,10 +212,9 @@ Must be set to true if a compliant Reachy 2 is used to control another one.
From our initial tests, recording **all** joints when only some are moving can reduce model quality with certain policies.
To avoid this, you can exclude specific parts from recording and replay using:
```bash
````
--robot.with_<part>=false
```
```,
with `<part>` being one of : `mobile_base`, `l_arm`, `r_arm", `neck`, `antennas`.
It determine whether the corresponding part is recorded in the observations. True if not set.
@@ -233,60 +222,49 @@ By default, **all parts are recorded**.
The same per-part mechanism is available in `reachy2_teleoperator` as well.
```bash
--teleop.with\_<part>
```
````
--teleop.with\_<part>
```
with `<part>` being one of : `mobile_base`, `l_arm`, `r_arm", `neck`, `antennas`.
Determine whether the corresponding part is recorded in the actions. True if not set.
> **Important:** In a given session, the **enabled parts must match** on both the robot and the teleoperator.
> For example, if the robot runs with `--robot.with_mobile_base=false`, the teleoperator must disable the same part `--teleoperator.with_mobile_base=false`.
For example, if the robot runs with `--robot.with_mobile_base=false`, the teleoperator must disable the same part `--teleoperator.with_mobile_base=false`.
##### Use the relevant cameras
You can do the same for **cameras**. Enable or disable each camera with default parameters using:
You can do the same for **cameras**. By default, only the **teleoperation cameras** are recorded (both `left_teleop_camera` and `right_teleop_camera`). Enable or disable each camera with:
```bash
--robot.with_left_teleop_camera=<true|false> \
--robot.with_right_teleop_camera=<true|false> \
```
--robot.with_left_teleop_camera=<true|false>
--robot.with_right_teleop_camera=<true|false>
--robot.with_torso_camera=<true|false>
```
By default, no camera is recorded, all camera arguments are set to `false`.
If you want to, you can use custom `width` and `height` parameters for Reachy 2's cameras using the `--robot.camera_width` & `--robot.camera_height` argument:
````
```bash
--robot.camera_width=1920 \
--robot.camera_height=1080
```
This will change the resolution of all 3 default robot cameras (enabled by the above bool arguments).
If you want, you can add additional cameras other than the ones in the robot as usual with:
```bash
--robot.cameras="{ extra: {type: opencv, index_or_path: 42, width: 640, height: 480, fps: 30}}" \
```
## Step 2: Replay
Make sure the robot is configured with the same parts as the dataset:
```bash
lerobot-replay \
python -m lerobot.replay \
--robot.type=reachy2 \
--robot.ip_address=192.168.0.200 \
--robot.use_external_commands=false \
--robot.with_mobile_base=false \
--dataset.repo_id=pollen_robotics/record_test \
--dataset.episode=0
```
--display_data=true
````
## Step 3: Train
```bash
lerobot-train \
python -m lerobot.scripts.train \
--dataset.repo_id=pollen_robotics/record_test \
--policy.type=act \
--output_dir=outputs/train/reachy2_test \
@@ -299,9 +277,10 @@ lerobot-train \
## Step 4: Evaluate
```bash
lerobot-eval \
python -m lerobot.record \
--robot.type=reachy2 \
--robot.ip_address=192.168.0.200 \
--display_data=false \
--dataset.repo_id=pollen_robotics/eval_record_test \
--dataset.single_task="Evaluate reachy2 policy" \
--dataset.num_episodes=10 \
-592
View File
@@ -1,592 +0,0 @@
# SARM: Stage-Aware Reward Modeling
SARM (Stage-Aware Reward Modeling) is a video-based reward modeling framework for long-horizon robot manipulation tasks. This guide covers how to train SARM reward models and optionally use them with Reward-Aligned Behavior Cloning (RA-BC).
**Paper**: [SARM: Stage-Aware Reward Modeling for Long Horizon Robot Manipulation](https://arxiv.org/abs/2509.25358)
<img
src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/lerobot/lerobot-sarm.png"
alt="An overview of SARM"
width="80%"
/>
## Why Reward Models?
Standard behavior cloning treats all demonstration frames equally, but real-world robot datasets are messy. They contain hesitations, corrections, and variable-quality trajectories. Reward models solve this by learning a generalizable notion of **task progress** from demonstrations: given video frames and a task description, they predict how close the robot is to completing the task (0→1). This learned "progress signal" can be used in multiple ways, two promising applications are: (1) **weighted imitation learning** (RA-BC), where high-progress frames receive more weight during policy training, and (2) **reinforcement learning**, where the reward model provides dense rewards for online or offline policy improvement.
## Overview
SARM has following features:
1. **Stage-aware architecture**: Jointly predicts the high-level task stage and fine-grained progress within each stage
2. **Subtask annotations**: Uses natural language subtask annotations to derive consistent progress labels
3. **Temporal proportions**: Computes dataset-level priors (α̅\_k) for each subtask to normalize progress across variable-length demonstrations
SARM trains on a compact **stage+tau** target for each frame:
- **stage**: integer stage index `k ∈ {0, ..., K-1}`
- **τ (tau)**: within-stage progress `τ ∈ [0, 1]`
- **target encoding**: `y = k + τ` (this is what the dataset processor produces)
At inference time (and in downstream RA-BC), SARM converts the raw `k + τ` value into a **normalized progress** in `[0, 1]` using dataset-level **temporal proportions** `α̅_k` (stored in `meta/temporal_proportions_*.json`).
This matches **Formula (2)** from the paper:
```
progress_t = P_{k-1} + α̅_k × τ_t
```
Where:
- `τ_t = (t - s_k) / (e_k - s_k)` is within-subtask normalized time
- `P_{k-1}` is cumulative prior (sum of previous subtask proportions)
- `α̅_k` is the temporal proportion for subtask k
This ensures identical task states map to consistent progress values, even across demonstrations of different lengths.
## Inputs and Targets (What the new code expects)
SARM is trained through its processor (`src/lerobot/policies/sarm/processor_sarm.py`), which:
- **Encodes** images and task text with CLIP (ViT-B/32) into `video_features` and `text_features`
- **Pads/truncates** robot state into `state_features` (up to `max_state_dim`)
- **Builds targets** as `sparse_targets` (and `dense_targets` in `dense_only`/`dual`) using the stage+tau encoding `y = k + τ`
- **Masks rewind frames** using a per-sample `lengths` tensor (rewind is a training-time augmentation)
At minimum, each training sample needs:
- `task` (string): task description
- `policy.image_key` images and `policy.state_key` states from the dataset
---
## Annotation Modes
You can choose from **3 annotation modes** that determine how progress labels are computed:
| Mode | Annotations Required | Heads | Use Case |
| -------------- | -------------------- | ---------------------------- | ------------------------------------------------------------ |
| `single_stage` | None | Sparse only | Simple tasks, quick experiments, no VLM needed |
| `dense_only` | Dense (VLM) | Dual (sparse auto-generated) | Detailed subtask tracking without defining high-level stages |
| `dual` | Sparse + Dense (VLM) | Dual | Full SARM paper setup with both granularities |
### Mode Details
<hfoptions id="mode_explanation">
<hfoption id="single_stage">
**No annotations required.** The entire episode is treated as a single stage called `"task"`, and progress is linear from 0 to 1 over the episode duration.
- **Sparse head**: 1 stage ("task"), linear progress
- **Dense head**: Not used
- **Best for**: Simple tasks, quick experiments, or when VLM annotation is not available
## Set Up Your Environment
1. Install LeRobot by following our [Installation Guide](./installation).
2. Install SARM dependencies by running:
```bash
pip install -e ".[sarm]"
```
Workflow:
```
1. Train SARM → 2. Visualize predictions → 3. (Optional) Train policy with RA-BC
```
</hfoption>
<hfoption id="dense_only">
**Only dense (fine-grained) annotations from a VLM.** The sparse head automatically uses a single `"task"` stage covering the full episode, while the dense head learns detailed subtask progression.
- **Sparse head**: 1 stage ("task"), linear progress (auto-generated)
- **Dense head**: Multiple fine-grained stages from VLM annotations
- **Best for**: When you want detailed subtask tracking but don't need to define high-level stages
Workflow:
```
1. Annotate (dense) → 2. Verify → 3. Train SARM → 4. Visualize → 5. (Optional) Train policy with RA-BC
```
</hfoption>
<hfoption id="dual">
**Both sparse and dense annotations from VLM.** Full dual-head mode as described in the SARM paper, with both high-level (sparse) and fine-grained (dense) stage predictions.
- **Sparse head**: High-level stages from VLM annotations
- **Dense head**: Fine-grained stages from VLM annotations
- **Best for**: Complex multi-stage tasks where both granularities are useful
Workflow:
```
1. Annotate (sparse+dense) → 2. Verify → 3. Train SARM → 4. Visualize → 5. (Optional) Train policy with RA-BC
```
</hfoption>
</hfoptions>
---
## Step 1: Subtask Annotation
<hfoptions id="annotation_mode">
<hfoption id="single_stage">
**No annotation required!** Skip this step entirely. The model will use the episode's task description and compute linear progress automatically.
</hfoption>
<hfoption id="dense_only">
Generate **dense (fine-grained) annotations only** using a VLM. The sparse stage will be auto-generated.
```bash
python src/lerobot/data_processing/sarm_annotations/subtask_annotation.py \
--repo-id your-username/your-dataset \
--dense-only \
--dense-subtasks "Bring robot arms up from starting position,Grab near side and do 1st fold,Grab side and do 2nd fold,Grab side and do 3rd fold to finish folding" \
--video-key observation.images.base \
--num-workers 4 \
--push-to-hub
```
**What gets saved:**
- `meta/temporal_proportions_sparse.json` - Auto-generated sparse proportions (`{"task": 1.0}`)
- `meta/temporal_proportions_dense.json` - Dense temporal proportions
- Per-episode columns in `episodes/*.parquet`:
- `dense_subtask_names`, `dense_subtask_start_frames`, `dense_subtask_end_frames`
- (also time-based columns: `dense_subtask_start_times`, `dense_subtask_end_times`)
</hfoption>
<hfoption id="dual">
Generate **both sparse (high-level) and dense (fine-grained) annotations** using a VLM.
```bash
python src/lerobot/data_processing/sarm_annotations/subtask_annotation.py \
--repo-id your-username/your-dataset \
--sparse-subtasks "Bring arms up from starting position,Fold the towel (3 folds in total)" \
--dense-subtasks "Bring robot arms up from starting position,Grab near side and do 1st fold,Grab side and do 2nd fold,Grab side and do 3rd fold to finish folding" \
--video-key observation.images.base \
--num-workers 4 \
--push-to-hub
```
**What gets saved:**
- `meta/temporal_proportions_sparse.json` - Sparse temporal proportions
- `meta/temporal_proportions_dense.json` - Dense temporal proportions
- Per-episode columns in `episodes/*.parquet`:
- `sparse_subtask_names`, `sparse_subtask_start_frames`, `sparse_subtask_end_frames`
- `dense_subtask_names`, `dense_subtask_start_frames`, `dense_subtask_end_frames`
- (also time-based columns: `*_subtask_start_times`, `*_subtask_end_times`)
</hfoption>
</hfoptions>
### Annotation Arguments
| Argument | Description |
| ---------------------- | ------------------------------------------------------------------------------- |
| `--repo-id` | HuggingFace dataset repository ID |
| `--sparse-subtasks` | Comma-separated list of high-level subtask names |
| `--dense-subtasks` | Comma-separated list of fine-grained subtask names |
| `--dense-only` | Generate only dense annotations (auto-creates sparse "task" stage) |
| `--video-key` | Camera/video key to use (e.g., `observation.images.top`) |
| `--num-workers` | Number of parallel GPU workers (default: 1) |
| `--episodes` | Specific episode indices to annotate (default: all) |
| `--skip-existing` | Skip episodes that already have annotations |
| `--model` | VLM model (default: `Qwen/Qwen3-VL-30B-A3B-Instruct`) |
| `--num-visualizations` | Number of episodes to visualize after annotation (default: 5, set to 0 to skip) |
> **Note**: After annotation completes, 5 episodes are automatically visualized by default. Use `--num-visualizations 0` to skip this step.
---
## Step 2: Verify Annotations
<hfoptions id="verify_mode">
<hfoption id="single_stage">
**No verification needed!** Skip this step.
</hfoption>
<hfoption id="dense_only">
Visualize annotations using the `--visualize-only` flag:
```bash
python src/lerobot/data_processing/sarm_annotations/subtask_annotation.py \
--repo-id your-username/your-dataset \
--visualize-only \
--visualize-type dense \
--num-visualizations 5 \
--video-key observation.images.base \
--output-dir ./subtask_viz
```
</hfoption>
<hfoption id="dual">
Visualize annotations using the `--visualize-only` flag:
```bash
python src/lerobot/data_processing/sarm_annotations/subtask_annotation.py \
--repo-id your-username/your-dataset \
--visualize-only \
--visualize-type both \
--num-visualizations 5 \
--video-key observation.images.base \
--output-dir ./subtask_viz
```
</hfoption>
</hfoptions>
This generates visualizations showing video frames with subtask boundaries overlaid and timeline of subtasks.
### Visualization Arguments
| Argument | Description |
| ---------------------- | -------------------------------------------------------------- |
| `--visualize-only` | Only visualize existing annotations (no generation) |
| `--num-visualizations` | Number of episodes to visualize (default: 5) |
| `--visualize-type` | Type of annotations to visualize: `sparse`, `dense`, or `both` |
**Tip**: If annotations are inaccurate, adjust your subtask descriptions to be more specific and re-run.
---
## Step 3: Train SARM
<hfoptions id="train_mode">
<hfoption id="single_stage">
Train with **no annotations** - uses linear progress from 0 to 1:
```bash
lerobot-train \
--dataset.repo_id=your-username/your-dataset \
--policy.type=sarm \
--policy.annotation_mode=single_stage \
--policy.image_key=observation.images.base \
--output_dir=outputs/train/sarm_single \
--batch_size=32 \
--steps=5000 \
--wandb.enable=true \
--wandb.project=sarm \
--policy.repo_id=your-username/your-model-name
```
</hfoption>
<hfoption id="dense_only">
Train with **dense annotations only** (sparse auto-generated):
```bash
lerobot-train \
--dataset.repo_id=your-username/your-dataset \
--policy.type=sarm \
--policy.annotation_mode=dense_only \
--policy.image_key=observation.images.base \
--output_dir=outputs/train/sarm_dense \
--batch_size=32 \
--steps=5000 \
--wandb.enable=true \
--wandb.project=sarm \
--policy.repo_id=your-username/your-model-name
```
</hfoption>
<hfoption id="dual">
Train with **both sparse and dense annotations**:
```bash
lerobot-train \
--dataset.repo_id=your-username/your-dataset \
--policy.type=sarm \
--policy.annotation_mode=dual \
--policy.image_key=observation.images.base \
--output_dir=outputs/train/sarm_dual \
--batch_size=32 \
--steps=5000 \
--wandb.enable=true \
--wandb.project=sarm \
--policy.repo_id=your-username/your-model-name
```
</hfoption>
</hfoptions>
### Multi-GPU Training
Add `accelerate launch --multi_gpu --num_processes=4` to use multiple GPUs for training.
### Training Arguments
| Argument | Description | Default |
| -------------------------- | ----------------------------------------------------------------- | ------------------------ |
| `--policy.annotation_mode` | `single_stage`, `dense_only`, or `dual` | `single_stage` |
| `--policy.image_key` | Camera key for images | `observation.images.top` |
| `--policy.state_key` | Key for joint states | `observation.state` |
| `--policy.n_obs_steps` | Observation history steps (total obs frames = `n_obs_steps + 1`) | `8` |
| `--policy.frame_gap` | Gap (in frames) between sampled observations (at 30 fps: 30 ≈ 1s) | `30` |
---
## Step 4: Visualize Predictions
Use `compute_rabc_weights.py` with `--visualize-only` to visualize model predictions (and, if available, annotation-derived targets) without writing a parquet file.
<hfoptions id="viz_mode">
<hfoption id="single_stage">
```bash
python src/lerobot/policies/sarm/compute_rabc_weights.py \
--dataset-repo-id your-username/your-dataset \
--reward-model-path your-username/sarm-model \
--visualize-only \
--num-visualizations 5 \
--head-mode sparse \
--output-dir ./sarm_viz
```
</hfoption>
<hfoption id="dense_only">
```bash
python src/lerobot/policies/sarm/compute_rabc_weights.py \
--dataset-repo-id your-username/your-dataset \
--reward-model-path your-username/sarm-model \
--visualize-only \
--num-visualizations 5 \
--head-mode dense \
--output-dir ./sarm_viz
```
</hfoption>
<hfoption id="dual">
```bash
python src/lerobot/policies/sarm/compute_rabc_weights.py \
--dataset-repo-id your-username/your-dataset \
--reward-model-path your-username/sarm-model \
--visualize-only \
--num-visualizations 5 \
--head-mode both \
--output-dir ./sarm_viz
```
</hfoption>
</hfoptions>
The visualization shows:
- **Progress plot**: Predicted progress (and optional annotation-derived “GT” when available and `--stride 1`)
- **Stage probabilities**: Stacked area plot of predicted stage probabilities
- **Sample frames**: Key frames from the episode with progress/stage labels
### Visualization Arguments
| Argument | Description |
| ---------------------- | --------------------------------------------------------- |
| `--visualize-only` | Only visualize predictions (no RABC computation) |
| `--num-visualizations` | Number of episodes to visualize (default: 5) |
| `--head-mode` | SARM head to use: `sparse`, `dense`, or `both` |
| `--stride` | Compute every N frames, interpolate the rest (default: 1) |
---
## Step 5 (Optional): Train Policy with RA-BC
Reward-Aligned Behavior Cloning (RA-BC) uses the trained SARM model to weight training samples based on predicted progress improvement. This requires two steps:
1. **Precompute progress values** for all frames using the trained SARM model
2. **Train policy** with RA-BC weighting using the precomputed values
### How RA-BC Works
For each training sample, RA-BC computes the progress delta:
```
r_i = φ(o_{t+Δ}) - φ(o_t)
```
Where `φ` is the SARM progress prediction and `Δ` is the policy's `chunk_size`. Samples with positive progress (good demonstrations) get higher weights, while samples with negative or zero progress get down-weighted.
The weighting follows **Equations 8-9** from the paper:
- **Soft weight**: `w̃_i = clip((r_i 2σ)) / (4σ + ε), 0, 1)`
- **Final weight**: `w_i = 𝟙{r_i > κ} + 𝟙{0 ≤ r_i ≤ κ} × w̃_i`
### Step 5a: Compute SARM Progress Values
First, run the SARM model on all frames in your dataset to compute progress values:
```bash
python src/lerobot/policies/sarm/compute_rabc_weights.py \
--dataset-repo-id your-username/your-dataset \
--reward-model-path your-username/sarm-model \
--head-mode sparse \
--num-visualizations 5 \
--push-to-hub
```
This script:
- Processes all frames and computes progress values
- Saves progress values to a parquet file next to the dataset on disk (defaults to `<dataset_root>/sarm_progress.parquet`)
- Generates visualizations of the first N episodes (default: 5)
**Arguments:**
| Argument | Description | Default |
| ---------------------- | -------------------------------------------------------------- | ---------- |
| `--reward-model-path` | Path to trained SARM model | (required) |
| `--head-mode` | SARM head to use: `sparse`, `dense`, or `both` | `sparse` |
| `--device` | Device for inference | `cuda` |
| `--visualize-only` | Only visualize predictions (no RA-BC computation) | `false` |
| `--num-visualizations` | Number of episodes to visualize (default: 5, set to 0 to skip) | `5` |
**Output format** (`sarm_progress.parquet`):
| Column | Description |
| ----------------- | ---------------------------------------------- |
| `index` | Global frame index in dataset |
| `episode_index` | Episode number |
| `frame_index` | Local frame index within episode |
| `progress_sparse` | Sparse head progress value [0, 1] |
| `progress_dense` | Dense head progress value [0, 1] (if computed) |
### Step 5b: Train Policy with RA-BC
Once you have the progress file, train your policy with RA-BC weighting. The progress file is auto-detected from the dataset path (`sarm_progress.parquet`). Currently PI0, PI0.5 and SmolVLA are supported with RA-BC:
```bash
lerobot-train \
--dataset.repo_id=your-username/your-dataset \
--policy.type=pi0 \
--use_rabc=true \
--rabc_head_mode=sparse \
--rabc_kappa=0.01 \
--output_dir=outputs/train/policy_rabc \
--batch_size=32 \
--steps=40000
```
The training script automatically:
- Loads the precomputed progress values from the parquet file
- Uses the policy's `chunk_size` to compute progress deltas (Δ)
- Computes sample weights based on progress improvement
- Applies weighted loss during training
**RA-BC Arguments:**
| Argument | Description | Default |
| ---------------------- | ---------------------------------------------------------- | ---------------------------------- |
| `--use_rabc` | Enable RA-BC sample weighting | `false` |
| `--rabc_progress_path` | Path to progress parquet file (auto-detected from dataset) | `sarm_progress.parquet` in dataset |
| `--rabc_head_mode` | Which SARM head's progress to use: `sparse` or `dense` | `sparse` |
| `--rabc_kappa` | Threshold κ for high-quality samples | `0.01` |
### Tuning RA-BC Kappa
The `kappa` parameter is the threshold that determines which samples get full weight (w=1). Understanding how to tune it is critical for RA-BC to work effectively.
**How the weighting works:**
| Condition | Weight |
| ------------------- | ----------------------- |
| `delta > kappa` | 1.0 (hard threshold) |
| `0 ≤ delta ≤ kappa` | Soft weight from Eq. 8 |
| `delta < 0` | 0.0 (negative progress) |
**Diagnosing kappa issues:**
Monitor these WandB metrics during training:
| Metric | Healthy Range | Problem Indicator |
| ------------------ | ------------- | ------------------------- |
| `rabc_mean_weight` | 0.3 - 0.8 | ≈ 1.0 means kappa too low |
| `rabc_delta_mean` | > 0 | Should be positive |
| `rabc_delta_std` | > 0 | Variance in data quality |
**If `rabc_mean_weight ≈ 1.0`:** Your kappa is too low. Most samples have `delta > kappa` and bypass the soft-weighting entirely. RA-BC becomes equivalent to vanilla BC.
**Setting kappa based on your data:**
The default `kappa=0.01` was tuned for the paper's T-shirt folding task (~90s episodes at 30fps). For your dataset, check the logged `rabc_delta_mean` and `rabc_delta_std`:
```
# If delta_mean ≈ 0.03 and delta_std ≈ 0.02:
# Most deltas fall in range [0.01, 0.05]
# Option 1: Set kappa = delta_mean (medium selectivity)
--rabc_kappa=0.03
# Option 2: Set kappa = delta_mean + delta_std (high selectivity)
--rabc_kappa=0.05
# Option 3: Set kappa = delta_mean + 2*delta_std (very selective)
--rabc_kappa=0.07
```
**When RA-BC may not help:**
If your dataset is already high quality (consistent progress across all demonstrations), RA-BC won't provide much benefit since there's nothing to filter.
### Multi-GPU Training with RA-BC
```bash
accelerate launch \
--multi_gpu \
--num_processes=4 \
src/lerobot/scripts/lerobot_train.py \
--dataset.repo_id=your-username/your-dataset \
--policy.type=pi0 \
--use_rabc=true \
--rabc_kappa=0.01 \
--output_dir=outputs/train/policy_rabc \
--batch_size=32 \
--steps=40000
```
---
## Tips & Best Practices
### Choosing a Mode
- **Start with `single_stage`** for quick experiments - no annotation overhead
- Use **`dense_only`** when you want detailed progress tracking but tasks don't have clear high-level stages
- Use **`dual`** for complex tasks where both coarse and fine-grained progress is meaningful
### Annotation Quality
1. **Be specific with subtask names**: Instead of "fold", use "grab near side and fold toward center"
2. **Verify with visualization**: Always check a few episodes before training
3. **Consistent naming**: Use the same subtask names across all episodes
### RA-BC
1. **Train SARM first**: RA-BC quality depends entirely on SARM quality
2. **Monitor `rabc_mean_weight`**: If it's ≈ 1.0, increase kappa (see [Tuning RA-BC Kappa](#tuning-ra-bc-kappa))
---
## Citation
```bibtex
@article{chen2025sarm,
title={SARM: Stage-Aware Reward Modeling for Long Horizon Robot Manipulation},
author={Chen, Qianzhong and Yu, Justin and Schwager, Mac and Abbeel, Pieter and Shentu, Yide and Wu, Philipp},
journal={arXiv preprint arXiv:2509.25358},
year={2025}
}
```
-3
View File
@@ -106,9 +106,6 @@ lerobot-record \
--dataset.repo_id=${HF_USER}/eval_DATASET_NAME_test \ # <- This will be the dataset name on HF Hub
--dataset.episode_time_s=50 \
--dataset.num_episodes=10 \
--dataset.streaming_encoding=true \
--dataset.encoder_threads=2 \
# --dataset.vcodec=auto \
# <- Teleop optional if you want to teleoperate in between episodes \
# --teleop.type=so100_leader \
# --teleop.port=/dev/ttyACM0 \
+4 -4
View File
@@ -103,7 +103,7 @@ lerobot-setup-motors \
<!-- prettier-ignore-start -->
```python
from lerobot.robots.so_follower import SO100Follower, SO100FollowerConfig
from lerobot.robots.so100_follower import SO100Follower, SO100FollowerConfig
config = SO100FollowerConfig(
port="/dev/tty.usbmodem585A0076841",
@@ -177,7 +177,7 @@ lerobot-setup-motors \
<!-- prettier-ignore-start -->
```python
from lerobot.teleoperators.so_leader import SO100Leader, SO100LeaderConfig
from lerobot.teleoperators.so100_leader import SO100Leader, SO100LeaderConfig
config = SO100LeaderConfig(
port="/dev/tty.usbmodem585A0076841",
@@ -579,7 +579,7 @@ lerobot-calibrate \
<!-- prettier-ignore-start -->
```python
from lerobot.robots.so_follower import SO100FollowerConfig, SO100Follower
from lerobot.robots.so100_follower import SO100FollowerConfig, SO100Follower
config = SO100FollowerConfig(
port="/dev/tty.usbmodem585A0076891",
@@ -617,7 +617,7 @@ lerobot-calibrate \
<!-- prettier-ignore-start -->
```python
from lerobot.teleoperators.so_leader import SO100LeaderConfig, SO100Leader
from lerobot.teleoperators.so100_leader import SO100LeaderConfig, SO100Leader
config = SO100LeaderConfig(
port="/dev/tty.usbmodem58760431551",
+4 -17
View File
@@ -1,18 +1,5 @@
# SO-101
<div style="display: flex; align-items: center; gap: 10px;">
<img
src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/lerobot/SO101_Follower.webp"
alt="SO-101"
width="60%"
/>
<img
src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/lerobot/SO101_Leader.webp"
alt="SO-101"
width="60%"
/>
</div>
In the steps below, we explain how to assemble our flagship robot, the SO-101.
## Source the parts
@@ -138,7 +125,7 @@ lerobot-setup-motors \
<!-- prettier-ignore-start -->
```python
from lerobot.robots.so_follower import SO101Follower, SO101FollowerConfig
from lerobot.robots.so101_follower import SO101Follower, SO101FollowerConfig
config = SO101FollowerConfig(
port="/dev/tty.usbmodem585A0076841",
@@ -214,7 +201,7 @@ lerobot-setup-motors \
<!-- prettier-ignore-start -->
```python
from lerobot.teleoperators.so_leader import SO101Leader, SO101LeaderConfig
from lerobot.teleoperators.so101_leader import SO101Leader, SO101LeaderConfig
config = SO101LeaderConfig(
port="/dev/tty.usbmodem585A0076841",
@@ -377,7 +364,7 @@ lerobot-calibrate \
<!-- prettier-ignore-start -->
```python
from lerobot.robots.so_follower import SO101FollowerConfig, SO101Follower
from lerobot.robots.so101_follower import SO101FollowerConfig, SO101Follower
config = SO101FollowerConfig(
port="/dev/tty.usbmodem585A0076891",
@@ -426,7 +413,7 @@ lerobot-calibrate \
<!-- prettier-ignore-start -->
```python
from lerobot.teleoperators.so_leader import SO101LeaderConfig, SO101Leader
from lerobot.teleoperators.so101_leader import SO101LeaderConfig, SO101Leader
config = SO101LeaderConfig(
port="/dev/tty.usbmodem58760431551",
-155
View File
@@ -1,155 +0,0 @@
# Streaming Video Encoding Guide
## 1. Overview
Streaming video encoding eliminates the traditional PNG round-trip during video dataset recording. Instead of:
1. Capture frame -> write PNG to disk -> (at episode end) read PNG's -> encode to MP4 -> delete PNG's
Frames can be encoded in real-time during capture:
1. Capture frame -> queue to encoder thread -> encode to MP4 directly
This makes `save_episode()` near-instant (the video is already encoded by the time the episode ends) and removes the blocking wait that previously occurred between episodes, especially with multiple cameras in long episodes.
## 2. Tuning Parameters
| Parameter | CLI Flag | Type | Default | Description |
| ----------------------- | --------------------------------- | ------------- | ------------- | ----------------------------------------------------------------- |
| `streaming_encoding` | `--dataset.streaming_encoding` | `bool` | `True` | Enable real-time encoding during capture |
| `vcodec` | `--dataset.vcodec` | `str` | `"libsvtav1"` | Video codec. `"auto"` detects best HW encoder |
| `encoder_threads` | `--dataset.encoder_threads` | `int \| None` | `None` (auto) | Threads per encoder instance. `None` will leave the vcoded decide |
| `encoder_queue_maxsize` | `--dataset.encoder_queue_maxsize` | `int` | `60` | Max buffered frames per camera (~2s at 30fps). Consumes RAM |
## 3. Performance Considerations
Streaming encoding means the CPU is encoding video **during** the capture loop, not after. This creates a CPU budget that must be shared between:
- **Control loop** (reading cameras, control the robot, writing non-video data)
- **Encoder threads** (one pool per camera)
- **Rerun visualization** (if enabled)
- **OS and other processes**
### Resolution & Number of Cameras Impact
| Setup | Throughput (px/sec) | CPU Encoding Load | Notes |
| ------------------------- | ------------------- | ----------------- | ------------------------------ |
| 2camsx 640x480x3 @30fps | 55M | Low | Works on most systems |
| 2camsx 1280x720x3 @30fps | 165M | Moderate | Comfortable on modern systems |
| 2camsx 1920x1080x3 @30fps | 373M | High | Requires powerful high-end CPU |
### `encoder_threads` Tuning
This parameter controls how many threads each encoder instance uses internally:
- **Higher values** (e.g., 4-5): Faster encoding, but uses more CPU cores per camera. Good for high-end systems with many cores.
- **Lower values** (e.g., 1-2): Less CPU per camera, freeing cores for capture and visualization. Good for low-res images and capable CPUs.
- **`None` (default)**: Lets the codec decide. Information available in the codec logs.
### Backpressure and Frame Dropping
Each camera has a bounded queue (`encoder_queue_maxsize`, default 60 frames). When the encoder can't keep up:
1. The queue fills up (consuming RAM)
2. New frames are **dropped** (not blocked) — the capture loop continues uninterrupted
3. A warning is logged: `"Encoder queue full for {camera}, dropped N frame(s)"`
4. At episode end, total dropped frames per camera are reported
### Symptoms of Encoder Falling Behind
- **System feels laggy and freezes**: all CPUs are at 100%
- **Dropped frame warnings** in the log or lower frames/FPS than expected in the recorded dataset
- **Choppy robot movement**: If CPU is severely overloaded, even the capture loop may be affected
- **Accumulated rerun lag**: Visualization falls behind real-time
## 4. Hardware-Accelerated Encoding
### When to Use
Use HW encoding when:
- CPU is the bottleneck (dropped frames, choppy robot, rerun lag)
- You have compatible hardware (GPU or dedicated encoder)
- You're recording at high throughput (high resolution or with many cameras)
### Choosing a Codec
| Codec | CPU Usage | File Size | Quality | Notes |
| --------------------- | --------- | -------------- | ------- | ---------------------------------------------------------------- |
| `libsvtav1` (default) | High | Smallest | Best | Default. Best compression but most CPU-intensive |
| `h264` | Medium | ~30-50% larger | Good | Software H.264. Lower CPU |
| HW encoders | Very Low | Largest | Good | Offloads to dedicated hardware. Best for CPU-constrained systems |
### Available HW Encoders
| Encoder | Platform | Hardware | CLI Value |
| ------------------- | ------------- | ------------------------------------------------------------------------------------------------ | ------------------------------------ |
| `h264_videotoolbox` | macOS | Apple Silicon / Intel | `--dataset.vcodec=h264_videotoolbox` |
| `hevc_videotoolbox` | macOS | Apple Silicon / Intel | `--dataset.vcodec=hevc_videotoolbox` |
| `h264_nvenc` | Linux/Windows | NVIDIA GPU | `--dataset.vcodec=h264_nvenc` |
| `hevc_nvenc` | Linux/Windows | NVIDIA GPU | `--dataset.vcodec=hevc_nvenc` |
| `h264_vaapi` | Linux | Intel/AMD GPU | `--dataset.vcodec=h264_vaapi` |
| `h264_qsv` | Linux/Windows | Intel Quick Sync | `--dataset.vcodec=h264_qsv` |
| `auto` | Any | Probes the system for available HW encoders. Falls back to `libsvtav1` if no HW encoder is found | `--dataset.vcodec=auto` |
> [!NOTE]
> In order to use the HW accelerated encoders you might need to upgrade your GPU drivers.
> [!NOTE]
> `libsvtav1` is the default because it provides the best training performance; other vcodecs can reduce CPU usage and be faster, but they typically produce larger files and may affect training time.
## 5. Troubleshooting
| Symptom | Likely Cause | Fix |
| ------------------------------------------------------------------ | -------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| System freezes or choppy robot movement or Rerun visualization lag | CPU starved (100% load usage) | Close other apps, reduce encoding throughput, lower `encoder_threads`, use `h264`, use `display_data=False`. If the CPU continues to be at 100% then it might be insufficient for your setup, consider `--dataset.streaming_encoding=false` or HW encoding (`--dataset.vcodec=auto`) |
| "Encoder queue full" warnings or dropped frames in dataset | Encoder can't keep up (Queue overflow) | If CPU is not at 100%: Increase `encoder_threads`, increase `encoder_queue_maxsize` or use HW encoding (`--dataset.vcodec=auto`). |
| High RAM usage | Queue filling faster than encoding | `encoder_threads` too low or CPU insufficient. Reduce `encoder_queue_maxsize` or use HW encoding |
| Large video files | Using HW encoder or H.264 | Expected trade-off. Switch to `libsvtav1` if CPU allows |
| `save_episode()` still slow | `streaming_encoding` is `False` | Set `--dataset.streaming_encoding=true` |
| Encoder thread crash | Codec not available or invalid settings | Check `vcodec` is installed, try `--dataset.vcodec=auto` |
| Recorded dataset is missing frames | CPU/GPU starvation or occasional load spikes | If ~5% of frames are missing, your system is likely overloaded — follow the recommendations above. If fewer frames are missing (~2%), they are probably due to occasional transient load spikes (often at startup) and can be considered expected. |
## 6. Recommended Configurations
These estimates are conservative; we recommend testing them on your setup—start with a low load and increase it gradually.
### High-End Systems: modern 12+ cores (24+ threads)
A throughput between ~250-500M px/sec should be comfortable in CPU. For even better results try HW encoding if available.
```bash
# 3camsx 1280x720x3 @30fps: Defaults work well. Optionally increase encoder parallelism.
# 2camsx 1920x1080x3 @30fps: Defaults work well. Optionally increase encoder parallelism.
lerobot-record --dataset.encoder_threads=5 ...
# 3camsx 1920x1080x3 @30fps: Might require some tuning.
```
### Mid-Range Systems: modern 8+ cores (16+ threads) or Apple Silicon
A throughput between ~80-300M px/sec should be possible in CPU.
```bash
# 3camsx 640x480x3 @30fps: Defaults work well. Optionally decrease encoder parallelism.
# 2camsx 1280x720x3 @30fps: Defaults work well. Optionally decrease encoder parallelism.
lerobot-record --dataset.encoder_threads=2 ...
# 2camsx 1920x1080x3 @30fps: Might require some tuning.
```
### Low-Resource Systems: modern 4+ cores (8+ threads) or Raspberry Pi 5
On very constrained systems, streaming encoding may compete too heavily with the capture loop. Disabling it falls back to the PNG-based approach where encoding happens between episodes (blocking, but doesn't interfere with capture). Alternatively, record at a lower throughput to reduce both capture and encoding load. Consider also changing codec to `h264` and using batch encoding.
```bash
# 2camsx 640x480x3 @30fps: Requires some tuning.
# Use H.264, disable streaming, consider batching encoding
lerobot-record --dataset.vcodec=h264 --dataset.streaming_encoding=false ...
```
## 7. Closing note
Performance ultimately depends on your exact setup — frames-per-second, resolution, CPU cores and load, available memory, episode length, and the encoder you choose. Always test with your target workload, be mindful about your CPU & system capabilities and tune `encoder_threads`, `encoder_queue_maxsize`, and
`vcodec` reasonably. That said, a common practical configuration (for many applications) is three cameras at 640×480x3 @30fps; this usually runs fine with the default streaming video encoding settings in modern systems. Always verify your recorded dataset is healthy by comparing the video duration to the CLI episode duration and confirming the row count equals FPS × CLI duration.
+33 -132
View File
@@ -1,21 +1,21 @@
# Unitree G1
# Unitree G1 Robot Setup and Control
This guide covers the complete setup process for the Unitree G1 humanoid, from initial connection to running gr00t_wbc locomotion.
## About
## About the Unitree G1
We support both 29 and 23 DOF G1 EDU version. We introduce:
We offer support for both 29 and 23 DOF G1. We introduce:
- **`unitree g1` robot class, handling low level read/write from/to the humanoid**
- **ZMQ socket bridge** for remote communication and camera streaming, allowing for remote policy deployment over wlan, eth or directly on the robot
- **Locomotion policies** from NVIDIA gr00t and Amazon FAR Holosoma
- **Simulation mode** for testing policies without the physical robot in mujoco
- **`unitree g1` robot class, handling low level communication with the humanoid**
- **ZMQ socket bridge** for remote communication over WiFi, allowing one to deploy policies remotely instead of over ethernet or directly on the Orin
- **GR00T locomotion policy** for bipedal walking and balance
- **MuJoCo simulation mode** for testing policies without the physical robot
---
## Connection guide
## Part 1: Connect to Robot over Ethernet
### Step 1: Configure Ethernet Interface
### Step 1: Configure Your Computer's Ethernet Interface
Set a static IP on the same subnet as the robot:
@@ -26,7 +26,7 @@ sudo ip addr add 192.168.123.200/24 dev enp131s0
sudo ip link set enp131s0 up
```
**Note**: The G1's Ethernet IP is fixed at `192.168.123.164`. Your computer must use `192.168.123.x` with x ≠ 164.
**Note**: The robot's Ethernet IP is fixed at `192.168.123.164`. Your computer must use `192.168.123.x` where x ≠ 164.
### Step 2: SSH into the Robot
@@ -35,24 +35,25 @@ ssh unitree@192.168.123.164
# Password: 123
```
You should now be connected to the G1's Orin.
You should now be connected to the robot's onboard computer.
---
## Part 2: Enable WiFi on the Robot
Wlan0 is disabled by default on the G1. To enable it:
Once connected via Ethernet, follow these steps to enable WiFi:
### Step 1: Enable WiFi Hardware
```bash
# Unblock WiFi radio
sudo rfkill unblock wifi
sudo rfkill unblock all
# Bring up wlan0
# Bring up WiFi interface
sudo ip link set wlan0 up
# Enable NetworkManager control of wlan0
# Enable NetworkManager control
sudo nmcli radio wifi on
sudo nmcli device set wlan0 managed yes
sudo systemctl restart NetworkManager
@@ -72,7 +73,7 @@ sudo iptables -A FORWARD -i wlp132s0f0 -o enp131s0 -m state --state RELATED,ESTA
sudo iptables -A FORWARD -i enp131s0 -o wlp132s0f0 -j ACCEPT
```
**On the G1:**
**On the robot:**
```bash
# Add laptop as default gateway
@@ -110,7 +111,7 @@ ssh unitree@<YOUR_ROBOT_IP>
# Password: 123
```
Replace `<YOUR_ROBOT_IP>` with your robot's actual WiFi IP address.
Replace `<YOUR_ROBOT_IP>` with your robot's actual WiFi IP address (e.g., `172.18.129.215`).
---
@@ -123,7 +124,7 @@ SSH into the robot and install LeRobot:
```bash
ssh unitree@<YOUR_ROBOT_IP>
conda create -y -n lerobot python=3.12
conda create -y -n lerobot python=3.10
conda activate lerobot
git clone https://github.com/huggingface/lerobot.git
cd lerobot
@@ -146,14 +147,14 @@ python src/lerobot/robots/unitree_g1/run_g1_server.py
---
## Part 4: Controlling the robot
## Part 4: Running GR00T Locomotion
With the robot server running, you can now control the robot remotely. Let's launch a locomotion policy
With the robot server running, you can now control the robot from your laptop.
### Step 1: Install LeRobot on your machine
```bash
conda create -y -n lerobot python=3.12
conda create -y -n lerobot python=3.10
conda activate lerobot
git clone https://github.com/huggingface/lerobot.git
cd lerobot
@@ -171,134 +172,34 @@ Edit the config file to match your robot's WiFi IP:
robot_ip: str = "<YOUR_ROBOT_IP>" # Replace with your robot's WiFi IP.
```
**Note**: When running directly on the G1 (not remotely), set `robot_ip: str = "127.0.0.1"` instead.
### Step 3: Run the Locomotion Policy
```bash
# Run GR00T locomotion controller
python examples/unitree_g1/gr00t_locomotion.py --repo-id "nepyope/GR00T-WholeBodyControl_g1"
# Run Holosoma locomotion controller
python examples/unitree_g1/holosoma_locomotion.py
```
### Step 4: Control with Remote
- **Left stick**: Forward/backward and left/right movement
- **Right stick**: Rotation
- **R1 button**: Raise waist height
- **R2 button**: Lower waist height
Press `Ctrl+C` to stop the policy.
---
## Running in Simulation Mode (MuJoCo)
## Extra: Running in Simulation Mode (MuJoCo)
You can test policies before deploying on the physical robot using MuJoCo simulation. Set `is_simulation=True` in config or pass `--robot.is_simulation=true` via CLI.
### Calibrate Exoskeleton Teleoperator
```bash
lerobot-calibrate \
--teleop.type=unitree_g1 \
--teleop.left_arm_config.port=/dev/ttyACM1 \
--teleop.right_arm_config.port=/dev/ttyACM0 \
--teleop.id=exo
```
### Teleoperate in Simulation
```bash
lerobot-teleoperate \
--robot.type=unitree_g1 \
--robot.is_simulation=true \
--teleop.type=unitree_g1 \
--teleop.left_arm_config.port=/dev/ttyACM1 \
--teleop.right_arm_config.port=/dev/ttyACM0 \
--teleop.id=exo \
--fps=100
```
### Record Dataset in Simulation
```bash
lerobot-record \
--robot.type=unitree_g1 \
--robot.is_simulation=true \
--robot.cameras='{"global_view": {"type": "zmq", "server_address": "localhost", "port": 5555, "camera_name": "head_camera", "width": 640, "height": 480, "fps": 30}}' \
--teleop.type=unitree_g1 \
--teleop.left_arm_config.port=/dev/ttyACM1 \
--teleop.right_arm_config.port=/dev/ttyACM0 \
--teleop.id=exo \
--dataset.repo_id=your-username/dataset-name \
--dataset.single_task="Test" \
--dataset.num_episodes=2 \
--dataset.episode_time_s=5 \
--dataset.reset_time_s=5 \
--dataset.push_to_hub=true \
--dataset.streaming_encoding=true \
# --dataset.vcodec=auto \
--dataset.encoder_threads=2
```
Example simulation dataset: [nepyope/teleop_test_sim](https://huggingface.co/datasets/nepyope/teleop_test_sim)
---
## Running on Real Robot
Once the robot server is running on the G1 (see Part 3), you can teleoperate and record on the real robot.
### Start the Camera Server
On the robot, start the ZMQ image server:
```bash
python src/lerobot/cameras/zmq/image_server.py
```
Keep this running in a separate terminal for camera streaming during recording.
### Teleoperate Real Robot
```bash
lerobot-teleoperate \
--robot.type=unitree_g1 \
--robot.is_simulation=false \
--teleop.type=unitree_g1 \
--teleop.left_arm_config.port=/dev/ttyACM1 \
--teleop.right_arm_config.port=/dev/ttyACM0 \
--teleop.id=exo \
--fps=100
```
### Record Dataset on Real Robot
```bash
lerobot-record \
--robot.type=unitree_g1 \
--robot.is_simulation=false \
--robot.cameras='{"global_view": {"type": "zmq", "server_address": "172.18.129.215", "port": 5555, "camera_name": "head_camera", "width": 640, "height": 480, "fps": 30}}' \
--teleop.type=unitree_g1 \
--teleop.left_arm_config.port=/dev/ttyACM1 \
--teleop.right_arm_config.port=/dev/ttyACM0 \
--teleop.id=exo \
--dataset.repo_id=your-username/dataset-name \
--dataset.single_task="Test" \
--dataset.num_episodes=2 \
--dataset.episode_time_s=5 \
--dataset.reset_time_s=5 \
--dataset.push_to_hub=true \
--dataset.streaming_encoding=true \
# --dataset.vcodec=auto \
--dataset.encoder_threads=2
```
**Note**: Update `server_address` to match your robot's camera server IP.
Example real robot dataset: [nepyope/teleop_test_real](https://huggingface.co/datasets/nepyope/teleop_test_real)
---
You can now test and develop policies without a physical robot using MuJoCo. to do so set `is_simulation=True` in config.
## Additional Resources
- [Unitree SDK Documentation](https://github.com/unitreerobotics/unitree_sdk2_python)
- [GR00T-WholeBodyControl](https://github.com/NVlabs/GR00T-WholeBodyControl)
- [Holosoma](https://github.com/amazon-far/holosoma)
- [GR00T Policy Repository](https://huggingface.co/nepyope/GR00T-WholeBodyControl_g1)
- [LeRobot Documentation](https://github.com/huggingface/lerobot)
- [Unitree_IL_Lerobot](https://github.com/unitreerobotics/unitree_IL_lerobot)
+6 -38
View File
@@ -12,7 +12,6 @@ LeRobot provides several utilities for manipulating datasets:
4. **Add Features** - Add new features to a dataset
5. **Remove Features** - Remove features from a dataset
6. **Convert to Video** - Convert image-based datasets to video format for efficient storage
7. **Show the Info of Datasets** - Show the summary of datasets information such as number of episode etc.
The core implementation is in `lerobot.datasets.dataset_tools`.
An example script detailing how to use the tools API is available in `examples/dataset/use_dataset_tools.py`.
@@ -96,26 +95,26 @@ Convert an image-based dataset to video format, creating a new LeRobotDataset wh
# Local-only: Save to a custom output directory (no hub push)
lerobot-edit-dataset \
--repo_id lerobot/pusht_image \
--operation.type convert_image_to_video \
--operation.type convert_to_video \
--operation.output_dir /path/to/output/pusht_video
# Save with new repo_id (local storage)
lerobot-edit-dataset \
--repo_id lerobot/pusht_image \
--new_repo_id lerobot/pusht_video \
--operation.type convert_image_to_video
--operation.type convert_to_video
# Convert and push to Hugging Face Hub
lerobot-edit-dataset \
--repo_id lerobot/pusht_image \
--new_repo_id lerobot/pusht_video \
--operation.type convert_image_to_video \
--operation.type convert_to_video \
--push_to_hub true
# Convert with custom video codec and quality settings
lerobot-edit-dataset \
--repo_id lerobot/pusht_image \
--operation.type convert_image_to_video \
--operation.type convert_to_video \
--operation.output_dir outputs/pusht_video \
--operation.vcodec libsvtav1 \
--operation.pix_fmt yuv420p \
@@ -125,23 +124,16 @@ lerobot-edit-dataset \
# Convert only specific episodes
lerobot-edit-dataset \
--repo_id lerobot/pusht_image \
--operation.type convert_image_to_video \
--operation.type convert_to_video \
--operation.output_dir outputs/pusht_video \
--operation.episode_indices "[0, 1, 2, 5, 10]"
# Convert with multiple workers for parallel processing
lerobot-edit-dataset \
--repo_id lerobot/pusht_image \
--operation.type convert_image_to_video \
--operation.type convert_to_video \
--operation.output_dir outputs/pusht_video \
--operation.num_workers 8
# For memory-constrained systems, users can now specify limits:
lerobot-edit-dataset \
--repo_id lerobot/pusht_image \
--operation.type convert_to_video \
--operation.max_episodes_per_batch 50 \
--operation.max_frames_per_batch 10000
```
**Parameters:**
@@ -157,30 +149,6 @@ lerobot-edit-dataset \
**Note:** The resulting dataset will be a proper LeRobotDataset with all cameras encoded as videos in the `videos/` directory, with parquet files containing only metadata (no raw image data). All episodes, stats, and tasks are preserved.
### Show the information of datasets
Show the information of datasets such as number of episode, number of frame, File size and so on.
No change will be made to the dataset
```bash
# Show dataset information without feature details
lerobot-edit-dataset \
--repo_id lerobot/pusht_image \
--operation.type info \
# Show dataset information with feature details
lerobot-edit-dataset \
--repo_id lerobot/pusht_image \
--operation.type info \
--operation.show_features true
```
**Parameters:**
- `parameters`: The flag to control show or no show dataset information with feature details.(default=false)
### Push to Hub
Add the `--push_to_hub true` flag to any command to automatically upload the resulting dataset to the Hugging Face Hub:
-80
View File
@@ -1,80 +0,0 @@
# WALL-OSS
WALL-OSS is an open-source foundation model for embodied intelligence, proposed by the [XSquare Robot](https://x2robot.com/en/research/68bc2cde8497d7f238dde690) team in 2025. The LeRobot implementation is adapted from their open-source [WallX](https://github.com/X-Square-Robot/wall-x) repository.
X Square Robots WALL-OSS is now integrated into Hugging Faces LeRobot ecosystem. This is an exciting collaborative project between the LeRobot and X Square Robot teams. You can now post-train, evaluate, and deploy WALL-OSS directly through LeRobot. With this, were aiming to make it easier for the open-source robotics community to customize and deploy WALL-OSS foundation models. Read and explore WALL-OSS [paper](https://arxiv.org/pdf/2509.11766) and [code](https://github.com/X-Square-Robot/wall-x).
## Model Overview
The WALL-OSS team is building the embodied foundation model to capture and compress the world's most valuable data: the continuous, high-fidelity stream of physical interaction. By creating a direct feedback loop between the model's decisions and the body's lived experience, the emergence of a truly generalizable intelligence is enabled—one that understands not just how the world works, but how to act effectively within it.
<img
src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/lerobot/walloss-lerobot-paper.png"
alt="An overview of WALL-OSS"
width="85%"
/>
Technically, WALL-OSS introduces a tightly coupled multimodal architecture (tightly-coupled MoE structure) that integrates both discrete and continuous action modeling strategies. Through a two-stage training pipeline (Inspiration → Integration), the model gradually unifies semantic reasoning and high-frequency action generation. Its core innovations include:
- **Embodied perceptionenhanced multimodal pretraining**: Large-scale training on unified visionlanguageaction data to strengthen spatial, causal, and manipulation understanding.
- **Unified Cross-Level Chain-of-Thought (Uni-CoT)**: A single differentiable framework that unifies high-level instruction reasoning, sub-task decomposition, and fine-grained action synthesis, forming a continuous chain from “understanding” to “execution.”
- **Mixture-of-Experts (MoE) action heads**: Dynamically activating experts depending on the task phase and modeling actions in discrete or continuous space to maintain stable VLM priors.
- **Two-stage training paradigm**:
- **Inspiration stage**: Injecting discrete action priors to strengthen spatial understanding and semantic-action alignment.
- **Integration stage**: Using flow matching to achieve high-frequency continuous control.
## Installation Requirements
1. Install LeRobot by following our [Installation Guide](./installation).
2. Install WallX dependencies by running:
```bash
pip install -e ".[wallx]"
```
## Usage
To use WallX in LeRobot, specify the policy type as:
```python
policy.type=wall_x
```
## Training
For training WallX, you can use the standard LeRobot training script with the appropriate configuration:
```bash
lerobot-train \
--dataset.repo_id=your_dataset \
--policy.type=wall_x \
--output_dir=./outputs/wallx_training \
--job_name=wallx_training \
--policy.repo_id=your_repo_id \
--policy.pretrained_name_or_path=x-square-robot/wall-oss-flow \
--policy.prediction_mode=diffusion \
--policy.attn_implementation=eager \
--steps=3000 \
--policy.device=cuda \
--batch_size=32
```
### Training Arguments
| Argument | Description |
| ------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `--dataset.repo_id` | The Hugging Face Hub repository ID for your training dataset (e.g., `lerobot/aloha_sim_insertion_human`) |
| `--policy.type` | Specifies using the WallX policy architecture |
| `--output_dir` | Local directory where training checkpoints and logs will be saved |
| `--job_name` | A name identifier for this training run (used in logging/tracking) |
| `--policy.repo_id` | Your Hugging Face Hub repo ID where the trained model will be pushed |
| `--policy.pretrained_path` | Path to pretrained WallX weights to initialize from (the official WALL-OSS checkpoint) |
| `--policy.prediction_mode` | The action prediction strategy: `diffusion` or `fast` - `diffusion` uses iterative denoising for action generation, `fast` uses next token prediction instead |
| `--policy.attn_implementation` | Attention implementation backend - `eager` uses standard PyTorch attention (alternatives include `flash_attention_2` or `sdpa`) |
| `--steps` | Total number of training steps to run |
| `--policy.device` | Device to train on (`cuda` for GPU, `cpu` for CPU) |
| `--batch_size` | Number of samples per training batch |
## License
This model follows the **Apache 2.0 License**, consistent with the original [WallX repository](https://github.com/X-Square-Robot/wall-x).
+1 -1
View File
@@ -154,7 +154,7 @@ lerobot-train \
```bash
lerobot-train \
--dataset.repo_id=<USER>/bimanual-so100-handover-cube \
--dataset.repo_id=pepijn223/bimanual-so100-handover-cube \
--output_dir=./outputs/xvla_bimanual \
--job_name=xvla_so101_training \
--policy.path="lerobot/xvla-base" \
+19 -19
View File
@@ -22,7 +22,7 @@ lerobot-replay \
--robot.type=so100_follower \
--robot.port=/dev/tty.usbmodem58760431541 \
--robot.id=black \
--dataset.repo_id=<USER>/record-test \
--dataset.repo_id=aliberts/record-test \
--dataset.episode=2
```
"""
@@ -41,7 +41,8 @@ from lerobot.robots import ( # noqa: F401
RobotConfig,
koch_follower,
make_robot_from_config,
so_follower,
so100_follower,
so101_follower,
)
from lerobot.utils.constants import ACTION
from lerobot.utils.robot_utils import precise_sleep
@@ -57,7 +58,7 @@ class DatasetReplayConfig:
repo_id: str
# Episode to replay.
episode: int
# Root directory where the dataset will be stored (e.g. 'dataset/path'). If None, defaults to $HF_LEROBOT_HOME/repo_id.
# Root directory where the dataset will be stored (e.g. 'dataset/path').
root: str | Path | None = None
# Limit the frames per second. By default, uses the policy fps.
fps: int = 30
@@ -81,25 +82,24 @@ def replay(cfg: ReplayConfig):
actions = dataset.hf_dataset.select_columns(ACTION)
robot.connect()
try:
log_say("Replaying episode", cfg.play_sounds, blocking=True)
for idx in range(dataset.num_frames):
start_episode_t = time.perf_counter()
log_say("Replaying episode", cfg.play_sounds, blocking=True)
for idx in range(dataset.num_frames):
start_episode_t = time.perf_counter()
action_array = actions[idx][ACTION]
action = {}
for i, name in enumerate(dataset.features[ACTION]["names"]):
key = f"{name.removeprefix('main_')}.pos"
action[key] = action_array[i].item()
action_array = actions[idx][ACTION]
action = {}
for i, name in enumerate(dataset.features[ACTION]["names"]):
key = f"{name.removeprefix('main_')}.pos"
action[key] = action_array[i].item()
action["shoulder_lift.pos"] = -(action["shoulder_lift.pos"] - 90)
action["elbow_flex.pos"] -= 90
robot.send_action(action)
action["shoulder_lift.pos"] = -(action["shoulder_lift.pos"] - 90)
action["elbow_flex.pos"] -= 90
robot.send_action(action)
dt_s = time.perf_counter() - start_episode_t
precise_sleep(max(1 / dataset.fps - dt_s, 0.0))
finally:
robot.disconnect()
dt_s = time.perf_counter() - start_episode_t
precise_sleep(1 / dataset.fps - dt_s)
robot.disconnect()
if __name__ == "__main__":
-490
View File
@@ -1,490 +0,0 @@
#!/usr/bin/env python
# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""
SLURM-distributed SARM RA-BC annotation pipeline.
Computes SARM progress values for all frames in a dataset, distributed across
SLURM workers, then merges the shards into a single sarm_progress.parquet.
Two subcommands, each a separate SLURM submission:
compute N workers, each computes progress for a subset of episodes
aggregate 1 worker, merges N shards into sarm_progress.parquet, pushes to hub
Usage:
python slurm_compute_rabc.py compute \\
--repo-id user/dataset --reward-model-path user/sarm_model \\
--stride 10 --device cpu --workers 50 --partition cpu
python slurm_compute_rabc.py aggregate \\
--repo-id user/dataset --reward-model-path user/sarm_model \\
--partition cpu --push-to-hub
"""
import argparse
from pathlib import Path
from datatrove.executor import LocalPipelineExecutor
from datatrove.executor.slurm import SlurmPipelineExecutor
from datatrove.pipeline.base import PipelineStep
class ComputeProgressShards(PipelineStep):
"""Each worker computes SARM progress for its assigned episodes."""
def __init__(
self, repo_id, reward_model_path, stride=1, head_mode="sparse", device="cpu", shard_dir="rabc_shards"
):
super().__init__()
if stride < 1:
raise ValueError(f"stride must be >= 1, got {stride}")
self.repo_id = repo_id
self.reward_model_path = reward_model_path
self.stride = stride
self.head_mode = head_mode
self.device = device
self.shard_dir = shard_dir
def run(self, data=None, rank: int = 0, world_size: int = 1):
import logging
from pathlib import Path
import numpy as np
import pyarrow as pa
import pyarrow.parquet as pq
import torch
from tqdm import tqdm
from lerobot.policies.sarm.compute_rabc_weights import (
generate_all_frame_indices,
interpolate_progress,
load_sarm_resources,
)
from lerobot.utils.utils import init_logging
init_logging()
dataset, reward_model, preprocess = load_sarm_resources(
self.repo_id,
self.reward_model_path,
self.device,
)
if hasattr(preprocess, "eval"):
preprocess.eval()
for step in preprocess.steps:
if hasattr(step, "eval"):
step.eval()
image_key = reward_model.config.image_key
state_key = reward_model.config.state_key
frame_gap = reward_model.config.frame_gap
center_idx = reward_model.config.n_obs_steps // 2
dual_mode = reward_model.config.uses_dual_heads
compute_sparse = self.head_mode in ("sparse", "both") or not dual_mode
compute_dense = self.head_mode in ("dense", "both") and dual_mode
my_episodes = list(range(dataset.num_episodes))[rank::world_size]
if not my_episodes:
logging.info(f"Rank {rank}: no episodes assigned")
return
logging.info(f"Rank {rank}: {len(my_episodes)} / {dataset.num_episodes} episodes")
all_rows = []
for ep_idx in tqdm(my_episodes, desc=f"Rank {rank}"):
ep = dataset.meta.episodes[ep_idx]
ep_start, ep_end = ep["dataset_from_index"], ep["dataset_to_index"]
task = dataset[ep_start].get("task", "perform the task")
all_ep_indices = generate_all_frame_indices(ep_start, ep_end, frame_gap)
if self.stride > 1:
compute_indices = [i for i in all_ep_indices if (i - ep_start) % self.stride == 0]
if (ep_end - 1) not in compute_indices:
compute_indices.append(ep_end - 1)
compute_indices = sorted(set(compute_indices))
else:
compute_indices = all_ep_indices
frame_results = {}
for qi in tqdm(compute_indices, desc=f" Ep {ep_idx}", leave=False):
try:
sample = dataset[qi]
batch = {
image_key: sample[image_key],
"task": task,
"index": qi,
"episode_index": ep_idx,
}
if state_key in sample:
batch[state_key] = sample[state_key]
with torch.no_grad():
processed = preprocess(batch)
vf = processed["video_features"].to(self.device)
tf = processed["text_features"].to(self.device)
sf = processed.get("state_features")
if sf is not None:
sf = sf.to(self.device)
lengths = processed.get("lengths")
sparse_val = dense_val = np.nan
if compute_sparse:
r = reward_model.calculate_rewards(
text_embeddings=tf,
video_embeddings=vf,
state_features=sf,
lengths=lengths,
return_all_frames=True,
head_mode="sparse",
)
sparse_val = float(r[0, center_idx] if r.ndim == 2 else r[center_idx])
if compute_dense:
r = reward_model.calculate_rewards(
text_embeddings=tf,
video_embeddings=vf,
state_features=sf,
lengths=lengths,
return_all_frames=True,
head_mode="dense",
)
dense_val = float(r[0, center_idx] if r.ndim == 2 else r[center_idx])
frame_results[qi] = (sparse_val, dense_val)
except Exception as e:
logging.warning(f"Failed frame {qi}: {e}")
if not frame_results:
logging.warning(f"Episode {ep_idx}: all frames failed, skipping")
continue
# Interpolate to all frames in this episode
computed_idx = np.array(sorted(frame_results.keys()))
all_frame_arr = np.arange(ep_start, ep_end)
sparse_vals = np.array([frame_results[i][0] for i in computed_idx]) if compute_sparse else None
dense_vals = np.array([frame_results[i][1] for i in computed_idx]) if compute_dense else None
if self.stride > 1 and len(computed_idx) > 1:
if compute_sparse:
sparse_vals = interpolate_progress(computed_idx, sparse_vals, all_frame_arr)
if compute_dense:
dense_vals = interpolate_progress(computed_idx, dense_vals, all_frame_arr)
output_frames = all_frame_arr
else:
# Use only successfully computed frames to avoid indexing mismatch on failures
output_frames = computed_idx
for i, fi in enumerate(output_frames):
row = {"index": int(fi), "episode_index": ep_idx, "frame_index": int(fi - ep_start)}
if compute_sparse:
row["progress_sparse"] = float(sparse_vals[i])
if compute_dense:
row["progress_dense"] = float(dense_vals[i])
all_rows.append(row)
if all_rows:
import pandas as pd
df = pd.DataFrame(all_rows).sort_values("index").reset_index(drop=True)
table = pa.Table.from_pandas(df, preserve_index=False)
table = table.replace_schema_metadata({b"reward_model_path": self.reward_model_path.encode()})
shard_dir = Path(self.shard_dir)
shard_dir.mkdir(parents=True, exist_ok=True)
out = shard_dir / f"shard_{rank:05d}.parquet"
pq.write_table(table, out)
logging.info(f"Rank {rank}: saved {len(df)} rows to {out}")
class AggregateProgress(PipelineStep):
"""Merge all shard parquets into final sarm_progress.parquet."""
def __init__(self, repo_id, reward_model_path, shard_dir="rabc_shards", push_to_hub=False):
super().__init__()
self.repo_id = repo_id
self.reward_model_path = reward_model_path
self.shard_dir = shard_dir
self.push_to_hub = push_to_hub
def run(self, data=None, rank: int = 0, world_size: int = 1):
import datetime
import logging
import os
from pathlib import Path
import pandas as pd
import pyarrow as pa
import pyarrow.parquet as pq
from lerobot.datasets.lerobot_dataset import LeRobotDataset
from lerobot.utils.utils import init_logging
init_logging()
if rank != 0:
return
shard_dir = Path(self.shard_dir)
shards = sorted(shard_dir.glob("shard_*.parquet"))
if not shards:
raise FileNotFoundError(f"No shards found in {shard_dir}")
# Log shard modification time range to help detect stale files
mtimes = [os.path.getmtime(s) for s in shards]
oldest = datetime.datetime.fromtimestamp(min(mtimes)).isoformat(timespec="seconds")
newest = datetime.datetime.fromtimestamp(max(mtimes)).isoformat(timespec="seconds")
logging.info(f"Aggregating {len(shards)} shards (oldest: {oldest}, newest: {newest})")
df = pd.concat([pd.read_parquet(s) for s in shards], ignore_index=True)
df = df.sort_values("index").reset_index(drop=True)
table = pa.Table.from_pandas(df, preserve_index=False)
table = table.replace_schema_metadata({b"reward_model_path": self.reward_model_path.encode()})
temp_ds = LeRobotDataset(self.repo_id, download_videos=False)
out_path = Path(temp_ds.root) / "sarm_progress.parquet"
out_path.parent.mkdir(parents=True, exist_ok=True)
pq.write_table(table, out_path)
logging.info(f"Saved {len(df)} rows to {out_path}")
for col in ["progress_sparse", "progress_dense"]:
if col in df.columns:
v = df[col].dropna()
logging.info(
f"{col}: mean={v.mean():.4f} std={v.std():.4f} min={v.min():.4f} max={v.max():.4f}"
)
if self.push_to_hub:
from huggingface_hub import HfApi
api = HfApi()
hub_path = "sarm_progress.parquet"
logging.info(f"Uploading to {self.repo_id}/{hub_path}")
api.upload_file(
path_or_fileobj=str(out_path),
path_in_repo=hub_path,
repo_id=self.repo_id,
repo_type="dataset",
)
logging.info(f"Uploaded: https://huggingface.co/datasets/{self.repo_id}/blob/main/{hub_path}")
def make_compute_executor(
repo_id,
reward_model_path,
stride,
head_mode,
device,
shard_dir,
logs_dir,
job_name,
slurm,
workers,
partition,
cpus_per_task,
mem_per_cpu,
):
kwargs = {
"pipeline": [
ComputeProgressShards(repo_id, reward_model_path, stride, head_mode, device, str(shard_dir)),
],
"logging_dir": str(logs_dir / job_name),
}
if slurm:
kwargs.update(
{
"job_name": job_name,
"tasks": workers,
"workers": workers,
"time": "24:00:00",
"partition": partition,
"cpus_per_task": cpus_per_task,
"sbatch_args": {"mem-per-cpu": mem_per_cpu},
}
)
return SlurmPipelineExecutor(**kwargs)
kwargs.update({"tasks": workers, "workers": 1})
return LocalPipelineExecutor(**kwargs)
def make_aggregate_executor(
repo_id,
reward_model_path,
shard_dir,
logs_dir,
job_name,
slurm,
partition,
cpus_per_task,
mem_per_cpu,
push_to_hub,
):
kwargs = {
"pipeline": [
AggregateProgress(repo_id, reward_model_path, str(shard_dir), push_to_hub),
],
"logging_dir": str(logs_dir / job_name),
}
if slurm:
kwargs.update(
{
"job_name": job_name,
"tasks": 1,
"workers": 1,
"time": "02:00:00",
"partition": partition,
"cpus_per_task": cpus_per_task,
"sbatch_args": {"mem-per-cpu": mem_per_cpu},
}
)
return SlurmPipelineExecutor(**kwargs)
kwargs.update({"tasks": 1, "workers": 1})
return LocalPipelineExecutor(**kwargs)
def _add_shared_args(p):
p.add_argument(
"--repo-id",
type=str,
required=True,
help="Hugging Face repository identifier, e.g. 'user/dataset'.",
)
p.add_argument(
"--shard-dir",
type=Path,
default=Path("rabc_shards"),
help="Directory to read/write per-rank parquet shards.",
)
p.add_argument(
"--logs-dir",
type=Path,
default=Path("logs"),
help="Directory for datatrove logs.",
)
p.add_argument(
"--job-name",
type=str,
default=None,
help="SLURM job name (defaults to rabc_<subcommand>).",
)
p.add_argument(
"--slurm",
type=int,
default=1,
help="1 = submit via SLURM; 0 = run locally (useful for debugging).",
)
p.add_argument(
"--partition",
type=str,
default=None,
help="SLURM partition to submit to.",
)
p.add_argument(
"--cpus-per-task",
type=int,
default=4,
help="Number of CPUs per SLURM task.",
)
p.add_argument(
"--mem-per-cpu",
type=str,
default="4G",
help="Memory per CPU, e.g. '4G' or '1950M'.",
)
def main():
parser = argparse.ArgumentParser(
description="SLURM-distributed SARM RA-BC annotation pipeline",
formatter_class=argparse.RawDescriptionHelpFormatter,
)
sub = parser.add_subparsers(dest="command", required=True)
# compute subcommand
cp = sub.add_parser(
"compute",
help="Distribute progress computation across SLURM workers.",
)
_add_shared_args(cp)
cp.add_argument(
"--reward-model-path",
type=str,
required=True,
help="Path or HF repo id of the SARM reward model.",
)
cp.add_argument(
"--stride",
type=int,
default=1,
help="Compute every Nth frame; intermediate frames are interpolated (must be >= 1).",
)
cp.add_argument(
"--head-mode",
type=str,
default="sparse",
choices=["sparse", "dense", "both"],
help="Which reward head(s) to compute.",
)
cp.add_argument(
"--device",
type=str,
default="cpu",
help="Device for reward model inference, e.g. 'cpu' or 'cuda'.",
)
cp.add_argument(
"--workers",
type=int,
default=50,
help="Number of parallel SLURM tasks (one shard per worker).",
)
# aggregate subcommand
ap = sub.add_parser(
"aggregate",
help="Merge per-rank shards into a single sarm_progress.parquet.",
)
_add_shared_args(ap)
ap.add_argument(
"--reward-model-path",
type=str,
required=True,
help="Path or HF repo id of the SARM reward model (stored in parquet metadata).",
)
ap.add_argument(
"--push-to-hub",
action="store_true",
help="Upload sarm_progress.parquet to the Hugging Face Hub after aggregation.",
)
args = parser.parse_args()
job_name = args.job_name or f"rabc_{args.command}"
kwargs = vars(args)
kwargs["slurm"] = kwargs.pop("slurm") == 1
kwargs["job_name"] = job_name
command = kwargs.pop("command")
executor = make_compute_executor(**kwargs) if command == "compute" else make_aggregate_executor(**kwargs)
executor.run()
if __name__ == "__main__":
main()
+43 -45
View File
@@ -78,24 +78,40 @@ def main():
listener, events = init_keyboard_listener()
init_rerun(session_name="lekiwi_evaluate")
try:
if not robot.is_connected:
raise ValueError("Robot is not connected!")
if not robot.is_connected:
raise ValueError("Robot is not connected!")
print("Starting evaluate loop...")
recorded_episodes = 0
while recorded_episodes < NUM_EPISODES and not events["stop_recording"]:
log_say(f"Running inference, recording eval episode {recorded_episodes} of {NUM_EPISODES}")
print("Starting evaluate loop...")
recorded_episodes = 0
while recorded_episodes < NUM_EPISODES and not events["stop_recording"]:
log_say(f"Running inference, recording eval episode {recorded_episodes} of {NUM_EPISODES}")
# Main record loop
# Main record loop
record_loop(
robot=robot,
events=events,
fps=FPS,
policy=policy,
preprocessor=preprocessor, # Pass the pre and post policy processors
postprocessor=postprocessor,
dataset=dataset,
control_time_s=EPISODE_TIME_SEC,
single_task=TASK_DESCRIPTION,
display_data=True,
teleop_action_processor=teleop_action_processor,
robot_action_processor=robot_action_processor,
robot_observation_processor=robot_observation_processor,
)
# Reset the environment if not stopping or re-recording
if not events["stop_recording"] and (
(recorded_episodes < NUM_EPISODES - 1) or events["rerecord_episode"]
):
log_say("Reset the environment")
record_loop(
robot=robot,
events=events,
fps=FPS,
policy=policy,
preprocessor=preprocessor, # Pass the pre and post policy processors
postprocessor=postprocessor,
dataset=dataset,
control_time_s=EPISODE_TIME_SEC,
single_task=TASK_DESCRIPTION,
display_data=True,
@@ -104,42 +120,24 @@ def main():
robot_observation_processor=robot_observation_processor,
)
# Reset the environment if not stopping or re-recording
if not events["stop_recording"] and (
(recorded_episodes < NUM_EPISODES - 1) or events["rerecord_episode"]
):
log_say("Reset the environment")
record_loop(
robot=robot,
events=events,
fps=FPS,
control_time_s=EPISODE_TIME_SEC,
single_task=TASK_DESCRIPTION,
display_data=True,
teleop_action_processor=teleop_action_processor,
robot_action_processor=robot_action_processor,
robot_observation_processor=robot_observation_processor,
)
if events["rerecord_episode"]:
log_say("Re-record episode")
events["rerecord_episode"] = False
events["exit_early"] = False
dataset.clear_episode_buffer()
continue
if events["rerecord_episode"]:
log_say("Re-record episode")
events["rerecord_episode"] = False
events["exit_early"] = False
dataset.clear_episode_buffer()
continue
# Save episode
dataset.save_episode()
recorded_episodes += 1
# Save episode
dataset.save_episode()
recorded_episodes += 1
# Clean up
log_say("Stop recording")
robot.disconnect()
listener.stop()
finally:
# Clean up
log_say("Stop recording")
robot.disconnect()
listener.stop()
dataset.finalize()
dataset.push_to_hub()
dataset.finalize()
dataset.push_to_hub()
if __name__ == "__main__":
+45 -46
View File
@@ -21,7 +21,7 @@ from lerobot.robots.lekiwi.config_lekiwi import LeKiwiClientConfig
from lerobot.robots.lekiwi.lekiwi_client import LeKiwiClient
from lerobot.scripts.lerobot_record import record_loop
from lerobot.teleoperators.keyboard import KeyboardTeleop, KeyboardTeleopConfig
from lerobot.teleoperators.so_leader import SO100Leader, SO100LeaderConfig
from lerobot.teleoperators.so100_leader import SO100Leader, SO100LeaderConfig
from lerobot.utils.constants import ACTION, OBS_STR
from lerobot.utils.control_utils import init_keyboard_listener
from lerobot.utils.utils import log_say
@@ -74,23 +74,40 @@ def main():
listener, events = init_keyboard_listener()
init_rerun(session_name="lekiwi_record")
try:
if not robot.is_connected or not leader_arm.is_connected or not keyboard.is_connected:
raise ValueError("Robot or teleop is not connected!")
if not robot.is_connected or not leader_arm.is_connected or not keyboard.is_connected:
raise ValueError("Robot or teleop is not connected!")
print("Starting record loop...")
recorded_episodes = 0
while recorded_episodes < NUM_EPISODES and not events["stop_recording"]:
log_say(f"Recording episode {recorded_episodes}")
print("Starting record loop...")
recorded_episodes = 0
while recorded_episodes < NUM_EPISODES and not events["stop_recording"]:
log_say(f"Recording episode {recorded_episodes}")
# Main record loop
# Main record loop
record_loop(
robot=robot,
events=events,
fps=FPS,
dataset=dataset,
teleop=[leader_arm, keyboard],
control_time_s=EPISODE_TIME_SEC,
single_task=TASK_DESCRIPTION,
display_data=True,
teleop_action_processor=teleop_action_processor,
robot_action_processor=robot_action_processor,
robot_observation_processor=robot_observation_processor,
)
# Reset the environment if not stopping or re-recording
if not events["stop_recording"] and (
(recorded_episodes < NUM_EPISODES - 1) or events["rerecord_episode"]
):
log_say("Reset the environment")
record_loop(
robot=robot,
events=events,
fps=FPS,
dataset=dataset,
teleop=[leader_arm, keyboard],
control_time_s=EPISODE_TIME_SEC,
control_time_s=RESET_TIME_SEC,
single_task=TASK_DESCRIPTION,
display_data=True,
teleop_action_processor=teleop_action_processor,
@@ -98,44 +115,26 @@ def main():
robot_observation_processor=robot_observation_processor,
)
# Reset the environment if not stopping or re-recording
if not events["stop_recording"] and (
(recorded_episodes < NUM_EPISODES - 1) or events["rerecord_episode"]
):
log_say("Reset the environment")
record_loop(
robot=robot,
events=events,
fps=FPS,
teleop=[leader_arm, keyboard],
control_time_s=RESET_TIME_SEC,
single_task=TASK_DESCRIPTION,
display_data=True,
teleop_action_processor=teleop_action_processor,
robot_action_processor=robot_action_processor,
robot_observation_processor=robot_observation_processor,
)
if events["rerecord_episode"]:
log_say("Re-record episode")
events["rerecord_episode"] = False
events["exit_early"] = False
dataset.clear_episode_buffer()
continue
if events["rerecord_episode"]:
log_say("Re-record episode")
events["rerecord_episode"] = False
events["exit_early"] = False
dataset.clear_episode_buffer()
continue
# Save episode
dataset.save_episode()
recorded_episodes += 1
# Save episode
dataset.save_episode()
recorded_episodes += 1
finally:
# Clean up
log_say("Stop recording")
robot.disconnect()
leader_arm.disconnect()
keyboard.disconnect()
listener.stop()
# Clean up
log_say("Stop recording")
robot.disconnect()
leader_arm.disconnect()
keyboard.disconnect()
listener.stop()
dataset.finalize()
dataset.push_to_hub()
dataset.finalize()
dataset.push_to_hub()
if __name__ == "__main__":
+15 -17
View File
@@ -42,27 +42,25 @@ def main():
# Connect to the robot
robot.connect()
try:
if not robot.is_connected:
raise ValueError("Robot is not connected!")
if not robot.is_connected:
raise ValueError("Robot is not connected!")
print("Starting replay loop...")
log_say(f"Replaying episode {EPISODE_IDX}")
for idx in range(len(episode_frames)):
t0 = time.perf_counter()
print("Starting replay loop...")
log_say(f"Replaying episode {EPISODE_IDX}")
for idx in range(len(episode_frames)):
t0 = time.perf_counter()
# Get recorded action from dataset
action = {
name: float(actions[idx][ACTION][i])
for i, name in enumerate(dataset.features[ACTION]["names"])
}
# Get recorded action from dataset
action = {
name: float(actions[idx][ACTION][i]) for i, name in enumerate(dataset.features[ACTION]["names"])
}
# Send action to robot
_ = robot.send_action(action)
# Send action to robot
_ = robot.send_action(action)
precise_sleep(max(1.0 / dataset.fps - (time.perf_counter() - t0), 0.0))
finally:
robot.disconnect()
precise_sleep(max(1.0 / dataset.fps - (time.perf_counter() - t0), 0.0))
robot.disconnect()
if __name__ == "__main__":
+1 -1
View File
@@ -18,7 +18,7 @@ import time
from lerobot.robots.lekiwi import LeKiwiClient, LeKiwiClientConfig
from lerobot.teleoperators.keyboard.teleop_keyboard import KeyboardTeleop, KeyboardTeleopConfig
from lerobot.teleoperators.so_leader import SO100Leader, SO100LeaderConfig
from lerobot.teleoperators.so100_leader import SO100Leader, SO100LeaderConfig
from lerobot.utils.robot_utils import precise_sleep
from lerobot.utils.visualization_utils import init_rerun, log_rerun_data
@@ -0,0 +1,416 @@
#!/usr/bin/env python3
"""
Comprehensive debug script for OpenArms CAN FD communication.
Tests all 4 CAN interfaces with CAN FD support.
"""
import can
import time
import sys
import subprocess
def check_can_interface(port):
"""Check if CAN interface is UP and configured."""
try:
result = subprocess.run(['ip', 'link', 'show', port],
capture_output=True, text=True)
if result.returncode != 0:
return False, "Interface not found", None
output = result.stdout
if 'UP' not in output:
return False, "Interface is DOWN", None
# Check if CAN FD is enabled
is_fd = 'fd on' in output.lower() or 'canfd' in output.lower()
return True, "Interface is UP", is_fd
except FileNotFoundError:
return None, "Cannot check (ip command not found)", None
def test_motor_on_interface(bus, motor_id, timeout=2.0, use_fd=False):
"""
Test a single motor and return all responses.
Returns:
list of (arbitration_id, data) tuples for all responses received
"""
# Send enable command
enable_msg = can.Message(
arbitration_id=motor_id,
data=[0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFC],
is_extended_id=False,
is_fd=use_fd
)
try:
bus.send(enable_msg)
except Exception as e:
return None, f"Send error: {e}"
# Listen for responses
responses = []
start_time = time.time()
while time.time() - start_time < timeout:
msg = bus.recv(timeout=0.1)
if msg:
responses.append((msg.arbitration_id, msg.data, msg.is_fd if hasattr(msg, 'is_fd') else False))
# Send disable command
disable_msg = can.Message(
arbitration_id=motor_id,
data=[0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFD],
is_extended_id=False,
is_fd=use_fd
)
try:
bus.send(disable_msg)
except:
pass
return responses, None
def test_interface(port, interface_type="socketcan", use_can_fd=True):
"""Test all 8 motors on a single CAN interface."""
results = {
'interface': port,
'status': None,
'is_fd': use_can_fd,
'motors': {}
}
# Check interface status
status_ok, status_msg, interface_has_fd = check_can_interface(port)
if interface_has_fd is not None:
results['interface_fd_enabled'] = interface_has_fd
if use_can_fd and not interface_has_fd:
status_msg += " (CAN FD NOT enabled on interface!)"
elif interface_has_fd:
status_msg += " (CAN FD enabled)"
results['status'] = status_msg
if status_ok is False:
return results
# Try to connect
try:
if use_can_fd:
print(f" Connecting to {port} with CAN FD (1 Mbps / 5 Mbps)...")
bus = can.interface.Bus(
channel=port,
interface=interface_type,
bitrate=1000000,
data_bitrate=5000000,
fd=True
)
else:
print(f" Connecting to {port} with CAN 2.0 (1 Mbps)...")
bus = can.interface.Bus(
channel=port,
interface=interface_type,
bitrate=1000000
)
except Exception as e:
results['status'] = f"Connection failed: {e}"
return results
try:
# Clear any pending messages
while bus.recv(timeout=0.01):
pass
# Test each motor (0x01 to 0x08)
for motor_id in range(0x01, 0x09):
responses, error = test_motor_on_interface(bus, motor_id, timeout=1.0, use_fd=use_can_fd)
if error:
results['motors'][motor_id] = {'error': error}
elif responses:
results['motors'][motor_id] = {
'found': True,
'responses': responses
}
else:
results['motors'][motor_id] = {
'found': False,
'responses': []
}
time.sleep(0.05) # Small delay between motors
finally:
bus.shutdown()
return results
def print_results(all_results):
"""Print formatted results for all interfaces."""
print("SUMMARY - Motors Found on Each Interface")
motor_names = {
0x01: "joint_1 (Shoulder pan)",
0x02: "joint_2 (Shoulder lift)",
0x03: "joint_3 (Shoulder rotation)",
0x04: "joint_4 (Elbow flex)",
0x05: "joint_5 (Wrist roll)",
0x06: "joint_6 (Wrist pitch)",
0x07: "joint_7 (Wrist rotation)",
0x08: "gripper",
}
total_found = 0
for result in all_results:
interface = result['interface']
status = result['status']
print(f"{interface}: {status}")
if result.get('is_fd'):
print(f" Mode: CAN FD")
else:
print(f" Mode: CAN 2.0")
if 'Connection failed' in status or 'DOWN' in status:
print(f" ⚠ Cannot test {interface}")
continue
motors_found = 0
for motor_id in range(0x01, 0x09):
motor_data = result['motors'].get(motor_id, {})
motor_name = motor_names.get(motor_id, "Unknown")
if motor_data.get('error'):
print(f" Motor 0x{motor_id:02X} ({motor_name}): ✗ {motor_data['error']}")
elif motor_data.get('found'):
motors_found += 1
total_found += 1
responses = motor_data['responses']
print(f" Motor 0x{motor_id:02X} ({motor_name}): ✓ FOUND")
for resp_id, data, is_fd in responses:
data_hex = data.hex()
fd_flag = " [FD]" if is_fd else " [2.0]"
print(f" → Response from 0x{resp_id:02X}{fd_flag}: {data_hex}")
else:
print(f" Motor 0x{motor_id:02X} ({motor_name}): ✗ No response")
print(f"\n Summary: {motors_found}/8 motors found on {interface}")
# Overall summary
print("OVERALL SUMMARY")
print(f"Total motors found across all interfaces: {total_found}")
# Analyze configuration
print("DIAGNOSIS")
for result in all_results:
interface = result['interface']
motors_found = sum(1 for m in result['motors'].values() if m.get('found'))
if motors_found == 0:
print(f"\n{interface}: NO MOTORS FOUND")
print(" Possible issues:")
print(" 1. CAN FD mode mismatch (interface vs motor configuration)")
print(" 2. Missing 120Ω termination resistors at BOTH cable ends")
print(" 3. Motor timeout parameter set incorrectly (should NOT be 0)")
print(" 4. CANH/CANL wiring issue")
print(" 5. Cable too long (>40m for CAN FD at 5Mbps)")
# Check FD mismatch
if result.get('is_fd') and not result.get('interface_fd_enabled'):
print(" ⚠️ CRITICAL: Trying CAN FD but interface NOT configured for FD!")
print(f" Fix: sudo ip link set {interface} type can bitrate 1000000 dbitrate 5000000 fd on")
elif motors_found < 8:
print(f"\n{interface}: Only {motors_found}/8 motors responding")
print(" Check power and connections for missing motors")
else:
print(f"\n{interface}: All 8 motors responding correctly!")
# Check for unexpected response IDs
print("RESPONSE ID ANALYSIS")
for result in all_results:
interface = result['interface']
unexpected = []
for motor_id, motor_data in result['motors'].items():
if motor_data.get('found'):
expected_id = motor_id + 0x10
actual_ids = [resp[0] for resp in motor_data['responses']]
if expected_id not in actual_ids:
unexpected.append((motor_id, actual_ids))
if unexpected:
print(f"\n{interface}: Unexpected response IDs detected")
for motor_id, actual_ids in unexpected:
expected_id = motor_id + 0x10
print(f" Motor 0x{motor_id:02X}: Expected 0x{expected_id:02X}, "
f"got {[f'0x{id:02X}' for id in actual_ids]}")
print(" → Motor Master IDs need reconfiguration")
else:
motors_found = sum(1 for m in result['motors'].values() if m.get('found'))
if motors_found > 0:
print(f"\n{interface}: All responding motors use correct IDs")
def test_communication_speed(interface, motor_id, num_iterations=100):
"""
Test communication speed with a motor.
Returns:
tuple: (hz, avg_latency_ms) or (None, None) if test failed
"""
try:
# Connect to interface
bus = can.interface.Bus(
channel=interface,
interface="socketcan",
bitrate=1000000,
data_bitrate=5000000,
fd=True
)
# Send refresh commands and measure round-trip time
latencies = []
successful = 0
for _ in range(num_iterations):
start = time.perf_counter()
# Send enable command (lightweight operation)
enable_msg = can.Message(
arbitration_id=motor_id,
data=[0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFC],
is_extended_id=False,
is_fd=True
)
bus.send(enable_msg)
# Wait for response
msg = bus.recv(timeout=0.1)
if msg:
latency = (time.perf_counter() - start) * 1000 # Convert to ms
latencies.append(latency)
successful += 1
bus.shutdown()
if successful > 0:
avg_latency = sum(latencies) / len(latencies)
hz = 1000.0 / avg_latency if avg_latency > 0 else 0
return hz, avg_latency
return None, None
except Exception as e:
print(f" Speed test error: {e}")
return None, None
def main():
"""Main function to test all CAN interfaces with CAN FD."""
print("\nThis will test all 4 CAN interfaces (can0-can3) with CAN FD")
print("Testing motors 0x01-0x08 on each interface")
print()
print("Make sure:")
print(" ✓ Motors are powered (24V)")
print(" ✓ CAN interfaces configured with FD mode:")
print(" ./examples/openarms/setup_can.sh")
print(" ✓ Motor 'timeout' parameter NOT set to 0 (use Damiao tools)")
print(" ✓ CAN wiring includes 120Ω termination at BOTH ends")
print()
input("Press ENTER to start testing...")
# Test all 4 interfaces with CAN FD
all_results = []
for i in range(4):
interface = f"can{i}"
print(f"Testing {interface}...")
result = test_interface(interface, use_can_fd=True)
all_results.append(result)
# Quick status
if 'Connection failed' in result['status'] or 'DOWN' in result['status']:
print(f"{interface}: {result['status']}")
else:
motors_found = sum(1 for m in result['motors'].values() if m.get('found'))
print(f" {interface}: {motors_found}/8 motors found")
time.sleep(0.2)
# Print detailed results
print_results(all_results)
print("Testing Complete!")
all_found = sum(sum(1 for m in r['motors'].values() if m.get('found')) for r in all_results)
if all_found == 0:
print("\n⚠️ CRITICAL: No motors found on any interface!")
print("\nTop issues to check:")
print(" 1. Motor 'timeout' parameter (use Damiao tools to set > 0)")
print(" 2. CAN FD not enabled (run ./examples/openarms/setup_can.sh)")
print(" 3. Missing termination resistors")
print("\nTry:")
print(" a) Check motor parameters with Damiao Debugging Tools")
print(" b) Verify CAN FD is enabled: ip -d link show can0 | grep fd")
print(" c) Run setup script: ./examples/openarms/setup_can.sh")
else:
# Run speed test on interfaces with motors
print("COMMUNICATION SPEED TEST")
print("\nTesting maximum communication frequency...")
for result in all_results:
interface = result['interface']
# Find first responding motor
responding_motor = None
for motor_id, motor_data in result['motors'].items():
if motor_data.get('found'):
responding_motor = motor_id
break
if responding_motor:
print(f"\n{interface}: Testing with motor 0x{responding_motor:02X}...")
hz, latency = test_communication_speed(interface, responding_motor, num_iterations=100)
if hz:
print(f" ✓ Max frequency: {hz:.1f} Hz")
print(f" ✓ Avg latency: {latency:.2f} ms")
print(f" ✓ Commands per second: ~{int(hz)}")
else:
print(f" ✗ Speed test failed")
else:
print(f"\n{interface}: No motors found, skipping speed test")
print()
if __name__ == "__main__":
try:
main()
except KeyboardInterrupt:
print("\n\nTesting interrupted by user.")
sys.exit(1)
except Exception as e:
print(f"\nUnexpected error: {e}")
import traceback
traceback.print_exc()
sys.exit(1)
+360
View File
@@ -0,0 +1,360 @@
#!/usr/bin/env python
# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""
OpenArms Policy Evaluation
Evaluates a trained policy on the OpenArms robot by running inference and recording
the evaluation episodes to a dataset. Supports optional leader arm for manual resets.
Example usage:
python examples/openarms/evaluate.py
"""
import time
from pathlib import Path
from lerobot.cameras.opencv.configuration_opencv import OpenCVCameraConfig
from lerobot.configs.policies import PreTrainedConfig
from lerobot.datasets.lerobot_dataset import LeRobotDataset
from lerobot.datasets.pipeline_features import aggregate_pipeline_dataset_features, create_initial_features
from lerobot.datasets.utils import combine_feature_dicts
from lerobot.policies.factory import make_policy, make_pre_post_processors
from lerobot.processor import make_default_processors
from lerobot.robots.openarms.config_openarms_follower import OpenArmsFollowerConfig
from lerobot.robots.openarms.openarms_follower import OpenArmsFollower
from lerobot.scripts.lerobot_record import record_loop
from lerobot.teleoperators.openarms.config_openarms_leader import OpenArmsLeaderConfig
from lerobot.teleoperators.openarms.openarms_leader import OpenArmsLeader
from lerobot.utils.control_utils import init_keyboard_listener
from lerobot.utils.utils import log_say
from lerobot.utils.visualization_utils import init_rerun
HF_MODEL_ID = "lerobot-data-collection/three-folds-pi0" # TODO: Replace with your trained model
HF_EVAL_DATASET_ID = "lerobot-data-collection/three-folds-pi0_eval7" # TODO: Replace with your eval dataset name
TASK_DESCRIPTION = "three-folds-dataset" # TODO: Replace with your task, this should match!!
NUM_EPISODES = 1
FPS = 30
EPISODE_TIME_SEC = 300
RESET_TIME_SEC = 60
# Robot CAN interfaces
FOLLOWER_LEFT_PORT = "can0"
FOLLOWER_RIGHT_PORT = "can1"
# If enabled, you can manually reset the environment between evaluation episodes
USE_LEADER_FOR_RESETS = True # Set to False if you don't want to use leader
LEADER_LEFT_PORT = "can2"
LEADER_RIGHT_PORT = "can3"
# Camera configuration
CAMERA_CONFIG = {
"left_wrist": OpenCVCameraConfig(index_or_path="/dev/video5", width=640, height=480, fps=FPS),
"right_wrist": OpenCVCameraConfig(index_or_path="/dev/video1", width=640, height=480, fps=FPS),
"base": OpenCVCameraConfig(index_or_path="/dev/video3", width=640, height=480, fps=FPS),
}
def main():
"""Main evaluation function."""
print("OpenArms Policy Evaluation")
print(f"\nModel: {HF_MODEL_ID}")
print(f"Evaluation Dataset: {HF_EVAL_DATASET_ID}")
print(f"Task: {TASK_DESCRIPTION}")
print(f"Episodes: {NUM_EPISODES}")
print(f"Episode Duration: {EPISODE_TIME_SEC}s")
print(f"Reset Duration: {RESET_TIME_SEC}s")
print(f"Use Leader for Resets: {USE_LEADER_FOR_RESETS}")
follower_config = OpenArmsFollowerConfig(
port_left=FOLLOWER_LEFT_PORT,
port_right=FOLLOWER_RIGHT_PORT,
can_interface="socketcan",
id="openarms_follower",
disable_torque_on_disconnect=True,
max_relative_target=10.0,
cameras=CAMERA_CONFIG,
)
follower = OpenArmsFollower(follower_config)
follower.connect(calibrate=False)
if not follower.is_connected:
raise RuntimeError("Follower robot failed to connect!")
leader = None
if USE_LEADER_FOR_RESETS:
leader_config = OpenArmsLeaderConfig(
port_left=LEADER_LEFT_PORT,
port_right=LEADER_RIGHT_PORT,
can_interface="socketcan",
id="openarms_leader",
manual_control=False, # Enable torque control for gravity compensation
)
leader = OpenArmsLeader(leader_config)
leader.connect(calibrate=False)
if not leader.is_connected:
raise RuntimeError("Leader robot failed to connect!")
# Enable gravity compensation
if leader.pin_robot is not None:
leader.bus_right.enable_torque()
leader.bus_left.enable_torque()
time.sleep(0.1)
print(f"Leader connected with gravity compensation ({LEADER_LEFT_PORT}, {LEADER_RIGHT_PORT})")
else:
print(f"Leader connected but gravity compensation unavailable (no URDF)")
# Build default processors for action and observation
teleop_action_processor, robot_action_processor, robot_observation_processor = make_default_processors()
# Build dataset features from robot features and processors
# For actions, only include positions (no velocity or torque)
action_features_hw = {}
for key, value in follower.action_features.items():
if key.endswith(".pos"):
action_features_hw[key] = value
dataset_features = combine_feature_dicts(
aggregate_pipeline_dataset_features(
pipeline=teleop_action_processor,
initial_features=create_initial_features(action=action_features_hw),
use_videos=True,
),
aggregate_pipeline_dataset_features(
pipeline=robot_observation_processor,
initial_features=create_initial_features(observation=follower.observation_features),
use_videos=True,
),
)
# Check if dataset already exists
dataset_path = Path.home() / ".cache" / "huggingface" / "lerobot" / HF_EVAL_DATASET_ID
if dataset_path.exists():
print(f"Evaluation dataset already exists at: {dataset_path}")
print("This will append new episodes to the existing dataset.")
choice = input(" Continue? (y/n): ").strip().lower()
if choice != 'y':
print(" Aborting evaluation.")
follower.disconnect()
if leader:
leader.disconnect()
return
# Create dataset
dataset = LeRobotDataset.create(
repo_id=HF_EVAL_DATASET_ID,
fps=FPS,
features=dataset_features,
robot_type=follower.name,
use_videos=True,
image_writer_processes=0,
image_writer_threads=12,
)
# Load policy config from pretrained model and create policy using factory
policy_config = PreTrainedConfig.from_pretrained(HF_MODEL_ID)
policy_config.pretrained_path = HF_MODEL_ID
policy = make_policy(policy_config, ds_meta=dataset.meta)
preprocessor, postprocessor = make_pre_post_processors(
policy_cfg=policy.config,
pretrained_path=HF_MODEL_ID,
dataset_stats=dataset.meta.stats,
preprocessor_overrides={
"device_processor": {"device": str(policy.config.device)}
},
)
print(f"\nRunning evaluation...")
# Initialize keyboard listener and visualization
listener, events = init_keyboard_listener()
init_rerun(session_name="openarms_evaluation")
episode_idx = 0
try:
while episode_idx < NUM_EPISODES and not events["stop_recording"]:
log_say(f"Evaluating episode {episode_idx + 1} of {NUM_EPISODES}")
print(f"\nRunning inference for episode {episode_idx + 1}...")
# Run inference with policy
record_loop(
robot=follower,
events=events,
fps=FPS,
teleop_action_processor=teleop_action_processor,
robot_action_processor=robot_action_processor,
robot_observation_processor=robot_observation_processor,
policy=policy,
preprocessor=preprocessor,
postprocessor=postprocessor,
dataset=dataset,
control_time_s=EPISODE_TIME_SEC,
single_task=TASK_DESCRIPTION,
display_data=True,
)
# Handle re-recording
if events["rerecord_episode"]:
log_say("Re-recording episode")
events["rerecord_episode"] = False
events["exit_early"] = False
dataset.clear_episode_buffer()
continue
# Save episode
if dataset.episode_buffer is not None and dataset.episode_buffer.get("size", 0) > 0:
print(f"Saving episode {episode_idx + 1} ({dataset.episode_buffer['size']} frames)...")
dataset.save_episode()
episode_idx += 1
# Reset environment between episodes (if not last episode)
if not events["stop_recording"] and episode_idx < NUM_EPISODES:
if USE_LEADER_FOR_RESETS and leader:
log_say("Reset the environment using leader arms")
print(f"\nManual reset period ({RESET_TIME_SEC}s)...")
# Use leader for manual reset with gravity compensation
import numpy as np
dt = 1 / FPS
reset_start_time = time.perf_counter()
while time.perf_counter() - reset_start_time < RESET_TIME_SEC:
if events["exit_early"] or events["stop_recording"]:
break
loop_start = time.perf_counter()
# Get leader state
leader_action = leader.get_action()
# Extract positions and velocities
leader_positions_deg = {}
leader_velocities_deg_per_sec = {}
for motor in leader.bus_right.motors:
pos_key = f"right_{motor}.pos"
vel_key = f"right_{motor}.vel"
if pos_key in leader_action:
leader_positions_deg[f"right_{motor}"] = leader_action[pos_key]
if vel_key in leader_action:
leader_velocities_deg_per_sec[f"right_{motor}"] = leader_action[vel_key]
for motor in leader.bus_left.motors:
pos_key = f"left_{motor}.pos"
vel_key = f"left_{motor}.vel"
if pos_key in leader_action:
leader_positions_deg[f"left_{motor}"] = leader_action[pos_key]
if vel_key in leader_action:
leader_velocities_deg_per_sec[f"left_{motor}"] = leader_action[vel_key]
# Calculate gravity and friction torques
leader_positions_rad = {k: np.deg2rad(v) for k, v in leader_positions_deg.items()}
leader_gravity_torques_nm = leader._gravity_from_q(leader_positions_rad)
leader_velocities_rad_per_sec = {k: np.deg2rad(v) for k, v in leader_velocities_deg_per_sec.items()}
leader_friction_torques_nm = leader._friction_from_velocity(
leader_velocities_rad_per_sec,
friction_scale=1.0
)
# Combine torques
leader_total_torques_nm = {}
for motor_name in leader_gravity_torques_nm:
gravity = leader_gravity_torques_nm.get(motor_name, 0.0)
friction = leader_friction_torques_nm.get(motor_name, 0.0)
leader_total_torques_nm[motor_name] = gravity + friction
# Apply compensation
for motor in leader.bus_right.motors:
full_name = f"right_{motor}"
position = leader_positions_deg.get(full_name, 0.0)
torque = leader_total_torques_nm.get(full_name, 0.0)
kd = leader.get_damping_kd(motor)
leader.bus_right._mit_control(
motor=motor, kp=0.0, kd=kd,
position_degrees=position,
velocity_deg_per_sec=0.0,
torque=torque,
)
for motor in leader.bus_left.motors:
full_name = f"left_{motor}"
position = leader_positions_deg.get(full_name, 0.0)
torque = leader_total_torques_nm.get(full_name, 0.0)
kd = leader.get_damping_kd(motor)
leader.bus_left._mit_control(
motor=motor, kp=0.0, kd=kd,
position_degrees=position,
velocity_deg_per_sec=0.0,
torque=torque,
)
# Send leader positions to follower
follower_action = {}
for joint in leader_positions_deg.keys():
pos_key = f"{joint}.pos"
if pos_key in leader_action:
follower_action[pos_key] = leader_action[pos_key]
if follower_action:
follower.send_action(follower_action)
# Maintain loop rate
loop_duration = time.perf_counter() - loop_start
sleep_time = dt - loop_duration
if sleep_time > 0:
time.sleep(sleep_time)
print("Reset complete")
else:
log_say("Waiting for manual reset")
print(f"Manually reset the environment and press ENTER to continue")
input("Press ENTER when ready...")
print(f"Evaluation complete! {episode_idx} episodes recorded")
log_say("Evaluation complete", blocking=True)
except KeyboardInterrupt:
print("\n\nEvaluation interrupted by user")
finally:
if leader:
leader.bus_right.disable_torque()
leader.bus_left.disable_torque()
time.sleep(0.1)
leader.disconnect()
follower.disconnect()
if listener is not None:
listener.stop()
dataset.finalize()
print("\nUploading to Hugging Face Hub...")
dataset.push_to_hub(private=True)
if __name__ == "__main__":
main()
+653
View File
@@ -0,0 +1,653 @@
#!/usr/bin/env python
# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""
OpenArms Policy Evaluation with Real-Time Chunking (RTC)
Evaluates a trained policy on the OpenArms robot using RTC for smooth, continuous motion.
RTC enables large flow-matching policies (Pi0, Pi0.5, SmolVLA) to produce reactive motion
despite high inference latency by asynchronously generating action chunks.
Features:
- Thread-based asynchronous action generation and execution
- RTC for smooth transitions between action chunks
- Dataset recording for evaluation episodes
Example usage:
python examples/openarms/evaluate_with_rtc.py
# With custom RTC parameters
python examples/openarms/evaluate_with_rtc.py \
--rtc.execution_horizon=12 \
--rtc.max_guidance_weight=10.0
"""
import logging
import math
import sys
import time
import traceback
from dataclasses import dataclass, field
from pathlib import Path
from threading import Event, Lock, Thread
import torch
from torch import Tensor
from lerobot.cameras.opencv.configuration_opencv import OpenCVCameraConfig
from lerobot.configs import parser
from lerobot.configs.policies import PreTrainedConfig
from lerobot.configs.types import RTCAttentionSchedule
from lerobot.datasets.lerobot_dataset import LeRobotDataset
from lerobot.datasets.pipeline_features import aggregate_pipeline_dataset_features, create_initial_features
from lerobot.datasets.utils import build_dataset_frame, combine_feature_dicts, hw_to_dataset_features
from lerobot.policies.factory import get_policy_class, make_pre_post_processors
from lerobot.policies.rtc.action_queue import ActionQueue
from lerobot.policies.rtc.configuration_rtc import RTCConfig
from lerobot.policies.rtc.latency_tracker import LatencyTracker
from lerobot.processor import make_default_processors
from lerobot.rl.process import ProcessSignalHandler
from lerobot.robots.openarms.config_openarms_follower import OpenArmsFollowerConfig
from lerobot.robots.openarms.openarms_follower import OpenArmsFollower
from lerobot.utils.hub import HubMixin
from lerobot.utils.utils import init_logging, log_say
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
# ============================================================================
# Default Configuration Constants
# ============================================================================
DEFAULT_HF_MODEL_ID = "lerobot-data-collection/three-folds-pi0"
DEFAULT_HF_EVAL_DATASET_ID = "lerobot-data-collection/three-folds-pi0_eval_rtc"
DEFAULT_TASK_DESCRIPTION = "three-folds-dataset"
DEFAULT_NUM_EPISODES = 1
DEFAULT_FPS = 30
DEFAULT_EPISODE_TIME_SEC = 300
DEFAULT_RESET_TIME_SEC = 60
DEFAULT_FOLLOWER_LEFT_PORT = "can0"
DEFAULT_FOLLOWER_RIGHT_PORT = "can1"
DEFAULT_CAMERA_CONFIG = {
"left_wrist": OpenCVCameraConfig(index_or_path="/dev/video5", width=640, height=480, fps=DEFAULT_FPS),
"right_wrist": OpenCVCameraConfig(index_or_path="/dev/video1", width=640, height=480, fps=DEFAULT_FPS),
"base": OpenCVCameraConfig(index_or_path="/dev/video3", width=640, height=480, fps=DEFAULT_FPS),
}
# ============================================================================
# Thread-Safe Robot Wrapper
# ============================================================================
class RobotWrapper:
"""Thread-safe wrapper for robot operations."""
def __init__(self, robot: OpenArmsFollower):
self.robot = robot
self.lock = Lock()
def get_observation(self) -> dict[str, Tensor]:
with self.lock:
return self.robot.get_observation()
def send_action(self, action: dict) -> None:
with self.lock:
self.robot.send_action(action)
@property
def observation_features(self) -> dict:
with self.lock:
return self.robot.observation_features
@property
def action_features(self) -> dict:
with self.lock:
return self.robot.action_features
@property
def name(self) -> str:
return self.robot.name
# ============================================================================
# Configuration
# ============================================================================
@dataclass
class OpenArmsRTCEvalConfig(HubMixin):
"""Configuration for OpenArms evaluation with RTC."""
policy: PreTrainedConfig | None = None
rtc: RTCConfig = field(
default_factory=lambda: RTCConfig(
enabled=True,
execution_horizon=10,
max_guidance_weight=10.0,
prefix_attention_schedule=RTCAttentionSchedule.EXP,
)
)
model_id: str = DEFAULT_HF_MODEL_ID
eval_dataset_id: str = DEFAULT_HF_EVAL_DATASET_ID
task: str = DEFAULT_TASK_DESCRIPTION
num_episodes: int = DEFAULT_NUM_EPISODES
fps: float = DEFAULT_FPS
episode_time_sec: float = DEFAULT_EPISODE_TIME_SEC
reset_time_sec: float = DEFAULT_RESET_TIME_SEC
follower_left_port: str = DEFAULT_FOLLOWER_LEFT_PORT
follower_right_port: str = DEFAULT_FOLLOWER_RIGHT_PORT
device: str = "cuda"
# Should be higher than inference_delay + execution_horizon
action_queue_size_to_get_new_actions: int = 30
record_dataset: bool = True
push_to_hub: bool = True
use_torch_compile: bool = False
torch_compile_backend: str = "inductor"
torch_compile_mode: str = "default"
torch_compile_disable_cudagraphs: bool = True
def __post_init__(self):
policy_path = parser.get_path_arg("policy")
if policy_path:
cli_overrides = parser.get_cli_overrides("policy")
self.policy = PreTrainedConfig.from_pretrained(policy_path, cli_overrides=cli_overrides)
self.policy.pretrained_path = policy_path
self.model_id = policy_path
elif self.model_id:
self.policy = PreTrainedConfig.from_pretrained(self.model_id)
self.policy.pretrained_path = self.model_id
@classmethod
def __get_path_fields__(cls) -> list[str]:
return ["policy"]
# ============================================================================
# Action Generation Thread
# ============================================================================
def get_actions_thread(
policy,
robot: RobotWrapper,
robot_observation_processor,
action_queue: ActionQueue,
shutdown_event: Event,
cfg: OpenArmsRTCEvalConfig,
episode_active: Event,
):
"""Thread function to asynchronously generate action chunks from the policy."""
try:
logger.info("[GET_ACTIONS] Starting action generation thread")
latency_tracker = LatencyTracker()
time_per_chunk = 1.0 / cfg.fps
hw_features = hw_to_dataset_features(robot.observation_features, "observation")
policy_device = policy.config.device
logger.info(f"[GET_ACTIONS] Loading preprocessor/postprocessor from {cfg.policy.pretrained_path}")
preprocessor, postprocessor = make_pre_post_processors(
policy_cfg=cfg.policy,
pretrained_path=cfg.policy.pretrained_path,
dataset_stats=None,
preprocessor_overrides={
"device_processor": {"device": cfg.device},
},
)
logger.info("[GET_ACTIONS] Preprocessor/postprocessor loaded successfully")
get_actions_threshold = cfg.action_queue_size_to_get_new_actions
if not cfg.rtc.enabled:
get_actions_threshold = 0
while not shutdown_event.is_set():
if not episode_active.is_set():
time.sleep(0.01)
continue
if action_queue.qsize() <= get_actions_threshold:
current_time = time.perf_counter()
action_index_before_inference = action_queue.get_action_index()
prev_actions = action_queue.get_left_over()
inference_latency = latency_tracker.max()
inference_delay = math.ceil(inference_latency / time_per_chunk) if inference_latency else 0
obs = robot.get_observation()
obs_processed = robot_observation_processor(obs)
obs_with_policy_features = build_dataset_frame(
hw_features, obs_processed, prefix="observation"
)
for name in obs_with_policy_features:
obs_with_policy_features[name] = torch.from_numpy(obs_with_policy_features[name])
if "image" in name:
obs_with_policy_features[name] = (
obs_with_policy_features[name].type(torch.float32) / 255
)
obs_with_policy_features[name] = (
obs_with_policy_features[name].permute(2, 0, 1).contiguous()
)
obs_with_policy_features[name] = obs_with_policy_features[name].unsqueeze(0)
obs_with_policy_features[name] = obs_with_policy_features[name].to(policy_device)
obs_with_policy_features["task"] = [cfg.task]
obs_with_policy_features["robot_type"] = robot.name
preprocessed_obs = preprocessor(obs_with_policy_features)
actions = policy.predict_action_chunk(
preprocessed_obs,
inference_delay=inference_delay,
prev_chunk_left_over=prev_actions,
)
original_actions = actions.squeeze(0).clone()
postprocessed_actions = postprocessor(actions).squeeze(0)
new_latency = time.perf_counter() - current_time
new_delay = math.ceil(new_latency / time_per_chunk)
latency_tracker.add(new_latency)
if cfg.action_queue_size_to_get_new_actions < cfg.rtc.execution_horizon + new_delay:
logger.warning(
"[GET_ACTIONS] action_queue_size_to_get_new_actions too small. "
"Should be higher than inference delay + execution horizon."
)
action_queue.merge(
original_actions, postprocessed_actions, new_delay, action_index_before_inference
)
logger.debug(
f"[GET_ACTIONS] Generated chunk, latency={new_latency:.3f}s, "
f"delay={new_delay}, queue_size={action_queue.qsize()}"
)
else:
time.sleep(0.01)
logger.info("[GET_ACTIONS] Action generation thread shutting down")
except Exception as e:
logger.error(f"[GET_ACTIONS] Fatal exception: {e}")
logger.error(traceback.format_exc())
shutdown_event.set()
sys.exit(1)
# ============================================================================
# Action Execution Thread
# ============================================================================
def actor_thread(
robot: RobotWrapper,
robot_action_processor,
action_queue: ActionQueue,
shutdown_event: Event,
cfg: OpenArmsRTCEvalConfig,
episode_active: Event,
dataset: LeRobotDataset | None,
dataset_lock: Lock,
teleop_action_processor,
robot_observation_processor,
):
"""Thread function to execute actions on the robot."""
try:
logger.info("[ACTOR] Starting actor thread")
action_count = 0
action_interval = 1.0 / cfg.fps
action_keys = [k for k in robot.action_features.keys() if k.endswith(".pos")]
while not shutdown_event.is_set():
if not episode_active.is_set():
time.sleep(0.01)
continue
start_time = time.perf_counter()
action = action_queue.get()
if action is not None:
action = action.cpu()
action_dict = {}
for i, key in enumerate(action_keys):
if i < len(action):
action_dict[key] = action[i].item()
action_processed = robot_action_processor((action_dict, None))
robot.send_action(action_processed)
if cfg.record_dataset and dataset is not None:
with dataset_lock:
obs = robot.get_observation()
obs_processed = robot_observation_processor(obs)
action_for_dataset = teleop_action_processor((action_dict, None))
frame = {}
for key, value in obs_processed.items():
frame[f"observation.{key}"] = value
for key, value in action_for_dataset.items():
frame[f"action.{key}"] = value
frame["task"] = cfg.task
dataset.add_frame(frame)
action_count += 1
dt_s = time.perf_counter() - start_time
sleep_time = max(0, action_interval - dt_s - 0.001)
if sleep_time > 0:
time.sleep(sleep_time)
logger.info(f"[ACTOR] Actor thread shutting down. Total actions executed: {action_count}")
except Exception as e:
logger.error(f"[ACTOR] Fatal exception: {e}")
logger.error(traceback.format_exc())
shutdown_event.set()
sys.exit(1)
# ============================================================================
# Main Evaluation Function
# ============================================================================
def _apply_torch_compile(policy, cfg: OpenArmsRTCEvalConfig):
"""Apply torch.compile to the policy's predict_action_chunk method."""
if policy.name in ["pi05", "pi0"]:
return policy
try:
if not hasattr(torch, "compile"):
logger.warning(
f"torch.compile not available. Requires PyTorch 2.0+. "
f"Current version: {torch.__version__}. Skipping compilation."
)
return policy
logger.info("Applying torch.compile to predict_action_chunk...")
compile_kwargs = {
"backend": cfg.torch_compile_backend,
"mode": cfg.torch_compile_mode,
}
if cfg.torch_compile_disable_cudagraphs:
compile_kwargs["options"] = {"triton.cudagraphs": False}
original_method = policy.predict_action_chunk
compiled_method = torch.compile(original_method, **compile_kwargs)
policy.predict_action_chunk = compiled_method
logger.info("Successfully compiled predict_action_chunk")
except Exception as e:
logger.error(f"Failed to apply torch.compile: {e}")
logger.warning("Continuing without torch.compile")
return policy
@parser.wrap()
def main(cfg: OpenArmsRTCEvalConfig):
"""Main evaluation function with RTC."""
init_logging()
print("=" * 60)
print("OpenArms Policy Evaluation with RTC")
print("=" * 60)
print(f"\nModel: {cfg.model_id}")
print(f"Evaluation Dataset: {cfg.eval_dataset_id}")
print(f"Task: {cfg.task}")
print(f"Episodes: {cfg.num_episodes}")
print(f"Episode Duration: {cfg.episode_time_sec}s")
print(f"RTC Enabled: {cfg.rtc.enabled}")
print(f"RTC Execution Horizon: {cfg.rtc.execution_horizon}")
print(f"RTC Max Guidance Weight: {cfg.rtc.max_guidance_weight}")
print(f"Device: {cfg.device}")
print("=" * 60)
signal_handler = ProcessSignalHandler(use_threads=True, display_pid=False)
shutdown_event = signal_handler.shutdown_event
episode_active = Event()
# Initialize Robot
follower_config = OpenArmsFollowerConfig(
port_left=cfg.follower_left_port,
port_right=cfg.follower_right_port,
can_interface="socketcan",
id="openarms_follower",
disable_torque_on_disconnect=True,
max_relative_target=10.0,
cameras=DEFAULT_CAMERA_CONFIG,
)
follower = OpenArmsFollower(follower_config)
follower.connect(calibrate=False)
if not follower.is_connected:
raise RuntimeError("Follower robot failed to connect!")
robot = RobotWrapper(follower)
logger.info("Follower robot connected")
# Build Processors and Dataset Features
teleop_action_processor, robot_action_processor, robot_observation_processor = make_default_processors()
action_features_hw = {}
for key, value in follower.action_features.items():
if key.endswith(".pos"):
action_features_hw[key] = value
dataset_features = combine_feature_dicts(
aggregate_pipeline_dataset_features(
pipeline=teleop_action_processor,
initial_features=create_initial_features(action=action_features_hw),
use_videos=True,
),
aggregate_pipeline_dataset_features(
pipeline=robot_observation_processor,
initial_features=create_initial_features(observation=follower.observation_features),
use_videos=True,
),
)
# Create or Load Dataset
dataset = None
dataset_lock = Lock()
if cfg.record_dataset:
dataset_path = Path.home() / ".cache" / "huggingface" / "lerobot" / cfg.eval_dataset_id
if dataset_path.exists():
logger.info(f"Evaluation dataset exists at: {dataset_path}")
logger.info("New episodes will be appended.")
choice = input("Continue? (y/n): ").strip().lower()
if choice != "y":
logger.info("Aborting evaluation.")
follower.disconnect()
return
dataset = LeRobotDataset.create(
repo_id=cfg.eval_dataset_id,
fps=int(cfg.fps),
features=dataset_features,
robot_type=follower.name,
use_videos=True,
image_writer_processes=0,
image_writer_threads=12,
)
logger.info(f"Dataset created: {cfg.eval_dataset_id}")
# Load Policy
logger.info(f"Loading policy from: {cfg.model_id}")
policy_class = get_policy_class(cfg.policy.type)
config = PreTrainedConfig.from_pretrained(cfg.policy.pretrained_path)
if cfg.policy.type in ["pi05", "pi0"]:
config.compile_model = cfg.use_torch_compile
policy = policy_class.from_pretrained(cfg.policy.pretrained_path, config=config)
policy.config.rtc_config = cfg.rtc
policy.init_rtc_processor()
assert policy.name in ["smolvla", "pi05", "pi0"], "Only smolvla, pi05, and pi0 are supported for RTC"
policy = policy.to(cfg.device)
policy.eval()
if cfg.use_torch_compile:
policy = _apply_torch_compile(policy, cfg)
logger.info(f"Policy loaded: {policy.name}")
# Create Action Queue and Start Threads
action_queue = ActionQueue(cfg.rtc)
get_actions_t = Thread(
target=get_actions_thread,
args=(
policy,
robot,
robot_observation_processor,
action_queue,
shutdown_event,
cfg,
episode_active,
),
daemon=True,
name="GetActions",
)
get_actions_t.start()
logger.info("Started action generation thread")
actor_t = Thread(
target=actor_thread,
args=(
robot,
robot_action_processor,
action_queue,
shutdown_event,
cfg,
episode_active,
dataset,
dataset_lock,
teleop_action_processor,
robot_observation_processor,
),
daemon=True,
name="Actor",
)
actor_t.start()
logger.info("Started actor thread")
# Run Evaluation Episodes
episode_idx = 0
try:
while episode_idx < cfg.num_episodes and not shutdown_event.is_set():
log_say(f"Evaluating episode {episode_idx + 1} of {cfg.num_episodes}")
logger.info(f"\n{'='*40}")
logger.info(f"Episode {episode_idx + 1} / {cfg.num_episodes}")
logger.info(f"{'='*40}")
action_queue = ActionQueue(cfg.rtc)
episode_active.set()
episode_start_time = time.time()
while (time.time() - episode_start_time) < cfg.episode_time_sec:
if shutdown_event.is_set():
break
elapsed = time.time() - episode_start_time
if int(elapsed) % 10 == 0 and int(elapsed) > 0:
logger.info(
f"[MAIN] Episode progress: {elapsed:.0f}/{cfg.episode_time_sec}s, "
f"queue_size={action_queue.qsize()}"
)
time.sleep(0.5)
episode_active.clear()
logger.info(f"Episode {episode_idx + 1} completed")
if cfg.record_dataset and dataset is not None:
with dataset_lock:
if dataset.episode_buffer is not None and dataset.episode_buffer.get("size", 0) > 0:
logger.info(
f"Saving episode {episode_idx + 1} "
f"({dataset.episode_buffer['size']} frames)"
)
dataset.save_episode()
episode_idx += 1
# Manual reset between episodes
if not shutdown_event.is_set() and episode_idx < cfg.num_episodes:
log_say("Waiting for manual reset")
logger.info("Manually reset the environment and press ENTER to continue")
input("Press ENTER when ready...")
logger.info(f"Evaluation complete! {episode_idx} episodes recorded")
log_say("Evaluation complete", blocking=True)
except KeyboardInterrupt:
logger.info("\n\nEvaluation interrupted by user")
finally:
shutdown_event.set()
episode_active.clear()
if get_actions_t.is_alive():
logger.info("Waiting for action generation thread to finish...")
get_actions_t.join(timeout=5.0)
if actor_t.is_alive():
logger.info("Waiting for actor thread to finish...")
actor_t.join(timeout=5.0)
follower.disconnect()
logger.info("Follower disconnected")
if cfg.record_dataset and dataset is not None:
dataset.finalize()
if cfg.push_to_hub:
logger.info("Uploading to Hugging Face Hub...")
dataset.push_to_hub(private=True)
logger.info("Cleanup completed")
if __name__ == "__main__":
main()
+216
View File
@@ -0,0 +1,216 @@
import time
import numpy as np
from lerobot.robots.openarms.openarms_follower import OpenArmsFollower
from lerobot.robots.openarms.config_openarms_follower import OpenArmsFollowerConfig
# Friction model parameters from OpenArms config/follower.yaml
# τ_fric(ω) = Fo + Fv·ω + Fc·tanh(k·ω)
# For 8 motors: [joint_1, joint_2, joint_3, joint_4, joint_5, joint_6, joint_7, gripper]
FRICTION_PARAMS = {
"Fc": [0.306, 0.306, 0.40, 0.166, 0.050, 0.093, 0.172, 0.0512], # Coulomb friction [Nm]
"k": [28.417, 28.417, 29.065, 130.038, 151.771, 242.287, 7.888, 4.000], # tanh steepness
"Fv": [0.063, 0.0630, 0.604, 0.813, 0.029, 0.072, 0.084, 0.084], # Viscous friction [Nm·s/rad]
"Fo": [0.088, 0.088, 0.008, -0.058, 0.005, 0.009, -0.059, -0.050], # Offset torque [Nm]
}
# Constants from OpenArms C++ implementation
AMP_TMP = 1.0
COEF_TMP = 0.1
FRICTION_SCALE = 1.0 # OpenArms C++ uses 0.3 factor in unilateral mode
DAMPING_KD = [0.5, 0.5, 0.5, 0.5, 0.1, 0.1, 0.1, 0.1] # Damping gains for stability
def compute_friction_torque(velocity_rad_per_sec: float, motor_index: int) -> float:
"""
Compute friction torque for a single motor using the tanh friction model.
Args:
velocity_rad_per_sec: Angular velocity in rad/s
motor_index: Index of the motor (0-7)
Returns:
Friction torque in N·m (scaled for stability)
"""
Fc = FRICTION_PARAMS["Fc"][motor_index]
k = FRICTION_PARAMS["k"][motor_index]
Fv = FRICTION_PARAMS["Fv"][motor_index]
Fo = FRICTION_PARAMS["Fo"][motor_index]
# Friction model: τ_fric = amp * Fc * tanh(coef * k * ω) + Fv * ω + Fo
friction_torque = (
AMP_TMP * Fc * np.tanh(COEF_TMP * k * velocity_rad_per_sec) +
Fv * velocity_rad_per_sec +
Fo
)
# Scale down friction compensation for stability at lower control rates
# (OpenArms C++ uses 0.3 factor in unilateral mode)!!
friction_torque *= FRICTION_SCALE
return friction_torque
def main() -> None:
config = OpenArmsFollowerConfig(
port_left="can0",
port_right="can1",
can_interface="socketcan",
id="openarms_follower",
disable_torque_on_disconnect=True,
max_relative_target=5.0,
)
print("Initializing robot...")
follower = OpenArmsFollower(config)
follower.connect(calibrate=True)
print(f"Applying friction compensation")
print(" 1. Support the arm before starting")
print(" 2. The arm will be held in place by friction compensation")
print(" 3. You should be able to move it with gentle force")
print("\nPress ENTER when ready to start...")
input()
print(f"✓ Motors enabled")
print("\nStarting friction compensation loop...")
print("Press Ctrl+C to stop\n")
loop_times = []
last_print_time = time.perf_counter()
# Motor name to index mapping
motor_name_to_index = {
"joint_1": 0,
"joint_2": 1,
"joint_3": 2,
"joint_4": 3,
"joint_5": 4,
"joint_6": 5,
"joint_7": 6,
"gripper": 7,
}
try:
while True:
loop_start = time.perf_counter()
# Get current joint positions and velocities from robot
obs = follower.get_observation()
# Extract velocities in degrees per second
velocities_deg_per_sec = {}
positions_deg = {}
for motor in follower.bus_right.motors:
vel_key = f"right_{motor}.vel"
pos_key = f"right_{motor}.pos"
if vel_key in obs:
velocities_deg_per_sec[f"right_{motor}"] = obs[vel_key]
if pos_key in obs:
positions_deg[f"right_{motor}"] = obs[pos_key]
for motor in follower.bus_left.motors:
vel_key = f"left_{motor}.vel"
pos_key = f"left_{motor}.pos"
if vel_key in obs:
velocities_deg_per_sec[f"left_{motor}"] = obs[vel_key]
if pos_key in obs:
positions_deg[f"left_{motor}"] = obs[pos_key]
# Convert velocities to rad/s and compute friction torques
friction_torques_nm = {}
for motor_full_name, velocity_deg_per_sec in velocities_deg_per_sec.items():
# Extract motor name without arm prefix
if motor_full_name.startswith("right_"):
motor_name = motor_full_name.removeprefix("right_")
elif motor_full_name.startswith("left_"):
motor_name = motor_full_name.removeprefix("left_")
else:
continue
# Get motor index for friction parameters
motor_index = motor_name_to_index.get(motor_name, 0)
# Convert velocity to rad/s
velocity_rad_per_sec = np.deg2rad(velocity_deg_per_sec)
# Compute friction torque
friction_torque = compute_friction_torque(velocity_rad_per_sec, motor_index)
friction_torques_nm[motor_full_name] = friction_torque
# Apply friction compensation to right arm (all joints INCLUDING gripper)
for motor in follower.bus_right.motors:
full_name = f"right_{motor}"
position = positions_deg.get(full_name, 0.0)
torque = friction_torques_nm.get(full_name, 0.0)
# Get motor index for damping gain
motor_index = motor_name_to_index.get(motor, 0)
kd = DAMPING_KD[motor_index]
# Send MIT control command with friction compensation + damping
follower.bus_right._mit_control(
motor=motor,
kp=0.0, # No position control
kd=kd, # Add damping for stability
position_degrees=position,
velocity_deg_per_sec=0.0,
torque=torque
)
# Apply friction compensation to left arm (all joints INCLUDING gripper)
for motor in follower.bus_left.motors:
full_name = f"left_{motor}"
position = positions_deg.get(full_name, 0.0)
torque = friction_torques_nm.get(full_name, 0.0)
# Get motor index for damping gain
motor_index = motor_name_to_index.get(motor, 0)
kd = DAMPING_KD[motor_index]
# Send MIT control command with friction compensation + damping
follower.bus_left._mit_control(
motor=motor,
kp=0.0, # No position control
kd=kd, # Add damping for stability
position_degrees=position,
velocity_deg_per_sec=0.0,
torque=torque
)
# Measure loop time
loop_end = time.perf_counter()
loop_time = loop_end - loop_start
loop_times.append(loop_time)
# Print status every 2 seconds
if loop_end - last_print_time >= 2.0:
if loop_times:
avg_time = sum(loop_times) / len(loop_times)
current_hz = 1.0 / avg_time if avg_time > 0 else 0
print(f"{current_hz:.1f} Hz")
loop_times = []
last_print_time = loop_end
time.sleep(0.001)
except KeyboardInterrupt:
print("\n\nStopping friction compensation...")
finally:
print("\nDisabling all motors and disconnecting...")
follower.bus_right.disable_torque()
follower.bus_left.disable_torque()
time.sleep(0.1)
follower.disconnect()
print("✓ Safe shutdown complete")
if __name__ == "__main__":
main()
+142
View File
@@ -0,0 +1,142 @@
import time
import numpy as np
import pinocchio as pin
from os.path import join, dirname, exists, expanduser
from lerobot.robots.openarms.openarms_follower import OpenArmsFollower
from lerobot.robots.openarms.config_openarms_follower import OpenArmsFollowerConfig
def main() -> None:
config = OpenArmsFollowerConfig(
port_left="can0",
port_right="can1",
can_interface="socketcan",
id="openarms_follower",
disable_torque_on_disconnect=True,
max_relative_target=5.0,
)
print("Initializing robot...")
follower = OpenArmsFollower(config)
follower.connect(calibrate=True)
# Load URDF for Pinocchio dynamics
urdf_path = "/home/croissant/Documents/openarm_description/openarm_bimanual_pybullet.urdf"
pin_robot = pin.RobotWrapper.BuildFromURDF(urdf_path, dirname(urdf_path))
pin_robot.data = pin_robot.model.createData()
print(f"✓ Loaded Pinocchio model with {pin_robot.nq} DoFs")
follower.pin_robot = pin_robot
print(f"Applying gravity compensation")
print(" 1. Support the arm before starting")
print(" 2. The arm will be held in place by gravity compensation")
print(" 3. You should be able to move it with gentle force")
print("\nPress ENTER when ready to start...")
input()
print(f"✓ Motors enabled")
print("\nStarting gravity compensation loop...")
print("Press Ctrl+C to stop\n")
loop_times = []
last_print_time = time.perf_counter()
try:
while True:
loop_start = time.perf_counter()
# Get current joint positions from robot
obs = follower.get_observation()
# Extract positions in degrees
positions_deg = {}
for motor in follower.bus_right.motors:
key = f"right_{motor}.pos"
if key in obs:
positions_deg[f"right_{motor}"] = obs[key]
for motor in follower.bus_left.motors:
key = f"left_{motor}.pos"
if key in obs:
positions_deg[f"left_{motor}"] = obs[key]
# Convert to radians and calculate gravity torques
# Use the built-in method from OpenArmsFollower
positions_rad = {k: np.deg2rad(v) for k, v in positions_deg.items()}
torques_nm = follower._gravity_from_q(positions_rad)
# Apply gravity compensation to right arm (all joints except gripper)
for motor in follower.bus_right.motors:
if motor == "gripper":
continue # Skip gripper
full_name = f"right_{motor}"
position = positions_deg.get(full_name, 0.0)
torque = torques_nm.get(full_name, 0.0)
# Send MIT control command with gravity compensation torque
follower.bus_right._mit_control(
motor=motor,
kp=0.0, # No position control
kd=0.0, # No velocity damping
position_degrees=position,
velocity_deg_per_sec=0.0,
torque=torque
)
# Apply gravity compensation to left arm (all joints except gripper)
for motor in follower.bus_left.motors:
if motor == "gripper":
continue # Skip gripper
full_name = f"left_{motor}"
position = positions_deg.get(full_name, 0.0)
torque = torques_nm.get(full_name, 0.0)
# Send MIT control command with gravity compensation torque
follower.bus_left._mit_control(
motor=motor,
kp=0.0, # No position control
kd=0.0, # No velocity damping
position_degrees=position,
velocity_deg_per_sec=0.0,
torque=torque
)
# Measure loop time
loop_end = time.perf_counter()
loop_time = loop_end - loop_start
loop_times.append(loop_time)
# Print status every 2 seconds
if loop_end - last_print_time >= 2.0:
if loop_times:
avg_time = sum(loop_times) / len(loop_times)
current_hz = 1.0 / avg_time if avg_time > 0 else 0
print(f"{current_hz:.1f} Hz ({avg_time*1000:.1f} ms)")
loop_times = []
last_print_time = loop_end
time.sleep(0.005)
except KeyboardInterrupt:
print("\n\nStopping gravity compensation...")
finally:
print("\nDisabling all motors and disconnecting...")
follower.bus_right.disable_torque()
follower.bus_left.disable_torque()
time.sleep(0.1)
follower.disconnect()
print("✓ Safe shutdown complete")
if __name__ == "__main__":
main()
@@ -0,0 +1,395 @@
"""
OpenArms Dataset Recording with Gravity + Friction Compensation
Records a dataset using OpenArms follower robot with leader teleoperator.
Leader arms have gravity and friction compensation for weightless, easy movement.
Includes 3 cameras: left wrist, right wrist, and base camera.
Uses the same compensation approach as teleop_with_compensation.py
"""
import shutil
import time
from pathlib import Path
import numpy as np
from lerobot.cameras.opencv.configuration_opencv import OpenCVCameraConfig
from lerobot.datasets.lerobot_dataset import LeRobotDataset
from lerobot.datasets.utils import build_dataset_frame, hw_to_dataset_features
from lerobot.robots.openarms.config_openarms_follower import OpenArmsFollowerConfig
from lerobot.robots.openarms.openarms_follower import OpenArmsFollower
from lerobot.teleoperators.openarms.config_openarms_leader import OpenArmsLeaderConfig
from lerobot.teleoperators.openarms.openarms_leader import OpenArmsLeader
from lerobot.utils.control_utils import init_keyboard_listener
from lerobot.utils.utils import log_say
from lerobot.utils.visualization_utils import init_rerun, log_rerun_data
# Recording parameters
NUM_EPISODES = 1
FPS = 30
EPISODE_TIME_SEC = 600
RESET_TIME_SEC = 120
TASK_DESCRIPTION = "OpenArms task description"
# Friction compensation scale factor (1.0 = full, 0.3 = 30% for stability)
FRICTION_SCALE = 1.0
def record_loop_with_compensation(
robot,
leader,
events,
fps,
dataset,
dataset_features,
control_time_s,
single_task,
display_data=True,
):
"""
Custom record loop that applies gravity + friction compensation to leader.
Based on record_loop but with integrated compensation.
"""
dt = 1 / fps
episode_start_time = time.perf_counter()
# All joints (both arms)
all_joints = []
for motor in leader.bus_right.motors:
all_joints.append(f"right_{motor}")
for motor in leader.bus_left.motors:
all_joints.append(f"left_{motor}")
while True:
loop_start = time.perf_counter()
elapsed = loop_start - episode_start_time
# Check if we should exit
if elapsed >= control_time_s or events["exit_early"] or events["stop_recording"]:
break
# Get leader state
leader_action = leader.get_action()
# Extract positions and velocities in degrees
leader_positions_deg = {}
leader_velocities_deg_per_sec = {}
for motor in leader.bus_right.motors:
pos_key = f"right_{motor}.pos"
vel_key = f"right_{motor}.vel"
if pos_key in leader_action:
leader_positions_deg[f"right_{motor}"] = leader_action[pos_key]
if vel_key in leader_action:
leader_velocities_deg_per_sec[f"right_{motor}"] = leader_action[vel_key]
for motor in leader.bus_left.motors:
pos_key = f"left_{motor}.pos"
vel_key = f"left_{motor}.vel"
if pos_key in leader_action:
leader_positions_deg[f"left_{motor}"] = leader_action[pos_key]
if vel_key in leader_action:
leader_velocities_deg_per_sec[f"left_{motor}"] = leader_action[vel_key]
# Calculate gravity torques for leader using built-in method
leader_positions_rad = {k: np.deg2rad(v) for k, v in leader_positions_deg.items()}
leader_gravity_torques_nm = leader._gravity_from_q(leader_positions_rad)
# Calculate friction torques for leader using built-in method
leader_velocities_rad_per_sec = {k: np.deg2rad(v) for k, v in leader_velocities_deg_per_sec.items()}
leader_friction_torques_nm = leader._friction_from_velocity(
leader_velocities_rad_per_sec,
friction_scale=FRICTION_SCALE
)
# Combine gravity + friction torques
leader_total_torques_nm = {}
for motor_name in leader_gravity_torques_nm:
gravity = leader_gravity_torques_nm.get(motor_name, 0.0)
friction = leader_friction_torques_nm.get(motor_name, 0.0)
leader_total_torques_nm[motor_name] = gravity + friction
# Apply gravity + friction compensation to leader RIGHT arm (all joints including gripper)
for motor in leader.bus_right.motors:
full_name = f"right_{motor}"
position = leader_positions_deg.get(full_name, 0.0)
torque = leader_total_torques_nm.get(full_name, 0.0)
# Get damping gain for stability
kd = leader.get_damping_kd(motor)
leader.bus_right._mit_control(
motor=motor,
kp=0.0,
kd=kd, # Add damping for stability
position_degrees=position,
velocity_deg_per_sec=0.0,
torque=torque,
)
# Apply gravity + friction compensation to leader LEFT arm (all joints including gripper)
for motor in leader.bus_left.motors:
full_name = f"left_{motor}"
position = leader_positions_deg.get(full_name, 0.0)
torque = leader_total_torques_nm.get(full_name, 0.0)
# Get damping gain for stability
kd = leader.get_damping_kd(motor)
leader.bus_left._mit_control(
motor=motor,
kp=0.0,
kd=kd, # Add damping for stability
position_degrees=position,
velocity_deg_per_sec=0.0,
torque=torque,
)
# Send leader positions to follower (both arms)
follower_action = {}
for joint in all_joints:
pos_key = f"{joint}.pos"
if pos_key in leader_action:
follower_action[pos_key] = leader_action[pos_key]
# Send action to robot
if follower_action:
robot.send_action(follower_action)
# Get observation from robot (includes camera images)
observation = robot.get_observation()
# Add to dataset if we have a dataset
if dataset is not None:
# Build properly formatted observation frame
obs_frame = build_dataset_frame(dataset_features, observation, prefix="observation")
# Build properly formatted action frame (keep .pos suffix - it matches the feature names)
action_frame = build_dataset_frame(dataset_features, follower_action, prefix="action")
# Combine into single frame
frame = {**obs_frame, **action_frame}
# Add metadata (task is required, timestamp will be auto-calculated by add_frame)
frame["task"] = single_task
dataset.add_frame(frame)
# Display data if requested
if display_data:
log_rerun_data(observation=observation, action=follower_action)
# Maintain loop rate
loop_duration = time.perf_counter() - loop_start
sleep_time = dt - loop_duration
if sleep_time > 0:
time.sleep(sleep_time)
def main():
"""Main recording loop with gravity compensation."""
print("=" * 70)
print("OpenArms Dataset Recording with Compensation")
print("=" * 70)
# Create camera configurations (3 cameras: left wrist, right wrist, base)
# Using actual device paths found by lerobot-find-cameras opencv
camera_config = {
"left_wrist": OpenCVCameraConfig(index_or_path="/dev/video0", width=640, height=480, fps=FPS),
"right_wrist": OpenCVCameraConfig(index_or_path="/dev/video1", width=640, height=480, fps=FPS),
"base": OpenCVCameraConfig(index_or_path="/dev/video7", width=640, height=480, fps=FPS),
}
# Configure follower robot with cameras
follower_config = OpenArmsFollowerConfig(
port_left="can2",
port_right="can3",
can_interface="socketcan",
id="openarms_follower",
disable_torque_on_disconnect=True,
max_relative_target=10.0,
cameras=camera_config,
)
# Configure leader teleoperator (no cameras needed)
leader_config = OpenArmsLeaderConfig(
port_left="can0",
port_right="can1",
can_interface="socketcan",
id="openarms_leader",
manual_control=False, # Enable torque control for gravity compensation
)
# Initialize robot and teleoperator
print("\nInitializing devices...")
follower = OpenArmsFollower(follower_config)
leader = OpenArmsLeader(leader_config)
# Connect devices
print("Connecting and calibrating...")
follower.connect(calibrate=True)
leader.connect(calibrate=True)
# Verify URDF is loaded for gravity compensation
if leader.pin_robot is None:
raise RuntimeError("URDF model not loaded on leader. Gravity compensation not available.")
# Configure the dataset features
# For actions, we only want to record positions (not velocity or torque)
action_features_hw = {}
for key, value in follower.action_features.items():
if key.endswith(".pos"):
action_features_hw[key] = value
action_features = hw_to_dataset_features(action_features_hw, "action")
obs_features = hw_to_dataset_features(follower.observation_features, "observation")
dataset_features = {**action_features, **obs_features}
# Create the dataset
print("\nCreating dataset...")
repo_id = "<hf_username>/<dataset_repo_id>" # TODO: Replace with your Hugging Face repo
# Check if dataset already exists and prompt user
dataset_path = Path.home() / ".cache" / "huggingface" / "lerobot" / repo_id
while dataset_path.exists():
print(f"\nDataset already exists at: {dataset_path}")
print("\nOptions:")
print(" 1. Overwrite existing dataset")
print(" 2. Use a different name")
print(" 3. Abort")
choice = input("\nEnter your choice (1/2/3): ").strip()
if choice == '1':
print(f"Removing existing dataset...")
shutil.rmtree(dataset_path)
print("✓ Existing dataset removed")
break
elif choice == '2':
print("\nCurrent repo_id:", repo_id)
new_repo_id = input("Enter new repo_id (format: <username>/<dataset_name>): ").strip()
if new_repo_id and '/' in new_repo_id:
repo_id = new_repo_id
dataset_path = Path.home() / ".cache" / "huggingface" / "lerobot" / repo_id
print(f"✓ Using new repo_id: {repo_id}")
# Loop will continue if this new path also exists
else:
print("Invalid repo_id format. Please use format: <username>/<dataset_name>")
elif choice == '3':
print("Aborting. Please remove the existing dataset manually or restart with a different repo_id.")
follower.disconnect()
leader.disconnect()
return
else:
print("Invalid choice. Please enter 1, 2, or 3.")
dataset = LeRobotDataset.create(
repo_id=repo_id,
fps=FPS,
features=dataset_features,
robot_type=follower.name,
use_videos=True,
image_writer_threads=4,
)
# Initialize keyboard listener and visualization
_, events = init_keyboard_listener()
init_rerun(session_name="openarms_recording")
# Enable motors on both leader arms for gravity compensation
leader.bus_right.enable_torque()
leader.bus_left.enable_torque()
time.sleep(0.1)
print("\n" + "=" * 70)
print(f"Recording {NUM_EPISODES} episodes")
print(f"Task: {TASK_DESCRIPTION}")
print("=" * 70)
print("\nLeader BOTH arms: Gravity + Friction comp | Follower BOTH arms: Teleop")
print("\nKeyboard controls:")
print(" - Press 'q' to stop recording")
print(" - Press 'r' to re-record current episode")
print("=" * 70)
episode_idx = 0
try:
while episode_idx < NUM_EPISODES and not events["stop_recording"]:
log_say(f"Recording episode {episode_idx + 1} of {NUM_EPISODES}")
# Record episode with compensation active
record_loop_with_compensation(
robot=follower,
leader=leader,
events=events,
fps=FPS,
dataset=dataset,
dataset_features=dataset_features,
control_time_s=EPISODE_TIME_SEC,
single_task=TASK_DESCRIPTION,
display_data=True,
)
# Reset the environment if not stopping or re-recording
if not events["stop_recording"] and (episode_idx < NUM_EPISODES - 1 or events["rerecord_episode"]):
log_say("Reset the environment")
record_loop_with_compensation(
robot=follower,
leader=leader,
events=events,
fps=FPS,
dataset=None, # Don't save reset period
dataset_features=dataset_features,
control_time_s=RESET_TIME_SEC,
single_task=TASK_DESCRIPTION,
display_data=True,
)
# Handle re-recording
if events["rerecord_episode"]:
log_say("Re-recording episode")
events["rerecord_episode"] = False
events["exit_early"] = False
dataset.clear_episode_buffer()
continue
# Only save episode if frames were recorded
if dataset.episode_buffer is not None and dataset.episode_buffer["size"] > 0:
dataset.save_episode()
episode_idx += 1
else:
log_say("No frames recorded, skipping episode save")
# Clear the empty buffer
dataset.episode_buffer = None
except KeyboardInterrupt:
print("\n\nStopping recording...")
finally:
# Clean up
log_say("Stop recording")
try:
leader.bus_right.disable_torque()
leader.bus_left.disable_torque()
time.sleep(0.1)
leader.disconnect()
follower.disconnect()
print("✓ Shutdown complete")
except Exception as e:
print(f"Shutdown error: {e}")
# Upload dataset
print("\nUploading dataset to Hugging Face Hub...")
try:
dataset.push_to_hub()
print("✓ Dataset uploaded successfully")
except Exception as e:
print(f"Warning: Failed to upload dataset: {e}")
print("You can manually upload later using: dataset.push_to_hub()")
print("✓ Recording complete!")
if __name__ == "__main__":
main()
+166
View File
@@ -0,0 +1,166 @@
#!/usr/bin/env python
# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""
OpenArms Dataset Replay Example
Replays position actions from a recorded dataset on an OpenArms follower robot.
Only position commands (ending with .pos) are replayed, not velocity or torque.
Example usage:
python examples/openarms/replay.py
"""
import time
from lerobot.datasets.lerobot_dataset import LeRobotDataset
from lerobot.robots.openarms.config_openarms_follower import OpenArmsFollowerConfig
from lerobot.robots.openarms.openarms_follower import OpenArmsFollower
from lerobot.utils.constants import ACTION
from lerobot.utils.robot_utils import busy_wait
from lerobot.utils.utils import log_say
# Configuration
EPISODE_IDX = 0
DATASET_REPO_ID = "lerobot-data-collection/replay-this-2025-11-02-17-58" # TODO: Replace with your dataset
DATASET_ROOT = None # Use default cache location, or specify custom path
# Robot configuration - adjust these to match your setup
ROBOT_CONFIG = OpenArmsFollowerConfig(
port_left="can2", # CAN interface for left arm
port_right="can3", # CAN interface for right arm
can_interface="socketcan",
id="openarms_follower",
disable_torque_on_disconnect=True,
max_relative_target=10.0, # Safety limit: max degrees to move per step
)
def main():
"""Main replay function."""
print("=" * 70)
print("OpenArms Dataset Replay")
print("=" * 70)
print(f"\nDataset: {DATASET_REPO_ID}")
print(f"Episode: {EPISODE_IDX}")
print(f"Robot: {ROBOT_CONFIG.id}")
print(f" Left arm: {ROBOT_CONFIG.port_left}")
print(f" Right arm: {ROBOT_CONFIG.port_right}")
print("\n" + "=" * 70)
# Initialize the robot
print("\n[1/3] Initializing robot...")
robot = OpenArmsFollower(ROBOT_CONFIG)
# Load the dataset
print(f"\n[2/3] Loading dataset '{DATASET_REPO_ID}'...")
dataset = LeRobotDataset(
DATASET_REPO_ID,
root=DATASET_ROOT,
episodes=[EPISODE_IDX]
)
# Filter dataset to only include frames from the specified episode
# (required for dataset V3.0 where episodes are chunked)
episode_frames = dataset.hf_dataset.filter(
lambda x: x["episode_index"] == EPISODE_IDX
)
if len(episode_frames) == 0:
raise ValueError(
f"No frames found for episode {EPISODE_IDX} in dataset {DATASET_REPO_ID}"
)
print(f" Found {len(episode_frames)} frames in episode {EPISODE_IDX}")
# Extract action features from dataset
action_features = dataset.features.get(ACTION, {})
action_names = action_features.get("names", [])
# Filter to only position actions (ending with .pos)
position_action_names = [name for name in action_names if name.endswith(".pos")]
if not position_action_names:
raise ValueError(
f"No position actions found in dataset. Action names: {action_names}"
)
print(f" Found {len(position_action_names)} position actions to replay")
print(f" Actions: {', '.join(position_action_names[:5])}{'...' if len(position_action_names) > 5 else ''}")
# Select only action columns from dataset
actions = episode_frames.select_columns(ACTION)
# Connect to the robot
print(f"\n[3/3] Connecting to robot...")
robot.connect(calibrate=False) # Skip calibration for replay
if not robot.is_connected:
raise RuntimeError("Robot failed to connect!")
print("\n" + "=" * 70)
print("Ready to replay!")
print("=" * 70)
print("\nThe robot will replay the recorded positions.")
print("Press Ctrl+C to stop at any time.\n")
input("Press ENTER to start replaying...")
# Replay loop
log_say(f"Replaying episode {EPISODE_IDX}", blocking=True)
try:
for idx in range(len(episode_frames)):
loop_start = time.perf_counter()
# Extract action array from dataset
action_array = actions[idx][ACTION]
# Build action dictionary, but only include position actions
action = {}
for i, name in enumerate(action_names):
# Only include position actions (ending with .pos)
if name.endswith(".pos"):
action[name] = float(action_array[i])
# Send action to robot
robot.send_action(action)
# Maintain replay rate (use dataset fps)
loop_duration = time.perf_counter() - loop_start
dt_s = 1.0 / dataset.fps - loop_duration
busy_wait(dt_s)
# Progress indicator every 100 frames
if (idx + 1) % 100 == 0:
progress = (idx + 1) / len(episode_frames) * 100
print(f"Progress: {idx + 1}/{len(episode_frames)} frames ({progress:.1f}%)")
print(f"\n✓ Successfully replayed {len(episode_frames)} frames")
log_say("Replay complete", blocking=True)
except KeyboardInterrupt:
print("\n\nReplay interrupted by user")
finally:
# Disconnect robot
print("\nDisconnecting robot...")
robot.disconnect()
print("✓ Replay complete!")
if __name__ == "__main__":
main()
+73
View File
@@ -0,0 +1,73 @@
#!/bin/bash
# Setup all OpenArms CAN interfaces with CAN FD
set -e
echo "=========================================="
echo "OpenArms CAN FD Interface Setup"
echo "=========================================="
echo ""
echo "Mode: CAN FD"
echo " - Nominal bitrate: 1 Mbps"
echo " - Data bitrate: 5 Mbps"
echo ""
echo "Configuring interfaces can0, can1, can2, can3..."
echo ""
# Configure each CAN interface with CAN FD
for i in 0 1 2 3; do
interface="can$i"
# Check if interface exists
if ! ip link show "$interface" &> /dev/null; then
echo "$interface: Not found, skipping"
continue
fi
# Bring down interface
sudo ip link set "$interface" down 2>/dev/null
# Configure CAN FD mode
sudo ip link set "$interface" type can \
bitrate 1000000 \
dbitrate 5000000 \
fd on
# Bring up interface
sudo ip link set "$interface" up
# Verify configuration
if ip link show "$interface" | grep -q "UP"; then
echo "$interface: Configured and UP"
else
echo "$interface: Failed to bring UP"
fi
done
echo ""
echo "=========================================="
echo "Verification"
echo "=========================================="
echo ""
# Show detailed status for each interface
for i in 0 1 2 3; do
interface="can$i"
if ip link show "$interface" &> /dev/null; then
echo "$interface:"
# Show key parameters
ip -d link show "$interface" | grep -E "can|state|bitrate|dbitrate" | head -3
echo ""
fi
done
echo "=========================================="
echo "Setup Complete!"
echo "=========================================="
echo ""
echo "All interfaces configured for CAN FD mode"
echo ""
echo "Next steps:"
echo " 1. Test motors: python debug_can_communication.py"
echo " 2. Run teleoperation: python examples/openarms/teleop.py"
echo ""
+148
View File
@@ -0,0 +1,148 @@
"""
OpenArms Teleoperation Example - Full Dual Arms
This script demonstrates teleoperation of OpenArms follower robot using an OpenArms leader arm.
It first calibrates both devices, then enters a teleoperation loop for both arms.
"""
import time
from lerobot.robots.openarms.openarms_follower import OpenArmsFollower
from lerobot.robots.openarms.config_openarms_follower import OpenArmsFollowerConfig
from lerobot.teleoperators.openarms.openarms_leader import OpenArmsLeader
from lerobot.teleoperators.openarms.config_openarms_leader import OpenArmsLeaderConfig
follower_config = OpenArmsFollowerConfig(
port_left="can2", # CAN interface for follower left arm
port_right="can3", # CAN interface for follower right arm
can_interface="socketcan", # Linux SocketCAN
id="openarms_follower",
disable_torque_on_disconnect=True,
max_relative_target=5.0, # Safety limit
)
leader_config = OpenArmsLeaderConfig(
port_left="can0", # CAN interface for leader left arm
port_right="can1", # CAN interface for leader right arm
can_interface="socketcan", # Linux SocketCAN
id="openarms_leader",
manual_control=True, # Enable manual control (torque disabled)
)
print("=" * 60)
print("OpenArms Teleoperation - Full Dual Arms")
print("=" * 60)
# Initialize devices
print("\n[1/4] Initializing devices...")
follower = OpenArmsFollower(follower_config)
leader = OpenArmsLeader(leader_config)
# Connect and calibrate follower
print("\n[2/4] Connecting and calibrating follower robot...")
print("Note: If you have existing calibration, just press ENTER to use it.")
follower.connect(calibrate=True)
# Connect and calibrate leader
print("\n[3/4] Connecting and calibrating leader arm...")
print("Note: The leader arm will have torque disabled for manual control.")
leader.connect(calibrate=True)
# Wait for user to be ready
print("\n[4/4] Ready for teleoperation!")
print("\nBoth arms will be controlled (16 motors total):")
print(" RIGHT ARM: joints 1-7 + gripper")
print(" LEFT ARM: joints 1-7 + gripper")
print("\nPress ENTER to start teleoperation...")
input()
print("\nTeleoperation started! Move both leader arms.")
print("Press Ctrl+C to stop.\n")
# All joints for both arms (16 motors total)
all_joints = [
# Right arm
"right_joint_1",
"right_joint_2",
"right_joint_3",
"right_joint_4",
"right_joint_5",
"right_joint_6",
"right_joint_7",
"right_gripper",
# Left arm
"left_joint_1",
"left_joint_2",
"left_joint_3",
"left_joint_4",
"left_joint_5",
"left_joint_6",
"left_joint_7",
"left_gripper",
]
# Performance monitoring
loop_times = []
start_time = time.perf_counter()
last_print_time = start_time
try:
while True:
loop_start = time.perf_counter()
# Get action from leader
leader_action = leader.get_action()
# Filter to only position data for all joints (both arms)
joint_action = {}
for joint in all_joints:
pos_key = f"{joint}.pos"
if pos_key in leader_action:
joint_action[pos_key] = leader_action[pos_key]
# Send action to follower (both arms)
if joint_action:
follower.send_action(joint_action)
# Measure loop time
loop_end = time.perf_counter()
loop_time = loop_end - loop_start
loop_times.append(loop_time)
# Print stats every 2 seconds
if loop_end - last_print_time >= 2.0:
if loop_times:
avg_time = sum(loop_times) / len(loop_times)
current_hz = 1.0 / avg_time if avg_time > 0 else 0
min_time = min(loop_times)
max_time = max(loop_times)
max_hz = 1.0 / min_time if min_time > 0 else 0
min_hz = 1.0 / max_time if max_time > 0 else 0
print(f"[Hz Stats] Avg: {current_hz:.1f} Hz | "
f"Range: {min_hz:.1f}-{max_hz:.1f} Hz | "
f"Avg loop time: {avg_time*1000:.1f} ms")
# Reset for next measurement window
loop_times = []
last_print_time = loop_end
except KeyboardInterrupt:
print("\n\nStopping teleoperation...")
finally:
# Disconnect devices
print("Disconnecting devices...")
try:
follower.disconnect()
except Exception as e:
print(f"Error disconnecting follower: {e}")
try:
leader.disconnect()
except Exception as e:
print(f"Error disconnecting leader: {e}")
print("Done!")
+197
View File
@@ -0,0 +1,197 @@
"""
OpenArms Mini Teleoperation Example
This script demonstrates teleoperation of an OpenArms follower robot using
an OpenArms Mini leader (Feetech-based) with dual arms (16 motors total).
The OpenArms Mini has:
- Right arm: 8 motors (joint_1 to joint_7 + gripper)
- Left arm: 8 motors (joint_1 to joint_7 + gripper)
Note on gripper normalization:
- OpenArms Mini gripper: 0-100 scale (0=closed, 100=open)
- OpenArms follower gripper: degrees (0=closed, -65=open)
- This script automatically converts between the two ranges
"""
import time
import os
import sys
from lerobot.robots.openarms.openarms_follower import OpenArmsFollower
from lerobot.robots.openarms.config_openarms_follower import OpenArmsFollowerConfig
from lerobot.teleoperators.openarms_mini.openarms_mini import OpenArmsMini
from lerobot.teleoperators.openarms_mini.config_openarms_mini import OpenArmsMiniConfig
from lerobot.utils.robot_utils import busy_wait
# Target control frequency
TARGET_FPS = 30
# Configure the OpenArms follower (Damiao motors on CAN bus)
follower_config = OpenArmsFollowerConfig(
port_left="can0", # CAN interface for follower left arm
port_right="can1", # CAN interface for follower right arm
can_interface="socketcan", # Linux SocketCAN
id="openarms_follower",
disable_torque_on_disconnect=True,
max_relative_target=10.0, # Safety limit (degrees per step)
)
# Configure the OpenArms Mini leader (Feetech motors on serial)
leader_config = OpenArmsMiniConfig(
port_right="/dev/ttyACM0", # Serial port for right arm
port_left="/dev/ttyACM1", # Serial port for left arm
id="openarms_mini",
use_degrees=True,
)
print("OpenArms Mini → OpenArms Follower Teleoperation")
# Initialize devices
follower = OpenArmsFollower(follower_config)
leader = OpenArmsMini(leader_config)
# Connect and calibrate follower
print("Note: If you have existing calibration, just press ENTER to use it.")
follower.connect(calibrate=True)
# Connect and calibrate leader
print("Note: The leader arms will have torque disabled for manual control.")
leader.connect(calibrate=True)
print("\nPress ENTER to start teleoperation...")
input()
print("Press Ctrl+C to stop.\n")
# All joints for both arms (16 motors total)
all_joints = [
# Right arm
"right_joint_1",
"right_joint_2",
"right_joint_3",
"right_joint_4",
"right_joint_5",
"right_joint_6",
"right_joint_7",
"right_gripper",
# Left arm
"left_joint_1",
"left_joint_2",
"left_joint_3",
"left_joint_4",
"left_joint_5",
"left_joint_6",
"left_joint_7",
"left_gripper",
]
# Performance monitoring
loop_times = []
avg_loop_time = 0.0
min_loop_time = float('inf')
max_loop_time = 0.0
stats_update_interval = 1.0 # Update stats every 1 second
last_stats_update = time.perf_counter()
SWAPPED_JOINTS = {
"right_joint_6": "right_joint_7",
"right_joint_7": "right_joint_6",
"left_joint_6": "left_joint_7",
"left_joint_7": "left_joint_6",
}
try:
while True:
loop_start = time.perf_counter()
# Get actions and observations
leader_action = leader.get_action()
follower_obs = follower.get_observation()
joint_action = {}
for joint in all_joints:
leader_key = f"{joint}.pos"
# Determine which follower joint this leader joint controls
follower_joint = SWAPPED_JOINTS.get(joint, joint)
follower_key = f"{follower_joint}.pos"
# Get leader position (default 0 if missing)
pos = leader_action.get(leader_key, 0.0)
# Convert gripper values: Mini uses 0-100, OpenArms uses 0 to -65 degrees
if "gripper" in joint:
# Map 0-100 (Mini) to 0 to -65 (OpenArms)
# 0 (closed) -> 0°, 100 (open) -> -65°
pos = (pos / 100.0) * -65.0
# Store in action dict for follower
joint_action[follower_key] = pos
follower.send_action(joint_action)
# Loop timing
loop_end = time.perf_counter()
loop_time = loop_end - loop_start
loop_times.append(loop_time)
# Update stats periodically
current_time = time.perf_counter()
if current_time - last_stats_update >= stats_update_interval:
if loop_times:
avg_loop_time = sum(loop_times) / len(loop_times)
min_loop_time = min(loop_times)
max_loop_time = max(loop_times)
loop_times = []
last_stats_update = current_time
# Display everything
sys.stdout.write("\033[H\033[J") # Clear screen
# Show timing stats at the top
if avg_loop_time > 0:
avg_hz = 1.0 / avg_loop_time
min_hz = 1.0 / max_loop_time if max_loop_time > 0 else 0
max_hz = 1.0 / min_loop_time if min_loop_time > 0 and min_loop_time < float('inf') else 0
print(f"[Performance] Target: {TARGET_FPS} Hz | Avg: {avg_hz:.1f} Hz | Range: {min_hz:.1f}-{max_hz:.1f} Hz | Loop: {avg_loop_time*1000:.1f} ms\n")
else:
print(f"[Performance] Target: {TARGET_FPS} Hz | Measuring...\n")
# Show joint positions
print(f"{'Joint':<20} {'Leader':>15} {'Follower':>15}")
print(f"{'':20} {'(0-100/deg)':>15} {'(deg)':>15}")
print("-" * 52)
for joint in all_joints:
leader_key = f"{joint}.pos"
follower_joint = SWAPPED_JOINTS.get(joint, joint)
follower_key = f"{follower_joint}.pos"
leader_pos = leader_action.get(leader_key, 0.0)
follower_pos = follower_obs.get(follower_key, 0.0)
print(f"{joint:<20} {leader_pos:>15.2f} {follower_pos:>15.2f}")
# Smart sleep to maintain target FPS
dt_s = time.perf_counter() - loop_start
busy_wait(max(0, 1.0 / TARGET_FPS - dt_s))
except KeyboardInterrupt:
print("\n\nStopping teleoperation...")
finally:
# Disconnect devices
print("Disconnecting devices...")
try:
follower.disconnect()
except Exception as e:
print(f"Error disconnecting follower: {e}")
try:
leader.disconnect()
except Exception as e:
print(f"Error disconnecting leader: {e}")
print("Done!")
+202
View File
@@ -0,0 +1,202 @@
"""
OpenArms Teleoperation with Gravity + Friction Compensation
Leader arms (both LEFT and RIGHT): Gravity + Friction compensation (weightless, easy to move)
Follower arms (both LEFT and RIGHT): Mirror leader movements
Uses the URDF file from the lerobot repository.
"""
import time
import numpy as np
from lerobot.robots.openarms.config_openarms_follower import OpenArmsFollowerConfig
from lerobot.robots.openarms.openarms_follower import OpenArmsFollower
from lerobot.teleoperators.openarms.config_openarms_leader import OpenArmsLeaderConfig
from lerobot.teleoperators.openarms.openarms_leader import OpenArmsLeader
# Friction compensation scale factor (1.0 = full, 0.3 = 30% for stability)
FRICTION_SCALE = 1.0
def main():
"""Main teleoperation loop with gravity compensation"""
print("=" * 70)
print("OpenArms Teleoperation with Gravity Compensation")
print("=" * 70)
# Configuration
follower_config = OpenArmsFollowerConfig(
port_left="can2",
port_right="can3",
can_interface="socketcan",
id="openarms_follower",
disable_torque_on_disconnect=True,
max_relative_target=10.0,
)
leader_config = OpenArmsLeaderConfig(
port_left="can0",
port_right="can1",
can_interface="socketcan",
id="openarms_leader",
manual_control=False, # Enable torque control for gravity compensation
)
# Initialize and connect
print("\nInitializing devices...")
follower = OpenArmsFollower(follower_config)
leader = OpenArmsLeader(leader_config)
follower.connect()
leader.connect()
# URDF is automatically loaded in the leader constructor
if leader.pin_robot is None:
raise RuntimeError("URDF model not loaded on leader. Gravity compensation not available.")
print("\nLeader BOTH arms: Gravity + Friction comp | Follower BOTH arms: Teleop")
print("Press ENTER to start...")
input()
# Enable motors on both leader arms for gravity compensation
leader.bus_right.enable_torque()
leader.bus_left.enable_torque()
time.sleep(0.1)
print("Press Ctrl+C to stop\n")
# Main control loop
loop_times = []
last_print_time = time.perf_counter()
# All joints (both arms)
all_joints = []
for motor in leader.bus_right.motors:
all_joints.append(f"right_{motor}")
for motor in leader.bus_left.motors:
all_joints.append(f"left_{motor}")
try:
while True:
loop_start = time.perf_counter()
# Get leader state
leader_action = leader.get_action()
# Extract positions and velocities in degrees
leader_positions_deg = {}
leader_velocities_deg_per_sec = {}
for motor in leader.bus_right.motors:
pos_key = f"right_{motor}.pos"
vel_key = f"right_{motor}.vel"
if pos_key in leader_action:
leader_positions_deg[f"right_{motor}"] = leader_action[pos_key]
if vel_key in leader_action:
leader_velocities_deg_per_sec[f"right_{motor}"] = leader_action[vel_key]
for motor in leader.bus_left.motors:
pos_key = f"left_{motor}.pos"
vel_key = f"left_{motor}.vel"
if pos_key in leader_action:
leader_positions_deg[f"left_{motor}"] = leader_action[pos_key]
if vel_key in leader_action:
leader_velocities_deg_per_sec[f"left_{motor}"] = leader_action[vel_key]
# Calculate gravity torques for leader using built-in method
leader_positions_rad = {k: np.deg2rad(v) for k, v in leader_positions_deg.items()}
leader_gravity_torques_nm = leader._gravity_from_q(leader_positions_rad)
# Calculate friction torques for leader using built-in method
leader_velocities_rad_per_sec = {k: np.deg2rad(v) for k, v in leader_velocities_deg_per_sec.items()}
leader_friction_torques_nm = leader._friction_from_velocity(
leader_velocities_rad_per_sec,
friction_scale=FRICTION_SCALE
)
# Combine gravity + friction torques
leader_total_torques_nm = {}
for motor_name in leader_gravity_torques_nm:
gravity = leader_gravity_torques_nm.get(motor_name, 0.0)
friction = leader_friction_torques_nm.get(motor_name, 0.0)
leader_total_torques_nm[motor_name] = gravity + friction
# Apply gravity + friction compensation to leader RIGHT arm (all joints including gripper)
for motor in leader.bus_right.motors:
full_name = f"right_{motor}"
position = leader_positions_deg.get(full_name, 0.0)
torque = leader_total_torques_nm.get(full_name, 0.0)
# Get damping gain for stability
kd = leader.get_damping_kd(motor)
leader.bus_right._mit_control(
motor=motor,
kp=0.0,
kd=kd, # Add damping for stability
position_degrees=position,
velocity_deg_per_sec=0.0,
torque=torque,
)
# Apply gravity + friction compensation to leader LEFT arm (all joints including gripper)
for motor in leader.bus_left.motors:
full_name = f"left_{motor}"
position = leader_positions_deg.get(full_name, 0.0)
torque = leader_total_torques_nm.get(full_name, 0.0)
# Get damping gain for stability
kd = leader.get_damping_kd(motor)
leader.bus_left._mit_control(
motor=motor,
kp=0.0,
kd=kd, # Add damping for stability
position_degrees=position,
velocity_deg_per_sec=0.0,
torque=torque,
)
# Send leader positions to follower (both arms)
follower_action = {}
for joint in all_joints:
pos_key = f"{joint}.pos"
if pos_key in leader_action:
follower_action[pos_key] = leader_action[pos_key]
if follower_action:
follower.send_action(follower_action)
# Performance monitoring
loop_end = time.perf_counter()
loop_time = loop_end - loop_start
loop_times.append(loop_time)
if loop_end - last_print_time >= 2.0:
if loop_times:
avg_time = sum(loop_times) / len(loop_times)
current_hz = 1.0 / avg_time if avg_time > 0 else 0
print(f"{current_hz:.1f} Hz ({avg_time*1000:.1f} ms)")
loop_times = []
last_print_time = loop_end
except KeyboardInterrupt:
print("\n\nStopping...")
finally:
try:
leader.bus_right.disable_torque()
leader.bus_left.disable_torque()
time.sleep(0.1)
leader.disconnect()
follower.disconnect()
print("✓ Shutdown complete")
except Exception as e:
print(f"Shutdown error: {e}")
if __name__ == "__main__":
main()
+745
View File
@@ -0,0 +1,745 @@
body {
margin: 0;
padding: 0;
font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, sans-serif;
background: #f5f5f5;
}
main {
min-height: 100vh;
padding: 2rem;
}
header {
text-align: center;
margin-bottom: 2rem;
}
h1 {
font-size: 2rem;
font-weight: 600;
color: #333;
margin: 0;
}
h2 {
font-size: 1.25rem;
font-weight: 600;
color: #333;
margin: 0 0 1rem 0;
}
h3 {
font-size: 0.875rem;
font-weight: 600;
color: #666;
margin: 0 0 0.5rem 0;
text-transform: uppercase;
letter-spacing: 0.5px;
}
.container {
max-width: 1920px;
margin: 0 auto;
display: grid;
grid-template-columns: minmax(500px, 600px) 1fr;
gap: 2rem;
align-items: start;
}
/* Left column container */
.left-column {
display: flex;
flex-direction: column;
gap: 1.5rem;
}
/* Right column container */
.right-column {
display: flex;
flex-direction: column;
gap: 1.5rem;
}
/* Responsive: Stack on smaller screens */
@media (max-width: 1200px) {
.container {
grid-template-columns: 1fr;
}
}
.panel {
background: white;
border-radius: 8px;
padding: 1.5rem;
box-shadow: 0 1px 3px rgba(0,0,0,0.1);
}
.config-panel {
border: 2px solid #e5e7eb;
}
.config-header {
display: flex;
justify-content: space-between;
align-items: center;
cursor: pointer;
user-select: none;
padding: 0.5rem 0;
}
.config-header:hover {
opacity: 0.7;
}
.toggle-icon {
font-size: 1rem;
color: #6b7280;
transition: transform 0.2s;
}
.config-content {
margin-top: 1rem;
padding-top: 1rem;
border-top: 1px solid #e5e7eb;
}
.robot-setup {
margin-bottom: 0.5rem;
}
.robot-status {
display: flex;
align-items: center;
justify-content: space-between;
padding: 1rem;
border-radius: 6px;
font-weight: 500;
gap: 1rem;
}
.robot-status.ready {
background: linear-gradient(135deg, #d1fae5 0%, #a7f3d0 100%);
color: #065f46;
border: 1px solid #10b981;
}
.robot-status.not-ready {
background: linear-gradient(135deg, #fef3c7 0%, #fde68a 100%);
color: #92400e;
border: 1px solid #f59e0b;
}
.btn-setup {
background: #10b981;
color: white;
border: none;
padding: 0.5rem 1rem;
border-radius: 4px;
font-size: 0.875rem;
font-weight: 500;
cursor: pointer;
transition: background 0.2s;
}
.btn-setup:hover:not(:disabled) {
background: #059669;
}
.btn-setup:disabled {
background: #d1d5db;
cursor: not-allowed;
}
.btn-zero {
background: #8b5cf6;
color: white;
border: none;
padding: 0.5rem 1rem;
border-radius: 4px;
font-size: 0.875rem;
font-weight: 500;
cursor: pointer;
transition: background 0.2s;
}
.btn-zero:hover:not(:disabled) {
background: #7c3aed;
}
.btn-zero:disabled {
background: #d1d5db;
cursor: not-allowed;
}
.zero-position-section {
margin-top: 1rem;
padding-top: 1rem;
border-top: 1px solid #e5e7eb;
}
.btn-zero-large {
width: 100%;
background: #8b5cf6;
color: white;
border: none;
padding: 0.875rem 1.5rem;
border-radius: 8px;
font-size: 1rem;
font-weight: 600;
cursor: pointer;
transition: all 0.2s;
box-shadow: 0 2px 4px rgba(139, 92, 246, 0.2);
}
.btn-zero-large:hover:not(:disabled) {
background: #7c3aed;
box-shadow: 0 4px 8px rgba(139, 92, 246, 0.3);
transform: translateY(-1px);
}
.btn-zero-large:disabled {
background: #d1d5db;
cursor: not-allowed;
box-shadow: none;
transform: none;
}
.delete-episode-section {
margin-top: 1rem;
padding-top: 1rem;
border-top: 1px solid #e5e7eb;
}
.btn-delete {
width: 100%;
background: #ef4444;
color: white;
border: none;
padding: 0.875rem 1.5rem;
border-radius: 8px;
font-size: 1rem;
font-weight: 600;
cursor: pointer;
transition: all 0.2s;
box-shadow: 0 2px 4px rgba(239, 68, 68, 0.2);
}
.btn-delete:hover:not(:disabled) {
background: #dc2626;
box-shadow: 0 4px 8px rgba(239, 68, 68, 0.3);
transform: translateY(-1px);
}
.btn-delete:disabled {
background: #d1d5db;
cursor: not-allowed;
box-shadow: none;
transform: none;
}
.delete-info {
margin-top: 0.5rem;
font-size: 0.875rem;
color: #666;
text-align: center;
font-style: italic;
}
.btn-disconnect {
background: #ef4444;
color: white;
border: none;
padding: 0.5rem 1rem;
border-radius: 4px;
font-size: 0.875rem;
font-weight: 500;
cursor: pointer;
transition: background 0.2s;
}
.btn-disconnect:hover {
background: #dc2626;
}
.btn-refresh {
background: #3b82f6;
color: white;
border: none;
padding: 0.4rem 0.8rem;
border-radius: 4px;
font-size: 0.75rem;
font-weight: 500;
cursor: pointer;
transition: background 0.2s;
}
.btn-refresh:hover:not(:disabled) {
background: #2563eb;
}
.btn-refresh:disabled {
background: #d1d5db;
cursor: not-allowed;
}
.control-panel {
border: 2px solid #10b981;
}
.status-banner {
display: flex;
align-items: center;
gap: 1rem;
padding: 1rem 1.5rem;
border-radius: 6px;
margin-bottom: 1.5rem;
font-weight: 500;
font-size: 0.95rem;
}
.status-banner.initializing {
background: linear-gradient(135deg, #dbeafe 0%, #bfdbfe 100%);
color: #1e40af;
border-left: 4px solid #3b82f6;
}
.status-banner.encoding {
background: linear-gradient(135deg, #fef3c7 0%, #fde68a 100%);
color: #92400e;
border-left: 4px solid #f59e0b;
}
.status-banner.uploading {
background: linear-gradient(135deg, #e0e7ff 0%, #c7d2fe 100%);
color: #3730a3;
border-left: 4px solid #6366f1;
}
.status-banner.success {
background: linear-gradient(135deg, #d1fae5 0%, #a7f3d0 100%);
color: #065f46;
border-left: 4px solid #10b981;
}
.status-banner.warning {
background: linear-gradient(135deg, #fee2e2 0%, #fecaca 100%);
color: #991b1b;
border-left: 4px solid #ef4444;
}
.spinner {
width: 20px;
height: 20px;
border: 3px solid rgba(0, 0, 0, 0.1);
border-top-color: currentColor;
border-radius: 50%;
animation: spin 0.8s linear infinite;
}
@keyframes spin {
to { transform: rotate(360deg); }
}
.control-horizontal {
display: flex;
flex-direction: column;
gap: 1.5rem;
}
.control-left {
display: flex;
flex-direction: column;
gap: 1rem;
}
.control-right {
display: flex;
align-items: center;
justify-content: center;
}
.input-group {
display: flex;
gap: 0.5rem;
margin-bottom: 0;
}
input[type="text"] {
flex: 1;
padding: 0.75rem;
border: 1px solid #ddd;
border-radius: 4px;
font-size: 1rem;
}
input[type="text"]:disabled {
background: #f5f5f5;
cursor: not-allowed;
}
input[type="text"]:focus {
outline: none;
border-color: #10b981;
}
button {
padding: 0.75rem 1.5rem;
border: none;
border-radius: 4px;
font-size: 1rem;
font-weight: 500;
cursor: pointer;
transition: all 0.2s;
}
.btn-set-task {
background: #3b82f6;
color: white;
min-width: 120px;
}
.btn-set-task:hover:not(:disabled) {
background: #2563eb;
}
.btn-set-task:disabled {
background: #d1d5db;
cursor: not-allowed;
}
.btn-start {
background: #10b981;
color: white;
}
.btn-start:hover:not(:disabled) {
background: #059669;
}
.btn-start:disabled {
background: #d1d5db;
cursor: not-allowed;
}
.btn-stop {
background: #ef4444;
color: white;
}
.btn-stop:hover {
background: #dc2626;
}
.btn-reset {
padding: 0.5rem 1rem;
background: #6b7280;
color: white;
font-size: 0.875rem;
}
.btn-reset:hover {
background: #4b5563;
}
.status {
display: flex;
align-items: center;
gap: 0.75rem;
padding: 1rem;
border-radius: 4px;
margin-bottom: 1rem;
}
.status.recording {
background: #fee2e2;
color: #991b1b;
}
.status.recording.recording-active {
display: flex;
flex-direction: column;
gap: 1rem;
background: #dc2626;
color: white;
padding: 1.5rem;
border: 4px solid #991b1b;
box-shadow: 0 4px 12px rgba(220, 38, 38, 0.4);
font-weight: 700;
font-size: 1rem;
}
.status.recording.recording-active .indicator {
width: 20px;
height: 20px;
background: #fef2f2;
animation: pulse-strong 1s ease-in-out infinite;
}
@keyframes pulse-strong {
0%, 100% {
opacity: 1;
transform: scale(1);
}
50% {
opacity: 0.7;
transform: scale(1.1);
}
}
.status.recording.recording-active .time-display {
display: flex;
flex-direction: column;
gap: 0.5rem;
font-size: 1.5rem;
font-weight: 700;
color: white;
}
.fps-display {
font-size: 1rem;
font-weight: 500;
opacity: 0.95;
}
.fps-warning {
color: #fef2f2;
animation: pulse-warning 1s ease-in-out infinite;
}
@keyframes pulse-warning {
0%, 100% { opacity: 1; }
50% { opacity: 0.5; }
}
.status.recording.recording-active .btn-stop {
align-self: stretch;
}
.ramp-up-countdown {
display: flex;
justify-content: center;
margin-bottom: 1rem;
}
.countdown-box {
display: flex;
flex-direction: column;
align-items: center;
justify-content: center;
padding: 2rem 3rem;
background: linear-gradient(135deg, #fef3c7 0%, #fde68a 100%);
border: 4px solid #f59e0b;
border-radius: 16px;
box-shadow: 0 6px 20px rgba(245, 158, 11, 0.4);
min-width: 280px;
animation: pulse-warm 1.5s ease-in-out infinite;
}
@keyframes pulse-warm {
0%, 100% {
box-shadow: 0 6px 20px rgba(245, 158, 11, 0.4);
}
50% {
box-shadow: 0 6px 25px rgba(245, 158, 11, 0.6);
}
}
.countdown-label {
font-size: 1rem;
color: #92400e;
text-transform: uppercase;
letter-spacing: 1.5px;
font-weight: 800;
margin-bottom: 1rem;
text-align: center;
}
.countdown-value {
font-size: 4.5rem;
font-weight: 900;
color: #d97706;
font-family: 'Courier New', monospace;
line-height: 1;
text-shadow: 2px 2px 6px rgba(0, 0, 0, 0.15);
margin-bottom: 0.5rem;
}
.countdown-subtitle {
font-size: 0.875rem;
color: #78350f;
font-weight: 600;
font-style: italic;
text-align: center;
margin-top: 0.5rem;
}
.status.idle {
background: #f3f4f6;
color: #374151;
}
.indicator {
width: 12px;
height: 12px;
border-radius: 50%;
background: #ef4444;
animation: pulse 1.5s ease-in-out infinite;
}
@keyframes pulse {
0%, 100% { opacity: 1; }
50% { opacity: 0.5; }
}
.counter {
display: flex;
flex-direction: column;
align-items: center;
gap: 0.75rem;
padding: 1.5rem;
background: linear-gradient(135deg, #f9fafb 0%, #f3f4f6 100%);
border-radius: 8px;
border: 2px solid #e5e7eb;
min-width: 200px;
}
.counter-label {
font-size: 0.75rem;
color: #6b7280;
text-transform: uppercase;
letter-spacing: 0.5px;
font-weight: 600;
}
.counter-value {
font-size: 3rem;
font-weight: 700;
color: #10b981;
line-height: 1;
}
.time-display {
font-size: 1.5rem;
font-weight: 600;
font-family: 'Courier New', monospace;
}
.error-box {
padding: 1rem;
background: #fee2e2;
color: #991b1b;
border-radius: 4px;
border-left: 4px solid #ef4444;
font-size: 0.875rem;
}
.config-section {
margin-bottom: 1.5rem;
}
.config-section:last-child {
margin-bottom: 0;
}
.config-grid {
display: grid;
grid-template-columns: repeat(auto-fit, minmax(200px, 1fr));
gap: 1rem;
}
label {
display: flex;
flex-direction: column;
gap: 0.5rem;
font-size: 0.875rem;
color: #374151;
font-weight: 500;
}
select {
padding: 0.5rem;
border: 1px solid #ddd;
border-radius: 4px;
font-size: 0.875rem;
background: white;
}
select:disabled {
background: #f5f5f5;
cursor: not-allowed;
}
/* Camera Layout */
.camera-layout {
display: flex;
flex-direction: column;
gap: 1.5rem;
}
.camera-base {
width: 100%;
}
.camera-wrist-container {
display: grid;
grid-template-columns: repeat(2, 1fr);
gap: 1.5rem;
}
.camera-wrist {
width: 100%;
}
.camera {
border: 1px solid #e5e7eb;
border-radius: 4px;
overflow: hidden;
}
.camera h3 {
padding: 0.75rem;
background: #f9fafb;
border-bottom: 1px solid #e5e7eb;
margin: 0;
}
.camera img {
width: 100%;
height: auto;
display: block;
background: #000;
min-height: 300px;
object-fit: cover;
}
.camera-placeholder {
text-align: center;
padding: 4rem 2rem;
background: #f9fafb;
border-radius: 4px;
border: 2px dashed #d1d5db;
}
.camera-placeholder p {
margin: 0.5rem 0;
font-size: 1rem;
color: #6b7280;
}
.camera-placeholder p:first-child {
font-size: 1.25rem;
font-weight: 500;
color: #374151;
}
.hint {
margin-top: 0.5rem;
font-size: 0.75rem;
color: #6b7280;
display: flex;
align-items: center;
gap: 0.5rem;
flex-wrap: wrap;
}
+857
View File
@@ -0,0 +1,857 @@
import { useState, useEffect, useCallback, useRef } from 'react';
import './App.css';
const API_BASE = 'http://localhost:8000/api';
function App() {
// State
const [task, setTask] = useState('');
const [isRecording, setIsRecording] = useState(false);
const [isInitializing, setIsInitializing] = useState(false);
const [isEncoding, setIsEncoding] = useState(false);
const [isUploading, setIsUploading] = useState(false);
const [robotsReady, setRobotsReady] = useState(false);
const [elapsedTime, setElapsedTime] = useState(0);
const [currentFps, setCurrentFps] = useState(0);
const [loopFps, setLoopFps] = useState(0);
const [episodeCount, setEpisodeCount] = useState(0);
const [error, setError] = useState(null);
const [statusMessage, setStatusMessage] = useState('Ready');
const [uploadStatus, setUploadStatus] = useState(null);
const [rampUpRemaining, setRampUpRemaining] = useState(0);
const [movingToZero, setMovingToZero] = useState(false);
const [configExpanded, setConfigExpanded] = useState(false);
const [latestRepoId, setLatestRepoId] = useState(null);
// Configuration
const [config, setConfig] = useState({
leader_type: 'openarms', // 'openarms' or 'openarms_mini'
leader_left: 'can0',
leader_right: 'can1',
follower_left: 'can2',
follower_right: 'can3',
left_wrist: '/dev/video0',
right_wrist: '/dev/video1',
base: '/dev/video4'
});
// Available options
const [availableCameras, setAvailableCameras] = useState([]);
const [availableUsbPorts, setAvailableUsbPorts] = useState([]);
const canInterfaces = ['can0', 'can1', 'can2', 'can3'];
const statusIntervalRef = useRef(null);
const hasInitializedRef = useRef(false);
const loadConfig = () => {
try {
const saved = localStorage.getItem('openarms_config');
if (saved) {
const loadedConfig = JSON.parse(saved);
setConfig(prev => ({ ...prev, ...loadedConfig }));
}
} catch (e) {
console.error('Load config error:', e);
}
};
const saveConfig = (newConfig) => {
try {
localStorage.setItem('openarms_config', JSON.stringify(newConfig || config));
} catch (e) {
console.error('Save config error:', e);
}
};
// Fetch status periodically
const fetchStatus = async () => {
try {
const response = await fetch(`${API_BASE}/status`);
const data = await response.json();
setIsRecording(data.is_recording);
setIsInitializing(data.is_initializing);
setIsEncoding(data.is_encoding);
setIsUploading(data.is_uploading);
setRobotsReady(data.robots_ready);
setElapsedTime(data.elapsed_time);
setCurrentFps(data.current_fps || 0);
setLoopFps(data.loop_fps || 0);
setEpisodeCount(data.episode_count);
setError(data.error);
setStatusMessage(data.status_message || 'Ready');
setUploadStatus(data.upload_status);
setRampUpRemaining(data.ramp_up_remaining || 0);
setMovingToZero(data.moving_to_zero || false);
// Track the latest repo_id from the backend
if (data.latest_repo_id) {
setLatestRepoId(data.latest_repo_id);
}
if (data.config) {
// Only merge server config if we don't have a saved config (first load)
if (!localStorage.getItem('openarms_config')) {
setConfig(prev => {
const merged = { ...data.config, ...prev };
localStorage.setItem('openarms_config', JSON.stringify(merged));
return merged;
});
}
}
} catch (e) {
console.error('Failed to fetch status:', e);
}
};
const setupRobots = async () => {
// Show warning to verify camera positions
const confirmed = window.confirm(
'⚠️ IMPORTANT: Before connecting robots, please verify:\n\n' +
'📹 Check that cameras are correctly positioned:\n' +
' • LEFT wrist camera is actually on the LEFT arm\n' +
' • RIGHT wrist camera is actually on the RIGHT arm\n' +
' • BASE camera is actually the BASE/overhead camera\n\n' +
'Incorrect camera positioning will result in invalid training data!\n\n' +
'Click OK to continue with robot setup, or Cancel to review configuration.'
);
if (!confirmed) {
return; // User cancelled, don't proceed
}
setError(null);
try {
const response = await fetch(`${API_BASE}/robots/setup`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(config)
});
if (!response.ok) {
const data = await response.json();
throw new Error(data.detail || 'Failed to setup robots');
}
await response.json();
saveConfig(config);
} catch (e) {
setError(`Robot setup failed: ${e.message}`);
}
};
// Disconnect robots
const disconnectRobots = async () => {
try {
await fetch(`${API_BASE}/robots/disconnect`, { method: 'POST' });
setRobotsReady(false);
} catch (e) {
console.error('Failed to disconnect robots:', e);
}
};
// Discover cameras
const discoverCameras = async () => {
try {
const response = await fetch(`${API_BASE}/cameras/discover`);
const data = await response.json();
const cameras = data.cameras || [];
setAvailableCameras(cameras);
// Get list of valid camera IDs
const validCameraIds = cameras.map(cam => String(cam.id));
// Auto-fix config if current values are invalid or not set
const updated = { ...config };
let changed = false;
// Auto-fix invalid camera config
if (!config.left_wrist || !validCameraIds.includes(config.left_wrist)) {
if (cameras.length >= 1) {
updated.left_wrist = String(cameras[0].id);
changed = true;
}
}
if (!config.right_wrist || !validCameraIds.includes(config.right_wrist)) {
if (cameras.length >= 2) {
updated.right_wrist = String(cameras[1].id);
changed = true;
}
}
if (!config.base || !validCameraIds.includes(config.base)) {
if (cameras.length >= 3) {
updated.base = String(cameras[2].id);
changed = true;
}
}
if (changed) {
setConfig(updated);
saveConfig(updated);
}
if (cameras.length === 0) {
setError('No cameras detected! Please connect cameras and refresh.');
}
} catch (e) {
console.error('Failed to discover cameras:', e);
setError(`Camera discovery failed: ${e.message}`);
}
};
// Discover USB ports
const discoverUsbPorts = async () => {
try {
const response = await fetch(`${API_BASE}/usb/discover`);
const data = await response.json();
const ports = data.ports || [];
setAvailableUsbPorts(ports);
// Auto-fix config if OpenArms Mini is selected and ports are invalid
if (config.leader_type === 'openarms_mini') {
const updated = { ...config };
let changed = false;
if (ports.length >= 1 && !ports.includes(config.leader_left)) {
updated.leader_left = ports[0];
changed = true;
}
if (ports.length >= 2 && !ports.includes(config.leader_right)) {
updated.leader_right = ports[1];
changed = true;
}
if (changed) {
setConfig(updated);
saveConfig(updated);
}
}
if (ports.length === 0) {
console.warn('No USB ports detected for OpenArms Mini');
}
} catch (e) {
console.error('Failed to discover USB ports:', e);
}
};
// Set task only (for pedal use)
const setTaskOnly = async () => {
if (!task.trim()) {
setError('Please enter a task description');
return;
}
setError(null);
try {
const response = await fetch(`${API_BASE}/recording/set-task`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ task, ...config })
});
if (!response.ok) {
const data = await response.json();
throw new Error(data.detail || 'Failed to set task');
}
const result = await response.json();
setStatusMessage(result.message || `Task set: ${task}`);
saveConfig(config);
// Clear success message after 3 seconds
setTimeout(() => {
if (!isRecording && !isInitializing) {
setStatusMessage('Ready');
}
}, 3000);
} catch (e) {
setError(e.message);
}
};
// Start recording
const startRecording = async () => {
if (!task.trim()) {
setError('Please enter a task description');
return;
}
setError(null);
try {
const response = await fetch(`${API_BASE}/recording/start`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ task, ...config })
});
if (!response.ok) {
const data = await response.json();
throw new Error(data.detail || 'Failed to start recording');
}
await response.json();
saveConfig(config);
} catch (e) {
setError(e.message);
}
};
// Stop recording
const stopRecording = async () => {
try {
const response = await fetch(`${API_BASE}/recording/stop`, {
method: 'POST'
});
if (!response.ok) {
const data = await response.json();
throw new Error(data.detail || 'Failed to stop recording');
}
const data = await response.json();
setError(null);
// Update latest repo_id after recording
if (data.dataset_name) {
setLatestRepoId(`lerobot-data-collection/${data.dataset_name}`);
}
} catch (e) {
setError(e.message);
}
};
const deleteLatestEpisode = async () => {
if (!latestRepoId) {
setError('No episode to delete');
return;
}
const confirmed = window.confirm(
`WARNING: This will permanently delete the repository:\n\n${latestRepoId}\n\nThis action cannot be undone. Continue?`
);
if (!confirmed) {
return;
}
try {
const response = await fetch(`${API_BASE}/recording/delete-latest`, { method: 'POST' });
if (!response.ok) {
const data = await response.json();
throw new Error(data.detail || 'Failed to delete episode');
}
const data = await response.json();
setLatestRepoId(null);
setEpisodeCount(Math.max(0, episodeCount - 1));
setStatusMessage(`Deleted: ${data.deleted_repo}`);
setTimeout(() => {
if (!isRecording && !isInitializing) {
setStatusMessage('Ready');
}
}, 3000);
} catch (e) {
setError(`Delete failed: ${e.message}`);
}
};
// Reset counter
const resetCounter = async () => {
try {
await fetch(`${API_BASE}/counter/reset`, { method: 'POST' });
setEpisodeCount(0);
} catch (e) {
console.error('Failed to reset counter:', e);
}
};
// Move robot to zero position
const moveToZero = async () => {
setError(null);
try {
const response = await fetch(`${API_BASE}/robots/move-to-zero`, { method: 'POST' });
if (!response.ok) {
const data = await response.json();
throw new Error(data.detail || 'Failed to move to zero position');
}
await response.json();
} catch (e) {
setError(`Move to zero failed: ${e.message}`);
}
};
// Format time as MM:SS
const formatTime = (seconds) => {
const mins = Math.floor(seconds / 60);
const secs = Math.floor(seconds % 60);
return `${mins.toString().padStart(2, '0')}:${secs.toString().padStart(2, '0')}`;
};
// Update config and save
const updateConfig = (key, value) => {
const updated = { ...config, [key]: value };
setConfig(updated);
saveConfig(updated);
};
// Initialize on mount only
useEffect(() => {
// Prevent double-initialization in development
if (hasInitializedRef.current) {
return;
}
hasInitializedRef.current = true;
loadConfig();
discoverCameras();
discoverUsbPorts();
fetchStatus();
statusIntervalRef.current = setInterval(fetchStatus, 1000);
return () => {
if (statusIntervalRef.current) {
clearInterval(statusIntervalRef.current);
}
};
// eslint-disable-next-line react-hooks/exhaustive-deps
}, []); // Run only once on mount
// Discover USB ports when leader type changes to Mini
useEffect(() => {
if (config.leader_type === 'openarms_mini') {
discoverUsbPorts();
}
// eslint-disable-next-line react-hooks/exhaustive-deps
}, [config.leader_type]);
return (
<main>
<header>
<h1>OpenArms Recording</h1>
</header>
<div className="container">
{/* Left Column: Configuration and Recording Control */}
<div className="left-column">
{/* Configuration Panel */}
<section className="panel config-panel">
<div
className="config-header"
onClick={() => setConfigExpanded(!configExpanded)}
role="button"
tabIndex={0}
onKeyDown={(e) => e.key === 'Enter' && setConfigExpanded(!configExpanded)}
>
<h2> Configuration</h2>
<span className="toggle-icon">{configExpanded ? '▼' : '▶'}</span>
</div>
{configExpanded && (
<div className="config-content">
{/* Robot Setup */}
<div className="config-section">
<h3>🤖 Robot Setup</h3>
<div className="robot-setup">
{robotsReady ? (
<div className="robot-status ready">
<span> Robots Ready - Recording will start instantly</span>
<button onClick={disconnectRobots} className="btn-disconnect">
Disconnect Robots
</button>
</div>
) : (
<div className="robot-status not-ready">
<span> Robots not initialized - Recording will take ~10 seconds</span>
<button
onClick={setupRobots}
disabled={isRecording || isInitializing}
className="btn-setup"
>
🚀 Setup Robots
</button>
</div>
)}
</div>
</div>
{/* Leader Type Selection */}
<div className="config-section">
<h3>🎮 Leader Type</h3>
<div className="config-grid">
<label style={{gridColumn: '1 / -1'}}>
Leader Arm Type
<select
value={config.leader_type}
onChange={(e) => updateConfig('leader_type', e.target.value)}
disabled={isRecording || robotsReady}
>
<option value="openarms">OpenArms (CAN Bus - Damiao Motors)</option>
<option value="openarms_mini">OpenArms Mini (USB - Feetech Motors)</option>
</select>
</label>
</div>
</div>
{/* Leader Interfaces (CAN or USB based on type) */}
<div className="config-section">
<div style={{ display: 'flex', justifyContent: 'space-between', alignItems: 'center', marginBottom: '0.5rem' }}>
<h3>
{config.leader_type === 'openarms_mini'
? `Leader Ports (USB/Serial) ${availableUsbPorts.length > 0 ? `(${availableUsbPorts.length} detected)` : ''}`
: 'Leader Interfaces (CAN)'}
</h3>
{config.leader_type === 'openarms_mini' && (
<button
onClick={discoverUsbPorts}
className="btn-refresh"
disabled={isRecording || robotsReady}
>
🔄 Refresh
</button>
)}
</div>
<div className="config-grid">
<label>
Leader Left
<select
value={config.leader_left}
onChange={(e) => updateConfig('leader_left', e.target.value)}
disabled={isRecording || robotsReady}
>
{config.leader_type === 'openarms_mini' ? (
availableUsbPorts.length > 0 ? (
availableUsbPorts.map((port) => (
<option key={port} value={port}>{port}</option>
))
) : (
<option value="">No USB ports detected</option>
)
) : (
canInterfaces.map((iface) => (
<option key={iface} value={iface}>{iface}</option>
))
)}
</select>
</label>
<label>
Leader Right
<select
value={config.leader_right}
onChange={(e) => updateConfig('leader_right', e.target.value)}
disabled={isRecording || robotsReady}
>
{config.leader_type === 'openarms_mini' ? (
availableUsbPorts.length > 0 ? (
availableUsbPorts.map((port) => (
<option key={port} value={port}>{port}</option>
))
) : (
<option value="">No USB ports detected</option>
)
) : (
canInterfaces.map((iface) => (
<option key={iface} value={iface}>{iface}</option>
))
)}
</select>
</label>
</div>
</div>
{/* Follower CAN Interfaces */}
<div className="config-section">
<h3>Follower Interfaces (CAN)</h3>
<div className="config-grid">
<label>
Follower Left
<select
value={config.follower_left}
onChange={(e) => updateConfig('follower_left', e.target.value)}
disabled={isRecording || robotsReady}
>
{canInterfaces.map((iface) => (
<option key={iface} value={iface}>{iface}</option>
))}
</select>
</label>
<label>
Follower Right
<select
value={config.follower_right}
onChange={(e) => updateConfig('follower_right', e.target.value)}
disabled={isRecording || robotsReady}
>
{canInterfaces.map((iface) => (
<option key={iface} value={iface}>{iface}</option>
))}
</select>
</label>
</div>
</div>
{/* Camera Configuration */}
<div className="config-section">
<div style={{ display: 'flex', justifyContent: 'space-between', alignItems: 'center', marginBottom: '0.5rem' }}>
<h3>Cameras {availableCameras.length > 0 && `(${availableCameras.length} detected)`}</h3>
<button
onClick={discoverCameras}
className="btn-refresh"
disabled={isRecording || robotsReady}
>
🔄 Refresh
</button>
</div>
<div className="config-grid">
<label>
Left Wrist
<select
value={config.left_wrist}
onChange={(e) => updateConfig('left_wrist', e.target.value)}
disabled={isRecording || robotsReady}
>
{availableCameras.map((cam) => (
<option key={cam.id} value={String(cam.id)}>
{cam.name || `Camera @ ${cam.id}`}
</option>
))}
</select>
</label>
<label>
Right Wrist
<select
value={config.right_wrist}
onChange={(e) => updateConfig('right_wrist', e.target.value)}
disabled={isRecording || robotsReady}
>
{availableCameras.map((cam) => (
<option key={cam.id} value={String(cam.id)}>
{cam.name || `Camera @ ${cam.id}`}
</option>
))}
</select>
</label>
<label>
Base Camera
<select
value={config.base}
onChange={(e) => updateConfig('base', e.target.value)}
disabled={isRecording || robotsReady}
>
{availableCameras.map((cam) => (
<option key={cam.id} value={String(cam.id)}>
{cam.name || `Camera @ ${cam.id}`}
</option>
))}
</select>
</label>
</div>
</div>
</div>
)}
</section>
{/* Control Panel */}
<section className="panel control-panel">
<h2>🎬 Recording Control</h2>
{/* Status Banner - Always show important statuses */}
{isInitializing && (
<div className="status-banner initializing">
<div className="spinner"></div>
<span>{statusMessage}</span>
</div>
)}
{isEncoding && (
<div className="status-banner encoding">
<div className="spinner"></div>
<span>📹 {statusMessage}</span>
</div>
)}
{isUploading && (
<div className="status-banner uploading">
<div className="spinner"></div>
<span> {statusMessage}</span>
</div>
)}
{uploadStatus && !isRecording && !isEncoding && !isUploading && (
<div className={`status-banner ${uploadStatus.startsWith('✓') ? 'success' : 'warning'}`}>
<span>{uploadStatus}</span>
</div>
)}
<div className="control-horizontal">
{/* Task Input and Status */}
<div className="control-left">
<div className="input-group">
<input
type="text"
value={task}
onChange={(e) => setTask(e.target.value)}
placeholder="Task description (e.g., 'pick and place')"
disabled={isRecording || isInitializing || isEncoding || isUploading}
onKeyPress={(e) => {
if (e.key === 'Enter' && robotsReady) {
setTaskOnly();
}
}}
/>
<button
onClick={setTaskOnly}
disabled={isRecording || isInitializing || isEncoding || isUploading || !robotsReady}
className="btn-set-task"
title={!robotsReady ? 'Please setup robots first' : 'Store task for pedal use (Enter key)'}
>
💾 Set Task
</button>
<button
onClick={startRecording}
disabled={isRecording || isInitializing || isEncoding || isUploading || !robotsReady}
className="btn-start"
title={!robotsReady ? 'Please setup robots first' : ''}
>
{isInitializing
? '⏳ Initializing...'
: isRecording
? '⏺ Recording...'
: robotsReady
? '⏺ Start Recording'
: '⏺ Setup Robots First'}
</button>
</div>
{/* Ramp-up Countdown */}
{isRecording && rampUpRemaining > 0 && (
<div className="ramp-up-countdown">
<div className="countdown-box">
<div className="countdown-label"> WARMING UP - PID RAMP-UP</div>
<div className="countdown-value">{rampUpRemaining.toFixed(1)}s</div>
<div className="countdown-subtitle">Recording will start automatically...</div>
</div>
</div>
)}
{/* Recording Status - Only show after ramp-up */}
{isRecording && rampUpRemaining <= 0 && (
<div className="status recording recording-active">
<div className="indicator"></div>
<div className="time-display">
<span>{formatTime(elapsedTime)}</span>
<span className="fps-display">
Loop: {loopFps.toFixed(1)} Hz
{loopFps > 0 && loopFps < 29 && <span className="fps-warning"> </span>}
</span>
<span className="fps-display">Recording: {currentFps.toFixed(1)} FPS</span>
</div>
<button onClick={stopRecording} className="btn-stop">
Stop
</button>
</div>
)}
</div>
{/* Episode Counter */}
<div className="control-right">
<div className="counter">
<div className="counter-label">Episodes Recorded</div>
<div className="counter-value">{episodeCount}</div>
<button onClick={resetCounter} className="btn-reset">
Reset
</button>
</div>
</div>
</div>
{/* Delete Latest Episode Button */}
{!isRecording && !isInitializing && latestRepoId && (
<div className="delete-episode-section">
<button
onClick={deleteLatestEpisode}
className="btn-delete"
title="Delete the latest recorded episode from HuggingFace Hub"
>
Delete Latest Episode
</button>
<div className="delete-info">Will delete: {latestRepoId}</div>
</div>
)}
{/* Move to Zero Button */}
{robotsReady && !isRecording && !isInitializing && (
<div className="zero-position-section">
<button
onClick={moveToZero}
disabled={movingToZero}
className="btn-zero-large"
title="Move both leader and follower robots to zero position (2s)"
>
{movingToZero ? '⏳ Moving to Zero Position...' : '🎯 Move to Zero Position (Leader + Follower)'}
</button>
</div>
)}
{/* Error Display */}
{error && (
<div className="error-box">
{error}
</div>
)}
</section>
</div>
{/* Right Column: Camera Feeds */}
<div className="right-column">
<section className="panel cameras">
<h2>📹 Camera Views</h2>
{robotsReady || isRecording || isInitializing ? (
<div className="camera-layout">
{/* Base camera - full width */}
<div className="camera camera-base">
<h3>Base Camera</h3>
<img src={`${API_BASE}/camera/stream/base`} alt="Base Camera" />
</div>
{/* Wrist cameras - side by side */}
<div className="camera-wrist-container">
<div className="camera camera-wrist">
<h3>Left Wrist</h3>
<img src={`${API_BASE}/camera/stream/left_wrist`} alt="Left Wrist Camera" />
</div>
<div className="camera camera-wrist">
<h3>Right Wrist</h3>
<img src={`${API_BASE}/camera/stream/right_wrist`} alt="Right Wrist Camera" />
</div>
</div>
</div>
) : (
<div className="camera-placeholder">
<p>📷 Camera feeds will appear when robots are set up</p>
<p className="hint">Click "Setup Robots" above to preview camera feeds</p>
</div>
)}
</section>
</div>
</div>
</main>
);
}
export default App;
+41
View File
@@ -0,0 +1,41 @@
# OpenArms Web Recording Interface
A web interface for recording OpenArms datasets.
## Installation
```bash
cd examples/openarms_web_interface
npm install
```
## Usage
**Start everything with one command:**
```bash
./launch.sh
```
This will:
- Start the FastAPI backend on port 8000
- Start the React frontend on port 5173
- Show live logs from both services
Then open your browser to: **http://localhost:5173**
**Stop with:** `Ctrl+C`
---
## Workflow
1. **Configure CAN interfaces** and **camera paths** in the dropdowns
2. Click **"Setup Robots"** to initialize (once at start)
3. Enter a **task description**
4. Click **"Start Recording"** to begin an episode
5. Click **"Stop Recording"** when done
6. Dataset is automatically encoded and uploaded to HuggingFace Hub as **private**
7. Repeat steps 3-6 for more episodes (no need to re-setup robots!)
---
@@ -0,0 +1,12 @@
<!doctype html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>OpenArms Recording Interface</title>
</head>
<body>
<div id="root"></div>
<script type="module" src="/main.jsx"></script>
</body>
</html>
+142
View File
@@ -0,0 +1,142 @@
#!/bin/bash
# OpenArms Web Interface Launcher
# Starts Rerun viewer, FastAPI backend, and React frontend
set -e
# Colors for output
GREEN='\033[0;32m'
BLUE='\033[0;34m'
YELLOW='\033[1;33m'
RED='\033[0;31m'
NC='\033[0m' # No Color
# Get script directory
SCRIPT_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
cd "$SCRIPT_DIR"
echo -e "${BLUE}╔════════════════════════════════════════╗${NC}"
echo -e "${BLUE}║ OpenArms Web Recording Interface ║${NC}"
echo -e "${BLUE}╚════════════════════════════════════════╝${NC}"
echo ""
# Function to cleanup on exit
cleanup() {
echo ""
echo -e "${YELLOW}Shutting down services...${NC}"
# Kill all child processes
pkill -P $$ 2>/dev/null || true
# Kill specific services by port
lsof -ti:8000 | xargs kill -9 2>/dev/null || true # Backend
lsof -ti:5173 | xargs kill -9 2>/dev/null || true # Frontend
lsof -ti:9876 | xargs kill -9 2>/dev/null || true # Rerun (if spawned)
echo -e "${GREEN}✓ Services stopped${NC}"
exit 0
}
# Register cleanup on script exit
trap cleanup EXIT INT TERM
# Check if required commands exist
command -v rerun >/dev/null 2>&1 || {
echo -e "${RED}✗ Error: 'rerun' not found. Please install: pip install rerun-sdk${NC}"
exit 1
}
command -v python >/dev/null 2>&1 || {
echo -e "${RED}✗ Error: 'python' not found${NC}"
exit 1
}
command -v npm >/dev/null 2>&1 || {
echo -e "${RED}✗ Error: 'npm' not found${NC}"
exit 1
}
# Check if node_modules exists
if [ ! -d "node_modules" ]; then
echo -e "${YELLOW}⚠ node_modules not found. Running npm install...${NC}"
npm install
echo -e "${GREEN}✓ Dependencies installed${NC}"
echo ""
fi
echo -e "${GREEN}Starting services...${NC}"
echo ""
# 1. Start FastAPI backend (Rerun will start when recording begins)
echo -e "${BLUE}[1/2]${NC} Starting FastAPI backend on port 8000..."
cd "$SCRIPT_DIR"
# Use Python from current environment (if lerobot env is active, it will use that)
# Otherwise, check if we need to use conda run
if [[ "$CONDA_DEFAULT_ENV" == "lerobot" ]]; then
# Already in lerobot environment
echo -e "${GREEN}✓ Using active lerobot environment${NC}"
PYTHON_CMD="python"
elif command -v conda >/dev/null 2>&1 && conda env list | grep -q "^lerobot "; then
# lerobot env exists but not active - use conda run
echo -e "${YELLOW}Using conda run with lerobot environment...${NC}"
PYTHON_CMD="conda run -n lerobot --no-capture-output python"
else
# Fall back to system python
echo -e "${YELLOW}⚠ Warning: lerobot environment not found, using system python${NC}"
PYTHON_CMD="python"
fi
$PYTHON_CMD web_record_server.py > /tmp/openarms_backend.log 2>&1 &
BACKEND_PID=$!
sleep 3
if ps -p $BACKEND_PID > /dev/null; then
echo -e "${GREEN}✓ Backend started${NC} (PID: $BACKEND_PID)"
echo -e " URL: ${BLUE}http://localhost:8000${NC}"
else
echo -e "${RED}✗ Failed to start backend${NC}"
echo -e "${YELLOW}Check logs: tail -f /tmp/openarms_backend.log${NC}"
exit 1
fi
echo ""
# 2. Start React frontend
echo -e "${BLUE}[2/2]${NC} Starting React frontend on port 5173..."
cd "$SCRIPT_DIR"
npm run dev > /tmp/openarms_frontend.log 2>&1 &
FRONTEND_PID=$!
sleep 3
if ps -p $FRONTEND_PID > /dev/null; then
echo -e "${GREEN}✓ Frontend started${NC} (PID: $FRONTEND_PID)"
echo -e " URL: ${BLUE}http://localhost:5173${NC}"
else
echo -e "${RED}✗ Failed to start frontend${NC}"
echo -e "${YELLOW}Check logs: tail -f /tmp/openarms_frontend.log${NC}"
exit 1
fi
echo ""
# Display status
echo -e "${GREEN}╔════════════════════════════════════════╗${NC}"
echo -e "${GREEN}║ All services running! 🚀 ║${NC}"
echo -e "${GREEN}╚════════════════════════════════════════╝${NC}"
echo ""
echo -e "🔧 ${BLUE}Backend:${NC} http://localhost:8000"
echo -e "🌐 ${BLUE}Frontend:${NC} http://localhost:5173"
echo -e "📊 ${BLUE}Rerun:${NC} Will spawn automatically when recording starts"
echo ""
echo -e "${YELLOW}Open your browser to:${NC} ${BLUE}http://localhost:5173${NC}"
echo ""
echo -e "${YELLOW}Logs:${NC}"
echo -e " • Backend: tail -f /tmp/openarms_backend.log"
echo -e " • Frontend: tail -f /tmp/openarms_frontend.log"
echo ""
echo -e "${RED}Press Ctrl+C to stop all services${NC}"
echo ""
# Keep script running and wait for any service to exit
wait
+7
View File
@@ -0,0 +1,7 @@
import { createRoot } from 'react-dom/client'
import App from './App.jsx'
createRoot(document.getElementById('root')).render(
<App />
)
File diff suppressed because it is too large Load Diff
@@ -0,0 +1,21 @@
{
"name": "openarms-web-interface",
"private": true,
"version": "0.0.0",
"type": "module",
"scripts": {
"dev": "vite",
"build": "vite build",
"preview": "vite preview"
},
"dependencies": {
"react": "^18.3.1",
"react-dom": "^18.3.1"
},
"devDependencies": {
"@types/react": "^18.3.12",
"@types/react-dom": "^18.3.1",
"@vitejs/plugin-react": "^4.3.4",
"vite": "^6.0.1"
}
}
@@ -0,0 +1,17 @@
import { defineConfig } from 'vite'
import react from '@vitejs/plugin-react'
// https://vite.dev/config/
export default defineConfig({
plugins: [react()],
server: {
port: 5173,
strictPort: false,
host: true,
open: false
},
build: {
outDir: 'dist',
sourcemap: true
}
})
File diff suppressed because it is too large Load Diff
+44 -46
View File
@@ -34,11 +34,12 @@ from lerobot.processor.converters import (
transition_to_observation,
transition_to_robot_action,
)
from lerobot.robots.so_follower import SO100Follower, SO100FollowerConfig
from lerobot.robots.so_follower.robot_kinematic_processor import (
from lerobot.robots.so100_follower.config_so100_follower import SO100FollowerConfig
from lerobot.robots.so100_follower.robot_kinematic_processor import (
ForwardKinematicsJointsToEE,
InverseKinematicsEEToJoints,
)
from lerobot.robots.so100_follower.so100_follower import SO100Follower
from lerobot.scripts.lerobot_record import record_loop
from lerobot.utils.control_utils import init_keyboard_listener
from lerobot.utils.utils import log_say
@@ -142,24 +143,38 @@ def main():
listener, events = init_keyboard_listener()
init_rerun(session_name="phone_so100_evaluate")
try:
if not robot.is_connected:
raise ValueError("Robot is not connected!")
if not robot.is_connected:
raise ValueError("Robot is not connected!")
print("Starting evaluate loop...")
episode_idx = 0
for episode_idx in range(NUM_EPISODES):
log_say(f"Running inference, recording eval episode {episode_idx + 1} of {NUM_EPISODES}")
print("Starting evaluate loop...")
episode_idx = 0
for episode_idx in range(NUM_EPISODES):
log_say(f"Running inference, recording eval episode {episode_idx + 1} of {NUM_EPISODES}")
# Main record loop
# Main record loop
record_loop(
robot=robot,
events=events,
fps=FPS,
policy=policy,
preprocessor=preprocessor, # Pass the pre and post policy processors
postprocessor=postprocessor,
dataset=dataset,
control_time_s=EPISODE_TIME_SEC,
single_task=TASK_DESCRIPTION,
display_data=True,
teleop_action_processor=make_default_teleop_action_processor(),
robot_action_processor=robot_ee_to_joints_processor,
robot_observation_processor=robot_joints_to_ee_pose_processor,
)
# Reset the environment if not stopping or re-recording
if not events["stop_recording"] and ((episode_idx < NUM_EPISODES - 1) or events["rerecord_episode"]):
log_say("Reset the environment")
record_loop(
robot=robot,
events=events,
fps=FPS,
policy=policy,
preprocessor=preprocessor, # Pass the pre and post policy processors
postprocessor=postprocessor,
dataset=dataset,
control_time_s=EPISODE_TIME_SEC,
single_task=TASK_DESCRIPTION,
display_data=True,
@@ -168,41 +183,24 @@ def main():
robot_observation_processor=robot_joints_to_ee_pose_processor,
)
# Reset the environment if not stopping or re-recording
if not events["stop_recording"] and (
(episode_idx < NUM_EPISODES - 1) or events["rerecord_episode"]
):
log_say("Reset the environment")
record_loop(
robot=robot,
events=events,
fps=FPS,
control_time_s=EPISODE_TIME_SEC,
single_task=TASK_DESCRIPTION,
display_data=True,
teleop_action_processor=make_default_teleop_action_processor(),
robot_action_processor=robot_ee_to_joints_processor,
robot_observation_processor=robot_joints_to_ee_pose_processor,
)
if events["rerecord_episode"]:
log_say("Re-record episode")
events["rerecord_episode"] = False
events["exit_early"] = False
dataset.clear_episode_buffer()
continue
if events["rerecord_episode"]:
log_say("Re-record episode")
events["rerecord_episode"] = False
events["exit_early"] = False
dataset.clear_episode_buffer()
continue
# Save episode
dataset.save_episode()
episode_idx += 1
# Save episode
dataset.save_episode()
episode_idx += 1
finally:
# Clean up
log_say("Stop recording")
robot.disconnect()
listener.stop()
# Clean up
log_say("Stop recording")
robot.disconnect()
listener.stop()
dataset.finalize()
dataset.push_to_hub()
dataset.finalize()
dataset.push_to_hub()
if __name__ == "__main__":
+44 -46
View File
@@ -26,14 +26,15 @@ from lerobot.processor.converters import (
transition_to_observation,
transition_to_robot_action,
)
from lerobot.robots.so_follower import SO100Follower, SO100FollowerConfig
from lerobot.robots.so_follower.robot_kinematic_processor import (
from lerobot.robots.so100_follower.config_so100_follower import SO100FollowerConfig
from lerobot.robots.so100_follower.robot_kinematic_processor import (
EEBoundsAndSafety,
EEReferenceAndDelta,
ForwardKinematicsJointsToEE,
GripperVelocityToJoint,
InverseKinematicsEEToJoints,
)
from lerobot.robots.so100_follower.so100_follower import SO100Follower
from lerobot.scripts.lerobot_record import record_loop
from lerobot.teleoperators.phone.config_phone import PhoneConfig, PhoneOS
from lerobot.teleoperators.phone.phone_processor import MapPhoneActionToRobotAction
@@ -149,23 +150,38 @@ def main():
listener, events = init_keyboard_listener()
init_rerun(session_name="phone_so100_record")
try:
if not robot.is_connected or not phone.is_connected:
raise ValueError("Robot or teleop is not connected!")
if not robot.is_connected or not phone.is_connected:
raise ValueError("Robot or teleop is not connected!")
print("Starting record loop. Move your phone to teleoperate the robot...")
episode_idx = 0
while episode_idx < NUM_EPISODES and not events["stop_recording"]:
log_say(f"Recording episode {episode_idx + 1} of {NUM_EPISODES}")
print("Starting record loop. Move your phone to teleoperate the robot...")
episode_idx = 0
while episode_idx < NUM_EPISODES and not events["stop_recording"]:
log_say(f"Recording episode {episode_idx + 1} of {NUM_EPISODES}")
# Main record loop
# Main record loop
record_loop(
robot=robot,
events=events,
fps=FPS,
teleop=phone,
dataset=dataset,
control_time_s=EPISODE_TIME_SEC,
single_task=TASK_DESCRIPTION,
display_data=True,
teleop_action_processor=phone_to_robot_ee_pose_processor,
robot_action_processor=robot_ee_to_joints_processor,
robot_observation_processor=robot_joints_to_ee_pose,
)
# Reset the environment if not stopping or re-recording
if not events["stop_recording"] and (episode_idx < NUM_EPISODES - 1 or events["rerecord_episode"]):
log_say("Reset the environment")
record_loop(
robot=robot,
events=events,
fps=FPS,
teleop=phone,
dataset=dataset,
control_time_s=EPISODE_TIME_SEC,
control_time_s=RESET_TIME_SEC,
single_task=TASK_DESCRIPTION,
display_data=True,
teleop_action_processor=phone_to_robot_ee_pose_processor,
@@ -173,43 +189,25 @@ def main():
robot_observation_processor=robot_joints_to_ee_pose,
)
# Reset the environment if not stopping or re-recording
if not events["stop_recording"] and (
episode_idx < NUM_EPISODES - 1 or events["rerecord_episode"]
):
log_say("Reset the environment")
record_loop(
robot=robot,
events=events,
fps=FPS,
teleop=phone,
control_time_s=RESET_TIME_SEC,
single_task=TASK_DESCRIPTION,
display_data=True,
teleop_action_processor=phone_to_robot_ee_pose_processor,
robot_action_processor=robot_ee_to_joints_processor,
robot_observation_processor=robot_joints_to_ee_pose,
)
if events["rerecord_episode"]:
log_say("Re-recording episode")
events["rerecord_episode"] = False
events["exit_early"] = False
dataset.clear_episode_buffer()
continue
if events["rerecord_episode"]:
log_say("Re-recording episode")
events["rerecord_episode"] = False
events["exit_early"] = False
dataset.clear_episode_buffer()
continue
# Save episode
dataset.save_episode()
episode_idx += 1
# Save episode
dataset.save_episode()
episode_idx += 1
finally:
# Clean up
log_say("Stop recording")
robot.disconnect()
phone.disconnect()
listener.stop()
# Clean up
log_say("Stop recording")
robot.disconnect()
phone.disconnect()
listener.stop()
dataset.finalize()
dataset.push_to_hub()
dataset.finalize()
dataset.push_to_hub()
if __name__ == "__main__":
+23 -24
View File
@@ -23,10 +23,11 @@ from lerobot.processor.converters import (
robot_action_observation_to_transition,
transition_to_robot_action,
)
from lerobot.robots.so_follower import SO100Follower, SO100FollowerConfig
from lerobot.robots.so_follower.robot_kinematic_processor import (
from lerobot.robots.so100_follower.config_so100_follower import SO100FollowerConfig
from lerobot.robots.so100_follower.robot_kinematic_processor import (
InverseKinematicsEEToJoints,
)
from lerobot.robots.so100_follower.so100_follower import SO100Follower
from lerobot.utils.constants import ACTION
from lerobot.utils.robot_utils import precise_sleep
from lerobot.utils.utils import log_say
@@ -73,34 +74,32 @@ def main():
# Connect to the robot
robot.connect()
try:
if not robot.is_connected:
raise ValueError("Robot is not connected!")
if not robot.is_connected:
raise ValueError("Robot is not connected!")
print("Starting replay loop...")
log_say(f"Replaying episode {EPISODE_IDX}")
for idx in range(len(episode_frames)):
t0 = time.perf_counter()
print("Starting replay loop...")
log_say(f"Replaying episode {EPISODE_IDX}")
for idx in range(len(episode_frames)):
t0 = time.perf_counter()
# Get recorded action from dataset
ee_action = {
name: float(actions[idx][ACTION][i])
for i, name in enumerate(dataset.features[ACTION]["names"])
}
# Get recorded action from dataset
ee_action = {
name: float(actions[idx][ACTION][i]) for i, name in enumerate(dataset.features[ACTION]["names"])
}
# Get robot observation
robot_obs = robot.get_observation()
# Get robot observation
robot_obs = robot.get_observation()
# Dataset EE -> robot joints
joint_action = robot_ee_to_joints_processor((ee_action, robot_obs))
# Dataset EE -> robot joints
joint_action = robot_ee_to_joints_processor((ee_action, robot_obs))
# Send action to robot
_ = robot.send_action(joint_action)
# Send action to robot
_ = robot.send_action(joint_action)
precise_sleep(max(1.0 / dataset.fps - (time.perf_counter() - t0), 0.0))
finally:
# Clean up
robot.disconnect()
precise_sleep(1.0 / dataset.fps - (time.perf_counter() - t0))
# Clean up
robot.disconnect()
if __name__ == "__main__":
+3 -2
View File
@@ -21,13 +21,14 @@ from lerobot.processor.converters import (
robot_action_observation_to_transition,
transition_to_robot_action,
)
from lerobot.robots.so_follower import SO100Follower, SO100FollowerConfig
from lerobot.robots.so_follower.robot_kinematic_processor import (
from lerobot.robots.so100_follower.config_so100_follower import SO100FollowerConfig
from lerobot.robots.so100_follower.robot_kinematic_processor import (
EEBoundsAndSafety,
EEReferenceAndDelta,
GripperVelocityToJoint,
InverseKinematicsEEToJoints,
)
from lerobot.robots.so100_follower.so100_follower import SO100Follower
from lerobot.teleoperators.phone.config_phone import PhoneConfig, PhoneOS
from lerobot.teleoperators.phone.phone_processor import MapPhoneActionToRobotAction
from lerobot.teleoperators.phone.teleop_phone import Phone
+10 -10
View File
@@ -27,8 +27,8 @@ measuring consistency and ground truth alignment.
Usage:
# Basic usage with smolvla policy
uv run python examples/rtc/eval_dataset.py \
--policy.path=<USER>/smolvla_check_rtc_last3 \
--dataset.repo_id=<USER>/check_rtc \
--policy.path=helper2424/smolvla_check_rtc_last3 \
--dataset.repo_id=helper2424/check_rtc \
--rtc.execution_horizon=8 \
--device=mps \
--rtc.max_guidance_weight=10.0 \
@@ -58,16 +58,16 @@ Usage:
--device=cuda
uv run python examples/rtc/eval_dataset.py \
--policy.path=<USER>/reuben_pi0 \
--dataset.repo_id=<USER>/so101_cube_in_cup \
--policy.path=lipsop/reuben_pi0 \
--dataset.repo_id=ReubenLim/so101_cube_in_cup \
--rtc.execution_horizon=8 \
--device=cuda
# With torch.compile for faster inference (PyTorch 2.0+)
# Note: CUDA graphs disabled by default due to in-place ops in denoising loop
uv run python examples/rtc/eval_dataset.py \
--policy.path=<USER>/smolvla_check_rtc_last3 \
--dataset.repo_id=<USER>/check_rtc \
--policy.path=helper2424/smolvla_check_rtc_last3 \
--dataset.repo_id=helper2424/check_rtc \
--rtc.execution_horizon=8 \
--device=mps \
--use_torch_compile=true \
@@ -75,8 +75,8 @@ Usage:
# With torch.compile on CUDA (CUDA graphs disabled by default)
uv run python examples/rtc/eval_dataset.py \
--policy.path=<USER>/smolvla_check_rtc_last3 \
--dataset.repo_id=<USER>/check_rtc \
--policy.path=helper2424/smolvla_check_rtc_last3 \
--dataset.repo_id=helper2424/check_rtc \
--rtc.execution_horizon=8 \
--device=cuda \
--use_torch_compile=true \
@@ -84,8 +84,8 @@ Usage:
# Enable CUDA graphs (advanced - may cause tensor aliasing errors)
uv run python examples/rtc/eval_dataset.py \
--policy.path=<USER>/smolvla_check_rtc_last3 \
--dataset.repo_id=<USER>/check_rtc \
--policy.path=helper2424/smolvla_check_rtc_last3 \
--dataset.repo_id=helper2424/check_rtc \
--use_torch_compile=true \
--torch_compile_backend=inductor \
--torch_compile_mode=max-autotune \
+6 -17
View File
@@ -28,7 +28,7 @@ For simulation environments, see eval_with_simulation.py
Usage:
# Run RTC with Real robot with RTC
uv run examples/rtc/eval_with_real_robot.py \
--policy.path=<USER>/smolvla_check_rtc_last3 \
--policy.path=helper2424/smolvla_check_rtc_last3 \
--policy.device=mps \
--rtc.enabled=true \
--rtc.execution_horizon=20 \
@@ -41,7 +41,7 @@ Usage:
# Run RTC with Real robot without RTC
uv run examples/rtc/eval_with_real_robot.py \
--policy.path=<USER>/smolvla_check_rtc_last3 \
--policy.path=helper2424/smolvla_check_rtc_last3 \
--policy.device=mps \
--rtc.enabled=false \
--robot.type=so100_follower \
@@ -53,7 +53,7 @@ Usage:
# Run RTC with Real robot with pi0.5 policy
uv run examples/rtc/eval_with_real_robot.py \
--policy.path=<USER>/pi05_check_rtc \
--policy.path=helper2424/pi05_check_rtc \
--policy.device=mps \
--rtc.enabled=true \
--rtc.execution_horizon=20 \
@@ -94,9 +94,9 @@ from lerobot.rl.process import ProcessSignalHandler
from lerobot.robots import ( # noqa: F401
Robot,
RobotConfig,
bi_so_follower,
koch_follower,
so_follower,
so100_follower,
so101_follower,
)
from lerobot.robots.utils import make_robot_from_config
from lerobot.utils.constants import OBS_IMAGES
@@ -455,18 +455,7 @@ def demo_cli(cfg: RTCDemoConfig):
if cfg.policy.type == "pi05" or cfg.policy.type == "pi0":
config.compile_model = cfg.use_torch_compile
if config.use_peft:
from peft import PeftConfig, PeftModel
peft_pretrained_path = cfg.policy.pretrained_path
peft_config = PeftConfig.from_pretrained(peft_pretrained_path)
policy = policy_class.from_pretrained(
pretrained_name_or_path=peft_config.base_model_name_or_path, config=config
)
policy = PeftModel.from_pretrained(policy, peft_pretrained_path, config=peft_config)
else:
policy = policy_class.from_pretrained(cfg.policy.pretrained_path, config=config)
policy = policy_class.from_pretrained(cfg.policy.pretrained_path, config=config)
# Turn on RTC
policy.config.rtc_config = cfg.rtc
+44 -46
View File
@@ -34,11 +34,12 @@ from lerobot.processor.converters import (
transition_to_observation,
transition_to_robot_action,
)
from lerobot.robots.so_follower import SO100Follower, SO100FollowerConfig
from lerobot.robots.so_follower.robot_kinematic_processor import (
from lerobot.robots.so100_follower.config_so100_follower import SO100FollowerConfig
from lerobot.robots.so100_follower.robot_kinematic_processor import (
ForwardKinematicsJointsToEE,
InverseKinematicsEEToJoints,
)
from lerobot.robots.so100_follower.so100_follower import SO100Follower
from lerobot.scripts.lerobot_record import record_loop
from lerobot.utils.control_utils import init_keyboard_listener
from lerobot.utils.utils import log_say
@@ -142,24 +143,38 @@ def main():
listener, events = init_keyboard_listener()
init_rerun(session_name="so100_so100_evaluate")
try:
if not robot.is_connected:
raise ValueError("Robot is not connected!")
if not robot.is_connected:
raise ValueError("Robot is not connected!")
print("Starting evaluate loop...")
episode_idx = 0
for episode_idx in range(NUM_EPISODES):
log_say(f"Running inference, recording eval episode {episode_idx + 1} of {NUM_EPISODES}")
print("Starting evaluate loop...")
episode_idx = 0
for episode_idx in range(NUM_EPISODES):
log_say(f"Running inference, recording eval episode {episode_idx + 1} of {NUM_EPISODES}")
# Main record loop
# Main record loop
record_loop(
robot=robot,
events=events,
fps=FPS,
policy=policy,
preprocessor=preprocessor, # Pass the pre and post policy processors
postprocessor=postprocessor,
dataset=dataset,
control_time_s=EPISODE_TIME_SEC,
single_task=TASK_DESCRIPTION,
display_data=True,
teleop_action_processor=make_default_teleop_action_processor(),
robot_action_processor=robot_ee_to_joints_processor,
robot_observation_processor=robot_joints_to_ee_pose_processor,
)
# Reset the environment if not stopping or re-recording
if not events["stop_recording"] and ((episode_idx < NUM_EPISODES - 1) or events["rerecord_episode"]):
log_say("Reset the environment")
record_loop(
robot=robot,
events=events,
fps=FPS,
policy=policy,
preprocessor=preprocessor, # Pass the pre and post policy processors
postprocessor=postprocessor,
dataset=dataset,
control_time_s=EPISODE_TIME_SEC,
single_task=TASK_DESCRIPTION,
display_data=True,
@@ -168,41 +183,24 @@ def main():
robot_observation_processor=robot_joints_to_ee_pose_processor,
)
# Reset the environment if not stopping or re-recording
if not events["stop_recording"] and (
(episode_idx < NUM_EPISODES - 1) or events["rerecord_episode"]
):
log_say("Reset the environment")
record_loop(
robot=robot,
events=events,
fps=FPS,
control_time_s=EPISODE_TIME_SEC,
single_task=TASK_DESCRIPTION,
display_data=True,
teleop_action_processor=make_default_teleop_action_processor(),
robot_action_processor=robot_ee_to_joints_processor,
robot_observation_processor=robot_joints_to_ee_pose_processor,
)
if events["rerecord_episode"]:
log_say("Re-record episode")
events["rerecord_episode"] = False
events["exit_early"] = False
dataset.clear_episode_buffer()
continue
if events["rerecord_episode"]:
log_say("Re-record episode")
events["rerecord_episode"] = False
events["exit_early"] = False
dataset.clear_episode_buffer()
continue
# Save episode
dataset.save_episode()
episode_idx += 1
# Save episode
dataset.save_episode()
episode_idx += 1
finally:
# Clean up
log_say("Stop recording")
robot.disconnect()
listener.stop()
# Clean up
log_say("Stop recording")
robot.disconnect()
listener.stop()
dataset.finalize()
dataset.push_to_hub()
dataset.finalize()
dataset.push_to_hub()
if __name__ == "__main__":
+46 -48
View File
@@ -27,14 +27,16 @@ from lerobot.processor.converters import (
transition_to_observation,
transition_to_robot_action,
)
from lerobot.robots.so_follower import SO100Follower, SO100FollowerConfig
from lerobot.robots.so_follower.robot_kinematic_processor import (
from lerobot.robots.so100_follower.config_so100_follower import SO100FollowerConfig
from lerobot.robots.so100_follower.robot_kinematic_processor import (
EEBoundsAndSafety,
ForwardKinematicsJointsToEE,
InverseKinematicsEEToJoints,
)
from lerobot.robots.so100_follower.so100_follower import SO100Follower
from lerobot.scripts.lerobot_record import record_loop
from lerobot.teleoperators.so_leader import SO100Leader, SO100LeaderConfig
from lerobot.teleoperators.so100_leader.config_so100_leader import SO100LeaderConfig
from lerobot.teleoperators.so100_leader.so100_leader import SO100Leader
from lerobot.utils.control_utils import init_keyboard_listener
from lerobot.utils.utils import log_say
from lerobot.utils.visualization_utils import init_rerun
@@ -146,23 +148,38 @@ def main():
listener, events = init_keyboard_listener()
init_rerun(session_name="recording_phone")
try:
if not leader.is_connected or not follower.is_connected:
raise ValueError("Robot or teleop is not connected!")
if not leader.is_connected or not follower.is_connected:
raise ValueError("Robot or teleop is not connected!")
print("Starting record loop...")
episode_idx = 0
while episode_idx < NUM_EPISODES and not events["stop_recording"]:
log_say(f"Recording episode {episode_idx + 1} of {NUM_EPISODES}")
print("Starting record loop...")
episode_idx = 0
while episode_idx < NUM_EPISODES and not events["stop_recording"]:
log_say(f"Recording episode {episode_idx + 1} of {NUM_EPISODES}")
# Main record loop
# Main record loop
record_loop(
robot=follower,
events=events,
fps=FPS,
teleop=leader,
dataset=dataset,
control_time_s=EPISODE_TIME_SEC,
single_task=TASK_DESCRIPTION,
display_data=True,
teleop_action_processor=leader_joints_to_ee,
robot_action_processor=ee_to_follower_joints,
robot_observation_processor=follower_joints_to_ee,
)
# Reset the environment if not stopping or re-recording
if not events["stop_recording"] and (episode_idx < NUM_EPISODES - 1 or events["rerecord_episode"]):
log_say("Reset the environment")
record_loop(
robot=follower,
events=events,
fps=FPS,
teleop=leader,
dataset=dataset,
control_time_s=EPISODE_TIME_SEC,
control_time_s=RESET_TIME_SEC,
single_task=TASK_DESCRIPTION,
display_data=True,
teleop_action_processor=leader_joints_to_ee,
@@ -170,44 +187,25 @@ def main():
robot_observation_processor=follower_joints_to_ee,
)
# Reset the environment if not stopping or re-recording
if not events["stop_recording"] and (
episode_idx < NUM_EPISODES - 1 or events["rerecord_episode"]
):
log_say("Reset the environment")
record_loop(
robot=follower,
events=events,
fps=FPS,
teleop=leader,
control_time_s=RESET_TIME_SEC,
single_task=TASK_DESCRIPTION,
display_data=True,
teleop_action_processor=leader_joints_to_ee,
robot_action_processor=ee_to_follower_joints,
robot_observation_processor=follower_joints_to_ee,
)
if events["rerecord_episode"]:
log_say("Re-recording episode")
events["rerecord_episode"] = False
events["exit_early"] = False
dataset.clear_episode_buffer()
continue
if events["rerecord_episode"]:
log_say("Re-recording episode")
events["rerecord_episode"] = False
events["exit_early"] = False
dataset.clear_episode_buffer()
continue
# Save episode
dataset.save_episode()
episode_idx += 1
# Save episode
dataset.save_episode()
episode_idx += 1
# Clean up
log_say("Stop recording")
leader.disconnect()
follower.disconnect()
listener.stop()
finally:
# Clean up
log_say("Stop recording")
leader.disconnect()
follower.disconnect()
listener.stop()
dataset.finalize()
dataset.push_to_hub()
dataset.finalize()
dataset.push_to_hub()
if __name__ == "__main__":
+22 -24
View File
@@ -24,10 +24,11 @@ from lerobot.processor.converters import (
robot_action_observation_to_transition,
transition_to_robot_action,
)
from lerobot.robots.so_follower import SO100Follower, SO100FollowerConfig
from lerobot.robots.so_follower.robot_kinematic_processor import (
from lerobot.robots.so100_follower.config_so100_follower import SO100FollowerConfig
from lerobot.robots.so100_follower.robot_kinematic_processor import (
InverseKinematicsEEToJoints,
)
from lerobot.robots.so100_follower.so100_follower import SO100Follower
from lerobot.utils.constants import ACTION
from lerobot.utils.robot_utils import precise_sleep
from lerobot.utils.utils import log_say
@@ -74,35 +75,32 @@ def main():
# Connect to the robot
robot.connect()
try:
if not robot.is_connected:
raise ValueError("Robot is not connected!")
if not robot.is_connected:
raise ValueError("Robot is not connected!")
print("Starting replay loop...")
log_say(f"Replaying episode {EPISODE_IDX}")
for idx in range(len(episode_frames)):
t0 = time.perf_counter()
print("Starting replay loop...")
log_say(f"Replaying episode {EPISODE_IDX}")
for idx in range(len(episode_frames)):
t0 = time.perf_counter()
# Get recorded action from dataset
ee_action = {
name: float(actions[idx][ACTION][i])
for i, name in enumerate(dataset.features[ACTION]["names"])
}
# Get recorded action from dataset
ee_action = {
name: float(actions[idx][ACTION][i]) for i, name in enumerate(dataset.features[ACTION]["names"])
}
# Get robot observation
robot_obs = robot.get_observation()
# Get robot observation
robot_obs = robot.get_observation()
# Dataset EE -> robot joints
joint_action = robot_ee_to_joints_processor((ee_action, robot_obs))
# Dataset EE -> robot joints
joint_action = robot_ee_to_joints_processor((ee_action, robot_obs))
# Send action to robot
_ = robot.send_action(joint_action)
# Send action to robot
_ = robot.send_action(joint_action)
precise_sleep(max(1.0 / dataset.fps - (time.perf_counter() - t0), 0.0))
precise_sleep(1.0 / dataset.fps - (time.perf_counter() - t0))
finally:
# Clean up
robot.disconnect()
# Clean up
robot.disconnect()
if __name__ == "__main__":

Some files were not shown because too many files have changed in this diff Show More