mirror of
https://github.com/huggingface/lerobot.git
synced 2026-05-17 17:50:09 +00:00
e627d6442e
- Add replay.py script and replay_viewer.html for browser-based EE trajectory visualization from glannuzel/grabette-dataset - Add viewer.html for interactive URDF inspection - Move OpenArm URDF and meshes into openarm_follower/urdf/ - Add virtual EE target frame (openarm_right_ee_target) at 7cm from link7 - Adapt evaluate.py for single right-arm OpenArm with wrist camera - Update docs with replay viewer usage - Update openarm_follower config, driver, and kinematic processor Made-with: Cursor
200 lines
9.8 KiB
Plaintext
200 lines
9.8 KiB
Plaintext
# UMI Data with pi0 Relative EE Actions
|
||
|
||
This guide explains how to prepare a UMI-collected dataset for training a pi0 policy with relative end-effector (EE) actions, and how to run inference with the trained model.
|
||
|
||
**What we will do:**
|
||
|
||
1. Recompute dataset statistics for relative actions and state.
|
||
2. Train pi0 with `derive_state_from_action=true` (full UMI pipeline).
|
||
3. Evaluate the trained policy on a real robot.
|
||
|
||
## Background
|
||
|
||
[UMI (Universal Manipulation Interface)](https://umi-gripper.github.io) collects manipulation data with hand-held grippers, recovering 6-DoF EE poses via SLAM. UMI datasets stored in LeRobot format already contain `action` (absolute EE pose) and wrist-camera images. To train pi0 with relative actions, we need **relative action statistics** — so the normalizer sees `(action − state)` distributions.
|
||
|
||
### Why relative actions?
|
||
|
||
With relative actions, each action in a chunk is an **offset from the current state** rather than an absolute target:
|
||
|
||
```
|
||
relative_action[i] = absolute_action[t + i] − state[t]
|
||
```
|
||
|
||
This is the representation advocated by UMI (Chi et al., 2024). Compared to absolute actions it removes the need for a consistent global coordinate frame, and compared to delta actions (each step relative to the previous) it avoids error accumulation across the chunk. See the [Action Representations](action_representations) guide for a full comparison.
|
||
|
||
### Full UMI mode: `derive_state_from_action`
|
||
|
||
When `derive_state_from_action=true`, pi0 automatically derives `observation.state` from the action column on the fly — no separate state column or dataset conversion step needed. Under the hood:
|
||
|
||
- `action_delta_indices` extends to `[-1, 0, 1, ..., 49]` (one extra leading timestep).
|
||
- `DeriveStateFromActionStep` extracts `[action[t-1], action[t]]` as a 2-step state and strips the extra timestep from the action chunk.
|
||
- `RelativeActionsProcessorStep` converts actions to offsets from `state[t]`.
|
||
- `RelativeStateProcessorStep` converts the 2-step state to relative proprioception (velocity + zeros) and flattens.
|
||
|
||
This single flag implies `use_relative_state=true` and `state_obs_steps=2`.
|
||
|
||
During **inference**, state comes from the robot (via FK), so `DeriveStateFromActionStep` is a no-op. `RelativeStateProcessorStep` buffers the previous state and applies the same conversion automatically.
|
||
|
||
## Step 1: Recompute Stats
|
||
|
||
Use the built-in CLI to recompute dataset statistics for relative actions and derive-state-from-action:
|
||
|
||
```bash
|
||
lerobot-edit-dataset \
|
||
--repo_id <your_dataset> \
|
||
--operation.type recompute_stats \
|
||
--operation.relative_action true \
|
||
--operation.derive_state_from_action true \
|
||
--operation.chunk_size 50 \
|
||
--operation.relative_exclude_joints "['gripper']" \
|
||
--push_to_hub true
|
||
```
|
||
|
||
The `derive_state_from_action` flag tells `recompute_stats` to read from the action column (instead of `observation.state`) when computing relative state stats. It automatically enables `relative_state=true` and `state_obs_steps=2`.
|
||
|
||
The `relative_exclude_joints` parameter specifies joints that stay absolute. Gripper commands are typically binary or continuous open/close and don't benefit from relative encoding. Leave it as `"[]"` to convert all dimensions to relative.
|
||
|
||
## Step 2: Train
|
||
|
||
No custom training script is needed — standard `lerobot-train` handles everything:
|
||
|
||
```bash
|
||
lerobot-train \
|
||
--dataset.repo_id=<hf_username>/<dataset_repo_id> \
|
||
--policy.type=pi0 \
|
||
--policy.pretrained_path=lerobot/pi0_base \
|
||
--policy.derive_state_from_action=true \
|
||
--policy.use_relative_actions=true \
|
||
--policy.relative_exclude_joints='["gripper"]'
|
||
```
|
||
|
||
`derive_state_from_action=true` auto-enables `use_relative_state=true` and `state_obs_steps=2`.
|
||
|
||
Under the hood, the training pipeline:
|
||
|
||
- Loads relative action stats and relative state stats from the dataset's `meta/stats.json`.
|
||
- Extends `action_delta_indices` to `[-1, 0, 1, ..., 49]` to load one extra leading timestep.
|
||
- `DeriveStateFromActionStep` extracts the 2-step state from the action chunk and strips the extra timestep.
|
||
- `RelativeActionsProcessorStep` converts actions to offsets from `state[t]`.
|
||
- `RelativeStateProcessorStep` converts the 2-step state to relative offsets from the current timestep, then flattens.
|
||
- `NormalizerProcessorStep` normalizes everything.
|
||
- The model trains on normalized relative values.
|
||
|
||
See the [pi0 documentation](pi0) for all available training options.
|
||
|
||
## Step 3: Evaluate
|
||
|
||
The evaluation script in `examples/umi_pi0_relative_ee/evaluate.py` runs inference on a real robot (SO-100 with EE space):
|
||
|
||
```bash
|
||
python examples/umi_pi0_relative_ee/evaluate.py
|
||
```
|
||
|
||
Edit `HF_MODEL_ID`, `HF_DATASET_ID`, and robot configuration at the top of the file.
|
||
|
||
### Latency compensation
|
||
|
||
For real robot deployment, you may want to skip the first few steps of each predicted action chunk to compensate for system latency. Set `LATENCY_SKIP_STEPS` in the evaluate script:
|
||
|
||
```python
|
||
LATENCY_SKIP_STEPS = 0 # ceil(total_latency_ms / (1000 / FPS))
|
||
```
|
||
|
||
For example, at 10Hz with ~200ms total latency, set `LATENCY_SKIP_STEPS = 2`.
|
||
|
||
The inference flow uses pi0's built-in processor pipeline — no custom wrappers needed:
|
||
|
||
1. **Robot → FK** — Joint positions are converted to EE pose via `ForwardKinematicsJointsToEE`, producing `observation.state`.
|
||
2. **Preprocessor** — `DeriveStateFromActionStep` is a no-op (state comes from robot). `RelativeStateProcessorStep` buffers previous state, stacks, and converts to relative. `RelativeActionsProcessorStep` caches state. `NormalizerProcessorStep` normalizes.
|
||
3. **pi0 inference** — The model predicts a normalized relative action chunk.
|
||
4. **Postprocessor** — `UnnormalizerProcessorStep` unnormalizes, then `AbsoluteActionsProcessorStep` adds the cached state back to get absolute EE targets.
|
||
5. **IK → Robot** — `InverseKinematicsEEToJoints` converts absolute EE targets to joint commands.
|
||
|
||
## Replay Viewer
|
||
|
||
Before running on hardware, you can visualize any dataset episode in a browser-based 3D viewer. The viewer shows the EE trajectory overlaid on the OpenArm URDF model, making it easy to sanity-check recorded data or debug unexpected behavior.
|
||
|
||
### Quick start
|
||
|
||
```bash
|
||
python examples/umi_pi0_relative_ee/replay.py
|
||
```
|
||
|
||
This extracts the trajectory from episode 0 of the default dataset, starts a local HTTP server, and opens the viewer at [http://localhost:8765/replay_viewer.html](http://localhost:8765/replay_viewer.html).
|
||
|
||
### Options
|
||
|
||
| Flag | Default | Description |
|
||
| ----------- | ---------------------------- | ------------------------------------ |
|
||
| `--repo-id` | `glannuzel/grabette-dataset` | HuggingFace dataset repo to load |
|
||
| `--episode` | `0` | Episode index to replay |
|
||
| `--port` | `8765` | HTTP server port |
|
||
| `--force` | off | Re-extract trajectory even if cached |
|
||
|
||
Example with a different dataset and episode:
|
||
|
||
```bash
|
||
python examples/umi_pi0_relative_ee/replay.py \
|
||
--repo-id myuser/my-dataset \
|
||
--episode 3 \
|
||
--port 8766
|
||
```
|
||
|
||
### Viewer controls
|
||
|
||
The panel in the top-left corner shows live EE coordinates (x, y, z, ax, ay, az) and gripper state for the current frame. Below that are transport controls:
|
||
|
||
- **Play / Pause** — toggle automatic playback.
|
||
- **Step buttons** (◀ ▶) — advance or rewind one frame at a time.
|
||
- **Reset** (⟳) — jump back to frame 0.
|
||
- **Scrubber** — drag to seek to any frame.
|
||
- **Speed selector** — 0.25×, 0.5×, 1×, 2×, or 4× playback speed.
|
||
|
||
The 3D scene uses orbit controls — click and drag to rotate, scroll to zoom, right-click drag to pan.
|
||
|
||
### Color legend
|
||
|
||
| Color | Meaning |
|
||
| ------------------ | --------------------------------------------- |
|
||
| Red sphere | Current EE position |
|
||
| Yellow trail | Past trajectory (frames already visited) |
|
||
| Dark trail | Future trajectory (frames ahead) |
|
||
| Orange ring + axes | URDF `ee_target` frame (zero-joint reference) |
|
||
|
||
The trajectory is automatically re-centered so that frame 0 aligns with the robot's `openarm_right_ee_target` link in the zero-joint pose.
|
||
|
||
## How the Pieces Fit Together
|
||
|
||
```
|
||
Training (full UMI mode: derive_state_from_action=true):
|
||
DataLoader (action: B,51,D)
|
||
→ DeriveStateFromActionStep (state = action[:,:2,:], action = action[:,1:,:])
|
||
→ RelativeActionsProcessorStep (action -= state[:,-1,:])
|
||
→ RelativeStateProcessorStep (state offsets from current, flatten → B,2*D)
|
||
→ NormalizerProcessorStep → pi0 model
|
||
|
||
Inference:
|
||
robot joints → FK → observation.state (absolute EE)
|
||
↓
|
||
DeriveStateFromActionStep (no-op)
|
||
↓
|
||
RelativeActionsProcessorStep (caches state)
|
||
↓
|
||
RelativeStateProcessorStep (buffers prev, stacks, subtracts, flattens)
|
||
↓
|
||
NormalizerProcessorStep → pi0 model → relative action chunk
|
||
↓
|
||
UnnormalizerProcessorStep
|
||
↓
|
||
AbsoluteActionsProcessorStep (+ cached state → absolute EE)
|
||
↓
|
||
IK → joint targets → robot
|
||
```
|
||
|
||
## References
|
||
|
||
- [UMI: Universal Manipulation Interface](https://umi-gripper.github.io) — Chi et al., 2024. Defines relative trajectory actions.
|
||
- [Action Representations](action_representations) — LeRobot guide comparing absolute, relative, and delta actions.
|
||
- [pi0 documentation](pi0) — Full pi0 configuration including `use_relative_actions`.
|
||
- [`examples/so100_to_so100_EE/`](https://github.com/huggingface/lerobot/tree/main/examples/so100_to_so100_EE) — EE-space evaluation example this builds on.
|