# UMI Data with pi0 Relative EE Actions This guide explains how to prepare a UMI-collected dataset for training a pi0 policy with relative end-effector (EE) actions, and how to run inference with the trained model. **What we will do:** 1. How to add `observation.state` to an existing UMI LeRobot dataset. 2. How to train pi0 with `use_relative_actions=True`. 3. How to evaluate the trained policy on a real robot. ## Background [UMI (Universal Manipulation Interface)](https://umi-gripper.github.io) collects manipulation data with hand-held grippers, recovering 6-DoF EE poses via SLAM. UMI datasets stored in LeRobot format already contain `action` (absolute EE pose) and wrist-camera images. To train pi0 with relative actions, we need two additions: 1. **`observation.state`** — the current EE pose the policy conditions on. 2. **Relative action statistics** — so the normalizer sees `(action − state)` distributions. ### Why relative actions? With relative actions, each action in a chunk is an **offset from the current state** rather than an absolute target: ``` relative_action[i] = absolute_action[t + i] − state[t] ``` This is the representation advocated by UMI (Chi et al., 2024). Compared to absolute actions it removes the need for a consistent global coordinate frame, and compared to delta actions (each step relative to the previous) it avoids error accumulation across the chunk. See the [Action Representations](action_representations) guide for a full comparison. ## State-Action Offset UMI SLAM produces a single trajectory of EE poses stored as `action`. We derive `observation.state` from the same trajectory with a configurable offset: ``` state[t] = action[t - offset] ``` | Offset | `state[t]` | Meaning | | ------ | ------------- | ---------------------------------------------------------------- | | 0 | `action[t]` | State and action are the same pose at time t | | 1 | `action[t-1]` | State is the previous action — where the gripper already arrived | An offset of 1 is the typical UMI convention: at decision time the "current state" is where the gripper _already is_ (the result of the previous command), and the action is where it should go next. At episode boundaries where `t < offset`, we clamp to `action[0]`. ## Step 1: Add `observation.state` pi0 with `use_relative_actions=True` needs `observation.state` in the dataset to compute `action - state` on the fly. The script in `examples/umi_pi0_relative_ee/convert_umi_dataset.py` adds it. Edit the constants at the top: ```python HF_DATASET_ID = "/" # Option A: Copy an existing feature as observation.state STATE_SOURCE_FEATURE = "observation.joints" # or "observation.pose", etc. # Option B: Derive from action with offset (set STATE_SOURCE_FEATURE = None) STATE_SOURCE_FEATURE = None STATE_ACTION_OFFSET = 1 ``` **Choosing the state source:** - If your dataset already has a feature in the same space as `action` (e.g. `observation.joints` for joint-space actions, or `observation.pose` for EE-space actions), set `STATE_SOURCE_FEATURE` to copy it. - If your dataset only has a single trajectory (like raw UMI EE data where action = the EE poses), set `STATE_SOURCE_FEATURE = None` and use `STATE_ACTION_OFFSET` to derive state from the action column with a time offset. `observation.state` **must have the same shape as `action`** — the relative conversion computes `action - state` element-wise. Then run: ```bash python examples/umi_pi0_relative_ee/convert_umi_dataset.py ``` If your dataset already has `observation.state`, the script exits early — nothing to do. ## Step 2: Recompute Relative Action Stats Use the built-in CLI to recompute dataset statistics in relative space: ```bash lerobot-edit-dataset \ --repo_id \ --operation.type recompute_stats \ --operation.relative_action true \ --operation.chunk_size 50 \ --operation.relative_exclude_joints "['gripper']" \ --push_to_hub true ``` The `relative_exclude_joints` parameter specifies joints that stay absolute. Gripper commands are typically binary or continuous open/close and don't benefit from relative encoding. Leave it as `"[]"` to convert all dimensions to relative. ## Step 3: Train No custom training script is needed — standard `lerobot-train` handles everything: ```bash lerobot-train \ --dataset.repo_id=/ \ --policy.type=pi0 \ --policy.pretrained_path=lerobot/pi0_base \ --policy.use_relative_actions=true \ --policy.relative_exclude_joints='["gripper"]' ``` Under the hood, the training pipeline: - Loads relative action stats from the dataset's `meta/stats.json`. - Configures `RelativeActionsProcessorStep` in the preprocessor (absolute → relative before normalization). - The model trains on normalized relative action values. See the [pi0 documentation](pi0) for all available training options. ## Step 4: Evaluate The evaluation script in `examples/umi_pi0_relative_ee/evaluate.py` runs inference on a real robot (SO-100 with EE space): ```bash python examples/umi_pi0_relative_ee/evaluate.py ``` Edit `HF_MODEL_ID`, `HF_DATASET_ID`, and robot configuration at the top of the file. The inference flow uses pi0's built-in processor pipeline — no custom wrappers needed: 1. **Robot → FK** — Joint positions are converted to EE pose via `ForwardKinematicsJointsToEE`, producing `observation.state`. 2. **Preprocessor** — `RelativeActionsProcessorStep` caches the raw `observation.state`, then `NormalizerProcessorStep` normalizes everything. 3. **pi0 inference** — The model predicts a normalized relative action chunk. 4. **Postprocessor** — `UnnormalizerProcessorStep` unnormalizes, then `AbsoluteActionsProcessorStep` adds the cached state back to get absolute EE targets. 5. **IK → Robot** — `InverseKinematicsEEToJoints` converts absolute EE targets to joint commands. ## How the Pieces Fit Together ``` Training: dataset (absolute EE) → RelativeActionsProcessorStep → NormalizerProcessorStep → pi0 model Inference: robot joints → FK → observation.state (absolute EE) ↓ RelativeActionsProcessorStep (caches state) ↓ NormalizerProcessorStep → pi0 model → relative action chunk ↓ UnnormalizerProcessorStep ↓ AbsoluteActionsProcessorStep (+ cached state → absolute EE) ↓ IK → joint targets → robot ``` ## References - [UMI: Universal Manipulation Interface](https://umi-gripper.github.io) — Chi et al., 2024. Defines relative trajectory actions. - [Action Representations](action_representations) — LeRobot guide comparing absolute, relative, and delta actions. - [pi0 documentation](pi0) — Full pi0 configuration including `use_relative_actions`. - [`examples/so100_to_so100_EE/`](https://github.com/huggingface/lerobot/tree/main/examples/so100_to_so100_EE) — EE-space evaluation example this builds on.