add docs for rtc

This commit is contained in:
Michel Aractingi
2025-11-18 17:46:49 +01:00
parent b7b0ac2456
commit 611159f8bb
6 changed files with 207 additions and 293 deletions
+6 -2
View File
@@ -15,8 +15,6 @@
title: Train a Robot with RL title: Train a Robot with RL
- local: hilserl_sim - local: hilserl_sim
title: Train RL in Simulation title: Train RL in Simulation
- local: async
title: Use Async Inference
- local: multi_gpu_training - local: multi_gpu_training
title: Multi GPU training title: Multi GPU training
title: "Tutorials" title: "Tutorials"
@@ -40,6 +38,12 @@
- local: groot - local: groot
title: NVIDIA GR00T N1.5 title: NVIDIA GR00T N1.5
title: "Policies" title: "Policies"
- sections:
- local: async
title: Use Async Inference
- local: rtc
title: Real-Time Chunking (RTC)
title: "Inference"
- sections: - sections:
- local: envhub - local: envhub
title: Environments from the Hub title: Environments from the Hub
+173
View File
@@ -0,0 +1,173 @@
# Real-Time Chunking (RTC)
Real-Time Chunking (RTC) is an inference-time method that allows large, flow-matching based robotic policies, such as [Pi0](./pi0), [Pi0.5](./pi05), and [SmolVLA](./smolvla), to produce smooth, continuous, and reactive motion despite having high inference latency.
These policies generate chunks of future actions (e.g., 50 steps at a time) instead of single actions.
Because the models are large, producing each chunk takes longer than the time it takes the robot to execute it.
Naively executing chunks leads to problems such as pauses, jerky transitions, or sudden changes in strategy whenever the next chunk arrives late or disagrees with the previously executed actions.
RTC solves this by asynchronously generating the next chunk while the robot continues executing the current one, and by guiding the new chunk so it aligns smoothly with the portion of the previous chunk that has already been executed.
## How RTC Works (simplified)
RTC lets the robot think ahead while its still moving. When the robot is carrying out one chunk of actions, RTC starts creating the next chunk early.
But since the robot has already moved a bit by the time the new chunk is ready, RTC has to make sure the new chunk still lines up smoothly with what the robot is currently doing.
To do this, RTC treats the beginning of the new chunk like an inpainting or “fill-in-the-gaps” problem:
it gently adjusts the first part of the new chunk so it blends naturally with the robots ongoing motion. The result is no pauses, no sudden jumps.
In technical terms, RTC adds a guidance term to the flow-matching denoising process that forces the overlapping timesteps of the new chunk to stay close to the executed portion of the previous chunk, typically using a soft transition mask.
## Quick Start
### Installation
RTC is built into LeRobot. Just install the policy dependencies you need:
```bash
# For Pi0 or Pi0.5
pip install -e ".[pi]"
# For SmolVLA
pip install -e ".[smolvla]"
```
### Using RTC with Pi0
Here's a minimal example of using RTC with Pi0:
```python
from lerobot.policies.pi0 import PI0Policy, PI0Config
from lerobot.configs.types import RTCAttentionSchedule
from lerobot.policies.rtc.configuration_rtc import RTCConfig
# Load Pi0 with RTC enabled
policy_cfg = PI0Config()
# Enable RTC
policy_cfg.rtc_config = RTCConfig(
enabled=True,
execution_horizon=10, # How many steps to blend with previous chunk
max_guidance_weight=10.0, # How strongly to enforce consistency
prefix_attention_schedule=RTCAttentionSchedule.LINEAR, # Linear blend
)
# Load the policy
policy = PI0Policy.from_pretrained("lerobot/pi0_base", policy_cfg=policy_cfg, device="cuda")
# Now use predict_action_chunk with RTC parameters
prev_chunk_left_over = None # Will hold the leftover from previous chunk
inference_delay = 4 # How many steps of inference latency
for step in range(num_steps):
# Get observation from environment
observation = get_observation()
# Predict action chunk with RTC
action_chunk = policy.predict_action_chunk(
observation,
inference_delay=inference_delay,
prev_chunk_left_over=prev_chunk_left_over,
execution_horizon=policy_cfg.rtc_config.execution_horizon,
)
# Execute the first N actions
execute_actions(action_chunk[:execution_horizon])
# Save the rest for next iteration
prev_chunk_left_over = action_chunk[inference_delay:]
```
## Key Parameters
`RTCConfig` has the following parameters to tune:
**`execution_horizon`**: How many timesteps from the previous chunk to maintain consistency with. Higher values mean smoother transitions but potentially less reactivity.
Typical values: 8-12 steps
```python
RTCConfig(execution_horizon=10)
```
**`max_guidance_weight`**: How strongly to enforce consistency with the previous chunk. Higher values give stronger smoothness but may over-constrain new predictions.
Typical values:
- Dataset evaluation: 10.0-100.0
- Real-time robot control: 1.0-10.0
**`prefix_attention_schedule`**: How to weight consistency across the overlap region.
- `LINEAR`: Linear decay from inference_delay to execution_horizon (recommended for getting started)
- `EXP`: Exponential decay (often performs better)
- `ONES`: Full weight across entire execution_horizon
- `ZEROS`: Binary (full weight up to inference_delay, then zero)
**`inference_delay`**: How many timesteps of inference latency your system has. This is passed to `predict_action_chunk()` rather than the config, since it may vary at runtime.
Typical values: 3-5 steps for dataset evaluation, dynamically calculated for real-time control
## Testing RTC Offline
Before running on a real robot, test RTC with dataset samples to visualize how it works:
```bash
python examples/rtc/eval_dataset.py \
--policy.path=lerobot/pi0_libero_finetuned \
--dataset.repo_id=HuggingFaceVLA/libero \
--rtc.execution_horizon=10 \
--rtc.max_guidance_weight=10.0 \
--device=cuda
```
## Testing RTC with a Real Robot
```bash
python examples/rtc/eval_with_real_robot.py \
--policy.path=${HF_USERNAME}/policy_repo_id \
--robot.type=so100_follower \
--robot.port=/dev/tty.usbmodem58FA0834591 \
--robot.cameras="{ gripper: {type: opencv, index_or_path: 1, width: 640, height: 480, fps: 30}, front: {type: opencv, index_or_path: 0, width: 640, height: 480, fps: 30}}" \
--task="Move green small object into the purple platform" \
--duration=120 \
--device=cuda
```
## How It Differs from the Async Inference in LeRobot
Both RTC and [async inference](./async) improve real-time robot control, but they solve different problems.
| Aspect | Async Inference | RTC |
| ------------- | -------------------------------------------------------------------------- | --------------------------------------------------- |
| **Problem** | Idle frames while waiting for inference | Discontinuities between action chunks |
| **Solution** | Decouple prediction from execution | Guide new chunks to continue smoothly from previous |
| **Benefit** | No waiting, continuous action | Smooth transitions, natural motion |
| **Best Used** | Async inference is best used with large models with high inference latency | Flow-matching based policies |
**Use both together** for maximum smoothness and reactivity!
## Advanced: Debug Tracking
RTC includes built-in debug tracking to help you understand what's happening during inference:
```python
# Enable debug tracking
policy_cfg.rtc_config.debug = True
policy_cfg.rtc_config.debug_maxlen = 100
# After inference, access debug data
debug_data = policy.rtc_processor.get_debug_data()
# Visualize denoising steps, corrections, etc.
from lerobot.policies.rtc.debug_visualizer import RTCDebugVisualizer
visualizer = RTCDebugVisualizer()
# ... create plots
```
See `examples/rtc/eval_dataset.py` for a complete example of visualization.
## References
- [Smooth-As-Butter Robot Policies](https://alexander-soare.github.io/robotics/2025/08/05/smooth-as-butter-robot-policies.html) - Excellent technical explanation with real robot results
- [Physical Intelligence - Real-Time Chunking](https://www.physicalintelligence.company/research/real_time_chunking) - Original paper and research
- [Kinetix RTC Implementation](https://github.com/Physical-Intelligence/real-time-chunking-kinetix) - Reference implementation from Physical Intelligence
-251
View File
@@ -1,251 +0,0 @@
# Real-Time Chunking (RTC) Examples
This directory contains examples and evaluation scripts for Real-Time Chunking (RTC), a technique for improving action chunking policies in real-time robot control.
## Overview
Real-Time Chunking addresses the challenge of maintaining consistency and reactivity when using action chunking policies with non-negligible inference latency. It uses a guidance technique during diffusion sampling to blend new action predictions with previously planned actions.
**Key Benefits:**
- Maintains consistency between consecutive action chunks
- Reduces jitter and improves smoothness
- Adapts to inference delays dynamically
**Reference:** [Physical Intelligence - Real-Time Chunking](https://www.physicalintelligence.company/download/real_time_chunking.pdf)
## Scripts
### 1. `eval_dataset.py`
Offline evaluation on dataset samples with detailed visualization and validation.
**Features:**
- Compare RTC vs non-RTC predictions on two random dataset samples
- Validate RTC behavior (delay region, blend region, post-horizon region)
- Generate debug visualizations:
- Denoising step comparisons (x_t, v_t, x1_t, corrections)
- Final action predictions comparison
- Support for torch.compile() optimization
- Memory-efficient sequential policy loading for large models
**Usage:**
```bash
# Basic usage with SmolVLA policy
uv run python examples/rtc/eval_dataset.py \
--policy.path=helper2424/smolvla_check_rtc_last3 \
--dataset.repo_id=helper2424/check_rtc \
--rtc.execution_horizon=8 \
--device=mps \
--rtc.max_guidance_weight=10.0 \
--seed=10
# With Pi0.5 policy on CUDA
uv run python examples/rtc/eval_dataset.py \
--policy.path=lerobot/pi05_libero_finetuned \
--dataset.repo_id=HuggingFaceVLA/libero \
--rtc.execution_horizon=8 \
--device=cuda
# With Pi0 policy
uv run python examples/rtc/eval_dataset.py \
--policy.path=lerobot/pi0_libero_finetuned \
--dataset.repo_id=HuggingFaceVLA/libero \
--rtc.execution_horizon=8 \
--device=cuda
# With torch.compile for faster inference
uv run python examples/rtc/eval_dataset.py \
--policy.path=helper2424/smolvla_check_rtc_last3 \
--dataset.repo_id=helper2424/check_rtc \
--rtc.execution_horizon=8 \
--device=cuda \
--use_torch_compile=true \
--torch_compile_mode=max-autotune
# Enable CUDA graphs (advanced - may cause tensor aliasing errors)
uv run python examples/rtc/eval_dataset.py \
--policy.path=helper2424/smolvla_check_rtc_last3 \
--dataset.repo_id=helper2424/check_rtc \
--use_torch_compile=true \
--torch_compile_backend=inductor \
--torch_compile_mode=max-autotune \
--torch_compile_disable_cudagraphs=false
```
**Key Parameters:**
- `--policy.path`: Path to pretrained policy
- `--dataset.repo_id`: Dataset to evaluate on
- `--rtc.execution_horizon`: Number of steps to maintain consistency (default: 20)
- `--rtc.max_guidance_weight`: Maximum guidance weight (default: 10.0)
- `--rtc.prefix_attention_schedule`: Schedule type (ZEROS, ONES, LINEAR, EXP)
- `--inference_delay`: Inference delay for RTC (default: 4)
- `--seed`: Random seed for reproducibility (default: 42)
- `--output_dir`: Directory to save visualizations (default: rtc_debug_output)
- `--device`: Device to use (cuda, cpu, mps, auto)
- `--use_torch_compile`: Enable torch.compile() for faster inference
**Output:**
The script generates several visualization files in `rtc_debug_output/`:
- `denoising_xt_comparison.png` - Noisy state evolution during denoising
- `denoising_vt_comparison.png` - Velocity predictions during denoising
- `denoising_x1t_comparison.png` - Predicted final states during denoising
- `denoising_correction_comparison.png` - RTC guidance corrections applied
- `final_actions_comparison.png` - Final action predictions (prev_chunk, no_rtc, rtc)
The script also validates RTC behavior and reports:
- ✅ Delay region [0:inference_delay]: RTC = prev_chunk
- ✅ Blend region [inference_delay:execution_horizon]: prev_chunk ≤ RTC ≤ no_rtc
- ✅ Post-horizon [execution_horizon:]: RTC = no_rtc
### 2. `eval_with_real_robot.py`
Real-time evaluation on physical robots or simulation environments.
**Features:**
- Run policy with RTC on real robot or simulation
- Multi-threaded action execution and inference
- Action queue management with proper timing
- Latency tracking and adaptive inference delay
- Support for both robots and gym environments
- Support for torch.compile() optimization
**Usage:**
```bash
# With real robot
uv run python examples/rtc/eval_with_real_robot.py \
--policy.path=lerobot/smolvla_base \
--robot.type=so100 \
--task="pick up the cup" \
--duration=30.0
# With simulation environment
uv run python examples/rtc/eval_with_real_robot.py \
--policy.path=lerobot/smolvla_base \
--env.type=pusht \
--duration=60.0
# With policy compilation (CUDA only, not MPS)
uv run python examples/rtc/eval_with_real_robot.py \
--policy.path=lerobot/smolvla_base \
--robot.type=so100 \
--use_torch_compile=true \
--torch_compile_mode=max-autotune
```
**Key Parameters:**
- `--policy.path`: Path to pretrained policy
- `--robot.type` or `--env.type`: Robot or environment to use
- `--task`: Task description (for VLA models)
- `--rtc.execution_horizon`: Number of steps to maintain consistency (default: 10)
- `--rtc.max_guidance_weight`: Maximum guidance weight (default: 1.0)
- `--rtc.prefix_attention_schedule`: Schedule type (ZEROS, ONES, LINEAR, EXP)
- `--duration`: How long to run (seconds, default: 30.0)
- `--fps`: Action execution frequency (Hz, default: 10.0)
- `--action_queue_size_to_get_new_actions`: Queue size threshold to request new actions (default: 30)
- `--device`: Device to use (cuda, cpu, mps, auto)
- `--use_torch_compile`: Enable torch.compile() for faster inference
## Understanding RTC Parameters
### `execution_horizon`
Number of timesteps from previous chunk to maintain consistency with. Higher values mean more consistency but potentially less reactivity.
**Typical values:** 8-12 steps for dataset evaluation, 10 steps for real-time execution
### `max_guidance_weight`
Upper bound on guidance strength. Higher values give stronger consistency but may over-constrain new predictions.
**Typical values:**
- Dataset evaluation: 10.0-100.0 (can be higher for analysis)
- Real-time execution: 1.0-10.0 (more conservative)
### `prefix_attention_schedule`
How to weight consistency across the overlap region:
- `ZEROS`: Binary (full weight up to inference_delay, then zero)
- `ONES`: Full weight across entire execution_horizon
- `LINEAR`: Linear decay from inference_delay to execution_horizon
- `EXP`: Exponential decay (recommended)
**Recommended:** `EXP`
### `inference_delay`
Number of timesteps from the prefix to use for guidance. Typically calculated dynamically based on inference latency in real-time execution, but fixed for dataset evaluation.
**Typical values:** 3-5 steps for dataset evaluation
### `action_queue_size_to_get_new_actions` (real-time only)
Threshold for requesting new action chunks. Should be higher than `inference_delay + execution_horizon` to ensure smooth operation.
**Typical values:** 20-30 steps
## Validation Rules (Dataset Evaluation)
The dataset evaluation script validates that RTC behavior matches expectations:
1. **Delay Region [0:inference_delay]**: RTC actions should equal previous chunk
- Ensures consistency during the inference delay period
2. **Blend Region [inference_delay:execution_horizon]**: RTC should be between prev_chunk and no_rtc
- Smooth transition from previous plan to new predictions
3. **Post-Horizon [execution_horizon:]**: RTC should equal no_rtc
- Full adoption of new predictions after execution horizon
## Tips
1. **Start with dataset evaluation** (`eval_dataset.py`) to understand RTC behavior and tune parameters before running on robot
2. **Use visualizations** to debug unexpected behavior - check denoising steps and final actions
3. **Tune execution_horizon** based on your inference latency and action frequency
4. **Monitor validation output** - failures indicate potential implementation issues or misconfigured parameters
5. **Compare different schedules** - EXP usually works best but LINEAR can be more interpretable
## Troubleshooting
### Validation fails in delay region
- Check that `prev_chunk_left_over` is properly passed to the policy
- Verify RTC guidance is being applied during denoising
- Look at denoising visualizations to see where guidance diverges
### Validation fails in post-horizon region
- RTC and no_rtc use different noise - verify same noise is being used for comparison
- Check that weights are correctly zeroed out after execution horizon
- Review prefix_attention_schedule visualization
### Poor performance on real robot
- Increase `action_queue_size_to_get_new_actions` if you see warnings
- Reduce `max_guidance_weight` if robot is too conservative
- Try different `prefix_attention_schedule` values
- Enable torch.compile() for faster inference (CUDA only)
### Memory issues with large models
- The dataset evaluation script loads policies sequentially to minimize memory
- For real-time execution, only one policy is loaded
- Use smaller batch sizes if needed
## Related Documentation
- [RTC Implementation](../../src/lerobot/policies/rtc/modeling_rtc.py)
- [RTC Configuration](../../src/lerobot/policies/rtc/configuration_rtc.py)
- [Action Queue](../../src/lerobot/policies/rtc/action_queue.py)
- [Physical Intelligence Paper](https://www.physicalintelligence.company/download/real_time_chunking.pdf)
+1 -2
View File
@@ -98,7 +98,6 @@ pygame-dep = ["pygame>=2.5.1,<2.7.0"]
placo-dep = ["placo>=0.9.6,<0.10.0"] placo-dep = ["placo>=0.9.6,<0.10.0"]
transformers-dep = ["transformers>=4.53.0,<5.0.0"] transformers-dep = ["transformers>=4.53.0,<5.0.0"]
grpcio-dep = ["grpcio==1.73.1", "protobuf==6.31.0"] # TODO: Bumb dependency (compatible with wandb) grpcio-dep = ["grpcio==1.73.1", "protobuf==6.31.0"] # TODO: Bumb dependency (compatible with wandb)
matplotlib-dep = ["matplotlib>=3.10.3,<4.0.0"]
# Motors # Motors
feetech = ["feetech-servo-sdk>=1.0.0,<2.0.0"] feetech = ["feetech-servo-sdk>=1.0.0,<2.0.0"]
@@ -133,7 +132,7 @@ groot = [
hilserl = ["lerobot[transformers-dep]", "gym-hil>=0.1.13,<0.2.0", "lerobot[grpcio-dep]", "lerobot[placo-dep]"] hilserl = ["lerobot[transformers-dep]", "gym-hil>=0.1.13,<0.2.0", "lerobot[grpcio-dep]", "lerobot[placo-dep]"]
# Features # Features
async = ["lerobot[grpcio-dep]", "lerobot[matplotlib-dep]"] async = ["lerobot[grpcio-dep]", "matplotlib>=3.10.3,<4.0.0"]
# Development # Development
dev = ["pre-commit>=3.7.0,<5.0.0", "debugpy>=1.8.1,<1.9.0", "lerobot[grpcio-dep]", "grpcio-tools==1.73.1"] dev = ["pre-commit>=3.7.0,<5.0.0", "debugpy>=1.8.1,<1.9.0", "lerobot[grpcio-dep]", "grpcio-tools==1.73.1"]
+27 -38
View File
@@ -1,49 +1,38 @@
# Real-Time Chunking (RTC) Module # Real-Time Chunking (RTC)
This module implements Real-Time Chunking and related adaptive inference techniques for robotics policies in LeRobot. This module contains the LeRobot implementation of **Real-Time Chunking (RTC)**, an inference-time technique for flow-matching based policies.
## Overview **Note**: RTC is not a policy itself, but rather an inference enhancement that works with flow-matching based policies including [π₀](../pi0/), [π₀.₅](../pi05/), and [SmolVLA](../smolvla/).
Real-Time Chunking (RTC) addresses the challenge of real-time inference in action chunking policies by treating chunk generation as an inpainting problem. It strategically handles overlapping timesteps between action chunks using prefix attention mechanisms. ---
It is particularly effective for handling long-horizon inference in robotics policies. ## Citation
## Integration with Policies If you use Real-Time Chunking in your work, please cite:
RTC can be integrated with any policy that supports flow mathicng for chunking: ```bibtex
@misc{openpi2024,
author = {Physical Intelligence Lab},
title = {OpenPI: PyTorch Implementation of π0 and π0.5 Policies},
year = {2024},
publisher = {GitHub},
howpublished = {\url{https://github.com/Physical-Intelligence/openpi}},
license = {Apache-2.0}
}
- **SmolVLA**: Vision-language-action model with RTC support @misc{black2025realtimeexecutionactionchunking,
- **Pi0**: Action prediction model with adaptive chunking title={Real-Time Execution of Action Chunking Flow Policies},
- **Pi05**: Action prediction model with adaptive chunking author={Kevin Black and Manuel Y. Galliker and Sergey Levine},
year={2025},
## Original Implementation eprint={2506.07339},
archivePrefix={arXiv},
This implementation is based on Physical Intelligence's Kinetix RTC: primaryClass={cs.RO},
url={https://arxiv.org/abs/2506.07339},
- [Original RTC implementation](https://github.com/Physical-Intelligence/real-time-chunking-kinetix/blob/main/src/model.py#L214) }
- [Kinetix GitHub Repository](https://github.com/Physical-Intelligence/real-time-chunking-kinetix)
## References
- [Real Time Chunking Paper](https://www.physicalintelligence.company/research/real_time_chunking)
- [Physical Intelligence Kinetix](https://github.com/Physical-Intelligence/real-time-chunking-kinetix)
## How to run
### Check with data from the dataset
```bash
uv run python examples/rtc/eval_dataset.py \
--policy.path=helper2424/smolvla_check_rtc_last3 \
--dataset.repo_id=helper2424/check_rtc \
--rtc.execution_horizon=8 \
--device=mps \
--seed=42
``` ```
This script will evaluate RTC on a data from a dataset and save the results to a file, u can check the results in the `rtc_debug_output` directory. ---
The example output should look like this: ## License
![Flow Matching with RTC](./flow_matching.png)
It shows how flow matching works with RTC and without it. The chart shows values of action predictions for each timestep. The colour shows the the generation progress. The blue ones - earlier timesteps, the yellow ones - later timesteps. The red line is the ground truth (previous action chunk). This implementation follows the **Apache 2.0 License**, consistent with the LeRobot project.
Binary file not shown.

Before

Width:  |  Height:  |  Size: 1.3 MiB