first iteration

This commit is contained in:
Steven Palma
2026-04-14 15:42:04 +02:00
parent cff4bcf4a0
commit 05a2604d6e
27 changed files with 2822 additions and 2751 deletions
+7 -7
View File
@@ -50,25 +50,25 @@ This process can be repeated iteratively: deploy, collect, fine-tune, repeat. Ea
### Teleoperator Requirements
The `examples/hil` HIL scripts require **teleoperators with active motors** that can:
The `lerobot-rollout --strategy.type=dagger` mode requires **teleoperators with active motors** that can:
- Enable/disable torque programmatically
- Move to target positions (to mirror the robot state when pausing)
**Compatible teleoperators in the current `examples/hil` scripts:**
**Compatible teleoperators:**
- `openarm_mini` - OpenArm Mini
- `so_leader` - SO100 / SO101 leader arm
> [!IMPORTANT]
> The provided `examples/hil` commands default to `bi_openarm_follower` + `openarm_mini`.
> The provided commands default to `bi_openarm_follower` + `openarm_mini`.
> `so_follower` + `so_leader` configs are also registered and can be used via CLI flags.
---
## Script
A single script handles both synchronous and RTC-based inference. Toggle RTC with `--rtc.enabled=true`:
Use `lerobot-rollout` with `--strategy.type=dagger` for HIL data collection. Toggle RTC with `--rtc.enabled=true`:
| Mode | Flag | Models |
| ------------------------ | -------------------- | --------------------- |
@@ -97,7 +97,7 @@ python src/lerobot/scripts/lerobot_train.py \
**Standard inference (ACT, Diffusion Policy):**
```bash
python examples/hil/hil_data_collection.py \
lerobot-rollout --strategy.type=dagger \
--robot.type=bi_openarm_follower \
--robot.left_arm_config.port=can1 \
--robot.left_arm_config.side=left \
@@ -121,7 +121,7 @@ python examples/hil/hil_data_collection.py \
For models with high inference latency, enable RTC for smooth execution:
```bash
python examples/hil/hil_data_collection.py \
lerobot-rollout --strategy.type=dagger \
--rtc.enabled=true \
--rtc.execution_horizon=20 \
--rtc.max_guidance_weight=5.0 \
@@ -235,7 +235,7 @@ This HIL data collection approach builds on ideas from interactive imitation lea
- **HG-DAgger** (Kelly et al., 2019) made this practical for robotics: a human expert monitors the robot and only intervenes when needed, rather than labeling every state. The gating between autonomous and human control is exactly the pause → takeover → return-to-policy loop used in the scripts here.
- **RaC** (Hu et al., 2025) scales this loop to long-horizon tasks by explicitly decomposing interventions into **recovery** (teleoperating back to a good state) and **correction** (demonstrating the right behavior from there). This decomposition is the protocol followed by the HIL scripts in `examples/hil`.
- **RaC** (Hu et al., 2025) scales this loop to long-horizon tasks by explicitly decomposing interventions into **recovery** (teleoperating back to a good state) and **correction** (demonstrating the right behavior from there). This decomposition is the protocol followed by the DAgger strategy in `lerobot-rollout`.
- **π0.6/RECAP** (Physical Intelligence, 2025) applies the same iterative collect-and-finetune loop at scale with VLA models, showing that even large pretrained policies benefit substantially from targeted human corrections on their own failure modes. π0.6 is trained using RECAP.
+27 -105
View File
@@ -503,121 +503,43 @@ hf upload ${HF_USER}/act_so101_test${CKPT} \
## Run inference and evaluate your policy
You can use the `record` script from [`lerobot-record`](https://github.com/huggingface/lerobot/blob/main/src/lerobot/scripts/lerobot_record.py) with a policy checkpoint as input, to run inference and evaluate your policy. For instance, run this command or API example to run inference and record 10 evaluation episodes:
Use `lerobot-rollout` to deploy a trained policy on your robot. You can choose different strategies depending on your needs:
<hfoptions id="eval">
<hfoption id="Command">
<hfoption id="Base mode (no recording)">
```bash
lerobot-record \
lerobot-rollout \
--strategy.type=base \
--policy.path=${HF_USER}/my_policy \
--robot.type=so100_follower \
--robot.port=/dev/ttyACM1 \
--robot.cameras="{ up: {type: opencv, index_or_path: /dev/video10, width: 640, height: 480, fps: 30}, side: {type: intelrealsense, serial_number_or_name: 233522074606, width: 640, height: 480, fps: 30}}" \
--robot.id=my_awesome_follower_arm \
--display_data=false \
--dataset.repo_id=${HF_USER}/eval_so100 \
--dataset.single_task="Put lego brick into the transparent box" \
--dataset.streaming_encoding=true \
--dataset.encoder_threads=2 \
# --dataset.vcodec=auto \
# <- Teleop optional if you want to teleoperate in between episodes \
# --teleop.type=so100_leader \
# --teleop.port=/dev/ttyACM0 \
# --teleop.id=my_awesome_leader_arm \
--policy.path=${HF_USER}/my_policy
--task="Put lego brick into the transparent box" \
--duration=60
```
</hfoption>
<hfoption id="API example">
<!-- prettier-ignore-start -->
```python
from lerobot.cameras.opencv import OpenCVCameraConfig
from lerobot.datasets import LeRobotDataset
from lerobot.utils.feature_utils import hw_to_dataset_features
from lerobot.policies.act import ACTPolicy
from lerobot.policies import make_pre_post_processors
from lerobot.robots.so_follower import SO100Follower, SO100FollowerConfig
from lerobot.scripts.lerobot_record import record_loop
from lerobot.common.control_utils import init_keyboard_listener
from lerobot.utils.utils import log_say
from lerobot.utils.visualization_utils import init_rerun
NUM_EPISODES = 5
FPS = 30
EPISODE_TIME_SEC = 60
TASK_DESCRIPTION = "My task description"
HF_MODEL_ID = "<hf_username>/<model_repo_id>"
HF_DATASET_ID = "<hf_username>/<eval_dataset_repo_id>"
# Create the robot configuration
camera_config = {"front": OpenCVCameraConfig(index_or_path=0, width=640, height=480, fps=FPS)}
robot_config = SO100FollowerConfig(
port="/dev/tty.usbmodem58760434471", id="my_awesome_follower_arm", cameras=camera_config
)
# Initialize the robot
robot = SO100Follower(robot_config)
# Initialize the policy
policy = ACTPolicy.from_pretrained(HF_MODEL_ID)
# Configure the dataset features
action_features = hw_to_dataset_features(robot.action_features, "action")
obs_features = hw_to_dataset_features(robot.observation_features, "observation")
dataset_features = {**action_features, **obs_features}
# Create the dataset
dataset = LeRobotDataset.create(
repo_id=HF_DATASET_ID,
fps=FPS,
features=dataset_features,
robot_type=robot.name,
use_videos=True,
image_writer_threads=4,
)
# Initialize the keyboard listener and rerun visualization
_, events = init_keyboard_listener()
init_rerun(session_name="recording")
# Connect the robot
robot.connect()
preprocessor, postprocessor = make_pre_post_processors(
policy_cfg=policy,
pretrained_path=HF_MODEL_ID,
dataset_stats=dataset.meta.stats,
)
for episode_idx in range(NUM_EPISODES):
log_say(f"Running inference, recording eval episode {episode_idx + 1} of {NUM_EPISODES}")
# Run the policy inference loop
record_loop(
robot=robot,
events=events,
fps=FPS,
policy=policy,
preprocessor=preprocessor,
postprocessor=postprocessor,
dataset=dataset,
control_time_s=EPISODE_TIME_SEC,
single_task=TASK_DESCRIPTION,
display_data=True,
)
dataset.save_episode()
# Clean up
robot.disconnect()
dataset.push_to_hub()
<hfoption id="Sentry mode (with recording)">
```bash
lerobot-rollout \
--strategy.type=sentry \
--strategy.episode_duration_s=60 \
--strategy.upload_every_n_episodes=5 \
--policy.path=${HF_USER}/my_policy \
--robot.type=so100_follower \
--robot.port=/dev/ttyACM1 \
--robot.cameras="{ up: {type: opencv, index_or_path: /dev/video10, width: 640, height: 480, fps: 30}, side: {type: intelrealsense, serial_number_or_name: 233522074606, width: 640, height: 480, fps: 30}}" \
--dataset.repo_id=${HF_USER}/eval_so100 \
--dataset.single_task="Put lego brick into the transparent box" \
--duration=600
```
<!-- prettier-ignore-end -->
</hfoption>
</hfoptions>
As you can see, it's almost the same command as previously used to record your training dataset. Two things changed:
The `--strategy.type` flag selects the execution mode:
1. There is an additional `--control.policy.path` argument which indicates the path to your policy checkpoint with (e.g. `outputs/train/eval_act_so101_test/checkpoints/last/pretrained_model`). You can also use the model repository if you uploaded a model checkpoint to the hub (e.g. `${HF_USER}/act_so101_test`).
2. The name of dataset begins by `eval` to reflect that you are running inference (e.g. `${HF_USER}/eval_act_so101_test`).
- `base`: Autonomous rollout with no data recording (useful for quick evaluation)
- `sentry`: Continuous recording with auto-upload (useful for large-scale evaluation)
- `highlight`: Ring buffer recording with keystroke save (useful for capturing interesting events)
- `dagger`: Human-in-the-loop data collection (see [HIL Data Collection](./hil_data_collection))
All strategies support `--rtc.enabled=true` for smooth execution with slow VLA models (Pi0, Pi0.5, SmolVLA).
+7 -3
View File
@@ -34,7 +34,7 @@ pip install -e ".[smolvla]"
### Using RTC with Pi0
You can find a complete reference implementation in [eval_with_real_robot.py](examples/rtc/eval_with_real_robot.py).
You can use `lerobot-rollout --strategy.type=base --rtc.enabled=true` for RTC deployment on real robots.
The snippet below provides a simplified pseudo-example of how RTC operates with Pi0 in your pipeline:
```python
@@ -137,8 +137,12 @@ The script generates a visualization of the denoising process, comparing standar
## Testing RTC with a Real Robot
```bash
python examples/rtc/eval_with_real_robot.py \
lerobot-rollout \
--strategy.type=base \
--policy.path=${HF_USERNAME}/policy_repo_id \
--rtc.enabled=true \
--rtc.execution_horizon=10 \
--rtc.max_guidance_weight=10.0 \
--robot.type=so100_follower \
--robot.port=/dev/tty.usbmodem58FA0834591 \
--robot.cameras="{ gripper: {type: opencv, index_or_path: 1, width: 640, height: 480, fps: 30}, front: {type: opencv, index_or_path: 0, width: 640, height: 480, fps: 30}}" \
@@ -178,7 +182,7 @@ visualizer = RTCDebugVisualizer()
# ... create plots
```
See `examples/rtc/eval_dataset.py` for a complete example of visualization.
See `examples/rtc/eval_dataset.py` for a complete example of offline RTC visualization.
## References
File diff suppressed because it is too large Load Diff
-226
View File
@@ -1,226 +0,0 @@
# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Shared utilities for Human-in-the-Loop data collection scripts."""
import logging
import time
from dataclasses import dataclass, field
from pathlib import Path
from lerobot.common.control_utils import is_headless
from lerobot.processor import (
IdentityProcessorStep,
RobotAction,
RobotObservation,
RobotProcessorPipeline,
observation_to_transition,
robot_action_observation_to_transition,
transition_to_observation,
transition_to_robot_action,
)
from lerobot.robots import Robot
from lerobot.teleoperators import Teleoperator
from lerobot.utils.robot_utils import precise_sleep
logger = logging.getLogger(__name__)
@dataclass
class HILDatasetConfig:
repo_id: str
single_task: str
root: str | Path | None = None
fps: int = 30
episode_time_s: float = 120
num_episodes: int = 50
video: bool = True
push_to_hub: bool = True
private: bool = False
tags: list[str] | None = None
num_image_writer_processes: int = 0
num_image_writer_threads_per_camera: int = 4
video_encoding_batch_size: int = 1
vcodec: str = "auto"
streaming_encoding: bool = True
encoder_queue_maxsize: int = 30
encoder_threads: int | None = None
rename_map: dict[str, str] = field(default_factory=dict)
def teleop_has_motor_control(teleop: Teleoperator) -> bool:
"""Check if teleoperator has motor control capabilities."""
return all(hasattr(teleop, attr) for attr in ("enable_torque", "disable_torque", "write_goal_positions"))
def teleop_disable_torque(teleop: Teleoperator) -> None:
"""Disable teleop torque if supported."""
if hasattr(teleop, "disable_torque"):
teleop.disable_torque()
def teleop_enable_torque(teleop: Teleoperator) -> None:
"""Enable teleop torque if supported."""
if hasattr(teleop, "enable_torque"):
teleop.enable_torque()
def teleop_smooth_move_to(teleop: Teleoperator, target_pos: dict, duration_s: float = 2.0, fps: int = 50):
"""Smoothly move teleop to target position if motor control is available."""
if not teleop_has_motor_control(teleop):
logger.warning("Teleop does not support motor control - cannot mirror robot position")
return
teleop_enable_torque(teleop)
current = teleop.get_action()
steps = max(int(duration_s * fps), 1)
for step in range(steps + 1):
t = step / steps
interp = {}
for k in current:
if k in target_pos:
interp[k] = current[k] * (1 - t) + target_pos[k] * t
else:
interp[k] = current[k]
teleop.write_goal_positions(interp)
time.sleep(1 / fps)
def init_keyboard_listener():
"""Initialize keyboard listener with HIL controls."""
events = {
"exit_early": False,
"rerecord_episode": False,
"stop_recording": False,
"policy_paused": False,
"correction_active": False,
"resume_policy": False,
"in_reset": False,
"start_next_episode": False,
}
if is_headless():
logger.warning("Headless environment - keyboard controls unavailable")
return None, events
from pynput import keyboard
def on_press(key):
try:
if events["in_reset"]:
if key in [keyboard.Key.space, keyboard.Key.right]:
logger.info("[HIL] Starting next episode...")
events["start_next_episode"] = True
elif hasattr(key, "char") and key.char == "c":
events["start_next_episode"] = True
elif key == keyboard.Key.esc:
logger.info("[HIL] ESC - Stop recording, pushing to hub...")
events["stop_recording"] = True
events["start_next_episode"] = True
else:
if key == keyboard.Key.space:
if not events["policy_paused"] and not events["correction_active"]:
logger.info("[HIL] PAUSED - Press 'c' to take control or 'p' to resume policy")
events["policy_paused"] = True
elif hasattr(key, "char") and key.char == "c":
if events["policy_paused"] and not events["correction_active"]:
logger.info("[HIL] Taking control...")
events["start_next_episode"] = True
elif hasattr(key, "char") and key.char == "p":
if events["policy_paused"] or events["correction_active"]:
logger.info("[HIL] Resuming policy...")
events["resume_policy"] = True
elif key == keyboard.Key.right:
logger.info("[HIL] End episode")
events["exit_early"] = True
elif key == keyboard.Key.left:
logger.info("[HIL] Re-record episode")
events["rerecord_episode"] = True
events["exit_early"] = True
elif key == keyboard.Key.esc:
logger.info("[HIL] ESC - Stop recording...")
events["stop_recording"] = True
events["exit_early"] = True
except Exception as e:
logger.info(f"Key error: {e}")
listener = keyboard.Listener(on_press=on_press)
listener.start()
return listener, events
def make_identity_processors():
"""Create identity processors for recording."""
teleop_proc = RobotProcessorPipeline[tuple[RobotAction, RobotObservation], RobotAction](
steps=[IdentityProcessorStep()],
to_transition=robot_action_observation_to_transition,
to_output=transition_to_robot_action,
)
obs_proc = RobotProcessorPipeline[RobotObservation, RobotObservation](
steps=[IdentityProcessorStep()],
to_transition=observation_to_transition,
to_output=transition_to_observation,
)
return teleop_proc, obs_proc
def reset_loop(robot: Robot, teleop: Teleoperator, events: dict, fps: int):
"""Reset period where human repositions environment."""
logger.info("[HIL] RESET")
events["in_reset"] = True
events["start_next_episode"] = False
obs = robot.get_observation()
robot_pos = {k: v for k, v in obs.items() if k.endswith(".pos") and k in robot.observation_features}
teleop_smooth_move_to(teleop, robot_pos, duration_s=2.0, fps=50)
logger.info("Press any key to enable teleoperation")
while not events["start_next_episode"] and not events["stop_recording"]:
precise_sleep(0.05)
if events["stop_recording"]:
return
events["start_next_episode"] = False
teleop_disable_torque(teleop)
logger.info("Teleop enabled - press any key to start episode")
while not events["start_next_episode"] and not events["stop_recording"]:
loop_start = time.perf_counter()
action = teleop.get_action()
robot.send_action(action)
precise_sleep(1 / fps - (time.perf_counter() - loop_start))
events["in_reset"] = False
events["start_next_episode"] = False
events["exit_early"] = False
events["policy_paused"] = False
events["correction_active"] = False
events["resume_policy"] = False
def print_controls(rtc: bool = False):
"""Print control instructions."""
mode = "Human-in-the-Loop Data Collection" + (" (RTC)" if rtc else "")
logger.info(
"%s\n Controls:\n"
" SPACE - Pause policy\n"
" c - Take control\n"
" p - Resume policy after pause/correction\n"
" → - End episode\n"
" ESC - Stop and push to hub",
mode,
)
+62 -31
View File
@@ -14,17 +14,21 @@
# See the License for the specific language governing permissions and
# limitations under the License.
from lerobot.common.control_utils import init_keyboard_listener
import logging
import time
from lerobot.common.control_utils import init_keyboard_listener, predict_action
from lerobot.datasets import LeRobotDataset
from lerobot.policies import make_pre_post_processors
from lerobot.policies.act import ACTPolicy
from lerobot.policies.utils import make_robot_action
from lerobot.processor import make_default_processors
from lerobot.robots.lekiwi import LeKiwiClient, LeKiwiClientConfig
from lerobot.scripts.lerobot_record import record_loop
from lerobot.utils.constants import ACTION, OBS_STR
from lerobot.utils.feature_utils import hw_to_dataset_features
from lerobot.utils.feature_utils import build_dataset_frame, hw_to_dataset_features
from lerobot.utils.robot_utils import precise_sleep
from lerobot.utils.utils import log_say
from lerobot.utils.visualization_utils import init_rerun
from lerobot.utils.visualization_utils import init_rerun, log_rerun_data
NUM_EPISODES = 2
FPS = 30
@@ -35,6 +39,9 @@ HF_DATASET_ID = "<hf_username>/<eval_dataset_repo_id>"
def main():
# NOTE: For production policy deployment, use `lerobot-rollout` CLI instead.
# This script provides a self-contained example for educational purposes.
# Create the robot configuration & robot
robot_config = LeKiwiClientConfig(remote_ip="172.18.134.136", id="lekiwi")
@@ -83,43 +90,67 @@ def main():
raise ValueError("Robot is not connected!")
print("Starting evaluate loop...")
control_interval = 1 / FPS
recorded_episodes = 0
while recorded_episodes < NUM_EPISODES and not events["stop_recording"]:
log_say(f"Running inference, recording eval episode {recorded_episodes} of {NUM_EPISODES}")
# Main record loop
record_loop(
robot=robot,
events=events,
fps=FPS,
policy=policy,
preprocessor=preprocessor, # Pass the pre and post policy processors
postprocessor=postprocessor,
dataset=dataset,
control_time_s=EPISODE_TIME_SEC,
single_task=TASK_DESCRIPTION,
display_data=True,
teleop_action_processor=teleop_action_processor,
robot_action_processor=robot_action_processor,
robot_observation_processor=robot_observation_processor,
)
# Inline evaluation loop: predict actions and send to robot
timestamp = 0
start_episode_t = time.perf_counter()
while timestamp < EPISODE_TIME_SEC:
start_loop_t = time.perf_counter()
if events["exit_early"]:
events["exit_early"] = False
break
# Get robot observation
obs = robot.get_observation()
obs_processed = robot_observation_processor(obs)
observation_frame = build_dataset_frame(dataset.features, obs_processed, prefix=OBS_STR)
# Predict action using the policy
action_tensor = predict_action(
observation=observation_frame,
policy=policy,
device=policy.config.device,
preprocessor=preprocessor,
postprocessor=postprocessor,
use_amp=policy.config.device.type == "cuda",
task=TASK_DESCRIPTION,
robot_type=robot.name,
)
# Convert policy output to robot action dict
action_values = make_robot_action(action_tensor, dataset.features)
# Process and send action to robot
robot_action_to_send = robot_action_processor((action_values, obs))
robot.send_action(robot_action_to_send)
# Write to dataset
action_frame = build_dataset_frame(dataset.features, action_values, prefix=ACTION)
frame = {**observation_frame, **action_frame, "task": TASK_DESCRIPTION}
dataset.add_frame(frame)
log_rerun_data(observation=obs_processed, action=action_values)
dt_s = time.perf_counter() - start_loop_t
sleep_time_s = control_interval - dt_s
if sleep_time_s < 0:
logging.warning(
f"Evaluate loop is running slower ({1 / dt_s:.1f} Hz) than the target FPS ({FPS} Hz)."
)
precise_sleep(max(sleep_time_s, 0.0))
timestamp = time.perf_counter() - start_episode_t
# Reset the environment if not stopping or re-recording
if not events["stop_recording"] and (
(recorded_episodes < NUM_EPISODES - 1) or events["rerecord_episode"]
):
log_say("Reset the environment")
record_loop(
robot=robot,
events=events,
fps=FPS,
control_time_s=EPISODE_TIME_SEC,
single_task=TASK_DESCRIPTION,
display_data=True,
teleop_action_processor=teleop_action_processor,
robot_action_processor=robot_action_processor,
robot_observation_processor=robot_observation_processor,
)
log_say("Waiting for environment reset, press right arrow key when ready...")
if events["rerecord_episode"]:
log_say("Re-record episode")
+10 -9
View File
@@ -45,9 +45,6 @@ def main():
leader_arm = SO100Leader(leader_arm_config)
keyboard = KeyboardTeleop(keyboard_config)
# TODO(Steven): Update this example to use pipelines
teleop_action_processor, robot_action_processor, robot_observation_processor = make_default_processors()
# Configure the dataset features
action_features = hw_to_dataset_features(robot.action_features, ACTION)
obs_features = hw_to_dataset_features(robot.observation_features, OBS_STR)
@@ -77,6 +74,10 @@ def main():
if not robot.is_connected or not leader_arm.is_connected or not keyboard.is_connected:
raise ValueError("Robot or teleop is not connected!")
teleop_action_processor, robot_action_processor, robot_observation_processor = (
make_default_processors()
)
print("Starting record loop...")
recorded_episodes = 0
while recorded_episodes < NUM_EPISODES and not events["stop_recording"]:
@@ -87,14 +88,14 @@ def main():
robot=robot,
events=events,
fps=FPS,
teleop_action_processor=teleop_action_processor,
robot_action_processor=robot_action_processor,
robot_observation_processor=robot_observation_processor,
dataset=dataset,
teleop=[leader_arm, keyboard],
control_time_s=EPISODE_TIME_SEC,
single_task=TASK_DESCRIPTION,
display_data=True,
teleop_action_processor=teleop_action_processor,
robot_action_processor=robot_action_processor,
robot_observation_processor=robot_observation_processor,
)
# Reset the environment if not stopping or re-recording
@@ -106,13 +107,13 @@ def main():
robot=robot,
events=events,
fps=FPS,
teleop_action_processor=teleop_action_processor,
robot_action_processor=robot_action_processor,
robot_observation_processor=robot_observation_processor,
teleop=[leader_arm, keyboard],
control_time_s=RESET_TIME_SEC,
single_task=TASK_DESCRIPTION,
display_data=True,
teleop_action_processor=teleop_action_processor,
robot_action_processor=robot_action_processor,
robot_observation_processor=robot_observation_processor,
)
if events["rerecord_episode"]:
+63 -31
View File
@@ -14,13 +14,17 @@
# See the License for the specific language governing permissions and
# limitations under the License.
import logging
import time
from lerobot.cameras.opencv import OpenCVCameraConfig
from lerobot.common.control_utils import init_keyboard_listener
from lerobot.common.control_utils import init_keyboard_listener, predict_action
from lerobot.configs import FeatureType, PolicyFeature
from lerobot.datasets import LeRobotDataset, aggregate_pipeline_dataset_features, create_initial_features
from lerobot.model.kinematics import RobotKinematics
from lerobot.policies import make_pre_post_processors
from lerobot.policies.act import ACTPolicy
from lerobot.policies.utils import make_robot_action
from lerobot.processor import (
RobotProcessorPipeline,
make_default_teleop_action_processor,
@@ -34,11 +38,12 @@ from lerobot.robots.so_follower.robot_kinematic_processor import (
ForwardKinematicsJointsToEE,
InverseKinematicsEEToJoints,
)
from lerobot.scripts.lerobot_record import record_loop
from lerobot.types import RobotAction, RobotObservation
from lerobot.utils.feature_utils import combine_feature_dicts
from lerobot.utils.constants import ACTION, OBS_STR
from lerobot.utils.feature_utils import build_dataset_frame, combine_feature_dicts
from lerobot.utils.robot_utils import precise_sleep
from lerobot.utils.utils import log_say
from lerobot.utils.visualization_utils import init_rerun
from lerobot.utils.visualization_utils import init_rerun, log_rerun_data
NUM_EPISODES = 5
FPS = 30
@@ -49,6 +54,9 @@ HF_DATASET_ID = "<hf_username>/<dataset_repo_id>"
def main():
# NOTE: For production policy deployment, use `lerobot-rollout` CLI instead.
# This script provides a self-contained example for educational purposes.
# Create the robot configuration & robot
camera_config = {"front": OpenCVCameraConfig(index_or_path=0, width=640, height=480, fps=FPS)}
robot_config = SO100FollowerConfig(
@@ -143,43 +151,67 @@ def main():
raise ValueError("Robot is not connected!")
print("Starting evaluate loop...")
control_interval = 1 / FPS
episode_idx = 0
for episode_idx in range(NUM_EPISODES):
log_say(f"Running inference, recording eval episode {episode_idx + 1} of {NUM_EPISODES}")
# Main record loop
record_loop(
robot=robot,
events=events,
fps=FPS,
policy=policy,
preprocessor=preprocessor, # Pass the pre and post policy processors
postprocessor=postprocessor,
dataset=dataset,
control_time_s=EPISODE_TIME_SEC,
single_task=TASK_DESCRIPTION,
display_data=True,
teleop_action_processor=make_default_teleop_action_processor(),
robot_action_processor=robot_ee_to_joints_processor,
robot_observation_processor=robot_joints_to_ee_pose_processor,
)
# Inline evaluation loop: predict actions and send to robot
timestamp = 0
start_episode_t = time.perf_counter()
while timestamp < EPISODE_TIME_SEC:
start_loop_t = time.perf_counter()
if events["exit_early"]:
events["exit_early"] = False
break
# Get robot observation
obs = robot.get_observation()
obs_processed = robot_joints_to_ee_pose_processor(obs)
observation_frame = build_dataset_frame(dataset.features, obs_processed, prefix=OBS_STR)
# Predict action using the policy
action_tensor = predict_action(
observation=observation_frame,
policy=policy,
device=policy.config.device,
preprocessor=preprocessor,
postprocessor=postprocessor,
use_amp=policy.config.device.type == "cuda",
task=TASK_DESCRIPTION,
robot_type=robot.name,
)
# Convert policy output to robot action dict
action_values = make_robot_action(action_tensor, dataset.features)
# Process and send action to robot (EE -> joints via IK)
robot_action_to_send = robot_ee_to_joints_processor((action_values, obs))
robot.send_action(robot_action_to_send)
# Write to dataset
action_frame = build_dataset_frame(dataset.features, action_values, prefix=ACTION)
frame = {**observation_frame, **action_frame, "task": TASK_DESCRIPTION}
dataset.add_frame(frame)
log_rerun_data(observation=obs_processed, action=action_values)
dt_s = time.perf_counter() - start_loop_t
sleep_time_s = control_interval - dt_s
if sleep_time_s < 0:
logging.warning(
f"Evaluate loop is running slower ({1 / dt_s:.1f} Hz) than the target FPS ({FPS} Hz)."
)
precise_sleep(max(sleep_time_s, 0.0))
timestamp = time.perf_counter() - start_episode_t
# Reset the environment if not stopping or re-recording
if not events["stop_recording"] and (
(episode_idx < NUM_EPISODES - 1) or events["rerecord_episode"]
):
log_say("Reset the environment")
record_loop(
robot=robot,
events=events,
fps=FPS,
control_time_s=EPISODE_TIME_SEC,
single_task=TASK_DESCRIPTION,
display_data=True,
teleop_action_processor=make_default_teleop_action_processor(),
robot_action_processor=robot_ee_to_joints_processor,
robot_observation_processor=robot_joints_to_ee_pose_processor,
)
log_say("Waiting for environment reset, press right arrow key when ready...")
if events["rerecord_episode"]:
log_say("Re-record episode")
+19 -91
View File
@@ -16,29 +16,14 @@
from lerobot.cameras.opencv import OpenCVCameraConfig
from lerobot.common.control_utils import init_keyboard_listener
from lerobot.datasets import LeRobotDataset, aggregate_pipeline_dataset_features, create_initial_features
from lerobot.model.kinematics import RobotKinematics
from lerobot.processor import (
RobotProcessorPipeline,
observation_to_transition,
robot_action_observation_to_transition,
transition_to_observation,
transition_to_robot_action,
)
from lerobot.datasets import LeRobotDataset
from lerobot.processor import make_default_processors
from lerobot.robots.so_follower import SO100Follower, SO100FollowerConfig
from lerobot.robots.so_follower.robot_kinematic_processor import (
EEBoundsAndSafety,
EEReferenceAndDelta,
ForwardKinematicsJointsToEE,
GripperVelocityToJoint,
InverseKinematicsEEToJoints,
)
from lerobot.scripts.lerobot_record import record_loop
from lerobot.teleoperators.phone import Phone, PhoneConfig
from lerobot.teleoperators.phone.config_phone import PhoneOS
from lerobot.teleoperators.phone.phone_processor import MapPhoneActionToRobotAction
from lerobot.types import RobotAction, RobotObservation
from lerobot.utils.feature_utils import combine_feature_dicts
from lerobot.utils.constants import ACTION, OBS_STR
from lerobot.utils.feature_utils import hw_to_dataset_features
from lerobot.utils.utils import log_say
from lerobot.utils.visualization_utils import init_rerun
@@ -65,77 +50,16 @@ def main():
robot = SO100Follower(robot_config)
phone = Phone(teleop_config)
# NOTE: It is highly recommended to use the urdf in the SO-ARM100 repo: https://github.com/TheRobotStudio/SO-ARM100/blob/main/Simulation/SO101/so101_new_calib.urdf
kinematics_solver = RobotKinematics(
urdf_path="./SO101/so101_new_calib.urdf",
target_frame_name="gripper_frame_link",
joint_names=list(robot.bus.motors.keys()),
)
# Build pipeline to convert phone action to EE action
phone_to_robot_ee_pose_processor = RobotProcessorPipeline[
tuple[RobotAction, RobotObservation], RobotAction
](
steps=[
MapPhoneActionToRobotAction(platform=teleop_config.phone_os),
EEReferenceAndDelta(
kinematics=kinematics_solver,
end_effector_step_sizes={"x": 0.5, "y": 0.5, "z": 0.5},
motor_names=list(robot.bus.motors.keys()),
use_latched_reference=True,
),
EEBoundsAndSafety(
end_effector_bounds={"min": [-1.0, -1.0, -1.0], "max": [1.0, 1.0, 1.0]},
max_ee_step_m=0.20,
),
GripperVelocityToJoint(speed_factor=20.0),
],
to_transition=robot_action_observation_to_transition,
to_output=transition_to_robot_action,
)
# Build pipeline to convert EE action to joints action
robot_ee_to_joints_processor = RobotProcessorPipeline[tuple[RobotAction, RobotObservation], RobotAction](
steps=[
InverseKinematicsEEToJoints(
kinematics=kinematics_solver,
motor_names=list(robot.bus.motors.keys()),
initial_guess_current_joints=True,
),
],
to_transition=robot_action_observation_to_transition,
to_output=transition_to_robot_action,
)
# Build pipeline to convert joint observation to EE observation
robot_joints_to_ee_pose = RobotProcessorPipeline[RobotObservation, RobotObservation](
steps=[
ForwardKinematicsJointsToEE(
kinematics=kinematics_solver, motor_names=list(robot.bus.motors.keys())
)
],
to_transition=observation_to_transition,
to_output=transition_to_observation,
)
# Configure the dataset features
action_features = hw_to_dataset_features(robot.action_features, ACTION)
obs_features = hw_to_dataset_features(robot.observation_features, OBS_STR)
dataset_features = {**action_features, **obs_features}
# Create the dataset
dataset = LeRobotDataset.create(
repo_id=HF_REPO_ID,
fps=FPS,
features=combine_feature_dicts(
# Run the feature contract of the pipelines
# This tells you how the features would look like after the pipeline steps
aggregate_pipeline_dataset_features(
pipeline=phone_to_robot_ee_pose_processor,
initial_features=create_initial_features(action=phone.action_features),
use_videos=True,
),
aggregate_pipeline_dataset_features(
pipeline=robot_joints_to_ee_pose,
initial_features=create_initial_features(observation=robot.observation_features),
use_videos=True,
),
),
features=dataset_features,
robot_type=robot.name,
use_videos=True,
image_writer_threads=4,
@@ -153,6 +77,10 @@ def main():
if not robot.is_connected or not phone.is_connected:
raise ValueError("Robot or teleop is not connected!")
teleop_action_processor, robot_action_processor, robot_observation_processor = (
make_default_processors()
)
print("Starting record loop. Move your phone to teleoperate the robot...")
episode_idx = 0
while episode_idx < NUM_EPISODES and not events["stop_recording"]:
@@ -163,14 +91,14 @@ def main():
robot=robot,
events=events,
fps=FPS,
teleop_action_processor=teleop_action_processor,
robot_action_processor=robot_action_processor,
robot_observation_processor=robot_observation_processor,
teleop=phone,
dataset=dataset,
control_time_s=EPISODE_TIME_SEC,
single_task=TASK_DESCRIPTION,
display_data=True,
teleop_action_processor=phone_to_robot_ee_pose_processor,
robot_action_processor=robot_ee_to_joints_processor,
robot_observation_processor=robot_joints_to_ee_pose,
)
# Reset the environment if not stopping or re-recording
@@ -182,13 +110,13 @@ def main():
robot=robot,
events=events,
fps=FPS,
teleop_action_processor=teleop_action_processor,
robot_action_processor=robot_action_processor,
robot_observation_processor=robot_observation_processor,
teleop=phone,
control_time_s=RESET_TIME_SEC,
single_task=TASK_DESCRIPTION,
display_data=True,
teleop_action_processor=phone_to_robot_ee_pose_processor,
robot_action_processor=robot_ee_to_joints_processor,
robot_observation_processor=robot_joints_to_ee_pose,
)
if events["rerecord_episode"]:
-673
View File
@@ -1,673 +0,0 @@
#!/usr/bin/env python
# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""
Demo script showing how to use Real-Time Chunking (RTC) with action chunking policies on real robots.
This script demonstrates:
1. Creating a robot and policy (SmolVLA, Pi0, etc.) with RTC
2. Consuming actions from the policy while the robot executes
3. Periodically requesting new action chunks in the background using threads
4. Managing action buffers and timing for real-time operation
For simulation environments, see eval_with_simulation.py
Usage:
# Run RTC with Real robot with RTC
uv run examples/rtc/eval_with_real_robot.py \
--policy.path=<USER>/smolvla_check_rtc_last3 \
--policy.device=mps \
--rtc.enabled=true \
--rtc.execution_horizon=20 \
--robot.type=so100_follower \
--robot.port=/dev/tty.usbmodem58FA0834591 \
--robot.id=so100_follower \
--robot.cameras="{ gripper: {type: opencv, index_or_path: 1, width: 640, height: 480, fps: 30}, front: {type: opencv, index_or_path: 0, width: 640, height: 480, fps: 30}}" \
--task="Move green small object into the purple platform" \
--duration=120
# Run RTC with Real robot without RTC
uv run examples/rtc/eval_with_real_robot.py \
--policy.path=<USER>/smolvla_check_rtc_last3 \
--policy.device=mps \
--rtc.enabled=false \
--robot.type=so100_follower \
--robot.port=/dev/tty.usbmodem58FA0834591 \
--robot.id=so100_follower \
--robot.cameras="{ gripper: {type: opencv, index_or_path: 1, width: 640, height: 480, fps: 30}, front: {type: opencv, index_or_path: 0, width: 640, height: 480, fps: 30}}" \
--task="Move green small object into the purple platform" \
--duration=120
# Run RTC with Real robot with pi0.5 policy
uv run examples/rtc/eval_with_real_robot.py \
--policy.path=<USER>/pi05_check_rtc \
--policy.device=mps \
--rtc.enabled=true \
--rtc.execution_horizon=20 \
--robot.type=so100_follower \
--robot.port=/dev/tty.usbmodem58FA0834591 \
--robot.id=so100_follower \
--robot.cameras="{ gripper: {type: opencv, index_or_path: 0, width: 640, height: 480, fps: 30}, front: {type: opencv, index_or_path: 1, width: 640, height: 480, fps: 30}}" \
--task="Move green small object into the purple platform" \
--duration=120
# Run RTC with bi_openarm_follower (dual-arm OpenArms) and pi0.5 policy
python examples/rtc/eval_with_real_robot.py \
--policy.path=lerobot-data-collection/folding_final \
--robot.type=bi_openarm_follower \
--robot.cameras='{left_wrist: {type: opencv, index_or_path: "/dev/video4", width: 1280, height: 720, fps: 30}, base: {type: opencv, index_or_path: "/dev/video2", width: 640, height: 480, fps: 30}, right_wrist: {type: opencv, index_or_path: "/dev/video0", width: 1280, height: 720, fps: 30}}' \
--robot.left_arm_config.port=can0 \
--robot.left_arm_config.side=left \
--robot.left_arm_config.can_interface=socketcan \
--robot.left_arm_config.disable_torque_on_disconnect=true \
--robot.left_arm_config.max_relative_target=8.0 \
--robot.right_arm_config.port=can1 \
--robot.right_arm_config.side=right \
--robot.right_arm_config.can_interface=socketcan \
--robot.right_arm_config.disable_torque_on_disconnect=true \
--robot.right_arm_config.max_relative_target=8.0 \
--task="Fold the T-shirt properly" \
--fps=30 \
--duration=2000 \
--interpolation_multiplier=3 \
--rtc.enabled=true \
--rtc.execution_horizon=20 \
--rtc.max_guidance_weight=5.0 \
--rtc.prefix_attention_schedule=LINEAR \
--device=cuda
"""
import logging
import math
import sys
import time
import traceback
from dataclasses import dataclass, field
from threading import Event, Lock, Thread
import torch
from torch import Tensor
from lerobot.cameras.opencv import OpenCVCameraConfig # noqa: F401
from lerobot.cameras.realsense import RealSenseCameraConfig # noqa: F401
from lerobot.cameras.zmq import ZMQCameraConfig # noqa: F401
from lerobot.configs import PreTrainedConfig, RTCAttentionSchedule, parser
from lerobot.policies import get_policy_class, make_pre_post_processors
from lerobot.policies.rtc import ActionInterpolator, ActionQueue, LatencyTracker, RTCConfig
from lerobot.processor import (
NormalizerProcessorStep,
RelativeActionsProcessorStep,
TransitionKey,
create_transition,
make_default_robot_action_processor,
make_default_robot_observation_processor,
to_relative_actions,
)
from lerobot.rl.process import ProcessSignalHandler
from lerobot.robots import ( # noqa: F401
Robot,
RobotConfig,
bi_openarm_follower,
bi_so_follower,
koch_follower,
so_follower,
unitree_g1,
)
from lerobot.robots.utils import make_robot_from_config
from lerobot.utils.constants import OBS_IMAGES, OBS_STATE
from lerobot.utils.feature_utils import build_dataset_frame, hw_to_dataset_features
from lerobot.utils.hub import HubMixin
from lerobot.utils.utils import init_logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
class RobotWrapper:
def __init__(self, robot: Robot):
self.robot = robot
self.lock = Lock()
def get_observation(self) -> dict[str, Tensor]:
with self.lock:
return self.robot.get_observation()
def send_action(self, action: Tensor):
with self.lock:
self.robot.send_action(action)
def observation_features(self) -> list[str]:
with self.lock:
return self.robot.observation_features
def action_features(self) -> list[str]:
with self.lock:
return self.robot.action_features
@dataclass
class RTCDemoConfig(HubMixin):
"""Configuration for RTC demo with action chunking policies and real robots."""
# Policy configuration
policy: PreTrainedConfig | None = None
# Robot configuration
robot: RobotConfig | None = None
# RTC configuration
rtc: RTCConfig = field(
default_factory=lambda: RTCConfig(
execution_horizon=10,
max_guidance_weight=1.0,
prefix_attention_schedule=RTCAttentionSchedule.EXP,
)
)
# Demo parameters
duration: float = 30.0 # Duration to run the demo (seconds)
fps: float = 10.0 # Action execution frequency (Hz)
interpolation_multiplier: int = 1 # Control rate multiplier (1=off, 2=2x, 3=3x)
# Compute device
device: str | None = None # Device to run on (cuda, cpu, auto)
# Get new actions horizon. The amount of executed steps after which will be requested new actions.
# It should be higher than inference delay + execution horizon.
action_queue_size_to_get_new_actions: int = 30
# Task to execute
task: str = field(default="", metadata={"help": "Task to execute"})
# Torch compile configuration
use_torch_compile: bool = field(
default=False,
metadata={"help": "Use torch.compile for faster inference (PyTorch 2.0+)"},
)
torch_compile_backend: str = field(
default="inductor",
metadata={"help": "Backend for torch.compile (inductor, aot_eager, cudagraphs)"},
)
torch_compile_mode: str = field(
default="default",
metadata={"help": "Compilation mode (default, reduce-overhead, max-autotune)"},
)
torch_compile_disable_cudagraphs: bool = field(
default=True,
metadata={
"help": "Disable CUDA graphs in torch.compile. Required due to in-place tensor "
"operations in denoising loop (x_t += dt * v_t) which cause tensor aliasing issues."
},
)
def __post_init__(self):
# HACK: We parse again the cli args here to get the pretrained path if there was one.
policy_path = parser.get_path_arg("policy")
if policy_path:
cli_overrides = parser.get_cli_overrides("policy")
self.policy = PreTrainedConfig.from_pretrained(policy_path, cli_overrides=cli_overrides)
self.policy.pretrained_path = policy_path
else:
raise ValueError("Policy path is required")
# Validate that robot configuration is provided
if self.robot is None:
raise ValueError("Robot configuration must be provided")
@classmethod
def __get_path_fields__(cls) -> list[str]:
"""This enables the parser to load config from the policy using `--policy.path=local/dir`"""
return ["policy"]
def is_image_key(k: str) -> bool:
return k.startswith(OBS_IMAGES)
def _reanchor_relative_rtc_prefix(
prev_actions_absolute: Tensor,
current_state: Tensor,
relative_step: RelativeActionsProcessorStep,
normalizer_step: NormalizerProcessorStep | None,
policy_device: torch.device | str,
) -> Tensor:
"""Convert absolute leftovers into model-space for relative-action RTC policies.
When a policy uses relative actions, the RTC prefix (leftover actions from
the previous chunk) is stored in absolute space. Before feeding it back to
the policy we need to re-express it relative to the *current* robot state
and then re-normalize.
"""
state = current_state.detach().cpu()
if state.dim() == 1:
state = state.unsqueeze(0)
action_cpu = prev_actions_absolute.detach().cpu()
mask = relative_step._build_mask(action_cpu.shape[-1])
relative_actions = to_relative_actions(action_cpu, state, mask)
transition = create_transition(action=relative_actions)
if normalizer_step is not None:
transition = normalizer_step(transition)
return transition[TransitionKey.ACTION].to(policy_device)
def get_actions(
policy,
robot: RobotWrapper,
robot_observation_processor,
action_queue: ActionQueue,
shutdown_event: Event,
cfg: RTCDemoConfig,
):
"""Thread function to request action chunks from the policy.
Args:
policy: The policy instance (SmolVLA, Pi0, etc.)
robot: The robot instance for getting observations
robot_observation_processor: Processor for raw robot observations
action_queue: Queue to put new action chunks
shutdown_event: Event to signal shutdown
cfg: Demo configuration
"""
try:
logger.info("[GET_ACTIONS] Starting get actions thread")
latency_tracker = LatencyTracker() # Track latency of action chunks
fps = cfg.fps
time_per_chunk = 1.0 / fps
# Only keep .pos joints + camera streams if the policy was trained on positions,
# not the full pos/vel/torque state the robot exposes.
observation_features_hw = {
key: value
for key, value in robot.observation_features().items()
if key.endswith(".pos") or isinstance(value, tuple)
}
dataset_features = hw_to_dataset_features(observation_features_hw, "observation")
policy_device = policy.config.device
# Load preprocessor and postprocessor from pretrained files
# The stats are embedded in the processor .safetensors files
logger.info(f"[GET_ACTIONS] Loading preprocessor/postprocessor from {cfg.policy.pretrained_path}")
preprocessor, postprocessor = make_pre_post_processors(
policy_cfg=cfg.policy,
pretrained_path=cfg.policy.pretrained_path,
dataset_stats=None, # Will load from pretrained processor files
preprocessor_overrides={
"device_processor": {"device": cfg.policy.device},
},
)
logger.info("[GET_ACTIONS] Preprocessor/postprocessor loaded successfully with embedded stats")
relative_step = next(
(s for s in preprocessor.steps if isinstance(s, RelativeActionsProcessorStep) and s.enabled),
None,
)
normalizer_step = next(
(s for s in preprocessor.steps if isinstance(s, NormalizerProcessorStep)),
None,
)
if relative_step is not None:
if relative_step.action_names is None:
cfg_names = getattr(cfg.policy, "action_feature_names", None)
if cfg_names:
relative_step.action_names = list(cfg_names)
else:
relative_step.action_names = [
k for k in robot.robot.action_features if k.endswith(".pos")
]
logger.info("[GET_ACTIONS] Relative actions enabled: will re-anchor RTC prefix")
get_actions_threshold = cfg.action_queue_size_to_get_new_actions
if not cfg.rtc.enabled:
get_actions_threshold = 0
while not shutdown_event.is_set():
if action_queue.qsize() <= get_actions_threshold:
current_time = time.perf_counter()
action_index_before_inference = action_queue.get_action_index()
prev_actions = action_queue.get_left_over()
inference_latency = latency_tracker.max()
inference_delay = math.ceil(inference_latency / time_per_chunk)
obs = robot.get_observation()
# Apply robot observation processor
obs_processed = robot_observation_processor(obs)
obs_with_policy_features = build_dataset_frame(
dataset_features, obs_processed, prefix="observation"
)
for name in obs_with_policy_features:
obs_with_policy_features[name] = torch.from_numpy(obs_with_policy_features[name])
if "image" in name:
obs_with_policy_features[name] = (
obs_with_policy_features[name].type(torch.float32) / 255
)
obs_with_policy_features[name] = (
obs_with_policy_features[name].permute(2, 0, 1).contiguous()
)
obs_with_policy_features[name] = obs_with_policy_features[name].unsqueeze(0)
obs_with_policy_features[name] = obs_with_policy_features[name].to(policy_device)
obs_with_policy_features["task"] = [cfg.task] # Task should be a list, not a string!
obs_with_policy_features["robot_type"] = (
robot.robot.name if hasattr(robot.robot, "name") else ""
)
preproceseded_obs = preprocessor(obs_with_policy_features)
# Re-anchor leftover actions for relative-action policies.
# We need the *postprocessed* (absolute) leftover, not the original
# (normalized/relative) one that get_left_over() returns.
if (
prev_actions is not None
and relative_step is not None
and OBS_STATE in obs_with_policy_features
):
with action_queue.lock:
if action_queue.queue is not None:
prev_actions_abs = action_queue.queue[action_queue.last_index :].clone()
else:
prev_actions_abs = None
if prev_actions_abs is not None and prev_actions_abs.numel() > 0:
prev_actions = _reanchor_relative_rtc_prefix(
prev_actions_absolute=prev_actions_abs,
current_state=obs_with_policy_features[OBS_STATE],
relative_step=relative_step,
normalizer_step=normalizer_step,
policy_device=policy_device,
)
# Generate actions WITH RTC
actions = policy.predict_action_chunk(
preproceseded_obs,
inference_delay=inference_delay,
prev_chunk_left_over=prev_actions,
)
# Store original actions (before postprocessing) for RTC
original_actions = actions.squeeze(0).clone()
postprocessed_actions = postprocessor(actions)
postprocessed_actions = postprocessed_actions.squeeze(0)
new_latency = time.perf_counter() - current_time
new_delay = math.ceil(new_latency / time_per_chunk)
latency_tracker.add(new_latency)
if cfg.action_queue_size_to_get_new_actions < cfg.rtc.execution_horizon + new_delay:
logger.warning(
"[GET_ACTIONS] cfg.action_queue_size_to_get_new_actions Too small, It should be higher than inference delay + execution horizon."
)
action_queue.merge(
original_actions, postprocessed_actions, new_delay, action_index_before_inference
)
else:
# Small sleep to prevent busy waiting
time.sleep(0.1)
logger.info("[GET_ACTIONS] get actions thread shutting down")
except Exception as e:
logger.error(f"[GET_ACTIONS] Fatal exception in get_actions thread: {e}")
logger.error(traceback.format_exc())
sys.exit(1)
def actor_control(
robot: RobotWrapper,
robot_action_processor,
action_queue: ActionQueue,
shutdown_event: Event,
cfg: RTCDemoConfig,
):
"""Thread function to execute actions on the robot.
Args:
robot: The robot instance
action_queue: Queue to get actions from
shutdown_event: Event to signal shutdown
cfg: Demo configuration
"""
try:
logger.info("[ACTOR] Starting actor thread")
action_keys = [k for k in robot.action_features() if k.endswith(".pos")]
action_count = 0
interpolator = ActionInterpolator(multiplier=cfg.interpolation_multiplier)
action_interval = interpolator.get_control_interval(cfg.fps)
while not shutdown_event.is_set():
start_time = time.perf_counter()
if interpolator.needs_new_action():
new_action = action_queue.get()
if new_action is not None:
interpolator.add(new_action.cpu())
action = interpolator.get()
if action is not None:
action = action.cpu()
action_dict = {key: action[i].item() for i, key in enumerate(action_keys)}
action_processed = robot_action_processor((action_dict, None))
robot.send_action(action_processed)
action_count += 1
dt_s = time.perf_counter() - start_time
time.sleep(max(0, (action_interval - dt_s) - 0.001))
logger.info(f"[ACTOR] Actor thread shutting down. Total actions executed: {action_count}")
except Exception as e:
logger.error(f"[ACTOR] Fatal exception in actor_control thread: {e}")
logger.error(traceback.format_exc())
sys.exit(1)
def _apply_torch_compile(policy, cfg: RTCDemoConfig):
"""Apply torch.compile to the policy's predict_action_chunk method.
Args:
policy: Policy instance to compile
cfg: Configuration containing torch compile settings
Returns:
Policy with compiled predict_action_chunk method
"""
# PI models handle their own compilation
if policy.type == "pi05" or policy.type == "pi0":
return policy
try:
# Check if torch.compile is available (PyTorch 2.0+)
if not hasattr(torch, "compile"):
logger.warning(
f"torch.compile is not available. Requires PyTorch 2.0+. "
f"Current version: {torch.__version__}. Skipping compilation."
)
return policy
logger.info("Applying torch.compile to predict_action_chunk...")
logger.info(f" Backend: {cfg.torch_compile_backend}")
logger.info(f" Mode: {cfg.torch_compile_mode}")
logger.info(f" Disable CUDA graphs: {cfg.torch_compile_disable_cudagraphs}")
# Compile the predict_action_chunk method
# - CUDA graphs disabled to prevent tensor aliasing from in-place ops (x_t += dt * v_t)
compile_kwargs = {
"backend": cfg.torch_compile_backend,
"mode": cfg.torch_compile_mode,
}
# Disable CUDA graphs if requested (prevents tensor aliasing issues)
if cfg.torch_compile_disable_cudagraphs:
compile_kwargs["options"] = {"triton.cudagraphs": False}
original_method = policy.predict_action_chunk
compiled_method = torch.compile(original_method, **compile_kwargs)
policy.predict_action_chunk = compiled_method
logger.info("✓ Successfully compiled predict_action_chunk")
except Exception as e:
logger.error(f"Failed to apply torch.compile: {e}")
logger.warning("Continuing without torch.compile")
return policy
@parser.wrap()
def demo_cli(cfg: RTCDemoConfig):
"""Main entry point for RTC demo with draccus configuration."""
# Initialize logging
init_logging()
logger.info(f"Using device: {cfg.device}")
# Setup signal handler for graceful shutdown
signal_handler = ProcessSignalHandler(use_threads=True, display_pid=False)
shutdown_event = signal_handler.shutdown_event
policy = None
robot = None
get_actions_thread = None
actor_thread = None
policy_class = get_policy_class(cfg.policy.type)
# Load config and set compile_model for pi0/pi05 models
config = PreTrainedConfig.from_pretrained(cfg.policy.pretrained_path)
if cfg.policy.type == "pi05" or cfg.policy.type == "pi0":
config.compile_model = cfg.use_torch_compile
if config.use_peft:
from peft import PeftConfig, PeftModel
peft_pretrained_path = cfg.policy.pretrained_path
peft_config = PeftConfig.from_pretrained(peft_pretrained_path)
policy = policy_class.from_pretrained(
pretrained_name_or_path=peft_config.base_model_name_or_path, config=config
)
policy = PeftModel.from_pretrained(policy, peft_pretrained_path, config=peft_config)
else:
policy = policy_class.from_pretrained(cfg.policy.pretrained_path, config=config)
# Turn on RTC
policy.config.rtc_config = cfg.rtc
# Init RTC processort, as by default if RTC disabled in the config
# The processor won't be created
policy.init_rtc_processor()
assert policy.name in ["smolvla", "pi05", "pi0"], "Only smolvla, pi05, and pi0 are supported for RTC"
policy = policy.to(cfg.device)
policy.eval()
# Apply torch.compile to predict_action_chunk method if enabled
if cfg.use_torch_compile:
policy = _apply_torch_compile(policy, cfg)
# Create robot
logger.info(f"Initializing robot: {cfg.robot.type}")
robot = make_robot_from_config(cfg.robot)
robot.connect()
robot_wrapper = RobotWrapper(robot)
# Create robot observation processor
robot_observation_processor = make_default_robot_observation_processor()
robot_action_processor = make_default_robot_action_processor()
# Create action queue for communication between threads
action_queue = ActionQueue(cfg.rtc)
# Start chunk requester thread
get_actions_thread = Thread(
target=get_actions,
args=(policy, robot_wrapper, robot_observation_processor, action_queue, shutdown_event, cfg),
daemon=True,
name="GetActions",
)
get_actions_thread.start()
logger.info("Started get actions thread")
# Start action executor thread
actor_thread = Thread(
target=actor_control,
args=(robot_wrapper, robot_action_processor, action_queue, shutdown_event, cfg),
daemon=True,
name="Actor",
)
actor_thread.start()
logger.info("Started actor thread")
logger.info("Started stop by duration thread")
# Main thread monitors for duration or shutdown
logger.info(f"Running demo for {cfg.duration} seconds...")
start_time = time.time()
while not shutdown_event.is_set() and (time.time() - start_time) < cfg.duration:
time.sleep(10)
# Log queue status periodically
if int(time.time() - start_time) % 5 == 0:
logger.info(f"[MAIN] Action queue size: {action_queue.qsize()}")
if time.time() - start_time > cfg.duration:
break
logger.info("Demo duration reached or shutdown requested")
# Signal shutdown
shutdown_event.set()
# Wait for threads to finish
if get_actions_thread and get_actions_thread.is_alive():
logger.info("Waiting for chunk requester thread to finish...")
get_actions_thread.join()
if actor_thread and actor_thread.is_alive():
logger.info("Waiting for action executor thread to finish...")
actor_thread.join()
# Cleanup robot
if robot:
robot.disconnect()
logger.info("Robot disconnected")
logger.info("Cleanup completed")
if __name__ == "__main__":
demo_cli()
logging.info("RTC demo finished")
+63 -31
View File
@@ -14,13 +14,17 @@
# See the License for the specific language governing permissions and
# limitations under the License.
import logging
import time
from lerobot.cameras.opencv import OpenCVCameraConfig
from lerobot.common.control_utils import init_keyboard_listener
from lerobot.common.control_utils import init_keyboard_listener, predict_action
from lerobot.configs import FeatureType, PolicyFeature
from lerobot.datasets import LeRobotDataset, aggregate_pipeline_dataset_features, create_initial_features
from lerobot.model.kinematics import RobotKinematics
from lerobot.policies import make_pre_post_processors
from lerobot.policies.act import ACTPolicy
from lerobot.policies.utils import make_robot_action
from lerobot.processor import (
RobotProcessorPipeline,
make_default_teleop_action_processor,
@@ -34,11 +38,12 @@ from lerobot.robots.so_follower.robot_kinematic_processor import (
ForwardKinematicsJointsToEE,
InverseKinematicsEEToJoints,
)
from lerobot.scripts.lerobot_record import record_loop
from lerobot.types import RobotAction, RobotObservation
from lerobot.utils.feature_utils import combine_feature_dicts
from lerobot.utils.constants import ACTION, OBS_STR
from lerobot.utils.feature_utils import build_dataset_frame, combine_feature_dicts
from lerobot.utils.robot_utils import precise_sleep
from lerobot.utils.utils import log_say
from lerobot.utils.visualization_utils import init_rerun
from lerobot.utils.visualization_utils import init_rerun, log_rerun_data
NUM_EPISODES = 5
FPS = 30
@@ -49,6 +54,9 @@ HF_DATASET_ID = "<hf_username>/<dataset_repo_id>"
def main():
# NOTE: For production policy deployment, use `lerobot-rollout` CLI instead.
# This script provides a self-contained example for educational purposes.
# Create the robot configuration & robot
camera_config = {"front": OpenCVCameraConfig(index_or_path=0, width=640, height=480, fps=FPS)}
robot_config = SO100FollowerConfig(
@@ -143,43 +151,67 @@ def main():
raise ValueError("Robot is not connected!")
print("Starting evaluate loop...")
control_interval = 1 / FPS
episode_idx = 0
for episode_idx in range(NUM_EPISODES):
log_say(f"Running inference, recording eval episode {episode_idx + 1} of {NUM_EPISODES}")
# Main record loop
record_loop(
robot=robot,
events=events,
fps=FPS,
policy=policy,
preprocessor=preprocessor, # Pass the pre and post policy processors
postprocessor=postprocessor,
dataset=dataset,
control_time_s=EPISODE_TIME_SEC,
single_task=TASK_DESCRIPTION,
display_data=True,
teleop_action_processor=make_default_teleop_action_processor(),
robot_action_processor=robot_ee_to_joints_processor,
robot_observation_processor=robot_joints_to_ee_pose_processor,
)
# Inline evaluation loop: predict actions and send to robot
timestamp = 0
start_episode_t = time.perf_counter()
while timestamp < EPISODE_TIME_SEC:
start_loop_t = time.perf_counter()
if events["exit_early"]:
events["exit_early"] = False
break
# Get robot observation
obs = robot.get_observation()
obs_processed = robot_joints_to_ee_pose_processor(obs)
observation_frame = build_dataset_frame(dataset.features, obs_processed, prefix=OBS_STR)
# Predict action using the policy
action_tensor = predict_action(
observation=observation_frame,
policy=policy,
device=policy.config.device,
preprocessor=preprocessor,
postprocessor=postprocessor,
use_amp=policy.config.device.type == "cuda",
task=TASK_DESCRIPTION,
robot_type=robot.name,
)
# Convert policy output to robot action dict
action_values = make_robot_action(action_tensor, dataset.features)
# Process and send action to robot (EE -> joints via IK)
robot_action_to_send = robot_ee_to_joints_processor((action_values, obs))
robot.send_action(robot_action_to_send)
# Write to dataset
action_frame = build_dataset_frame(dataset.features, action_values, prefix=ACTION)
frame = {**observation_frame, **action_frame, "task": TASK_DESCRIPTION}
dataset.add_frame(frame)
log_rerun_data(observation=obs_processed, action=action_values)
dt_s = time.perf_counter() - start_loop_t
sleep_time_s = control_interval - dt_s
if sleep_time_s < 0:
logging.warning(
f"Evaluate loop is running slower ({1 / dt_s:.1f} Hz) than the target FPS ({FPS} Hz)."
)
precise_sleep(max(sleep_time_s, 0.0))
timestamp = time.perf_counter() - start_episode_t
# Reset the environment if not stopping or re-recording
if not events["stop_recording"] and (
(episode_idx < NUM_EPISODES - 1) or events["rerecord_episode"]
):
log_say("Reset the environment")
record_loop(
robot=robot,
events=events,
fps=FPS,
control_time_s=EPISODE_TIME_SEC,
single_task=TASK_DESCRIPTION,
display_data=True,
teleop_action_processor=make_default_teleop_action_processor(),
robot_action_processor=robot_ee_to_joints_processor,
robot_observation_processor=robot_joints_to_ee_pose_processor,
)
log_say("Waiting for environment reset, press right arrow key when ready...")
if events["rerecord_episode"]:
log_say("Re-record episode")
+19 -88
View File
@@ -17,25 +17,13 @@
from lerobot.cameras.opencv import OpenCVCameraConfig
from lerobot.common.control_utils import init_keyboard_listener
from lerobot.datasets import LeRobotDataset, aggregate_pipeline_dataset_features, create_initial_features
from lerobot.model.kinematics import RobotKinematics
from lerobot.processor import (
RobotProcessorPipeline,
observation_to_transition,
robot_action_observation_to_transition,
transition_to_observation,
transition_to_robot_action,
)
from lerobot.datasets import LeRobotDataset
from lerobot.processor import make_default_processors
from lerobot.robots.so_follower import SO100Follower, SO100FollowerConfig
from lerobot.robots.so_follower.robot_kinematic_processor import (
EEBoundsAndSafety,
ForwardKinematicsJointsToEE,
InverseKinematicsEEToJoints,
)
from lerobot.scripts.lerobot_record import record_loop
from lerobot.teleoperators.so_leader import SO100Leader, SO100LeaderConfig
from lerobot.types import RobotAction, RobotObservation
from lerobot.utils.feature_utils import combine_feature_dicts
from lerobot.utils.constants import ACTION, OBS_STR
from lerobot.utils.feature_utils import hw_to_dataset_features
from lerobot.utils.utils import log_say
from lerobot.utils.visualization_utils import init_rerun
@@ -62,77 +50,16 @@ def main():
follower = SO100Follower(follower_config)
leader = SO100Leader(leader_config)
# NOTE: It is highly recommended to use the urdf in the SO-ARM100 repo: https://github.com/TheRobotStudio/SO-ARM100/blob/main/Simulation/SO101/so101_new_calib.urdf
follower_kinematics_solver = RobotKinematics(
urdf_path="./SO101/so101_new_calib.urdf",
target_frame_name="gripper_frame_link",
joint_names=list(follower.bus.motors.keys()),
)
# NOTE: It is highly recommended to use the urdf in the SO-ARM100 repo: https://github.com/TheRobotStudio/SO-ARM100/blob/main/Simulation/SO101/so101_new_calib.urdf
leader_kinematics_solver = RobotKinematics(
urdf_path="./SO101/so101_new_calib.urdf",
target_frame_name="gripper_frame_link",
joint_names=list(leader.bus.motors.keys()),
)
# Build pipeline to convert follower joints to EE observation
follower_joints_to_ee = RobotProcessorPipeline[RobotObservation, RobotObservation](
steps=[
ForwardKinematicsJointsToEE(
kinematics=follower_kinematics_solver, motor_names=list(follower.bus.motors.keys())
),
],
to_transition=observation_to_transition,
to_output=transition_to_observation,
)
# Build pipeline to convert leader joints to EE action
leader_joints_to_ee = RobotProcessorPipeline[tuple[RobotAction, RobotObservation], RobotAction](
steps=[
ForwardKinematicsJointsToEE(
kinematics=leader_kinematics_solver, motor_names=list(leader.bus.motors.keys())
),
],
to_transition=robot_action_observation_to_transition,
to_output=transition_to_robot_action,
)
# Build pipeline to convert EE action to follower joints
ee_to_follower_joints = RobotProcessorPipeline[tuple[RobotAction, RobotObservation], RobotAction](
[
EEBoundsAndSafety(
end_effector_bounds={"min": [-1.0, -1.0, -1.0], "max": [1.0, 1.0, 1.0]},
max_ee_step_m=0.10,
),
InverseKinematicsEEToJoints(
kinematics=follower_kinematics_solver,
motor_names=list(follower.bus.motors.keys()),
initial_guess_current_joints=True,
),
],
to_transition=robot_action_observation_to_transition,
to_output=transition_to_robot_action,
)
# Configure the dataset features
action_features = hw_to_dataset_features(follower.action_features, ACTION)
obs_features = hw_to_dataset_features(follower.observation_features, OBS_STR)
dataset_features = {**action_features, **obs_features}
# Create the dataset
dataset = LeRobotDataset.create(
repo_id=HF_REPO_ID,
fps=FPS,
features=combine_feature_dicts(
# Run the feature contract of the pipelines
# This tells you how the features would look like after the pipeline steps
aggregate_pipeline_dataset_features(
pipeline=leader_joints_to_ee,
initial_features=create_initial_features(action=leader.action_features),
use_videos=True,
),
aggregate_pipeline_dataset_features(
pipeline=follower_joints_to_ee,
initial_features=create_initial_features(observation=follower.observation_features),
use_videos=True,
),
),
features=dataset_features,
robot_type=follower.name,
use_videos=True,
image_writer_threads=4,
@@ -150,6 +77,10 @@ def main():
if not leader.is_connected or not follower.is_connected:
raise ValueError("Robot or teleop is not connected!")
teleop_action_processor, robot_action_processor, robot_observation_processor = (
make_default_processors()
)
print("Starting record loop...")
episode_idx = 0
while episode_idx < NUM_EPISODES and not events["stop_recording"]:
@@ -160,14 +91,14 @@ def main():
robot=follower,
events=events,
fps=FPS,
teleop_action_processor=teleop_action_processor,
robot_action_processor=robot_action_processor,
robot_observation_processor=robot_observation_processor,
teleop=leader,
dataset=dataset,
control_time_s=EPISODE_TIME_SEC,
single_task=TASK_DESCRIPTION,
display_data=True,
teleop_action_processor=leader_joints_to_ee,
robot_action_processor=ee_to_follower_joints,
robot_observation_processor=follower_joints_to_ee,
)
# Reset the environment if not stopping or re-recording
@@ -179,13 +110,13 @@ def main():
robot=follower,
events=events,
fps=FPS,
teleop_action_processor=teleop_action_processor,
robot_action_processor=robot_action_processor,
robot_observation_processor=robot_observation_processor,
teleop=leader,
control_time_s=RESET_TIME_SEC,
single_task=TASK_DESCRIPTION,
display_data=True,
teleop_action_processor=leader_joints_to_ee,
robot_action_processor=ee_to_follower_joints,
robot_observation_processor=follower_joints_to_ee,
)
if events["rerecord_episode"]:
+1
View File
@@ -269,6 +269,7 @@ lerobot-find-joint-limits="lerobot.scripts.lerobot_find_joint_limits:main"
lerobot-imgtransform-viz="lerobot.scripts.lerobot_imgtransform_viz:main"
lerobot-edit-dataset="lerobot.scripts.lerobot_edit_dataset:main"
lerobot-setup-can="lerobot.scripts.lerobot_setup_can:main"
lerobot-rollout="lerobot.scripts.lerobot_rollout:main"
# ---------------- Tool Configurations ----------------
[tool.setuptools.package-data]
+47
View File
@@ -0,0 +1,47 @@
# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Policy deployment engine with pluggable rollout strategies."""
from .configs import (
BaseStrategyConfig,
DAggerStrategyConfig,
HighlightStrategyConfig,
RolloutConfig,
RolloutDatasetConfig,
RolloutStrategyConfig,
SentryStrategyConfig,
)
from .context import RolloutContext, build_rollout_context
from .inference import InferenceEngine
from .ring_buffer import RolloutRingBuffer
from .robot_wrapper import ThreadSafeRobot
from .strategies import RolloutStrategy, create_strategy
__all__ = [
"BaseStrategyConfig",
"DAggerStrategyConfig",
"HighlightStrategyConfig",
"InferenceEngine",
"RolloutConfig",
"RolloutContext",
"RolloutDatasetConfig",
"RolloutRingBuffer",
"RolloutStrategy",
"RolloutStrategyConfig",
"SentryStrategyConfig",
"ThreadSafeRobot",
"build_rollout_context",
"create_strategy",
]
+213
View File
@@ -0,0 +1,213 @@
# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Configuration dataclasses for the rollout deployment engine."""
from __future__ import annotations
import abc
import logging
from dataclasses import dataclass, field
from pathlib import Path
import draccus
from lerobot.configs import PreTrainedConfig, parser
from lerobot.policies.rtc import RTCConfig
from lerobot.robots.config import RobotConfig
from lerobot.teleoperators.config import TeleoperatorConfig
logger = logging.getLogger(__name__)
# ---------------------------------------------------------------------------
# Strategy configs (polymorphic dispatch via draccus ChoiceRegistry)
# ---------------------------------------------------------------------------
@dataclass
class RolloutStrategyConfig(draccus.ChoiceRegistry, abc.ABC):
"""Abstract base for rollout strategy configurations.
Use ``--strategy.type=<name>`` on the CLI to select a strategy.
"""
@property
def type(self) -> str:
return self.get_choice_name(self.__class__)
@RolloutStrategyConfig.register_subclass("base")
@dataclass
class BaseStrategyConfig(RolloutStrategyConfig):
"""Autonomous rollout with no data recording."""
pass
@RolloutStrategyConfig.register_subclass("sentry")
@dataclass
class SentryStrategyConfig(RolloutStrategyConfig):
"""Continuous autonomous rollout with always-on recording.
Episodes are auto-rotated every ``episode_duration_s`` seconds and
uploaded in the background every ``upload_every_n_episodes`` episodes.
"""
episode_duration_s: float = 120.0
upload_every_n_episodes: int = 5
@RolloutStrategyConfig.register_subclass("highlight")
@dataclass
class HighlightStrategyConfig(RolloutStrategyConfig):
"""Autonomous rollout with on-demand recording via ring buffer.
A memory-bounded ring buffer continuously captures telemetry. When
the user presses the save key, the buffer contents are flushed to
the dataset and live recording continues until the key is pressed
again.
"""
ring_buffer_seconds: float = 30.0
ring_buffer_max_memory_mb: float = 2048.0
save_key: str = "s"
@RolloutStrategyConfig.register_subclass("dagger")
@dataclass
class DAggerStrategyConfig(RolloutStrategyConfig):
"""Human-in-the-loop data collection (DAgger / RaC).
Alternates between autonomous policy execution and human intervention.
Intervention frames are tagged with ``intervention=True``.
"""
episode_time_s: float = 120.0
num_episodes: int = 50
play_sounds: bool = True
calibrate: bool = False
log_hz: bool = True
hz_log_interval_s: float = 2.0
# ---------------------------------------------------------------------------
# Dataset recording config (shared across recording strategies)
# ---------------------------------------------------------------------------
@dataclass
class RolloutDatasetConfig:
"""Dataset configuration for rollout strategies that record data."""
repo_id: str = ""
single_task: str = ""
root: str | Path | None = None
fps: int = 30
video: bool = True
push_to_hub: bool = True
private: bool = False
tags: list[str] | None = None
num_image_writer_processes: int = 0
num_image_writer_threads_per_camera: int = 4
video_encoding_batch_size: int = 1
vcodec: str = "auto"
streaming_encoding: bool = True
encoder_queue_maxsize: int = 30
encoder_threads: int | None = None
rename_map: dict[str, str] = field(default_factory=dict)
# ---------------------------------------------------------------------------
# Top-level rollout config
# ---------------------------------------------------------------------------
@dataclass
class RolloutConfig:
"""Top-level configuration for the ``lerobot-rollout`` CLI.
Combines hardware, policy, strategy, and runtime settings. The
``__post_init__`` method performs fail-fast validation to reject
invalid flag combinations early.
"""
# Hardware
robot: RobotConfig | None = None
teleop: TeleoperatorConfig | None = None
# Policy (loaded from --policy.path via __post_init__)
policy: PreTrainedConfig | None = None
# Strategy (polymorphic: --strategy.type=base|sentry|highlight|dagger)
strategy: RolloutStrategyConfig = field(default_factory=BaseStrategyConfig)
# Inference backend
rtc: RTCConfig = field(default_factory=RTCConfig)
# Dataset (required for sentry, highlight, dagger; None for base)
dataset: RolloutDatasetConfig | None = None
# Runtime
fps: float = 30.0
duration: float = 0.0 # 0 = infinite (24/7 mode)
interpolation_multiplier: int = 1
device: str | None = None
task: str = ""
display_data: bool = False
resume: bool = False
# Torch compile
use_torch_compile: bool = False
torch_compile_backend: str = "inductor"
torch_compile_mode: str = "default"
compile_warmup_inferences: int = 2
def __post_init__(self):
# --- Policy loading (same pattern as existing scripts) ---
policy_path = parser.get_path_arg("policy")
if policy_path:
cli_overrides = parser.get_cli_overrides("policy")
self.policy = PreTrainedConfig.from_pretrained(policy_path, cli_overrides=cli_overrides)
self.policy.pretrained_path = policy_path
if self.policy is None:
raise ValueError("--policy.path is required for rollout")
if self.robot is None:
raise ValueError("--robot.type is required for rollout")
# --- Strategy-specific validation ---
if isinstance(self.strategy, DAggerStrategyConfig) and self.teleop is None:
raise ValueError("DAgger strategy requires --teleop.type to be set")
needs_dataset = isinstance(
self.strategy, (SentryStrategyConfig, HighlightStrategyConfig, DAggerStrategyConfig)
)
if needs_dataset and (self.dataset is None or not self.dataset.repo_id):
raise ValueError(f"{self.strategy.type} strategy requires --dataset.repo_id to be set")
if isinstance(self.strategy, BaseStrategyConfig) and self.dataset is not None:
raise ValueError(
"Base strategy does not record data. Use sentry, highlight, or dagger for recording."
)
# Sentry MUST use streaming encoding to avoid disk I/O blocking the control loop
if isinstance(self.strategy, SentryStrategyConfig) and self.dataset is not None:
if not self.dataset.streaming_encoding:
logger.warning("Sentry mode forces streaming_encoding=True")
self.dataset.streaming_encoding = True
@classmethod
def __get_path_fields__(cls) -> list[str]:
return ["policy"]
+234
View File
@@ -0,0 +1,234 @@
# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Rollout context: shared state created once before strategy dispatch."""
from __future__ import annotations
import logging
from dataclasses import dataclass, field
from threading import Event
import torch
from lerobot.configs import PreTrainedConfig
from lerobot.datasets import (
LeRobotDataset,
aggregate_pipeline_dataset_features,
create_initial_features,
)
from lerobot.policies import get_policy_class, make_pre_post_processors
from lerobot.policies.pretrained import PreTrainedPolicy
from lerobot.processor import (
PolicyProcessorPipeline,
RobotAction,
RobotObservation,
RobotProcessorPipeline,
make_default_processors,
rename_stats,
)
from lerobot.robots import Robot, make_robot_from_config
from lerobot.teleoperators import Teleoperator, make_teleoperator_from_config
from lerobot.utils.feature_utils import combine_feature_dicts, hw_to_dataset_features
from .configs import BaseStrategyConfig, RolloutConfig
from .robot_wrapper import ThreadSafeRobot
logger = logging.getLogger(__name__)
@dataclass
class RolloutContext:
"""Bundle of shared resources passed to every rollout strategy.
Built once by :func:`build_rollout_context` before strategy dispatch.
"""
cfg: RolloutConfig
robot: Robot
robot_wrapper: ThreadSafeRobot
teleop: Teleoperator | None
policy: PreTrainedPolicy
preprocessor: PolicyProcessorPipeline
postprocessor: PolicyProcessorPipeline
teleop_action_processor: RobotProcessorPipeline[tuple[RobotAction, RobotObservation], RobotAction]
robot_action_processor: RobotProcessorPipeline[tuple[RobotAction, RobotObservation], RobotAction]
robot_observation_processor: RobotProcessorPipeline[RobotObservation, RobotObservation]
dataset: LeRobotDataset | None
shutdown_event: Event = field(default_factory=Event)
dataset_features: dict = field(default_factory=dict)
action_keys: list[str] = field(default_factory=list)
hw_features: dict = field(default_factory=dict)
def build_rollout_context(cfg: RolloutConfig, shutdown_event: Event) -> RolloutContext:
"""Wire up hardware, policy, processors, and dataset from config.
This function performs all the one-time setup that every strategy
needs, keeping the strategy implementations lean.
"""
# --- Hardware ---
robot = make_robot_from_config(cfg.robot)
robot.connect()
robot_wrapper = ThreadSafeRobot(robot)
teleop = None
if cfg.teleop is not None:
teleop = make_teleoperator_from_config(cfg.teleop)
teleop.connect()
# --- Processors ---
teleop_action_processor, robot_action_processor, robot_observation_processor = make_default_processors()
# --- Policy ---
use_rtc = cfg.rtc.enabled
policy_class = get_policy_class(cfg.policy.type)
policy_config = PreTrainedConfig.from_pretrained(cfg.policy.pretrained_path)
# Set compile_model for pi0/pi05
if hasattr(policy_config, "compile_model"):
policy_config.compile_model = cfg.use_torch_compile
# Handle PEFT models
if policy_config.use_peft:
from peft import PeftConfig, PeftModel
peft_path = cfg.policy.pretrained_path
peft_config = PeftConfig.from_pretrained(peft_path)
policy = policy_class.from_pretrained(
pretrained_name_or_path=peft_config.base_model_name_or_path, config=policy_config
)
policy = PeftModel.from_pretrained(policy, peft_path, config=peft_config)
else:
policy = policy_class.from_pretrained(cfg.policy.pretrained_path, config=policy_config)
# Enable RTC on the policy
if use_rtc:
policy.config.rtc_config = cfg.rtc
if hasattr(policy, "init_rtc_processor"):
policy.init_rtc_processor()
policy = policy.to(cfg.device)
policy.eval()
# Apply torch.compile if requested (skip pi0/pi05 — they handle their own)
if cfg.use_torch_compile and policy.type not in ("pi0", "pi05"):
try:
if hasattr(torch, "compile"):
compile_kwargs = {
"backend": cfg.torch_compile_backend,
"mode": cfg.torch_compile_mode,
"options": {"triton.cudagraphs": False},
}
policy.predict_action_chunk = torch.compile(policy.predict_action_chunk, **compile_kwargs)
logger.info("torch.compile applied to predict_action_chunk")
except Exception as e:
logger.warning("Failed to apply torch.compile: %s", e)
# --- Observation features (filter to .pos joints + camera streams) ---
all_obs_features = robot.observation_features
observation_features_hw = {
k: v for k, v in all_obs_features.items() if k.endswith(".pos") or isinstance(v, tuple)
}
action_features_hw = {k: v for k, v in robot.action_features.items() if k.endswith(".pos")}
# Build dataset features
dataset_features = combine_feature_dicts(
aggregate_pipeline_dataset_features(
pipeline=teleop_action_processor,
initial_features=create_initial_features(action=action_features_hw),
use_videos=cfg.dataset.video if cfg.dataset else True,
),
aggregate_pipeline_dataset_features(
pipeline=robot_observation_processor,
initial_features=create_initial_features(observation=observation_features_hw),
use_videos=cfg.dataset.video if cfg.dataset else True,
),
)
hw_features = hw_to_dataset_features(observation_features_hw, "observation")
# Action keys
action_keys = [k for k in robot.action_features if k.endswith(".pos")]
# --- Dataset ---
dataset = None
if cfg.dataset is not None and not isinstance(cfg.strategy, BaseStrategyConfig):
if cfg.resume:
dataset = LeRobotDataset.resume(
cfg.dataset.repo_id,
root=cfg.dataset.root,
batch_encoding_size=cfg.dataset.video_encoding_batch_size,
vcodec=cfg.dataset.vcodec,
streaming_encoding=cfg.dataset.streaming_encoding,
encoder_queue_maxsize=cfg.dataset.encoder_queue_maxsize,
encoder_threads=cfg.dataset.encoder_threads,
image_writer_processes=cfg.dataset.num_image_writer_processes,
image_writer_threads=cfg.dataset.num_image_writer_threads_per_camera
* len(robot.cameras if hasattr(robot, "cameras") else []),
)
else:
dataset = LeRobotDataset.create(
cfg.dataset.repo_id,
cfg.dataset.fps,
root=cfg.dataset.root,
robot_type=robot.name,
features=dataset_features,
use_videos=cfg.dataset.video,
image_writer_processes=cfg.dataset.num_image_writer_processes,
image_writer_threads=cfg.dataset.num_image_writer_threads_per_camera
* len(robot.cameras if hasattr(robot, "cameras") else []),
batch_encoding_size=cfg.dataset.video_encoding_batch_size,
vcodec=cfg.dataset.vcodec,
streaming_encoding=cfg.dataset.streaming_encoding,
encoder_queue_maxsize=cfg.dataset.encoder_queue_maxsize,
encoder_threads=cfg.dataset.encoder_threads,
)
# --- Pre/post processors ---
dataset_stats = None
if dataset is not None:
dataset_stats = rename_stats(
dataset.meta.stats,
cfg.dataset.rename_map if cfg.dataset else {},
)
preprocessor, postprocessor = make_pre_post_processors(
policy_cfg=cfg.policy,
pretrained_path=cfg.policy.pretrained_path,
dataset_stats=dataset_stats,
preprocessor_overrides={
"device_processor": {"device": cfg.device or cfg.policy.device},
"rename_observations_processor": {"rename_map": cfg.dataset.rename_map if cfg.dataset else {}},
},
)
return RolloutContext(
cfg=cfg,
robot=robot,
robot_wrapper=robot_wrapper,
teleop=teleop,
policy=policy,
preprocessor=preprocessor,
postprocessor=postprocessor,
teleop_action_processor=teleop_action_processor,
robot_action_processor=robot_action_processor,
robot_observation_processor=robot_observation_processor,
dataset=dataset,
shutdown_event=shutdown_event,
dataset_features=dataset_features,
action_keys=action_keys,
hw_features=hw_features,
)
+431
View File
@@ -0,0 +1,431 @@
# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Unified inference engine supporting both synchronous and RTC backends.
The :class:`InferenceEngine` abstracts whether prediction happens inline
(sync) or in a background thread (RTC), so rollout strategies don't need
to branch on the inference backend.
"""
from __future__ import annotations
import logging
import math
import time
import traceback
from copy import copy
from threading import Event, Lock, Thread
from typing import Any
import torch
from lerobot.common.control_utils import prepare_observation_for_inference
from lerobot.policies.pretrained import PreTrainedPolicy
from lerobot.policies.rtc import ActionInterpolator, ActionQueue, LatencyTracker
from lerobot.policies.rtc.configuration_rtc import RTCConfig
from lerobot.processor import (
NormalizerProcessorStep,
PolicyProcessorPipeline,
RelativeActionsProcessorStep,
TransitionKey,
create_transition,
to_relative_actions,
)
from lerobot.utils.constants import OBS_STATE
from lerobot.utils.feature_utils import build_dataset_frame
from .robot_wrapper import ThreadSafeRobot
logger = logging.getLogger(__name__)
# ---------------------------------------------------------------------------
# RTC helpers (extracted from examples/rtc and examples/hil)
# ---------------------------------------------------------------------------
def _reanchor_relative_rtc_prefix(
prev_actions_absolute: torch.Tensor,
current_state: torch.Tensor,
relative_step: RelativeActionsProcessorStep,
normalizer_step: NormalizerProcessorStep | None,
policy_device: torch.device | str,
) -> torch.Tensor:
"""Convert absolute leftover actions into model-space for relative-action RTC policies."""
state = current_state.detach().cpu()
if state.dim() == 1:
state = state.unsqueeze(0)
action_cpu = prev_actions_absolute.detach().cpu()
mask = relative_step._build_mask(action_cpu.shape[-1])
relative_actions = to_relative_actions(action_cpu, state, mask)
transition = create_transition(action=relative_actions)
if normalizer_step is not None:
transition = normalizer_step(transition)
return transition[TransitionKey.ACTION].to(policy_device)
def _normalize_prev_actions_length(prev_actions: torch.Tensor, target_steps: int) -> torch.Tensor:
"""Pad or truncate RTC prefix actions to a fixed length for stable compiled inference."""
if prev_actions.ndim != 2:
raise ValueError(f"Expected 2D [T, A] tensor, got shape={tuple(prev_actions.shape)}")
steps, action_dim = prev_actions.shape
if steps == target_steps:
return prev_actions
if steps > target_steps:
return prev_actions[:target_steps]
padded = torch.zeros((target_steps, action_dim), dtype=prev_actions.dtype, device=prev_actions.device)
padded[:steps] = prev_actions
return padded
def _resolve_action_key_order(
policy_action_names: list[str] | None, dataset_action_names: list[str]
) -> list[str]:
"""Choose action name ordering for mapping policy tensor outputs to robot action dicts."""
if not policy_action_names:
return dataset_action_names
policy_action_names = list(policy_action_names)
if len(policy_action_names) != len(dataset_action_names):
logger.warning(
"policy.action_feature_names length (%d) != dataset action dim (%d); using dataset order",
len(policy_action_names),
len(dataset_action_names),
)
return dataset_action_names
if set(dataset_action_names) != set(policy_action_names):
logger.warning("policy.action_feature_names keys don't match dataset; using dataset order")
return dataset_action_names
return policy_action_names
# ---------------------------------------------------------------------------
# InferenceEngine
# ---------------------------------------------------------------------------
class InferenceEngine:
"""Abstracts sync vs. RTC (async) inference for rollout strategies.
Parameters
----------
policy:
The loaded policy (already on device, in eval mode, with RTC
processor initialised if applicable).
preprocessor / postprocessor:
Policy processor pipelines.
robot_wrapper:
Thread-safe robot wrapper.
rtc_config:
RTC configuration. If ``rtc_config.enabled`` is False, the
engine operates in synchronous mode.
hw_features:
Dataset-level feature dict built from ``hw_to_dataset_features``.
action_keys:
Ordered list of action feature names.
task:
Task description string.
fps:
Control loop frequency.
device:
Torch device string.
interpolator:
Action interpolator (used only in RTC mode for the actor loop).
use_torch_compile:
Whether torch.compile warmup is needed.
compile_warmup_inferences:
Number of warmup inferences before live rollout.
"""
def __init__(
self,
policy: PreTrainedPolicy,
preprocessor: PolicyProcessorPipeline,
postprocessor: PolicyProcessorPipeline,
robot_wrapper: ThreadSafeRobot,
rtc_config: RTCConfig,
hw_features: dict,
action_keys: list[str],
task: str,
fps: float,
device: str | None,
interpolator: ActionInterpolator | None = None,
use_torch_compile: bool = False,
compile_warmup_inferences: int = 2,
) -> None:
self._policy = policy
self._preprocessor = preprocessor
self._postprocessor = postprocessor
self._robot = robot_wrapper
self._rtc_config = rtc_config
self._hw_features = hw_features
self._action_keys = action_keys
self._task = task
self._fps = fps
self._device = device or "cpu"
self._interpolator = interpolator
self._use_torch_compile = use_torch_compile
self._compile_warmup_inferences = compile_warmup_inferences
# RTC state
self._use_rtc = rtc_config.enabled
self._action_queue: ActionQueue | None = None
self._obs_holder: dict[str, Any] = {}
self._obs_lock = Lock()
self._policy_active = Event()
self._compile_warmup_done = Event()
self._shutdown_event = Event()
self._rtc_thread: Thread | None = None
if not self._use_torch_compile:
self._compile_warmup_done.set()
# Processor introspection for relative-action re-anchoring
self._relative_step = next(
(s for s in preprocessor.steps if isinstance(s, RelativeActionsProcessorStep) and s.enabled),
None,
)
self._normalizer_step = next(
(s for s in preprocessor.steps if isinstance(s, NormalizerProcessorStep)),
None,
)
if self._relative_step is not None:
if self._relative_step.action_names is None:
cfg_names = getattr(policy.config, "action_feature_names", None)
if cfg_names:
self._relative_step.action_names = list(cfg_names)
else:
self._relative_step.action_names = [
k for k in robot_wrapper.action_features if k.endswith(".pos")
]
logger.info("Relative actions enabled: RTC prefix will be re-anchored")
# ------------------------------------------------------------------
# Lifecycle
# ------------------------------------------------------------------
@property
def is_rtc(self) -> bool:
return self._use_rtc
@property
def action_queue(self) -> ActionQueue | None:
return self._action_queue
@property
def compile_warmup_done(self) -> Event:
return self._compile_warmup_done
def start(self) -> None:
"""Start the inference engine. Launches the RTC background thread if enabled."""
if self._use_rtc:
self._action_queue = ActionQueue(self._rtc_config)
self._obs_holder = {
"obs": None,
"robot_type": self._robot.robot_type,
}
self._shutdown_event.clear()
self._rtc_thread = Thread(
target=self._rtc_loop,
daemon=True,
name="RTCInference",
)
self._rtc_thread.start()
logger.info("RTC inference thread started")
def stop(self) -> None:
"""Signal the RTC thread to stop and wait for it."""
self._shutdown_event.set()
self._policy_active.clear()
if self._rtc_thread is not None and self._rtc_thread.is_alive():
self._rtc_thread.join(timeout=3.0)
self._rtc_thread = None
def pause(self) -> None:
"""Pause the RTC background thread (used during DAgger takeover)."""
self._policy_active.clear()
def resume(self) -> None:
"""Resume the RTC background thread."""
self._policy_active.set()
def reset(self) -> None:
"""Reset policy, processors, and action queue between episodes."""
self._policy.reset()
self._preprocessor.reset()
self._postprocessor.reset()
if self._use_rtc:
self._action_queue = ActionQueue(self._rtc_config)
if self._interpolator is not None:
self._interpolator.reset()
# ------------------------------------------------------------------
# Sync inference
# ------------------------------------------------------------------
def get_action_sync(self, obs_frame: dict) -> torch.Tensor:
"""Run synchronous single-step inference.
Parameters
----------
obs_frame:
Observation dict with numpy arrays (output of ``build_dataset_frame``).
Returns
-------
Action tensor on CPU.
"""
observation = copy(obs_frame)
policy_device = torch.device(self._device)
with (
torch.inference_mode(),
torch.autocast(device_type=policy_device.type)
if policy_device.type == "cuda" and self._policy.config.use_amp
else torch.inference_mode(),
):
observation = prepare_observation_for_inference(
observation, policy_device, self._task, self._robot.robot_type
)
observation = self._preprocessor(observation)
action = self._policy.select_action(observation)
action = self._postprocessor(action)
return action.squeeze(0).cpu()
# ------------------------------------------------------------------
# RTC: action consumption (called from main thread)
# ------------------------------------------------------------------
def consume_rtc_action(self) -> torch.Tensor | None:
"""Pop the next action from the RTC action queue. Returns None if empty."""
if self._action_queue is None:
return None
return self._action_queue.get()
def update_observation(self, obs_filtered: dict) -> None:
"""Push the latest observation to the shared holder for the RTC thread."""
with self._obs_lock:
self._obs_holder["obs"] = obs_filtered
# ------------------------------------------------------------------
# RTC: background inference thread
# ------------------------------------------------------------------
def _rtc_loop(self) -> None:
"""Background thread that generates action chunks via RTC."""
try:
latency_tracker = LatencyTracker()
time_per_chunk = 1.0 / self._fps
threshold = 30
policy_device = self._policy.config.device
warmup_required = max(1, self._compile_warmup_inferences) if self._use_torch_compile else 0
inference_count = 0
while not self._shutdown_event.is_set():
if not self._policy_active.is_set():
time.sleep(0.01)
continue
queue = self._action_queue
with self._obs_lock:
obs = self._obs_holder.get("obs")
if queue is None or obs is None:
time.sleep(0.01)
continue
if queue.qsize() <= threshold:
try:
current_time = time.perf_counter()
idx_before = queue.get_action_index()
prev_actions = queue.get_left_over()
latency = latency_tracker.max()
delay = math.ceil(latency / time_per_chunk) if latency else 0
# Build observation batch
obs_batch = build_dataset_frame(self._hw_features, obs, prefix="observation")
for name in obs_batch:
obs_batch[name] = torch.from_numpy(obs_batch[name])
if "image" in name:
obs_batch[name] = obs_batch[name].float() / 255
obs_batch[name] = obs_batch[name].permute(2, 0, 1).contiguous()
obs_batch[name] = obs_batch[name].unsqueeze(0).to(policy_device)
obs_batch["task"] = [self._task]
obs_batch["robot_type"] = self._obs_holder.get("robot_type", "unknown")
preprocessed = self._preprocessor(obs_batch)
# Re-anchor leftover for relative-action policies
if (
prev_actions is not None
and self._relative_step is not None
and OBS_STATE in obs_batch
):
prev_abs = queue.get_processed_left_over()
if prev_abs is not None and prev_abs.numel() > 0:
prev_actions = _reanchor_relative_rtc_prefix(
prev_actions_absolute=prev_abs,
current_state=obs_batch[OBS_STATE],
relative_step=self._relative_step,
normalizer_step=self._normalizer_step,
policy_device=policy_device,
)
if prev_actions is not None:
prev_actions = _normalize_prev_actions_length(
prev_actions, target_steps=self._rtc_config.execution_horizon
)
actions = self._policy.predict_action_chunk(
preprocessed, inference_delay=delay, prev_chunk_left_over=prev_actions
)
original = actions.squeeze(0).clone()
processed = self._postprocessor(actions).squeeze(0)
new_latency = time.perf_counter() - current_time
new_delay = math.ceil(new_latency / time_per_chunk)
inference_count += 1
is_warmup = self._use_torch_compile and inference_count <= warmup_required
if is_warmup:
latency_tracker.reset()
else:
latency_tracker.add(new_latency)
queue.merge(original, processed, new_delay, idx_before)
if (
is_warmup
and inference_count >= warmup_required
and not self._compile_warmup_done.is_set()
):
self._compile_warmup_done.set()
logger.info("Compile warmup complete (%d inferences)", inference_count)
logger.debug("RTC inference latency=%.2fs, queue=%d", new_latency, queue.qsize())
except Exception as e:
logger.error("RTC inference error: %s", e)
logger.debug(traceback.format_exc())
time.sleep(0.5)
else:
time.sleep(0.01)
except Exception as e:
logger.error("Fatal error in RTC thread: %s", e)
logger.error(traceback.format_exc())
+100
View File
@@ -0,0 +1,100 @@
# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Memory-bounded ring buffer for the Highlight Reel rollout strategy."""
from __future__ import annotations
from collections import deque
import numpy as np
class RolloutRingBuffer:
"""Fixed-capacity circular buffer for observation/action frames.
Stores the last *N* seconds of telemetry in memory, bounded by both
time (``max_frames``) and memory (``max_memory_bytes``). When either
limit is reached the oldest frames are evicted.
Parameters
----------
max_seconds:
Maximum duration of buffered telemetry.
max_memory_mb:
Hard memory cap in MiB. Frames are evicted when the estimated
total size exceeds this.
fps:
Frames per second used to convert ``max_seconds`` to a frame
count.
"""
def __init__(self, max_seconds: float = 30.0, max_memory_mb: float = 2048.0, fps: float = 30.0) -> None:
self._max_frames = int(max_seconds * fps)
self._max_bytes = int(max_memory_mb * 1024 * 1024)
self._buffer: deque[dict] = deque(maxlen=self._max_frames)
self._current_bytes: int = 0
# ------------------------------------------------------------------
# Public API
# ------------------------------------------------------------------
def append(self, frame: dict) -> None:
"""Add *frame* to the buffer, evicting the oldest if at capacity."""
frame_bytes = _estimate_frame_bytes(frame)
# Evict oldest frames until we are under the memory cap
while self._current_bytes + frame_bytes > self._max_bytes and self._buffer:
evicted = self._buffer.popleft()
self._current_bytes -= _estimate_frame_bytes(evicted)
self._buffer.append(frame)
self._current_bytes += frame_bytes
def drain(self) -> list[dict]:
"""Return all buffered frames and clear the buffer."""
frames = list(self._buffer)
self._buffer.clear()
self._current_bytes = 0
return frames
def clear(self) -> None:
"""Discard all buffered frames."""
self._buffer.clear()
self._current_bytes = 0
def __len__(self) -> int:
return len(self._buffer)
@property
def estimated_bytes(self) -> int:
return self._current_bytes
# ------------------------------------------------------------------
# Helpers
# ------------------------------------------------------------------
def _estimate_frame_bytes(frame: dict) -> int:
"""Rough byte estimate for a single frame dictionary."""
total = 0
for v in frame.values():
if isinstance(v, np.ndarray) or hasattr(v, "nbytes"):
total += v.nbytes
elif isinstance(v, (int, float)):
total += 8
elif isinstance(v, str) or isinstance(v, bytes):
total += len(v)
return max(total, 1) # avoid zero-size frames
+79
View File
@@ -0,0 +1,79 @@
# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Thread-safe robot wrapper for concurrent observation/action access."""
from __future__ import annotations
from threading import Lock
from typing import Any
from lerobot.robots import Robot
class ThreadSafeRobot:
"""Lock-protected wrapper around a :class:`Robot` for use with background threads.
When RTC inference runs in a background thread while the main loop
executes actions, both threads may access the robot concurrently.
This wrapper serialises ``get_observation`` and ``send_action`` calls.
Read-only properties are proxied without the lock since they don't
mutate hardware state.
"""
def __init__(self, robot: Robot) -> None:
self._robot = robot
self._lock = Lock()
# -- Lock-protected I/O --------------------------------------------------
def get_observation(self) -> dict[str, Any]:
with self._lock:
return self._robot.get_observation()
def send_action(self, action: dict[str, Any] | Any) -> Any:
with self._lock:
return self._robot.send_action(action)
# -- Read-only proxies (no lock needed) -----------------------------------
@property
def observation_features(self) -> dict:
return self._robot.observation_features
@property
def action_features(self) -> dict:
return self._robot.action_features
@property
def name(self) -> str:
return self._robot.name
@property
def robot_type(self) -> str:
return self._robot.robot_type
@property
def cameras(self):
return getattr(self._robot, "cameras", {})
@property
def is_connected(self) -> bool:
return self._robot.is_connected
@property
def inner(self) -> Robot:
"""Access the underlying robot (e.g. for connect/disconnect)."""
return self._robot
@@ -0,0 +1,77 @@
# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Rollout strategy ABC and factory."""
from __future__ import annotations
import abc
from typing import TYPE_CHECKING
if TYPE_CHECKING:
from lerobot.rollout.configs import RolloutStrategyConfig
from lerobot.rollout.context import RolloutContext
class RolloutStrategy(abc.ABC):
"""Abstract base for rollout execution strategies.
Each concrete strategy implements a self-contained control loop with
its own recording/interaction semantics. Strategies are mutually
exclusive only one runs per session.
"""
def __init__(self, config: RolloutStrategyConfig) -> None:
self.config = config
@abc.abstractmethod
def setup(self, ctx: RolloutContext) -> None:
"""Strategy-specific initialisation (keyboard listeners, buffers, etc.)."""
@abc.abstractmethod
def run(self, ctx: RolloutContext) -> None:
"""Main rollout loop. Returns when shutdown is requested or duration expires."""
@abc.abstractmethod
def teardown(self, ctx: RolloutContext) -> None:
"""Cleanup: save dataset, stop threads, disconnect hardware."""
def create_strategy(config: RolloutStrategyConfig) -> RolloutStrategy:
"""Instantiate the appropriate strategy from a config object."""
from lerobot.rollout.configs import (
BaseStrategyConfig,
DAggerStrategyConfig,
HighlightStrategyConfig,
SentryStrategyConfig,
)
if isinstance(config, BaseStrategyConfig):
from .base import BaseStrategy
return BaseStrategy(config)
if isinstance(config, SentryStrategyConfig):
from .sentry import SentryStrategy
return SentryStrategy(config)
if isinstance(config, HighlightStrategyConfig):
from .highlight import HighlightStrategy
return HighlightStrategy(config)
if isinstance(config, DAggerStrategyConfig):
from .dagger import DAggerStrategy
return DAggerStrategy(config)
raise ValueError(f"Unknown strategy config type: {type(config).__name__}")
+141
View File
@@ -0,0 +1,141 @@
# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Base rollout strategy: autonomous policy execution with no data recording."""
from __future__ import annotations
import logging
import time
from lerobot.policies.rtc import ActionInterpolator
from lerobot.policies.utils import make_robot_action
from lerobot.utils.constants import OBS_STR
from lerobot.utils.feature_utils import build_dataset_frame
from lerobot.utils.robot_utils import precise_sleep
from ..context import RolloutContext
from ..inference import InferenceEngine, _resolve_action_key_order
from . import RolloutStrategy
logger = logging.getLogger(__name__)
class BaseStrategy(RolloutStrategy):
"""Autonomous policy rollout with no data recording.
Supports both synchronous and RTC inference backends via the
:class:`InferenceEngine`. All actions flow through the
``robot_action_processor`` pipeline before reaching the robot.
"""
def __init__(self, config):
super().__init__(config)
self._engine: InferenceEngine | None = None
def setup(self, ctx: RolloutContext) -> None:
interpolator = ActionInterpolator(multiplier=ctx.cfg.interpolation_multiplier)
self._engine = InferenceEngine(
policy=ctx.policy,
preprocessor=ctx.preprocessor,
postprocessor=ctx.postprocessor,
robot_wrapper=ctx.robot_wrapper,
rtc_config=ctx.cfg.rtc,
hw_features=ctx.hw_features,
action_keys=ctx.action_keys,
task=ctx.cfg.task,
fps=ctx.cfg.fps,
device=ctx.cfg.device,
interpolator=interpolator,
use_torch_compile=ctx.cfg.use_torch_compile,
compile_warmup_inferences=ctx.cfg.compile_warmup_inferences,
)
self._engine.start()
logger.info("Base strategy ready (rtc=%s)", self._engine.is_rtc)
def run(self, ctx: RolloutContext) -> None:
engine = self._engine
cfg = ctx.cfg
robot = ctx.robot_wrapper
action_keys = ctx.action_keys
interpolator = ActionInterpolator(multiplier=cfg.interpolation_multiplier)
control_interval = interpolator.get_control_interval(cfg.fps)
policy_action_names = getattr(cfg.policy, "action_feature_names", None)
ordered_keys = _resolve_action_key_order(
list(policy_action_names) if policy_action_names else None,
action_keys,
)
start_time = time.perf_counter()
warmup_flushed = False
if engine.is_rtc:
engine.resume()
while not ctx.shutdown_event.is_set():
loop_start = time.perf_counter()
if cfg.duration > 0 and (time.perf_counter() - start_time) >= cfg.duration:
break
obs = robot.get_observation()
obs_processed = ctx.robot_observation_processor(obs)
if engine.is_rtc:
engine.update_observation(obs_processed)
if cfg.use_torch_compile and not engine.compile_warmup_done.is_set():
dt = time.perf_counter() - loop_start
if (sleep_t := control_interval - dt) > 0:
precise_sleep(sleep_t)
continue
if cfg.use_torch_compile and not warmup_flushed:
engine.reset()
interpolator.reset()
warmup_flushed = True
if interpolator.needs_new_action():
action_tensor = engine.consume_rtc_action()
if action_tensor is not None:
interpolator.add(action_tensor.cpu())
interp = interpolator.get()
if interp is not None:
action_dict = {k: interp[i].item() for i, k in enumerate(ordered_keys) if i < len(interp)}
processed = ctx.robot_action_processor((action_dict, obs))
robot.send_action(processed)
else:
obs_frame = build_dataset_frame(ctx.dataset_features, obs_processed, prefix=OBS_STR)
action_tensor = engine.get_action_sync(obs_frame)
action_dict = make_robot_action(action_tensor, ctx.dataset_features)
processed = ctx.robot_action_processor((action_dict, obs))
robot.send_action(processed)
dt = time.perf_counter() - loop_start
if (sleep_t := control_interval - dt) > 0:
precise_sleep(sleep_t)
def teardown(self, ctx: RolloutContext) -> None:
if self._engine is not None:
self._engine.stop()
if ctx.robot.is_connected:
ctx.robot.disconnect()
if ctx.teleop is not None and ctx.teleop.is_connected:
ctx.teleop.disconnect()
logger.info("Base strategy teardown complete")
+550
View File
@@ -0,0 +1,550 @@
# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""DAgger rollout strategy: Human-in-the-Loop data collection.
Implements the RaC paradigm (Recovery and Correction) for interactive
imitation learning. Alternates between autonomous policy execution and
human intervention via teleoperator.
Keyboard Controls:
SPACE - Pause policy (robot holds position, no recording)
c - Take control (start correction, recording resumes)
p - Resume policy after pause/correction
-> - End episode (save and continue)
<- - Re-record episode
ESC - Stop recording and push to hub
"""
from __future__ import annotations
import logging
import time
from typing import Any
import torch
from lerobot.common.control_utils import is_headless, predict_action
from lerobot.datasets import VideoEncodingManager
from lerobot.policies.rtc import ActionInterpolator
from lerobot.policies.utils import make_robot_action
from lerobot.utils.constants import ACTION, OBS_STR
from lerobot.utils.device_utils import get_safe_torch_device
from lerobot.utils.feature_utils import build_dataset_frame
from lerobot.utils.robot_utils import precise_sleep
from lerobot.utils.utils import log_say
from ..configs import DAggerStrategyConfig
from ..context import RolloutContext
from ..inference import InferenceEngine, _resolve_action_key_order
from . import RolloutStrategy
logger = logging.getLogger(__name__)
# ---------------------------------------------------------------------------
# Teleoperator helpers (extracted from examples/hil/hil_utils.py)
# ---------------------------------------------------------------------------
def _teleop_has_motor_control(teleop) -> bool:
return all(hasattr(teleop, attr) for attr in ("enable_torque", "disable_torque", "write_goal_positions"))
def _teleop_disable_torque(teleop) -> None:
if hasattr(teleop, "disable_torque"):
teleop.disable_torque()
def _teleop_enable_torque(teleop) -> None:
if hasattr(teleop, "enable_torque"):
teleop.enable_torque()
def _teleop_smooth_move_to(teleop, target_pos: dict, duration_s: float = 2.0, fps: int = 50) -> None:
"""Smoothly move teleop to target position if motor control is available."""
if not _teleop_has_motor_control(teleop):
logger.warning("Teleop does not support motor control — cannot mirror robot position")
return
_teleop_enable_torque(teleop)
current = teleop.get_action()
steps = max(int(duration_s * fps), 1)
for step in range(steps + 1):
t = step / steps
interp = {}
for k in current:
if k in target_pos:
interp[k] = current[k] * (1 - t) + target_pos[k] * t
else:
interp[k] = current[k]
teleop.write_goal_positions(interp)
time.sleep(1 / fps)
def _reset_loop(robot, teleop, events: dict, fps: int) -> None:
"""Reset period where the human repositions the environment."""
logger.info("RESET — press any key to enable teleoperation")
events["in_reset"] = True
events["start_next_episode"] = False
obs = robot.get_observation()
robot_pos = {k: v for k, v in obs.items() if k.endswith(".pos") and k in robot.observation_features}
_teleop_smooth_move_to(teleop, robot_pos, duration_s=2.0, fps=50)
while not events["start_next_episode"] and not events["stop_recording"]:
precise_sleep(0.05)
if events["stop_recording"]:
return
events["start_next_episode"] = False
_teleop_disable_torque(teleop)
logger.info("Teleop enabled — press any key to start episode")
while not events["start_next_episode"] and not events["stop_recording"]:
loop_start = time.perf_counter()
action = teleop.get_action()
robot.send_action(action)
precise_sleep(1 / fps - (time.perf_counter() - loop_start))
events["in_reset"] = False
events["start_next_episode"] = False
events["exit_early"] = False
events["policy_paused"] = False
events["correction_active"] = False
events["resume_policy"] = False
def _init_dagger_keyboard():
"""Initialise keyboard listener with DAgger/HIL controls."""
events = {
"exit_early": False,
"rerecord_episode": False,
"stop_recording": False,
"policy_paused": False,
"correction_active": False,
"resume_policy": False,
"in_reset": False,
"start_next_episode": False,
}
if is_headless():
logger.warning("Headless environment — keyboard controls unavailable")
return None, events
from pynput import keyboard
def on_press(key):
try:
if events["in_reset"]:
if key in [keyboard.Key.space, keyboard.Key.right] or hasattr(key, "char") and key.char == "c":
events["start_next_episode"] = True
elif key == keyboard.Key.esc:
events["stop_recording"] = True
events["start_next_episode"] = True
else:
if key == keyboard.Key.space:
if not events["policy_paused"] and not events["correction_active"]:
logger.info("PAUSED — press 'c' to take control or 'p' to resume policy")
events["policy_paused"] = True
elif hasattr(key, "char") and key.char == "c":
if events["policy_paused"] and not events["correction_active"]:
logger.info("Taking control...")
events["start_next_episode"] = True
elif hasattr(key, "char") and key.char == "p":
if events["policy_paused"] or events["correction_active"]:
logger.info("Resuming policy...")
events["resume_policy"] = True
elif key == keyboard.Key.right:
logger.info("End episode")
events["exit_early"] = True
elif key == keyboard.Key.left:
logger.info("Re-record episode")
events["rerecord_episode"] = True
events["exit_early"] = True
elif key == keyboard.Key.esc:
logger.info("Stop recording...")
events["stop_recording"] = True
events["exit_early"] = True
except Exception as e:
logger.debug("Key error: %s", e)
listener = keyboard.Listener(on_press=on_press)
listener.start()
return listener, events
def _start_pedal_listener(events: dict) -> None:
"""Start foot pedal listener thread if evdev is available."""
import threading
try:
from evdev import InputDevice, categorize, ecodes
except ImportError:
return
pedal_device = "/dev/input/by-id/usb-PCsensor_FootSwitch-event-kbd"
def pedal_reader():
try:
dev = InputDevice(pedal_device)
logger.info("Pedal connected: %s", dev.name)
for ev in dev.read_loop():
if ev.type != ecodes.EV_KEY:
continue
key = categorize(ev)
code = key.keycode
if isinstance(code, (list, tuple)):
code = code[0]
if key.keystate != 1:
continue
if events["in_reset"]:
if code in ["KEY_A", "KEY_C"]:
events["start_next_episode"] = True
else:
if code not in ["KEY_A", "KEY_C"]:
continue
if events["correction_active"]:
events["resume_policy"] = True
elif events["policy_paused"]:
events["start_next_episode"] = True
else:
events["policy_paused"] = True
except (FileNotFoundError, PermissionError):
pass
except Exception as e:
logger.warning("Pedal error: %s", e)
threading.Thread(target=pedal_reader, daemon=True).start()
# ---------------------------------------------------------------------------
# DAgger Strategy
# ---------------------------------------------------------------------------
class DAggerStrategy(RolloutStrategy):
"""Human-in-the-Loop data collection with intervention tagging.
State machine:
AUTONOMOUS -> (SPACE) -> PAUSED -> ('c') -> TAKEOVER -> ('p') -> AUTONOMOUS
-> (->) -> save episode
Supports both synchronous and RTC inference backends.
All actions (policy and teleop) flow through the appropriate
processor pipelines, supporting EE-space recording.
"""
config: DAggerStrategyConfig
def __init__(self, config: DAggerStrategyConfig):
super().__init__(config)
self._engine: InferenceEngine | None = None
self._listener = None
self._events: dict[str, Any] = {}
def setup(self, ctx: RolloutContext) -> None:
interpolator = ActionInterpolator(multiplier=ctx.cfg.interpolation_multiplier)
self._engine = InferenceEngine(
policy=ctx.policy,
preprocessor=ctx.preprocessor,
postprocessor=ctx.postprocessor,
robot_wrapper=ctx.robot_wrapper,
rtc_config=ctx.cfg.rtc,
hw_features=ctx.hw_features,
action_keys=ctx.action_keys,
task=ctx.cfg.task,
fps=ctx.cfg.fps,
device=ctx.cfg.device,
interpolator=interpolator,
use_torch_compile=ctx.cfg.use_torch_compile,
compile_warmup_inferences=ctx.cfg.compile_warmup_inferences,
)
self._engine.start()
self._listener, self._events = _init_dagger_keyboard()
_start_pedal_listener(self._events)
logger.info(
"DAgger strategy ready (rtc=%s, episodes=%d, episode_time=%.0fs)",
self._engine.is_rtc,
self.config.num_episodes,
self.config.episode_time_s,
)
logger.info("Controls: SPACE=pause, c=take control, p=resume, ->=end, <-=redo, ESC=stop")
def run(self, ctx: RolloutContext) -> None:
engine = self._engine
dataset = ctx.dataset
events = self._events
teleop = ctx.teleop
with VideoEncodingManager(dataset):
try:
recorded = 0
while recorded < self.config.num_episodes and not events["stop_recording"]:
log_say(f"Episode {dataset.num_episodes}", self.config.play_sounds)
self._run_episode(ctx)
if events["rerecord_episode"]:
log_say("Re-recording", self.config.play_sounds)
events["rerecord_episode"] = False
events["exit_early"] = False
dataset.clear_episode_buffer()
continue
dataset.save_episode()
recorded += 1
if recorded < self.config.num_episodes and not events["stop_recording"]:
_reset_loop(ctx.robot_wrapper, teleop, events, int(ctx.cfg.fps))
finally:
try:
dataset.save_episode()
except Exception:
pass
def teardown(self, ctx: RolloutContext) -> None:
log_say("Stop recording", self.config.play_sounds, blocking=True)
if self._engine is not None:
self._engine.stop()
if self._listener is not None and not is_headless():
self._listener.stop()
if ctx.dataset is not None:
ctx.dataset.finalize()
if ctx.cfg.dataset and ctx.cfg.dataset.push_to_hub:
ctx.dataset.push_to_hub(
tags=ctx.cfg.dataset.tags,
private=ctx.cfg.dataset.private,
)
if ctx.robot.is_connected:
ctx.robot.disconnect()
if ctx.teleop is not None and ctx.teleop.is_connected:
ctx.teleop.disconnect()
logger.info("DAgger strategy teardown complete")
# ------------------------------------------------------------------
# Episode rollout (state machine)
# ------------------------------------------------------------------
def _run_episode(self, ctx: RolloutContext) -> None:
"""Run a single DAgger episode with the HIL state machine."""
engine = self._engine
cfg = ctx.cfg
robot = ctx.robot_wrapper
teleop = ctx.teleop
dataset = ctx.dataset
events = self._events
interpolator = ActionInterpolator(multiplier=cfg.interpolation_multiplier)
control_interval = interpolator.get_control_interval(cfg.fps)
stream_online = bool(cfg.dataset.streaming_encoding) if cfg.dataset else False
record_stride = max(1, cfg.interpolation_multiplier)
policy_action_names = getattr(cfg.policy, "action_feature_names", None)
ordered_keys = _resolve_action_key_order(
list(policy_action_names) if policy_action_names else None,
ctx.action_keys,
)
dataset_action_keys = list(dataset.features.get(ACTION, {}).get("names", ctx.action_keys))
engine.reset()
_teleop_disable_torque(teleop)
was_paused = False
waiting_for_takeover = False
last_action: dict[str, Any] | None = None
robot_action: dict[str, Any] = {}
frame_buffer: list[dict] = []
task_str = cfg.dataset.single_task if cfg.dataset else cfg.task
timestamp = 0.0
record_tick = 0
start_t = time.perf_counter()
warmup_flushed = False
if engine.is_rtc:
engine.resume()
while timestamp < self.config.episode_time_s:
loop_start = time.perf_counter()
if events["exit_early"]:
events["exit_early"] = False
events["policy_paused"] = False
events["correction_active"] = False
events["resume_policy"] = False
break
# --- Resume from pause/correction ---
if events["resume_policy"] and (
events["policy_paused"] or events["correction_active"] or waiting_for_takeover
):
events["resume_policy"] = False
events["start_next_episode"] = False
events["policy_paused"] = False
events["correction_active"] = False
waiting_for_takeover = False
was_paused = False
last_action = None
interpolator.reset()
engine.reset()
if engine.is_rtc:
engine.resume()
# --- Pause: align teleop to robot position ---
if events["policy_paused"] and not was_paused:
if engine.is_rtc:
engine.pause()
obs = robot.get_observation()
robot_pos = {
k: v for k, v in obs.items() if k.endswith(".pos") and k in robot.observation_features
}
_teleop_smooth_move_to(teleop, robot_pos, duration_s=2.0, fps=50)
events["start_next_episode"] = False
waiting_for_takeover = True
was_paused = True
interpolator.reset()
# --- Takeover: enable teleop control ---
if waiting_for_takeover and events["start_next_episode"]:
_teleop_disable_torque(teleop)
events["start_next_episode"] = False
events["correction_active"] = True
waiting_for_takeover = False
if engine.is_rtc:
engine.reset()
# --- Get observation ---
obs = robot.get_observation()
obs_processed = ctx.robot_observation_processor(obs)
obs_frame = build_dataset_frame(dataset.features, obs_processed, prefix=OBS_STR)
# --- CORRECTION: human teleop control ---
if events["correction_active"]:
teleop_action = teleop.get_action()
processed_teleop = ctx.teleop_action_processor((teleop_action, obs))
robot_action_to_send = ctx.robot_action_processor((processed_teleop, obs))
robot.send_action(robot_action_to_send)
action_frame = build_dataset_frame(dataset.features, processed_teleop, prefix=ACTION)
if record_tick % record_stride == 0:
frame = {**obs_frame, **action_frame, "task": task_str}
if stream_online:
dataset.add_frame(frame)
else:
frame_buffer.append(frame)
record_tick += 1
# --- PAUSED: hold position ---
elif waiting_for_takeover or events["policy_paused"]:
if last_action:
robot.send_action(last_action)
# --- AUTONOMOUS: policy control ---
else:
if engine.is_rtc:
engine.update_observation(obs_processed)
if cfg.use_torch_compile and not engine.compile_warmup_done.is_set():
dt = time.perf_counter() - loop_start
if (sleep_t := control_interval - dt) > 0:
precise_sleep(sleep_t)
timestamp = time.perf_counter() - start_t
continue
if cfg.use_torch_compile and not warmup_flushed:
engine.reset()
interpolator.reset()
warmup_flushed = True
if engine.is_rtc:
engine.resume()
if interpolator.needs_new_action():
action_tensor = engine.consume_rtc_action()
if action_tensor is not None:
interpolator.add(action_tensor.cpu())
interp = interpolator.get()
if interp is not None:
robot_action = {
k: interp[i].item() for i, k in enumerate(ordered_keys) if i < len(interp)
}
processed = ctx.robot_action_processor((robot_action, obs))
robot.send_action(processed)
last_action = processed
action_frame = build_dataset_frame(dataset.features, robot_action, prefix=ACTION)
if record_tick % record_stride == 0:
frame = {**obs_frame, **action_frame, "task": task_str}
if stream_online:
dataset.add_frame(frame)
else:
frame_buffer.append(frame)
record_tick += 1
else:
# Sync inference
if interpolator.needs_new_action():
device = get_safe_torch_device(cfg.device)
action_tensor = predict_action(
observation=obs_frame,
policy=ctx.policy,
device=device,
preprocessor=ctx.preprocessor,
postprocessor=ctx.postprocessor,
use_amp=ctx.policy.config.use_amp,
task=task_str,
robot_type=robot.robot_type,
)
robot_action = make_robot_action(action_tensor, dataset.features)
action_t = torch.tensor([robot_action[k] for k in dataset_action_keys])
interpolator.add(action_t)
interp = interpolator.get()
if interp is not None:
robot_action = {k: interp[i].item() for i, k in enumerate(dataset_action_keys)}
processed = ctx.robot_action_processor((robot_action, obs))
robot.send_action(processed)
last_action = processed
action_frame = build_dataset_frame(dataset.features, robot_action, prefix=ACTION)
if record_tick % record_stride == 0:
frame = {**obs_frame, **action_frame, "task": task_str}
if stream_online:
dataset.add_frame(frame)
else:
frame_buffer.append(frame)
record_tick += 1
dt = time.perf_counter() - loop_start
if (sleep_t := control_interval - dt) > 0:
precise_sleep(sleep_t)
timestamp = time.perf_counter() - start_t
# End of episode: flush any buffered frames
if engine.is_rtc:
engine.pause()
_teleop_disable_torque(teleop)
if not stream_online:
for frame in frame_buffer:
dataset.add_frame(frame)
+251
View File
@@ -0,0 +1,251 @@
# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Highlight Reel strategy: on-demand recording via ring buffer."""
from __future__ import annotations
import logging
import time
from lerobot.datasets import VideoEncodingManager
from lerobot.policies.rtc import ActionInterpolator
from lerobot.policies.utils import make_robot_action
from lerobot.utils.constants import ACTION, OBS_STR
from lerobot.utils.feature_utils import build_dataset_frame
from lerobot.utils.robot_utils import precise_sleep
from ..configs import HighlightStrategyConfig
from ..context import RolloutContext
from ..inference import InferenceEngine, _resolve_action_key_order
from ..ring_buffer import RolloutRingBuffer
from . import RolloutStrategy
logger = logging.getLogger(__name__)
class HighlightStrategy(RolloutStrategy):
"""Autonomous rollout with on-demand recording via ring buffer.
The robot runs autonomously while a memory-bounded ring buffer
captures continuous telemetry. When the user presses the save key:
1. The ring buffer is flushed to the dataset (last *Z* seconds).
2. Live recording continues until the save key is pressed again.
3. The episode is saved and the ring buffer resumes capturing.
All actions flow through ``robot_action_processor`` before reaching
the robot, supporting EE-space recording with joint-space robots.
"""
config: HighlightStrategyConfig
def __init__(self, config: HighlightStrategyConfig):
super().__init__(config)
self._engine: InferenceEngine | None = None
self._ring: RolloutRingBuffer | None = None
self._listener = None
self._save_requested = False
self._recording_live = False
def setup(self, ctx: RolloutContext) -> None:
interpolator = ActionInterpolator(multiplier=ctx.cfg.interpolation_multiplier)
self._engine = InferenceEngine(
policy=ctx.policy,
preprocessor=ctx.preprocessor,
postprocessor=ctx.postprocessor,
robot_wrapper=ctx.robot_wrapper,
rtc_config=ctx.cfg.rtc,
hw_features=ctx.hw_features,
action_keys=ctx.action_keys,
task=ctx.cfg.task,
fps=ctx.cfg.fps,
device=ctx.cfg.device,
interpolator=interpolator,
use_torch_compile=ctx.cfg.use_torch_compile,
compile_warmup_inferences=ctx.cfg.compile_warmup_inferences,
)
self._engine.start()
self._ring = RolloutRingBuffer(
max_seconds=self.config.ring_buffer_seconds,
max_memory_mb=self.config.ring_buffer_max_memory_mb,
fps=ctx.cfg.fps,
)
self._setup_keyboard()
logger.info(
"Highlight strategy ready (buffer=%.0fs, key='%s')",
self.config.ring_buffer_seconds,
self.config.save_key,
)
def run(self, ctx: RolloutContext) -> None:
engine = self._engine
cfg = ctx.cfg
robot = ctx.robot_wrapper
dataset = ctx.dataset
action_keys = ctx.action_keys
ring = self._ring
interpolator = ActionInterpolator(multiplier=cfg.interpolation_multiplier)
control_interval = interpolator.get_control_interval(cfg.fps)
policy_action_names = getattr(cfg.policy, "action_feature_names", None)
ordered_keys = _resolve_action_key_order(
list(policy_action_names) if policy_action_names else None,
action_keys,
)
if engine.is_rtc:
engine.resume()
start_time = time.perf_counter()
warmup_flushed = False
task_str = cfg.dataset.single_task if cfg.dataset else cfg.task
with VideoEncodingManager(dataset):
try:
while not ctx.shutdown_event.is_set():
loop_start = time.perf_counter()
if cfg.duration > 0 and (time.perf_counter() - start_time) >= cfg.duration:
break
obs = robot.get_observation()
obs_processed = ctx.robot_observation_processor(obs)
action_dict = None
if engine.is_rtc:
engine.update_observation(obs_processed)
if cfg.use_torch_compile and not engine.compile_warmup_done.is_set():
dt = time.perf_counter() - loop_start
if (sleep_t := control_interval - dt) > 0:
precise_sleep(sleep_t)
continue
if cfg.use_torch_compile and not warmup_flushed:
engine.reset()
interpolator.reset()
warmup_flushed = True
if interpolator.needs_new_action():
action_tensor = engine.consume_rtc_action()
if action_tensor is not None:
interpolator.add(action_tensor.cpu())
interp = interpolator.get()
if interp is not None:
action_dict = {
k: interp[i].item() for i, k in enumerate(ordered_keys) if i < len(interp)
}
processed = ctx.robot_action_processor((action_dict, obs))
robot.send_action(processed)
else:
obs_frame = build_dataset_frame(ctx.dataset_features, obs_processed, prefix=OBS_STR)
action_tensor = engine.get_action_sync(obs_frame)
action_dict = make_robot_action(action_tensor, ctx.dataset_features)
processed = ctx.robot_action_processor((action_dict, obs))
robot.send_action(processed)
# Build frame for ring buffer / live recording
if action_dict is not None:
obs_frame = build_dataset_frame(ctx.dataset_features, obs_processed, prefix=OBS_STR)
action_frame = build_dataset_frame(ctx.dataset_features, action_dict, prefix=ACTION)
frame = {**obs_frame, **action_frame, "task": task_str}
# Handle save key toggle
if self._save_requested:
self._save_requested = False
if not self._recording_live:
logger.info(
"Flushing ring buffer (%d frames) + starting live recording", len(ring)
)
for buffered_frame in ring.drain():
dataset.add_frame(buffered_frame)
self._recording_live = True
else:
dataset.add_frame(frame)
dataset.save_episode()
logger.info("Episode saved")
self._recording_live = False
engine.reset()
interpolator.reset()
if engine.is_rtc:
engine.resume()
if self._recording_live:
dataset.add_frame(frame)
else:
ring.append(frame)
dt = time.perf_counter() - loop_start
if (sleep_t := control_interval - dt) > 0:
precise_sleep(sleep_t)
finally:
if self._recording_live:
try:
dataset.save_episode()
except Exception:
pass
def teardown(self, ctx: RolloutContext) -> None:
if self._engine is not None:
self._engine.stop()
if self._listener is not None:
self._listener.stop()
if ctx.dataset is not None:
ctx.dataset.finalize()
if ctx.cfg.dataset and ctx.cfg.dataset.push_to_hub:
ctx.dataset.push_to_hub(
tags=ctx.cfg.dataset.tags,
private=ctx.cfg.dataset.private,
)
if ctx.robot.is_connected:
ctx.robot.disconnect()
if ctx.teleop is not None and ctx.teleop.is_connected:
ctx.teleop.disconnect()
logger.info("Highlight strategy teardown complete")
def _setup_keyboard(self) -> None:
"""Set up keyboard listener for the save key."""
from lerobot.common.control_utils import is_headless
if is_headless():
logger.warning("Headless environment — highlight save key unavailable")
return
try:
from pynput import keyboard
save_key = self.config.save_key
def on_press(key):
try:
if hasattr(key, "char") and key.char == save_key:
self._save_requested = True
elif key == keyboard.Key.esc:
self._save_requested = False
except Exception:
pass
self._listener = keyboard.Listener(on_press=on_press)
self._listener.start()
except ImportError:
logger.warning("pynput not available — keyboard listener disabled")
+227
View File
@@ -0,0 +1,227 @@
# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Sentry rollout strategy: continuous autonomous recording with auto-upload."""
from __future__ import annotations
import logging
import time
from threading import Thread
from lerobot.datasets import VideoEncodingManager
from lerobot.policies.rtc import ActionInterpolator
from lerobot.policies.utils import make_robot_action
from lerobot.utils.constants import ACTION, OBS_STR
from lerobot.utils.feature_utils import build_dataset_frame
from lerobot.utils.robot_utils import precise_sleep
from ..configs import SentryStrategyConfig
from ..context import RolloutContext
from ..inference import InferenceEngine, _resolve_action_key_order
from . import RolloutStrategy
logger = logging.getLogger(__name__)
class SentryStrategy(RolloutStrategy):
"""Continuous autonomous rollout with always-on recording.
Episodes are auto-rotated every ``episode_duration_s`` seconds.
The dataset is pushed to Hub in the background every
``upload_every_n_episodes`` episodes.
Requires ``streaming_encoding=True`` (enforced in config validation)
to prevent disk I/O from blocking the control loop.
All actions flow through ``robot_observation_processor`` (observations)
and ``robot_action_processor`` (actions) before reaching the robot,
supporting EE-space recording with joint-space robots.
"""
config: SentryStrategyConfig
def __init__(self, config: SentryStrategyConfig):
super().__init__(config)
self._engine: InferenceEngine | None = None
self._push_thread: Thread | None = None
def setup(self, ctx: RolloutContext) -> None:
interpolator = ActionInterpolator(multiplier=ctx.cfg.interpolation_multiplier)
self._engine = InferenceEngine(
policy=ctx.policy,
preprocessor=ctx.preprocessor,
postprocessor=ctx.postprocessor,
robot_wrapper=ctx.robot_wrapper,
rtc_config=ctx.cfg.rtc,
hw_features=ctx.hw_features,
action_keys=ctx.action_keys,
task=ctx.cfg.task,
fps=ctx.cfg.fps,
device=ctx.cfg.device,
interpolator=interpolator,
use_torch_compile=ctx.cfg.use_torch_compile,
compile_warmup_inferences=ctx.cfg.compile_warmup_inferences,
)
self._engine.start()
logger.info(
"Sentry strategy ready (episode_duration=%.0fs, upload_every=%d eps)",
self.config.episode_duration_s,
self.config.upload_every_n_episodes,
)
def run(self, ctx: RolloutContext) -> None:
engine = self._engine
cfg = ctx.cfg
robot = ctx.robot_wrapper
dataset = ctx.dataset
action_keys = ctx.action_keys
interpolator = ActionInterpolator(multiplier=cfg.interpolation_multiplier)
control_interval = interpolator.get_control_interval(cfg.fps)
policy_action_names = getattr(cfg.policy, "action_feature_names", None)
ordered_keys = _resolve_action_key_order(
list(policy_action_names) if policy_action_names else None,
action_keys,
)
if engine.is_rtc:
engine.resume()
start_time = time.perf_counter()
episode_start = time.perf_counter()
episodes_since_push = 0
warmup_flushed = False
task_str = cfg.dataset.single_task if cfg.dataset else cfg.task
with VideoEncodingManager(dataset):
try:
while not ctx.shutdown_event.is_set():
loop_start = time.perf_counter()
if cfg.duration > 0 and (time.perf_counter() - start_time) >= cfg.duration:
break
obs = robot.get_observation()
obs_processed = ctx.robot_observation_processor(obs)
action_dict = None
if engine.is_rtc:
engine.update_observation(obs_processed)
if cfg.use_torch_compile and not engine.compile_warmup_done.is_set():
dt = time.perf_counter() - loop_start
if (sleep_t := control_interval - dt) > 0:
precise_sleep(sleep_t)
continue
if cfg.use_torch_compile and not warmup_flushed:
engine.reset()
interpolator.reset()
warmup_flushed = True
if interpolator.needs_new_action():
action_tensor = engine.consume_rtc_action()
if action_tensor is not None:
interpolator.add(action_tensor.cpu())
interp = interpolator.get()
if interp is not None:
action_dict = {
k: interp[i].item() for i, k in enumerate(ordered_keys) if i < len(interp)
}
processed = ctx.robot_action_processor((action_dict, obs))
robot.send_action(processed)
else:
obs_frame = build_dataset_frame(ctx.dataset_features, obs_processed, prefix=OBS_STR)
action_tensor = engine.get_action_sync(obs_frame)
action_dict = make_robot_action(action_tensor, ctx.dataset_features)
processed = ctx.robot_action_processor((action_dict, obs))
robot.send_action(processed)
# Record frame
if action_dict is not None:
obs_frame = build_dataset_frame(ctx.dataset_features, obs_processed, prefix=OBS_STR)
action_frame = build_dataset_frame(ctx.dataset_features, action_dict, prefix=ACTION)
frame = {**obs_frame, **action_frame, "task": task_str}
dataset.add_frame(frame)
# Auto-rotate episodes
elapsed = time.perf_counter() - episode_start
if elapsed >= self.config.episode_duration_s:
dataset.save_episode()
episodes_since_push += 1
logger.info("Episode saved (total: %d)", dataset.num_episodes)
if episodes_since_push >= self.config.upload_every_n_episodes:
self._background_push(dataset, cfg)
episodes_since_push = 0
episode_start = time.perf_counter()
engine.reset()
interpolator.reset()
if engine.is_rtc:
engine.resume()
dt = time.perf_counter() - loop_start
if (sleep_t := control_interval - dt) > 0:
precise_sleep(sleep_t)
finally:
try:
dataset.save_episode()
except Exception:
pass
def teardown(self, ctx: RolloutContext) -> None:
if self._engine is not None:
self._engine.stop()
if ctx.dataset is not None:
ctx.dataset.finalize()
if ctx.cfg.dataset and ctx.cfg.dataset.push_to_hub:
ctx.dataset.push_to_hub(
tags=ctx.cfg.dataset.tags,
private=ctx.cfg.dataset.private,
)
if self._push_thread is not None and self._push_thread.is_alive():
self._push_thread.join(timeout=60)
if ctx.robot.is_connected:
ctx.robot.disconnect()
if ctx.teleop is not None and ctx.teleop.is_connected:
ctx.teleop.disconnect()
logger.info("Sentry strategy teardown complete")
def _background_push(self, dataset, cfg) -> None:
"""Push dataset to hub in a background thread (non-blocking)."""
if self._push_thread is not None and self._push_thread.is_alive():
logger.info("Previous push still in progress, skipping")
return
def _push():
try:
dataset.push_to_hub(
tags=cfg.dataset.tags if cfg.dataset else None,
private=cfg.dataset.private if cfg.dataset else False,
)
logger.info("Background push to hub complete")
except Exception as e:
logger.error("Background push failed: %s", e)
self._push_thread = Thread(target=_push, daemon=True)
self._push_thread.start()
+61 -190
View File
@@ -13,58 +13,54 @@
# limitations under the License.
"""
Records a dataset. Actions for the robot can be either generated by teleoperation or by a policy.
Records a dataset via teleoperation. This is a pure data-collection
tool no policy inference. For deploying trained policies, use
``lerobot-rollout`` instead.
Requires: pip install 'lerobot[core_scripts]' (includes dataset + hardware + viz extras)
Example:
```shell
lerobot-record \
--robot.type=so100_follower \
--robot.port=/dev/tty.usbmodem58760431541 \
--robot.cameras="{laptop: {type: opencv, index_or_path: 0, width: 640, height: 480, fps: 30}}" \
--robot.id=black \
--dataset.repo_id=<my_username>/<my_dataset_name> \
--dataset.num_episodes=2 \
--dataset.single_task="Grab the cube" \
--dataset.streaming_encoding=true \
--dataset.encoder_threads=2 \
lerobot-record \\
--robot.type=so100_follower \\
--robot.port=/dev/tty.usbmodem58760431541 \\
--robot.cameras="{laptop: {type: opencv, index_or_path: 0, width: 640, height: 480, fps: 30}}" \\
--robot.id=black \\
--teleop.type=so100_leader \\
--teleop.port=/dev/tty.usbmodem58760431551 \\
--teleop.id=blue \\
--dataset.repo_id=<my_username>/<my_dataset_name> \\
--dataset.num_episodes=2 \\
--dataset.single_task="Grab the cube" \\
--dataset.streaming_encoding=true \\
--dataset.encoder_threads=2 \\
--display_data=true
# <- Optional: specify video codec (auto, h264, hevc, libsvtav1). Default is libsvtav1. \
# --dataset.vcodec=h264 \
# <- Teleop optional if you want to teleoperate to record or in between episodes with a policy \
# --teleop.type=so100_leader \
# --teleop.port=/dev/tty.usbmodem58760431551 \
# --teleop.id=blue \
# <- Policy optional if you want to record with a policy \
# --policy.path=${HF_USER}/my_policy \
```
Example recording with bimanual so100:
```shell
lerobot-record \
--robot.type=bi_so_follower \
--robot.left_arm_config.port=/dev/tty.usbmodem5A460822851 \
--robot.right_arm_config.port=/dev/tty.usbmodem5A460814411 \
--robot.id=bimanual_follower \
lerobot-record \\
--robot.type=bi_so_follower \\
--robot.left_arm_config.port=/dev/tty.usbmodem5A460822851 \\
--robot.right_arm_config.port=/dev/tty.usbmodem5A460814411 \\
--robot.id=bimanual_follower \\
--robot.left_arm_config.cameras='{
wrist: {"type": "opencv", "index_or_path": 1, "width": 640, "height": 480, "fps": 30},
top: {"type": "opencv", "index_or_path": 3, "width": 640, "height": 480, "fps": 30},
}' --robot.right_arm_config.cameras='{
wrist: {"type": "opencv", "index_or_path": 2, "width": 640, "height": 480, "fps": 30},
front: {"type": "opencv", "index_or_path": 4, "width": 640, "height": 480, "fps": 30},
}' \
--teleop.type=bi_so_leader \
--teleop.left_arm_config.port=/dev/tty.usbmodem5A460852721 \
--teleop.right_arm_config.port=/dev/tty.usbmodem5A460819811 \
--teleop.id=bimanual_leader \
--display_data=true \
--dataset.repo_id=${HF_USER}/bimanual-so-handover-cube \
--dataset.num_episodes=25 \
--dataset.single_task="Grab and handover the red cube to the other arm" \
--dataset.streaming_encoding=true \
# --dataset.vcodec=auto \
}' \\
--teleop.type=bi_so_leader \\
--teleop.left_arm_config.port=/dev/tty.usbmodem5A460852721 \\
--teleop.right_arm_config.port=/dev/tty.usbmodem5A460819811 \\
--teleop.id=bimanual_leader \\
--display_data=true \\
--dataset.repo_id=${HF_USER}/bimanual-so-handover-cube \\
--dataset.num_episodes=25 \\
--dataset.single_task="Grab and handover the red cube to the other arm" \\
--dataset.streaming_encoding=true \\
--dataset.encoder_threads=2
```
"""
@@ -74,9 +70,6 @@ import time
from dataclasses import asdict, dataclass, field
from pathlib import Path
from pprint import pformat
from typing import Any
import torch
from lerobot.cameras import CameraConfig # noqa: F401
from lerobot.cameras.opencv import OpenCVCameraConfig # noqa: F401
@@ -86,11 +79,9 @@ from lerobot.cameras.zmq import ZMQCameraConfig # noqa: F401
from lerobot.common.control_utils import (
init_keyboard_listener,
is_headless,
predict_action,
sanity_check_dataset_name,
sanity_check_dataset_robot_compatibility,
)
from lerobot.configs import PreTrainedConfig, parser
from lerobot.configs import parser
from lerobot.datasets import (
LeRobotDataset,
VideoEncodingManager,
@@ -98,21 +89,11 @@ from lerobot.datasets import (
create_initial_features,
safe_stop_image_writer,
)
from lerobot.policies import (
ActionInterpolator,
PreTrainedPolicy,
make_policy,
make_pre_post_processors,
make_robot_action,
)
from lerobot.processor import (
PolicyAction,
PolicyProcessorPipeline,
RobotAction,
RobotObservation,
RobotProcessorPipeline,
make_default_processors,
rename_stats,
)
from lerobot.robots import ( # noqa: F401
Robot,
@@ -146,7 +127,6 @@ from lerobot.teleoperators import ( # noqa: F401
)
from lerobot.teleoperators.keyboard import KeyboardTeleop
from lerobot.utils.constants import ACTION, OBS_STR
from lerobot.utils.device_utils import get_safe_torch_device
from lerobot.utils.feature_utils import build_dataset_frame, combine_feature_dicts
from lerobot.utils.import_utils import register_third_party_plugins
from lerobot.utils.robot_utils import precise_sleep
@@ -218,10 +198,8 @@ class DatasetRecordConfig:
class RecordConfig:
robot: RobotConfig
dataset: DatasetRecordConfig
# Whether to control the robot with a teleoperator
# Teleoperator to control the robot (required)
teleop: TeleoperatorConfig | None = None
# Whether to control the robot with a policy
policy: PreTrainedConfig | None = None
# Display all cameras on screen
display_data: bool = False
# Display data on a remote Rerun server
@@ -234,27 +212,14 @@ class RecordConfig:
play_sounds: bool = True
# Resume recording on an existing dataset.
resume: bool = False
# Action interpolation multiplier for smoother policy control (1=off, 2=2x, 3=3x)
# Only applies when using a policy (not teleop)
interpolation_multiplier: int = 1
def __post_init__(self):
# HACK: We parse again the cli args here to get the pretrained path if there was one.
policy_path = parser.get_path_arg("policy")
if policy_path:
cli_overrides = parser.get_cli_overrides("policy")
self.policy = PreTrainedConfig.from_pretrained(policy_path, cli_overrides=cli_overrides)
self.policy.pretrained_path = policy_path
if self.teleop is None and self.policy is None:
raise ValueError("Choose a policy, a teleoperator or both to control the robot")
@classmethod
def __get_path_fields__(cls) -> list[str]:
"""This enables the parser to load config from the policy using `--policy.path=local/dir`"""
return ["policy"]
if self.teleop is None:
raise ValueError(
"A teleoperator is required for recording. "
"Use --teleop.type=... to specify one. "
"For policy-based deployment, use lerobot-rollout instead."
)
""" --------------- record_loop() data flow --------------------------
@@ -264,18 +229,14 @@ class RecordConfig:
V
[ robot_observation_processor ] ---> processed_obs
V
.-----( ACTION LOGIC )------------------.
V V
[ From Teleoperator ] [ From Policy ]
| |
| [teleop.get_action] -> raw_action | [predict_action]
| | | |
| V | V
| [teleop_action_processor] | |
| | | |
'---> processed_teleop_action '---> processed_policy_action
| |
'-------------------------.-------------'
[ Teleoperator ]
|
| [teleop.get_action] -> raw_action
| |
| V
| [teleop_action_processor]
| |
'---> processed_teleop_action
V
[ robot_action_processor ] --> robot_action_to_send
V
@@ -303,13 +264,9 @@ def record_loop(
], # runs after robot
dataset: LeRobotDataset | None = None,
teleop: Teleoperator | list[Teleoperator] | None = None,
policy: PreTrainedPolicy | None = None,
preprocessor: PolicyProcessorPipeline[dict[str, Any], dict[str, Any]] | None = None,
postprocessor: PolicyProcessorPipeline[PolicyAction, PolicyAction] | None = None,
control_time_s: int | None = None,
single_task: str | None = None,
display_data: bool = False,
interpolator: ActionInterpolator | None = None,
display_compressed_images: bool = False,
):
if dataset is not None and dataset.fps != fps:
@@ -340,21 +297,7 @@ def record_loop(
"For multi-teleop, the list must contain exactly one KeyboardTeleop and one arm teleoperator. Currently only supported for LeKiwi robot."
)
# Reset policy and processor if they are provided
if policy is not None and preprocessor is not None and postprocessor is not None:
policy.reset()
preprocessor.reset()
postprocessor.reset()
# Reset interpolator if provided
if interpolator is not None:
interpolator.reset()
# Calculate control interval based on interpolation
use_interpolation = interpolator is not None and interpolator.enabled and policy is not None
control_interval = interpolator.get_control_interval(fps) if interpolator else 1 / fps
# Pre-compute action key order outside the hot loop — it won't change mid-episode.
action_keys = sorted(robot.action_features) if use_interpolation else []
control_interval = 1 / fps
no_action_count = 0
timestamp = 0
@@ -372,63 +315,11 @@ def record_loop(
# Applies a pipeline to the raw robot observation, default is IdentityProcessor
obs_processed = robot_observation_processor(obs)
if policy is not None or dataset is not None:
if dataset is not None:
observation_frame = build_dataset_frame(dataset.features, obs_processed, prefix=OBS_STR)
# Track whether this iteration should be recorded to the dataset.
# Interpolated-only iterations send actions to the robot but don't record frames,
# keeping the dataset at the original fps while the robot moves at the higher rate.
is_record_frame = True
# Get action from either policy or teleop
if policy is not None and preprocessor is not None and postprocessor is not None:
# With interpolation: only call policy when interpolator needs new action
if use_interpolation:
ran_inference = False
if interpolator.needs_new_action():
action_values = predict_action(
observation=observation_frame,
policy=policy,
device=get_safe_torch_device(policy.config.device),
preprocessor=preprocessor,
postprocessor=postprocessor,
use_amp=policy.config.use_amp,
task=single_task,
robot_type=robot.robot_type,
)
act_processed_policy = make_robot_action(action_values, dataset.features)
robot_action_to_send = robot_action_processor((act_processed_policy, obs))
action_tensor = torch.tensor([robot_action_to_send[k] for k in action_keys])
interpolator.add(action_tensor)
ran_inference = True
interp_action = interpolator.get()
if interp_action is not None:
robot_action_to_send = {k: interp_action[i].item() for i, k in enumerate(action_keys)}
action_values = robot_action_to_send
else:
continue
is_record_frame = ran_inference
else:
action_values = predict_action(
observation=observation_frame,
policy=policy,
device=get_safe_torch_device(policy.config.device),
preprocessor=preprocessor,
postprocessor=postprocessor,
use_amp=policy.config.use_amp,
task=single_task,
robot_type=robot.robot_type,
)
act_processed_policy: RobotAction = make_robot_action(action_values, dataset.features)
# Applies a pipeline to the action, default is IdentityProcessor
robot_action_to_send = robot_action_processor((act_processed_policy, obs))
action_values = robot_action_to_send
elif policy is None and isinstance(teleop, Teleoperator):
# Get action from teleop
if isinstance(teleop, Teleoperator):
act = teleop.get_action()
if robot.name == "unitree_g1":
teleop.send_feedback(obs)
@@ -438,7 +329,7 @@ def record_loop(
action_values = act_processed_teleop
robot_action_to_send = robot_action_processor((act_processed_teleop, obs))
elif policy is None and isinstance(teleop, list):
elif isinstance(teleop, list):
arm_action = teleop_arm.get_action()
arm_action = {f"arm_{k}": v for k, v in arm_action.items()}
keyboard_action = teleop_keyboard.get_action()
@@ -451,7 +342,7 @@ def record_loop(
no_action_count += 1
if no_action_count == 1 or no_action_count % 10 == 0:
logging.warning(
"No policy or teleoperator provided, skipping action generation. "
"No teleoperator provided, skipping action generation. "
"This is likely to happen when resetting the environment without a teleop device. "
"The robot won't be at its rest position at the start of the next episode."
)
@@ -463,8 +354,8 @@ def record_loop(
# TODO(steven, pepijn, adil): we should use a pipeline step to clip the action, so the sent action is the action that we input to the robot.
_sent_action = robot.send_action(robot_action_to_send)
# Write to dataset (only on real policy frames, not interpolated-only iterations)
if dataset is not None and is_record_frame:
# Write to dataset
if dataset is not None:
action_frame = build_dataset_frame(dataset.features, action_values, prefix=ACTION)
frame = {**observation_frame, **action_frame, "task": single_task}
dataset.add_frame(frame)
@@ -540,8 +431,12 @@ def record(cfg: RecordConfig) -> LeRobotDataset:
)
sanity_check_dataset_robot_compatibility(dataset, robot, cfg.dataset.fps, dataset_features)
else:
# Create empty dataset or load existing saved episodes
sanity_check_dataset_name(cfg.dataset.repo_id, cfg.policy)
# Reject eval_ prefix — for policy evaluation use lerobot-rollout
if cfg.dataset.repo_id.startswith("eval_"):
raise ValueError(
"Dataset names starting with 'eval_' are reserved for policy evaluation. "
"lerobot-record is for data collection only. Use lerobot-rollout for policy deployment."
)
dataset = LeRobotDataset.create(
cfg.dataset.repo_id,
cfg.dataset.fps,
@@ -558,26 +453,6 @@ def record(cfg: RecordConfig) -> LeRobotDataset:
encoder_threads=cfg.dataset.encoder_threads,
)
# Load pretrained policy
policy = None if cfg.policy is None else make_policy(cfg.policy, ds_meta=dataset.meta)
preprocessor = None
postprocessor = None
interpolator = None
if cfg.policy is not None:
preprocessor, postprocessor = make_pre_post_processors(
policy_cfg=cfg.policy,
pretrained_path=cfg.policy.pretrained_path,
dataset_stats=rename_stats(dataset.meta.stats, cfg.dataset.rename_map),
preprocessor_overrides={
"device_processor": {"device": cfg.policy.device},
"rename_observations_processor": {"rename_map": cfg.dataset.rename_map},
},
)
# Create interpolator for smoother policy control
if cfg.interpolation_multiplier > 1:
interpolator = ActionInterpolator(multiplier=cfg.interpolation_multiplier)
logging.info(f"Action interpolation enabled: {cfg.interpolation_multiplier}x control rate")
robot.connect()
if teleop is not None:
teleop.connect()
@@ -601,14 +476,10 @@ def record(cfg: RecordConfig) -> LeRobotDataset:
robot_action_processor=robot_action_processor,
robot_observation_processor=robot_observation_processor,
teleop=teleop,
policy=policy,
preprocessor=preprocessor,
postprocessor=postprocessor,
dataset=dataset,
control_time_s=cfg.dataset.episode_time_s,
single_task=cfg.dataset.single_task,
display_data=cfg.display_data,
interpolator=interpolator,
display_compressed_images=display_compressed_images,
)
+133
View File
@@ -0,0 +1,133 @@
#!/usr/bin/env python
# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Policy deployment engine with pluggable rollout strategies.
``lerobot-rollout`` is the single CLI for running trained policies on
real robots. It uses a **Strategy Pattern** to provide completely
isolated, mutually exclusive execution loops:
--strategy.type=base 24/7 autonomous rollout (no recording)
--strategy.type=sentry Continuous recording with auto-upload
--strategy.type=highlight Ring buffer + keystroke save
--strategy.type=dagger Human-in-the-loop (DAgger/RaC)
All strategies accept ``--rtc.enabled=true`` for asynchronous inference
with slow VLA models (Pi0, Pi0.5, SmolVLA).
Usage examples::
# Base mode (sync inference)
lerobot-rollout \\
--strategy.type=base \\
--policy.path=lerobot/act_koch_real \\
--robot.type=koch_follower \\
--task="pick up cube" --duration=30
# Base mode (RTC for slow VLAs)
lerobot-rollout \\
--strategy.type=base \\
--policy.path=lerobot/pi0_base \\
--rtc.enabled=true --rtc.execution_horizon=10 \\
--robot.type=so100_follower \\
--task="pick up cube" --duration=60
# Sentry mode (continuous recording)
lerobot-rollout \\
--strategy.type=sentry \\
--strategy.episode_duration_s=120 \\
--strategy.upload_every_n_episodes=5 \\
--policy.path=lerobot/pi0_base \\
--rtc.enabled=true \\
--robot.type=so100_follower \\
--dataset.repo_id=user/sentry-data \\
--dataset.single_task="patrol" --duration=3600
# DAgger mode (human-in-the-loop)
lerobot-rollout \\
--strategy.type=dagger \\
--policy.path=outputs/pretrain/checkpoints/last/pretrained_model \\
--robot.type=bi_openarm_follower \\
--teleop.type=openarm_mini \\
--dataset.repo_id=user/hil-data \\
--dataset.single_task="Fold the T-shirt"
"""
import logging
from lerobot.cameras.opencv import OpenCVCameraConfig # noqa: F401
from lerobot.cameras.realsense import RealSenseCameraConfig # noqa: F401
from lerobot.cameras.zmq import ZMQCameraConfig # noqa: F401
from lerobot.configs import parser
from lerobot.rl.process import ProcessSignalHandler
from lerobot.robots import ( # noqa: F401
bi_openarm_follower,
bi_so_follower,
koch_follower,
so_follower,
unitree_g1,
)
from lerobot.rollout.configs import RolloutConfig
from lerobot.rollout.context import build_rollout_context
from lerobot.rollout.strategies import create_strategy
from lerobot.teleoperators import ( # noqa: F401
bi_openarm_leader,
bi_so_leader,
koch_leader,
openarm_leader,
openarm_mini,
so_leader,
unitree_g1 as unitree_g1_teleop,
)
from lerobot.utils.utils import init_logging
logger = logging.getLogger(__name__)
@parser.wrap()
def rollout(cfg: RolloutConfig):
"""Main entry point for policy deployment."""
init_logging()
signal_handler = ProcessSignalHandler(use_threads=True, display_pid=False)
shutdown_event = signal_handler.shutdown_event
logger.info("Building rollout context...")
ctx = build_rollout_context(cfg, shutdown_event)
strategy = create_strategy(cfg.strategy)
logger.info("Strategy: %s", cfg.strategy.type)
try:
strategy.setup(ctx)
strategy.run(ctx)
except KeyboardInterrupt:
logger.info("Interrupted by user")
finally:
strategy.teardown(ctx)
logger.info("Rollout finished")
def main():
from lerobot.utils.import_utils import register_third_party_plugins
register_third_party_plugins()
rollout()
if __name__ == "__main__":
main()
-82
View File
@@ -24,10 +24,6 @@ def lerobot_train(args):
return run_command(cmd="lerobot-train", module="lerobot_train", args=args)
def lerobot_record(args):
return run_command(cmd="lerobot-record", module="lerobot_record", args=args)
def resolve_model_id_for_peft_training(policy_type):
"""PEFT training needs pretrained models, this finds the pretrained model of a policy type for PEFT training."""
if policy_type == "smolvla":
@@ -155,81 +151,3 @@ def test_peft_training_params_are_fewer(policy_type, tmp_path):
f"--output_dir={output_dir}",
]
)
class DummyRobot:
name = "dummy"
cameras = []
action_features = {"foo": 1.0, "bar": 2.0}
observation_features = {"obs1": 1.0, "obs2": 2.0}
is_connected = True
def connect(self, *args):
pass
def disconnect(self):
pass
def dummy_make_robot_from_config(*args, **kwargs):
return DummyRobot()
@pytest.mark.parametrize("policy_type", ["smolvla"])
@skip_if_package_missing("peft")
def test_peft_record_loads_policy(policy_type, tmp_path):
"""Train a policy with PEFT and attempt to load it with `lerobot-record`."""
from peft import PeftModel
output_dir = tmp_path / f"output_{policy_type}"
model_id = resolve_model_id_for_peft_training(policy_type)
lerobot_train(
[
f"--policy.path={model_id}",
"--policy.push_to_hub=false",
"--policy.input_features=null",
"--policy.output_features=null",
"--peft.method=LORA",
"--dataset.repo_id=lerobot/pusht",
"--dataset.episodes=[0, 1]",
"--steps=1",
f"--output_dir={output_dir}",
]
)
policy_dir = output_dir / "checkpoints" / "last" / "pretrained_model"
dataset_dir = tmp_path / "eval_pusht"
single_task = "move the table"
loaded_policy = None
def dummy_record_loop(*args, **kwargs):
nonlocal loaded_policy
if "dataset" not in kwargs:
return
dataset = kwargs["dataset"]
dataset.add_frame({"task": single_task})
loaded_policy = kwargs["policy"]
with (
patch("lerobot.scripts.lerobot_record.make_robot_from_config", dummy_make_robot_from_config),
# disable record loop since we're only interested in successful loading of the policy.
patch("lerobot.scripts.lerobot_record.record_loop", dummy_record_loop),
# disable speech output
patch("lerobot.utils.utils.say"),
):
lerobot_record(
[
f"--policy.path={policy_dir}",
"--robot.type=so101_follower",
"--robot.port=/dev/null",
"--dataset.repo_id=lerobot/eval_pusht",
f'--dataset.single_task="{single_task}"',
f"--dataset.root={dataset_dir}",
"--dataset.push_to_hub=false",
]
)
assert isinstance(loaded_policy, PeftModel)