[pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci
This commit is contained in:
pre-commit-ci[bot]
2025-08-07 11:31:35 +00:00
committed by Adil Zouitine
parent a14af62ee3
commit 7124d471c1
+51 -43
View File
@@ -3,16 +3,19 @@
In robotics, there's a fundamental mismatch between the data that robots and humans produce and what machine learning models expect. This creates several translation challenges:
**Raw Robot Data → Model Input:**
- Robots output raw sensor data (camera images, joint positions, force readings) that need normalization, batching, and device placement before models can process them
- Language instructions from humans ("pick up the red cube") must be tokenized into numerical representations
- Different robots use different coordinate systems and units that need standardization
**Model Output → Robot Commands:**
- Models might output end-effector positions, but robots need joint-space commands
- Teleoperators (like gamepads) produce relative movements (delta positions), but robots expect absolute commands
- Model predictions are often normalized and need to be converted back to real-world scales
**Cross-Domain Translation:**
- Training data from one robot setup needs adaptation for deployment on different hardware
- Models trained with specific camera configurations must work with new camera arrangements
- Datasets with different naming conventions need harmonization
@@ -24,6 +27,7 @@ Processors are the data transformation backbone of LeRobot. They handle all the
## What are Processors?
In robotics, data comes in many forms - images from cameras, joint positions from sensors, text instructions from users, and more. Each type of data requires specific transformations before a model can use it effectively. Models need this data to be:
- **Normalized**: Scaled to appropriate ranges for neural network processing
- **Batched**: Organized with proper dimensions for batch processing
- **Tokenized**: Text converted to numerical representations
@@ -63,6 +67,7 @@ transition: EnvTransition = {
```
Each key in the transition has a specific purpose:
- **OBSERVATION**: All sensor data (images, states, proprioception)
- **ACTION**: The action to execute or that was executed
- **REWARD**: Reinforcement learning signal
@@ -82,27 +87,27 @@ import torch
class MyProcessorStep:
"""Example processor step interface - all methods must be implemented."""
def __call__(self, transition: EnvTransition) -> EnvTransition:
"""Transform the transition - this is the main processing logic."""
raise NotImplementedError
def feature_contract(self, features: dict[str, PolicyFeature]) -> dict[str, PolicyFeature]:
"""Declare how this step transforms feature shapes/types."""
raise NotImplementedError
def get_config(self) -> dict[str, Any]:
"""Return JSON-serializable configuration for saving/loading."""
raise NotImplementedError
def state_dict(self) -> dict[str, torch.Tensor]:
"""Return any learnable parameters (tensors only)."""
raise NotImplementedError
def load_state_dict(self, state: dict[str, torch.Tensor]) -> None:
"""Load learnable parameters from saved state."""
raise NotImplementedError
def reset(self) -> None:
"""Reset any internal state between episodes."""
raise NotImplementedError
@@ -123,7 +128,7 @@ processor = RobotProcessor(
step3 # Third transformation
],
name="my_preprocessing_pipeline",
# Optional: Custom converters for input/output formats
to_transition=custom_batch_to_transition, # How to convert batch dict → EnvTransition
to_output=custom_transition_to_batch # How to convert EnvTransition → output format
@@ -146,7 +151,6 @@ output = processor(transition) # Stays as EnvTransition throughout
The `to_transition` and `to_output` converters enable seamless integration with existing codebases.
By default, they handle the standard LeRobot batch format, but you can customize them for different data structures.
### Data Format Conversion
Different data sources have different formats, but processors need a unified `EnvTransition` structure internally.
@@ -191,7 +195,7 @@ processor = RobotProcessor(
## Common Processor Steps
LeRobot provides a rich set of pre-built processor steps for common transformations.
LeRobot provides a rich set of pre-built processor steps for common transformations.
Let's explore each in detail:
### Data Normalization
@@ -351,6 +355,7 @@ Different datasets and models may use different naming conventions.
The `RenameProcessor` solves this mismatch:
**Why is this useful?**
- When loading a model trained on a different dataset with different key names
- When using foundation models that expect specific key naming conventions
- When standardizing datasets from different sources
@@ -414,17 +419,17 @@ preprocessor = RobotProcessor(
"observation.images.wrist": "observation.images.camera1"
}
),
# 2. Add batch dimensions for model
ToBatchProcessor(),
# 3. Tokenize language instructions if present
TokenizerProcessor(
tokenizer_name="google/paligemma-3b-pt-224",
max_length=64,
task_key="task"
),
# 4. Normalize numerical data
NormalizerProcessor(
features=policy_features,
@@ -435,7 +440,7 @@ preprocessor = RobotProcessor(
},
stats=dataset.meta.stats
),
# 5. Move to GPU and convert to half precision
DeviceProcessor(
device="cuda:0",
@@ -450,7 +455,7 @@ postprocessor = RobotProcessor(
steps=[
# 1. Move back to CPU for robot hardware
DeviceProcessor(device="cpu"),
# 2. Denormalize actions to original scale
UnnormalizerProcessor(
features=policy_features,
@@ -489,15 +494,15 @@ for epoch in range(num_epochs):
for batch in dataloader:
# Preprocess batch
processed_batch = preprocessor(batch)
# Forward pass - returns loss and optional metrics
loss, metrics = model.forward(processed_batch)
# Backward pass
optimizer.zero_grad()
loss.backward()
optimizer.step()
# Log metrics if available
if metrics:
wandb.log(metrics)
@@ -515,7 +520,7 @@ preprocessor = RobotProcessor.from_pretrained(
config_filename="robot_preprocessor.json"
)
postprocessor = RobotProcessor.from_pretrained(
"path/to/model",
"path/to/model",
config_filename="robot_postprocessor.json"
)
@@ -534,34 +539,34 @@ with torch.no_grad():
while not done:
# Get observation from robot
observation = robot.get_observation()
# Build dataset-compatible frame
observation_frame = build_dataset_frame(
dataset.features,
observation,
dataset.features,
observation,
prefix="observation"
)
# Add task instruction to complementary data
observation_frame["task"] = "pick up the red cube"
# Preprocess for model
model_input = preprocessor(observation_frame)
# Run policy
raw_action = policy.select_action(model_input)
# Postprocess action
action_transition = {TransitionKey.ACTION: raw_action}
processed = postprocessor(action_transition)
action = processed[TransitionKey.ACTION]
# Convert to robot action format
robot_action = {
key: action[i].item()
key: action[i].item()
for i, key in enumerate(robot.action_features)
}
# Execute on robot
robot.send_action(robot_action)
```
@@ -632,10 +637,10 @@ processor = RobotProcessor.from_pretrained(
overrides={
# Change device for different hardware
"device_processor": {"device": "cuda:1"},
# Update statistics for new dataset
"normalizer_processor": {"stats": new_dataset.meta.stats},
# Provide non-serializable objects (like tokenizers)
"tokenizer_processor": {"tokenizer": custom_tokenizer}
}
@@ -669,16 +674,16 @@ from lerobot.processor.pipeline import ProcessorStepRegistry, ObservationProcess
@ProcessorStepRegistry.register("my_company/gaussian_noise")
class GaussianNoiseProcessor(ObservationProcessor):
"""Add Gaussian noise to observations for robustness training."""
noise_std: float = 0.01
training_only: bool = True
is_training: bool = True
def observation(self, observation):
"""Add noise to observation tensors."""
if not self.is_training and self.training_only:
return observation
noisy_obs = {}
for key, value in observation.items():
if isinstance(value, torch.Tensor) and "image" not in key:
@@ -687,9 +692,9 @@ class GaussianNoiseProcessor(ObservationProcessor):
noisy_obs[key] = value + noise
else:
noisy_obs[key] = value
return noisy_obs
def get_config(self):
return {
"noise_std": self.noise_std,
@@ -716,15 +721,15 @@ from lerobot.processor import ActionProcessor
@ProcessorStepRegistry.register("my_company/action_clipper")
class ActionClipProcessor(ActionProcessor):
"""Clip actions to safe ranges."""
min_value: float = -1.0
max_value: float = 1.0
def action(self, action):
"""Process only the action component."""
# No need to handle transition dict - base class does it
return torch.clamp(action, self.min_value, self.max_value)
def get_config(self):
return {"min_value": self.min_value, "max_value": self.max_value}
```
@@ -776,7 +781,7 @@ Use `step_through()` for detailed debugging of the transformation pipeline:
# Inspect data at each transformation stage
for i, intermediate in enumerate(processor.step_through(data)):
print(f"\n=== After step {i} ===")
# Check observation shapes
obs = intermediate.get(TransitionKey.OBSERVATION)
if obs:
@@ -786,7 +791,7 @@ for i, intermediate in enumerate(processor.step_through(data)):
f"dtype={value.dtype}, "
f"device={value.device}, "
f"range=[{value.min():.3f}, {value.max():.3f}]")
# Check action if present
action = intermediate.get(TransitionKey.ACTION)
if action is not None and isinstance(action, torch.Tensor):
@@ -818,6 +823,7 @@ variant_processor = RobotProcessor(
### 1. Order Matters
The sequence of processors is crucial. Follow this general order:
```python
# Preprocessing: Raw → Model-ready
1. Rename (standardize keys)
@@ -851,6 +857,7 @@ print(ProcessorStepRegistry.list()) # See all registered processors
### 3. Common Pitfalls and Solutions
**Tensor Device Mismatch:**
```python
# Problem: RuntimeError: Expected all tensors on same device
# Solution: Ensure DeviceProcessor is in pipeline
@@ -863,6 +870,7 @@ preprocessor = RobotProcessor(
```
**Missing Statistics:**
```python
# Problem: NormalizerProcessor has no stats
# Solution 1: Compute stats from dataset
@@ -897,6 +905,6 @@ Processors are the unsung heroes of robotics pipelines, handling the critical tr
- Create custom transformations for specialized tasks
Remember: good preprocessing is often the difference between a model that works in theory
and one that works in practice!
The modular pipeline approach ensures your transformations are testable, reproducible,
and portable across different robots and environments.
and one that works in practice!
The modular pipeline approach ensures your transformations are testable, reproducible,
and portable across different robots and environments.