mirror of
https://github.com/huggingface/lerobot.git
synced 2026-05-22 12:09:42 +00:00
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
This commit is contained in:
committed by
Adil Zouitine
parent
a14af62ee3
commit
7124d471c1
@@ -3,16 +3,19 @@
|
|||||||
In robotics, there's a fundamental mismatch between the data that robots and humans produce and what machine learning models expect. This creates several translation challenges:
|
In robotics, there's a fundamental mismatch between the data that robots and humans produce and what machine learning models expect. This creates several translation challenges:
|
||||||
|
|
||||||
**Raw Robot Data → Model Input:**
|
**Raw Robot Data → Model Input:**
|
||||||
|
|
||||||
- Robots output raw sensor data (camera images, joint positions, force readings) that need normalization, batching, and device placement before models can process them
|
- Robots output raw sensor data (camera images, joint positions, force readings) that need normalization, batching, and device placement before models can process them
|
||||||
- Language instructions from humans ("pick up the red cube") must be tokenized into numerical representations
|
- Language instructions from humans ("pick up the red cube") must be tokenized into numerical representations
|
||||||
- Different robots use different coordinate systems and units that need standardization
|
- Different robots use different coordinate systems and units that need standardization
|
||||||
|
|
||||||
**Model Output → Robot Commands:**
|
**Model Output → Robot Commands:**
|
||||||
|
|
||||||
- Models might output end-effector positions, but robots need joint-space commands
|
- Models might output end-effector positions, but robots need joint-space commands
|
||||||
- Teleoperators (like gamepads) produce relative movements (delta positions), but robots expect absolute commands
|
- Teleoperators (like gamepads) produce relative movements (delta positions), but robots expect absolute commands
|
||||||
- Model predictions are often normalized and need to be converted back to real-world scales
|
- Model predictions are often normalized and need to be converted back to real-world scales
|
||||||
|
|
||||||
**Cross-Domain Translation:**
|
**Cross-Domain Translation:**
|
||||||
|
|
||||||
- Training data from one robot setup needs adaptation for deployment on different hardware
|
- Training data from one robot setup needs adaptation for deployment on different hardware
|
||||||
- Models trained with specific camera configurations must work with new camera arrangements
|
- Models trained with specific camera configurations must work with new camera arrangements
|
||||||
- Datasets with different naming conventions need harmonization
|
- Datasets with different naming conventions need harmonization
|
||||||
@@ -24,6 +27,7 @@ Processors are the data transformation backbone of LeRobot. They handle all the
|
|||||||
## What are Processors?
|
## What are Processors?
|
||||||
|
|
||||||
In robotics, data comes in many forms - images from cameras, joint positions from sensors, text instructions from users, and more. Each type of data requires specific transformations before a model can use it effectively. Models need this data to be:
|
In robotics, data comes in many forms - images from cameras, joint positions from sensors, text instructions from users, and more. Each type of data requires specific transformations before a model can use it effectively. Models need this data to be:
|
||||||
|
|
||||||
- **Normalized**: Scaled to appropriate ranges for neural network processing
|
- **Normalized**: Scaled to appropriate ranges for neural network processing
|
||||||
- **Batched**: Organized with proper dimensions for batch processing
|
- **Batched**: Organized with proper dimensions for batch processing
|
||||||
- **Tokenized**: Text converted to numerical representations
|
- **Tokenized**: Text converted to numerical representations
|
||||||
@@ -63,6 +67,7 @@ transition: EnvTransition = {
|
|||||||
```
|
```
|
||||||
|
|
||||||
Each key in the transition has a specific purpose:
|
Each key in the transition has a specific purpose:
|
||||||
|
|
||||||
- **OBSERVATION**: All sensor data (images, states, proprioception)
|
- **OBSERVATION**: All sensor data (images, states, proprioception)
|
||||||
- **ACTION**: The action to execute or that was executed
|
- **ACTION**: The action to execute or that was executed
|
||||||
- **REWARD**: Reinforcement learning signal
|
- **REWARD**: Reinforcement learning signal
|
||||||
@@ -146,7 +151,6 @@ output = processor(transition) # Stays as EnvTransition throughout
|
|||||||
The `to_transition` and `to_output` converters enable seamless integration with existing codebases.
|
The `to_transition` and `to_output` converters enable seamless integration with existing codebases.
|
||||||
By default, they handle the standard LeRobot batch format, but you can customize them for different data structures.
|
By default, they handle the standard LeRobot batch format, but you can customize them for different data structures.
|
||||||
|
|
||||||
|
|
||||||
### Data Format Conversion
|
### Data Format Conversion
|
||||||
|
|
||||||
Different data sources have different formats, but processors need a unified `EnvTransition` structure internally.
|
Different data sources have different formats, but processors need a unified `EnvTransition` structure internally.
|
||||||
@@ -351,6 +355,7 @@ Different datasets and models may use different naming conventions.
|
|||||||
The `RenameProcessor` solves this mismatch:
|
The `RenameProcessor` solves this mismatch:
|
||||||
|
|
||||||
**Why is this useful?**
|
**Why is this useful?**
|
||||||
|
|
||||||
- When loading a model trained on a different dataset with different key names
|
- When loading a model trained on a different dataset with different key names
|
||||||
- When using foundation models that expect specific key naming conventions
|
- When using foundation models that expect specific key naming conventions
|
||||||
- When standardizing datasets from different sources
|
- When standardizing datasets from different sources
|
||||||
@@ -818,6 +823,7 @@ variant_processor = RobotProcessor(
|
|||||||
### 1. Order Matters
|
### 1. Order Matters
|
||||||
|
|
||||||
The sequence of processors is crucial. Follow this general order:
|
The sequence of processors is crucial. Follow this general order:
|
||||||
|
|
||||||
```python
|
```python
|
||||||
# Preprocessing: Raw → Model-ready
|
# Preprocessing: Raw → Model-ready
|
||||||
1. Rename (standardize keys)
|
1. Rename (standardize keys)
|
||||||
@@ -851,6 +857,7 @@ print(ProcessorStepRegistry.list()) # See all registered processors
|
|||||||
### 3. Common Pitfalls and Solutions
|
### 3. Common Pitfalls and Solutions
|
||||||
|
|
||||||
**Tensor Device Mismatch:**
|
**Tensor Device Mismatch:**
|
||||||
|
|
||||||
```python
|
```python
|
||||||
# Problem: RuntimeError: Expected all tensors on same device
|
# Problem: RuntimeError: Expected all tensors on same device
|
||||||
# Solution: Ensure DeviceProcessor is in pipeline
|
# Solution: Ensure DeviceProcessor is in pipeline
|
||||||
@@ -863,6 +870,7 @@ preprocessor = RobotProcessor(
|
|||||||
```
|
```
|
||||||
|
|
||||||
**Missing Statistics:**
|
**Missing Statistics:**
|
||||||
|
|
||||||
```python
|
```python
|
||||||
# Problem: NormalizerProcessor has no stats
|
# Problem: NormalizerProcessor has no stats
|
||||||
# Solution 1: Compute stats from dataset
|
# Solution 1: Compute stats from dataset
|
||||||
|
|||||||
Reference in New Issue
Block a user