chore(docs): Processor doc (#1685)

* chore(docs): initialize doc * Added script for the second part of the processor doc * precommit style nit * improved part 2 of processor guide * Add comprehensive documentation for processors in robotics - Introduced a detailed guide on processors, covering their role in transforming raw robot data into model-ready inputs and vice versa. - Explained core concepts such as EnvTransition, ProcessorStep, and RobotProcessor, along with their functionalities. - Included examples of common processor steps like normalization, device management, batch processing, and text tokenization. - Provided insights on building complete pipelines, integrating processors into training loops, and saving/loading configurations. - Emphasized best practices and advanced features for effective usage of processors in robotics applications. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * feat(docs): Enhance introduction to processors with additional converter functions - Updated the introduction to processors documentation to include default batch-to-transition and transition-to-batch converters. - Added detailed descriptions and examples for new specialized converter functions: `to_transition_teleop_action`, `to_transition_robot_observation`, `to_output_robot_action`, and `to_dataset_frame`. - Improved clarity on how these converters facilitate integration with existing robotics applications. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Improved doc implement_your_own_pipeline - Use normalization processor as default example - Add section on transform features - Add section on overrides. * Add phone docs and use pipeline for robots/teleop docs * Fix typo in documentation for adapters in robots/teleop section * Enhance documentation for processors with detailed explanations and examples - Updated the introduction to processors, clarifying the role of `EnvTransition` and `ProcessorStep`. - Introduced `DataProcessorPipeline` as a generic orchestrator for chaining processor steps. - Added comprehensive descriptions of new converter functions and their applications. - Improved clarity on type safety and the differences between `RobotProcessorPipeline` and `PolicyProcessorPipeline`. - Included examples for various processing scenarios, emphasizing best practices for data handling in robotics. * Enhance documentation for processor migration and debugging - Added detailed sections on the migration of models to the new `PolicyProcessorPipeline` system, including breaking changes and migration scripts. - Introduced a comprehensive guide for debugging processor pipelines, covering common issues, step-by-step inspection, and runtime monitoring techniques. - Updated examples to reflect new usage patterns and best practices for processor implementation and error handling. - Clarified the role of various processor steps and their configurations in the context of robotics applications. --------- Co-authored-by: Michel Aractingi <michel.aractingi@huggingface.co> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Pepijn <pepijn@huggingface.co>
2026-05-15 08:39:49 +00:00 · 2025-09-12 18:00:37 +02:00
parent 1ccdf365d2
commit a877c596ba
7 changed files with 2064 additions and 2 deletions
@@ -24,9 +24,16 @@
  - local: smolvla
    title: Finetune SmolVLA
  title: "Policies"
+
+- sections:
+  - local: introduction_processors
+    title: Introduction to Robot Processors
+  - local: implement_your_own_processor
+    title: Implement your own processor
+  - local: processors_robots_teleop
+    title: Processors for Robots and Teleoperators
+  title: "Robot Processors"
 - sections:
-  - local: hope_jr
-    title: Hope Jr
  - local: so101
    title: SO-101
  - local: so100
@@ -35,9 +42,15 @@
    title: Koch v1.1
  - local: lekiwi
    title: LeKiwi
+  - local: hope_jr
+    title: Hope Jr
  - local: reachy2
    title: Reachy 2
  title: "Robots"
+- sections:
+  - local: phone_teleop
+    title: Phone
+  title: "Teleoperators"
 - sections:
  - local: notebooks
    title: Notebooks
@@ -1,5 +1,61 @@
 # Backward compatibility

+## Policy Normalization Migration (PR #1452)
+
+**Breaking Change**: LeRobot policies no longer have built-in normalization layers embedded in their weights. Normalization is now handled by external `PolicyProcessorPipeline` components.
+
+### What changed?
+
+|                            | Before PR #1452                                  | After PR #1452                                               |
+| -------------------------- | ------------------------------------------------ | ------------------------------------------------------------ |
+| **Normalization Location** | Embedded in model weights (`normalize_inputs.*`) | External `PolicyProcessorPipeline` components                |
+| **Model State Dict**       | Contains normalization statistics                | **Clean weights only** - no normalization parameters         |
+| **Usage**                  | `policy(batch)` handles everything               | `preprocessor(batch)` → `policy(...)` → `postprocessor(...)` |
+
+### Impact on existing models
+
+- Models trained **before** PR #1452 have normalization embedded in their weights
+- These models need migration to work with the new `PolicyProcessorPipeline` system
+- The migration extracts normalization statistics and creates separate processor pipelines
+
+### Migrating old models
+
+Use the migration script to convert models with embedded normalization:
+
+```shell
+python src/lerobot/processor/migrate_policy_normalization.py \
+    --pretrained-path lerobot/act_aloha_sim_transfer_cube_human \
+    --push-to-hub \
+    --branch migrated
+```
+
+The script:
+
+1. **Extracts** normalization statistics from model weights
+2. **Creates** external preprocessor and postprocessor pipelines
+3. **Removes** normalization layers from model weights
+4. **Saves** clean model + processor pipelines
+5. **Pushes** to Hub with automatic PR creation
+
+### Using migrated models
+
+```python
+# New usage pattern (after migration)
+from lerobot.policies.factory import make_policy, make_pre_post_processors
+
+# Load model and processors separately
+policy = make_policy(config, ds_meta=dataset.meta)
+preprocessor, postprocessor = make_pre_post_processors(
+    policy_cfg=config,
+    dataset_stats=dataset.meta.stats
+)
+
+# Process data through pipeline
+processed_batch = preprocessor(raw_batch)
+action = policy.select_action(processed_batch)
+final_action = postprocessor(action)
+```
+
 ## Hardware API redesign

 PR [#777](https://github.com/huggingface/lerobot/pull/777) improves the LeRobot calibration but is **not backward-compatible**. Below is a overview of what changed and how you can continue to work with datasets created before this pull request.
@@ -0,0 +1,738 @@
+# Debug Your Processor Pipeline
+
+Processor pipelines can be complex, especially when chaining multiple transformation steps. This guide provides comprehensive debugging tools and techniques to help you identify issues, optimize performance, and understand data flow through your pipelines.
+
+## Quick Debugging Checklist
+
+When your pipeline isn't working as expected, check these common issues:
+
+### 1. **Data Type Mismatches**
+
+```python
+# ❌ Problem: PolicyProcessorPipeline gets RobotAction dict
+policy_processor = PolicyProcessorPipeline[dict, dict](...)
+robot_action = {"joint_1": 0.5}  # dict[str, Any]
+result = policy_processor(robot_action)  # May fail!
+
+# ✅ Solution: Use RobotProcessorPipeline for robot data
+robot_processor = RobotProcessorPipeline[dict, dict](...)
+result = robot_processor(robot_action)  # Works!
+```
+
+### 2. **Device Mismatches (Now More Subtle!)**
+
+Modern device handling is sophisticated and can cause subtle issues:
+
+```python
+# ❌ Problem 1: Multi-GPU preservation vs movement
+# DeviceProcessorStep(device="cuda:0") with data on cuda:1
+device_processor = DeviceProcessorStep(device="cuda:0")
+data_on_cuda1 = {"obs": torch.randn(10).cuda(1)}  # On cuda:1
+
+# SUBTLE: Data stays on cuda:1 (preserved), doesn't move to cuda:0!
+# This is intentional for Accelerate compatibility but can be confusing
+result = device_processor(create_transition(observation=data_on_cuda1))
+assert result[TransitionKey.OBSERVATION]["obs"].device.index == 1  # Still on cuda:1!
+
+# ✅ Solution: Use CPU as intermediate if you need specific GPU
+steps = [
+    DeviceProcessorStep(device="cpu"),     # Force to CPU first
+    DeviceProcessorStep(device="cuda:0"),  # Then to specific GPU
+]
+
+# ❌ Problem 2: Automatic dtype adaptation in normalization
+normalizer = NormalizerProcessorStep(stats=stats, dtype=torch.float32)  # Configured as float32
+input_data = torch.randn(10, dtype=torch.bfloat16)  # Input is bfloat16
+
+# SUBTLE: Normalizer automatically adapts to bfloat16!
+# This changes internal state and can affect reproducibility
+result = normalizer(create_transition(observation={"obs": input_data}))
+assert normalizer.dtype == torch.bfloat16  # Changed from float32!
+
+# ✅ Solution: Explicitly control dtype flow
+steps = [
+    DeviceProcessorStep(device="cuda", float_dtype="float32"),  # Force consistent dtype
+    NormalizerProcessorStep(stats=stats, dtype=torch.float32),  # Matching dtype
+]
+
+# ❌ Problem 3: Mixed precision with normalization statistics
+# Statistics on CPU, input on GPU with different dtype
+normalizer = NormalizerProcessorStep(stats=cpu_stats, device="cpu")
+gpu_input = torch.randn(10, dtype=torch.float16).cuda()
+
+# SUBTLE: Statistics get moved/converted automatically during processing
+# This can cause memory spikes and unexpected device transfers
+```
+
+### 3. **Missing Statistics**
+
+```python
+# ❌ Problem: NormalizerProcessorStep has no stats
+# Solution 1: Load with dataset stats
+processor = PolicyProcessorPipeline.from_pretrained(
+    "model_path",
+    overrides={"normalizer_processor": {"stats": dataset.meta.stats}}
+)
+
+# Solution 2: Compute stats from dataset
+from lerobot.datasets.compute_stats import compute_stats
+stats = compute_stats(dataset)
+
+# Solution 3: Hot-swap statistics at runtime (very useful!)
+from lerobot.processor import hotswap_stats
+new_processor = hotswap_stats(existing_processor, new_dataset.meta.stats)
+
+# Hot-swap is powerful for:
+# - Adapting trained models to new datasets
+# - A/B testing different normalization statistics
+# - Domain adaptation without retraining
+```
+
+## Step-by-Step Pipeline Inspection
+
+Use `step_through()` to see exactly what happens at each transformation stage:
+
+```python
+# Inspect data at each transformation stage
+for i, intermediate in enumerate(processor.step_through(data)):
+    print(f"\n=== After step {i}: {processor.steps[i].__class__.__name__} ===")
+
+    # Check observation shapes and statistics
+    obs = intermediate.get(TransitionKey.OBSERVATION)
+    if obs:
+        for key, value in obs.items():
+            if isinstance(value, torch.Tensor):
+                print(f"{key}: shape={value.shape}, "
+                      f"dtype={value.dtype}, "
+                      f"device={value.device}")
+                if value.numel() > 0:  # Avoid empty tensors
+                    print(f"  range=[{value.min():.3f}, {value.max():.3f}], "
+                          f"mean={value.mean():.3f}, std={value.std():.3f}")
+
+    # Check action if present
+    action = intermediate.get(TransitionKey.ACTION)
+    if action is not None:
+        if isinstance(action, torch.Tensor):
+            print(f"action: shape={action.shape}, dtype={action.dtype}, device={action.device}")
+            if action.numel() > 0:
+                print(f"  range=[{action.min():.3f}, {action.max():.3f}]")
+        elif isinstance(action, dict):
+            print(f"action: dict with {len(action)} keys: {list(action.keys())}")
+
+    # Check complementary data
+    comp = intermediate.get(TransitionKey.COMPLEMENTARY_DATA)
+    if comp:
+        print(f"complementary_data: {list(comp.keys())}")
+```
+
+## Runtime Monitoring with Hooks
+
+Add monitoring hooks without modifying your pipeline code:
+
+```python
+# Define monitoring hooks
+def log_shapes(step_idx: int, transition: EnvTransition):
+    """Log tensor shapes after each step."""
+    obs = transition.get(TransitionKey.OBSERVATION)
+    if obs:
+        print(f"Step {step_idx} shapes:")
+        for key, value in obs.items():
+            if isinstance(value, torch.Tensor):
+                print(f"  {key}: {value.shape}")
+
+def check_nans(step_idx: int, transition: EnvTransition):
+    """Check for NaN values."""
+    obs = transition.get(TransitionKey.OBSERVATION)
+    if obs:
+        for key, value in obs.items():
+            if isinstance(value, torch.Tensor) and torch.isnan(value).any():
+                print(f"Warning: NaN detected in {key} at step {step_idx}")
+
+def measure_performance(step_idx: int, transition: EnvTransition):
+    """Measure processing time per step."""
+    import time
+    start_time = getattr(measure_performance, 'start_time', time.time())
+    if step_idx == 0:
+        measure_performance.start_time = time.time()
+    else:
+        elapsed = time.time() - start_time
+        print(f"Step {step_idx-1} took {elapsed*1000:.2f}ms")
+        measure_performance.start_time = time.time()
+
+# Register hooks
+processor.register_after_step_hook(log_shapes)
+processor.register_after_step_hook(check_nans)
+processor.register_after_step_hook(measure_performance)
+
+# Process data - hooks will be called after each step
+output = processor(input_data)
+
+# Remove hooks when done debugging
+processor.unregister_after_step_hook(log_shapes)
+processor.unregister_after_step_hook(check_nans)
+processor.unregister_after_step_hook(measure_performance)
+```
+
+## Pipeline Testing and Validation
+
+### Test Individual Steps
+
+```python
+# Test each step independently
+test_transition = create_transition(
+    observation={"observation.state": torch.randn(7)},
+    action=torch.randn(4)
+)
+
+for i, step in enumerate(processor.steps):
+    try:
+        result = step(test_transition)
+        print(f"✅ Step {i} ({step.__class__.__name__}) passed")
+        test_transition = result  # Use output for next step
+    except Exception as e:
+        print(f"❌ Step {i} ({step.__class__.__name__}) failed: {e}")
+        break
+```
+
+### Pipeline Slicing for Debugging
+
+```python
+# Test subsets of your pipeline
+first_three_steps = processor[:3]  # Returns new DataProcessorPipeline
+middle_step = processor[2]         # Returns single ProcessorStep
+
+# Test partial pipeline
+partial_output = first_three_steps(input_data)
+print(f"After first 3 steps: {partial_output}")
+
+# Test remaining steps
+remaining_steps = processor[3:]
+final_output = remaining_steps(partial_output)
+```
+
+### Create Test Variations
+
+```python
+# Create variations for A/B testing
+variant_processor = RobotProcessorPipeline[dict, dict](
+    steps=processor.steps[:-1] + [alternative_final_step],
+    name="variant_pipeline"
+)
+
+# Compare outputs
+original_output = processor(test_data)
+variant_output = variant_processor(test_data)
+```
+
+## Performance Profiling
+
+### Memory Usage Monitoring
+
+```python
+import torch
+
+def memory_hook(step_idx: int, transition: EnvTransition):
+    """Monitor GPU memory usage."""
+    if torch.cuda.is_available():
+        allocated = torch.cuda.memory_allocated() / 1024**3  # GB
+        cached = torch.cuda.memory_reserved() / 1024**3     # GB
+        print(f"Step {step_idx}: {allocated:.2f}GB allocated, {cached:.2f}GB cached")
+
+processor.register_after_step_hook(memory_hook)
+```
+
+### Processing Time Analysis
+
+```python
+import time
+from collections import defaultdict
+
+class PerformanceProfiler:
+    def __init__(self):
+        self.step_times = defaultdict(list)
+        self.start_time = None
+
+    def __call__(self, step_idx: int, transition: EnvTransition):
+        current_time = time.perf_counter()
+        if self.start_time is not None:
+            step_name = processor.steps[step_idx-1].__class__.__name__
+            elapsed = current_time - self.start_time
+            self.step_times[step_name].append(elapsed)
+        self.start_time = current_time
+
+    def report(self):
+        print("\n=== Performance Report ===")
+        for step_name, times in self.step_times.items():
+            avg_time = sum(times) / len(times) * 1000  # ms
+            print(f"{step_name}: {avg_time:.2f}ms avg ({len(times)} calls)")
+
+profiler = PerformanceProfiler()
+processor.register_after_step_hook(profiler)
+
+# Run your pipeline
+for _ in range(100):
+    output = processor(test_data)
+
+# Get performance report
+profiler.report()
+```
+
+## Common Issues and Solutions
+
+### Issue: "Action should be a PolicyAction type"
+
+```python
+# Problem: Passing dict to PolicyProcessorPipeline
+action_dict = {"joint_1": 0.5}  # RobotAction
+policy_processor(transition_with_dict_action)  # Fails!
+
+# Solution 1: Use RobotProcessorPipeline instead
+robot_processor = RobotProcessorPipeline[dict, dict](...)
+
+# Solution 2: Convert to tensor first
+action_tensor = torch.tensor([0.5])  # PolicyAction
+transition = create_transition(action=action_tensor)
+```
+
+### Issue: "Missing required keys in transition"
+
+```python
+# Problem: Incomplete transition
+incomplete = {TransitionKey.OBSERVATION: {...}}  # Missing ACTION
+
+# Solution: Use create_transition with defaults
+complete = create_transition(
+    observation={...},
+    action=None,  # Explicit None is fine
+    reward=0.0,   # Default values
+    done=False
+)
+```
+
+### Issue: Normalization Statistics Not Found
+
+```python
+# Problem: Processor can't find normalization stats
+normalizer = NormalizerProcessorStep(features=..., stats=None)  # No stats!
+
+# Solution 1: Compute stats from dataset
+from lerobot.datasets.compute_stats import compute_stats
+stats = compute_stats(dataset)
+normalizer = NormalizerProcessorStep(features=..., stats=stats)
+
+# Solution 2: Load with overrides
+processor = PolicyProcessorPipeline.from_pretrained(
+    "model_path",
+    overrides={"normalizer_processor": {"stats": dataset.meta.stats}}
+)
+
+# Solution 3: Hot-swap statistics (powerful for domain adaptation!)
+from lerobot.processor import hotswap_stats
+
+# Load model trained on dataset A
+trained_processor = PolicyProcessorPipeline.from_pretrained("model_trained_on_dataset_A")
+
+# Adapt to dataset B without retraining
+adapted_processor = hotswap_stats(trained_processor, dataset_B.meta.stats)
+
+# Now works with dataset B's data distribution!
+```
+
+### Issue: GPU Out of Memory
+
+```python
+# Problem: Large batches on GPU
+# Solution: Use float16 and optimize order
+steps = [
+    DeviceProcessorStep(device="cuda", float_dtype="float16"),  # Use half precision
+    NormalizerProcessorStep(...),  # Normalize in half precision
+]
+```
+
+## Debugging Complex Pipelines
+
+### Phone Teleoperation Example
+
+```python
+# Debug a complex phone → robot pipeline
+phone_pipeline = RobotProcessorPipeline[RobotAction, RobotAction](
+    steps=[
+        MapPhoneActionToRobotAction(platform=PhoneOS.IOS),
+        AddRobotObservationAsComplimentaryData(robot=robot),
+        EEReferenceAndDelta(kinematics=solver, ...),
+        EEBoundsAndSafety(...),
+        InverseKinematicsEEToJoints(...),
+        GripperVelocityToJoint(...),
+    ]
+)
+
+# Test with mock phone input
+mock_phone_action = {
+    "phone.pos": [0.1, 0.0, 0.0],
+    "phone.rot": Rotation.identity(),
+    "phone.enabled": True,
+    "phone.raw_inputs": {"a3": 0.5}  # iOS button
+}
+
+# Step through to see transformations
+for i, result in enumerate(phone_pipeline.step_through(mock_phone_action)):
+    action = result.get(TransitionKey.ACTION, {})
+    print(f"\nStep {i} output keys: {list(action.keys())}")
+
+    # Check specific transformations
+    if i == 0:  # After MapPhoneActionToRobotAction
+        assert "target_x" in action, "Phone mapping failed"
+    elif i == 2:  # After EEReferenceAndDelta
+        assert "ee.x" in action, "EE reference calculation failed"
+    elif i == 4:  # After InverseKinematicsEEToJoints
+        assert "shoulder_pan.pos" in action, "IK failed"
+```
+
+## Registry Debugging
+
+```python
+# List all available processors
+from lerobot.processor import ProcessorStepRegistry
+print("Available processors:")
+for name in sorted(ProcessorStepRegistry.list()):
+    cls = ProcessorStepRegistry.get(name)
+    print(f"  {name}: {cls.__name__}")
+
+# Check if a processor is registered
+if "my_custom_processor" in ProcessorStepRegistry.list():
+    print("✅ Custom processor is registered")
+else:
+    print("❌ Custom processor not found - check @ProcessorStepRegistry.register()")
+```
+
+## Best Practices for Debugging
+
+### 1. **Start Simple**
+
+```python
+# Test with minimal data first
+minimal_transition = create_transition(
+    observation={"observation.state": torch.randn(1, 7)},
+    action=torch.randn(1, 4)
+)
+```
+
+### 2. **Test Each Step Individually**
+
+```python
+# Don't test the whole pipeline at once
+for step in processor.steps:
+    try:
+        output = step(test_transition)
+        print(f"✅ {step.__class__.__name__} works")
+    except Exception as e:
+        print(f"❌ {step.__class__.__name__} failed: {e}")
+        break
+```
+
+### 3. **Use Hooks for Continuous Monitoring**
+
+```python
+# Add permanent monitoring for production
+def production_monitor(step_idx: int, transition: EnvTransition):
+    """Log critical issues only."""
+    obs = transition.get(TransitionKey.OBSERVATION)
+    if obs:
+        for key, value in obs.items():
+            if isinstance(value, torch.Tensor):
+                if torch.isnan(value).any():
+                    print(f"🚨 NaN detected in {key} at step {step_idx}")
+                if torch.isinf(value).any():
+                    print(f"🚨 Inf detected in {key} at step {step_idx}")
+
+processor.register_after_step_hook(production_monitor)
+```
+
+### 4. **Validate Feature Contracts**
+
+```python
+# Check that your pipeline produces expected features
+initial_features = {...}  # Your input features
+output_features = processor.transform_features(initial_features)
+
+print("Input features:", list(initial_features.keys()))
+print("Output features:", list(output_features.keys()))
+
+# Verify expected features exist
+expected_keys = ["observation.state", "action"]
+for key in expected_keys:
+    if key not in output_features:
+        print(f"❌ Missing expected feature: {key}")
+```
+
+## Troubleshooting Specific Processors
+
+### Normalization Issues
+
+```python
+# Debug normalization problems
+normalizer = NormalizerProcessorStep(...)
+
+# Check statistics
+print("Available stats:", list(normalizer._tensor_stats.keys()))
+for key, stats in normalizer._tensor_stats.items():
+    print(f"{key}: {list(stats.keys())}")
+    for stat_name, tensor in stats.items():
+        print(f"  {stat_name}: {tensor}")
+
+# Test normalization manually
+test_value = torch.tensor([1.0, 2.0, 3.0])
+normalized = normalizer._apply_transform(test_value, "test_key", FeatureType.STATE)
+print(f"Original: {test_value}")
+print(f"Normalized: {normalized}")
+```
+
+### Tokenization Issues
+
+```python
+# Debug tokenizer problems
+tokenizer_step = TokenizerProcessorStep(...)
+
+# Test tokenization manually
+test_transition = create_transition(
+    complementary_data={"task": "pick up the red cube"}
+)
+tokenizer_step.transition = test_transition  # Set current transition
+task = tokenizer_step.get_task(test_transition)
+print(f"Extracted task: {task}")
+
+if task:
+    tokens = tokenizer_step._tokenize_text(task)
+    print(f"Tokens: {tokens}")
+```
+
+### Device Transfer Issues (Advanced Debugging)
+
+Modern device handling has subtle behaviors that can cause issues:
+
+```python
+# Debug device processor with multi-GPU awareness
+device_step = DeviceProcessorStep(device="cuda:0", float_dtype="float16")
+
+print(f"Target device: {device_step.tensor_device}")
+print(f"Non-blocking: {device_step.non_blocking}")
+print(f"Target dtype: {device_step._target_float_dtype}")
+
+# Test 1: GPU-to-GPU preservation (Accelerate compatibility)
+tensor_on_cuda1 = torch.randn(10).cuda(1)
+processed = device_step._process_tensor(tensor_on_cuda1)
+print(f"cuda:1 → cuda:0 config: stays on cuda:{processed.device.index}")  # Stays on cuda:1!
+
+# Test 2: CPU-to-GPU movement
+cpu_tensor = torch.randn(10)
+processed = device_step._process_tensor(cpu_tensor)
+print(f"CPU → cuda:0 config: moves to cuda:{processed.device.index}")  # Moves to cuda:0
+
+# Test 3: Automatic dtype adaptation in normalization
+normalizer = NormalizerProcessorStep(stats=stats, dtype=torch.float32)
+bfloat16_input = torch.randn(10, dtype=torch.bfloat16)
+
+# Before processing
+print(f"Normalizer dtype before: {normalizer.dtype}")
+print(f"Stats dtype before: {list(normalizer._tensor_stats.values())[0]['mean'].dtype}")
+
+# Process data
+transition = create_transition(observation={"obs": bfloat16_input})
+result = normalizer(transition)
+
+# After processing - automatic adaptation!
+print(f"Normalizer dtype after: {normalizer.dtype}")  # Changed to bfloat16!
+print(f"Stats dtype after: {list(normalizer._tensor_stats.values())[0]['mean'].dtype}")  # bfloat16!
+print(f"Output dtype: {result[TransitionKey.OBSERVATION]['obs'].dtype}")  # bfloat16
+```
+
+### Multi-GPU Debugging Patterns
+
+```python
+# Test multi-GPU behavior
+def debug_multi_gpu_behavior():
+    if torch.cuda.device_count() < 2:
+        print("Need 2+ GPUs for this test")
+        return
+
+    processor = DeviceProcessorStep(device="cuda:0")
+
+    # Test data on different GPUs
+    test_cases = [
+        ("CPU", torch.randn(5)),
+        ("cuda:0", torch.randn(5).cuda(0)),
+        ("cuda:1", torch.randn(5).cuda(1)),
+    ]
+
+    for name, tensor in test_cases:
+        transition = create_transition(observation={"test": tensor})
+        result = processor(transition)
+        output_device = result[TransitionKey.OBSERVATION]["test"].device
+
+        print(f"{name} input → {output_device} output")
+        # Expected:
+        # CPU input → cuda:0 output (moved)
+        # cuda:0 input → cuda:0 output (preserved)
+        # cuda:1 input → cuda:1 output (preserved, not moved!)
+
+debug_multi_gpu_behavior()
+```
+
+## Pipeline Optimization
+
+### Performance Bottleneck Detection
+
+```python
+import time
+from collections import defaultdict
+
+class DetailedProfiler:
+    def __init__(self):
+        self.step_times = defaultdict(list)
+        self.memory_usage = defaultdict(list)
+        self.step_start = None
+
+    def before_step(self, step_idx: int, transition: EnvTransition):
+        self.step_start = time.perf_counter()
+        if torch.cuda.is_available():
+            torch.cuda.synchronize()  # Ensure accurate timing
+
+    def after_step(self, step_idx: int, transition: EnvTransition):
+        if self.step_start is not None:
+            elapsed = time.perf_counter() - self.step_start
+            step_name = processor.steps[step_idx].__class__.__name__
+            self.step_times[step_name].append(elapsed * 1000)  # ms
+
+            if torch.cuda.is_available():
+                memory_mb = torch.cuda.memory_allocated() / 1024**2
+                self.memory_usage[step_name].append(memory_mb)
+
+    def report(self):
+        print("\n=== Detailed Performance Report ===")
+        total_time = 0
+        for step_name, times in self.step_times.items():
+            avg_time = sum(times) / len(times)
+            max_time = max(times)
+            min_time = min(times)
+            total_time += avg_time
+
+            print(f"{step_name}:")
+            print(f"  Time: {avg_time:.2f}ms avg, {min_time:.2f}-{max_time:.2f}ms range")
+
+            if step_name in self.memory_usage:
+                memory_vals = self.memory_usage[step_name]
+                avg_memory = sum(memory_vals) / len(memory_vals)
+                print(f"  Memory: {avg_memory:.1f}MB avg")
+
+        print(f"\nTotal pipeline time: {total_time:.2f}ms")
+
+profiler = DetailedProfiler()
+processor.register_before_step_hook(profiler.before_step)
+processor.register_after_step_hook(profiler.after_step)
+```
+
+### Memory Optimization
+
+```python
+# Optimize memory usage
+def optimize_for_memory():
+    return PolicyProcessorPipeline[dict, dict](
+        steps=[
+            # Use float16 early to save memory
+            DeviceProcessorStep(device="cuda", float_dtype="float16"),
+
+            # Normalize on GPU in half precision
+            NormalizerProcessorStep(...),
+
+            # Process in chunks if needed
+            # ChunkProcessorStep(chunk_size=32),  # Custom step
+        ]
+    )
+```
+
+## Integration Testing
+
+### End-to-End Robot Pipeline Test
+
+```python
+# Test complete robot control pipeline
+def test_robot_pipeline():
+    # Mock robot observation
+    robot_obs = {
+        "shoulder_pan.pos": 0.0,
+        "shoulder_lift.pos": -90.0,
+        "elbow_flex.pos": 90.0,
+        "wrist_flex.pos": 0.0,
+        "wrist_roll.pos": 0.0,
+        "gripper.pos": 50.0,
+        "camera_front": np.random.randint(0, 255, (480, 640, 3), dtype=np.uint8)
+    }
+
+    # Test observation processing
+    obs_processor = RobotProcessorPipeline[dict, EnvTransition](
+        steps=[VanillaObservationProcessorStep()],
+        to_transition=observation_to_transition,
+        to_output=identity_transition
+    )
+
+    processed_obs = obs_processor(robot_obs)
+    print("✅ Observation processing works")
+
+    # Test action processing
+    mock_ee_action = {
+        "ee.x": 0.5, "ee.y": 0.0, "ee.z": 0.3,
+        "ee.wx": 0.0, "ee.wy": 0.0, "ee.wz": 0.0,
+        "gripper": 0.1
+    }
+
+    action_processor = RobotProcessorPipeline[dict, dict](
+        steps=[
+            # Your action processing steps
+        ],
+        to_transition=robot_action_to_transition,
+        to_output=transition_to_robot_action
+    )
+
+    joint_action = action_processor(mock_ee_action)
+    print("✅ Action processing works")
+    print(f"Joint commands: {list(joint_action.keys())}")
+
+test_robot_pipeline()
+```
+
+## Error Recovery and Fallbacks
+
+```python
+# Robust pipeline with fallbacks
+class RobustProcessor:
+    def __init__(self, primary_processor, fallback_processor=None):
+        self.primary = primary_processor
+        self.fallback = fallback_processor or IdentityProcessorStep()
+
+    def __call__(self, transition):
+        try:
+            return self.primary(transition)
+        except Exception as e:
+            print(f"⚠️  Primary processor failed: {e}")
+            print("🔄 Using fallback processor")
+            return self.fallback(transition)
+
+# Usage
+robust_pipeline = RobustProcessor(
+    primary_processor=complex_pipeline,
+    fallback_processor=simple_pipeline
+)
+```
+
+## Summary
+
+Effective pipeline debugging involves:
+
+1. **Step-by-step inspection** to understand data flow
+2. **Runtime hooks** for continuous monitoring
+3. **Individual step testing** to isolate issues
+4. **Performance profiling** to identify bottlenecks
+5. **Type validation** to catch data structure mismatches
+6. **Fallback strategies** for robust deployment
+
+Remember: Start simple, test incrementally, and use the rich debugging tools LeRobot provides to build reliable, high-performance processor pipelines!
@@ -0,0 +1,488 @@
+# Implement your own Robot Processor
+
+In this tutorial, you'll learn how to implement your own Robot Processor.
+It begins by exploring the need for a custom processor, then uses the Normalization processors as the running example to explain how to implement, configure, and serialize a processor. Finally, it lists all helper processors that ship with LeRobot.
+
+## Why would you need a custom processor?
+
+In most cases, when reading raw data from a sensor like the camera and robot motor encoders,
+you will need to process this data to transform it into a format that is compatible to use with the policies in LeRobot.
+For example, raw images are encoded with `uint8` and the values are in the range `[0, 255]`.
+To use these images with the policies, you will need to cast them to `float32` and normalize them to the range `[0, 1]`.
+
+For example, in LeRobot's `VanillaObservationProcessor`, raw images come from the environment as numpy arrays with `uint8` values in range `[0, 255]` and in channel-last format `(H, W, C)`. The processor transforms them into PyTorch tensors with `float32` values in range `[0, 1]` and channel-first format `(C, H, W)`:
+
+```python
+# Input: numpy array with shape (480, 640, 3) and dtype uint8
+raw_image = env_observation["pixels"]  # Values in [0, 255]
+
+# After processing: torch tensor with shape (1, 3, 480, 640) and dtype float32
+processed_image = processor(transition)["observation"]["observation.image"]  # Values in [0, 1]
+```
+
+On the other hand, when a model returns a certain action to be executed on the robot, it is often that one has to post-process this action to make it compatible to run on the robot.
+For example, the model might return joint positions values that range from `[-1, 1]` and one would need to scale them to the ranges of the minimum and maximum joint angle positions of the robot.
+
+In LeRobot, this normalization workflow is handled by the `NormalizerProcessor` (for inputs) and the `UnnormalizerProcessor` (for outputs). These processors are heavily used by policies (e.g., Pi0, SmolVLA) and integrate tightly with the `RobotProcessor`'s `get_config`, `state_dict`, and `load_state_dict` APIs.
+
+For instance, `UnnormalizerProcessor` converts model outputs in `[-1, 1]` back to actual robot joint ranges:
+
+```python
+# Input: model action with normalized values in [-1, 1]
+normalized_action = torch.tensor([-0.5, 0.8, -1.0, 0.2])  # Model output
+
+# After post-processing: real joint positions in robot's native ranges
+# Example: joints range from [-180.0, 180.0]
+real_action = unnormalizer(transition)["action"]
+# real action after post-processing: [ -90.,  144., -180.,   36.]
+```
+
+The unnormalizer uses the dataset statistics to convert back:
+
+```python
+# For MIN_MAX normalization: action = (normalized + 1) * (max - min) / 2 + min
+real_action = (normalized_action + 1) * (max_val - min_val) / 2 + min_val
+```
+
+All these situations point us towards the need for a mechanism to preprocess the data before being passed to the policies and then post-process the action that are returned to be executed on the robot.
+
+To that end, LeRobot provides a pipeline mechanism to implement a sequence of processing steps for the input data and the output action.
+
+## How to implement your own processor?
+
+We'll use the `DeviceProcessorStep` as our main example because it demonstrates essential processor patterns and device/dtype awareness that's crucial for modern multi-GPU setups.
+
+Prepare the sequence of processing steps necessary for your problem. A processor step is a class that implements the following methods:
+
+- `__call__`: implements the processing step for the input transition.
+- `get_config`: gets the configuration of the processor step.
+- `state_dict`: gets the state of the processor step.
+- `load_state_dict`: loads the state of the processor step.
+- `reset`: resets the state of the processor step.
+- `feature_contract`: displays the modification to the feature space during the processor step.
+
+### Implement the `__call__` method
+
+The `__call__` method is the core of your processor step. It takes an `EnvTransition` and returns a modified `EnvTransition`. Here's how the `DeviceProcessorStep` works:
+
+```python
+from dataclasses import dataclass
+import torch
+from lerobot.processor import ProcessorStep, ProcessorStepRegistry
+from lerobot.processor.core import EnvTransition, TransitionKey
+
+@dataclass
+@ProcessorStepRegistry.register("device_processor")
+class DeviceProcessorStep(ProcessorStep):
+    """Move tensors to specified device with optional dtype conversion."""
+
+    device: str = "cpu"
+    float_dtype: str | None = None
+
+    def __post_init__(self):
+        """Initialize device and dtype mappings."""
+        self.tensor_device = torch.device(self.device)
+        self.non_blocking = "cuda" in str(self.device)
+
+        # Map string dtype to torch dtype
+        if self.float_dtype is not None:
+            dtype_mapping = {
+                "float16": torch.float16, "half": torch.float16,
+                "float32": torch.float32, "float": torch.float32,
+                "bfloat16": torch.bfloat16
+            }
+            self._target_float_dtype = dtype_mapping[self.float_dtype]
+        else:
+            self._target_float_dtype = None
+
+    def __call__(self, transition: EnvTransition) -> EnvTransition:
+        new_transition = transition.copy()
+
+        # Process simple tensor keys
+        for key in [TransitionKey.ACTION, TransitionKey.REWARD, TransitionKey.DONE, TransitionKey.TRUNCATED]:
+            value = transition.get(key)
+            if isinstance(value, torch.Tensor):
+                new_transition[key] = self._process_tensor(value)
+
+        # Process nested tensor dicts
+        for key in [TransitionKey.OBSERVATION, TransitionKey.COMPLEMENTARY_DATA]:
+            data_dict = transition.get(key)
+            if data_dict is not None:
+                new_data_dict = {
+                    k: self._process_tensor(v) if isinstance(v, torch.Tensor) else v
+                    for k, v in data_dict.items()
+                }
+                new_transition[key] = new_data_dict
+
+        return new_transition
+
+    def _process_tensor(self, tensor: torch.Tensor) -> torch.Tensor:
+        """Move tensor to target device and convert dtype if needed."""
+        # Smart device handling for multi-GPU compatibility
+        if tensor.is_cuda and self.tensor_device.type == "cuda":
+            # Both on GPU: preserve original GPU (Accelerate compatibility)
+            target_device = tensor.device
+        else:
+            # CPU or different device types: use configured device
+            target_device = self.tensor_device
+
+        # Move if necessary
+        if tensor.device != target_device:
+            tensor = tensor.to(target_device, non_blocking=self.non_blocking)
+
+        # Convert float dtype if specified
+        if self._target_float_dtype is not None and tensor.is_floating_point():
+            tensor = tensor.to(dtype=self._target_float_dtype)
+
+        return tensor
+
+    def get_config(self) -> dict:
+        return {"device": self.device, "float_dtype": self.float_dtype}
+```
+
+See the full implementation in `src/lerobot/processor/device_processor.py` for complete details.
+
+**Key principles:**
+
+- **Always use `transition.copy()`** to avoid side effects
+- **Handle both simple and nested tensors** systematically
+- **Smart device handling**: Preserve GPU placement for Accelerate compatibility
+- **Validate configurations** in `__post_init__()`
+
+### Configuration and State Management
+
+Processors support serialization through three methods that separate configuration from tensor state. This is especially important for normalization processors, which carry dataset statistics (tensors) in their state, and hyperparameters in their config:
+
+```python
+from dataclasses import dataclass, field
+from typing import Any
+import torch
+from lerobot.configs.types import FeatureType, NormalizationMode, PolicyFeature
+
+@dataclass
+class NormalizerProcessor:
+    features: dict[str, PolicyFeature]
+    norm_map: dict[FeatureType, NormalizationMode]
+    eps: float = 1e-8
+    _tensor_stats: dict[str, dict[str, torch.Tensor]] = field(default_factory=dict, init=False, repr=False)
+
+    def get_config(self) -> dict[str, Any]:
+        """JSON-serializable configuration (no tensors)."""
+        return {
+            "eps": self.eps,
+            "features": {k: {"type": v.type.value, "shape": v.shape} for k, v in self.features.items()},
+            "norm_map": {ft.value: nm.value for ft, nm in self.norm_map.items()},
+        }
+
+    def state_dict(self) -> dict[str, torch.Tensor]:
+        """Tensor state only (e.g., dataset statistics)."""
+        flat: dict[str, torch.Tensor] = {}
+        for key, sub in self._tensor_stats.items():
+            for stat_name, tensor in sub.items():
+                flat[f"{key}.{stat_name}"] = tensor
+        return flat
+
+    def load_state_dict(self, state: dict[str, torch.Tensor]) -> None:
+        """Restore tensor state at runtime."""
+        self._tensor_stats.clear()
+        for flat_key, tensor in state.items():
+            key, stat_name = flat_key.rsplit(".", 1)
+            self._tensor_stats.setdefault(key, {})[stat_name] = tensor
+```
+
+**Usage:**
+
+```python
+# Save (e.g., inside a policy)
+config = processor.get_config()
+tensors = processor.state_dict()
+
+# Restore (e.g., loading a pretrained policy)
+new_processor = NormalizerProcessor(**config)
+new_processor.load_state_dict(tensors)
+```
+
+### Transform features
+
+The `transform_features` method defines how your processor transforms feature names and shapes. This is crucial for policy configuration and debugging.
+
+Normalization typically preserves the feature keys and shapes, so `NormalizerProcessor.transform_features` returns the input features unchanged. When your processor renames or reshapes, implement this method to reflect the mapping for downstream components. For example, a simple rename processor:
+
+```python
+def transform_features(self, features: dict[str, PolicyFeature]) -> dict[str, PolicyFeature]:
+    # Simple renaming
+    if "pixels" in features:
+        features["observation.image"] = features.pop("pixels")
+
+    # Pattern-based renaming
+    for key in list(features.keys()):
+        if key.startswith("env_state."):
+            suffix = key[len("env_state."):]
+            features[f"observation.{suffix}"] = features.pop(key)
+
+    return features
+```
+
+**Key principles:**
+
+- Use `features.pop(old_key)` to remove and get the old feature
+- Use `features[new_key] = old_feature` to add the renamed feature
+- Always return the modified features dictionary
+- Document transformations clearly in the docstring
+
+### Example of usage from the codebase
+
+`transform_features` is used by `RobotProcessor` to derive the dataset/policy feature contract from an initial feature set by applying each step's transformation. You can see concrete examples in the codebase:
+
+- Phone teleoperation record pipeline (`examples/phone_so100_record.py`): processors like `ForwardKinematicsJointsToEE`, `GripperVelocityToJoint`, and `EEBoundsAndSafety` implement `transform_features` to declare which action/observation keys should be materialized in the dataset.
+- SO100 follower kinematics (`src/lerobot/robots/so100_follower/robot_kinematic_processor.py`): each processor's `transform_features` method adds or refines feature keys such as `observation.state.ee.{x,y,z,wx,wy,wz}` or `action.gripper.pos`.
+- Rename and tokenizer processors (`src/lerobot/processor/rename_processor.py`, `src/lerobot/processor/tokenizer_processor.py`): demonstrate key renaming and adding language token features to the contract.
+
+In practice, you will often aggregate features by running `DataProcessorPipeline.transform_features(...)` with your initial features to compute the final contract before recording or training.
+
+## Helper Classes
+
+LeRobot provides pre-built processor classes for common transformations. Below is a comprehensive list of registered processors in the codebase.
+
+### Core processors (observations, actions, normalization)
+
+- **`VanillaObservationProcessorStep`** (`observation_processor`): Images and state processing to LeRobot format.
+- **`NormalizerProcessorStep`** (`normalizer_processor`): Normalize observations/actions (mean/std or min/max to [-1, 1]).
+- **`UnnormalizerProcessorStep`** (`unnormalizer_processor`): Inverse of the normalizer for model outputs.
+- **`DeviceProcessorStep`** (`device_processor`): Move tensors to a specific device (CPU/GPU) and optional float dtype.
+- **`AddBatchDimensionProcessorStep`** (`to_batch_processor`): Add batch dimension to observations/actions when missing.
+- **`RenameObservationsProcessorStep`** (`rename_observations_processor`): Rename observation keys using a mapping dictionary.
+- **`TokenizerProcessorStep`** (`tokenizer_processor`): Tokenize language tasks into `observation.language.*` tensors.
+
+### Teleoperation mapping processors
+
+- **`MapDeltaActionToRobotAction`** (`map_delta_action_to_robot_action`): Map teleop deltas (e.g., gamepad) to `action.target_*` fields.
+- **`MapPhoneActionToRobotAction`** (`map_phone_action_to_robot_action`): Map calibrated phone pose/buttons to `action.target_*` and gripper.
+
+### Robot kinematics processors (SO100 follower example)
+
+- **`EEReferenceAndDelta`** (`ee_reference_and_delta`): Compute desired EE pose from target deltas and current pose.
+- **`EEBoundsAndSafety`** (`ee_bounds_and_safety`): Clip EE pose to bounds and check for jumps.
+- **`InverseKinematicsEEToJoints`** (`inverse_kinematics_ee_to_joints`): Convert EE pose to joint targets via IK.
+- **`GripperVelocityToJoint`** (`gripper_velocity_to_joint`): Convert gripper velocity input to joint position command.
+- **`ForwardKinematicsJointsToEE`** (`forward_kinematics_joints_to_ee`): Compute EE pose features from joint positions via FK.
+- **`AddRobotObservationAsComplimentaryData`** (`add_robot_observation`): Read robot observation and insert `raw_joint_positions` into complementary data.
+
+### Policy-specific utility processors
+
+- **`Pi0NewLineProcessor`** (`pi0_new_line_processor`): Ensure text tasks end with a newline (Pi0 tokenizer compatibility).
+- **`SmolVLANewLineProcessor`** (`smolvla_new_line_processor`): Ensure text tasks end with a newline (SmolVLA tokenizer compatibility).
+
+### Usage Example
+
+```python
+from lerobot.processor import (
+    NormalizerProcessorStep, DeviceProcessorStep,
+    RobotProcessorPipeline, AddBatchDimensionProcessorStep
+)
+
+# Create a processing pipeline (typical policy preprocessor)
+steps = [
+    NormalizerProcessorStep(features=features, norm_map=norm_map, stats=stats),
+    AddBatchDimensionProcessorStep(),
+    DeviceProcessorStep(device="cuda"),
+]
+
+# Use in RobotProcessorPipeline
+processor = RobotProcessorPipeline[dict, dict](steps=steps)
+processed_transition = processor(raw_transition)
+```
+
+### Using overrides
+
+You can override step parameters at load-time using `overrides`. This is handy for non-serializable objects or site-specific settings. It works both in policy factories and with `DataProcessorPipeline.from_pretrained(...)`.
+
+Example: during policy evaluation on the robot, override the device and rename map.
+Use this to run a policy trained on CUDA on a CPU-only robot, or to remap camera keys when the robot uses different names than the dataset.
+
+```437:445:src/lerobot/record.py
+preprocessor, postprocessor = make_processor(
+    policy_cfg=cfg.policy,
+    pretrained_path=cfg.policy.pretrained_path,
+    dataset_stats=rename_stats(dataset.meta.stats, cfg.dataset.rename_map),
+    preprocessor_overrides={
+        "device_processor": {"device": cfg.policy.device},
+        "rename_processor": {"rename_map": cfg.dataset.rename_map},
+    },
+)
+```
+
+Direct usage with `from_pretrained`:
+
+```python
+from lerobot.processor import RobotProcessorPipeline
+
+processor = RobotProcessorPipeline.from_pretrained(
+    "username/my-processor",
+    overrides={
+        "device_processor": {"device": "cuda:0"},  # registry name for registered steps
+        "CustomStep": {"param": 42},               # class name for non-registered steps
+    },
+)
+```
+
+## Best Practices
+
+Based on analysis of all LeRobot processor implementations, here are the key patterns and practices:
+
+### 1. **Safe Data Handling**
+
+```python
+# ✅ Always copy data to avoid side effects
+new_action = action.copy()
+new_obs = observation.copy()
+
+# ✅ Check for required data before processing
+if "pixels" not in observation:
+    return observation  # Pass through unchanged
+
+# ✅ Handle None gracefully
+comp = self.transition.get(TransitionKey.COMPLEMENTARY_DATA)
+if comp is None:
+    raise ValueError("Required complementary data missing")
+```
+
+### 2. **Robust Input Validation**
+
+```python
+# ✅ Validate data types and shapes
+if not isinstance(action, dict):
+    raise ValueError(f"Action should be a RobotAction type got {type(action)}")
+
+# ✅ Check tensor properties before processing
+if img_tensor.dtype != torch.uint8:
+    raise ValueError(f"Expected torch.uint8 images, but got {img_tensor.dtype}")
+
+# ✅ Validate required keys exist
+if None in (x, y, z, wx, wy, wz):
+    raise ValueError("Missing required end-effector pose components")
+```
+
+### 3. **Use Appropriate Base Classes**
+
+```python
+# ✅ Observation-only processors
+class MyObsProcessor(ObservationProcessorStep):
+    def observation(self, observation): ...
+
+# ✅ Action-only processors
+class MyActionProcessor(ActionProcessorStep):
+    def action(self, action): ...
+
+# ✅ Robot action processors (dict actions only)
+class MyRobotActionProcessor(RobotActionProcessorStep):
+    def action(self, action: dict[str, Any]): ...
+
+# ✅ Full control processors
+class MyFullProcessor(ProcessorStep):
+    def __call__(self, transition: EnvTransition): ...
+```
+
+### 4. **Registration and Naming**
+
+```python
+# ✅ Always register with namespaced names
+@ProcessorStepRegistry.register("my_company/image_processor")
+@dataclass
+class ImageProcessor(ObservationProcessorStep):
+    ...
+
+# ✅ Use descriptive, unique names
+# Good: "robotics_lab/safety_clipper", "acme_corp/vision_enhancer"
+# Bad: "processor", "step", "my_processor"
+```
+
+### 5. **State Management Patterns**
+
+```python
+# ✅ Use dataclass fields for internal state
+@dataclass
+class StatefulProcessor(ProcessorStep):
+    # Public config
+    window_size: int = 10
+
+    # Internal state (not in config)
+    _buffer: list = field(default_factory=list, init=False, repr=False)
+    _last_value: float | None = field(default=None, init=False, repr=False)
+
+    def reset(self):
+        """Reset internal state between episodes."""
+        self._buffer.clear()
+        self._last_value = None
+```
+
+### 6. **Error Handling**
+
+```python
+# ✅ Early returns for edge cases
+if not self.enabled or action is None:
+    return action
+
+# ✅ Clear error messages for invalid inputs
+if not isinstance(action, dict):
+    raise ValueError(f"Action should be a RobotAction type got {type(action)}")
+
+# ✅ Validate required keys exist
+if "required_key" not in action:
+    raise ValueError("Required key 'required_key' not found in action")
+```
+
+### 7. **Device and Dtype Awareness**
+
+The key principle: **tensors stored in your processor should mimic the dtype and device of input tensors**. This enables seamless operation in multi-GPU setups, Accelerate, and data parallel configurations.
+
+```python
+# ✅ Adapt internal state to match input tensors
+def _apply_transform(self, tensor: torch.Tensor, key: str) -> torch.Tensor:
+    # Check if our internal stats match the input tensor
+    if key in self._tensor_stats:
+        first_stat = next(iter(self._tensor_stats[key].values()))
+        if first_stat.device != tensor.device or first_stat.dtype != tensor.dtype:
+            # Automatically adapt to input tensor's device/dtype
+            self.to(device=tensor.device, dtype=tensor.dtype)
+
+    # Now process with matching device/dtype
+    return self._process_with_stats(tensor, key)
+
+# ✅ Implement to() method for device/dtype migration
+def to(self, device=None, dtype=None):
+    if device is not None:
+        self.device = device
+    if dtype is not None:
+        self.dtype = dtype
+    # Update internal tensor stats to match
+    self._tensor_stats = to_tensor(self.stats, device=self.device, dtype=self.dtype)
+    return self
+
+# ✅ This pattern enables:
+# - Multi-GPU training (data on different GPUs)
+# - Mixed precision (float16, bfloat16)
+# - Accelerate compatibility (automatic device placement)
+# - Data parallel setups (distributed training)
+```
+
+## Conclusion
+
+You now have all the tools to implement custom processors in LeRobot! The key steps are:
+
+1. **Define your processor** as a dataclass with the required methods (`__call__`, `get_config`, `state_dict`, `load_state_dict`, `reset`, `transform_features`)
+2. **Register it** using `@ProcessorStepRegistry.register("name")` for discoverability
+3. **Integrate it** into a `DataProcessorPipeline` with other processing steps
+4. **Use base classes** like `ObservationProcessorStep` when possible to reduce boilerplate
+5. **Implement device/dtype awareness** to support multi-GPU and mixed precision setups
+
+The processor system is designed to be modular and composable, allowing you to build complex data processing pipelines from simple, focused components. Whether you're preprocessing sensor data for training or post-processing model outputs for robot execution, custom processors give you the flexibility to handle any data transformation your robotics application requires.
+
+Key principles for robust processors:
+
+- **Device/dtype adaptation**: Internal tensors should match input tensors
+- **Clear error messages**: Help users understand what went wrong
+- **Base class usage**: Leverage specialized base classes to reduce boilerplate
+- **Feature contracts**: Declare data structure changes with `transform_features()`
+
+Start simple, test thoroughly, and ensure your processors work seamlessly across different hardware configurations!
@@ -0,0 +1,424 @@
+# Introduction to Processors
+
+In robotics, there's a fundamental mismatch between the data that robots and humans produce and what machine learning models expect. This creates several translation challenges:
+
+**Raw Robot Data → Model Input:**
+
+- Robots output raw sensor data (camera images, joint positions, force readings) that need normalization, batching, and device placement before models can process them
+- Language instructions from humans ("pick up the red cube") must be tokenized into numerical representations
+- Different robots use different coordinate systems and units that need standardization
+
+**Model Output → Robot Commands:**
+
+- Models might output end-effector positions, but robots need joint-space commands
+- Teleoperators (like gamepads) produce relative movements (delta positions), but robots expect absolute commands
+- Model predictions are often normalized and need to be converted back to real-world scales
+
+**Cross-Domain Translation:**
+
+- Training data from one robot setup needs adaptation for deployment on different hardware
+- Models trained with specific camera configurations must work with new camera arrangements
+- Datasets with different naming conventions need harmonization
+
+**That's where processors come in.** They serve as the universal translators that bridge these gaps, ensuring seamless data flow from sensors to models to actuators.
+
+Processors are the data transformation backbone of LeRobot. They handle all the preprocessing and postprocessing steps needed to convert raw environment data into model-ready inputs and vice versa.
+
+## What are Processors?
+
+In robotics, data comes in many forms - images from cameras, joint positions from sensors, text instructions from users, and more. Each type of data requires specific transformations before a model can use it effectively. Models need this data to be:
+
+- **Normalized**: Scaled to appropriate ranges for neural network processing
+- **Batched**: Organized with proper dimensions for batch processing
+- **Tokenized**: Text converted to numerical representations
+- **Device-placed**: Moved to the right hardware (CPU/GPU)
+- **Type-converted**: Cast to appropriate data types
+
+Processors handle these transformations through composable, reusable steps that can be chained together into pipelines. Think of them as a modular assembly line where each station performs a specific transformation on your data.
+
+## Core Concepts
+
+### EnvTransition: The Universal Data Container
+
+The `EnvTransition` is the fundamental data structure that flows through all processors. It's a strongly-typed dictionary that represents a complete robot-environment interaction:
+
+```python
+from lerobot.processor import TransitionKey, EnvTransition, PolicyAction, RobotAction
+
+# EnvTransition is precisely typed to handle different action types:
+# - PolicyAction: torch.Tensor (for model inputs/outputs)
+# - RobotAction: dict[str, Any] (for robot hardware)
+# - EnvAction: np.ndarray (for gym environments)
+
+# Example transition from a robot collecting data
+transition: EnvTransition = {
+    TransitionKey.OBSERVATION: {                           # dict[str, Any] | None
+        "observation.images.camera0": camera0_image_tensor,  # Shape: (H, W, C)
+        "observation.images.camera1": camera1_image_tensor,  # Shape: (H, W, C)
+        "observation.state": joint_positions_tensor,         # Shape: (7,) for 7-DOF arm
+        "observation.environment_state": env_state_tensor    # Shape: (3,) for object position
+    },
+    TransitionKey.ACTION: action_tensor,                   # PolicyAction | RobotAction | EnvAction | None
+    TransitionKey.REWARD: 0.0,                            # float | torch.Tensor | None
+    TransitionKey.DONE: False,                            # bool | torch.Tensor | None
+    TransitionKey.TRUNCATED: False,                       # bool | torch.Tensor | None
+    TransitionKey.INFO: {"success": False},               # dict[str, Any] | None
+    TransitionKey.COMPLEMENTARY_DATA: {                   # dict[str, Any] | None
+        "task": "pick up the red cube",                    # Language instruction
+        "task_index": 0,                                   # Task identifier
+        "index": 42                                        # Frame index
+    }
+}
+```
+
+Each key in the transition has a specific purpose:
+
+- **OBSERVATION**: All sensor data (images, states, proprioception)
+- **ACTION**: The action to execute or that was executed
+- **REWARD**: Reinforcement learning signal
+- **DONE/TRUNCATED**: Episode boundary indicators
+- **INFO**: Arbitrary metadata
+- **COMPLEMENTARY_DATA**: Task descriptions, indices, padding flags, inter-step data
+
+### ProcessorStep: The Building Block
+
+A `ProcessorStep` is a single transformation unit that processes transitions. It's an abstract base class with two required methods:
+
+```python
+from lerobot.processor import ProcessorStep, EnvTransition
+
+class MyProcessorStep(ProcessorStep):
+    """Example processor step - inherit and implement abstract methods."""
+
+    def __call__(self, transition: EnvTransition) -> EnvTransition:
+        """Transform the transition - REQUIRED abstract method."""
+        # Your processing logic here
+        return transition
+
+    def transform_features(self, features):
+        """Declare how this step transforms feature shapes/types - REQUIRED abstract method."""
+        return features  # Most processors return features unchanged
+```
+
+### DataProcessorPipeline: The Generic Orchestrator
+
+The `DataProcessorPipeline[TInput, TOutput]` chains multiple `ProcessorStep` instances together with compile-time type safety:
+
+```python
+from lerobot.processor import RobotProcessorPipeline, PolicyProcessorPipeline
+
+# For robot hardware (unbatched data)
+robot_processor = RobotProcessorPipeline[dict[str, Any], dict[str, Any]](
+    steps=[step1, step2, step3],
+    name="robot_pipeline"
+)
+
+# For model training/inference (batched data)
+policy_processor = PolicyProcessorPipeline[dict[str, Any], dict[str, Any]](
+    steps=[step1, step2, step3],
+    name="policy_pipeline"
+)
+```
+
+## RobotProcessorPipeline vs PolicyProcessorPipeline
+
+The key distinction is in the data structures they handle:
+
+| Aspect          | RobotProcessorPipeline                       | PolicyProcessorPipeline                  |
+| --------------- | -------------------------------------------- | ---------------------------------------- |
+| **Input**       | `dict[str, Any]` - Individual robot values   | `dict[str, Any]` - Batched tensors       |
+| **Output**      | `dict[str, Any]` - Individual robot commands | `torch.Tensor` - Policy predictions      |
+| **Use Case**    | Real-time robot control                      | Model training/inference                 |
+| **Data Format** | Unbatched, heterogeneous                     | Batched, homogeneous                     |
+| **Examples**    | `{"joint_1": 0.5}`                           | `{"observation.state": tensor([[0.5]])}` |
+
+**Use `RobotProcessorPipeline`** for robot hardware interfaces:
+
+```python
+# Robot data structures: dict[str, Any] for observations and actions
+robot_obs: dict[str, Any] = {
+    "joint_1": 0.5,           # Individual joint values
+    "joint_2": -0.3,
+    "camera_0": image_array   # Raw camera data
+}
+
+robot_action: dict[str, Any] = {
+    "joint_1": 0.2,          # Target joint positions
+    "joint_2": 0.1,
+    "gripper": 0.8
+}
+```
+
+**Use `PolicyProcessorPipeline`** for model training and batch processing:
+
+```python
+# Policy data structures: batch dicts and tensors
+policy_batch: dict[str, Any] = {
+    "observation.state": torch.tensor([[0.5, -0.3]]),      # Batched states
+    "observation.images.camera0": torch.tensor(...),        # Batched images
+    "action": torch.tensor([[0.2, 0.1, 0.8]])              # Batched actions
+}
+
+policy_action: torch.Tensor = torch.tensor([[0.2, 0.1, 0.8]])  # Model output tensor
+```
+
+## Converter Functions
+
+LeRobot provides converter functions to bridge different data formats:
+
+```python
+from lerobot.processor.converters import (
+    # Robot hardware converters
+    robot_action_to_transition,    # Robot dict → EnvTransition
+    observation_to_transition,     # Robot obs → EnvTransition
+    transition_to_robot_action,    # EnvTransition → Robot dict
+
+    # Policy/training converters
+    batch_to_transition,           # Batch dict → EnvTransition
+    transition_to_batch,           # EnvTransition → Batch dict
+    policy_action_to_transition,   # Policy tensor → EnvTransition
+    transition_to_policy_action,   # EnvTransition → Policy tensor
+
+    # Utilities
+    create_transition,             # Build transitions with defaults
+    merge_transitions,             # Combine multiple transitions
+    identity_transition            # Pass-through converter
+)
+```
+
+## Real-World Examples
+
+### Robot Control Pipeline
+
+```python
+# Phone teleoperation → Robot control (from examples/phone_to_so100/)
+phone_to_robot = RobotProcessorPipeline[RobotAction, RobotAction](
+    steps=[
+        MapPhoneActionToRobotAction(platform=PhoneOS.IOS),  # Phone → robot targets
+        EEReferenceAndDelta(kinematics=solver, ...),        # Deltas → absolute pose
+        EEBoundsAndSafety(bounds=..., max_step=0.2),        # Safety limits
+        InverseKinematicsEEToJoints(kinematics=solver),     # Pose → joint angles
+        GripperVelocityToJoint(motor_names=motors),         # Gripper control
+    ],
+    to_transition=robot_action_to_transition,
+    to_output=transition_to_robot_action
+)
+
+# Usage: phone_action → robot_joints
+phone_input = {"phone.pos": [0.1, 0.2, 0.0], "phone.rot": rotation}
+robot_joints = phone_to_robot(phone_input)
+robot.send_action(robot_joints)
+```
+
+### Policy Training Pipeline
+
+```python
+# Training data preprocessing (optimized order for GPU performance)
+training_preprocessor = PolicyProcessorPipeline[dict[str, Any], dict[str, Any]](
+    steps=[
+        RenameObservationsProcessorStep(rename_map={}),     # Standardize keys
+        AddBatchDimensionProcessorStep(),                   # Add batch dims
+        TokenizerProcessorStep(tokenizer_name="...", ...),  # Tokenize language
+        DeviceProcessorStep(device="cuda"),                 # Move to GPU first ⚡
+        NormalizerProcessorStep(features=..., stats=...),   # Normalize on GPU ⚡
+    ]
+)
+
+# Model output postprocessing
+training_postprocessor = PolicyProcessorPipeline[torch.Tensor, torch.Tensor](
+    steps=[
+        DeviceProcessorStep(device="cpu"),                  # Move to CPU
+        UnnormalizerProcessorStep(features=..., stats=...), # Denormalize
+    ]
+)
+```
+
+### Mixed Robot + Policy Pipeline
+
+```python
+# Real deployment: Robot sensors → Model → Robot commands
+with torch.no_grad():
+    while not done:
+        # 1. Get robot observation (unbatched)
+        raw_obs = robot.get_observation()  # dict[str, Any]
+
+        # 2. Process for policy (add batching, normalize)
+        policy_input = policy_preprocessor(raw_obs)  # Batched dict
+
+        # 3. Run model
+        policy_output = policy.select_action(policy_input)  # Policy tensor
+
+        # 4. Postprocess for robot (denormalize, convert to dict)
+        robot_action = policy_postprocessor(policy_output)  # dict[str, Any]
+
+        # 5. Send to robot
+        robot.send_action(robot_action)
+```
+
+## Feature Contracts: Shape and Type Transformation
+
+Processors don't just transform data - they can also **change the data structure itself**. The `transform_features()` method declares these changes, which is crucial for dataset recording and policy creation.
+
+### Why Feature Contracts Matter
+
+When building datasets or policies, LeRobot needs to know:
+
+- **What data fields will exist** after processing
+- **What shapes and types** each field will have
+- **How to configure models** for the expected data structure
+
+```python
+# Example: A processor that adds velocity to observations
+class VelocityProcessor(ObservationProcessorStep):
+    def observation(self, obs):
+        new_obs = obs.copy()
+        if "observation.state" in obs:
+            # Add computed velocity field
+            new_obs["observation.velocity"] = self._compute_velocity(obs["observation.state"])
+        return new_obs
+
+    def transform_features(self, features):
+        """Declare the new velocity field we're adding."""
+        if PipelineFeatureType.OBSERVATION in features:
+            # Add velocity feature with same shape as state
+            state_feature = features[PipelineFeatureType.OBSERVATION].get("observation.state")
+            if state_feature:
+                features[PipelineFeatureType.OBSERVATION]["observation.velocity"] = PolicyFeature(
+                    type=FeatureType.STATE,
+                    shape=state_feature.shape  # Same shape as position
+                )
+        return features
+```
+
+### Real Examples from LeRobot
+
+**Phone Action Mapping** - Transforms action structure:
+
+```python
+# Input features: {"phone.pos": (3,), "phone.rot": (4,), "phone.enabled": (1,)}
+# Output features: {"target_x": (1,), "target_y": (1,), ..., "gripper": (1,)}
+
+def transform_features(self, features):
+    # Remove phone-specific keys
+    features[PipelineFeatureType.ACTION].pop("phone.pos", None)
+    features[PipelineFeatureType.ACTION].pop("phone.rot", None)
+
+    # Add robot target keys
+    features[PipelineFeatureType.ACTION]["target_x"] = PolicyFeature(type=FeatureType.ACTION, shape=(1,))
+    features[PipelineFeatureType.ACTION]["target_y"] = PolicyFeature(type=FeatureType.ACTION, shape=(1,))
+    # ... more target fields
+    return features
+```
+
+**Forward Kinematics** - Adds computed observations:
+
+```python
+# Input: Joint positions
+# Output: Joint positions + End-effector pose
+
+def transform_features(self, features):
+    # Add end-effector pose features computed from joints
+    for axis in ["x", "y", "z", "wx", "wy", "wz"]:
+        features[PipelineFeatureType.OBSERVATION][f"observation.state.ee.{axis}"] = PolicyFeature(
+            type=FeatureType.STATE, shape=(1,)
+        )
+    return features
+```
+
+**Tokenization** - Adds language features:
+
+```python
+# Input: Text in complementary_data
+# Output: Token IDs and attention mask in observations
+
+def transform_features(self, features):
+    features[PipelineFeatureType.OBSERVATION]["observation.language.tokens"] = PolicyFeature(
+        type=FeatureType.LANGUAGE, shape=(self.max_length,)
+    )
+    features[PipelineFeatureType.OBSERVATION]["observation.language.attention_mask"] = PolicyFeature(
+        type=FeatureType.LANGUAGE, shape=(self.max_length,)
+    )
+    return features
+```
+
+### Feature Aggregation in Practice
+
+```python
+from lerobot.datasets.pipeline_features import aggregate_pipeline_dataset_features
+
+# Start with robot's raw features
+initial_features = create_initial_features(
+    observation=robot.observation_features,  # {"joint_1.pos": float, "camera_0": (480,640,3)}
+    action=robot.action_features            # {"joint_1.pos": float, "gripper.pos": float}
+)
+
+# Apply processor pipeline to compute final features
+final_features = aggregate_pipeline_dataset_features(
+    pipeline=my_processor_pipeline,
+    initial_features=initial_features,
+    use_videos=True
+)
+
+# Result: Complete feature specification for dataset/policy
+# {
+#   "observation.state": {"shape": (7,), "dtype": "float32"},
+#   "observation.images.camera_0": {"shape": (3, 480, 640), "dtype": "uint8"},
+#   "observation.velocity": {"shape": (7,), "dtype": "float32"},  # Added by processor!
+#   "action": {"shape": (7,), "dtype": "float32"}
+# }
+
+# Use for dataset creation
+dataset = LeRobotDataset.create(
+    repo_id="my_dataset",
+    features=final_features,  # Knows exactly what data to expect
+    ...
+)
+```
+
+## Common Processor Steps
+
+LeRobot provides many registered processor steps. Here are the most commonly used core processors:
+
+### Essential Processors
+
+- **`normalizer_processor`**: Normalize observations/actions using dataset statistics (mean/std or min/max)
+- **`device_processor`**: Move tensors to CPU/GPU with optional dtype conversion
+- **`to_batch_processor`**: Add batch dimensions to transitions for model compatibility
+- **`rename_observations_processor`**: Rename observation keys using mapping dictionaries
+- **`tokenizer_processor`**: Tokenize natural language task descriptions into tokens and attention masks
+
+## Performance Tips
+
+**🚀 Critical Optimization**: Always move data to GPU **before** normalization for significant speedups:
+
+```python
+# ✅ FAST: GPU normalization
+steps=[
+    DeviceProcessorStep(device="cuda"),  # Move to GPU first
+    NormalizerProcessorStep(...)         # Normalize on GPU - much faster!
+]
+
+# ❌ SLOW: CPU normalization
+steps=[
+    NormalizerProcessorStep(...),        # Normalize on CPU - slow
+    DeviceProcessorStep(device="cuda")   # Move to GPU after
+]
+```
+
+## Next Steps
+
+- **[Implement Your Own Processor](implement_your_own_processor.mdx)** - Create custom processor steps
+- **[Debug Your Pipeline](debug_processor_pipeline.mdx)** - Troubleshoot and optimize pipelines
+- **[Processors for Robots and Teleoperators](processors_robots_teleop.mdx)** - Real-world integration patterns
+
+## Summary
+
+Processors solve the data translation problem in robotics by providing:
+
+- **Modular transformations**: Composable, reusable processing steps
+- **Type safety**: Generic pipelines with compile-time checking
+- **Performance optimization**: GPU-accelerated operations
+- **Robot/Policy distinction**: Separate pipelines for different data structures
+- **Comprehensive ecosystem**: 30+ registered processors for common tasks
+
+The key insight: `RobotProcessorPipeline` handles unbatched robot hardware data, while `PolicyProcessorPipeline` handles batched model data. Choose the right tool for your data structure!
@@ -0,0 +1,195 @@
+# Phone
+
+Use your phone (iOS or Android) to control your robot.
+
+**In this guide you'll learn:**
+
+- How to connect an iOS/Android phone
+- How phone pose is mapped to robot end‑effector (EE) targets
+- How to tweak safety limits, gripper control, and IK settings
+
+To use phone to control your robot, install the relevant dependencies with:
+
+```bash
+pip install lerobot[phone]
+```
+
+## Get started
+
+### Supported platforms
+
+- iOS: Uses the HEBI Mobile I/O app (ARKit pose + buttons). Download the app first, open it and the examples will discover it on your network and stream the phone pose and inputs.
+- Android: Uses the `teleop` package (WebXR). When you start the Python process, it prints a local URL. Open the link on your phone, tap Start, then use Move to stream pose.
+
+Links:
+
+- Android WebXR library: [`teleop` on PyPI](https://pypi.org/project/teleop/)
+- iOS app: [HEBI Mobile I/O](https://docs.hebi.us/tools.html#mobile-io)
+
+### Phone orientation and controls
+
+- Orientation: hold the phone with the screen facing up and the top edge pointing in the same direction as the robot gripper. This ensures calibration aligns the phone’s frame with the robot frame so motion feels natural.
+- Enable/disable:
+  - iOS: Hold `B1` to enable teleoperation, release to stop. The first press captures a reference pose.
+  - Android: Press and hold the `Move` button, release to stop. The first press captures a reference pose.
+- Gripper control:
+  - iOS: Analog input `A3` controls the gripper as velocity input.
+  - Android: Buttons `A` and `B` act like increment/decrement (A opens, B closes). You can tune velocity in the `GripperVelocityToJoint` step.
+
+### Step 1: Choose the platform
+
+Modify the examples to use `PhoneOS.IOS` or `PhoneOS.ANDROID` in `PhoneConfig`. The API is identical across platforms, only the input source differs. All examples are under `examples/` and have `phone_so100_*.py` variants.
+
+Teleoperation example:
+
+```36:43:examples/phone_so100_teleop.py
+from lerobot.teleoperators.phone.config_phone import PhoneConfig, PhoneOS
+
+teleop_config = PhoneConfig(phone_os=PhoneOS.IOS)  # or PhoneOS.ANDROID
+teleop_device = Phone(teleop_config)
+```
+
+### Step 2: Connect and calibrate
+
+When `Phone(teleop_config)` is created and `connect()` is called, calibration is prompted automatically. Hold the phone in the orientation described above, then:
+
+- iOS: press and hold `B1` to capture the reference pose.
+- Android: press `Move` button on the WebXR page to capture the reference pose.
+
+Why calibrate? We capture the current pose so subsequent poses are expressed in a robot aligned frame. When you again press the button to enable control, the position is recaptured to avoid drift when your phone is repositioned while it was disabled.
+
+### Step 3: Run an example
+
+Run on of the examples scripts to teleoperate, record a dataset, replay a dataset or evaluate a policy.
+
+All scripts assume you configured your robot (e.g., SO-100 follower) and set the correct serial port.
+
+- Android: after starting the script, open the printed local URL on your phone, tap Start, then press and hold Move.
+- iOS: open HEBI Mobile I/O first; B1 enables motion. A3 controls the gripper.
+
+You can customize mapping or safety limits by editing the processor steps shown in the examples.
+
+You can also remap inputs (e.g., use a different analog input) or adapt the pipeline to other robots (e.g., LeKiwi) by modifying the input and kinematics steps. More about this in the [Processors for Robots and Teleoperators](./processors_robots_teleop.mdx) guide.
+
+- Run this example to teleoperate:
+
+  ```bash
+  python examples/phone_so100_teleop.py
+  ```
+
+- Run this example to record a dataset, which saves absolute end effector observations and actions:
+
+  ```bash
+  python examples/phone_so100_record.py
+  ```
+
+- Run this example to replay recorded episodes:
+
+  ```bash
+  python examples/phone_so100_replay.py
+  ```
+
+- Run this example to evaluate a pretrained policy:
+
+  ```bash
+  python examples/phone_so100_eval.py
+  ```
+
+### Important pipeline steps and options
+
+- Kinematics are used in multiple steps. We use [Placo](https://github.com/Rhoban/placo) which is a wrapper around Pinocchio for handling our kinematics. We construct the kinematics object by passing the robot's URDF and target frame. We set `target_frame_name` to the gripper frame.
+
+  ```44:49:examples/phone_so100_teleop.py
+  RobotKinematics(
+      urdf_path="./src/lerobot/teleoperators/sim/so101_new_calib.urdf",
+      target_frame_name="gripper_frame_link",
+      joint_names=list(robot.bus.motors.keys()),
+  )
+  ```
+
+- The `MapPhoneActionToRobotAction` step converts the calibrated phone pose and inputs into target deltas and gripper commands, below is shown what the step outputs.
+
+  ```72:83:src/lerobot/teleoperators/phone/phone_processor.py
+  # Map calibrated phone pose to robot targets (enabled gates the motion)
+  act.update(
+      {
+          "action.enabled": enabled,
+          "action.target_x": -pos[1] if enabled else 0.0,
+          "action.target_y":  pos[0] if enabled else 0.0,
+          "action.target_z":  pos[2] if enabled else 0.0,
+          "action.target_wx": rotvec[1] if enabled else 0.0,
+          "action.target_wy": rotvec[0] if enabled else 0.0,
+          "action.target_wz": -rotvec[2] if enabled else 0.0,
+          "action.gripper":   gripper,
+      }
+  )
+  ```
+
+- The `EEReferenceAndDelta` step converts target deltas to an absolute desired EE pose, storing a reference on enable, the `end_effector_step_sizes` are the step sizes for the EE pose and can be modified to change the motion speed.
+
+  ```56:65:examples/phone_so100_teleop.py
+  EEReferenceAndDelta(
+      kinematics=kinematics_solver,
+      end_effector_step_sizes={"x": 0.5, "y": 0.5, "z": 0.5},
+      motor_names=list(robot.bus.motors.keys()),
+  )
+  ```
+
+- The `EEBoundsAndSafety` step clamps EE motion to a workspace and checks for large ee step jumps to ensure safety. The `end_effector_bounds` are the bounds for the EE pose and can be modified to change the workspace. The `max_ee_step_m` and `max_ee_twist_step_rad` are the step limits for the EE pose and can be modified to change the safety limits.
+
+  ```61:66:examples/phone_so100_teleop.py
+  EEBoundsAndSafety(
+      end_effector_bounds={"min": [-1.0, -1.0, -1.0], "max": [1.0, 1.0, 1.0]},
+      max_ee_step_m=0.10,
+      max_ee_twist_step_rad=0.50,
+  )
+  ```
+
+- The `GripperVelocityToJoint` step turns a velocity‑like gripper input into absolute gripper position using the current measured state. The `speed_factor` is the factor by which the velocity is multiplied.
+
+  ```78:81:examples/phone_so100_teleop.py
+  GripperVelocityToJoint(
+      motor_names=list(robot.bus.motors.keys()),
+      speed_factor=20.0,
+  )
+  ```
+
+#### Different IK initial guesses
+
+We use different IK initial guesses in the kinematic steps. As initial guess either the current measured joints or the previous IK solution is used.
+
+- Closed loop (used in record/eval): sets `initial_guess_current_joints=True` so IK starts from the measured joints each frame.
+
+  ```71:76:examples/phone_so100_eval.py
+  InverseKinematicsEEToJoints(
+      kinematics=kinematics_solver,
+      motor_names=list(robot.bus.motors.keys()),
+      initial_guess_current_joints=True,  # closed loop
+  )
+  ```
+
+- Open loop (used in replay): sets `initial_guess_current_joints=False` so IK continues from the previous IK solution rather than the measured state. This preserves action stability when we replay without feedback.
+
+  ```80:86:examples/phone_so100_replay.py
+  InverseKinematicsEEToJoints(
+      kinematics=kinematics_solver,
+      motor_names=list(robot.bus.motors.keys()),
+      initial_guess_current_joints=False,  # open loop
+  )
+  ```
+
+### Pipeline steps explained
+
+- MapPhoneActionToRobotAction: converts calibrated phone pose and inputs into target deltas and a gripper command. Motion is gated by an enable signal (B1 on iOS, Move on Android).
+- AddRobotObservationAsComplimentaryData: reads current robot joints and inserts them under `complementary_data.raw_joint_positions` for FK/IK steps to use.
+- EEReferenceAndDelta: latches a reference EE pose on enable and combines it with target deltas to produce an absolute desired EE pose each frame. When disabled, it keeps sending the last commanded pose.
+- EEBoundsAndSafety: clamps the EE pose to a workspace and rate‑limits jumps for safety. Also declares `action.ee.*` features.
+- InverseKinematicsEEToJoints: turns an EE pose into joint positions with IK. `initial_guess_current_joints=True` is recommended for closed‑loop control; set `False` for open‑loop replay for stability.
+- GripperVelocityToJoint: integrates a velocity‑like gripper input into an absolute gripper position using the current measured state.
+- ForwardKinematicsJointsToEE: computes `observation.state.ee.*` from observed joints for logging and training on EE state.
+
+### Troubleshooting
+
+- iOS not discovered: ensure HEBI Mobile I/O is open and your laptop/phone are on the same network.
+- Android URL not reachable: check local you used `https` instead of `http`, use the exact IP printed by the script and allow your browser to enter and ignore the certificate issue.
+- Motion feels inverted: adjust the sign flips in `MapPhoneActionToRobotAction` or swap axes to match your setup.
@@ -0,0 +1,148 @@
+# Processors for Robots and Teleoperators
+
+This guide shows how to build and modify processing pipelines that connect teleoperators (e.g., phone) to robots and datasets. Pipelines standardize conversions between different action/observation spaces so you can swap teleops and robots without rewriting glue code.
+
+We use the Phone to SO‑100 follower examples for concreteness, but the same patterns apply to other robots.
+
+**What you'll learn**
+
+- Absolute vs. relative EE control: What each means, trade‑offs, and how to choose for your task.
+- Three-pipeline pattern: How to map teleop actions → dataset actions → robot commands, and robot observations → dataset observations.
+- Adapters (`to_transition` / `to_output`): How these convert raw dicts to `EnvTransition` and back to reduce boilerplate.
+- Dataset feature contracts: How steps declare features via `transform_features(...)`, and how to aggregate/merge them for recording.
+- Choosing a representation: When to store joints, absolute EE poses, or relative EE deltas—and how that affects training.
+- Pipeline customization guidance: How to swap robots/URDFs safely and tune bounds, step sizes, and options like IK initialization.
+
+### Absolute vs relative EE control
+
+The examples in this guide use absolute end effector (EE) poses because they are easy to reason about. In practice, relative EE deltas or joint position are often preferred as learning features.
+
+You can choose what you save and learn from the teleop and robot action spaces, joints, absolute EE, or relative EE by using/implementing the right steps (and `transform_features()`) in your pipelines.
+
+## Three pipelines
+
+We often compose three pipelines. Depending on your setup, some can be empty if action and observation spaces already match.
+Each of these pipelines handle different conversions between different action and observation spaces. Below is a quick explanation of each pipeline.
+
+1. Pipeline 1: Teleop action space → dataset action space (phone pose → EE targets)
+2. Pipeline 2: Dataset action space → robot command space (EE targets → joints)
+3. Pipeline 3: Robot observation space → dataset observation space (joints → EE pose)
+
+Below is an example of the three pipelines that we use in the phone to SO-100 follower examples:
+
+```69:90:examples/phone_so100_record.py
+phone_to_robot_ee_pose = RobotProcessor(  # teleop -> dataset action
+    steps=[MapPhoneActionToRobotAction(platform=teleop_config.phone_os),
+           AddRobotObservationAsComplimentaryData(robot=robot),
+            EEReferenceAndDelta(kinematics=kinematics_solver,
+                               end_effector_step_sizes={"x": 0.5, "y": 0.5, "z": 0.5},
+                               motor_names=list(robot.bus.motors.keys())),
+           EEBoundsAndSafety(end_effector_bounds={"min": [-1, -1, -1], "max": [1, 1, 1]},
+                             max_ee_step_m=0.20, max_ee_twist_step_rad=0.50)],
+    to_transition=to_transition_teleop_action,
+    to_output=lambda tr: tr,
+)
+
+robot_ee_to_joints = RobotProcessor(      # dataset action -> robot
+    steps=[InverseKinematicsEEToJoints(kinematics=kinematics_solver,
+                                       motor_names=list(robot.bus.motors.keys()),
+                                       initial_guess_current_joints=True),
+           GripperVelocityToJoint(motor_names=list(robot.bus.motors.keys()), speed_factor=20.0)],
+    to_transition=lambda tr: tr,
+    to_output=to_output_robot_action,
+)
+
+robot_joints_to_ee_pose = RobotProcessor( # robot obs -> dataset obs
+    steps=[ForwardKinematicsJointsToEE(kinematics=kinematics_solver,
+                                       motor_names=list(robot.bus.motors.keys()))],
+    to_transition=to_transition_robot_observation,
+    to_output=lambda tr: tr,
+)
+```
+
+## Why to_transition / to_output
+
+To convert from robot/teleoperator to pipeline and back, we use the `to_transition` and `to_output` pipeline adapters.
+They standardize conversions to reduce boilerplate code, and form the bridge between the robot and teleoperators raw dicts and the pipeline’s `EnvTransition` format.
+In the phone to SO-100 follower examples we use the following adapters:
+
+- `to_transition_teleop_action`: transforms the teleop action dict to a pipeline transition (puts keys under `action.*`, converts scalars/arrays to tensors, keeps objects like `Rotation` intact)
+- `to_output_robot_action`: transforms the pipeline transition to a robot action dict (extracts keys ending with `.pos`/`.vel` and strips `action.` prefix)
+- `to_transition_robot_observation`: transforms the robot observation dict to a pipeline transition (splits state vs images; stores state under `observation.state.*` and images under `observation.images.*`)
+
+See `src/lerobot/processor/converters.py` for more details.
+
+## Dataset feature contracts
+
+Dataset features are the keys saved in the dataset. Each step can declare what its dataset features are via `transform_features(...)`. We can then aggregate features per pipeline with `aggregate_pipeline_dataset_features()` and merge multiple groups with `merge_features(...)`.
+
+Below is and example of how we declare features with the `transform_features` method in the phone to SO-100 follower examples:
+
+```203:211:src/lerobot/robots/so100_follower/robot_kinematic_processor.py
+def transform_features(self, features: dict[str, PolicyFeature]) -> dict[str, PolicyFeature]:
+    # Because this is last step we specify the dataset features of this step that we want to be stored in the dataset
+    features["action.ee.x"] = float
+    features["action.ee.y"] = float
+    features["action.ee.z"] = float
+    features["action.ee.wx"] = float
+    features["action.ee.wy"] = float
+    features["action.ee.wz"] = float
+    return features
+```
+
+Tip: declare features at the last step that produces them (e.g., `EEBoundsAndSafety` declares `action.ee.*`, `ForwardKinematicsJointsToEE` declares `observation.state.ee.*`).
+
+Below is an example of how we aggregate and merge features in the phone to SO-100 follower examples:
+
+```121:145:examples/phone_so100_record.py
+action_ee = aggregate_pipeline_dataset_features(
+    pipeline=phone_to_robot_ee_pose,
+    initial_features=phone.action_features,
+    use_videos=True,
+    patterns=["action.ee"],
+)
+
+gripper = aggregate_pipeline_dataset_features(
+    pipeline=robot_ee_to_joints,
+    initial_features={},
+    use_videos=True,
+    patterns=["action.gripper.pos", "observation.state.gripper.pos"],
+)
+
+observation_ee = aggregate_pipeline_dataset_features(
+    pipeline=robot_joints_to_ee_pose,
+    initial_features=robot.observation_features,
+    use_videos=True,
+    patterns=["observation.state.ee"],
+)
+
+dataset_features = merge_features(action_ee, gripper, observation_ee)
+```
+
+How it works:
+
+- `aggregate_pipeline_dataset_features(...)`: applies `transform_features` across the pipeline and filters by patterns (images included when `use_videos=True`).
+- `merge_features(...)`: combine multiple feature dicts.
+- Recording uses `to_dataset_frame(...)` to build frames consistent with `dataset.features` before we call `add_frame(...)` to add the frame to the dataset.
+
+## Guidance when customizing robot pipelines
+
+You can store any of the following features as your action/observation space:
+
+- Joint positions
+- Absolute EE poses
+- Relative EE deltas
+- Other features: joint velocity, etc.
+
+Pick what you want to use for your policy action and observation space and configure/modify the pipelines and steps accordingly.
+
+### Different robots
+
+- Swap `RobotKinematics` URDF and `motor_names`. Ensure `target_frame_name` points to your gripper/wrist.
+
+### Safety first
+
+- When changing pipelines, start with tight bounds, implement safety steps when working with real robots.
+- Its advised to start with simulation first and then move to real robots.
+
+Hope this guide helps you get started with customizing your robot pipelines, If you run into any issues at any point, jump into our [Discord community](https://discord.com/invite/s3KuuzsPFb) for support.