more changes

2026-07-06 09:37:06 +00:00 · 2025-11-17 13:06:30 +01:00
parent cb7d2ed0fc
commit f3b25eb425
3 changed files with 0 additions and 540 deletions
@@ -1,165 +0,0 @@
-# XVLA Custom Processor Steps - Implementation Summary
-
-## Overview
-Implemented three custom processor steps for XVLA that encapsulate the preprocessing and postprocessing logic previously scattered in `lerobot_eval.py` (lines 165-184).
-
-## Files Modified
-
-### 1. `/src/lerobot/policies/xvla/processor_xvla.py`
-**Changes:**
- Added imports: `dataclass`, `numpy`, `Rotate6D_to_AxisAngle`, processor core types
- Implemented 3 new processor step classes (all registered with `ProcessorStepRegistry`)
-
-**New Classes:**
-
-#### `XVLAImageScaleProcessorStep` 
- **Registry Name:** `xvla_image_scale`
- **Purpose:** Scales image observations by 255 (converts [0,1] to [0,255])
- **Configuration:** 
-  - `image_keys: list[str] | None` - Auto-detects or specify image keys
- **Location:** Lines 93-140
-
-#### `XVLAAddDomainIdProcessorStep`
- **Registry Name:** `xvla_add_domain_id`  
- **Purpose:** Adds domain_id tensor to complementary data
- **Configuration:**
-  - `domain_id: int = 3` - Domain identifier
-  - `device: str = "cuda"` - Tensor device
- **Location:** Lines 143-192
-
-#### `XVLARotation6DToAxisAngleProcessorStep`
- **Registry Name:** `xvla_rotation_6d_to_axis_angle`
- **Purpose:** Converts 6D rotation to axis-angle and reorganizes action dimensions
-  - Input: [eef(3), rotation_6d(6), gripper(1)] = 10D
-  - Output: [eef(3), axis_angle(3), gripper(1)] = 7D
- **Configuration:**
-  - `expected_action_dim: int = 10`
- **Location:** Lines 195-255
-
-### 2. `/src/lerobot/policies/xvla/README_PROCESSORS.md` (NEW)
-Comprehensive documentation covering:
- Processor step descriptions and configurations
- Integration examples for preprocessing/postprocessing pipelines
- Before/after comparison showing simplified evaluation code
- JSON/YAML configuration examples
- Reference to Groot processor patterns
-
-## Key Features
-
-### 1. **Registry-Based Architecture**
-All processors are registered with `@ProcessorStepRegistry.register()`, enabling:
- Instantiation from configuration files
- Serialization/deserialization with policies
- Easy discovery and debugging
-
-### 2. **Proper ProcessorStep Interface**
-Each processor implements:
- `__call__(transition: EnvTransition) -> EnvTransition` - Main processing logic
- `transform_features(features) -> features` - Feature contract declaration
- `get_config() -> dict` - Serializable configuration
-
-### 3. **Safe Data Handling**
- All processors use `transition.copy()` to avoid side effects
- Proper handling of missing/None values
- Device-aware tensor operations
-
-### 4. **Configurable and Reusable**
- All parameters exposed in `get_config()`
- Can be customized per deployment
- Works with any XVLA model configuration
-
-## Usage Impact
-
-### Before (from lerobot_eval.py):
-```python
-# Lines 166-184 - scattered preprocessing/postprocessing
-observation[f"observation.images.image"] = observation[f"observation.images.image"] * 255
-observation[f"observation.images.image2"] = observation[f"observation.images.image2"] * 255
-observation = add_envs_task(env, observation)
-observation = preprocessor(observation)
-observation["domain_id"] = torch.tensor([int(3)], dtype=torch.long).to("cuda")
-
-with torch.inference_mode():
-    action = policy.select_action(observation).to("cpu").numpy()
-target_eef = action[:, :3]
-target_axis = Rotate6D_to_AxisAngle(action[:, 3:9])
-target_act = action[:, 9:10]
-action_numpy = np.concatenate([target_eef, target_axis, target_act], axis=-1)
-```
-
-### After (with custom processors):
-```python
-# Clean and simple - processors encapsulate all the logic
-observation = add_envs_task(env, observation)
-observation = preprocessor(observation)  # Includes image scaling + domain_id
-
-with torch.inference_mode():
-    action = policy.select_action(observation)
-action = postprocessor(action)  # Includes rotation conversion + device transfer
-action_numpy = action.numpy()
-```
-
-## Design Patterns Followed
-
-1. **Groot Processor Reference:** Followed same patterns as `processor_groot.py`:
-   - Dataclass-based configuration
-   - Registry registration
-   - State management via `get_config()`
-   - Proper transition handling
-
-2. **LeRobot Processor Guidelines:** (from `implement_your_own_processor.mdx`):
-   - Safe data handling with `copy()`
-   - Clear error messages
-   - Device/dtype awareness
-   - Feature contract declaration
-
-3. **Pipeline Integration:** 
-   - Works seamlessly with `PolicyProcessorPipeline`
-   - Automatic dict ↔ EnvTransition conversion
-   - Composable with other processor steps
-
-## Benefits
-
-1. **Cleaner Code:** Evaluation loop is now much simpler
-2. **Maintainable:** Processing logic is centralized and well-documented
-3. **Configurable:** All parameters can be adjusted via config files
-4. **Reusable:** Can be used across different XVLA deployments
-5. **Testable:** Each processor can be tested independently
-6. **Serializable:** Processors save/load with the policy
-
-## Testing Recommendations
-
-1. **Unit Tests:**
-   - Test each processor with sample transitions
-   - Verify image scaling (multiply by 255)
-   - Verify domain_id addition and device placement
-   - Verify rotation conversion accuracy
-
-2. **Integration Tests:**
-   - Test full preprocessing pipeline
-   - Test full postprocessing pipeline
-   - Verify evaluation loop still works correctly
-   - Test with different domain_ids and devices
-
-3. **Configuration Tests:**
-   - Test loading processors from config
-   - Test serialization/deserialization
-   - Test overrides mechanism
-
-## Next Steps
-
-1. **Update XVLA Policy Factory:** Optionally add these processors to the default pipeline in `make_xvla_pre_post_processors()` or document how to add them via config
-
-2. **Update lerobot_eval.py:** Simplify the evaluation code to use the new processors
-
-3. **Add Configuration Examples:** Create sample config files showing processor integration
-
-4. **Add Tests:** Implement unit and integration tests for the new processors
-
-## Notes
-
- No changes made to `make_xvla_pre_post_processors()` as requested
- Processors are available but not automatically included (must be added via config)
- All processors follow LeRobot conventions and best practices
- Compatible with existing XVLA model configurations
-
@@ -1,141 +0,0 @@
-# XVLA Custom Processors - Quick Start
-
-## What Was Implemented
-
-Three custom processor steps that simplify XVLA evaluation by encapsulating preprocessing and postprocessing logic:
-
-```
-┌─────────────────────────────────────────────────────────────┐
-│  PREPROCESSING PIPELINE                                     │
-├─────────────────────────────────────────────────────────────┤
-│  1. RenameObservationsProcessorStep                         │
-│  2. AddBatchDimensionProcessorStep                          │
-│  3. XVLAImageScaleProcessorStep          ← NEW              │
-│     └─ Scales images by 255                                 │
-│  4. TokenizerProcessorStep                                  │
-│  5. DeviceProcessorStep                                     │
-│  6. XVLAAddDomainIdProcessorStep         ← NEW              │
-│     └─ Adds domain_id tensor                                │
-│  7. NormalizerProcessorStep                                 │
-└─────────────────────────────────────────────────────────────┘
-
-┌─────────────────────────────────────────────────────────────┐
-│  POSTPROCESSING PIPELINE                                    │
-├─────────────────────────────────────────────────────────────┤
-│  1. UnnormalizerProcessorStep                               │
-│  2. XVLARotation6DToAxisAngleProcessorStep  ← NEW           │
-│     └─ Converts 6D rotation to axis-angle (10D → 7D)        │
-│  3. DeviceProcessorStep(device="cpu")                       │
-└─────────────────────────────────────────────────────────────┘
-```
-
-## Simplest Usage
-
-### Option 1: Import and Use Directly
-
-```python
-from lerobot.policies.xvla.processor_xvla import (
-    XVLAImageScaleProcessorStep,
-    XVLAAddDomainIdProcessorStep,
-    XVLARotation6DToAxisAngleProcessorStep,
-)
-
-# Add to your existing preprocessor steps
-preprocessor = PolicyProcessorPipeline(
-    steps=[
-        # ... your existing steps ...
-        XVLAImageScaleProcessorStep(),
-        # ... more steps ...
-        XVLAAddDomainIdProcessorStep(domain_id=3),
-    ]
-)
-
-# Add to your postprocessor steps
-postprocessor = PolicyProcessorPipeline(
-    steps=[
-        XVLARotation6DToAxisAngleProcessorStep(),
-        DeviceProcessorStep(device="cpu"),
-    ]
-)
-```
-
-### Option 2: Load from Config
-
-```python
-# In your config.json or YAML:
-{
-    "preprocessor_steps": [
-        {"name": "xvla_image_scale"},
-        {"name": "xvla_add_domain_id", "domain_id": 3, "device": "cuda"}
-    ],
-    "postprocessor_steps": [
-        {"name": "xvla_rotation_6d_to_axis_angle", "expected_action_dim": 10}
-    ]
-}
-
-# Then load:
-preprocessor = PolicyProcessorPipeline.from_pretrained("path/to/config")
-```
-
-## Evaluation Loop Comparison
-
-### ❌ Old Way (Manual Processing)
-```python
-# Scattered preprocessing
-observation["observation.images.image"] *= 255
-observation["observation.images.image2"] *= 255
-observation = add_envs_task(env, observation)
-observation = preprocessor(observation)
-observation["domain_id"] = torch.tensor([3], dtype=torch.long).to("cuda")
-
-# Policy inference
-action = policy.select_action(observation)
-
-# Manual postprocessing
-target_eef = action[:, :3]
-target_axis = Rotate6D_to_AxisAngle(action[:, 3:9])
-target_act = action[:, 9:10]
-action = np.concatenate([target_eef, target_axis, target_act], axis=-1)
-```
-
-### ✅ New Way (With Custom Processors)
-```python
-# All preprocessing in one call
-observation = add_envs_task(env, observation)
-observation = preprocessor(observation)  # Includes scaling + domain_id
-
-# Policy inference
-action = policy.select_action(observation)
-
-# All postprocessing in one call
-action = postprocessor(action)  # Includes rotation conversion
-```
-
-**Result:** 13 lines → 6 lines of cleaner, more maintainable code!
-
-## Quick Reference
-
-| Processor | Purpose | Config Key | Default |
-|-----------|---------|------------|---------|
-| **XVLAImageScaleProcessorStep** | Scale images by 255 | `xvla_image_scale` | Auto-detect images |
-| **XVLAAddDomainIdProcessorStep** | Add domain_id tensor | `xvla_add_domain_id` | domain_id=3, device="cuda" |
-| **XVLARotation6DToAxisAngleProcessorStep** | Convert 6D→axis-angle | `xvla_rotation_6d_to_axis_angle` | expected_action_dim=10 |
-
-## Key Benefits
-
-1. ✅ **Clean code** - No scattered preprocessing logic
-2. ✅ **Configurable** - Adjust via config files
-3. ✅ **Reusable** - Works across different XVLA setups
-4. ✅ **Serializable** - Saves/loads with policy
-5. ✅ **Testable** - Each processor can be tested independently
-6. ✅ **Registry-based** - Easy instantiation from config
-
-## Next Steps
-
-1. **Update your evaluation script** to use the new processors
-2. **Add processors to your config** if using config-based loading
-3. **Test with your specific XVLA model** to ensure compatibility
-4. **Adjust parameters** as needed (domain_id, device, etc.)
-
-For detailed documentation, see `README_PROCESSORS.md`.
-
@@ -1,234 +0,0 @@
-# XVLA Configuration and Evaluation Updates - Summary
-
-## Overview
-Updated XVLA configuration files and evaluation script to use the new custom processor steps, eliminating manual preprocessing and postprocessing code.
-
-## Files Modified
-
-### 1. `/src/lerobot/policies/xvla/policy_preprocessor.json`
-
-**Added two new processor steps:**
-
-#### Step 3: `xvla_image_scale` (NEW - Line 14-19)
-```json
-{
-  "registry_name": "xvla_image_scale",
-  "config": {
-    "image_keys": null
-  }
-}
-```
- **Position:** After `to_batch_processor`, before `tokenizer_processor`
- **Purpose:** Scales images by 255 (converts from [0,1] to [0,255])
- **Replaces:** Manual code `observation["observation.images.image"] *= 255`
-
-#### Step 6: `xvla_add_domain_id` (NEW - Line 38-44)
-```json
-{
-  "registry_name": "xvla_add_domain_id",
-  "config": {
-    "domain_id": 3,
-    "device": "cuda"
-  }
-}
-```
- **Position:** After `device_processor`, before `normalizer_processor`
- **Purpose:** Adds domain_id tensor to complementary data
- **Replaces:** Manual code `observation["domain_id"] = torch.tensor([int(3)], dtype=torch.long).to("cuda")`
-
-**Final preprocessing pipeline order:**
-1. `rename_observations_processor`
-2. `to_batch_processor`
-3. `xvla_image_scale` ⭐ NEW
-4. `tokenizer_processor`
-5. `device_processor`
-6. `xvla_add_domain_id` ⭐ NEW
-7. `normalizer_processor`
-
-### 2. `/src/lerobot/policies/xvla/policy_postprocessor.json`
-
-**Added one new processor step and updated device:**
-
-#### Step 2: `xvla_rotation_6d_to_axis_angle` (NEW - Line 23-28)
-```json
-{
-  "registry_name": "xvla_rotation_6d_to_axis_angle",
-  "config": {
-    "expected_action_dim": 10
-  }
-}
-```
- **Position:** After `unnormalizer_processor`, before `device_processor`
- **Purpose:** Converts 6D rotation to axis-angle (10D → 7D action)
- **Replaces:** Manual code:
-  ```python
-  target_eef = action[:, :3]
-  target_axis = Rotate6D_to_AxisAngle(action[:, 3:9])
-  target_act = action[:, 9:10]
-  action = np.concatenate([target_eef, target_axis, target_act], axis=-1)
-  ```
-
-#### Step 3: `device_processor` (UPDATED - Line 29-35)
- **Changed device:** `"cuda"` → `"cpu"`
- **Purpose:** Move tensors to CPU for environment interaction
- **Replaces:** Manual code `.to("cpu")`
-
-**Final postprocessing pipeline order:**
-1. `unnormalizer_processor`
-2. `xvla_rotation_6d_to_axis_angle` ⭐ NEW
-3. `device_processor` (device changed to "cpu") 🔧 UPDATED
-
-### 3. `/src/lerobot/scripts/lerobot_eval.py`
-
-**Removed manual preprocessing/postprocessing code:**
-
-#### Lines 91-92: Removed import (DELETED)
-```python
-# REMOVED:
-from lerobot.policies.xvla.utils import Rotate6D_to_AxisAngle
-```
-
-#### Lines 165-184: Simplified evaluation logic (REPLACED)
-
-**Before (18 lines with manual processing):**
-```python
-observation[f"observation.images.image"] = observation[f"observation.images.image"] * 255
-observation[f"observation.images.image2"] = observation[f"observation.images.image2"] * 255
-observation = add_envs_task(env, observation)
-observation = preprocessor(observation)
-observation["domain_id"] = torch.tensor([int(3)], dtype=torch.long).to("cuda")
-
-with torch.inference_mode():
-    action = policy.select_action(observation).to("cpu").numpy()
-# action = postprocessor(action)  # THIS WAS COMMENTED OUT
-target_eef = action[:, :3]
-target_axis = Rotate6D_to_AxisAngle(action[:, 3:9])
-target_act = action[:, 9:10]
-action_numpy = np.concatenate([target_eef, target_axis, target_act], axis=-1)
-
-# Convert to CPU / numpy.
-# action_numpy: np.ndarray = action.to("cpu").numpy()
-assert action_numpy.ndim == 2, "Action dimensions should be (batch, action_dim)"
-```
-
-**After (11 lines, clean and simple):**
-```python
-observation = add_envs_task(env, observation)
-
-# Preprocess observation (includes image scaling and domain_id addition)
-observation = preprocessor(observation)
-
-# Policy inference
-with torch.inference_mode():
-    action = policy.select_action(observation)
-
-# Postprocess action (includes rotation conversion and device transfer to CPU)
-action = postprocessor(action)
-
-# Convert to numpy
-action_numpy: np.ndarray = action.numpy()
-assert action_numpy.ndim == 2, "Action dimensions should be (batch, action_dim)"
-```
-
-## Impact Summary
-
-### Code Reduction
- **Lines removed:** ~13 lines of manual processing code
- **Lines added:** ~7 lines of clean processor calls
- **Net reduction:** ~6 lines + cleaner structure
- **Removed import:** No longer need `Rotate6D_to_AxisAngle` import
-
-### Benefits
-
-1. **✅ Cleaner Code**
-   - Evaluation loop is now much simpler and more readable
-   - No scattered preprocessing logic
-   - Clear separation of concerns
-
-2. **✅ Configuration-Driven**
-   - All preprocessing/postprocessing controlled via JSON config
-   - Easy to adjust parameters (domain_id, device, etc.) without code changes
-   - Can load different configs for different deployments
-
-3. **✅ Maintainable**
-   - Processing logic centralized in processor classes
-   - Single source of truth for transformations
-   - Easier to debug and test
-
-4. **✅ Reusable**
-   - Processors work across all XVLA evaluations
-   - Can be shared between training and inference
-   - Can be serialized with the model
-
-5. **✅ Consistent**
-   - Same processing pipeline guaranteed in all contexts
-   - No risk of forgetting manual steps
-   - Automatic handling of edge cases
-
-## Testing Checklist
-
-Before deploying, verify:
-
- [ ] Images are scaled correctly (0-255 range)
- [ ] domain_id is added to complementary data
- [ ] 6D rotation correctly converts to axis-angle
- [ ] Actions are 7D after postprocessing
- [ ] Evaluation success rates match previous results
- [ ] Video rendering still works
- [ ] Multi-environment batching works correctly
-
-## Configuration Notes
-
-### Customizing Domain ID
-To change the domain ID for different embodiments, edit `policy_preprocessor.json`:
-```json
-{
-  "registry_name": "xvla_add_domain_id",
-  "config": {
-    "domain_id": 5,  // Change this value
-    "device": "cuda"
-  }
-}
-```
-
-### Customizing Image Keys
-To scale specific images only, edit `policy_preprocessor.json`:
-```json
-{
-  "registry_name": "xvla_image_scale",
-  "config": {
-    "image_keys": ["observation.images.image", "observation.images.wrist_cam"]
-  }
-}
-```
-
-### Customizing Action Dimensions
-To support different action dimensions, edit `policy_postprocessor.json`:
-```json
-{
-  "registry_name": "xvla_rotation_6d_to_axis_angle",
-  "config": {
-    "expected_action_dim": 12  // Adjust based on your model
-  }
-}
-```
-
-## Migration Guide
-
-If you have existing XVLA checkpoints without these configs:
-
-1. **Copy the updated JSON files** to your checkpoint directory
-2. **No model retraining needed** - processors are data transforms only
-3. **Test evaluation** to ensure consistent results
-4. **Update any custom evaluation scripts** to use processors
-
-## Related Files
-
- Custom processors implementation: `/src/lerobot/policies/xvla/processor_xvla.py`
- Documentation: `/src/lerobot/policies/xvla/README_PROCESSORS.md`
- Quick start: `/src/lerobot/policies/xvla/QUICK_START.md`
-
-## Questions?
-
-See the processor documentation in `/src/lerobot/policies/xvla/README_PROCESSORS.md` for detailed usage examples and troubleshooting.
-