fix(datasets)

2026-05-11 14:49:43 +00:00 · 2026-03-09 15:06:55 +01:00 · 2026-03-08 20:53:16 +01:00 · 2026-03-07 17:57:56 +01:00 · 2026-03-07 17:46:09 +01:00 · 2026-03-07 01:14:19 +01:00
24 changed files with 5932 additions and 39 deletions
@@ -101,6 +101,9 @@
 ## Installation

 LeRobot works with Python 3.10+ and PyTorch 2.2+.
+
+### Environment Setup
+
 Create a virtual environment with Python 3.10 and activate it, e.g. with [`miniconda`](https://docs.anaconda.com/free/miniconda/index.html):

 ```bash
@@ -124,10 +127,21 @@ conda install ffmpeg -c conda-forge
 >
 > - _[On Linux only]_ Install [ffmpeg build dependencies](https://trac.ffmpeg.org/wiki/CompilationGuide/Ubuntu#GettheDependencies) and [compile ffmpeg from source with libsvtav1](https://trac.ffmpeg.org/wiki/CompilationGuide/Ubuntu#libsvtav1), and make sure you use the corresponding ffmpeg binary to your install with `which ffmpeg`.

-Install 🤗 LeRobot:
+### Install LeRobot 🤗
+
+#### From Source
+
+First, clone the repository and navigate into the directory:

 ```bash
-pip install lerobot
+git clone https://github.com/huggingface/lerobot.git
+cd lerobot
+```
+
+Then, install the library in editable mode. This is useful if you plan to contribute to the code.
+
+```bash
+pip install -e .
 ```

 > **NOTE:** If you encounter build errors, you may need to install additional dependencies (`cmake`, `build-essential`, and `ffmpeg libs`). On Linux, run:
@@ -145,6 +159,34 @@ For instance, to install 🤗 LeRobot with aloha and pusht, use:
 pip install -e ".[aloha, pusht]"
 ```

+### Installation from PyPI
+
+**Core Library:**
+Install the base package with:
+
+```bash
+pip install lerobot
+```
+
+_This installs only the default dependencies._
+
+**Extra Features:**
+To install additional functionality, use one of the following:
+
+```bash
+pip install 'lerobot[all]'          # All available features
+pip install 'lerobot[aloha,pusht]'  # Specific features (Aloha & Pusht)
+pip install 'lerobot[feetech]'      # Feetech motor support
+```
+
+_Replace `[...]` with your desired features._
+
+**Available Tags:**
+For a full list of optional dependencies, see:
+https://pypi.org/project/lerobot/
+
+### Weights & Biases
+
 To use [Weights and Biases](https://docs.wandb.ai/quickstart) for experiment tracking, log in with

 ```bash
@@ -311,7 +353,7 @@ If you want, you can cite this work with:

 ```bibtex
@misc{cadene2024lerobot,
-    author = {Cadene, Remi and Alibert, Simon and Soare, Alexander and Gallouedec, Quentin and Zouitine, Adil and Palma, Steven and Kooijmans, Pepijn and Aractingi, Michel and Shukor, Mustafa and Aubakirova, Dana and Russi, Martino and Capuano, Francesco and Pascale, Caroline and Choghari, Jade and Moss, Jess and Wolf, Thomas},
+    author = {Cadene, Remi and Alibert, Simon and Soare, Alexander and Gallouedec, Quentin and Zouitine, Adil and Palma, Steven and Kooijmans, Pepijn and Aractingi, Michel and Shukor, Mustafa and Aubakirova, Dana and Russi, Martino and Capuano, Francesco and Pascal, Caroline and Choghari, Jade and Moss, Jess and Wolf, Thomas},
    title = {LeRobot: State-of-the-art Machine Learning for Real-World Robotics in Pytorch},
    howpublished = "\url{https://github.com/huggingface/lerobot}",
    year = {2024}
@@ -294,7 +294,7 @@ dataset.push_to_hub()

 #### Dataset upload

-Locally, your dataset is stored in this folder: `~/.cache/huggingface/lerobot/{repo-id}`. At the end of data recording, your dataset will be uploaded on your Hugging Face page (e.g. https://huggingface.co/datasets/cadene/so101_test) that you can obtain by running:
+Locally, your dataset is stored in this folder: `~/.cache/huggingface/lerobot/{repo-id}`. At the end of data recording, your dataset will be uploaded on your Hugging Face page (e.g. `https://huggingface.co/datasets/${HF_USER}/so101_test`) that you can obtain by running:

 ```bash
 echo https://huggingface.co/datasets/${HF_USER}/so101_test
@@ -428,7 +428,7 @@ Your robot should replicate movements similar to those you recorded. For example

 ## Train a policy

-To train a policy to control your robot, use the [`python -m lerobot.scripts.train`](../src/lerobot/scripts/train.py) script. A few arguments are required. Here is an example command:
+To train a policy to control your robot, use the [`python -m lerobot.scripts.train`](https://github.com/huggingface/lerobot/blob/main/src/lerobot/scripts/train.py) script. A few arguments are required. Here is an example command:

 ```bash
 python -m lerobot.scripts.train \
@@ -444,7 +444,7 @@ python -m lerobot.scripts.train \
 Let's explain the command:

 1. We provided the dataset as argument with `--dataset.repo_id=${HF_USER}/so101_test`.
-2. We provided the policy with `policy.type=act`. This loads configurations from [`configuration_act.py`](../src/lerobot/policies/act/configuration_act.py). Importantly, this policy will automatically adapt to the number of motor states, motor actions and cameras of your robot (e.g. `laptop` and `phone`) which have been saved in your dataset.
+2. We provided the policy with `policy.type=act`. This loads configurations from [`configuration_act.py`](https://github.com/huggingface/lerobot/blob/main/src/lerobot/policies/act/configuration_act.py). Importantly, this policy will automatically adapt to the number of motor states, motor actions and cameras of your robot (e.g. `laptop` and `phone`) which have been saved in your dataset.
 3. We provided `policy.device=cuda` since we are training on a Nvidia GPU, but you could use `policy.device=mps` to train on Apple silicon.
 4. We provided `wandb.enable=true` to use [Weights and Biases](https://docs.wandb.ai/quickstart) for visualizing training plots. This is optional but if you use it, make sure you are logged in by running `wandb login`.

@@ -96,7 +96,7 @@ If you uploaded your dataset to the hub you can [visualize your dataset online](

 ## Train a policy

-To train a policy to control your robot, use the [`python -m lerobot.scripts.train`](../src/lerobot/scripts/train.py) script. A few arguments are required. Here is an example command:
+To train a policy to control your robot, use the [`python -m lerobot.scripts.train`](https://github.com/huggingface/lerobot/blob/main/src/lerobot/scripts/train.py) script. A few arguments are required. Here is an example command:

 ```bash
 python -m lerobot.scripts.train \
@@ -111,7 +111,7 @@ python -m lerobot.scripts.train \
 Let's explain the command:

 1. We provided the dataset as argument with `--dataset.repo_id=${HF_USER}/il_gym`.
-2. We provided the policy with `policy.type=act`. This loads configurations from [`configuration_act.py`](../src/lerobot/policies/act/configuration_act.py). Importantly, this policy will automatically adapt to the number of motor states, motor actions and cameras of your robot (e.g. `laptop` and `phone`) which have been saved in your dataset.
+2. We provided the policy with `policy.type=act`. This loads configurations from [`configuration_act.py`](https://github.com/huggingface/lerobot/blob/main/src/lerobot/policies/act/configuration_act.py). Importantly, this policy will automatically adapt to the number of motor states, motor actions and cameras of your robot (e.g. `laptop` and `phone`) which have been saved in your dataset.
 3. We provided `policy.device=cuda` since we are training on a Nvidia GPU, but you could use `policy.device=mps` to train on Apple silicon.
 4. We provided `wandb.enable=true` to use [Weights and Biases](https://docs.wandb.ai/quickstart) for visualizing training plots. This is optional but if you use it, make sure you are logged in by running `wandb login`.

@@ -1,15 +1,6 @@
 # Installation

-## Install LeRobot
-
-Currently only available from source.
-
-Download our source code:
-
-```bash
-git clone https://github.com/huggingface/lerobot.git
-cd lerobot
-```
+## Environment Setup

 Create a virtual environment with Python 3.10, using [`Miniconda`](https://docs.anaconda.com/miniconda/install/#quick-command-line-install)

@@ -40,12 +31,49 @@ conda install ffmpeg -c conda-forge
 >
 > - _[On Linux only]_ If you want to bring your own ffmpeg: Install [ffmpeg build dependencies](https://trac.ffmpeg.org/wiki/CompilationGuide/Ubuntu#GettheDependencies) and [compile ffmpeg from source with libsvtav1](https://trac.ffmpeg.org/wiki/CompilationGuide/Ubuntu#libsvtav1), and make sure you use the corresponding ffmpeg binary to your install with `which ffmpeg`.

-Install 🤗 LeRobot:
+## Install LeRobot 🤗
+
+### From Source
+
+First, clone the repository and navigate into the directory:
+
+```bash
+git clone https://github.com/huggingface/lerobot.git
+cd lerobot
+```
+
+Then, install the library in editable mode. This is useful if you plan to contribute to the code.

 ```bash
 pip install -e .
 ```

+### Installation from PyPI
+
+**Core Library:**
+Install the base package with:
+
+```bash
+pip install lerobot
+```
+
+_This installs only the default dependencies._
+
+**Extra Features:**
+To install additional functionality, use one of the following:
+
+```bash
+pip install 'lerobot[all]'          # All available features
+pip install 'lerobot[aloha,pusht]'  # Specific features (Aloha & Pusht)
+pip install 'lerobot[feetech]'      # Feetech motor support
+```
+
+_Replace `[...]` with your desired features._
+
+**Available Tags:**
+For a full list of optional dependencies, see:
+https://pypi.org/project/lerobot/
+
 ### Troubleshooting

 If you encounter build errors, you may need to install additional dependencies: `cmake`, `build-essential`, and `ffmpeg libs`.
@@ -25,7 +25,7 @@ discord = "https://discord.gg/s3KuuzsPFb"

 [project]
 name = "lerobot"
-version = "0.3.2"
+version = "0.3.3"
 description = "🤗 LeRobot: State-of-the-art Machine Learning for Real-World Robotics in Pytorch"
 readme = "README.md"
 license = { text = "Apache-2.0" }
@@ -68,15 +68,16 @@ dependencies = [
    "einops>=0.8.0",
    "opencv-python-headless>=4.9.0",
    "av>=14.2.0",
-    "torch>=2.2.1",
-    "torchcodec>=0.2.1; sys_platform != 'win32' and (sys_platform != 'linux' or (platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 'armv7l')) and (sys_platform != 'darwin' or platform_machine != 'x86_64')",
-    "torchvision>=0.21.0",
    "jsonlines>=4.0.0",
    "packaging>=24.2",
    "pynput>=1.7.7",
    "pyserial>=3.5",
    "wandb>=0.20.0",

+    "torch>=2.2.1,<2.8.0", # TODO: Bumb dependency
+    "torchcodec>=0.2.1,<0.6.0; sys_platform != 'win32' and (sys_platform != 'linux' or (platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 'armv7l')) and (sys_platform != 'darwin' or platform_machine != 'x86_64')", # TODO: Bumb dependency
+    "torchvision>=0.21.0,<0.23.0", # TODO: Bumb dependency
+
    "draccus==0.10.0", # TODO: Remove ==
    "gymnasium>=0.29.1,<1.0.0", # TODO: Bumb dependency
    "rerun-sdk>=0.21.0,<0.23.0", # TODO: Bumb dependency
@@ -27,6 +27,7 @@ from huggingface_hub.constants import CONFIG_NAME
 from huggingface_hub.errors import HfHubHTTPError

 from lerobot.configs.types import FeatureType, NormalizationMode, PolicyFeature
+from lerobot.constants import ACTION, OBS_STATE
 from lerobot.optim.optimizers import OptimizerConfig
 from lerobot.optim.schedulers import LRSchedulerConfig
 from lerobot.utils.hub import HubMixin
@@ -119,8 +120,8 @@ class PreTrainedConfig(draccus.ChoiceRegistry, HubMixin, abc.ABC):

    @property
    def robot_state_feature(self) -> PolicyFeature | None:
-        for _, ft in self.input_features.items():
-            if ft.type is FeatureType.STATE:
+        for ft_name, ft in self.input_features.items():
+            if ft.type is FeatureType.STATE and ft_name == OBS_STATE:
                return ft
        return None

@@ -137,8 +138,8 @@ class PreTrainedConfig(draccus.ChoiceRegistry, HubMixin, abc.ABC):

    @property
    def action_feature(self) -> PolicyFeature | None:
-        for _, ft in self.output_features.items():
-            if ft.type is FeatureType.ACTION:
+        for ft_name, ft in self.output_features.items():
+            if ft.type is FeatureType.ACTION and ft_name == ACTION:
                return ft
        return None

@@ -486,8 +486,8 @@ class LeRobotDataset(torch.utils.data.Dataset):
        self.episode_data_index = get_episode_data_index(self.meta.episodes, self.episodes)

        # Check timestamps
-        timestamps = torch.stack(self.hf_dataset["timestamp"]).numpy()
-        episode_indices = torch.stack(self.hf_dataset["episode_index"]).numpy()
+        timestamps = torch.tensor(self.hf_dataset["timestamp"]).numpy()
+        episode_indices = torch.tensor(self.hf_dataset["episode_index"]).numpy()
        ep_data_index_np = {k: t.numpy() for k, t in self.episode_data_index.items()}
        check_timestamps_sync(timestamps, episode_indices, ep_data_index_np, self.fps, self.tolerance_s)

@@ -667,7 +667,7 @@ class LeRobotDataset(torch.utils.data.Dataset):
        for key in self.meta.video_keys:
            if query_indices is not None and key in query_indices:
                timestamps = self.hf_dataset.select(query_indices[key])["timestamp"]
-                query_timestamps[key] = torch.stack(timestamps).tolist()
+                query_timestamps[key] = torch.tensor(timestamps).tolist()
            else:
                query_timestamps[key] = [current_ts]

@@ -675,7 +675,7 @@ class LeRobotDataset(torch.utils.data.Dataset):

    def _query_hf_dataset(self, query_indices: dict[str, list[int]]) -> dict:
        return {
-            key: torch.stack(self.hf_dataset.select(q_idx)[key])
+            key: torch.tensor(self.hf_dataset.select(q_idx)[key])
            for key, q_idx in query_indices.items()
            if key not in self.meta.video_keys
        }
@@ -632,7 +632,7 @@ def cycle(iterable):
            iterator = iter(iterable)


-def create_branch(repo_id, *, branch: str, repo_type: str | None = None) -> None:
+def create_branch(repo_id, *, branch: str, repo_type: str | None = None, revision: str | None = None) -> None:
    """Create a branch on a existing Hugging Face repo. Delete the branch if it already
    exists before creating it.
    """
@@ -644,7 +644,7 @@ def create_branch(repo_id, *, branch: str, repo_type: str | None = None) -> None
    if ref in refs:
        api.delete_branch(repo_id, repo_type=repo_type, branch=branch)

-    api.create_branch(repo_id, repo_type=repo_type, branch=branch)
+    api.create_branch(repo_id, repo_type=repo_type, branch=branch, revision=revision)


 def create_lerobot_dataset_card(
@@ -105,6 +105,7 @@ import filecmp
 import json
 import logging
 import math
+import re
 import shutil
 import subprocess
 import tempfile
@@ -119,6 +120,7 @@ from huggingface_hub import HfApi
 from huggingface_hub.errors import EntryNotFoundError, HfHubHTTPError
 from safetensors.torch import load_file

+from lerobot.datasets.backward_compatibility import CompatibilityError
 from lerobot.datasets.utils import (
    DEFAULT_CHUNK_SIZE,
    DEFAULT_PARQUET_PATH,
@@ -130,6 +132,7 @@ from lerobot.datasets.utils import (
    create_branch,
    create_lerobot_dataset_card,
    flatten_dict,
+    get_repo_versions,
    get_safe_version,
    load_json,
    unflatten_dict,
@@ -205,7 +208,7 @@ def convert_stats_to_json(v1_dir: Path, v2_dir: Path) -> None:
 def get_features_from_hf_dataset(
    dataset: Dataset, robot_config: RobotConfig | None = None
 ) -> dict[str, list]:
-    robot_config = parse_robot_config(robot_config)
+    robot_config = parse_robot_config(robot_config) if robot_config else None
    features = {}
    for key, ft in dataset.features.items():
        if isinstance(ft, datasets.Value):
@@ -325,7 +328,19 @@ def move_videos(
        video_files = [str(f.relative_to(work_dir)) for f in work_dir.glob("videos*/*/*/*.mp4")]
        videos_moved = True  # Videos have already been moved

-    assert len(video_files) == total_episodes * len(video_keys)
+    expected_count = total_episodes * len(video_keys)
+    if len(video_files) != expected_count:
+        print(
+            f"Warning: expected {expected_count} video files "
+            f"({total_episodes} episodes x {len(video_keys)} keys), "
+            f"found {len(video_files)}. Keeping only videos matching existing episodes."
+        )
+        episode_pattern = re.compile(r"episode_(\d+)")
+        valid_episodes = set(range(total_episodes))
+        video_files = [
+            f for f in video_files
+            if (m := episode_pattern.search(f)) and int(m.group(1)) in valid_episodes
+        ]

    lfs_untracked_videos = _get_lfs_untracked_videos(work_dir, video_files)

@@ -442,8 +457,16 @@ def convert_dataset(
    test_branch: str | None = None,
    **card_kwargs,
 ):
-    v1 = get_safe_version(repo_id, V16)
-    v1x_dir = local_dir / V16 / repo_id
+    try:
+        v1 = get_safe_version(repo_id, V16)
+    except CompatibilityError:
+        hub_versions = get_repo_versions(repo_id)
+        v1x_versions = [v for v in hub_versions if v.major == 1]
+        if not v1x_versions:
+            raise
+        v1 = f"v{max(v1x_versions)}"
+        logging.warning(f"v1.6 not found for {repo_id}, falling back to {v1}")
+    v1x_dir = local_dir / v1 / repo_id
    v20_dir = local_dir / V20 / repo_id
    v1x_dir.mkdir(parents=True, exist_ok=True)
    v20_dir.mkdir(parents=True, exist_ok=True)
@@ -455,7 +478,7 @@ def convert_dataset(
    branch = "main"
    if test_branch:
        branch = test_branch
-        create_branch(repo_id=repo_id, branch=test_branch, repo_type="dataset")
+        create_branch(repo_id=repo_id, branch=test_branch, repo_type="dataset", revision=v1)

    metadata_v1 = load_json(v1x_dir / V1_INFO_PATH)
    dataset = datasets.load_dataset("parquet", data_dir=v1x_dir / "data", split="train")
@@ -564,6 +587,12 @@ def convert_dataset(
        "features": features,
    }
    write_json(metadata_v2_0, v20_dir / INFO_PATH)
+
+    info = load_json(v20_dir / INFO_PATH)
+    if "language_instruction" in info.get("features", {}):
+        del info["features"]["language_instruction"]
+        write_json(info, v20_dir / INFO_PATH)
+
    convert_stats_to_json(v1x_dir, v20_dir)
    card = create_lerobot_dataset_card(tags=repo_tags, dataset_info=metadata_v2_0, **card_kwargs)

@@ -677,6 +706,8 @@ def main():

    if args.robot is not None:
        robot_config = make_robot_config(args.robot)
+    else:
+        robot_config = None

    del args.robot

@@ -85,7 +85,7 @@ def convert_dataset(
            path_in_repo=STATS_PATH, repo_id=dataset.repo_id, revision=branch, repo_type="dataset"
        )

-    hub_api.create_tag(repo_id, tag=CODEBASE_VERSION, revision=branch, repo_type="dataset")
+    #hub_api.create_tag(repo_id, tag=CODEBASE_VERSION, revision=branch, repo_type="dataset")


 if __name__ == "__main__":
@@ -45,6 +45,8 @@ def convert_episode_stats(dataset: LeRobotDataset, ep_idx: int):

        axes_to_reduce = (0, 2, 3) if ft["dtype"] in ["image", "video"] else 0
        keepdims = True if ft["dtype"] in ["image", "video"] else ep_ft_data.ndim == 1
+        if ft["dtype"] in ["image", "video"] and ep_ft_data.ndim == 3:
+            ep_ft_data = np.expand_dims(ep_ft_data, axis=0)
        ep_stats[key] = get_feature_stats(ep_ft_data, axis=axes_to_reduce, keepdims=keepdims)

        if ft["dtype"] in ["image", "video"]:  # remove batch dim
@@ -0,0 +1,54 @@
+#!/usr/bin/env python
+
+# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from .device_processor import DeviceProcessor
+from .normalize_processor import NormalizerProcessor, UnnormalizerProcessor
+from .observation_processor import VanillaObservationProcessor
+from .pipeline import (
+    ActionProcessor,
+    DoneProcessor,
+    EnvTransition,
+    IdentityProcessor,
+    InfoProcessor,
+    ObservationProcessor,
+    ProcessorStep,
+    ProcessorStepRegistry,
+    RewardProcessor,
+    RobotProcessor,
+    TransitionKey,
+    TruncatedProcessor,
+)
+from .rename_processor import RenameProcessor
+
+__all__ = [
+    "ActionProcessor",
+    "DeviceProcessor",
+    "DoneProcessor",
+    "EnvTransition",
+    "IdentityProcessor",
+    "InfoProcessor",
+    "NormalizerProcessor",
+    "UnnormalizerProcessor",
+    "ObservationProcessor",
+    "ProcessorStep",
+    "ProcessorStepRegistry",
+    "RenameProcessor",
+    "RewardProcessor",
+    "RobotProcessor",
+    "TransitionKey",
+    "TruncatedProcessor",
+    "VanillaObservationProcessor",
+]
@@ -0,0 +1,82 @@
+#!/usr/bin/env python
+
+# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+from dataclasses import dataclass
+from typing import Any
+
+import torch
+
+from lerobot.configs.types import PolicyFeature
+from lerobot.processor.pipeline import EnvTransition, TransitionKey
+from lerobot.utils.utils import get_safe_torch_device
+
+
+@dataclass
+class DeviceProcessor:
+    """Processes transitions by moving tensors to the specified device.
+
+    This processor ensures that all tensors in the transition are moved to the
+    specified device (CPU or GPU) before they are returned.
+    """
+
+    device: torch.device = "cpu"
+
+    def __post_init__(self):
+        self.device = get_safe_torch_device(self.device)
+        self.non_blocking = "cuda" in str(self.device)
+
+    def __call__(self, transition: EnvTransition) -> EnvTransition:
+        # Create a copy of the transition
+        new_transition = transition.copy()
+
+        # Process observation tensors
+        observation = transition.get(TransitionKey.OBSERVATION)
+        if observation is not None:
+            new_observation = {
+                k: v.to(self.device, non_blocking=self.non_blocking) if isinstance(v, torch.Tensor) else v
+                for k, v in observation.items()
+            }
+            new_transition[TransitionKey.OBSERVATION] = new_observation
+
+        # Process action tensor
+        action = transition.get(TransitionKey.ACTION)
+        if action is not None and isinstance(action, torch.Tensor):
+            new_transition[TransitionKey.ACTION] = action.to(self.device, non_blocking=self.non_blocking)
+
+        # Process reward tensor
+        reward = transition.get(TransitionKey.REWARD)
+        if reward is not None and isinstance(reward, torch.Tensor):
+            new_transition[TransitionKey.REWARD] = reward.to(self.device, non_blocking=self.non_blocking)
+
+        # Process done tensor
+        done = transition.get(TransitionKey.DONE)
+        if done is not None and isinstance(done, torch.Tensor):
+            new_transition[TransitionKey.DONE] = done.to(self.device, non_blocking=self.non_blocking)
+
+        # Process truncated tensor
+        truncated = transition.get(TransitionKey.TRUNCATED)
+        if truncated is not None and isinstance(truncated, torch.Tensor):
+            new_transition[TransitionKey.TRUNCATED] = truncated.to(
+                self.device, non_blocking=self.non_blocking
+            )
+
+        return new_transition
+
+    def get_config(self) -> dict[str, Any]:
+        """Return configuration for serialization."""
+        return {"device": self.device}
+
+    def feature_contract(self, features: dict[str, PolicyFeature]) -> dict[str, PolicyFeature]:
+        return features
@@ -0,0 +1,331 @@
+from __future__ import annotations
+
+from collections.abc import Mapping
+from dataclasses import dataclass, field
+from typing import Any
+
+import numpy as np
+import torch
+from torch import Tensor
+
+from lerobot.configs.types import FeatureType, NormalizationMode, PolicyFeature
+from lerobot.datasets.lerobot_dataset import LeRobotDataset
+from lerobot.processor.pipeline import EnvTransition, ProcessorStepRegistry, TransitionKey
+
+
+def _convert_stats_to_tensors(stats: dict[str, dict[str, Any]]) -> dict[str, dict[str, Tensor]]:
+    """Convert numpy arrays and other types to torch tensors."""
+    tensor_stats: dict[str, dict[str, Tensor]] = {}
+    for key, sub in stats.items():
+        tensor_stats[key] = {}
+        for stat_name, value in sub.items():
+            if isinstance(value, np.ndarray):
+                tensor_val = torch.from_numpy(value.astype(np.float32))
+            elif isinstance(value, torch.Tensor):
+                tensor_val = value.to(dtype=torch.float32)
+            elif isinstance(value, (int, float, list, tuple)):
+                tensor_val = torch.tensor(value, dtype=torch.float32)
+            else:
+                raise TypeError(f"Unsupported type for stats['{key}']['{stat_name}']: {type(value)}")
+            tensor_stats[key][stat_name] = tensor_val
+    return tensor_stats
+
+
+@dataclass
+@ProcessorStepRegistry.register(name="normalizer_processor")
+class NormalizerProcessor:
+    """Normalizes observations and actions in a single processor step.
+
+    This processor handles normalization of both observation and action tensors
+    using either mean/std normalization or min/max scaling to a [-1, 1] range.
+
+    For each tensor key in the stats dictionary, the processor will:
+    - Use mean/std normalization if those statistics are provided: (x - mean) / std
+    - Use min/max scaling if those statistics are provided: 2 * (x - min) / (max - min) - 1
+
+    The processor can be configured to normalize only specific keys by setting
+    the normalize_keys parameter.
+    """
+
+    # Features and normalisation map are mandatory to match the design of normalize.py
+    features: dict[str, PolicyFeature]
+    norm_map: dict[FeatureType, NormalizationMode]
+
+    # Pre-computed statistics coming from dataset.meta.stats for instance.
+    stats: dict[str, dict[str, Any]] | None = None
+
+    # Explicit subset of keys to normalise. If ``None`` every key (except
+    # "action") found in ``stats`` will be normalised. Using a ``set`` makes
+    # membership checks O(1).
+    normalize_keys: set[str] | None = None
+
+    eps: float = 1e-8
+
+    _tensor_stats: dict[str, dict[str, Tensor]] = field(default_factory=dict, init=False, repr=False)
+
+    @classmethod
+    def from_lerobot_dataset(
+        cls,
+        dataset: LeRobotDataset,
+        features: dict[str, PolicyFeature],
+        norm_map: dict[FeatureType, NormalizationMode],
+        *,
+        normalize_keys: set[str] | None = None,
+        eps: float = 1e-8,
+    ) -> NormalizerProcessor:
+        """Factory helper that pulls statistics from a :class:`LeRobotDataset`.
+
+        The features and norm_map parameters are mandatory to match the design
+        pattern used in normalize.py.
+        """
+
+        return cls(
+            features=features,
+            norm_map=norm_map,
+            stats=dataset.meta.stats,
+            normalize_keys=normalize_keys,
+            eps=eps,
+        )
+
+    def __post_init__(self):
+        # Handle deserialization from JSON config
+        if self.features and isinstance(list(self.features.values())[0], dict):
+            # Features came from JSON - need to reconstruct PolicyFeature objects
+            reconstructed_features = {}
+            for key, ft_dict in self.features.items():
+                reconstructed_features[key] = PolicyFeature(
+                    type=FeatureType(ft_dict["type"]), shape=tuple(ft_dict["shape"])
+                )
+            self.features = reconstructed_features
+
+        if self.norm_map and isinstance(list(self.norm_map.keys())[0], str):
+            # norm_map came from JSON - need to reconstruct enum keys and values
+            reconstructed_norm_map = {}
+            for ft_type_str, norm_mode_str in self.norm_map.items():
+                reconstructed_norm_map[FeatureType(ft_type_str)] = NormalizationMode(norm_mode_str)
+            self.norm_map = reconstructed_norm_map
+
+        # Convert statistics once so we avoid repeated numpy→Tensor conversions
+        # during runtime.
+        self.stats = self.stats or {}
+        self._tensor_stats = _convert_stats_to_tensors(self.stats)
+
+        # Ensure *normalize_keys* is a set for fast look-ups and compare by
+        # value later when returning the configuration.
+        if self.normalize_keys is not None and not isinstance(self.normalize_keys, set):
+            self.normalize_keys = set(self.normalize_keys)
+
+    def _normalize_obs(self, observation):
+        if observation is None:
+            return None
+
+        # Decide which keys should be normalised for this call.
+        if self.normalize_keys is not None:
+            keys_to_norm = self.normalize_keys
+        else:
+            # Use feature map to skip action keys.
+            keys_to_norm = {k for k, ft in self.features.items() if ft.type is not FeatureType.ACTION}
+
+        processed = dict(observation)
+        for key in keys_to_norm:
+            if key not in processed or key not in self._tensor_stats:
+                continue
+
+            orig_val = processed[key]
+            tensor = (
+                orig_val.to(dtype=torch.float32)
+                if isinstance(orig_val, torch.Tensor)
+                else torch.as_tensor(orig_val, dtype=torch.float32)
+            )
+            stats = {k: v.to(tensor.device) for k, v in self._tensor_stats[key].items()}
+
+            if "mean" in stats and "std" in stats:
+                mean, std = stats["mean"], stats["std"]
+                processed[key] = (tensor - mean) / (std + self.eps)
+            elif "min" in stats and "max" in stats:
+                min_val, max_val = stats["min"], stats["max"]
+                processed[key] = 2 * (tensor - min_val) / (max_val - min_val + self.eps) - 1
+        return processed
+
+    def _normalize_action(self, action):
+        if action is None or "action" not in self._tensor_stats:
+            return action
+
+        tensor = (
+            action.to(dtype=torch.float32)
+            if isinstance(action, torch.Tensor)
+            else torch.as_tensor(action, dtype=torch.float32)
+        )
+        stats = {k: v.to(tensor.device) for k, v in self._tensor_stats["action"].items()}
+        if "mean" in stats and "std" in stats:
+            mean, std = stats["mean"], stats["std"]
+            return (tensor - mean) / (std + self.eps)
+        if "min" in stats and "max" in stats:
+            min_val, max_val = stats["min"], stats["max"]
+            return 2 * (tensor - min_val) / (max_val - min_val + self.eps) - 1
+        raise ValueError("Action stats must contain either ('mean','std') or ('min','max')")
+
+    def __call__(self, transition: EnvTransition) -> EnvTransition:
+        observation = self._normalize_obs(transition.get(TransitionKey.OBSERVATION))
+        action = self._normalize_action(transition.get(TransitionKey.ACTION))
+
+        # Create a new transition with normalized values
+        new_transition = transition.copy()
+        new_transition[TransitionKey.OBSERVATION] = observation
+        new_transition[TransitionKey.ACTION] = action
+        return new_transition
+
+    def get_config(self) -> dict[str, Any]:
+        config = {
+            "eps": self.eps,
+            "features": {
+                key: {"type": ft.type.value, "shape": ft.shape} for key, ft in self.features.items()
+            },
+            "norm_map": {ft_type.value: norm_mode.value for ft_type, norm_mode in self.norm_map.items()},
+        }
+        if self.normalize_keys is not None:
+            # Serialise as a list for YAML / JSON friendliness
+            config["normalize_keys"] = sorted(self.normalize_keys)
+        return config
+
+    def state_dict(self) -> dict[str, Tensor]:
+        flat = {}
+        for key, sub in self._tensor_stats.items():
+            for stat_name, tensor in sub.items():
+                flat[f"{key}.{stat_name}"] = tensor
+        return flat
+
+    def load_state_dict(self, state: Mapping[str, Tensor]) -> None:
+        self._tensor_stats.clear()
+        for flat_key, tensor in state.items():
+            key, stat_name = flat_key.rsplit(".", 1)
+            self._tensor_stats.setdefault(key, {})[stat_name] = tensor
+
+    def reset(self):
+        pass
+
+    def feature_contract(self, features: dict[str, PolicyFeature]) -> dict[str, PolicyFeature]:
+        return features
+
+
+@dataclass
+@ProcessorStepRegistry.register(name="unnormalizer_processor")
+class UnnormalizerProcessor:
+    """Inverse normalisation for observations and actions.
+
+    Exactly mirrors :class:`NormalizerProcessor` but applies the inverse
+    transform.
+    """
+
+    features: dict[str, PolicyFeature]
+    norm_map: dict[FeatureType, NormalizationMode]
+    stats: dict[str, dict[str, Any]] | None = None
+
+    _tensor_stats: dict[str, dict[str, Tensor]] = field(default_factory=dict, init=False, repr=False)
+
+    @classmethod
+    def from_lerobot_dataset(
+        cls,
+        dataset: LeRobotDataset,
+        features: dict[str, PolicyFeature],
+        norm_map: dict[FeatureType, NormalizationMode],
+    ) -> UnnormalizerProcessor:
+        return cls(features=features, norm_map=norm_map, stats=dataset.meta.stats)
+
+    def __post_init__(self):
+        # Handle deserialization from JSON config
+        if self.features and isinstance(list(self.features.values())[0], dict):
+            # Features came from JSON - need to reconstruct PolicyFeature objects
+            reconstructed_features = {}
+            for key, ft_dict in self.features.items():
+                reconstructed_features[key] = PolicyFeature(
+                    type=FeatureType(ft_dict["type"]), shape=tuple(ft_dict["shape"])
+                )
+            self.features = reconstructed_features
+
+        if self.norm_map and isinstance(list(self.norm_map.keys())[0], str):
+            # norm_map came from JSON - need to reconstruct enum keys and values
+            reconstructed_norm_map = {}
+            for ft_type_str, norm_mode_str in self.norm_map.items():
+                reconstructed_norm_map[FeatureType(ft_type_str)] = NormalizationMode(norm_mode_str)
+            self.norm_map = reconstructed_norm_map
+
+        self.stats = self.stats or {}
+        self._tensor_stats = _convert_stats_to_tensors(self.stats)
+
+    def _unnormalize_obs(self, observation):
+        if observation is None:
+            return None
+        keys = [k for k, ft in self.features.items() if ft.type is not FeatureType.ACTION]
+        processed = dict(observation)
+        for key in keys:
+            if key not in processed or key not in self._tensor_stats:
+                continue
+            orig_val = processed[key]
+            tensor = (
+                orig_val.to(dtype=torch.float32)
+                if isinstance(orig_val, torch.Tensor)
+                else torch.as_tensor(orig_val, dtype=torch.float32)
+            )
+            stats = {k: v.to(tensor.device) for k, v in self._tensor_stats[key].items()}
+            if "mean" in stats and "std" in stats:
+                mean, std = stats["mean"], stats["std"]
+                processed[key] = tensor * std + mean
+            elif "min" in stats and "max" in stats:
+                min_val, max_val = stats["min"], stats["max"]
+                processed[key] = (tensor + 1) / 2 * (max_val - min_val) + min_val
+        return processed
+
+    def _unnormalize_action(self, action):
+        if action is None or "action" not in self._tensor_stats:
+            return action
+        tensor = (
+            action.to(dtype=torch.float32)
+            if isinstance(action, torch.Tensor)
+            else torch.as_tensor(action, dtype=torch.float32)
+        )
+        stats = {k: v.to(tensor.device) for k, v in self._tensor_stats["action"].items()}
+        if "mean" in stats and "std" in stats:
+            mean, std = stats["mean"], stats["std"]
+            return tensor * std + mean
+        if "min" in stats and "max" in stats:
+            min_val, max_val = stats["min"], stats["max"]
+            return (tensor + 1) / 2 * (max_val - min_val) + min_val
+        raise ValueError("Action stats must contain either ('mean','std') or ('min','max')")
+
+    def __call__(self, transition: EnvTransition) -> EnvTransition:
+        observation = self._unnormalize_obs(transition.get(TransitionKey.OBSERVATION))
+        action = self._unnormalize_action(transition.get(TransitionKey.ACTION))
+
+        # Create a new transition with unnormalized values
+        new_transition = transition.copy()
+        new_transition[TransitionKey.OBSERVATION] = observation
+        new_transition[TransitionKey.ACTION] = action
+        return new_transition
+
+    def get_config(self) -> dict[str, Any]:
+        return {
+            "features": {
+                key: {"type": ft.type.value, "shape": ft.shape} for key, ft in self.features.items()
+            },
+            "norm_map": {ft_type.value: norm_mode.value for ft_type, norm_mode in self.norm_map.items()},
+        }
+
+    def state_dict(self) -> dict[str, Tensor]:
+        flat = {}
+        for key, sub in self._tensor_stats.items():
+            for stat_name, tensor in sub.items():
+                flat[f"{key}.{stat_name}"] = tensor
+        return flat
+
+    def load_state_dict(self, state: Mapping[str, Tensor]) -> None:
+        self._tensor_stats.clear()
+        for flat_key, tensor in state.items():
+            key, stat_name = flat_key.rsplit(".", 1)
+            self._tensor_stats.setdefault(key, {})[stat_name] = tensor
+
+    def reset(self):
+        pass
+
+    def feature_contract(self, features: dict[str, PolicyFeature]) -> dict[str, PolicyFeature]:
+        return features
@@ -0,0 +1,157 @@
+#!/usr/bin/env python
+
+# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+from dataclasses import dataclass
+
+import einops
+import numpy as np
+import torch
+from torch import Tensor
+
+from lerobot.configs.types import PolicyFeature
+from lerobot.constants import OBS_ENV_STATE, OBS_IMAGE, OBS_IMAGES, OBS_STATE
+from lerobot.processor.pipeline import ObservationProcessor, ProcessorStepRegistry
+
+
+@dataclass
+@ProcessorStepRegistry.register(name="observation_processor")
+class VanillaObservationProcessor(ObservationProcessor):
+    """
+    Processes environment observations into the LeRobot format by handling both images and states.
+
+    Image processing:
+        - Converts channel-last (H, W, C) images to channel-first (C, H, W)
+        - Normalizes uint8 images ([0, 255]) to float32 ([0, 1])
+        - Adds a batch dimension if missing
+        - Supports single images and image dictionaries
+
+    State processing:
+        - Maps 'environment_state' to observation.environment_state
+        - Maps 'agent_pos' to observation.state
+        - Converts numpy arrays to tensors
+        - Adds a batch dimension if missing
+    """
+
+    def _process_single_image(self, img: np.ndarray) -> Tensor:
+        """Process a single image array."""
+        # Convert to tensor
+        img_tensor = torch.from_numpy(img)
+
+        # Add batch dimension if needed
+        if img_tensor.ndim == 3:
+            img_tensor = img_tensor.unsqueeze(0)
+
+        # Validate image format
+        _, h, w, c = img_tensor.shape
+        if not (c < h and c < w):
+            raise ValueError(f"Expected channel-last images, but got shape {img_tensor.shape}")
+
+        if img_tensor.dtype != torch.uint8:
+            raise ValueError(f"Expected torch.uint8 images, but got {img_tensor.dtype}")
+
+        # Convert to channel-first format
+        img_tensor = einops.rearrange(img_tensor, "b h w c -> b c h w").contiguous()
+
+        # Convert to float32 and normalize to [0, 1]
+        img_tensor = img_tensor.type(torch.float32) / 255.0
+
+        return img_tensor
+
+    def _process_observation(self, observation):
+        """
+        Processes both image and state observations.
+        """
+
+        processed_obs = observation.copy()
+
+        if "pixels" in processed_obs:
+            pixels = processed_obs.pop("pixels")
+
+            if isinstance(pixels, dict):
+                imgs = {f"{OBS_IMAGES}.{key}": img for key, img in pixels.items()}
+            else:
+                imgs = {OBS_IMAGE: pixels}
+
+            for imgkey, img in imgs.items():
+                processed_obs[imgkey] = self._process_single_image(img)
+
+        if "environment_state" in processed_obs:
+            env_state_np = processed_obs.pop("environment_state")
+            env_state = torch.from_numpy(env_state_np).float()
+            if env_state.dim() == 1:
+                env_state = env_state.unsqueeze(0)
+            processed_obs[OBS_ENV_STATE] = env_state
+
+        if "agent_pos" in processed_obs:
+            agent_pos_np = processed_obs.pop("agent_pos")
+            agent_pos = torch.from_numpy(agent_pos_np).float()
+            if agent_pos.dim() == 1:
+                agent_pos = agent_pos.unsqueeze(0)
+            processed_obs[OBS_STATE] = agent_pos
+
+        return processed_obs
+
+    def observation(self, observation):
+        return self._process_observation(observation)
+
+    def feature_contract(self, features: dict[str, PolicyFeature]) -> dict[str, PolicyFeature]:
+        """Transforms feature keys to a standardized contract.
+
+        This method handles several renaming patterns:
+        - Exact matches (e.g., 'pixels' -> 'OBS_IMAGE').
+        - Prefixed exact matches (e.g., 'observation.pixels' -> 'OBS_IMAGE').
+        - Prefix matches (e.g., 'pixels.cam1' -> 'OBS_IMAGES.cam1').
+        - Prefixed prefix matches (e.g., 'observation.pixels.cam1' -> 'OBS_IMAGES.cam1').
+        - environment_state -> OBS_ENV_STATE,
+        - agent_pos -> OBS_STATE,
+        - observation.environment_state -> OBS_ENV_STATE,
+        - observation.agent_pos -> OBS_STATE
+        """
+        exact_pairs = {
+            "pixels": OBS_IMAGE,
+            "environment_state": OBS_ENV_STATE,
+            "agent_pos": OBS_STATE,
+        }
+
+        prefix_pairs = {
+            "pixels.": f"{OBS_IMAGES}.",
+        }
+
+        for key in list(features.keys()):
+            matched_prefix = False
+            for old_prefix, new_prefix in prefix_pairs.items():
+                prefixed_old = f"observation.{old_prefix}"
+                if key.startswith(prefixed_old):
+                    suffix = key[len(prefixed_old) :]
+                    features[f"{new_prefix}{suffix}"] = features.pop(key)
+                    matched_prefix = True
+                    break
+
+                if key.startswith(old_prefix):
+                    suffix = key[len(old_prefix) :]
+                    features[f"{new_prefix}{suffix}"] = features.pop(key)
+                    matched_prefix = True
+                    break
+
+            if matched_prefix:
+                continue
+
+            for old, new in exact_pairs.items():
+                if key == old or key == f"observation.{old}":
+                    if key in features:
+                        features[new] = features.pop(key)
+                        break
+
+        return features
@@ -0,0 +1,51 @@
+#!/usr/bin/env python
+
+# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+from dataclasses import dataclass, field
+from typing import Any
+
+from lerobot.configs.types import PolicyFeature
+from lerobot.processor.pipeline import (
+    ObservationProcessor,
+    ProcessorStepRegistry,
+)
+
+
+@dataclass
+@ProcessorStepRegistry.register(name="rename_processor")
+class RenameProcessor(ObservationProcessor):
+    """Rename processor that renames keys in the observation."""
+
+    rename_map: dict[str, str] = field(default_factory=dict)
+
+    def observation(self, observation):
+        processed_obs = {}
+        for key, value in observation.items():
+            if key in self.rename_map:
+                processed_obs[self.rename_map[key]] = value
+            else:
+                processed_obs[key] = value
+
+        return processed_obs
+
+    def get_config(self) -> dict[str, Any]:
+        return {"rename_map": self.rename_map}
+
+    def feature_contract(self, features: dict[str, PolicyFeature]) -> dict[str, PolicyFeature]:
+        """Transforms:
+        - Each key in the observation that appears in `rename_map` is renamed to its value.
+        - Keys not in `rename_map` remain unchanged.
+        """
+        return {self.rename_map.get(k, k): v for k, v in features.items()}
@@ -19,6 +19,7 @@ import traceback
 import pytest
 from serial import SerialException

+from lerobot.configs.types import FeatureType, PolicyFeature
 from tests.utils import DEVICE

 # Import fixture modules as plugins
@@ -69,3 +70,19 @@ def patch_builtins_input(monkeypatch):
            print(text)

    monkeypatch.setattr("builtins.input", print_text)
+
+
+@pytest.fixture
+def policy_feature_factory():
+    """PolicyFeature factory"""
+
+    def _pf(ft: FeatureType, shape: tuple[int, ...]) -> PolicyFeature:
+        return PolicyFeature(type=ft, shape=shape)
+
+    return _pf
+
+
+def assert_contract_is_typed(features: dict[str, PolicyFeature]) -> None:
+    assert isinstance(features, dict)
+    assert all(isinstance(k, str) for k in features.keys())
+    assert all(isinstance(v, PolicyFeature) for v in features.values())
@@ -27,11 +27,13 @@ from lerobot import available_policies
 from lerobot.configs.default import DatasetConfig
 from lerobot.configs.train import TrainPipelineConfig
 from lerobot.configs.types import FeatureType, NormalizationMode, PolicyFeature
+from lerobot.constants import ACTION, OBS_STATE
 from lerobot.datasets.factory import make_dataset
 from lerobot.datasets.utils import cycle, dataset_to_policy_features
 from lerobot.envs.factory import make_env, make_env_config
 from lerobot.envs.utils import preprocess_observation
 from lerobot.optim.factory import make_optimizer_and_scheduler
+from lerobot.policies.act.configuration_act import ACTConfig
 from lerobot.policies.act.modeling_act import ACTTemporalEnsembler
 from lerobot.policies.factory import (
    get_policy_class,
@@ -363,6 +365,54 @@ def test_normalize(insert_temporal_dim):
    unnormalize(output_batch)


+@pytest.mark.parametrize("multikey", [True, False])
+def test_multikey_construction(multikey: bool):
+    """
+    Asserts that multiple keys with type State/Action are correctly processed by the policy constructor,
+    preventing erroneous creation of the policy object.
+    """
+    input_features = {
+        "observation.state": PolicyFeature(
+            type=FeatureType.STATE,
+            shape=(10,),
+        ),
+    }
+    output_features = {
+        "action": PolicyFeature(
+            type=FeatureType.ACTION,
+            shape=(5,),
+        ),
+    }
+
+    if multikey:
+        """Simulates the complete state/action is constructed from more granular multiple
+        keys, of the same type as the overall state/action"""
+        input_features = {}
+        input_features["observation.state.subset1"] = PolicyFeature(type=FeatureType.STATE, shape=(5,))
+        input_features["observation.state.subset2"] = PolicyFeature(type=FeatureType.STATE, shape=(5,))
+        input_features["observation.state"] = PolicyFeature(type=FeatureType.STATE, shape=(10,))
+
+        output_features = {}
+        output_features["action.first_three_motors"] = PolicyFeature(type=FeatureType.ACTION, shape=(3,))
+        output_features["action.last_two_motors"] = PolicyFeature(type=FeatureType.ACTION, shape=(2,))
+        output_features["action"] = PolicyFeature(
+            type=FeatureType.ACTION,
+            shape=(5,),
+        )
+
+    config = ACTConfig(input_features=input_features, output_features=output_features)
+
+    state_condition = config.robot_state_feature == input_features[OBS_STATE]
+    action_condition = config.action_feature == output_features[ACTION]
+
+    assert state_condition, (
+        f"Discrepancy detected. Robot state feature is {config.robot_state_feature} but policy expects {input_features[OBS_STATE]}"
+    )
+    assert action_condition, (
+        f"Discrepancy detected. Action feature is {config.action_feature} but policy expects {output_features[ACTION]}"
+    )
+
+
@pytest.mark.parametrize(
    "ds_repo_id, policy_name, policy_kwargs, file_name_extra",
    [
@@ -0,0 +1,282 @@
+import torch
+
+from lerobot.processor.pipeline import (
+    RobotProcessor,
+    TransitionKey,
+    _default_batch_to_transition,
+    _default_transition_to_batch,
+)
+
+
+def _dummy_batch():
+    """Create a dummy batch using the new format with observation.* and next.* keys."""
+    return {
+        "observation.image.left": torch.randn(1, 3, 128, 128),
+        "observation.image.right": torch.randn(1, 3, 128, 128),
+        "observation.state": torch.tensor([[0.1, 0.2, 0.3, 0.4]]),
+        "action": torch.tensor([[0.5]]),
+        "next.reward": 1.0,
+        "next.done": False,
+        "next.truncated": False,
+        "info": {"key": "value"},
+    }
+
+
+def test_observation_grouping_roundtrip():
+    """Test that observation.* keys are properly grouped and ungrouped."""
+    proc = RobotProcessor([])
+    batch_in = _dummy_batch()
+    batch_out = proc(batch_in)
+
+    # Check that all observation.* keys are preserved
+    original_obs_keys = {k: v for k, v in batch_in.items() if k.startswith("observation.")}
+    reconstructed_obs_keys = {k: v for k, v in batch_out.items() if k.startswith("observation.")}
+
+    assert set(original_obs_keys.keys()) == set(reconstructed_obs_keys.keys())
+
+    # Check tensor values
+    assert torch.allclose(batch_out["observation.image.left"], batch_in["observation.image.left"])
+    assert torch.allclose(batch_out["observation.image.right"], batch_in["observation.image.right"])
+    assert torch.allclose(batch_out["observation.state"], batch_in["observation.state"])
+
+    # Check other fields
+    assert torch.allclose(batch_out["action"], batch_in["action"])
+    assert batch_out["next.reward"] == batch_in["next.reward"]
+    assert batch_out["next.done"] == batch_in["next.done"]
+    assert batch_out["next.truncated"] == batch_in["next.truncated"]
+    assert batch_out["info"] == batch_in["info"]
+
+
+def test_batch_to_transition_observation_grouping():
+    """Test that _default_batch_to_transition correctly groups observation.* keys."""
+    batch = {
+        "observation.image.top": torch.randn(1, 3, 128, 128),
+        "observation.image.left": torch.randn(1, 3, 128, 128),
+        "observation.state": [1, 2, 3, 4],
+        "action": "action_data",
+        "next.reward": 1.5,
+        "next.done": True,
+        "next.truncated": False,
+        "info": {"episode": 42},
+    }
+
+    transition = _default_batch_to_transition(batch)
+
+    # Check observation is a dict with all observation.* keys
+    assert isinstance(transition[TransitionKey.OBSERVATION], dict)
+    assert "observation.image.top" in transition[TransitionKey.OBSERVATION]
+    assert "observation.image.left" in transition[TransitionKey.OBSERVATION]
+    assert "observation.state" in transition[TransitionKey.OBSERVATION]
+
+    # Check values are preserved
+    assert torch.allclose(
+        transition[TransitionKey.OBSERVATION]["observation.image.top"], batch["observation.image.top"]
+    )
+    assert torch.allclose(
+        transition[TransitionKey.OBSERVATION]["observation.image.left"], batch["observation.image.left"]
+    )
+    assert transition[TransitionKey.OBSERVATION]["observation.state"] == [1, 2, 3, 4]
+
+    # Check other fields
+    assert transition[TransitionKey.ACTION] == "action_data"
+    assert transition[TransitionKey.REWARD] == 1.5
+    assert transition[TransitionKey.DONE]
+    assert not transition[TransitionKey.TRUNCATED]
+    assert transition[TransitionKey.INFO] == {"episode": 42}
+    assert transition[TransitionKey.COMPLEMENTARY_DATA] == {}
+
+
+def test_transition_to_batch_observation_flattening():
+    """Test that _default_transition_to_batch correctly flattens observation dict."""
+    observation_dict = {
+        "observation.image.top": torch.randn(1, 3, 128, 128),
+        "observation.image.left": torch.randn(1, 3, 128, 128),
+        "observation.state": [1, 2, 3, 4],
+    }
+
+    transition = {
+        TransitionKey.OBSERVATION: observation_dict,
+        TransitionKey.ACTION: "action_data",
+        TransitionKey.REWARD: 1.5,
+        TransitionKey.DONE: True,
+        TransitionKey.TRUNCATED: False,
+        TransitionKey.INFO: {"episode": 42},
+        TransitionKey.COMPLEMENTARY_DATA: {},
+    }
+
+    batch = _default_transition_to_batch(transition)
+
+    # Check that observation.* keys are flattened back to batch
+    assert "observation.image.top" in batch
+    assert "observation.image.left" in batch
+    assert "observation.state" in batch
+
+    # Check values are preserved
+    assert torch.allclose(batch["observation.image.top"], observation_dict["observation.image.top"])
+    assert torch.allclose(batch["observation.image.left"], observation_dict["observation.image.left"])
+    assert batch["observation.state"] == [1, 2, 3, 4]
+
+    # Check other fields are mapped to next.* format
+    assert batch["action"] == "action_data"
+    assert batch["next.reward"] == 1.5
+    assert batch["next.done"]
+    assert not batch["next.truncated"]
+    assert batch["info"] == {"episode": 42}
+
+
+def test_no_observation_keys():
+    """Test behavior when there are no observation.* keys."""
+    batch = {
+        "action": "action_data",
+        "next.reward": 2.0,
+        "next.done": False,
+        "next.truncated": True,
+        "info": {"test": "no_obs"},
+    }
+
+    transition = _default_batch_to_transition(batch)
+
+    # Observation should be None when no observation.* keys
+    assert transition[TransitionKey.OBSERVATION] is None
+
+    # Check other fields
+    assert transition[TransitionKey.ACTION] == "action_data"
+    assert transition[TransitionKey.REWARD] == 2.0
+    assert not transition[TransitionKey.DONE]
+    assert transition[TransitionKey.TRUNCATED]
+    assert transition[TransitionKey.INFO] == {"test": "no_obs"}
+
+    # Round trip should work
+    reconstructed_batch = _default_transition_to_batch(transition)
+    assert reconstructed_batch["action"] == "action_data"
+    assert reconstructed_batch["next.reward"] == 2.0
+    assert not reconstructed_batch["next.done"]
+    assert reconstructed_batch["next.truncated"]
+    assert reconstructed_batch["info"] == {"test": "no_obs"}
+
+
+def test_minimal_batch():
+    """Test with minimal batch containing only observation.* and action."""
+    batch = {"observation.state": "minimal_state", "action": "minimal_action"}
+
+    transition = _default_batch_to_transition(batch)
+
+    # Check observation
+    assert transition[TransitionKey.OBSERVATION] == {"observation.state": "minimal_state"}
+    assert transition[TransitionKey.ACTION] == "minimal_action"
+
+    # Check defaults
+    assert transition[TransitionKey.REWARD] == 0.0
+    assert not transition[TransitionKey.DONE]
+    assert not transition[TransitionKey.TRUNCATED]
+    assert transition[TransitionKey.INFO] == {}
+    assert transition[TransitionKey.COMPLEMENTARY_DATA] == {}
+
+    # Round trip
+    reconstructed_batch = _default_transition_to_batch(transition)
+    assert reconstructed_batch["observation.state"] == "minimal_state"
+    assert reconstructed_batch["action"] == "minimal_action"
+    assert reconstructed_batch["next.reward"] == 0.0
+    assert not reconstructed_batch["next.done"]
+    assert not reconstructed_batch["next.truncated"]
+    assert reconstructed_batch["info"] == {}
+
+
+def test_empty_batch():
+    """Test behavior with empty batch."""
+    batch = {}
+
+    transition = _default_batch_to_transition(batch)
+
+    # All fields should have defaults
+    assert transition[TransitionKey.OBSERVATION] is None
+    assert transition[TransitionKey.ACTION] is None
+    assert transition[TransitionKey.REWARD] == 0.0
+    assert not transition[TransitionKey.DONE]
+    assert not transition[TransitionKey.TRUNCATED]
+    assert transition[TransitionKey.INFO] == {}
+    assert transition[TransitionKey.COMPLEMENTARY_DATA] == {}
+
+    # Round trip
+    reconstructed_batch = _default_transition_to_batch(transition)
+    assert reconstructed_batch["action"] is None
+    assert reconstructed_batch["next.reward"] == 0.0
+    assert not reconstructed_batch["next.done"]
+    assert not reconstructed_batch["next.truncated"]
+    assert reconstructed_batch["info"] == {}
+
+
+def test_complex_nested_observation():
+    """Test with complex nested observation data."""
+    batch = {
+        "observation.image.top": {"image": torch.randn(1, 3, 128, 128), "timestamp": 1234567890},
+        "observation.image.left": {"image": torch.randn(1, 3, 128, 128), "timestamp": 1234567891},
+        "observation.state": torch.randn(7),
+        "action": torch.randn(8),
+        "next.reward": 3.14,
+        "next.done": False,
+        "next.truncated": True,
+        "info": {"episode_length": 200, "success": True},
+    }
+
+    transition = _default_batch_to_transition(batch)
+    reconstructed_batch = _default_transition_to_batch(transition)
+
+    # Check that all observation keys are preserved
+    original_obs_keys = {k for k in batch if k.startswith("observation.")}
+    reconstructed_obs_keys = {k for k in reconstructed_batch if k.startswith("observation.")}
+
+    assert original_obs_keys == reconstructed_obs_keys
+
+    # Check tensor values
+    assert torch.allclose(batch["observation.state"], reconstructed_batch["observation.state"])
+
+    # Check nested dict with tensors
+    assert torch.allclose(
+        batch["observation.image.top"]["image"], reconstructed_batch["observation.image.top"]["image"]
+    )
+    assert torch.allclose(
+        batch["observation.image.left"]["image"], reconstructed_batch["observation.image.left"]["image"]
+    )
+
+    # Check action tensor
+    assert torch.allclose(batch["action"], reconstructed_batch["action"])
+
+    # Check other fields
+    assert batch["next.reward"] == reconstructed_batch["next.reward"]
+    assert batch["next.done"] == reconstructed_batch["next.done"]
+    assert batch["next.truncated"] == reconstructed_batch["next.truncated"]
+    assert batch["info"] == reconstructed_batch["info"]
+
+
+def test_custom_converter():
+    """Test that custom converters can still be used."""
+
+    def to_tr(batch):
+        # Custom converter that modifies the reward
+        tr = _default_batch_to_transition(batch)
+        # Double the reward
+        reward = tr.get(TransitionKey.REWARD, 0.0)
+        new_tr = tr.copy()
+        new_tr[TransitionKey.REWARD] = reward * 2 if reward is not None else 0.0
+        return new_tr
+
+    def to_batch(tr):
+        batch = _default_transition_to_batch(tr)
+        return batch
+
+    processor = RobotProcessor(steps=[], to_transition=to_tr, to_output=to_batch)
+
+    batch = {
+        "observation.state": torch.randn(1, 4),
+        "action": torch.randn(1, 2),
+        "next.reward": 1.0,
+        "next.done": False,
+    }
+
+    result = processor(batch)
+
+    # Check the reward was doubled by our custom converter
+    assert result["next.reward"] == 2.0
+    assert torch.allclose(result["observation.state"], batch["observation.state"])
+    assert torch.allclose(result["action"], batch["action"])
@@ -0,0 +1,628 @@
+#!/usr/bin/env python
+
+# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+from unittest.mock import Mock
+
+import numpy as np
+import pytest
+import torch
+
+from lerobot.configs.types import FeatureType, NormalizationMode, PolicyFeature
+from lerobot.processor.normalize_processor import (
+    NormalizerProcessor,
+    UnnormalizerProcessor,
+    _convert_stats_to_tensors,
+)
+from lerobot.processor.pipeline import RobotProcessor, TransitionKey
+
+
+def create_transition(
+    observation=None, action=None, reward=None, done=None, truncated=None, info=None, complementary_data=None
+):
+    """Helper to create an EnvTransition dictionary."""
+    return {
+        TransitionKey.OBSERVATION: observation,
+        TransitionKey.ACTION: action,
+        TransitionKey.REWARD: reward,
+        TransitionKey.DONE: done,
+        TransitionKey.TRUNCATED: truncated,
+        TransitionKey.INFO: info,
+        TransitionKey.COMPLEMENTARY_DATA: complementary_data,
+    }
+
+
+def test_numpy_conversion():
+    stats = {
+        "observation.image": {
+            "mean": np.array([0.5, 0.5, 0.5]),
+            "std": np.array([0.2, 0.2, 0.2]),
+        }
+    }
+    tensor_stats = _convert_stats_to_tensors(stats)
+
+    assert isinstance(tensor_stats["observation.image"]["mean"], torch.Tensor)
+    assert isinstance(tensor_stats["observation.image"]["std"], torch.Tensor)
+    assert torch.allclose(tensor_stats["observation.image"]["mean"], torch.tensor([0.5, 0.5, 0.5]))
+    assert torch.allclose(tensor_stats["observation.image"]["std"], torch.tensor([0.2, 0.2, 0.2]))
+
+
+def test_tensor_conversion():
+    stats = {
+        "action": {
+            "mean": torch.tensor([0.0, 0.0]),
+            "std": torch.tensor([1.0, 1.0]),
+        }
+    }
+    tensor_stats = _convert_stats_to_tensors(stats)
+
+    assert tensor_stats["action"]["mean"].dtype == torch.float32
+    assert tensor_stats["action"]["std"].dtype == torch.float32
+
+
+def test_scalar_conversion():
+    stats = {
+        "reward": {
+            "mean": 0.5,
+            "std": 0.1,
+        }
+    }
+    tensor_stats = _convert_stats_to_tensors(stats)
+
+    assert torch.allclose(tensor_stats["reward"]["mean"], torch.tensor(0.5))
+    assert torch.allclose(tensor_stats["reward"]["std"], torch.tensor(0.1))
+
+
+def test_list_conversion():
+    stats = {
+        "observation.state": {
+            "min": [0.0, -1.0, -2.0],
+            "max": [1.0, 1.0, 2.0],
+        }
+    }
+    tensor_stats = _convert_stats_to_tensors(stats)
+
+    assert torch.allclose(tensor_stats["observation.state"]["min"], torch.tensor([0.0, -1.0, -2.0]))
+    assert torch.allclose(tensor_stats["observation.state"]["max"], torch.tensor([1.0, 1.0, 2.0]))
+
+
+def test_unsupported_type():
+    stats = {
+        "bad_key": {
+            "mean": "string_value",
+        }
+    }
+    with pytest.raises(TypeError, match="Unsupported type"):
+        _convert_stats_to_tensors(stats)
+
+
+# Helper functions to create feature maps and norm maps
+def _create_observation_features():
+    return {
+        "observation.image": PolicyFeature(FeatureType.VISUAL, (3, 96, 96)),
+        "observation.state": PolicyFeature(FeatureType.STATE, (2,)),
+    }
+
+
+def _create_observation_norm_map():
+    return {
+        FeatureType.VISUAL: NormalizationMode.MEAN_STD,
+        FeatureType.STATE: NormalizationMode.MIN_MAX,
+    }
+
+
+# Fixtures for observation normalisation tests using NormalizerProcessor
+@pytest.fixture
+def observation_stats():
+    return {
+        "observation.image": {
+            "mean": np.array([0.5, 0.5, 0.5]),
+            "std": np.array([0.2, 0.2, 0.2]),
+        },
+        "observation.state": {
+            "min": np.array([0.0, -1.0]),
+            "max": np.array([1.0, 1.0]),
+        },
+    }
+
+
+@pytest.fixture
+def observation_normalizer(observation_stats):
+    """Return a NormalizerProcessor that only has observation stats (no action)."""
+    features = _create_observation_features()
+    norm_map = _create_observation_norm_map()
+    return NormalizerProcessor(features=features, norm_map=norm_map, stats=observation_stats)
+
+
+def test_mean_std_normalization(observation_normalizer):
+    observation = {
+        "observation.image": torch.tensor([0.7, 0.5, 0.3]),
+        "observation.state": torch.tensor([0.5, 0.0]),
+    }
+    transition = create_transition(observation=observation)
+
+    normalized_transition = observation_normalizer(transition)
+    normalized_obs = normalized_transition[TransitionKey.OBSERVATION]
+
+    # Check mean/std normalization
+    expected_image = (torch.tensor([0.7, 0.5, 0.3]) - 0.5) / 0.2
+    assert torch.allclose(normalized_obs["observation.image"], expected_image)
+
+
+def test_min_max_normalization(observation_normalizer):
+    observation = {
+        "observation.state": torch.tensor([0.5, 0.0]),
+    }
+    transition = create_transition(observation=observation)
+
+    normalized_transition = observation_normalizer(transition)
+    normalized_obs = normalized_transition[TransitionKey.OBSERVATION]
+
+    # Check min/max normalization to [-1, 1]
+    # For state[0]: 2 * (0.5 - 0.0) / (1.0 - 0.0) - 1 = 0.0
+    # For state[1]: 2 * (0.0 - (-1.0)) / (1.0 - (-1.0)) - 1 = 0.0
+    expected_state = torch.tensor([0.0, 0.0])
+    assert torch.allclose(normalized_obs["observation.state"], expected_state, atol=1e-6)
+
+
+def test_selective_normalization(observation_stats):
+    features = _create_observation_features()
+    norm_map = _create_observation_norm_map()
+    normalizer = NormalizerProcessor(
+        features=features, norm_map=norm_map, stats=observation_stats, normalize_keys={"observation.image"}
+    )
+
+    observation = {
+        "observation.image": torch.tensor([0.7, 0.5, 0.3]),
+        "observation.state": torch.tensor([0.5, 0.0]),
+    }
+    transition = create_transition(observation=observation)
+
+    normalized_transition = normalizer(transition)
+    normalized_obs = normalized_transition[TransitionKey.OBSERVATION]
+
+    # Only image should be normalized
+    assert torch.allclose(normalized_obs["observation.image"], (torch.tensor([0.7, 0.5, 0.3]) - 0.5) / 0.2)
+    # State should remain unchanged
+    assert torch.allclose(normalized_obs["observation.state"], observation["observation.state"])
+
+
+@pytest.mark.skipif(not torch.cuda.is_available(), reason="CUDA not available")
+def test_device_compatibility(observation_stats):
+    features = _create_observation_features()
+    norm_map = _create_observation_norm_map()
+    normalizer = NormalizerProcessor(features=features, norm_map=norm_map, stats=observation_stats)
+    observation = {
+        "observation.image": torch.tensor([0.7, 0.5, 0.3]).cuda(),
+    }
+    transition = create_transition(observation=observation)
+
+    normalized_transition = normalizer(transition)
+    normalized_obs = normalized_transition[TransitionKey.OBSERVATION]
+
+    assert normalized_obs["observation.image"].device.type == "cuda"
+
+
+def test_from_lerobot_dataset():
+    # Mock dataset
+    mock_dataset = Mock()
+    mock_dataset.meta.stats = {
+        "observation.image": {"mean": [0.5], "std": [0.2]},
+        "action": {"mean": [0.0], "std": [1.0]},
+    }
+
+    features = {
+        "observation.image": PolicyFeature(FeatureType.VISUAL, (3, 96, 96)),
+        "action": PolicyFeature(FeatureType.ACTION, (1,)),
+    }
+    norm_map = {
+        FeatureType.VISUAL: NormalizationMode.MEAN_STD,
+        FeatureType.ACTION: NormalizationMode.MEAN_STD,
+    }
+
+    normalizer = NormalizerProcessor.from_lerobot_dataset(mock_dataset, features, norm_map)
+
+    # Both observation and action statistics should be present in tensor stats
+    assert "observation.image" in normalizer._tensor_stats
+    assert "action" in normalizer._tensor_stats
+
+
+def test_state_dict_save_load(observation_normalizer):
+    # Save state
+    state_dict = observation_normalizer.state_dict()
+
+    # Create new normalizer and load state
+    features = _create_observation_features()
+    norm_map = _create_observation_norm_map()
+    new_normalizer = NormalizerProcessor(features=features, norm_map=norm_map, stats={})
+    new_normalizer.load_state_dict(state_dict)
+
+    # Test that it works the same
+    observation = {"observation.image": torch.tensor([0.7, 0.5, 0.3])}
+    transition = create_transition(observation=observation)
+
+    result1 = observation_normalizer(transition)[TransitionKey.OBSERVATION]
+    result2 = new_normalizer(transition)[TransitionKey.OBSERVATION]
+
+    assert torch.allclose(result1["observation.image"], result2["observation.image"])
+
+
+# Fixtures for ActionUnnormalizer tests
+@pytest.fixture
+def action_stats_mean_std():
+    return {
+        "mean": np.array([0.0, 0.0, 0.0]),
+        "std": np.array([1.0, 2.0, 0.5]),
+    }
+
+
+@pytest.fixture
+def action_stats_min_max():
+    return {
+        "min": np.array([-1.0, -2.0, 0.0]),
+        "max": np.array([1.0, 2.0, 1.0]),
+    }
+
+
+def _create_action_features():
+    return {
+        "action": PolicyFeature(FeatureType.ACTION, (3,)),
+    }
+
+
+def _create_action_norm_map_mean_std():
+    return {
+        FeatureType.ACTION: NormalizationMode.MEAN_STD,
+    }
+
+
+def _create_action_norm_map_min_max():
+    return {
+        FeatureType.ACTION: NormalizationMode.MIN_MAX,
+    }
+
+
+def test_mean_std_unnormalization(action_stats_mean_std):
+    features = _create_action_features()
+    norm_map = _create_action_norm_map_mean_std()
+    unnormalizer = UnnormalizerProcessor(
+        features=features, norm_map=norm_map, stats={"action": action_stats_mean_std}
+    )
+
+    normalized_action = torch.tensor([1.0, -0.5, 2.0])
+    transition = create_transition(action=normalized_action)
+
+    unnormalized_transition = unnormalizer(transition)
+    unnormalized_action = unnormalized_transition[TransitionKey.ACTION]
+
+    # action * std + mean
+    expected = torch.tensor([1.0 * 1.0 + 0.0, -0.5 * 2.0 + 0.0, 2.0 * 0.5 + 0.0])
+    assert torch.allclose(unnormalized_action, expected)
+
+
+def test_min_max_unnormalization(action_stats_min_max):
+    features = _create_action_features()
+    norm_map = _create_action_norm_map_min_max()
+    unnormalizer = UnnormalizerProcessor(
+        features=features, norm_map=norm_map, stats={"action": action_stats_min_max}
+    )
+
+    # Actions in [-1, 1]
+    normalized_action = torch.tensor([0.0, -1.0, 1.0])
+    transition = create_transition(action=normalized_action)
+
+    unnormalized_transition = unnormalizer(transition)
+    unnormalized_action = unnormalized_transition[TransitionKey.ACTION]
+
+    # Map from [-1, 1] to [min, max]
+    # (action + 1) / 2 * (max - min) + min
+    expected = torch.tensor(
+        [
+            (0.0 + 1) / 2 * (1.0 - (-1.0)) + (-1.0),  # 0.0
+            (-1.0 + 1) / 2 * (2.0 - (-2.0)) + (-2.0),  # -2.0
+            (1.0 + 1) / 2 * (1.0 - 0.0) + 0.0,  # 1.0
+        ]
+    )
+    assert torch.allclose(unnormalized_action, expected)
+
+
+def test_numpy_action_input(action_stats_mean_std):
+    features = _create_action_features()
+    norm_map = _create_action_norm_map_mean_std()
+    unnormalizer = UnnormalizerProcessor(
+        features=features, norm_map=norm_map, stats={"action": action_stats_mean_std}
+    )
+
+    normalized_action = np.array([1.0, -0.5, 2.0], dtype=np.float32)
+    transition = create_transition(action=normalized_action)
+
+    unnormalized_transition = unnormalizer(transition)
+    unnormalized_action = unnormalized_transition[TransitionKey.ACTION]
+
+    assert isinstance(unnormalized_action, torch.Tensor)
+    expected = torch.tensor([1.0, -1.0, 1.0])
+    assert torch.allclose(unnormalized_action, expected)
+
+
+def test_none_action(action_stats_mean_std):
+    features = _create_action_features()
+    norm_map = _create_action_norm_map_mean_std()
+    unnormalizer = UnnormalizerProcessor(
+        features=features, norm_map=norm_map, stats={"action": action_stats_mean_std}
+    )
+
+    transition = create_transition()
+    result = unnormalizer(transition)
+
+    # Should return transition unchanged
+    assert result == transition
+
+
+def test_action_from_lerobot_dataset():
+    mock_dataset = Mock()
+    mock_dataset.meta.stats = {"action": {"mean": [0.0], "std": [1.0]}}
+    features = {"action": PolicyFeature(FeatureType.ACTION, (1,))}
+    norm_map = {FeatureType.ACTION: NormalizationMode.MEAN_STD}
+    unnormalizer = UnnormalizerProcessor.from_lerobot_dataset(mock_dataset, features, norm_map)
+    assert "mean" in unnormalizer._tensor_stats["action"]
+
+
+# Fixtures for NormalizerProcessor tests
+@pytest.fixture
+def full_stats():
+    return {
+        "observation.image": {
+            "mean": np.array([0.5, 0.5, 0.5]),
+            "std": np.array([0.2, 0.2, 0.2]),
+        },
+        "observation.state": {
+            "min": np.array([0.0, -1.0]),
+            "max": np.array([1.0, 1.0]),
+        },
+        "action": {
+            "mean": np.array([0.0, 0.0]),
+            "std": np.array([1.0, 2.0]),
+        },
+    }
+
+
+def _create_full_features():
+    return {
+        "observation.image": PolicyFeature(FeatureType.VISUAL, (3, 96, 96)),
+        "observation.state": PolicyFeature(FeatureType.STATE, (2,)),
+        "action": PolicyFeature(FeatureType.ACTION, (2,)),
+    }
+
+
+def _create_full_norm_map():
+    return {
+        FeatureType.VISUAL: NormalizationMode.MEAN_STD,
+        FeatureType.STATE: NormalizationMode.MIN_MAX,
+        FeatureType.ACTION: NormalizationMode.MEAN_STD,
+    }
+
+
+@pytest.fixture
+def normalizer_processor(full_stats):
+    features = _create_full_features()
+    norm_map = _create_full_norm_map()
+    return NormalizerProcessor(features=features, norm_map=norm_map, stats=full_stats)
+
+
+def test_combined_normalization(normalizer_processor):
+    observation = {
+        "observation.image": torch.tensor([0.7, 0.5, 0.3]),
+        "observation.state": torch.tensor([0.5, 0.0]),
+    }
+    action = torch.tensor([1.0, -0.5])
+    transition = create_transition(
+        observation=observation,
+        action=action,
+        reward=1.0,
+        done=False,
+        truncated=False,
+        info={},
+        complementary_data={},
+    )
+
+    processed_transition = normalizer_processor(transition)
+
+    # Check normalized observations
+    processed_obs = processed_transition[TransitionKey.OBSERVATION]
+    expected_image = (torch.tensor([0.7, 0.5, 0.3]) - 0.5) / 0.2
+    assert torch.allclose(processed_obs["observation.image"], expected_image)
+
+    # Check normalized action
+    processed_action = processed_transition[TransitionKey.ACTION]
+    expected_action = torch.tensor([(1.0 - 0.0) / 1.0, (-0.5 - 0.0) / 2.0])
+    assert torch.allclose(processed_action, expected_action)
+
+    # Check other fields remain unchanged
+    assert processed_transition[TransitionKey.REWARD] == 1.0
+    assert not processed_transition[TransitionKey.DONE]
+
+
+def test_processor_from_lerobot_dataset(full_stats):
+    # Mock dataset
+    mock_dataset = Mock()
+    mock_dataset.meta.stats = full_stats
+
+    features = _create_full_features()
+    norm_map = _create_full_norm_map()
+
+    processor = NormalizerProcessor.from_lerobot_dataset(
+        mock_dataset, features, norm_map, normalize_keys={"observation.image"}
+    )
+
+    assert processor.normalize_keys == {"observation.image"}
+    assert "observation.image" in processor._tensor_stats
+    assert "action" in processor._tensor_stats
+
+
+def test_get_config(full_stats):
+    features = _create_full_features()
+    norm_map = _create_full_norm_map()
+    processor = NormalizerProcessor(
+        features=features, norm_map=norm_map, stats=full_stats, normalize_keys={"observation.image"}, eps=1e-6
+    )
+
+    config = processor.get_config()
+    expected_config = {
+        "normalize_keys": ["observation.image"],
+        "eps": 1e-6,
+        "features": {
+            "observation.image": {"type": "VISUAL", "shape": (3, 96, 96)},
+            "observation.state": {"type": "STATE", "shape": (2,)},
+            "action": {"type": "ACTION", "shape": (2,)},
+        },
+        "norm_map": {
+            "VISUAL": "MEAN_STD",
+            "STATE": "MIN_MAX",
+            "ACTION": "MEAN_STD",
+        },
+    }
+    assert config == expected_config
+
+
+def test_integration_with_robot_processor(normalizer_processor):
+    """Test integration with RobotProcessor pipeline"""
+    robot_processor = RobotProcessor([normalizer_processor])
+
+    observation = {
+        "observation.image": torch.tensor([0.7, 0.5, 0.3]),
+        "observation.state": torch.tensor([0.5, 0.0]),
+    }
+    action = torch.tensor([1.0, -0.5])
+    transition = create_transition(
+        observation=observation,
+        action=action,
+        reward=1.0,
+        done=False,
+        truncated=False,
+        info={},
+        complementary_data={},
+    )
+
+    processed_transition = robot_processor(transition)
+
+    # Verify the processing worked
+    assert isinstance(processed_transition[TransitionKey.OBSERVATION], dict)
+    assert isinstance(processed_transition[TransitionKey.ACTION], torch.Tensor)
+
+
+# Edge case tests
+def test_empty_observation():
+    stats = {"observation.image": {"mean": [0.5], "std": [0.2]}}
+    features = {"observation.image": PolicyFeature(FeatureType.VISUAL, (3, 96, 96))}
+    norm_map = {FeatureType.VISUAL: NormalizationMode.MEAN_STD}
+    normalizer = NormalizerProcessor(features=features, norm_map=norm_map, stats=stats)
+
+    transition = create_transition()
+    result = normalizer(transition)
+
+    assert result == transition
+
+
+def test_empty_stats():
+    features = {"observation.image": PolicyFeature(FeatureType.VISUAL, (3, 96, 96))}
+    norm_map = {FeatureType.VISUAL: NormalizationMode.MEAN_STD}
+    normalizer = NormalizerProcessor(features=features, norm_map=norm_map, stats={})
+    observation = {"observation.image": torch.tensor([0.5])}
+    transition = create_transition(observation=observation)
+
+    result = normalizer(transition)
+    # Should return observation unchanged since no stats are available
+    assert torch.allclose(
+        result[TransitionKey.OBSERVATION]["observation.image"], observation["observation.image"]
+    )
+
+
+def test_partial_stats():
+    """If statistics are incomplete, the value should pass through unchanged."""
+    stats = {"observation.image": {"mean": [0.5]}}  # Missing std / (min,max)
+    features = {"observation.image": PolicyFeature(FeatureType.VISUAL, (3, 96, 96))}
+    norm_map = {FeatureType.VISUAL: NormalizationMode.MEAN_STD}
+    normalizer = NormalizerProcessor(features=features, norm_map=norm_map, stats=stats)
+    observation = {"observation.image": torch.tensor([0.7])}
+    transition = create_transition(observation=observation)
+
+    processed = normalizer(transition)[TransitionKey.OBSERVATION]
+    assert torch.allclose(processed["observation.image"], observation["observation.image"])
+
+
+def test_missing_action_stats_no_error():
+    mock_dataset = Mock()
+    mock_dataset.meta.stats = {"observation.image": {"mean": [0.5], "std": [0.2]}}
+
+    features = {"observation.image": PolicyFeature(FeatureType.VISUAL, (3, 96, 96))}
+    norm_map = {FeatureType.VISUAL: NormalizationMode.MEAN_STD}
+
+    processor = UnnormalizerProcessor.from_lerobot_dataset(mock_dataset, features, norm_map)
+    # The tensor stats should not contain the 'action' key
+    assert "action" not in processor._tensor_stats
+
+
+def test_serialization_roundtrip(full_stats):
+    """Test that features and norm_map can be serialized and deserialized correctly."""
+    features = _create_full_features()
+    norm_map = _create_full_norm_map()
+    original_processor = NormalizerProcessor(
+        features=features, norm_map=norm_map, stats=full_stats, normalize_keys={"observation.image"}, eps=1e-6
+    )
+
+    # Get config (serialization)
+    config = original_processor.get_config()
+
+    # Create a new processor from the config (deserialization)
+    new_processor = NormalizerProcessor(
+        features=config["features"],
+        norm_map=config["norm_map"],
+        stats=full_stats,
+        normalize_keys=set(config["normalize_keys"]),
+        eps=config["eps"],
+    )
+
+    # Test that both processors work the same way
+    observation = {
+        "observation.image": torch.tensor([0.7, 0.5, 0.3]),
+        "observation.state": torch.tensor([0.5, 0.0]),
+    }
+    action = torch.tensor([1.0, -0.5])
+    transition = create_transition(
+        observation=observation,
+        action=action,
+        reward=1.0,
+        done=False,
+        truncated=False,
+        info={},
+        complementary_data={},
+    )
+
+    result1 = original_processor(transition)
+    result2 = new_processor(transition)
+
+    # Compare results
+    assert torch.allclose(
+        result1[TransitionKey.OBSERVATION]["observation.image"],
+        result2[TransitionKey.OBSERVATION]["observation.image"],
+    )
+    assert torch.allclose(result1[TransitionKey.ACTION], result2[TransitionKey.ACTION])
+
+    # Verify features and norm_map are correctly reconstructed
+    assert new_processor.features.keys() == original_processor.features.keys()
+    for key in new_processor.features:
+        assert new_processor.features[key].type == original_processor.features[key].type
+        assert new_processor.features[key].shape == original_processor.features[key].shape
+
+    assert new_processor.norm_map == original_processor.norm_map
@@ -0,0 +1,486 @@
+#!/usr/bin/env python
+
+# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import numpy as np
+import pytest
+import torch
+
+from lerobot.configs.types import FeatureType
+from lerobot.constants import OBS_ENV_STATE, OBS_IMAGE, OBS_IMAGES, OBS_STATE
+from lerobot.processor import VanillaObservationProcessor
+from lerobot.processor.pipeline import TransitionKey
+from tests.conftest import assert_contract_is_typed
+
+
+def create_transition(
+    observation=None, action=None, reward=None, done=None, truncated=None, info=None, complementary_data=None
+):
+    """Helper to create an EnvTransition dictionary."""
+    return {
+        TransitionKey.OBSERVATION: observation,
+        TransitionKey.ACTION: action,
+        TransitionKey.REWARD: reward,
+        TransitionKey.DONE: done,
+        TransitionKey.TRUNCATED: truncated,
+        TransitionKey.INFO: info,
+        TransitionKey.COMPLEMENTARY_DATA: complementary_data,
+    }
+
+
+def test_process_single_image():
+    """Test processing a single image."""
+    processor = VanillaObservationProcessor()
+
+    # Create a mock image (H, W, C) format, uint8
+    image = np.random.randint(0, 256, size=(64, 64, 3), dtype=np.uint8)
+
+    observation = {"pixels": image}
+    transition = create_transition(observation=observation)
+
+    result = processor(transition)
+    processed_obs = result[TransitionKey.OBSERVATION]
+
+    # Check that the image was processed correctly
+    assert "observation.image" in processed_obs
+    processed_img = processed_obs["observation.image"]
+
+    # Check shape: should be (1, 3, 64, 64) - batch, channels, height, width
+    assert processed_img.shape == (1, 3, 64, 64)
+
+    # Check dtype and range
+    assert processed_img.dtype == torch.float32
+    assert processed_img.min() >= 0.0
+    assert processed_img.max() <= 1.0
+
+
+def test_process_image_dict():
+    """Test processing multiple images in a dictionary."""
+    processor = VanillaObservationProcessor()
+
+    # Create mock images
+    image1 = np.random.randint(0, 256, size=(32, 32, 3), dtype=np.uint8)
+    image2 = np.random.randint(0, 256, size=(48, 48, 3), dtype=np.uint8)
+
+    observation = {"pixels": {"camera1": image1, "camera2": image2}}
+    transition = create_transition(observation=observation)
+
+    result = processor(transition)
+    processed_obs = result[TransitionKey.OBSERVATION]
+
+    # Check that both images were processed
+    assert "observation.images.camera1" in processed_obs
+    assert "observation.images.camera2" in processed_obs
+
+    # Check shapes
+    assert processed_obs["observation.images.camera1"].shape == (1, 3, 32, 32)
+    assert processed_obs["observation.images.camera2"].shape == (1, 3, 48, 48)
+
+
+def test_process_batched_image():
+    """Test processing already batched images."""
+    processor = VanillaObservationProcessor()
+
+    # Create a batched image (B, H, W, C)
+    image = np.random.randint(0, 256, size=(2, 64, 64, 3), dtype=np.uint8)
+
+    observation = {"pixels": image}
+    transition = create_transition(observation=observation)
+
+    result = processor(transition)
+    processed_obs = result[TransitionKey.OBSERVATION]
+
+    # Check that batch dimension is preserved
+    assert processed_obs["observation.image"].shape == (2, 3, 64, 64)
+
+
+def test_invalid_image_format():
+    """Test error handling for invalid image formats."""
+    processor = VanillaObservationProcessor()
+
+    # Test wrong channel order (channels first)
+    image = np.random.randint(0, 256, size=(3, 64, 64), dtype=np.uint8)
+    observation = {"pixels": image}
+    transition = create_transition(observation=observation)
+
+    with pytest.raises(ValueError, match="Expected channel-last images"):
+        processor(transition)
+
+
+def test_invalid_image_dtype():
+    """Test error handling for invalid image dtype."""
+    processor = VanillaObservationProcessor()
+
+    # Test wrong dtype
+    image = np.random.rand(64, 64, 3).astype(np.float32)
+    observation = {"pixels": image}
+    transition = create_transition(observation=observation)
+
+    with pytest.raises(ValueError, match="Expected torch.uint8 images"):
+        processor(transition)
+
+
+def test_no_pixels_in_observation():
+    """Test processor when no pixels are in observation."""
+    processor = VanillaObservationProcessor()
+
+    observation = {"other_data": np.array([1, 2, 3])}
+    transition = create_transition(observation=observation)
+
+    result = processor(transition)
+    processed_obs = result[TransitionKey.OBSERVATION]
+
+    # Should preserve other data unchanged
+    assert "other_data" in processed_obs
+    np.testing.assert_array_equal(processed_obs["other_data"], np.array([1, 2, 3]))
+
+
+def test_none_observation():
+    """Test processor with None observation."""
+    processor = VanillaObservationProcessor()
+
+    transition = create_transition()
+    result = processor(transition)
+
+    assert result == transition
+
+
+def test_serialization_methods():
+    """Test serialization methods."""
+    processor = VanillaObservationProcessor()
+
+    # Test get_config
+    config = processor.get_config()
+    assert isinstance(config, dict)
+
+    # Test state_dict
+    state = processor.state_dict()
+    assert isinstance(state, dict)
+
+    # Test load_state_dict (should not raise)
+    processor.load_state_dict(state)
+
+    # Test reset (should not raise)
+    processor.reset()
+
+
+def test_process_environment_state():
+    """Test processing environment_state."""
+    processor = VanillaObservationProcessor()
+
+    env_state = np.array([1.0, 2.0, 3.0], dtype=np.float32)
+    observation = {"environment_state": env_state}
+    transition = create_transition(observation=observation)
+
+    result = processor(transition)
+    processed_obs = result[TransitionKey.OBSERVATION]
+
+    # Check that environment_state was renamed and processed
+    assert "observation.environment_state" in processed_obs
+    assert "environment_state" not in processed_obs
+
+    processed_state = processed_obs["observation.environment_state"]
+    assert processed_state.shape == (1, 3)  # Batch dimension added
+    assert processed_state.dtype == torch.float32
+    torch.testing.assert_close(processed_state, torch.tensor([[1.0, 2.0, 3.0]]))
+
+
+def test_process_agent_pos():
+    """Test processing agent_pos."""
+    processor = VanillaObservationProcessor()
+
+    agent_pos = np.array([0.5, -0.5, 1.0], dtype=np.float32)
+    observation = {"agent_pos": agent_pos}
+    transition = create_transition(observation=observation)
+
+    result = processor(transition)
+    processed_obs = result[TransitionKey.OBSERVATION]
+
+    # Check that agent_pos was renamed and processed
+    assert "observation.state" in processed_obs
+    assert "agent_pos" not in processed_obs
+
+    processed_state = processed_obs["observation.state"]
+    assert processed_state.shape == (1, 3)  # Batch dimension added
+    assert processed_state.dtype == torch.float32
+    torch.testing.assert_close(processed_state, torch.tensor([[0.5, -0.5, 1.0]]))
+
+
+def test_process_batched_states():
+    """Test processing already batched states."""
+    processor = VanillaObservationProcessor()
+
+    env_state = np.array([[1.0, 2.0], [3.0, 4.0]], dtype=np.float32)
+    agent_pos = np.array([[0.5, -0.5], [1.0, -1.0]], dtype=np.float32)
+
+    observation = {"environment_state": env_state, "agent_pos": agent_pos}
+    transition = create_transition(observation=observation)
+
+    result = processor(transition)
+    processed_obs = result[TransitionKey.OBSERVATION]
+
+    # Check that batch dimensions are preserved
+    assert processed_obs["observation.environment_state"].shape == (2, 2)
+    assert processed_obs["observation.state"].shape == (2, 2)
+
+
+def test_process_both_states():
+    """Test processing both environment_state and agent_pos."""
+    processor = VanillaObservationProcessor()
+
+    env_state = np.array([1.0, 2.0], dtype=np.float32)
+    agent_pos = np.array([0.5, -0.5], dtype=np.float32)
+
+    observation = {"environment_state": env_state, "agent_pos": agent_pos, "other_data": "keep_me"}
+    transition = create_transition(observation=observation)
+
+    result = processor(transition)
+    processed_obs = result[TransitionKey.OBSERVATION]
+
+    # Check that both states were processed
+    assert "observation.environment_state" in processed_obs
+    assert "observation.state" in processed_obs
+
+    # Check that original keys were removed
+    assert "environment_state" not in processed_obs
+    assert "agent_pos" not in processed_obs
+
+    # Check that other data was preserved
+    assert processed_obs["other_data"] == "keep_me"
+
+
+def test_no_states_in_observation():
+    """Test processor when no states are in observation."""
+    processor = VanillaObservationProcessor()
+
+    observation = {"other_data": np.array([1, 2, 3])}
+    transition = create_transition(observation=observation)
+
+    result = processor(transition)
+    processed_obs = result[TransitionKey.OBSERVATION]
+
+    # Should preserve data unchanged
+    np.testing.assert_array_equal(processed_obs, observation)
+
+
+def test_complete_observation_processing():
+    """Test processing a complete observation with both images and states."""
+    processor = VanillaObservationProcessor()
+
+    # Create mock data
+    image = np.random.randint(0, 256, size=(32, 32, 3), dtype=np.uint8)
+    env_state = np.array([1.0, 2.0, 3.0], dtype=np.float32)
+    agent_pos = np.array([0.5, -0.5, 1.0], dtype=np.float32)
+
+    observation = {
+        "pixels": image,
+        "environment_state": env_state,
+        "agent_pos": agent_pos,
+        "other_data": "preserve_me",
+    }
+    transition = create_transition(observation=observation)
+
+    result = processor(transition)
+    processed_obs = result[TransitionKey.OBSERVATION]
+
+    # Check that image was processed
+    assert "observation.image" in processed_obs
+    assert processed_obs["observation.image"].shape == (1, 3, 32, 32)
+
+    # Check that states were processed
+    assert "observation.environment_state" in processed_obs
+    assert "observation.state" in processed_obs
+
+    # Check that original keys were removed
+    assert "pixels" not in processed_obs
+    assert "environment_state" not in processed_obs
+    assert "agent_pos" not in processed_obs
+
+    # Check that other data was preserved
+    assert processed_obs["other_data"] == "preserve_me"
+
+
+def test_image_only_processing():
+    """Test processing observation with only images."""
+    processor = VanillaObservationProcessor()
+
+    image = np.random.randint(0, 256, size=(64, 64, 3), dtype=np.uint8)
+    observation = {"pixels": image}
+    transition = create_transition(observation=observation)
+
+    result = processor(transition)
+    processed_obs = result[TransitionKey.OBSERVATION]
+
+    assert "observation.image" in processed_obs
+    assert len(processed_obs) == 1
+
+
+def test_state_only_processing():
+    """Test processing observation with only states."""
+    processor = VanillaObservationProcessor()
+
+    agent_pos = np.array([1.0, 2.0], dtype=np.float32)
+    observation = {"agent_pos": agent_pos}
+    transition = create_transition(observation=observation)
+
+    result = processor(transition)
+    processed_obs = result[TransitionKey.OBSERVATION]
+
+    assert "observation.state" in processed_obs
+    assert "agent_pos" not in processed_obs
+
+
+def test_empty_observation():
+    """Test processing empty observation."""
+    processor = VanillaObservationProcessor()
+
+    observation = {}
+    transition = create_transition(observation=observation)
+
+    result = processor(transition)
+    processed_obs = result[TransitionKey.OBSERVATION]
+
+    assert processed_obs == {}
+
+
+def test_equivalent_to_original_function():
+    """Test that ObservationProcessor produces equivalent results to preprocess_observation."""
+    # Import the original function for comparison
+    from lerobot.envs.utils import preprocess_observation
+
+    processor = VanillaObservationProcessor()
+
+    # Create test data similar to what the original function expects
+    image = np.random.randint(0, 256, size=(64, 64, 3), dtype=np.uint8)
+    env_state = np.array([1.0, 2.0, 3.0], dtype=np.float32)
+    agent_pos = np.array([0.5, -0.5, 1.0], dtype=np.float32)
+
+    observation = {"pixels": image, "environment_state": env_state, "agent_pos": agent_pos}
+
+    # Process with original function
+    original_result = preprocess_observation(observation)
+
+    # Process with new processor
+    transition = create_transition(observation=observation)
+    processor_result = processor(transition)[TransitionKey.OBSERVATION]
+
+    # Compare results
+    assert set(original_result.keys()) == set(processor_result.keys())
+
+    for key in original_result:
+        torch.testing.assert_close(original_result[key], processor_result[key])
+
+
+def test_equivalent_with_image_dict():
+    """Test equivalence with dictionary of images."""
+    from lerobot.envs.utils import preprocess_observation
+
+    processor = VanillaObservationProcessor()
+
+    # Create test data with multiple cameras
+    image1 = np.random.randint(0, 256, size=(32, 32, 3), dtype=np.uint8)
+    image2 = np.random.randint(0, 256, size=(48, 48, 3), dtype=np.uint8)
+    agent_pos = np.array([1.0, 2.0], dtype=np.float32)
+
+    observation = {"pixels": {"cam1": image1, "cam2": image2}, "agent_pos": agent_pos}
+
+    # Process with original function
+    original_result = preprocess_observation(observation)
+
+    # Process with new processor
+    transition = create_transition(observation=observation)
+    processor_result = processor(transition)[TransitionKey.OBSERVATION]
+
+    # Compare results
+    assert set(original_result.keys()) == set(processor_result.keys())
+
+    for key in original_result:
+        torch.testing.assert_close(original_result[key], processor_result[key])
+
+
+def test_image_processor_feature_contract_pixels_to_image(policy_feature_factory):
+    processor = VanillaObservationProcessor()
+    features = {
+        "pixels": policy_feature_factory(FeatureType.VISUAL, (3, 64, 64)),
+        "keep": policy_feature_factory(FeatureType.ENV, (1,)),
+    }
+    out = processor.feature_contract(features.copy())
+
+    assert OBS_IMAGE in out and out[OBS_IMAGE] == features["pixels"]
+    assert "pixels" not in out
+    assert out["keep"] == features["keep"]
+    assert_contract_is_typed(out)
+
+
+def test_image_processor_feature_contract_observation_pixels_to_image(policy_feature_factory):
+    processor = VanillaObservationProcessor()
+    features = {
+        "observation.pixels": policy_feature_factory(FeatureType.VISUAL, (3, 64, 64)),
+        "keep": policy_feature_factory(FeatureType.ENV, (1,)),
+    }
+    out = processor.feature_contract(features.copy())
+
+    assert OBS_IMAGE in out and out[OBS_IMAGE] == features["observation.pixels"]
+    assert "observation.pixels" not in out
+    assert out["keep"] == features["keep"]
+    assert_contract_is_typed(out)
+
+
+def test_image_processor_feature_contract_multi_camera_and_prefixed(policy_feature_factory):
+    processor = VanillaObservationProcessor()
+    features = {
+        "pixels.front": policy_feature_factory(FeatureType.VISUAL, (3, 64, 64)),
+        "pixels.wrist": policy_feature_factory(FeatureType.VISUAL, (3, 64, 64)),
+        "observation.pixels.rear": policy_feature_factory(FeatureType.VISUAL, (3, 64, 64)),
+        "keep": policy_feature_factory(FeatureType.ENV, (7,)),
+    }
+    out = processor.feature_contract(features.copy())
+
+    assert f"{OBS_IMAGES}.front" in out and out[f"{OBS_IMAGES}.front"] == features["pixels.front"]
+    assert f"{OBS_IMAGES}.wrist" in out and out[f"{OBS_IMAGES}.wrist"] == features["pixels.wrist"]
+    assert f"{OBS_IMAGES}.rear" in out and out[f"{OBS_IMAGES}.rear"] == features["observation.pixels.rear"]
+    assert "pixels.front" not in out and "pixels.wrist" not in out and "observation.pixels.rear" not in out
+    assert out["keep"] == features["keep"]
+    assert_contract_is_typed(out)
+
+
+def test_state_processor_feature_contract_environment_and_agent_pos(policy_feature_factory):
+    processor = VanillaObservationProcessor()
+    features = {
+        "environment_state": policy_feature_factory(FeatureType.STATE, (3,)),
+        "agent_pos": policy_feature_factory(FeatureType.STATE, (7,)),
+        "keep": policy_feature_factory(FeatureType.ENV, (1,)),
+    }
+    out = processor.feature_contract(features.copy())
+
+    assert OBS_ENV_STATE in out and out[OBS_ENV_STATE] == features["environment_state"]
+    assert OBS_STATE in out and out[OBS_STATE] == features["agent_pos"]
+    assert "environment_state" not in out and "agent_pos" not in out
+    assert out["keep"] == features["keep"]
+    assert_contract_is_typed(out)
+
+
+def test_state_processor_feature_contract_prefixed_inputs(policy_feature_factory):
+    proc = VanillaObservationProcessor()
+    features = {
+        "observation.environment_state": policy_feature_factory(FeatureType.STATE, (2,)),
+        "observation.agent_pos": policy_feature_factory(FeatureType.STATE, (4,)),
+    }
+    out = proc.feature_contract(features.copy())
+
+    assert OBS_ENV_STATE in out and out[OBS_ENV_STATE] == features["observation.environment_state"]
+    assert OBS_STATE in out and out[OBS_STATE] == features["observation.agent_pos"]
+    assert "environment_state" not in out and "agent_pos" not in out
+    assert_contract_is_typed(out)
@@ -0,0 +1,467 @@
+#!/usr/bin/env python
+
+# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import tempfile
+from pathlib import Path
+
+import numpy as np
+import torch
+
+from lerobot.configs.types import FeatureType
+from lerobot.processor import ProcessorStepRegistry, RenameProcessor, RobotProcessor, TransitionKey
+from tests.conftest import assert_contract_is_typed
+
+
+def create_transition(
+    observation=None, action=None, reward=None, done=None, truncated=None, info=None, complementary_data=None
+):
+    """Helper to create an EnvTransition dictionary."""
+    return {
+        TransitionKey.OBSERVATION: observation,
+        TransitionKey.ACTION: action,
+        TransitionKey.REWARD: reward,
+        TransitionKey.DONE: done,
+        TransitionKey.TRUNCATED: truncated,
+        TransitionKey.INFO: info,
+        TransitionKey.COMPLEMENTARY_DATA: complementary_data,
+    }
+
+
+def test_basic_renaming():
+    """Test basic key renaming functionality."""
+    rename_map = {
+        "old_key1": "new_key1",
+        "old_key2": "new_key2",
+    }
+    processor = RenameProcessor(rename_map=rename_map)
+
+    observation = {
+        "old_key1": torch.tensor([1.0, 2.0]),
+        "old_key2": np.array([3.0, 4.0]),
+        "unchanged_key": "keep_me",
+    }
+    transition = create_transition(observation=observation)
+
+    result = processor(transition)
+    processed_obs = result[TransitionKey.OBSERVATION]
+
+    # Check renamed keys
+    assert "new_key1" in processed_obs
+    assert "new_key2" in processed_obs
+    assert "old_key1" not in processed_obs
+    assert "old_key2" not in processed_obs
+
+    # Check values are preserved
+    torch.testing.assert_close(processed_obs["new_key1"], torch.tensor([1.0, 2.0]))
+    np.testing.assert_array_equal(processed_obs["new_key2"], np.array([3.0, 4.0]))
+
+    # Check unchanged key is preserved
+    assert processed_obs["unchanged_key"] == "keep_me"
+
+
+def test_empty_rename_map():
+    """Test processor with empty rename map (should pass through unchanged)."""
+    processor = RenameProcessor(rename_map={})
+
+    observation = {
+        "key1": torch.tensor([1.0]),
+        "key2": "value2",
+    }
+    transition = create_transition(observation=observation)
+
+    result = processor(transition)
+    processed_obs = result[TransitionKey.OBSERVATION]
+
+    # All keys should be unchanged
+    assert processed_obs.keys() == observation.keys()
+    torch.testing.assert_close(processed_obs["key1"], observation["key1"])
+    assert processed_obs["key2"] == observation["key2"]
+
+
+def test_none_observation():
+    """Test processor with None observation."""
+    processor = RenameProcessor(rename_map={"old": "new"})
+
+    transition = create_transition()
+    result = processor(transition)
+
+    # Should return transition unchanged
+    assert result == transition
+
+
+def test_overlapping_rename():
+    """Test renaming when new names might conflict."""
+    rename_map = {
+        "a": "b",
+        "b": "c",  # This creates a potential conflict
+    }
+    processor = RenameProcessor(rename_map=rename_map)
+
+    observation = {
+        "a": 1,
+        "b": 2,
+        "x": 3,
+    }
+    transition = create_transition(observation=observation)
+
+    result = processor(transition)
+    processed_obs = result[TransitionKey.OBSERVATION]
+
+    # Check that renaming happens correctly
+    assert "a" not in processed_obs
+    assert processed_obs["b"] == 1  # 'a' renamed to 'b'
+    assert processed_obs["c"] == 2  # original 'b' renamed to 'c'
+    assert processed_obs["x"] == 3
+
+
+def test_partial_rename():
+    """Test renaming only some keys."""
+    rename_map = {
+        "observation.state": "observation.proprio_state",
+        "pixels": "observation.image",
+    }
+    processor = RenameProcessor(rename_map=rename_map)
+
+    observation = {
+        "observation.state": torch.randn(10),
+        "pixels": np.random.randint(0, 256, (64, 64, 3), dtype=np.uint8),
+        "reward": 1.0,
+        "info": {"episode": 1},
+    }
+    transition = create_transition(observation=observation)
+
+    result = processor(transition)
+    processed_obs = result[TransitionKey.OBSERVATION]
+
+    # Check renamed keys
+    assert "observation.proprio_state" in processed_obs
+    assert "observation.image" in processed_obs
+    assert "observation.state" not in processed_obs
+    assert "pixels" not in processed_obs
+
+    # Check unchanged keys
+    assert processed_obs["reward"] == 1.0
+    assert processed_obs["info"] == {"episode": 1}
+
+
+def test_get_config():
+    """Test configuration serialization."""
+    rename_map = {
+        "old1": "new1",
+        "old2": "new2",
+    }
+    processor = RenameProcessor(rename_map=rename_map)
+
+    config = processor.get_config()
+    assert config == {"rename_map": rename_map}
+
+
+def test_state_dict():
+    """Test state dict (should be empty for RenameProcessor)."""
+    processor = RenameProcessor(rename_map={"old": "new"})
+
+    state = processor.state_dict()
+    assert state == {}
+
+    # Load state dict should work even with empty dict
+    processor.load_state_dict({})
+
+
+def test_integration_with_robot_processor():
+    """Test integration with RobotProcessor pipeline."""
+    rename_map = {
+        "agent_pos": "observation.state",
+        "pixels": "observation.image",
+    }
+    rename_processor = RenameProcessor(rename_map=rename_map)
+
+    pipeline = RobotProcessor([rename_processor])
+
+    observation = {
+        "agent_pos": np.array([1.0, 2.0, 3.0]),
+        "pixels": np.zeros((32, 32, 3), dtype=np.uint8),
+        "other_data": "preserve_me",
+    }
+    transition = create_transition(
+        observation=observation, reward=0.5, done=False, truncated=False, info={}, complementary_data={}
+    )
+
+    result = pipeline(transition)
+    processed_obs = result[TransitionKey.OBSERVATION]
+
+    # Check renaming worked through pipeline
+    assert "observation.state" in processed_obs
+    assert "observation.image" in processed_obs
+    assert "agent_pos" not in processed_obs
+    assert "pixels" not in processed_obs
+    assert processed_obs["other_data"] == "preserve_me"
+
+    # Check other transition elements unchanged
+    assert result[TransitionKey.REWARD] == 0.5
+    assert result[TransitionKey.DONE] is False
+
+
+def test_save_and_load_pretrained():
+    """Test saving and loading processor with RobotProcessor."""
+    rename_map = {
+        "old_state": "observation.state",
+        "old_image": "observation.image",
+    }
+    processor = RenameProcessor(rename_map=rename_map)
+    pipeline = RobotProcessor([processor], name="TestRenameProcessor")
+
+    with tempfile.TemporaryDirectory() as tmp_dir:
+        # Save pipeline
+        pipeline.save_pretrained(tmp_dir)
+
+        # Check files were created
+        config_path = Path(tmp_dir) / "testrenameprocessor.json"  # Based on name="TestRenameProcessor"
+        assert config_path.exists()
+
+        # No state files should be created for RenameProcessor
+        state_files = list(Path(tmp_dir).glob("*.safetensors"))
+        assert len(state_files) == 0
+
+        # Load pipeline
+        loaded_pipeline = RobotProcessor.from_pretrained(tmp_dir)
+
+        assert loaded_pipeline.name == "TestRenameProcessor"
+        assert len(loaded_pipeline) == 1
+
+        # Check that loaded processor works correctly
+        loaded_processor = loaded_pipeline.steps[0]
+        assert isinstance(loaded_processor, RenameProcessor)
+        assert loaded_processor.rename_map == rename_map
+
+        # Test functionality after loading
+        observation = {"old_state": [1, 2, 3], "old_image": "image_data"}
+        transition = create_transition(observation=observation)
+
+        result = loaded_pipeline(transition)
+        processed_obs = result[TransitionKey.OBSERVATION]
+
+        assert "observation.state" in processed_obs
+        assert "observation.image" in processed_obs
+        assert processed_obs["observation.state"] == [1, 2, 3]
+        assert processed_obs["observation.image"] == "image_data"
+
+
+def test_registry_functionality():
+    """Test that RenameProcessor is properly registered."""
+    # Check that it's registered
+    assert "rename_processor" in ProcessorStepRegistry.list()
+
+    # Get from registry
+    retrieved_class = ProcessorStepRegistry.get("rename_processor")
+    assert retrieved_class is RenameProcessor
+
+    # Create instance from registry
+    instance = retrieved_class(rename_map={"old": "new"})
+    assert isinstance(instance, RenameProcessor)
+    assert instance.rename_map == {"old": "new"}
+
+
+def test_registry_based_save_load():
+    """Test save/load using registry name instead of module path."""
+    processor = RenameProcessor(rename_map={"key1": "renamed_key1"})
+    pipeline = RobotProcessor([processor])
+
+    with tempfile.TemporaryDirectory() as tmp_dir:
+        # Save and load
+        pipeline.save_pretrained(tmp_dir)
+
+        # Verify config uses registry name
+        import json
+
+        with open(Path(tmp_dir) / "robotprocessor.json") as f:  # Default name is "RobotProcessor"
+            config = json.load(f)
+
+        assert "registry_name" in config["steps"][0]
+        assert config["steps"][0]["registry_name"] == "rename_processor"
+        assert "class" not in config["steps"][0]  # Should use registry, not module path
+
+        # Load should work
+        loaded_pipeline = RobotProcessor.from_pretrained(tmp_dir)
+        loaded_processor = loaded_pipeline.steps[0]
+        assert isinstance(loaded_processor, RenameProcessor)
+        assert loaded_processor.rename_map == {"key1": "renamed_key1"}
+
+
+def test_chained_rename_processors():
+    """Test multiple RenameProcessors in a pipeline."""
+    # First processor: rename raw keys to intermediate format
+    processor1 = RenameProcessor(
+        rename_map={
+            "pos": "agent_position",
+            "img": "camera_image",
+        }
+    )
+
+    # Second processor: rename to final format
+    processor2 = RenameProcessor(
+        rename_map={
+            "agent_position": "observation.state",
+            "camera_image": "observation.image",
+        }
+    )
+
+    pipeline = RobotProcessor([processor1, processor2])
+
+    observation = {
+        "pos": np.array([1.0, 2.0]),
+        "img": "image_data",
+        "extra": "keep_me",
+    }
+    transition = create_transition(observation=observation)
+
+    # Step through to see intermediate results
+    results = list(pipeline.step_through(transition))
+
+    # After first processor
+    assert "agent_position" in results[1][TransitionKey.OBSERVATION]
+    assert "camera_image" in results[1][TransitionKey.OBSERVATION]
+
+    # After second processor
+    final_obs = results[2][TransitionKey.OBSERVATION]
+    assert "observation.state" in final_obs
+    assert "observation.image" in final_obs
+    assert final_obs["extra"] == "keep_me"
+
+    # Original keys should be gone
+    assert "pos" not in final_obs
+    assert "img" not in final_obs
+    assert "agent_position" not in final_obs
+    assert "camera_image" not in final_obs
+
+
+def test_nested_observation_rename():
+    """Test renaming with nested observation structures."""
+    rename_map = {
+        "observation.images.left": "observation.camera.left_view",
+        "observation.images.right": "observation.camera.right_view",
+        "observation.proprio": "observation.proprioception",
+    }
+    processor = RenameProcessor(rename_map=rename_map)
+
+    observation = {
+        "observation.images.left": torch.randn(3, 64, 64),
+        "observation.images.right": torch.randn(3, 64, 64),
+        "observation.proprio": torch.randn(7),
+        "observation.gripper": torch.tensor([0.0]),  # Not renamed
+    }
+    transition = create_transition(observation=observation)
+
+    result = processor(transition)
+    processed_obs = result[TransitionKey.OBSERVATION]
+
+    # Check renames
+    assert "observation.camera.left_view" in processed_obs
+    assert "observation.camera.right_view" in processed_obs
+    assert "observation.proprioception" in processed_obs
+
+    # Check unchanged key
+    assert "observation.gripper" in processed_obs
+
+    # Check old keys removed
+    assert "observation.images.left" not in processed_obs
+    assert "observation.images.right" not in processed_obs
+    assert "observation.proprio" not in processed_obs
+
+
+def test_value_types_preserved():
+    """Test that various value types are preserved during renaming."""
+    rename_map = {"old_tensor": "new_tensor", "old_array": "new_array", "old_scalar": "new_scalar"}
+    processor = RenameProcessor(rename_map=rename_map)
+
+    tensor_value = torch.randn(3, 3)
+    array_value = np.random.rand(2, 2)
+
+    observation = {
+        "old_tensor": tensor_value,
+        "old_array": array_value,
+        "old_scalar": 42,
+        "old_string": "hello",
+        "old_dict": {"nested": "value"},
+        "old_list": [1, 2, 3],
+    }
+    transition = create_transition(observation=observation)
+
+    result = processor(transition)
+    processed_obs = result[TransitionKey.OBSERVATION]
+
+    # Check that values and types are preserved
+    assert torch.equal(processed_obs["new_tensor"], tensor_value)
+    assert np.array_equal(processed_obs["new_array"], array_value)
+    assert processed_obs["new_scalar"] == 42
+    assert processed_obs["old_string"] == "hello"
+    assert processed_obs["old_dict"] == {"nested": "value"}
+    assert processed_obs["old_list"] == [1, 2, 3]
+
+
+def test_feature_contract_basic_renaming(policy_feature_factory):
+    processor = RenameProcessor(rename_map={"a": "x", "b": "y"})
+    features = {
+        "a": policy_feature_factory(FeatureType.STATE, (2,)),
+        "b": policy_feature_factory(FeatureType.ACTION, (3,)),
+        "c": policy_feature_factory(FeatureType.ENV, (1,)),
+    }
+
+    out = processor.feature_contract(features.copy())
+
+    # Values preserved and typed
+    assert out["x"] == features["a"]
+    assert out["y"] == features["b"]
+    assert out["c"] == features["c"]
+
+    assert_contract_is_typed(out)
+    # Input not mutated
+    assert set(features) == {"a", "b", "c"}
+
+
+def test_feature_contract_overlapping_keys(policy_feature_factory):
+    # Overlapping renames: both 'a' and 'b' exist. 'a'->'b', 'b'->'c'
+    processor = RenameProcessor(rename_map={"a": "b", "b": "c"})
+    features = {
+        "a": policy_feature_factory(FeatureType.STATE, (1,)),
+        "b": policy_feature_factory(FeatureType.STATE, (2,)),
+    }
+    out = processor.feature_contract(features)
+
+    assert set(out) == {"b", "c"}
+    assert out["b"] == features["a"]  # 'a' renamed to'b'
+    assert out["c"] == features["b"]  # 'b' renamed to 'c'
+    assert_contract_is_typed(out)
+
+
+def test_feature_contract_chained_processors(policy_feature_factory):
+    # Chain two rename processors at the contract level
+    processor1 = RenameProcessor(rename_map={"pos": "agent_position", "img": "camera_image"})
+    processor2 = RenameProcessor(
+        rename_map={"agent_position": "observation.state", "camera_image": "observation.image"}
+    )
+    pipeline = RobotProcessor([processor1, processor2])
+
+    spec = {
+        "pos": policy_feature_factory(FeatureType.STATE, (7,)),
+        "img": policy_feature_factory(FeatureType.VISUAL, (3, 64, 64)),
+        "extra": policy_feature_factory(FeatureType.ENV, (1,)),
+    }
+    out = pipeline.feature_contract(initial_features=spec)
+
+    assert set(out) == {"observation.state", "observation.image", "extra"}
+    assert out["observation.state"] == spec["pos"]
+    assert out["observation.image"] == spec["img"]
+    assert out["extra"] == spec["extra"]
+    assert_contract_is_typed(out)
Author	SHA1	Message	Date
CarolinePascal	34454748f4	fix(datasets)	2026-03-09 15:06:55 +01:00
CarolinePascal	8e5763c5ab	fix(datasets)	2026-03-08 20:53:16 +01:00
CarolinePascal	388d4518ba	fix(datasets)	2026-03-07 17:57:56 +01:00
CarolinePascal	232dbe4176	fix(datasets)	2026-03-07 17:46:09 +01:00
CarolinePascal	10c2e2fc87	fix(datasets)	2026-03-07 01:14:19 +01:00
CarolinePascal	5e74f06b20	fix(datasets)	2026-03-07 00:24:01 +01:00
CarolinePascal	07931b1101	fix(datasets)	2026-03-07 00:18:57 +01:00
Steven Palma	b883328e6c	chore: Bump to 0.3.3 (#1690 )	2025-08-06 20:29:48 +02:00
Steven Palma	49ecbeb33f	fix(deps): ceil torch pkg versions (#1689 ) * fix(deps): ceil torch pkg versions * chore(Docs): add todo comment	2025-08-06 20:10:47 +02:00
Adil Zouitine	88f7bf01c1	feat(pipeline): universal processor for LeRobot (#1431 ) * Refactor observation preprocessing to use a modular pipeline system - Introduced `RobotPipeline` and `ObservationProcessor` for handling observation transformations. - Updated `preprocess_observation` to maintain backward compatibility while leveraging the new pipeline. - Added tests for the new processing components and ensured they match the original functionality. - Removed hardcoded logic in favor of a more flexible, composable architecture. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Refactor observation processing and improve modularity - Updated `ObservationProcessor` to enhance the modular design for processing observations. - Cleaned up imports and improved code readability by removing unnecessary lines and comments. - Ensured backward compatibility while integrating new processing components. - Added tests to validate the functionality of the updated processing architecture. * Remove redundant tests for None observation and serialization methods in `test_observation_processor.py` to streamline the test suite and improve maintainability. * Refactor processing architecture to use RobotProcessor - Replaced instances of RobotPipeline with RobotProcessor across the codebase for improved modularity and clarity. - Introduced ProcessorStepRegistry for better management of processing steps. - Updated relevant documentation and tests to reflect the new processing structure. - Enhanced the save/load functionality to support the new processor design. - Added a model card template for RobotProcessor to facilitate sharing and documentation. * Add RobotProcessor tutorial to documentation - Introduced a new tutorial on using RobotProcessor for preprocessing robot data. - Added a section in the table of contents for easy navigation to the new tutorial. - The tutorial covers key concepts, real-world scenarios, and practical examples for effective use of the RobotProcessor pipeline. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add normalization processor and related components - Introduced `NormalizationProcessor` to handle both observation normalization and action unnormalization. - Added `ObservationNormalizer` and `ActionUnnormalizer` classes for specific normalization tasks. - Updated `__init__.py` to include the new `NormalizationProcessor` in the module exports. - Enhanced `ObservationProcessor` with registration in the `ProcessorStepRegistry` for better modularity. - Created `RenameProcessor` for renaming keys in observations, improving flexibility in data processing. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Enhance processing architecture with new components - Added `RenameProcessor` to facilitate key renaming in observations, improving data handling flexibility. - Updated `__init__.py` to include `RenameProcessor` in module exports. - Refactored `NormalizationProcessor` and `ObservationNormalizer` to use `rsplit` for better key handling. - Introduced comprehensive tests for `NormalizationProcessor` and `RenameProcessor` to ensure functionality and robustness. * chore (docs): add docstring for processor * fix (test): test factory * fix(test): policies * Update tests/processor/test_observation_processor.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Adil Zouitine <adilzouitinegm@gmail.com> * chore(test): add suggestion made by copilot regarding numpy test * fix(test): import issue * Refactor normalization components and update tests - Renamed `ObservationNormalizer` to `NormalizerProcessor` and `ActionUnnormalizer` to `UnnormalizerProcessor` for clarity. - Consolidated normalization logic for both observations and actions into `NormalizerProcessor` and `UnnormalizerProcessor`. - Updated tests to reflect the new class names and ensure proper functionality of normalization and unnormalization processes. - Enhanced handling of missing statistics in normalization processes. * chore (docstrin):Improve docstring for NormalizerProcessor * feat (device processor): Implement device processor * chore (batch handling): Enhance processing components with batch conversion utilities * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix(test): linting issue * chore (output format): improves output format * chore (type): add typing for multiprocess envs * feat (overrides): Implement support for loading processors with parameter overrides - Added the ability to provide non-serializable objects when loading processors from saved configurations using the `overrides` parameter. - Enhanced error handling for invalid override keys and instantiation errors. - Updated documentation and examples to illustrate the usage of overrides for both registered and unregistered steps. - Added comprehensive tests to validate the new functionality and ensure backward compatibility. * chore(normalization): addressing comments from copilot * chore(learner): nit comment from copilot * feat(pipeline): Enhance step_through method to support both tuple and dict inputs * refactor(pipeline): Simplify observation and padding data handling in batch transitions * Apply suggestions from code review Co-authored-by: Simon Alibert <75076266+aliberts@users.noreply.github.com> Signed-off-by: Adil Zouitine <adilzouitinegm@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refactor(pipeline): Introduce ComplementaryDataProcessor for handling complementary data in transitions * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refactor(pipeline): Transition from tuple to dictionary format for EnvTransition - Updated the EnvTransition structure to use a dictionary format instead of a tuple, enhancing readability and maintainability. - Replaced instances of TransitionIndex with TransitionKey for accessing transition components. - Adjusted related processing functions and tests to accommodate the new dictionary format, ensuring consistent handling of transitions across the codebase. * refactor(observation_processor): Improve observation processing by using constants and simplifying pixel handling - Introduced constants for observation keys to enhance readability. - Streamlined the handling of the "pixels" key by copying observations first and processing images more clearly. - Updated the environment state and agent position assignments to use the new constants, improving maintainability. * feat(pipeline): Add hook unregistration functionality and enhance documentation - Implemented methods to unregister before, after, and reset hooks in the RobotProcessor class, allowing for more flexible hook management. - Enhanced documentation to clarify hook execution semantics and the implications of modifying transitions within hooks. - Added comprehensive tests to verify the correct behavior of hook registration and unregistration, including error handling for non-existent hooks. * refactor(pipeline): Clarify hook behavior and improve documentation - Updated the RobotProcessor class to ensure hooks are strictly for observation and do not modify transitions, enhancing clarity and maintainability. - Refactored hook registration methods to reflect the new behavior, ensuring they accept only functions that do not return modified transitions. - Enhanced documentation to clearly outline the purpose of hooks and their execution semantics. - Added tests to verify that hooks are not executed during the step_through method while ensuring they function correctly during the __call__ method. * feat(pipeline): Add __repr__ method to RobotProcessor for improved readability - Implemented a __repr__ method in the RobotProcessor class to provide a clear string representation of the processor, including step names and optional parameters like name and seed. - Added comprehensive tests to validate the __repr__ output for various scenarios, including empty processors, single and multiple steps, custom names, and seed values. - Ensured that the representation handles long lists of steps with truncation for better readability. * chore(pipeline): Move _CFG_NAME along other class member * refactor(pipeline): Utilize get_safe_torch_device for device assignment - Replaced direct torch.device instantiation with get_safe_torch_device to ensure safe device handling. - This change enhances code readability and maintains consistency in device management across the RobotProcessor class. * refactor(pipeline): Enhance state filename generation and profiling method - Updated state filename generation to use the registry name when available, improving clarity in saved files. - Modified the profile_steps method to include a warmup_runs parameter, allowing for more controlled performance profiling. - Ensured consistent conditions during profiling by deep copying transitions for each run, enhancing accuracy in timing results. * chore(doc): address pip install commant lerobot that not exist yet * feat(pipeline): Enhance configuration filename handling and state file naming - Introduced support for custom configuration filenames in the `save_pretrained` method, allowing users to specify a filename instead of the default. - Improved state file naming to include step indices, preventing conflicts when multiple processors of the same type are saved. - Added automatic detection for configuration files when loading from a directory, with error handling for multiple files. - Updated tests to validate new features, including custom filenames and automatic config detection. * refactor(pipeline): Improve state file naming conventions for clarity and uniqueness - Enhanced state file naming to include the processor's sanitized name, ensuring uniqueness when multiple processors are saved in the same directory. - Updated tests to reflect changes in state file naming, verifying that filenames now include the processor name and step indices to prevent conflicts. - Added a new test to validate state file naming when using multiple processors, ensuring distinct filenames for each processor's state files. * docs(pipeline): Add clarification for repo name sanitization process * Feat/pipeline add feature contract (#1637) * Add feature contract to pipelinestep and pipeline * Add tests * Add processor tests * PR feedback * encorperate pr feedback * type in doc * oops * docs(pipeline): Clarify transition handling and hook behavior - Updated documentation to specify that hooks always receive transitions in EnvTransition format, ensuring consistent behavior across input formats. - Refactored the step_through method to yield only EnvTransition objects, regardless of the input format, and updated related tests to reflect this change. - Enhanced test assertions to verify the structure of results and the correctness of processing steps. * refactor(pipeline): Remove to() method for device management - Eliminated the to() method from RobotProcessor, which was responsible for moving tensor states to specified devices. - Removed associated unit tests that validated the functionality of the to() method across various scenarios. - Streamlined the pipeline code by focusing on other device management strategies. * refactor(pipeline): Remove model card generation and streamline processor methods - Eliminated the _generate_model_card method from RobotProcessor, which was responsible for generating README.md files from a template. - Updated save_pretrained method to remove model card generation, focusing on serialization of processor definitions and parameters. - Added default implementations for get_config, state_dict, load_state_dict, reset, and feature_contract methods in various processor classes to enhance consistency and usability. * refactor(observation): Streamline observation preprocessing and remove unused processor methods - Updated the `preprocess_observation` function to enhance image handling and ensure proper tensor formatting. - Removed the `RobotProcessor` and associated transition handling from the `rollout` function, simplifying the observation processing flow. - Integrated direct calls to `preprocess_observation` for improved clarity and efficiency in the evaluation script. * refactor(pipeline): Rename parameters for clarity and enhance save/load functionality - Updated parameter names in the save_pretrained and from_pretrained methods for improved readability, changing destination_path to save_directory and source to pretrained_model_name_or_path. - Enhanced the save_pretrained method to ensure directory creation and file handling is consistent with the new parameter names. - Streamlined the loading process in from_pretrained to utilize loaded_config for better clarity and maintainability. * refactor(pipeline): minor improvements (#1684) * chore(pipeline): remove unused features + device torch + envtransition keys * refactor(pipeline): ImageProcessor & StateProcessor are both implemented directly in VanillaObservationPRocessor * refactor(pipeline): RenameProcessor now inherits from ObservationProcessor + remove unused code * test(pipeline): fix broken test after refactors * docs(pipeline): update docstrings VanillaObservationProcessor * chore(pipeline): move None check to base pipeline classes --------- Signed-off-by: Adil Zouitine <adilzouitinegm@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Simon Alibert <75076266+aliberts@users.noreply.github.com> Co-authored-by: Pepijn <138571049+pkooij@users.noreply.github.com> Co-authored-by: Steven Palma <imstevenpmwork@ieee.org>	2025-08-06 16:11:04 +02:00
Steven Palma	6daa579ce1	docs: update installation instructions (#1686 )	2025-08-06 15:06:36 +02:00
Caroline Pascal	06bebd97b3	fix(typo): fixing typo in LeRobot authors names (#1673 )	2025-08-05 23:47:49 +02:00
HUANG TZU-CHUN	e0096feb6a	fix(docs): Update links in il_robots.mdx and il_sim.mdx to use absolute URLs (#1313 ) * Update links to use absolute URLs. * Update dataset upload example link to use HF_USER variable and match the correct syntax.	2025-08-05 12:33:55 +02:00
Francesco Capuano	90d3a99aa1	Fix policy construction (#1665 ) * add: test to check proper construction with multiple features with STATE/ACTION type * fix: robot and action state should match policy's expectations * fix minor Signed-off-by: Francesco Capuano <74058581+fracapuano@users.noreply.github.com> --------- Signed-off-by: Francesco Capuano <74058581+fracapuano@users.noreply.github.com>	2025-08-04 21:49:51 +02:00
Steven Palma	8c577525c1	chore: Bump to 4.0.0 (#1653 )	2025-08-04 11:00:22 +02:00