* Change Diffusion policy to use chunk_size notation instead of horizon to standerize the variable names across policies

* reshape noise after taking it as output of the network
update factory with dsrl
2026-06-30 22:57:00 +00:00 · 2025-11-06 12:02:13 +01:00 · 2025-11-06 12:02:11 +01:00 · 2025-11-06 11:57:29 +01:00 · 2025-11-04 15:56:41 +01:00 · 2025-11-04 14:52:46 +01:00
23 changed files with 2457 additions and 72 deletions
@@ -185,6 +185,11 @@ _Replace `[...]` with your desired features._
 For a full list of optional dependencies, see:
 https://pypi.org/project/lerobot/

+> [!NOTE]
+> For lerobot 0.4.0, if you want to install libero or pi tags, you will have to do: `pip install "lerobot[pi,libero]@git+https://github.com/huggingface/lerobot.git"`.
+>
+> This will be solved in the next patch release
+
 ### Weights & Biases

 To use [Weights and Biases](https://docs.wandb.ai/quickstart) for experiment tracking, log in with
@@ -337,7 +342,3 @@ If you want, you can cite this work with:
 ## Star History

 [![Star History Chart](https://api.star-history.com/svg?repos=huggingface/lerobot&type=Timeline)](https://star-history.com/#huggingface/lerobot&Timeline)
-
-```
-
-```
@@ -41,6 +41,8 @@
    title: NVIDIA GR00T N1.5
  title: "Policies"
 - sections:
+  - local: envhub
+    title: Environments from the Hub
  - local: il_sim
    title: Imitation Learning in Sim
  - local: libero
@@ -0,0 +1,424 @@
+# Loading Environments from the Hub
+
+The **EnvHub** feature allows you to load simulation environments directly from the Hugging Face Hub with a single line of code. This unlocks a powerful new model for collaboration: instead of environments being locked away inside monolithic libraries, anyone can publish custom environments and share them with the community.
+
+## Overview
+
+With EnvHub, you can:
+
+- Load environments from the Hub instantly
+- Share your custom simulation tasks with the community
+- Version control your environments using Git
+- Distribute complex physics simulations without packaging hassles
+
+## Quick Start
+
+Loading an environment from the Hub is as simple as:
+
+```python
+from lerobot.envs.factory import make_env
+
+# Load a hub environment (requires explicit consent to run remote code)
+env = make_env("lerobot/cartpole-env", trust_remote_code=True)
+```
+
+<Tip warning={true}>
+  **Security Notice**: Loading environments from the Hub executes Python code
+  from third-party repositories. Only use `trust_remote_code=True` with
+  repositories you trust. We strongly recommend pinning to a specific commit
+  hash for reproducibility and security.
+</Tip>
+
+## What is EnvHub?
+
+EnvHub is a framework that allows researchers and developers to:
+
+1. **Publish environments** to the Hugging Face Hub as Git repositories
+2. **Load environments** dynamically without installing them as packages
+3. **Version and track** environment changes using Git semantics
+4. **Discover** new simulation tasks shared by the community
+
+This design means you can go from discovering an interesting environment on the Hub to running experiments in seconds, without worrying about dependency conflicts or complex installation procedures.
+
+## Repository Structure
+
+To make your environment loadable from the Hub, your repository must contain at minimum:
+
+### Required Files
+
+**`env.py`** (or custom Python file)
+
+- Must expose a `make_env(n_envs: int, use_async_envs: bool)` function
+- This function should return one of:
+  - A `gym.vector.VectorEnv` (most common)
+  - A single `gym.Env` (will be automatically wrapped)
+  - A dict mapping `{suite_name: {task_id: VectorEnv}}` (for multi-task benchmarks)
+
+### Optional Files
+
+**`requirements.txt`**
+
+- List any additional dependencies your environment needs
+- Users will need to install these manually before loading your environment
+
+**`README.md`**
+
+- Document your environment: what task it implements, observation/action spaces, rewards, etc.
+- Include usage examples and any special setup instructions
+
+**`.gitignore`**
+
+- Exclude unnecessary files from your repository
+
+### Example Repository Structure
+
+```
+my-environment-repo/
+├── env.py                 # Main environment definition (required)
+├── requirements.txt       # Dependencies (optional)
+├── README.md             # Documentation (recommended)
+├── assets/               # Images, videos, etc. (optional)
+│   └── demo.gif
+└── configs/              # Config files if needed (optional)
+    └── task_config.yaml
+```
+
+## Creating Your Environment Repository
+
+### Step 1: Define Your Environment
+
+Create an `env.py` file with a `make_env` function:
+
+```python
+# env.py
+import gymnasium as gym
+
+def make_env(n_envs: int = 1, use_async_envs: bool = False):
+    """
+    Create vectorized environments for your custom task.
+
+    Args:
+        n_envs: Number of parallel environments
+        use_async_envs: Whether to use AsyncVectorEnv or SyncVectorEnv
+
+    Returns:
+        gym.vector.VectorEnv or dict mapping suite names to vectorized envs
+    """
+    def _make_single_env():
+        # Create your custom environment
+        return gym.make("CartPole-v1")
+
+    # Choose vector environment type
+    env_cls = gym.vector.AsyncVectorEnv if use_async_envs else gym.vector.SyncVectorEnv
+
+    # Create vectorized environment
+    vec_env = env_cls([_make_single_env for _ in range(n_envs)])
+
+    return vec_env
+```
+
+### Step 2: Test Locally
+
+Before uploading, test your environment locally:
+
+```python
+from lerobot.envs.utils import _load_module_from_path, _call_make_env, _normalize_hub_result
+
+# Load your module
+module = _load_module_from_path("./env.py")
+
+# Test the make_env function
+result = _call_make_env(module, n_envs=2, use_async_envs=False)
+normalized = _normalize_hub_result(result)
+
+# Verify it works
+suite_name = next(iter(normalized))
+env = normalized[suite_name][0]
+obs, info = env.reset()
+print(f"Observation shape: {obs.shape if hasattr(obs, 'shape') else type(obs)}")
+env.close()
+```
+
+### Step 3: Upload to the Hub
+
+Upload your repository to Hugging Face:
+
+```bash
+# Install huggingface_hub if needed
+pip install huggingface_hub
+
+# Login to Hugging Face
+huggingface-cli login
+
+# Create a new repository
+huggingface-cli repo create my-custom-env --type space --org my-org
+
+# Initialize git and push
+git init
+git add .
+git commit -m "Initial environment implementation"
+git remote add origin https://huggingface.co/my-org/my-custom-env
+git push -u origin main
+```
+
+Alternatively, use the `huggingface_hub` Python API:
+
+```python
+from huggingface_hub import HfApi
+
+api = HfApi()
+
+# Create repository
+api.create_repo("my-custom-env", repo_type="space")
+
+# Upload files
+api.upload_folder(
+    folder_path="./my-env-folder",
+    repo_id="username/my-custom-env",
+    repo_type="space",
+)
+```
+
+## Loading Environments from the Hub
+
+### Basic Usage
+
+```python
+from lerobot.envs.factory import make_env
+
+# Load from the hub
+envs_dict = make_env(
+    "username/my-custom-env",
+    n_envs=4,
+    trust_remote_code=True
+)
+
+# Access the environment
+suite_name = next(iter(envs_dict))
+env = envs_dict[suite_name][0]
+
+# Use it like any gym environment
+obs, info = env.reset()
+action = env.action_space.sample()
+obs, reward, terminated, truncated, info = env.step(action)
+```
+
+### Advanced: Pinning to Specific Versions
+
+For reproducibility and security, pin to a specific Git revision:
+
+```python
+# Pin to a specific branch
+env = make_env("username/my-env@main", trust_remote_code=True)
+
+# Pin to a specific commit (recommended for papers/experiments)
+env = make_env("username/my-env@abc123def456", trust_remote_code=True)
+
+# Pin to a tag
+env = make_env("username/my-env@v1.0.0", trust_remote_code=True)
+```
+
+### Custom File Paths
+
+If your environment definition is not in `env.py`:
+
+```python
+# Load from a custom file
+env = make_env("username/my-env:custom_env.py", trust_remote_code=True)
+
+# Combine with version pinning
+env = make_env("username/my-env@v1.0:envs/task_a.py", trust_remote_code=True)
+```
+
+### Async Environments
+
+For better performance with multiple environments:
+
+```python
+envs_dict = make_env(
+    "username/my-env",
+    n_envs=8,
+    use_async_envs=True,  # Use AsyncVectorEnv for parallel execution
+    trust_remote_code=True
+)
+```
+
+## URL Format Reference
+
+The hub URL format supports several patterns:
+
+| Pattern              | Description                    | Example                                |
+| -------------------- | ------------------------------ | -------------------------------------- |
+| `user/repo`          | Load `env.py` from main branch | `make_env("lerobot/pusht-env")`        |
+| `user/repo@revision` | Load from specific revision    | `make_env("lerobot/pusht-env@main")`   |
+| `user/repo:path`     | Load custom file               | `make_env("lerobot/envs:pusht.py")`    |
+| `user/repo@rev:path` | Revision + custom file         | `make_env("lerobot/envs@v1:pusht.py")` |
+
+## Multi-Task Environments
+
+For benchmarks with multiple tasks (like LIBERO), return a nested dictionary:
+
+```python
+def make_env(n_envs: int = 1, use_async_envs: bool = False):
+    env_cls = gym.vector.AsyncVectorEnv if use_async_envs else gym.vector.SyncVectorEnv
+
+    # Return dict: {suite_name: {task_id: VectorEnv}}
+    return {
+        "suite_1": {
+            0: env_cls([lambda: gym.make("Task1-v0") for _ in range(n_envs)]),
+            1: env_cls([lambda: gym.make("Task2-v0") for _ in range(n_envs)]),
+        },
+        "suite_2": {
+            0: env_cls([lambda: gym.make("Task3-v0") for _ in range(n_envs)]),
+        }
+    }
+```
+
+## Security Considerations
+
+<Tip warning={true}>
+  **Important**: The `trust_remote_code=True` flag is required to execute
+  environment code from the Hub. This is by design for security.
+</Tip>
+
+When loading environments from the Hub:
+
+1. **Review the code first**: Visit the repository and inspect `env.py` before loading
+2. **Pin to commits**: Use specific commit hashes for reproducibility
+3. **Check dependencies**: Review `requirements.txt` for suspicious packages
+4. **Use trusted sources**: Prefer official organizations or well-known researchers
+5. **Sandbox if needed**: Run untrusted code in isolated environments (containers, VMs)
+
+Example of safe usage:
+
+```python
+# ❌ BAD: Loading without inspection
+env = make_env("random-user/untrusted-env", trust_remote_code=True)
+
+# ✅ GOOD: Review code, then pin to specific commit
+# 1. Visit https://huggingface.co/trusted-org/verified-env
+# 2. Review the env.py file
+# 3. Copy the commit hash
+env = make_env("trusted-org/verified-env@a1b2c3d4", trust_remote_code=True)
+```
+
+## Example: CartPole from the Hub
+
+Here's a complete example using the reference CartPole environment:
+
+```python
+from lerobot.envs.factory import make_env
+import numpy as np
+
+# Load the environment
+envs_dict = make_env("lerobot/cartpole-env", n_envs=4, trust_remote_code=True)
+
+# Get the vectorized environment
+suite_name = next(iter(envs_dict))
+env = envs_dict[suite_name][0]
+
+# Run a simple episode
+obs, info = env.reset()
+done = np.zeros(env.num_envs, dtype=bool)
+total_reward = np.zeros(env.num_envs)
+
+while not done.all():
+    # Random policy
+    action = env.action_space.sample()
+    obs, reward, terminated, truncated, info = env.step(action)
+    total_reward += reward
+    done = terminated | truncated
+
+print(f"Average reward: {total_reward.mean():.2f}")
+env.close()
+```
+
+## Benefits of EnvHub
+
+### For Environment Authors
+
+- **Easy distribution**: No PyPI packaging required
+- **Version control**: Use Git for environment versioning
+- **Rapid iteration**: Push updates instantly
+- **Documentation**: Hub README renders beautifully
+- **Community**: Reach LeRobot users directly
+
+### For Researchers
+
+- **Quick experiments**: Load any environment in one line
+- **Reproducibility**: Pin to specific commits
+- **Discovery**: Browse environments on the Hub
+- **No conflicts**: No need to install conflicting packages
+
+### For the Community
+
+- **Growing ecosystem**: More diverse simulation tasks
+- **Standardization**: Common `make_env` API
+- **Collaboration**: Fork and improve existing environments
+- **Accessibility**: Lower barrier to sharing research
+
+## Troubleshooting
+
+### "Refusing to execute remote code"
+
+You must explicitly pass `trust_remote_code=True`:
+
+```python
+env = make_env("user/repo", trust_remote_code=True)
+```
+
+### "Module X not found"
+
+The hub environment has dependencies you need to install:
+
+```bash
+# Check the repo's requirements.txt and install dependencies
+pip install gymnasium numpy
+```
+
+### "make_env not found in module"
+
+Your `env.py` must expose a `make_env` function:
+
+```python
+def make_env(n_envs: int, use_async_envs: bool):
+    # Your implementation
+    pass
+```
+
+### Environment returns wrong type
+
+The `make_env` function must return:
+
+- A `gym.vector.VectorEnv`, or
+- A single `gym.Env`, or
+- A dict `{suite_name: {task_id: VectorEnv}}`
+
+## Best Practices
+
+1. **Document your environment**: Include observation/action space descriptions, reward structure, and termination conditions in your README
+2. **Add requirements.txt**: List all dependencies with versions
+3. **Test thoroughly**: Verify your environment works locally before pushing
+4. **Use semantic versioning**: Tag releases with version numbers
+5. **Add examples**: Include usage examples in your README
+6. **Keep it simple**: Minimize dependencies when possible
+7. **License your work**: Add a LICENSE file to clarify usage terms
+
+## Future Directions
+
+The EnvHub ecosystem enables exciting possibilities:
+
+- **GPU-accelerated physics**: Share Isaac Gym or Brax environments
+- **Photorealistic rendering**: Distribute environments with advanced graphics
+- **Multi-agent scenarios**: Complex interaction tasks
+- **Real-world simulators**: Digital twins of physical setups
+- **Procedural generation**: Infinite task variations
+- **Domain randomization**: Pre-configured DR pipelines
+
+As more researchers and developers contribute, the diversity and quality of available environments will grow, benefiting the entire robotics learning community.
+
+## See Also
+
+- [Hugging Face Hub Documentation](https://huggingface.co/docs/hub/en/index)
+- [Gymnasium Documentation](https://gymnasium.farama.org/index.html)
+- [Example Hub Environment](https://huggingface.co/lerobot/cartpole-env)
@@ -40,7 +40,7 @@ python -c "import flash_attn; print(f'Flash Attention {flash_attn.__version__} i
 3. Install LeRobot by running:

 ```bash
-pip install lerobot[groot] # consider also installing libero,dev and test tags
+pip install lerobot[groot]
 ```

 ## Usage
@@ -83,6 +83,9 @@ accelerate launch \

 ### Libero Benchmark Results

+> [!NOTE]
+> Follow our instructions for Libero usage: [Libero](./libero)
+
 GR00T has demonstrated strong performance on the Libero benchmark suite. To compare and test its LeRobot implementation, we finetuned the GR00T N1.5 model for 30k steps on the Libero dataset and compared the results to the GR00T reference results.

 | Benchmark          | LeRobot Implementation | GR00T Reference |
@@ -28,6 +28,11 @@ LIBERO is now part of our **multi-eval supported simulation**, meaning you can b
 To Install LIBERO, after following LeRobot official instructions, just do:
 `pip install -e ".[libero]"`

+> [!NOTE]
+> For lerobot 0.4.0, if you want to install libero tag, you will have to do: `pip install "lerobot[libero]@git+https://github.com/huggingface/lerobot.git"`.
+>
+> This will be solved in the next patch release
+
 ### Single-suite evaluation

 Evaluate a policy on one LIBERO suite:
@@ -28,6 +28,11 @@ As described by Physical Intelligence, while AI has achieved remarkable success
   pip install -e ".[pi]"
   ```

+   > [!NOTE]
+   > For lerobot 0.4.0, if you want to install pi tag, you will have to do: `pip install "lerobot[pi]@git+https://github.com/huggingface/lerobot.git"`.
+   >
+   > This will be solved in the next patch release
+
 ## Training Data and Capabilities

 π₀ is trained on the largest robot interaction dataset to date, combining three key data sources:
@@ -36,6 +36,11 @@ This diverse training mixture creates a "curriculum" that enables generalization
   pip install -e ".[pi]"
   ```

+   > [!NOTE]
+   > For lerobot 0.4.0, if you want to install pi tag, you will have to do: `pip install "lerobot[pi]@git+https://github.com/huggingface/lerobot.git"`.
+   >
+   > This will be solved in the next patch release
+
 ## Usage

 To use π₀.₅ in your LeRobot configuration, specify the policy type as:
@@ -25,7 +25,7 @@ discord = "https://discord.gg/s3KuuzsPFb"

 [project]
 name = "lerobot"
-version = "0.4.0"
+version = "0.4.1"
 description = "🤗 LeRobot: State-of-the-art Machine Learning for Real-World Robotics in Pytorch"
 readme = "README.md"
 license = { text = "Apache-2.0" }
@@ -142,7 +142,7 @@ video_benchmark = ["scikit-image>=0.23.2,<0.26.0", "pandas>=2.2.2,<2.4.0"]
 # Simulation
 aloha = ["gym-aloha>=0.1.2,<0.2.0"]
 pusht = ["gym-pusht>=0.1.5,<0.2.0", "pymunk>=6.6.0,<7.0.0"] # TODO: Fix pymunk version in gym-pusht instead
-libero = ["lerobot[transformers-dep]", "libero @ git+https://github.com/huggingface/lerobot-libero.git@main#egg=libero"]
+libero = ["lerobot[transformers-dep]", "hf-libero>=0.1.3,<0.2.0"]
 metaworld = ["metaworld==3.0.0"]

 # All
@@ -39,6 +39,7 @@ from lerobot.datasets.aggregate import aggregate_datasets
 from lerobot.datasets.compute_stats import aggregate_stats
 from lerobot.datasets.lerobot_dataset import LeRobotDataset, LeRobotDatasetMetadata
 from lerobot.datasets.utils import (
+    DATA_DIR,
    DEFAULT_CHUNK_SIZE,
    DEFAULT_DATA_FILE_SIZE_IN_MB,
    DEFAULT_DATA_PATH,
@@ -962,28 +963,23 @@ def _copy_data_with_feature_changes(
    remove_features: list[str] | None = None,
 ) -> None:
    """Copy data while adding or removing features."""
-    if dataset.meta.episodes is None:
-        dataset.meta.episodes = load_episodes(dataset.meta.root)
+    data_dir = dataset.root / DATA_DIR
+    parquet_files = sorted(data_dir.glob("*/*.parquet"))

-    # Map file paths to episode indices to extract chunk/file indices
-    file_to_episodes: dict[Path, set[int]] = {}
-    for ep_idx in range(dataset.meta.total_episodes):
-        file_path = dataset.meta.get_data_file_path(ep_idx)
-        if file_path not in file_to_episodes:
-            file_to_episodes[file_path] = set()
-        file_to_episodes[file_path].add(ep_idx)
+    if not parquet_files:
+        raise ValueError(f"No parquet files found in {data_dir}")

    frame_idx = 0

-    for src_path in tqdm(sorted(file_to_episodes.keys()), desc="Processing data files"):
-        df = pd.read_parquet(dataset.root / src_path).reset_index(drop=True)
+    for src_path in tqdm(parquet_files, desc="Processing data files"):
+        df = pd.read_parquet(src_path).reset_index(drop=True)

-        # Get chunk_idx and file_idx from the source file's first episode
-        episodes_in_file = file_to_episodes[src_path]
-        first_ep_idx = min(episodes_in_file)
-        src_ep = dataset.meta.episodes[first_ep_idx]
-        chunk_idx = src_ep["data/chunk_index"]
-        file_idx = src_ep["data/file_index"]
+        relative_path = src_path.relative_to(dataset.root)
+        chunk_dir = relative_path.parts[1]
+        file_name = relative_path.parts[2]
+
+        chunk_idx = int(chunk_dir.split("-")[1])
+        file_idx = int(file_name.split("-")[1].split(".")[0])

        if remove_features:
            df = df.drop(columns=remove_features, errors="ignore")
@@ -1009,7 +1005,7 @@ def _copy_data_with_feature_changes(
                        df[feature_name] = feature_slice
            frame_idx = end_idx

-        # Write using the preserved chunk_idx and file_idx from source
+        # Write using the same chunk/file structure as source
        dst_path = new_meta.root / DEFAULT_DATA_PATH.format(chunk_index=chunk_idx, file_index=file_idx)
        dst_path.parent.mkdir(parents=True, exist_ok=True)

@@ -430,9 +430,7 @@ class LeRobotDatasetMetadata:
        video_keys = [video_key] if video_key is not None else self.video_keys
        for key in video_keys:
            if not self.features[key].get("info", None):
-                video_path = self.root / self.video_path.format(
-                    video_key=video_key, chunk_index=0, file_index=0
-                )
+                video_path = self.root / self.video_path.format(video_key=key, chunk_index=0, file_index=0)
                self.info["features"][key]["info"] = get_video_info(video_path)

    def update_chunk_settings(
@@ -19,6 +19,7 @@ import gymnasium as gym
 from gymnasium.envs.registration import registry as gym_registry

 from lerobot.envs.configs import AlohaEnv, EnvConfig, LiberoEnv, PushtEnv
+from lerobot.envs.utils import _call_make_env, _download_hub_file, _import_hub_module, _normalize_hub_result


 def make_env_config(env_type: str, **kwargs) -> EnvConfig:
@@ -33,15 +34,24 @@ def make_env_config(env_type: str, **kwargs) -> EnvConfig:


 def make_env(
-    cfg: EnvConfig, n_envs: int = 1, use_async_envs: bool = False
+    cfg: EnvConfig | str,
+    n_envs: int = 1,
+    use_async_envs: bool = False,
+    hub_cache_dir: str | None = None,
+    trust_remote_code: bool = False,
 ) -> dict[str, dict[int, gym.vector.VectorEnv]]:
-    """Makes a gym vector environment according to the config.
+    """Makes a gym vector environment according to the config or Hub reference.

    Args:
-        cfg (EnvConfig): the config of the environment to instantiate.
+        cfg (EnvConfig | str): Either an `EnvConfig` object describing the environment to build locally,
+            or a Hugging Face Hub repository identifier (e.g. `"username/repo"`). In the latter case,
+            the repo must include a Python file (usually `env.py`).
        n_envs (int, optional): The number of parallelized env to return. Defaults to 1.
        use_async_envs (bool, optional): Whether to return an AsyncVectorEnv or a SyncVectorEnv. Defaults to
            False.
+        hub_cache_dir (str | None): Optional cache path for downloaded hub files.
+        trust_remote_code (bool): **Explicit consent** to execute remote code from the Hub.
+            Default False — must be set to True to import/exec hub `env.py`.

    Raises:
        ValueError: if n_envs < 1
@@ -54,6 +64,21 @@ def make_env(
            - For single-task environments: a single suite entry (cfg.type) with task_id=0.

    """
+    # if user passed a hub id string (e.g., "username/repo", "username/repo@main:env.py")
+    # simplified: only support hub-provided `make_env`
+    if isinstance(cfg, str):
+        # _download_hub_file will raise the same RuntimeError if trust_remote_code is False
+        repo_id, file_path, local_file, revision = _download_hub_file(cfg, trust_remote_code, hub_cache_dir)
+
+        # import and surface clear import errors
+        module = _import_hub_module(local_file, repo_id)
+
+        # call the hub-provided make_env
+        raw_result = _call_make_env(module, n_envs=n_envs, use_async_envs=use_async_envs)
+
+        # normalize the return into {suite: {task_id: vec_env}}
+        return _normalize_hub_result(raw_result)
+
    if n_envs < 1:
        raise ValueError("`n_envs` must be at least 1")

@@ -13,6 +13,8 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
+import importlib.util
+import os
 import warnings
 from collections.abc import Mapping, Sequence
 from functools import singledispatch
@@ -22,6 +24,7 @@ import einops
 import gymnasium as gym
 import numpy as np
 import torch
+from huggingface_hub import hf_hub_download, snapshot_download
 from torch import Tensor

 from lerobot.configs.types import FeatureType, PolicyFeature
@@ -195,3 +198,132 @@ def _(envs: Sequence) -> None:
@close_envs.register
 def _(env: gym.Env) -> None:
    _close_single_env(env)
+
+
+# helper to safely load a python file as a module
+def _load_module_from_path(path: str, module_name: str | None = None):
+    module_name = module_name or f"hub_env_{os.path.basename(path).replace('.', '_')}"
+    spec = importlib.util.spec_from_file_location(module_name, path)
+    if spec is None:
+        raise ImportError(f"Could not load module spec for {module_name} from {path}")
+    module = importlib.util.module_from_spec(spec)
+    spec.loader.exec_module(module)  # type: ignore
+    return module
+
+
+# helper to parse hub string (supports "user/repo", "user/repo@rev", optional path)
+# examples:
+#   "user/repo" -> will look for env.py at repo root
+#   "user/repo@main:envs/my_env.py" -> explicit revision and path
+def _parse_hub_url(hub_uri: str):
+    # very small parser: [repo_id][@revision][:path]
+    # repo_id is required (user/repo or org/repo)
+    revision = None
+    file_path = "env.py"
+    if "@" in hub_uri:
+        repo_and_rev, *rest = hub_uri.split(":", 1)
+        repo_id, rev = repo_and_rev.split("@", 1)
+        revision = rev
+        if rest:
+            file_path = rest[0]
+    else:
+        repo_id, *rest = hub_uri.split(":", 1)
+        if rest:
+            file_path = rest[0]
+    return repo_id, revision, file_path
+
+
+def _download_hub_file(
+    cfg_str: str,
+    trust_remote_code: bool,
+    hub_cache_dir: str | None,
+) -> tuple[str, str, str, str]:
+    """
+    Parse `cfg_str` (hub URL), enforce `trust_remote_code`, and return
+    (repo_id, file_path, local_file, revision).
+    """
+    if not trust_remote_code:
+        raise RuntimeError(
+            f"Refusing to execute remote code from the Hub for '{cfg_str}'. "
+            "Executing hub env modules runs arbitrary Python code from third-party repositories. "
+            "If you trust this repo and understand the risks, call `make_env(..., trust_remote_code=True)` "
+            "and prefer pinning to a specific revision: 'user/repo@<commit-hash>:env.py'."
+        )
+
+    repo_id, revision, file_path = _parse_hub_url(cfg_str)
+
+    try:
+        local_file = hf_hub_download(
+            repo_id=repo_id, filename=file_path, revision=revision, cache_dir=hub_cache_dir
+        )
+    except Exception as e:
+        # fallback to snapshot download
+        snapshot_dir = snapshot_download(repo_id=repo_id, revision=revision, cache_dir=hub_cache_dir)
+        local_file = os.path.join(snapshot_dir, file_path)
+        if not os.path.exists(local_file):
+            raise FileNotFoundError(
+                f"Could not find {file_path} in repository {repo_id}@{revision or 'main'}"
+            ) from e
+
+    return repo_id, file_path, local_file, revision
+
+
+def _import_hub_module(local_file: str, repo_id: str) -> Any:
+    """
+    Import the downloaded file as a module and surface helpful import error messages.
+    """
+    module_name = f"hub_env_{repo_id.replace('/', '_')}"
+    try:
+        module = _load_module_from_path(local_file, module_name=module_name)
+    except ModuleNotFoundError as e:
+        missing = getattr(e, "name", None) or str(e)
+        raise ModuleNotFoundError(
+            f"Hub env '{repo_id}:{os.path.basename(local_file)}' failed to import because the dependency "
+            f"'{missing}' is not installed locally.\n\n"
+        ) from e
+    except ImportError as e:
+        raise ImportError(
+            f"Failed to load hub env module '{repo_id}:{os.path.basename(local_file)}'. Import error: {e}\n\n"
+        ) from e
+    return module
+
+
+def _call_make_env(module: Any, n_envs: int, use_async_envs: bool) -> Any:
+    """
+    Ensure module exposes make_env and call it.
+    """
+    if not hasattr(module, "make_env"):
+        raise AttributeError(
+            f"The hub module {getattr(module, '__name__', 'hub_module')} must expose `make_env(n_envs=int, use_async_envs=bool)`."
+        )
+    entry_fn = module.make_env
+    return entry_fn(n_envs=n_envs, use_async_envs=use_async_envs)
+
+
+def _normalize_hub_result(result: Any) -> dict[str, dict[int, gym.vector.VectorEnv]]:
+    """
+    Normalize possible return types from hub `make_env` into the mapping:
+      { suite_name: { task_id: vector_env } }
+    Accepts:
+      - dict (assumed already correct)
+      - gym.vector.VectorEnv
+      - gym.Env (will be wrapped into SyncVectorEnv)
+    """
+    if isinstance(result, dict):
+        return result
+
+    # VectorEnv: use its spec.id if available
+    if isinstance(result, gym.vector.VectorEnv):
+        suite_name = getattr(result, "spec", None) and getattr(result.spec, "id", None) or "hub_env"
+        return {suite_name: {0: result}}
+
+    # Single Env: wrap into SyncVectorEnv
+    if isinstance(result, gym.Env):
+        vec = gym.vector.SyncVectorEnv([lambda: result])
+        suite_name = getattr(result, "spec", None) and getattr(result.spec, "id", None) or "hub_env"
+        return {suite_name: {0: vec}}
+
+    raise ValueError(
+        "Hub `make_env` must return either a mapping {suite: {task_id: vec_env}}, "
+        "a gym.vector.VectorEnv, or a single gym.Env."
+    )
@@ -45,7 +45,7 @@ class DiffusionConfig(PreTrainedConfig):
    Args:
        n_obs_steps: Number of environment steps worth of observations to pass to the policy (takes the
            current step and additional steps going back).
-        horizon: Diffusion model action prediction size as detailed in `DiffusionPolicy.select_action`.
+        chunk_size: Diffusion model action prediction size as detailed in `DiffusionPolicy.select_action`.
        n_action_steps: The number of action steps to run in the environment for one invocation of the policy.
            See `DiffusionPolicy.select_action` for more details.
        input_shapes: A dictionary defining the shapes of the input data for the policy. The key represents
@@ -105,7 +105,7 @@ class DiffusionConfig(PreTrainedConfig):

    # Inputs / output structure.
    n_obs_steps: int = 2
-    horizon: int = 16
+    chunk_size: int = 16
    n_action_steps: int = 8

    normalization_mapping: dict[str, NormalizationMode] = field(
@@ -118,7 +118,7 @@ class DiffusionConfig(PreTrainedConfig):

    # The original implementation doesn't sample frames for the last 7 steps,
    # which avoids excessive padding and leads to improved training results.
-    drop_n_last_frames: int = 7  # horizon - n_action_steps - n_obs_steps + 1
+    drop_n_last_frames: int = 7  # chunk_size - n_action_steps - n_obs_steps + 1

    # Architecture / modeling.
    # Vision backbone.
@@ -180,13 +180,13 @@ class DiffusionConfig(PreTrainedConfig):
                f"Got {self.noise_scheduler_type}."
            )

-        # Check that the horizon size and U-Net downsampling is compatible.
+        # Check that the chunk size and U-Net downsampling is compatible.
        # U-Net downsamples by 2 with each stage.
        downsampling_factor = 2 ** len(self.down_dims)
-        if self.horizon % downsampling_factor != 0:
+        if self.chunk_size % downsampling_factor != 0:
            raise ValueError(
-                "The horizon should be an integer multiple of the downsampling factor (which is determined "
-                f"by `len(down_dims)`). Got {self.horizon=} and {self.down_dims=}"
+                "The chunk_size should be an integer multiple of the downsampling factor (which is determined "
+                f"by `len(down_dims)`). Got {self.chunk_size=} and {self.down_dims=}"
            )

    def get_optimizer_preset(self) -> AdamConfig:
@@ -231,7 +231,7 @@ class DiffusionConfig(PreTrainedConfig):

    @property
    def action_delta_indices(self) -> list:
-        return list(range(1 - self.n_obs_steps, 1 - self.n_obs_steps + self.horizon))
+        return list(range(1 - self.n_obs_steps, 1 - self.n_obs_steps + self.chunk_size))

    @property
    def reward_delta_indices(self) -> None:
@@ -99,25 +99,25 @@ class DiffusionPolicy(PreTrainedPolicy):
        return actions

    @torch.no_grad()
-    def select_action(self, batch: dict[str, Tensor], noise: Tensor | None = None) -> Tensor:
+    def select_action(self, batch: dict[str, Tensor], noise: Tensor | None = None, **kwargs) -> Tensor:
        """Select a single action given environment observations.

        This method handles caching a history of observations and an action trajectory generated by the
        underlying diffusion model. Here's how it works:
          - `n_obs_steps` steps worth of observations are cached (for the first steps, the observation is
            copied `n_obs_steps` times to fill the cache).
-          - The diffusion model generates `horizon` steps worth of actions.
+          - The diffusion model generates `chunk_size` steps worth of actions.
          - `n_action_steps` worth of actions are actually kept for execution, starting from the current step.
        Schematically this looks like:
            ----------------------------------------------------------------------------------------------
-            (legend: o = n_obs_steps, h = horizon, a = n_action_steps)
+            (legend: o = n_obs_steps, c = chunk_size, a = n_action_steps)
            |timestep            | n-o+1 | n-o+2 | ..... | n     | ..... | n+a-1 | n+a   | ..... | n-o+h |
            |observation is used | YES   | YES   | YES   | YES   | NO    | NO    | NO    | NO    | NO    |
            |action is generated | YES   | YES   | YES   | YES   | YES   | YES   | YES   | YES   | YES   |
            |action is used      | NO    | NO    | NO    | YES   | YES   | YES   | NO    | NO    | NO    |
            ----------------------------------------------------------------------------------------------
-        Note that this means we require: `n_action_steps <= horizon - n_obs_steps + 1`. Also, note that
-        "horizon" may not the best name to describe what the variable actually means, because this period is
+        Note that this means we require: `n_action_steps <= chunk_size - n_obs_steps + 1`. Also, note that
+        this period is
        actually measured from the first observation which (if `n_obs_steps` > 1) happened in the past.
        """
        # NOTE: for offline evaluation, we have action in the batch, so we need to pop it out
@@ -213,7 +213,7 @@ class DiffusionModel(nn.Module):
            noise
            if noise is not None
            else torch.randn(
-                size=(batch_size, self.config.horizon, self.config.action_feature.shape[0]),
+                size=(batch_size, self.config.chunk_size, self.config.action_feature.shape[0]),
                dtype=dtype,
                device=device,
                generator=generator,
@@ -309,16 +309,16 @@ class DiffusionModel(nn.Module):
                AND/OR
            "observation.environment_state": (B, n_obs_steps, environment_dim)

-            "action": (B, horizon, action_dim)
-            "action_is_pad": (B, horizon)
+            "action": (B, chunk_size, action_dim)
+            "action_is_pad": (B, chunk_size)
        }
        """
        # Input validation.
        assert set(batch).issuperset({OBS_STATE, ACTION, "action_is_pad"})
        assert OBS_IMAGES in batch or OBS_ENV_STATE in batch
        n_obs_steps = batch[OBS_STATE].shape[1]
-        horizon = batch[ACTION].shape[1]
-        assert horizon == self.config.horizon
+        chunk_size = batch[ACTION].shape[1]
+        assert chunk_size == self.config.chunk_size
        assert n_obs_steps == self.config.n_obs_steps

        # Encode image features and concatenate them all together along with the state vector.
@@ -0,0 +1,242 @@
+# !/usr/bin/env python
+
+# Copyright 2025 The HuggingFace Inc. team.
+# All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from dataclasses import dataclass, field
+
+from lerobot.configs.policies import PreTrainedConfig
+from lerobot.configs.types import NormalizationMode
+from lerobot.optim.optimizers import MultiAdamConfig
+from lerobot.utils.constants import ACTION, OBS_IMAGE, OBS_STATE
+
+
+def is_image_feature(key: str) -> bool:
+    """Check if a feature key represents an image feature.
+
+    Args:
+        key: The feature key to check
+
+    Returns:
+        True if the key represents an image feature, False otherwise
+    """
+    return key.startswith(OBS_IMAGE)
+
+
+@dataclass
+class ConcurrencyConfig:
+    """Configuration for the concurrency of the actor and learner.
+    Possible values are:
+    - "threads": Use threads for the actor and learner.
+    - "processes": Use processes for the actor and learner.
+    """
+
+    actor: str = "threads"
+    learner: str = "threads"
+
+
+@dataclass
+class ActorLearnerConfig:
+    learner_host: str = "127.0.0.1"
+    learner_port: int = 50051
+    policy_parameters_push_frequency: int = 4
+    queue_get_timeout: float = 2
+
+
+@dataclass
+class CriticNetworkConfig:
+    hidden_dims: list[int] = field(default_factory=lambda: [256, 256])
+    activate_final: bool = True
+    final_activation: str | None = None
+
+
+@dataclass
+class ActorNetworkConfig:
+    hidden_dims: list[int] = field(default_factory=lambda: [256, 256])
+    activate_final: bool = True
+    use_layer_norm: bool = True
+
+
+@dataclass
+class NoiseActorConfig:
+    """Configuration for the noise actor in DSRL.
+    The noise actor outputs noise that gets fed to the diffusion policy.
+    """
+
+    use_tanh_squash: bool = False  # Whether to bound the noise output
+    std_min: float = 1e-5
+    std_max: float = 2.0
+    init_final: float = 0.05
+
+
+@PreTrainedConfig.register_subclass("dsrl")
+@dataclass
+class DSRLConfig(PreTrainedConfig):
+    """Diffusion Steering via Reinforcement Learning (DSRL) configuration."""
+
+    # Mapping of feature types to normalization modes
+    normalization_mapping: dict[str, NormalizationMode] = field(
+        default_factory=lambda: {
+            "VISUAL": NormalizationMode.MEAN_STD,
+            "STATE": NormalizationMode.MIN_MAX,
+            "ENV": NormalizationMode.MIN_MAX,
+            "ACTION": NormalizationMode.MIN_MAX,
+        }
+    )
+
+    # Statistics for normalizing different types of inputs
+    dataset_stats: dict[str, dict[str, list[float]]] | None = field(
+        default_factory=lambda: {
+            OBS_IMAGE: {
+                "mean": [0.485, 0.456, 0.406],
+                "std": [0.229, 0.224, 0.225],
+            },
+            OBS_STATE: {
+                "min": [0.0, 0.0],
+                "max": [1.0, 1.0],
+            },
+            ACTION: {
+                "min": [0.0, 0.0, 0.0],
+                "max": [1.0, 1.0, 1.0],
+            },
+        }
+    )
+
+    # Architecture specifics
+    # Device to run the model on (e.g., "cuda", "cpu")
+    device: str = "cpu"
+    # Device to store the model on
+    storage_device: str = "cpu"
+    # Name of the vision encoder model (Set to "helper2424/resnet10" for hil serl resnet10)
+    vision_encoder_name: str | None = None
+    # Whether to freeze the vision encoder during training
+    freeze_vision_encoder: bool = True
+    # Hidden dimension size for the image encoder
+    image_encoder_hidden_dim: int = 32
+    # Whether to use a shared encoder for actor and critic
+    shared_encoder: bool = True
+    # Number of discrete actions, eg for gripper actions
+    num_discrete_actions: int | None = None
+    # Dimension of the image embedding pooling
+    image_embedding_pooling_dim: int = 8
+
+    # Name of the action policy
+    action_policy_name: str = "pi0"
+    action_policy_weights: str | None = "lerobot/pi0_base"
+
+    # Training parameter
+    # Number of steps for online training
+    online_steps: int = 1000000
+    # Capacity of the online replay buffer
+    online_buffer_capacity: int = 100000
+    # Capacity of the offline replay buffer
+    offline_buffer_capacity: int = 100000
+    # Whether to use asynchronous prefetching for the buffers
+    async_prefetch: bool = False
+    # Number of steps before learning starts
+    online_step_before_learning: int = 100
+    # Frequency of policy updates
+    policy_update_freq: int = 1
+
+    # SAC algorithm parameters
+    discount: float = 0.99
+    # Initial temperature value
+    temperature_init: float = 1.0
+    # Number of critics in the ensemble
+    num_critics: int = 2
+    # Number of subsampled critics for training
+    num_subsample_critics: int | None = None
+    # Learning rate for the critic network
+    critic_lr: float = 3e-4
+    # Learning rate for the actor network
+    actor_lr: float = 3e-4
+    # Learning rate for the temperature parameter
+    temperature_lr: float = 3e-4
+    # Weight for the critic target update
+    critic_target_update_weight: float = 0.005
+    # Update-to-data ratio for the UTD algorithm (If you want enable utd_ratio, you need to set it to >1)
+    utd_ratio: int = 1
+    # Hidden dimension size for the state encoder
+    state_encoder_hidden_dim: int = 256
+    # Dimension of the latent space
+    latent_dim: int = 256
+    # Target entropy for the SAC algorithm
+    target_entropy: float | None = None
+    # Whether to use backup entropy for the SAC algorithm
+    use_backup_entropy: bool = True
+    # Gradient clipping norm for the SAC algorithm
+    grad_clip_norm: float = 40.0
+
+    # Network configuration
+    # Configuration for the critic network architecture
+    critic_network_kwargs: CriticNetworkConfig = field(default_factory=CriticNetworkConfig)
+    # Configuration for the noise critic network architecture
+    noise_critic_network_kwargs: CriticNetworkConfig = field(default_factory=CriticNetworkConfig)
+    # Configuration for the noise actor network architecture
+    noise_actor_network_kwargs: ActorNetworkConfig = field(default_factory=ActorNetworkConfig)
+    # Configuration for the noise actor specific parameters
+    noise_actor_kwargs: NoiseActorConfig = field(default_factory=NoiseActorConfig)
+    # Configuration for actor-learner architecture
+    actor_learner_config: ActorLearnerConfig = field(default_factory=ActorLearnerConfig)
+    # Configuration for concurrency settings (you can use threads or processes for the actor and learner)
+    concurrency: ConcurrencyConfig = field(default_factory=ConcurrencyConfig)
+
+    # Optimizations
+    use_torch_compile: bool = True
+
+    def __post_init__(self):
+        super().__post_init__()
+
+    def get_optimizer_preset(self) -> MultiAdamConfig:
+        return MultiAdamConfig(
+            weight_decay=0.0,
+            optimizer_groups={
+                "critic_action": {"lr": self.critic_lr},
+                "critic_noise": {"lr": self.critic_lr},
+                "noise_actor": {"lr": self.actor_lr},
+                "temperature": {"lr": self.temperature_lr},
+            },
+        )
+
+    def get_scheduler_preset(self) -> None:
+        return None
+
+    def validate_features(self) -> None:
+        has_image = any(is_image_feature(key) for key in self.input_features)
+        has_state = OBS_STATE in self.input_features
+
+        if not (has_state or has_image):
+            raise ValueError(
+                "You must provide either 'observation.state' or an image observation (key starting with 'observation.image') in the input features"
+            )
+
+        if ACTION not in self.output_features:
+            raise ValueError("You must provide 'action' in the output features")
+
+    @property
+    def image_features(self) -> list[str]:
+        return [key for key in self.input_features if is_image_feature(key)]
+
+    @property
+    def observation_delta_indices(self) -> list:
+        return None
+
+    @property
+    def action_delta_indices(self) -> list:
+        return None
+
+    @property
+    def reward_delta_indices(self) -> None:
+        return None
@@ -0,0 +1,89 @@
+# !/usr/bin/env python
+
+# Copyright 2025 The HuggingFace Inc. team.
+# All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""
+Processor for DSRL policy.
+
+DSRL uses a similar processing pipeline as SAC since it operates on
+state-action transitions. The main difference is that internally it
+also works with noise, but that's handled within the policy itself.
+"""
+
+from typing import Any
+
+import torch
+
+from lerobot.policies.dsrl.configuration_dsrl import DSRLConfig
+from lerobot.processor import (
+    AddBatchDimensionProcessorStep,
+    DeviceProcessorStep,
+    NormalizerProcessorStep,
+    PolicyAction,
+    PolicyProcessorPipeline,
+    RenameObservationsProcessorStep,
+    UnnormalizerProcessorStep,
+)
+from lerobot.processor.converters import (
+    policy_action_to_transition,
+    transition_to_policy_action,
+)
+from lerobot.utils.constants import POLICY_POSTPROCESSOR_DEFAULT_NAME, POLICY_PREPROCESSOR_DEFAULT_NAME
+
+
+def make_dsrl_pre_post_processors(
+    config: DSRLConfig,
+    dataset_stats: dict[str, dict[str, torch.Tensor]] | None = None,
+) -> tuple[
+    PolicyProcessorPipeline[dict, dict],
+    PolicyProcessorPipeline[PolicyAction, PolicyAction],
+]:
+    """Create preprocessor and postprocessor pipelines for DSRL policy.
+
+    Args:
+        config: DSRL policy configuration
+        dataset_stats: Optional dataset statistics for normalization
+
+    Returns:
+        Tuple of (preprocessor, postprocessor) pipelines
+    """
+    input_steps = [
+        RenameObservationsProcessorStep(rename_map={}),
+        AddBatchDimensionProcessorStep(),
+        DeviceProcessorStep(device=config.device),
+        NormalizerProcessorStep(
+            features={**config.input_features, **config.output_features},
+            norm_map=config.normalization_mapping,
+            stats=dataset_stats,
+        ),
+    ]
+    output_steps = [
+        UnnormalizerProcessorStep(
+            features=config.output_features, norm_map=config.normalization_mapping, stats=dataset_stats
+        ),
+        DeviceProcessorStep(device="cpu"),
+    ]
+    return (
+        PolicyProcessorPipeline[dict[str, Any], dict[str, Any]](
+            steps=input_steps,
+            name=POLICY_PREPROCESSOR_DEFAULT_NAME,
+        ),
+        PolicyProcessorPipeline[PolicyAction, PolicyAction](
+            steps=output_steps,
+            name=POLICY_POSTPROCESSOR_DEFAULT_NAME,
+            to_transition=policy_action_to_transition,
+            to_output=transition_to_policy_action,
+        ),
+    )
@@ -30,6 +30,7 @@ from lerobot.envs.configs import EnvConfig
 from lerobot.envs.utils import env_to_policy_features
 from lerobot.policies.act.configuration_act import ACTConfig
 from lerobot.policies.diffusion.configuration_diffusion import DiffusionConfig
+from lerobot.policies.dsrl.configuration_dsrl import DSRLConfig
 from lerobot.policies.groot.configuration_groot import GrootConfig
 from lerobot.policies.pi0.configuration_pi0 import PI0Config
 from lerobot.policies.pi05.configuration_pi05 import PI05Config
@@ -38,6 +39,7 @@ from lerobot.policies.sac.configuration_sac import SACConfig
 from lerobot.policies.sac.reward_model.configuration_classifier import RewardClassifierConfig
 from lerobot.policies.smolvla.configuration_smolvla import SmolVLAConfig
 from lerobot.policies.tdmpc.configuration_tdmpc import TDMPCConfig
+from lerobot.policies.utils import validate_visual_features_consistency
 from lerobot.policies.vqbet.configuration_vqbet import VQBeTConfig
 from lerobot.processor import PolicyAction, PolicyProcessorPipeline
 from lerobot.processor.converters import (
@@ -58,7 +60,7 @@ def get_policy_class(name: str) -> type[PreTrainedPolicy]:

    Args:
        name: The name of the policy. Supported names are "tdmpc", "diffusion", "act",
-              "vqbet", "pi0", "pi05", "sac", "reward_classifier", "smolvla".
+              "vqbet", "pi0", "pi05", "sac", "reward_classifier", "smolvla", "dsrl".

    Returns:
        The policy class corresponding to the given name.
@@ -102,6 +104,10 @@ def get_policy_class(name: str) -> type[PreTrainedPolicy]:
        from lerobot.policies.smolvla.modeling_smolvla import SmolVLAPolicy

        return SmolVLAPolicy
+    elif name == "dsrl":
+        from lerobot.policies.dsrl.modeling_dsrl import DSRLPolicy
+
+        return DSRLPolicy
    elif name == "groot":
        from lerobot.policies.groot.modeling_groot import GrootPolicy

@@ -120,7 +126,7 @@ def make_policy_config(policy_type: str, **kwargs) -> PreTrainedConfig:
    Args:
        policy_type: The type of the policy. Supported types include "tdmpc",
                     "diffusion", "act", "vqbet", "pi0", "pi05", "sac", "smolvla",
-                     "reward_classifier".
+                     "reward_classifier", "dsrl".
        **kwargs: Keyword arguments to be passed to the configuration class constructor.

    Returns:
@@ -147,6 +153,8 @@ def make_policy_config(policy_type: str, **kwargs) -> PreTrainedConfig:
        return SmolVLAConfig(**kwargs)
    elif policy_type == "reward_classifier":
        return RewardClassifierConfig(**kwargs)
+    elif policy_type == "dsrl":
+        return DSRLConfig(**kwargs)
    elif policy_type == "groot":
        return GrootConfig(**kwargs)
    else:
@@ -320,6 +328,21 @@ def make_pre_post_processors(
            config=policy_cfg,
            dataset_stats=kwargs.get("dataset_stats"),
        )
+    elif isinstance(policy_cfg, DSRLConfig):
+        from lerobot.policies.dsrl.processor_dsrl import make_dsrl_pre_post_processors
+
+        processors = make_dsrl_pre_post_processors(
+            config=policy_cfg,
+            dataset_stats=kwargs.get("dataset_stats"),
+        )
+
+    elif isinstance(policy_cfg, GrootConfig):
+        from lerobot.policies.groot.processor_groot import make_groot_pre_post_processors
+
+        processors = make_groot_pre_post_processors(
+            config=policy_cfg,
+            dataset_stats=kwargs.get("dataset_stats"),
+        )

    elif isinstance(policy_cfg, GrootConfig):
        from lerobot.policies.groot.processor_groot import make_groot_pre_post_processors
@@ -420,20 +443,7 @@ def make_policy(
    # policy = torch.compile(policy, mode="reduce-overhead")

    if not rename_map:
-        expected_features = set(cfg.input_features.keys()) | set(cfg.output_features.keys())
-        provided_features = set(features.keys())
-        if expected_features and provided_features != expected_features:
-            missing = expected_features - provided_features
-            extra = provided_features - expected_features
-            # TODO (jadechoghari): provide a dynamic rename map suggestion to the user.
-            raise ValueError(
-                f"Feature mismatch between dataset/environment and policy config.\n"
-                f"- Missing features: {sorted(missing) if missing else 'None'}\n"
-                f"- Extra features: {sorted(extra) if extra else 'None'}\n\n"
-                f"Please ensure your dataset and policy use consistent feature names.\n"
-                f"If your dataset uses different observation keys (e.g., cameras named differently), "
-                f"use the `--rename_map` argument, for example:\n"
-                f'  --rename_map=\'{{"observation.images.left": "observation.images.camera1", '
-                f'"observation.images.top": "observation.images.camera2"}}\''
-            )
+        validate_visual_features_consistency(cfg, features)
+        # TODO: (jadechoghari) - add a check_state(cfg, features) and check_action(cfg, features)
+
    return policy
@@ -1148,7 +1148,7 @@ class PI0Policy(PreTrainedPolicy):
        return self._action_queue.popleft()

    @torch.no_grad()
-    def predict_action_chunk(self, batch: dict[str, Tensor]) -> Tensor:
+    def predict_action_chunk(self, batch: dict[str, Tensor], noise: Tensor | None = None) -> Tensor:
        """Predict a chunk of actions given environment observations."""
        self.eval()

@@ -1158,7 +1158,7 @@ class PI0Policy(PreTrainedPolicy):
        state = self.prepare_state(batch)

        # Sample actions using the model
-        actions = self.model.sample_actions(images, img_masks, lang_tokens, lang_masks, state)
+        actions = self.model.sample_actions(images, img_masks, lang_tokens, lang_masks, state, noise)

        # Unpad actions to actual action dimension
        original_action_dim = self.config.output_features[ACTION].shape[0]
@@ -1120,7 +1120,7 @@ class PI05Policy(PreTrainedPolicy):
        return self._action_queue.popleft()

    @torch.no_grad()
-    def predict_action_chunk(self, batch: dict[str, Tensor]) -> Tensor:
+    def predict_action_chunk(self, batch: dict[str, Tensor], noise: Tensor | None = None) -> Tensor:
        """Predict a chunk of actions given environment observations."""
        self.eval()

@@ -1129,7 +1129,7 @@ class PI05Policy(PreTrainedPolicy):
        tokens, masks = batch[f"{OBS_LANGUAGE_TOKENS}"], batch[f"{OBS_LANGUAGE_ATTENTION_MASK}"]

        # Sample actions using the model (no separate state needed for PI05)
-        actions = self.model.sample_actions(images, img_masks, tokens, masks)
+        actions = self.model.sample_actions(images, img_masks, tokens, masks, noise)

        # Unpad actions to actual action dimension
        original_action_dim = self.config.output_features[ACTION].shape[0]
@@ -22,6 +22,8 @@ import numpy as np
 import torch
 from torch import nn

+from lerobot.configs.policies import PreTrainedConfig
+from lerobot.configs.types import FeatureType, PolicyFeature
 from lerobot.datasets.utils import build_dataset_frame
 from lerobot.processor import PolicyAction, RobotAction, RobotObservation
 from lerobot.utils.constants import ACTION, OBS_STR
@@ -198,3 +200,42 @@ def make_robot_action(action_tensor: PolicyAction, ds_features: dict[str, dict])
        f"{name}": float(action_tensor[i]) for i, name in enumerate(action_names)
    }
    return act_processed_policy
+
+
+def raise_feature_mismatch_error(
+    provided_features: set[str],
+    expected_features: set[str],
+) -> None:
+    """
+    Raises a standardized ValueError for feature mismatches between dataset/environment and policy config.
+    """
+    missing = expected_features - provided_features
+    extra = provided_features - expected_features
+    # TODO (jadechoghari): provide a dynamic rename map suggestion to the user.
+    raise ValueError(
+        f"Feature mismatch between dataset/environment and policy config.\n"
+        f"- Missing features: {sorted(missing) if missing else 'None'}\n"
+        f"- Extra features: {sorted(extra) if extra else 'None'}\n\n"
+        f"Please ensure your dataset and policy use consistent feature names.\n"
+        f"If your dataset uses different observation keys (e.g., cameras named differently), "
+        f"use the `--rename_map` argument, for example:\n"
+        f'  --rename_map=\'{{"observation.images.left": "observation.images.camera1", '
+        f'"observation.images.top": "observation.images.camera2"}}\''
+    )
+
+
+def validate_visual_features_consistency(
+    cfg: PreTrainedConfig,
+    features: dict[str, PolicyFeature],
+) -> None:
+    """
+    Validates visual feature consistency between a policy config and provided dataset/environment features.
+
+    Args:
+        cfg (PreTrainedConfig): The model or policy configuration containing input_features and type.
+        features (Dict[str, PolicyFeature]): A mapping of feature names to PolicyFeature objects.
+    """
+    expected_visuals = {k for k, v in cfg.input_features.items() if v.type == FeatureType.VISUAL}
+    provided_visuals = {k for k, v in features.items() if v.type == FeatureType.VISUAL}
+    if not provided_visuals.issubset(expected_visuals):
+        raise_feature_mismatch_error(provided_visuals, expected_visuals)
@@ -17,6 +17,7 @@ import importlib
 from dataclasses import dataclass, field

 import gymnasium as gym
+import numpy as np
 import pytest
 import torch
 from gymnasium.envs.registration import register, registry as gym_registry
@@ -26,7 +27,11 @@ import lerobot
 from lerobot.configs.types import PolicyFeature
 from lerobot.envs.configs import EnvConfig
 from lerobot.envs.factory import make_env, make_env_config
-from lerobot.envs.utils import preprocess_observation
+from lerobot.envs.utils import (
+    _normalize_hub_result,
+    _parse_hub_url,
+    preprocess_observation,
+)
 from tests.utils import require_env

 OBS_TYPES = ["state", "pixels", "pixels_agent_pos"]
@@ -108,3 +113,156 @@ def test_factory_custom_gym_id():
    finally:
        if gym_id in gym_registry:
            del gym_registry[gym_id]
+
+
+# Hub environment loading tests
+
+
+def test_make_env_hub_url_parsing():
+    """Test URL parsing for hub environment references."""
+    # simple repo_id
+    repo_id, revision, file_path = _parse_hub_url("user/repo")
+    assert repo_id == "user/repo"
+    assert revision is None
+    assert file_path == "env.py"
+
+    # repo with revision
+    repo_id, revision, file_path = _parse_hub_url("user/repo@main")
+    assert repo_id == "user/repo"
+    assert revision == "main"
+    assert file_path == "env.py"
+
+    # repo with custom file path
+    repo_id, revision, file_path = _parse_hub_url("user/repo:custom_env.py")
+    assert repo_id == "user/repo"
+    assert revision is None
+    assert file_path == "custom_env.py"
+
+    # repo with revision and custom file path
+    repo_id, revision, file_path = _parse_hub_url("user/repo@v1.0:envs/my_env.py")
+    assert repo_id == "user/repo"
+    assert revision == "v1.0"
+    assert file_path == "envs/my_env.py"
+
+    # repo with commit hash
+    repo_id, revision, file_path = _parse_hub_url("org/repo@abc123def456")
+    assert repo_id == "org/repo"
+    assert revision == "abc123def456"
+    assert file_path == "env.py"
+
+
+def test_normalize_hub_result():
+    """Test normalization of different return types from hub make_env."""
+    # test with VectorEnv (most common case)
+    mock_vec_env = gym.vector.SyncVectorEnv([lambda: gym.make("CartPole-v1")])
+    result = _normalize_hub_result(mock_vec_env)
+    assert isinstance(result, dict)
+    assert len(result) == 1
+    suite_name = next(iter(result))
+    assert 0 in result[suite_name]
+    assert isinstance(result[suite_name][0], gym.vector.VectorEnv)
+    mock_vec_env.close()
+
+    # test with single Env
+    mock_env = gym.make("CartPole-v1")
+    result = _normalize_hub_result(mock_env)
+    assert isinstance(result, dict)
+    suite_name = next(iter(result))
+    assert 0 in result[suite_name]
+    assert isinstance(result[suite_name][0], gym.vector.VectorEnv)
+    result[suite_name][0].close()
+
+    # test with dict (already normalized)
+    mock_vec_env = gym.vector.SyncVectorEnv([lambda: gym.make("CartPole-v1")])
+    input_dict = {"my_suite": {0: mock_vec_env}}
+    result = _normalize_hub_result(input_dict)
+    assert result == input_dict
+    assert "my_suite" in result
+    assert 0 in result["my_suite"]
+    mock_vec_env.close()
+
+    # test with invalid type
+    with pytest.raises(ValueError, match="Hub `make_env` must return"):
+        _normalize_hub_result("invalid_type")
+
+
+def test_make_env_from_hub_requires_trust_remote_code():
+    """Test that loading from hub requires explicit trust_remote_code=True."""
+    hub_id = "lerobot/cartpole-env"
+
+    # Should raise RuntimeError when trust_remote_code=False (default)
+    with pytest.raises(RuntimeError, match="Refusing to execute remote code"):
+        make_env(hub_id, trust_remote_code=False)
+
+    # Should also raise when not specified (defaults to False)
+    with pytest.raises(RuntimeError, match="Refusing to execute remote code"):
+        make_env(hub_id)
+
+
+@pytest.mark.parametrize(
+    "hub_id",
+    [
+        "lerobot/cartpole-env",
+        "lerobot/cartpole-env@main",
+        "lerobot/cartpole-env:env.py",
+    ],
+)
+def test_make_env_from_hub_with_trust(hub_id):
+    """Test loading environment from Hugging Face Hub with trust_remote_code=True."""
+    # load environment from hub
+    envs_dict = make_env(hub_id, n_envs=2, trust_remote_code=True)
+
+    # verify structure
+    assert isinstance(envs_dict, dict)
+    assert len(envs_dict) >= 1
+
+    # get the first suite and task
+    suite_name = next(iter(envs_dict))
+    task_id = next(iter(envs_dict[suite_name]))
+    env = envs_dict[suite_name][task_id]
+
+    # verify it's a vector environment
+    assert isinstance(env, gym.vector.VectorEnv)
+    assert env.num_envs == 2
+
+    # test basic environment interaction
+    obs, info = env.reset()
+    assert obs is not None
+    assert isinstance(obs, (dict, np.ndarray))
+
+    # take a random action
+    action = env.action_space.sample()
+    obs, reward, terminated, truncated, info = env.step(action)
+    assert obs is not None
+    assert isinstance(reward, np.ndarray)
+    assert len(reward) == 2
+
+    # clean up
+    env.close()
+
+
+def test_make_env_from_hub_async():
+    """Test loading hub environment with async vector environments."""
+    hub_id = "lerobot/cartpole-env"
+
+    # load with async envs
+    envs_dict = make_env(hub_id, n_envs=2, use_async_envs=True, trust_remote_code=True)
+
+    suite_name = next(iter(envs_dict))
+    task_id = next(iter(envs_dict[suite_name]))
+    env = envs_dict[suite_name][task_id]
+
+    # verify it's an async vector environment
+    assert isinstance(env, gym.vector.AsyncVectorEnv)
+    assert env.num_envs == 2
+
+    # test basic interaction
+    obs, info = env.reset()
+    assert obs is not None
+
+    action = env.action_space.sample()
+    obs, reward, terminated, truncated, info = env.step(action)
+    assert len(reward) == 2
+
+    # clean up
+    env.close()
@@ -0,0 +1,157 @@
+#!/usr/bin/env python
+
+# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""
+Visual Feature Consistency Tests
+
+This module tests the `validate_visual_features_consistency` function,
+which ensures that visual features (camera observations) in a dataset/env
+match the expectations defined in a policy configuration.
+
+The purpose of this check is to prevent mismatches between what a policy expects
+(e.g., `observation.images.camera1`, `camera2`, `camera3`) and what a dataset or
+environment actually provides (e.g., `observation.images.top`, `side`, or fewer cameras).
+"""
+
+from pathlib import Path
+
+import numpy as np
+import pytest
+
+from lerobot.configs.default import DatasetConfig
+from lerobot.configs.policies import PreTrainedConfig
+from lerobot.configs.train import TrainPipelineConfig
+from lerobot.datasets.lerobot_dataset import LeRobotDataset
+from lerobot.policies.factory import make_policy_config
+from lerobot.scripts.lerobot_train import train
+from lerobot.utils.utils import auto_select_torch_device
+
+pytest.importorskip("transformers")
+
+DUMMY_REPO_ID = "dummy/repo"
+
+
+@pytest.fixture
+def temp_dir(tmp_path):
+    return tmp_path
+
+
+DUMMY_STATE_DIM = 6
+DUMMY_ACTION_DIM = 6
+IMAGE_SIZE = 8
+DEVICE = auto_select_torch_device()
+
+
+def make_dummy_dataset(camera_keys, tmp_path):
+    """Creates a minimal dummy dataset for testing rename_mapping logic."""
+    features = {
+        "action": {"dtype": "float32", "shape": (DUMMY_ACTION_DIM,), "names": None},
+        "observation.state": {"dtype": "float32", "shape": (DUMMY_STATE_DIM,), "names": None},
+    }
+    for cam in camera_keys:
+        features[f"observation.images.{cam}"] = {
+            "dtype": "image",
+            "shape": (IMAGE_SIZE, IMAGE_SIZE, 3),
+            "names": ["height", "width", "channel"],
+        }
+    dataset = LeRobotDataset.create(
+        repo_id=DUMMY_REPO_ID,
+        fps=30,
+        features=features,
+        root=tmp_path / "_dataset",
+    )
+    root = tmp_path / "_dataset"
+    for ep_idx in range(2):
+        for _ in range(3):
+            frame = {
+                "action": np.random.randn(DUMMY_ACTION_DIM).astype(np.float32),
+                "observation.state": np.random.randn(DUMMY_STATE_DIM).astype(np.float32),
+            }
+            for cam in camera_keys:
+                frame[f"observation.images.{cam}"] = np.random.randint(
+                    0, 255, size=(IMAGE_SIZE, IMAGE_SIZE, 3), dtype=np.uint8
+                )
+            frame["task"] = f"task_{ep_idx}"
+            dataset.add_frame(frame)
+        dataset.save_episode()
+
+    dataset.finalize()
+    return dataset, root
+
+
+def custom_validate(train_config: TrainPipelineConfig, policy_path: str, empty_cameras: int):
+    train_config.policy = PreTrainedConfig.from_pretrained(policy_path)
+    train_config.policy.pretrained_path = Path(policy_path)
+    # override empty_cameras and push_to_hub for testing
+    train_config.policy.empty_cameras = empty_cameras
+    train_config.policy.push_to_hub = False
+    if train_config.use_policy_training_preset:
+        train_config.optimizer = train_config.policy.get_optimizer_preset()
+        train_config.scheduler = train_config.policy.get_scheduler_preset()
+    return train_config
+
+
+@pytest.mark.skip(reason="Skipping this test as it results OOM")
+@pytest.mark.parametrize(
+    "camera_keys, empty_cameras, rename_map, expect_success",
+    [
+        # case 1: dataset has fewer cameras than policy (3 instead of 4), but we specify empty_cameras=1 for smolvla, pi0, pi05
+        (["camera1", "camera2", "camera3"], 1, {}, True),
+        # case 2: dataset has 2 cameras with different names, rename_mapping provided
+        (
+            ["top", "side"],
+            0,
+            {
+                "observation.images.top": "observation.images.camera1",
+                "observation.images.side": "observation.images.camera2",
+            },
+            True,
+        ),
+        # case 3: dataset has 2 cameras, policy expects 3, names do not match, no empty_cameras
+        (["top", "side"], 0, {}, False),
+        # TODO: case 4: dataset has 2 cameras, policy expects 3, no rename_map, no empty_cameras, should raise for smolvla
+        # (["camera1", "camera2"], 0, {}, False),
+    ],
+)
+def test_train_with_camera_mismatch(camera_keys, empty_cameras, rename_map, expect_success, tmp_path):
+    """Tests that training works or fails depending on camera/feature alignment."""
+
+    _dataset, root = make_dummy_dataset(camera_keys, tmp_path)
+    pretrained_path = "lerobot/smolvla_base"
+    dataset_config = DatasetConfig(repo_id=DUMMY_REPO_ID, root=root)
+    policy_config = make_policy_config(
+        "smolvla",
+        optimizer_lr=0.01,
+        push_to_hub=False,
+        pretrained_path=pretrained_path,
+        device=DEVICE,
+    )
+    policy_config.empty_cameras = empty_cameras
+    train_config = TrainPipelineConfig(
+        dataset=dataset_config,
+        policy=policy_config,
+        rename_map=rename_map,
+        output_dir=tmp_path / "_output",
+        steps=1,
+    )
+    train_config = custom_validate(train_config, policy_path=pretrained_path, empty_cameras=empty_cameras)
+    # HACK: disable the internal CLI validation step for tests, we did it with custom_validate
+    train_config.validate = lambda: None
+    if expect_success:
+        train(train_config)
+    else:
+        with pytest.raises(ValueError):
+            train(train_config)
Author	SHA1	Message	Date
Michel Aractingi	ca0087d6da	* Change Diffusion policy to use chunk_size notation instead of horizon to standerize the variable names across policies * reshape noise after taking it as output of the network	2025-11-06 12:02:13 +01:00
Michel Aractingi	e3ce2eb743	update factory with dsrl	2025-11-06 12:02:11 +01:00
Michel Aractingi	17f4bc4c56	Add dsrl policy files	2025-11-06 11:57:29 +01:00
Michel Aractingi	f6b16f6d97	fix(dataset_tools) Critical bug in modify features (#2342 ) * fix bug in `_copy_data_with_feature_changes` * Update src/lerobot/datasets/dataset_tools.py Co-authored-by: Caroline Pascal <caroline8.pascal@gmail.com> Signed-off-by: Michel Aractingi <michel.aractingi@huggingface.co> * add missing import --------- Signed-off-by: Michel Aractingi <michel.aractingi@huggingface.co> Co-authored-by: Caroline Pascal <caroline8.pascal@gmail.com>	2025-11-04 15:56:41 +01:00
Jade Choghari	df0c335a5a	feat(sim): EnvHub - allow loading envs from the hub (#2121 ) * add env from the hub support * add safe loading * changes * add tests, docs * more * style/cleaning * order --------- Co-authored-by: Michel Aractingi <michel.aractingi@huggingface.co>	2025-11-04 14:52:46 +01:00
Jade Choghari	87ed3a2b6e	dep(upgrade): add libero as a pypi package (#2365 ) * add changes * Update pyproject.toml Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Jade Choghari <chogharijade@gmail.com> * add openpi-transformers Signed-off-by: Jade Choghari <chogharijade@gmail.com> * new changes Signed-off-by: Jade Choghari <chogharijade@gmail.com> * Update hf-libero version in pyproject.toml Signed-off-by: Jade Choghari <chogharijade@gmail.com> --------- Signed-off-by: Jade Choghari <chogharijade@gmail.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-11-04 10:43:52 +01:00
Jade Choghari	d57d1aa197	fix(make_policy): rename mapping edge cases in training (#2332 ) * fix bug * update fixes * add hf license * more fixes * add transformers * iterate on review * more fixes * more fixes * add a False test * reduce img size * reduce img size * skip the test * add * add style	2025-10-31 13:08:42 +01:00
Caroline Pascal	3f8c5d9809	fix(video_key typo): fixing video_key typo in update_video_info (#2323 )	2025-10-28 09:41:33 +01:00
Steven Palma	d1548e1d13	docs(install): imrpove groot and libero installation instructions (#2314 )	2025-10-26 15:37:41 +08:00
Steven Palma	d11ec6b5ef	docs(readme): update installation instructions for 0.4.0 (#2310 )	2025-10-24 17:31:37 +02:00
Steven Palma	c75455a6de	chore(dependecies): Bump lerobot to 0.4.1 (#2299 ) Signed-off-by: Steven Palma <imstevenpmwork@ieee.org>	2025-10-23 20:59:30 +02:00