Frame count is now derived from the upstream .npy length

fix decord
scripts: add Robometer parity checks (upstream example videos + LIBERO)
2026-06-16 15:57:03 +00:00 · 2026-05-18 10:57:16 +02:00 · 2026-05-18 10:39:51 +02:00 · 2026-05-17 15:41:31 +02:00 · 2026-05-17 14:59:23 +02:00 · 2026-05-13 11:09:19 +02:00
83 changed files with 7623 additions and 2865 deletions
@@ -1,288 +0,0 @@
-# Video benchmark
-
-## Questions
-
-What is the optimal trade-off between:
-
- maximizing loading time with random access,
- minimizing memory space on disk,
- maximizing success rate of policies,
- compatibility across devices/platforms for decoding videos (e.g. video players, web browsers).
-
-How to encode videos?
-
- Which video codec (`-vcodec`) to use? h264, h265, AV1?
- What pixel format to use (`-pix_fmt`)? `yuv444p` or `yuv420p`?
- How much compression (`-crf`)? No compression with `0`, intermediate compression with `25` or extreme with `50+`?
- Which frequency to chose for key frames (`-g`)? A key frame every `10` frames?
-
-How to decode videos?
-
- Which `decoder`? `torchvision`, `torchaudio`, `ffmpegio`, `decord`, or `nvc`?
- What scenarios to use for the requesting timestamps during benchmark? (`timestamps_mode`)
-
-## Variables
-
-**Image content & size**
-We don't expect the same optimal settings for a dataset of images from a simulation, or from real-world in an apartment, or in a factory, or outdoor, or with lots of moving objects in the scene, etc. Similarly, loading times might not vary linearly with the image size (resolution).
-For these reasons, we run this benchmark on four representative datasets:
-
- `lerobot/pusht_image`: (96 x 96 pixels) simulation with simple geometric shapes, fixed camera.
- `lerobot/aloha_mobile_shrimp_image`: (480 x 640 pixels) real-world indoor, moving camera.
- `lerobot/paris_street`: (720 x 1280 pixels) real-world outdoor, moving camera.
- `lerobot/kitchen`: (1080 x 1920 pixels) real-world indoor, fixed camera.
-
-Note: The datasets used for this benchmark need to be image datasets, not video datasets.
-
-**Data augmentations**
-We might revisit this benchmark and find better settings if we train our policies with various data augmentations to make them more robust (e.g. robust to color changes, compression, etc.).
-
-### Encoding parameters
-
-| parameter   | values                                                       |
-| ----------- | ------------------------------------------------------------ |
-| **vcodec**  | `libx264`, `libx265`, `libsvtav1`                            |
-| **pix_fmt** | `yuv444p`, `yuv420p`                                         |
-| **g**       | `1`, `2`, `3`, `4`, `5`, `6`, `10`, `15`, `20`, `40`, `None` |
-| **crf**     | `0`, `5`, `10`, `15`, `20`, `25`, `30`, `40`, `50`, `None`   |
-
-Note that `crf` value might be interpreted differently by various video codecs. In other words, the same value used with one codec doesn't necessarily translate into the same compression level with another codec. In fact, the default value (`None`) isn't the same amongst the different video codecs. Importantly, it is also the case for many other ffmpeg arguments like `g` which specifies the frequency of the key frames.
-
-For a comprehensive list and documentation of these parameters, see the ffmpeg documentation depending on the video codec used:
-
- h264: https://trac.ffmpeg.org/wiki/Encode/H.264
- h265: https://trac.ffmpeg.org/wiki/Encode/H.265
- AV1: https://trac.ffmpeg.org/wiki/Encode/AV1
-
-### Decoding parameters
-
-**Decoder**
-We tested two video decoding backends from torchvision:
-
- `pyav`
- `video_reader` (requires to build torchvision from source)
-
-**Requested timestamps**
-Given the way video decoding works, once a keyframe has been loaded, the decoding of subsequent frames is fast.
-This of course is affected by the `-g` parameter during encoding, which specifies the frequency of the keyframes. Given our typical use cases in robotics policies which might request a few timestamps in different random places, we want to replicate these use cases with the following scenarios:
-
- `1_frame`: 1 frame,
- `2_frames`: 2 consecutive frames (e.g. `[t, t + 1 / fps]`),
- `6_frames`: 6 consecutive frames (e.g. `[t + i / fps for i in range(6)]`)
-
-Note that this differs significantly from a typical use case like watching a movie, in which every frame is loaded sequentially from the beginning to the end and it's acceptable to have big values for `-g`.
-
-Additionally, because some policies might request single timestamps that are a few frames apart, we also have the following scenario:
-
- `2_frames_4_space`: 2 frames with 4 consecutive frames of spacing in between (e.g `[t, t + 5 / fps]`),
-
-However, due to how video decoding is implemented with `pyav`, we don't have access to an accurate seek so in practice this scenario is essentially the same as `6_frames` since all 6 frames between `t` and `t + 5 / fps` will be decoded.
-
-## Metrics
-
-**Data compression ratio (lower is better)**
-`video_images_size_ratio` is the ratio of the memory space on disk taken by the encoded video over the memory space taken by the original images. For instance, `video_images_size_ratio=25%` means that the video takes 4 times less memory space on disk compared to the original images.
-
-**Loading time ratio (lower is better)**
-`video_images_load_time_ratio` is the ratio of the time it takes to decode frames from the video at a given timestamps over the time it takes to load the exact same original images. Lower is better. For instance, `video_images_load_time_ratio=200%` means that decoding from video is 2 times slower than loading the original images.
-
-**Average Mean Square Error (lower is better)**
-`avg_mse` is the average mean square error between each decoded frame and its corresponding original image over all requested timestamps, and also divided by the number of pixels in the image to be comparable when switching to different image sizes.
-
-**Average Peak Signal to Noise Ratio (higher is better)**
-`avg_psnr` measures the ratio between the maximum possible power of a signal and the power of corrupting noise that affects the fidelity of its representation. Higher PSNR indicates better quality.
-
-**Average Structural Similarity Index Measure (higher is better)**
-`avg_ssim` evaluates the perceived quality of images by comparing luminance, contrast, and structure. SSIM values range from -1 to 1, where 1 indicates perfect similarity.
-
-One aspect that can't be measured here with those metrics is the compatibility of the encoding across platforms, in particular on web browser, for visualization purposes.
-h264, h265 and AV1 are all commonly used codecs and should not pose an issue. However, the chroma subsampling (`pix_fmt`) format might affect compatibility:
-
- `yuv420p` is more widely supported across various platforms, including web browsers.
- `yuv444p` offers higher color fidelity but might not be supported as broadly.
-
-<!-- **Loss of a pretrained policy (higher is better)** (not available)
-`loss_pretrained` is the result of evaluating with the selected encoding/decoding settings a policy pretrained on original images. It is easier to understand than `avg_l2_error`.
-
-**Success rate after retraining (higher is better)** (not available)
-`success_rate` is the result of training and evaluating a policy with the selected encoding/decoding settings. It is the most difficult metric to get but also the very best. -->
-
-## How the benchmark works
-
-The benchmark evaluates both encoding and decoding of video frames on the first episode of each dataset.
-
-**Encoding:** for each `vcodec` and `pix_fmt` pair, we use a default value for `g` and `crf` upon which we change a single value (either `g` or `crf`) to one of the specified values (we don't test every combination of those as this would be computationally too heavy).
-This gives a unique set of encoding parameters which is used to encode the episode.
-
-**Decoding:** Then, for each of those unique encodings, we iterate through every combination of the decoding parameters `backend` and `timestamps_mode`. For each of them, we record the metrics of a number of samples (given by `--num-samples`). This is parallelized for efficiency and the number of processes can be controlled with `--num-workers`. Ideally, it's best to have a `--num-samples` that is divisible by `--num-workers`.
-
-Intermediate results saved for each `vcodec` and `pix_fmt` combination in csv tables.
-These are then all concatenated to a single table ready for analysis.
-
-## Caveats
-
-We tried to measure the most impactful parameters for both encoding and decoding. However, for computational reasons we can't test out every combination.
-
-Additional encoding parameters exist that are not included in this benchmark. In particular:
-
- `-preset` which allows for selecting encoding presets. This represents a collection of options that will provide a certain encoding speed to compression ratio. By leaving this parameter unspecified, it is considered to be `medium` for libx264 and libx265 and `8` for libsvtav1.
- `-tune` which allows to optimize the encoding for certain aspects (e.g. film quality, fast decoding, etc.).
-
-See the documentation mentioned above for more detailed info on these settings and for a more comprehensive list of other parameters.
-
-Similarly on the decoding side, other decoders exist but are not implemented in our current benchmark. To name a few:
-
- `torchaudio`
- `ffmpegio`
- `decord`
- `nvc`
-
-Note as well that since we are mostly interested in the performance at decoding time (also because encoding is done only once before uploading a dataset), we did not measure encoding times nor have any metrics regarding encoding.
-However, besides the necessity to build ffmpeg from source, encoding did not pose any issue and it didn't take a significant amount of time during this benchmark.
-
-## Install
-
-Building ffmpeg from source is required to include libx265 and libaom/libsvtav1 (av1) video codecs ([compilation guide](https://trac.ffmpeg.org/wiki/CompilationGuide/Ubuntu)).
-
-**Note:** While you still need to build torchvision with a conda-installed `ffmpeg<4.3` to use the `video_reader` decoder (as described in [#220](https://github.com/huggingface/lerobot/pull/220)), you also need another version which is custom-built with all the video codecs for encoding. For the script to then use that version, you can prepend the command above with `PATH="$HOME/bin:$PATH"`, which is where ffmpeg should be built.
-
-## Adding a video decoder
-
-Right now, we're only benchmarking the two video decoder available with torchvision: `pyav` and `video_reader`.
-You can easily add a new decoder to benchmark by adding it to this function in the script:
-
-```diff
-def decode_video_frames(
-    video_path: str,
-    timestamps: list[float],
-    tolerance_s: float,
-    backend: str,
-) -> torch.Tensor:
-    if backend in ["pyav", "video_reader"]:
-        return decode_video_frames_torchvision(
-            video_path, timestamps, tolerance_s, backend
-        )
-+    elif backend == ["your_decoder"]:
-+        return your_decoder_function(
-+            video_path, timestamps, tolerance_s, backend
-+        )
-    else:
-        raise NotImplementedError(backend)
-```
-
-## Example
-
-For a quick run, you can try these parameters:
-
-```bash
-python benchmark/video/run_video_benchmark.py \
-    --output-dir outputs/video_benchmark \
-    --repo-ids \
-        lerobot/pusht_image \
-        lerobot/aloha_mobile_shrimp_image \
-    --vcodec libx264 libx265 \
-    --pix-fmt yuv444p yuv420p \
-    --g 2 20 None \
-    --crf 10 40 None \
-    --timestamps-modes 1_frame 2_frames \
-    --backends pyav video_reader \
-    --num-samples 5 \
-    --num-workers 5 \
-    --save-frames 0
-```
-
-## Results
-
-### Reproduce
-
-We ran the benchmark with the following parameters:
-
-```bash
-# h264 and h265 encodings
-python benchmark/video/run_video_benchmark.py \
-    --output-dir outputs/video_benchmark \
-    --repo-ids \
-        lerobot/pusht_image \
-        lerobot/aloha_mobile_shrimp_image \
-        lerobot/paris_street \
-        lerobot/kitchen \
-    --vcodec libx264 libx265 \
-    --pix-fmt yuv444p yuv420p \
-    --g 1 2 3 4 5 6 10 15 20 40 None \
-    --crf 0 5 10 15 20 25 30 40 50 None \
-    --timestamps-modes 1_frame 2_frames 6_frames \
-    --backends pyav video_reader \
-    --num-samples 50 \
-    --num-workers 5 \
-    --save-frames 1
-
-# av1 encoding (only compatible with yuv420p and pyav decoder)
-python benchmark/video/run_video_benchmark.py \
-    --output-dir outputs/video_benchmark \
-    --repo-ids \
-        lerobot/pusht_image \
-        lerobot/aloha_mobile_shrimp_image \
-        lerobot/paris_street \
-        lerobot/kitchen \
-    --vcodec libsvtav1 \
-    --pix-fmt yuv420p \
-    --g 1 2 3 4 5 6 10 15 20 40 None \
-    --crf 0 5 10 15 20 25 30 40 50 None \
-    --timestamps-modes 1_frame 2_frames 6_frames \
-    --backends pyav \
-    --num-samples 50 \
-    --num-workers 5 \
-    --save-frames 1
-```
-
-The full results are available [here](https://docs.google.com/spreadsheets/d/1OYJB43Qu8fC26k_OyoMFgGBBKfQRCi4BIuYitQnq3sw/edit?usp=sharing)
-
-### Parameters selected for LeRobotDataset
-
-Considering these results, we chose what we think is the best set of encoding parameter:
-
- vcodec: `libsvtav1`
- pix-fmt: `yuv420p`
- g: `2`
- crf: `30`
-
-Since we're using av1 encoding, we're choosing the `pyav` decoder as `video_reader` does not support it (and `pyav` doesn't require a custom build of `torchvision`).
-
-### Summary
-
-These tables show the results for `g=2` and `crf=30`, using `timestamps-modes=6_frames` and `backend=pyav`
-
-| video_images_size_ratio           | vcodec     | pix_fmt |           |           |           |
-| --------------------------------- | ---------- | ------- | --------- | --------- | --------- |
-|                                   | libx264    |         | libx265   |           | libsvtav1 |
-| repo_id                           | yuv420p    | yuv444p | yuv420p   | yuv444p   | yuv420p   |
-| lerobot/pusht_image               | **16.97%** | 17.58%  | 18.57%    | 18.86%    | 22.06%    |
-| lerobot/aloha_mobile_shrimp_image | 2.14%      | 2.11%   | 1.38%     | **1.37%** | 5.59%     |
-| lerobot/paris_street              | 2.12%      | 2.13%   | **1.54%** | **1.54%** | 4.43%     |
-| lerobot/kitchen                   | 1.40%      | 1.39%   | **1.00%** | **1.00%** | 2.52%     |
-
-| video_images_load_time_ratio      | vcodec  | pix_fmt |          |         |           |
-| --------------------------------- | ------- | ------- | -------- | ------- | --------- |
-|                                   | libx264 |         | libx265  |         | libsvtav1 |
-| repo_id                           | yuv420p | yuv444p | yuv420p  | yuv444p | yuv420p   |
-| lerobot/pusht_image               | 6.45    | 5.19    | **1.90** | 2.12    | 2.47      |
-| lerobot/aloha_mobile_shrimp_image | 11.80   | 7.92    | 0.71     | 0.85    | **0.48**  |
-| lerobot/paris_street              | 2.21    | 2.05    | 0.36     | 0.49    | **0.30**  |
-| lerobot/kitchen                   | 1.46    | 1.46    | 0.28     | 0.51    | **0.26**  |
-
-|                                   |          | vcodec   | pix_fmt      |          |           |              |
-| --------------------------------- | -------- | -------- | ------------ | -------- | --------- | ------------ |
-|                                   |          | libx264  |              | libx265  |           | libsvtav1    |
-| repo_id                           | metric   | yuv420p  | yuv444p      | yuv420p  | yuv444p   | yuv420p      |
-| lerobot/pusht_image               | avg_mse  | 2.90E-04 | **2.03E-04** | 3.13E-04 | 2.29E-04  | 2.19E-04     |
-|                                   | avg_psnr | 35.44    | 37.07        | 35.49    | **37.30** | 37.20        |
-|                                   | avg_ssim | 98.28%   | **98.85%**   | 98.31%   | 98.84%    | 98.72%       |
-| lerobot/aloha_mobile_shrimp_image | avg_mse  | 2.76E-04 | 2.59E-04     | 3.17E-04 | 3.06E-04  | **1.30E-04** |
-|                                   | avg_psnr | 35.91    | 36.21        | 35.88    | 36.09     | **40.17**    |
-|                                   | avg_ssim | 95.19%   | 95.18%       | 95.00%   | 95.05%    | **97.73%**   |
-| lerobot/paris_street              | avg_mse  | 6.89E-04 | 6.70E-04     | 4.03E-03 | 4.02E-03  | **3.09E-04** |
-|                                   | avg_psnr | 33.48    | 33.68        | 32.05    | 32.15     | **35.40**    |
-|                                   | avg_ssim | 93.76%   | 93.75%       | 89.46%   | 89.46%    | **95.46%**   |
-| lerobot/kitchen                   | avg_mse  | 2.50E-04 | 2.24E-04     | 4.28E-04 | 4.18E-04  | **1.53E-04** |
-|                                   | avg_psnr | 36.73    | 37.33        | 36.56    | 36.75     | **39.12**    |
-|                                   | avg_ssim | 95.47%   | 95.58%       | 95.52%   | 95.53%    | **96.82%**   |
@@ -1,488 +0,0 @@
-#!/usr/bin/env python
-
-# Copyright 2024 The HuggingFace Inc. team. All rights reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-"""Assess the performance of video decoding in various configurations.
-
-This script will benchmark different video encoding and decoding parameters.
-See the provided README.md or run `python benchmark/video/run_video_benchmark.py --help` for usage info.
-"""
-
-import argparse
-import datetime as dt
-import itertools
-import random
-import shutil
-from collections import OrderedDict
-from concurrent.futures import ThreadPoolExecutor, as_completed
-from pathlib import Path
-from threading import Lock
-
-import einops
-import numpy as np
-import pandas as pd
-import PIL
-import torch
-from skimage.metrics import mean_squared_error, peak_signal_noise_ratio, structural_similarity
-from tqdm import tqdm
-
-from lerobot.datasets.lerobot_dataset import LeRobotDataset
-from lerobot.datasets.video_utils import (
-    decode_video_frames,
-    encode_video_frames,
-)
-from lerobot.utils.constants import OBS_IMAGE
-from lerobot.utils.utils import TimerManager
-
-BASE_ENCODING = OrderedDict(
-    [
-        ("vcodec", "libx264"),
-        ("pix_fmt", "yuv444p"),
-        ("g", 2),
-        ("crf", None),
-        # TODO(aliberts): Add fastdecode
-        # ("fastdecode", 0),
-    ]
-)
-
-
-# TODO(rcadene, aliberts): move to `utils.py` folder when we want to refactor
-def parse_int_or_none(value) -> int | None:
-    if value.lower() == "none":
-        return None
-    try:
-        return int(value)
-    except ValueError as e:
-        raise argparse.ArgumentTypeError(f"Invalid int or None: {value}") from e
-
-
-def check_datasets_formats(repo_ids: list) -> None:
-    for repo_id in repo_ids:
-        dataset = LeRobotDataset(repo_id)
-        if len(dataset.meta.video_keys) > 0:
-            raise ValueError(
-                f"Use only image dataset for running this benchmark. Video dataset provided: {repo_id}"
-            )
-
-
-def get_directory_size(directory: Path) -> int:
-    total_size = 0
-    for item in directory.rglob("*"):
-        if item.is_file():
-            total_size += item.stat().st_size
-    return total_size
-
-
-def load_original_frames(imgs_dir: Path, timestamps: list[float], fps: int) -> torch.Tensor:
-    frames = []
-    for ts in timestamps:
-        idx = int(ts * fps)
-        frame = PIL.Image.open(imgs_dir / f"frame-{idx:06d}.png")
-        frame = torch.from_numpy(np.array(frame))
-        frame = frame.type(torch.float32) / 255
-        frame = einops.rearrange(frame, "h w c -> c h w")
-        frames.append(frame)
-    return torch.stack(frames)
-
-
-def save_decoded_frames(
-    imgs_dir: Path, save_dir: Path, frames: torch.Tensor, timestamps: list[float], fps: int
-) -> None:
-    if save_dir.exists() and len(list(save_dir.glob("frame-*.png"))) == len(timestamps):
-        return
-
-    save_dir.mkdir(parents=True, exist_ok=True)
-    for i, ts in enumerate(timestamps):
-        idx = int(ts * fps)
-        frame_hwc = (frames[i].permute((1, 2, 0)) * 255).type(torch.uint8).cpu().numpy()
-        PIL.Image.fromarray(frame_hwc).save(save_dir / f"frame-{idx:06d}_decoded.png")
-        shutil.copyfile(imgs_dir / f"frame-{idx:06d}.png", save_dir / f"frame-{idx:06d}_original.png")
-
-
-def save_first_episode(imgs_dir: Path, dataset: LeRobotDataset) -> None:
-    episode_index = 0
-    ep_num_images = dataset.meta.episodes["length"][episode_index]
-    if imgs_dir.exists() and len(list(imgs_dir.glob("frame-*.png"))) == ep_num_images:
-        return
-
-    imgs_dir.mkdir(parents=True, exist_ok=True)
-    hf_dataset = dataset.hf_dataset.with_format(None)
-
-    # We only save images from the first camera
-    img_keys = [key for key in hf_dataset.features if key.startswith(OBS_IMAGE)]
-    imgs_dataset = hf_dataset.select_columns(img_keys[0])
-
-    for i, item in enumerate(
-        tqdm(imgs_dataset, desc=f"saving {dataset.repo_id} first episode images", leave=False)
-    ):
-        img = item[img_keys[0]]
-        img.save(str(imgs_dir / f"frame-{i:06d}.png"), quality=100)
-
-        if i >= ep_num_images - 1:
-            break
-
-
-def sample_timestamps(timestamps_mode: str, ep_num_images: int, fps: int) -> list[float]:
-    # Start at 5 to allow for 2_frames_4_space and 6_frames
-    idx = random.randint(5, ep_num_images - 1)
-    match timestamps_mode:
-        case "1_frame":
-            frame_indexes = [idx]
-        case "2_frames":
-            frame_indexes = [idx - 1, idx]
-        case "2_frames_4_space":
-            frame_indexes = [idx - 5, idx]
-        case "6_frames":
-            frame_indexes = [idx - i for i in range(6)][::-1]
-        case _:
-            raise ValueError(timestamps_mode)
-
-    return [idx / fps for idx in frame_indexes]
-
-
-def benchmark_decoding(
-    imgs_dir: Path,
-    video_path: Path,
-    timestamps_mode: str,
-    backend: str,
-    ep_num_images: int,
-    fps: int,
-    num_samples: int = 50,
-    num_workers: int = 4,
-    save_frames: bool = False,
-) -> dict:
-    def process_sample(sample: int, lock: Lock):
-        time_benchmark = TimerManager(log=False)
-        timestamps = sample_timestamps(timestamps_mode, ep_num_images, fps)
-        num_frames = len(timestamps)
-        result = {
-            "psnr_values": [],
-            "ssim_values": [],
-            "mse_values": [],
-        }
-
-        with time_benchmark, lock:
-            frames = decode_video_frames(video_path, timestamps=timestamps, tolerance_s=5e-1, backend=backend)
-        result["load_time_video_ms"] = (time_benchmark.last * 1000) / num_frames
-
-        with time_benchmark:
-            original_frames = load_original_frames(imgs_dir, timestamps, fps)
-        result["load_time_images_ms"] = (time_benchmark.last * 1000) / num_frames
-
-        frames_np, original_frames_np = frames.numpy(), original_frames.numpy()
-        for i in range(num_frames):
-            result["mse_values"].append(mean_squared_error(original_frames_np[i], frames_np[i]))
-            result["psnr_values"].append(
-                peak_signal_noise_ratio(original_frames_np[i], frames_np[i], data_range=1.0)
-            )
-            result["ssim_values"].append(
-                structural_similarity(original_frames_np[i], frames_np[i], data_range=1.0, channel_axis=0)
-            )
-
-        if save_frames and sample == 0:
-            save_dir = video_path.with_suffix("") / f"{timestamps_mode}_{backend}"
-            save_decoded_frames(imgs_dir, save_dir, frames, timestamps, fps)
-
-        return result
-
-    load_times_video_ms = []
-    load_times_images_ms = []
-    mse_values = []
-    psnr_values = []
-    ssim_values = []
-
-    # A sample is a single set of decoded frames specified by timestamps_mode (e.g. a single frame, 2 frames, etc.).
-    # For each sample, we record metrics (loading time and quality metrics) which are then averaged over all samples.
-    # As these samples are independent, we run them in parallel threads to speed up the benchmark.
-    # Use a single shared lock for all worker threads
-    shared_lock = Lock()
-    with ThreadPoolExecutor(max_workers=num_workers) as executor:
-        futures = [executor.submit(process_sample, i, shared_lock) for i in range(num_samples)]
-        for future in tqdm(as_completed(futures), total=num_samples, desc="samples", leave=False):
-            result = future.result()
-            load_times_video_ms.append(result["load_time_video_ms"])
-            load_times_images_ms.append(result["load_time_images_ms"])
-            psnr_values.extend(result["psnr_values"])
-            ssim_values.extend(result["ssim_values"])
-            mse_values.extend(result["mse_values"])
-
-    avg_load_time_video_ms = float(np.array(load_times_video_ms).mean())
-    avg_load_time_images_ms = float(np.array(load_times_images_ms).mean())
-    video_images_load_time_ratio = avg_load_time_video_ms / avg_load_time_images_ms
-
-    return {
-        "avg_load_time_video_ms": avg_load_time_video_ms,
-        "avg_load_time_images_ms": avg_load_time_images_ms,
-        "video_images_load_time_ratio": video_images_load_time_ratio,
-        "avg_mse": float(np.mean(mse_values)),
-        "avg_psnr": float(np.mean(psnr_values)),
-        "avg_ssim": float(np.mean(ssim_values)),
-    }
-
-
-def benchmark_encoding_decoding(
-    dataset: LeRobotDataset,
-    video_path: Path,
-    imgs_dir: Path,
-    encoding_cfg: dict,
-    decoding_cfg: dict,
-    num_samples: int,
-    num_workers: int,
-    save_frames: bool,
-    overwrite: bool = False,
-    seed: int = 1337,
-) -> list[dict]:
-    fps = dataset.fps
-
-    if overwrite or not video_path.is_file():
-        tqdm.write(f"encoding {video_path}")
-        encode_video_frames(
-            imgs_dir=imgs_dir,
-            video_path=video_path,
-            fps=fps,
-            vcodec=encoding_cfg["vcodec"],
-            pix_fmt=encoding_cfg["pix_fmt"],
-            g=encoding_cfg.get("g"),
-            crf=encoding_cfg.get("crf"),
-            # fast_decode=encoding_cfg.get("fastdecode"),
-            overwrite=True,
-        )
-
-    episode_index = 0
-    ep_num_images = dataset.meta.episodes["length"][episode_index]
-    width, height = tuple(dataset[0][dataset.meta.camera_keys[0]].shape[-2:])
-    num_pixels = width * height
-    video_size_bytes = video_path.stat().st_size
-    images_size_bytes = get_directory_size(imgs_dir)
-    video_images_size_ratio = video_size_bytes / images_size_bytes
-
-    random.seed(seed)
-    benchmark_table = []
-    for timestamps_mode in tqdm(
-        decoding_cfg["timestamps_modes"], desc="decodings (timestamps_modes)", leave=False
-    ):
-        for backend in tqdm(decoding_cfg["backends"], desc="decodings (backends)", leave=False):
-            benchmark_row = benchmark_decoding(
-                imgs_dir,
-                video_path,
-                timestamps_mode,
-                backend,
-                ep_num_images,
-                fps,
-                num_samples,
-                num_workers,
-                save_frames,
-            )
-            benchmark_row.update(
-                **{
-                    "repo_id": dataset.repo_id,
-                    "resolution": f"{width} x {height}",
-                    "num_pixels": num_pixels,
-                    "video_size_bytes": video_size_bytes,
-                    "images_size_bytes": images_size_bytes,
-                    "video_images_size_ratio": video_images_size_ratio,
-                    "timestamps_mode": timestamps_mode,
-                    "backend": backend,
-                },
-                **encoding_cfg,
-            )
-            benchmark_table.append(benchmark_row)
-
-    return benchmark_table
-
-
-def main(
-    output_dir: Path,
-    repo_ids: list[str],
-    vcodec: list[str],
-    pix_fmt: list[str],
-    g: list[int],
-    crf: list[int],
-    # fastdecode: list[int],
-    timestamps_modes: list[str],
-    backends: list[str],
-    num_samples: int,
-    num_workers: int,
-    save_frames: bool,
-):
-    check_datasets_formats(repo_ids)
-    encoding_benchmarks = {
-        "g": g,
-        "crf": crf,
-        # "fastdecode": fastdecode,
-    }
-    decoding_benchmarks = {
-        "timestamps_modes": timestamps_modes,
-        "backends": backends,
-    }
-    headers = ["repo_id", "resolution", "num_pixels"]
-    headers += list(BASE_ENCODING.keys())
-    headers += [
-        "timestamps_mode",
-        "backend",
-        "video_size_bytes",
-        "images_size_bytes",
-        "video_images_size_ratio",
-        "avg_load_time_video_ms",
-        "avg_load_time_images_ms",
-        "video_images_load_time_ratio",
-        "avg_mse",
-        "avg_psnr",
-        "avg_ssim",
-    ]
-    file_paths = []
-    for video_codec in tqdm(vcodec, desc="encodings (vcodec)"):
-        for pixel_format in tqdm(pix_fmt, desc="encodings (pix_fmt)", leave=False):
-            benchmark_table = []
-            for repo_id in tqdm(repo_ids, desc="encodings (datasets)", leave=False):
-                dataset = LeRobotDataset(repo_id)
-                imgs_dir = output_dir / "images" / dataset.repo_id.replace("/", "_")
-                # We only use the first episode
-                save_first_episode(imgs_dir, dataset)
-                for duet in [
-                    dict(zip(encoding_benchmarks.keys(), unique_combination, strict=False))
-                    for unique_combination in itertools.product(*encoding_benchmarks.values())
-                ]:
-                    encoding_cfg = BASE_ENCODING.copy()
-                    encoding_cfg["vcodec"] = video_codec
-                    encoding_cfg["pix_fmt"] = pixel_format
-                    for key, value in duet.items():
-                        encoding_cfg[key] = value
-                    args_path = Path("_".join(str(value) for value in encoding_cfg.values()))
-                    video_path = output_dir / "videos" / args_path / f"{repo_id.replace('/', '_')}.mp4"
-                    benchmark_table += benchmark_encoding_decoding(
-                        dataset,
-                        video_path,
-                        imgs_dir,
-                        encoding_cfg,
-                        decoding_benchmarks,
-                        num_samples,
-                        num_workers,
-                        save_frames,
-                    )
-
-            # Save intermediate results
-            benchmark_df = pd.DataFrame(benchmark_table, columns=headers)
-            now = dt.datetime.now()
-            csv_path = (
-                output_dir
-                / f"{now:%Y-%m-%d}_{now:%H-%M-%S}_{video_codec}_{pixel_format}_{num_samples}-samples.csv"
-            )
-            benchmark_df.to_csv(csv_path, header=True, index=False)
-            file_paths.append(csv_path)
-            del benchmark_df
-
-    # Concatenate all results
-    df_list = [pd.read_csv(csv_path) for csv_path in file_paths]
-    concatenated_df = pd.concat(df_list, ignore_index=True)
-    concatenated_path = output_dir / f"{now:%Y-%m-%d}_{now:%H-%M-%S}_all_{num_samples}-samples.csv"
-    concatenated_df.to_csv(concatenated_path, header=True, index=False)
-
-
-if __name__ == "__main__":
-    parser = argparse.ArgumentParser()
-    parser.add_argument(
-        "--output-dir",
-        type=Path,
-        default=Path("outputs/video_benchmark"),
-        help="Directory where the video benchmark outputs are written.",
-    )
-    parser.add_argument(
-        "--repo-ids",
-        type=str,
-        nargs="*",
-        default=[
-            "lerobot/pusht_image",
-            "lerobot/aloha_mobile_shrimp_image",
-            "lerobot/paris_street",
-            "lerobot/kitchen",
-        ],
-        help="Datasets repo-ids to test against. First episodes only are used. Must be images.",
-    )
-    parser.add_argument(
-        "--vcodec",
-        type=str,
-        nargs="*",
-        default=["h264", "hevc", "libsvtav1"],
-        help="Video codecs to be tested",
-    )
-    parser.add_argument(
-        "--pix-fmt",
-        type=str,
-        nargs="*",
-        default=["yuv444p", "yuv420p"],
-        help="Pixel formats (chroma subsampling) to be tested",
-    )
-    parser.add_argument(
-        "--g",
-        type=parse_int_or_none,
-        nargs="*",
-        default=[1, 2, 3, 4, 5, 6, 10, 15, 20, 40, 100, None],
-        help="Group of pictures sizes to be tested.",
-    )
-    parser.add_argument(
-        "--crf",
-        type=parse_int_or_none,
-        nargs="*",
-        default=[0, 5, 10, 15, 20, 25, 30, 40, 50, None],
-        help="Constant rate factors to be tested.",
-    )
-    # parser.add_argument(
-    #     "--fastdecode",
-    #     type=int,
-    #     nargs="*",
-    #     default=[0, 1],
-    #     help="Use the fastdecode tuning option. 0 disables it. "
-    #         "For libx264 and libx265/hevc, only 1 is possible. "
-    #         "For libsvtav1, 1, 2 or 3 are possible values with a higher number meaning a faster decoding optimization",
-    # )
-    parser.add_argument(
-        "--timestamps-modes",
-        type=str,
-        nargs="*",
-        default=[
-            "1_frame",
-            "2_frames",
-            "2_frames_4_space",
-            "6_frames",
-        ],
-        help="Timestamps scenarios to be tested.",
-    )
-    parser.add_argument(
-        "--backends",
-        type=str,
-        nargs="*",
-        default=["torchcodec", "pyav"],
-        help="Torchvision decoding backend to be tested.",
-    )
-    parser.add_argument(
-        "--num-samples",
-        type=int,
-        default=50,
-        help="Number of samples for each encoding x decoding config.",
-    )
-    parser.add_argument(
-        "--num-workers",
-        type=int,
-        default=10,
-        help="Number of processes for parallelized sample processing.",
-    )
-    parser.add_argument(
-        "--save-frames",
-        type=int,
-        default=0,
-        help="Whether to save decoded frames or not. Enter a non-zero number for true.",
-    )
-    args = parser.parse_args()
-    main(**vars(args))
@@ -62,7 +62,7 @@ pip install -e ".[hilserl]"

 ### Understanding Configuration

-The training process begins with proper configuration for the HILSerl environment. The main configuration class is `GymManipulatorConfig` in `lerobot/rl/gym_manipulator.py`, which contains nested `HILSerlRobotEnvConfig` and `DatasetConfig`. The configuration is organized into focused, nested sub-configs:
+The training process begins with proper configuration for the HILSERl environment. The main configuration class is `GymManipulatorConfig` in `lerobot/rl/gym_manipulator.py`, which contains nested `HILSerlRobotEnvConfig` (defined in `lerobot/envs/configs.py`) and `DatasetConfig`. The configuration is organized into focused, nested sub-configs:

 <!-- prettier-ignore-start -->
 ```python
@@ -95,6 +95,7 @@ class HILSerlProcessorConfig:
 class ObservationConfig:
    add_joint_velocity_to_observation: bool = False    # Add joint velocities to state
    add_current_to_observation: bool = False    # Add motor currents to state
+    add_ee_pose_to_observation: bool = False    # Add end-effector pose to state
    display_cameras: bool = False    # Display camera feeds during execution

 class ImagePreprocessingConfig:
@@ -326,14 +327,22 @@ lerobot-find-joint-limits \
   Max joint positions [-20.0, -20.0, -20.0, -20.0, -20.0, -20.0]
   Min joint positions [50.0, 50.0, 50.0, 50.0, 50.0, 50.0]
   ```
-3. Use these values in the configuration of your teleoperation device (TeleoperatorConfig) under the `end_effector_bounds` field
+3. Use these values in your environment configuration under `env.processor.inverse_kinematics.end_effector_bounds` (see `InverseKinematicsConfig` in `lerobot/envs/configs.py`)

 **Example Configuration**

 ```json
-"end_effector_bounds": {
-    "max": [0.24, 0.20, 0.10],
-    "min": [0.16, -0.08, 0.03]
+{
+  "env": {
+    "processor": {
+      "inverse_kinematics": {
+        "end_effector_bounds": {
+          "max": [0.24, 0.2, 0.1],
+          "min": [0.16, -0.08, 0.03]
+        }
+      }
+    }
+  }
 }
 ```

@@ -404,30 +413,24 @@ We support using a gamepad or a keyboard or the leader arm of the robot.

 HIL-Serl learns actions in the end-effector space of the robot. Therefore, the teleoperation will control the end-effector's x,y,z displacements.

-For that we need to define a version of the robot that takes actions in the end-effector space. Check the robot class `SO100FollowerEndEffector` and its configuration `SO100FollowerEndEffectorConfig` for the default parameters related to the end-effector space.
+The end-effector transformation is applied by the processor pipeline (`InverseKinematicsRLStep`, `EEBoundsAndSafety`, `EEReferenceAndDelta`, `GripperVelocityToJoint`) configured under `env.processor.inverse_kinematics` (`InverseKinematicsConfig`) and `env.processor.gripper` / `env.processor.max_gripper_pos`. The defaults related to the end-effector space are:

 <!-- prettier-ignore-start -->
 ```python
-class SO100FollowerEndEffectorConfig(SO100FollowerConfig):
-    """Configuration for the SO100FollowerEndEffector robot."""
+class InverseKinematicsConfig:
+    """Configuration for inverse kinematics processing."""

-    # Default bounds for the end-effector position (in meters)
-    end_effector_bounds: dict[str, list[float]] = field( # bounds for the end-effector in x,y,z direction
-        default_factory=lambda: {
-            "min": [-1.0, -1.0, -1.0],  # min x, y, z
-            "max": [1.0, 1.0, 1.0],  # max x, y, z
-        }
-    )
+    urdf_path: str | None = None
+    target_frame_name: str | None = None
+    # bounds for the end-effector in x,y,z direction
+    end_effector_bounds: dict[str, list[float]] | None = None
+    # maximum step size for the end-effector in x,y,z direction
+    end_effector_step_sizes: dict[str, float] | None = None

-    max_gripper_pos: float = 50 # maximum gripper position that the gripper will be open at
-
-    end_effector_step_sizes: dict[str, float] = field( # maximum step size for the end-effector in x,y,z direction
-        default_factory=lambda: {
-            "x": 0.02,
-            "y": 0.02,
-            "z": 0.02,
-        }
-    )
+class HILSerlProcessorConfig:
+    ...
+    # maximum gripper position that the gripper will be open at
+    max_gripper_pos: float | None = 100.0
 ```
 <!-- prettier-ignore-end -->

@@ -606,11 +609,11 @@ This guide explains how to train a reward classifier for human-in-the-loop reinf

 **Note**: Training a reward classifier is optional. You can start the first round of RL experiments by annotating the success manually with your gamepad or keyboard device.

-The reward classifier implementation in `modeling_classifier.py` uses a pretrained vision model to process the images. It can output either a single value for binary rewards to predict success/fail cases or multiple values for multi-class settings.
+The reward classifier implementation in `lerobot/rewards/classifier/modeling_classifier.py` uses a pretrained vision model to process the images. It can output either a single value for binary rewards to predict success/fail cases or multiple values for multi-class settings.

 **Collecting a Dataset for the reward classifier**

-Before training, you need to collect a dataset with labeled examples. The `record_dataset` function in `gym_manipulator.py` enables the process of collecting a dataset of observations, actions, and rewards.
+Before training, you need to collect a dataset with labeled examples. Setting `mode: "record"` in your config and running `gym_manipulator.py` enables the process of collecting a dataset of observations, actions, and rewards.

 To collect a dataset, you need to modify some parameters in the environment configuration based on HILSerlRobotEnvConfig.

@@ -658,7 +661,7 @@ Example configuration section for data collection:
  },
  "dataset": {
    "repo_id": "hf_username/dataset_name",
-    "dataset_root": "data/your_dataset",
+    "root": "data/your_dataset",
    "task": "reward_classifier_task",
    "num_episodes_to_record": 20,
    "replay_episode": null,
@@ -671,7 +674,7 @@ Example configuration section for data collection:

 **Reward Classifier Configuration**

-The reward classifier is configured using `configuration_classifier.py`. Here are the key parameters:
+The reward classifier is configured using `lerobot/rewards/classifier/configuration_classifier.py`. Here are the key parameters:

 - **model_name**: Base model architecture (e.g., we mainly use `"helper2424/resnet10"`)
 - **model_type**: `"cnn"` or `"transformer"`
@@ -689,7 +692,7 @@ Example configuration for training the [reward classifier](https://huggingface.c
    "repo_id": "hf_username/dataset_name",
    "root": null
  },
-  "policy": {
+  "reward_model": {
    "type": "reward_classifier",
    "model_name": "helper2424/resnet10",
    "model_type": "cnn",
@@ -699,7 +702,6 @@ Example configuration for training the [reward classifier](https://huggingface.c
    "dropout_rate": 0.1,
    "learning_rate": 1e-4,
    "device": "cuda",
-    "use_amp": true,
    "input_features": {
      "observation.images.front": {
        "type": "VISUAL",
@@ -818,13 +820,14 @@ The LeRobot system uses a distributed actor-learner architecture for training. T

 **Configuration Setup**

-Create a training configuration file (example available [here](https://huggingface.co/datasets/lerobot/config_examples/resolve/main/rl/train_config.json)). The training config is based on the main `TrainRLServerPipelineConfig` class in `lerobot/configs/train.py`.
+Create a training configuration file (example available [here](https://huggingface.co/datasets/lerobot/config_examples/resolve/main/rl/train_config.json)). The training config is based on the main `TrainRLServerPipelineConfig` class in `lerobot/rl/train_rl.py`.

-1. Configure the policy settings (`type="sac"`, `device`, etc.)
-2. Set `dataset` to your cropped dataset
-3. Configure environment settings with crop parameters
-4. Check the other parameters related to SAC in [configuration_sac.py](https://github.com/huggingface/lerobot/blob/main/src/lerobot/policies/sac/configuration_sac.py#L79).
-5. Verify that the `policy` config is correct with the right `input_features` and `output_features` for your task.
+1. Configure the policy settings (`type="gaussian_actor"`, `device`, etc.)
+2. Configure the algorithm settings under the top-level `algorithm` block (`type="sac"`, learning rates, discount, etc., defined in `lerobot/rl/algorithms/sac/configuration_sac.py`).
+3. Set `dataset` to your cropped dataset
+4. Configure environment settings with crop parameters
+5. Check the other parameters related to the Gaussian Actor in [configuration_gaussian_actor.py](https://github.com/huggingface/lerobot/blob/main/src/lerobot/policies/gaussian_actor/configuration_gaussian_actor.py#L79).
+6. Verify that the `policy` config is correct with the right `input_features` and `output_features` for your task.

 **Starting the Learner**

@@ -926,7 +929,7 @@ The ideal behaviour is that your intervention rate should drop gradually during

 Some configuration values have a disproportionate impact on training stability and speed:

- **`temperature_init`** (`policy.temperature_init`) – initial entropy temperature in SAC. Higher values encourage more exploration; lower values make the policy more deterministic early on. A good starting point is `1e-2`. We observed that setting it too high can make human interventions ineffective and slow down learning.
+- **`temperature_init`** (`algorithm.temperature_init`) – initial entropy temperature in SAC. Higher values encourage more exploration; lower values make the policy more deterministic early on. A good starting point is `1e-2`. We observed that setting it too high can make human interventions ineffective and slow down learning.
 - **`policy_parameters_push_frequency`** (`policy.actor_learner_config.policy_parameters_push_frequency`) – interval in _seconds_ between two weight pushes from the learner to the actor. The default is `4 s`. Decrease to **1-2 s** to provide fresher weights (at the cost of more network traffic); increase only if your connection is slow, as this will reduce sample efficiency.
 - **`storage_device`** (`policy.storage_device`) – device on which the learner keeps the policy parameters. If you have spare GPU memory, set this to `"cuda"` (instead of the default `"cpu"`). Keeping the weights on-GPU removes CPU→GPU transfer overhead and can significantly increase the number of learner updates per second.

@@ -28,13 +28,15 @@ lerobot-train \
 --steps=100000 \
 --batch_size=32 \
 --peft.method_type=LORA \
- --peft.r=64
+ --peft.r=64 \
+ --peft.lora_alpha=64
 ```

 Note the `--peft.method_type` parameter that let's you select which PEFT method to use. Here we use
 [LoRA](https://huggingface.co/docs/peft/main/en/package_reference/lora) (Low-Rank Adapter) which is probably the most
 popular fine-tuning method to date. Low-rank adaption means that we only fine-tune a matrix with comparably low rank
-instead of the full weight matrix. This rank can be specified using the `--peft.r` parameter. The higher the rank
+instead of the full weight matrix. This rank can be specified using the `--peft.r` parameter, and the LoRA scaling factor with
+`--peft.lora_alpha` (where `scaling = lora_alpha / r`). The higher the rank
 the closer you get to full fine-tuning

 There are more complex methods that have more parameters. These are not yet supported, feel free to raise an issue
@@ -0,0 +1,244 @@
+#!/usr/bin/env python
+
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Create videos with a Robometer progress overlay for one LeRobot dataset episode.
+
+This is a lightweight smoke-test utility for Robometer checkpoints. It downloads
+one episode video, samples a small number of frames, runs Robometer on those
+frames, and reuses the progress overlay renderer from
+``examples/dataset/create_progress_videos.py``.
+
+Example:
+
+    uv run python examples/dataset/create_robometer_progress_videos.py \\
+        --repo-id lerobot/aloha_mobile_cabinet \\
+        --episode 0 \\
+        --reward-model-path lilkm/robometer-4b \\
+        --device cuda
+"""
+
+from __future__ import annotations
+
+import argparse
+import logging
+from pathlib import Path
+
+import cv2
+import numpy as np
+import torch
+
+from examples.dataset.create_progress_videos import (
+    composite_progress_video,
+    convert_mp4_to_gif,
+    download_episode_metadata,
+    download_video_file,
+    load_episode_meta,
+)
+from lerobot.rewards.robometer import RobometerConfig, RobometerRewardModel
+from lerobot.rewards.robometer.modeling_robometer import decode_progress_outputs
+from lerobot.rewards.robometer.processor_robometer import RobometerEncoderProcessorStep
+from lerobot.utils.utils import init_logging
+
+
+def _default_device() -> str:
+    return "cuda" if torch.cuda.is_available() else "cpu"
+
+
+def sample_episode_frames(
+    video_path: Path,
+    *,
+    from_timestamp: float,
+    to_timestamp: float,
+    fps: float,
+    num_frames: int,
+) -> tuple[np.ndarray, np.ndarray]:
+    """Sample RGB frames uniformly from an episode video segment.
+
+    Returns:
+        ``(frames, frame_indices)`` where ``frames`` is ``(T,H,W,C)`` uint8 RGB
+        and ``frame_indices`` are local episode frame indices used for overlay.
+    """
+    if num_frames <= 0:
+        raise ValueError(f"num_frames must be positive, got {num_frames}")
+
+    duration_seconds = to_timestamp - from_timestamp
+    total_frames = max(int(round(duration_seconds * fps)), 1)
+    frame_indices = np.linspace(0, total_frames - 1, num=min(num_frames, total_frames), dtype=int)
+
+    capture = cv2.VideoCapture(str(video_path))
+    frames: list[np.ndarray] = []
+    try:
+        for frame_idx in frame_indices:
+            timestamp = from_timestamp + frame_idx / fps
+            capture.set(cv2.CAP_PROP_POS_MSEC, timestamp * 1000)
+            ret, frame_bgr = capture.read()
+            if not ret:
+                logging.warning("Could not read frame %d at %.3fs", frame_idx, timestamp)
+                continue
+            frames.append(cv2.cvtColor(frame_bgr, cv2.COLOR_BGR2RGB))
+    finally:
+        capture.release()
+
+    if not frames:
+        raise RuntimeError(f"No frames could be sampled from {video_path}")
+
+    return np.stack(frames), frame_indices[: len(frames)]
+
+
+def predict_robometer_progress(
+    frames: np.ndarray,
+    *,
+    task: str,
+    reward_model_path: str,
+    device: str,
+) -> list[float]:
+    """Run Robometer and return per-sampled-frame progress predictions."""
+    config = RobometerConfig(pretrained_path=reward_model_path, device=device, max_frames=None)
+    model = RobometerRewardModel.from_pretrained(reward_model_path, config=config)
+
+    encoder = RobometerEncoderProcessorStep(
+        base_model_id=model.config.base_model_id,
+        use_multi_image=model.config.use_multi_image,
+        use_per_frame_progress_token=model.config.use_per_frame_progress_token,
+        max_frames=None,
+    )
+    batch = encoder.encode_samples([(frames, task)])
+
+    model_device = next(model.model.parameters()).device
+    inputs = {key: value.to(model_device) if hasattr(value, "to") else value for key, value in batch.items()}
+
+    model.eval()
+    with torch.no_grad():
+        progress_logits, success_logits = model._compute_rbm_logits(inputs)
+
+    decoded = decode_progress_outputs(
+        progress_logits,
+        success_logits,
+        is_discrete_mode=model.config.use_discrete_progress,
+    )
+    return decoded["progress_pred"][0]
+
+
+def process_dataset(
+    repo_id: str,
+    episode: int,
+    reward_model_path: str,
+    device: str,
+    camera_key: str | None,
+    output_dir: Path,
+    num_frames: int,
+    task: str | None = None,
+    create_gif: bool = False,
+) -> Path:
+    safe_name = repo_id.replace("/", "_")
+    logging.info("Processing %s episode %d with Robometer %s", repo_id, episode, reward_model_path)
+
+    local_path = download_episode_metadata(repo_id, episode)
+    episode_meta = load_episode_meta(local_path, episode, camera_key)
+    video_path = download_video_file(repo_id, local_path, episode_meta["video_rel"])
+
+    task_name = task or episode_meta.get("task_name", "")
+    if not task_name:
+        raise ValueError("No task found in dataset metadata. Pass --task explicitly.")
+
+    frames, frame_indices = sample_episode_frames(
+        video_path,
+        from_timestamp=episode_meta["from_ts"],
+        to_timestamp=episode_meta["to_ts"],
+        fps=episode_meta["fps"],
+        num_frames=num_frames,
+    )
+    logging.info("Sampled %d frames for Robometer inference", len(frames))
+
+    progress = predict_robometer_progress(
+        frames,
+        task=task_name,
+        reward_model_path=reward_model_path,
+        device=device,
+    )
+    progress_data = np.stack([frame_indices, np.asarray(progress, dtype=np.float32)], axis=1)
+    logging.info("Progress predictions: %s", [round(float(value), 3) for value in progress])
+
+    output_path = output_dir / f"{safe_name}_ep{episode}_robometer_progress.mp4"
+    final_path = composite_progress_video(
+        video_path=video_path,
+        from_timestamp=episode_meta["from_ts"],
+        to_timestamp=episode_meta["to_ts"],
+        progress_data=progress_data,
+        output_path=output_path,
+        fps=episode_meta["fps"],
+        task_name=task_name,
+    )
+
+    if create_gif:
+        final_path = convert_mp4_to_gif(final_path)
+    return final_path
+
+
+def main() -> None:
+    parser = argparse.ArgumentParser(
+        description="Create MP4/GIF videos with Robometer progress overlay for dataset episodes."
+    )
+    parser.add_argument("--repo-id", required=True, help="Hugging Face LeRobot dataset repo id.")
+    parser.add_argument("--episode", type=int, required=True, help="Episode index to visualize.")
+    parser.add_argument(
+        "--reward-model-path",
+        default="lilkm/robometer-4b",
+        help="Robometer checkpoint path or Hub repo id (e.g. lilkm/robometer-4b).",
+    )
+    parser.add_argument("--device", default=_default_device(), help="Torch device for Robometer inference.")
+    parser.add_argument(
+        "--camera-key",
+        default=None,
+        help="Camera observation key (e.g. observation.images.top). Auto-selects first camera if omitted.",
+    )
+    parser.add_argument(
+        "--task", default=None, help="Task description override if dataset metadata lacks one."
+    )
+    parser.add_argument(
+        "--num-frames",
+        type=int,
+        default=8,
+        help="Number of episode frames to sample for Robometer inference.",
+    )
+    parser.add_argument(
+        "--output-dir",
+        type=Path,
+        default=Path("progress_videos"),
+        help="Directory to write output files.",
+    )
+    parser.add_argument("--gif", action="store_true", help="Also generate a GIF from the MP4 output.")
+    args = parser.parse_args()
+
+    init_logging()
+    args.output_dir.mkdir(parents=True, exist_ok=True)
+
+    result = process_dataset(
+        repo_id=args.repo_id,
+        episode=args.episode,
+        reward_model_path=args.reward_model_path,
+        device=args.device,
+        camera_key=args.camera_key,
+        output_dir=args.output_dir,
+        num_frames=args.num_frames,
+        task=args.task,
+        create_gif=args.gif,
+    )
+    logging.info("Output: %s", result)
+
+
+if __name__ == "__main__":
+    main()
@@ -4,13 +4,13 @@ from pathlib import Path
 from queue import Empty, Full

 import torch
-import torch.optim as optim

 from lerobot.datasets import LeRobotDataset
 from lerobot.envs.configs import HILSerlProcessorConfig, HILSerlRobotEnvConfig
-from lerobot.policies import SACConfig
-from lerobot.policies.sac.modeling_sac import SACPolicy
+from lerobot.policies import GaussianActorConfig
+from lerobot.policies.gaussian_actor.modeling_gaussian_actor import GaussianActorPolicy
 from lerobot.rewards.classifier.modeling_classifier import Classifier
+from lerobot.rl.algorithms.sac import SACAlgorithm, SACAlgorithmConfig
 from lerobot.rl.buffer import ReplayBuffer
 from lerobot.rl.gym_manipulator import make_robot_env
 from lerobot.robots.so_follower import SO100FollowerConfig
@@ -28,7 +28,7 @@ def run_learner(
    transitions_queue: mp.Queue,
    parameters_queue: mp.Queue,
    shutdown_event: mp.Event,
-    policy_learner: SACPolicy,
+    policy_learner: GaussianActorPolicy,
    online_buffer: ReplayBuffer,
    offline_buffer: ReplayBuffer,
    lr: float = 3e-4,
@@ -40,8 +40,9 @@ def run_learner(
    policy_learner.train()
    policy_learner.to(device)

-    # Create Adam optimizer from scratch - simple and clean
-    optimizer = optim.Adam(policy_learner.parameters(), lr=lr)
+    algo_config = SACAlgorithmConfig.from_policy_config(policy_learner.config)
+    algorithm = SACAlgorithm(policy=policy_learner, config=algo_config)
+    algorithm.make_optimizers_and_scheduler()

    print(f"[LEARNER] Online buffer capacity: {online_buffer.capacity}")
    print(f"[LEARNER] Offline buffer capacity: {offline_buffer.capacity}")
@@ -83,24 +84,26 @@ def run_learner(
                else:
                    batch[key] = online_batch[key]

-            loss, _ = policy_learner.forward(batch)
+            def batch_iter(b=batch):
+                while True:
+                    yield b

-            optimizer.zero_grad()
-            loss.backward()
-            optimizer.step()
+            stats = algorithm.update(batch_iter())
            training_step += 1

            if training_step % LOG_EVERY == 0:
+                log_dict = stats.to_log_dict()
                print(
-                    f"[LEARNER] Training step {training_step}, Loss: {loss.item():.4f}, "
+                    f"[LEARNER] Training step {training_step}, "
+                    f"critic_loss: {log_dict.get('critic', 'N/A'):.4f}, "
                    f"Buffers: Online={len(online_buffer)}, Offline={len(offline_buffer)}"
                )

            # Send updated parameters to actor every 10 training steps
            if training_step % SEND_EVERY == 0:
                try:
-                    state_dict = {k: v.cpu() for k, v in policy_learner.state_dict().items()}
-                    parameters_queue.put_nowait(state_dict)
+                    weights = algorithm.get_weights()
+                    parameters_queue.put_nowait(weights)
                    print("[LEARNER] Sent updated parameters to actor")
                except Full:
                    # Missing write due to queue not being consumed (should happen rarely)
@@ -113,7 +116,7 @@ def run_actor(
    transitions_queue: mp.Queue,
    parameters_queue: mp.Queue,
    shutdown_event: mp.Event,
-    policy_actor: SACPolicy,
+    policy_actor: GaussianActorPolicy,
    reward_classifier: Classifier,
    env_cfg: HILSerlRobotEnvConfig,
    device: torch.device = "mps",
@@ -144,15 +147,15 @@ def run_actor(

            while step < MAX_STEPS_PER_EPISODE and not shutdown_event.is_set():
                try:
-                    new_params = parameters_queue.get_nowait()
-                    policy_actor.load_state_dict(new_params)
+                    new_weights = parameters_queue.get_nowait()
+                    policy_actor.load_state_dict(new_weights)
                    print("[ACTOR] Updated policy parameters from learner")
                except Empty:  # No new updated parameters available from learner, waiting
                    pass

-                # Get action from policy
+                # Get action from policy (returns full action: continuous + discrete)
                policy_obs = make_policy_obs(obs, device=device)
-                action_tensor = policy_actor.select_action(policy_obs)  # predicts a single action
+                action_tensor = policy_actor.select_action(policy_obs)
                action = action_tensor.squeeze(0).cpu().numpy()

                # Step environment
@@ -261,14 +264,14 @@ def main():
    action_features = hw_to_dataset_features(env.robot.action_features, "action")

    # Create SAC policy for action selection
-    policy_cfg = SACConfig(
+    policy_cfg = GaussianActorConfig(
        device=device,
        input_features=obs_features,
        output_features=action_features,
    )

-    policy_actor = SACPolicy(policy_cfg)
-    policy_learner = SACPolicy(policy_cfg)
+    policy_actor = GaussianActorPolicy(policy_cfg)
+    policy_learner = GaussianActorPolicy(policy_cfg)

    demonstrations_repo_id = "lerobot/example_hil_serl_dataset"
    offline_dataset = LeRobotDataset(repo_id=demonstrations_repo_id)
@@ -99,7 +99,18 @@ dataset = [
    "pandas>=2.0.0,<3.0.0", # NOTE: Transitive dependency of datasets
    "pyarrow>=21.0.0,<30.0.0", # NOTE: Transitive dependency of datasets
    "lerobot[av-dep]",
-    "torchcodec>=0.3.0,<0.12.0; sys_platform != 'win32' and (sys_platform != 'linux' or (platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 'armv7l')) and (sys_platform != 'darwin' or platform_machine != 'x86_64')", # NOTE: Windows support starts at version 0.7 (needs torch==2.8), ffmpeg>=8 support starts at version 0.8.1 (needs torch==2.9), system-wide ffmpeg support starts at version 0.10 (needs torch==2.10), 0.11 needs torch==2.11, 0.12 needs torch==2.12.
+
+    # NOTE: torchcodec wheel availability matrix (PyPI):
+    #   - linux x86_64/amd64 + macOS arm64 : wheels since 0.3.0 (the historic supported set).
+    #   - win32 x86_64                     : wheels since 0.7.0  (needs torch>=2.8).
+    #   - linux aarch64/arm64              : wheels since 0.11.0 (needs torch>=2.11).
+    #   - macOS x86_64 (Intel) and linux armv7l: no wheels in any released version -> fall through to the PyAV decoder.
+    # Each platform gets its own line so the resolver picks the minimum version that has a wheel for it.
+
+    # Other torch/torchcodec pairings (informational): 0.8.1 = ffmpeg>=8 support, 0.10 = system-wide ffmpeg support, 0.12 needs torch==2.12.
+    "torchcodec>=0.3.0,<0.12.0; (sys_platform == 'linux' and (platform_machine == 'x86_64' or platform_machine == 'AMD64')) or (sys_platform == 'darwin' and platform_machine == 'arm64')",
+    "torchcodec>=0.7.0,<0.12.0; sys_platform == 'win32'",
+    "torchcodec>=0.11.0,<0.12.0; sys_platform == 'linux' and (platform_machine == 'aarch64' or platform_machine == 'arm64')",
    "jsonlines>=4.0.0,<5.0.0",
 ]
 training = [
@@ -193,9 +204,10 @@ groot = [
    "flash-attn>=2.5.9,<3.0.0 ; sys_platform != 'darwin'"
 ]
 sarm = ["lerobot[transformers-dep]", "pydantic>=2.0.0,<3.0.0", "faker>=33.0.0,<35.0.0", "lerobot[matplotlib-dep]", "lerobot[qwen-vl-utils-dep]"]
+robometer = ["lerobot[transformers-dep]", "lerobot[qwen-vl-utils-dep]", "lerobot[peft-dep]"]
 xvla = ["lerobot[transformers-dep]"]
 eo1 = ["lerobot[transformers-dep]", "lerobot[qwen-vl-utils-dep]"]
-hilserl = ["lerobot[transformers-dep]", "gym-hil>=0.1.13,<0.2.0", "lerobot[grpcio-dep]", "lerobot[placo-dep]"]
+hilserl = ["lerobot[transformers-dep]", "lerobot[dataset]", "gym-hil>=0.1.13,<0.2.0", "lerobot[grpcio-dep]", "lerobot[placo-dep]"]

 # Features
 async = ["lerobot[grpcio-dep]", "lerobot[matplotlib-dep]"]
@@ -291,6 +303,7 @@ lerobot-imgtransform-viz="lerobot.scripts.lerobot_imgtransform_viz:main"
 lerobot-edit-dataset="lerobot.scripts.lerobot_edit_dataset:main"
 lerobot-setup-can="lerobot.scripts.lerobot_setup_can:main"
 lerobot-rollout="lerobot.scripts.lerobot_rollout:main"
+lerobot-export-robometer="lerobot.scripts.lerobot_export_robometer:main"

 # ---------------- Tool Configurations ----------------

@@ -0,0 +1,164 @@
+#!/usr/bin/env python
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+"""Pinpoint exactly which rows of ``embed_tokens`` / ``lm_head`` differ.
+
+Useful follow-up to ``scripts/verify_robometer_export.py`` when the verifier
+reports a small tail of differing keys but you want to know whether the
+diff is:
+
+1. Concentrated in the 5 special-token rows added by ``resize_token_embeddings``
+   (expected non-determinism: mean-resize sampling differs between runs).
+2. Spread across the full vocabulary (would point to a real loading bug).
+
+Also confirms whether ``apply_upstream_checkpoint`` actually overwrites the
+embed/lm-head tensors when loading the upstream state dict (vs. silently
+skipping them due to a key mismatch).
+"""
+
+from __future__ import annotations
+
+import argparse
+import sys
+
+import torch
+from safetensors.torch import load_file
+
+from lerobot.configs.rewards import RewardModelConfig
+from lerobot.rewards.robometer import RobometerConfig, RobometerRewardModel
+from lerobot.rewards.robometer._upstream_loader import (
+    _download_robometer_snapshot,
+    _remap_state_dict_keys,
+    _resolve_checkpoint_safetensors_files,
+    apply_upstream_checkpoint,
+)
+
+EMBED_KEY = "model.model.language_model.embed_tokens.weight"
+LMHEAD_KEY = "model.lm_head.weight"
+
+
+def _load_upstream(path: str) -> RobometerRewardModel:
+    cfg = RobometerConfig(pretrained_path=path, device="cpu")
+    model = RobometerRewardModel(cfg)
+    apply_upstream_checkpoint(model, path)
+    model.eval()
+    return model
+
+
+def _load_lerobot(path: str) -> RobometerRewardModel:
+    cfg = RewardModelConfig.from_pretrained(path)
+    if not isinstance(cfg, RobometerConfig):
+        raise TypeError(f"Expected RobometerConfig, got {type(cfg)}")
+    cfg.pretrained_path = path
+    cfg.device = "cpu"
+    return RobometerRewardModel.from_pretrained(path, config=cfg)
+
+
+def _inspect_upstream_state_dict(upstream_path: str, model: RobometerRewardModel) -> None:
+    """Dump the upstream state-dict view of the embed/lm-head tensors.
+
+    Loads the raw upstream safetensors (pre-remap), runs the remapper, and
+    reports whether the embed/lm-head keys survive into the merged dict that
+    eventually hits ``model.load_state_dict``.
+    """
+    snapshot_dir = _download_robometer_snapshot(upstream_path)
+    files = _resolve_checkpoint_safetensors_files(snapshot_dir)
+    merged: dict[str, torch.Tensor] = {}
+    for path in files:
+        merged.update(load_file(str(path)))
+    remapped = _remap_state_dict_keys(merged, model)
+
+    print(f"\n=== Upstream state-dict inspection (snapshot at {snapshot_dir}) ===")
+    print(f"raw keys (before remap)  : {len(merged)}")
+    print(f"keys after remap         : {len(remapped)}")
+    print(f"model expects (state_dict): {len(model.state_dict())}")
+
+    expected = set(model.state_dict())
+    present_after_remap = set(remapped) & expected
+    print(f"keys present after remap : {len(present_after_remap)}")
+
+    missing_keys = expected - set(remapped)
+    print(f"keys missing from remap  : {len(missing_keys)}")
+    if missing_keys:
+        sample = list(missing_keys)[:10]
+        print(f"  sample missing keys    : {sample}")
+
+    unexpected_keys = set(remapped) - expected
+    print(f"keys unexpected by model : {len(unexpected_keys)}")
+    if unexpected_keys:
+        sample = list(unexpected_keys)[:10]
+        print(f"  sample unexpected keys : {sample}")
+
+    for key in (EMBED_KEY, LMHEAD_KEY):
+        present = key in remapped
+        shape = tuple(remapped[key].shape) if present else None
+        print(f"  {key:60s}  present={present}, shape={shape}")
+
+
+def _diff_embed(name: str, a: torch.Tensor, b: torch.Tensor, special_token_count: int) -> None:
+    a = a.float()
+    b = b.float()
+    if a.shape != b.shape:
+        print(f"❌ {name} shape mismatch: {tuple(a.shape)} vs {tuple(b.shape)}")
+        return
+
+    abs_diff = (a - b).abs()
+    per_row_max = abs_diff.max(dim=1).values
+    nz_rows = (per_row_max > 0).nonzero(as_tuple=True)[0].tolist()
+    print(f"\n=== {name} (shape {tuple(a.shape)}) ===")
+    print(f"global max|Δ|         = {abs_diff.max().item():.3e}")
+    print(f"rows with any diff    = {len(nz_rows)}")
+    if nz_rows:
+        first = nz_rows[:10]
+        last = nz_rows[-10:]
+        print(f"  first nonzero rows  = {first}")
+        print(f"  last nonzero rows   = {last}")
+        vocab_size = a.shape[0]
+        base_vocab = vocab_size - special_token_count
+        special_rows = list(range(base_vocab, vocab_size))
+        in_special = [r for r in nz_rows if r in special_rows]
+        out_special = [r for r in nz_rows if r not in special_rows]
+        print(
+            f"  diffs in special-token rows ({base_vocab}..{vocab_size - 1}): {len(in_special)}/{special_token_count}"
+        )
+        print(f"  diffs in base-vocab rows  (0..{base_vocab - 1})           : {len(out_special)}")
+        for r in special_rows:
+            print(
+                f"    row {r}: max|Δ|={per_row_max[r].item():.3e}, "
+                f"upstream_norm={a[r].norm().item():.3e}, lerobot_norm={b[r].norm().item():.3e}"
+            )
+
+
+def main() -> int:
+    parser = argparse.ArgumentParser(
+        description=__doc__, formatter_class=argparse.RawDescriptionHelpFormatter
+    )
+    parser.add_argument("--upstream", required=True)
+    parser.add_argument("--lerobot", required=True)
+    parser.add_argument(
+        "--special-token-count",
+        type=int,
+        default=5,
+        help="Number of special tokens Robometer adds. Defaults to len(ROBOMETER_SPECIAL_TOKENS)=5.",
+    )
+    args = parser.parse_args()
+
+    print(f"Loading upstream:        {args.upstream}")
+    upstream = _load_upstream(args.upstream)
+    print(f"Loading LeRobot-format:  {args.lerobot}")
+    lerobot = _load_lerobot(args.lerobot)
+
+    _inspect_upstream_state_dict(args.upstream, upstream)
+
+    sd_u, sd_l = upstream.state_dict(), lerobot.state_dict()
+
+    for key in (EMBED_KEY, LMHEAD_KEY):
+        if key not in sd_u or key not in sd_l:
+            print(f"❌ key missing: {key} (upstream={key in sd_u}, lerobot={key in sd_l})")
+            continue
+        _diff_embed(key, sd_u[key], sd_l[key], args.special_token_count)
+
+    return 0
+
+
+if __name__ == "__main__":
+    sys.exit(main())
@@ -0,0 +1,168 @@
+#!/usr/bin/env python
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+"""Extract one LIBERO episode for Robometer parity testing.
+
+Loads a LeRobot LIBERO (or any video-bearing LeRobot) dataset, picks one
+episode, samples ``--num-frames`` frames uniformly across its duration
+(matching upstream Robometer's default of 8 frames), and saves them to
+``.npz`` plus a sidecar ``.txt`` task file.
+
+The ``.npz`` layout (``frames`` key, ``(T, H, W, C) uint8``) is what upstream
+``example_inference_local.py`` consumes, so the same file feeds both pipelines
+and frame sampling cannot drift.
+
+Workflow:
+
+1. Run this script (LeRobot env) to produce ``frames.npz`` + ``task.txt``.
+2. Pass them to upstream ``scripts/example_inference_local.py``
+   (upstream env) to produce reference progress / success outputs.
+3. Pass the same ``frames.npz`` to ``scripts/parity_robometer.py``
+   (LeRobot env) to compare both sides.
+
+Example:
+
+    uv run python scripts/extract_libero_episode_for_parity.py \\
+        --repo-id lerobot/libero_10_image \\
+        --episode 0 \\
+        --num-frames 8 \\
+        --out-dir /tmp/libero_ep0
+"""
+
+from __future__ import annotations
+
+import argparse
+import sys
+from pathlib import Path
+
+import numpy as np
+import torch
+
+from lerobot.configs.types import FeatureType
+from lerobot.datasets.lerobot_dataset import LeRobotDataset
+
+
+def _pick_visual_feature(features: dict, requested: str | None) -> str:
+    """Return a visual feature key, preferring ``requested`` when given."""
+    visual_keys = [
+        key
+        for key, ft in features.items()
+        if getattr(ft, "type", None) == FeatureType.VISUAL or ft.get("dtype", "") == "video"
+    ]
+    if not visual_keys:
+        raise ValueError(f"Dataset has no visual feature; available: {list(features)}")
+    if requested is not None:
+        if requested not in visual_keys:
+            raise ValueError(f"Camera key {requested!r} not in dataset visual features {visual_keys}")
+        return requested
+    return visual_keys[0]
+
+
+def _frame_uint8_hwc(tensor: torch.Tensor) -> np.ndarray:
+    """Convert a LeRobotDataset video frame to ``uint8`` ``(H, W, C)`` RGB."""
+    arr = tensor.detach().cpu().numpy()
+    if arr.ndim == 3 and arr.shape[0] in (1, 3):
+        arr = arr.transpose(1, 2, 0)
+    if arr.dtype != np.uint8:
+        arr = np.clip(arr * 255.0 if arr.max() <= 1.0 + 1e-3 else arr, 0, 255).astype(np.uint8)
+    if arr.shape[-1] == 1:
+        arr = np.repeat(arr, 3, axis=-1)
+    return arr
+
+
+def main() -> int:
+    parser = argparse.ArgumentParser(
+        description=__doc__,
+        formatter_class=argparse.RawDescriptionHelpFormatter,
+    )
+    parser.add_argument(
+        "--repo-id",
+        default="lerobot/libero_10_image",
+        help="LeRobot LIBERO (or other) dataset repo id (default: lerobot/libero_10_image).",
+    )
+    parser.add_argument("--episode", type=int, default=0, help="Episode index.")
+    parser.add_argument(
+        "--camera-key",
+        default=None,
+        help="Visual feature key (e.g. observation.images.image). Auto-selects first if omitted.",
+    )
+    parser.add_argument(
+        "--num-frames",
+        type=int,
+        default=8,
+        help="Number of frames to sample uniformly (default: 8 — Robometer's training-time default).",
+    )
+    parser.add_argument(
+        "--out-dir",
+        type=Path,
+        default=Path("outputs/robometer_parity/libero"),
+        help="Directory to write frames.npz / task.txt / frame_indices.npy.",
+    )
+    args = parser.parse_args()
+
+    print(f"Loading {args.repo_id} (episode {args.episode})...")
+    dataset = LeRobotDataset(args.repo_id, episodes=[args.episode])
+
+    camera_key = _pick_visual_feature(dataset.features, args.camera_key)
+    print(f"Using camera key: {camera_key}")
+
+    ep_from = int(dataset.episode_data_index["from"][0].item())
+    ep_to = int(dataset.episode_data_index["to"][0].item())
+    total_frames = ep_to - ep_from
+    if total_frames <= 0:
+        print(f"ERROR: episode {args.episode} has no frames.", file=sys.stderr)
+        return 1
+    print(f"Episode has {total_frames} frames; sampling {args.num_frames} uniformly.")
+
+    indices = np.linspace(0, total_frames - 1, num=min(args.num_frames, total_frames), dtype=int)
+    frames: list[np.ndarray] = []
+    task: str = ""
+    for offset in indices:
+        sample = dataset[ep_from + int(offset)]
+        frame_tensor = sample[camera_key]
+        frames.append(_frame_uint8_hwc(frame_tensor))
+        if not task:
+            task = sample.get("task", "") or ""
+
+    if not task:
+        print("ERROR: episode has no task description in metadata.", file=sys.stderr)
+        return 1
+
+    frames_array = np.stack(frames)
+
+    args.out_dir.mkdir(parents=True, exist_ok=True)
+    frames_path = args.out_dir / "frames.npz"
+    task_path = args.out_dir / "task.txt"
+    indices_path = args.out_dir / "frame_indices.npy"
+
+    np.savez(frames_path, frames=frames_array)
+    task_path.write_text(task + "\n", encoding="utf-8")
+    np.save(indices_path, indices)
+
+    print()
+    print(f"Wrote {frames_path} (shape={frames_array.shape}, dtype={frames_array.dtype})")
+    print(f"Wrote {task_path}   (task={task!r})")
+    print(f"Wrote {indices_path} (frame_indices={indices.tolist()})")
+    print()
+    print("Next steps:")
+    print("  # in upstream env (where `robometer` is importable):")
+    print(
+        f"  python third_party/robometer/scripts/example_inference_local.py \\\n"
+        f"      --model-path robometer/Robometer-4B \\\n"
+        f"      --video {frames_path} \\\n"
+        f'      --task "{task}" \\\n'
+        f"      --out {args.out_dir / 'upstream.npy'}"
+    )
+    print()
+    print("  # back in LeRobot env:")
+    print(
+        f"  uv run python scripts/parity_robometer.py \\\n"
+        f"      --frames {frames_path} \\\n"
+        f'      --task "{task}" \\\n'
+        f"      --upstream-progress {args.out_dir / 'upstream.npy'} \\\n"
+        f"      --upstream-success  {args.out_dir / 'upstream_success_probs.npy'}"
+    )
+    return 0
+
+
+if __name__ == "__main__":
+    sys.exit(main())
@@ -0,0 +1,232 @@
+#!/usr/bin/env python
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+"""Functional parity check: LeRobot Robometer vs. upstream Robometer.
+
+Runs the in-tree :class:`RobometerRewardModel` on the same frames + task that
+upstream Robometer was run on, and compares per-frame progress / success
+predictions against reference outputs saved by upstream's
+``scripts/example_inference_local.py``.
+
+Workflow:
+
+1. In the upstream Robometer environment (where ``robometer`` is importable),
+   run::
+
+       python third_party/robometer/scripts/example_inference_local.py \\
+           --model-path robometer/Robometer-4B \\
+           --video /path/to/episode.mp4 \\
+           --task "Open the drawer" \\
+           --fps 1.0 \\
+           --out /tmp/robometer_upstream.npy
+
+   This produces:
+   - ``/tmp/robometer_upstream.npy``               (progress predictions)
+   - ``/tmp/robometer_upstream_success_probs.npy`` (success probabilities)
+
+2. Extract the exact same frames the upstream script used, save as ``.npz``::
+
+       # quick helper: extract frames at the same fps and save as .npz
+       python -c "
+       from third_party.robometer.scripts.example_inference_local import load_frames_input
+       import numpy as np
+       frames = load_frames_input('/path/to/episode.mp4', fps=1.0, max_frames=512)
+       np.savez('/tmp/robometer_frames.npz', frames=frames)
+       "
+
+3. In this LeRobot env, run this script::
+
+       uv run python scripts/parity_robometer.py \\
+           --frames /tmp/robometer_frames.npz \\
+           --task "Open the drawer" \\
+           --upstream-progress /tmp/robometer_upstream.npy \\
+           --upstream-success  /tmp/robometer_upstream_success_probs.npy \\
+           --lerobot-model     lilkm/robometer-4b
+"""
+
+from __future__ import annotations
+
+import argparse
+import sys
+
+import numpy as np
+import torch
+
+from lerobot.configs.rewards import RewardModelConfig
+from lerobot.rewards.robometer import RobometerConfig, RobometerRewardModel
+from lerobot.rewards.robometer.modeling_robometer import decode_progress_outputs
+from lerobot.rewards.robometer.processor_robometer import RobometerEncoderProcessorStep
+
+
+def _load_frames(path: str) -> np.ndarray:
+    """Load frames from .npy/.npz. Expects (T, H, W, C) uint8."""
+    if path.endswith(".npy"):
+        frames = np.load(path)
+    elif path.endswith(".npz"):
+        with np.load(path, allow_pickle=False) as npz:
+            frames = npz["frames"].copy() if "frames" in npz else next(iter(npz.values())).copy()
+    else:
+        raise ValueError(f"Frames must be .npy or .npz (got {path!r}).")
+
+    if frames.dtype != np.uint8:
+        frames = np.clip(frames, 0, 255).astype(np.uint8)
+    if frames.ndim != 4:
+        raise ValueError(f"Frames must be 4D (T,H,W,C); got shape {frames.shape}.")
+    if frames.shape[-1] not in (1, 3):
+        # Probably (T,C,H,W) — transpose
+        if frames.shape[1] in (1, 3):
+            frames = frames.transpose(0, 2, 3, 1)
+        else:
+            raise ValueError(f"Cannot interpret frame channel layout: {frames.shape}.")
+    return frames
+
+
+def _run_lerobot(
+    frames: np.ndarray,
+    task: str,
+    model_path: str,
+    device: str,
+) -> tuple[np.ndarray, np.ndarray]:
+    """Run LeRobot's Robometer on the given frames; return (progress, success)."""
+    cfg = RobometerConfig(pretrained_path=model_path, device=device, max_frames=None)
+    model = RobometerRewardModel.from_pretrained(model_path, config=cfg)
+
+    encoder = RobometerEncoderProcessorStep(
+        base_model_id=model.config.base_model_id,
+        use_multi_image=model.config.use_multi_image,
+        use_per_frame_progress_token=model.config.use_per_frame_progress_token,
+        max_frames=None,
+    )
+    batch = encoder.encode_samples([(frames, task)])
+
+    model_device = next(model.model.parameters()).device
+    inputs = {key: value.to(model_device) if hasattr(value, "to") else value for key, value in batch.items()}
+
+    model.eval()
+    with torch.no_grad():
+        progress_logits, success_logits = model._compute_rbm_logits(inputs)
+
+    decoded = decode_progress_outputs(
+        progress_logits,
+        success_logits,
+        is_discrete_mode=model.config.use_discrete_progress,
+    )
+    progress = np.asarray(decoded["progress_pred"][0], dtype=np.float32)
+    success = (
+        np.asarray(decoded["success_probs"][0], dtype=np.float32)
+        if decoded["success_probs"]
+        else np.array([], dtype=np.float32)
+    )
+    return progress, success
+
+
+def _compare(name: str, lerobot: np.ndarray, upstream: np.ndarray, atol: float, rtol: float) -> bool:
+    print(f"\n=== {name} ===")
+    if lerobot.shape != upstream.shape:
+        print(f"shape mismatch: lerobot={lerobot.shape}  upstream={upstream.shape}")
+        return False
+
+    abs_diff = np.abs(lerobot - upstream)
+    rel_diff = abs_diff / (np.abs(upstream) + 1e-12)
+    print(f"shape        : {lerobot.shape}")
+    print(f"max |Δ|      : {abs_diff.max():.3e}")
+    print(f"mean |Δ|     : {abs_diff.mean():.3e}")
+    print(f"max rel |Δ|  : {rel_diff.max():.3e}")
+    print(f"lerobot[:5]  : {lerobot[:5]}")
+    print(f"upstream[:5] : {upstream[:5]}")
+
+    within_tol = bool(np.allclose(lerobot, upstream, atol=atol, rtol=rtol))
+    print(f"allclose(atol={atol}, rtol={rtol}) -> {within_tol}")
+    return within_tol
+
+
+def main() -> int:
+    parser = argparse.ArgumentParser(
+        description=__doc__,
+        formatter_class=argparse.RawDescriptionHelpFormatter,
+    )
+    parser.add_argument(
+        "--frames",
+        required=True,
+        help=".npy / .npz file with the exact frames upstream was run on (T,H,W,C uint8).",
+    )
+    parser.add_argument("--task", required=True, help="Task instruction string.")
+    parser.add_argument(
+        "--upstream-progress",
+        required=True,
+        help="Reference progress .npy saved by upstream example_inference_local.py.",
+    )
+    parser.add_argument(
+        "--upstream-success",
+        default=None,
+        help="Optional reference success_probs .npy. If omitted, success comparison is skipped.",
+    )
+    parser.add_argument(
+        "--lerobot-model",
+        default="lilkm/robometer-4b",
+        help="LeRobot-format Robometer Hub repo id or local path.",
+    )
+    parser.add_argument(
+        "--device",
+        default="cuda" if torch.cuda.is_available() else "cpu",
+        help="Device for the LeRobot model (default: cuda if available).",
+    )
+    parser.add_argument(
+        "--atol",
+        type=float,
+        default=1e-3,
+        help="Absolute tolerance for allclose (default: 1e-3; bf16 round-trip headroom).",
+    )
+    parser.add_argument(
+        "--rtol",
+        type=float,
+        default=1e-2,
+        help="Relative tolerance for allclose (default: 1e-2).",
+    )
+    parser.add_argument(
+        "--out-prefix",
+        default="lerobot_robometer_outputs",
+        help="Save the LeRobot outputs as <prefix>_progress.npy / <prefix>_success.npy.",
+    )
+    args = parser.parse_args()
+
+    # 0. Sanity: confirm the LeRobot config is a RobometerConfig.
+    cfg = RewardModelConfig.from_pretrained(args.lerobot_model)
+    if not isinstance(cfg, RobometerConfig):
+        print(f"ERROR: {args.lerobot_model!r} does not resolve to a RobometerConfig.", file=sys.stderr)
+        return 2
+
+    # 1. Load frames + task + upstream reference outputs.
+    frames = _load_frames(args.frames)
+    upstream_progress = np.load(args.upstream_progress).astype(np.float32)
+    upstream_success = (
+        np.load(args.upstream_success).astype(np.float32) if args.upstream_success is not None else None
+    )
+
+    print(f"Loaded {frames.shape[0]} frames at {frames.shape[1:]}, task={args.task!r}")
+    print(f"LeRobot model: {args.lerobot_model}  device: {args.device}")
+
+    # 2. Run LeRobot pipeline.
+    progress, success = _run_lerobot(frames, args.task, args.lerobot_model, args.device)
+    np.save(f"{args.out_prefix}_progress.npy", progress)
+    if success.size > 0:
+        np.save(f"{args.out_prefix}_success.npy", success)
+    print(f"Saved LeRobot outputs to {args.out_prefix}_progress.npy / _success.npy")
+
+    # 3. Compare to upstream references.
+    progress_ok = _compare("progress", progress, upstream_progress, args.atol, args.rtol)
+    if upstream_success is not None and success.size > 0:
+        success_ok = _compare("success_probs", success, upstream_success, args.atol, args.rtol)
+    else:
+        success_ok = True
+        print("\n(skipping success comparison — upstream success file not provided)")
+
+    print()
+    if progress_ok and success_ok:
+        print("Parity check passed.")
+        return 0
+    print("Parity check FAILED.")
+    return 1
+
+
+if __name__ == "__main__":
+    sys.exit(main())
@@ -0,0 +1,362 @@
+#!/usr/bin/env python
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+"""Run LeRobot Robometer parity against upstream Robometer's bundled examples.
+
+Upstream Robometer ships three reference videos with their pre-computed
+progress / success outputs at
+``third_party/robometer/scripts/example_videos/``::
+
+    soar_put_green_stick_in_brown_bowl.mp4
+        + soar_put_green_stick_in_brown_bowl_rewards.npy            (progress)
+        + soar_put_green_stick_in_brown_bowl_rewards_success_probs.npy (success)
+    berkeley_rpt_stack_cup.mp4
+        + berkeley_rpt_stack_cup_rewards.npy
+        + berkeley_rpt_stack_cup_rewards_success_probs.npy
+    jaco_play_pick_up_green_cup.mp4
+        + pick_up_green_cup_rewards.npy
+        + pick_up_green_cup_rewards_success_probs.npy
+
+This script:
+1. Decodes each video at upstream's sampling fps using ``av`` (PyAV), with the
+   same linspace-over-total-frames logic as upstream's ``extract_frames``.
+2. Runs the LeRobot ``RobometerRewardModel`` on those frames + the task from
+   upstream's README.
+3. Compares per-frame progress / success to the pre-saved upstream outputs.
+
+This means you do **not** need to install upstream Robometer to confirm parity.
+
+Run::
+
+    uv run python scripts/parity_robometer_upstream_examples.py \\
+        --lerobot-model lilkm/robometer-4b \\
+        --device cuda \\
+        --decoder decord
+
+The number of frames sampled per video is derived from the length of each
+upstream ``.npy`` reference, so the script does not need a ``--fps`` argument
+(the README documents ``fps=3`` for SOAR / Berkeley, but the Jaco Play
+reference was generated with a different fps).
+"""
+
+from __future__ import annotations
+
+import argparse
+import sys
+from pathlib import Path
+
+import numpy as np
+import torch
+
+from lerobot.configs.rewards import RewardModelConfig
+from lerobot.rewards.robometer import RobometerConfig, RobometerRewardModel
+from lerobot.rewards.robometer.modeling_robometer import decode_progress_outputs
+from lerobot.rewards.robometer.processor_robometer import RobometerEncoderProcessorStep
+
+try:
+    import decord  # type: ignore
+
+    _HAS_DECORD = True
+except ImportError:
+    decord = None  # type: ignore
+    _HAS_DECORD = False
+
+try:
+    import av
+
+    _HAS_AV = True
+except ImportError:
+    av = None  # type: ignore
+    _HAS_AV = False
+
+EXAMPLES = [
+    {
+        "name": "soar_put_green_stick_in_brown_bowl",
+        "video": "soar_put_green_stick_in_brown_bowl.mp4",
+        "task": "Put green stick in brown bowl",
+        "progress_npy": "soar_put_green_stick_in_brown_bowl_rewards.npy",
+        "success_npy": "soar_put_green_stick_in_brown_bowl_rewards_success_probs.npy",
+    },
+    {
+        "name": "berkeley_rpt_stack_cup",
+        "video": "berkeley_rpt_stack_cup.mp4",
+        "task": "Pick up the yellow cup and stack it on the other cup",
+        "progress_npy": "berkeley_rpt_stack_cup_rewards.npy",
+        "success_npy": "berkeley_rpt_stack_cup_rewards_success_probs.npy",
+    },
+    {
+        "name": "jaco_play_pick_up_green_cup",
+        "video": "jaco_play_pick_up_green_cup.mp4",
+        "task": "Pick up the green cup",
+        "progress_npy": "pick_up_green_cup_rewards.npy",
+        "success_npy": "pick_up_green_cup_rewards_success_probs.npy",
+    },
+]
+
+
+def _extract_frames_decord(video_path: Path, num_frames: int) -> tuple[np.ndarray, str]:
+    """Sample ``num_frames`` indices uniformly from the video using decord.
+
+    Mirrors upstream's ``extract_frames`` indexing
+    (``third_party/robometer/scripts/example_inference.py``): a
+    ``np.linspace(0, total_frames-1, num_frames)`` lookup over decord's
+    ``VideoReader``. We pass ``num_frames`` explicitly (derived from the
+    upstream reference output length) so we don't have to guess what ``fps``
+    upstream actually used when generating each saved ``.npy`` — the file
+    length is the ground truth.
+    """
+    vr = decord.VideoReader(str(video_path), num_threads=1)
+    total_frames = len(vr)
+    if total_frames == 0:
+        raise RuntimeError(f"No decodable frames in {video_path}.")
+    desired_frames = max(1, min(int(num_frames), total_frames))
+    indices = np.linspace(0, total_frames - 1, desired_frames, dtype=int).tolist()
+    frames = vr.get_batch(indices).asnumpy()
+    native_fps = float(vr.get_avg_fps()) or 1.0
+    return frames, f"decord total={total_frames} native_fps={native_fps:.3f}"
+
+
+def _extract_frames_av(video_path: Path, num_frames: int) -> tuple[np.ndarray, str]:
+    """PyAV fallback for environments without decord.
+
+    PyAV and decord can disagree on ``total_frames`` for the same container,
+    so the sampled frame indices can drift. Install ``decord`` for a real
+    parity check; this fallback is for smoke tests only.
+    """
+    container = av.open(str(video_path))
+    stream = container.streams.video[0]
+    native_fps = float(stream.average_rate) if stream.average_rate else float(stream.guessed_rate or 30.0)
+    rgb_frames: list[np.ndarray] = []
+    for frame in container.decode(stream):
+        rgb_frames.append(frame.to_ndarray(format="rgb24"))
+    container.close()
+    total_frames = len(rgb_frames)
+    if total_frames == 0:
+        raise RuntimeError(f"No decodable frames in {video_path}.")
+    desired_frames = max(1, min(int(num_frames), total_frames))
+    indices = np.linspace(0, total_frames - 1, desired_frames, dtype=int)
+    frames = np.stack([rgb_frames[i] for i in indices])
+    return frames, f"av total={total_frames} native_fps={native_fps:.3f}"
+
+
+def _extract_frames(video_path: Path, num_frames: int, prefer: str) -> tuple[np.ndarray, str]:
+    """Decoder dispatch. ``prefer`` is ``"decord"`` | ``"av"`` | ``"auto"``."""
+    if prefer == "decord":
+        if not _HAS_DECORD:
+            raise RuntimeError("decord requested but not installed (`uv pip install decord`).")
+        return _extract_frames_decord(video_path, num_frames)
+    if prefer == "av":
+        if not _HAS_AV:
+            raise RuntimeError("av requested but not installed.")
+        return _extract_frames_av(video_path, num_frames)
+    # auto
+    if _HAS_DECORD:
+        return _extract_frames_decord(video_path, num_frames)
+    if _HAS_AV:
+        return _extract_frames_av(video_path, num_frames)
+    raise RuntimeError("No video decoder available (install `decord` or `av`).")
+
+
+def _pearson(a: np.ndarray, b: np.ndarray) -> float:
+    """Pearson correlation; returns 1.0 for constant inputs (no signal to align)."""
+    a = a.astype(np.float64)
+    b = b.astype(np.float64)
+    if a.size < 2:
+        return 1.0
+    da = a - a.mean()
+    db = b - b.mean()
+    denom = float(np.sqrt((da * da).sum()) * np.sqrt((db * db).sum()))
+    if denom == 0:
+        return 1.0
+    return float((da * db).sum() / denom)
+
+
+def _run_lerobot(
+    model: RobometerRewardModel,
+    encoder: RobometerEncoderProcessorStep,
+    frames: np.ndarray,
+    task: str,
+) -> tuple[np.ndarray, np.ndarray]:
+    batch = encoder.encode_samples([(frames, task)])
+    device = next(model.model.parameters()).device
+    inputs = {key: value.to(device) if hasattr(value, "to") else value for key, value in batch.items()}
+    model.eval()
+    with torch.no_grad():
+        progress_logits, success_logits = model._compute_rbm_logits(inputs)
+    decoded = decode_progress_outputs(
+        progress_logits, success_logits, is_discrete_mode=model.config.use_discrete_progress
+    )
+    progress = np.asarray(decoded["progress_pred"][0], dtype=np.float32)
+    success = (
+        np.asarray(decoded["success_probs"][0], dtype=np.float32)
+        if decoded["success_probs"]
+        else np.array([], dtype=np.float32)
+    )
+    return progress, success
+
+
+def _compare(
+    name: str,
+    lerobot: np.ndarray,
+    upstream: np.ndarray,
+    *,
+    atol: float,
+    pearson_min: float,
+) -> bool:
+    if lerobot.shape != upstream.shape:
+        print(f"  {name:8s}  SHAPE MISMATCH lerobot={lerobot.shape} upstream={upstream.shape}")
+        return False
+    abs_diff = np.abs(lerobot - upstream)
+    pearson = _pearson(lerobot, upstream)
+    abs_ok = bool(abs_diff.max() <= atol)
+    pearson_ok = bool(pearson >= pearson_min)
+    verdict = "PASS" if (abs_ok or pearson_ok) else "FAIL"
+    print(
+        f"  {name:8s}  shape={lerobot.shape}  max|Δ|={abs_diff.max():.3e}  "
+        f"mean|Δ|={abs_diff.mean():.3e}  pearson={pearson:.4f}  "
+        f"(atol={atol:.0e} pearson_min={pearson_min:.3f}) -> {verdict}"
+    )
+    return abs_ok or pearson_ok
+
+
+def main() -> int:
+    parser = argparse.ArgumentParser(
+        description=__doc__,
+        formatter_class=argparse.RawDescriptionHelpFormatter,
+    )
+    parser.add_argument(
+        "--examples-dir",
+        type=Path,
+        default=Path("third_party/robometer/scripts/example_videos"),
+        help="Directory containing the upstream Robometer example mp4s + .npy outputs.",
+    )
+    parser.add_argument(
+        "--lerobot-model",
+        default="lilkm/robometer-4b",
+        help="LeRobot-format Robometer Hub repo id or local path.",
+    )
+    parser.add_argument(
+        "--device",
+        default="cuda" if torch.cuda.is_available() else "cpu",
+        help="Device for the LeRobot model.",
+    )
+    parser.add_argument(
+        "--decoder",
+        choices=("auto", "decord", "av"),
+        default="auto",
+        help=(
+            "Video decoder. ``auto`` prefers decord (matches upstream) and falls back to av. "
+            "Force ``decord`` for a clean parity check."
+        ),
+    )
+    parser.add_argument(
+        "--progress-atol",
+        type=float,
+        default=1e-2,
+        help="Absolute tolerance for the progress array. Default 1e-2 covers CUDA bf16 noise.",
+    )
+    parser.add_argument(
+        "--success-atol",
+        type=float,
+        default=1e-1,
+        help=(
+            "Absolute tolerance for the success array. Looser than progress because "
+            "``sigmoid`` amplifies logit-space noise near 0.5."
+        ),
+    )
+    parser.add_argument(
+        "--pearson-min",
+        type=float,
+        default=0.99,
+        help="Minimum Pearson correlation for a PASS verdict (per array).",
+    )
+    args = parser.parse_args()
+
+    if args.decoder == "av" or (args.decoder == "auto" and not _HAS_DECORD):
+        print(
+            "WARNING: using PyAV decoder. PyAV's total-frame count can differ from decord's, "
+            "which propagates into different sampled-frame indices. Install `decord` and "
+            "re-run for a clean parity check.",
+            file=sys.stderr,
+        )
+
+    examples_dir = args.examples_dir.resolve()
+    if not examples_dir.is_dir():
+        print(f"ERROR: examples dir {examples_dir} does not exist.", file=sys.stderr)
+        return 2
+
+    # Sanity-check the LeRobot config is a RobometerConfig before loading weights.
+    cfg = RewardModelConfig.from_pretrained(args.lerobot_model)
+    if not isinstance(cfg, RobometerConfig):
+        print(f"ERROR: {args.lerobot_model!r} did not resolve to a RobometerConfig.", file=sys.stderr)
+        return 2
+
+    print(f"Loading LeRobot Robometer from {args.lerobot_model} on {args.device}...")
+    cfg.pretrained_path = args.lerobot_model
+    cfg.device = args.device
+    model = RobometerRewardModel.from_pretrained(args.lerobot_model, config=cfg)
+    encoder = RobometerEncoderProcessorStep(
+        base_model_id=model.config.base_model_id,
+        use_multi_image=model.config.use_multi_image,
+        use_per_frame_progress_token=model.config.use_per_frame_progress_token,
+        max_frames=None,
+    )
+
+    all_ok = True
+    for ex in EXAMPLES:
+        video_path = examples_dir / ex["video"]
+        upstream_progress_path = examples_dir / ex["progress_npy"]
+        upstream_success_path = examples_dir / ex["success_npy"]
+
+        missing = [p for p in (video_path, upstream_progress_path, upstream_success_path) if not p.exists()]
+        if missing:
+            print(f"[skip] {ex['name']}: missing {[str(m) for m in missing]}")
+            all_ok = False
+            continue
+
+        print(f"\n=== {ex['name']} ===")
+        print(f"  task: {ex['task']!r}")
+
+        # Trust the upstream reference array as the source of truth for how
+        # many frames to sample. The README documents fps=3 for SOAR/Berkeley
+        # but Jaco Play was generated with a different fps, so any hardcoded
+        # ``--fps`` mismatches at least one example. The npy length always
+        # tells us what upstream actually used.
+        upstream_progress = np.load(upstream_progress_path).astype(np.float32)
+        upstream_success = np.load(upstream_success_path).astype(np.float32)
+        target_num_frames = int(upstream_progress.shape[0])
+        frames, decoder_info = _extract_frames(video_path, target_num_frames, prefer=args.decoder)
+        print(
+            f"  decoded {frames.shape[0]} frames (matches upstream npy length); "
+            f"shape={frames.shape}  [{decoder_info}]"
+        )
+
+        progress, success = _run_lerobot(model, encoder, frames, ex["task"])
+
+        progress_ok = _compare(
+            "progress",
+            progress,
+            upstream_progress,
+            atol=args.progress_atol,
+            pearson_min=args.pearson_min,
+        )
+        success_ok = _compare(
+            "success",
+            success,
+            upstream_success,
+            atol=args.success_atol,
+            pearson_min=args.pearson_min,
+        )
+        verdict = "PASS" if (progress_ok and success_ok) else "FAIL"
+        print(f"  -> {verdict}")
+        all_ok = all_ok and progress_ok and success_ok
+
+    print()
+    if all_ok:
+        print("All upstream example parity checks passed.")
+        return 0
+    print("Some upstream example parity checks FAILED.")
+    return 1
+
+
+if __name__ == "__main__":
+    sys.exit(main())
@@ -0,0 +1,149 @@
+#!/usr/bin/env python
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+
+"""Verify that a LeRobot-format Robometer is byte-equivalent to its upstream source.
+
+Run this once after publishing a LeRobot-format Robometer to the Hub, before
+flipping the default `RobometerConfig.pretrained_path` to it. It loads both
+the upstream snapshot and the re-exported copy, compares state dicts, and
+prints a clear pass/fail summary.
+
+Example:
+
+    python scripts/verify_robometer_export.py \\
+        --upstream robometer/Robometer-4B \\
+        --lerobot  lerobot/robometer-4b
+
+    python scripts/verify_robometer_export.py \\
+        --upstream robometer/Robometer-4B \\
+        --lerobot  ./robometer-4b-lerobot   # local folder also works
+"""
+
+from __future__ import annotations
+
+import argparse
+import sys
+
+from lerobot.configs.rewards import RewardModelConfig
+from lerobot.rewards.robometer import RobometerConfig, RobometerRewardModel
+from lerobot.rewards.robometer._upstream_loader import apply_upstream_checkpoint
+
+
+def _load_upstream(path: str) -> RobometerRewardModel:
+    # Fresh ``RobometerConfig`` (``vlm_config=None``) triggers
+    # ``RobometerRewardModel.__init__``'s upstream-matching path: download
+    # base Qwen, resize for ROBOMETER_SPECIAL_TOKENS. The subsequent
+    # ``apply_upstream_checkpoint`` call resizes again if the checkpoint's
+    # vocab differs (e.g. upstream was trained against an older Qwen).
+    cfg = RobometerConfig(pretrained_path=path, device="cpu")
+    model = RobometerRewardModel(cfg)
+    apply_upstream_checkpoint(model, path)
+    model.eval()
+    return model
+
+
+def _load_lerobot(path: str) -> RobometerRewardModel:
+    cfg = RewardModelConfig.from_pretrained(path)
+    if not isinstance(cfg, RobometerConfig):
+        raise TypeError(f"Expected RobometerConfig in LeRobot export, got {type(cfg)}")
+    cfg.pretrained_path = path
+    cfg.device = "cpu"
+    return RobometerRewardModel.from_pretrained(path, config=cfg)
+
+
+def compare_state_dicts(a: RobometerRewardModel, b: RobometerRewardModel) -> bool:
+    sd_a, sd_b = a.state_dict(), b.state_dict()
+    keys_a, keys_b = set(sd_a), set(sd_b)
+
+    missing = keys_a - keys_b
+    extra = keys_b - keys_a
+    if missing:
+        print(f"❌ {len(missing)} keys missing in LeRobot-format model (sample: {list(missing)[:5]})")
+    if extra:
+        print(f"❌ {len(extra)} extra keys in LeRobot-format model (sample: {list(extra)[:5]})")
+    if missing or extra:
+        return False
+
+    diff_summary: list[tuple[str, float]] = []
+    for key in sorted(keys_a):
+        ta, tb = sd_a[key], sd_b[key]
+        if ta.shape != tb.shape:
+            print(f"❌ shape mismatch at {key}: {tuple(ta.shape)} vs {tuple(tb.shape)}")
+            return False
+        # Compare in float to avoid bfloat16 equality quirks.
+        max_abs = (ta.float() - tb.float()).abs().max().item()
+        if max_abs > 0:
+            diff_summary.append((key, max_abs))
+
+    if not diff_summary:
+        print(f"✅ All {len(keys_a)} parameters identical")
+        return True
+
+    # Some keys differ; show worst offenders.
+    diff_summary.sort(key=lambda kv: kv[1], reverse=True)
+    print(f"⚠️  {len(diff_summary)} keys differ. Top 10 by max abs diff:")
+    for key, value in diff_summary[:10]:
+        print(f"    {key:60s}  max|Δ| = {value:.3e}")
+
+    # Tolerance: bf16 round-trips can introduce ULP-level noise but no real
+    # change. Allow up to 1e-3 absolute difference; anything larger is a real
+    # divergence.
+    worst = diff_summary[0][1]
+    if worst < 1e-3:
+        print(f"✅ Worst diff {worst:.3e} is within bf16 round-trip tolerance")
+        return True
+    print(f"❌ Worst diff {worst:.3e} exceeds tolerance (1e-3)")
+    return False
+
+
+def main() -> int:
+    parser = argparse.ArgumentParser(
+        description=__doc__, formatter_class=argparse.RawDescriptionHelpFormatter
+    )
+    parser.add_argument("--upstream", required=True, help="Upstream Robometer repo id or local path.")
+    parser.add_argument("--lerobot", required=True, help="LeRobot-format Robometer repo id or local path.")
+    args = parser.parse_args()
+
+    print(f"Loading upstream:        {args.upstream}")
+    upstream = _load_upstream(args.upstream)
+    print(f"Loading LeRobot-format:  {args.lerobot}")
+    lerobot = _load_lerobot(args.lerobot)
+
+    print("\n=== Config comparison ===")
+    config_ok = True
+    for field in [
+        "base_model_id",
+        "torch_dtype",
+        "use_multi_image",
+        "use_per_frame_progress_token",
+        "average_temporal_patches",
+        "frame_pooling",
+        "frame_pooling_attn_temperature",
+        "progress_loss_type",
+        "progress_discrete_bins",
+    ]:
+        a, b = getattr(upstream.config, field), getattr(lerobot.config, field)
+        field_ok = a == b
+        config_ok = config_ok and field_ok
+        ok = "✅" if field_ok else "❌"
+        print(f"  {ok} {field}: upstream={a!r}, lerobot={b!r}")
+
+    print("\n=== State-dict comparison ===")
+    state_dict_ok = compare_state_dicts(upstream, lerobot)
+
+    print()
+    if config_ok and state_dict_ok:
+        print("🎉 Verification passed — safe to flip the default.")
+        return 0
+    print("⛔ Verification failed — DO NOT flip the default.")
+    return 1
+
+
+if __name__ == "__main__":
+    sys.exit(main())
@@ -99,6 +99,7 @@ def save_checkpoint(
        optimizer (Optimizer | None, optional): The optimizer to save the state from. Defaults to None.
        scheduler (LRScheduler | None, optional): The scheduler to save the state from. Defaults to None.
        preprocessor: The preprocessor/pipeline to save. Defaults to None.
+        postprocessor: The postprocessor/pipeline to save. Defaults to None.
    """
    pretrained_dir = checkpoint_dir / PRETRAINED_MODEL_DIR
    policy.save_pretrained(pretrained_dir)
@@ -117,3 +117,9 @@ class PeftConfig:
    # the rank used for the adapter. In general a higher rank means more trainable parameters and closer to full
    # fine-tuning.
    r: int = 16
+
+    # Alpha parameter for LoRA scaling (scaling = lora_alpha / r).
+    # In general, a higher alpha means stronger adaptation signal.
+    # If None, the PEFT library defaults to alpha=8, which may dampen high-rank adapters.
+    # Common values are r (alpha == rank) or 2*r.
+    lora_alpha: int | None = None
@@ -46,8 +46,11 @@ class EvalPipelineConfig:
        # HACK: We parse again the cli args here to get the pretrained path if there was one.
        policy_path = parser.get_path_arg("policy")
        if policy_path:
-            cli_overrides = parser.get_cli_overrides("policy")
-            self.policy = PreTrainedConfig.from_pretrained(policy_path, cli_overrides=cli_overrides)
+            yaml_overrides = parser.get_yaml_overrides("policy")
+            cli_overrides = parser.get_cli_overrides("policy") or []
+            self.policy = PreTrainedConfig.from_pretrained(
+                policy_path, cli_overrides=yaml_overrides + cli_overrides
+            )
            self.policy.pretrained_path = Path(policy_path)

        else:
@@ -13,8 +13,10 @@
 # limitations under the License.
 import importlib
 import inspect
+import json
 import pkgutil
 import sys
+import tempfile
 from argparse import ArgumentError
 from collections.abc import Callable, Iterable, Sequence
 from functools import wraps
@@ -24,6 +26,7 @@ from types import ModuleType
 from typing import Any, TypeVar, cast

 import draccus
+import yaml  # type: ignore[import-untyped]

 from lerobot.utils.utils import has_method

@@ -32,6 +35,29 @@ F = TypeVar("F", bound=Callable[..., object])
 PATH_KEY = "path"
 PLUGIN_DISCOVERY_SUFFIX = "discover_packages_path"

+# Storage for path args extracted from YAML/JSON config files, so that
+# get_path_arg() can find them even when they weren't passed via CLI.
+_config_path_args: dict[str, str] = {}
+
+# Storage for non-path YAML overrides so validate() can pass them to from_pretrained.
+_config_yaml_overrides: dict[str, list[str]] = {}
+
+
+def _flatten_to_cli_args(d: dict, prefix: str = "") -> list[str]:
+    """Recursively flatten a nested dict to CLI-style args (e.g. {"lr": 1e-4} -> ["--lr=0.0001"])."""
+    args = []
+    for key, value in d.items():
+        if key in (PATH_KEY, draccus.CHOICE_TYPE_KEY):
+            continue
+        full_key = f"{prefix}.{key}" if prefix else key
+        if isinstance(value, bool):
+            value = str(value).lower()
+        if isinstance(value, dict):
+            args.extend(_flatten_to_cli_args(value, full_key))
+        elif value is not None and not isinstance(value, list):
+            args.append(f"--{full_key}={value}")
+    return args
+

 def get_cli_overrides(field_name: str, args: Sequence[str] | None = None) -> list[str] | None:
    """Parses arguments from cli at a given nested attribute level.
@@ -145,7 +171,14 @@ def load_plugin(plugin_path: str) -> None:


 def get_path_arg(field_name: str, args: Sequence[str] | None = None) -> str | None:
-    return parse_arg(f"{field_name}.{PATH_KEY}", args)
+    result = parse_arg(f"{field_name}.{PATH_KEY}", args)
+    if result is None:
+        result = _config_path_args.get(field_name)
+    return result
+
+
+def get_yaml_overrides(field_name: str) -> list[str]:
+    return _config_yaml_overrides.get(field_name, [])


 def get_type_arg(field_name: str, args: Sequence[str] | None = None) -> str | None:
@@ -192,6 +225,52 @@ def filter_path_args(fields_to_filter: str | list[str], args: Sequence[str] | No
    return filtered_args


+def extract_path_fields_from_config(config_path: str, path_fields: list[str]) -> str:
+    """Extract `path` fields from a YAML/JSON config before draccus processes it.
+
+    When a user specifies e.g. ``policy.path: lerobot/smolvla_base`` in a YAML config,
+    draccus will fail because ``path`` is not a valid field on policy config classes.
+    This function extracts those path values, stores them in ``_config_path_args`` for
+    later retrieval by ``get_path_arg()``, and returns a cleaned temp config file path.
+    """
+    config_file = Path(config_path)
+    suffix = config_file.suffix.lower()
+
+    if suffix in (".yaml", ".yml"):
+        with open(config_file) as f:
+            config_data = yaml.safe_load(f)
+    elif suffix == ".json":
+        with open(config_file) as f:
+            config_data = json.load(f)
+    else:
+        return config_path
+
+    if not isinstance(config_data, dict):
+        return config_path
+
+    modified = False
+    for field in path_fields:
+        if field in config_data and isinstance(config_data[field], dict) and PATH_KEY in config_data[field]:
+            _config_path_args[field] = str(config_data[field].pop(PATH_KEY))
+            remaining = config_data[field]
+            if remaining:
+                _config_yaml_overrides[field] = _flatten_to_cli_args(remaining)
+            else:
+                del config_data[field]
+            modified = True
+
+    if not modified:
+        return config_path
+
+    # Write cleaned config to a temp file
+    with tempfile.NamedTemporaryFile(mode="w", suffix=suffix, delete=False) as tmp:
+        if suffix in (".yaml", ".yml"):
+            yaml.dump(config_data, tmp, default_flow_style=False)
+        else:
+            json.dump(config_data, tmp, indent=2)
+    return tmp.name
+
+
 def wrap(config_path: Path | None = None) -> Callable[[F], F]:
    """
    HACK: Similar to draccus.wrap but does three additional things:
@@ -225,6 +304,9 @@ def wrap(config_path: Path | None = None) -> Callable[[F], F]:
                if has_method(argtype, "__get_path_fields__"):
                    path_fields = argtype.__get_path_fields__()
                    cli_args = filter_path_args(path_fields, cli_args)
+                    # Also extract path fields from the YAML/JSON config file
+                    if config_path_cli:
+                        config_path_cli = extract_path_fields_from_config(config_path_cli, path_fields)
                if has_method(argtype, "from_pretrained") and config_path_cli:
                    cli_args = filter_arg("config_path", cli_args)
                    cfg = argtype.from_pretrained(config_path_cli, cli_args=cli_args)
@@ -89,9 +89,16 @@ class RewardModelConfig(draccus.ChoiceRegistry, HubMixin, abc.ABC):
    def reward_delta_indices(self) -> list | None:  # type: ignore[type-arg]
        return None

-    @abc.abstractmethod
-    def get_optimizer_preset(self) -> OptimizerConfig:
-        raise NotImplementedError
+    def get_optimizer_preset(self) -> OptimizerConfig | None:
+        """Default optimizer for this reward model, or ``None`` for zero-shot models.
+
+        Trainable reward models (e.g. SARM, Classifier) must override this with a
+        concrete optimizer config. Zero-shot reward models (e.g. Robometer) leave
+        the default ``None`` — they error out earlier via the
+        :attr:`~lerobot.rewards.pretrained.PreTrainedRewardModel.is_trainable`
+        check in ``lerobot-train``.
+        """
+        return None

    def get_scheduler_preset(self) -> LRSchedulerConfig | None:
        return None
@@ -144,8 +144,11 @@ class TrainPipelineConfig(HubMixin):
            )
            self.reward_model.pretrained_path = str(Path(reward_model_path))
        elif policy_path:
-            cli_overrides = parser.get_cli_overrides("policy")
-            self.policy = PreTrainedConfig.from_pretrained(policy_path, cli_overrides=cli_overrides)
+            yaml_overrides = parser.get_yaml_overrides("policy")
+            cli_overrides = parser.get_cli_overrides("policy") or []
+            self.policy = PreTrainedConfig.from_pretrained(
+                policy_path, cli_overrides=yaml_overrides + cli_overrides
+            )
            self.policy.pretrained_path = Path(policy_path)
        elif self.resume:
            config_path = parser.parse_arg("config_path")
@@ -269,10 +272,3 @@ class TrainPipelineConfig(HubMixin):

        with draccus.config_type("json"):
            return draccus.parse(cls, config_file, args=cli_args)
-
-
-@dataclass(kw_only=True)
-class TrainRLServerPipelineConfig(TrainPipelineConfig):
-    # NOTE: In RL, we don't need an offline dataset
-    # TODO: Make `TrainPipelineConfig.dataset` optional
-    dataset: DatasetConfig | None = None  # type: ignore[assignment] # because the parent class has made it's type non-optional
@@ -14,6 +14,7 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 import contextlib
+from collections.abc import Callable
 from pathlib import Path

 import numpy as np
@@ -189,6 +190,29 @@ class LeRobotDatasetMetadata:
        if self.episodes is None:
            self._load_metadata()

+    def filter_episodes(
+        self,
+        predicate: Callable[[dict], bool],
+        candidates: list[int] | None = None,
+    ) -> list[int]:
+        """Filter episodes whose metadata satisfies a given predicate.
+
+        Args:
+            predicate: Predicate over per-episode metadata rows used to select episodes.
+            candidates: Optional list of episode indices to restrict evaluation to.
+
+        Returns:
+            List of sorted episode indices that satisfy the predicate.
+        """
+        self.ensure_readable()
+        if candidates is not None:
+            candidate_set = set(candidates)
+            combined = lambda ep: ep["episode_index"] in candidate_set and predicate(ep)  # noqa: E731
+        else:
+            combined = predicate
+        filtered = self.episodes.filter(combined, keep_in_memory=True, load_from_cache_file=False)
+        return sorted(int(idx) for idx in filtered["episode_index"])
+
    def _pull_from_repo(
        self,
        allow_patterns: list[str] | str | None = None,
@@ -49,6 +49,7 @@ class LeRobotDataset(torch.utils.data.Dataset):
        repo_id: str,
        root: str | Path | None = None,
        episodes: list[int] | None = None,
+        episode_filter: Callable[[dict], bool] | None = None,
        image_transforms: Callable | None = None,
        delta_timestamps: dict[str, list[float]] | None = None,
        tolerance_s: float = 1e-4,
@@ -153,6 +154,11 @@ class LeRobotDataset(torch.utils.data.Dataset):
                ``$HF_LEROBOT_HOME/hub``.
            episodes (list[int] | None, optional): If specified, this will only load episodes specified by
                their episode_index in this list. Defaults to None.
+            episode_filter (Callable[[dict], bool] | None, optional): Predicate over per-episode
+                metadata rows used to select episodes. Evaluated against ``meta/`` without ``stats`` keys
+                (e.g.``task_index``, ``episode_index``, ``length``, ``from_timestamp``, ``to_timestamp``).
+                Intersected with ``episodes`` when both are set. Example: ``lambda ep: ep["length"] >= 100``.
+                Defaults to None.
            image_transforms (Callable | None, optional):
                Transform applied to visual modalities inside `__getitem__` after image decoding / tensor
                conversion. This works for both image-backed and video-backed observations and can later be
@@ -199,7 +205,6 @@ class LeRobotDataset(torch.utils.data.Dataset):
        self.reader = None
        self.set_image_transforms(image_transforms)
        self.delta_timestamps = delta_timestamps
-        self.episodes = episodes
        self.tolerance_s = tolerance_s
        self.revision = revision if revision else CODEBASE_VERSION
        self._video_backend = video_backend if video_backend else get_safe_default_codec()
@@ -218,6 +223,23 @@ class LeRobotDataset(torch.utils.data.Dataset):
        self.root = self.meta.root
        self.revision = self.meta.revision

+        if episodes is not None and any(
+            episode >= self.meta.total_episodes or episode < 0 for episode in episodes
+        ):
+            logger.warning(
+                f"Some episodes in the provided episodes list are out of range for this dataset ({self.meta.total_episodes})."
+            )
+
+        if episode_filter is not None:
+            resolved = self.meta.filter_episodes(episode_filter, candidates=episodes)
+            if not resolved:
+                raise ValueError(
+                    "The episode filter did not match any episode. Make sure the filter and episodes list are valid and compatible."
+                )
+            logger.info(f"The episode filter matched {len(resolved)} episode(s).")
+            episodes = resolved
+        self.episodes = episodes
+
        # Create reader (hf_dataset loaded below)
        self.reader = DatasetReader(
            meta=self.meta,
@@ -33,7 +33,6 @@ import fsspec
 import numpy as np
 import pyarrow as pa
 import torch
-import torchvision
 from datasets.features.features import register_feature
 from PIL import Image

@@ -132,7 +131,9 @@ def decode_video_frames(
        video_path (Path): Path to the video file.
        timestamps (list[float]): List of timestamps to extract frames.
        tolerance_s (float): Allowed deviation in seconds for frame retrieval.
-        backend (str, optional): Backend to use for decoding. Defaults to "torchcodec" when available in the platform; otherwise, defaults to "pyav".
+        backend (str, optional): Backend to use for decoding. Defaults to "torchcodec" when available
+            in the platform; otherwise, defaults to "pyav". The legacy value "video_reader" is
+            accepted for one release as an alias for "pyav" and will be removed in a future version.
        return_uint8 (bool): If True, return raw uint8 frames without float32 normalization.
            This reduces memory for DataLoader IPC; normalization can be done on GPU afterward.

@@ -145,85 +146,87 @@ def decode_video_frames(
        backend = get_safe_default_codec()
    if backend == "torchcodec":
        return decode_video_frames_torchcodec(video_path, timestamps, tolerance_s, return_uint8=return_uint8)
-    elif backend in ["pyav", "video_reader"]:
-        return decode_video_frames_torchvision(
-            video_path, timestamps, tolerance_s, backend, return_uint8=return_uint8
-        )
+    elif backend == "pyav":
+        return decode_video_frames_pyav(video_path, timestamps, tolerance_s, return_uint8=return_uint8)
+    elif backend == "video_reader":
+        logger.warning("backend='video_reader' is deprecated and now aliases to 'pyav'.")
+        return decode_video_frames_pyav(video_path, timestamps, tolerance_s, return_uint8=return_uint8)
    else:
        raise ValueError(f"Unsupported video backend: {backend}")


-def decode_video_frames_torchvision(
+def decode_video_frames_pyav(
    video_path: Path | str,
    timestamps: list[float],
    tolerance_s: float,
-    backend: str = "pyav",
    log_loaded_timestamps: bool = False,
    return_uint8: bool = False,
 ) -> torch.Tensor:
-    """Loads frames associated to the requested timestamps of a video
+    """Loads frames associated to the requested timestamps of a video using PyAV.

-    The backend can be either "pyav" (default) or "video_reader".
-    "video_reader" requires installing torchvision from source, see:
-    https://github.com/pytorch/vision/blob/main/torchvision/csrc/io/decoder/gpu/README.rst
-    (note that you need to compile against ffmpeg<4.3)
+    This is the fallback decoder for platforms where torchcodec has no wheel (currently macOS
+    x86_64 and linux armv7l — see the torchcodec block in pyproject.toml for the full matrix).
+    On supported platforms, prefer `decode_video_frames_torchcodec`, which is faster and supports
+    accurate seek.

-    While both use cpu, "video_reader" is supposedly faster than "pyav" but requires additional setup.
-    For more info on video decoding, see `benchmark/video/README.md`
+    PyAV doesn't support accurate seek: we seek to the nearest preceding keyframe and decode
+    forward until we have covered the requested timestamp range. The number of key frames in a
+    video can be adjusted at encoding time to trade off decoding speed against file size.

-    See torchvision doc for more info on these two backends:
-    https://pytorch.org/vision/0.18/index.html?highlight=backend#torchvision.set_video_backend
+    Args:
+        video_path: Path to the video file.
+        timestamps: List of timestamps (in seconds) to extract frames for.
+        tolerance_s: Allowed deviation in seconds between a queried timestamp and the closest
+            decoded frame.
+        log_loaded_timestamps: When True, log every decoded frame's timestamp at INFO level.
+        return_uint8: When True, return raw uint8 frames (C, H, W). Otherwise, return float32 in
+            [0, 1] range.

-    Note: Video benefits from inter-frame compression. Instead of storing every frame individually,
-    the encoder stores a reference frame (or a key frame) and subsequent frames as differences relative to
-    that key frame. As a consequence, to access a requested frame, we need to load the preceding key frame,
-    and all subsequent frames until reaching the requested frame. The number of key frames in a video
-    can be adjusted during encoding to take into account decoding time and video size in bytes.
+    Returns:
+        torch.Tensor of shape (len(timestamps), C, H, W).
    """
-    video_path = str(video_path)
-
-    # set backend
-    keyframes_only = False
-    torchvision.set_video_backend(backend)
-    if backend == "pyav":
-        keyframes_only = True  # pyav doesn't support accurate seek
-
-    # set a video stream reader
    # TODO(rcadene): also load audio stream at the same time
-    reader = torchvision.io.VideoReader(video_path, "video")
+    video_path = str(video_path)

    # set the first and last requested timestamps
    # Note: previous timestamps are usually loaded, since we need to access the previous key frame
    first_ts = min(timestamps)
    last_ts = max(timestamps)

-    # access closest key frame of the first requested frame
-    # Note: closest key frame timestamp is usually smaller than `first_ts` (e.g. key frame can be the first frame of the video)
-    # for details on what `seek` is doing see: https://pyav.basswood-io.com/docs/stable/api/container.html?highlight=inputcontainer#av.container.InputContainer.seek
-    reader.seek(first_ts, keyframes_only=keyframes_only)
+    loaded_frames: list[torch.Tensor] = []
+    loaded_ts: list[float] = []

-    # load all frames until last requested frame
-    loaded_frames = []
-    loaded_ts = []
-    for frame in reader:
-        current_ts = frame["pts"]
-        if log_loaded_timestamps:
-            logger.info(f"frame loaded at timestamp={current_ts:.4f}")
-        loaded_frames.append(frame["data"])
-        loaded_ts.append(current_ts)
-        if current_ts >= last_ts:
-            break
+    # Seek + decode. `container.seek(offset)` with no `stream` argument expects the offset in
+    # av.time_base units (microseconds). `backward=True` lands us on the nearest keyframe at or
+    # before `first_ts`, so we can then decode forward until we cover `last_ts`. See:
+    # https://pyav.basswood-io.com/docs/stable/api/container.html#av.container.InputContainer.seek
+    with av.open(video_path) as container:
+        stream = container.streams.video[0]
+        container.seek(int(first_ts * av.time_base), backward=True)

-    if backend == "pyav":
-        reader.container.close()
+        for frame in container.decode(stream):
+            if frame.pts is None:
+                continue
+            current_ts = float(frame.pts * stream.time_base)
+            if log_loaded_timestamps:
+                logger.info(f"frame loaded at timestamp={current_ts:.4f}")
+            # Convert to CHW uint8 to match torchcodec's output layout.
+            arr = frame.to_ndarray(format="rgb24")  # H, W, 3
+            loaded_frames.append(torch.from_numpy(arr).permute(2, 0, 1).contiguous())
+            loaded_ts.append(current_ts)
+            if current_ts >= last_ts:
+                break

-    reader = None
+    if not loaded_frames:
+        raise FrameTimestampError(
+            f"No frames could be decoded from {video_path} in the timestamp range [{first_ts}, {last_ts}]."
+        )

    query_ts = torch.tensor(timestamps)
-    loaded_ts = torch.tensor(loaded_ts)
+    loaded_ts_t = torch.tensor(loaded_ts)

    # compute distances between each query timestamp and timestamps of all loaded frames
-    dist = torch.cdist(query_ts[:, None], loaded_ts[:, None], p=1)
+    dist = torch.cdist(query_ts[:, None], loaded_ts_t[:, None], p=1)
    min_, argmin_ = dist.min(1)

    is_within_tol = min_ < tolerance_s
@@ -234,14 +237,14 @@ def decode_video_frames_torchvision(
            " This might be due to synchronization issues with timestamps during data collection."
            " To be safe, we advise to ignore this item during training."
            f"\nqueried timestamps: {query_ts}"
-            f"\nloaded timestamps: {loaded_ts}"
+            f"\nloaded timestamps: {loaded_ts_t}"
            f"\nvideo: {video_path}"
-            f"\nbackend: {backend}"
+            f"\nbackend: pyav"
        )

    # get closest frames to the query timestamps
    closest_frames = torch.stack([loaded_frames[idx] for idx in argmin_])
-    closest_ts = loaded_ts[argmin_]
+    closest_ts = loaded_ts_t[argmin_]

    if log_loaded_timestamps:
        logger.info(f"{closest_ts=}")
@@ -18,13 +18,13 @@ from .act.configuration_act import ACTConfig as ACTConfig
 from .diffusion.configuration_diffusion import DiffusionConfig as DiffusionConfig
 from .eo1.configuration_eo1 import EO1Config as EO1Config
 from .factory import get_policy_class, make_policy, make_policy_config, make_pre_post_processors
+from .gaussian_actor.configuration_gaussian_actor import GaussianActorConfig as GaussianActorConfig
 from .groot.configuration_groot import GrootConfig as GrootConfig
 from .multi_task_dit.configuration_multi_task_dit import MultiTaskDiTConfig as MultiTaskDiTConfig
 from .pi0.configuration_pi0 import PI0Config as PI0Config
 from .pi0_fast.configuration_pi0_fast import PI0FastConfig as PI0FastConfig
 from .pi05.configuration_pi05 import PI05Config as PI05Config
 from .pretrained import PreTrainedPolicy as PreTrainedPolicy
-from .sac.configuration_sac import SACConfig as SACConfig
 from .smolvla.configuration_smolvla import SmolVLAConfig as SmolVLAConfig
 from .tdmpc.configuration_tdmpc import TDMPCConfig as TDMPCConfig
 from .utils import make_robot_action, prepare_observation_for_inference
@@ -32,21 +32,21 @@ from .vqbet.configuration_vqbet import VQBeTConfig as VQBeTConfig
 from .wall_x.configuration_wall_x import WallXConfig as WallXConfig
 from .xvla.configuration_xvla import XVLAConfig as XVLAConfig

-# NOTE: Policy modeling classes (e.g., SACPolicy) are intentionally NOT re-exported here.
+# NOTE: Policy modeling classes (e.g., GaussianActorPolicy) are intentionally NOT re-exported here.
 # They have heavy optional dependencies and are loaded lazily via get_policy_class().
-# Import directly: ``from lerobot.policies.sac.modeling_sac import SACPolicy``
+# Import directly: ``from lerobot.policies.gaussian_actor.modeling_gaussian_actor import GaussianActorPolicy``

 __all__ = [
    # Configuration classes
    "ACTConfig",
    "DiffusionConfig",
+    "EO1Config",
+    "GaussianActorConfig",
    "GrootConfig",
    "MultiTaskDiTConfig",
-    "EO1Config",
    "PI0Config",
    "PI0FastConfig",
    "PI05Config",
-    "SACConfig",
    "SmolVLAConfig",
    "TDMPCConfig",
    "VQBeTConfig",
@@ -47,12 +47,12 @@ from lerobot.utils.feature_utils import dataset_to_policy_features
 from .act.configuration_act import ACTConfig
 from .diffusion.configuration_diffusion import DiffusionConfig
 from .eo1.configuration_eo1 import EO1Config
+from .gaussian_actor.configuration_gaussian_actor import GaussianActorConfig
 from .groot.configuration_groot import GrootConfig
 from .multi_task_dit.configuration_multi_task_dit import MultiTaskDiTConfig
 from .pi0.configuration_pi0 import PI0Config
 from .pi05.configuration_pi05 import PI05Config
 from .pretrained import PreTrainedPolicy
-from .sac.configuration_sac import SACConfig
 from .smolvla.configuration_smolvla import SmolVLAConfig
 from .tdmpc.configuration_tdmpc import TDMPCConfig
 from .utils import validate_visual_features_consistency
@@ -88,7 +88,7 @@ def get_policy_class(name: str) -> type[PreTrainedPolicy]:

    Args:
        name: The name of the policy. Supported names are "tdmpc", "diffusion", "act",
-            "multi_task_dit", "vqbet", "pi0", "pi05", "sac", "smolvla", "wall_x".
+            "multi_task_dit", "vqbet", "pi0", "pi05", "gaussian_actor", "smolvla", "wall_x".
    Returns:
        The policy class corresponding to the given name.

@@ -127,10 +127,10 @@ def get_policy_class(name: str) -> type[PreTrainedPolicy]:
        from .pi05.modeling_pi05 import PI05Policy

        return PI05Policy
-    elif name == "sac":
-        from .sac.modeling_sac import SACPolicy
+    elif name == "gaussian_actor":
+        from .gaussian_actor.modeling_gaussian_actor import GaussianActorPolicy

-        return SACPolicy
+        return GaussianActorPolicy
    elif name == "smolvla":
        from .smolvla.modeling_smolvla import SmolVLAPolicy

@@ -167,7 +167,7 @@ def make_policy_config(policy_type: str, **kwargs) -> PreTrainedConfig:

    Args:
        policy_type: The type of the policy. Supported types include "tdmpc",
-                     "multi_task_dit", "diffusion", "act", "vqbet", "pi0", "pi05", "sac",
+                     "multi_task_dit", "diffusion", "act", "vqbet", "pi0", "pi05", "gaussian_actor",
                     "smolvla", "wall_x".
        **kwargs: Keyword arguments to be passed to the configuration class constructor.

@@ -191,8 +191,8 @@ def make_policy_config(policy_type: str, **kwargs) -> PreTrainedConfig:
        return PI0Config(**kwargs)
    elif policy_type == "pi05":
        return PI05Config(**kwargs)
-    elif policy_type == "sac":
-        return SACConfig(**kwargs)
+    elif policy_type == "gaussian_actor":
+        return GaussianActorConfig(**kwargs)
    elif policy_type == "smolvla":
        return SmolVLAConfig(**kwargs)
    elif policy_type == "groot":
@@ -365,10 +365,10 @@ def make_pre_post_processors(
            dataset_stats=kwargs.get("dataset_stats"),
        )

-    elif isinstance(policy_cfg, SACConfig):
-        from .sac.processor_sac import make_sac_pre_post_processors
+    elif isinstance(policy_cfg, GaussianActorConfig):
+        from .gaussian_actor.processor_gaussian_actor import make_gaussian_actor_pre_post_processors

-        processors = make_sac_pre_post_processors(
+        processors = make_gaussian_actor_pre_post_processors(
            config=policy_cfg,
            dataset_stats=kwargs.get("dataset_stats"),
        )
@@ -12,8 +12,8 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.

-from .configuration_sac import SACConfig
-from .modeling_sac import SACPolicy
-from .processor_sac import make_sac_pre_post_processors
+from .configuration_gaussian_actor import GaussianActorConfig
+from .modeling_gaussian_actor import GaussianActorPolicy
+from .processor_gaussian_actor import make_gaussian_actor_pre_post_processors

-__all__ = ["SACConfig", "SACPolicy", "make_sac_pre_post_processors"]
+__all__ = ["GaussianActorConfig", "GaussianActorPolicy", "make_gaussian_actor_pre_post_processors"]
@@ -1,4 +1,4 @@
-# !/usr/bin/env python
+#!/usr/bin/env python

 # Copyright 2025 The HuggingFace Inc. team.
 # All rights reserved.
@@ -75,18 +75,19 @@ class PolicyConfig:
    init_final: float = 0.05


-@PreTrainedConfig.register_subclass("sac")
+@PreTrainedConfig.register_subclass("gaussian_actor")
@dataclass
-class SACConfig(PreTrainedConfig):
-    """Soft Actor-Critic (SAC) configuration.
+class GaussianActorConfig(PreTrainedConfig):
+    """Gaussian actor configuration.

-    SAC is an off-policy actor-critic deep RL algorithm based on the maximum entropy
-    reinforcement learning framework. It learns a policy and a Q-function simultaneously
-    using experience collected from the environment.
+    This configures the policy-side (actor + observation encoder) of a Gaussian
+    policy, as used by SAC and related maximum-entropy continuous-control algorithms.
+    By default the actor output is a tanh-squashed diagonal Gaussian
+    (``TanhMultivariateNormalDiag``); the tanh squashing can be disabled via
+    ``policy_kwargs.use_tanh_squash``. The critics, temperature, and Bellman-update
+    logic live on the algorithm side (see ``lerobot.rl.algorithms.sac``).

-    This configuration class contains all the parameters needed to define a SAC agent,
-    including network architectures, optimization settings, and algorithm-specific
-    hyperparameters.
+    CLI: ``--policy.type=gaussian_actor``.
    """

    # Mapping of feature types to normalization modes
@@ -122,7 +123,7 @@ class SACConfig(PreTrainedConfig):
    device: str = "cpu"
    # Device to store the model on
    storage_device: str = "cpu"
-    # Name of the vision encoder model (Set to "helper2424/resnet10" for hil serl resnet10)
+    # Name of the vision encoder model (Set to "lerobot/resnet10" for hil serl resnet10)
    vision_encoder_name: str | None = None
    # Whether to freeze the vision encoder during training
    freeze_vision_encoder: bool = True
@@ -135,7 +136,13 @@ class SACConfig(PreTrainedConfig):
    # Dimension of the image embedding pooling
    image_embedding_pooling_dim: int = 8

-    # Training parameter
+    # Encoder architecture
+    # Hidden dimension size for the state encoder
+    state_encoder_hidden_dim: int = 256
+    # Dimension of the latent space
+    latent_dim: int = 256
+
+    # Online training (TODO(Khalil): relocate to TrainRLServerPipelineConfig)
    # Number of steps for online training
    online_steps: int = 1000000
    # Capacity of the online replay buffer
@@ -146,67 +153,38 @@ class SACConfig(PreTrainedConfig):
    async_prefetch: bool = False
    # Number of steps before learning starts
    online_step_before_learning: int = 100
-    # Frequency of policy updates
-    policy_update_freq: int = 1

-    # SAC algorithm parameters
-    # Discount factor for the SAC algorithm
-    discount: float = 0.99
-    # Initial temperature value
-    temperature_init: float = 1.0
-    # Number of critics in the ensemble
-    num_critics: int = 2
-    # Number of subsampled critics for training
-    num_subsample_critics: int | None = None
-    # Learning rate for the critic network
-    critic_lr: float = 3e-4
-    # Learning rate for the actor network
-    actor_lr: float = 3e-4
-    # Learning rate for the temperature parameter
-    temperature_lr: float = 3e-4
-    # Weight for the critic target update
-    critic_target_update_weight: float = 0.005
-    # Update-to-data ratio for the UTD algorithm (If you want enable utd_ratio, you need to set it to >1)
-    utd_ratio: int = 1
-    # Hidden dimension size for the state encoder
-    state_encoder_hidden_dim: int = 256
-    # Dimension of the latent space
-    latent_dim: int = 256
-    # Target entropy for the SAC algorithm
-    target_entropy: float | None = None
-    # Whether to use backup entropy for the SAC algorithm
-    use_backup_entropy: bool = True
-    # Gradient clipping norm for the SAC algorithm
-    grad_clip_norm: float = 40.0
-
-    # Network configuration
-    # Configuration for the critic network architecture
-    critic_network_kwargs: CriticNetworkConfig = field(default_factory=CriticNetworkConfig)
-    # Configuration for the actor network architecture
-    actor_network_kwargs: ActorNetworkConfig = field(default_factory=ActorNetworkConfig)
-    # Configuration for the policy parameters
-    policy_kwargs: PolicyConfig = field(default_factory=PolicyConfig)
-    # Configuration for the discrete critic network
-    discrete_critic_network_kwargs: CriticNetworkConfig = field(default_factory=CriticNetworkConfig)
+    # Actor-learner transport (TODO(Khalil): relocate to TrainRLServerPipelineConfig).
    # Configuration for actor-learner architecture
    actor_learner_config: ActorLearnerConfig = field(default_factory=ActorLearnerConfig)
    # Configuration for concurrency settings (you can use threads or processes for the actor and learner)
    concurrency: ConcurrencyConfig = field(default_factory=ConcurrencyConfig)

-    # Optimizations
-    use_torch_compile: bool = True
+    # Network architecture
+    # Configuration for the actor network architecture
+    actor_network_kwargs: ActorNetworkConfig = field(default_factory=ActorNetworkConfig)
+    # Configuration for the policy parameters (Gaussian head)
+    policy_kwargs: PolicyConfig = field(default_factory=PolicyConfig)
+    # Configuration for the discrete critic network
+    discrete_critic_network_kwargs: CriticNetworkConfig = field(default_factory=CriticNetworkConfig)

    def __post_init__(self):
        super().__post_init__()
-        # Any validation specific to SAC configuration
+        # Any validation specific to GaussianActor configuration

    def get_optimizer_preset(self) -> MultiAdamConfig:
+        # Default learning rate used to satisfy the abstract ``get_optimizer_preset()``
+        # contract from ``PreTrainedConfig``. The actual optimizers used during RL
+        # training are built by ``SACAlgorithm.make_optimizers_and_scheduler()`` from
+        # ``SACAlgorithmConfig.{actor_lr,critic_lr,temperature_lr}`` and fully bypass
+        # this preset.
+        default_lr = 3e-4
        return MultiAdamConfig(
            weight_decay=0.0,
            optimizer_groups={
-                "actor": {"lr": self.actor_lr},
-                "critic": {"lr": self.critic_lr},
-                "temperature": {"lr": self.temperature_lr},
+                "actor": {"lr": default_lr},
+                "critic": {"lr": default_lr},
+                "temperature": {"lr": default_lr},
            },
        )

@@ -15,16 +15,11 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.

-import math
 from collections.abc import Callable
 from dataclasses import asdict
-from typing import Literal

-import einops
-import numpy as np
 import torch
 import torch.nn as nn
-import torch.nn.functional as F  # noqa: N812
 from torch import Tensor
 from torch.distributions import MultivariateNormal, TanhTransform, Transform, TransformedDistribution

@@ -32,20 +27,20 @@ from lerobot.utils.constants import ACTION, OBS_ENV_STATE, OBS_STATE

 from ..pretrained import PreTrainedPolicy
 from ..utils import get_device_from_parameters
-from .configuration_sac import SACConfig, is_image_feature
+from .configuration_gaussian_actor import GaussianActorConfig, is_image_feature

 DISCRETE_DIMENSION_INDEX = -1  # Gripper is always the last dimension


-class SACPolicy(
+class GaussianActorPolicy(
    PreTrainedPolicy,
 ):
-    config_class = SACConfig
-    name = "sac"
+    config_class = GaussianActorConfig
+    name = "gaussian_actor"

    def __init__(
        self,
-        config: SACConfig | None = None,
+        config: GaussianActorConfig | None = None,
    ):
        super().__init__(config)
        config.validate_features()
@@ -54,9 +49,8 @@ class SACPolicy(
        # Determine action dimension and initialize all components
        continuous_action_dim = config.output_features[ACTION].shape[0]
        self._init_encoders()
-        self._init_critics(continuous_action_dim)
        self._init_actor(continuous_action_dim)
-        self._init_temperature()
+        self._init_discrete_critic()

    def get_optim_params(self) -> dict:
        optim_params = {
@@ -65,11 +59,7 @@ class SACPolicy(
                for n, p in self.actor.named_parameters()
                if not n.startswith("encoder") or not self.shared_encoder
            ],
-            "critic": self.critic_ensemble.parameters(),
-            "temperature": self.log_alpha,
        }
-        if self.config.num_discrete_actions is not None:
-            optim_params["discrete_critic"] = self.discrete_critic.parameters()
        return optim_params

    def reset(self):
@@ -79,7 +69,9 @@ class SACPolicy(
    @torch.no_grad()
    def predict_action_chunk(self, batch: dict[str, Tensor]) -> Tensor:
        """Predict a chunk of actions given environment observations."""
-        raise NotImplementedError("SACPolicy does not support action chunking. It returns single actions!")
+        raise NotImplementedError(
+            "GaussianActorPolicy does not support action chunking. It returns single actions!"
+        )

    @torch.no_grad()
    def select_action(self, batch: dict[str, Tensor]) -> Tensor:
@@ -92,360 +84,43 @@ class SACPolicy(
        actions, _, _ = self.actor(batch, observations_features)

        if self.config.num_discrete_actions is not None:
-            discrete_action_value = self.discrete_critic(batch, observations_features)
-            discrete_action = torch.argmax(discrete_action_value, dim=-1, keepdim=True)
+            if self.discrete_critic is not None:
+                discrete_action_value = self.discrete_critic(batch, observations_features)
+                discrete_action = torch.argmax(discrete_action_value, dim=-1, keepdim=True)
+            else:
+                discrete_action = torch.ones(
+                    (*actions.shape[:-1], 1), device=actions.device, dtype=actions.dtype
+                )
            actions = torch.cat([actions, discrete_action], dim=-1)

        return actions

-    def critic_forward(
-        self,
-        observations: dict[str, Tensor],
-        actions: Tensor,
-        use_target: bool = False,
-        observation_features: Tensor | None = None,
-    ) -> Tensor:
-        """Forward pass through a critic network ensemble
+    def forward(self, batch: dict[str, Tensor | dict[str, Tensor]]) -> dict[str, Tensor]:
+        """Actor forward pass: sample actions and return log-probabilities.

        Args:
-            observations: Dictionary of observations
-            actions: Action tensor
-            use_target: If True, use target critics, otherwise use ensemble critics
+            batch: A flat observation dict, or a training dict containing
+                ``"state"`` (observations) and optionally ``"observation_feature"``
+                (pre-computed encoder features).

        Returns:
-            Tensor of Q-values from all critics
+            Dict with ``"action"``, ``"log_prob"``, and ``"action_mean"`` tensors.
        """
-
-        critics = self.critic_target if use_target else self.critic_ensemble
-        q_values = critics(observations, actions, observation_features)
-        return q_values
-
-    def discrete_critic_forward(
-        self, observations, use_target=False, observation_features=None
-    ) -> torch.Tensor:
-        """Forward pass through a discrete critic network
-
-        Args:
-            observations: Dictionary of observations
-            use_target: If True, use target critics, otherwise use ensemble critics
-            observation_features: Optional pre-computed observation features to avoid recomputing encoder output
-
-        Returns:
-            Tensor of Q-values from the discrete critic network
-        """
-        discrete_critic = self.discrete_critic_target if use_target else self.discrete_critic
-        q_values = discrete_critic(observations, observation_features)
-        return q_values
-
-    def forward(
-        self,
-        batch: dict[str, Tensor | dict[str, Tensor]],
-        model: Literal["actor", "critic", "temperature", "discrete_critic"] = "critic",
-    ) -> dict[str, Tensor]:
-        """Compute the loss for the given model
-
-        Args:
-            batch: Dictionary containing:
-                - action: Action tensor
-                - reward: Reward tensor
-                - state: Observations tensor dict
-                - next_state: Next observations tensor dict
-                - done: Done mask tensor
-                - observation_feature: Optional pre-computed observation features
-                - next_observation_feature: Optional pre-computed next observation features
-            model: Which model to compute the loss for ("actor", "critic", "discrete_critic", or "temperature")
-
-        Returns:
-            The computed loss tensor
-        """
-        # Extract common components from batch
-        actions: Tensor = batch[ACTION]
-        observations: dict[str, Tensor] = batch["state"]
-        observation_features: Tensor = batch.get("observation_feature")
-
-        if model == "critic":
-            # Extract critic-specific components
-            rewards: Tensor = batch["reward"]
-            next_observations: dict[str, Tensor] = batch["next_state"]
-            done: Tensor = batch["done"]
-            next_observation_features: Tensor = batch.get("next_observation_feature")
-
-            loss_critic = self.compute_loss_critic(
-                observations=observations,
-                actions=actions,
-                rewards=rewards,
-                next_observations=next_observations,
-                done=done,
-                observation_features=observation_features,
-                next_observation_features=next_observation_features,
-            )
-
-            return {"loss_critic": loss_critic}
-
-        if model == "discrete_critic" and self.config.num_discrete_actions is not None:
-            # Extract critic-specific components
-            rewards: Tensor = batch["reward"]
-            next_observations: dict[str, Tensor] = batch["next_state"]
-            done: Tensor = batch["done"]
-            next_observation_features: Tensor = batch.get("next_observation_feature")
-            complementary_info = batch.get("complementary_info")
-            loss_discrete_critic = self.compute_loss_discrete_critic(
-                observations=observations,
-                actions=actions,
-                rewards=rewards,
-                next_observations=next_observations,
-                done=done,
-                observation_features=observation_features,
-                next_observation_features=next_observation_features,
-                complementary_info=complementary_info,
-            )
-            return {"loss_discrete_critic": loss_discrete_critic}
-        if model == "actor":
-            return {
-                "loss_actor": self.compute_loss_actor(
-                    observations=observations,
-                    observation_features=observation_features,
-                )
-            }
-
-        if model == "temperature":
-            return {
-                "loss_temperature": self.compute_loss_temperature(
-                    observations=observations,
-                    observation_features=observation_features,
-                )
-            }
-
-        raise ValueError(f"Unknown model type: {model}")
-
-    def update_target_networks(self):
-        """Update target networks with exponential moving average"""
-        for target_param, param in zip(
-            self.critic_target.parameters(),
-            self.critic_ensemble.parameters(),
-            strict=True,
-        ):
-            target_param.data.copy_(
-                param.data * self.config.critic_target_update_weight
-                + target_param.data * (1.0 - self.config.critic_target_update_weight)
-            )
-        if self.config.num_discrete_actions is not None:
-            for target_param, param in zip(
-                self.discrete_critic_target.parameters(),
-                self.discrete_critic.parameters(),
-                strict=True,
-            ):
-                target_param.data.copy_(
-                    param.data * self.config.critic_target_update_weight
-                    + target_param.data * (1.0 - self.config.critic_target_update_weight)
-                )
-
-    @property
-    def temperature(self) -> float:
-        """Return the current temperature value, always in sync with log_alpha."""
-        return self.log_alpha.exp().item()
-
-    def compute_loss_critic(
-        self,
-        observations,
-        actions,
-        rewards,
-        next_observations,
-        done,
-        observation_features: Tensor | None = None,
-        next_observation_features: Tensor | None = None,
-    ) -> Tensor:
-        with torch.no_grad():
-            next_action_preds, next_log_probs, _ = self.actor(next_observations, next_observation_features)
-
-            # 2- compute q targets
-            q_targets = self.critic_forward(
-                observations=next_observations,
-                actions=next_action_preds,
-                use_target=True,
-                observation_features=next_observation_features,
-            )
-
-            # subsample critics to prevent overfitting if use high UTD (update to date)
-            # TODO: Get indices before forward pass to avoid unnecessary computation
-            if self.config.num_subsample_critics is not None:
-                indices = torch.randperm(self.config.num_critics)
-                indices = indices[: self.config.num_subsample_critics]
-                q_targets = q_targets[indices]
-
-            # critics subsample size
-            min_q, _ = q_targets.min(dim=0)  # Get values from min operation
-            if self.config.use_backup_entropy:
-                min_q = min_q - (self.temperature * next_log_probs)
-
-            td_target = rewards + (1 - done) * self.config.discount * min_q
-
-        # 3- compute predicted qs
-        if self.config.num_discrete_actions is not None:
-            # NOTE: We only want to keep the continuous action part
-            # In the buffer we have the full action space (continuous + discrete)
-            # We need to split them before concatenating them in the critic forward
-            actions: Tensor = actions[:, :DISCRETE_DIMENSION_INDEX]
-        q_preds = self.critic_forward(
-            observations=observations,
-            actions=actions,
-            use_target=False,
-            observation_features=observation_features,
-        )
-
-        # 4- Calculate loss
-        # Compute state-action value loss (TD loss) for all of the Q functions in the ensemble.
-        td_target_duplicate = einops.repeat(td_target, "b -> e b", e=q_preds.shape[0])
-        # You compute the mean loss of the batch for each critic and then to compute the final loss you sum them up
-        critics_loss = (
-            F.mse_loss(
-                input=q_preds,
-                target=td_target_duplicate,
-                reduction="none",
-            ).mean(dim=1)
-        ).sum()
-        return critics_loss
-
-    def compute_loss_discrete_critic(
-        self,
-        observations,
-        actions,
-        rewards,
-        next_observations,
-        done,
-        observation_features=None,
-        next_observation_features=None,
-        complementary_info=None,
-    ):
-        # NOTE: We only want to keep the discrete action part
-        # In the buffer we have the full action space (continuous + discrete)
-        # We need to split them before concatenating them in the critic forward
-        actions_discrete: Tensor = actions[:, DISCRETE_DIMENSION_INDEX:].clone()
-        actions_discrete = torch.round(actions_discrete)
-        actions_discrete = actions_discrete.long()
-
-        discrete_penalties: Tensor | None = None
-        if complementary_info is not None:
-            discrete_penalties: Tensor | None = complementary_info.get("discrete_penalty")
-
-        with torch.no_grad():
-            # For DQN, select actions using online network, evaluate with target network
-            next_discrete_qs = self.discrete_critic_forward(
-                next_observations, use_target=False, observation_features=next_observation_features
-            )
-            best_next_discrete_action = torch.argmax(next_discrete_qs, dim=-1, keepdim=True)
-
-            # Get target Q-values from target network
-            target_next_discrete_qs = self.discrete_critic_forward(
-                observations=next_observations,
-                use_target=True,
-                observation_features=next_observation_features,
-            )
-
-            # Use gather to select Q-values for best actions
-            target_next_discrete_q = torch.gather(
-                target_next_discrete_qs, dim=1, index=best_next_discrete_action
-            ).squeeze(-1)
-
-            # Compute target Q-value with Bellman equation
-            rewards_discrete = rewards
-            if discrete_penalties is not None:
-                rewards_discrete = rewards + discrete_penalties
-            target_discrete_q = rewards_discrete + (1 - done) * self.config.discount * target_next_discrete_q
-
-        # Get predicted Q-values for current observations
-        predicted_discrete_qs = self.discrete_critic_forward(
-            observations=observations, use_target=False, observation_features=observation_features
-        )
-
-        # Use gather to select Q-values for taken actions
-        predicted_discrete_q = torch.gather(predicted_discrete_qs, dim=1, index=actions_discrete).squeeze(-1)
-
-        # Compute MSE loss between predicted and target Q-values
-        discrete_critic_loss = F.mse_loss(input=predicted_discrete_q, target=target_discrete_q)
-        return discrete_critic_loss
-
-    def compute_loss_temperature(self, observations, observation_features: Tensor | None = None) -> Tensor:
-        """Compute the temperature loss"""
-        # calculate temperature loss
-        with torch.no_grad():
-            _, log_probs, _ = self.actor(observations, observation_features)
-        temperature_loss = (-self.log_alpha.exp() * (log_probs + self.target_entropy)).mean()
-        return temperature_loss
-
-    def compute_loss_actor(
-        self,
-        observations,
-        observation_features: Tensor | None = None,
-    ) -> Tensor:
-        actions_pi, log_probs, _ = self.actor(observations, observation_features)
-
-        q_preds = self.critic_forward(
-            observations=observations,
-            actions=actions_pi,
-            use_target=False,
-            observation_features=observation_features,
-        )
-        min_q_preds = q_preds.min(dim=0)[0]
-
-        actor_loss = ((self.temperature * log_probs) - min_q_preds).mean()
-        return actor_loss
+        observations = batch.get("state", batch)
+        observation_features = batch.get("observation_feature") if isinstance(batch, dict) else None
+        actions, log_probs, means = self.actor(observations, observation_features)
+        return {"action": actions, "log_prob": log_probs, "action_mean": means}

    def _init_encoders(self):
        """Initialize shared or separate encoders for actor and critic."""
        self.shared_encoder = self.config.shared_encoder
-        self.encoder_critic = SACObservationEncoder(self.config)
+        self.encoder_critic = GaussianActorObservationEncoder(self.config)
        self.encoder_actor = (
-            self.encoder_critic if self.shared_encoder else SACObservationEncoder(self.config)
+            self.encoder_critic if self.shared_encoder else GaussianActorObservationEncoder(self.config)
        )

-    def _init_critics(self, continuous_action_dim):
-        """Build critic ensemble, targets, and optional discrete critic."""
-        heads = [
-            CriticHead(
-                input_dim=self.encoder_critic.output_dim + continuous_action_dim,
-                **asdict(self.config.critic_network_kwargs),
-            )
-            for _ in range(self.config.num_critics)
-        ]
-        self.critic_ensemble = CriticEnsemble(encoder=self.encoder_critic, ensemble=heads)
-        target_heads = [
-            CriticHead(
-                input_dim=self.encoder_critic.output_dim + continuous_action_dim,
-                **asdict(self.config.critic_network_kwargs),
-            )
-            for _ in range(self.config.num_critics)
-        ]
-        self.critic_target = CriticEnsemble(encoder=self.encoder_critic, ensemble=target_heads)
-        self.critic_target.load_state_dict(self.critic_ensemble.state_dict())
-
-        if self.config.use_torch_compile:
-            self.critic_ensemble = torch.compile(self.critic_ensemble)
-            self.critic_target = torch.compile(self.critic_target)
-
-        if self.config.num_discrete_actions is not None:
-            self._init_discrete_critics()
-
-    def _init_discrete_critics(self):
-        """Build discrete discrete critic ensemble and target networks."""
-        self.discrete_critic = DiscreteCritic(
-            encoder=self.encoder_critic,
-            input_dim=self.encoder_critic.output_dim,
-            output_dim=self.config.num_discrete_actions,
-            **asdict(self.config.discrete_critic_network_kwargs),
-        )
-        self.discrete_critic_target = DiscreteCritic(
-            encoder=self.encoder_critic,
-            input_dim=self.encoder_critic.output_dim,
-            output_dim=self.config.num_discrete_actions,
-            **asdict(self.config.discrete_critic_network_kwargs),
-        )
-
-        # TODO: (maractingi, azouitine) Compile the discrete critic
-        self.discrete_critic_target.load_state_dict(self.discrete_critic.state_dict())
-
    def _init_actor(self, continuous_action_dim):
-        """Initialize policy actor network and default target entropy."""
+        """Initialize policy actor network."""
        # NOTE: The actor select only the continuous action part
        self.actor = Policy(
            encoder=self.encoder_actor,
@@ -455,21 +130,25 @@ class SACPolicy(
            **asdict(self.config.policy_kwargs),
        )

-        self.target_entropy = self.config.target_entropy
-        if self.target_entropy is None:
-            dim = continuous_action_dim + (1 if self.config.num_discrete_actions is not None else 0)
-            self.target_entropy = -np.prod(dim) / 2
+    def _init_discrete_critic(self) -> None:
+        """Initialize discrete critic network."""
+        if self.config.num_discrete_actions is None:
+            self.discrete_critic = None
+            return

-    def _init_temperature(self) -> None:
-        """Set up temperature parameter (log_alpha)."""
-        temp_init = self.config.temperature_init
-        self.log_alpha = nn.Parameter(torch.tensor([math.log(temp_init)]))
+        # TODO(Khalil): Compile the discrete critic
+        self.discrete_critic = DiscreteCritic(
+            encoder=self.encoder_critic,
+            input_dim=self.encoder_critic.output_dim,
+            output_dim=self.config.num_discrete_actions,
+            **asdict(self.config.discrete_critic_network_kwargs),
+        )


-class SACObservationEncoder(nn.Module):
+class GaussianActorObservationEncoder(nn.Module):
    """Encode image and/or state vector observations."""

-    def __init__(self, config: SACConfig) -> None:
+    def __init__(self, config: GaussianActorConfig) -> None:
        super().__init__()
        self.config = config
        self._init_image_layers()
@@ -677,84 +356,6 @@ class MLP(nn.Module):
        return self.net(x)


-class CriticHead(nn.Module):
-    def __init__(
-        self,
-        input_dim: int,
-        hidden_dims: list[int],
-        activations: Callable[[torch.Tensor], torch.Tensor] | str = nn.SiLU(),
-        activate_final: bool = False,
-        dropout_rate: float | None = None,
-        init_final: float | None = None,
-        final_activation: Callable[[torch.Tensor], torch.Tensor] | str | None = None,
-    ):
-        super().__init__()
-        self.net = MLP(
-            input_dim=input_dim,
-            hidden_dims=hidden_dims,
-            activations=activations,
-            activate_final=activate_final,
-            dropout_rate=dropout_rate,
-            final_activation=final_activation,
-        )
-        self.output_layer = nn.Linear(in_features=hidden_dims[-1], out_features=1)
-        if init_final is not None:
-            nn.init.uniform_(self.output_layer.weight, -init_final, init_final)
-            nn.init.uniform_(self.output_layer.bias, -init_final, init_final)
-        else:
-            orthogonal_init()(self.output_layer.weight)
-
-    def forward(self, x: torch.Tensor) -> torch.Tensor:
-        return self.output_layer(self.net(x))
-
-
-class CriticEnsemble(nn.Module):
-    """
-    CriticEnsemble wraps multiple CriticHead modules into an ensemble.
-
-    Args:
-        encoder (SACObservationEncoder): encoder for observations.
-        ensemble (List[CriticHead]): list of critic heads.
-        init_final (float | None): optional initializer scale for final layers.
-
-    Forward returns a tensor of shape (num_critics, batch_size) containing Q-values.
-    """
-
-    def __init__(
-        self,
-        encoder: SACObservationEncoder,
-        ensemble: list[CriticHead],
-        init_final: float | None = None,
-    ):
-        super().__init__()
-        self.encoder = encoder
-        self.init_final = init_final
-        self.critics = nn.ModuleList(ensemble)
-
-    def forward(
-        self,
-        observations: dict[str, torch.Tensor],
-        actions: torch.Tensor,
-        observation_features: torch.Tensor | None = None,
-    ) -> torch.Tensor:
-        device = get_device_from_parameters(self)
-        # Move each tensor in observations to device
-        observations = {k: v.to(device) for k, v in observations.items()}
-
-        obs_enc = self.encoder(observations, cache=observation_features)
-
-        inputs = torch.cat([obs_enc, actions], dim=-1)
-
-        # Loop through critics and collect outputs
-        q_values = []
-        for critic in self.critics:
-            q_values.append(critic(inputs))
-
-        # Stack outputs to match expected shape [num_critics, batch_size]
-        q_values = torch.stack([q.squeeze(-1) for q in q_values], dim=0)
-        return q_values
-
-
 class DiscreteCritic(nn.Module):
    def __init__(
        self,
@@ -800,7 +401,7 @@ class DiscreteCritic(nn.Module):
 class Policy(nn.Module):
    def __init__(
        self,
-        encoder: SACObservationEncoder,
+        encoder: GaussianActorObservationEncoder,
        network: nn.Module,
        action_dim: int,
        std_min: float = -5,
@@ -811,7 +412,7 @@ class Policy(nn.Module):
        encoder_is_shared: bool = False,
    ):
        super().__init__()
-        self.encoder: SACObservationEncoder = encoder
+        self.encoder: GaussianActorObservationEncoder = encoder
        self.network = network
        self.action_dim = action_dim
        self.std_min = std_min
@@ -885,7 +486,7 @@ class Policy(nn.Module):


 class DefaultImageEncoder(nn.Module):
-    def __init__(self, config: SACConfig):
+    def __init__(self, config: GaussianActorConfig):
        super().__init__()
        image_key = next(key for key in config.input_features if is_image_feature(key))
        self.image_enc_layers = nn.Sequential(
@@ -931,12 +532,12 @@ def freeze_image_encoder(image_encoder: nn.Module):


 class PretrainedImageEncoder(nn.Module):
-    def __init__(self, config: SACConfig):
+    def __init__(self, config: GaussianActorConfig):
        super().__init__()

        self.image_enc_layers, self.image_enc_out_shape = self._load_pretrained_vision_encoder(config)

-    def _load_pretrained_vision_encoder(self, config: SACConfig):
+    def _load_pretrained_vision_encoder(self, config: GaussianActorConfig):
        """Set up CNN encoder"""
        from transformers import AutoModel

@@ -32,18 +32,18 @@ from lerobot.processor import (
 )
 from lerobot.utils.constants import POLICY_POSTPROCESSOR_DEFAULT_NAME, POLICY_PREPROCESSOR_DEFAULT_NAME

-from .configuration_sac import SACConfig
+from .configuration_gaussian_actor import GaussianActorConfig


-def make_sac_pre_post_processors(
-    config: SACConfig,
+def make_gaussian_actor_pre_post_processors(
+    config: GaussianActorConfig,
    dataset_stats: dict[str, dict[str, torch.Tensor]] | None = None,
 ) -> tuple[
    PolicyProcessorPipeline[dict[str, Any], dict[str, Any]],
    PolicyProcessorPipeline[PolicyAction, PolicyAction],
 ]:
    """
-    Constructs pre-processor and post-processor pipelines for the SAC policy.
+    Constructs pre-processor and post-processor pipelines for the Gaussian actor policy.

    The pre-processing pipeline prepares input data for the model by:
    1. Renaming features to match pretrained configurations.
@@ -56,7 +56,7 @@ def make_sac_pre_post_processors(
    2. Unnormalizing the output features to their original scale.

    Args:
-        config: The configuration object for the SAC policy.
+        config: The configuration object for the tanh-Gaussian policy.
        dataset_stats: A dictionary of statistics for normalization.

    Returns:
@@ -4,7 +4,6 @@
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
-# You may not use this file except in compliance with the License.
 # You may obtain a copy of the License at
 #
 #     http://www.apache.org/licenses/LICENSE-2.0
@@ -321,6 +320,7 @@ class GymHILAdapterProcessorStep(ProcessorStep):
    This step normalizes the `transition` object by:
    1. Copying `teleop_action` from `info` to `complementary_data`.
    2. Copying `is_intervention` from `info` (using the string key) to `info` (using the enum key).
+    3. Copying `discrete_penalty` from `info` to `complementary_data`.
    """

    def __call__(self, transition: EnvTransition) -> EnvTransition:
@@ -330,6 +330,9 @@ class GymHILAdapterProcessorStep(ProcessorStep):
        if TELEOP_ACTION_KEY in info:
            complementary_data[TELEOP_ACTION_KEY] = info[TELEOP_ACTION_KEY]

+        if DISCRETE_PENALTY_KEY in info:
+            complementary_data[DISCRETE_PENALTY_KEY] = info[DISCRETE_PENALTY_KEY]
+
        if "is_intervention" in info:
            info[TeleopEvents.IS_INTERVENTION] = info["is_intervention"]

@@ -348,18 +351,24 @@ class GymHILAdapterProcessorStep(ProcessorStep):
@ProcessorStepRegistry.register("gripper_penalty_processor")
 class GripperPenaltyProcessorStep(ProcessorStep):
    """
-    Applies a penalty for inefficient gripper usage.
+    Applies a small per-transition cost on the discrete gripper action.

-    This step penalizes actions that attempt to close an already closed gripper or
-    open an already open one, based on position thresholds.
+    Fires only when the commanded action would actually transition the gripper
+    from one extreme to the other (close-while-open or open-while-closed).
+    This discourages gripper oscillation while leaving "stay" and saturating-further
+    commands unpenalized.

    Attributes:
        penalty: The negative reward value to apply.
        max_gripper_pos: The maximum position value for the gripper, used for normalization.
+        open_threshold: Normalized state below which the gripper is considered "open".
+        closed_threshold: Normalized state above which the gripper is considered "closed".
    """

-    penalty: float = -0.01
+    penalty: float = -0.02
    max_gripper_pos: float = 30.0
+    open_threshold: float = 0.1
+    closed_threshold: float = 0.9

    def __call__(self, transition: EnvTransition) -> EnvTransition:
        """
@@ -379,11 +388,15 @@ class GripperPenaltyProcessorStep(ProcessorStep):
        if raw_joint_positions is None:
            return new_transition

-        current_gripper_pos = raw_joint_positions.get(GRIPPER_KEY, None)
+        current_gripper_pos = raw_joint_positions.get(f"{GRIPPER_KEY}.pos", None)
        if current_gripper_pos is None:
            return new_transition

-        # Gripper action is a PolicyAction at this stage
+        # During reset, the transition may not carry any action yet.
+        if action is None:
+            return new_transition
+
+        # Gripper action is expected as the last action dimension.
        gripper_action = action[-1].item()
        gripper_action_normalized = gripper_action / self.max_gripper_pos

@@ -391,9 +404,13 @@ class GripperPenaltyProcessorStep(ProcessorStep):
        gripper_state_normalized = current_gripper_pos / self.max_gripper_pos

        # Calculate penalty boolean as in original
-        gripper_penalty_bool = (gripper_state_normalized < 0.5 and gripper_action_normalized > 0.5) or (
-            gripper_state_normalized > 0.75 and gripper_action_normalized < 0.5
-        )
+        #   - currently open  AND target is closed  -> close transition
+        #   - currently closed AND target is open   -> open transition
+        is_open = gripper_state_normalized < self.open_threshold
+        is_closed = gripper_state_normalized > self.closed_threshold
+        cmd_close = gripper_action_normalized > self.closed_threshold
+        cmd_open = gripper_action_normalized < self.open_threshold
+        gripper_penalty_bool = (is_open and cmd_close) or (is_closed and cmd_open)

        gripper_penalty = self.penalty * int(gripper_penalty_bool)

@@ -409,11 +426,14 @@ class GripperPenaltyProcessorStep(ProcessorStep):
        Returns the configuration of the step for serialization.

        Returns:
-            A dictionary containing the penalty value and max gripper position.
+            A dictionary containing the penalty value, max gripper position,
+            and the open/closed thresholds.
        """
        return {
            "penalty": self.penalty,
            "max_gripper_pos": self.max_gripper_pos,
+            "open_threshold": self.open_threshold,
+            "closed_threshold": self.closed_threshold,
        }

    def reset(self) -> None:
@@ -134,6 +134,24 @@ class _NormalizationMixin:
        if self.dtype is None:
            self.dtype = torch.float32
        self._tensor_stats = to_tensor(self.stats, device=self.device, dtype=self.dtype)
+        self._reshape_visual_stats()
+
+    def _reshape_visual_stats(self) -> None:
+        """Reshape flat ``(C,)`` visual stats to ``(C, 1, 1)`` for image broadcasting.
+
+        No-op for stats from :func:`~lerobot.datasets.compute_stats.compute_stats`
+        (already ``(C, 1, 1)``). Needed by RL training, which can start without
+        a dataset and supplies stats manually via JSON config.
+        """
+        for key, feature in self.features.items():
+            if feature.type != FeatureType.VISUAL:
+                continue
+            if key not in self._tensor_stats:
+                continue
+            for stat_name, stat_tensor in self._tensor_stats[key].items():
+                if not isinstance(stat_tensor, Tensor) or stat_tensor.ndim != 1:
+                    continue
+                self._tensor_stats[key][stat_name] = stat_tensor.reshape(-1, 1, 1)

    def to(
        self, device: torch.device | str | None = None, dtype: torch.dtype | None = None
@@ -152,6 +170,7 @@ class _NormalizationMixin:
        if dtype is not None:
            self.dtype = dtype
        self._tensor_stats = to_tensor(self.stats, device=self.device, dtype=self.dtype)
+        self._reshape_visual_stats()
        return self

    def state_dict(self) -> dict[str, Tensor]:
@@ -201,6 +220,7 @@ class _NormalizationMixin:
            # Don't load from state_dict, keep the explicitly provided stats
            # But ensure _tensor_stats is properly initialized
            self._tensor_stats = to_tensor(self.stats, device=self.device, dtype=self.dtype)  # type: ignore[assignment]
+            self._reshape_visual_stats()
            return

        # Normal behavior: load stats from state_dict
@@ -211,6 +231,7 @@ class _NormalizationMixin:
            self._tensor_stats.setdefault(key, {})[stat_name] = tensor.to(
                dtype=torch.float32, device=self.device
            )
+        self._reshape_visual_stats()

        # Reconstruct the original stats dict from tensor stats for compatibility with to() method
        # and other functions that rely on self.stats
@@ -20,11 +20,13 @@ from .factory import (
    make_reward_pre_post_processors as make_reward_pre_post_processors,
 )
 from .pretrained import PreTrainedRewardModel as PreTrainedRewardModel
+from .robometer.configuration_robometer import RobometerConfig as RobometerConfig
 from .sarm.configuration_sarm import SARMConfig as SARMConfig

 __all__ = [
    # Configuration classes
    "RewardClassifierConfig",
+    "RobometerConfig",
    "SARMConfig",
    # Base class
    "PreTrainedRewardModel",
@@ -30,7 +30,7 @@ class RewardClassifierConfig(RewardModelConfig):
    latent_dim: int = 256
    image_embedding_pooling_dim: int = 8
    dropout_rate: float = 0.1
-    model_name: str = "helper2424/resnet10"  # TODO: This needs to be updated. The model on the Hub doesn't call self.post_init() in its __init__, which is required by transformers v5 to set all_tied_weights_keys. The from_pretrained call fails when it tries to access this attribute during _finalize_model_loading.
+    model_name: str = "lerobot/resnet10"
    device: str = "cpu"
    model_type: str = "cnn"  # "transformer" or "cnn"
    num_cameras: int = 2
@@ -105,6 +105,7 @@ class Classifier(PreTrainedRewardModel):
    def __init__(
        self,
        config: RewardClassifierConfig,
+        **kwargs,
    ):
        from transformers import AutoModel

@@ -24,6 +24,7 @@ from lerobot.configs.rewards import RewardModelConfig
 from lerobot.processor import PolicyAction, PolicyProcessorPipeline
 from lerobot.rewards.classifier.configuration_classifier import RewardClassifierConfig
 from lerobot.rewards.pretrained import PreTrainedRewardModel
+from lerobot.rewards.robometer.configuration_robometer import RobometerConfig
 from lerobot.rewards.sarm.configuration_sarm import SARMConfig


@@ -36,7 +37,7 @@ def get_reward_model_class(name: str) -> type[PreTrainedRewardModel]:

    Args:
        name: The name of the reward model. Supported names are "reward_classifier",
-              "sarm".
+              "sarm", "robometer".

    Returns:
        The reward model class corresponding to the given name.
@@ -52,6 +53,10 @@ def get_reward_model_class(name: str) -> type[PreTrainedRewardModel]:
        from lerobot.rewards.sarm.modeling_sarm import SARMRewardModel

        return SARMRewardModel
+    elif name == "robometer":
+        from lerobot.rewards.robometer.modeling_robometer import RobometerRewardModel
+
+        return RobometerRewardModel
    else:
        try:
            return _get_reward_model_cls_from_name(name=name)
@@ -68,7 +73,7 @@ def make_reward_model_config(reward_type: str, **kwargs) -> RewardModelConfig:

    Args:
        reward_type: The type of the reward model. Supported types include
-                     "reward_classifier", "sarm".
+                     "reward_classifier", "sarm", "robometer".
        **kwargs: Keyword arguments to be passed to the configuration class constructor.

    Returns:
@@ -81,6 +86,8 @@ def make_reward_model_config(reward_type: str, **kwargs) -> RewardModelConfig:
        return RewardClassifierConfig(**kwargs)
    elif reward_type == "sarm":
        return SARMConfig(**kwargs)
+    elif reward_type == "robometer":
+        return RobometerConfig(**kwargs)
    else:
        try:
            config_cls = RewardModelConfig.get_choice_class(reward_type)
@@ -160,6 +167,13 @@ def make_reward_pre_post_processors(
            dataset_stats=kwargs.get("dataset_stats"),
            dataset_meta=kwargs.get("dataset_meta"),
        )
+    elif isinstance(reward_cfg, RobometerConfig):
+        from lerobot.rewards.robometer.processor_robometer import make_robometer_pre_post_processors
+
+        return make_robometer_pre_post_processors(
+            config=reward_cfg,
+            dataset_stats=kwargs.get("dataset_stats"),
+        )

    else:
        try:
@@ -0,0 +1,19 @@
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from .configuration_robometer import RobometerConfig
+from .modeling_robometer import RobometerRewardModel
+from .processor_robometer import make_robometer_pre_post_processors
+
+__all__ = ["RobometerConfig", "RobometerRewardModel", "make_robometer_pre_post_processors"]
@@ -0,0 +1,229 @@
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Upstream/legacy Robometer checkpoint loader.
+
+This module is **only** used by the one-time conversion tooling
+(:mod:`lerobot.scripts.lerobot_export_robometer` and
+``scripts/verify_robometer_export.py``). It supports:
+
+- Sharded upstream checkpoints (``model-0000X-of-Y.safetensors`` + index).
+- PEFT/LoRA adapter checkpoints (``adapter_config.json`` + adapter weights).
+- Local snapshot directories or Hugging Face Hub repo ids.
+
+Once :class:`~lerobot.rewards.robometer.RobometerRewardModel` is loaded
+through this module, calling ``save_pretrained`` writes the canonical
+LeRobot-native layout (single ``model.safetensors`` + ``config.json``) that
+the base loader understands.
+
+The runtime path
+(:meth:`~lerobot.rewards.pretrained.PreTrainedRewardModel.from_pretrained`)
+does **not** import this file. It is safe to delete once you no longer need
+the conversion tooling.
+"""
+
+from __future__ import annotations
+
+import json
+import logging
+from pathlib import Path
+from typing import Any
+
+from huggingface_hub import snapshot_download
+from safetensors.torch import load_file
+from torch import Tensor, nn
+
+from lerobot.utils.import_utils import require_package
+
+logger = logging.getLogger(__name__)
+
+
+def _download_robometer_snapshot(
+    pretrained_path: str,
+    *,
+    hub_token: str | None = None,
+) -> Path:
+    """Resolve a Robometer snapshot directory.
+
+    - If ``pretrained_path`` is an existing local directory, return it directly.
+    - Otherwise treat ``pretrained_path`` as a Hugging Face repo id (optionally
+      with ``@revision``) and download it via ``snapshot_download``.
+    """
+    local_candidate = Path(pretrained_path)
+    if local_candidate.is_dir():
+        return local_candidate
+
+    if "@" in pretrained_path:
+        repo_id, revision = pretrained_path.split("@", 1)
+    else:
+        repo_id, revision = pretrained_path, None
+
+    return Path(
+        snapshot_download(
+            repo_id=repo_id,
+            revision=revision,
+            token=hub_token,
+            allow_patterns=[
+                "*.json",
+                "*.safetensors",
+                "*.bin",
+                "*.txt",
+                "*.model",
+                "tokenizer*",
+                "special_tokens_map.json",
+            ],
+        )
+    )
+
+
+def _maybe_apply_peft(base_model: Any, snapshot_dir: Path) -> Any:
+    adapter_config = snapshot_dir / "adapter_config.json"
+    if not adapter_config.exists():
+        return base_model
+
+    require_package("peft", extra="peft-dep")
+    from peft import PeftModel
+
+    return PeftModel.from_pretrained(base_model, str(snapshot_dir))
+
+
+def _remap_state_dict_keys(state_dict: dict[str, Tensor], model: nn.Module) -> dict[str, Tensor]:
+    """Try a few common prefix swaps so PEFT-wrapped checkpoints load cleanly."""
+    model_keys = set(model.state_dict().keys())
+    remapped: dict[str, Tensor] = {}
+
+    for key, value in state_dict.items():
+        if key in model_keys:
+            remapped[key] = value
+            continue
+
+        candidates: list[str] = []
+        if key.startswith("model.model."):
+            candidates.append(key.replace("model.model.", "model.base_model.model.model.", 1))
+            candidates.append(key.replace("model.model.", "model.", 1))
+        if key.startswith("model."):
+            candidates.append(f"model.{key}")
+            candidates.append(key.replace("model.", "", 1))
+        else:
+            candidates.append(f"model.{key}")
+        if key.startswith("model.") and not key.startswith("model.base_model."):
+            parts = key.split(".", 1)
+            if len(parts) == 2:
+                candidates.append(f"model.base_model.{parts[1]}")
+
+        for candidate in candidates:
+            if candidate in model_keys:
+                remapped[candidate] = value
+                break
+        else:
+            remapped[key] = value
+
+    return remapped
+
+
+def _resolve_checkpoint_safetensors_files(snapshot_dir: Path) -> list[Path]:
+    """Pick the safetensors files that hold the full model weights.
+
+    When ``model.safetensors.index.json`` is present, only the files it lists are
+    loaded. Otherwise any ``model*.safetensors`` shards are preferred over
+    sidecar files. Falls back to every ``*.safetensors`` in the snapshot.
+    """
+    index_path = snapshot_dir / "model.safetensors.index.json"
+    if index_path.exists():
+        with index_path.open() as f:
+            weight_map = json.load(f).get("weight_map", {})
+        indexed = sorted(
+            {snapshot_dir / name for name in weight_map.values() if (snapshot_dir / name).exists()}
+        )
+        if indexed:
+            return indexed
+
+    model_shards = sorted(snapshot_dir.glob("model*.safetensors"))
+    if model_shards:
+        return model_shards
+
+    return sorted(snapshot_dir.glob("*.safetensors"))
+
+
+def apply_upstream_checkpoint(
+    model: nn.Module,
+    pretrained_path: str,
+    *,
+    hub_token: str | None = None,
+) -> None:
+    """Load an upstream (sharded / PEFT) Robometer checkpoint into ``model``.
+
+    Downloads the snapshot, optionally applies PEFT wrapping, merges sharded
+    ``.safetensors`` files in memory, remaps PEFT-prefixed keys, and loads them
+    into ``model`` non-strictly. ``model`` must already be constructed with the
+    matching Robometer architecture (e.g. via
+    :class:`~lerobot.rewards.robometer.RobometerRewardModel` ``__init__``).
+    """
+    snapshot_dir = _download_robometer_snapshot(pretrained_path, hub_token=hub_token)
+
+    # PEFT adapter checkpoints wrap the base model before weight loading so the
+    # remapper can place adapter tensors at the right prefix.
+    base_model = getattr(model, "model", None)
+    if base_model is not None:
+        wrapped = _maybe_apply_peft(base_model, snapshot_dir)
+        if wrapped is not base_model:
+            model.model = wrapped
+
+    files = _resolve_checkpoint_safetensors_files(snapshot_dir)
+    if not files:
+        logger.warning("No *.safetensors files in %s; using freshly initialised heads", snapshot_dir)
+        return
+
+    merged: dict[str, Tensor] = {}
+    for path in files:
+        merged.update(load_file(str(path)))
+
+    remapped = _remap_state_dict_keys(merged, model)
+
+    # Defensive vocab-match. With the corrected resize logic
+    # (``_resize_embeddings_for_robometer`` uses ``len(tokenizer) + 5``),
+    # a freshly built ``RobometerRewardModel`` should already share the same
+    # vocabulary as the upstream checkpoint (e.g. 151,674 for
+    # ``robometer/Robometer-4B``). This block stays in place as a safety net
+    # in case a future upstream variant uses a different vocab — we never
+    # want ``load_state_dict`` to trip on a silent shape mismatch.
+    base_model = getattr(model, "model", None)
+    if base_model is not None and hasattr(base_model, "get_input_embeddings"):
+        for key in (
+            "model.model.language_model.embed_tokens.weight",
+            "model.language_model.embed_tokens.weight",
+            "model.embed_tokens.weight",
+        ):
+            tensor = remapped.get(key)
+            if tensor is None:
+                continue
+            ckpt_vocab = int(tensor.shape[0])
+            current_vocab = int(base_model.get_input_embeddings().num_embeddings)
+            if ckpt_vocab != current_vocab:
+                logger.info(
+                    "Resizing model embed table %d -> %d to match upstream checkpoint vocab "
+                    "(upstream was trained against a different Qwen revision).",
+                    current_vocab,
+                    ckpt_vocab,
+                )
+                base_model.resize_token_embeddings(ckpt_vocab)
+            break
+
+    missing, unexpected = model.load_state_dict(remapped, strict=False)
+    if missing:
+        logger.debug("Robometer checkpoint missing %d keys (sample: %s)", len(missing), missing[:5])
+    if unexpected:
+        logger.debug(
+            "Robometer checkpoint had %d unexpected keys (sample: %s)", len(unexpected), unexpected[:5]
+        )
@@ -0,0 +1,162 @@
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from __future__ import annotations
+
+from copy import deepcopy
+from dataclasses import dataclass, field
+from typing import TYPE_CHECKING, Any
+
+from lerobot.configs import FeatureType, NormalizationMode, PolicyFeature
+from lerobot.configs.rewards import RewardModelConfig
+from lerobot.utils.constants import OBS_IMAGES
+from lerobot.utils.import_utils import _transformers_available, require_package
+
+if TYPE_CHECKING or _transformers_available:
+    from transformers import AutoConfig, AutoTokenizer
+else:
+    AutoConfig = None  # type: ignore[assignment]
+    AutoTokenizer = None  # type: ignore[assignment]
+
+
+@RewardModelConfig.register_subclass("robometer")
+@dataclass
+class RobometerConfig(RewardModelConfig):
+    """Configuration for the Robometer reward model."""
+
+    pretrained_path: str | None = "lilkm/Robometer-4B"
+    image_key: str = OBS_IMAGES + ".top"
+    task_key: str = "task"
+    default_task: str | None = None
+
+    max_frames: int | None = 8
+    reward_output: str = "progress"  # "progress" or "success"
+    success_threshold: float = 0.5
+
+    license: str | None = "apache-2.0"
+    tags: list[str] | None = field(
+        default_factory=lambda: ["reward-model", "vision-language", "qwen3-vl", "zero-shot"]
+    )
+
+    base_model_id: str = "Qwen/Qwen3-VL-4B-Instruct"
+    torch_dtype: str = "bfloat16"
+    use_multi_image: bool = True
+    use_per_frame_progress_token: bool = True
+    average_temporal_patches: bool = True
+    frame_pooling: str = "mean"  # "mean" | "boundary" | "attention"
+    frame_pooling_attn_temperature: float = 1.0
+    progress_loss_type: str = "discrete"  # "l1" | "l2" | "discrete"
+    progress_discrete_bins: int = 10
+
+    # Serialised Qwen backbone config (post-resize). Always populated by
+    # ``__post_init__`` from ``base_model_id`` + ``len(tokenizer) + 5``, so it
+    # is never ``None`` after construction (EO-1 style). Saved into
+    # ``config.json`` automatically by the base ``_save_pretrained``.
+    vlm_config: dict[str, Any] | None = None
+
+    input_features: dict[str, PolicyFeature] = field(default_factory=dict)
+    output_features: dict[str, PolicyFeature] = field(default_factory=dict)
+    normalization_mapping: dict[str, NormalizationMode] = field(
+        default_factory=lambda: {
+            "VISUAL": NormalizationMode.IDENTITY,
+            "REWARD": NormalizationMode.IDENTITY,
+        }
+    )
+
+    def __post_init__(self) -> None:
+        super().__post_init__()
+        if self.reward_output not in {"progress", "success"}:
+            raise ValueError(f"reward_output must be 'progress' or 'success', got {self.reward_output!r}")
+        if self.max_frames is not None and self.max_frames < 1:
+            raise ValueError(f"max_frames must be >= 1, got {self.max_frames}")
+        if self.frame_pooling not in {"mean", "boundary", "attention"}:
+            raise ValueError(f"frame_pooling must be mean/boundary/attention; got {self.frame_pooling!r}")
+        if self.frame_pooling_attn_temperature <= 0:
+            raise ValueError("frame_pooling_attn_temperature must be > 0")
+        if self.progress_loss_type not in {"l1", "l2", "discrete"}:
+            raise ValueError(f"progress_loss_type must be l1/l2/discrete; got {self.progress_loss_type!r}")
+        if self.use_per_frame_progress_token and not self.use_multi_image:
+            raise ValueError("use_per_frame_progress_token=True requires use_multi_image=True")
+
+        if self.image_key not in self.input_features:
+            self.input_features[self.image_key] = PolicyFeature(shape=(3, 224, 224), type=FeatureType.VISUAL)
+        self.output_features.setdefault("progress", PolicyFeature(shape=(1,), type=FeatureType.REWARD))
+        self.output_features.setdefault("success", PolicyFeature(shape=(1,), type=FeatureType.REWARD))
+
+        # Deterministically populate ``vlm_config`` so it is never ``None``
+        # after construction (mirrors EO-1's ``__post_init__`` snapshot).
+        # The target vocab matches upstream Robometer's runtime resize
+        # ``base_model.resize_token_embeddings(len(processor.tokenizer))`` —
+        # see ``third_party/robometer/.../setup_utils.py`` —
+        # i.e. ``len(tokenizer) + len(ROBOMETER_SPECIAL_TOKENS)``.
+        #
+        # For ``Qwen/Qwen3-VL-4B-Instruct`` this gives 151,669 + 5 = 151,674,
+        # which is exactly the published ``robometer/Robometer-4B`` checkpoint
+        # vocab. NB: ``text_config.vocab_size`` in the raw Qwen config is the
+        # padded embedding-table size (151,936), not the tokenizer length —
+        # we override it with the tokenizer-driven value to stay consistent
+        # with upstream.
+        if self.vlm_config is None:
+            require_package("transformers", extra="robometer")
+            # Local import avoids a top-level cycle (modeling_robometer imports
+            # this module). ``ROBOMETER_SPECIAL_TOKENS`` is the single source
+            # of truth for the resize delta.
+            from lerobot.rewards.robometer.modeling_robometer import ROBOMETER_SPECIAL_TOKENS
+
+            vlm = AutoConfig.from_pretrained(self.base_model_id).to_dict()
+            tokenizer = AutoTokenizer.from_pretrained(self.base_model_id)
+            text_config = vlm.get("text_config")
+            if not isinstance(text_config, dict):
+                raise ValueError(
+                    f"Backbone config for {self.base_model_id!r} has no nested `text_config`; "
+                    "Robometer expects a Qwen-VL-style config."
+                )
+            text_config["vocab_size"] = len(tokenizer) + len(ROBOMETER_SPECIAL_TOKENS)
+            self.vlm_config = vlm
+
+    @property
+    def use_discrete_progress(self) -> bool:
+        """Whether the progress head outputs distribution logits over bins."""
+        return self.progress_loss_type.lower() == "discrete"
+
+    @property
+    def vlm_backbone_config(self):
+        """Reconstruct the Qwen backbone config from :attr:`vlm_config`.
+
+        ``vlm_config`` is always populated after :meth:`__post_init__`
+        (either fresh, computed from the tokenizer, or loaded from a saved
+        ``config.json`` via draccus).
+        """
+        require_package("transformers", extra="robometer")
+        config_dict = deepcopy(self.vlm_config)
+        model_type = config_dict.pop("model_type", None)
+        if model_type is None:
+            raise ValueError("vlm_config must include `model_type` to reconstruct the backbone config")
+        return AutoConfig.for_model(model_type, **config_dict)
+
+    @property
+    def observation_delta_indices(self) -> list[int] | None:
+        return None
+
+    @property
+    def action_delta_indices(self) -> None:
+        return None
+
+    @property
+    def reward_delta_indices(self) -> None:
+        return None
+
+    def validate_features(self) -> None:
+        if self.image_key not in self.input_features:
+            raise ValueError(f"Robometer requires image input feature {self.image_key!r}")
@@ -0,0 +1,493 @@
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Robometer reward model.
+
+- Qwen3-VL backbone (default: ``Qwen/Qwen3-VL-4B-Instruct``).
+- Progress + success heads at inference; the preference head is preserved in the
+  state dict but not queried.
+"""
+
+from __future__ import annotations
+
+import logging
+from typing import TYPE_CHECKING, Any
+
+import torch
+from torch import Tensor, nn
+
+from lerobot.rewards.pretrained import PreTrainedRewardModel
+from lerobot.rewards.robometer.configuration_robometer import RobometerConfig
+from lerobot.utils.constants import OBS_PREFIX
+from lerobot.utils.import_utils import _transformers_available, require_package
+
+if TYPE_CHECKING or _transformers_available:
+    from transformers import AutoModelForImageTextToText
+else:
+    AutoModelForImageTextToText = None  # type: ignore[assignment]
+
+logger = logging.getLogger(__name__)
+
+# Namespace for Robometer's pre-encoded Qwen-VL observation tensors. The
+# processor writes both Qwen-VL tensors and Robometer-specific token ids /
+# metadata here; the model reads them at inference (no tokenizer needed in
+# the model — EO1-style separation).
+ROBOMETER_FEATURE_PREFIX = f"{OBS_PREFIX}robometer."
+ROBOMETER_QWEN_INPUT_KEYS = (
+    "input_ids",
+    "attention_mask",
+    "pixel_values",
+    "pixel_values_videos",
+    "image_grid_thw",
+    "video_grid_thw",
+    "second_per_grid_ts",
+)
+ROBOMETER_METADATA_KEYS = (
+    "prog_token_id",
+    "vision_start_token_id",
+    "vision_end_token_id",
+    "video_merge_size",
+)
+ROBOMETER_INPUT_KEYS = ROBOMETER_QWEN_INPUT_KEYS + ROBOMETER_METADATA_KEYS
+
+# Order matters: the released checkpoint resized `embed_tokens` after adding
+# these tokens in this order, so changing the set or order would silently
+# misalign the saved embedding rows with their token ids. `<|reward_token|>`
+# and `<|sim_token|>` are vestigial (never read by any head) but still occupy
+# rows the checkpoint expects.
+ROBOMETER_SPECIAL_TOKENS = (
+    "<|split_token|>",
+    "<|reward_token|>",
+    "<|pref_token|>",
+    "<|sim_token|>",
+    "<|prog_token|>",
+)
+
+
+def convert_bins_to_continuous(bin_logits: Tensor) -> Tensor:
+    """Collapse per-bin logits into a single value in ``[0, 1]``.
+
+    The discrete progress head outputs ``num_bins`` logits per frame. Bins are
+    evenly spaced centers in ``[0, 1]``; the continuous prediction is the
+    softmax-weighted mean of those centers.
+    """
+    bin_probs = torch.softmax(bin_logits, dim=-1)
+    num_bins = bin_logits.shape[-1]
+    bin_centers = torch.linspace(0.0, 1.0, num_bins, device=bin_logits.device, dtype=bin_logits.dtype)
+    return (bin_probs * bin_centers).sum(dim=-1)
+
+
+def squeeze_last_safe(x: Tensor) -> Tensor:
+    """Drop a trailing singleton dim only when present.
+
+    Matches the upstream helper of the same name in
+    ``robometer.models.rbm`` (kept module-level and non-underscored to mirror
+    upstream).
+    """
+    return x.squeeze(-1) if x.ndim > 1 and x.shape[-1] == 1 else x
+
+
+def _torch_dtype(name: str) -> torch.dtype:
+    dtype = getattr(torch, name, None)
+    if isinstance(dtype, torch.dtype):
+        return dtype
+    raise ValueError(f"Unknown torch dtype: {name!r}")
+
+
+class RobometerPredictionHead(nn.Sequential):
+    """Small MLP head used for Robometer's progress / success / preference outputs.
+
+    Subclasses ``nn.Sequential`` (not ``nn.Module``) so the ``state_dict`` keys
+    stay flat (``progress_head.0.weight``, ``progress_head.1.weight``, ...) and
+    remain byte-compatible with the published ``lilkm/robometer-4b`` checkpoint.
+    """
+
+    def __init__(self, hidden_dim: int, output_size: int, *, dropout: float, with_sigmoid: bool) -> None:
+        layers: list[nn.Module] = [
+            nn.Linear(hidden_dim, hidden_dim // 2),
+            nn.LayerNorm(hidden_dim // 2),
+            nn.GELU(),
+            nn.Dropout(dropout),
+            nn.Linear(hidden_dim // 2, output_size),
+        ]
+        if with_sigmoid:
+            layers.append(nn.Sigmoid())
+        super().__init__(*layers)
+
+
+def decode_progress_outputs(
+    progress_logits: Tensor | None,
+    success_logits: Tensor | None,
+    *,
+    is_discrete_mode: bool,
+) -> dict[str, list[list[float]]]:
+    """Decode RBM head outputs into per-frame floats.
+
+    Args:
+        progress_logits: ``(B, T)`` (continuous) or ``(B, T, num_bins)`` (discrete).
+        success_logits: ``(B, T)`` raw logits, ``sigmoid``-ed to probabilities.
+        is_discrete_mode: if True the progress logits get a softmax over bins
+            and are projected onto bin centers via :func:`convert_bins_to_continuous`.
+
+    Returns:
+        Dict with ``progress_pred`` and ``success_probs``, each a list of
+        length ``B`` of per-frame float lists.
+    """
+    progress_pred: list[list[float]] = []
+    success_probs: list[list[float]] = []
+
+    if progress_logits is not None:
+        for sample_logits in progress_logits:
+            if is_discrete_mode:
+                continuous = convert_bins_to_continuous(sample_logits.detach().float().cpu())
+                progress_pred.append(continuous.flatten().tolist())
+            else:
+                progress_pred.append(sample_logits.detach().float().cpu().flatten().tolist())
+
+    if success_logits is not None:
+        for sample_logits in success_logits:
+            success_probs.append(torch.sigmoid(sample_logits.detach().float().cpu()).flatten().tolist())
+
+    return {"progress_pred": progress_pred, "success_probs": success_probs}
+
+
+class RobometerRewardModel(PreTrainedRewardModel):
+    """Robometer reward model: Qwen3-VL backbone + progress/success heads."""
+
+    name = "robometer"
+    config_class = RobometerConfig
+
+    def __init__(self, config: RobometerConfig, *, dropout: float = 0.1) -> None:
+        require_package("transformers", extra="robometer")
+        super().__init__(config)
+        self.config = config
+
+        # Two backbone-build paths (EO-1 style, branched on ``pretrained_path``):
+        #
+        #   - Fresh training (``pretrained_path is None``): download the base
+        #     Qwen weights and resize the embed table to match
+        #     ``vlm_config.text_config.vocab_size`` — populated deterministically
+        #     in ``RobometerConfig.__post_init__`` as
+        #     ``len(tokenizer) + len(ROBOMETER_SPECIAL_TOKENS)``, mirroring
+        #     upstream Robometer's ``_add_special_tokens_and_resize`` in
+        #     ``third_party/robometer/.../setup_utils.py``.
+        #
+        #   - Loading a saved checkpoint (``pretrained_path`` is set): rebuild
+        #     the empty architecture from ``vlm_config`` via
+        #     ``AutoModelForImageTextToText.from_config`` so the subsequent
+        #     ``model.safetensors`` load is a direct fill of the right shape —
+        #     no redundant Qwen weight download.
+        torch_dtype = _torch_dtype(config.torch_dtype)
+        if config.pretrained_path is None:
+            self.model = AutoModelForImageTextToText.from_pretrained(
+                config.base_model_id,
+                dtype=torch_dtype,
+                trust_remote_code=True,
+            )
+            target_vocab = config.vlm_config["text_config"]["vocab_size"]
+            self.model.resize_token_embeddings(target_vocab)
+        else:
+            self.model = AutoModelForImageTextToText.from_config(
+                config.vlm_backbone_config,
+                dtype=torch_dtype,
+                trust_remote_code=True,
+            )
+
+        # All Qwen-VL backbones Robometer supports expose `text_config.hidden_size`.
+        # Falls back to the top-level `hidden_size` so future non-multimodal
+        # variants would still resolve.
+        backbone_config = self.model.config
+        text_config = getattr(backbone_config, "text_config", None)
+        hidden_size = getattr(text_config, "hidden_size", None) if text_config is not None else None
+        if hidden_size is None:
+            hidden_size = getattr(backbone_config, "hidden_size", None)
+        if hidden_size is None:
+            raise AttributeError(
+                f"Could not infer hidden_size from backbone config of {config.base_model_id}"
+            )
+        hidden_dim = int(hidden_size)
+
+        # Robometer's three prediction heads + frame-pool attention. The
+        # preference head is preserved to match the published state-dict layout
+        # even though only progress + success are consumed at inference, and
+        # `frame_pool_attn` is always allocated so checkpoints trained with
+        # `frame_pooling="attention"` load without remapping.
+        progress_output = config.progress_discrete_bins if config.use_discrete_progress else 1
+        self.progress_head = RobometerPredictionHead(
+            hidden_dim,
+            progress_output,
+            dropout=dropout,
+            with_sigmoid=not config.use_discrete_progress,
+        )
+        self.preference_head = RobometerPredictionHead(hidden_dim, 1, dropout=dropout, with_sigmoid=False)
+        self.success_head = RobometerPredictionHead(hidden_dim, 1, dropout=dropout, with_sigmoid=False)
+        self.frame_pool_attn = nn.Linear(hidden_dim, 1, bias=False)
+
+        # Match the dtype of the loaded base model so weight loading is a no-op cast.
+        model_dtype = next(self.model.parameters()).dtype
+        self.progress_head.to(dtype=model_dtype)
+        self.preference_head.to(dtype=model_dtype)
+        self.success_head.to(dtype=model_dtype)
+        self.frame_pool_attn.to(dtype=model_dtype)
+
+    def compute_reward(self, batch: dict[str, Tensor]) -> Tensor:
+        inputs = {
+            key: batch[f"{ROBOMETER_FEATURE_PREFIX}{key}"]
+            for key in ROBOMETER_INPUT_KEYS
+            if f"{ROBOMETER_FEATURE_PREFIX}{key}" in batch
+        }
+        if "input_ids" not in inputs:
+            raise KeyError(
+                f"Robometer batch missing pre-encoded inputs (expected "
+                f"`{ROBOMETER_FEATURE_PREFIX}input_ids`). Make sure the "
+                "RobometerEncoderProcessorStep ran before `compute_reward`."
+            )
+
+        device = next(self.model.parameters()).device
+        inputs = {key: value.to(device) if hasattr(value, "to") else value for key, value in inputs.items()}
+
+        self.eval()
+        with torch.no_grad():
+            progress_logits, success_logits = self._compute_rbm_logits(inputs)
+
+        decoded = decode_progress_outputs(
+            progress_logits,
+            success_logits,
+            is_discrete_mode=self.config.use_discrete_progress,
+        )
+        values = (
+            decoded["success_probs"] if self.config.reward_output == "success" else decoded["progress_pred"]
+        )
+
+        rewards = torch.stack([torch.as_tensor(seq, dtype=torch.float32)[-1] for seq in values])
+        if self.config.reward_output == "success":
+            rewards = (rewards > self.config.success_threshold).float()
+        return rewards.to(self.config.device or "cpu")
+
+    def _compute_rbm_logits(
+        self,
+        inputs: dict[str, Any],
+    ) -> tuple[Tensor, Tensor]:
+        """Run the Qwen3-VL backbone and apply Robometer's heads.
+
+        ``inputs`` is the encoded batch produced by
+        :class:`RobometerEncoderProcessorStep`. It carries Qwen tensors as well
+        as Robometer-specific metadata (``prog_token_id``,
+        ``vision_start_token_id``, ``vision_end_token_id``, ``video_merge_size``)
+        — the metadata is popped here so the rest can be forwarded straight to
+        the Qwen model.
+
+        Returns ``(progress_logits, success_logits)``. Shapes:
+
+        - ``progress_logits``: ``(B, T)`` (continuous) or ``(B, T, num_bins)`` (discrete).
+        - ``success_logits``: ``(B, T)`` raw logits (sigmoid happens at decode time).
+        """
+        prog_token_id = inputs.pop("prog_token_id", None)
+        vision_start_token_id = inputs.pop("vision_start_token_id", None)
+        vision_end_token_id = inputs.pop("vision_end_token_id", None)
+        video_merge_size = inputs.pop("video_merge_size", 14)
+
+        # Qwen3-VL doesn't reliably populate `last_hidden_state`; ask for the
+        # full hidden-state tuple and take the last layer. This matches the
+        # `is_qwen3` path in upstream Robometer's `RBM.forward_qwen` (main).
+        outputs = self.model(**inputs, output_hidden_states=True, return_dict=True)
+        hidden_state = (
+            outputs.hidden_states[-1]
+            if getattr(outputs, "hidden_states", None)
+            else outputs.last_hidden_state
+        )
+
+        input_ids = inputs["input_ids"]
+        if self.config.use_per_frame_progress_token:
+            if prog_token_id is None:
+                raise KeyError("`prog_token_id` missing in batch (run RobometerEncoderProcessorStep first)")
+            return self._process_token_extraction(hidden_state, input_ids, prog_token_id=prog_token_id)
+        if self.config.use_multi_image:
+            if vision_start_token_id is None or vision_end_token_id is None:
+                raise KeyError(
+                    "`vision_start_token_id` / `vision_end_token_id` missing in batch "
+                    "(run RobometerEncoderProcessorStep first)"
+                )
+            return self._process_multi_image_frames(
+                hidden_state,
+                input_ids,
+                start_id=vision_start_token_id,
+                end_id=vision_end_token_id,
+            )
+        video_grid_thw = inputs.get("video_grid_thw")
+        if video_grid_thw is None:
+            raise ValueError("video_grid_thw is required for video-mode Robometer inference")
+        if vision_start_token_id is None:
+            raise KeyError("`vision_start_token_id` missing in batch")
+        return self._process_video_frames(
+            hidden_state,
+            input_ids,
+            video_grid_thw,
+            start_id=vision_start_token_id,
+            merge_size=video_merge_size,
+        )
+
+    def _apply_heads_to_hidden_states(self, frame_embeddings: Tensor) -> tuple[Tensor, Tensor]:
+        """Apply progress + success heads to a tensor of frame embeddings.
+
+        Mirrors upstream ``RBM._apply_heads_to_hidden_states``.
+        """
+        progress_out = self.progress_head(frame_embeddings)
+        progress = progress_out if self.config.use_discrete_progress else squeeze_last_safe(progress_out)
+        success = squeeze_last_safe(self.success_head(frame_embeddings))
+        return progress, success
+
+    def _process_token_extraction(
+        self,
+        hidden_state: Tensor,
+        input_ids: Tensor,
+        *,
+        prog_token_id: int,
+    ) -> tuple[Tensor, Tensor]:
+        """Per-frame progress/success from ``<|prog_token|>`` positions.
+
+        Mirrors the progress-sample branch of upstream
+        ``RBM._process_token_extraction``.
+        """
+        token_mask = input_ids == prog_token_id
+        batch_indices, positions = token_mask.nonzero(as_tuple=True)
+        if positions.numel() == 0:
+            raise ValueError("`<|prog_token|>` not found in any sequence")
+
+        per_sample_hidden = [
+            hidden_state[i, positions[batch_indices == i]] for i in range(input_ids.shape[0])
+        ]
+        progress_list, success_list = [], []
+        for embeddings in per_sample_hidden:
+            if embeddings.shape[0] == 0:
+                raise ValueError("`<|prog_token|>` missing in a sequence")
+            progress, success = self._apply_heads_to_hidden_states(embeddings)
+            progress_list.append(progress)
+            success_list.append(success)
+
+        return torch.stack(progress_list), torch.stack(success_list)
+
+    def _process_multi_image_frames(
+        self,
+        hidden_state: Tensor,
+        input_ids: Tensor,
+        *,
+        start_id: int,
+        end_id: int,
+    ) -> tuple[Tensor, Tensor]:
+        """Per-frame progress/success in multi-image mode (Qwen-VL).
+
+        Mirrors upstream ``RBM._process_multi_image_frames`` (progress-sample
+        branch only — we don't run preference at inference).
+        """
+        progress_list, success_list = [], []
+        for batch_idx in range(input_ids.shape[0]):
+            seq_ids = input_ids[batch_idx]
+            seq_hidden = hidden_state[batch_idx]
+            frame_embeddings = self._extract_hidden_states_from_token_pairs(
+                seq_hidden, seq_ids, start_id, end_id
+            )
+            progress, success = self._apply_heads_to_hidden_states(frame_embeddings)
+            progress_list.append(progress)
+            success_list.append(success)
+
+        return torch.stack(progress_list), torch.stack(success_list)
+
+    def _extract_hidden_states_from_token_pairs(
+        self,
+        hidden_state: Tensor,
+        input_ids: Tensor,
+        start_id: int,
+        end_id: int,
+    ) -> Tensor:
+        start_positions = (input_ids == start_id).nonzero(as_tuple=True)[0]
+        end_positions = (input_ids == end_id).nonzero(as_tuple=True)[0]
+        if start_positions.numel() == 0:
+            raise ValueError("`<|vision_start|>` not found in sequence")
+        if start_positions.numel() != end_positions.numel():
+            raise ValueError(
+                f"Mismatched vision token counts: {start_positions.numel()} start vs "
+                f"{end_positions.numel()} end"
+            )
+
+        frames: list[Tensor] = []
+        for start, end in zip(start_positions.tolist(), end_positions.tolist(), strict=True):
+            if start >= end:
+                raise ValueError(f"Invalid vision token pair: start={start} end={end}")
+            patch_tokens = hidden_state[start + 1 : end]
+            if patch_tokens.shape[0] == 0:
+                frames.append((hidden_state[start] + hidden_state[end]) / 2.0)
+                continue
+
+            pooling = self.config.frame_pooling
+            if pooling == "mean":
+                frames.append(patch_tokens.mean(dim=0))
+            elif pooling == "boundary":
+                frames.append(patch_tokens[-1])
+            else:  # attention
+                scores = (
+                    self.frame_pool_attn(patch_tokens).squeeze(-1)
+                    / self.config.frame_pooling_attn_temperature
+                )
+                weights = torch.softmax(scores, dim=0).unsqueeze(-1)
+                frames.append((weights * patch_tokens).sum(dim=0))
+
+        return torch.stack(frames)
+
+    def _process_video_frames(
+        self,
+        hidden_state: Tensor,
+        input_ids: Tensor,
+        video_grid_thw: Tensor,
+        *,
+        start_id: int,
+        merge_size: int,
+    ) -> tuple[Tensor, Tensor]:
+        """Per-frame progress/success in video mode (Qwen-VL).
+
+        Mirrors upstream ``RBM._process_video_frames`` /
+        ``RBM._extract_progress_from_trajectory`` (progress-sample branch
+        only — preference is not run at inference). In particular,
+        ``average_temporal_patches=False`` reads the *boundary* token at
+        ``cursor + tokens_per_frame`` to match upstream byte-for-byte.
+        """
+        progress_list, success_list = [], []
+        for batch_idx in range(input_ids.shape[0]):
+            seq_ids = input_ids[batch_idx]
+            seq_hidden = hidden_state[batch_idx]
+            start_positions = (seq_ids == start_id).nonzero(as_tuple=True)[0]
+            if start_positions.numel() == 0:
+                raise ValueError("`<|vision_start|>` not found in sequence")
+            t_dim, h_dim, w_dim = (int(x) for x in video_grid_thw[batch_idx].tolist())
+            tokens_per_frame = (h_dim * w_dim) // (merge_size**2)
+
+            cursor = start_positions[0].item()
+            frame_embeddings: list[Tensor] = []
+            for _ in range(t_dim):
+                if self.config.average_temporal_patches:
+                    patch = seq_hidden[cursor : cursor + tokens_per_frame]
+                    frame_embeddings.append(patch.mean(dim=0))
+                else:
+                    # Upstream takes the position *one past* the patch span as
+                    # the per-frame boundary; see
+                    # `RBM._extract_progress_from_trajectory`.
+                    frame_embeddings.append(seq_hidden[cursor + tokens_per_frame])
+                cursor += tokens_per_frame
+
+            stacked = torch.stack(frame_embeddings)
+            progress, success = self._apply_heads_to_hidden_states(stacked)
+            progress_list.append(progress)
+            success_list.append(success)
+
+        return torch.stack(progress_list), torch.stack(success_list)
@@ -0,0 +1,348 @@
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Robometer pre/post processing pipelines."""
+
+from __future__ import annotations
+
+from dataclasses import dataclass, field
+from typing import TYPE_CHECKING, Any
+
+import numpy as np
+import torch
+from PIL import Image
+from torch import Tensor
+
+from lerobot.configs import PipelineFeatureType, PolicyFeature
+from lerobot.processor import (
+    AddBatchDimensionProcessorStep,
+    DeviceProcessorStep,
+    PolicyAction,
+    PolicyProcessorPipeline,
+    ProcessorStep,
+    ProcessorStepRegistry,
+    policy_action_to_transition,
+)
+from lerobot.rewards.robometer.configuration_robometer import RobometerConfig
+from lerobot.rewards.robometer.modeling_robometer import (
+    ROBOMETER_FEATURE_PREFIX,
+    ROBOMETER_SPECIAL_TOKENS,
+)
+from lerobot.types import EnvTransition, TransitionKey
+from lerobot.utils.constants import (
+    OBS_IMAGES,
+    POLICY_POSTPROCESSOR_DEFAULT_NAME,
+    POLICY_PREPROCESSOR_DEFAULT_NAME,
+)
+from lerobot.utils.import_utils import _transformers_available, require_package
+
+if TYPE_CHECKING or _transformers_available:
+    from transformers import AutoProcessor
+else:
+    AutoProcessor = None
+
+PROGRESS_PROMPT = (
+    "The task for the robot is '{task}'. Given the trajectory video, predict "
+    "the task progress at each frame, how far along the robot is towards "
+    "completing the task, a float between 0 and 1, where 0 is the starting "
+    "state and 1 is when the task is completed. If the robot is not "
+    "performing the same task, predict 0 progress."
+)
+
+
+def _frames_to_pil(frames: np.ndarray) -> list[Image.Image]:
+    """Convert ``(T, H, W, C)`` uint8 frames to a list of PIL images."""
+    if frames.ndim != 4:
+        raise ValueError(f"Expected (T,H,W,C) frames; got shape {frames.shape}")
+    if frames.dtype != np.uint8:
+        frames = np.clip(frames, 0, 255).astype(np.uint8)
+    return [Image.fromarray(frames[i]) for i in range(frames.shape[0])]
+
+
+def _video_to_numpy(video: Tensor, *, max_frames: int | None) -> np.ndarray:
+    """Convert one trajectory tensor to a ``(T, H, W, C) uint8`` numpy array."""
+    if max_frames is not None:
+        video = video[-max_frames:]
+    if video.shape[1] in (1, 3):
+        video = video.permute(0, 2, 3, 1)
+    elif video.shape[-1] not in (1, 3):
+        raise ValueError(f"Expected channel dim of size 1 or 3, got shape {tuple(video.shape)}")
+
+    array = video.detach().cpu().numpy()
+    if np.issubdtype(array.dtype, np.floating) and array.size > 0 and array.max() <= 1.0:
+        array = array * 255.0
+    return np.clip(array, 0, 255).astype(np.uint8)
+
+
+def _expand_tasks(task: Any, *, batch_size: int, default: str | None) -> list[str]:
+    if task is None:
+        task = default
+    if task is None:
+        raise KeyError("Robometer expected a task description in complementary data")
+    if isinstance(task, str):
+        return [task] * batch_size
+    if isinstance(task, tuple):
+        task = list(task)
+    if not (isinstance(task, list) and all(isinstance(item, str) for item in task)):
+        raise TypeError(f"Robometer task must be a string or list of strings, got {type(task)}")
+    if len(task) == 1 and batch_size > 1:
+        return task * batch_size
+    if len(task) != batch_size:
+        raise ValueError(f"Expected {batch_size} tasks, got {len(task)}")
+    return task
+
+
+@dataclass
+@ProcessorStepRegistry.register(name="robometer_encoder")
+class RobometerEncoderProcessorStep(ProcessorStep):
+    """Encode raw frames + task into Qwen-VL tensors for the Robometer model.
+
+    Loads a :class:`~transformers.AutoProcessor` matching ``base_model_id`` and
+    registers Robometer's special tokens on the tokenizer. The matching
+    embedding resize happens model-side in
+    :meth:`RobometerRewardModel.__init__`. This step owns the tokenizer — the
+    model itself never needs one — and is the EO1-style boundary between
+    pre-processing and modeling.
+
+    At call time the step reads:
+
+    - ``observation[image_key]``: ``(B, T, C, H, W)`` or ``(B, C, H, W)`` frames.
+    - ``complementary_data[task_key]``: a string or list of strings.
+
+    and writes ``observation[f"{ROBOMETER_FEATURE_PREFIX}<name>"]`` for:
+
+    - the Qwen-VL processor outputs: ``input_ids``, ``attention_mask``,
+      ``pixel_values``, ``image_grid_thw``, ``video_grid_thw``, ...
+    - Robometer-specific token ids consumed by the model heads:
+      ``prog_token_id``, ``vision_start_token_id``, ``vision_end_token_id``,
+      ``video_merge_size``.
+    """
+
+    base_model_id: str = "Qwen/Qwen3-VL-4B-Instruct"
+    image_key: str = OBS_IMAGES + ".top"
+    task_key: str = "task"
+    default_task: str | None = None
+    max_frames: int | None = 8
+    use_multi_image: bool = True
+    use_per_frame_progress_token: bool = True
+    max_length: int = 1024
+
+    _processor: Any = field(default=None, init=False, repr=False)
+
+    def __post_init__(self) -> None:
+        require_package("transformers", extra="robometer")
+        require_package("qwen-vl-utils", extra="robometer", import_name="qwen_vl_utils")
+
+        self._processor = AutoProcessor.from_pretrained(
+            self.base_model_id,
+            trust_remote_code=True,
+            do_sample_frames=False,
+            padding_side="right",
+        )
+
+        # Register Robometer's special tokens on the tokenizer. The matching
+        # embedding resize happens model-side in `RobometerRewardModel.__init__`.
+        tokenizer = self._processor.tokenizer
+        # Qwen tokenizers may not define a pad token, but batched prompts/videos
+        # require padding, so reuse EOS as the padding token.
+        if tokenizer.pad_token is None:
+            tokenizer.pad_token = tokenizer.eos_token
+        for token in ROBOMETER_SPECIAL_TOKENS:
+            if token not in tokenizer.get_vocab():
+                tokenizer.add_special_tokens({"additional_special_tokens": [token]})
+
+    def __call__(self, transition: EnvTransition) -> EnvTransition:
+        observation = transition.get(TransitionKey.OBSERVATION)
+        complementary = transition.get(TransitionKey.COMPLEMENTARY_DATA) or {}
+        if not isinstance(observation, dict):
+            raise ValueError("RobometerEncoderProcessorStep requires an observation dict")
+
+        if self.image_key not in observation:
+            raise KeyError(f"Robometer expected image key {self.image_key!r} in observation")
+
+        frames = observation[self.image_key]
+        tensor = frames.detach().cpu() if isinstance(frames, Tensor) else torch.as_tensor(frames)
+        if tensor.ndim == 4:
+            tensor = tensor.unsqueeze(1)
+        elif tensor.ndim != 5:
+            raise ValueError(
+                f"Expected Robometer frames with shape (B,C,H,W) or (B,T,C,H,W); got {tuple(tensor.shape)}"
+            )
+
+        batch_size = tensor.shape[0]
+        tasks = _expand_tasks(
+            complementary.get(self.task_key, self.default_task),
+            batch_size=batch_size,
+            default=self.default_task,
+        )
+
+        samples = [
+            (_video_to_numpy(tensor[i], max_frames=self.max_frames), tasks[i]) for i in range(batch_size)
+        ]
+        encoded = self.encode_samples(samples)
+
+        new_observation = dict(observation)
+        for key, value in encoded.items():
+            new_observation[f"{ROBOMETER_FEATURE_PREFIX}{key}"] = value
+
+        new_transition = transition.copy()
+        new_transition[TransitionKey.OBSERVATION] = new_observation
+        return new_transition
+
+    def encode_samples(self, samples: list[tuple[np.ndarray, str]]) -> dict[str, Tensor]:
+        """Run the Qwen-VL processor on a list of ``(frames, task)`` samples.
+
+        Used internally by ``__call__`` and exposed for callers that want to
+        run the encoder on a single trajectory without building an
+        :class:`EnvTransition` (see ``examples/dataset/create_robometer_progress_videos.py``).
+        """
+        from qwen_vl_utils import process_vision_info
+
+        conversations = [self._build_conversation(frames, task) for frames, task in samples]
+
+        texts = [
+            self._processor.apply_chat_template(
+                msg,
+                tokenize=False,
+                add_generation_prompt=False,
+                add_vision_id=True,
+                enable_thinking=False,
+                fps=1,
+            )
+            for msg in conversations
+        ]
+
+        process_kwargs: dict[str, Any] = {
+            "return_video_kwargs": True,
+            "return_video_metadata": True,
+        }
+        image_processor = getattr(self._processor, "image_processor", None)
+        if image_processor is not None and hasattr(image_processor, "patch_size"):
+            process_kwargs["image_patch_size"] = image_processor.patch_size
+
+        image_inputs, video_inputs, video_kwargs = process_vision_info(conversations, **process_kwargs)
+
+        videos: list[Any] | None = None
+        video_metadatas: list[Any] | None = None
+        if video_inputs:
+            if isinstance(video_inputs[0], tuple) and len(video_inputs[0]) == 2:
+                videos_seq, metadatas_seq = zip(*video_inputs, strict=False)
+                videos = list(videos_seq)
+                video_metadatas = list(metadatas_seq)
+            else:
+                videos = list(video_inputs)
+
+        processor_kwargs: dict[str, Any] = {
+            "text": texts,
+            "images": image_inputs,
+            "padding": True,
+            "truncation": False,
+            "max_length": self.max_length,
+            "return_tensors": "pt",
+            "do_resize": False,
+        }
+        if videos is not None:
+            processor_kwargs["videos"] = videos
+        if video_metadatas is not None:
+            processor_kwargs["video_metadata"] = video_metadatas
+        if video_kwargs:
+            processor_kwargs.update(video_kwargs)
+
+        encoded = self._processor(**processor_kwargs)
+
+        # Write Robometer-specific token ids and the video patch merge size into
+        # the encoded batch so `RobometerRewardModel` doesn't need its own
+        # tokenizer at inference (EO1-style separation: the processor owns the
+        # tokenizer, the model owns the backbone and heads).
+        tokenizer = self._processor.tokenizer
+        encoded["prog_token_id"] = tokenizer.convert_tokens_to_ids("<|prog_token|>")
+        encoded["vision_start_token_id"] = tokenizer.convert_tokens_to_ids("<|vision_start|>")
+        encoded["vision_end_token_id"] = tokenizer.convert_tokens_to_ids("<|vision_end|>")
+        video_processor = getattr(self._processor, "video_processor", None)
+        encoded["video_merge_size"] = int(getattr(video_processor, "merge_size", 14))
+        return encoded
+
+    def _build_conversation(self, frames: np.ndarray, task: str) -> list[dict[str, Any]]:
+        pil_frames = _frames_to_pil(frames)
+        prompt = PROGRESS_PROMPT.format(task=task)
+        content: list[dict[str, Any]] = [{"type": "text", "text": prompt}]
+
+        if self.use_multi_image:
+            for image in pil_frames:
+                content.append({"type": "image", "image": image})
+                if self.use_per_frame_progress_token:
+                    content.append({"type": "text", "text": "<|prog_token|>"})
+        else:
+            content.append({"type": "video", "video": pil_frames, "sample_fps": 1.0})
+
+        return [{"role": "user", "content": content}]
+
+    def transform_features(
+        self, features: dict[PipelineFeatureType, dict[str, PolicyFeature]]
+    ) -> dict[PipelineFeatureType, dict[str, PolicyFeature]]:
+        # The Qwen-VL processor produces variable-length sequence tensors that
+        # don't fit the static `PolicyFeature(shape=...)` mould; we deliberately
+        # do not advertise the new observation keys here.
+        return features
+
+    def get_config(self) -> dict[str, Any]:
+        return {
+            "base_model_id": self.base_model_id,
+            "image_key": self.image_key,
+            "task_key": self.task_key,
+            "default_task": self.default_task,
+            "max_frames": self.max_frames,
+            "use_multi_image": self.use_multi_image,
+            "use_per_frame_progress_token": self.use_per_frame_progress_token,
+            "max_length": self.max_length,
+        }
+
+
+def make_robometer_pre_post_processors(
+    config: RobometerConfig,
+    dataset_stats: dict[str, dict[str, Any]] | None = None,
+) -> tuple[
+    PolicyProcessorPipeline[dict[str, Any], dict[str, Any]],
+    PolicyProcessorPipeline[PolicyAction, PolicyAction],
+]:
+    """Pipeline that pre-encodes frames + task into Qwen-VL tensors.
+
+    The preprocessor adds a batch dimension if needed, runs Robometer's
+    encoder, and moves everything to the configured device. The
+    postprocessor is the identity since Robometer outputs a single reward
+    tensor (no action to un-normalise).
+    """
+    del dataset_stats  # Robometer has its own normalisation inside the Qwen-VL processor.
+
+    preprocessor = PolicyProcessorPipeline[dict[str, Any], dict[str, Any]](
+        steps=[
+            AddBatchDimensionProcessorStep(),
+            RobometerEncoderProcessorStep(
+                base_model_id=config.base_model_id,
+                image_key=config.image_key,
+                task_key=config.task_key,
+                default_task=config.default_task,
+                max_frames=config.max_frames,
+                use_multi_image=config.use_multi_image,
+                use_per_frame_progress_token=config.use_per_frame_progress_token,
+            ),
+            DeviceProcessorStep(device=config.device or "cpu"),
+        ],
+        name=POLICY_PREPROCESSOR_DEFAULT_NAME,
+    )
+    postprocessor = PolicyProcessorPipeline(
+        name=POLICY_POSTPROCESSOR_DEFAULT_NAME,
+        to_transition=policy_action_to_transition,
+    )
+    return preprocessor, postprocessor
@@ -12,23 +12,33 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.

-"""
-Reinforcement learning modules.
+"""Reinforcement learning modules.

-Requires: ``pip install 'lerobot[hilserl]'``
-
-Available modules (import directly)::
-
-    from lerobot.rl.actor import ...
-    from lerobot.rl.learner import ...
-    from lerobot.rl.learner_service import ...
-    from lerobot.rl.buffer import ...
-    from lerobot.rl.eval_policy import ...
-    from lerobot.rl.gym_manipulator import ...
+Distributed actor / learner entry points (``actor``, ``learner``,
+``learner_service``) require ``pip install 'lerobot[hilserl]'``. Algorithms,
+buffer, data sources and trainer are gRPC-free and usable standalone.
 """

-from lerobot.utils.import_utils import require_package
+from .algorithms.base import RLAlgorithm as RLAlgorithm
+from .algorithms.configs import RLAlgorithmConfig as RLAlgorithmConfig, TrainingStats as TrainingStats
+from .algorithms.factory import (
+    make_algorithm as make_algorithm,
+    make_algorithm_config as make_algorithm_config,
+)
+from .algorithms.sac.configuration_sac import SACAlgorithmConfig as SACAlgorithmConfig
+from .buffer import ReplayBuffer as ReplayBuffer
+from .data_sources import DataMixer as DataMixer, OnlineOfflineMixer as OnlineOfflineMixer
+from .trainer import RLTrainer as RLTrainer

-require_package("grpcio", extra="hilserl", import_name="grpc")
-
-__all__: list[str] = []
+__all__ = [
+    "RLAlgorithm",
+    "RLAlgorithmConfig",
+    "TrainingStats",
+    "make_algorithm",
+    "make_algorithm_config",
+    "SACAlgorithmConfig",
+    "RLTrainer",
+    "ReplayBuffer",
+    "DataMixer",
+    "OnlineOfflineMixer",
+]
@@ -49,39 +49,53 @@ https://github.com/michel-aractingi/lerobot-hilserl-guide
 import logging
 import os
 import time
+from collections.abc import Generator
 from functools import lru_cache
 from queue import Empty
+from typing import TYPE_CHECKING, Any
+
+from lerobot.utils.import_utils import _grpc_available, require_package
+
+if TYPE_CHECKING or _grpc_available:
+    import grpc
+
+    from lerobot.transport import services_pb2, services_pb2_grpc
+    from lerobot.transport.utils import (
+        bytes_to_state_dict,
+        grpc_channel_options,
+        python_object_to_bytes,
+        receive_bytes_in_chunks,
+        send_bytes_in_chunks,
+        transitions_to_bytes,
+    )
+else:
+    grpc = None
+    services_pb2 = None
+    services_pb2_grpc = None
+    bytes_to_state_dict = None
+    grpc_channel_options = None
+    python_object_to_bytes = None
+    receive_bytes_in_chunks = None
+    send_bytes_in_chunks = None
+    transitions_to_bytes = None

-import grpc
 import torch
 from torch import nn
-from torch.multiprocessing import Event, Queue
+from torch.multiprocessing import Queue

 from lerobot.cameras import opencv  # noqa: F401
 from lerobot.configs import parser
-from lerobot.configs.train import TrainRLServerPipelineConfig
-from lerobot.policies import make_policy
-from lerobot.policies.sac.modeling_sac import SACPolicy
+from lerobot.policies import make_policy, make_pre_post_processors
+from lerobot.processor import TransitionKey
 from lerobot.robots import so_follower  # noqa: F401
 from lerobot.teleoperators import gamepad, so_leader  # noqa: F401
 from lerobot.teleoperators.utils import TeleopEvents
-from lerobot.transport import services_pb2, services_pb2_grpc
-from lerobot.transport.utils import (
-    bytes_to_state_dict,
-    grpc_channel_options,
-    python_object_to_bytes,
-    receive_bytes_in_chunks,
-    send_bytes_in_chunks,
-    transitions_to_bytes,
-)
-from lerobot.types import TransitionKey
 from lerobot.utils.device_utils import get_safe_torch_device
 from lerobot.utils.process import ProcessSignalHandler
 from lerobot.utils.random_utils import set_seed
 from lerobot.utils.robot_utils import precise_sleep
 from lerobot.utils.transition import (
    Transition,
-    move_state_dict_to_device,
    move_transition_to_device,
 )
 from lerobot.utils.utils import (
@@ -89,19 +103,24 @@ from lerobot.utils.utils import (
    init_logging,
 )

+from .algorithms.base import RLAlgorithm
+from .algorithms.factory import make_algorithm
 from .gym_manipulator import (
-    create_transition,
    make_processors,
    make_robot_env,
+    reset_and_build_transition,
    step_env_and_process_transition,
 )
 from .queue import get_last_item_from_queue
+from .train_rl import TrainRLServerPipelineConfig

 # Main entry point


@parser.wrap()
 def actor_cli(cfg: TrainRLServerPipelineConfig):
+    # Fail fast with a friendly error if the optional ``hilserl`` extra is missing.
+    require_package("grpcio", extra="hilserl", import_name="grpc")
    cfg.validate()
    display_pid = False
    if not use_threads(cfg):
@@ -212,7 +231,7 @@ def actor_cli(cfg: TrainRLServerPipelineConfig):

 def act_with_policy(
    cfg: TrainRLServerPipelineConfig,
-    shutdown_event: any,  # Event,
+    shutdown_event: Any,  # Event
    parameters_queue: Queue,
    transitions_queue: Queue,
    interactions_queue: Queue,
@@ -252,22 +271,24 @@ def act_with_policy(
    logging.info("make_policy")

    ### Instantiate the policy in both the actor and learner processes
-    ### To avoid sending a SACPolicy object through the port, we create a policy instance
+    ### To avoid sending a policy object through the port, we create a policy instance
    ### on both sides, the learner sends the updated parameters every n steps to update the actor's parameters
-    policy: SACPolicy = make_policy(
+    policy = make_policy(
        cfg=cfg.policy,
        env_cfg=cfg.env,
    )
-    policy = policy.eval()
+    policy = policy.to(device).eval()
    assert isinstance(policy, nn.Module)

-    obs, info = online_env.reset()
-    env_processor.reset()
-    action_processor.reset()
+    # Build the algorithm
+    algorithm = make_algorithm(cfg=cfg.algorithm, policy=policy)

-    # Process initial observation
-    transition = create_transition(observation=obs, info=info)
-    transition = env_processor(transition)
+    preprocessor, postprocessor = make_pre_post_processors(
+        policy_cfg=cfg.policy,
+        dataset_stats=cfg.policy.dataset_stats,
+    )
+
+    transition = reset_and_build_transition(online_env, env_processor, action_processor)

    # NOTE: For the moment we will solely handle the case of a single environment
    sum_reward_episode = 0
@@ -291,8 +312,17 @@ def act_with_policy(

        # Time policy inference and check if it meets FPS requirement
        with policy_timer:
-            # Extract observation from transition for policy
-            action = policy.select_action(batch=observation)
+            normalized_observation = preprocessor.process_observation(observation)
+            action = policy.select_action(batch=normalized_observation)
+            # Unnormalize only the continuous part.
+            if cfg.policy.num_discrete_actions is not None:
+                continuous_action = postprocessor.process_action(action[..., :-1])
+                discrete_action = action[..., -1:].to(
+                    device=continuous_action.device, dtype=continuous_action.dtype
+                )
+                action = torch.cat([continuous_action, discrete_action], dim=-1)
+            else:
+                action = postprocessor.process_action(action)
        policy_fps = policy_timer.fps_last

        log_policy_frequency_issue(policy_fps=policy_fps, cfg=cfg, interaction_step=interaction_step)
@@ -326,7 +356,8 @@ def act_with_policy(

        # Check for intervention from transition info
        intervention_info = new_transition[TransitionKey.INFO]
-        if intervention_info.get(TeleopEvents.IS_INTERVENTION, False):
+        is_intervention = bool(intervention_info.get(TeleopEvents.IS_INTERVENTION, False))
+        if is_intervention:
            episode_intervention = True
            episode_intervention_steps += 1

@@ -334,6 +365,7 @@ def act_with_policy(
            "discrete_penalty": torch.tensor(
                [new_transition[TransitionKey.COMPLEMENTARY_DATA].get("discrete_penalty", 0.0)]
            ),
+            TeleopEvents.IS_INTERVENTION.value: is_intervention,
        }
        # Create transition for learner (convert to old format)
        list_transition_to_send_to_learner.append(
@@ -354,7 +386,7 @@ def act_with_policy(
        if done or truncated:
            logging.info(f"[ACTOR] Global step {interaction_step}: Episode reward: {sum_reward_episode}")

-            update_policy_parameters(policy=policy, parameters_queue=parameters_queue, device=device)
+            update_policy_parameters(algorithm=algorithm, parameters_queue=parameters_queue, device=device)

            if len(list_transition_to_send_to_learner) > 0:
                push_transitions_to_transport_queue(
@@ -390,14 +422,7 @@ def act_with_policy(
            episode_intervention_steps = 0
            episode_total_steps = 0

-            # Reset environment and processors
-            obs, info = online_env.reset()
-            env_processor.reset()
-            action_processor.reset()
-
-            # Process initial observation
-            transition = create_transition(observation=obs, info=info)
-            transition = env_processor(transition)
+            transition = reset_and_build_transition(online_env, env_processor, action_processor)

        if cfg.env.fps is not None:
            dt_time = time.perf_counter() - start_time
@@ -408,10 +433,10 @@ def act_with_policy(


 def establish_learner_connection(
-    stub: services_pb2_grpc.LearnerServiceStub,
-    shutdown_event: Event,  # type: ignore
+    stub: "services_pb2_grpc.LearnerServiceStub",
+    shutdown_event: Any,  # Event
    attempts: int = 30,
-):
+) -> bool:
    """Establish a connection with the learner.

    Args:
@@ -441,12 +466,14 @@ def establish_learner_connection(
 def learner_service_client(
    host: str = "127.0.0.1",
    port: int = 50051,
-) -> tuple[services_pb2_grpc.LearnerServiceStub, grpc.Channel]:
-    """
-    Returns a client for the learner service.
+) -> "tuple[services_pb2_grpc.LearnerServiceStub, grpc.Channel]":
+    """Return a client for the learner service.

    GRPC uses HTTP/2, which is a binary protocol and multiplexes requests over a single connection.
    So we need to create only one client and reuse it.
+
+    Returns:
+        tuple[services_pb2_grpc.LearnerServiceStub, grpc.Channel]: The stub and the channel.
    """

    channel = grpc.insecure_channel(
@@ -461,16 +488,18 @@ def learner_service_client(
 def receive_policy(
    cfg: TrainRLServerPipelineConfig,
    parameters_queue: Queue,
-    shutdown_event: Event,  # type: ignore
-    learner_client: services_pb2_grpc.LearnerServiceStub | None = None,
-    grpc_channel: grpc.Channel | None = None,
-):
+    shutdown_event: Any,  # Event
+    learner_client: "services_pb2_grpc.LearnerServiceStub | None" = None,
+    grpc_channel: "grpc.Channel | None" = None,
+) -> None:
    """Receive parameters from the learner.

    Args:
        cfg (TrainRLServerPipelineConfig): The configuration for the actor.
        parameters_queue (Queue): The queue to receive the parameters.
        shutdown_event (Event): The event to check if the process should shutdown.
+        learner_client (services_pb2_grpc.LearnerServiceStub | None): Optional pre-created stub.
+        grpc_channel (grpc.Channel | None): Optional pre-created channel.
    """
    logging.info("[ACTOR] Start receiving parameters from the Learner")
    if not use_threads(cfg):
@@ -513,12 +542,11 @@ def receive_policy(
 def send_transitions(
    cfg: TrainRLServerPipelineConfig,
    transitions_queue: Queue,
-    shutdown_event: any,  # Event,
-    learner_client: services_pb2_grpc.LearnerServiceStub | None = None,
-    grpc_channel: grpc.Channel | None = None,
-) -> services_pb2.Empty:
-    """
-    Sends transitions to the learner.
+    shutdown_event: Any,  # Event
+    learner_client: "services_pb2_grpc.LearnerServiceStub | None" = None,
+    grpc_channel: "grpc.Channel | None" = None,
+) -> None:
+    """Send transitions to the learner.

    This function continuously retrieves messages from the queue and processes:

@@ -526,6 +554,13 @@ def send_transitions(
        - A batch of transitions (observation, action, reward, next observation) is collected.
        - Transitions are moved to the CPU and serialized using PyTorch.
        - The serialized data is wrapped in a `services_pb2.Transition` message and sent to the learner.
+
+    Args:
+        cfg (TrainRLServerPipelineConfig): The configuration for the actor.
+        transitions_queue (Queue): The queue to receive the transitions.
+        shutdown_event (Event): The event to check if the process should shutdown.
+        learner_client (services_pb2_grpc.LearnerServiceStub | None): Optional pre-created stub.
+        grpc_channel (grpc.Channel | None): Optional pre-created channel.
    """

    if not use_threads(cfg):
@@ -563,18 +598,24 @@ def send_transitions(
 def send_interactions(
    cfg: TrainRLServerPipelineConfig,
    interactions_queue: Queue,
-    shutdown_event: Event,  # type: ignore
-    learner_client: services_pb2_grpc.LearnerServiceStub | None = None,
-    grpc_channel: grpc.Channel | None = None,
-) -> services_pb2.Empty:
-    """
-    Sends interactions to the learner.
+    shutdown_event: Any,  # Event
+    learner_client: "services_pb2_grpc.LearnerServiceStub | None" = None,
+    grpc_channel: "grpc.Channel | None" = None,
+) -> None:
+    """Send interactions to the learner.

    This function continuously retrieves messages from the queue and processes:

    - Interaction Messages:
        - Contains useful statistics about episodic rewards and policy timings.
        - The message is serialized using `pickle` and sent to the learner.
+
+    Args:
+        cfg (TrainRLServerPipelineConfig): The configuration for the actor.
+        interactions_queue (Queue): The queue to receive the interactions.
+        shutdown_event (Event): The event to check if the process should shutdown.
+        learner_client (services_pb2_grpc.LearnerServiceStub | None): Optional pre-created stub.
+        grpc_channel (grpc.Channel | None): Optional pre-created channel.
    """

    if not use_threads(cfg):
@@ -613,7 +654,11 @@ def send_interactions(
    logging.info("[ACTOR] Interactions process stopped")


-def transitions_stream(shutdown_event: Event, transitions_queue: Queue, timeout: float) -> services_pb2.Empty:  # type: ignore
+def transitions_stream(
+    shutdown_event: Any,  # Event
+    transitions_queue: Queue,
+    timeout: float,
+) -> "Generator[Any, None, services_pb2.Empty]":
    while not shutdown_event.is_set():
        try:
            message = transitions_queue.get(block=True, timeout=timeout)
@@ -629,10 +674,10 @@ def transitions_stream(shutdown_event: Event, transitions_queue: Queue, timeout:


 def interactions_stream(
-    shutdown_event: Event,
+    shutdown_event: Any,  # Event
    interactions_queue: Queue,
-    timeout: float,  # type: ignore
-) -> services_pb2.Empty:
+    timeout: float,
+) -> "Generator[Any, None, services_pb2.Empty]":
    while not shutdown_event.is_set():
        try:
            message = interactions_queue.get(block=True, timeout=timeout)
@@ -652,7 +697,8 @@ def interactions_stream(
 #  Policy functions


-def update_policy_parameters(policy: SACPolicy, parameters_queue: Queue, device):
+def update_policy_parameters(algorithm: RLAlgorithm, parameters_queue: Queue, device):
+    """Drain the latest learner-pushed weights into ``algorithm.policy``."""
    bytes_state_dict = get_last_item_from_queue(parameters_queue, block=False)
    if bytes_state_dict is not None:
        logging.info("[ACTOR] Load new parameters from Learner.")
@@ -667,18 +713,7 @@ def update_policy_parameters(policy: SACPolicy, parameters_queue: Queue, device)
        # - Send critic's encoder state when shared_encoder=True
        # - Skip encoder params entirely when freeze_vision_encoder=True
        # - Ensure discrete_critic gets correct encoder state (currently uses encoder_critic)
-
-        # Load actor state dict
-        actor_state_dict = move_state_dict_to_device(state_dicts["policy"], device=device)
-        policy.actor.load_state_dict(actor_state_dict)
-
-        # Load discrete critic if present
-        if hasattr(policy, "discrete_critic") and "discrete_critic" in state_dicts:
-            discrete_critic_state_dict = move_state_dict_to_device(
-                state_dicts["discrete_critic"], device=device
-            )
-            policy.discrete_critic.load_state_dict(discrete_critic_state_dict)
-            logging.info("[ACTOR] Loaded discrete critic parameters from Learner.")
+        algorithm.load_weights(state_dicts, device=device)


 #  Utilities functions
@@ -0,0 +1,20 @@
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from .sac import SACAlgorithm, SACAlgorithmConfig
+
+__all__ = [
+    "SACAlgorithm",
+    "SACAlgorithmConfig",
+]
@@ -0,0 +1,207 @@
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from __future__ import annotations
+
+import abc
+import builtins
+import os
+from collections.abc import Iterator
+from pathlib import Path
+from typing import TYPE_CHECKING, Any, TypeVar
+
+import torch
+from huggingface_hub import hf_hub_download
+from huggingface_hub.constants import SAFETENSORS_SINGLE_FILE
+from huggingface_hub.errors import HfHubHTTPError
+from safetensors.torch import load_file as load_safetensors, save_file as save_safetensors
+from torch.optim import Optimizer
+
+from lerobot.types import BatchType
+from lerobot.utils.hub import HubMixin
+
+from .configs import RLAlgorithmConfig, TrainingStats
+
+if TYPE_CHECKING:
+    from torch import nn
+
+    from ..data_sources.data_mixer import DataMixer
+
+T = TypeVar("T", bound="RLAlgorithm")
+
+
+class RLAlgorithm(HubMixin, abc.ABC):
+    """Base for all RL algorithms."""
+
+    config_class: type[RLAlgorithmConfig]
+    name: str
+    config: RLAlgorithmConfig
+
+    @abc.abstractmethod
+    def update(self, batch_iterator: Iterator[BatchType]) -> TrainingStats:
+        """One complete training step.
+
+        The algorithm calls ``next(batch_iterator)`` as many times as it
+        needs (e.g. ``utd_ratio`` times for SAC) to obtain fresh batches.
+        The iterator is owned by the trainer; the algorithm just consumes
+        from it.
+        """
+        raise NotImplementedError
+
+    def configure_data_iterator(
+        self,
+        data_mixer: DataMixer,
+        batch_size: int,
+        *,
+        async_prefetch: bool = True,
+        queue_size: int = 2,
+    ) -> Iterator[BatchType]:
+        """Create the data iterator this algorithm needs.
+
+        The default implementation uses the standard ``data_mixer.get_iterator()``.
+        Algorithms that need specialised sampling should override this method.
+        """
+        return data_mixer.get_iterator(
+            batch_size=batch_size,
+            async_prefetch=async_prefetch,
+            queue_size=queue_size,
+        )
+
+    @abc.abstractmethod
+    def make_optimizers_and_scheduler(self) -> dict[str, Optimizer]:
+        """Build and return the optimizers used during training.
+
+        Called once on the learner side after construction.
+        """
+        raise NotImplementedError
+
+    def get_optimizers(self) -> dict[str, Optimizer]:
+        """Return optimizers for checkpointing / external scheduling."""
+        return {}
+
+    @property
+    def optimization_step(self) -> int:
+        """Current learner optimization step.
+
+        Part of the stable contract for checkpoint/resume. Algorithms can
+        either use this default storage or override for custom behavior.
+        """
+        return getattr(self, "_optimization_step", 0)
+
+    @optimization_step.setter
+    def optimization_step(self, value: int) -> None:
+        self._optimization_step = int(value)
+
+    def get_weights(self) -> dict[str, Any]:
+        """Policy state-dict to push to actors."""
+        return {}
+
+    @abc.abstractmethod
+    def load_weights(self, weights: dict[str, Any], device: str | torch.device = "cpu") -> None:
+        """Load policy state-dict received from the learner."""
+        raise NotImplementedError
+
+    @abc.abstractmethod
+    def state_dict(self) -> dict[str, torch.Tensor]:
+        """Algorithm-owned trainable tensors.
+
+        Must return a flat tensor mapping for everything the algorithm owns
+        that is not part of the policy (e.g. critic ensembles, target networks,
+        temperature parameters). Algorithms with no training-only tensors
+        should explicitly return an empty dict.
+        """
+        raise NotImplementedError
+
+    @abc.abstractmethod
+    def load_state_dict(
+        self,
+        state_dict: dict[str, torch.Tensor],
+        device: str | torch.device = "cpu",
+    ) -> None:
+        """In-place load of algorithm-owned tensors.
+
+        Implementations MUST keep the identity of any ``nn.Parameter`` that an
+        optimizer references (e.g. SAC's ``log_alpha``) by using ``.copy_()``
+        rather than rebinding the attribute.
+        """
+        raise NotImplementedError
+
+    def _save_pretrained(self, save_directory: Path) -> None:
+        """Persist the algorithm's tensors and config to ``save_directory``.
+
+        Writes ``model.safetensors`` (algorithm tensors via :meth:`state_dict`)
+        and ``config.json`` (via :meth:`RLAlgorithmConfig.save_pretrained`).
+        """
+        tensors = {k: v.detach().cpu().contiguous() for k, v in self.state_dict().items()}
+        save_safetensors(tensors, str(save_directory / SAFETENSORS_SINGLE_FILE))
+        self.config._save_pretrained(save_directory)
+
+    @classmethod
+    def from_pretrained(
+        cls: builtins.type[T],
+        pretrained_name_or_path: str | Path,
+        *,
+        policy: nn.Module,
+        config: RLAlgorithmConfig | None = None,
+        force_download: bool = False,
+        resume_download: bool | None = None,
+        proxies: dict | None = None,
+        token: str | bool | None = None,
+        cache_dir: str | Path | None = None,
+        local_files_only: bool = False,
+        revision: str | None = None,
+        device: str | torch.device = "cpu",
+        **algo_kwargs: Any,
+    ) -> T:
+        """Build an algorithm and load its weights from ``pretrained_name_or_path``."""
+        if config is None:
+            config = cls.config_class.from_pretrained(
+                pretrained_name_or_path,
+                force_download=force_download,
+                resume_download=resume_download,
+                proxies=proxies,
+                token=token,
+                cache_dir=cache_dir,
+                local_files_only=local_files_only,
+                revision=revision,
+            )
+        if hasattr(config, "policy_config"):
+            config.policy_config = policy.config
+
+        instance = cls(policy=policy, config=config, **algo_kwargs)
+
+        model_id = str(pretrained_name_or_path)
+        if os.path.isdir(model_id):
+            model_file = os.path.join(model_id, SAFETENSORS_SINGLE_FILE)
+        else:
+            try:
+                model_file = hf_hub_download(
+                    repo_id=model_id,
+                    filename=SAFETENSORS_SINGLE_FILE,
+                    revision=revision,
+                    cache_dir=cache_dir,
+                    force_download=force_download,
+                    proxies=proxies,
+                    resume_download=resume_download,
+                    token=token,
+                    local_files_only=local_files_only,
+                )
+            except HfHubHTTPError as e:
+                raise FileNotFoundError(
+                    f"{SAFETENSORS_SINGLE_FILE} not found on the HuggingFace Hub in {model_id}"
+                ) from e
+
+        tensors = load_safetensors(model_file)
+        instance.load_state_dict(tensors, device=device)
+        return instance
@@ -0,0 +1,138 @@
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from __future__ import annotations
+
+import abc
+import builtins
+import logging
+import os
+from dataclasses import dataclass, field
+from pathlib import Path
+from typing import Any, TypeVar
+
+import draccus
+from huggingface_hub import hf_hub_download
+from huggingface_hub.constants import CONFIG_NAME
+from huggingface_hub.errors import HfHubHTTPError
+
+from lerobot.utils.hub import HubMixin
+
+T = TypeVar("T", bound="RLAlgorithmConfig")
+
+logger = logging.getLogger(__name__)
+
+
+@dataclass
+class TrainingStats:
+    """Returned by ``algorithm.update()`` for logging and checkpointing."""
+
+    losses: dict[str, float] = field(default_factory=dict)
+    grad_norms: dict[str, float] = field(default_factory=dict)
+    extra: dict[str, float] = field(default_factory=dict)
+
+    def to_log_dict(self) -> dict[str, float]:
+        """Flatten all stats into a single dict for logging."""
+
+        d: dict[str, float] = {}
+        for name, val in self.losses.items():
+            d[name] = val
+        for name, val in self.grad_norms.items():
+            d[f"{name}_grad_norm"] = val
+        for name, val in self.extra.items():
+            d[name] = val
+        return d
+
+
+@dataclass
+class RLAlgorithmConfig(draccus.ChoiceRegistry, HubMixin, abc.ABC):
+    """Registry for algorithm configs."""
+
+    @property
+    def type(self) -> str:
+        """Registered name of this algorithm config (e.g. ``"sac"``)."""
+        choice_name = self.get_choice_name(self.__class__)
+        if not isinstance(choice_name, str):
+            raise TypeError(f"Expected string from get_choice_name, got {type(choice_name)}")
+        return choice_name
+
+    @classmethod
+    @abc.abstractmethod
+    def from_policy_config(cls, policy_cfg: Any) -> RLAlgorithmConfig:
+        """Build an algorithm config from a policy config.
+
+        Must be overridden by every registered config subclass.
+        """
+        raise NotImplementedError(f"{cls.__name__} must implement from_policy_config()")
+
+    def _save_pretrained(self, save_directory: Path) -> None:
+        """Serialize this config as ``config.json`` inside ``save_directory``."""
+        with open(save_directory / CONFIG_NAME, "w") as f, draccus.config_type("json"):
+            draccus.dump(self, f, indent=4)
+
+    @classmethod
+    def from_pretrained(
+        cls: builtins.type[T],
+        pretrained_name_or_path: str | Path,
+        *,
+        force_download: bool = False,
+        resume_download: bool | None = None,
+        proxies: dict[Any, Any] | None = None,
+        token: str | bool | None = None,
+        cache_dir: str | Path | None = None,
+        local_files_only: bool = False,
+        revision: str | None = None,
+        **algo_kwargs: Any,
+    ) -> T:
+        model_id = str(pretrained_name_or_path)
+        config_file: str | None = None
+        if Path(model_id).is_dir():
+            if CONFIG_NAME in os.listdir(model_id):
+                config_file = os.path.join(model_id, CONFIG_NAME)
+            else:
+                logger.error(f"{CONFIG_NAME} not found in {Path(model_id).resolve()}")
+        else:
+            try:
+                config_file = hf_hub_download(
+                    repo_id=model_id,
+                    filename=CONFIG_NAME,
+                    revision=revision,
+                    cache_dir=cache_dir,
+                    force_download=force_download,
+                    proxies=proxies,
+                    resume_download=resume_download,
+                    token=token,
+                    local_files_only=local_files_only,
+                )
+            except HfHubHTTPError as e:
+                raise FileNotFoundError(
+                    f"{CONFIG_NAME} not found on the HuggingFace Hub in {model_id}"
+                ) from e
+
+        if config_file is None:
+            raise FileNotFoundError(f"{CONFIG_NAME} not found in {model_id}")
+
+        with draccus.config_type("json"):
+            instance = draccus.parse(RLAlgorithmConfig, config_file, args=[])
+
+        if cls is not RLAlgorithmConfig and not isinstance(instance, cls):
+            raise TypeError(
+                f"Config at {model_id} has type '{instance.type}' but was loaded via "
+                f"{cls.__name__}; use the matching subclass or RLAlgorithmConfig.from_pretrained()."
+            )
+
+        for key, value in algo_kwargs.items():
+            if hasattr(instance, key):
+                setattr(instance, key, value)
+        return instance
@@ -0,0 +1,99 @@
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from __future__ import annotations
+
+import torch
+
+from .base import RLAlgorithm
+from .configs import RLAlgorithmConfig
+
+
+def make_algorithm_config(algorithm_type: str, **kwargs) -> RLAlgorithmConfig:
+    """Instantiate an `RLAlgorithmConfig` from its registered type name.
+
+    Args:
+        algorithm_type: Registry key of the algorithm (e.g. ``"sac"``).
+        **kwargs: Keyword arguments forwarded to the config class constructor.
+
+    Returns:
+        An instance of the matching ``RLAlgorithmConfig`` subclass.
+
+    Raises:
+        ValueError: If ``algorithm_type`` is not registered.
+    """
+    try:
+        cls = RLAlgorithmConfig.get_choice_class(algorithm_type)
+    except KeyError as err:
+        raise ValueError(
+            f"Algorithm type '{algorithm_type}' is not registered. "
+            f"Available: {list(RLAlgorithmConfig.get_known_choices().keys())}"
+        ) from err
+    return cls(**kwargs)
+
+
+def get_algorithm_class(name: str) -> type[RLAlgorithm]:
+    """
+    Retrieves an RL algorithm class by its registered name.
+
+    This function uses dynamic imports to avoid loading all algorithm classes into
+    memory at once, improving startup time and reducing dependencies.
+
+    Args:
+        name: The name of the algorithm. Supported names are "sac".
+
+    Returns:
+        The algorithm class corresponding to the given name.
+
+    Raises:
+        ValueError: If the algorithm name is not recognized.
+    """
+    if name == "sac":
+        from .sac.sac_algorithm import SACAlgorithm
+
+        return SACAlgorithm
+    raise ValueError(
+        f"Algorithm type '{name}' is not available. "
+        f"Known: {list(RLAlgorithmConfig.get_known_choices().keys())}"
+    )
+
+
+def make_algorithm(cfg: RLAlgorithmConfig, policy: torch.nn.Module) -> RLAlgorithm:
+    """
+    Instantiate an RL algorithm.
+
+    This factory function looks up the :class:`RLAlgorithm` subclass that matches
+    ``cfg.type`` and instantiates it with the provided policy. It also enforces
+    that ``cfg.policy_config`` has been populated before construction (this is
+    normally handled by :meth:`TrainRLServerPipelineConfig.validate`).
+
+    Args:
+        cfg: The algorithm configuration. Must have ``policy_config`` set.
+        policy: The policy module the algorithm will train.
+
+    Returns:
+        An instantiated :class:`RLAlgorithm`.
+
+    Raises:
+        ValueError: If ``cfg.policy_config`` is ``None`` or ``cfg.type`` is not
+            registered.
+    """
+    if getattr(cfg, "policy_config", None) is None:
+        raise ValueError(
+            f"{type(cfg).__name__}.policy_config is None. "
+            "It must be populated (typically by TrainRLServerPipelineConfig.validate) "
+            "before calling make_algorithm()."
+        )
+    cls = get_algorithm_class(cfg.type)
+    return cls(policy=policy, config=cfg)
@@ -0,0 +1,18 @@
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from .configuration_sac import SACAlgorithmConfig
+from .sac_algorithm import SACAlgorithm
+
+__all__ = ["SACAlgorithm", "SACAlgorithmConfig"]
@@ -0,0 +1,99 @@
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from __future__ import annotations
+
+from dataclasses import dataclass, field
+
+from lerobot.configs.policies import PreTrainedConfig
+from lerobot.policies.gaussian_actor.configuration_gaussian_actor import (
+    CriticNetworkConfig,
+    GaussianActorConfig,
+)
+
+from ..configs import RLAlgorithmConfig
+
+
+@RLAlgorithmConfig.register_subclass("sac")
+@dataclass
+class SACAlgorithmConfig(RLAlgorithmConfig):
+    """Soft Actor-Critic (SAC) algorithm configuration.
+
+    SAC is an off-policy actor-critic deep RL algorithm based on the maximum
+    entropy reinforcement learning framework. It learns a policy and a Q-function
+    simultaneously using experience collected from the environment.
+
+    This configuration class contains the algorithm-side hyperparameters: critic
+    ensemble, target networks, temperature / entropy tuning, and the Bellman
+    update loop. The policy-side (actor + observation encoder) lives in
+    :class:`~lerobot.policies.gaussian_actor.GaussianActorConfig` and is
+    referenced via :attr:`policy_config`.
+    """
+
+    # Optimizer learning rates
+    # Learning rate for the actor network
+    actor_lr: float = 3e-4
+    # Learning rate for the critic network
+    critic_lr: float = 3e-4
+    # Learning rate for the temperature parameter
+    temperature_lr: float = 3e-4
+
+    # Bellman update
+    # Discount factor for the SAC algorithm
+    discount: float = 0.99
+    # Whether to use backup entropy for the SAC algorithm
+    use_backup_entropy: bool = True
+    # Weight for the critic target update
+    critic_target_update_weight: float = 0.005
+
+    # Critic ensemble
+    # Number of critics in the ensemble
+    num_critics: int = 2
+    # Number of subsampled critics for training
+    num_subsample_critics: int | None = None
+    # Configuration for the critic network architecture
+    critic_network_kwargs: CriticNetworkConfig = field(default_factory=CriticNetworkConfig)
+    # Configuration for the discrete critic network
+    discrete_critic_network_kwargs: CriticNetworkConfig = field(default_factory=CriticNetworkConfig)
+
+    # Temperature / entropy
+    # Initial temperature value
+    temperature_init: float = 1.0
+    # Target entropy for automatic temperature tuning. If ``None``, defaults to
+    # ``-|A|/2`` where ``|A|`` is the total action dimension (continuous + 1 if
+    # there is a discrete action head).
+    target_entropy: float | None = None
+
+    # Update loop
+    # Update-to-data ratio. Set to >1 to enable extra critic updates per env step.
+    utd_ratio: int = 1
+    # Frequency of policy updates
+    policy_update_freq: int = 1
+    # Gradient clipping norm for the SAC algorithm
+    grad_clip_norm: float = 40.0
+
+    # Optimizations
+    # torch.compile is currently disabled by default
+    use_torch_compile: bool = False
+
+    # Policy config
+    policy_config: PreTrainedConfig | None = None
+
+    @classmethod
+    def from_policy_config(cls, policy_cfg: GaussianActorConfig) -> SACAlgorithmConfig:
+        """Build an algorithm config with default hyperparameters for a given policy."""
+        return cls(
+            policy_config=policy_cfg,
+            discrete_critic_network_kwargs=policy_cfg.discrete_critic_network_kwargs,
+        )
@@ -0,0 +1,672 @@
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from __future__ import annotations
+
+import math
+from collections.abc import Callable, Iterator
+from dataclasses import asdict
+from typing import Any
+
+import einops
+import torch
+import torch.nn as nn
+import torch.nn.functional as F  # noqa: N812
+from torch import Tensor
+from torch.optim import Optimizer
+
+from lerobot.policies.gaussian_actor.modeling_gaussian_actor import (
+    DISCRETE_DIMENSION_INDEX,
+    MLP,
+    DiscreteCritic,
+    GaussianActorObservationEncoder,
+    GaussianActorPolicy,
+    orthogonal_init,
+)
+from lerobot.policies.utils import get_device_from_parameters
+from lerobot.types import BatchType
+from lerobot.utils.constants import ACTION
+from lerobot.utils.transition import move_state_dict_to_device
+
+from ..base import RLAlgorithm
+from ..configs import TrainingStats
+from .configuration_sac import SACAlgorithmConfig
+
+
+class SACAlgorithm(RLAlgorithm):
+    """Soft Actor-Critic. Owns critics, targets, temperature, and loss computation."""
+
+    config_class = SACAlgorithmConfig
+    name = "sac"
+
+    def __init__(
+        self,
+        policy: GaussianActorPolicy,
+        config: SACAlgorithmConfig,
+    ):
+        self.config = config
+        self.policy_config = config.policy_config
+        self.policy = policy
+        self.optimizers: dict[str, Optimizer] = {}
+        self._optimization_step: int = 0
+
+        action_dim = self.policy.config.output_features[ACTION].shape[0]
+        self._init_critics(action_dim)
+        self._init_temperature(action_dim)
+
+        self._device = torch.device(self.policy.config.device)
+        self._move_to_device()
+
+    def _init_critics(self, action_dim) -> None:
+        """Build critic ensemble, targets."""
+        encoder = self.policy.encoder_critic
+
+        heads = [
+            CriticHead(
+                input_dim=encoder.output_dim + action_dim,
+                **asdict(self.config.critic_network_kwargs),
+            )
+            for _ in range(self.config.num_critics)
+        ]
+        self.critic_ensemble = CriticEnsemble(encoder=encoder, ensemble=heads)
+        target_heads = [
+            CriticHead(
+                input_dim=encoder.output_dim + action_dim,
+                **asdict(self.config.critic_network_kwargs),
+            )
+            for _ in range(self.config.num_critics)
+        ]
+        self.critic_target = CriticEnsemble(encoder=encoder, ensemble=target_heads)
+        self.critic_target.load_state_dict(self.critic_ensemble.state_dict())
+
+        # TODO(Khalil): Investigate and fix torch.compile
+        # NOTE: torch.compile is disabled, policy does not converge when enabled.
+        if self.config.use_torch_compile:
+            self.critic_ensemble = torch.compile(self.critic_ensemble)
+            self.critic_target = torch.compile(self.critic_target)
+
+        self.discrete_critic_target = None
+        if self.policy_config.num_discrete_actions is not None:
+            self.discrete_critic_target = self._init_discrete_critic_target(encoder)
+
+    def _init_discrete_critic_target(self, encoder: GaussianActorObservationEncoder) -> DiscreteCritic:
+        """Build target discrete critic (main network is owned by the policy)."""
+        discrete_critic_target = DiscreteCritic(
+            encoder=encoder,
+            input_dim=encoder.output_dim,
+            output_dim=self.policy_config.num_discrete_actions,
+            **asdict(self.config.discrete_critic_network_kwargs),
+        )
+        # TODO(Khalil): Compile the discrete critic
+        discrete_critic_target.load_state_dict(self.policy.discrete_critic.state_dict())
+        return discrete_critic_target
+
+    def _init_temperature(self, continuous_action_dim: int) -> None:
+        """Set up temperature parameter (log_alpha) and target entropy."""
+        temp_init = self.config.temperature_init
+        self.log_alpha = nn.Parameter(torch.tensor([math.log(temp_init)]))
+
+        self.target_entropy = self.config.target_entropy
+        if self.target_entropy is None:
+            total_action_dim = continuous_action_dim + (
+                1 if self.policy_config.num_discrete_actions is not None else 0
+            )
+            self.target_entropy = -total_action_dim / 2
+
+    def _move_to_device(self) -> None:
+        self.policy.to(self._device)
+        self.critic_ensemble.to(self._device)
+        self.critic_target.to(self._device)
+        self.log_alpha = nn.Parameter(self.log_alpha.data.to(self._device))
+        if self.discrete_critic_target is not None:
+            self.discrete_critic_target.to(self._device)
+
+    @property
+    def temperature(self) -> float:
+        """Return the current temperature value, always in sync with log_alpha."""
+        return self.log_alpha.exp().item()
+
+    def _critic_forward(
+        self,
+        observations: dict[str, Tensor],
+        actions: Tensor,
+        use_target: bool = False,
+        observation_features: Tensor | None = None,
+    ) -> Tensor:
+        """Forward pass through a critic network ensemble
+
+        Args:
+            observations: Dictionary of observations
+            actions: Action tensor
+            use_target: If True, use target critics, otherwise use ensemble critics
+
+        Returns:
+            Tensor of Q-values from all critics
+        """
+
+        critics = self.critic_target if use_target else self.critic_ensemble
+        q_values = critics(observations, actions, observation_features)
+        return q_values
+
+    def _discrete_critic_forward(
+        self, observations, use_target=False, observation_features=None
+    ) -> torch.Tensor:
+        """Forward pass through a discrete critic network
+
+        Args:
+            observations: Dictionary of observations
+            use_target: If True, use target critics, otherwise use ensemble critics
+            observation_features: Optional pre-computed observation features to avoid recomputing encoder output
+
+        Returns:
+            Tensor of Q-values from the discrete critic network
+        """
+        discrete_critic = self.discrete_critic_target if use_target else self.policy.discrete_critic
+        q_values = discrete_critic(observations, observation_features)
+        return q_values
+
+    def update(self, batch_iterator: Iterator[BatchType]) -> TrainingStats:
+        """Run one SAC training step (critic / discrete-critic / actor / temperature).
+
+        Pulls ``utd_ratio`` batches from ``batch_iterator``, computes the relevant
+        losses, backpropagates each, and updates target networks.
+
+        Args:
+            batch_iterator: yields batches each containing
+                - ``action``: Action tensor
+                - ``reward``: Reward tensor
+                - ``state``: Observations tensor dict
+                - ``next_state``: Next observations tensor dict
+                - ``done``: Done mask tensor
+                - ``observation_feature``: Optional pre-computed observation features
+                - ``next_observation_feature``: Optional pre-computed next observation features
+                - ``complementary_info`` (optional): per-step extras like discrete penalties
+
+        Returns:
+            TrainingStats with per-component losses and grad norms.
+        """
+        clip = self.config.grad_clip_norm
+
+        for _ in range(self.config.utd_ratio - 1):
+            batch = next(batch_iterator)
+            fb = self._prepare_forward_batch(batch, include_complementary_info=True)
+
+            loss_critic = self._compute_loss_critic(fb)
+            self.optimizers["critic"].zero_grad()
+            loss_critic.backward()
+            torch.nn.utils.clip_grad_norm_(self.critic_ensemble.parameters(), max_norm=clip)
+            self.optimizers["critic"].step()
+
+            if self.policy_config.num_discrete_actions is not None:
+                loss_dc = self._compute_loss_discrete_critic(fb)
+                self.optimizers["discrete_critic"].zero_grad()
+                loss_dc.backward()
+                torch.nn.utils.clip_grad_norm_(self.policy.discrete_critic.parameters(), max_norm=clip)
+                self.optimizers["discrete_critic"].step()
+
+            self._update_target_networks()
+
+        batch = next(batch_iterator)
+        fb = self._prepare_forward_batch(batch, include_complementary_info=False)
+
+        loss_critic = self._compute_loss_critic(fb)
+        self.optimizers["critic"].zero_grad()
+        loss_critic.backward()
+        critic_grad = torch.nn.utils.clip_grad_norm_(self.critic_ensemble.parameters(), max_norm=clip).item()
+        self.optimizers["critic"].step()
+
+        stats = TrainingStats(
+            losses={"loss_critic": loss_critic.item()},
+            grad_norms={"critic": critic_grad},
+        )
+
+        if self.policy_config.num_discrete_actions is not None:
+            loss_dc = self._compute_loss_discrete_critic(fb)
+            self.optimizers["discrete_critic"].zero_grad()
+            loss_dc.backward()
+            dc_grad = torch.nn.utils.clip_grad_norm_(
+                self.policy.discrete_critic.parameters(), max_norm=clip
+            ).item()
+            self.optimizers["discrete_critic"].step()
+            stats.losses["loss_discrete_critic"] = loss_dc.item()
+            stats.grad_norms["discrete_critic"] = dc_grad
+
+        if self._optimization_step % self.config.policy_update_freq == 0:
+            for _ in range(self.config.policy_update_freq):
+                loss_actor = self._compute_loss_actor(fb)
+                self.optimizers["actor"].zero_grad()
+                loss_actor.backward()
+                actor_grad = torch.nn.utils.clip_grad_norm_(
+                    self.policy.actor.parameters(), max_norm=clip
+                ).item()
+                self.optimizers["actor"].step()
+
+                loss_temp = self._compute_loss_temperature(fb)
+                self.optimizers["temperature"].zero_grad()
+                loss_temp.backward()
+                temp_grad = torch.nn.utils.clip_grad_norm_([self.log_alpha], max_norm=clip).item()
+                self.optimizers["temperature"].step()
+
+            stats.losses["loss_actor"] = loss_actor.item()
+            stats.losses["loss_temperature"] = loss_temp.item()
+            stats.grad_norms["actor"] = actor_grad
+            stats.grad_norms["temperature"] = temp_grad
+            stats.extra["temperature"] = self.temperature
+
+        self._update_target_networks()
+        self._optimization_step += 1
+        return stats
+
+    def _compute_loss_critic(self, batch: dict[str, Any]) -> Tensor:
+        # Extract common components from batch
+        observations = batch["state"]
+        actions = batch[ACTION]
+        observation_features = batch.get("observation_feature")
+        # Extract critic-specific components
+        rewards = batch["reward"]
+        next_observations = batch["next_state"]
+        done = batch["done"]
+        next_observation_features = batch.get("next_observation_feature")
+
+        with torch.no_grad():
+            next_action_preds, next_log_probs, _ = self.policy.actor(
+                next_observations, next_observation_features
+            )
+
+            # 2- compute q targets
+            q_targets = self._critic_forward(
+                observations=next_observations,
+                actions=next_action_preds,
+                use_target=True,
+                observation_features=next_observation_features,
+            )
+
+            # subsample critics to prevent overfitting if use high UTD (update to date)
+            # TODO: Get indices before forward pass to avoid unnecessary computation
+            if self.config.num_subsample_critics is not None:
+                indices = torch.randperm(self.config.num_critics)
+                indices = indices[: self.config.num_subsample_critics]
+                q_targets = q_targets[indices]
+
+            # critics subsample size
+            min_q, _ = q_targets.min(dim=0)  # Get values from min operation
+            if self.config.use_backup_entropy:
+                min_q = min_q - (self.temperature * next_log_probs)
+
+            td_target = rewards + (1 - done) * self.config.discount * min_q
+
+        # 3- compute predicted qs
+        if self.policy_config.num_discrete_actions is not None:
+            # NOTE: We only want to keep the continuous action part
+            # In the buffer we have the full action space (continuous + discrete)
+            # We need to split them before concatenating them in the critic forward
+            actions: Tensor = actions[:, :DISCRETE_DIMENSION_INDEX]
+        q_preds = self._critic_forward(
+            observations=observations,
+            actions=actions,
+            use_target=False,
+            observation_features=observation_features,
+        )
+
+        # 4- Calculate loss
+        # Compute state-action value loss (TD loss) for all of the Q functions in the ensemble.
+        td_target_duplicate = einops.repeat(td_target, "b -> e b", e=q_preds.shape[0])
+        # You compute the mean loss of the batch for each critic and then to compute the final loss you sum them up
+        critics_loss = (
+            F.mse_loss(
+                input=q_preds,
+                target=td_target_duplicate,
+                reduction="none",
+            ).mean(dim=1)
+        ).sum()
+        return critics_loss
+
+    def _compute_loss_discrete_critic(self, batch: dict[str, Any]) -> Tensor:
+        observations = batch["state"]
+        actions = batch[ACTION]
+        rewards = batch["reward"]
+        next_observations = batch["next_state"]
+        done = batch["done"]
+        observation_features = batch.get("observation_feature")
+        next_observation_features = batch.get("next_observation_feature")
+        complementary_info = batch.get("complementary_info")
+
+        # NOTE: We only want to keep the discrete action part
+        # In the buffer we have the full action space (continuous + discrete)
+        # We need to split them before concatenating them in the critic forward
+        actions_discrete: Tensor = actions[:, DISCRETE_DIMENSION_INDEX:].clone()
+        actions_discrete = torch.round(actions_discrete)
+        actions_discrete = actions_discrete.long()
+
+        discrete_penalties: Tensor | None = None
+        if complementary_info is not None:
+            discrete_penalties = complementary_info.get("discrete_penalty")
+
+        with torch.no_grad():
+            # For DQN, select actions using online network, evaluate with target network
+            next_discrete_qs = self._discrete_critic_forward(
+                next_observations, use_target=False, observation_features=next_observation_features
+            )
+            best_next_discrete_action = torch.argmax(next_discrete_qs, dim=-1, keepdim=True)
+
+            # Get target Q-values from target network
+            target_next_discrete_qs = self._discrete_critic_forward(
+                observations=next_observations,
+                use_target=True,
+                observation_features=next_observation_features,
+            )
+
+            # Use gather to select Q-values for best actions
+            target_next_discrete_q = torch.gather(
+                target_next_discrete_qs, dim=1, index=best_next_discrete_action
+            ).squeeze(-1)
+
+            # Compute target Q-value with Bellman equation
+            rewards_discrete = rewards
+            if discrete_penalties is not None:
+                rewards_discrete = rewards + discrete_penalties
+            target_discrete_q = rewards_discrete + (1 - done) * self.config.discount * target_next_discrete_q
+
+        # Get predicted Q-values for current observations
+        predicted_discrete_qs = self._discrete_critic_forward(
+            observations=observations, use_target=False, observation_features=observation_features
+        )
+
+        # Use gather to select Q-values for taken actions
+        predicted_discrete_q = torch.gather(predicted_discrete_qs, dim=1, index=actions_discrete).squeeze(-1)
+
+        # Compute MSE loss between predicted and target Q-values
+        discrete_critic_loss = F.mse_loss(input=predicted_discrete_q, target=target_discrete_q)
+        return discrete_critic_loss
+
+    def _compute_loss_actor(self, batch: dict[str, Any]) -> Tensor:
+        observations = batch["state"]
+        observation_features = batch.get("observation_feature")
+
+        actions_pi, log_probs, _ = self.policy.actor(observations, observation_features)
+
+        q_preds = self._critic_forward(
+            observations=observations,
+            actions=actions_pi,
+            use_target=False,
+            observation_features=observation_features,
+        )
+        min_q_preds = q_preds.min(dim=0)[0]
+
+        actor_loss = ((self.temperature * log_probs) - min_q_preds).mean()
+        return actor_loss
+
+    def _compute_loss_temperature(self, batch: dict[str, Any]) -> Tensor:
+        """Compute the temperature loss"""
+        observations = batch["state"]
+        observation_features = batch.get("observation_feature")
+
+        # calculate temperature loss
+        with torch.no_grad():
+            _, log_probs, _ = self.policy.actor(observations, observation_features)
+
+        temperature_loss = (-self.log_alpha.exp() * (log_probs + self.target_entropy)).mean()
+        return temperature_loss
+
+    def _update_target_networks(self) -> None:
+        """Update target networks with exponential moving average"""
+        for target_p, p in zip(
+            self.critic_target.parameters(), self.critic_ensemble.parameters(), strict=True
+        ):
+            target_p.data.copy_(
+                p.data * self.config.critic_target_update_weight
+                + target_p.data * (1.0 - self.config.critic_target_update_weight)
+            )
+        if self.policy_config.num_discrete_actions is not None:
+            for target_p, p in zip(
+                self.discrete_critic_target.parameters(),
+                self.policy.discrete_critic.parameters(),
+                strict=True,
+            ):
+                target_p.data.copy_(
+                    p.data * self.config.critic_target_update_weight
+                    + target_p.data * (1.0 - self.config.critic_target_update_weight)
+                )
+
+    def _prepare_forward_batch(
+        self, batch: BatchType, *, include_complementary_info: bool = True
+    ) -> dict[str, Any]:
+        observations = batch["state"]
+        next_observations = batch["next_state"]
+        observation_features, next_observation_features = self.get_observation_features(
+            observations, next_observations
+        )
+        forward_batch: dict[str, Any] = {
+            ACTION: batch[ACTION],
+            "reward": batch["reward"],
+            "state": observations,
+            "next_state": next_observations,
+            "done": batch["done"],
+            "observation_feature": observation_features,
+            "next_observation_feature": next_observation_features,
+        }
+        if include_complementary_info and "complementary_info" in batch:
+            forward_batch["complementary_info"] = batch["complementary_info"]
+        return forward_batch
+
+    def make_optimizers_and_scheduler(self) -> dict[str, Optimizer]:
+        """
+        Creates and returns optimizers for the actor, critic, and temperature components of a reinforcement learning policy.
+
+        This function sets up Adam optimizers for:
+        - The **actor network**, ensuring that only relevant parameters are optimized.
+        - The **critic ensemble**, which evaluates the value function.
+        - The **temperature parameter**, which controls the entropy in soft actor-critic (SAC)-like methods.
+
+        It also initializes a learning rate scheduler, though currently, it is set to `None`.
+
+        NOTE:
+        - If the encoder is shared, its parameters are excluded from the actor's optimization process.
+        - The policy's log temperature (`log_alpha`) is wrapped in a list to ensure proper optimization as a standalone tensor.
+
+        Args:
+            cfg: Configuration object containing hyperparameters.
+            policy (nn.Module): The policy model containing the actor, critic, and temperature components.
+
+        Returns:
+            A dictionary mapping component names ("actor", "critic", "temperature")
+            to their respective Adam optimizers.
+        """
+        actor_params = self.policy.get_optim_params()["actor"]
+        self.optimizers = {
+            "actor": torch.optim.Adam(actor_params, lr=self.config.actor_lr),
+            "critic": torch.optim.Adam(self.critic_ensemble.parameters(), lr=self.config.critic_lr),
+            "temperature": torch.optim.Adam([self.log_alpha], lr=self.config.temperature_lr),
+        }
+        if self.policy_config.num_discrete_actions is not None:
+            self.optimizers["discrete_critic"] = torch.optim.Adam(
+                self.policy.discrete_critic.parameters(), lr=self.config.critic_lr
+            )
+        return self.optimizers
+
+    def get_optimizers(self) -> dict[str, Optimizer]:
+        return self.optimizers
+
+    def get_weights(self) -> dict[str, Any]:
+        """Send actor + discrete-critic state dicts."""
+        state_dicts: dict[str, Any] = {
+            "policy": move_state_dict_to_device(self.policy.actor.state_dict(), device="cpu"),
+        }
+        if self.policy_config.num_discrete_actions is not None:
+            state_dicts["discrete_critic"] = move_state_dict_to_device(
+                self.policy.discrete_critic.state_dict(), device="cpu"
+            )
+        return state_dicts
+
+    def load_weights(self, weights: dict[str, Any], device: str | torch.device = "cpu") -> None:
+        """Load actor + discrete-critic weights into the policy."""
+        actor_sd = move_state_dict_to_device(weights["policy"], device=device)
+        self.policy.actor.load_state_dict(actor_sd)
+        if "discrete_critic" in weights and self.policy.discrete_critic is not None:
+            discrete_sd = move_state_dict_to_device(weights["discrete_critic"], device=device)
+            self.policy.discrete_critic.load_state_dict(discrete_sd)
+
+    def state_dict(self) -> dict[str, torch.Tensor]:
+        """Algorithm-owned trainable tensors.
+
+        Encoder weights are stripped because they are owned by the policy
+        (``policy.encoder_critic``) and already saved via ``policy.save_pretrained``.
+        """
+        bundle: dict[str, torch.Tensor] = {}
+        for k, v in _strip_encoder_keys(self.critic_ensemble.state_dict()).items():
+            bundle[f"critic_ensemble.{k}"] = v
+        for k, v in _strip_encoder_keys(self.critic_target.state_dict()).items():
+            bundle[f"critic_target.{k}"] = v
+        if self.discrete_critic_target is not None:
+            for k, v in _strip_encoder_keys(self.discrete_critic_target.state_dict()).items():
+                bundle[f"discrete_critic_target.{k}"] = v
+        bundle["log_alpha"] = self.log_alpha.detach()
+        return bundle
+
+    def load_state_dict(
+        self,
+        state_dict: dict[str, torch.Tensor],
+        device: str | torch.device = "cpu",
+    ) -> None:
+        """In-place load of algorithm-owned tensors.
+
+        ``log_alpha`` is restored via ``Parameter.data.copy_`` so the
+        ``temperature`` optimizer's reference to the parameter object stays
+        valid after resume.
+        """
+        critic_ensemble_state = _split_prefix(state_dict, "critic_ensemble.")
+        critic_target_state = _split_prefix(state_dict, "critic_target.")
+        self.critic_ensemble.load_state_dict(critic_ensemble_state, strict=False)
+        self.critic_target.load_state_dict(critic_target_state, strict=False)
+
+        if self.discrete_critic_target is not None:
+            discrete_target_state = _split_prefix(state_dict, "discrete_critic_target.")
+            self.discrete_critic_target.load_state_dict(discrete_target_state, strict=False)
+
+        if "log_alpha" in state_dict:
+            self.log_alpha.data.copy_(state_dict["log_alpha"].to(self.log_alpha.device))
+
+    def get_observation_features(
+        self, observations: Tensor, next_observations: Tensor
+    ) -> tuple[Tensor | None, Tensor | None]:
+        """
+        Get observation features from the policy encoder. It act as cache for the observation features.
+        when the encoder is frozen, the observation features are not updated.
+        We can save compute by caching the observation features.
+
+        Args:
+            policy: The policy model
+            observations: The current observations
+            next_observations: The next observations
+
+        Returns:
+            tuple: observation_features, next_observation_features
+        """
+
+        if self.policy.config.vision_encoder_name is None or not self.policy.config.freeze_vision_encoder:
+            return None, None
+
+        with torch.no_grad():
+            observation_features = self.policy.actor.encoder.get_cached_image_features(observations)
+            next_observation_features = self.policy.actor.encoder.get_cached_image_features(next_observations)
+
+        return observation_features, next_observation_features
+
+
+def _strip_encoder_keys(state: dict[str, torch.Tensor]) -> dict[str, torch.Tensor]:
+    """Drop ``encoder.*`` keys from a critic-module state dict."""
+    return {k: v for k, v in state.items() if not k.startswith("encoder.")}
+
+
+def _split_prefix(state: dict[str, torch.Tensor], prefix: str) -> dict[str, torch.Tensor]:
+    """Return the subset of ``state`` whose keys start with ``prefix``, prefix-stripped."""
+    return {k.removeprefix(prefix): v for k, v in state.items() if k.startswith(prefix)}
+
+
+class CriticHead(nn.Module):
+    def __init__(
+        self,
+        input_dim: int,
+        hidden_dims: list[int],
+        activations: Callable[[torch.Tensor], torch.Tensor] | str = nn.SiLU(),
+        activate_final: bool = False,
+        dropout_rate: float | None = None,
+        init_final: float | None = None,
+        final_activation: Callable[[torch.Tensor], torch.Tensor] | str | None = None,
+    ):
+        super().__init__()
+        self.net = MLP(
+            input_dim=input_dim,
+            hidden_dims=hidden_dims,
+            activations=activations,
+            activate_final=activate_final,
+            dropout_rate=dropout_rate,
+            final_activation=final_activation,
+        )
+        self.output_layer = nn.Linear(in_features=hidden_dims[-1], out_features=1)
+        if init_final is not None:
+            nn.init.uniform_(self.output_layer.weight, -init_final, init_final)
+            nn.init.uniform_(self.output_layer.bias, -init_final, init_final)
+        else:
+            orthogonal_init()(self.output_layer.weight)
+
+    def forward(self, x: torch.Tensor) -> torch.Tensor:
+        return self.output_layer(self.net(x))
+
+
+class CriticEnsemble(nn.Module):
+    """
+    CriticEnsemble wraps multiple CriticHead modules into an ensemble.
+
+    Args:
+        encoder (GaussianActorObservationEncoder): encoder for observations.
+        ensemble (List[CriticHead]): list of critic heads.
+        init_final (float | None): optional initializer scale for final layers.
+
+    Forward returns a tensor of shape (num_critics, batch_size) containing Q-values.
+    """
+
+    def __init__(
+        self,
+        encoder: GaussianActorObservationEncoder,
+        ensemble: list[CriticHead],
+        init_final: float | None = None,
+    ):
+        super().__init__()
+        self.encoder = encoder
+        self.init_final = init_final
+        self.critics = nn.ModuleList(ensemble)
+
+    def forward(
+        self,
+        observations: dict[str, torch.Tensor],
+        actions: torch.Tensor,
+        observation_features: torch.Tensor | None = None,
+    ) -> torch.Tensor:
+        device = get_device_from_parameters(self)
+        # Move each tensor in observations to device
+        observations = {k: v.to(device) for k, v in observations.items()}
+
+        obs_enc = self.encoder(observations, cache=observation_features)
+
+        inputs = torch.cat([obs_enc, actions], dim=-1)
+
+        # Loop through critics and collect outputs
+        q_values = []
+        for critic in self.critics:
+            q_values.append(critic(inputs))
+
+        # Stack outputs to match expected shape [num_critics, batch_size]
+        q_values = torch.stack([q.squeeze(-1) for q in q_values], dim=0)
+        return q_values
@@ -97,8 +97,8 @@ class ReplayBuffer:
        Args:
            capacity (int): Maximum number of transitions to store in the buffer.
            device (str): The device where the tensors will be moved when sampling ("cuda:0" or "cpu").
-            state_keys (List[str]): The list of keys that appear in `state` and `next_state`.
-            image_augmentation_function (Optional[Callable]): A function that takes a batch of images
+            state_keys (list[str]): The list of keys that appear in `state` and `next_state`.
+            image_augmentation_function (Callable | None): A function that takes a batch of images
                and returns a batch of augmented images. If None, a default augmentation function is used.
            use_drq (bool): Whether to use the default DRQ image augmentation style, when sampling in the buffer.
            storage_device: The device (e.g. "cpu" or "cuda:0") where the data will be stored.
@@ -634,7 +634,7 @@ class ReplayBuffer:
                If None, you must handle or define default keys.

        Returns:
-            transitions (List[Transition]):
+            transitions (list[Transition]):
                A list of Transition dictionaries with the same length as `dataset`.
        """
        if state_keys is None:
@@ -176,11 +176,11 @@ def convert_lerobot_dataset_to_cropped_lerobot_dataset(

    Args:
        original_dataset (LeRobotDataset): The source dataset.
-        crop_params_dict (Dict[str, Tuple[int, int, int, int]]):
+        crop_params_dict (dict[str, Tuple[int, int, int, int]]):
            A dictionary mapping observation keys to crop parameters (top, left, height, width).
        new_repo_id (str): Repository id for the new dataset.
        new_dataset_root (str): The root directory where the new dataset will be written.
-        resize_size (Tuple[int, int], optional): The target size (height, width) after cropping.
+        resize_size (tuple[int, int], optional): The target size (height, width) after cropping.
            Defaults to (128, 128).

    Returns:
@@ -0,0 +1,19 @@
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from lerobot.types import BatchType
+
+from .data_mixer import DataMixer, OnlineOfflineMixer
+
+__all__ = ["BatchType", "DataMixer", "OnlineOfflineMixer"]
@@ -0,0 +1,97 @@
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from __future__ import annotations
+
+import abc
+
+from lerobot.types import BatchType
+
+from ..buffer import ReplayBuffer, concatenate_batch_transitions
+
+
+class DataMixer(abc.ABC):
+    """Abstract interface for all data mixing strategies."""
+
+    @abc.abstractmethod
+    def sample(self, batch_size: int) -> BatchType:
+        """Draw one batch of ``batch_size`` transitions."""
+        raise NotImplementedError
+
+    def get_iterator(
+        self,
+        batch_size: int,
+        async_prefetch: bool = True,
+        queue_size: int = 2,
+    ):
+        """Infinite iterator that yields batches."""
+        while True:
+            yield self.sample(batch_size)
+
+
+class OnlineOfflineMixer(DataMixer):
+    """Mixes transitions from an online and an offline replay buffer."""
+
+    def __init__(
+        self,
+        online_buffer: ReplayBuffer,
+        offline_buffer: ReplayBuffer | None = None,
+        online_ratio: float = 1.0,
+    ):
+        if not 0.0 <= online_ratio <= 1.0:
+            raise ValueError(f"online_ratio must be in [0, 1], got {online_ratio}")
+        self.online_buffer = online_buffer
+        self.offline_buffer = offline_buffer
+        self.online_ratio = online_ratio
+
+    def sample(self, batch_size: int) -> BatchType:
+        if self.offline_buffer is None:
+            return self.online_buffer.sample(batch_size)
+
+        n_online = max(1, int(batch_size * self.online_ratio))
+        n_offline = batch_size - n_online
+
+        online_batch = self.online_buffer.sample(n_online)
+        offline_batch = self.offline_buffer.sample(n_offline)
+        return concatenate_batch_transitions(online_batch, offline_batch)
+
+    def get_iterator(
+        self,
+        batch_size: int,
+        async_prefetch: bool = True,
+        queue_size: int = 2,
+    ):
+        """Yield batches by composing buffer async iterators."""
+
+        n_online = max(1, int(batch_size * self.online_ratio))
+
+        online_iter = self.online_buffer.get_iterator(
+            batch_size=n_online,
+            async_prefetch=async_prefetch,
+            queue_size=queue_size,
+        )
+
+        if self.offline_buffer is None:
+            yield from online_iter
+            return
+
+        n_offline = batch_size - n_online
+        offline_iter = self.offline_buffer.get_iterator(
+            batch_size=n_offline,
+            async_prefetch=async_prefetch,
+            queue_size=queue_size,
+        )
+
+        while True:
+            yield concatenate_batch_transitions(next(online_iter), next(offline_iter))
@@ -17,7 +17,6 @@ import logging

 from lerobot.cameras import opencv  # noqa: F401
 from lerobot.configs import parser
-from lerobot.configs.train import TrainRLServerPipelineConfig
 from lerobot.datasets import LeRobotDataset
 from lerobot.policies import make_policy
 from lerobot.robots import (  # noqa: F401
@@ -31,6 +30,7 @@ from lerobot.teleoperators import (
 )

 from .gym_manipulator import make_robot_env
+from .train_rl import TrainRLServerPipelineConfig

 logging.basicConfig(level=logging.INFO)

@@ -74,6 +74,7 @@ from lerobot.teleoperators import (
 from lerobot.teleoperators.teleoperator import Teleoperator
 from lerobot.teleoperators.utils import TeleopEvents
 from lerobot.utils.constants import ACTION, DONE, OBS_IMAGES, OBS_STATE, REWARD
+from lerobot.utils.import_utils import require_package
 from lerobot.utils.robot_utils import precise_sleep
 from lerobot.utils.utils import log_say

@@ -312,6 +313,7 @@ def make_robot_env(cfg: HILSerlRobotEnvConfig) -> tuple[gym.Env, Any]:
    # Check if this is a GymHIL simulation environment
    if cfg.name == "gym_hil":
        assert cfg.robot is None and cfg.teleop is None, "GymHIL environment does not support robot or teleop"
+        require_package("gym-hil", extra="hilserl", import_name="gym_hil")
        import gym_hil  # noqa: F401

        # Extract gripper settings with defaults
@@ -383,10 +385,21 @@ def make_processors(
            GymHILAdapterProcessorStep(),
            Numpy2TorchActionProcessorStep(),
            VanillaObservationProcessorStep(),
-            AddBatchDimensionProcessorStep(),
-            DeviceProcessorStep(device=device),
        ]

+        # Add time limit processor if reset config exists
+        if cfg.processor.reset is not None:
+            env_pipeline_steps.append(
+                TimeLimitProcessorStep(max_episode_steps=int(cfg.processor.reset.control_time_s * cfg.fps))
+            )
+
+        env_pipeline_steps.extend(
+            [
+                AddBatchDimensionProcessorStep(),
+                DeviceProcessorStep(device=device),
+            ]
+        )
+
        return DataProcessorPipeline(
            steps=env_pipeline_steps, to_transition=identity_transition, to_output=identity_transition
        ), DataProcessorPipeline(
@@ -551,8 +564,19 @@ def step_env_and_process_transition(
    terminated = terminated or processed_action_transition[TransitionKey.DONE]
    truncated = truncated or processed_action_transition[TransitionKey.TRUNCATED]
    complementary_data = processed_action_transition[TransitionKey.COMPLEMENTARY_DATA].copy()
+
+    if hasattr(env, "get_raw_joint_positions"):
+        raw_joint_positions = env.get_raw_joint_positions()
+        if raw_joint_positions is not None:
+            complementary_data["raw_joint_positions"] = raw_joint_positions
+
+    # Merge env and action-processor info: env wins for str keys, action-processor
+    # wins for `TeleopEvents` enum keys
+    action_info = processed_action_transition[TransitionKey.INFO]
    new_info = info.copy()
-    new_info.update(processed_action_transition[TransitionKey.INFO])
+    for key, value in action_info.items():
+        if isinstance(key, TeleopEvents):
+            new_info[key] = value

    new_transition = create_transition(
        observation=obs,
@@ -568,6 +592,24 @@ def step_env_and_process_transition(
    return new_transition


+def reset_and_build_transition(
+    env: gym.Env,
+    env_processor: DataProcessorPipeline[EnvTransition, EnvTransition],
+    action_processor: DataProcessorPipeline[EnvTransition, EnvTransition],
+) -> EnvTransition:
+    """Reset env + processors and return the first env-processed transition."""
+    obs, info = env.reset()
+    env_processor.reset()
+    action_processor.reset()
+    complementary_data: dict[str, Any] = {}
+    if hasattr(env, "get_raw_joint_positions"):
+        raw_joint_positions = env.get_raw_joint_positions()
+        if raw_joint_positions is not None:
+            complementary_data["raw_joint_positions"] = raw_joint_positions
+    transition = create_transition(observation=obs, info=info, complementary_data=complementary_data)
+    return env_processor(data=transition)
+
+
 def control_loop(
    env: gym.Env,
    env_processor: DataProcessorPipeline[EnvTransition, EnvTransition],
@@ -593,17 +635,7 @@ def control_loop(
    print("- When not intervening, robot will stay still")
    print("- Press Ctrl+C to exit")

-    # Reset environment and processors
-    obs, info = env.reset()
-    complementary_data = (
-        {"raw_joint_positions": info.pop("raw_joint_positions")} if "raw_joint_positions" in info else {}
-    )
-    env_processor.reset()
-    action_processor.reset()
-
-    # Process initial observation
-    transition = create_transition(observation=obs, info=info, complementary_data=complementary_data)
-    transition = env_processor(data=transition)
+    transition = reset_and_build_transition(env, env_processor, action_processor)

    # Determine if gripper is used
    use_gripper = cfg.env.processor.gripper.use_gripper if cfg.env.processor.gripper is not None else True
@@ -659,79 +691,82 @@ def control_loop(
    episode_step = 0
    episode_start_time = time.perf_counter()

-    while episode_idx < cfg.dataset.num_episodes_to_record:
-        step_start_time = time.perf_counter()
+    try:
+        while episode_idx < cfg.dataset.num_episodes_to_record:
+            step_start_time = time.perf_counter()

-        # Create a neutral action (no movement)
-        neutral_action = torch.tensor([0.0, 0.0, 0.0], dtype=torch.float32)
-        if use_gripper:
-            neutral_action = torch.cat([neutral_action, torch.tensor([0.0])])  # Gripper stay
+            # Create a neutral action (no movement)
+            neutral_action = torch.tensor([0.0, 0.0, 0.0], dtype=torch.float32)
+            if use_gripper:
+                neutral_action = torch.cat([neutral_action, torch.tensor([1.0])])  # Gripper stay

-        # Use the new step function
-        transition = step_env_and_process_transition(
-            env=env,
-            transition=transition,
-            action=neutral_action,
-            env_processor=env_processor,
-            action_processor=action_processor,
-        )
-        terminated = transition.get(TransitionKey.DONE, False)
-        truncated = transition.get(TransitionKey.TRUNCATED, False)
-
-        if cfg.mode == "record":
-            observations = {
+            observation = {
                k: v.squeeze(0).cpu()
                for k, v in transition[TransitionKey.OBSERVATION].items()
                if isinstance(v, torch.Tensor)
            }
-            # Use teleop_action if available, otherwise use the action from the transition
-            action_to_record = transition[TransitionKey.COMPLEMENTARY_DATA].get(
-                "teleop_action", transition[TransitionKey.ACTION]
+
+            transition = step_env_and_process_transition(
+                env=env,
+                transition=transition,
+                action=neutral_action,
+                env_processor=env_processor,
+                action_processor=action_processor,
            )
-            frame = {
-                **observations,
-                ACTION: action_to_record.cpu(),
-                REWARD: np.array([transition[TransitionKey.REWARD]], dtype=np.float32),
-                DONE: np.array([terminated or truncated], dtype=bool),
-            }
-            if use_gripper:
-                discrete_penalty = transition[TransitionKey.COMPLEMENTARY_DATA].get("discrete_penalty", 0.0)
-                frame["complementary_info.discrete_penalty"] = np.array([discrete_penalty], dtype=np.float32)
+            terminated = transition.get(TransitionKey.DONE, False)
+            truncated = transition.get(TransitionKey.TRUNCATED, False)

-            if dataset is not None:
-                frame["task"] = cfg.dataset.task
-                dataset.add_frame(frame)
+            if cfg.mode == "record":
+                action_to_record = transition[TransitionKey.COMPLEMENTARY_DATA].get(
+                    "teleop_action", transition[TransitionKey.ACTION]
+                )
+                frame = {
+                    **observation,
+                    ACTION: action_to_record.cpu(),
+                    REWARD: np.array([transition[TransitionKey.REWARD]], dtype=np.float32),
+                    DONE: np.array([terminated or truncated], dtype=bool),
+                }
+                if use_gripper:
+                    discrete_penalty = transition[TransitionKey.COMPLEMENTARY_DATA].get(
+                        "discrete_penalty", 0.0
+                    )
+                    frame["complementary_info.discrete_penalty"] = np.array(
+                        [discrete_penalty], dtype=np.float32
+                    )

-        episode_step += 1
+                if dataset is not None:
+                    frame["task"] = cfg.dataset.task
+                    dataset.add_frame(frame)

-        # Handle episode termination
-        if terminated or truncated:
-            episode_time = time.perf_counter() - episode_start_time
-            logging.info(
-                f"Episode ended after {episode_step} steps in {episode_time:.1f}s with reward {transition[TransitionKey.REWARD]}"
-            )
-            episode_step = 0
-            episode_idx += 1
+            episode_step += 1

-            if dataset is not None:
-                if transition[TransitionKey.INFO].get(TeleopEvents.RERECORD_EPISODE, False):
-                    logging.info(f"Re-recording episode {episode_idx}")
-                    dataset.clear_episode_buffer()
-                    episode_idx -= 1
-                else:
-                    logging.info(f"Saving episode {episode_idx}")
-                    dataset.save_episode()
+            # Handle episode termination
+            if terminated or truncated:
+                episode_time = time.perf_counter() - episode_start_time
+                logging.info(
+                    f"Episode ended after {episode_step} steps in {episode_time:.1f}s with reward {transition[TransitionKey.REWARD]}"
+                )
+                episode_step = 0
+                episode_idx += 1

-            # Reset for new episode
-            obs, info = env.reset()
-            env_processor.reset()
-            action_processor.reset()
+                if dataset is not None:
+                    if transition[TransitionKey.INFO].get(TeleopEvents.RERECORD_EPISODE, False):
+                        logging.info(f"Re-recording episode {episode_idx}")
+                        dataset.clear_episode_buffer()
+                        episode_idx -= 1
+                    else:
+                        logging.info(f"Saving episode {episode_idx}")
+                        dataset.save_episode()

-            transition = create_transition(observation=obs, info=info)
-            transition = env_processor(transition)
+                # Reset for new episode
+                transition = reset_and_build_transition(env, env_processor, action_processor)

-        # Maintain fps timing
-        precise_sleep(max(dt - (time.perf_counter() - step_start_time), 0.0))
+            # Maintain fps timing
+            precise_sleep(max(dt - (time.perf_counter() - step_start_time), 0.0))
+    finally:
+        if dataset is not None and dataset.writer is not None and dataset.writer.image_writer is not None:
+            logging.info("Waiting for image writer to finish...")
+            dataset.writer.image_writer.stop()

    if dataset is not None and cfg.dataset.push_to_hub:
        logging.info("Finalizing dataset before pushing to hub")
@@ -51,9 +51,21 @@ import time
 from concurrent.futures import ThreadPoolExecutor
 from pathlib import Path
 from pprint import pformat
+from typing import TYPE_CHECKING, Any
+
+from lerobot.utils.import_utils import _grpc_available, require_package
+
+if TYPE_CHECKING or _grpc_available:
+    import grpc
+
+    from lerobot.transport import services_pb2_grpc
+else:
+    grpc = None
+    services_pb2_grpc = None

-import grpc
 import torch
+from huggingface_hub.constants import SAFETENSORS_SINGLE_FILE
+from safetensors.torch import load_file as load_safetensors
 from termcolor import colored
 from torch import nn
 from torch.multiprocessing import Queue
@@ -68,14 +80,11 @@ from lerobot.common.train_utils import (
 )
 from lerobot.common.wandb_utils import WandBLogger
 from lerobot.configs import parser
-from lerobot.configs.train import TrainRLServerPipelineConfig
 from lerobot.datasets import LeRobotDataset, make_dataset
-from lerobot.policies import make_policy
-from lerobot.policies.sac.modeling_sac import SACPolicy
+from lerobot.policies import make_policy, make_pre_post_processors
 from lerobot.robots import so_follower  # noqa: F401
 from lerobot.teleoperators import gamepad, so_leader  # noqa: F401
 from lerobot.teleoperators.utils import TeleopEvents
-from lerobot.transport import services_pb2_grpc
 from lerobot.transport.utils import (
    MAX_MESSAGE_SIZE,
    bytes_to_python_object,
@@ -84,26 +93,35 @@ from lerobot.transport.utils import (
 )
 from lerobot.utils.constants import (
    ACTION,
+    ALGORITHM_DIR,
    CHECKPOINTS_DIR,
    LAST_CHECKPOINT_LINK,
    PRETRAINED_MODEL_DIR,
    TRAINING_STATE_DIR,
+    TRAINING_STEP,
 )
 from lerobot.utils.device_utils import get_safe_torch_device
+from lerobot.utils.io_utils import load_json, write_json
 from lerobot.utils.process import ProcessSignalHandler
 from lerobot.utils.random_utils import set_seed
-from lerobot.utils.transition import move_state_dict_to_device, move_transition_to_device
 from lerobot.utils.utils import (
    format_big_number,
    init_logging,
 )

-from .buffer import ReplayBuffer, concatenate_batch_transitions
+from .algorithms.base import RLAlgorithm
+from .algorithms.factory import make_algorithm
+from .buffer import ReplayBuffer
+from .data_sources import OnlineOfflineMixer
 from .learner_service import MAX_WORKERS, SHUTDOWN_TIMEOUT, LearnerService
+from .train_rl import TrainRLServerPipelineConfig
+from .trainer import RLTrainer


@parser.wrap()
 def train_cli(cfg: TrainRLServerPipelineConfig):
+    # Fail fast with a friendly error if the optional ``hilserl`` extra is missing.
+    require_package("grpcio", extra="hilserl", import_name="grpc")
    if not use_threads(cfg):
        import torch.multiprocessing as mp

@@ -179,7 +197,7 @@ def train(cfg: TrainRLServerPipelineConfig, job_name: str | None = None):
 def start_learner_threads(
    cfg: TrainRLServerPipelineConfig,
    wandb_logger: WandBLogger | None,
-    shutdown_event: any,  # Event,
+    shutdown_event: Any,  # Event
 ) -> None:
    """
    Start the learner threads for training.
@@ -253,7 +271,7 @@ def start_learner_threads(
 def add_actor_information_and_train(
    cfg: TrainRLServerPipelineConfig,
    wandb_logger: WandBLogger | None,
-    shutdown_event: any,  # Event,
+    shutdown_event: Any,  # Event
    transition_queue: Queue,
    interaction_message_queue: Queue,
    parameters_queue: Queue,
@@ -266,8 +284,8 @@ def add_actor_information_and_train(
    - Transfers transitions from the actor to the replay buffer.
    - Logs received interaction messages.
    - Ensures training begins only when the replay buffer has a sufficient number of transitions.
-    - Samples batches from the replay buffer and performs multiple critic updates.
-    - Periodically updates the actor, critic, and temperature optimizers.
+    - Delegates training updates to an ``RLAlgorithm``.
+    - Periodically pushes updated weights to actors.
    - Logs training statistics, including loss values and optimization frequency.

    NOTE: This function doesn't have a single responsibility, it should be split into multiple functions
@@ -286,17 +304,13 @@ def add_actor_information_and_train(
    # of 7%
    device = get_safe_torch_device(try_device=cfg.policy.device, log=True)
    storage_device = get_safe_torch_device(try_device=cfg.policy.storage_device)
-    clip_grad_norm_value = cfg.policy.grad_clip_norm
    online_step_before_learning = cfg.policy.online_step_before_learning
-    utd_ratio = cfg.policy.utd_ratio
    fps = cfg.env.fps
    log_freq = cfg.log_freq
    save_freq = cfg.save_freq
-    policy_update_freq = cfg.policy.policy_update_freq
    policy_parameters_push_frequency = cfg.policy.actor_learner_config.policy_parameters_push_frequency
    saving_checkpoint = cfg.save_checkpoint
    online_steps = cfg.policy.online_steps
-    async_prefetch = cfg.policy.async_prefetch

    # Initialize logging for multiprocessing
    if not use_threads(cfg):
@@ -308,7 +322,7 @@ def add_actor_information_and_train(

    logging.info("Initializing policy")

-    policy: SACPolicy = make_policy(
+    policy = make_policy(
        cfg=cfg.policy,
        env_cfg=cfg.env,
    )
@@ -317,15 +331,17 @@ def add_actor_information_and_train(

    policy.train()

-    push_actor_policy_to_queue(parameters_queue=parameters_queue, policy=policy)
+    algorithm = make_algorithm(cfg=cfg.algorithm, policy=policy)

+    preprocessor, postprocessor = make_pre_post_processors(
+        policy_cfg=cfg.policy,
+        dataset_stats=cfg.policy.dataset_stats,
+    )
+
+    # Push initial policy weights to actors
+    push_actor_policy_to_queue(parameters_queue=parameters_queue, algorithm=algorithm)
    last_time_policy_pushed = time.time()

-    optimizers, lr_scheduler = make_optimizers_and_scheduler(cfg=cfg, policy=policy)
-
-    # If we are resuming, we need to load the training state
-    resume_optimization_step, resume_interaction_step = load_training_state(cfg=cfg, optimizers=optimizers)
-
    log_training_info(cfg=cfg, policy=policy)

    replay_buffer = initialize_replay_buffer(cfg, device, storage_device)
@@ -338,21 +354,37 @@ def add_actor_information_and_train(
            device=device,
            storage_device=storage_device,
        )
-        batch_size: int = batch_size // 2  # We will sample from both replay buffer
+
+    # DataMixer: online-only or online/offline 50-50 mix
+    data_mixer = OnlineOfflineMixer(
+        online_buffer=replay_buffer,
+        offline_buffer=offline_replay_buffer,
+        online_ratio=cfg.online_ratio,
+    )
+    # RLTrainer owns the iterator, preprocessor, and creates optimizers.
+    trainer = RLTrainer(
+        algorithm=algorithm,
+        data_mixer=data_mixer,
+        batch_size=batch_size,
+        preprocessor=preprocessor,
+    )
+
+    # If we are resuming, we need to load the training state
+    optimizers = algorithm.get_optimizers()
+    resume_optimization_step, resume_interaction_step = load_training_state(
+        cfg=cfg, optimizers=optimizers, algorithm=algorithm, device=device
+    )

    logging.info("Starting learner thread")
    interaction_message = None
    optimization_step = resume_optimization_step if resume_optimization_step is not None else 0
+    algorithm.optimization_step = optimization_step
    interaction_step_shift = resume_interaction_step if resume_interaction_step is not None else 0

    dataset_repo_id = None
    if cfg.dataset is not None:
        dataset_repo_id = cfg.dataset.repo_id

-    # Initialize iterators
-    online_iterator = None
-    offline_iterator = None
-
    # NOTE: THIS IS THE MAIN LOOP OF THE LEARNER
    while True:
        # Exit the training loop if shutdown is requested
@@ -365,7 +397,6 @@ def add_actor_information_and_train(
            transition_queue=transition_queue,
            replay_buffer=replay_buffer,
            offline_replay_buffer=offline_replay_buffer,
-            device=device,
            dataset_repo_id=dataset_repo_id,
            shutdown_event=shutdown_event,
        )
@@ -382,180 +413,20 @@ def add_actor_information_and_train(
        if len(replay_buffer) < online_step_before_learning:
            continue

-        if online_iterator is None:
-            online_iterator = replay_buffer.get_iterator(
-                batch_size=batch_size, async_prefetch=async_prefetch, queue_size=2
-            )
-
-        if offline_replay_buffer is not None and offline_iterator is None:
-            offline_iterator = offline_replay_buffer.get_iterator(
-                batch_size=batch_size, async_prefetch=async_prefetch, queue_size=2
-            )
-
        time_for_one_optimization_step = time.time()
-        for _ in range(utd_ratio - 1):
-            # Sample from the iterators
-            batch = next(online_iterator)

-            if dataset_repo_id is not None:
-                batch_offline = next(offline_iterator)
-                batch = concatenate_batch_transitions(
-                    left_batch_transitions=batch, right_batch_transition=batch_offline
-                )
-
-            actions = batch[ACTION]
-            rewards = batch["reward"]
-            observations = batch["state"]
-            next_observations = batch["next_state"]
-            done = batch["done"]
-            check_nan_in_transition(observations=observations, actions=actions, next_state=next_observations)
-
-            observation_features, next_observation_features = get_observation_features(
-                policy=policy, observations=observations, next_observations=next_observations
-            )
-
-            # Create a batch dictionary with all required elements for the forward method
-            forward_batch = {
-                ACTION: actions,
-                "reward": rewards,
-                "state": observations,
-                "next_state": next_observations,
-                "done": done,
-                "observation_feature": observation_features,
-                "next_observation_feature": next_observation_features,
-                "complementary_info": batch["complementary_info"],
-            }
-
-            # Use the forward method for critic loss
-            critic_output = policy.forward(forward_batch, model="critic")
-
-            # Main critic optimization
-            loss_critic = critic_output["loss_critic"]
-            optimizers["critic"].zero_grad()
-            loss_critic.backward()
-            critic_grad_norm = torch.nn.utils.clip_grad_norm_(
-                parameters=policy.critic_ensemble.parameters(), max_norm=clip_grad_norm_value
-            )
-            optimizers["critic"].step()
-
-            # Discrete critic optimization (if available)
-            if policy.config.num_discrete_actions is not None:
-                discrete_critic_output = policy.forward(forward_batch, model="discrete_critic")
-                loss_discrete_critic = discrete_critic_output["loss_discrete_critic"]
-                optimizers["discrete_critic"].zero_grad()
-                loss_discrete_critic.backward()
-                discrete_critic_grad_norm = torch.nn.utils.clip_grad_norm_(
-                    parameters=policy.discrete_critic.parameters(), max_norm=clip_grad_norm_value
-                )
-                optimizers["discrete_critic"].step()
-
-            # Update target networks (main and discrete)
-            policy.update_target_networks()
-
-        # Sample for the last update in the UTD ratio
-        batch = next(online_iterator)
-
-        if dataset_repo_id is not None:
-            batch_offline = next(offline_iterator)
-            batch = concatenate_batch_transitions(
-                left_batch_transitions=batch, right_batch_transition=batch_offline
-            )
-
-        actions = batch[ACTION]
-        rewards = batch["reward"]
-        observations = batch["state"]
-        next_observations = batch["next_state"]
-        done = batch["done"]
-
-        check_nan_in_transition(observations=observations, actions=actions, next_state=next_observations)
-
-        observation_features, next_observation_features = get_observation_features(
-            policy=policy, observations=observations, next_observations=next_observations
-        )
-
-        # Create a batch dictionary with all required elements for the forward method
-        forward_batch = {
-            ACTION: actions,
-            "reward": rewards,
-            "state": observations,
-            "next_state": next_observations,
-            "done": done,
-            "observation_feature": observation_features,
-            "next_observation_feature": next_observation_features,
-        }
-
-        critic_output = policy.forward(forward_batch, model="critic")
-
-        loss_critic = critic_output["loss_critic"]
-        optimizers["critic"].zero_grad()
-        loss_critic.backward()
-        critic_grad_norm = torch.nn.utils.clip_grad_norm_(
-            parameters=policy.critic_ensemble.parameters(), max_norm=clip_grad_norm_value
-        ).item()
-        optimizers["critic"].step()
-
-        # Initialize training info dictionary
-        training_infos = {
-            "loss_critic": loss_critic.item(),
-            "critic_grad_norm": critic_grad_norm,
-        }
-
-        # Discrete critic optimization (if available)
-        if policy.config.num_discrete_actions is not None:
-            discrete_critic_output = policy.forward(forward_batch, model="discrete_critic")
-            loss_discrete_critic = discrete_critic_output["loss_discrete_critic"]
-            optimizers["discrete_critic"].zero_grad()
-            loss_discrete_critic.backward()
-            discrete_critic_grad_norm = torch.nn.utils.clip_grad_norm_(
-                parameters=policy.discrete_critic.parameters(), max_norm=clip_grad_norm_value
-            ).item()
-            optimizers["discrete_critic"].step()
-
-            # Add discrete critic info to training info
-            training_infos["loss_discrete_critic"] = loss_discrete_critic.item()
-            training_infos["discrete_critic_grad_norm"] = discrete_critic_grad_norm
-
-        # Actor and temperature optimization (at specified frequency)
-        if optimization_step % policy_update_freq == 0:
-            for _ in range(policy_update_freq):
-                # Actor optimization
-                actor_output = policy.forward(forward_batch, model="actor")
-                loss_actor = actor_output["loss_actor"]
-                optimizers["actor"].zero_grad()
-                loss_actor.backward()
-                actor_grad_norm = torch.nn.utils.clip_grad_norm_(
-                    parameters=policy.actor.parameters(), max_norm=clip_grad_norm_value
-                ).item()
-                optimizers["actor"].step()
-
-                # Add actor info to training info
-                training_infos["loss_actor"] = loss_actor.item()
-                training_infos["actor_grad_norm"] = actor_grad_norm
-
-                # Temperature optimization
-                temperature_output = policy.forward(forward_batch, model="temperature")
-                loss_temperature = temperature_output["loss_temperature"]
-                optimizers["temperature"].zero_grad()
-                loss_temperature.backward()
-                temp_grad_norm = torch.nn.utils.clip_grad_norm_(
-                    parameters=[policy.log_alpha], max_norm=clip_grad_norm_value
-                ).item()
-                optimizers["temperature"].step()
-
-                # Add temperature info to training info
-                training_infos["loss_temperature"] = loss_temperature.item()
-                training_infos["temperature_grad_norm"] = temp_grad_norm
-                training_infos["temperature"] = policy.temperature
+        # One training step (trainer owns data_mixer iterator; algorithm owns UTD loop)
+        stats = trainer.training_step()

        # Push policy to actors if needed
        if time.time() - last_time_policy_pushed > policy_parameters_push_frequency:
-            push_actor_policy_to_queue(parameters_queue=parameters_queue, policy=policy)
+            push_actor_policy_to_queue(parameters_queue=parameters_queue, algorithm=algorithm)
            last_time_policy_pushed = time.time()

-        # Update target networks (main and discrete)
-        policy.update_target_networks()
+        training_infos = stats.to_log_dict()

        # Log training metrics at specified intervals
+        optimization_step = algorithm.optimization_step
        if optimization_step % log_freq == 0:
            training_infos["replay_buffer_size"] = len(replay_buffer)
            if offline_replay_buffer is not None:
@@ -583,7 +454,6 @@ def add_actor_information_and_train(
                custom_step_key="Optimization step",
            )

-        optimization_step += 1
        if optimization_step % log_freq == 0:
            logging.info(f"[LEARNER] Number of optimization step: {optimization_step}")

@@ -597,9 +467,12 @@ def add_actor_information_and_train(
                policy=policy,
                optimizers=optimizers,
                replay_buffer=replay_buffer,
+                algorithm=algorithm,
                offline_replay_buffer=offline_replay_buffer,
                dataset_repo_id=dataset_repo_id,
                fps=fps,
+                preprocessor=preprocessor,
+                postprocessor=postprocessor,
            )


@@ -607,7 +480,7 @@ def start_learner(
    parameters_queue: Queue,
    transition_queue: Queue,
    interaction_message_queue: Queue,
-    shutdown_event: any,  # Event,
+    shutdown_event: Any,  # Event
    cfg: TrainRLServerPipelineConfig,
 ):
    """
@@ -681,9 +554,12 @@ def save_training_checkpoint(
    policy: nn.Module,
    optimizers: dict[str, Optimizer],
    replay_buffer: ReplayBuffer,
+    algorithm: RLAlgorithm | None = None,
    offline_replay_buffer: ReplayBuffer | None = None,
    dataset_repo_id: str | None = None,
    fps: int = 30,
+    preprocessor=None,
+    postprocessor=None,
 ) -> None:
    """
    Save training checkpoint and associated data.
@@ -707,6 +583,8 @@ def save_training_checkpoint(
        offline_replay_buffer: Optional offline replay buffer to save
        dataset_repo_id: Repository ID for dataset
        fps: Frames per second for dataset
+        preprocessor: Optional preprocessor pipeline to save
+        postprocessor: Optional postprocessor pipeline to save
    """
    logging.info(f"Checkpoint policy after step {optimization_step}")
    _num_digits = max(6, len(str(online_steps)))
@@ -715,7 +593,7 @@ def save_training_checkpoint(
    # Create checkpoint directory
    checkpoint_dir = get_step_checkpoint_dir(cfg.output_dir, online_steps, optimization_step)

-    # Save checkpoint
+    # Save policy artifacts (pretrained_model/) + Trainer scaffolding (training_state/).
    save_checkpoint(
        checkpoint_dir=checkpoint_dir,
        step=optimization_step,
@@ -723,13 +601,22 @@ def save_training_checkpoint(
        policy=policy,
        optimizer=optimizers,
        scheduler=None,
+        preprocessor=preprocessor,
+        postprocessor=postprocessor,
    )

-    # Save interaction step manually
-    training_state_dir = os.path.join(checkpoint_dir, TRAINING_STATE_DIR)
-    os.makedirs(training_state_dir, exist_ok=True)
-    training_state = {"step": optimization_step, "interaction_step": interaction_step}
-    torch.save(training_state, os.path.join(training_state_dir, "training_state.pt"))
+    # Algorithm-owned tensors live in their own component subfolder
+    # so they can be `push_to_hub`'d independently and don't bloat the inference artifact.
+    if algorithm is not None:
+        algorithm.save_pretrained(checkpoint_dir / ALGORITHM_DIR)
+
+    # Enrich training_step.json with the RL-specific interaction_step counter so
+    # both can be restored from a single file.
+    training_state_dir = checkpoint_dir / TRAINING_STATE_DIR
+    write_json(
+        {"step": optimization_step, "interaction_step": interaction_step},
+        training_state_dir / TRAINING_STEP,
+    )

    # Update the "last" symlink
    update_last_checkpoint(checkpoint_dir)
@@ -760,58 +647,6 @@ def save_training_checkpoint(
    logging.info("Resume training")


-def make_optimizers_and_scheduler(cfg: TrainRLServerPipelineConfig, policy: nn.Module):
-    """
-    Creates and returns optimizers for the actor, critic, and temperature components of a reinforcement learning policy.
-
-    This function sets up Adam optimizers for:
-    - The **actor network**, ensuring that only relevant parameters are optimized.
-    - The **critic ensemble**, which evaluates the value function.
-    - The **temperature parameter**, which controls the entropy in soft actor-critic (SAC)-like methods.
-
-    It also initializes a learning rate scheduler, though currently, it is set to `None`.
-
-    NOTE:
-    - If the encoder is shared, its parameters are excluded from the actor's optimization process.
-    - The policy's log temperature (`log_alpha`) is wrapped in a list to ensure proper optimization as a standalone tensor.
-
-    Args:
-        cfg: Configuration object containing hyperparameters.
-        policy (nn.Module): The policy model containing the actor, critic, and temperature components.
-
-    Returns:
-        Tuple[Dict[str, torch.optim.Optimizer], Optional[torch.optim.lr_scheduler._LRScheduler]]:
-        A tuple containing:
-        - `optimizers`: A dictionary mapping component names ("actor", "critic", "temperature") to their respective Adam optimizers.
-        - `lr_scheduler`: Currently set to `None` but can be extended to support learning rate scheduling.
-
-    """
-    optimizer_actor = torch.optim.Adam(
-        params=[
-            p
-            for n, p in policy.actor.named_parameters()
-            if not policy.config.shared_encoder or not n.startswith("encoder")
-        ],
-        lr=cfg.policy.actor_lr,
-    )
-    optimizer_critic = torch.optim.Adam(params=policy.critic_ensemble.parameters(), lr=cfg.policy.critic_lr)
-
-    if cfg.policy.num_discrete_actions is not None:
-        optimizer_discrete_critic = torch.optim.Adam(
-            params=policy.discrete_critic.parameters(), lr=cfg.policy.critic_lr
-        )
-    optimizer_temperature = torch.optim.Adam(params=[policy.log_alpha], lr=cfg.policy.critic_lr)
-    lr_scheduler = None
-    optimizers = {
-        "actor": optimizer_actor,
-        "critic": optimizer_critic,
-        "temperature": optimizer_temperature,
-    }
-    if cfg.policy.num_discrete_actions is not None:
-        optimizers["discrete_critic"] = optimizer_discrete_critic
-    return optimizers, lr_scheduler
-
-
 # Training setup functions


@@ -875,13 +710,20 @@ def handle_resume_logic(cfg: TrainRLServerPipelineConfig) -> TrainRLServerPipeli
 def load_training_state(
    cfg: TrainRLServerPipelineConfig,
    optimizers: Optimizer | dict[str, Optimizer],
+    algorithm: RLAlgorithm | None = None,
+    device: str | torch.device = "cpu",
 ):
    """
-    Loads the training state (optimizers, step count, etc.) from a checkpoint.
+    Loads the training state (optimizers, RNG, step + interaction step, and
+    algorithm-owned tensors) from the most recent checkpoint.

    Args:
-        cfg (TrainRLServerPipelineConfig): Training configuration
-        optimizers (Optimizer | dict): Optimizers to load state into
+        cfg: Training configuration.
+        optimizers: Optimizers to load state into.
+        algorithm: Algorithm whose state dict should be restored.
+            Required for full main-equivalent resume;
+            the policy itself is restored separately via ``make_policy``.
+        device: Device on which to place loaded algorithm tensors.

    Returns:
        tuple: (optimization_step, interaction_step) or (None, None) if not resuming
@@ -890,20 +732,31 @@ def load_training_state(
        return None, None

    # Construct path to the last checkpoint directory
-    checkpoint_dir = os.path.join(cfg.output_dir, CHECKPOINTS_DIR, LAST_CHECKPOINT_LINK)
+    checkpoint_dir = Path(cfg.output_dir) / CHECKPOINTS_DIR / LAST_CHECKPOINT_LINK

    logging.info(f"Loading training state from {checkpoint_dir}")

    try:
-        # Use the utility function from train_utils which loads the optimizer state
-        step, optimizers, _ = utils_load_training_state(Path(checkpoint_dir), optimizers, None)
+        # Restore optimizers + RNG + step from the standard `training_state/` folder
+        step, optimizers, _ = utils_load_training_state(checkpoint_dir, optimizers, None)

-        # Load interaction step separately from training_state.pt
-        training_state_path = os.path.join(checkpoint_dir, TRAINING_STATE_DIR, "training_state.pt")
-        interaction_step = 0
-        if os.path.exists(training_state_path):
-            training_state = torch.load(training_state_path, weights_only=False)  # nosec B614: Safe usage of torch.load
-            interaction_step = training_state.get("interaction_step", 0)
+        # Restore algorithm-owned tensors
+        if algorithm is not None:
+            algo_dir = checkpoint_dir / ALGORITHM_DIR
+            if algo_dir.is_dir():
+                tensors = load_safetensors(str(algo_dir / SAFETENSORS_SINGLE_FILE))
+                algorithm.load_state_dict(tensors, device=device)
+                logging.info(f"Loaded algorithm state from {algo_dir}")
+            else:
+                logging.warning(
+                    f"No algorithm state found at {algo_dir}; "
+                    "will keep their freshly-initialised values. Adam moments restored from the "
+                    "old optimizer state may not match these reset parameters."
+                )
+
+        # Read interaction_step from the enriched training_step.json
+        training_step_path = checkpoint_dir / TRAINING_STATE_DIR / TRAINING_STEP
+        interaction_step = int(load_json(training_step_path).get("interaction_step", 0))

        logging.info(f"Resuming from step {step}, interaction step {interaction_step}")
        return step, interaction_step
@@ -1016,33 +869,6 @@ def initialize_offline_replay_buffer(
 # Utilities/Helpers functions


-def get_observation_features(
-    policy: SACPolicy, observations: torch.Tensor, next_observations: torch.Tensor
-) -> tuple[torch.Tensor | None, torch.Tensor | None]:
-    """
-    Get observation features from the policy encoder. It act as cache for the observation features.
-    when the encoder is frozen, the observation features are not updated.
-    We can save compute by caching the observation features.
-
-    Args:
-        policy: The policy model
-        observations: The current observations
-        next_observations: The next observations
-
-    Returns:
-        tuple: observation_features, next_observation_features
-    """
-
-    if policy.config.vision_encoder_name is None or not policy.config.freeze_vision_encoder:
-        return None, None
-
-    with torch.no_grad():
-        observation_features = policy.actor.encoder.get_cached_image_features(observations)
-        next_observation_features = policy.actor.encoder.get_cached_image_features(next_observations)
-
-    return observation_features, next_observation_features
-
-
 def use_threads(cfg: TrainRLServerPipelineConfig) -> bool:
    return cfg.policy.concurrency.learner == "threads"

@@ -1093,19 +919,11 @@ def check_nan_in_transition(
    return nan_detected


-def push_actor_policy_to_queue(parameters_queue: Queue, policy: nn.Module):
+def push_actor_policy_to_queue(parameters_queue: Queue, algorithm: RLAlgorithm) -> None:
    logging.debug("[LEARNER] Pushing actor policy to the queue")

    # Create a dictionary to hold all the state dicts
-    state_dicts = {"policy": move_state_dict_to_device(policy.actor.state_dict(), device="cpu")}
-
-    # Add discrete critic if it exists
-    if hasattr(policy, "discrete_critic") and policy.discrete_critic is not None:
-        state_dicts["discrete_critic"] = move_state_dict_to_device(
-            policy.discrete_critic.state_dict(), device="cpu"
-        )
-        logging.debug("[LEARNER] Including discrete critic in state dict push")
-
+    state_dicts = algorithm.get_weights()
    state_bytes = state_to_bytes(state_dicts)
    parameters_queue.put(state_bytes)

@@ -1129,9 +947,8 @@ def process_transitions(
    transition_queue: Queue,
    replay_buffer: ReplayBuffer,
    offline_replay_buffer: ReplayBuffer,
-    device: str,
    dataset_repo_id: str | None,
-    shutdown_event: any,
+    shutdown_event: Any,  # Event
 ):
    """Process all available transitions from the queue.

@@ -1139,7 +956,6 @@ def process_transitions(
        transition_queue: Queue for receiving transitions from the actor
        replay_buffer: Replay buffer to add transitions to
        offline_replay_buffer: Offline replay buffer to add transitions to
-        device: Device to move transitions to
        dataset_repo_id: Repository ID for dataset
        shutdown_event: Event to signal shutdown
    """
@@ -1148,8 +964,6 @@ def process_transitions(
        transition_list = bytes_to_transitions(buffer=transition_list)

        for transition in transition_list:
-            transition = move_transition_to_device(transition=transition, device=device)
-
            # Skip transitions with NaN values
            if check_nan_in_transition(
                observations=transition["state"],
@@ -1163,7 +977,7 @@ def process_transitions(

            # Add to offline buffer if it's an intervention
            if dataset_repo_id is not None and transition.get("complementary_info", {}).get(
-                TeleopEvents.IS_INTERVENTION
+                TeleopEvents.IS_INTERVENTION.value
            ):
                offline_replay_buffer.add(**transition)

@@ -1172,7 +986,7 @@ def process_interaction_messages(
    interaction_message_queue: Queue,
    interaction_step_shift: int,
    wandb_logger: WandBLogger | None,
-    shutdown_event: any,
+    shutdown_event: Any,  # Event
 ) -> dict | None:
    """Process all available interaction messages from the queue.

@@ -18,17 +18,32 @@
 import logging
 import time
 from multiprocessing import Event, Queue
+from typing import TYPE_CHECKING

-from lerobot.transport import services_pb2, services_pb2_grpc
-from lerobot.transport.utils import receive_bytes_in_chunks, send_bytes_in_chunks
+from lerobot.utils.import_utils import _grpc_available

 from .queue import get_last_item_from_queue

+if TYPE_CHECKING or _grpc_available:
+    import grpc
+
+    from lerobot.transport import services_pb2, services_pb2_grpc
+    from lerobot.transport.utils import receive_bytes_in_chunks, send_bytes_in_chunks
+
+    _ServicerBase = services_pb2_grpc.LearnerServiceServicer
+else:
+    grpc = None
+    services_pb2 = None
+    services_pb2_grpc = None
+    receive_bytes_in_chunks = None
+    send_bytes_in_chunks = None
+    _ServicerBase = object
+
 MAX_WORKERS = 3  # Stream parameters, send transitions and interactions
 SHUTDOWN_TIMEOUT = 10


-class LearnerService(services_pb2_grpc.LearnerServiceServicer):
+class LearnerService(_ServicerBase):
    """
    Implementation of the LearnerService gRPC service
    This service is used to send parameters to the Actor and receive transitions and interactions from the Actor
@@ -51,7 +66,9 @@ class LearnerService(services_pb2_grpc.LearnerServiceServicer):
        self.interaction_message_queue = interaction_message_queue
        self.queue_get_timeout = queue_get_timeout

-    def StreamParameters(self, request, context):  # noqa: N802
+    def StreamParameters(  # noqa: N802
+        self, request: "services_pb2.Empty", context: "grpc.ServicerContext"
+    ):
        # TODO: authorize the request
        logging.info("[LEARNER] Received request to stream parameters from the Actor")

@@ -86,7 +103,7 @@ class LearnerService(services_pb2_grpc.LearnerServiceServicer):
        logging.info("[LEARNER] Stream parameters finished")
        return services_pb2.Empty()

-    def SendTransitions(self, request_iterator, _context):  # noqa: N802
+    def SendTransitions(self, request_iterator, _context: "grpc.ServicerContext"):  # noqa: N802
        # TODO: authorize the request
        logging.info("[LEARNER] Received request to receive transitions from the Actor")

@@ -100,7 +117,7 @@ class LearnerService(services_pb2_grpc.LearnerServiceServicer):
        logging.debug("[LEARNER] Finished receiving transitions")
        return services_pb2.Empty()

-    def SendInteractions(self, request_iterator, _context):  # noqa: N802
+    def SendInteractions(self, request_iterator, _context: "grpc.ServicerContext"):  # noqa: N802
        # TODO: authorize the request
        logging.info("[LEARNER] Received request to receive interactions from the Actor")

@@ -114,5 +131,5 @@ class LearnerService(services_pb2_grpc.LearnerServiceServicer):
        logging.debug("[LEARNER] Finished receiving interactions")
        return services_pb2.Empty()

-    def Ready(self, request, context):  # noqa: N802
+    def Ready(self, request: "services_pb2.Empty", context: "grpc.ServicerContext"):  # noqa: N802
        return services_pb2.Empty()
@@ -0,0 +1,50 @@
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Top-level pipeline config for distributed RL training (actor / learner)."""
+
+from __future__ import annotations
+
+from dataclasses import dataclass
+
+from lerobot.configs.default import DatasetConfig
+from lerobot.configs.train import TrainPipelineConfig
+
+from .algorithms.configs import RLAlgorithmConfig
+from .algorithms.factory import make_algorithm_config
+from .algorithms.sac import SACAlgorithmConfig  # noqa: F401
+
+
+@dataclass(kw_only=True)
+class TrainRLServerPipelineConfig(TrainPipelineConfig):
+    # NOTE: In RL, we don't need an offline dataset
+    # TODO: Make `TrainPipelineConfig.dataset` optional
+    dataset: DatasetConfig | None = None  # type: ignore[assignment] # because the parent class has made it's type non-optional
+
+    # Algorithm config.
+    algorithm: RLAlgorithmConfig | None = None
+
+    # Data mixer strategy name. Currently supports "online_offline".
+    mixer: str = "online_offline"
+    # Fraction sampled from online replay when using OnlineOfflineMixer.
+    online_ratio: float = 0.5
+
+    def validate(self) -> None:
+        super().validate()
+
+        if self.algorithm is None:
+            self.algorithm = make_algorithm_config("sac")
+
+        if getattr(self.algorithm, "policy_config", None) is None:
+            self.algorithm.policy_config = self.policy
@@ -0,0 +1,101 @@
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from __future__ import annotations
+
+from collections.abc import Iterator
+from typing import Any
+
+from lerobot.types import BatchType
+
+from .algorithms.base import RLAlgorithm
+from .algorithms.configs import TrainingStats
+from .data_sources.data_mixer import DataMixer
+
+
+class RLTrainer:
+    """Unified training step orchestrator.
+
+    Holds the algorithm, a DataMixer, and an optional preprocessor.
+    """
+
+    def __init__(
+        self,
+        algorithm: RLAlgorithm,
+        data_mixer: DataMixer,
+        batch_size: int,
+        *,
+        preprocessor: Any | None = None,
+    ):
+        self.algorithm = algorithm
+        self.data_mixer = data_mixer
+        self.batch_size = batch_size
+        self._preprocessor = preprocessor
+
+        self._iterator: Iterator[BatchType] | None = None
+
+        self.algorithm.make_optimizers_and_scheduler()
+
+    def _build_data_iterator(self) -> Iterator[BatchType]:
+        """Create a fresh algorithm-configured iterator (optionally preprocessed)."""
+        raw = self.algorithm.configure_data_iterator(
+            data_mixer=self.data_mixer,
+            batch_size=self.batch_size,
+        )
+        if self._preprocessor is not None:
+            return _PreprocessedIterator(raw, self._preprocessor)
+        return raw
+
+    def reset_data_iterator(self) -> None:
+        """Discard the current iterator so it will be rebuilt lazily next step."""
+        self._iterator = None
+
+    def set_data_mixer(self, data_mixer: DataMixer, *, reset: bool = True) -> None:
+        """Swap the active data mixer, optionally resetting the iterator."""
+        self.data_mixer = data_mixer
+        if reset:
+            self.reset_data_iterator()
+
+    def training_step(self) -> TrainingStats:
+        """Run one training step (algorithm-agnostic)."""
+        if self._iterator is None:
+            self._iterator = self._build_data_iterator()
+        return self.algorithm.update(self._iterator)
+
+
+def preprocess_rl_batch(preprocessor: Any, batch: BatchType) -> BatchType:
+    """Apply policy preprocessing to RL observations only."""
+    observations = batch["state"]
+    next_observations = batch["next_state"]
+    batch["state"] = preprocessor.process_observation(observations)
+    batch["next_state"] = preprocessor.process_observation(next_observations)
+
+    return batch
+
+
+class _PreprocessedIterator:
+    """Iterator wrapper that preprocesses each sampled RL batch."""
+
+    __slots__ = ("_raw", "_preprocessor")
+
+    def __init__(self, raw_iterator: Iterator[BatchType], preprocessor: Any) -> None:
+        self._raw = raw_iterator
+        self._preprocessor = preprocessor
+
+    def __iter__(self) -> _PreprocessedIterator:
+        return self
+
+    def __next__(self) -> BatchType:
+        batch = next(self._raw)
+        return preprocess_rl_batch(self._preprocessor, batch)
@@ -353,7 +353,8 @@ class GripperVelocityToJoint(RobotActionProcessorStep):
        speed_factor: A scaling factor to convert the normalized velocity command to a position change.
        clip_min: The minimum allowed gripper joint position.
        clip_max: The maximum allowed gripper joint position.
-        discrete_gripper: If True, treat the input action as discrete (0: open, 1: close, 2: stay).
+        discrete_gripper: If True, interpret the input as a discrete class index
+            {0 = close, 1 = stay, 2 = open}, matching `GamepadTeleop.GripperAction`.
    """

    speed_factor: float = 20.0
@@ -377,10 +378,10 @@ class GripperVelocityToJoint(RobotActionProcessorStep):
            raise ValueError("Joints observation is require for computing robot kinematics")

        if self.discrete_gripper:
-            # Discrete gripper actions are in [0, 1, 2]
-            # 0: open, 1: close, 2: stay
-            # We need to shift them to [-1, 0, 1] and then scale them to clip_max
-            gripper_vel = (gripper_vel - 1) * self.clip_max
+            # Map discrete command {0=close, 1=stay, 2=open} -> signed velocity.
+            # Negation accounts for SO100 sign (joint position increases on close).
+            #   0 -> +clip_max (close), 1 -> 0 (stay), 2 -> -clip_max (open)
+            gripper_vel = -(gripper_vel - 1) * self.clip_max

        # Compute desired gripper position
        delta = gripper_vel * float(self.speed_factor)
@@ -0,0 +1,151 @@
+#!/usr/bin/env python
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Re-save a Robometer checkpoint in LeRobot HF format.
+
+LeRobot's reward model format is ``config.json`` (a draccus-encoded
+:class:`~lerobot.rewards.robometer.RobometerConfig`) plus a single
+``model.safetensors`` containing the merged base + heads weights. The
+released checkpoint at ``lilkm/robometer-4b`` already follows this layout;
+this script is for converting other Robometer variants (e.g. a future
+upstream release or a local training run) into the same format.
+
+Example:
+
+.. code-block:: shell
+
+   lerobot-export-robometer \\
+       --src robometer/Robometer-4B \\
+       --dst ./robometer-4b-lerobot
+"""
+
+from __future__ import annotations
+
+import argparse
+import logging
+from pathlib import Path
+
+from lerobot.rewards.robometer import RobometerConfig, RobometerRewardModel
+from lerobot.rewards.robometer._upstream_loader import apply_upstream_checkpoint
+from lerobot.utils.utils import init_logging
+
+
+def export_robometer_to_lerobot(
+    src: str,
+    dst: str | Path,
+    *,
+    device: str = "cpu",
+    dataset_repo_id: str = "",
+    write_model_card: bool = True,
+) -> Path:
+    """Load Robometer from ``src`` and re-save it under ``dst`` in LeRobot HF format.
+
+    Produces ``config.json``, ``model.safetensors``, and (optionally) ``README.md``.
+
+    Args:
+        src: Upstream source. Hugging Face repo id (``"robometer/Robometer-4B"``,
+            optionally ``"...@revision"``) or a local snapshot directory.
+        dst: Output directory. ``config.json`` and ``model.safetensors`` are
+            written here.
+        device: Where to place the model during loading. Defaults to CPU; use
+            ``"cuda"`` if you want to verify on GPU before saving.
+        dataset_repo_id: Hugging Face dataset id the model was trained on
+            (e.g. ``"robometer/RBM-1M"``). Written into the model card's
+            ``datasets:`` metadata. Leave empty if not applicable.
+        write_model_card: Generate a ``README.md`` using LeRobot's reward
+            model card template. Disable if you want to write the README
+            yourself.
+
+    Returns:
+        The resolved output directory.
+    """
+    # A fresh ``RobometerConfig`` has ``vlm_config=None``, which routes
+    # ``__init__`` through the upstream-matching path: download base Qwen,
+    # resize embeddings per ``ROBOMETER_SPECIAL_TOKENS``. ``apply_upstream_checkpoint``
+    # then resizes again (if needed) to match the upstream checkpoint's vocab
+    # and overlays its weights. ``_save_pretrained`` snapshots the resulting
+    # post-resize architecture into ``vlm_config`` for fast future loads.
+    cfg = RobometerConfig(pretrained_path=src, device=device)
+    model = RobometerRewardModel(cfg)
+    apply_upstream_checkpoint(model, src)
+    model.to(device)
+    model.eval()
+
+    dst = Path(dst)
+    dst.mkdir(parents=True, exist_ok=True)
+    model.save_pretrained(str(dst))
+
+    if write_model_card:
+        card = model.generate_model_card(
+            dataset_repo_id=dataset_repo_id,
+            model_type=model.config.type,
+            license=model.config.license,
+            tags=model.config.tags,
+        )
+        card.save(str(dst / "README.md"))
+
+    return dst
+
+
+def _parse_args() -> argparse.Namespace:
+    parser = argparse.ArgumentParser(
+        description=__doc__, formatter_class=argparse.RawDescriptionHelpFormatter
+    )
+    parser.add_argument(
+        "--src",
+        default="robometer/Robometer-4B",
+        help="Upstream Robometer source (HF repo id or local directory).",
+    )
+    parser.add_argument(
+        "--dst",
+        required=True,
+        help="Output directory for the LeRobot-format checkpoint.",
+    )
+    parser.add_argument(
+        "--device",
+        default="cpu",
+        help="Torch device to load the model on (default: cpu). Conversion only "
+        "needs CPU; use cuda if you also want to smoke-test inference.",
+    )
+    parser.add_argument(
+        "--dataset",
+        default="",
+        help="Optional Hugging Face dataset id used for training "
+        "(e.g. `robometer/RBM-1M`). Written into the auto-generated model card's "
+        "`datasets:` metadata.",
+    )
+    parser.add_argument(
+        "--no-readme",
+        action="store_true",
+        help="Skip writing README.md. Use if you want to author the model card by hand.",
+    )
+    return parser.parse_args()
+
+
+def main() -> None:
+    init_logging()
+    args = _parse_args()
+    out = export_robometer_to_lerobot(
+        src=args.src,
+        dst=args.dst,
+        device=args.device,
+        dataset_repo_id=args.dataset,
+        write_model_card=not args.no_readme,
+    )
+    logging.info("Saved LeRobot-format Robometer checkpoint to %s", out)
+
+
+if __name__ == "__main__":
+    main()
@@ -92,6 +92,7 @@ def get_sys_info() -> dict[str, str]:
    info.update(
        {
            "PyTorch version": torch_version,
+            "Torchcodec version": get_package_version("torchcodec"),
            "Is PyTorch built with CUDA support?": str(torch_cuda_available),
            "Cuda version": cuda_version,
            "GPU model": gpu_model,
@@ -104,11 +104,14 @@ class KeyboardTeleop(Teleoperator):

    def _on_press(self, key):
        if hasattr(key, "char"):
-            self.event_queue.put((key.char, True))
+            key = key.char
+        self.event_queue.put((key, True))

    def _on_release(self, key):
        if hasattr(key, "char"):
-            self.event_queue.put((key.char, False))
+            key = key.char
+        self.event_queue.put((key, False))
+
        if key == keyboard.Key.esc:
            logging.info("ESC pressed, disconnecting.")
            self.disconnect()
@@ -204,8 +207,6 @@ class KeyboardEndEffectorTeleop(KeyboardTeleop):
                # this is useful for retrieving other events like interventions for RL, episode success, etc.
                self.misc_keys_queue.put(key)

-        self.current_pressed.clear()
-
        action_dict = {
            "delta_x": delta_x,
            "delta_y": delta_y,
@@ -256,6 +257,8 @@ class KeyboardEndEffectorTeleop(KeyboardTeleop):
        ]
        is_intervention = any(self.current_pressed.get(key, False) for key in movement_keys)

+        self.current_pressed.clear()
+
        # Check for episode control commands from misc_keys_queue
        terminate_episode = False
        success = False
@@ -39,10 +39,8 @@ For more details, see the [Physical Intelligence π₀ blog post](https://www.ph
 π₀.₅ represents a significant evolution from π₀, developed by Physical Intelligence to address a big challenge in robotics: open-world generalization. While robots can perform impressive tasks in controlled environments, π₀.₅ is designed to generalize to entirely new environments and situations that were never seen during training.

 For more details, see the [Physical Intelligence π₀.₅ blog post](https://www.physicalintelligence.company/blog/pi05).
-{% elif model_name == "sac" %}
-[Soft Actor-Critic (SAC)](https://huggingface.co/papers/1801.01290) is an entropy-regularised actor-critic algorithm offering stable, sample-efficient learning in continuous-control environments.
-{% elif model_name == "reward_classifier" %}
-A reward classifier is a lightweight neural network that scores observations or trajectories for task success, providing a learned reward signal or offline evaluation when explicit rewards are unavailable.
+{% elif model_name == "gaussian_actor" %}
+This is a Gaussian Actor policy (Gaussian policy with a tanh squash) — the policy-side component used by [Soft Actor-Critic (SAC)](https://huggingface.co/papers/1801.01290) and related maximum-entropy continuous-control algorithms.
 {% else %}
 _Model type not recognized — please update this template._
 {% endif %}
@@ -13,6 +13,8 @@
 A reward classifier is a lightweight neural network that scores observations or trajectories for task success, providing a learned reward signal or offline evaluation when explicit rewards are unavailable.
 {% elif model_name == "sarm" %}
 A Success-Aware Reward Model (SARM) predicts a dense reward signal from observations, typically used downstream for reinforcement learning or human-in-the-loop fine-tuning when task success is not directly observable.
+{% elif model_name == "robometer" %}
+Robometer is a zero-shot general-purpose robotic reward model built on a fine-tuned Qwen3-VL backbone with progress, preference, and success heads. Given a video and a task description it outputs a per-frame progress signal in [0, 1] and a per-frame success probability — suitable for offline reward labelling and for low-frequency reward signals during RL fine-tuning of robot policies.
 {% else %}
 _Reward model type not recognized — please update this template._
 {% endif %}
@@ -40,6 +40,7 @@ PolicyAction = torch.Tensor
 RobotAction = dict[str, Any]
 EnvAction = np.ndarray
 RobotObservation = dict[str, Any]
+BatchType = dict[str, Any]


 EnvTransition = TypedDict(
@@ -47,6 +47,7 @@ CHECKPOINTS_DIR = "checkpoints"
 LAST_CHECKPOINT_LINK = "last"
 PRETRAINED_MODEL_DIR = "pretrained_model"
 TRAINING_STATE_DIR = "training_state"
+ALGORITHM_DIR = "algorithm"
 RNG_STATE = "rng_state.safetensors"
 TRAINING_STEP = "training_step.json"
 OPTIMIZER_STATE = "optimizer_state.safetensors"
@@ -132,6 +132,7 @@ _faker_available = is_package_available("faker")
 _pynput_available = is_package_available("pynput")
 _pygame_available = is_package_available("pygame")
 _qwen_vl_utils_available = is_package_available("qwen-vl-utils", import_name="qwen_vl_utils")
+_grpc_available = is_package_available("grpcio", import_name="grpc")
 _wallx_deps_available = (
    _transformers_available and _peft_available and _torchdiffeq_available and _qwen_vl_utils_available
 )
@@ -1691,3 +1691,68 @@ def test_delta_timestamps_query_returns_correct_values(tmp_path, empty_lerobot_d
    # Previous frame is outside episode, so it's clamped to first frame and marked as padded
    assert state_values == [10.0, 10.0], f"Expected [10.0, 10.0], got {state_values}"
    assert is_pad == [True, False], f"Expected [True, False], got {is_pad}"
+
+
+def test_episode_filter_filters_dataset(tmp_path, lerobot_dataset_factory):
+    """episode_filter on LeRobotDataset narrows the loaded dataset to matching episodes."""
+    dataset = lerobot_dataset_factory(root=tmp_path / "test", total_episodes=8, total_frames=200)
+    lengths = dataset.meta.episodes["length"]
+    threshold = sorted(lengths)[len(lengths) // 2]
+    expected_eps = [i for i, length in enumerate(lengths) if length >= threshold]
+    expected_frames = sum(lengths[i] for i in expected_eps)
+
+    filtered = LeRobotDataset(
+        dataset.repo_id,
+        root=dataset.root,
+        episode_filter=lambda ep: ep["length"] >= threshold,
+    )
+
+    assert filtered.num_episodes == len(expected_eps)
+    assert filtered.num_frames == expected_frames
+    seen_eps = {filtered[i]["episode_index"].item() for i in range(len(filtered))}
+    assert seen_eps == set(expected_eps)
+
+
+def test_episode_filter_intersects_with_episodes(tmp_path, lerobot_dataset_factory):
+    """When both episodes and episode_filter are given to LeRobotDataset, the result is their intersection."""
+    dataset = lerobot_dataset_factory(root=tmp_path / "test", total_episodes=8, total_frames=200)
+    lengths = dataset.meta.episodes["length"]
+    candidates = [0, 2, 4, 6]
+    candidate_lengths = [lengths[i] for i in candidates]
+    threshold = sorted(candidate_lengths)[len(candidate_lengths) // 2]
+    expected_eps = [i for i in candidates if lengths[i] >= threshold]
+
+    filtered = LeRobotDataset(
+        dataset.repo_id,
+        root=dataset.root,
+        episodes=candidates,
+        episode_filter=lambda ep: ep["length"] >= threshold,
+    )
+
+    assert filtered.num_episodes == len(expected_eps)
+    seen_eps = {filtered[i]["episode_index"].item() for i in range(len(filtered))}
+    assert seen_eps == set(expected_eps)
+
+
+def test_episode_filter_no_match_raises(tmp_path, lerobot_dataset_factory):
+    """An empty match in LeRobotDataset's episode_filter raises a ValueError rather than silently returning an empty dataset."""
+    dataset = lerobot_dataset_factory(root=tmp_path / "test", total_episodes=4, total_frames=100)
+
+    with pytest.raises(ValueError, match=r"The episode filter did not match any episode"):
+        LeRobotDataset(
+            dataset.repo_id,
+            root=dataset.root,
+            episode_filter=lambda ep: ep["length"] < 0,
+        )
+
+
+def test_episode_filter_unknown_key_raises(tmp_path, lerobot_dataset_factory):
+    """A predicate referencing a column absent from meta.episodes surfaces a clear KeyError."""
+    dataset = lerobot_dataset_factory(root=tmp_path / "test", total_episodes=4, total_frames=100)
+
+    with pytest.raises(KeyError, match="not_a_real_field"):
+        LeRobotDataset(
+            dataset.repo_id,
+            root=dataset.root,
+            episode_filter=lambda ep: ep["not_a_real_field"] > 0,
+        )
@@ -17,19 +17,19 @@
 import pytest

 from lerobot.configs.types import FeatureType, NormalizationMode, PolicyFeature
-from lerobot.policies.sac.configuration_sac import (
+from lerobot.policies.gaussian_actor.configuration_gaussian_actor import (
    ActorLearnerConfig,
    ActorNetworkConfig,
    ConcurrencyConfig,
    CriticNetworkConfig,
+    GaussianActorConfig,
    PolicyConfig,
-    SACConfig,
 )
 from lerobot.utils.constants import ACTION, OBS_IMAGE, OBS_STATE


-def test_sac_config_default_initialization():
-    config = SACConfig()
+def test_gaussian_actor_config_default_initialization():
+    config = GaussianActorConfig()

    assert config.normalization_mapping == {
        "VISUAL": NormalizationMode.MEAN_STD,
@@ -55,9 +55,6 @@ def test_sac_config_default_initialization():
    # Basic parameters
    assert config.device == "cpu"
    assert config.storage_device == "cpu"
-    assert config.discount == 0.99
-    assert config.temperature_init == 1.0
-    assert config.num_critics == 2

    # Architecture specifics
    assert config.vision_encoder_name is None
@@ -66,6 +63,8 @@ def test_sac_config_default_initialization():
    assert config.shared_encoder is True
    assert config.num_discrete_actions is None
    assert config.image_embedding_pooling_dim == 8
+    assert config.state_encoder_hidden_dim == 256
+    assert config.latent_dim == 256

    # Training parameters
    assert config.online_steps == 1000000
@@ -73,20 +72,6 @@ def test_sac_config_default_initialization():
    assert config.offline_buffer_capacity == 100000
    assert config.async_prefetch is False
    assert config.online_step_before_learning == 100
-    assert config.policy_update_freq == 1
-
-    # SAC algorithm parameters
-    assert config.num_subsample_critics is None
-    assert config.critic_lr == 3e-4
-    assert config.actor_lr == 3e-4
-    assert config.temperature_lr == 3e-4
-    assert config.critic_target_update_weight == 0.005
-    assert config.utd_ratio == 1
-    assert config.state_encoder_hidden_dim == 256
-    assert config.latent_dim == 256
-    assert config.target_entropy is None
-    assert config.use_backup_entropy is True
-    assert config.grad_clip_norm == 40.0

    # Dataset stats defaults
    expected_dataset_stats = {
@@ -105,11 +90,6 @@ def test_sac_config_default_initialization():
    }
    assert config.dataset_stats == expected_dataset_stats

-    # Critic network configuration
-    assert config.critic_network_kwargs.hidden_dims == [256, 256]
-    assert config.critic_network_kwargs.activate_final is True
-    assert config.critic_network_kwargs.final_activation is None
-
    # Actor network configuration
    assert config.actor_network_kwargs.hidden_dims == [256, 256]
    assert config.actor_network_kwargs.activate_final is True
@@ -135,7 +115,6 @@ def test_sac_config_default_initialization():
    assert config.concurrency.learner == "threads"

    assert isinstance(config.actor_network_kwargs, ActorNetworkConfig)
-    assert isinstance(config.critic_network_kwargs, CriticNetworkConfig)
    assert isinstance(config.policy_kwargs, PolicyConfig)
    assert isinstance(config.actor_learner_config, ActorLearnerConfig)
    assert isinstance(config.concurrency, ConcurrencyConfig)
@@ -175,22 +154,22 @@ def test_concurrency_config():
    assert config.learner == "threads"


-def test_sac_config_custom_initialization():
-    config = SACConfig(
+def test_gaussian_actor_config_custom_initialization():
+    config = GaussianActorConfig(
        device="cpu",
-        discount=0.95,
-        temperature_init=0.5,
-        num_critics=3,
+        latent_dim=128,
+        state_encoder_hidden_dim=128,
+        num_discrete_actions=3,
    )

    assert config.device == "cpu"
-    assert config.discount == 0.95
-    assert config.temperature_init == 0.5
-    assert config.num_critics == 3
+    assert config.latent_dim == 128
+    assert config.state_encoder_hidden_dim == 128
+    assert config.num_discrete_actions == 3


 def test_validate_features():
-    config = SACConfig(
+    config = GaussianActorConfig(
        input_features={OBS_STATE: PolicyFeature(type=FeatureType.STATE, shape=(10,))},
        output_features={ACTION: PolicyFeature(type=FeatureType.ACTION, shape=(3,))},
    )
@@ -198,7 +177,7 @@ def test_validate_features():


 def test_validate_features_missing_observation():
-    config = SACConfig(
+    config = GaussianActorConfig(
        input_features={"wrong_key": PolicyFeature(type=FeatureType.STATE, shape=(10,))},
        output_features={ACTION: PolicyFeature(type=FeatureType.ACTION, shape=(3,))},
    )
@@ -209,7 +188,7 @@ def test_validate_features_missing_observation():


 def test_validate_features_missing_action():
-    config = SACConfig(
+    config = GaussianActorConfig(
        input_features={OBS_STATE: PolicyFeature(type=FeatureType.STATE, shape=(10,))},
        output_features={"wrong_key": PolicyFeature(type=FeatureType.ACTION, shape=(3,))},
    )
@@ -0,0 +1,528 @@
+# !/usr/bin/env python
+
+# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import pytest
+
+pytest.importorskip("datasets", reason="datasets is required (install lerobot[dataset])")
+
+import torch  # noqa: E402
+from torch import Tensor, nn  # noqa: E402
+
+from lerobot.configs.types import FeatureType, PolicyFeature  # noqa: E402
+from lerobot.policies.gaussian_actor.configuration_gaussian_actor import GaussianActorConfig  # noqa: E402
+from lerobot.policies.gaussian_actor.modeling_gaussian_actor import MLP, GaussianActorPolicy  # noqa: E402
+from lerobot.rl.algorithms.sac import SACAlgorithm, SACAlgorithmConfig  # noqa: E402
+from lerobot.utils.constants import ACTION, OBS_IMAGE, OBS_STATE  # noqa: E402
+from lerobot.utils.random_utils import seeded_context, set_seed  # noqa: E402
+
+try:
+    import transformers  # noqa: F401
+
+    TRANSFORMERS_AVAILABLE = True
+except ImportError:
+    TRANSFORMERS_AVAILABLE = False
+
+
+@pytest.fixture(autouse=True)
+def set_random_seed():
+    seed = 42
+    set_seed(seed)
+
+
+def test_mlp_with_default_args():
+    mlp = MLP(input_dim=10, hidden_dims=[256, 256])
+
+    x = torch.randn(10)
+    y = mlp(x)
+    assert y.shape == (256,)
+
+
+def test_mlp_with_batch_dim():
+    mlp = MLP(input_dim=10, hidden_dims=[256, 256])
+    x = torch.randn(2, 10)
+    y = mlp(x)
+    assert y.shape == (2, 256)
+
+
+def test_forward_with_empty_hidden_dims():
+    mlp = MLP(input_dim=10, hidden_dims=[])
+    x = torch.randn(1, 10)
+    assert mlp(x).shape == (1, 10)
+
+
+def test_mlp_with_dropout():
+    mlp = MLP(input_dim=10, hidden_dims=[256, 256, 11], dropout_rate=0.1)
+    x = torch.randn(1, 10)
+    y = mlp(x)
+    assert y.shape == (1, 11)
+
+    drop_out_layers_count = sum(isinstance(layer, nn.Dropout) for layer in mlp.net)
+    assert drop_out_layers_count == 2
+
+
+def test_mlp_with_custom_final_activation():
+    mlp = MLP(input_dim=10, hidden_dims=[256, 256], final_activation=torch.nn.Tanh())
+    x = torch.randn(1, 10)
+    y = mlp(x)
+    assert y.shape == (1, 256)
+    assert (y >= -1).all() and (y <= 1).all()
+
+
+def test_gaussian_actor_policy_with_default_args():
+    with pytest.raises(ValueError, match="should be an instance of class `PreTrainedConfig`"):
+        GaussianActorPolicy()
+
+
+def create_dummy_state(batch_size: int, state_dim: int = 10) -> Tensor:
+    return {
+        OBS_STATE: torch.randn(batch_size, state_dim),
+    }
+
+
+def create_dummy_with_visual_input(batch_size: int, state_dim: int = 10) -> Tensor:
+    return {
+        OBS_IMAGE: torch.randn(batch_size, 3, 84, 84),
+        OBS_STATE: torch.randn(batch_size, state_dim),
+    }
+
+
+def create_dummy_action(batch_size: int, action_dim: int = 10) -> Tensor:
+    return torch.randn(batch_size, action_dim)
+
+
+def create_default_train_batch(
+    batch_size: int = 8, state_dim: int = 10, action_dim: int = 10
+) -> dict[str, Tensor]:
+    return {
+        ACTION: create_dummy_action(batch_size, action_dim),
+        "reward": torch.randn(batch_size),
+        "state": create_dummy_state(batch_size, state_dim),
+        "next_state": create_dummy_state(batch_size, state_dim),
+        "done": torch.randn(batch_size),
+    }
+
+
+def create_train_batch_with_visual_input(
+    batch_size: int = 8, state_dim: int = 10, action_dim: int = 10
+) -> dict[str, Tensor]:
+    return {
+        ACTION: create_dummy_action(batch_size, action_dim),
+        "reward": torch.randn(batch_size),
+        "state": create_dummy_with_visual_input(batch_size, state_dim),
+        "next_state": create_dummy_with_visual_input(batch_size, state_dim),
+        "done": torch.randn(batch_size),
+    }
+
+
+def create_observation_batch(batch_size: int = 8, state_dim: int = 10) -> dict[str, Tensor]:
+    return {
+        OBS_STATE: torch.randn(batch_size, state_dim),
+    }
+
+
+def create_observation_batch_with_visual_input(batch_size: int = 8, state_dim: int = 10) -> dict[str, Tensor]:
+    return {
+        OBS_STATE: torch.randn(batch_size, state_dim),
+        OBS_IMAGE: torch.randn(batch_size, 3, 84, 84),
+    }
+
+
+def create_default_config(
+    state_dim: int, continuous_action_dim: int, has_discrete_action: bool = False
+) -> GaussianActorConfig:
+    action_dim = continuous_action_dim
+    if has_discrete_action:
+        action_dim += 1
+
+    config = GaussianActorConfig(
+        input_features={OBS_STATE: PolicyFeature(type=FeatureType.STATE, shape=(state_dim,))},
+        output_features={ACTION: PolicyFeature(type=FeatureType.ACTION, shape=(continuous_action_dim,))},
+        dataset_stats={
+            OBS_STATE: {
+                "min": [0.0] * state_dim,
+                "max": [1.0] * state_dim,
+            },
+            ACTION: {
+                "min": [0.0] * continuous_action_dim,
+                "max": [1.0] * continuous_action_dim,
+            },
+        },
+    )
+    config.validate_features()
+    return config
+
+
+def create_config_with_visual_input(
+    state_dim: int, continuous_action_dim: int, has_discrete_action: bool = False
+) -> GaussianActorConfig:
+    config = create_default_config(
+        state_dim=state_dim,
+        continuous_action_dim=continuous_action_dim,
+        has_discrete_action=has_discrete_action,
+    )
+    config.input_features[OBS_IMAGE] = PolicyFeature(type=FeatureType.VISUAL, shape=(3, 84, 84))
+    config.dataset_stats[OBS_IMAGE] = {
+        "mean": torch.randn(3, 1, 1),
+        "std": torch.randn(3, 1, 1),
+    }
+
+    config.state_encoder_hidden_dim = 32
+    config.latent_dim = 32
+
+    config.validate_features()
+    return config
+
+
+def _make_algorithm(config: GaussianActorConfig) -> tuple[SACAlgorithm, GaussianActorPolicy]:
+    """Helper to create policy + algorithm pair for tests that need critics."""
+    policy = GaussianActorPolicy(config=config)
+    policy.train()
+    algo_config = SACAlgorithmConfig.from_policy_config(config)
+    algorithm = SACAlgorithm(policy=policy, config=algo_config)
+    algorithm.make_optimizers_and_scheduler()
+    return algorithm, policy
+
+
+@pytest.mark.parametrize("batch_size,state_dim,action_dim", [(2, 6, 6), (1, 10, 10)])
+def test_gaussian_actor_policy_select_action(batch_size: int, state_dim: int, action_dim: int):
+    config = create_default_config(state_dim=state_dim, continuous_action_dim=action_dim)
+    policy = GaussianActorPolicy(config=config)
+    policy.eval()
+
+    with torch.no_grad():
+        observation_batch = create_observation_batch(batch_size=batch_size, state_dim=state_dim)
+        selected_action = policy.select_action(observation_batch)
+        # squeeze(0) removes batch dim when batch_size==1
+        assert selected_action.shape[-1] == action_dim
+
+
+def test_gaussian_actor_policy_select_action_with_discrete():
+    """select_action should return continuous + discrete actions."""
+    config = create_default_config(state_dim=10, continuous_action_dim=6)
+    config.num_discrete_actions = 3
+    policy = GaussianActorPolicy(config=config)
+    policy.eval()
+
+    with torch.no_grad():
+        observation_batch = create_observation_batch(batch_size=1, state_dim=10)
+        # Squeeze to unbatched (single observation)
+        observation_batch = {k: v.squeeze(0) for k, v in observation_batch.items()}
+        selected_action = policy.select_action(observation_batch)
+        assert selected_action.shape[-1] == 7  # 6 continuous + 1 discrete
+
+
+@pytest.mark.parametrize("batch_size,state_dim,action_dim", [(2, 6, 6), (1, 10, 10)])
+def test_gaussian_actor_policy_forward(batch_size: int, state_dim: int, action_dim: int):
+    config = create_default_config(state_dim=state_dim, continuous_action_dim=action_dim)
+    policy = GaussianActorPolicy(config=config)
+    policy.eval()
+
+    batch = create_default_train_batch(batch_size=batch_size, action_dim=action_dim, state_dim=state_dim)
+    with torch.no_grad():
+        output = policy.forward(batch)
+        assert "action" in output
+        assert "log_prob" in output
+        assert "action_mean" in output
+        assert output["action"].shape == (batch_size, action_dim)
+
+
+@pytest.mark.parametrize("batch_size,state_dim,action_dim", [(2, 6, 6), (1, 10, 10)])
+def test_gaussian_actor_training_through_sac(batch_size: int, state_dim: int, action_dim: int):
+    config = create_default_config(state_dim=state_dim, continuous_action_dim=action_dim)
+    algorithm, policy = _make_algorithm(config)
+
+    batch = create_default_train_batch(batch_size=batch_size, action_dim=action_dim, state_dim=state_dim)
+    forward_batch = algorithm._prepare_forward_batch(batch)
+
+    critic_loss = algorithm._compute_loss_critic(forward_batch)
+    assert critic_loss.item() is not None
+    assert critic_loss.shape == ()
+    algorithm.optimizers["critic"].zero_grad()
+    critic_loss.backward()
+    algorithm.optimizers["critic"].step()
+
+    actor_loss = algorithm._compute_loss_actor(forward_batch)
+    assert actor_loss.item() is not None
+    assert actor_loss.shape == ()
+    algorithm.optimizers["actor"].zero_grad()
+    actor_loss.backward()
+    algorithm.optimizers["actor"].step()
+
+    temp_loss = algorithm._compute_loss_temperature(forward_batch)
+    assert temp_loss.item() is not None
+    assert temp_loss.shape == ()
+    algorithm.optimizers["temperature"].zero_grad()
+    temp_loss.backward()
+    algorithm.optimizers["temperature"].step()
+
+
+@pytest.mark.parametrize("batch_size,state_dim,action_dim", [(2, 6, 6), (1, 10, 10)])
+def test_gaussian_actor_training_with_visual_input(batch_size: int, state_dim: int, action_dim: int):
+    config = create_config_with_visual_input(state_dim=state_dim, continuous_action_dim=action_dim)
+    algorithm, policy = _make_algorithm(config)
+
+    batch = create_train_batch_with_visual_input(
+        batch_size=batch_size, state_dim=state_dim, action_dim=action_dim
+    )
+    forward_batch = algorithm._prepare_forward_batch(batch)
+
+    critic_loss = algorithm._compute_loss_critic(forward_batch)
+    assert critic_loss.item() is not None
+    assert critic_loss.shape == ()
+    algorithm.optimizers["critic"].zero_grad()
+    critic_loss.backward()
+    algorithm.optimizers["critic"].step()
+
+    actor_loss = algorithm._compute_loss_actor(forward_batch)
+    assert actor_loss.item() is not None
+    assert actor_loss.shape == ()
+    algorithm.optimizers["actor"].zero_grad()
+    actor_loss.backward()
+    algorithm.optimizers["actor"].step()
+
+    policy.eval()
+    with torch.no_grad():
+        observation_batch = create_observation_batch_with_visual_input(
+            batch_size=batch_size, state_dim=state_dim
+        )
+        selected_action = policy.select_action(observation_batch)
+        assert selected_action.shape[-1] == action_dim
+
+
+@pytest.mark.parametrize(
+    "batch_size,state_dim,action_dim,vision_encoder_name",
+    [(1, 6, 6, "lerobot/resnet10"), (1, 6, 6, "facebook/convnext-base-224")],
+)
+@pytest.mark.skipif(not TRANSFORMERS_AVAILABLE, reason="Transformers are not installed")
+def test_gaussian_actor_policy_with_pretrained_encoder(
+    batch_size: int, state_dim: int, action_dim: int, vision_encoder_name: str
+):
+    config = create_config_with_visual_input(state_dim=state_dim, continuous_action_dim=action_dim)
+    config.vision_encoder_name = vision_encoder_name
+    algorithm, policy = _make_algorithm(config)
+
+    batch = create_train_batch_with_visual_input(
+        batch_size=batch_size, state_dim=state_dim, action_dim=action_dim
+    )
+    forward_batch = algorithm._prepare_forward_batch(batch)
+
+    critic_loss = algorithm._compute_loss_critic(forward_batch)
+    assert critic_loss.item() is not None
+    assert critic_loss.shape == ()
+    algorithm.optimizers["critic"].zero_grad()
+    critic_loss.backward()
+    algorithm.optimizers["critic"].step()
+
+    actor_loss = algorithm._compute_loss_actor(forward_batch)
+    assert actor_loss.item() is not None
+    assert actor_loss.shape == ()
+
+
+def test_gaussian_actor_training_with_shared_encoder():
+    batch_size = 2
+    action_dim = 10
+    state_dim = 10
+    config = create_config_with_visual_input(state_dim=state_dim, continuous_action_dim=action_dim)
+    config.shared_encoder = True
+
+    algorithm, policy = _make_algorithm(config)
+
+    batch = create_train_batch_with_visual_input(
+        batch_size=batch_size, state_dim=state_dim, action_dim=action_dim
+    )
+    forward_batch = algorithm._prepare_forward_batch(batch)
+
+    critic_loss = algorithm._compute_loss_critic(forward_batch)
+    assert critic_loss.shape == ()
+    algorithm.optimizers["critic"].zero_grad()
+    critic_loss.backward()
+    algorithm.optimizers["critic"].step()
+
+    actor_loss = algorithm._compute_loss_actor(forward_batch)
+    assert actor_loss.shape == ()
+    algorithm.optimizers["actor"].zero_grad()
+    actor_loss.backward()
+    algorithm.optimizers["actor"].step()
+
+
+def test_gaussian_actor_training_with_discrete_critic():
+    batch_size = 2
+    continuous_action_dim = 9
+    full_action_dim = continuous_action_dim + 1
+    state_dim = 10
+    config = create_config_with_visual_input(
+        state_dim=state_dim, continuous_action_dim=continuous_action_dim, has_discrete_action=True
+    )
+    config.num_discrete_actions = 5
+
+    algorithm, policy = _make_algorithm(config)
+
+    batch = create_train_batch_with_visual_input(
+        batch_size=batch_size, state_dim=state_dim, action_dim=full_action_dim
+    )
+    forward_batch = algorithm._prepare_forward_batch(batch)
+
+    critic_loss = algorithm._compute_loss_critic(forward_batch)
+    assert critic_loss.shape == ()
+    algorithm.optimizers["critic"].zero_grad()
+    critic_loss.backward()
+    algorithm.optimizers["critic"].step()
+
+    discrete_critic_loss = algorithm._compute_loss_discrete_critic(forward_batch)
+    assert discrete_critic_loss.shape == ()
+    algorithm.optimizers["discrete_critic"].zero_grad()
+    discrete_critic_loss.backward()
+    algorithm.optimizers["discrete_critic"].step()
+
+    actor_loss = algorithm._compute_loss_actor(forward_batch)
+    assert actor_loss.shape == ()
+    algorithm.optimizers["actor"].zero_grad()
+    actor_loss.backward()
+    algorithm.optimizers["actor"].step()
+
+    policy.eval()
+    with torch.no_grad():
+        observation_batch = create_observation_batch_with_visual_input(
+            batch_size=batch_size, state_dim=state_dim
+        )
+        # Policy.select_action now handles both continuous + discrete
+        selected_action = policy.select_action({k: v.squeeze(0) for k, v in observation_batch.items()})
+        assert selected_action.shape[-1] == continuous_action_dim + 1
+
+
+def test_sac_algorithm_target_entropy():
+    """Target entropy is an SAC hyperparameter and lives on the algorithm."""
+    config = create_default_config(continuous_action_dim=10, state_dim=10)
+    algorithm, _ = _make_algorithm(config)
+    assert algorithm.target_entropy == -5.0
+
+
+def test_sac_algorithm_target_entropy_with_discrete_action():
+    config = create_config_with_visual_input(state_dim=10, continuous_action_dim=6, has_discrete_action=True)
+    config.num_discrete_actions = 5
+    algorithm, _ = _make_algorithm(config)
+    assert algorithm.target_entropy == -3.5
+
+
+def test_sac_algorithm_temperature():
+    import math
+
+    config = create_default_config(continuous_action_dim=10, state_dim=10)
+    algo_config = SACAlgorithmConfig.from_policy_config(config)
+    policy = GaussianActorPolicy(config=config)
+    algorithm = SACAlgorithm(policy=policy, config=algo_config)
+
+    assert algorithm.temperature == pytest.approx(1.0)
+    algorithm.log_alpha.data = torch.tensor([math.log(0.1)])
+    assert algorithm.temperature == pytest.approx(0.1)
+
+
+def test_sac_algorithm_update_target_network():
+    config = create_default_config(state_dim=10, continuous_action_dim=6)
+    algo_config = SACAlgorithmConfig.from_policy_config(config)
+    algo_config.critic_target_update_weight = 1.0
+    policy = GaussianActorPolicy(config=config)
+    algorithm = SACAlgorithm(policy=policy, config=algo_config)
+
+    for p in algorithm.critic_ensemble.parameters():
+        p.data = torch.ones_like(p.data)
+
+    algorithm._update_target_networks()
+    for p in algorithm.critic_target.parameters():
+        assert torch.allclose(p.data, torch.ones_like(p.data))
+
+
+@pytest.mark.parametrize("num_critics", [1, 3])
+def test_sac_algorithm_with_critics_number_of_heads(num_critics: int):
+    batch_size = 2
+    action_dim = 10
+    state_dim = 10
+    config = create_config_with_visual_input(state_dim=state_dim, continuous_action_dim=action_dim)
+
+    policy = GaussianActorPolicy(config=config)
+    policy.train()
+    algo_config = SACAlgorithmConfig.from_policy_config(config)
+    algo_config.num_critics = num_critics
+    algorithm = SACAlgorithm(policy=policy, config=algo_config)
+    algorithm.make_optimizers_and_scheduler()
+
+    assert len(algorithm.critic_ensemble.critics) == num_critics
+
+    batch = create_train_batch_with_visual_input(
+        batch_size=batch_size, state_dim=state_dim, action_dim=action_dim
+    )
+    forward_batch = algorithm._prepare_forward_batch(batch)
+
+    critic_loss = algorithm._compute_loss_critic(forward_batch)
+    assert critic_loss.shape == ()
+    algorithm.optimizers["critic"].zero_grad()
+    critic_loss.backward()
+    algorithm.optimizers["critic"].step()
+
+
+def test_gaussian_actor_policy_save_and_load(tmp_path):
+    """Test that the policy can be saved and loaded from pretrained."""
+    root = tmp_path / "test_gaussian_actor_save_and_load"
+
+    state_dim = 10
+    action_dim = 10
+    batch_size = 2
+
+    config = create_default_config(state_dim=state_dim, continuous_action_dim=action_dim)
+    policy = GaussianActorPolicy(config=config)
+    policy.eval()
+    policy.save_pretrained(root)
+    loaded_policy = GaussianActorPolicy.from_pretrained(root, config=config)
+    loaded_policy.eval()
+
+    assert policy.state_dict().keys() == loaded_policy.state_dict().keys()
+    for k in policy.state_dict():
+        assert torch.allclose(policy.state_dict()[k], loaded_policy.state_dict()[k], atol=1e-6)
+
+    with torch.no_grad():
+        with seeded_context(12):
+            observation_batch = create_observation_batch(batch_size=batch_size, state_dim=state_dim)
+            actions = policy.select_action(observation_batch)
+
+        with seeded_context(12):
+            loaded_observation_batch = create_observation_batch(batch_size=batch_size, state_dim=state_dim)
+            loaded_actions = loaded_policy.select_action(loaded_observation_batch)
+
+        assert torch.allclose(actions, loaded_actions)
+
+
+def test_gaussian_actor_policy_save_and_load_with_discrete_critic(tmp_path):
+    """Discrete critic should be saved/loaded as part of the policy."""
+    root = tmp_path / "test_gaussian_actor_save_and_load_discrete"
+
+    state_dim = 10
+    action_dim = 6
+
+    config = create_default_config(state_dim=state_dim, continuous_action_dim=action_dim)
+    config.num_discrete_actions = 3
+    policy = GaussianActorPolicy(config=config)
+    policy.eval()
+    policy.save_pretrained(root)
+
+    loaded_policy = GaussianActorPolicy.from_pretrained(root, config=config)
+    loaded_policy.eval()
+
+    assert loaded_policy.discrete_critic is not None
+    dc_keys = [k for k in loaded_policy.state_dict() if k.startswith("discrete_critic.")]
+    assert len(dc_keys) > 0
+
+    for k in policy.state_dict():
+        assert torch.allclose(policy.state_dict()[k], loaded_policy.state_dict()[k], atol=1e-6)
@@ -1,546 +0,0 @@
-# !/usr/bin/env python
-
-# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-import math
-
-import pytest
-import torch
-from torch import Tensor, nn
-
-from lerobot.configs.types import FeatureType, PolicyFeature
-from lerobot.policies.sac.configuration_sac import SACConfig
-from lerobot.policies.sac.modeling_sac import MLP, SACPolicy
-from lerobot.utils.constants import ACTION, OBS_IMAGE, OBS_STATE
-from lerobot.utils.random_utils import seeded_context, set_seed
-
-try:
-    import transformers  # noqa: F401
-
-    TRANSFORMERS_AVAILABLE = True
-except ImportError:
-    TRANSFORMERS_AVAILABLE = False
-
-
-@pytest.fixture(autouse=True)
-def set_random_seed():
-    seed = 42
-    set_seed(seed)
-
-
-def test_mlp_with_default_args():
-    mlp = MLP(input_dim=10, hidden_dims=[256, 256])
-
-    x = torch.randn(10)
-    y = mlp(x)
-    assert y.shape == (256,)
-
-
-def test_mlp_with_batch_dim():
-    mlp = MLP(input_dim=10, hidden_dims=[256, 256])
-    x = torch.randn(2, 10)
-    y = mlp(x)
-    assert y.shape == (2, 256)
-
-
-def test_forward_with_empty_hidden_dims():
-    mlp = MLP(input_dim=10, hidden_dims=[])
-    x = torch.randn(1, 10)
-    assert mlp(x).shape == (1, 10)
-
-
-def test_mlp_with_dropout():
-    mlp = MLP(input_dim=10, hidden_dims=[256, 256, 11], dropout_rate=0.1)
-    x = torch.randn(1, 10)
-    y = mlp(x)
-    assert y.shape == (1, 11)
-
-    drop_out_layers_count = sum(isinstance(layer, nn.Dropout) for layer in mlp.net)
-    assert drop_out_layers_count == 2
-
-
-def test_mlp_with_custom_final_activation():
-    mlp = MLP(input_dim=10, hidden_dims=[256, 256], final_activation=torch.nn.Tanh())
-    x = torch.randn(1, 10)
-    y = mlp(x)
-    assert y.shape == (1, 256)
-    assert (y >= -1).all() and (y <= 1).all()
-
-
-def test_sac_policy_with_default_args():
-    with pytest.raises(ValueError, match="should be an instance of class `PreTrainedConfig`"):
-        SACPolicy()
-
-
-def create_dummy_state(batch_size: int, state_dim: int = 10) -> Tensor:
-    return {
-        OBS_STATE: torch.randn(batch_size, state_dim),
-    }
-
-
-def create_dummy_with_visual_input(batch_size: int, state_dim: int = 10) -> Tensor:
-    return {
-        OBS_IMAGE: torch.randn(batch_size, 3, 84, 84),
-        OBS_STATE: torch.randn(batch_size, state_dim),
-    }
-
-
-def create_dummy_action(batch_size: int, action_dim: int = 10) -> Tensor:
-    return torch.randn(batch_size, action_dim)
-
-
-def create_default_train_batch(
-    batch_size: int = 8, state_dim: int = 10, action_dim: int = 10
-) -> dict[str, Tensor]:
-    return {
-        ACTION: create_dummy_action(batch_size, action_dim),
-        "reward": torch.randn(batch_size),
-        "state": create_dummy_state(batch_size, state_dim),
-        "next_state": create_dummy_state(batch_size, state_dim),
-        "done": torch.randn(batch_size),
-    }
-
-
-def create_train_batch_with_visual_input(
-    batch_size: int = 8, state_dim: int = 10, action_dim: int = 10
-) -> dict[str, Tensor]:
-    return {
-        ACTION: create_dummy_action(batch_size, action_dim),
-        "reward": torch.randn(batch_size),
-        "state": create_dummy_with_visual_input(batch_size, state_dim),
-        "next_state": create_dummy_with_visual_input(batch_size, state_dim),
-        "done": torch.randn(batch_size),
-    }
-
-
-def create_observation_batch(batch_size: int = 8, state_dim: int = 10) -> dict[str, Tensor]:
-    return {
-        OBS_STATE: torch.randn(batch_size, state_dim),
-    }
-
-
-def create_observation_batch_with_visual_input(batch_size: int = 8, state_dim: int = 10) -> dict[str, Tensor]:
-    return {
-        OBS_STATE: torch.randn(batch_size, state_dim),
-        OBS_IMAGE: torch.randn(batch_size, 3, 84, 84),
-    }
-
-
-def make_optimizers(policy: SACPolicy, has_discrete_action: bool = False) -> dict[str, torch.optim.Optimizer]:
-    """Create optimizers for the SAC policy."""
-    optimizer_actor = torch.optim.Adam(
-        # Handle the case of shared encoder where the encoder weights are not optimized with the actor gradient
-        params=[
-            p
-            for n, p in policy.actor.named_parameters()
-            if not policy.config.shared_encoder or not n.startswith("encoder")
-        ],
-        lr=policy.config.actor_lr,
-    )
-    optimizer_critic = torch.optim.Adam(
-        params=policy.critic_ensemble.parameters(),
-        lr=policy.config.critic_lr,
-    )
-    optimizer_temperature = torch.optim.Adam(
-        params=[policy.log_alpha],
-        lr=policy.config.critic_lr,
-    )
-
-    optimizers = {
-        "actor": optimizer_actor,
-        "critic": optimizer_critic,
-        "temperature": optimizer_temperature,
-    }
-
-    if has_discrete_action:
-        optimizers["discrete_critic"] = torch.optim.Adam(
-            params=policy.discrete_critic.parameters(),
-            lr=policy.config.critic_lr,
-        )
-
-    return optimizers
-
-
-def create_default_config(
-    state_dim: int, continuous_action_dim: int, has_discrete_action: bool = False
-) -> SACConfig:
-    action_dim = continuous_action_dim
-    if has_discrete_action:
-        action_dim += 1
-
-    config = SACConfig(
-        input_features={OBS_STATE: PolicyFeature(type=FeatureType.STATE, shape=(state_dim,))},
-        output_features={ACTION: PolicyFeature(type=FeatureType.ACTION, shape=(continuous_action_dim,))},
-        dataset_stats={
-            OBS_STATE: {
-                "min": [0.0] * state_dim,
-                "max": [1.0] * state_dim,
-            },
-            ACTION: {
-                "min": [0.0] * continuous_action_dim,
-                "max": [1.0] * continuous_action_dim,
-            },
-        },
-    )
-    config.validate_features()
-    return config
-
-
-def create_config_with_visual_input(
-    state_dim: int, continuous_action_dim: int, has_discrete_action: bool = False
-) -> SACConfig:
-    config = create_default_config(
-        state_dim=state_dim,
-        continuous_action_dim=continuous_action_dim,
-        has_discrete_action=has_discrete_action,
-    )
-    config.input_features[OBS_IMAGE] = PolicyFeature(type=FeatureType.VISUAL, shape=(3, 84, 84))
-    config.dataset_stats[OBS_IMAGE] = {
-        "mean": torch.randn(3, 1, 1),
-        "std": torch.randn(3, 1, 1),
-    }
-
-    # Let make tests a little bit faster
-    config.state_encoder_hidden_dim = 32
-    config.latent_dim = 32
-
-    config.validate_features()
-    return config
-
-
-@pytest.mark.parametrize("batch_size,state_dim,action_dim", [(2, 6, 6), (1, 10, 10)])
-def test_sac_policy_with_default_config(batch_size: int, state_dim: int, action_dim: int):
-    batch = create_default_train_batch(batch_size=batch_size, action_dim=action_dim, state_dim=state_dim)
-    config = create_default_config(state_dim=state_dim, continuous_action_dim=action_dim)
-
-    policy = SACPolicy(config=config)
-    policy.train()
-
-    optimizers = make_optimizers(policy)
-
-    cirtic_loss = policy.forward(batch, model="critic")["loss_critic"]
-    assert cirtic_loss.item() is not None
-    assert cirtic_loss.shape == ()
-    cirtic_loss.backward()
-    optimizers["critic"].step()
-
-    actor_loss = policy.forward(batch, model="actor")["loss_actor"]
-    assert actor_loss.item() is not None
-    assert actor_loss.shape == ()
-
-    actor_loss.backward()
-    optimizers["actor"].step()
-
-    temperature_loss = policy.forward(batch, model="temperature")["loss_temperature"]
-    assert temperature_loss.item() is not None
-    assert temperature_loss.shape == ()
-
-    temperature_loss.backward()
-    optimizers["temperature"].step()
-
-    policy.eval()
-    with torch.no_grad():
-        observation_batch = create_observation_batch(batch_size=batch_size, state_dim=state_dim)
-        selected_action = policy.select_action(observation_batch)
-        assert selected_action.shape == (batch_size, action_dim)
-
-
-@pytest.mark.parametrize("batch_size,state_dim,action_dim", [(2, 6, 6), (1, 10, 10)])
-def test_sac_policy_with_visual_input(batch_size: int, state_dim: int, action_dim: int):
-    config = create_config_with_visual_input(state_dim=state_dim, continuous_action_dim=action_dim)
-    policy = SACPolicy(config=config)
-
-    batch = create_train_batch_with_visual_input(
-        batch_size=batch_size, state_dim=state_dim, action_dim=action_dim
-    )
-
-    policy.train()
-
-    optimizers = make_optimizers(policy)
-
-    cirtic_loss = policy.forward(batch, model="critic")["loss_critic"]
-    assert cirtic_loss.item() is not None
-    assert cirtic_loss.shape == ()
-    cirtic_loss.backward()
-    optimizers["critic"].step()
-
-    actor_loss = policy.forward(batch, model="actor")["loss_actor"]
-    assert actor_loss.item() is not None
-    assert actor_loss.shape == ()
-
-    actor_loss.backward()
-    optimizers["actor"].step()
-
-    temperature_loss = policy.forward(batch, model="temperature")["loss_temperature"]
-    assert temperature_loss.item() is not None
-    assert temperature_loss.shape == ()
-
-    temperature_loss.backward()
-    optimizers["temperature"].step()
-
-    policy.eval()
-    with torch.no_grad():
-        observation_batch = create_observation_batch_with_visual_input(
-            batch_size=batch_size, state_dim=state_dim
-        )
-        selected_action = policy.select_action(observation_batch)
-        assert selected_action.shape == (batch_size, action_dim)
-
-
-# Let's check best candidates for pretrained encoders
-@pytest.mark.parametrize(
-    "batch_size,state_dim,action_dim,vision_encoder_name",
-    [(1, 6, 6, "helper2424/resnet10"), (1, 6, 6, "facebook/convnext-base-224")],
-)
-@pytest.mark.skipif(not TRANSFORMERS_AVAILABLE, reason="Transformers are not installed")
-@pytest.mark.skip(
-    reason="helper2424/resnet10 needs to be updated to work with the latest version of transformers"
-)
-def test_sac_policy_with_pretrained_encoder(
-    batch_size: int, state_dim: int, action_dim: int, vision_encoder_name: str
-):
-    config = create_config_with_visual_input(state_dim=state_dim, continuous_action_dim=action_dim)
-    config.vision_encoder_name = vision_encoder_name
-    policy = SACPolicy(config=config)
-    policy.train()
-
-    batch = create_train_batch_with_visual_input(
-        batch_size=batch_size, state_dim=state_dim, action_dim=action_dim
-    )
-
-    optimizers = make_optimizers(policy)
-
-    cirtic_loss = policy.forward(batch, model="critic")["loss_critic"]
-    assert cirtic_loss.item() is not None
-    assert cirtic_loss.shape == ()
-    cirtic_loss.backward()
-    optimizers["critic"].step()
-
-    actor_loss = policy.forward(batch, model="actor")["loss_actor"]
-    assert actor_loss.item() is not None
-    assert actor_loss.shape == ()
-
-
-def test_sac_policy_with_shared_encoder():
-    batch_size = 2
-    action_dim = 10
-    state_dim = 10
-    config = create_config_with_visual_input(state_dim=state_dim, continuous_action_dim=action_dim)
-    config.shared_encoder = True
-
-    policy = SACPolicy(config=config)
-    policy.train()
-
-    batch = create_train_batch_with_visual_input(
-        batch_size=batch_size, state_dim=state_dim, action_dim=action_dim
-    )
-
-    policy.train()
-
-    optimizers = make_optimizers(policy)
-
-    cirtic_loss = policy.forward(batch, model="critic")["loss_critic"]
-    assert cirtic_loss.item() is not None
-    assert cirtic_loss.shape == ()
-    cirtic_loss.backward()
-    optimizers["critic"].step()
-
-    actor_loss = policy.forward(batch, model="actor")["loss_actor"]
-    assert actor_loss.item() is not None
-    assert actor_loss.shape == ()
-
-    actor_loss.backward()
-    optimizers["actor"].step()
-
-
-def test_sac_policy_with_discrete_critic():
-    batch_size = 2
-    continuous_action_dim = 9
-    full_action_dim = continuous_action_dim + 1  # the last action is discrete
-    state_dim = 10
-    config = create_config_with_visual_input(
-        state_dim=state_dim, continuous_action_dim=continuous_action_dim, has_discrete_action=True
-    )
-
-    num_discrete_actions = 5
-    config.num_discrete_actions = num_discrete_actions
-
-    policy = SACPolicy(config=config)
-    policy.train()
-
-    batch = create_train_batch_with_visual_input(
-        batch_size=batch_size, state_dim=state_dim, action_dim=full_action_dim
-    )
-
-    policy.train()
-
-    optimizers = make_optimizers(policy, has_discrete_action=True)
-
-    cirtic_loss = policy.forward(batch, model="critic")["loss_critic"]
-    assert cirtic_loss.item() is not None
-    assert cirtic_loss.shape == ()
-    cirtic_loss.backward()
-    optimizers["critic"].step()
-
-    discrete_critic_loss = policy.forward(batch, model="discrete_critic")["loss_discrete_critic"]
-    assert discrete_critic_loss.item() is not None
-    assert discrete_critic_loss.shape == ()
-    discrete_critic_loss.backward()
-    optimizers["discrete_critic"].step()
-
-    actor_loss = policy.forward(batch, model="actor")["loss_actor"]
-    assert actor_loss.item() is not None
-    assert actor_loss.shape == ()
-
-    actor_loss.backward()
-    optimizers["actor"].step()
-
-    policy.eval()
-    with torch.no_grad():
-        observation_batch = create_observation_batch_with_visual_input(
-            batch_size=batch_size, state_dim=state_dim
-        )
-        selected_action = policy.select_action(observation_batch)
-        assert selected_action.shape == (batch_size, full_action_dim)
-
-        discrete_actions = selected_action[:, -1].long()
-        discrete_action_values = set(discrete_actions.tolist())
-
-        assert all(action in range(num_discrete_actions) for action in discrete_action_values), (
-            f"Discrete action {discrete_action_values} is not in range({num_discrete_actions})"
-        )
-
-
-def test_sac_policy_with_default_entropy():
-    config = create_default_config(continuous_action_dim=10, state_dim=10)
-    policy = SACPolicy(config=config)
-    assert policy.target_entropy == -5.0
-
-
-def test_sac_policy_default_target_entropy_with_discrete_action():
-    config = create_config_with_visual_input(state_dim=10, continuous_action_dim=6, has_discrete_action=True)
-    policy = SACPolicy(config=config)
-    assert policy.target_entropy == -3.0
-
-
-def test_sac_policy_with_predefined_entropy():
-    config = create_default_config(state_dim=10, continuous_action_dim=6)
-    config.target_entropy = -3.5
-
-    policy = SACPolicy(config=config)
-    assert policy.target_entropy == pytest.approx(-3.5)
-
-
-def test_sac_policy_update_temperature():
-    """Test that temperature property is always in sync with log_alpha."""
-    config = create_default_config(continuous_action_dim=10, state_dim=10)
-    policy = SACPolicy(config=config)
-
-    assert policy.temperature == pytest.approx(1.0)
-    policy.log_alpha.data = torch.tensor([math.log(0.1)])
-    # Temperature property automatically reflects log_alpha changes
-    assert policy.temperature == pytest.approx(0.1)
-
-
-def test_sac_policy_update_target_network():
-    config = create_default_config(state_dim=10, continuous_action_dim=6)
-    config.critic_target_update_weight = 1.0
-
-    policy = SACPolicy(config=config)
-    policy.train()
-
-    for p in policy.critic_ensemble.parameters():
-        p.data = torch.ones_like(p.data)
-
-    policy.update_target_networks()
-    for p in policy.critic_target.parameters():
-        assert torch.allclose(p.data, torch.ones_like(p.data)), (
-            f"Target network {p.data} is not equal to {torch.ones_like(p.data)}"
-        )
-
-
-@pytest.mark.parametrize("num_critics", [1, 3])
-def test_sac_policy_with_critics_number_of_heads(num_critics: int):
-    batch_size = 2
-    action_dim = 10
-    state_dim = 10
-    config = create_config_with_visual_input(state_dim=state_dim, continuous_action_dim=action_dim)
-    config.num_critics = num_critics
-
-    policy = SACPolicy(config=config)
-    policy.train()
-
-    assert len(policy.critic_ensemble.critics) == num_critics
-
-    batch = create_train_batch_with_visual_input(
-        batch_size=batch_size, state_dim=state_dim, action_dim=action_dim
-    )
-
-    policy.train()
-
-    optimizers = make_optimizers(policy)
-
-    cirtic_loss = policy.forward(batch, model="critic")["loss_critic"]
-    assert cirtic_loss.item() is not None
-    assert cirtic_loss.shape == ()
-    cirtic_loss.backward()
-    optimizers["critic"].step()
-
-
-def test_sac_policy_save_and_load(tmp_path):
-    root = tmp_path / "test_sac_save_and_load"
-
-    state_dim = 10
-    action_dim = 10
-    batch_size = 2
-
-    config = create_default_config(state_dim=state_dim, continuous_action_dim=action_dim)
-    policy = SACPolicy(config=config)
-    policy.eval()
-    policy.save_pretrained(root)
-    loaded_policy = SACPolicy.from_pretrained(root, config=config)
-    loaded_policy.eval()
-
-    batch = create_default_train_batch(batch_size=1, state_dim=10, action_dim=10)
-
-    with torch.no_grad():
-        with seeded_context(12):
-            # Collect policy values before saving
-            cirtic_loss = policy.forward(batch, model="critic")["loss_critic"]
-            actor_loss = policy.forward(batch, model="actor")["loss_actor"]
-            temperature_loss = policy.forward(batch, model="temperature")["loss_temperature"]
-
-            observation_batch = create_observation_batch(batch_size=batch_size, state_dim=state_dim)
-            actions = policy.select_action(observation_batch)
-
-        with seeded_context(12):
-            # Collect policy values after loading
-            loaded_cirtic_loss = loaded_policy.forward(batch, model="critic")["loss_critic"]
-            loaded_actor_loss = loaded_policy.forward(batch, model="actor")["loss_actor"]
-            loaded_temperature_loss = loaded_policy.forward(batch, model="temperature")["loss_temperature"]
-
-            loaded_observation_batch = create_observation_batch(batch_size=batch_size, state_dim=state_dim)
-            loaded_actions = loaded_policy.select_action(loaded_observation_batch)
-
-        assert policy.state_dict().keys() == loaded_policy.state_dict().keys()
-        for k in policy.state_dict():
-            assert torch.allclose(policy.state_dict()[k], loaded_policy.state_dict()[k], atol=1e-6)
-
-        # Compare values before and after saving and loading
-        # They should be the same
-        assert torch.allclose(cirtic_loss, loaded_cirtic_loss)
-        assert torch.allclose(actor_loss, loaded_actor_loss)
-        assert torch.allclose(temperature_loss, loaded_temperature_loss)
-        assert torch.allclose(actions, loaded_actions)
@@ -21,8 +21,8 @@ import pytest
 import torch

 from lerobot.configs.types import FeatureType, NormalizationMode, PolicyFeature
-from lerobot.policies.sac.configuration_sac import SACConfig
-from lerobot.policies.sac.processor_sac import make_sac_pre_post_processors
+from lerobot.policies.gaussian_actor.configuration_gaussian_actor import GaussianActorConfig
+from lerobot.policies.gaussian_actor.processor_gaussian_actor import make_gaussian_actor_pre_post_processors
 from lerobot.processor import (
    AddBatchDimensionProcessorStep,
    DataProcessorPipeline,
@@ -38,7 +38,7 @@ from lerobot.utils.constants import ACTION, OBS_STATE

 def create_default_config():
    """Create a default SAC configuration for testing."""
-    config = SACConfig()
+    config = GaussianActorConfig()
    config.input_features = {
        OBS_STATE: PolicyFeature(type=FeatureType.STATE, shape=(10,)),
    }
@@ -66,7 +66,7 @@ def test_make_sac_processor_basic():
    config = create_default_config()
    stats = create_default_stats()

-    preprocessor, postprocessor = make_sac_pre_post_processors(
+    preprocessor, postprocessor = make_gaussian_actor_pre_post_processors(
        config,
        stats,
    )
@@ -88,12 +88,12 @@ def test_make_sac_processor_basic():
    assert isinstance(postprocessor.steps[1], DeviceProcessorStep)


-def test_sac_processor_normalization_modes():
+def test_gaussian_actor_processor_normalization_modes():
    """Test that SAC processor correctly handles different normalization modes."""
    config = create_default_config()
    stats = create_default_stats()

-    preprocessor, postprocessor = make_sac_pre_post_processors(
+    preprocessor, postprocessor = make_gaussian_actor_pre_post_processors(
        config,
        stats,
    )
@@ -121,13 +121,13 @@ def test_sac_processor_normalization_modes():


@pytest.mark.skipif(not torch.cuda.is_available(), reason="CUDA not available")
-def test_sac_processor_cuda():
+def test_gaussian_actor_processor_cuda():
    """Test SAC processor with CUDA device."""
    config = create_default_config()
    config.device = "cuda"
    stats = create_default_stats()

-    preprocessor, postprocessor = make_sac_pre_post_processors(
+    preprocessor, postprocessor = make_gaussian_actor_pre_post_processors(
        config,
        stats,
    )
@@ -153,13 +153,13 @@ def test_sac_processor_cuda():


@pytest.mark.skipif(not torch.cuda.is_available(), reason="CUDA not available")
-def test_sac_processor_accelerate_scenario():
+def test_gaussian_actor_processor_accelerate_scenario():
    """Test SAC processor in simulated Accelerate scenario."""
    config = create_default_config()
    config.device = "cuda:0"
    stats = create_default_stats()

-    preprocessor, postprocessor = make_sac_pre_post_processors(
+    preprocessor, postprocessor = make_gaussian_actor_pre_post_processors(
        config,
        stats,
    )
@@ -180,13 +180,13 @@ def test_sac_processor_accelerate_scenario():


@pytest.mark.skipif(torch.cuda.device_count() < 2, reason="Requires at least 2 GPUs")
-def test_sac_processor_multi_gpu():
+def test_gaussian_actor_processor_multi_gpu():
    """Test SAC processor with multi-GPU setup."""
    config = create_default_config()
    config.device = "cuda:0"
    stats = create_default_stats()

-    preprocessor, postprocessor = make_sac_pre_post_processors(
+    preprocessor, postprocessor = make_gaussian_actor_pre_post_processors(
        config,
        stats,
    )
@@ -206,11 +206,11 @@ def test_sac_processor_multi_gpu():
    assert processed[TransitionKey.ACTION.value].device == device


-def test_sac_processor_without_stats():
+def test_gaussian_actor_processor_without_stats():
    """Test SAC processor creation without dataset statistics."""
    config = create_default_config()

-    preprocessor, postprocessor = make_sac_pre_post_processors(config, dataset_stats=None)
+    preprocessor, postprocessor = make_gaussian_actor_pre_post_processors(config, dataset_stats=None)

    # Should still create processors
    assert preprocessor is not None
@@ -226,12 +226,12 @@ def test_sac_processor_without_stats():
    assert processed is not None


-def test_sac_processor_save_and_load():
+def test_gaussian_actor_processor_save_and_load():
    """Test saving and loading SAC processor."""
    config = create_default_config()
    stats = create_default_stats()

-    preprocessor, postprocessor = make_sac_pre_post_processors(
+    preprocessor, postprocessor = make_gaussian_actor_pre_post_processors(
        config,
        stats,
    )
@@ -257,14 +257,14 @@ def test_sac_processor_save_and_load():


@pytest.mark.skipif(not torch.cuda.is_available(), reason="CUDA not available")
-def test_sac_processor_mixed_precision():
+def test_gaussian_actor_processor_mixed_precision():
    """Test SAC processor with mixed precision."""
    config = create_default_config()
    config.device = "cuda"
    stats = create_default_stats()

    # Create processor
-    preprocessor, postprocessor = make_sac_pre_post_processors(
+    preprocessor, postprocessor = make_gaussian_actor_pre_post_processors(
        config,
        stats,
    )
@@ -304,12 +304,12 @@ def test_sac_processor_mixed_precision():
    assert processed[TransitionKey.ACTION.value].dtype == torch.float16


-def test_sac_processor_batch_data():
+def test_gaussian_actor_processor_batch_data():
    """Test SAC processor with batched data."""
    config = create_default_config()
    stats = create_default_stats()

-    preprocessor, postprocessor = make_sac_pre_post_processors(
+    preprocessor, postprocessor = make_gaussian_actor_pre_post_processors(
        config,
        stats,
    )
@@ -329,12 +329,12 @@ def test_sac_processor_batch_data():
    assert processed[TransitionKey.ACTION.value].shape == (batch_size, 5)


-def test_sac_processor_edge_cases():
+def test_gaussian_actor_processor_edge_cases():
    """Test SAC processor with edge cases."""
    config = create_default_config()
    stats = create_default_stats()

-    preprocessor, postprocessor = make_sac_pre_post_processors(
+    preprocessor, postprocessor = make_gaussian_actor_pre_post_processors(
        config,
        stats,
    )
@@ -358,13 +358,13 @@ def test_sac_processor_edge_cases():


@pytest.mark.skipif(not torch.cuda.is_available(), reason="CUDA not available")
-def test_sac_processor_bfloat16_device_float32_normalizer():
+def test_gaussian_actor_processor_bfloat16_device_float32_normalizer():
    """Test: DeviceProcessor(bfloat16) + NormalizerProcessor(float32) → output bfloat16 via automatic adaptation"""
    config = create_default_config()
    config.device = "cuda"
    stats = create_default_stats()

-    preprocessor, _ = make_sac_pre_post_processors(
+    preprocessor, _ = make_gaussian_actor_pre_post_processors(
        config,
        stats,
    )
@@ -1804,13 +1804,15 @@ def test_stats_override_preservation_in_load_state_dict():
                override_normalizer.stats[key][stat_name], original_stats[key][stat_name]
            ), f"Stats for {key}.{stat_name} should not match original stats"

-    # Verify that _tensor_stats are also correctly set to match the override stats
+    # Verify that _tensor_stats values match the override stats
+    # Note: visual stats are reshaped from (C,) to (C,1,1) by _reshape_visual_stats
    expected_tensor_stats = to_tensor(override_stats)
    for key in expected_tensor_stats:
        for stat_name in expected_tensor_stats[key]:
            if isinstance(expected_tensor_stats[key][stat_name], torch.Tensor):
                torch.testing.assert_close(
-                    override_normalizer._tensor_stats[key][stat_name], expected_tensor_stats[key][stat_name]
+                    override_normalizer._tensor_stats[key][stat_name].squeeze(),
+                    expected_tensor_stats[key][stat_name].squeeze(),
                )


@@ -1849,12 +1851,16 @@ def test_stats_without_override_loads_normally():
    # Stats should now match the original stats (normal behavior)
    # Check that all keys and values match
    assert set(new_normalizer.stats.keys()) == set(original_stats.keys())
+    # Note: visual stats are reshaped from (C,) to (C,1,1) by _reshape_visual_stats,
+    # so we squeeze before comparing values.
    for key in original_stats:
        assert set(new_normalizer.stats[key].keys()) == set(original_stats[key].keys())
        for stat_name in original_stats[key]:
-            np.testing.assert_allclose(
-                new_normalizer.stats[key][stat_name], original_stats[key][stat_name], rtol=1e-6, atol=1e-6
-            )
+            actual = new_normalizer.stats[key][stat_name]
+            expected = original_stats[key][stat_name]
+            if hasattr(actual, "squeeze"):
+                actual = actual.squeeze()
+            np.testing.assert_allclose(actual, expected, rtol=1e-6, atol=1e-6)


 def test_stats_explicit_provided_flag_detection():
@@ -2075,8 +2081,9 @@ def test_stats_reconstruction_after_load_state_dict():
    assert ACTION in new_normalizer.stats

    # Check that values are correct (converted back from tensors)
-    np.testing.assert_allclose(new_normalizer.stats[OBS_IMAGE]["mean"], [0.5, 0.5, 0.5])
-    np.testing.assert_allclose(new_normalizer.stats[OBS_IMAGE]["std"], [0.2, 0.2, 0.2])
+    # Note: visual stats are reshaped to (C,1,1), so we squeeze before comparing
+    np.testing.assert_allclose(new_normalizer.stats[OBS_IMAGE]["mean"].squeeze(), [0.5, 0.5, 0.5])
+    np.testing.assert_allclose(new_normalizer.stats[OBS_IMAGE]["std"].squeeze(), [0.2, 0.2, 0.2])
    np.testing.assert_allclose(new_normalizer.stats[OBS_STATE]["min"], [0.0, -1.0])
    np.testing.assert_allclose(new_normalizer.stats[OBS_STATE]["max"], [1.0, 1.0])
    np.testing.assert_allclose(new_normalizer.stats[ACTION]["mean"], [0.0, 0.0])
@@ -12,7 +12,6 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.

-import pytest
 import torch

 from lerobot.configs.types import FeatureType, NormalizationMode, PolicyFeature
@@ -36,9 +35,6 @@ def test_classifier_output():


@skip_if_package_missing("transformers")
-@pytest.mark.skip(
-    reason="helper2424/resnet10 needs to be updated to work with the latest version of transformers"
-)
 def test_binary_classifier_with_default_params():
    from lerobot.rewards.classifier.modeling_classifier import Classifier

@@ -80,9 +76,6 @@ def test_binary_classifier_with_default_params():


@skip_if_package_missing("transformers")
-@pytest.mark.skip(
-    reason="helper2424/resnet10 needs to be updated to work with the latest version of transformers"
-)
 def test_multiclass_classifier():
    from lerobot.rewards.classifier.modeling_classifier import Classifier

@@ -122,9 +115,6 @@ def test_multiclass_classifier():


@skip_if_package_missing("transformers")
-@pytest.mark.skip(
-    reason="helper2424/resnet10 needs to be updated to work with the latest version of transformers"
-)
 def test_default_device():
    from lerobot.rewards.classifier.modeling_classifier import Classifier

@@ -141,9 +131,6 @@ def test_default_device():


@skip_if_package_missing("transformers")
-@pytest.mark.skip(
-    reason="helper2424/resnet10 needs to be updated to work with the latest version of transformers"
-)
 def test_explicit_device_setup():
    from lerobot.rewards.classifier.modeling_classifier import Classifier

@@ -0,0 +1,299 @@
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Tests for the in-tree Robometer reward model."""
+
+from __future__ import annotations
+
+from types import SimpleNamespace
+
+import pytest
+import torch
+
+from lerobot.configs.rewards import RewardModelConfig
+from lerobot.rewards.factory import get_reward_model_class, make_reward_model_config
+from lerobot.rewards.robometer import RobometerConfig
+from lerobot.rewards.robometer.modeling_robometer import (
+    ROBOMETER_FEATURE_PREFIX,
+    convert_bins_to_continuous,
+    decode_progress_outputs,
+)
+
+
+class _FakeQwenConfig:
+    """Stand-in for a Qwen3-VL config (the `model.config` attribute).
+
+    ``to_dict`` matches HF's ``PretrainedConfig.to_dict`` closely enough for
+    ``RobometerRewardModel._save_pretrained`` to snapshot a meaningful
+    ``vlm_config`` into the saved ``config.json`` and for the reload path
+    to round-trip through ``AutoConfig.for_model``.
+    """
+
+    def __init__(self, hidden_dim: int = 8, vocab_size: int = 100) -> None:
+        self.text_config = SimpleNamespace(hidden_size=hidden_dim, vocab_size=vocab_size)
+        self._hidden_dim = hidden_dim
+        self._vocab_size = vocab_size
+
+    def to_dict(self) -> dict:
+        return {
+            "model_type": "fake_qwen",
+            "text_config": {
+                "hidden_size": self._hidden_dim,
+                "vocab_size": self._vocab_size,
+            },
+        }
+
+
+class _FakeEmbeddings(torch.nn.Module):
+    def __init__(self, num_embeddings: int = 100) -> None:
+        super().__init__()
+        self.num_embeddings = num_embeddings
+
+
+class _FakeBaseModel(torch.nn.Module):
+    """Stand-in for the Qwen3-VL backbone during tests.
+
+    Provides the minimum surface `RobometerRewardModel.__init__` and
+    `_compute_rbm_logits` rely on: a `parameters()` iterator (for dtype +
+    device), a `config.text_config.hidden_size`, a `config.to_dict()` so
+    `_save_pretrained` can snapshot `vlm_config`,
+    `get_input_embeddings()` / `resize_token_embeddings()` so the fresh-init
+    embed resize is a no-op, and a forward that returns a `SimpleNamespace`
+    with a `hidden_states` tuple.
+    """
+
+    def __init__(self, hidden_dim: int = 8) -> None:
+        super().__init__()
+        self._param = torch.nn.Parameter(torch.zeros(1))
+        self.hidden_dim = hidden_dim
+        self.config = _FakeQwenConfig(hidden_dim)
+        self._embeddings = _FakeEmbeddings()
+
+    def get_input_embeddings(self) -> _FakeEmbeddings:
+        return self._embeddings
+
+    def resize_token_embeddings(self, new_size: int) -> None:
+        self._embeddings.num_embeddings = new_size
+
+    def forward(self, **kwargs):  # noqa: ARG002 - intentional kwargs sink
+        input_ids = kwargs["input_ids"]
+        return SimpleNamespace(
+            hidden_states=(torch.zeros(input_ids.shape[0], input_ids.shape[1], self.hidden_dim),),
+            last_hidden_state=torch.zeros(input_ids.shape[0], input_ids.shape[1], self.hidden_dim),
+        )
+
+
+class _FakeTokenizer:
+    """Minimal stand-in for an HF tokenizer.
+
+    ``RobometerConfig.__post_init__`` uses ``len(tokenizer)`` to compute the
+    deterministic resize target ``len(tokenizer) + len(ROBOMETER_SPECIAL_TOKENS)``,
+    so a working ``__len__`` is all we need.
+    """
+
+    def __init__(self, length: int = 100) -> None:
+        self._length = length
+
+    def __len__(self) -> int:
+        return self._length
+
+
+def _patch_build(monkeypatch) -> None:
+    """Stub out the HF AutoX calls so Robometer construction stays cheap in tests.
+
+    Covers (EO-1 style — no model-side override hooks):
+    * ``AutoConfig.from_pretrained`` (config side) — used by
+      ``RobometerConfig.__post_init__`` to snapshot the backbone config.
+    * ``AutoTokenizer.from_pretrained`` (config side) — used by
+      ``__post_init__`` to compute ``len(tokenizer) + 5``.
+    * ``AutoConfig.for_model``                       — used by
+      ``RobometerConfig.vlm_backbone_config`` when rebuilding for ``from_config``.
+    * ``AutoModelForImageTextToText.from_pretrained`` — fresh-training path
+      (``pretrained_path is None``).
+    * ``AutoModelForImageTextToText.from_config``    — checkpoint-reload path
+      (``pretrained_path`` is set).
+    """
+    from lerobot.rewards.robometer import configuration_robometer, modeling_robometer
+
+    monkeypatch.setattr(
+        modeling_robometer.AutoModelForImageTextToText,
+        "from_pretrained",
+        lambda *args, **kwargs: _FakeBaseModel(hidden_dim=8),
+    )
+    monkeypatch.setattr(
+        modeling_robometer.AutoModelForImageTextToText,
+        "from_config",
+        lambda *args, **kwargs: _FakeBaseModel(hidden_dim=8),
+    )
+    monkeypatch.setattr(
+        configuration_robometer.AutoConfig,
+        "for_model",
+        lambda *args, **kwargs: _FakeQwenConfig(hidden_dim=8),
+    )
+    monkeypatch.setattr(
+        configuration_robometer.AutoConfig,
+        "from_pretrained",
+        lambda *args, **kwargs: _FakeQwenConfig(hidden_dim=8),
+    )
+    monkeypatch.setattr(
+        configuration_robometer.AutoTokenizer,
+        "from_pretrained",
+        lambda *args, **kwargs: _FakeTokenizer(length=100),
+    )
+
+
+def _make_batch(features: dict[str, torch.Tensor]) -> dict[str, torch.Tensor]:
+    """Build a `compute_reward`-ready batch using Robometer's namespaced keys."""
+    return {f"{ROBOMETER_FEATURE_PREFIX}{key}": value for key, value in features.items()}
+
+
+def test_robometer_config_registered(monkeypatch):
+    _patch_build(monkeypatch)
+    assert "robometer" in RewardModelConfig.get_known_choices()
+    assert RewardModelConfig.get_choice_class("robometer") is RobometerConfig
+    assert isinstance(make_reward_model_config("robometer", device="cpu"), RobometerConfig)
+
+
+def test_robometer_factory_returns_in_tree_class():
+    from lerobot.rewards.robometer.modeling_robometer import RobometerRewardModel
+
+    assert get_reward_model_class("robometer") is RobometerRewardModel
+
+
+def test_convert_bins_to_continuous_returns_expected_values():
+    # Two frames: first peaks at bin 0 (center 0.0), second peaks at bin 9 (center 1.0).
+    bin_logits = torch.full((2, 10), -10.0)
+    bin_logits[0, 0] = 10.0
+    bin_logits[1, -1] = 10.0
+    values = convert_bins_to_continuous(bin_logits)
+    assert values.shape == (2,)
+    assert torch.allclose(values, torch.tensor([0.0, 1.0]), atol=1e-3)
+
+
+def test_decode_progress_outputs_returns_last_frame_values():
+    progress = torch.tensor([[0.1, 0.9], [0.4, 0.6]])
+    success_logits = torch.tensor([[0.0, 5.0], [0.0, -5.0]])
+
+    outputs = decode_progress_outputs(progress, success_logits, is_discrete_mode=False)
+
+    assert outputs["progress_pred"] == [pytest.approx([0.1, 0.9]), pytest.approx([0.4, 0.6])]
+    assert outputs["success_probs"][0][-1] == pytest.approx(torch.sigmoid(torch.tensor(5.0)).item(), abs=1e-3)
+    assert outputs["success_probs"][1][-1] == pytest.approx(
+        torch.sigmoid(torch.tensor(-5.0)).item(), abs=1e-3
+    )
+
+
+def test_decode_progress_outputs_discrete_mode_softmaxes_over_bins():
+    # 2 frames, peaks at bin 0 and bin 9 → continuous predictions 0.0 and 1.0
+    bin_logits = torch.full((1, 2, 10), -10.0)
+    bin_logits[0, 0, 0] = 10.0
+    bin_logits[0, 1, -1] = 10.0
+
+    outputs = decode_progress_outputs(bin_logits, success_logits=None, is_discrete_mode=True)
+
+    assert outputs["success_probs"] == []
+    assert outputs["progress_pred"][0] == pytest.approx([0.0, 1.0], abs=1e-3)
+
+
+def test_robometer_compute_reward_reads_pre_encoded_inputs(monkeypatch):
+    from lerobot.rewards.robometer.modeling_robometer import RobometerRewardModel
+
+    progress = torch.tensor([[0.1, 0.9], [0.4, 0.6]])
+    success_logits = torch.tensor([[0.0, 5.0], [0.0, -5.0]])
+    _patch_build(monkeypatch)
+
+    cfg = RobometerConfig(device="cpu", reward_output="progress", progress_loss_type="l2")
+    model = RobometerRewardModel(cfg)
+    # Bypass the Qwen3-VL forward + head extraction with deterministic logits.
+    monkeypatch.setattr(model, "_compute_rbm_logits", lambda _inputs: (progress, success_logits))
+
+    batch = _make_batch({"input_ids": torch.zeros(2, 2, dtype=torch.long)})
+    rewards = model.compute_reward(batch)
+
+    assert torch.allclose(rewards, torch.tensor([0.9, 0.6]))
+
+
+def test_robometer_compute_reward_can_return_binary_success(monkeypatch):
+    from lerobot.rewards.robometer.modeling_robometer import RobometerRewardModel
+
+    progress = torch.tensor([[0.1, 0.9], [0.4, 0.6]])
+    success_logits = torch.tensor([[0.0, 5.0], [0.0, -5.0]])  # sigmoid(5) > 0.5; sigmoid(-5) < 0.5
+    _patch_build(monkeypatch)
+
+    cfg = RobometerConfig(
+        device="cpu",
+        reward_output="success",
+        success_threshold=0.5,
+        progress_loss_type="l2",
+    )
+    model = RobometerRewardModel(cfg)
+    monkeypatch.setattr(model, "_compute_rbm_logits", lambda _inputs: (progress, success_logits))
+
+    batch = _make_batch({"input_ids": torch.zeros(2, 2, dtype=torch.long)})
+    rewards = model.compute_reward(batch)
+
+    assert torch.equal(rewards, torch.tensor([1.0, 0.0]))
+
+
+def test_robometer_compute_reward_errors_when_inputs_missing(monkeypatch):
+    from lerobot.rewards.robometer.modeling_robometer import RobometerRewardModel
+
+    _patch_build(monkeypatch)
+
+    cfg = RobometerConfig(device="cpu", progress_loss_type="l2")
+    model = RobometerRewardModel(cfg)
+
+    with pytest.raises(KeyError, match=r"observation\.robometer\.input_ids"):
+        model.compute_reward({})
+
+
+def test_robometer_save_pretrained_roundtrips(monkeypatch, tmp_path):
+    """Saving and reloading a Robometer model in LeRobot HF format must produce
+    a single ``model.safetensors`` + ``config.json`` (no Hydra ``config.yaml``).
+    """
+    from huggingface_hub.constants import CONFIG_NAME, SAFETENSORS_SINGLE_FILE
+
+    from lerobot.rewards.robometer.modeling_robometer import RobometerRewardModel
+
+    _patch_build(monkeypatch)
+    cfg = RobometerConfig(
+        device="cpu",
+        pretrained_path="robometer/Robometer-4B",
+        # Knobs the user might tweak — must survive the round-trip.
+        image_key="observation.images.cam_top",
+        task_key="task",
+        reward_output="success",
+        success_threshold=0.7,
+        progress_loss_type="l2",
+    )
+    model = RobometerRewardModel(cfg)
+    model.save_pretrained(str(tmp_path))
+
+    # Exactly the files LeRobot's HubMixin promises.
+    assert (tmp_path / CONFIG_NAME).exists()
+    assert (tmp_path / SAFETENSORS_SINGLE_FILE).exists()
+    assert not (tmp_path / "config.yaml").exists()  # we want HF-style, not Hydra
+
+    # Reload from the local directory: no Hub fetch, no YAML overlay. The
+    # base class drives subclass dispatch via the `type` field in config.json.
+    reloaded_cfg = RewardModelConfig.from_pretrained(str(tmp_path))
+    assert isinstance(reloaded_cfg, RobometerConfig)
+    reloaded_cfg.pretrained_path = str(tmp_path)  # mimic lerobot-train's `validate()`
+    reloaded = RobometerRewardModel.from_pretrained(str(tmp_path), config=reloaded_cfg)
+
+    assert reloaded.config.image_key == "observation.images.cam_top"
+    assert reloaded.config.task_key == "task"
+    assert reloaded.config.reward_output == "success"
+    assert reloaded.config.success_threshold == 0.7
+    assert reloaded.config.progress_loss_type == "l2"  # came back from config.json
@@ -22,12 +22,14 @@ import pytest
 import torch

 pytest.importorskip("datasets", reason="datasets is required (install lerobot[dataset])")
+pytest.importorskip("grpc")

 from torch.multiprocessing import Event, Queue

-from lerobot.configs.train import TrainRLServerPipelineConfig
-from lerobot.policies.sac.configuration_sac import SACConfig
-from lerobot.utils.constants import OBS_STR
+from lerobot.configs.types import FeatureType, PolicyFeature
+from lerobot.policies.gaussian_actor.configuration_gaussian_actor import GaussianActorConfig
+from lerobot.rl.train_rl import TrainRLServerPipelineConfig
+from lerobot.utils.constants import ACTION, OBS_STATE, OBS_STR
 from lerobot.utils.transition import Transition
 from tests.utils import skip_if_package_missing

@@ -79,7 +81,7 @@ def cfg():

    port = find_free_port()

-    policy_cfg = SACConfig()
+    policy_cfg = GaussianActorConfig()
    policy_cfg.actor_learner_config.learner_host = "127.0.0.1"
    policy_cfg.actor_learner_config.learner_port = port
    policy_cfg.concurrency.actor = "threads"
@@ -299,3 +301,164 @@ def test_end_to_end_parameters_flow(cfg, data_size):
    assert received_params.keys() == input_params.keys()
    for key in input_params:
        assert torch.allclose(received_params[key], input_params[key])
+
+
+def test_learner_algorithm_wiring():
+    """Verify that make_algorithm constructs an SACAlgorithm from config,
+    make_optimizers_and_scheduler() creates the right optimizers, update() works, and
+    get_weights() output is serializable."""
+    from lerobot.policies.gaussian_actor.modeling_gaussian_actor import GaussianActorPolicy
+    from lerobot.rl.algorithms.factory import make_algorithm
+    from lerobot.rl.algorithms.sac import SACAlgorithm, SACAlgorithmConfig
+    from lerobot.transport.utils import state_to_bytes
+
+    state_dim = 10
+    action_dim = 6
+
+    sac_cfg = GaussianActorConfig(
+        input_features={OBS_STATE: PolicyFeature(type=FeatureType.STATE, shape=(state_dim,))},
+        output_features={ACTION: PolicyFeature(type=FeatureType.ACTION, shape=(action_dim,))},
+        dataset_stats={
+            OBS_STATE: {"min": [0.0] * state_dim, "max": [1.0] * state_dim},
+            ACTION: {"min": [0.0] * action_dim, "max": [1.0] * action_dim},
+        },
+    )
+    sac_cfg.validate_features()
+
+    policy = GaussianActorPolicy(config=sac_cfg)
+    policy.train()
+
+    algorithm = make_algorithm(cfg=SACAlgorithmConfig.from_policy_config(sac_cfg), policy=policy)
+    assert isinstance(algorithm, SACAlgorithm)
+
+    optimizers = algorithm.make_optimizers_and_scheduler()
+    assert "actor" in optimizers
+    assert "critic" in optimizers
+    assert "temperature" in optimizers
+
+    batch_size = 4
+
+    def batch_iterator():
+        while True:
+            yield {
+                ACTION: torch.randn(batch_size, action_dim),
+                "reward": torch.randn(batch_size),
+                "state": {OBS_STATE: torch.randn(batch_size, state_dim)},
+                "next_state": {OBS_STATE: torch.randn(batch_size, state_dim)},
+                "done": torch.zeros(batch_size),
+                "complementary_info": {},
+            }
+
+    stats = algorithm.update(batch_iterator())
+    assert "loss_critic" in stats.losses
+
+    # get_weights -> state_to_bytes round-trip
+    weights = algorithm.get_weights()
+    assert len(weights) > 0
+    serialized = state_to_bytes(weights)
+    assert isinstance(serialized, bytes)
+    assert len(serialized) > 0
+
+    # RLTrainer with DataMixer
+    from lerobot.rl.buffer import ReplayBuffer
+    from lerobot.rl.data_sources import OnlineOfflineMixer
+    from lerobot.rl.trainer import RLTrainer
+
+    replay_buffer = ReplayBuffer(
+        capacity=50,
+        device="cpu",
+        state_keys=[OBS_STATE],
+        storage_device="cpu",
+        use_drq=False,
+    )
+    for _ in range(50):
+        replay_buffer.add(
+            state={OBS_STATE: torch.randn(state_dim)},
+            action=torch.randn(action_dim),
+            reward=1.0,
+            next_state={OBS_STATE: torch.randn(state_dim)},
+            done=False,
+            truncated=False,
+        )
+    data_mixer = OnlineOfflineMixer(online_buffer=replay_buffer, offline_buffer=None)
+    trainer = RLTrainer(
+        algorithm=algorithm,
+        data_mixer=data_mixer,
+        batch_size=batch_size,
+    )
+    trainer_stats = trainer.training_step()
+    assert "loss_critic" in trainer_stats.losses
+
+
+def test_initial_and_periodic_weight_push_consistency():
+    """Both initial and periodic weight pushes should use algorithm.get_weights()
+    and produce identical structures."""
+    from lerobot.policies.gaussian_actor.modeling_gaussian_actor import GaussianActorPolicy
+    from lerobot.rl.algorithms.factory import make_algorithm
+    from lerobot.rl.algorithms.sac import SACAlgorithmConfig
+    from lerobot.transport.utils import bytes_to_state_dict, state_to_bytes
+
+    state_dim = 10
+    action_dim = 6
+    sac_cfg = GaussianActorConfig(
+        input_features={OBS_STATE: PolicyFeature(type=FeatureType.STATE, shape=(state_dim,))},
+        output_features={ACTION: PolicyFeature(type=FeatureType.ACTION, shape=(action_dim,))},
+        dataset_stats={
+            OBS_STATE: {"min": [0.0] * state_dim, "max": [1.0] * state_dim},
+            ACTION: {"min": [0.0] * action_dim, "max": [1.0] * action_dim},
+        },
+    )
+    sac_cfg.validate_features()
+
+    policy = GaussianActorPolicy(config=sac_cfg)
+    policy.train()
+    algorithm = make_algorithm(cfg=SACAlgorithmConfig.from_policy_config(sac_cfg), policy=policy)
+    algorithm.make_optimizers_and_scheduler()
+
+    # Simulate initial push (same code path the learner now uses)
+    initial_weights = algorithm.get_weights()
+    initial_bytes = state_to_bytes(initial_weights)
+
+    # Simulate periodic push
+    periodic_weights = algorithm.get_weights()
+    periodic_bytes = state_to_bytes(periodic_weights)
+
+    initial_decoded = bytes_to_state_dict(initial_bytes)
+    periodic_decoded = bytes_to_state_dict(periodic_bytes)
+
+    assert initial_decoded.keys() == periodic_decoded.keys()
+
+
+def test_actor_side_algorithm_select_action_and_load_weights():
+    """Simulate actor: create algorithm without optimizers, select_action, load_weights."""
+    from lerobot.policies.gaussian_actor.modeling_gaussian_actor import GaussianActorPolicy
+    from lerobot.rl.algorithms.factory import make_algorithm
+    from lerobot.rl.algorithms.sac import SACAlgorithm, SACAlgorithmConfig
+
+    state_dim = 10
+    action_dim = 6
+    sac_cfg = GaussianActorConfig(
+        input_features={OBS_STATE: PolicyFeature(type=FeatureType.STATE, shape=(state_dim,))},
+        output_features={ACTION: PolicyFeature(type=FeatureType.ACTION, shape=(action_dim,))},
+        dataset_stats={
+            OBS_STATE: {"min": [0.0] * state_dim, "max": [1.0] * state_dim},
+            ACTION: {"min": [0.0] * action_dim, "max": [1.0] * action_dim},
+        },
+    )
+    sac_cfg.validate_features()
+
+    # Actor side: no optimizers
+    policy = GaussianActorPolicy(config=sac_cfg)
+    policy.eval()
+    algorithm = make_algorithm(cfg=SACAlgorithmConfig.from_policy_config(sac_cfg), policy=policy)
+    assert isinstance(algorithm, SACAlgorithm)
+    assert algorithm.optimizers == {}
+
+    # select_action should work
+    obs = {OBS_STATE: torch.randn(state_dim)}
+    action = policy.select_action(obs)
+    assert action.shape == (action_dim,)
+
+    # Simulate receiving weights from learner
+    fake_weights = algorithm.get_weights()
+    algorithm.load_weights(fake_weights, device="cpu")
@@ -0,0 +1,89 @@
+# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""Tests for RL data mixing (DataMixer, OnlineOfflineMixer)."""
+
+import pytest
+
+pytest.importorskip("datasets", reason="datasets is required (install lerobot[dataset])")
+
+import torch  # noqa: E402
+
+from lerobot.rl.buffer import ReplayBuffer  # noqa: E402
+from lerobot.rl.data_sources import OnlineOfflineMixer  # noqa: E402
+from lerobot.utils.constants import OBS_STATE  # noqa: E402
+
+
+def _make_buffer(capacity: int = 100, state_dim: int = 4) -> ReplayBuffer:
+    buf = ReplayBuffer(
+        capacity=capacity,
+        device="cpu",
+        state_keys=[OBS_STATE],
+        storage_device="cpu",
+        use_drq=False,
+    )
+    for i in range(capacity):
+        buf.add(
+            state={OBS_STATE: torch.randn(state_dim)},
+            action=torch.randn(2),
+            reward=1.0,
+            next_state={OBS_STATE: torch.randn(state_dim)},
+            done=bool(i % 10 == 9),
+            truncated=False,
+        )
+    return buf
+
+
+def test_online_only_mixer_sample():
+    """OnlineOfflineMixer with no offline buffer returns online-only batches."""
+    buf = _make_buffer(capacity=50)
+    mixer = OnlineOfflineMixer(online_buffer=buf, offline_buffer=None, online_ratio=0.5)
+    batch = mixer.sample(batch_size=8)
+    assert batch["state"][OBS_STATE].shape[0] == 8
+    assert batch["action"].shape[0] == 8
+    assert batch["reward"].shape[0] == 8
+
+
+def test_online_only_mixer_ratio_one():
+    """OnlineOfflineMixer with online_ratio=1.0 and no offline is equivalent to online-only."""
+    buf = _make_buffer(capacity=50)
+    mixer = OnlineOfflineMixer(online_buffer=buf, offline_buffer=None, online_ratio=1.0)
+    batch = mixer.sample(batch_size=10)
+    assert batch["state"][OBS_STATE].shape[0] == 10
+
+
+def test_online_offline_mixer_sample():
+    """OnlineOfflineMixer with two buffers returns concatenated batches."""
+    online = _make_buffer(capacity=50)
+    offline = _make_buffer(capacity=50)
+    mixer = OnlineOfflineMixer(
+        online_buffer=online,
+        offline_buffer=offline,
+        online_ratio=0.5,
+    )
+    batch = mixer.sample(batch_size=10)
+    assert batch["state"][OBS_STATE].shape[0] == 10
+    assert batch["action"].shape[0] == 10
+    # 5 from online, 5 from offline (approx)
+    assert batch["reward"].shape[0] == 10
+
+
+def test_online_offline_mixer_iterator():
+    """get_iterator yields batches of the requested size."""
+    buf = _make_buffer(capacity=50)
+    mixer = OnlineOfflineMixer(online_buffer=buf, offline_buffer=None)
+    it = mixer.get_iterator(batch_size=4, async_prefetch=False)
+    batch1 = next(it)
+    batch2 = next(it)
+    assert batch1["state"][OBS_STATE].shape[0] == 4
+    assert batch2["state"][OBS_STATE].shape[0] == 4
@@ -20,7 +20,7 @@ from queue import Queue

 import pytest

-pytest.importorskip("grpc")
+pytest.importorskip("datasets", reason="datasets is required (install lerobot[dataset])")

 from torch.multiprocessing import Queue as TorchMPQueue  # noqa: E402

@@ -0,0 +1,606 @@
+#!/usr/bin/env python
+
+# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""Tests for the RL algorithm abstraction and SACAlgorithm implementation."""
+
+import pytest
+
+pytest.importorskip("datasets", reason="datasets is required (install lerobot[dataset])")
+
+import torch  # noqa: E402
+
+from lerobot.configs.types import FeatureType, PolicyFeature  # noqa: E402
+from lerobot.policies.gaussian_actor.configuration_gaussian_actor import GaussianActorConfig  # noqa: E402
+from lerobot.policies.gaussian_actor.modeling_gaussian_actor import GaussianActorPolicy  # noqa: E402
+from lerobot.rl.algorithms.configs import RLAlgorithmConfig, TrainingStats  # noqa: E402
+from lerobot.rl.algorithms.factory import make_algorithm  # noqa: E402
+from lerobot.rl.algorithms.sac import SACAlgorithm, SACAlgorithmConfig  # noqa: E402
+from lerobot.utils.constants import ACTION, OBS_IMAGE, OBS_STATE  # noqa: E402
+from lerobot.utils.random_utils import set_seed  # noqa: E402
+
+# ---------------------------------------------------------------------------
+# Helpers (reuse patterns from tests/policies/test_gaussian_actor_policy.py)
+# ---------------------------------------------------------------------------
+
+
+@pytest.fixture(autouse=True)
+def set_random_seed():
+    set_seed(42)
+
+
+def _make_sac_config(
+    state_dim: int = 10,
+    action_dim: int = 6,
+    num_discrete_actions: int | None = None,
+    with_images: bool = False,
+) -> GaussianActorConfig:
+    config = GaussianActorConfig(
+        input_features={OBS_STATE: PolicyFeature(type=FeatureType.STATE, shape=(state_dim,))},
+        output_features={ACTION: PolicyFeature(type=FeatureType.ACTION, shape=(action_dim,))},
+        dataset_stats={
+            OBS_STATE: {"min": [0.0] * state_dim, "max": [1.0] * state_dim},
+            ACTION: {"min": [0.0] * action_dim, "max": [1.0] * action_dim},
+        },
+        num_discrete_actions=num_discrete_actions,
+    )
+    if with_images:
+        config.input_features[OBS_IMAGE] = PolicyFeature(type=FeatureType.VISUAL, shape=(3, 84, 84))
+        config.dataset_stats[OBS_IMAGE] = {
+            "mean": torch.randn(3, 1, 1).tolist(),
+            "std": torch.randn(3, 1, 1).abs().tolist(),
+        }
+        config.latent_dim = 32
+        config.state_encoder_hidden_dim = 32
+    config.validate_features()
+    return config
+
+
+def _make_algorithm(
+    state_dim: int = 10,
+    action_dim: int = 6,
+    utd_ratio: int = 1,
+    policy_update_freq: int = 1,
+    num_discrete_actions: int | None = None,
+    with_images: bool = False,
+) -> tuple[SACAlgorithm, GaussianActorPolicy]:
+    sac_cfg = _make_sac_config(
+        state_dim=state_dim,
+        action_dim=action_dim,
+        num_discrete_actions=num_discrete_actions,
+        with_images=with_images,
+    )
+    policy = GaussianActorPolicy(config=sac_cfg)
+    policy.train()
+    algo_config = SACAlgorithmConfig.from_policy_config(sac_cfg)
+    algo_config.utd_ratio = utd_ratio
+    algo_config.policy_update_freq = policy_update_freq
+    algorithm = SACAlgorithm(policy=policy, config=algo_config)
+    algorithm.make_optimizers_and_scheduler()
+    return algorithm, policy
+
+
+def _make_batch(
+    batch_size: int = 4,
+    state_dim: int = 10,
+    action_dim: int = 6,
+    with_images: bool = False,
+) -> dict:
+    obs = {OBS_STATE: torch.randn(batch_size, state_dim)}
+    next_obs = {OBS_STATE: torch.randn(batch_size, state_dim)}
+    if with_images:
+        obs[OBS_IMAGE] = torch.randn(batch_size, 3, 84, 84)
+        next_obs[OBS_IMAGE] = torch.randn(batch_size, 3, 84, 84)
+    return {
+        ACTION: torch.randn(batch_size, action_dim),
+        "reward": torch.randn(batch_size),
+        "state": obs,
+        "next_state": next_obs,
+        "done": torch.zeros(batch_size),
+        "complementary_info": {},
+    }
+
+
+def _batch_iterator(**batch_kwargs):
+    """Infinite iterator that yields fresh batches (mirrors a real DataMixer iterator)."""
+    while True:
+        yield _make_batch(**batch_kwargs)
+
+
+# ===========================================================================
+# Registry / config tests
+# ===========================================================================
+
+
+def test_sac_algorithm_config_registered():
+    """SACAlgorithmConfig should be discoverable through the registry."""
+    assert "sac" in RLAlgorithmConfig.get_known_choices()
+    cls = RLAlgorithmConfig.get_choice_class("sac")
+    assert cls is SACAlgorithmConfig
+
+
+def test_sac_algorithm_config_from_policy_config():
+    """from_policy_config embeds the policy config and uses SAC defaults."""
+    sac_cfg = _make_sac_config()
+    algo_cfg = SACAlgorithmConfig.from_policy_config(sac_cfg)
+    assert algo_cfg.policy_config is sac_cfg
+    assert algo_cfg.discrete_critic_network_kwargs is sac_cfg.discrete_critic_network_kwargs
+    # Defaults come from SACAlgorithmConfig, not from the policy config.
+    assert algo_cfg.utd_ratio == 1
+    assert algo_cfg.policy_update_freq == 1
+    assert algo_cfg.grad_clip_norm == 40.0
+    assert algo_cfg.actor_lr == 3e-4
+
+
+# ===========================================================================
+# TrainingStats tests
+# ===========================================================================
+
+
+def test_training_stats_defaults():
+    stats = TrainingStats()
+    assert stats.losses == {}
+    assert stats.grad_norms == {}
+    assert stats.extra == {}
+
+
+# ===========================================================================
+# get_weights
+# ===========================================================================
+
+
+def test_get_weights_returns_policy_state_dict():
+    algorithm, policy = _make_algorithm()
+    weights = algorithm.get_weights()
+    assert "policy" in weights
+    actor_state_dict = policy.actor.state_dict()
+    for key in actor_state_dict:
+        assert key in weights["policy"]
+        assert torch.equal(weights["policy"][key].cpu(), actor_state_dict[key].cpu())
+
+
+def test_get_weights_includes_discrete_critic_when_present():
+    algorithm, _ = _make_algorithm(num_discrete_actions=3, action_dim=6)
+    weights = algorithm.get_weights()
+    assert "discrete_critic" in weights
+    assert len(weights["discrete_critic"]) > 0
+
+
+def test_get_weights_excludes_discrete_critic_when_absent():
+    algorithm, _ = _make_algorithm()
+    weights = algorithm.get_weights()
+    assert "discrete_critic" not in weights
+
+
+def test_get_weights_are_on_cpu():
+    algorithm, _ = _make_algorithm(num_discrete_actions=3, action_dim=6)
+    weights = algorithm.get_weights()
+    for group_name, state_dict in weights.items():
+        for key, tensor in state_dict.items():
+            assert tensor.device == torch.device("cpu"), f"{group_name}/{key} is not on CPU"
+
+
+# ===========================================================================
+# select_action (lives on the policy, not the algorithm)
+# ===========================================================================
+
+
+def test_select_action_returns_correct_shape():
+    action_dim = 6
+    _, policy = _make_algorithm(state_dim=10, action_dim=action_dim)
+    policy.eval()
+    obs = {OBS_STATE: torch.randn(10)}
+    action = policy.select_action(obs)
+    assert action.shape == (action_dim,)
+
+
+def test_select_action_with_discrete_critic():
+    continuous_dim = 5
+    _, policy = _make_algorithm(state_dim=10, action_dim=continuous_dim, num_discrete_actions=3)
+    policy.eval()
+    obs = {OBS_STATE: torch.randn(10)}
+    action = policy.select_action(obs)
+    assert action.shape == (continuous_dim + 1,)
+
+
+# ===========================================================================
+# update (single batch, utd_ratio=1)
+# ===========================================================================
+
+
+def test_update_returns_training_stats():
+    algorithm, _ = _make_algorithm()
+    stats = algorithm.update(_batch_iterator())
+    assert isinstance(stats, TrainingStats)
+    assert "loss_critic" in stats.losses
+    assert isinstance(stats.losses["loss_critic"], float)
+
+
+def test_update_populates_actor_and_temperature_losses():
+    """With policy_update_freq=1 and step 0, actor/temperature should be updated."""
+    algorithm, _ = _make_algorithm(policy_update_freq=1)
+    stats = algorithm.update(_batch_iterator())
+    assert "loss_actor" in stats.losses
+    assert "loss_temperature" in stats.losses
+    assert "temperature" in stats.extra
+
+
+@pytest.mark.parametrize("policy_update_freq", [2, 3])
+def test_update_skips_actor_at_non_update_steps(policy_update_freq):
+    """Actor/temperature should only update when optimization_step % freq == 0."""
+    algorithm, _ = _make_algorithm(policy_update_freq=policy_update_freq)
+    it = _batch_iterator()
+
+    # Step 0: should update actor
+    stats_0 = algorithm.update(it)
+    assert "loss_actor" in stats_0.losses
+
+    # Step 1: should NOT update actor
+    stats_1 = algorithm.update(it)
+    assert "loss_actor" not in stats_1.losses
+
+
+def test_update_increments_optimization_step():
+    algorithm, _ = _make_algorithm()
+    it = _batch_iterator()
+    assert algorithm.optimization_step == 0
+    algorithm.update(it)
+    assert algorithm.optimization_step == 1
+    algorithm.update(it)
+    assert algorithm.optimization_step == 2
+
+
+def test_update_with_discrete_critic():
+    algorithm, _ = _make_algorithm(num_discrete_actions=3, action_dim=6)
+    stats = algorithm.update(_batch_iterator(action_dim=7))  # continuous + 1 discrete
+    assert "loss_discrete_critic" in stats.losses
+    assert "discrete_critic" in stats.grad_norms
+
+
+# ===========================================================================
+# update with UTD ratio > 1
+# ===========================================================================
+
+
+@pytest.mark.parametrize("utd_ratio", [2, 4])
+def test_update_with_utd_ratio(utd_ratio):
+    algorithm, _ = _make_algorithm(utd_ratio=utd_ratio)
+    stats = algorithm.update(_batch_iterator())
+    assert isinstance(stats, TrainingStats)
+    assert "loss_critic" in stats.losses
+    assert algorithm.optimization_step == 1
+
+
+def test_update_utd_ratio_pulls_utd_batches():
+    """next(batch_iterator) should be called exactly utd_ratio times."""
+    utd_ratio = 3
+    algorithm, _ = _make_algorithm(utd_ratio=utd_ratio)
+
+    call_count = 0
+
+    def counting_iterator():
+        nonlocal call_count
+        while True:
+            call_count += 1
+            yield _make_batch()
+
+    algorithm.update(counting_iterator())
+    assert call_count == utd_ratio
+
+
+def test_update_utd_ratio_3_critic_warmup_changes_weights():
+    """With utd_ratio=3, critic weights should change after update (3 critic steps)."""
+    algorithm, policy = _make_algorithm(utd_ratio=3)
+
+    critic_params_before = {n: p.clone() for n, p in algorithm.critic_ensemble.named_parameters()}
+
+    algorithm.update(_batch_iterator())
+
+    changed = False
+    for n, p in algorithm.critic_ensemble.named_parameters():
+        if not torch.equal(p, critic_params_before[n]):
+            changed = True
+            break
+    assert changed, "Critic weights should have changed after UTD update"
+
+
+# ===========================================================================
+# get_observation_features
+# ===========================================================================
+
+
+def test_get_observation_features_returns_none_without_frozen_encoder():
+    algorithm, _ = _make_algorithm(with_images=False)
+    obs = {OBS_STATE: torch.randn(4, 10)}
+    next_obs = {OBS_STATE: torch.randn(4, 10)}
+    feat, next_feat = algorithm.get_observation_features(obs, next_obs)
+    assert feat is None
+    assert next_feat is None
+
+
+# ===========================================================================
+# optimization_step setter
+# ===========================================================================
+
+
+def test_optimization_step_can_be_set_for_resume():
+    algorithm, _ = _make_algorithm()
+    algorithm.optimization_step = 100
+    assert algorithm.optimization_step == 100
+
+
+# ===========================================================================
+# make_algorithm factory
+# ===========================================================================
+
+
+def test_make_algorithm_returns_sac_for_sac_policy():
+    sac_cfg = _make_sac_config()
+    policy = GaussianActorPolicy(config=sac_cfg)
+    algorithm = make_algorithm(cfg=SACAlgorithmConfig.from_policy_config(sac_cfg), policy=policy)
+    assert isinstance(algorithm, SACAlgorithm)
+    assert algorithm.optimizers == {}
+
+
+def test_make_optimizers_creates_expected_keys():
+    """make_optimizers_and_scheduler() should populate the algorithm with Adam optimizers."""
+    sac_cfg = _make_sac_config()
+    policy = GaussianActorPolicy(config=sac_cfg)
+    algorithm = make_algorithm(cfg=SACAlgorithmConfig.from_policy_config(sac_cfg), policy=policy)
+    optimizers = algorithm.make_optimizers_and_scheduler()
+    assert "actor" in optimizers
+    assert "critic" in optimizers
+    assert "temperature" in optimizers
+    assert all(isinstance(v, torch.optim.Adam) for v in optimizers.values())
+    assert algorithm.get_optimizers() is optimizers
+
+
+def test_actor_side_no_optimizers():
+    """Actor-side usage: no optimizers needed, make_optimizers_and_scheduler is not called."""
+    sac_cfg = _make_sac_config()
+    policy = GaussianActorPolicy(config=sac_cfg)
+    algorithm = make_algorithm(cfg=SACAlgorithmConfig.from_policy_config(sac_cfg), policy=policy)
+    assert isinstance(algorithm, SACAlgorithm)
+    assert algorithm.optimizers == {}
+
+
+def test_make_algorithm_uses_sac_algorithm_defaults():
+    """make_algorithm populates SACAlgorithmConfig with its own defaults."""
+    sac_cfg = _make_sac_config()
+    policy = GaussianActorPolicy(config=sac_cfg)
+    algorithm = make_algorithm(cfg=SACAlgorithmConfig.from_policy_config(sac_cfg), policy=policy)
+    assert algorithm.config.utd_ratio == 1
+    assert algorithm.config.policy_update_freq == 1
+    assert algorithm.config.grad_clip_norm == 40.0
+
+
+def test_unknown_algorithm_name_raises_in_registry():
+    """The ChoiceRegistry is the source of truth for unknown algorithm names."""
+    with pytest.raises(KeyError):
+        RLAlgorithmConfig.get_choice_class("unknown_algo")
+
+
+# ===========================================================================
+# load_weights (round-trip with get_weights)
+# ===========================================================================
+
+
+def test_load_weights_round_trip():
+    """get_weights -> load_weights should restore identical parameters on a fresh policy."""
+    algo_src, _ = _make_algorithm(state_dim=10, action_dim=6)
+    algo_src.update(_batch_iterator())
+
+    sac_cfg = _make_sac_config(state_dim=10, action_dim=6)
+    policy_dst = GaussianActorPolicy(config=sac_cfg)
+    algo_dst = SACAlgorithm(policy=policy_dst, config=algo_src.config)
+
+    weights = algo_src.get_weights()
+    algo_dst.load_weights(weights, device="cpu")
+
+    dst_actor_state_dict = algo_dst.policy.actor.state_dict()
+    for key, tensor in weights["policy"].items():
+        assert torch.equal(
+            dst_actor_state_dict[key].cpu(),
+            tensor.cpu(),
+        ), f"Policy param '{key}' mismatch after load_weights"
+
+
+def test_load_weights_round_trip_with_discrete_critic():
+    algo_src, _ = _make_algorithm(num_discrete_actions=3, action_dim=6)
+    algo_src.update(_batch_iterator(action_dim=7))
+
+    sac_cfg = _make_sac_config(num_discrete_actions=3, action_dim=6)
+    policy_dst = GaussianActorPolicy(config=sac_cfg)
+    algo_dst = SACAlgorithm(policy=policy_dst, config=algo_src.config)
+
+    weights = algo_src.get_weights()
+    algo_dst.load_weights(weights, device="cpu")
+
+    assert "discrete_critic" in weights
+    assert len(weights["discrete_critic"]) > 0
+    dst_discrete_critic_state_dict = algo_dst.policy.discrete_critic.state_dict()
+    for key, tensor in weights["discrete_critic"].items():
+        assert torch.equal(
+            dst_discrete_critic_state_dict[key].cpu(),
+            tensor.cpu(),
+        ), f"Discrete critic param '{key}' mismatch after load_weights"
+
+
+def test_load_weights_ignores_missing_discrete_critic():
+    """load_weights should not fail when weights lack discrete_critic on a non-discrete policy."""
+    algorithm, _ = _make_algorithm()
+    weights = algorithm.get_weights()
+    algorithm.load_weights(weights, device="cpu")
+
+
+def test_actor_side_weight_sync_with_discrete_critic():
+    """End-to-end: learner ``algorithm.get_weights()`` -> actor ``algorithm.load_weights()``."""
+    # Learner side: train the source algorithm so its weights diverge from init.
+    algo_src, _ = _make_algorithm(num_discrete_actions=3, action_dim=6)
+    algo_src.update(_batch_iterator(action_dim=7))
+    weights = algo_src.get_weights()
+
+    # Actor side: fresh policy + fresh algorithm holding it.
+    sac_cfg = _make_sac_config(num_discrete_actions=3, action_dim=6)
+    policy_actor = GaussianActorPolicy(config=sac_cfg)
+    algo_actor = SACAlgorithm(
+        policy=policy_actor,
+        config=SACAlgorithmConfig.from_policy_config(sac_cfg),
+    )
+
+    # Snapshot initial actor state for the "did it change?" assertion below.
+    initial_discrete_critic_state_dict = {
+        k: v.clone() for k, v in policy_actor.discrete_critic.state_dict().items()
+    }
+
+    algo_actor.load_weights(weights, device="cpu")
+
+    # Actor weights match the learner's exported actor state dict.
+    actor_state_dict = policy_actor.actor.state_dict()
+    for key, tensor in weights["policy"].items():
+        assert torch.equal(actor_state_dict[key].cpu(), tensor.cpu()), (
+            f"Actor param '{key}' not synced by algorithm.load_weights"
+        )
+
+    # Discrete critic weights match the learner's exported discrete critic.
+    discrete_critic_state_dict = policy_actor.discrete_critic.state_dict()
+    for key, tensor in weights["discrete_critic"].items():
+        assert torch.equal(discrete_critic_state_dict[key].cpu(), tensor.cpu()), (
+            f"Discrete critic param '{key}' not synced by algorithm.load_weights"
+        )
+
+    # Sanity: the discrete critic actually changed (otherwise the sync is trivial).
+    changed = any(
+        not torch.equal(initial_discrete_critic_state_dict[key], discrete_critic_state_dict[key])
+        for key in initial_discrete_critic_state_dict
+        if key in discrete_critic_state_dict
+    )
+    assert changed, "Discrete critic weights did not change between init and after sync"
+
+
+# ===========================================================================
+# TrainingStats generic losses dict
+# ===========================================================================
+
+
+def test_training_stats_generic_losses():
+    stats = TrainingStats(
+        losses={"loss_bc": 0.5, "loss_q": 1.2},
+        extra={"temperature": 0.1},
+    )
+    assert stats.losses["loss_bc"] == 0.5
+    assert stats.losses["loss_q"] == 1.2
+    assert stats.extra["temperature"] == 0.1
+
+
+# ===========================================================================
+# Registry-driven make_algorithm
+# ===========================================================================
+
+
+def test_make_algorithm_builds_sac():
+    """make_algorithm should look up the SAC class from the registry and instantiate it."""
+    sac_cfg = _make_sac_config()
+    algo_config = SACAlgorithmConfig.from_policy_config(sac_cfg)
+    algo_config.utd_ratio = 2
+    policy = GaussianActorPolicy(config=sac_cfg)
+
+    algorithm = make_algorithm(cfg=algo_config, policy=policy)
+    assert isinstance(algorithm, SACAlgorithm)
+    assert algorithm.config.utd_ratio == 2
+
+
+# ===========================================================================
+# state_dict / load_state_dict (algorithm-side resume)
+# ===========================================================================
+
+
+def test_state_dict_contains_algorithm_owned_tensors():
+    """state_dict should pack critics, target networks, and log_alpha (no encoder bloat)."""
+    algorithm, _ = _make_algorithm()
+    sd = algorithm.state_dict()
+
+    assert "log_alpha" in sd
+    assert any(k.startswith("critic_ensemble.") for k in sd)
+    assert any(k.startswith("critic_target.") for k in sd)
+    # encoder weights live on the policy and must not be duplicated here.
+    assert not any(".encoder." in k for k in sd)
+
+
+def test_state_dict_includes_discrete_critic_target_when_present():
+    algorithm, _ = _make_algorithm(num_discrete_actions=3, action_dim=6)
+    sd = algorithm.state_dict()
+    assert any(k.startswith("discrete_critic_target.") for k in sd)
+
+
+def test_load_state_dict_round_trip_restores_critics_and_log_alpha():
+    """state_dict -> load_state_dict on a fresh algorithm restores all bytes exactly."""
+    sac_cfg = _make_sac_config(num_discrete_actions=3, action_dim=6)
+    src_policy = GaussianActorPolicy(config=sac_cfg)
+    src = SACAlgorithm(policy=src_policy, config=SACAlgorithmConfig.from_policy_config(sac_cfg))
+    src.make_optimizers_and_scheduler()
+    # Train a few steps so weights diverge from init (action_dim=7 = 6 continuous + 1 discrete).
+    src.update(_batch_iterator(action_dim=7))
+    src.update(_batch_iterator(action_dim=7))
+
+    dst_policy = GaussianActorPolicy(config=sac_cfg)
+    dst = SACAlgorithm(policy=dst_policy, config=SACAlgorithmConfig.from_policy_config(sac_cfg))
+    dst.make_optimizers_and_scheduler()
+
+    src_sd = src.state_dict()
+    dst.load_state_dict(src_sd)
+    dst_sd = dst.state_dict()
+
+    assert set(dst_sd) == set(src_sd)
+    for key in src_sd:
+        assert torch.allclose(src_sd[key].cpu(), dst_sd[key].cpu()), f"{key} mismatch after round-trip"
+
+
+def test_load_state_dict_preserves_log_alpha_parameter_identity():
+    """The temperature optimizer holds a reference to log_alpha; identity must survive load."""
+    algorithm, _ = _make_algorithm()
+    log_alpha_id_before = id(algorithm.log_alpha)
+    optimizer_param_id = id(algorithm.optimizers["temperature"].param_groups[0]["params"][0])
+    assert log_alpha_id_before == optimizer_param_id
+
+    new_state = algorithm.state_dict()
+    new_state["log_alpha"] = torch.tensor([0.42])
+    algorithm.load_state_dict(new_state)
+
+    assert id(algorithm.log_alpha) == log_alpha_id_before
+    assert id(algorithm.optimizers["temperature"].param_groups[0]["params"][0]) == log_alpha_id_before
+    assert torch.allclose(algorithm.log_alpha.detach().cpu(), torch.tensor([0.42]))
+
+
+def test_save_pretrained_round_trip_via_disk(tmp_path):
+    """End-to-end: save_pretrained -> from_pretrained restores tensors and config."""
+    sac_cfg = _make_sac_config()
+    src_policy = GaussianActorPolicy(config=sac_cfg)
+    src = SACAlgorithm(policy=src_policy, config=SACAlgorithmConfig.from_policy_config(sac_cfg))
+    src.make_optimizers_and_scheduler()
+    src.update(_batch_iterator())
+
+    save_dir = tmp_path / "algorithm"
+    src.save_pretrained(save_dir)
+    assert (save_dir / "model.safetensors").is_file()
+    assert (save_dir / "config.json").is_file()
+
+    dst_policy = GaussianActorPolicy(config=sac_cfg)
+    dst = SACAlgorithm.from_pretrained(save_dir, policy=dst_policy)
+
+    src_sd = src.state_dict()
+    dst_sd = dst.state_dict()
+    assert set(src_sd) == set(dst_sd)
+    for key in src_sd:
+        assert torch.allclose(src_sd[key].cpu(), dst_sd[key].cpu()), f"{key} mismatch after disk round-trip"
@@ -0,0 +1,133 @@
+#!/usr/bin/env python
+
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import pytest
+
+pytest.importorskip("datasets", reason="datasets is required (install lerobot[dataset])")
+
+import torch  # noqa: E402
+from torch import Tensor  # noqa: E402
+
+from lerobot.rl.algorithms.base import RLAlgorithm  # noqa: E402
+from lerobot.rl.algorithms.configs import TrainingStats  # noqa: E402
+from lerobot.rl.trainer import RLTrainer  # noqa: E402
+from lerobot.utils.constants import ACTION, OBS_STATE  # noqa: E402
+
+
+class _DummyRLAlgorithmConfig:
+    """Dummy config for testing."""
+
+
+class _DummyRLAlgorithm(RLAlgorithm):
+    config_class = _DummyRLAlgorithmConfig
+    name = "dummy_rl_algorithm"
+
+    def __init__(self):
+        self.configure_calls = 0
+        self.update_calls = 0
+
+    def select_action(self, observation: dict[str, Tensor]) -> Tensor:
+        return torch.zeros(1)
+
+    def configure_data_iterator(
+        self,
+        data_mixer,
+        batch_size: int,
+        *,
+        async_prefetch: bool = True,
+        queue_size: int = 2,
+    ):
+        self.configure_calls += 1
+        return data_mixer.get_iterator(
+            batch_size=batch_size,
+            async_prefetch=async_prefetch,
+            queue_size=queue_size,
+        )
+
+    def make_optimizers_and_scheduler(self):
+        return {}
+
+    def update(self, batch_iterator):
+        self.update_calls += 1
+        _ = next(batch_iterator)
+        return TrainingStats(losses={"dummy": 1.0})
+
+    def load_weights(self, weights, device="cpu") -> None:
+        _ = (weights, device)
+
+    def state_dict(self) -> dict[str, torch.Tensor]:
+        return {}
+
+    def load_state_dict(self, state_dict, device="cpu") -> None:
+        _ = (state_dict, device)
+
+
+class _SimpleMixer:
+    def get_iterator(self, batch_size: int, async_prefetch: bool = True, queue_size: int = 2):
+        _ = (async_prefetch, queue_size)
+        while True:
+            yield {
+                "state": {OBS_STATE: torch.randn(batch_size, 3)},
+                ACTION: torch.randn(batch_size, 2),
+                "reward": torch.randn(batch_size),
+                "next_state": {OBS_STATE: torch.randn(batch_size, 3)},
+                "done": torch.zeros(batch_size),
+                "truncated": torch.zeros(batch_size),
+                "complementary_info": None,
+            }
+
+
+def test_trainer_lazy_iterator_lifecycle_and_reset():
+    algo = _DummyRLAlgorithm()
+    mixer = _SimpleMixer()
+    trainer = RLTrainer(algorithm=algo, data_mixer=mixer, batch_size=4)
+
+    # First call builds iterator once.
+    trainer.training_step()
+    assert algo.configure_calls == 1
+    assert algo.update_calls == 1
+
+    # Second call reuses existing iterator.
+    trainer.training_step()
+    assert algo.configure_calls == 1
+    assert algo.update_calls == 2
+
+    # Explicit reset forces lazy rebuild on next step.
+    trainer.reset_data_iterator()
+    trainer.training_step()
+    assert algo.configure_calls == 2
+    assert algo.update_calls == 3
+
+
+def test_trainer_set_data_mixer_resets_by_default():
+    algo = _DummyRLAlgorithm()
+    mixer_a = _SimpleMixer()
+    mixer_b = _SimpleMixer()
+    trainer = RLTrainer(algorithm=algo, data_mixer=mixer_a, batch_size=2)
+
+    trainer.training_step()
+    assert algo.configure_calls == 1
+
+    trainer.set_data_mixer(mixer_b, reset=True)
+    trainer.training_step()
+    assert algo.configure_calls == 2
+
+
+def test_algorithm_optimization_step_contract_defaults():
+    algo = _DummyRLAlgorithm()
+    assert algo.optimization_step == 0
+    algo.optimization_step = 11
+    assert algo.optimization_step == 11
@@ -0,0 +1,218 @@
+"""Tests for policy.path support in YAML config files (issue #2957)."""
+
+import json
+import tempfile
+
+import yaml
+
+from lerobot.configs.parser import (
+    _config_path_args,
+    _config_yaml_overrides,
+    _flatten_to_cli_args,
+    extract_path_fields_from_config,
+    get_path_arg,
+    get_yaml_overrides,
+)
+
+
+def test_extract_path_fields_from_yaml():
+    """Test that policy.path is extracted from a YAML config and removed."""
+    config = {
+        "dataset": {"repo_id": "lerobot/pusht"},
+        "policy": {"type": "smolvla", "path": "lerobot/smolvla_base", "push_to_hub": False},
+    }
+    with tempfile.NamedTemporaryFile(mode="w", suffix=".yaml", delete=False) as f:
+        yaml.dump(config, f)
+        config_path = f.name
+
+    _config_path_args.clear()
+    cleaned_path = extract_path_fields_from_config(config_path, ["policy"])
+
+    # Path should be extracted and stored
+    assert _config_path_args["policy"] == "lerobot/smolvla_base"
+
+    # Cleaned config should not have the path field
+    with open(cleaned_path) as f:
+        cleaned = yaml.safe_load(f)
+    assert "path" not in cleaned["policy"]
+    assert cleaned["policy"]["type"] == "smolvla"
+    assert cleaned["policy"]["push_to_hub"] is False
+
+    # Original dataset should be untouched
+    assert cleaned["dataset"]["repo_id"] == "lerobot/pusht"
+
+    _config_path_args.clear()
+
+
+def test_extract_path_fields_from_json():
+    """Test that policy.path is extracted from a JSON config."""
+    config = {
+        "policy": {"type": "act", "path": "some/local/path"},
+    }
+    with tempfile.NamedTemporaryFile(mode="w", suffix=".json", delete=False) as f:
+        json.dump(config, f)
+        config_path = f.name
+
+    _config_path_args.clear()
+    cleaned_path = extract_path_fields_from_config(config_path, ["policy"])
+
+    assert _config_path_args["policy"] == "some/local/path"
+
+    with open(cleaned_path) as f:
+        cleaned = json.load(f)
+    assert "path" not in cleaned["policy"]
+
+    _config_path_args.clear()
+
+
+def test_extract_no_path_returns_original():
+    """Test that configs without path fields are returned unchanged."""
+    config = {
+        "dataset": {"repo_id": "lerobot/pusht"},
+        "policy": {"type": "smolvla"},
+    }
+    with tempfile.NamedTemporaryFile(mode="w", suffix=".yaml", delete=False) as f:
+        yaml.dump(config, f)
+        config_path = f.name
+
+    _config_path_args.clear()
+    result = extract_path_fields_from_config(config_path, ["policy"])
+
+    assert result == config_path
+    assert "policy" not in _config_path_args
+
+    _config_path_args.clear()
+
+
+def test_extract_removes_empty_field():
+    """Test that the field dict is removed entirely if path was the only key."""
+    config = {
+        "dataset": {"repo_id": "lerobot/pusht"},
+        "policy": {"path": "lerobot/smolvla_base"},
+    }
+    with tempfile.NamedTemporaryFile(mode="w", suffix=".yaml", delete=False) as f:
+        yaml.dump(config, f)
+        config_path = f.name
+
+    _config_path_args.clear()
+    cleaned_path = extract_path_fields_from_config(config_path, ["policy"])
+
+    assert _config_path_args["policy"] == "lerobot/smolvla_base"
+
+    with open(cleaned_path) as f:
+        cleaned = yaml.safe_load(f)
+    assert "policy" not in cleaned
+
+    _config_path_args.clear()
+
+
+def test_get_path_arg_fallback():
+    """Test that get_path_arg falls back to _config_path_args when CLI has no path."""
+    _config_path_args.clear()
+    _config_path_args["policy"] = "lerobot/smolvla_base"
+
+    # No CLI args with --policy.path
+    result = get_path_arg("policy", args=[])
+    assert result == "lerobot/smolvla_base"
+
+    _config_path_args.clear()
+
+
+def test_get_path_arg_cli_takes_precedence():
+    """Test that CLI --policy.path takes precedence over YAML config path."""
+    _config_path_args.clear()
+    _config_path_args["policy"] = "yaml/path"
+
+    result = get_path_arg("policy", args=["--policy.path=cli/path"])
+    assert result == "cli/path"
+
+    _config_path_args.clear()
+
+
+def test_yaml_overrides_captured():
+    """Test that non-path policy fields are captured as CLI-style overrides."""
+    config = {
+        "policy": {"path": "lerobot/smolvla_base", "lr": 1e-4, "batch_size": 32},
+    }
+    with tempfile.NamedTemporaryFile(mode="w", suffix=".yaml", delete=False) as f:
+        yaml.dump(config, f)
+        config_path = f.name
+
+    _config_path_args.clear()
+    _config_yaml_overrides.clear()
+    extract_path_fields_from_config(config_path, ["policy"])
+
+    overrides = get_yaml_overrides("policy")
+    assert "--lr=0.0001" in overrides or any("lr=" in o for o in overrides)
+    assert any("batch_size=32" in o for o in overrides)
+
+    _config_path_args.clear()
+    _config_yaml_overrides.clear()
+
+
+def test_yaml_overrides_excludes_type_and_path():
+    """Test that type and path fields are not included in YAML overrides."""
+    config = {
+        "policy": {"path": "lerobot/smolvla_base", "type": "smolvla", "lr": 5e-5},
+    }
+    with tempfile.NamedTemporaryFile(mode="w", suffix=".yaml", delete=False) as f:
+        yaml.dump(config, f)
+        config_path = f.name
+
+    _config_path_args.clear()
+    _config_yaml_overrides.clear()
+    extract_path_fields_from_config(config_path, ["policy"])
+
+    overrides = get_yaml_overrides("policy")
+    assert not any("path=" in o for o in overrides)
+    assert not any("type=" in o for o in overrides)
+    assert any("lr=" in o for o in overrides)
+
+    _config_path_args.clear()
+    _config_yaml_overrides.clear()
+
+
+def test_get_yaml_overrides_empty_when_path_only():
+    """Test that get_yaml_overrides returns [] when policy had only a path field."""
+    config = {
+        "policy": {"path": "lerobot/smolvla_base"},
+    }
+    with tempfile.NamedTemporaryFile(mode="w", suffix=".yaml", delete=False) as f:
+        yaml.dump(config, f)
+        config_path = f.name
+
+    _config_path_args.clear()
+    _config_yaml_overrides.clear()
+    extract_path_fields_from_config(config_path, ["policy"])
+
+    assert get_yaml_overrides("policy") == []
+
+    _config_path_args.clear()
+    _config_yaml_overrides.clear()
+
+
+def test_flatten_bool_values():
+    """Test that boolean values are serialized as lowercase strings for draccus."""
+    d = {"push_to_hub": True, "use_rabc": False, "lr": 0.001, "name": "test"}
+    args = _flatten_to_cli_args(d)
+    assert "--push_to_hub=true" in args
+    assert "--use_rabc=false" in args
+    assert "--lr=0.001" in args
+    assert "--name=test" in args
+
+
+def test_flatten_none_values_skipped():
+    """Test that None values are not included in flattened args."""
+    d = {"lr": 0.001, "path_override": None, "name": "test"}
+    args = _flatten_to_cli_args(d)
+    assert any("lr=" in a for a in args)
+    assert any("name=" in a for a in args)
+    assert not any("path_override" in a for a in args)
+
+
+def test_flatten_nested_with_bools():
+    """Test that bools in nested dicts are handled correctly."""
+    d = {"optimizer": {"use_warmup": True, "lr": 0.01}}
+    args = _flatten_to_cli_args(d)
+    assert "--optimizer.use_warmup=true" in args
+    assert "--optimizer.lr=0.01" in args
@@ -22,7 +22,7 @@ from unittest.mock import patch

 import pytest

-pytest.importorskip("grpc")
+pytest.importorskip("datasets", reason="datasets is required (install lerobot[dataset])")

 from lerobot.utils.process import ProcessSignalHandler  # noqa: E402

@@ -19,7 +19,6 @@ from collections.abc import Callable

 import pytest

-pytest.importorskip("grpc")
 pytest.importorskip("datasets", reason="datasets is required (install lerobot[dataset])")

 import torch  # noqa: E402
@@ -2,27 +2,35 @@ version = 1
 revision = 3
 requires-python = ">=3.12"
 resolution-markers = [
-    "python_full_version >= '3.15' and platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 'armv7l' and platform_machine != 's390x' and sys_platform == 'linux'",
+    "(python_full_version >= '3.15' and platform_machine == 'AMD64' and sys_platform == 'linux') or (python_full_version >= '3.15' and platform_machine == 'x86_64' and sys_platform == 'linux')",
+    "(python_full_version == '3.14.*' and platform_machine == 'AMD64' and sys_platform == 'linux') or (python_full_version == '3.14.*' and platform_machine == 'x86_64' and sys_platform == 'linux')",
+    "(python_full_version == '3.13.*' and platform_machine == 'AMD64' and sys_platform == 'linux') or (python_full_version == '3.13.*' and platform_machine == 'x86_64' and sys_platform == 'linux')",
+    "(python_full_version < '3.13' and platform_machine == 'AMD64' and sys_platform == 'linux') or (python_full_version < '3.13' and platform_machine == 'x86_64' and sys_platform == 'linux')",
+    "(python_full_version >= '3.15' and platform_machine == 'aarch64' and sys_platform == 'linux') or (python_full_version >= '3.15' and platform_machine == 'arm64' and sys_platform == 'linux')",
+    "(python_full_version == '3.14.*' and platform_machine == 'aarch64' and sys_platform == 'linux') or (python_full_version == '3.14.*' and platform_machine == 'arm64' and sys_platform == 'linux')",
+    "(python_full_version == '3.13.*' and platform_machine == 'aarch64' and sys_platform == 'linux') or (python_full_version == '3.13.*' and platform_machine == 'arm64' and sys_platform == 'linux')",
+    "(python_full_version < '3.13' and platform_machine == 'aarch64' and sys_platform == 'linux') or (python_full_version < '3.13' and platform_machine == 'arm64' and sys_platform == 'linux')",
+    "python_full_version >= '3.15' and platform_machine != 'AMD64' and platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 's390x' and platform_machine != 'x86_64' and sys_platform == 'linux'",
    "python_full_version >= '3.15' and platform_machine == 's390x' and sys_platform == 'linux'",
-    "python_full_version == '3.14.*' and platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 'armv7l' and platform_machine != 's390x' and sys_platform == 'linux'",
-    "python_full_version == '3.13.*' and platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 'armv7l' and platform_machine != 's390x' and sys_platform == 'linux'",
+    "python_full_version == '3.14.*' and platform_machine != 'AMD64' and platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 's390x' and platform_machine != 'x86_64' and sys_platform == 'linux'",
+    "python_full_version == '3.13.*' and platform_machine != 'AMD64' and platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 's390x' and platform_machine != 'x86_64' and sys_platform == 'linux'",
    "python_full_version == '3.14.*' and platform_machine == 's390x' and sys_platform == 'linux'",
    "python_full_version == '3.13.*' and platform_machine == 's390x' and sys_platform == 'linux'",
-    "python_full_version < '3.13' and platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 'armv7l' and platform_machine != 's390x' and sys_platform == 'linux'",
+    "python_full_version < '3.13' and platform_machine != 'AMD64' and platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 's390x' and platform_machine != 'x86_64' and sys_platform == 'linux'",
    "python_full_version < '3.13' and platform_machine == 's390x' and sys_platform == 'linux'",
-    "(python_full_version >= '3.15' and platform_machine == 'aarch64' and sys_platform == 'linux') or (python_full_version >= '3.15' and platform_machine == 'arm64' and sys_platform == 'linux') or (python_full_version >= '3.15' and platform_machine == 'armv7l' and sys_platform == 'linux')",
-    "(python_full_version == '3.14.*' and platform_machine == 'aarch64' and sys_platform == 'linux') or (python_full_version == '3.14.*' and platform_machine == 'arm64' and sys_platform == 'linux') or (python_full_version == '3.14.*' and platform_machine == 'armv7l' and sys_platform == 'linux')",
-    "(python_full_version == '3.13.*' and platform_machine == 'aarch64' and sys_platform == 'linux') or (python_full_version == '3.13.*' and platform_machine == 'arm64' and sys_platform == 'linux') or (python_full_version == '3.13.*' and platform_machine == 'armv7l' and sys_platform == 'linux')",
-    "(python_full_version < '3.13' and platform_machine == 'aarch64' and sys_platform == 'linux') or (python_full_version < '3.13' and platform_machine == 'arm64' and sys_platform == 'linux') or (python_full_version < '3.13' and platform_machine == 'armv7l' and sys_platform == 'linux')",
-    "(python_full_version >= '3.15' and platform_machine != 's390x' and platform_machine != 'x86_64' and sys_platform == 'darwin') or (python_full_version >= '3.15' and platform_machine != 's390x' and sys_platform != 'darwin' and sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32')",
+    "python_full_version >= '3.15' and platform_machine == 'arm64' and sys_platform == 'darwin'",
+    "python_full_version == '3.14.*' and platform_machine == 'arm64' and sys_platform == 'darwin'",
+    "python_full_version == '3.13.*' and platform_machine == 'arm64' and sys_platform == 'darwin'",
+    "python_full_version < '3.13' and platform_machine == 'arm64' and sys_platform == 'darwin'",
+    "(python_full_version >= '3.15' and platform_machine != 'arm64' and platform_machine != 's390x' and sys_platform == 'darwin') or (python_full_version >= '3.15' and platform_machine != 's390x' and sys_platform != 'darwin' and sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32')",
    "python_full_version >= '3.15' and platform_machine == 's390x' and sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32'",
    "python_full_version >= '3.15' and platform_machine != 's390x' and sys_platform == 'emscripten'",
    "python_full_version >= '3.15' and platform_machine == 's390x' and sys_platform == 'emscripten'",
-    "(python_full_version == '3.14.*' and platform_machine != 's390x' and platform_machine != 'x86_64' and sys_platform == 'darwin') or (python_full_version == '3.14.*' and platform_machine != 's390x' and sys_platform != 'darwin' and sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32')",
-    "(python_full_version == '3.13.*' and platform_machine != 's390x' and platform_machine != 'x86_64' and sys_platform == 'darwin') or (python_full_version == '3.13.*' and platform_machine != 's390x' and sys_platform != 'darwin' and sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32')",
+    "(python_full_version == '3.14.*' and platform_machine != 'arm64' and platform_machine != 's390x' and sys_platform == 'darwin') or (python_full_version == '3.14.*' and platform_machine != 's390x' and sys_platform != 'darwin' and sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32')",
+    "(python_full_version == '3.13.*' and platform_machine != 'arm64' and platform_machine != 's390x' and sys_platform == 'darwin') or (python_full_version == '3.13.*' and platform_machine != 's390x' and sys_platform != 'darwin' and sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32')",
    "python_full_version == '3.14.*' and platform_machine == 's390x' and sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32'",
    "python_full_version == '3.13.*' and platform_machine == 's390x' and sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32'",
-    "(python_full_version < '3.13' and platform_machine != 's390x' and platform_machine != 'x86_64' and sys_platform == 'darwin') or (python_full_version < '3.13' and platform_machine != 's390x' and sys_platform != 'darwin' and sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32')",
+    "(python_full_version < '3.13' and platform_machine != 'arm64' and platform_machine != 's390x' and sys_platform == 'darwin') or (python_full_version < '3.13' and platform_machine != 's390x' and sys_platform != 'darwin' and sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32')",
    "python_full_version < '3.13' and platform_machine == 's390x' and sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32'",
    "python_full_version == '3.14.*' and platform_machine != 's390x' and sys_platform == 'emscripten'",
    "python_full_version == '3.13.*' and platform_machine != 's390x' and sys_platform == 'emscripten'",
@@ -30,13 +38,13 @@ resolution-markers = [
    "python_full_version == '3.13.*' and platform_machine == 's390x' and sys_platform == 'emscripten'",
    "python_full_version < '3.13' and platform_machine != 's390x' and sys_platform == 'emscripten'",
    "python_full_version < '3.13' and platform_machine == 's390x' and sys_platform == 'emscripten'",
-    "(python_full_version >= '3.15' and platform_machine == 'x86_64' and sys_platform == 'darwin') or (python_full_version >= '3.15' and platform_machine != 's390x' and sys_platform == 'win32')",
+    "python_full_version >= '3.15' and platform_machine != 's390x' and sys_platform == 'win32'",
    "python_full_version >= '3.15' and platform_machine == 's390x' and sys_platform == 'win32'",
-    "(python_full_version == '3.14.*' and platform_machine == 'x86_64' and sys_platform == 'darwin') or (python_full_version == '3.14.*' and platform_machine != 's390x' and sys_platform == 'win32')",
-    "(python_full_version == '3.13.*' and platform_machine == 'x86_64' and sys_platform == 'darwin') or (python_full_version == '3.13.*' and platform_machine != 's390x' and sys_platform == 'win32')",
+    "python_full_version == '3.14.*' and platform_machine != 's390x' and sys_platform == 'win32'",
+    "python_full_version == '3.13.*' and platform_machine != 's390x' and sys_platform == 'win32'",
    "python_full_version == '3.14.*' and platform_machine == 's390x' and sys_platform == 'win32'",
    "python_full_version == '3.13.*' and platform_machine == 's390x' and sys_platform == 'win32'",
-    "(python_full_version < '3.13' and platform_machine == 'x86_64' and sys_platform == 'darwin') or (python_full_version < '3.13' and platform_machine != 's390x' and sys_platform == 'win32')",
+    "python_full_version < '3.13' and platform_machine != 's390x' and sys_platform == 'win32'",
    "python_full_version < '3.13' and platform_machine == 's390x' and sys_platform == 'win32'",
 ]

@@ -626,11 +634,11 @@ wheels = [

 [[package]]
 name = "cmeel"
-version = "0.60.0"
+version = "0.60.1"
 source = { registry = "https://pypi.org/simple" }
-sdist = { url = "https://files.pythonhosted.org/packages/49/2b/a178a123602cb23b737289ae24fe9213bf1002660bb89d48e5dda62b46cc/cmeel-0.60.0.tar.gz", hash = "sha256:2e6d9ae61cc94112a67814b14948dd679b353090be4b87ab04c3ccaea3aa95de", size = 14935, upload-time = "2026-05-09T16:03:35.536Z" }
+sdist = { url = "https://files.pythonhosted.org/packages/54/46/ddc7df697e49cae32b1a97b3e1a3b47b815238f9059312f987bc62a2e756/cmeel-0.60.1.tar.gz", hash = "sha256:3e0b92eb933a3693ad3f1da8aae31defcbee5f25969daaf20e59c57d6a9474cf", size = 14972, upload-time = "2026-05-11T17:13:57.216Z" }
 wheels = [
-    { url = "https://files.pythonhosted.org/packages/68/d4/ffd1484c68ca7489596806f830446219540dd17508818fe0d2c2fb0f4f59/cmeel-0.60.0-py3-none-any.whl", hash = "sha256:ed0672f7cebbb1143e6e29fcc0d3fd26e100ed2381b49dd15444bd1dd6d3ce0b", size = 20573, upload-time = "2026-05-09T16:03:33.827Z" },
+    { url = "https://files.pythonhosted.org/packages/da/39/f2db2ff475d42222c70fe25c737028aeaafdd9c0aeba04e6b2dc66e7f93f/cmeel-0.60.1-py3-none-any.whl", hash = "sha256:3f92b68353a58b4d6b5a664ea96bf58a3fef0891a8ed570d3c153361bbbb94b7", size = 20612, upload-time = "2026-05-11T17:13:55.812Z" },
 ]

 [[package]]
@@ -912,86 +920,86 @@ wheels = [

 [[package]]
 name = "coverage"
-version = "7.13.5"
+version = "7.14.0"
 source = { registry = "https://pypi.org/simple" }
-sdist = { url = "https://files.pythonhosted.org/packages/9d/e0/70553e3000e345daff267cec284ce4cbf3fc141b6da229ac52775b5428f1/coverage-7.13.5.tar.gz", hash = "sha256:c81f6515c4c40141f83f502b07bbfa5c240ba25bbe73da7b33f1e5b6120ff179", size = 915967, upload-time = "2026-03-17T10:33:18.341Z" }
+sdist = { url = "https://files.pythonhosted.org/packages/23/7f/d0720730a397a999ffc0fd3f5bebef347338e3a47b727da66fbb228e2ff2/coverage-7.14.0.tar.gz", hash = "sha256:057a6af2f160a85384cde4ab36f0d2777bae1057bae255f95413cdd382aa5c74", size = 919489, upload-time = "2026-05-10T18:02:31.397Z" }
 wheels = [
-    { url = "https://files.pythonhosted.org/packages/a0/c3/a396306ba7db865bf96fc1fb3b7fd29bcbf3d829df642e77b13555163cd6/coverage-7.13.5-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:460cf0114c5016fa841214ff5564aa4864f11948da9440bc97e21ad1f4ba1e01", size = 219554, upload-time = "2026-03-17T10:30:42.208Z" },
-    { url = "https://files.pythonhosted.org/packages/a6/16/a68a19e5384e93f811dccc51034b1fd0b865841c390e3c931dcc4699e035/coverage-7.13.5-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:0e223ce4b4ed47f065bfb123687686512e37629be25cc63728557ae7db261422", size = 219908, upload-time = "2026-03-17T10:30:43.906Z" },
-    { url = "https://files.pythonhosted.org/packages/29/72/20b917c6793af3a5ceb7fb9c50033f3ec7865f2911a1416b34a7cfa0813b/coverage-7.13.5-cp312-cp312-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl", hash = "sha256:6e3370441f4513c6252bf042b9c36d22491142385049243253c7e48398a15a9f", size = 251419, upload-time = "2026-03-17T10:30:45.545Z" },
-    { url = "https://files.pythonhosted.org/packages/8c/49/cd14b789536ac6a4778c453c6a2338bc0a2fb60c5a5a41b4008328b9acc1/coverage-7.13.5-cp312-cp312-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl", hash = "sha256:03ccc709a17a1de074fb1d11f217342fb0d2b1582ed544f554fc9fc3f07e95f5", size = 254159, upload-time = "2026-03-17T10:30:47.204Z" },
-    { url = "https://files.pythonhosted.org/packages/9d/00/7b0edcfe64e2ed4c0340dac14a52ad0f4c9bd0b8b5e531af7d55b703db7c/coverage-7.13.5-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:3f4818d065964db3c1c66dc0fbdac5ac692ecbc875555e13374fdbe7eedb4376", size = 255270, upload-time = "2026-03-17T10:30:48.812Z" },
-    { url = "https://files.pythonhosted.org/packages/93/89/7ffc4ba0f5d0a55c1e84ea7cee39c9fc06af7b170513d83fbf3bbefce280/coverage-7.13.5-cp312-cp312-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:012d5319e66e9d5a218834642d6c35d265515a62f01157a45bcc036ecf947256", size = 257538, upload-time = "2026-03-17T10:30:50.77Z" },
-    { url = "https://files.pythonhosted.org/packages/81/bd/73ddf85f93f7e6fa83e77ccecb6162d9415c79007b4bc124008a4995e4a7/coverage-7.13.5-cp312-cp312-manylinux_2_31_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:8dd02af98971bdb956363e4827d34425cb3df19ee550ef92855b0acb9c7ce51c", size = 251821, upload-time = "2026-03-17T10:30:52.5Z" },
-    { url = "https://files.pythonhosted.org/packages/a0/81/278aff4e8dec4926a0bcb9486320752811f543a3ce5b602cc7a29978d073/coverage-7.13.5-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:f08fd75c50a760c7eb068ae823777268daaf16a80b918fa58eea888f8e3919f5", size = 253191, upload-time = "2026-03-17T10:30:54.543Z" },
-    { url = "https://files.pythonhosted.org/packages/70/ee/fe1621488e2e0a58d7e94c4800f0d96f79671553488d401a612bebae324b/coverage-7.13.5-cp312-cp312-musllinux_1_2_i686.whl", hash = "sha256:843ea8643cf967d1ac7e8ecd4bb00c99135adf4816c0c0593fdcc47b597fcf09", size = 251337, upload-time = "2026-03-17T10:30:56.663Z" },
-    { url = "https://files.pythonhosted.org/packages/37/a6/f79fb37aa104b562207cc23cb5711ab6793608e246cae1e93f26b2236ed9/coverage-7.13.5-cp312-cp312-musllinux_1_2_ppc64le.whl", hash = "sha256:9d44d7aa963820b1b971dbecd90bfe5fe8f81cff79787eb6cca15750bd2f79b9", size = 255404, upload-time = "2026-03-17T10:30:58.427Z" },
-    { url = "https://files.pythonhosted.org/packages/75/f0/ed15262a58ec81ce457ceb717b7f78752a1713556b19081b76e90896e8d4/coverage-7.13.5-cp312-cp312-musllinux_1_2_riscv64.whl", hash = "sha256:7132bed4bd7b836200c591410ae7d97bf7ae8be6fc87d160b2bd881df929e7bf", size = 250903, upload-time = "2026-03-17T10:31:00.093Z" },
-    { url = "https://files.pythonhosted.org/packages/0f/e9/9129958f20e7e9d4d56d51d42ccf708d15cac355ff4ac6e736e97a9393d2/coverage-7.13.5-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:a698e363641b98843c517817db75373c83254781426e94ada3197cabbc2c919c", size = 252780, upload-time = "2026-03-17T10:31:01.916Z" },
-    { url = "https://files.pythonhosted.org/packages/a4/d7/0ad9b15812d81272db94379fe4c6df8fd17781cc7671fdfa30c76ba5ff7b/coverage-7.13.5-cp312-cp312-win32.whl", hash = "sha256:bdba0a6b8812e8c7df002d908a9a2ea3c36e92611b5708633c50869e6d922fdf", size = 222093, upload-time = "2026-03-17T10:31:03.642Z" },
-    { url = "https://files.pythonhosted.org/packages/29/3d/821a9a5799fac2556bcf0bd37a70d1d11fa9e49784b6d22e92e8b2f85f18/coverage-7.13.5-cp312-cp312-win_amd64.whl", hash = "sha256:d2c87e0c473a10bffe991502eac389220533024c8082ec1ce849f4218dded810", size = 222900, upload-time = "2026-03-17T10:31:05.651Z" },
-    { url = "https://files.pythonhosted.org/packages/d4/fa/2238c2ad08e35cf4f020ea721f717e09ec3152aea75d191a7faf3ef009a8/coverage-7.13.5-cp312-cp312-win_arm64.whl", hash = "sha256:bf69236a9a81bdca3bff53796237aab096cdbf8d78a66ad61e992d9dac7eb2de", size = 221515, upload-time = "2026-03-17T10:31:07.293Z" },
-    { url = "https://files.pythonhosted.org/packages/74/8c/74fedc9663dcf168b0a059d4ea756ecae4da77a489048f94b5f512a8d0b3/coverage-7.13.5-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:5ec4af212df513e399cf11610cc27063f1586419e814755ab362e50a85ea69c1", size = 219576, upload-time = "2026-03-17T10:31:09.045Z" },
-    { url = "https://files.pythonhosted.org/packages/0c/c9/44fb661c55062f0818a6ffd2685c67aa30816200d5f2817543717d4b92eb/coverage-7.13.5-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:941617e518602e2d64942c88ec8499f7fbd49d3f6c4327d3a71d43a1973032f3", size = 219942, upload-time = "2026-03-17T10:31:10.708Z" },
-    { url = "https://files.pythonhosted.org/packages/5f/13/93419671cee82b780bab7ea96b67c8ef448f5f295f36bf5031154ec9a790/coverage-7.13.5-cp313-cp313-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl", hash = "sha256:da305e9937617ee95c2e39d8ff9f040e0487cbf1ac174f777ed5eddd7a7c1f26", size = 250935, upload-time = "2026-03-17T10:31:12.392Z" },
-    { url = "https://files.pythonhosted.org/packages/ac/68/1666e3a4462f8202d836920114fa7a5ee9275d1fa45366d336c551a162dd/coverage-7.13.5-cp313-cp313-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl", hash = "sha256:78e696e1cc714e57e8b25760b33a8b1026b7048d270140d25dafe1b0a1ee05a3", size = 253541, upload-time = "2026-03-17T10:31:14.247Z" },
-    { url = "https://files.pythonhosted.org/packages/4e/5e/3ee3b835647be646dcf3c65a7c6c18f87c27326a858f72ab22c12730773d/coverage-7.13.5-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:02ca0eed225b2ff301c474aeeeae27d26e2537942aa0f87491d3e147e784a82b", size = 254780, upload-time = "2026-03-17T10:31:16.193Z" },
-    { url = "https://files.pythonhosted.org/packages/44/b3/cb5bd1a04cfcc49ede6cd8409d80bee17661167686741e041abc7ee1b9a9/coverage-7.13.5-cp313-cp313-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:04690832cbea4e4663d9149e05dba142546ca05cb1848816760e7f58285c970a", size = 256912, upload-time = "2026-03-17T10:31:17.89Z" },
-    { url = "https://files.pythonhosted.org/packages/1b/66/c1dceb7b9714473800b075f5c8a84f4588f887a90eb8645282031676e242/coverage-7.13.5-cp313-cp313-manylinux_2_31_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:0590e44dd2745c696a778f7bab6aa95256de2cbc8b8cff4f7db8ff09813d6969", size = 251165, upload-time = "2026-03-17T10:31:19.605Z" },
-    { url = "https://files.pythonhosted.org/packages/b7/62/5502b73b97aa2e53ea22a39cf8649ff44827bef76d90bf638777daa27a9d/coverage-7.13.5-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:d7cfad2d6d81dd298ab6b89fe72c3b7b05ec7544bdda3b707ddaecff8d25c161", size = 252908, upload-time = "2026-03-17T10:31:21.312Z" },
-    { url = "https://files.pythonhosted.org/packages/7d/37/7792c2d69854397ca77a55c4646e5897c467928b0e27f2d235d83b5d08c6/coverage-7.13.5-cp313-cp313-musllinux_1_2_i686.whl", hash = "sha256:e092b9499de38ae0fbfbc603a74660eb6ff3e869e507b50d85a13b6db9863e15", size = 250873, upload-time = "2026-03-17T10:31:23.565Z" },
-    { url = "https://files.pythonhosted.org/packages/a3/23/bc866fb6163be52a8a9e5d708ba0d3b1283c12158cefca0a8bbb6e247a43/coverage-7.13.5-cp313-cp313-musllinux_1_2_ppc64le.whl", hash = "sha256:48c39bc4a04d983a54a705a6389512883d4a3b9862991b3617d547940e9f52b1", size = 255030, upload-time = "2026-03-17T10:31:25.58Z" },
-    { url = "https://files.pythonhosted.org/packages/7d/8b/ef67e1c222ef49860701d346b8bbb70881bef283bd5f6cbba68a39a086c7/coverage-7.13.5-cp313-cp313-musllinux_1_2_riscv64.whl", hash = "sha256:2d3807015f138ffea1ed9afeeb8624fd781703f2858b62a8dd8da5a0994c57b6", size = 250694, upload-time = "2026-03-17T10:31:27.316Z" },
-    { url = "https://files.pythonhosted.org/packages/46/0d/866d1f74f0acddbb906db212e096dee77a8e2158ca5e6bb44729f9d93298/coverage-7.13.5-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:ee2aa19e03161671ec964004fb74b2257805d9710bf14a5c704558b9d8dbaf17", size = 252469, upload-time = "2026-03-17T10:31:29.472Z" },
-    { url = "https://files.pythonhosted.org/packages/7a/f5/be742fec31118f02ce42b21c6af187ad6a344fed546b56ca60caacc6a9a0/coverage-7.13.5-cp313-cp313-win32.whl", hash = "sha256:ce1998c0483007608c8382f4ff50164bfc5bd07a2246dd272aa4043b75e61e85", size = 222112, upload-time = "2026-03-17T10:31:31.526Z" },
-    { url = "https://files.pythonhosted.org/packages/66/40/7732d648ab9d069a46e686043241f01206348e2bbf128daea85be4d6414b/coverage-7.13.5-cp313-cp313-win_amd64.whl", hash = "sha256:631efb83f01569670a5e866ceb80fe483e7c159fac6f167e6571522636104a0b", size = 222923, upload-time = "2026-03-17T10:31:33.633Z" },
-    { url = "https://files.pythonhosted.org/packages/48/af/fea819c12a095781f6ccd504890aaddaf88b8fab263c4940e82c7b770124/coverage-7.13.5-cp313-cp313-win_arm64.whl", hash = "sha256:f4cd16206ad171cbc2470dbea9103cf9a7607d5fe8c242fdf1edf36174020664", size = 221540, upload-time = "2026-03-17T10:31:35.445Z" },
-    { url = "https://files.pythonhosted.org/packages/23/d2/17879af479df7fbbd44bd528a31692a48f6b25055d16482fdf5cdb633805/coverage-7.13.5-cp313-cp313t-macosx_10_13_x86_64.whl", hash = "sha256:0428cbef5783ad91fe240f673cc1f76b25e74bbfe1a13115e4aa30d3f538162d", size = 220262, upload-time = "2026-03-17T10:31:37.184Z" },
-    { url = "https://files.pythonhosted.org/packages/5b/4c/d20e554f988c8f91d6a02c5118f9abbbf73a8768a3048cb4962230d5743f/coverage-7.13.5-cp313-cp313t-macosx_11_0_arm64.whl", hash = "sha256:e0b216a19534b2427cc201a26c25da4a48633f29a487c61258643e89d28200c0", size = 220617, upload-time = "2026-03-17T10:31:39.245Z" },
-    { url = "https://files.pythonhosted.org/packages/29/9c/f9f5277b95184f764b24e7231e166dfdb5780a46d408a2ac665969416d61/coverage-7.13.5-cp313-cp313t-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl", hash = "sha256:972a9cd27894afe4bc2b1480107054e062df08e671df7c2f18c205e805ccd806", size = 261912, upload-time = "2026-03-17T10:31:41.324Z" },
-    { url = "https://files.pythonhosted.org/packages/d5/f6/7f1ab39393eeb50cfe4747ae8ef0e4fc564b989225aa1152e13a180d74f8/coverage-7.13.5-cp313-cp313t-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl", hash = "sha256:4b59148601efcd2bac8c4dbf1f0ad6391693ccf7a74b8205781751637076aee3", size = 263987, upload-time = "2026-03-17T10:31:43.724Z" },
-    { url = "https://files.pythonhosted.org/packages/a0/d7/62c084fb489ed9c6fbdf57e006752e7c516ea46fd690e5ed8b8617c7d52e/coverage-7.13.5-cp313-cp313t-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:505d7083c8b0c87a8fa8c07370c285847c1f77739b22e299ad75a6af6c32c5c9", size = 266416, upload-time = "2026-03-17T10:31:45.769Z" },
-    { url = "https://files.pythonhosted.org/packages/a9/f6/df63d8660e1a0bff6125947afda112a0502736f470d62ca68b288ea762d8/coverage-7.13.5-cp313-cp313t-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:60365289c3741e4db327e7baff2a4aaacf22f788e80fa4683393891b70a89fbd", size = 267558, upload-time = "2026-03-17T10:31:48.293Z" },
-    { url = "https://files.pythonhosted.org/packages/5b/02/353ca81d36779bd108f6d384425f7139ac3c58c750dcfaafe5d0bee6436b/coverage-7.13.5-cp313-cp313t-manylinux_2_31_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:1b88c69c8ef5d4b6fe7dea66d6636056a0f6a7527c440e890cf9259011f5e606", size = 261163, upload-time = "2026-03-17T10:31:50.125Z" },
-    { url = "https://files.pythonhosted.org/packages/2c/16/2e79106d5749bcaf3aee6d309123548e3276517cd7851faa8da213bc61bf/coverage-7.13.5-cp313-cp313t-musllinux_1_2_aarch64.whl", hash = "sha256:5b13955d31d1633cf9376908089b7cebe7d15ddad7aeaabcbe969a595a97e95e", size = 263981, upload-time = "2026-03-17T10:31:51.961Z" },
-    { url = "https://files.pythonhosted.org/packages/29/c7/c29e0c59ffa6942030ae6f50b88ae49988e7e8da06de7ecdbf49c6d4feae/coverage-7.13.5-cp313-cp313t-musllinux_1_2_i686.whl", hash = "sha256:f70c9ab2595c56f81a89620e22899eea8b212a4041bd728ac6f4a28bf5d3ddd0", size = 261604, upload-time = "2026-03-17T10:31:53.872Z" },
-    { url = "https://files.pythonhosted.org/packages/40/48/097cdc3db342f34006a308ab41c3a7c11c3f0d84750d340f45d88a782e00/coverage-7.13.5-cp313-cp313t-musllinux_1_2_ppc64le.whl", hash = "sha256:084b84a8c63e8d6fc7e3931b316a9bcafca1458d753c539db82d31ed20091a87", size = 265321, upload-time = "2026-03-17T10:31:55.997Z" },
-    { url = "https://files.pythonhosted.org/packages/bb/1f/4994af354689e14fd03a75f8ec85a9a68d94e0188bbdab3fc1516b55e512/coverage-7.13.5-cp313-cp313t-musllinux_1_2_riscv64.whl", hash = "sha256:ad14385487393e386e2ea988b09d62dd42c397662ac2dabc3832d71253eee479", size = 260502, upload-time = "2026-03-17T10:31:58.308Z" },
-    { url = "https://files.pythonhosted.org/packages/22/c6/9bb9ef55903e628033560885f5c31aa227e46878118b63ab15dc7ba87797/coverage-7.13.5-cp313-cp313t-musllinux_1_2_x86_64.whl", hash = "sha256:7f2c47b36fe7709a6e83bfadf4eefb90bd25fbe4014d715224c4316f808e59a2", size = 262688, upload-time = "2026-03-17T10:32:00.141Z" },
-    { url = "https://files.pythonhosted.org/packages/14/4f/f5df9007e50b15e53e01edea486814783a7f019893733d9e4d6caad75557/coverage-7.13.5-cp313-cp313t-win32.whl", hash = "sha256:67e9bc5449801fad0e5dff329499fb090ba4c5800b86805c80617b4e29809b2a", size = 222788, upload-time = "2026-03-17T10:32:02.246Z" },
-    { url = "https://files.pythonhosted.org/packages/e1/98/aa7fccaa97d0f3192bec013c4e6fd6d294a6ed44b640e6bb61f479e00ed5/coverage-7.13.5-cp313-cp313t-win_amd64.whl", hash = "sha256:da86cdcf10d2519e10cabb8ac2de03da1bcb6e4853790b7fbd48523332e3a819", size = 223851, upload-time = "2026-03-17T10:32:04.416Z" },
-    { url = "https://files.pythonhosted.org/packages/3d/8b/e5c469f7352651e5f013198e9e21f97510b23de957dd06a84071683b4b60/coverage-7.13.5-cp313-cp313t-win_arm64.whl", hash = "sha256:0ecf12ecb326fe2c339d93fc131816f3a7367d223db37817208905c89bded911", size = 222104, upload-time = "2026-03-17T10:32:06.65Z" },
-    { url = "https://files.pythonhosted.org/packages/8e/77/39703f0d1d4b478bfd30191d3c14f53caf596fac00efb3f8f6ee23646439/coverage-7.13.5-cp314-cp314-macosx_10_15_x86_64.whl", hash = "sha256:fbabfaceaeb587e16f7008f7795cd80d20ec548dc7f94fbb0d4ec2e038ce563f", size = 219621, upload-time = "2026-03-17T10:32:08.589Z" },
-    { url = "https://files.pythonhosted.org/packages/e2/3e/51dff36d99ae14639a133d9b164d63e628532e2974d8b1edb99dd1ebc733/coverage-7.13.5-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:9bb2a28101a443669a423b665939381084412b81c3f8c0fcfbac57f4e30b5b8e", size = 219953, upload-time = "2026-03-17T10:32:10.507Z" },
-    { url = "https://files.pythonhosted.org/packages/6a/6c/1f1917b01eb647c2f2adc9962bd66c79eb978951cab61bdc1acab3290c07/coverage-7.13.5-cp314-cp314-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl", hash = "sha256:bd3a2fbc1c6cccb3c5106140d87cc6a8715110373ef42b63cf5aea29df8c217a", size = 250992, upload-time = "2026-03-17T10:32:12.41Z" },
-    { url = "https://files.pythonhosted.org/packages/22/e5/06b1f88f42a5a99df42ce61208bdec3bddb3d261412874280a19796fc09c/coverage-7.13.5-cp314-cp314-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl", hash = "sha256:6c36ddb64ed9d7e496028d1d00dfec3e428e0aabf4006583bb1839958d280510", size = 253503, upload-time = "2026-03-17T10:32:14.449Z" },
-    { url = "https://files.pythonhosted.org/packages/80/28/2a148a51e5907e504fa7b85490277734e6771d8844ebcc48764a15e28155/coverage-7.13.5-cp314-cp314-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:380e8e9084d8eb38db3a9176a1a4f3c0082c3806fa0dc882d1d87abc3c789247", size = 254852, upload-time = "2026-03-17T10:32:16.56Z" },
-    { url = "https://files.pythonhosted.org/packages/61/77/50e8d3d85cc0b7ebe09f30f151d670e302c7ff4a1bf6243f71dd8b0981fa/coverage-7.13.5-cp314-cp314-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:e808af52a0513762df4d945ea164a24b37f2f518cbe97e03deaa0ee66139b4d6", size = 257161, upload-time = "2026-03-17T10:32:19.004Z" },
-    { url = "https://files.pythonhosted.org/packages/3b/c4/b5fd1d4b7bf8d0e75d997afd3925c59ba629fc8616f1b3aae7605132e256/coverage-7.13.5-cp314-cp314-manylinux_2_31_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:e301d30dd7e95ae068671d746ba8c34e945a82682e62918e41b2679acd2051a0", size = 251021, upload-time = "2026-03-17T10:32:21.344Z" },
-    { url = "https://files.pythonhosted.org/packages/f8/66/6ea21f910e92d69ef0b1c3346ea5922a51bad4446c9126db2ae96ee24c4c/coverage-7.13.5-cp314-cp314-musllinux_1_2_aarch64.whl", hash = "sha256:800bc829053c80d240a687ceeb927a94fd108bbdc68dfbe505d0d75ab578a882", size = 252858, upload-time = "2026-03-17T10:32:23.506Z" },
-    { url = "https://files.pythonhosted.org/packages/9e/ea/879c83cb5d61aa2a35fb80e72715e92672daef8191b84911a643f533840c/coverage-7.13.5-cp314-cp314-musllinux_1_2_i686.whl", hash = "sha256:0b67af5492adb31940ee418a5a655c28e48165da5afab8c7fa6fd72a142f8740", size = 250823, upload-time = "2026-03-17T10:32:25.516Z" },
-    { url = "https://files.pythonhosted.org/packages/8a/fb/616d95d3adb88b9803b275580bdeee8bd1b69a886d057652521f83d7322f/coverage-7.13.5-cp314-cp314-musllinux_1_2_ppc64le.whl", hash = "sha256:c9136ff29c3a91e25b1d1552b5308e53a1e0653a23e53b6366d7c2dcbbaf8a16", size = 255099, upload-time = "2026-03-17T10:32:27.944Z" },
-    { url = "https://files.pythonhosted.org/packages/1c/93/25e6917c90ec1c9a56b0b26f6cad6408e5f13bb6b35d484a0d75c9cf000d/coverage-7.13.5-cp314-cp314-musllinux_1_2_riscv64.whl", hash = "sha256:cff784eef7f0b8f6cb28804fbddcfa99f89efe4cc35fb5627e3ac58f91ed3ac0", size = 250638, upload-time = "2026-03-17T10:32:29.914Z" },
-    { url = "https://files.pythonhosted.org/packages/fc/7b/dc1776b0464145a929deed214aef9fb1493f159b59ff3c7eeeedf91eddd0/coverage-7.13.5-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:68a4953be99b17ac3c23b6efbc8a38330d99680c9458927491d18700ef23ded0", size = 252295, upload-time = "2026-03-17T10:32:31.981Z" },
-    { url = "https://files.pythonhosted.org/packages/ea/fb/99cbbc56a26e07762a2740713f3c8f9f3f3106e3a3dd8cc4474954bccd34/coverage-7.13.5-cp314-cp314-win32.whl", hash = "sha256:35a31f2b1578185fbe6aa2e74cea1b1d0bbf4c552774247d9160d29b80ed56cc", size = 222360, upload-time = "2026-03-17T10:32:34.233Z" },
-    { url = "https://files.pythonhosted.org/packages/8d/b7/4758d4f73fb536347cc5e4ad63662f9d60ba9118cb6785e9616b2ce5d7fa/coverage-7.13.5-cp314-cp314-win_amd64.whl", hash = "sha256:2aa055ae1857258f9e0045be26a6d62bdb47a72448b62d7b55f4820f361a2633", size = 223174, upload-time = "2026-03-17T10:32:36.369Z" },
-    { url = "https://files.pythonhosted.org/packages/2c/f2/24d84e1dfe70f8ac9fdf30d338239860d0d1d5da0bda528959d0ebc9da28/coverage-7.13.5-cp314-cp314-win_arm64.whl", hash = "sha256:1b11eef33edeae9d142f9b4358edb76273b3bfd30bc3df9a4f95d0e49caf94e8", size = 221739, upload-time = "2026-03-17T10:32:38.736Z" },
-    { url = "https://files.pythonhosted.org/packages/60/5b/4a168591057b3668c2428bff25dd3ebc21b629d666d90bcdfa0217940e84/coverage-7.13.5-cp314-cp314t-macosx_10_15_x86_64.whl", hash = "sha256:10a0c37f0b646eaff7cce1874c31d1f1ccb297688d4c747291f4f4c70741cc8b", size = 220351, upload-time = "2026-03-17T10:32:41.196Z" },
-    { url = "https://files.pythonhosted.org/packages/f5/21/1fd5c4dbfe4a58b6b99649125635df46decdfd4a784c3cd6d410d303e370/coverage-7.13.5-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:b5db73ba3c41c7008037fa731ad5459fc3944cb7452fc0aa9f822ad3533c583c", size = 220612, upload-time = "2026-03-17T10:32:43.204Z" },
-    { url = "https://files.pythonhosted.org/packages/d6/fe/2a924b3055a5e7e4512655a9d4609781b0d62334fa0140c3e742926834e2/coverage-7.13.5-cp314-cp314t-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl", hash = "sha256:750db93a81e3e5a9831b534be7b1229df848b2e125a604fe6651e48aa070e5f9", size = 261985, upload-time = "2026-03-17T10:32:45.514Z" },
-    { url = "https://files.pythonhosted.org/packages/d7/0d/c8928f2bd518c45990fe1a2ab8db42e914ef9b726c975facc4282578c3eb/coverage-7.13.5-cp314-cp314t-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl", hash = "sha256:9ddb4f4a5479f2539644be484da179b653273bca1a323947d48ab107b3ed1f29", size = 264107, upload-time = "2026-03-17T10:32:47.971Z" },
-    { url = "https://files.pythonhosted.org/packages/ef/ae/4ae35bbd9a0af9d820362751f0766582833c211224b38665c0f8de3d487f/coverage-7.13.5-cp314-cp314t-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:d8a7a2049c14f413163e2bdabd37e41179b1d1ccb10ffc6ccc4b7a718429c607", size = 266513, upload-time = "2026-03-17T10:32:50.1Z" },
-    { url = "https://files.pythonhosted.org/packages/9c/20/d326174c55af36f74eac6ae781612d9492f060ce8244b570bb9d50d9d609/coverage-7.13.5-cp314-cp314t-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:e1c85e0b6c05c592ea6d8768a66a254bfb3874b53774b12d4c89c481eb78cb90", size = 267650, upload-time = "2026-03-17T10:32:52.391Z" },
-    { url = "https://files.pythonhosted.org/packages/7a/5e/31484d62cbd0eabd3412e30d74386ece4a0837d4f6c3040a653878bfc019/coverage-7.13.5-cp314-cp314t-manylinux_2_31_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:777c4d1eff1b67876139d24288aaf1817f6c03d6bae9c5cc8d27b83bcfe38fe3", size = 261089, upload-time = "2026-03-17T10:32:54.544Z" },
-    { url = "https://files.pythonhosted.org/packages/e9/d8/49a72d6de146eebb0b7e48cc0f4bc2c0dd858e3d4790ab2b39a2872b62bd/coverage-7.13.5-cp314-cp314t-musllinux_1_2_aarch64.whl", hash = "sha256:6697e29b93707167687543480a40f0db8f356e86d9f67ddf2e37e2dfd91a9dab", size = 263982, upload-time = "2026-03-17T10:32:56.803Z" },
-    { url = "https://files.pythonhosted.org/packages/06/3b/0351f1bd566e6e4dd39e978efe7958bde1d32f879e85589de147654f57bb/coverage-7.13.5-cp314-cp314t-musllinux_1_2_i686.whl", hash = "sha256:8fdf453a942c3e4d99bd80088141c4c6960bb232c409d9c3558e2dbaa3998562", size = 261579, upload-time = "2026-03-17T10:32:59.466Z" },
-    { url = "https://files.pythonhosted.org/packages/5d/ce/796a2a2f4017f554d7810f5c573449b35b1e46788424a548d4d19201b222/coverage-7.13.5-cp314-cp314t-musllinux_1_2_ppc64le.whl", hash = "sha256:32ca0c0114c9834a43f045a87dcebd69d108d8ffb666957ea65aa132f50332e2", size = 265316, upload-time = "2026-03-17T10:33:01.847Z" },
-    { url = "https://files.pythonhosted.org/packages/3d/16/d5ae91455541d1a78bc90abf495be600588aff8f6db5c8b0dae739fa39c9/coverage-7.13.5-cp314-cp314t-musllinux_1_2_riscv64.whl", hash = "sha256:8769751c10f339021e2638cd354e13adeac54004d1941119b2c96fe5276d45ea", size = 260427, upload-time = "2026-03-17T10:33:03.945Z" },
-    { url = "https://files.pythonhosted.org/packages/48/11/07f413dba62db21fb3fad5d0de013a50e073cc4e2dc4306e770360f6dfc8/coverage-7.13.5-cp314-cp314t-musllinux_1_2_x86_64.whl", hash = "sha256:cec2d83125531bd153175354055cdb7a09987af08a9430bd173c937c6d0fba2a", size = 262745, upload-time = "2026-03-17T10:33:06.285Z" },
-    { url = "https://files.pythonhosted.org/packages/91/15/d792371332eb4663115becf4bad47e047d16234b1aff687b1b18c58d60ae/coverage-7.13.5-cp314-cp314t-win32.whl", hash = "sha256:0cd9ed7a8b181775459296e402ca4fb27db1279740a24e93b3b41942ebe4b215", size = 223146, upload-time = "2026-03-17T10:33:08.756Z" },
-    { url = "https://files.pythonhosted.org/packages/db/51/37221f59a111dca5e85be7dbf09696323b5b9f13ff65e0641d535ed06ea8/coverage-7.13.5-cp314-cp314t-win_amd64.whl", hash = "sha256:301e3b7dfefecaca37c9f1aa6f0049b7d4ab8dd933742b607765d757aca77d43", size = 224254, upload-time = "2026-03-17T10:33:11.174Z" },
-    { url = "https://files.pythonhosted.org/packages/54/83/6acacc889de8987441aa7d5adfbdbf33d288dad28704a67e574f1df9bcbb/coverage-7.13.5-cp314-cp314t-win_arm64.whl", hash = "sha256:9dacc2ad679b292709e0f5fc1ac74a6d4d5562e424058962c7bb0c658ad25e45", size = 222276, upload-time = "2026-03-17T10:33:13.466Z" },
-    { url = "https://files.pythonhosted.org/packages/9e/ee/a4cf96b8ce1e566ed238f0659ac2d3f007ed1d14b181bcb684e19561a69a/coverage-7.13.5-py3-none-any.whl", hash = "sha256:34b02417cf070e173989b3db962f7ed56d2f644307b2cf9d5a0f258e13084a61", size = 211346, upload-time = "2026-03-17T10:33:15.691Z" },
+    { url = "https://files.pythonhosted.org/packages/09/1e/2f996b2c8415cbb6f54b0f5ec1ee850c96d7911961afb4fc05f4a89d8c58/coverage-7.14.0-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:7ffd19fc8aed057fd686a17a4935eef5f9859d69208f96310e893e64b9b6ccf5", size = 219967, upload-time = "2026-05-10T18:00:13.756Z" },
+    { url = "https://files.pythonhosted.org/packages/34/23/35c7aea1274aef7525bdd2dc92f710bdde6d11652239d71d1ec450067939/coverage-7.14.0-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:829994cfe1aeb773ca27bf246d4badc1e764893e3bfb98fff820fcecd1ca4662", size = 220329, upload-time = "2026-05-10T18:00:15.264Z" },
+    { url = "https://files.pythonhosted.org/packages/75/cf/a8f4b43a16e194b0261257ad28ded5853ec052570afef4a84e1d81189f3b/coverage-7.14.0-cp312-cp312-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl", hash = "sha256:b4f07cf7edcb7ec39431a5074d7ea83b29a9f71fcfc494f0f40af4e65180420f", size = 251839, upload-time = "2026-05-10T18:00:17.16Z" },
+    { url = "https://files.pythonhosted.org/packages/69/ff/6699e7b71e60d3049eb2bdcbc95ee3f35707b2b0e48f32e9e63d3ce30c08/coverage-7.14.0-cp312-cp312-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl", hash = "sha256:ca3d9cf2c32b521bd9518385608787fa86f38daf993695307531822c3430ed67", size = 254576, upload-time = "2026-05-10T18:00:18.829Z" },
+    { url = "https://files.pythonhosted.org/packages/22/ec/c936d495fcd67f48f03a9c4ad3297ff80d1f222a5df3980f15b34c186c21/coverage-7.14.0-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:92af52828e7f29d827346b0294e5a0853fa206db77db0395b282918d41e28db9", size = 255690, upload-time = "2026-05-10T18:00:20.648Z" },
+    { url = "https://files.pythonhosted.org/packages/5c/42/5af63f636cc62a4a2b1b3ba9146f6ee6f53a35a50d5cefc54d5670f60999/coverage-7.14.0-cp312-cp312-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:7b2bb6c9d7e769360d0f20a0f219603fd64f0c8f97de17ab25853261602be0fb", size = 257949, upload-time = "2026-05-10T18:00:22.28Z" },
+    { url = "https://files.pythonhosted.org/packages/26/d3/a225317bd2012132a27e1176d51660b826f99bb975876463c44ea0d7ee5a/coverage-7.14.0-cp312-cp312-manylinux_2_31_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:1c9ed6ef99f88fb8c14aa8e2bf8eb0fe55fa2edfea68f8675d78741df1a5ac0e", size = 252242, upload-time = "2026-05-10T18:00:24.076Z" },
+    { url = "https://files.pythonhosted.org/packages/f1/7f/9e65495298c3ea414742998539c37d048b5e81cc818fb1828cc6b51d10bf/coverage-7.14.0-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:8231ade007f37959fbf58acc677f26b922c02eda6f0428ea307da0fd39681bf3", size = 253608, upload-time = "2026-05-10T18:00:25.588Z" },
+    { url = "https://files.pythonhosted.org/packages/94/46/1522b524a35bdad22b2b8c4f9d32d0a104b524726ec380b2db68db1746f5/coverage-7.14.0-cp312-cp312-musllinux_1_2_i686.whl", hash = "sha256:d8b013632cc1ce1d09dbe4f32667b4d320ec2f54fc326ebeffcd0b0bcc2bb6c4", size = 251753, upload-time = "2026-05-10T18:00:27.104Z" },
+    { url = "https://files.pythonhosted.org/packages/f3/e9/cdf00d38817742c541ade405e115a3f7bf36e6f2a8b99d4f209861b85a2d/coverage-7.14.0-cp312-cp312-musllinux_1_2_ppc64le.whl", hash = "sha256:1733198802d71ec4c524f322e2867ee05c62e9e75df86bdca545407a221827d1", size = 255823, upload-time = "2026-05-10T18:00:29.038Z" },
+    { url = "https://files.pythonhosted.org/packages/38/fc/5e7877cf5f902d08a17ff1c532511476d87e1bea355bd5028cb97f902e79/coverage-7.14.0-cp312-cp312-musllinux_1_2_riscv64.whl", hash = "sha256:72a305291fa8ee01332f1aaf38b348ca34097f6aa0b0ef627eef2837e57bbba5", size = 251323, upload-time = "2026-05-10T18:00:30.647Z" },
+    { url = "https://files.pythonhosted.org/packages/18/9d/50f05a72dff8487464fdd4178dda5daed642a060e60afb644e3d45123559/coverage-7.14.0-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:fcaba850dd317c65423a9d63d88f9573c53b00354d6dd95724576cc98a131595", size = 253197, upload-time = "2026-05-10T18:00:32.211Z" },
+    { url = "https://files.pythonhosted.org/packages/00/3f/6f61ffe6439df266c3cf60f5c99cfaa21103d0210d706a42fc6c30683ff8/coverage-7.14.0-cp312-cp312-win32.whl", hash = "sha256:5ac83957a80d0701310e96d8bec68cdcf4f90a7674b7d13f15a344315b41ab27", size = 222515, upload-time = "2026-05-10T18:00:33.717Z" },
+    { url = "https://files.pythonhosted.org/packages/85/19/93853133df2cb371083285ef6a93982a0173e7a233b0f61373ba9fd30eb2/coverage-7.14.0-cp312-cp312-win_amd64.whl", hash = "sha256:70390b0da32cb90b501953716302906e8bcce087cb283e70d8c97729f22e92b2", size = 223324, upload-time = "2026-05-10T18:00:35.172Z" },
+    { url = "https://files.pythonhosted.org/packages/74/18/9f7fe62f659f24b7a82a0be56bf94c1bd0a89e0ae7ab4c668f6e82404294/coverage-7.14.0-cp312-cp312-win_arm64.whl", hash = "sha256:91b993743d959b8be85b4abf9d5478216a69329c321efe5be0433c1a841d691d", size = 221944, upload-time = "2026-05-10T18:00:37.014Z" },
+    { url = "https://files.pythonhosted.org/packages/6b/76/b7c66ee3c66e1b0f9d894c8125983aa0c03fb2336f2fd16559f9c966157f/coverage-7.14.0-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:f2bbb8254370eb4c628ff3d6fa8a7f74ddc40565394d4f7ab791d1fe568e37ef", size = 219990, upload-time = "2026-05-10T18:00:38.887Z" },
+    { url = "https://files.pythonhosted.org/packages/b3/af/e567cbad5ba69c013a50146dfa886dc7193361fda77521f51274ff620e1b/coverage-7.14.0-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:23b81107f46d3f21d0cbce30664fcec0f5d9f585638a67081750f99738f6bf66", size = 220365, upload-time = "2026-05-10T18:00:40.864Z" },
+    { url = "https://files.pythonhosted.org/packages/44/6f/9ad575d505b4d805b254febc8a5b338a2efe278f8786e56ff1cb8413f9c3/coverage-7.14.0-cp313-cp313-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl", hash = "sha256:22a7e06a5f11a757cdfe79018e9095f9f69ae283c5cd8123774c788deec8717b", size = 251363, upload-time = "2026-05-10T18:00:42.489Z" },
+    { url = "https://files.pythonhosted.org/packages/6f/5f/b5370068b2f57787454592ed7dcd1002f0f1703b7db1fa30f6a325a4ca6e/coverage-7.14.0-cp313-cp313-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl", hash = "sha256:9d1aa57a1dc8e05bdc42e81c5d671d849577aeedf279f4c449d6d286f9ed88ca", size = 253961, upload-time = "2026-05-10T18:00:44.079Z" },
+    { url = "https://files.pythonhosted.org/packages/29/1e/51adf17738976e8f2b85ddef7b7aa12a0838b056c92f175941d8862767c1/coverage-7.14.0-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:90c1a51bcfddf645b3bb7ec333d9e94393a8e94f55642380fa8a9a5a9e636cb7", size = 255193, upload-time = "2026-05-10T18:00:45.623Z" },
+    { url = "https://files.pythonhosted.org/packages/9e/7b/5bfd7ac1df3b881c2ac7a5cbc99c7609e6296c402f5ef587cd81c6f355b3/coverage-7.14.0-cp313-cp313-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:a841fae2fadcae4f438d43b6ccc4aac2ad609f47cdb6cfdce60cbb3fe5ca7bc2", size = 257326, upload-time = "2026-05-10T18:00:47.173Z" },
+    { url = "https://files.pythonhosted.org/packages/7d/38/1d37d316b174fad3843a1d76dbdfe4398771c9ecd0515935dd9ece9cd627/coverage-7.14.0-cp313-cp313-manylinux_2_31_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:c79d2319cabef1fe8e86df73371126931550804738f78ad7d31e3aad85a67367", size = 251582, upload-time = "2026-05-10T18:00:49.152Z" },
+    { url = "https://files.pythonhosted.org/packages/34/46/746704f95980ba220214e1a41e18cec5aea80a898eaa53c51bf2d645ff36/coverage-7.14.0-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:1b23b0c6f0b1db6ad769b7050c8b641c0bf215ded26c1816955b17b7f26edfa9", size = 253325, upload-time = "2026-05-10T18:00:51.252Z" },
+    { url = "https://files.pythonhosted.org/packages/e1/b9/bbe87206d9687b192352f893797825b5f5b15ecd3aa9c68fbff0c074d77b/coverage-7.14.0-cp313-cp313-musllinux_1_2_i686.whl", hash = "sha256:55d3089079ce181a4566b1065ab28d2575eb76d8ac8f81f4fcda2bf037fee087", size = 251291, upload-time = "2026-05-10T18:00:52.816Z" },
+    { url = "https://files.pythonhosted.org/packages/46/57/b8cdb12ac0d73ef0243218bd5e22c9df8f92edab8018213a86aec67c5324/coverage-7.14.0-cp313-cp313-musllinux_1_2_ppc64le.whl", hash = "sha256:49c005cba1e2f9677fb2845dcdf9a2e72a52a17d63e8231aaaae35d9f50215ef", size = 255448, upload-time = "2026-05-10T18:00:54.548Z" },
+    { url = "https://files.pythonhosted.org/packages/1f/d4/5002019538b2036ce3c84340f54d2fd5100d55b0a6b0894eee56128d03c7/coverage-7.14.0-cp313-cp313-musllinux_1_2_riscv64.whl", hash = "sha256:9117377b823daa28aa8635fbb08cda1cd6be3d7143257345459559aeef852d52", size = 251110, upload-time = "2026-05-10T18:00:56.122Z" },
+    { url = "https://files.pythonhosted.org/packages/37/53/20c5009477660f084e6ed60bc02a91894b8e234e617e86ecfd9aaf78e27b/coverage-7.14.0-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:7b79d646cf46d5cf9a9f40281d4441df5849e445726e369006d2b117710b33fe", size = 252885, upload-time = "2026-05-10T18:00:57.967Z" },
+    { url = "https://files.pythonhosted.org/packages/ae/ab/3cf6427ac9c1f1db747dbb1ce71dde47984876d4c2cfd018a3fef0a78d4d/coverage-7.14.0-cp313-cp313-win32.whl", hash = "sha256:fb609b3658479e33f9516d46f1a89dbb9b6c261366e3a11844a96ec487533dae", size = 222539, upload-time = "2026-05-10T18:00:59.581Z" },
+    { url = "https://files.pythonhosted.org/packages/8f/b8/9228523e80321c2cb4880d1f589bc0171f2f71432c35118ad04dc01decce/coverage-7.14.0-cp313-cp313-win_amd64.whl", hash = "sha256:0773d8329cf32b6fd222e4b52622c61fe8d503eb966cfc8d3c3c10c96266d50e", size = 223344, upload-time = "2026-05-10T18:01:01.531Z" },
+    { url = "https://files.pythonhosted.org/packages/a3/99/118daa192f95e3a6cb2740100fbf8797cda1734b4134ef0b5d501a7fa8f3/coverage-7.14.0-cp313-cp313-win_arm64.whl", hash = "sha256:b4e26a0f1b696faf283bffe5b8569e44e336c582439df5d53281ab89ee0cba96", size = 221966, upload-time = "2026-05-10T18:01:03.16Z" },
+    { url = "https://files.pythonhosted.org/packages/e6/f1/a46cc0c013be170216253184a32366d7cbdb9252feaec866b05c2d12a894/coverage-7.14.0-cp313-cp313t-macosx_10_13_x86_64.whl", hash = "sha256:953f521ca9445300397e65fda3dca58b2dbd68fee983777420b57ac3c77e9f90", size = 220679, upload-time = "2026-05-10T18:01:05.058Z" },
+    { url = "https://files.pythonhosted.org/packages/64/8c/9c30a3d311a34177fa432995be7fbfc64477d8bac5630bd38055b1c9b424/coverage-7.14.0-cp313-cp313t-macosx_11_0_arm64.whl", hash = "sha256:98af83fd65ae24b1fdd03aaead967a9f523bcd2f1aab2d4f3ffda65bb568a6f1", size = 221033, upload-time = "2026-05-10T18:01:07.002Z" },
+    { url = "https://files.pythonhosted.org/packages/9a/cd/3fb5e06c3badefd0c1b47e2044fdca67f8220a4ec2e7fcfb476aa0a67c6c/coverage-7.14.0-cp313-cp313t-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl", hash = "sha256:668b92e6958c4db7cf92e81caac328dfbbdbb215db2850ad28f0cbe1eea0bfbd", size = 262333, upload-time = "2026-05-10T18:01:08.903Z" },
+    { url = "https://files.pythonhosted.org/packages/a8/e6/fbc322325c7294d3e22c1ad6b79e45d0806b25228c8e5842aed6d8169aa7/coverage-7.14.0-cp313-cp313t-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl", hash = "sha256:9fbd898551762dea00d3fef2b1c4f99afd2c6a3ff952ea07d60a9bd5ed4f34bc", size = 264410, upload-time = "2026-05-10T18:01:10.531Z" },
+    { url = "https://files.pythonhosted.org/packages/08/92/c497b264bec1673c47cc77e26f760fcda4654cabf1f39546d1a23a3b8c35/coverage-7.14.0-cp313-cp313t-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:68af363c07ecd8d4b7d4043d85cb376d7d227eceb54e5323ee45da73dbd3e426", size = 266836, upload-time = "2026-05-10T18:01:12.19Z" },
+    { url = "https://files.pythonhosted.org/packages/78/fc/045da320987f401af5d2815d351e8aa799aec859f60e29f445e3089eeedb/coverage-7.14.0-cp313-cp313t-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:6e57054a583da8ac55edf24117ea4c9133032cfc4cf72aa2d48c1e5d4b52f899", size = 267974, upload-time = "2026-05-10T18:01:13.926Z" },
+    { url = "https://files.pythonhosted.org/packages/1b/ae/227b1e379497fb7a4fc3286e620f80c8a1e7cec66d45695a01639eb1af65/coverage-7.14.0-cp313-cp313t-manylinux_2_31_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:cc3499459bbcdd51a65b64c35ab7ed2764eaf3cba826e0df3f1d7fe2e102b70b", size = 261578, upload-time = "2026-05-10T18:01:15.564Z" },
+    { url = "https://files.pythonhosted.org/packages/a0/f5/3570342900f2acea31d33ff1590c5d8bac1a8e1a2e1c6d34a5d5e61de681/coverage-7.14.0-cp313-cp313t-musllinux_1_2_aarch64.whl", hash = "sha256:45899ec2138a4346ed34d601dedf5076fb74edf2d1dd9dc76a78e82397edee90", size = 264394, upload-time = "2026-05-10T18:01:17.607Z" },
+    { url = "https://files.pythonhosted.org/packages/16/29/de1bbc01c935b28f89b1dc3db85b011c055e843a8e5e3b83141c3f80af7f/coverage-7.14.0-cp313-cp313t-musllinux_1_2_i686.whl", hash = "sha256:8767486808c436f05b23ab98eb963fb29185e32a9357a166971685cb3459900f", size = 262022, upload-time = "2026-05-10T18:01:19.304Z" },
+    { url = "https://files.pythonhosted.org/packages/35/95/f53890b0bf2fc10ab168e05d38869215e73ca24c4cb521c3bb0eb62fe16b/coverage-7.14.0-cp313-cp313t-musllinux_1_2_ppc64le.whl", hash = "sha256:a3b5ddfd6aa7ddad53ee3edb231e88a2151507a43229b7d71b953916deca127d", size = 265732, upload-time = "2026-05-10T18:01:21.494Z" },
+    { url = "https://files.pythonhosted.org/packages/ed/ea/c919e259081dd2bdf0e43b87209709ba7ec2e4117c2a7f5185379c43463c/coverage-7.14.0-cp313-cp313t-musllinux_1_2_riscv64.whl", hash = "sha256:63df0fe568e698e1045792399f8ab6da3a6c2dce3182813fb92afa2641087b47", size = 260921, upload-time = "2026-05-10T18:01:23.533Z" },
+    { url = "https://files.pythonhosted.org/packages/1a/2c/c2831889705a81dc5d1c6ca12e4d8e9b95dfc146d153488a6c0ea685d28e/coverage-7.14.0-cp313-cp313t-musllinux_1_2_x86_64.whl", hash = "sha256:827d6397dbd95144939b18f89edf31f63e1f99633e8d5f32f22ba8bdda567477", size = 263109, upload-time = "2026-05-10T18:01:25.165Z" },
+    { url = "https://files.pythonhosted.org/packages/5a/a9/2fcae5003cac3d63fe344d2166243c2756935f48420863c5272b240d550b/coverage-7.14.0-cp313-cp313t-win32.whl", hash = "sha256:7bf43e000d24012599b879791cff41589af90674722421ef11b11a5431920bab", size = 223212, upload-time = "2026-05-10T18:01:27.157Z" },
+    { url = "https://files.pythonhosted.org/packages/3f/bb/18e94d7b14b9b398164197114a587a04ab7c9fdbe1d237eef57311c5e883/coverage-7.14.0-cp313-cp313t-win_amd64.whl", hash = "sha256:3f5549365af25d770e06b1f8f5682d9a5637d06eb494db91c6fa75d3950cc917", size = 224272, upload-time = "2026-05-10T18:01:29.107Z" },
+    { url = "https://files.pythonhosted.org/packages/db/56/4f14fad782b035c81c4ffd09159e7103d42bb1d93ac8496d04b90a11b7da/coverage-7.14.0-cp313-cp313t-win_arm64.whl", hash = "sha256:6d160217ec6fe890f16ad3a9531761589443749e448f91986c972714fad361c8", size = 222530, upload-time = "2026-05-10T18:01:31.151Z" },
+    { url = "https://files.pythonhosted.org/packages/1c/18/b9a6586d73992807c26f9a5f274131be3d76b56b18a82b9392e2a25d2e45/coverage-7.14.0-cp314-cp314-macosx_10_15_x86_64.whl", hash = "sha256:9aed9fa983514ca032790f3fe0d1c0e42ca7e16b42432af1706b50a9a46bef5d", size = 220036, upload-time = "2026-05-10T18:01:33.057Z" },
+    { url = "https://files.pythonhosted.org/packages/f3/9b/4165a1d56ddc302a0e2d518fd9d412a4fd0b57562618c78c5f21c57194f5/coverage-7.14.0-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:ba3b8390db29296dbbf49e91b6fe08f990743a90c8f447ba4c2ffc29670dfa63", size = 220368, upload-time = "2026-05-10T18:01:34.705Z" },
+    { url = "https://files.pythonhosted.org/packages/69/aa/c12e52a5ba148d9995229d557e3be6e554fe469addc0e9241b2f0956d8ea/coverage-7.14.0-cp314-cp314-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl", hash = "sha256:3a5d8e876dfa2f102e970b183863d6dedd023d3c0eeca1fe7a9787bc5f28b212", size = 251417, upload-time = "2026-05-10T18:01:36.949Z" },
+    { url = "https://files.pythonhosted.org/packages/d7/51/ec641c26e6dca1b25a7d2035ba6ecb7c884ef1a100a9e42fbe4ce4405139/coverage-7.14.0-cp314-cp314-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl", hash = "sha256:5ebb8f4614a3787d567e610bbfdf96a4798dd69a1afb1bd8ad228d4111fe6ff3", size = 253924, upload-time = "2026-05-10T18:01:38.985Z" },
+    { url = "https://files.pythonhosted.org/packages/33/c4/59c3de0bd1b538824173fd518fed51c1ce740ca5ed68e74545983f4053a9/coverage-7.14.0-cp314-cp314-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:6b9bf47223dd8db3d4c4b2e443b02bace480d428f0822c3f991600448a176c97", size = 255269, upload-time = "2026-05-10T18:01:40.957Z" },
+    { url = "https://files.pythonhosted.org/packages/7b/a9/36dfa153a62040296f6e7febfdb20a5720622f6ef5a81a41e8237b9a5344/coverage-7.14.0-cp314-cp314-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:3485a836550b303d006d57cc06e3d5afaabc642c77050b7c985a97b13e3776b8", size = 257583, upload-time = "2026-05-10T18:01:42.607Z" },
+    { url = "https://files.pythonhosted.org/packages/26/7b/cc2c048d4114d9ab1c2409e9ee365e5ae10736df6dffcfc9444effa6c708/coverage-7.14.0-cp314-cp314-manylinux_2_31_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:3e7e88110bae996d199d1693ca8ec3fd52441d426401ae963437598667b4c5eb", size = 251434, upload-time = "2026-05-10T18:01:44.537Z" },
+    { url = "https://files.pythonhosted.org/packages/ee/df/6770eaa576e604575e9a78055313250faef5faa84bd6f71a39fece519c43/coverage-7.14.0-cp314-cp314-musllinux_1_2_aarch64.whl", hash = "sha256:15228a6800ce7bdf1b74800595e56db7138cecb338fdbf044806e10dcf182dfe", size = 253280, upload-time = "2026-05-10T18:01:46.175Z" },
+    { url = "https://files.pythonhosted.org/packages/ad/9e/1c0264514a3f98259a6d64765a397b2c8373e3ba59ee722a4802d3ec0c61/coverage-7.14.0-cp314-cp314-musllinux_1_2_i686.whl", hash = "sha256:9d26ac7f5398bafc5b57421ad994e8a4749e8a7a0e62d05ec7d53014d5963bfa", size = 251241, upload-time = "2026-05-10T18:01:48.732Z" },
+    { url = "https://files.pythonhosted.org/packages/64/16/4efdf3e3c4079cdbf0ece56a2fea872df9e8a3e15a13a0af4400e1075944/coverage-7.14.0-cp314-cp314-musllinux_1_2_ppc64le.whl", hash = "sha256:2fb73254ff43c911c967a899e1359bc5049b4b115d6e8fbdde4937d0a2246cd5", size = 255516, upload-time = "2026-05-10T18:01:50.819Z" },
+    { url = "https://files.pythonhosted.org/packages/93/69/b1de96346603881b3d1bc8d6447c83200e1c9700ffbaff926ba01ff5724c/coverage-7.14.0-cp314-cp314-musllinux_1_2_riscv64.whl", hash = "sha256:454a380af72c6adada298ed270d38c7a391288198dbfb8467f786f588751a90c", size = 251059, upload-time = "2026-05-10T18:01:52.773Z" },
+    { url = "https://files.pythonhosted.org/packages/a4/66/2881853e0363a5e0a724d1103e53650795367471b6afb234f8b49e713bc6/coverage-7.14.0-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:65c86fb646d2bd2972e96bd1a8b45817ed907cee68655d6295fe7ec031d04cca", size = 252716, upload-time = "2026-05-10T18:01:54.506Z" },
+    { url = "https://files.pythonhosted.org/packages/55/5c/0d3305d002c41dcde873dbe456491e663dc55152ca526b630b5c47efd62f/coverage-7.14.0-cp314-cp314-win32.whl", hash = "sha256:6a6516b02a6101398e19a3f44820f69bab2590697f7def4331f668b14adaf828", size = 222788, upload-time = "2026-05-10T18:01:56.487Z" },
+    { url = "https://files.pythonhosted.org/packages/f9/58/6e1b8f52fdc3184b47dc5037f5070d83a3d11042db1594b02d2a44d786c8/coverage-7.14.0-cp314-cp314-win_amd64.whl", hash = "sha256:45e0f79d8351fa76e256716df91eab12890d32678b9590df7ae1042e4bd4cf5d", size = 223600, upload-time = "2026-05-10T18:01:58.497Z" },
+    { url = "https://files.pythonhosted.org/packages/00/70/a18c408e674bc26281cadaedc7351f929bd2094e191e4b15271c30b084cc/coverage-7.14.0-cp314-cp314-win_arm64.whl", hash = "sha256:4b899594a8b2d81e5cc064a0d7f9cac2081fed91049456cae7676787e41549c9", size = 222168, upload-time = "2026-05-10T18:02:00.411Z" },
+    { url = "https://files.pythonhosted.org/packages/3d/89/2681f071d238b62aff8dfc2ab44fc24cfdb38d1c01f391a80522ff5d3a16/coverage-7.14.0-cp314-cp314t-macosx_10_15_x86_64.whl", hash = "sha256:f580f8c80acd94ac72e863efe2cab791d8c38d153e0b463b92dfa000d5c84cd1", size = 220766, upload-time = "2026-05-10T18:02:02.313Z" },
+    { url = "https://files.pythonhosted.org/packages/bd/c7/c987babafd9207ffa1995e1ef1f9b26762cf4963aa768a66b6f0501e4616/coverage-7.14.0-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:a2bd259c442cd43c49b30fbafc51776eb19ea396faf159d26a83e6a0a5f13b0c", size = 221035, upload-time = "2026-05-10T18:02:04.017Z" },
+    { url = "https://files.pythonhosted.org/packages/5a/e9/d6a5ac3b333088143d6fc877d398a9a674dc03124a2f776e131f03864823/coverage-7.14.0-cp314-cp314t-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl", hash = "sha256:a706b908dfa85538863504c624b237a3cc34232bf403c057414ebfdb3b4d9f84", size = 262405, upload-time = "2026-05-10T18:02:05.915Z" },
+    { url = "https://files.pythonhosted.org/packages/38/b1/e70838d29a7c08e22d44398a46db90815bbcbf28de06992bd9210d1a8d8e/coverage-7.14.0-cp314-cp314t-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl", hash = "sha256:7333cd944ee4393b9b3d3c1b598c936d4fc8d70573a4c7dacfec5590dd50e436", size = 264530, upload-time = "2026-05-10T18:02:07.582Z" },
+    { url = "https://files.pythonhosted.org/packages/6b/73/5c31ef97763288d03d9995152b96d5475b527c63d91c84b01caea894b83a/coverage-7.14.0-cp314-cp314t-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:0f162bc9a15b82d947b02651b0c7e1609d6f7a8735ca330cfadec8481dd97d5a", size = 266932, upload-time = "2026-05-10T18:02:09.401Z" },
+    { url = "https://files.pythonhosted.org/packages/e1/76/dd56d80f29c5f05b4d76f7e7c6d47cafacae017189c75c5759d24f9ff0cc/coverage-7.14.0-cp314-cp314t-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:362cb78e01a5dc82009d88004cf60f2e6b6d6fcbfdec05b05af73b0abf40118f", size = 268062, upload-time = "2026-05-10T18:02:11.399Z" },
+    { url = "https://files.pythonhosted.org/packages/6e/c7/27ba85cd5b95614f159ff93ebff1901584a8d192e2e5e24c4943a7453f59/coverage-7.14.0-cp314-cp314t-manylinux_2_31_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:acebd068fca5512c3a6fde9c045f901613478781a73f0e82b307b214daef23fb", size = 261504, upload-time = "2026-05-10T18:02:13.257Z" },
+    { url = "https://files.pythonhosted.org/packages/13/2e/e8149f60ab5d5684c6eee881bdf34b127115cddbb958b196768dd9d63473/coverage-7.14.0-cp314-cp314t-musllinux_1_2_aarch64.whl", hash = "sha256:29fe3da551dface75deb2ccbf87b6b66e2e7ef38f6d89050b428be94afff3490", size = 264398, upload-time = "2026-05-10T18:02:15.063Z" },
+    { url = "https://files.pythonhosted.org/packages/d9/7f/1261b025285323225f4b4abffa5a643649dfd67e25ddca7ebcbdea3b7cb3/coverage-7.14.0-cp314-cp314t-musllinux_1_2_i686.whl", hash = "sha256:b4cc4fce8672fffcb09b0eafc167b396b3ba53c4a7230f54b7aaffbf6c835fa9", size = 262000, upload-time = "2026-05-10T18:02:16.756Z" },
+    { url = "https://files.pythonhosted.org/packages/d3/dc/829c54f60b9d08389439c00f813c752781c496fc5788c78d8006db4b4f2b/coverage-7.14.0-cp314-cp314t-musllinux_1_2_ppc64le.whl", hash = "sha256:5d4a51aad8ba8bdcd2b8bd8f03d4aca19693fa2327a3470e4718a25b03481020", size = 265732, upload-time = "2026-05-10T18:02:18.817Z" },
+    { url = "https://files.pythonhosted.org/packages/ed/b0/70bd1419941652fa062689cba9c3eeafb8f5e6fbb890bce41c3bdda5dbd6/coverage-7.14.0-cp314-cp314t-musllinux_1_2_riscv64.whl", hash = "sha256:9f323af3e1e4f68b60b7b247e37b8515563a61375518fa59de1af48ba28a3db6", size = 260847, upload-time = "2026-05-10T18:02:20.528Z" },
+    { url = "https://files.pythonhosted.org/packages/f2/73/be40b2390656c654d35ea0015ea7ba3d945769cf80790ad5e0bb2d56d2ba/coverage-7.14.0-cp314-cp314t-musllinux_1_2_x86_64.whl", hash = "sha256:1a0abc7342ea9711c469dd8b821c6c311e6bc6aac1442e5fbd6b27fae0a8f3db", size = 263166, upload-time = "2026-05-10T18:02:22.337Z" },
+    { url = "https://files.pythonhosted.org/packages/29/55/4a643f712fcf7cf2881f8ec1e0ccb7b164aff3108f69b51801246c8799f2/coverage-7.14.0-cp314-cp314t-win32.whl", hash = "sha256:a9f864ef57b7172e2db87a096642dd51e179e085ab6b2c371c29e885f65c8fb2", size = 223573, upload-time = "2026-05-10T18:02:24.11Z" },
+    { url = "https://files.pythonhosted.org/packages/27/96/3acae5da0953be042c0b4dea6d6789d2f080701c77b88e44d5bd41b9219b/coverage-7.14.0-cp314-cp314t-win_amd64.whl", hash = "sha256:29943e552fdc08e082eb51400fb2f58e118a83b5542bd06531214e084399b644", size = 224680, upload-time = "2026-05-10T18:02:25.896Z" },
+    { url = "https://files.pythonhosted.org/packages/93/3d/6ab5d2dd8325d838737c6f8d83d62eb6230e0d70b87b51b57bbfd08fa767/coverage-7.14.0-cp314-cp314t-win_arm64.whl", hash = "sha256:742a73ea621953b012f2c4c2219b512180dd84489acf5b1596b0aafc55b9100b", size = 222703, upload-time = "2026-05-10T18:02:27.822Z" },
+    { url = "https://files.pythonhosted.org/packages/61/e8/cb8e80d6f9f55b99588625062822bf946cf03ed06315df4bd8397f5632a1/coverage-7.14.0-py3-none-any.whl", hash = "sha256:8de5b61163aee3d05c8a2beab6f47913df7981dad1baf82c414d99158c286ab1", size = 211764, upload-time = "2026-05-10T18:02:29.538Z" },
 ]

 [[package]]
@@ -1134,7 +1142,7 @@ name = "decord"
 version = "0.6.0"
 source = { registry = "https://pypi.org/simple" }
 dependencies = [
-    { name = "numpy", marker = "(platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 'armv7l' and platform_machine != 's390x') or (platform_machine != 's390x' and sys_platform != 'linux')" },
+    { name = "numpy", marker = "(platform_machine != 'arm64' and platform_machine != 's390x' and sys_platform == 'darwin') or (platform_machine == 'AMD64' and sys_platform == 'linux') or (platform_machine == 'x86_64' and sys_platform == 'linux') or (platform_machine != 's390x' and sys_platform != 'darwin' and sys_platform != 'linux')" },
 ]
 wheels = [
    { url = "https://files.pythonhosted.org/packages/11/79/936af42edf90a7bd4e41a6cac89c913d4b47fa48a26b042d5129a9242ee3/decord-0.6.0-py3-none-manylinux2010_x86_64.whl", hash = "sha256:51997f20be8958e23b7c4061ba45d0efcd86bffd5fe81c695d0befee0d442976", size = 13602299, upload-time = "2021-06-14T21:30:55.486Z" },
@@ -1201,7 +1209,7 @@ wheels = [

 [[package]]
 name = "dm-control"
-version = "1.0.40"
+version = "1.0.41"
 source = { registry = "https://pypi.org/simple" }
 dependencies = [
    { name = "absl-py" },
@@ -1220,9 +1228,9 @@ dependencies = [
    { name = "setuptools" },
    { name = "tqdm" },
 ]
-sdist = { url = "https://files.pythonhosted.org/packages/78/27/1d3caa7fa7557b70d766f437979636b6a8c99b14f6e8b8f84795cad9f1df/dm_control-1.0.40.tar.gz", hash = "sha256:af5828af47fe50466008d53b141893a05c4e2779169fc8ef469d1828a016266e", size = 56273764, upload-time = "2026-04-25T22:05:39.202Z" }
+sdist = { url = "https://files.pythonhosted.org/packages/96/49/10beb2d63b05e385bbb67995b6621385c4c80719ad1c71acfc55537b97de/dm_control-1.0.41.tar.gz", hash = "sha256:644113b7bbba5884da57e15adbb28961de775d04e2db8b1bf0304c06835eff16", size = 56275358, upload-time = "2026-05-11T16:27:17.895Z" }
 wheels = [
-    { url = "https://files.pythonhosted.org/packages/2d/eb/a762d6f15e6c4faccef8fdcdae50cf8f232800a36b70aef93d38a787bb58/dm_control-1.0.40-py3-none-any.whl", hash = "sha256:cd15b1d95f5b320b7924e518715b8ac132043d575588b7e122f21016f49c7e89", size = 56446428, upload-time = "2026-04-25T22:05:33.561Z" },
+    { url = "https://files.pythonhosted.org/packages/3d/2d/b4b082bdd27a4bfd689a4c4109b3fab828ac56db60b024f07710e496fa65/dm_control-1.0.41-py3-none-any.whl", hash = "sha256:81e89b295aeeca2ba71e8dd1baf26084f9d974f8b80a4a4c284da7b883847161", size = 56446431, upload-time = "2026-05-11T16:27:12.6Z" },
 ]

 [[package]]
@@ -1461,8 +1469,8 @@ name = "flash-attn"
 version = "2.8.3"
 source = { registry = "https://pypi.org/simple" }
 dependencies = [
-    { name = "einops" },
-    { name = "torch", version = "2.11.0", source = { registry = "https://pypi.org/simple" }, marker = "sys_platform != 'linux'" },
+    { name = "einops", marker = "platform_machine != 'arm64' or sys_platform != 'darwin'" },
+    { name = "torch", version = "2.11.0", source = { registry = "https://pypi.org/simple" }, marker = "(platform_machine != 'arm64' and sys_platform == 'darwin') or (sys_platform != 'darwin' and sys_platform != 'linux')" },
    { name = "torch", version = "2.11.0+cu128", source = { registry = "https://download.pytorch.org/whl/cu128" }, marker = "sys_platform == 'linux'" },
 ]
 sdist = { url = "https://files.pythonhosted.org/packages/3b/b2/8d76c41ad7974ee264754709c22963447f7f8134613fd9ce80984ed0dab7/flash_attn-2.8.3.tar.gz", hash = "sha256:1e71dd64a9e0280e0447b8a0c2541bad4bf6ac65bdeaa2f90e51a9e57de0370d", size = 8447812, upload-time = "2025-08-15T08:28:12.911Z" }
@@ -2061,11 +2069,11 @@ wheels = [

 [[package]]
 name = "idna"
-version = "3.13"
+version = "3.14"
 source = { registry = "https://pypi.org/simple" }
-sdist = { url = "https://files.pythonhosted.org/packages/ce/cc/762dfb036166873f0059f3b7de4565e1b5bc3d6f28a414c13da27e442f99/idna-3.13.tar.gz", hash = "sha256:585ea8fe5d69b9181ec1afba340451fba6ba764af97026f92a91d4eef164a242", size = 194210, upload-time = "2026-04-22T16:42:42.314Z" }
+sdist = { url = "https://files.pythonhosted.org/packages/05/b1/efac073e0c297ecf2fb33c346989a529d4e19164f1759102dee5953ee17e/idna-3.14.tar.gz", hash = "sha256:466d810d7a2cc1022bea9b037c39728d51ae7dad40d480fc9b7d7ecf98ba8ee3", size = 198272, upload-time = "2026-05-10T20:32:15.935Z" }
 wheels = [
-    { url = "https://files.pythonhosted.org/packages/5d/13/ad7d7ca3808a898b4612b6fe93cde56b53f3034dcde235acb1f0e1df24c6/idna-3.13-py3-none-any.whl", hash = "sha256:892ea0cde124a99ce773decba204c5552b69c3c67ffd5f232eb7696135bc8bb3", size = 68629, upload-time = "2026-04-22T16:42:40.909Z" },
+    { url = "https://files.pythonhosted.org/packages/6c/3c/3f62dee257eb3d6b2c1ef2a09d36d9793c7111156a73b5654d2c2305e5ce/idna-3.14-py3-none-any.whl", hash = "sha256:e677eaf072e290f7b725f9acf0b3a2bd55f9fd6f7c70abe5f0e34823d0accf69", size = 72184, upload-time = "2026-05-10T20:32:14.295Z" },
 ]

 [[package]]
@@ -2423,7 +2431,7 @@ dependencies = [
    { name = "nbformat" },
    { name = "packaging" },
    { name = "prometheus-client" },
-    { name = "pywinpty", marker = "os_name == 'nt' and sys_platform != 'linux'" },
+    { name = "pywinpty", marker = "(os_name == 'nt' and platform_machine != 'arm64' and sys_platform == 'darwin') or (os_name == 'nt' and sys_platform != 'darwin' and sys_platform != 'linux')" },
    { name = "pyzmq" },
    { name = "send2trash" },
    { name = "terminado" },
@@ -2441,7 +2449,7 @@ name = "jupyter-server-terminals"
 version = "0.5.4"
 source = { registry = "https://pypi.org/simple" }
 dependencies = [
-    { name = "pywinpty", marker = "os_name == 'nt' and sys_platform != 'linux'" },
+    { name = "pywinpty", marker = "(os_name == 'nt' and platform_machine != 'arm64' and sys_platform == 'darwin') or (os_name == 'nt' and sys_platform != 'darwin' and sys_platform != 'linux')" },
    { name = "terminado" },
 ]
 sdist = { url = "https://files.pythonhosted.org/packages/f4/a7/bcd0a9b0cbba88986fe944aaaf91bfda603e5a50bda8ed15123f381a3b2f/jupyter_server_terminals-0.5.4.tar.gz", hash = "sha256:bbda128ed41d0be9020349f9f1f2a4ab9952a73ed5f5ac9f1419794761fb87f5", size = 31770, upload-time = "2026-01-14T16:53:20.213Z" }
@@ -2511,7 +2519,7 @@ wheels = [

 [[package]]
 name = "jupytext"
-version = "1.19.1"
+version = "1.19.2"
 source = { registry = "https://pypi.org/simple" }
 dependencies = [
    { name = "markdown-it-py", marker = "sys_platform == 'linux'" },
@@ -2520,9 +2528,9 @@ dependencies = [
    { name = "packaging", marker = "sys_platform == 'linux'" },
    { name = "pyyaml", marker = "sys_platform == 'linux'" },
 ]
-sdist = { url = "https://files.pythonhosted.org/packages/13/a5/80c02f307c8ce863cb33e27daf049315e9d96979e14eead700923b5ec9cc/jupytext-1.19.1.tar.gz", hash = "sha256:82587c07e299173c70ed5e8ec7e75183edf1be289ed518bab49ad0d4e3d5f433", size = 4307829, upload-time = "2026-01-25T21:35:13.276Z" }
+sdist = { url = "https://files.pythonhosted.org/packages/72/3a/4f13fcba0ed05965a48fca197d89fb8c78c4b61051dc0c9ee9ed92e77a8d/jupytext-1.19.2.tar.gz", hash = "sha256:da6198a42406a09142b6b26ebc46a3ec7077f525222a8f12b1811a0e289a2216", size = 4309931, upload-time = "2026-05-10T17:10:40.345Z" }
 wheels = [
-    { url = "https://files.pythonhosted.org/packages/16/5a/736dd2f4535dbf3bf26523f9158c011389ef88dd06ec2eef67fd744f1c7b/jupytext-1.19.1-py3-none-any.whl", hash = "sha256:d8975035155d034bdfde5c0c37891425314b7ea8d3a6c4b5d18c294348714cd9", size = 170478, upload-time = "2026-01-25T21:35:11.17Z" },
+    { url = "https://files.pythonhosted.org/packages/4c/65/b4b86e5fa07543bfbbcdc6c9f7f9f561e66a5f3539992e3009973f2b1314/jupytext-1.19.2-py3-none-any.whl", hash = "sha256:8a31e896c7e9215841783aade24336e945543057e1c2d7f00b22f9e870348688", size = 170653, upload-time = "2026-05-10T17:10:38.418Z" },
 ]

 [[package]]
@@ -2729,7 +2737,7 @@ all = [
    { name = "scikit-image" },
    { name = "scipy" },
    { name = "teleop" },
-    { name = "torchcodec", marker = "(platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 'armv7l' and sys_platform == 'linux') or (platform_machine != 'x86_64' and sys_platform == 'darwin') or (sys_platform != 'darwin' and sys_platform != 'linux' and sys_platform != 'win32')" },
+    { name = "torchcodec", marker = "(platform_machine == 'arm64' and sys_platform == 'darwin') or (platform_machine == 'AMD64' and sys_platform == 'linux') or (platform_machine == 'aarch64' and sys_platform == 'linux') or (platform_machine == 'arm64' and sys_platform == 'linux') or (platform_machine == 'x86_64' and sys_platform == 'linux') or sys_platform == 'win32'" },
    { name = "torchdiffeq" },
    { name = "transformers" },
    { name = "wandb" },
@@ -2742,7 +2750,7 @@ aloha = [
    { name = "pandas" },
    { name = "pyarrow" },
    { name = "scipy" },
-    { name = "torchcodec", marker = "(platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 'armv7l' and sys_platform == 'linux') or (platform_machine != 'x86_64' and sys_platform == 'darwin') or (sys_platform != 'darwin' and sys_platform != 'linux' and sys_platform != 'win32')" },
+    { name = "torchcodec", marker = "(platform_machine == 'arm64' and sys_platform == 'darwin') or (platform_machine == 'AMD64' and sys_platform == 'linux') or (platform_machine == 'aarch64' and sys_platform == 'linux') or (platform_machine == 'arm64' and sys_platform == 'linux') or (platform_machine == 'x86_64' and sys_platform == 'linux') or sys_platform == 'win32'" },
 ]
 async = [
    { name = "contourpy" },
@@ -2766,7 +2774,7 @@ core-scripts = [
    { name = "pynput" },
    { name = "pyserial" },
    { name = "rerun-sdk" },
-    { name = "torchcodec", marker = "(platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 'armv7l' and sys_platform == 'linux') or (platform_machine != 'x86_64' and sys_platform == 'darwin') or (sys_platform != 'darwin' and sys_platform != 'linux' and sys_platform != 'win32')" },
+    { name = "torchcodec", marker = "(platform_machine == 'arm64' and sys_platform == 'darwin') or (platform_machine == 'AMD64' and sys_platform == 'linux') or (platform_machine == 'aarch64' and sys_platform == 'linux') or (platform_machine == 'arm64' and sys_platform == 'linux') or (platform_machine == 'x86_64' and sys_platform == 'linux') or sys_platform == 'win32'" },
 ]
 damiao = [
    { name = "python-can" },
@@ -2777,7 +2785,7 @@ dataset = [
    { name = "jsonlines" },
    { name = "pandas" },
    { name = "pyarrow" },
-    { name = "torchcodec", marker = "(platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 'armv7l' and sys_platform == 'linux') or (platform_machine != 'x86_64' and sys_platform == 'darwin') or (sys_platform != 'darwin' and sys_platform != 'linux' and sys_platform != 'win32')" },
+    { name = "torchcodec", marker = "(platform_machine == 'arm64' and sys_platform == 'darwin') or (platform_machine == 'AMD64' and sys_platform == 'linux') or (platform_machine == 'aarch64' and sys_platform == 'linux') or (platform_machine == 'arm64' and sys_platform == 'linux') or (platform_machine == 'x86_64' and sys_platform == 'linux') or sys_platform == 'win32'" },
 ]
 dataset-viz = [
    { name = "av" },
@@ -2786,7 +2794,7 @@ dataset-viz = [
    { name = "pandas" },
    { name = "pyarrow" },
    { name = "rerun-sdk" },
-    { name = "torchcodec", marker = "(platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 'armv7l' and sys_platform == 'linux') or (platform_machine != 'x86_64' and sys_platform == 'darwin') or (sys_platform != 'darwin' and sys_platform != 'linux' and sys_platform != 'win32')" },
+    { name = "torchcodec", marker = "(platform_machine == 'arm64' and sys_platform == 'darwin') or (platform_machine == 'AMD64' and sys_platform == 'linux') or (platform_machine == 'aarch64' and sys_platform == 'linux') or (platform_machine == 'arm64' and sys_platform == 'linux') or (platform_machine == 'x86_64' and sys_platform == 'linux') or sys_platform == 'win32'" },
 ]
 deepdiff-dep = [
    { name = "deepdiff" },
@@ -2849,10 +2857,16 @@ hardware = [
    { name = "pyserial" },
 ]
 hilserl = [
+    { name = "av" },
+    { name = "datasets" },
    { name = "grpcio" },
    { name = "gym-hil" },
+    { name = "jsonlines" },
+    { name = "pandas" },
    { name = "placo" },
    { name = "protobuf" },
+    { name = "pyarrow" },
+    { name = "torchcodec", marker = "(platform_machine == 'arm64' and sys_platform == 'darwin') or (platform_machine == 'AMD64' and sys_platform == 'linux') or (platform_machine == 'aarch64' and sys_platform == 'linux') or (platform_machine == 'arm64' and sys_platform == 'linux') or (platform_machine == 'x86_64' and sys_platform == 'linux') or sys_platform == 'win32'" },
    { name = "transformers" },
 ]
 hopejr = [
@@ -2882,7 +2896,7 @@ libero = [
    { name = "pandas" },
    { name = "pyarrow" },
    { name = "scipy" },
-    { name = "torchcodec", marker = "(platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 'armv7l' and sys_platform == 'linux') or (platform_machine != 'x86_64' and sys_platform == 'darwin') or (sys_platform != 'darwin' and sys_platform != 'linux' and sys_platform != 'win32')" },
+    { name = "torchcodec", marker = "(platform_machine == 'arm64' and sys_platform == 'darwin') or (platform_machine == 'AMD64' and sys_platform == 'linux') or (platform_machine == 'aarch64' and sys_platform == 'linux') or (platform_machine == 'arm64' and sys_platform == 'linux') or (platform_machine == 'x86_64' and sys_platform == 'linux') or sys_platform == 'win32'" },
    { name = "transformers" },
 ]
 matplotlib-dep = [
@@ -2897,7 +2911,7 @@ metaworld = [
    { name = "pandas" },
    { name = "pyarrow" },
    { name = "scipy" },
-    { name = "torchcodec", marker = "(platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 'armv7l' and sys_platform == 'linux') or (platform_machine != 'x86_64' and sys_platform == 'darwin') or (sys_platform != 'darwin' and sys_platform != 'linux' and sys_platform != 'win32')" },
+    { name = "torchcodec", marker = "(platform_machine == 'arm64' and sys_platform == 'darwin') or (platform_machine == 'AMD64' and sys_platform == 'linux') or (platform_machine == 'aarch64' and sys_platform == 'linux') or (platform_machine == 'arm64' and sys_platform == 'linux') or (platform_machine == 'x86_64' and sys_platform == 'linux') or sys_platform == 'win32'" },
 ]
 multi-task-dit = [
    { name = "diffusers" },
@@ -2938,7 +2952,7 @@ pusht = [
    { name = "pandas" },
    { name = "pyarrow" },
    { name = "pymunk" },
-    { name = "torchcodec", marker = "(platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 'armv7l' and sys_platform == 'linux') or (platform_machine != 'x86_64' and sys_platform == 'darwin') or (sys_platform != 'darwin' and sys_platform != 'linux' and sys_platform != 'win32')" },
+    { name = "torchcodec", marker = "(platform_machine == 'arm64' and sys_platform == 'darwin') or (platform_machine == 'AMD64' and sys_platform == 'linux') or (platform_machine == 'aarch64' and sys_platform == 'linux') or (platform_machine == 'arm64' and sys_platform == 'linux') or (platform_machine == 'x86_64' and sys_platform == 'linux') or sys_platform == 'win32'" },
 ]
 pygame-dep = [
    { name = "pygame" },
@@ -2958,6 +2972,11 @@ qwen-vl-utils-dep = [
 reachy2 = [
    { name = "reachy2-sdk" },
 ]
+robometer = [
+    { name = "peft" },
+    { name = "qwen-vl-utils" },
+    { name = "transformers" },
+]
 robstride = [
    { name = "python-can" },
 ]
@@ -2990,7 +3009,7 @@ training = [
    { name = "jsonlines" },
    { name = "pandas" },
    { name = "pyarrow" },
-    { name = "torchcodec", marker = "(platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 'armv7l' and sys_platform == 'linux') or (platform_machine != 'x86_64' and sys_platform == 'darwin') or (sys_platform != 'darwin' and sys_platform != 'linux' and sys_platform != 'win32')" },
+    { name = "torchcodec", marker = "(platform_machine == 'arm64' and sys_platform == 'darwin') or (platform_machine == 'AMD64' and sys_platform == 'linux') or (platform_machine == 'aarch64' and sys_platform == 'linux') or (platform_machine == 'arm64' and sys_platform == 'linux') or (platform_machine == 'x86_64' and sys_platform == 'linux') or sys_platform == 'win32'" },
    { name = "wandb" },
 ]
 transformers-dep = [
@@ -3069,6 +3088,7 @@ requires-dist = [
    { name = "lerobot", extras = ["dataset"], marker = "extra == 'aloha'" },
    { name = "lerobot", extras = ["dataset"], marker = "extra == 'core-scripts'" },
    { name = "lerobot", extras = ["dataset"], marker = "extra == 'dataset-viz'" },
+    { name = "lerobot", extras = ["dataset"], marker = "extra == 'hilserl'" },
    { name = "lerobot", extras = ["dataset"], marker = "extra == 'libero'" },
    { name = "lerobot", extras = ["dataset"], marker = "extra == 'metaworld'" },
    { name = "lerobot", extras = ["dataset"], marker = "extra == 'pusht'" },
@@ -3107,6 +3127,7 @@ requires-dist = [
    { name = "lerobot", extras = ["peft"], marker = "extra == 'all'" },
    { name = "lerobot", extras = ["peft-dep"], marker = "extra == 'groot'" },
    { name = "lerobot", extras = ["peft-dep"], marker = "extra == 'peft'" },
+    { name = "lerobot", extras = ["peft-dep"], marker = "extra == 'robometer'" },
    { name = "lerobot", extras = ["peft-dep"], marker = "extra == 'wallx'" },
    { name = "lerobot", extras = ["phone"], marker = "extra == 'all'" },
    { name = "lerobot", extras = ["pi"], marker = "extra == 'all'" },
@@ -3124,6 +3145,7 @@ requires-dist = [
    { name = "lerobot", extras = ["pyzmq-dep"], marker = "extra == 'lekiwi'" },
    { name = "lerobot", extras = ["pyzmq-dep"], marker = "extra == 'unitree-g1'" },
    { name = "lerobot", extras = ["qwen-vl-utils-dep"], marker = "extra == 'eo1'" },
+    { name = "lerobot", extras = ["qwen-vl-utils-dep"], marker = "extra == 'robometer'" },
    { name = "lerobot", extras = ["qwen-vl-utils-dep"], marker = "extra == 'sarm'" },
    { name = "lerobot", extras = ["qwen-vl-utils-dep"], marker = "extra == 'wallx'" },
    { name = "lerobot", extras = ["reachy2"], marker = "extra == 'all'" },
@@ -3145,6 +3167,7 @@ requires-dist = [
    { name = "lerobot", extras = ["transformers-dep"], marker = "extra == 'multi-task-dit'" },
    { name = "lerobot", extras = ["transformers-dep"], marker = "extra == 'peft'" },
    { name = "lerobot", extras = ["transformers-dep"], marker = "extra == 'pi'" },
+    { name = "lerobot", extras = ["transformers-dep"], marker = "extra == 'robometer'" },
    { name = "lerobot", extras = ["transformers-dep"], marker = "extra == 'sarm'" },
    { name = "lerobot", extras = ["transformers-dep"], marker = "extra == 'smolvla'" },
    { name = "lerobot", extras = ["transformers-dep"], marker = "extra == 'wallx'" },
@@ -3202,7 +3225,9 @@ requires-dist = [
    { name = "timm", marker = "extra == 'groot'", specifier = ">=1.0.0,<1.1.0" },
    { name = "torch", marker = "sys_platform != 'linux'", specifier = ">=2.7,<2.12.0" },
    { name = "torch", marker = "sys_platform == 'linux'", specifier = ">=2.7,<2.12.0", index = "https://download.pytorch.org/whl/cu128" },
-    { name = "torchcodec", marker = "(platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 'armv7l' and sys_platform == 'linux' and extra == 'dataset') or (platform_machine != 'x86_64' and sys_platform == 'darwin' and extra == 'dataset') or (sys_platform != 'darwin' and sys_platform != 'linux' and sys_platform != 'win32' and extra == 'dataset')", specifier = ">=0.3.0,<0.12.0" },
+    { name = "torchcodec", marker = "(platform_machine == 'arm64' and sys_platform == 'darwin' and extra == 'dataset') or (platform_machine == 'AMD64' and sys_platform == 'linux' and extra == 'dataset') or (platform_machine == 'x86_64' and sys_platform == 'linux' and extra == 'dataset')", specifier = ">=0.3.0,<0.12.0" },
+    { name = "torchcodec", marker = "(platform_machine == 'aarch64' and sys_platform == 'linux' and extra == 'dataset') or (platform_machine == 'arm64' and sys_platform == 'linux' and extra == 'dataset')", specifier = ">=0.11.0,<0.12.0" },
+    { name = "torchcodec", marker = "sys_platform == 'win32' and extra == 'dataset'", specifier = ">=0.7.0,<0.12.0" },
    { name = "torchdiffeq", marker = "extra == 'wallx'", specifier = ">=0.2.4,<0.3.0" },
    { name = "torchvision", marker = "sys_platform != 'linux'", specifier = ">=0.22.0,<0.27.0" },
    { name = "torchvision", marker = "sys_platform == 'linux'", specifier = ">=0.22.0,<0.27.0", index = "https://download.pytorch.org/whl/cu128" },
@@ -3210,66 +3235,66 @@ requires-dist = [
    { name = "transformers", marker = "extra == 'transformers-dep'", specifier = ">=5.4.0,<5.6.0" },
    { name = "wandb", marker = "extra == 'training'", specifier = ">=0.24.0,<0.25.0" },
 ]
-provides-extras = ["dataset", "training", "hardware", "viz", "core-scripts", "evaluation", "dataset-viz", "av-dep", "pygame-dep", "placo-dep", "transformers-dep", "grpcio-dep", "can-dep", "peft-dep", "scipy-dep", "diffusers-dep", "qwen-vl-utils-dep", "matplotlib-dep", "pyserial-dep", "deepdiff-dep", "pynput-dep", "pyzmq-dep", "feetech", "dynamixel", "damiao", "robstride", "openarms", "gamepad", "hopejr", "lekiwi", "unitree-g1", "reachy2", "kinematics", "intelrealsense", "phone", "diffusion", "wallx", "pi", "smolvla", "multi-task-dit", "groot", "sarm", "xvla", "eo1", "hilserl", "async", "peft", "dev", "notebook", "test", "video-benchmark", "aloha", "pusht", "libero", "metaworld", "all"]
+provides-extras = ["dataset", "training", "hardware", "viz", "core-scripts", "evaluation", "dataset-viz", "av-dep", "pygame-dep", "placo-dep", "transformers-dep", "grpcio-dep", "can-dep", "peft-dep", "scipy-dep", "diffusers-dep", "qwen-vl-utils-dep", "matplotlib-dep", "pyserial-dep", "deepdiff-dep", "pynput-dep", "pyzmq-dep", "feetech", "dynamixel", "damiao", "robstride", "openarms", "gamepad", "hopejr", "lekiwi", "unitree-g1", "reachy2", "kinematics", "intelrealsense", "phone", "diffusion", "wallx", "pi", "smolvla", "multi-task-dit", "groot", "sarm", "robometer", "xvla", "eo1", "hilserl", "async", "peft", "dev", "notebook", "test", "video-benchmark", "aloha", "pusht", "libero", "metaworld", "all"]

 [[package]]
 name = "librt"
-version = "0.10.0"
+version = "0.11.0"
 source = { registry = "https://pypi.org/simple" }
-sdist = { url = "https://files.pythonhosted.org/packages/39/cb/c1945e506893b5b8577fb45a60c80e3ffe4a82092a04a6f29b0b951d9a24/librt-0.10.0.tar.gz", hash = "sha256:1aba1e8aa4e3307a7be68a74149545fde7451964dc0235a8bec5704a17bdda42", size = 191799, upload-time = "2026-05-05T16:31:23.535Z" }
+sdist = { url = "https://files.pythonhosted.org/packages/40/08/9e7f6b5d2b5bed6ad055cdd5925f192bb403a51280f86b56554d9d0699a2/librt-0.11.0.tar.gz", hash = "sha256:075dc3ef4458a278e0195cbf6ac9d38808d9b906c5a6c7f7f79c3888276a3fb1", size = 200139, upload-time = "2026-05-10T18:17:25.138Z" }
 wheels = [
-    { url = "https://files.pythonhosted.org/packages/12/8e/cbb5b6f6e45e65c10a42449a69eaccc44d73e6a081ea752fbc5221c6dc1c/librt-0.10.0-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:b4b58a44b407e91f633dafee008de9ddea6aa2a555ed94929c099260910bd0ba", size = 77327, upload-time = "2026-05-05T16:29:38.919Z" },
-    { url = "https://files.pythonhosted.org/packages/e9/3d/8233cbee8e99e6a8992f02bfc2dec8d787509566a511d1fde2574ee7473f/librt-0.10.0-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:950b79b11762531bdf45a9df909d2f9a2a8445c70c88665c01d14c8511a27dc5", size = 79971, upload-time = "2026-05-05T16:29:40.96Z" },
-    { url = "https://files.pythonhosted.org/packages/87/6f/5264b298cef2b72fc97d2dde56c66181eda35204bf5dcd1ed0c3d0a0a782/librt-0.10.0-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:4538453f51be197633b425912c150e25b0667252d3741c53e8368176d98d9d37", size = 246559, upload-time = "2026-05-05T16:29:42.701Z" },
-    { url = "https://files.pythonhosted.org/packages/07/7b/19b1b859cc60d5f99276cc2b3144d91556c6d1b1e4ebb50359696bebf7a8/librt-0.10.0-cp312-cp312-manylinux2014_i686.manylinux_2_17_i686.manylinux_2_28_i686.whl", hash = "sha256:70b955f091beac93e994a0b7ec616934f63b3ea5c3d6d7af847562f935aceca7", size = 235216, upload-time = "2026-05-05T16:29:44.193Z" },
-    { url = "https://files.pythonhosted.org/packages/6e/56/a2f40717142a8af46289f57874ef914353d8faccd5e4f8e594ab1e16e8c7/librt-0.10.0-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:483e685e06b6163728ba6c85d74315176be7190f432ec2a41226e5e14355d5f0", size = 263108, upload-time = "2026-05-05T16:29:46.365Z" },
-    { url = "https://files.pythonhosted.org/packages/67/ca/15c625c3bdc0167c01e04ef8878317e9713f3bfa788438342f7a94c7b22c/librt-0.10.0-cp312-cp312-manylinux_2_31_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:7ac53d946a009d1a38c44a60812708c9458fb2a239a5f630d8e625571386650f", size = 255280, upload-time = "2026-05-05T16:29:48.087Z" },
-    { url = "https://files.pythonhosted.org/packages/ed/c5/ba301d571d9e05844e2435b73aba30bee77bb75ce155c9affcfd2173dd03/librt-0.10.0-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:bc8771c9fcf0ea894ca41fdc2abd83572c2fbda221f232d86e718614e57ff513", size = 268829, upload-time = "2026-05-05T16:29:49.628Z" },
-    { url = "https://files.pythonhosted.org/packages/8b/60/af70e135bc1f1fe15dd3894b1e4bbefc7ecdf911749a925a39eb86ceb2a1/librt-0.10.0-cp312-cp312-musllinux_1_2_i686.whl", hash = "sha256:70805dbc5257892ac572f86290a61e3c8d90224ecce1a8b2d1f7ed51965417f4", size = 262051, upload-time = "2026-05-05T16:29:51.244Z" },
-    { url = "https://files.pythonhosted.org/packages/83/c2/c8236eb8b421bac5a172ba208f965abaa89805da2a3fa112bdf1764caf8f/librt-0.10.0-cp312-cp312-musllinux_1_2_riscv64.whl", hash = "sha256:d3b4f300f7bcba6e2ff73fb8bef1898479e9772bfa2682998c636391633ec826", size = 264347, upload-time = "2026-05-05T16:29:53.013Z" },
-    { url = "https://files.pythonhosted.org/packages/d6/f5/15b6d32bc25dacd4a60886a683d8128d6219910c122202b995a40dd4f8d2/librt-0.10.0-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:943bc943f92f4fb3408fae62485c6a3ad68ce4f2ee205643a39641525c19a276", size = 286482, upload-time = "2026-05-05T16:29:54.675Z" },
-    { url = "https://files.pythonhosted.org/packages/fb/8e/b1b959bacd323eb4360579db992513e1406d1c6ef7edb57b5511fd0666fd/librt-0.10.0-cp312-cp312-win32.whl", hash = "sha256:6065c1a758fba1010b41401013903d3d5d2750eab425ddedd584abac31d0630e", size = 62955, upload-time = "2026-05-05T16:29:56.39Z" },
-    { url = "https://files.pythonhosted.org/packages/9e/4c/d4cd6e4b9fc24098e63cc85537d1b6689682aee96809c38f08072067cc2b/librt-0.10.0-cp312-cp312-win_amd64.whl", hash = "sha256:d788ecbe208ab352dab0e105cc06057bf9a2fc7e58cabb0d751ad9e30062b9e2", size = 71191, upload-time = "2026-05-05T16:29:57.682Z" },
-    { url = "https://files.pythonhosted.org/packages/2b/19/8641da1f63d24b92354a492f893c022d6b3a0df44e70c8eff49364613983/librt-0.10.0-cp312-cp312-win_arm64.whl", hash = "sha256:6003d1f295bdba02656dc81308208fc060d0a51d8c0d0a6db70f7f3c57b9ba0a", size = 61432, upload-time = "2026-05-05T16:29:58.971Z" },
-    { url = "https://files.pythonhosted.org/packages/e5/29/681a75c82f4cc90d29e4b257a3299b79fe13fe927a04c57b8109d70b6957/librt-0.10.0-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:f0ede79d682e73f91c1b599a76d78b7464b9b5d213754cedb13372d9df36e596", size = 77299, upload-time = "2026-05-05T16:30:00.209Z" },
-    { url = "https://files.pythonhosted.org/packages/62/24/0c7ca445a55d04be79cac19819437fd094782347fa116f6681844fa6143e/librt-0.10.0-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:e0ba0b131fdb336c8b9c948e397f4a7e649d0f783b529f07b647bf4961df392e", size = 79930, upload-time = "2026-05-05T16:30:01.555Z" },
-    { url = "https://files.pythonhosted.org/packages/fe/1f/1e2b8f6443ef9e9a81e89486ca70e22f3684f93db003ce6eaefc3d0839b9/librt-0.10.0-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:2728117da2afb96fb957768725ee43dc9a2d73b031e02da424b818a3cdd3a275", size = 246195, upload-time = "2026-05-05T16:30:03.261Z" },
-    { url = "https://files.pythonhosted.org/packages/74/61/9dc9e03de0439ad84c1c240aac8b747f12c90cb797ea6042f7bdb8d3410f/librt-0.10.0-cp313-cp313-manylinux2014_i686.manylinux_2_17_i686.manylinux_2_28_i686.whl", hash = "sha256:723ba80594c49cdf0584196fc430752262605dc9449902fc9bd3d9b79976cb77", size = 234951, upload-time = "2026-05-05T16:30:04.881Z" },
-    { url = "https://files.pythonhosted.org/packages/55/f4/635223117d7590875bca441275065a3bf491203ad4208bd1cc3ffd90c5a1/librt-0.10.0-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:7292edaaca294a61a978c53a3c7d6130d099b0dfbc8f0a65916cdc6b891b9852", size = 262768, upload-time = "2026-05-05T16:30:06.638Z" },
-    { url = "https://files.pythonhosted.org/packages/e5/66/b04152d0cd8b6ca2b428a8bd3230343230c35ed304a932f35b5375f2f828/librt-0.10.0-cp313-cp313-manylinux_2_31_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:89fe9d539f2c10a1666633eeeac507ce95dd06d9ecc58de3c6390dba156a3d3a", size = 255075, upload-time = "2026-05-05T16:30:08.216Z" },
-    { url = "https://files.pythonhosted.org/packages/35/1e/25bac4c7f2ca36f0e612cade186970683cf79153d96beccc3a11a9e19b97/librt-0.10.0-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:4efa7b9587503fa5b67f40593302b9c8836d211d222ff9f7cafe67be5f8f0b10", size = 268559, upload-time = "2026-05-05T16:30:10.1Z" },
-    { url = "https://files.pythonhosted.org/packages/18/54/4601faab35b6632a13200faa146ca62bfd111ffbe2568be430d65c89493a/librt-0.10.0-cp313-cp313-musllinux_1_2_i686.whl", hash = "sha256:22dc982ef59df0136df36092ccbdbb570ced8aafb33e49585739b2f1de1c13b6", size = 261753, upload-time = "2026-05-05T16:30:11.912Z" },
-    { url = "https://files.pythonhosted.org/packages/1b/cf/39f4023509e94fade8b074666fa3292db9cb6b34ea5dcbe7af53df9fca1d/librt-0.10.0-cp313-cp313-musllinux_1_2_riscv64.whl", hash = "sha256:6f2e5f3606253a84cea719c94a3bb1c54487b5d617d0254d46e0920d8a06be3f", size = 264055, upload-time = "2026-05-05T16:30:13.465Z" },
-    { url = "https://files.pythonhosted.org/packages/8e/00/40247209fc46a8e308a91412d5206aedf8efb667ee89eb625820106a5c2f/librt-0.10.0-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:40884bfaa1e29f6b6a9be255007d8f359bfc9e61d68bdef8ed3158bfcbc95df9", size = 286190, upload-time = "2026-05-05T16:30:15.073Z" },
-    { url = "https://files.pythonhosted.org/packages/d8/6e/5566beb94431a985abe1787af5ef86e087750172ff9d0bbf20f93e88132d/librt-0.10.0-cp313-cp313-win32.whl", hash = "sha256:3cd34cd8254eba756660bff6c2da91278248184301054fe3e4feb073bdd49b14", size = 62949, upload-time = "2026-05-05T16:30:16.503Z" },
-    { url = "https://files.pythonhosted.org/packages/d0/c2/3ea3301d6c8dff51d39dbe8ed75db3dc92896947d4afb5eeadf821c1e67f/librt-0.10.0-cp313-cp313-win_amd64.whl", hash = "sha256:7baac5313e2d8dce1386f97777a8d03ab28f5fe1e780b3b9ac2ee7544551fedc", size = 71152, upload-time = "2026-05-05T16:30:17.766Z" },
-    { url = "https://files.pythonhosted.org/packages/3c/de/5d49cb92cadcbc77d3abc27b93fd6030ed8437487dde2eae38cab5e6704d/librt-0.10.0-cp313-cp313-win_arm64.whl", hash = "sha256:afc5b4406c8e2515698d922a5c7823a009312835ea58196671fff40e35cb8166", size = 61336, upload-time = "2026-05-05T16:30:19.021Z" },
-    { url = "https://files.pythonhosted.org/packages/6a/64/7165e08108cc185a13a9c069f0685e6ef92e70e07fddf7edf5e7348c6316/librt-0.10.0-cp314-cp314-macosx_10_13_x86_64.whl", hash = "sha256:f09588a30e6a22ec624090d72a3ab1a6d4d5485c3ed739603e76aa3c16efa688", size = 76794, upload-time = "2026-05-05T16:30:20.392Z" },
-    { url = "https://files.pythonhosted.org/packages/ae/ef/bf8613febf651b90c5222ee79dea5ae58d4cc2b544df69d3033424448934/librt-0.10.0-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:131ade118d12bd7a0adc4e655474a553f1b76cf78385868885944d21d51e45e0", size = 79662, upload-time = "2026-05-05T16:30:22.025Z" },
-    { url = "https://files.pythonhosted.org/packages/b6/67/9eddd165c1d8397bdf99b38bf12b5a55b3def5035b49eedb49f2775d1430/librt-0.10.0-cp314-cp314-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:b8b9ab28e40d011c373a189eae900c916e66d6fbecf7983e9e4883089ee085ef", size = 242390, upload-time = "2026-05-05T16:30:23.51Z" },
-    { url = "https://files.pythonhosted.org/packages/10/d1/d95da80334501866cd37004ab5d7483220d05862fab4b5405394f0264f0d/librt-0.10.0-cp314-cp314-manylinux2014_i686.manylinux_2_17_i686.manylinux_2_28_i686.whl", hash = "sha256:67c39bb30da73bae1f293d1ed8bc2f8f6642649dd0928d3600aeff3041ac23d6", size = 232603, upload-time = "2026-05-05T16:30:25.198Z" },
-    { url = "https://files.pythonhosted.org/packages/0c/fa/e6d64d28718bc1be4e1736fcb037ca1c4dfca927e7167df75a7d5215665e/librt-0.10.0-cp314-cp314-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:8c3273c6b774614f093c8927c2bf1b077d0fefde988fe98f46a333734e5597ab", size = 259187, upload-time = "2026-05-05T16:30:26.772Z" },
-    { url = "https://files.pythonhosted.org/packages/72/3f/3fdb77e7f937dad59cfd76b720be7e7643400ec76b2da35befab8d66ba30/librt-0.10.0-cp314-cp314-manylinux_2_31_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:9dd7c1b86a4baa583ab5db977484b93a2c474e69e96ef3e9538387ea54229cb9", size = 251846, upload-time = "2026-05-05T16:30:28.56Z" },
-    { url = "https://files.pythonhosted.org/packages/18/ca/f4d49133dd86a6f55d79eca30bf412fa722f511a9abe67f62f57aa64e66a/librt-0.10.0-cp314-cp314-musllinux_1_2_aarch64.whl", hash = "sha256:a77385c5a202e831149f7ad03be9e67cf80e957e52c614e83dcb822c95222eb8", size = 264936, upload-time = "2026-05-05T16:30:30.491Z" },
-    { url = "https://files.pythonhosted.org/packages/de/66/a8df2fbadc1f6c1827a096d11c40175bd526133480bd3bc88ec64a03d257/librt-0.10.0-cp314-cp314-musllinux_1_2_i686.whl", hash = "sha256:c6a5eafa74b5655bad59886138ed68426f098a6beb8cb95a71f2cc3cd8bb33fe", size = 258699, upload-time = "2026-05-05T16:30:32.002Z" },
-    { url = "https://files.pythonhosted.org/packages/bb/73/1e3c83613fe05451bb969e27b68a573d177f08d5f63533cc29fec0989658/librt-0.10.0-cp314-cp314-musllinux_1_2_riscv64.whl", hash = "sha256:1fc93d0439204c50ab4d1512611ce2c206f1b369b419f69c7c27c761561e3291", size = 259825, upload-time = "2026-05-05T16:30:35.077Z" },
-    { url = "https://files.pythonhosted.org/packages/09/24/5e2f926ee9d3ef348d9339526d7062abb5c44d8419e3179528c01d78c102/librt-0.10.0-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:79e713c178bc7a744adfbee6b4619a288eecc0c914da2a9313a20255abe2f0cf", size = 282548, upload-time = "2026-05-05T16:30:36.639Z" },
-    { url = "https://files.pythonhosted.org/packages/fc/7d/3e89ed6ad0162561fa8bef9df3195e24263104c955713cd0237d3711fad2/librt-0.10.0-cp314-cp314-win32.whl", hash = "sha256:2eba9d955a68c41d9f326be3da42f163ec3518b7ab20f1c826224e7bed71e0bf", size = 58970, upload-time = "2026-05-05T16:30:38.183Z" },
-    { url = "https://files.pythonhosted.org/packages/76/25/579e731c94a7086a268bfa3e7a4945cd47836bebd3cbf3faeafd2e7eaef9/librt-0.10.0-cp314-cp314-win_amd64.whl", hash = "sha256:cbfaf7f5145e9917f5d18bffa298eff6a19d74e7b8b11dabdca95785befe8dbf", size = 67260, upload-time = "2026-05-05T16:30:39.804Z" },
-    { url = "https://files.pythonhosted.org/packages/6e/f8/235822b7ae0b2334f12ee18bcf2476d07924077a5efeea57dbe927704be2/librt-0.10.0-cp314-cp314-win_arm64.whl", hash = "sha256:8d6d385d1969849a6b1397114df22714b6ded917bada98668e3e974dc663477e", size = 57156, upload-time = "2026-05-05T16:30:41.412Z" },
-    { url = "https://files.pythonhosted.org/packages/9f/e3/9b919cbf1e8eb770bf91bb7df28125e0f1daf4587169afefd95402636e9a/librt-0.10.0-cp314-cp314t-macosx_10_13_x86_64.whl", hash = "sha256:6c3a82d3bd32631ef5c79922dfc028520c9ad840255979ab4d908271818039ee", size = 79150, upload-time = "2026-05-05T16:30:42.761Z" },
-    { url = "https://files.pythonhosted.org/packages/6a/f5/72a944aa3bc3498169a168087eff58ca48b58bf1b704e59d091fd30739f3/librt-0.10.0-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:d64cc66005dc324c9bb1fa3fc2841f529002f6eb15966d55e46d430f56955a6a", size = 82304, upload-time = "2026-05-05T16:30:44.082Z" },
-    { url = "https://files.pythonhosted.org/packages/9c/e3/fcc290a33e295019759472dfa794d204e43504b276ac65eab7fd9da20ea3/librt-0.10.0-cp314-cp314t-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:9bb562cd28c88cd2c6a9a6c78f99dc39348d6b16c94adc25de0e574acf1176e9", size = 272556, upload-time = "2026-05-05T16:30:45.497Z" },
-    { url = "https://files.pythonhosted.org/packages/fd/54/546975e4c997573885e7f040a05012f8838e06fb12b0c3c1fbb76254e9d7/librt-0.10.0-cp314-cp314t-manylinux2014_i686.manylinux_2_17_i686.manylinux_2_28_i686.whl", hash = "sha256:b809aa2854d019c28773b03605df22adc675ee4f3f4402d673581313e8906119", size = 256941, upload-time = "2026-05-05T16:30:47.059Z" },
-    { url = "https://files.pythonhosted.org/packages/70/8c/f1d03401571b331653acddbd4e8cd955c06d945241dd08b25192fac0d04b/librt-0.10.0-cp314-cp314t-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:cc15acabdd519bd4176fdadc2119e5e3093485d86f89138daf47e5b4cedb983a", size = 285855, upload-time = "2026-05-05T16:30:48.86Z" },
-    { url = "https://files.pythonhosted.org/packages/0c/08/62cf80ff046c339faf56718b3a940244d4beb70f1c6407289b5830ec11e9/librt-0.10.0-cp314-cp314t-manylinux_2_31_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:b1b2d835307d08ddadd94568e2369648ec9173bd3eea6d7f52a1abe717c81f98", size = 275321, upload-time = "2026-05-05T16:30:50.63Z" },
-    { url = "https://files.pythonhosted.org/packages/d9/ea/da5918d4070362e9a4d2ee9cd34f9dc84902daad8fd4275f8504a727ff4e/librt-0.10.0-cp314-cp314t-musllinux_1_2_aarch64.whl", hash = "sha256:d261c6a2f93335a5167887fb0223e8b98ffce20ee3fde242e8e58a37ece6d0e5", size = 293993, upload-time = "2026-05-05T16:30:52.577Z" },
-    { url = "https://files.pythonhosted.org/packages/c9/8d/68b6086bed1fcdc314c640ea04e31e52d18052e08059fa595409d66a51a9/librt-0.10.0-cp314-cp314t-musllinux_1_2_i686.whl", hash = "sha256:e2ffd44963f8e7f68995504d90f9881d64e94dc1d8e310039b9526108fc0c0f7", size = 284254, upload-time = "2026-05-05T16:30:55.086Z" },
-    { url = "https://files.pythonhosted.org/packages/06/c8/b810f1d84ec34a5a7ed93d7b510ab04164d75fbdf23088d5c3fbe6b08357/librt-0.10.0-cp314-cp314t-musllinux_1_2_riscv64.whl", hash = "sha256:5f285f6455ed495791c4d8630e5af732960adea93cac4c893d15619f2eae53e8", size = 284925, upload-time = "2026-05-05T16:30:56.728Z" },
-    { url = "https://files.pythonhosted.org/packages/5a/00/3c82d4158c5a2c62528b8fccce65a8c9ad700e480e86f9389387435089a5/librt-0.10.0-cp314-cp314t-musllinux_1_2_x86_64.whl", hash = "sha256:f6034ff52e663d34c7b82ef2aa2f94ad7c1d939e2368e63b06844bc4d127d2e1", size = 307830, upload-time = "2026-05-05T16:30:58.377Z" },
-    { url = "https://files.pythonhosted.org/packages/99/3a/9c635ac3e8a00383ff689161d3eac8a30b3b2ddc711b40471e6b8983ea29/librt-0.10.0-cp314-cp314t-win32.whl", hash = "sha256:657860fd877fba6a241ea088ef99f63ca819945d3c715265da670bad56c37ebe", size = 60147, upload-time = "2026-05-05T16:31:00.293Z" },
-    { url = "https://files.pythonhosted.org/packages/dc/e8/6f65f3e565d4ac212cddddd552eacc8035ffdf941ca0ad6fe945a211d41f/librt-0.10.0-cp314-cp314t-win_amd64.whl", hash = "sha256:56ded2d66010203a0cb5af063b609e3f079531a0e5e576d618dece859fd2e1af", size = 68649, upload-time = "2026-05-05T16:31:01.778Z" },
-    { url = "https://files.pythonhosted.org/packages/51/78/a0705a67cacd81e5fa01a5035b3adbdfbb43a7b8d4bd27e2b282ae61baf2/librt-0.10.0-cp314-cp314t-win_arm64.whl", hash = "sha256:1ee63f30abf18ed4830fdbaf87b2b6f4bba1e198d46085c314edde4045e56715", size = 58247, upload-time = "2026-05-05T16:31:03.191Z" },
+    { url = "https://files.pythonhosted.org/packages/8b/d0/07c77e067f0838949b43bd89232c29d72efebb9d2801a9750184eb706b71/librt-0.11.0-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:b87504f1690a23b9a2cca841191a04f83895d4fc2dd04df91d82b1a04ca2ad46", size = 144147, upload-time = "2026-05-10T18:15:53.227Z" },
+    { url = "https://files.pythonhosted.org/packages/7a/24/8493538fa4f62f982686398a5b8f68008138a75086abdea19ade64bf4255/librt-0.11.0-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:40071fc5fe0ce8daa6de616702314a01e1250711682b0523d6ab8d4525910cb3", size = 143614, upload-time = "2026-05-10T18:15:54.657Z" },
+    { url = "https://files.pythonhosted.org/packages/ff/1e/f8bad050810d9171f34a1648ed910e56814c2ba61639f2bd53c6377ae24b/librt-0.11.0-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:137e79445c896a0ea7b265f52d23954e05b64222ee1af69e2cb34219067cbb67", size = 485538, upload-time = "2026-05-10T18:15:56.117Z" },
+    { url = "https://files.pythonhosted.org/packages/c0/fe/3594ebfbaf03084ba4b120c9ba5c3183fd938a48725e9bbe6ff0a5159ad8/librt-0.11.0-cp312-cp312-manylinux2014_i686.manylinux_2_17_i686.manylinux_2_28_i686.whl", hash = "sha256:cca6644054e78746d8d4ef238681f9c34ff8b584fe6b988ecebb8db3b15e622a", size = 479623, upload-time = "2026-05-10T18:15:57.544Z" },
+    { url = "https://files.pythonhosted.org/packages/b0/da/5d1876984b3746c85dbd219dbfcb73c85f54ee263fd32e5b2a632ec14571/librt-0.11.0-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:d5b0eea49f5562861ee8d757a32ef7d559c1d35be2aaaa1ec28941d74c9ffc8a", size = 513082, upload-time = "2026-05-10T18:15:58.805Z" },
+    { url = "https://files.pythonhosted.org/packages/19/6e/55bdf5d5ca00c3e18430690bf2c953d8d3ffd3c337418173d33dec985dc9/librt-0.11.0-cp312-cp312-manylinux_2_34_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:0d1029d7e1ae1a7e647ed6fb5df8c4ce2dffefb7a9f5fd1376a4554d96dac09f", size = 508105, upload-time = "2026-05-10T18:16:00.2Z" },
+    { url = "https://files.pythonhosted.org/packages/07/10/f1f23a7c595ee90ece4d35c851e5d104b1311a887ed1b4ac4c35bbd13da8/librt-0.11.0-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:bc3ce6b33c5828d9e80592011a5c584cb2ce86edbc4088405f70da47dc1d1b3b", size = 522268, upload-time = "2026-05-10T18:16:01.708Z" },
+    { url = "https://files.pythonhosted.org/packages/b6/02/5720f5697a7f54b78b3aefbe20df3a48cedcff1276618c4aa481177942ed/librt-0.11.0-cp312-cp312-musllinux_1_2_i686.whl", hash = "sha256:936c5995f3514a42111f20099397d8177c79b4d7e70961e396c6f5a0a3566766", size = 527348, upload-time = "2026-05-10T18:16:03.496Z" },
+    { url = "https://files.pythonhosted.org/packages/50/db/b4a47c6f91db4ff76348a0b3dd0cc65e090a078b765a810a62ff9434c3d3/librt-0.11.0-cp312-cp312-musllinux_1_2_riscv64.whl", hash = "sha256:9bc0ca6ad9381cbe8e4aa6e5726e4c80c78115a6e9723c599ed1d73e092bc49d", size = 516294, upload-time = "2026-05-10T18:16:05.173Z" },
+    { url = "https://files.pythonhosted.org/packages/9e/58/9384b2f4eb1ed1d273d40948a7c5c4b2360213b402ef3be4641c06299f9c/librt-0.11.0-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:070aa8c26c0a74774317a72df8851facc7f0f012a5b406557ac56992d92e1ec8", size = 553608, upload-time = "2026-05-10T18:16:06.839Z" },
+    { url = "https://files.pythonhosted.org/packages/21/7b/5aa8848a7c6a9278c79375146da1812e695754ceec5f005e6043461a7315/librt-0.11.0-cp312-cp312-win32.whl", hash = "sha256:6bf14feb84b05ae945277395451998c89c54d0def4070eb5c08de544930b245a", size = 101879, upload-time = "2026-05-10T18:16:08.103Z" },
+    { url = "https://files.pythonhosted.org/packages/37/33/8a745436944947575b584231750a41417de1a38cf6a2e9251d1065651c09/librt-0.11.0-cp312-cp312-win_amd64.whl", hash = "sha256:75672f0bc524ede266287d532d7923dbce94c7514ad07627bac3d0c6d92cc4d9", size = 119831, upload-time = "2026-05-10T18:16:09.174Z" },
+    { url = "https://files.pythonhosted.org/packages/59/67/a6739ac96e28b7855808bdb0370e250606104a859750d209e5a0716fe7ab/librt-0.11.0-cp312-cp312-win_arm64.whl", hash = "sha256:2f10cf143e4a9bb0f4f5af568a00df94a2d69ef41c2579584454bb0fe5cc642c", size = 103470, upload-time = "2026-05-10T18:16:10.369Z" },
+    { url = "https://files.pythonhosted.org/packages/82/61/e59168d4d0bf2bf90f4f0caf7a001bfc60254c3af4586013b04dc3ef517b/librt-0.11.0-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:78dc31f7fdfe9c9d0eb0e8f42d139db230e826415bbcabd9f0e9faaaee909894", size = 144119, upload-time = "2026-05-10T18:16:11.771Z" },
+    { url = "https://files.pythonhosted.org/packages/61/fd/caa1d60b12f7dd79ccea23054e06eeaebe266a5f52c40a6b651069200ce5/librt-0.11.0-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:fa475675db22290c3158e1d42326d0f5a65f04f44a0e68c3630a25b53560fb9c", size = 143565, upload-time = "2026-05-10T18:16:13.334Z" },
+    { url = "https://files.pythonhosted.org/packages/b8/a9/dc744f5c2b4978d48db970be29f22716d3413d28b14ad99740817315cf2c/librt-0.11.0-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:621db29691044bdeda22e789e482e1b0f3a985d90e3426c9c6d17606416205ea", size = 485395, upload-time = "2026-05-10T18:16:14.729Z" },
+    { url = "https://files.pythonhosted.org/packages/8f/21/7f8e97a1e4dae952a5a95948f6f8507a173bc1e669f54340bba6ca1ca31b/librt-0.11.0-cp313-cp313-manylinux2014_i686.manylinux_2_17_i686.manylinux_2_28_i686.whl", hash = "sha256:a9010e2ed5b3a9e158c5fd966b3ab7e834bb3d3aacc8f66c91dd4b57a3799230", size = 479383, upload-time = "2026-05-10T18:16:16.321Z" },
+    { url = "https://files.pythonhosted.org/packages/a6/6d/d8ee9c114bebf2c50e29ec2aa940826fccb62a645c3e4c18760987d0e16d/librt-0.11.0-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:7c39513d8b7477a2e1ed8c43fc21c524e8d5a0f8d4e8b7b074dbdbe7820a08e2", size = 513010, upload-time = "2026-05-10T18:16:17.647Z" },
+    { url = "https://files.pythonhosted.org/packages/f0/43/0b5708af2bd30a46400e72ba6bdaa8f066f15fb9a688527e34220e8d6c06/librt-0.11.0-cp313-cp313-manylinux_2_34_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:7aef3cf1d5af86e770ab04bfd993dfc4ae8b8c17f66fb77dd4a7d50de7bbb1a3", size = 508433, upload-time = "2026-05-10T18:16:19.309Z" },
+    { url = "https://files.pythonhosted.org/packages/4a/50/356187247d09013490481033183b3532b58acf8028bcb34b2b56a375c9b2/librt-0.11.0-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:557183ddc36babe46b27dd60facbd5adb4492181a5be887587d57cda6e092f21", size = 522595, upload-time = "2026-05-10T18:16:20.642Z" },
+    { url = "https://files.pythonhosted.org/packages/40/e7/c6ac4240899c7f3248079d5a9900debe0dadb3fdeaf856684c987105ba47/librt-0.11.0-cp313-cp313-musllinux_1_2_i686.whl", hash = "sha256:83d3e1f72bd42f6c5c0b7daec530c3f829bd02db42c70b8ddf0c2d90a2459930", size = 527255, upload-time = "2026-05-10T18:16:22.352Z" },
+    { url = "https://files.pythonhosted.org/packages/eb/b5/a81322dbeedeeaf9c1ee6f001734d28a09d8383ac9e6779bc24bbd0743c6/librt-0.11.0-cp313-cp313-musllinux_1_2_riscv64.whl", hash = "sha256:4ce1f21fbe589bc1afd7872dece84fb0e1144f794a288e58a10d2c54a55c43be", size = 516847, upload-time = "2026-05-10T18:16:23.627Z" },
+    { url = "https://files.pythonhosted.org/packages/ae/66/6e6323787d592b55204a42595ff1102da5115601b53a7e9ddebc889a6da5/librt-0.11.0-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:970b09f7044ea2b64c9da42fd3d335666518cfd1c6e8a182c95da73d0214b41e", size = 553920, upload-time = "2026-05-10T18:16:25.025Z" },
+    { url = "https://files.pythonhosted.org/packages/9c/21/623f8ca230857102066d9ca8c6c1734995908c4d0d1bee7bb2ef0021cb33/librt-0.11.0-cp313-cp313-win32.whl", hash = "sha256:78fddc31cd4d3caa897ad5d31f856b1faadc9474021ad6cb182b9018793e254e", size = 101898, upload-time = "2026-05-10T18:16:26.649Z" },
+    { url = "https://files.pythonhosted.org/packages/b3/1d/b4ebd44dd723f768469007515cb92251e0ae286c94c140f374801140fa74/librt-0.11.0-cp313-cp313-win_amd64.whl", hash = "sha256:8ca8aa88751a775870b764e93bad5135385f563cb8dcee399abf034ea4d3cb47", size = 119812, upload-time = "2026-05-10T18:16:27.859Z" },
+    { url = "https://files.pythonhosted.org/packages/3b/e4/b2f4ca7965ca373b491cdb4bc25cdb30c1649ca81a8782056a83850292a9/librt-0.11.0-cp313-cp313-win_arm64.whl", hash = "sha256:96f044bb325fd9cf1a723015638c219e9143f0dfbc0ca54c565df2b7fc748b44", size = 103448, upload-time = "2026-05-10T18:16:29.066Z" },
+    { url = "https://files.pythonhosted.org/packages/29/eb/dbce197da4e227779e56b5735f2decc3eb36e55a1cdbf1bd65d6639d76c1/librt-0.11.0-cp314-cp314-macosx_10_13_x86_64.whl", hash = "sha256:4a017a95e5837dc15a8c5661d60e05daa96b90908b1aa6b7acdf443cd25c8ebd", size = 143345, upload-time = "2026-05-10T18:16:30.674Z" },
+    { url = "https://files.pythonhosted.org/packages/76/a3/254bebd0c11c8ba684018efb8006ff22e466abce445215cca6c778e7d9de/librt-0.11.0-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:b1ecbd9819deccc39b7542bf4d2a740d8a620694d39989e58661d3763458f8d4", size = 143131, upload-time = "2026-05-10T18:16:32.037Z" },
+    { url = "https://files.pythonhosted.org/packages/f1/3f/f77d6122d21ac7bf6ae8a7dfced1bd2a7ac545d3273ebdcaf8042f6d619f/librt-0.11.0-cp314-cp314-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:7da327dacd7be8f8ec36547373550744a3cc0e536d54665cd83f8bcd961200e8", size = 477024, upload-time = "2026-05-10T18:16:33.493Z" },
+    { url = "https://files.pythonhosted.org/packages/ac/0a/2c996dadebaa7d9bbbd43ef2d4f3e66b6da545f838a41694ef6172cebec8/librt-0.11.0-cp314-cp314-manylinux2014_i686.manylinux_2_17_i686.manylinux_2_28_i686.whl", hash = "sha256:0dc56b1f8d06e60db362cc3fdae206681817f86ce4725d34511473487f12a34b", size = 474221, upload-time = "2026-05-10T18:16:34.864Z" },
+    { url = "https://files.pythonhosted.org/packages/0a/7e/f5d92af8486b8272c23b3e686b46ff72d89c8169585eb61eef01a2ac7147/librt-0.11.0-cp314-cp314-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:05fb8fb2ab90e21c8d12ea240d744ad514da9baf381ebfa70d91d20d21713175", size = 505174, upload-time = "2026-05-10T18:16:36.705Z" },
+    { url = "https://files.pythonhosted.org/packages/af/1a/cb0734fe86398eb33193ab753b7326255c74cac5eb09e76b9b16536e7adb/librt-0.11.0-cp314-cp314-manylinux_2_34_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:cae74872be221df4374d10fec61f93ed1513b9546ea84f2c0bf73ab3e9bd0b03", size = 497216, upload-time = "2026-05-10T18:16:38.418Z" },
+    { url = "https://files.pythonhosted.org/packages/18/06/094820f91558b66e29943c0ec41c9914f460f48dd51fc503c3101e10842d/librt-0.11.0-cp314-cp314-musllinux_1_2_aarch64.whl", hash = "sha256:32bcc918c0148eb7e3d57385125bac7e5f9e4359d05f07448b09f6f778c2f31c", size = 513921, upload-time = "2026-05-10T18:16:39.848Z" },
+    { url = "https://files.pythonhosted.org/packages/0b/c2/00de9018871a282f530cacb457d5ec0428f6ac7e6fedde9aff7468d9fb04/librt-0.11.0-cp314-cp314-musllinux_1_2_i686.whl", hash = "sha256:f9743fc99135d5f78d2454435615f6dec0473ca507c26ce9d92b10b562a280d3", size = 520850, upload-time = "2026-05-10T18:16:41.471Z" },
+    { url = "https://files.pythonhosted.org/packages/51/9d/64631832348fd1834fb3a61b996434edddaaf25a31d03b0a76273159d2cf/librt-0.11.0-cp314-cp314-musllinux_1_2_riscv64.whl", hash = "sha256:5ba067f4aadae8fda802d91d2124c90c42195ff32d9161d3549e6d05cfe26f96", size = 504237, upload-time = "2026-05-10T18:16:43.15Z" },
+    { url = "https://files.pythonhosted.org/packages/a5/ec/ae5525eb16edc827a044e7bb8777a455ff95d4bca9379e7e6bddd7383647/librt-0.11.0-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:de3bf945454d032f9e390b85c4072e0a0570bf825421c8be0e71209fa65e1abe", size = 546261, upload-time = "2026-05-10T18:16:44.408Z" },
+    { url = "https://files.pythonhosted.org/packages/5a/09/adce371f27ca039411da9659f7430fcc2ba6cd0c7b3e4467a0f091be7fa9/librt-0.11.0-cp314-cp314-win32.whl", hash = "sha256:d2277a05f6dcb9fd13db9566aac4fabd68c3ea1ea46ee5567d4eef8efa495a2f", size = 96965, upload-time = "2026-05-10T18:16:46.039Z" },
+    { url = "https://files.pythonhosted.org/packages/d6/ee/8ac720d98548f173c7ce2e632a7ca94673f74cacd5c8162a84af5b35958a/librt-0.11.0-cp314-cp314-win_amd64.whl", hash = "sha256:ab73e8db5e3f564d812c1f5c3a175930a5f9bc96ccb5e3b22a34d7858b401cf7", size = 115151, upload-time = "2026-05-10T18:16:47.133Z" },
+    { url = "https://files.pythonhosted.org/packages/94/20/c900cf14efeb09b6bef2b2dff20779f73464b97fd58d1c6bccc379588ae3/librt-0.11.0-cp314-cp314-win_arm64.whl", hash = "sha256:aea3caa317752e3a466fa8af45d91ee0ea8c7fdd96e42b0a8dd9b76a7931eba1", size = 98850, upload-time = "2026-05-10T18:16:48.597Z" },
+    { url = "https://files.pythonhosted.org/packages/0c/71/944bfe4b64e12abffcd3c15e1cce07f72f3d55655083786285f4dedeb532/librt-0.11.0-cp314-cp314t-macosx_10_13_x86_64.whl", hash = "sha256:d1b36540d7aaf9b9101b3a6f376c8d8e9f7a9aec93ed05918f2c69d493ffef72", size = 151138, upload-time = "2026-05-10T18:16:49.839Z" },
+    { url = "https://files.pythonhosted.org/packages/b6/10/99e64a5c86989357fda078c8143c533389585f6473b7439172dd8f3b3b2d/librt-0.11.0-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:efbb343ab2ce3540f4ecbe6315d677ed70f37cd9a72b1e58066c918ca83acbaa", size = 151976, upload-time = "2026-05-10T18:16:51.062Z" },
+    { url = "https://files.pythonhosted.org/packages/21/31/5072ad880946d83e5ea4147d6d018c78eefce85b77819b19bdd0ee229435/librt-0.11.0-cp314-cp314t-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:aa0dd688aab3f7914d3e6e5e3554978e0383312fb8e771d84be008a35b9ee548", size = 557927, upload-time = "2026-05-10T18:16:52.632Z" },
+    { url = "https://files.pythonhosted.org/packages/5e/8d/70b5fb7cfbab60edbe7381614ab985da58e144fbf465c86d44c95f43cdca/librt-0.11.0-cp314-cp314t-manylinux2014_i686.manylinux_2_17_i686.manylinux_2_28_i686.whl", hash = "sha256:f5fb36b8c6c63fdcbb1d526d94c0d1331610d43f4118cc1beb4efef4f3faacb2", size = 539698, upload-time = "2026-05-10T18:16:53.934Z" },
+    { url = "https://files.pythonhosted.org/packages/fa/a3/ba3495a0b3edbd24a4cae0d1d3c64f39a9fc45d06e812101289b50c1a619/librt-0.11.0-cp314-cp314t-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:4a9a237d13addb93715b6fee74023d5ee3469b53fce527626c0e088aa585805f", size = 577162, upload-time = "2026-05-10T18:16:55.589Z" },
+    { url = "https://files.pythonhosted.org/packages/f7/db/36e25fb81f99937ff1b96612a1dc9fd66f039cb9cc3aee12c01fac31aab9/librt-0.11.0-cp314-cp314t-manylinux_2_34_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:5ddd17bd87b2c56ddd60e546a7984a2e64c4e8eab92fb4cf3830a48ad5469d51", size = 566494, upload-time = "2026-05-10T18:16:56.975Z" },
+    { url = "https://files.pythonhosted.org/packages/33/0d/3f622b47f0b013eeb9cf4cc07ae9bfe378d832a4eec998b2b209fe84244d/librt-0.11.0-cp314-cp314t-musllinux_1_2_aarch64.whl", hash = "sha256:bd43992b4473d42f12ff9e68326079f0696d9d4e6000e8f39a0238d482ba6ee2", size = 596858, upload-time = "2026-05-10T18:16:58.374Z" },
+    { url = "https://files.pythonhosted.org/packages/a9/02/71b90bc93039c46a2000651f6ad60122b114c8f54c4ad306e0e96f5b75ad/librt-0.11.0-cp314-cp314t-musllinux_1_2_i686.whl", hash = "sha256:f8e3e8056dd674e279741485e2e512d6e9a751c7455809d0114e6ebf8d781085", size = 590318, upload-time = "2026-05-10T18:16:59.676Z" },
+    { url = "https://files.pythonhosted.org/packages/04/04/418cb3f75621e2b761fb1ab0f017f4d70a1a72a6e7c74ee4f7e8d198c2f3/librt-0.11.0-cp314-cp314t-musllinux_1_2_riscv64.whl", hash = "sha256:c1f708d8ae9c56cf38a903c44297243d2ec83fd82b396b977e0144a3e76217e3", size = 575115, upload-time = "2026-05-10T18:17:01.007Z" },
+    { url = "https://files.pythonhosted.org/packages/cc/2c/5a2183ac58dd911f26b5d7e7d7d8f1d87fcecdddd99d6c12169a258ff62c/librt-0.11.0-cp314-cp314t-musllinux_1_2_x86_64.whl", hash = "sha256:0add982e0e7b9fc14cf4b33789d5f13f66581889b88c2f58099f6ce8f92617bd", size = 617918, upload-time = "2026-05-10T18:17:02.682Z" },
+    { url = "https://files.pythonhosted.org/packages/15/1f/dc6771a52592a4451be6effa200cbfc9cec61e4393d3033d81a9d307961d/librt-0.11.0-cp314-cp314t-win32.whl", hash = "sha256:2b481d846ac894c4e8403c5fd0e87c5d11d6499e404b474602508a224ff531c8", size = 103562, upload-time = "2026-05-10T18:17:03.99Z" },
+    { url = "https://files.pythonhosted.org/packages/62/4a/7d1415567027286a75ba1093ec4aca11f073e0f559c530cf3e0a757ad55c/librt-0.11.0-cp314-cp314t-win_amd64.whl", hash = "sha256:28edb433edde181112a908c78907af28f964eabc15f4dd16c9d66c834302677c", size = 124327, upload-time = "2026-05-10T18:17:05.465Z" },
+    { url = "https://files.pythonhosted.org/packages/ce/62/b40b382fa0c66fee1478073eb8db352a4a6beda4a1adccf1df911d8c289c/librt-0.11.0-cp314-cp314t-win_arm64.whl", hash = "sha256:dee008f20b542e3cd162ba338a7f9ec0f6d23d395f66fe8aeeec3c9d067ea253", size = 102572, upload-time = "2026-05-10T18:17:06.809Z" },
 ]

 [[package]]
@@ -3647,7 +3672,7 @@ wheels = [

 [[package]]
 name = "mujoco"
-version = "3.8.0"
+version = "3.8.1"
 source = { registry = "https://pypi.org/simple" }
 dependencies = [
    { name = "absl-py" },
@@ -3656,23 +3681,23 @@ dependencies = [
    { name = "numpy" },
    { name = "pyopengl" },
 ]
-sdist = { url = "https://files.pythonhosted.org/packages/e2/d8/9aae1a021b6e15ee69d805d893e01dda71cbaae1c75d5f8ec8e12916cb7c/mujoco-3.8.0.tar.gz", hash = "sha256:250afe57458d6881b2d7659fa0029a128cb57cbbb620268d95647fb9ad742183", size = 918250, upload-time = "2026-04-24T22:59:07.531Z" }
+sdist = { url = "https://files.pythonhosted.org/packages/ff/ba/ad135f7a4a71360072bc4f202f7ab3130d7d7827cff70ce3e1b382a1a410/mujoco-3.8.1.tar.gz", hash = "sha256:019a0b3406892bc98454eaf55cbd27f85d167758ad785f77c608a61f3a34ad17", size = 922269, upload-time = "2026-05-11T13:44:01.357Z" }
 wheels = [
-    { url = "https://files.pythonhosted.org/packages/b4/0d/35aad24bef1f36e9ebf63367938b16abec82407338d612c37624ff20b0e3/mujoco-3.8.0-cp312-cp312-macosx_10_16_x86_64.whl", hash = "sha256:a495da0cd01aff6ac94ec97f0a1d913e1afe071daf107e220f81814435227982", size = 7265096, upload-time = "2026-04-24T22:58:33.475Z" },
-    { url = "https://files.pythonhosted.org/packages/60/d7/2ee5a123431eb50f234de2759e46f3d0c02876e0b1ffce1b26102ed388e7/mujoco-3.8.0-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:cc1d25b0cd47248fd39681310950b2bea0f6098f57358c0c02730d365bb80ba1", size = 7204862, upload-time = "2026-04-24T22:58:35.978Z" },
-    { url = "https://files.pythonhosted.org/packages/aa/62/a488e6e0963e0210b8262650d25e51c4c597ff7beed4fe01a7e88e3abfc5/mujoco-3.8.0-cp312-cp312-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:980ab5a2210777cf766e53eb574726f9360e2a87e47d83a6c8d801fb71f2fe52", size = 6743542, upload-time = "2026-04-24T22:58:38.137Z" },
-    { url = "https://files.pythonhosted.org/packages/6f/de/bc2271210dad5c6ab73af294779226308e9cf4ed8bc2dbe59922eb8702ed/mujoco-3.8.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:323fedd14905b73cfe56ea8ff916716ccf8b57cff348a7aa6932c8983a465d64", size = 7226045, upload-time = "2026-04-24T22:58:41.117Z" },
-    { url = "https://files.pythonhosted.org/packages/65/87/198a88747ff0c01e35070c0c80ae0c05ff8d1a61d6e6f379a4e5ff3e6185/mujoco-3.8.0-cp312-cp312-win_amd64.whl", hash = "sha256:8db22dbc5a6c98241549c8161f20a2b0c2ccc5d08fa42595e7a4b594e35a70dd", size = 5813167, upload-time = "2026-04-24T22:58:43.469Z" },
-    { url = "https://files.pythonhosted.org/packages/11/7d/41c73ebe93565ed196ec5ad012232138e3d10850e841ccd77d459afc4383/mujoco-3.8.0-cp313-cp313-macosx_10_16_x86_64.whl", hash = "sha256:d4a080aab0be4d02162e6fe3bcd7163c01cc751638f5a84ba05477b512d95cc0", size = 7265494, upload-time = "2026-04-24T22:58:45.119Z" },
-    { url = "https://files.pythonhosted.org/packages/7c/05/d21b43c31c5d9179c2d33e0d38896775b262a9d78729b760717927a02e28/mujoco-3.8.0-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:5aa987e70a6601ebf02d123d9842f2d1b8f8057163feec1f0a5a049de1cbe252", size = 7205049, upload-time = "2026-04-24T22:58:46.874Z" },
-    { url = "https://files.pythonhosted.org/packages/6d/9c/af181776d0ffb70ad6a4365f0613529f268782850dedab0569c6cce83fcc/mujoco-3.8.0-cp313-cp313-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:8bd33b7e2382605012dfe41c00a8d3bb358153e5b019d920a52c31664472ce20", size = 6743578, upload-time = "2026-04-24T22:58:48.862Z" },
-    { url = "https://files.pythonhosted.org/packages/89/8a/c9b28784a7e51926d609b70842e16b85e286df87ad861dbbb26c4e49cacf/mujoco-3.8.0-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:f2b3de0c9fed950c5080ea4b3ff1fb5c89f88e22798f1e1693ec8dbbd36de00b", size = 7226464, upload-time = "2026-04-24T22:58:51.039Z" },
-    { url = "https://files.pythonhosted.org/packages/50/3f/0a72c74dd766524b9f1b79f0d6d327b9a797d87b44fe62b3068b44123b54/mujoco-3.8.0-cp313-cp313-win_amd64.whl", hash = "sha256:de03d173f4d9c7341b5dcc10a8eddb36bb19989df68f24369dc7c782dc053f11", size = 5813265, upload-time = "2026-04-24T22:58:53.316Z" },
-    { url = "https://files.pythonhosted.org/packages/81/f7/afdbcb4ad50786ed7500205f29ffb5c3a5ef9d42e6b3ad8f9636c4911687/mujoco-3.8.0-cp314-cp314-macosx_10_16_x86_64.whl", hash = "sha256:09c27fc6ce1560912e920789bc121290e4c84919ae30f7b54da5efed4cd2804a", size = 7320672, upload-time = "2026-04-24T22:58:55.469Z" },
-    { url = "https://files.pythonhosted.org/packages/4a/8a/6299f6209084dda9469374461c77adad5d63041427b9b9bd4fadaf0c35b0/mujoco-3.8.0-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:5fc4e930cd1414f965381ac97ec054a001539e7aa462836145f6a3201b0dfe88", size = 7249791, upload-time = "2026-04-24T22:58:57.126Z" },
-    { url = "https://files.pythonhosted.org/packages/df/3c/a74d169b7725aee971962238d2aee767f64496ede1367cd558361c97d5ca/mujoco-3.8.0-cp314-cp314-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:14c32992267906d422ed2127e99aa9ad036a62324139da2a3bd25df1e928d0ee", size = 6754009, upload-time = "2026-04-24T22:59:00.375Z" },
-    { url = "https://files.pythonhosted.org/packages/66/36/f3610724bb35f6cfb2ffebe9d8d315975e5fa9722146ad58298df442da7e/mujoco-3.8.0-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:1373c4744a3424f1aa224d2ad5201497ea150c4698beb9aaeb8c3560efa60fdb", size = 7227764, upload-time = "2026-04-24T22:59:02.722Z" },
-    { url = "https://files.pythonhosted.org/packages/92/5a/80f5347c322300e4402b08df74e7489756e9af060bb8b8d342086dd5b41c/mujoco-3.8.0-cp314-cp314-win_amd64.whl", hash = "sha256:f8da8fc4a2861f9d1eef64d83adcd783bfe5c02bdc78af2d963d942d097dfdce", size = 6144239, upload-time = "2026-04-24T22:59:05.329Z" },
+    { url = "https://files.pythonhosted.org/packages/3d/4f/ccfaf99fab0042bef61da2cd21d47f41fcff069744b42bc6829d9da6b4fc/mujoco-3.8.1-cp312-cp312-macosx_10_16_x86_64.whl", hash = "sha256:b1c5213eb8fc53ee8c0713391eda5f0766088766025e8b46e1dfd3cb9f00db71", size = 7356297, upload-time = "2026-05-11T13:43:30.654Z" },
+    { url = "https://files.pythonhosted.org/packages/ec/33/64e4de2ff6f0ead6c9d204efdd68aea3ca235058829f69b957625768495d/mujoco-3.8.1-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:06e36128883f484e173d02e6140ae09c1ff9845dbc7fb605b02361d9d2ecaccf", size = 7245562, upload-time = "2026-05-11T13:43:32.736Z" },
+    { url = "https://files.pythonhosted.org/packages/09/a1/597263af3519b1ac1c09bb8fc8f8985e90e4990c8aff62ca1a0c035e824c/mujoco-3.8.1-cp312-cp312-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:9e4e951de3ab2ba5187b4807c648ac551fd6ae27ff04c7e25304b357d35d7fe8", size = 6784316, upload-time = "2026-05-11T13:43:34.837Z" },
+    { url = "https://files.pythonhosted.org/packages/0e/2f/8a911be3ed84436bc46b5bdcf4ead6ac9590d0ddc82c006834c49ffd60c4/mujoco-3.8.1-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:2970c90913a52b31eade56f27e5439905f08f1479e7991e257144ca829fa1267", size = 7322546, upload-time = "2026-05-11T13:43:37.074Z" },
+    { url = "https://files.pythonhosted.org/packages/83/58/d580c4abf7dda5231fed0886a626e1815e40c2d118bf7d0052a617435d6e/mujoco-3.8.1-cp312-cp312-win_amd64.whl", hash = "sha256:496b61c863d544e076990d1601555f71e522eb3b7aaae227dfbc5dea522f88a1", size = 5893327, upload-time = "2026-05-11T13:43:39.022Z" },
+    { url = "https://files.pythonhosted.org/packages/79/fb/55222063035a96748e78fcd690deb5ffbb4f000bbb24eec8220173f6101c/mujoco-3.8.1-cp313-cp313-macosx_10_16_x86_64.whl", hash = "sha256:1728de8572674cb146e134e96ecee8ed2c2f5e01d4aacacf42d703a441cfcea5", size = 7356742, upload-time = "2026-05-11T13:43:41.121Z" },
+    { url = "https://files.pythonhosted.org/packages/1a/97/636a41e34f9a9bcae6c59e8d884b12762b6bc2a912e95a40e1e3e02884f2/mujoco-3.8.1-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:96811f3903f208b3facca4cb7be98207e4b481cafcc1c3082b1184c28f778665", size = 7245661, upload-time = "2026-05-11T13:43:42.935Z" },
+    { url = "https://files.pythonhosted.org/packages/2e/0c/d8c164aef4a2d51ecb23a81db1ebac9e7da99233e6207ea4801b507ebc92/mujoco-3.8.1-cp313-cp313-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:d64076a564455c68550e3bc1be4788e0e121c6e38db014eee39b1e1d7a733415", size = 6784309, upload-time = "2026-05-11T13:43:44.771Z" },
+    { url = "https://files.pythonhosted.org/packages/f4/76/5b9d2a223d91c66a5ddf1aad2293c33ac795cabe8de0fa292550c28db374/mujoco-3.8.1-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:ad3cfa1c809bf70a6308aa1db7b1e44d019f7a499c0b0ac8a4ddff102f4fae54", size = 7322830, upload-time = "2026-05-11T13:43:46.859Z" },
+    { url = "https://files.pythonhosted.org/packages/e3/e8/e0432f14495d561aa0d1c77f05096a8c8a91f934b703a324c1abda0e57ba/mujoco-3.8.1-cp313-cp313-win_amd64.whl", hash = "sha256:9010b5929c2ad924537a528b0299c2dc552cdbb136cea6da24544793590c2487", size = 5893667, upload-time = "2026-05-11T13:43:48.987Z" },
+    { url = "https://files.pythonhosted.org/packages/9f/83/757e83c9f8291e8ea84105cd76de076f674e120d98b6cd21716504d7bfec/mujoco-3.8.1-cp314-cp314-macosx_10_16_x86_64.whl", hash = "sha256:5717c8cdfe0360a42f8f65af53b2f5e3c38318f0c4ace181b81583db4e8fbc8f", size = 7412564, upload-time = "2026-05-11T13:43:51.538Z" },
+    { url = "https://files.pythonhosted.org/packages/27/cc/1feae64fb8dc40d2ebb5b8938a13acd3ba5dfb9acf49c3e5907d9d42faf8/mujoco-3.8.1-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:b7a311b25c8284729505d1dc130501633ee2b00249d0102009131a4b23f0ce2f", size = 7295087, upload-time = "2026-05-11T13:43:53.848Z" },
+    { url = "https://files.pythonhosted.org/packages/85/7c/ca2fc8f4d0054eeb7dafe7ed047bda47661f077c22dc09e6350e46a576fc/mujoco-3.8.1-cp314-cp314-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:804c66272e6ffd2223c7c69d020dc21e35986d2c633b20ec92087f34acfb40e2", size = 6794464, upload-time = "2026-05-11T13:43:56.357Z" },
+    { url = "https://files.pythonhosted.org/packages/7f/9a/88e689fa5e21ff6e1fcd870429d33eb0f2ef58d5330db2ca0e6bea91b7d2/mujoco-3.8.1-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:2d4fcdb57b4ea7d8730db3d995e900f5c6281fafbd552161c363e6ff9e839959", size = 7324034, upload-time = "2026-05-11T13:43:58.091Z" },
+    { url = "https://files.pythonhosted.org/packages/b2/e5/8542d94dd3d73af37befc6b646c927f8448d851a61a5b9814371e581e0fb/mujoco-3.8.1-cp314-cp314-win_amd64.whl", hash = "sha256:e32717426b59e2619b5a44253856978a1312f12912e6404907acb551df3da4ff", size = 6227305, upload-time = "2026-05-11T13:43:59.932Z" },
 ]

 [[package]]
@@ -3793,7 +3818,7 @@ wheels = [

 [[package]]
 name = "mypy"
-version = "2.0.0"
+version = "2.1.0"
 source = { registry = "https://pypi.org/simple" }
 dependencies = [
    { name = "ast-serialize" },
@@ -3802,37 +3827,37 @@ dependencies = [
    { name = "pathspec" },
    { name = "typing-extensions" },
 ]
-sdist = { url = "https://files.pythonhosted.org/packages/cf/dc/7e6d49f04fca40b9dd5c752a51a432ffe67fb45200702bc9eee0cb4bbb26/mypy-2.0.0.tar.gz", hash = "sha256:1a9e3900ac5c40f1fe813506c7739da6e6f0eab2729067ebd94bfb0bbba53532", size = 3869036, upload-time = "2026-05-06T19:26:43.22Z" }
+sdist = { url = "https://files.pythonhosted.org/packages/82/15/cca9d88503549ed6fedeaa1d448cdddd542ee8a490232d732e278036fbf2/mypy-2.1.0.tar.gz", hash = "sha256:81e76ad12c2d804512e9b13240d1588316531bfba07558286078bfbce9613633", size = 3898359, upload-time = "2026-05-11T18:37:36.237Z" }
 wheels = [
-    { url = "https://files.pythonhosted.org/packages/f6/4b/f6cd12ef1eb63be1c342da3e8ca811d2280276177f6de4ef20cb2366d79b/mypy-2.0.0-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:660790551c988e69d8bf7a35c8b4149edeb22f4a339165702be843532e9dcdb5", size = 14756610, upload-time = "2026-05-06T19:26:19.221Z" },
-    { url = "https://files.pythonhosted.org/packages/32/73/67d09ca28bee21feaca264b2a680cf2d300bcc2071136ad064928324c843/mypy-2.0.0-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:7a15bf92cd8781f8e72f69ffa7e30d1f434402d065ee1ecd5223ef2ef100f914", size = 13554270, upload-time = "2026-05-06T19:26:08.977Z" },
-    { url = "https://files.pythonhosted.org/packages/61/b3/44718b5c6b1b5a27440ff2effe6a1be0fa2a190c0f4e2e21a83728416f95/mypy-2.0.0-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:4ff370b43d7def05bbcd2f5267f0bcda72dd6a552ef2ea9375b02d6fe06da270", size = 13924663, upload-time = "2026-05-06T19:21:24.932Z" },
-    { url = "https://files.pythonhosted.org/packages/6a/2b/bbb9cc5773f946846a7c340097e59bcf84095437dda0d56bb4f6cf1f6541/mypy-2.0.0-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:37bd246590a018e5a11703b7b09c39d47ede3df5ba3fa863c5b8590b465beb01", size = 14946862, upload-time = "2026-05-06T19:24:23.023Z" },
-    { url = "https://files.pythonhosted.org/packages/43/25/e9318566f443a5130b4ff0ad3367ee6c4c4c49ff083fe5214a7318c18282/mypy-2.0.0-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:cce87e92214fac8bf8feb8a680d0c1b6fb748d50e9b57fbb13e4b1d83a3ed19b", size = 15175090, upload-time = "2026-05-06T19:26:28.794Z" },
-    { url = "https://files.pythonhosted.org/packages/67/65/2ec28c834f21e164c33bc296a7db538ad50c74f83e517c7a0be95ff6de86/mypy-2.0.0-cp312-cp312-win_amd64.whl", hash = "sha256:e19e9cb69b66a4141009d24898259914fa2b71d026de0b46edf9fafdbf4fd46e", size = 11052899, upload-time = "2026-05-06T19:25:39.084Z" },
-    { url = "https://files.pythonhosted.org/packages/9e/72/d1ec625cfc9bd101c07a6834ef1f94e820296f8fdbad2eb03f50e0983f8c/mypy-2.0.0-cp312-cp312-win_arm64.whl", hash = "sha256:b021614cb08d44785b025982163ec3c39c94bff766ead071fa9e82b4ef6f62cd", size = 9972935, upload-time = "2026-05-06T19:23:24.204Z" },
-    { url = "https://files.pythonhosted.org/packages/e5/c6/996a1e535e5d0d597c3b1460fc962733091f885f312e749350eb2ac10965/mypy-2.0.0-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:9ef5f581b61240d1cc629b12f8df6565ed6ffac0d82ed745eef7833222ab50b9", size = 14737259, upload-time = "2026-05-06T19:20:23.081Z" },
-    { url = "https://files.pythonhosted.org/packages/94/c5/0f9460e26b77f434bd53f47d1ce32a3cd4580c92a5331fa5dfc059f9421a/mypy-2.0.0-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:20e3470a165dbc249bdfbe8d1c5172727ef22688cffc279f8c3aa264ab9d4d9a", size = 13538377, upload-time = "2026-05-06T19:21:08.804Z" },
-    { url = "https://files.pythonhosted.org/packages/b2/3e/8ea2f8dd1e5c9c279fb3c28193bdb850adf4d3d8172880abad829eced609/mypy-2.0.0-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:224ba142eee8b4d65d4db657cb1fc22abec30b135ded6ab297302ba1f62e505d", size = 13914264, upload-time = "2026-05-06T19:24:12.875Z" },
-    { url = "https://files.pythonhosted.org/packages/be/ce/78bd3b8520f676acee9dab48ea71473e68f6d5cf14b59fbd800bea50a92b/mypy-2.0.0-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:2e879ad8a03908ff74d15e8a9b42bf049918e6798d52c011011f1873d0b5877e", size = 14926761, upload-time = "2026-05-06T19:20:12.846Z" },
-    { url = "https://files.pythonhosted.org/packages/61/ef/b52fa340522da3d22e669117c3b83155c2660f7cdc035856958fbfffb224/mypy-2.0.0-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:65c5c15bcbd18d6fe927cc55c459597a3517d69cc3123f067be3b020010e115e", size = 15157014, upload-time = "2026-05-06T19:25:49.78Z" },
-    { url = "https://files.pythonhosted.org/packages/7a/0c/dde7614250c6d017936c7aa3bb63b9b52c7cfd298d3f1be9be45f307870b/mypy-2.0.0-cp313-cp313-win_amd64.whl", hash = "sha256:d1a068acd7c9fb77e9f8923f1556f2f49d6d7895821121b8d97fa5642b9c52f5", size = 11067049, upload-time = "2026-05-06T19:21:16.116Z" },
-    { url = "https://files.pythonhosted.org/packages/27/ec/1d6af4830a94a285442db19caa02f160cc1a255e4f324eec5458e6c2bafb/mypy-2.0.0-cp313-cp313-win_arm64.whl", hash = "sha256:ef9d96da1ddffbc21f27d3939319b6846d12393baa17c4d2f3e81e040e73ce2c", size = 9967903, upload-time = "2026-05-06T19:22:15.52Z" },
-    { url = "https://files.pythonhosted.org/packages/ce/2c/6fefe954207860aed6eeb91776795e64a257d3ce0360862288984ce121f5/mypy-2.0.0-cp314-cp314-macosx_10_15_x86_64.whl", hash = "sha256:c918c64e8ce36557851b0347f84eb12f1965d3a06813c36df253eb0c0afd1d82", size = 14729633, upload-time = "2026-05-06T19:24:53.383Z" },
-    { url = "https://files.pythonhosted.org/packages/23/d6/d336f5b820af189eb0390cce21de62d264c0a4e64713dfbe81bfc4fc7739/mypy-2.0.0-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:301f1a8ccc7d79b542ee218b28bb49443a83e194eb3d10da63ff1649e5aa5d34", size = 13559524, upload-time = "2026-05-06T19:22:24.906Z" },
-    { url = "https://files.pythonhosted.org/packages/af/a6/d7bb54fde1770f0484e5fbdbdce37a41e95ed0a1cd493ec60ead111e356c/mypy-2.0.0-cp314-cp314-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:fdf4ef489d44ce350bac3fd699907834e551d4c934e9cc862ef201215ab1558d", size = 13936018, upload-time = "2026-05-06T19:25:02.992Z" },
-    { url = "https://files.pythonhosted.org/packages/7d/ba/5be51316b91e6a6bf6e3a8adb3de500e7e1fb5bf9491743b8cbc81a34a2c/mypy-2.0.0-cp314-cp314-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:9cde2d0989f912fc850890f727d0d76495e7a6c5bdd9912a1efdb64952b4398d", size = 14910712, upload-time = "2026-05-06T19:25:21.83Z" },
-    { url = "https://files.pythonhosted.org/packages/b7/37/e2c8c3b373e20ebfb66e6c83a99027fd67df4ec43b08879f74e822d2dc4c/mypy-2.0.0-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:cdf05693c231a14fe37dbfce192a3a1372c26a833af4a80f550547742952e719", size = 15141499, upload-time = "2026-05-06T19:20:50.924Z" },
-    { url = "https://files.pythonhosted.org/packages/12/36/07756f933e00416d912e35878cfcf89a593a3350a885691c0bb85ae0226a/mypy-2.0.0-cp314-cp314-win_amd64.whl", hash = "sha256:73aee2da33a2237e66cbe84a94780e53599847e86bb3aa7b93e405e8cd9905f2", size = 11240511, upload-time = "2026-05-06T19:21:32.39Z" },
-    { url = "https://files.pythonhosted.org/packages/70/05/79ac1f20f2397353f3845f7b8bb5d8006cda7c8ef9092f04f9de3c6135f2/mypy-2.0.0-cp314-cp314-win_arm64.whl", hash = "sha256:1f6dcd8f39971f41edab2728c877c4ac8b50ad3c387ff2770423b79a05d23910", size = 10149336, upload-time = "2026-05-06T19:22:08.383Z" },
-    { url = "https://files.pythonhosted.org/packages/53/e0/0db84e0ebbad6e99e566c68e4b465784f2a2294f7719e8db9d509ef23087/mypy-2.0.0-cp314-cp314t-macosx_10_15_x86_64.whl", hash = "sha256:a04e980b9275c76159da66c6e1723c7798306f9802b31bdaf9358d0c84030ce8", size = 15797362, upload-time = "2026-05-06T19:22:00.835Z" },
-    { url = "https://files.pythonhosted.org/packages/0a/a4/14cc0768164dd53bec48aa41a20270b18df9bf72aa5054278bf133608315/mypy-2.0.0-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:33f9cf4825469b2bc73c53ba55f6d9a9b4cdb60f9e6e228745581520f29b8771", size = 14635914, upload-time = "2026-05-06T19:23:43.675Z" },
-    { url = "https://files.pythonhosted.org/packages/08/48/d866a3e23b4dc5974c77d9cf65a435bf22de01a84dd4620917950e233960/mypy-2.0.0-cp314-cp314t-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:191675c3c7dc2a5c7722a035a6909c277f14046c5e4e02aa5fbf65f8524f08ad", size = 15270866, upload-time = "2026-05-06T19:22:34.756Z" },
-    { url = "https://files.pythonhosted.org/packages/71/eb/de9ef94958eb2078a6b908ceb247757dc384d3a238d3bd6ed7d81de5eaf8/mypy-2.0.0-cp314-cp314t-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:c3d26c4321a3b06fc9f04c741e0733af693f82d823f8e64e47b2e63b7f19fa84", size = 16093131, upload-time = "2026-05-06T19:23:56.541Z" },
-    { url = "https://files.pythonhosted.org/packages/ad/07/0ab2c1a9d26e90942612724cbd5788f16b7810c5dd39bfcf79286c6c4524/mypy-2.0.0-cp314-cp314t-musllinux_1_2_x86_64.whl", hash = "sha256:bbcbc4d5917ca6ce12de70e051de7f533e3bf92d548b41a38a2232a6fe356525", size = 16330685, upload-time = "2026-05-06T19:21:42.037Z" },
-    { url = "https://files.pythonhosted.org/packages/a6/8f/46f85d1371a5be642dad263828118ae1efd536d91d8bd2000c68acff3920/mypy-2.0.0-cp314-cp314t-win_amd64.whl", hash = "sha256:dbc6ba6d40572ae49268531565793a8f07eac7fc65ad76d482c9b4c8765b6043", size = 12752017, upload-time = "2026-05-06T19:22:44.002Z" },
-    { url = "https://files.pythonhosted.org/packages/7a/e6/94ca48800cac19eb28a58188a768aaec0d16cac0f373915f073058ab0855/mypy-2.0.0-cp314-cp314t-win_arm64.whl", hash = "sha256:77926029dfcb7e1a3ecb0acb2ddbb24ca36be03f7d623e1759ad5376be8f6c01", size = 10527097, upload-time = "2026-05-06T19:20:58.973Z" },
-    { url = "https://files.pythonhosted.org/packages/5c/14/fd0694aa594d6e9f9fd16ce821be2eff295197a273262ef56ddcc1388d68/mypy-2.0.0-py3-none-any.whl", hash = "sha256:8a92b2be3146b4fa1f062af7eb05574cbf3e6eb8e1f14704af1075423144e4e5", size = 2673434, upload-time = "2026-05-06T19:26:32.856Z" },
+    { url = "https://files.pythonhosted.org/packages/95/b1/55861beb5c339b44f9a2ba92df9e2cb1eeb4ae1eee674cdf7772c797778b/mypy-2.1.0-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:244358bf1c0da7722230bce60683d52e8e9fd030554926f15b747a84efb5b3af", size = 14874381, upload-time = "2026-05-11T18:37:31.784Z" },
+    { url = "https://files.pythonhosted.org/packages/0b/b3/b7f770114b7d0ac92d0f76e8d93c2780844a70488a90e91821927850da86/mypy-2.1.0-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:4ec7c57657493c7a75534df2751c8ae2cda383c16ecc55d2106c54476b1b16f6", size = 13665501, upload-time = "2026-05-11T18:34:23.063Z" },
+    { url = "https://files.pythonhosted.org/packages/b6/f3/8ae2037967e2126689a0c11d99e2b707134a565191e92c60ca2572aec60a/mypy-2.1.0-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:d8161b6ff4392410023224f0969d17db93e1e154bc3e4ba62598e720723ae211", size = 14045750, upload-time = "2026-05-11T18:31:48.151Z" },
+    { url = "https://files.pythonhosted.org/packages/a0/32/615eb5911859e43d054941b0d0a7d06cfa2870eba86529cf385b052b111c/mypy-2.1.0-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:bf03e12003084a67395184d3eb8cbd6a489dc3655b5664b28c210a9e2403ab0b", size = 15061630, upload-time = "2026-05-11T18:37:06.898Z" },
+    { url = "https://files.pythonhosted.org/packages/d4/03/4eafbfff8bfab1b87082741eae6e6a624028c984e6708b73bce2a8570c9d/mypy-2.1.0-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:20509760fd791c51579d573153407d226385ec1f8bcce55d730b354f3336bc22", size = 15288831, upload-time = "2026-05-11T18:31:18.07Z" },
+    { url = "https://files.pythonhosted.org/packages/99/ee/919661478e5891a3c96e549c036e467e64563ab85995b10c53c8358e16a3/mypy-2.1.0-cp312-cp312-win_amd64.whl", hash = "sha256:6753d0c1fdd6b1a23b9e4f283ce80b2153b724adcb2653b20b85a8a28ac6436b", size = 11135228, upload-time = "2026-05-11T18:34:31.23Z" },
+    { url = "https://files.pythonhosted.org/packages/24/0a/6a12b9782ca0831a553192f351679f4548abc9d19a7cc93bb7feb02084c7/mypy-2.1.0-cp312-cp312-win_arm64.whl", hash = "sha256:98ebb6589bb3b6d0c6f0c459d53ca55b8091fbc13d277c4041c885392e8195e8", size = 10040684, upload-time = "2026-05-11T18:36:48.199Z" },
+    { url = "https://files.pythonhosted.org/packages/6e/dd/c7191469c777f07689c032a8f7326e393ea34c92d6d76eb7ce5ba57ea66d/mypy-2.1.0-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:35aac3bb114e03888f535d5eb51b8bafbb3266586b599da1940f9b1be3ec5bd5", size = 14852174, upload-time = "2026-05-11T18:31:38.929Z" },
+    { url = "https://files.pythonhosted.org/packages/55/8c/aed55408879043d72bb9135f4d0d19a02b886dd569631e113e3d2706cb8d/mypy-2.1.0-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:8de55a8c861f2a49331f807be98d90caeceeef520bde13d43a160207f8af613e", size = 13651542, upload-time = "2026-05-11T18:36:04.636Z" },
+    { url = "https://files.pythonhosted.org/packages/3a/8e/f371a824b1f1fa8ea6e3dbb8703d232977d572be2329554a3bc4d960302f/mypy-2.1.0-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:5fdf2941a07434af755837d9880f7d7d25f1dacb1af9dcd4b9b66f2220a3024e", size = 14033929, upload-time = "2026-05-11T18:35:55.742Z" },
+    { url = "https://files.pythonhosted.org/packages/94/21/f54be870d6dd53a82c674407e0f8eed7174b05ec78d42e5abd7b42e84fd5/mypy-2.1.0-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:e195b817c13f02352a9c124301f9f30f078405444679b6753c1b96b6eed37285", size = 15039200, upload-time = "2026-05-11T18:33:10.281Z" },
+    { url = "https://files.pythonhosted.org/packages/17/99/bf21748626a40ce59fd29a39386ab46afec88b7bd2f0fa6c3a97c995523f/mypy-2.1.0-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:5431d42af987ebd92ba2f71d45c85ed41d8e6ca9f5fd209a69f68f707d2469e5", size = 15272690, upload-time = "2026-05-11T18:32:07.205Z" },
+    { url = "https://files.pythonhosted.org/packages/d6/d7/9e90d2cf47100bea550ed2bc7b0d4de3a62181d84d5e37da0003e8462637/mypy-2.1.0-cp313-cp313-win_amd64.whl", hash = "sha256:767fe8c66dc3e01e19e1737d4c38ebefead16125e1b8e58ad421903b376f5c65", size = 11147435, upload-time = "2026-05-11T18:33:56.477Z" },
+    { url = "https://files.pythonhosted.org/packages/ec/46/e5c449e858798e35ffc90946282a27c62a77be743fe17480e4977374eb91/mypy-2.1.0-cp313-cp313-win_arm64.whl", hash = "sha256:ecfe70d43775ab99562ab128ce49854a362044c9f894961f68f898c23cb7429d", size = 10035052, upload-time = "2026-05-11T18:32:30.049Z" },
+    { url = "https://files.pythonhosted.org/packages/b0/ca/b279a672e874aedd5498ae25f722dacc8aa86bbffb939b3f97cbb1cf6686/mypy-2.1.0-cp314-cp314-macosx_10_15_x86_64.whl", hash = "sha256:7354c5a7f69d9345c3d6e69921d57088eea3ddeeb6b20d34c1b3855b02c36ec2", size = 14848422, upload-time = "2026-05-11T18:35:45.984Z" },
+    { url = "https://files.pythonhosted.org/packages/27/e6/3efe56c631d959b9b4454e208b0ac4b7f4f58b404c89f8bec7b49efdfc21/mypy-2.1.0-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:49890d4f76ac9e06ec117f9e09f3174da70a620a0c300953d8595c926e80947f", size = 13677374, upload-time = "2026-05-11T18:36:57.188Z" },
+    { url = "https://files.pythonhosted.org/packages/84/7f/8107ea87a44fd1f1b59882442f033c9c3488c127201b1d1d15f1cbd6022e/mypy-2.1.0-cp314-cp314-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:761be68e023ef5d94678772396a8af1220030f80837a3afd8d0aef3b419666f4", size = 14055743, upload-time = "2026-05-11T18:35:18.361Z" },
+    { url = "https://files.pythonhosted.org/packages/51/4d/b6d34db183133b83761b9199a82d31557cdbb70a380d8c3b3438e11882a3/mypy-2.1.0-cp314-cp314-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:c90345fc182dc363b891350457ec69c35140858538f38b4540845afcc32b1aef", size = 15020937, upload-time = "2026-05-11T18:34:59.618Z" },
+    { url = "https://files.pythonhosted.org/packages/ff/d7/f08360c691d758acb02f45022c34d98b92892f4ea756644e1000d4b9f3d8/mypy-2.1.0-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:b84802e7b5a6daf1f5e15bc9fcd7ddae77be13981ffab037f1c67bb84d67d135", size = 15253371, upload-time = "2026-05-11T18:36:41.081Z" },
+    { url = "https://files.pythonhosted.org/packages/67/1b/09460a13719530a19bce27bd3bc8449e83569dd2ba7faf51c9c3c30c0b61/mypy-2.1.0-cp314-cp314-win_amd64.whl", hash = "sha256:022c771234936ceac541ebaf836fe9e2abeb3f5e09aff21588fe543ff006fe21", size = 11326429, upload-time = "2026-05-11T18:34:13.526Z" },
+    { url = "https://files.pythonhosted.org/packages/40/62/75dbf0f82f7b6680340efc614af29dd0b3c17b8a4f1cd09b8bd2fd6bc814/mypy-2.1.0-cp314-cp314-win_arm64.whl", hash = "sha256:498207db725cec88829a6a5c2fc771205fd043719ef98bc49aba8fb9fc4e6d57", size = 10218799, upload-time = "2026-05-11T18:32:23.491Z" },
+    { url = "https://files.pythonhosted.org/packages/b2/66/caca04ed7d972fb6eb6dd1ccd6df1de5c38fae8c5b3dc1c4e8e0d85ee6b9/mypy-2.1.0-cp314-cp314t-macosx_10_15_x86_64.whl", hash = "sha256:7d5e5cad0efeba72b93cd17490cc0d69c5ac9ca132994fe3fb0314808aeeb83e", size = 15923458, upload-time = "2026-05-11T18:35:28.64Z" },
+    { url = "https://files.pythonhosted.org/packages/ed/52/2d90cbe49d014b13ed7ff337930c30bad35893fe38a1e4641e756bb62191/mypy-2.1.0-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:ff715050c127d724fd260a2e666e7747fdd83511c0c47d449d98238970aef780", size = 14757697, upload-time = "2026-05-11T18:36:14.208Z" },
+    { url = "https://files.pythonhosted.org/packages/ac/37/d98f4a14e081b238992d0ed96b6d39c7cc0148c9699eb71eaa68629665ea/mypy-2.1.0-cp314-cp314t-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:82208da9e09414d520e912d3e462d454854bed0810b71540bb016dcbca7308fd", size = 15405638, upload-time = "2026-05-11T18:33:48.249Z" },
+    { url = "https://files.pythonhosted.org/packages/a3/c2/15c46613b24a84fad2aea1248bf9619b99c2767ae9071fe224c179a0b7d4/mypy-2.1.0-cp314-cp314t-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:e79ebc1b904b84f0310dff7469655a9c36c7a68bddb37bdd42b67a332df61d08", size = 16215852, upload-time = "2026-05-11T18:32:50.296Z" },
+    { url = "https://files.pythonhosted.org/packages/5c/90/9c16a57f482c76d25f6379762b56bbf65c711d8158cf271fb2802cfb0640/mypy-2.1.0-cp314-cp314t-musllinux_1_2_x86_64.whl", hash = "sha256:e583edc957cfb0deb142079162ae826f58449b116c1d442f2d91c69d9fced081", size = 16452695, upload-time = "2026-05-11T18:33:38.182Z" },
+    { url = "https://files.pythonhosted.org/packages/0f/4c/215a4eeb63cacc5f17f516691ea7285d11e249802b942476bff15922a314/mypy-2.1.0-cp314-cp314t-win_amd64.whl", hash = "sha256:b33b6cd332695bba180d55e717a79d3038e479a2c49cc5eb3d53603409b9a5d7", size = 12866622, upload-time = "2026-05-11T18:34:39.945Z" },
+    { url = "https://files.pythonhosted.org/packages/4b/50/1043e1db5f455ffe4c9ab22747cd8ca2bc492b1e4f4e21b130a44ee2b217/mypy-2.1.0-cp314-cp314t-win_arm64.whl", hash = "sha256:4f910fe825376a7b66ef7ca8c98e5a149e8cd64c19ae71d84047a74ee060d4e6", size = 10610798, upload-time = "2026-05-11T18:36:31.444Z" },
+    { url = "https://files.pythonhosted.org/packages/0d/2a/13ca1f292f6db1b98ff495ef3467736b331621c5917cad984b7043e7348d/mypy-2.1.0-py3-none-any.whl", hash = "sha256:a663814603a5c563fb87a4f96fb473eeb30d1f5a4885afcf44f9db000a366289", size = 2693302, upload-time = "2026-05-11T18:31:29.246Z" },
 ]

 [[package]]
@@ -4253,6 +4278,7 @@ dependencies = [
    { name = "protobuf" },
 ]
 wheels = [
+    { url = "https://files.pythonhosted.org/packages/81/b1/d111b1df656761f980d9e298a60039a9cb66036b1d039e777537743d0ac3/onnxruntime-1.26.0-cp312-cp312-macosx_14_0_arm64.whl", hash = "sha256:05b028781b322ad74b57ce5b50aa5280bb1fe96ceec334628ade681e0b24c1ac", size = 18016624, upload-time = "2026-05-12T00:41:01.735Z" },
    { url = "https://files.pythonhosted.org/packages/f6/a0/3f9d896a0385a36bd04345d6d0b802821a5782adde562e7e135f6bb71c73/onnxruntime-1.26.0-cp312-cp312-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:91f2bb870a4b9224eba0a6728c1fa7a9e552b8e59e1083c51fbbc3d013f2b5c0", size = 16052692, upload-time = "2026-05-08T19:07:13.829Z" },
    { url = "https://files.pythonhosted.org/packages/7c/43/2a4e04f8dbeffad19bbcced4bcd4289bf478921518437404d6b92bdf213b/onnxruntime-1.26.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:9b6dd70599005bd1bf29779f04a91978b92b5e719c11a20068a8f8e535f725b6", size = 18185439, upload-time = "2026-05-08T19:07:36.299Z" },
    { url = "https://files.pythonhosted.org/packages/44/fc/026d0a7162b9c2153dac292baea9e027c42304dc1d9dc6f8ff5b4cfbaedd/onnxruntime-1.26.0-cp312-cp312-win_amd64.whl", hash = "sha256:a26374dc7fbcaae593601086b242120e13f2310558df0991da6dd8b8fac00414", size = 13026427, upload-time = "2026-05-08T19:08:03.503Z" },
@@ -4428,7 +4454,7 @@ name = "pexpect"
 version = "4.9.0"
 source = { registry = "https://pypi.org/simple" }
 dependencies = [
-    { name = "ptyprocess", marker = "(platform_machine != 's390x' and sys_platform == 'win32') or (sys_platform != 'emscripten' and sys_platform != 'win32')" },
+    { name = "ptyprocess", marker = "sys_platform != 'emscripten' and sys_platform != 'win32'" },
 ]
 sdist = { url = "https://files.pythonhosted.org/packages/42/92/cc564bf6381ff43ce1f4d06852fc19a2f11d180f23dc32d9588bee2f149d/pexpect-4.9.0.tar.gz", hash = "sha256:ee7d41123f3c9911050ea2c2dac107568dc43b2d3b0c7557a33212c398ead30f", size = 166450, upload-time = "2023-11-25T09:07:26.339Z" }
 wheels = [
@@ -5005,10 +5031,10 @@ name = "pyobjc-framework-applicationservices"
 version = "12.1"
 source = { registry = "https://pypi.org/simple" }
 dependencies = [
-    { name = "pyobjc-core", marker = "(platform_machine != 's390x' and sys_platform == 'win32') or (sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32')" },
-    { name = "pyobjc-framework-cocoa", marker = "(platform_machine != 's390x' and sys_platform == 'win32') or (sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32')" },
-    { name = "pyobjc-framework-coretext", marker = "(platform_machine != 's390x' and sys_platform == 'win32') or (sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32')" },
-    { name = "pyobjc-framework-quartz", marker = "(platform_machine != 's390x' and sys_platform == 'win32') or (sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32')" },
+    { name = "pyobjc-core", marker = "sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32'" },
+    { name = "pyobjc-framework-cocoa", marker = "sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32'" },
+    { name = "pyobjc-framework-coretext", marker = "sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32'" },
+    { name = "pyobjc-framework-quartz", marker = "sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32'" },
 ]
 sdist = { url = "https://files.pythonhosted.org/packages/be/6a/d4e613c8e926a5744fc47a9e9fea08384a510dc4f27d844f7ad7a2d793bd/pyobjc_framework_applicationservices-12.1.tar.gz", hash = "sha256:c06abb74f119bc27aeb41bf1aef8102c0ae1288aec1ac8665ea186a067a8945b", size = 103247, upload-time = "2025-11-14T10:08:52.18Z" }
 wheels = [
@@ -5024,7 +5050,7 @@ name = "pyobjc-framework-cocoa"
 version = "12.1"
 source = { registry = "https://pypi.org/simple" }
 dependencies = [
-    { name = "pyobjc-core", marker = "(platform_machine != 's390x' and sys_platform == 'win32') or (sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32')" },
+    { name = "pyobjc-core", marker = "sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32'" },
 ]
 sdist = { url = "https://files.pythonhosted.org/packages/02/a3/16ca9a15e77c061a9250afbae2eae26f2e1579eb8ca9462ae2d2c71e1169/pyobjc_framework_cocoa-12.1.tar.gz", hash = "sha256:5556c87db95711b985d5efdaaf01c917ddd41d148b1e52a0c66b1a2e2c5c1640", size = 2772191, upload-time = "2025-11-14T10:13:02.069Z" }
 wheels = [
@@ -5040,9 +5066,9 @@ name = "pyobjc-framework-coretext"
 version = "12.1"
 source = { registry = "https://pypi.org/simple" }
 dependencies = [
-    { name = "pyobjc-core", marker = "(platform_machine != 's390x' and sys_platform == 'win32') or (sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32')" },
-    { name = "pyobjc-framework-cocoa", marker = "(platform_machine != 's390x' and sys_platform == 'win32') or (sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32')" },
-    { name = "pyobjc-framework-quartz", marker = "(platform_machine != 's390x' and sys_platform == 'win32') or (sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32')" },
+    { name = "pyobjc-core", marker = "sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32'" },
+    { name = "pyobjc-framework-cocoa", marker = "sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32'" },
+    { name = "pyobjc-framework-quartz", marker = "sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32'" },
 ]
 sdist = { url = "https://files.pythonhosted.org/packages/29/da/682c9c92a39f713bd3c56e7375fa8f1b10ad558ecb075258ab6f1cdd4a6d/pyobjc_framework_coretext-12.1.tar.gz", hash = "sha256:e0adb717738fae395dc645c9e8a10bb5f6a4277e73cba8fa2a57f3b518e71da5", size = 90124, upload-time = "2025-11-14T10:14:38.596Z" }
 wheels = [
@@ -5058,8 +5084,8 @@ name = "pyobjc-framework-quartz"
 version = "12.1"
 source = { registry = "https://pypi.org/simple" }
 dependencies = [
-    { name = "pyobjc-core", marker = "(platform_machine != 's390x' and sys_platform == 'win32') or (sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32')" },
-    { name = "pyobjc-framework-cocoa", marker = "(platform_machine != 's390x' and sys_platform == 'win32') or (sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32')" },
+    { name = "pyobjc-core", marker = "sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32'" },
+    { name = "pyobjc-framework-cocoa", marker = "sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32'" },
 ]
 sdist = { url = "https://files.pythonhosted.org/packages/94/18/cc59f3d4355c9456fc945eae7fe8797003c4da99212dd531ad1b0de8a0c6/pyobjc_framework_quartz-12.1.tar.gz", hash = "sha256:27f782f3513ac88ec9b6c82d9767eef95a5cf4175ce88a1e5a65875fee799608", size = 3159099, upload-time = "2025-11-14T10:21:24.31Z" }
 wheels = [
@@ -5522,7 +5548,7 @@ wheels = [

 [[package]]
 name = "requests"
-version = "2.33.1"
+version = "2.34.0"
 source = { registry = "https://pypi.org/simple" }
 dependencies = [
    { name = "certifi" },
@@ -5530,9 +5556,9 @@ dependencies = [
    { name = "idna" },
    { name = "urllib3" },
 ]
-sdist = { url = "https://files.pythonhosted.org/packages/5f/a4/98b9c7c6428a668bf7e42ebb7c79d576a1c3c1e3ae2d47e674b468388871/requests-2.33.1.tar.gz", hash = "sha256:18817f8c57c6263968bc123d237e3b8b08ac046f5456bd1e307ee8f4250d3517", size = 134120, upload-time = "2026-03-30T16:09:15.531Z" }
+sdist = { url = "https://files.pythonhosted.org/packages/43/b8/7a707d60fea4c49094e40262cc0e2ca6c768cca21587e34d3f705afec47e/requests-2.34.0.tar.gz", hash = "sha256:7d62fe92f50eb82c529b0916bb445afa1531a566fc8f35ffdc64446e771b856a", size = 142436, upload-time = "2026-05-11T19:29:51.717Z" }
 wheels = [
-    { url = "https://files.pythonhosted.org/packages/d7/8e/7540e8a2036f79a125c1d2ebadf69ed7901608859186c856fa0388ef4197/requests-2.33.1-py3-none-any.whl", hash = "sha256:4e6d1ef462f3626a1f0a0a9c42dd93c63bad33f9f1c1937509b8c5c8718ab56a", size = 64947, upload-time = "2026-03-30T16:09:13.83Z" },
+    { url = "https://files.pythonhosted.org/packages/ef/e6/e300fce5fe83c30520607a015dabd985df3251e188d234bfe9492e17a389/requests-2.34.0-py3-none-any.whl", hash = "sha256:917520a21b767485ce7c588f4ebb917c436b24a31231b44228715eaeb5a52c60", size = 73021, upload-time = "2026-05-11T19:29:49.923Z" },
 ]

 [[package]]
@@ -6115,7 +6141,7 @@ version = "0.18.1"
 source = { registry = "https://pypi.org/simple" }
 dependencies = [
    { name = "ptyprocess", marker = "os_name != 'nt'" },
-    { name = "pywinpty", marker = "os_name == 'nt' and sys_platform != 'linux'" },
+    { name = "pywinpty", marker = "(os_name == 'nt' and platform_machine != 'arm64' and sys_platform == 'darwin') or (os_name == 'nt' and sys_platform != 'darwin' and sys_platform != 'linux')" },
    { name = "tornado" },
 ]
 sdist = { url = "https://files.pythonhosted.org/packages/8a/11/965c6fd8e5cc254f1fe142d547387da17a8ebfd75a3455f637c663fb38a0/terminado-0.18.1.tar.gz", hash = "sha256:de09f2c4b85de4765f7714688fff57d3e75bad1f909b589fde880460c753fd2e", size = 32701, upload-time = "2024-03-12T14:34:39.026Z" }
@@ -6216,15 +6242,19 @@ name = "torch"
 version = "2.11.0"
 source = { registry = "https://pypi.org/simple" }
 resolution-markers = [
-    "(python_full_version >= '3.15' and platform_machine != 's390x' and platform_machine != 'x86_64' and sys_platform == 'darwin') or (python_full_version >= '3.15' and platform_machine != 's390x' and sys_platform != 'darwin' and sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32')",
+    "python_full_version >= '3.15' and platform_machine == 'arm64' and sys_platform == 'darwin'",
+    "python_full_version == '3.14.*' and platform_machine == 'arm64' and sys_platform == 'darwin'",
+    "python_full_version == '3.13.*' and platform_machine == 'arm64' and sys_platform == 'darwin'",
+    "python_full_version < '3.13' and platform_machine == 'arm64' and sys_platform == 'darwin'",
+    "(python_full_version >= '3.15' and platform_machine != 'arm64' and platform_machine != 's390x' and sys_platform == 'darwin') or (python_full_version >= '3.15' and platform_machine != 's390x' and sys_platform != 'darwin' and sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32')",
    "python_full_version >= '3.15' and platform_machine == 's390x' and sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32'",
    "python_full_version >= '3.15' and platform_machine != 's390x' and sys_platform == 'emscripten'",
    "python_full_version >= '3.15' and platform_machine == 's390x' and sys_platform == 'emscripten'",
-    "(python_full_version == '3.14.*' and platform_machine != 's390x' and platform_machine != 'x86_64' and sys_platform == 'darwin') or (python_full_version == '3.14.*' and platform_machine != 's390x' and sys_platform != 'darwin' and sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32')",
-    "(python_full_version == '3.13.*' and platform_machine != 's390x' and platform_machine != 'x86_64' and sys_platform == 'darwin') or (python_full_version == '3.13.*' and platform_machine != 's390x' and sys_platform != 'darwin' and sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32')",
+    "(python_full_version == '3.14.*' and platform_machine != 'arm64' and platform_machine != 's390x' and sys_platform == 'darwin') or (python_full_version == '3.14.*' and platform_machine != 's390x' and sys_platform != 'darwin' and sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32')",
+    "(python_full_version == '3.13.*' and platform_machine != 'arm64' and platform_machine != 's390x' and sys_platform == 'darwin') or (python_full_version == '3.13.*' and platform_machine != 's390x' and sys_platform != 'darwin' and sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32')",
    "python_full_version == '3.14.*' and platform_machine == 's390x' and sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32'",
    "python_full_version == '3.13.*' and platform_machine == 's390x' and sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32'",
-    "(python_full_version < '3.13' and platform_machine != 's390x' and platform_machine != 'x86_64' and sys_platform == 'darwin') or (python_full_version < '3.13' and platform_machine != 's390x' and sys_platform != 'darwin' and sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32')",
+    "(python_full_version < '3.13' and platform_machine != 'arm64' and platform_machine != 's390x' and sys_platform == 'darwin') or (python_full_version < '3.13' and platform_machine != 's390x' and sys_platform != 'darwin' and sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32')",
    "python_full_version < '3.13' and platform_machine == 's390x' and sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32'",
    "python_full_version == '3.14.*' and platform_machine != 's390x' and sys_platform == 'emscripten'",
    "python_full_version == '3.13.*' and platform_machine != 's390x' and sys_platform == 'emscripten'",
@@ -6232,13 +6262,13 @@ resolution-markers = [
    "python_full_version == '3.13.*' and platform_machine == 's390x' and sys_platform == 'emscripten'",
    "python_full_version < '3.13' and platform_machine != 's390x' and sys_platform == 'emscripten'",
    "python_full_version < '3.13' and platform_machine == 's390x' and sys_platform == 'emscripten'",
-    "(python_full_version >= '3.15' and platform_machine == 'x86_64' and sys_platform == 'darwin') or (python_full_version >= '3.15' and platform_machine != 's390x' and sys_platform == 'win32')",
+    "python_full_version >= '3.15' and platform_machine != 's390x' and sys_platform == 'win32'",
    "python_full_version >= '3.15' and platform_machine == 's390x' and sys_platform == 'win32'",
-    "(python_full_version == '3.14.*' and platform_machine == 'x86_64' and sys_platform == 'darwin') or (python_full_version == '3.14.*' and platform_machine != 's390x' and sys_platform == 'win32')",
-    "(python_full_version == '3.13.*' and platform_machine == 'x86_64' and sys_platform == 'darwin') or (python_full_version == '3.13.*' and platform_machine != 's390x' and sys_platform == 'win32')",
+    "python_full_version == '3.14.*' and platform_machine != 's390x' and sys_platform == 'win32'",
+    "python_full_version == '3.13.*' and platform_machine != 's390x' and sys_platform == 'win32'",
    "python_full_version == '3.14.*' and platform_machine == 's390x' and sys_platform == 'win32'",
    "python_full_version == '3.13.*' and platform_machine == 's390x' and sys_platform == 'win32'",
-    "(python_full_version < '3.13' and platform_machine == 'x86_64' and sys_platform == 'darwin') or (python_full_version < '3.13' and platform_machine != 's390x' and sys_platform == 'win32')",
+    "python_full_version < '3.13' and platform_machine != 's390x' and sys_platform == 'win32'",
    "python_full_version < '3.13' and platform_machine == 's390x' and sys_platform == 'win32'",
 ]
 dependencies = [
@@ -6268,18 +6298,22 @@ name = "torch"
 version = "2.11.0+cu128"
 source = { registry = "https://download.pytorch.org/whl/cu128" }
 resolution-markers = [
-    "python_full_version >= '3.15' and platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 'armv7l' and platform_machine != 's390x' and sys_platform == 'linux'",
+    "(python_full_version >= '3.15' and platform_machine == 'AMD64' and sys_platform == 'linux') or (python_full_version >= '3.15' and platform_machine == 'x86_64' and sys_platform == 'linux')",
+    "(python_full_version == '3.14.*' and platform_machine == 'AMD64' and sys_platform == 'linux') or (python_full_version == '3.14.*' and platform_machine == 'x86_64' and sys_platform == 'linux')",
+    "(python_full_version == '3.13.*' and platform_machine == 'AMD64' and sys_platform == 'linux') or (python_full_version == '3.13.*' and platform_machine == 'x86_64' and sys_platform == 'linux')",
+    "(python_full_version < '3.13' and platform_machine == 'AMD64' and sys_platform == 'linux') or (python_full_version < '3.13' and platform_machine == 'x86_64' and sys_platform == 'linux')",
+    "(python_full_version >= '3.15' and platform_machine == 'aarch64' and sys_platform == 'linux') or (python_full_version >= '3.15' and platform_machine == 'arm64' and sys_platform == 'linux')",
+    "(python_full_version == '3.14.*' and platform_machine == 'aarch64' and sys_platform == 'linux') or (python_full_version == '3.14.*' and platform_machine == 'arm64' and sys_platform == 'linux')",
+    "(python_full_version == '3.13.*' and platform_machine == 'aarch64' and sys_platform == 'linux') or (python_full_version == '3.13.*' and platform_machine == 'arm64' and sys_platform == 'linux')",
+    "(python_full_version < '3.13' and platform_machine == 'aarch64' and sys_platform == 'linux') or (python_full_version < '3.13' and platform_machine == 'arm64' and sys_platform == 'linux')",
+    "python_full_version >= '3.15' and platform_machine != 'AMD64' and platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 's390x' and platform_machine != 'x86_64' and sys_platform == 'linux'",
    "python_full_version >= '3.15' and platform_machine == 's390x' and sys_platform == 'linux'",
-    "python_full_version == '3.14.*' and platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 'armv7l' and platform_machine != 's390x' and sys_platform == 'linux'",
-    "python_full_version == '3.13.*' and platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 'armv7l' and platform_machine != 's390x' and sys_platform == 'linux'",
+    "python_full_version == '3.14.*' and platform_machine != 'AMD64' and platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 's390x' and platform_machine != 'x86_64' and sys_platform == 'linux'",
+    "python_full_version == '3.13.*' and platform_machine != 'AMD64' and platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 's390x' and platform_machine != 'x86_64' and sys_platform == 'linux'",
    "python_full_version == '3.14.*' and platform_machine == 's390x' and sys_platform == 'linux'",
    "python_full_version == '3.13.*' and platform_machine == 's390x' and sys_platform == 'linux'",
-    "python_full_version < '3.13' and platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 'armv7l' and platform_machine != 's390x' and sys_platform == 'linux'",
+    "python_full_version < '3.13' and platform_machine != 'AMD64' and platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 's390x' and platform_machine != 'x86_64' and sys_platform == 'linux'",
    "python_full_version < '3.13' and platform_machine == 's390x' and sys_platform == 'linux'",
-    "(python_full_version >= '3.15' and platform_machine == 'aarch64' and sys_platform == 'linux') or (python_full_version >= '3.15' and platform_machine == 'arm64' and sys_platform == 'linux') or (python_full_version >= '3.15' and platform_machine == 'armv7l' and sys_platform == 'linux')",
-    "(python_full_version == '3.14.*' and platform_machine == 'aarch64' and sys_platform == 'linux') or (python_full_version == '3.14.*' and platform_machine == 'arm64' and sys_platform == 'linux') or (python_full_version == '3.14.*' and platform_machine == 'armv7l' and sys_platform == 'linux')",
-    "(python_full_version == '3.13.*' and platform_machine == 'aarch64' and sys_platform == 'linux') or (python_full_version == '3.13.*' and platform_machine == 'arm64' and sys_platform == 'linux') or (python_full_version == '3.13.*' and platform_machine == 'armv7l' and sys_platform == 'linux')",
-    "(python_full_version < '3.13' and platform_machine == 'aarch64' and sys_platform == 'linux') or (python_full_version < '3.13' and platform_machine == 'arm64' and sys_platform == 'linux') or (python_full_version < '3.13' and platform_machine == 'armv7l' and sys_platform == 'linux')",
 ]
 dependencies = [
    { name = "cuda-bindings", marker = "sys_platform == 'linux'" },
@@ -6318,12 +6352,15 @@ wheels = [
    { url = "https://files.pythonhosted.org/packages/64/85/38f4843ff2a6bf7dfb71a153acd99024dadb96749965a67524c2f1cc1894/torchcodec-0.11.1-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:57056e91d1d883d0fb77ca7759e304be9c0bdb4ea0e37bde5c2e361347063b8c", size = 4368988, upload-time = "2026-04-14T18:24:51.46Z" },
    { url = "https://files.pythonhosted.org/packages/4b/85/3b41034b0f1289423745f918ace2a1e1e86b9c578c2e2461b6afcbb5354a/torchcodec-0.11.1-cp312-cp312-manylinux_2_28_aarch64.whl", hash = "sha256:f1aee486a84247fcaa67870ac5005aa8d382a9839e91e476fa71b5b3d9fda9b7", size = 2397532, upload-time = "2026-04-14T18:24:53.368Z" },
    { url = "https://files.pythonhosted.org/packages/ca/a9/a2b6ee3e84c55bdd0c45fd991dde71c95a99115ec9e26938b212b4545dcf/torchcodec-0.11.1-cp312-cp312-manylinux_2_28_x86_64.whl", hash = "sha256:6c26e90e7aa982302644d0af8cb706318682bb390f48a80ecbfeab03499acd04", size = 2329883, upload-time = "2026-04-14T18:24:55.467Z" },
+    { url = "https://files.pythonhosted.org/packages/82/48/683114a4ed6b59f76b6919532a5db0f4068787be26bab92cc18a1dfa6794/torchcodec-0.11.1-cp312-cp312-win_amd64.whl", hash = "sha256:3fd2d10e0e0a5f455c1c87dc1380b3bd43b77dd5eeeaf479470643b1c04a2dd2", size = 1921066, upload-time = "2026-04-14T18:24:57.102Z" },
    { url = "https://files.pythonhosted.org/packages/2c/61/a8985a7561ef651e409deeac151a0ed5cef763db9577db5cc49c2f5eaab2/torchcodec-0.11.1-cp313-cp313-macosx_12_0_arm64.whl", hash = "sha256:915fbe20068ec77486fbbeaf0c627c89c7376445f27d215b7489c0a03c64fd4c", size = 4289805, upload-time = "2026-04-14T18:24:59.124Z" },
    { url = "https://files.pythonhosted.org/packages/7a/31/c4ec0304dd169a9b2b7fa0dd1d5d659d3cccc975b98ac88c498fe6dd7196/torchcodec-0.11.1-cp313-cp313-manylinux_2_28_aarch64.whl", hash = "sha256:3755de03c96afd37410cba68198225d11cd6431a32f2161a0019791a4a853305", size = 2399057, upload-time = "2026-04-14T18:25:00.782Z" },
    { url = "https://files.pythonhosted.org/packages/5d/b2/85ad7a81f387e40983c21bc94da0c333974afb41f38c3a85d25875274187/torchcodec-0.11.1-cp313-cp313-manylinux_2_28_x86_64.whl", hash = "sha256:5eee69971cec1147a03b8a6b678b5dfbeff0b2c71ed7929e488391f9fbcd630c", size = 2332721, upload-time = "2026-04-14T18:25:02.518Z" },
+    { url = "https://files.pythonhosted.org/packages/ad/ca/5c66f21d2a12039450e9dd4d9d7c480019dfbe9e8a87696a3c3a827c1e37/torchcodec-0.11.1-cp313-cp313-win_amd64.whl", hash = "sha256:67b34e5733636588ebe0f15082bbb90a8ce1472ccb8bb1a656ec28958a208919", size = 1920990, upload-time = "2026-04-14T18:25:04.269Z" },
    { url = "https://files.pythonhosted.org/packages/c4/b7/8d6ee76fca0cfefec01402f33c11766455da2b8460cb9191cdc34f8defc0/torchcodec-0.11.1-cp314-cp314-macosx_12_0_arm64.whl", hash = "sha256:a00ef79e847644f91c9995de021062adc851916b16244d26c0a7a04569710508", size = 4408290, upload-time = "2026-04-14T18:25:05.967Z" },
    { url = "https://files.pythonhosted.org/packages/1e/1e/e37bd46ffac9eec1a9afc32c5097cd83b0de1e865021f7f953c5142919f4/torchcodec-0.11.1-cp314-cp314-manylinux_2_28_aarch64.whl", hash = "sha256:170a3efea64f0cd2c21cee0a233a9e13c67a704b5c5e7ef9aeda31e747ac6885", size = 2402232, upload-time = "2026-04-14T18:25:08.026Z" },
    { url = "https://files.pythonhosted.org/packages/8f/d0/a9173dbfa011cc2224f7489e50844b9f62110050bbdbd9d29485e7f1e0e2/torchcodec-0.11.1-cp314-cp314-manylinux_2_28_x86_64.whl", hash = "sha256:db66ddce36a6fa35f30fbe1d78b57289fcb53f8f43c1c85923edbe339540c665", size = 2334158, upload-time = "2026-04-14T18:25:09.77Z" },
+    { url = "https://files.pythonhosted.org/packages/18/96/6ee0e26547976dc55a69042ce895747a34221eab348931e975141d80d25e/torchcodec-0.11.1-cp314-cp314-win_amd64.whl", hash = "sha256:3fd9ef8302b261d3db5585e42be4a3138c5c240a822031642cdf1f82ea3db5b7", size = 1925002, upload-time = "2026-04-14T18:25:11.718Z" },
 ]

 [[package]]
@@ -6345,15 +6382,19 @@ name = "torchvision"
 version = "0.26.0"
 source = { registry = "https://pypi.org/simple" }
 resolution-markers = [
-    "(python_full_version >= '3.15' and platform_machine != 's390x' and platform_machine != 'x86_64' and sys_platform == 'darwin') or (python_full_version >= '3.15' and platform_machine != 's390x' and sys_platform != 'darwin' and sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32')",
+    "python_full_version >= '3.15' and platform_machine == 'arm64' and sys_platform == 'darwin'",
+    "python_full_version == '3.14.*' and platform_machine == 'arm64' and sys_platform == 'darwin'",
+    "python_full_version == '3.13.*' and platform_machine == 'arm64' and sys_platform == 'darwin'",
+    "python_full_version < '3.13' and platform_machine == 'arm64' and sys_platform == 'darwin'",
+    "(python_full_version >= '3.15' and platform_machine != 'arm64' and platform_machine != 's390x' and sys_platform == 'darwin') or (python_full_version >= '3.15' and platform_machine != 's390x' and sys_platform != 'darwin' and sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32')",
    "python_full_version >= '3.15' and platform_machine == 's390x' and sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32'",
    "python_full_version >= '3.15' and platform_machine != 's390x' and sys_platform == 'emscripten'",
    "python_full_version >= '3.15' and platform_machine == 's390x' and sys_platform == 'emscripten'",
-    "(python_full_version == '3.14.*' and platform_machine != 's390x' and platform_machine != 'x86_64' and sys_platform == 'darwin') or (python_full_version == '3.14.*' and platform_machine != 's390x' and sys_platform != 'darwin' and sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32')",
-    "(python_full_version == '3.13.*' and platform_machine != 's390x' and platform_machine != 'x86_64' and sys_platform == 'darwin') or (python_full_version == '3.13.*' and platform_machine != 's390x' and sys_platform != 'darwin' and sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32')",
+    "(python_full_version == '3.14.*' and platform_machine != 'arm64' and platform_machine != 's390x' and sys_platform == 'darwin') or (python_full_version == '3.14.*' and platform_machine != 's390x' and sys_platform != 'darwin' and sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32')",
+    "(python_full_version == '3.13.*' and platform_machine != 'arm64' and platform_machine != 's390x' and sys_platform == 'darwin') or (python_full_version == '3.13.*' and platform_machine != 's390x' and sys_platform != 'darwin' and sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32')",
    "python_full_version == '3.14.*' and platform_machine == 's390x' and sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32'",
    "python_full_version == '3.13.*' and platform_machine == 's390x' and sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32'",
-    "(python_full_version < '3.13' and platform_machine != 's390x' and platform_machine != 'x86_64' and sys_platform == 'darwin') or (python_full_version < '3.13' and platform_machine != 's390x' and sys_platform != 'darwin' and sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32')",
+    "(python_full_version < '3.13' and platform_machine != 'arm64' and platform_machine != 's390x' and sys_platform == 'darwin') or (python_full_version < '3.13' and platform_machine != 's390x' and sys_platform != 'darwin' and sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32')",
    "python_full_version < '3.13' and platform_machine == 's390x' and sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32'",
    "python_full_version == '3.14.*' and platform_machine != 's390x' and sys_platform == 'emscripten'",
    "python_full_version == '3.13.*' and platform_machine != 's390x' and sys_platform == 'emscripten'",
@@ -6361,13 +6402,13 @@ resolution-markers = [
    "python_full_version == '3.13.*' and platform_machine == 's390x' and sys_platform == 'emscripten'",
    "python_full_version < '3.13' and platform_machine != 's390x' and sys_platform == 'emscripten'",
    "python_full_version < '3.13' and platform_machine == 's390x' and sys_platform == 'emscripten'",
-    "(python_full_version >= '3.15' and platform_machine == 'x86_64' and sys_platform == 'darwin') or (python_full_version >= '3.15' and platform_machine != 's390x' and sys_platform == 'win32')",
+    "python_full_version >= '3.15' and platform_machine != 's390x' and sys_platform == 'win32'",
    "python_full_version >= '3.15' and platform_machine == 's390x' and sys_platform == 'win32'",
-    "(python_full_version == '3.14.*' and platform_machine == 'x86_64' and sys_platform == 'darwin') or (python_full_version == '3.14.*' and platform_machine != 's390x' and sys_platform == 'win32')",
-    "(python_full_version == '3.13.*' and platform_machine == 'x86_64' and sys_platform == 'darwin') or (python_full_version == '3.13.*' and platform_machine != 's390x' and sys_platform == 'win32')",
+    "python_full_version == '3.14.*' and platform_machine != 's390x' and sys_platform == 'win32'",
+    "python_full_version == '3.13.*' and platform_machine != 's390x' and sys_platform == 'win32'",
    "python_full_version == '3.14.*' and platform_machine == 's390x' and sys_platform == 'win32'",
    "python_full_version == '3.13.*' and platform_machine == 's390x' and sys_platform == 'win32'",
-    "(python_full_version < '3.13' and platform_machine == 'x86_64' and sys_platform == 'darwin') or (python_full_version < '3.13' and platform_machine != 's390x' and sys_platform == 'win32')",
+    "python_full_version < '3.13' and platform_machine != 's390x' and sys_platform == 'win32'",
    "python_full_version < '3.13' and platform_machine == 's390x' and sys_platform == 'win32'",
 ]
 dependencies = [
@@ -6393,18 +6434,22 @@ name = "torchvision"
 version = "0.26.0+cu128"
 source = { registry = "https://download.pytorch.org/whl/cu128" }
 resolution-markers = [
-    "python_full_version >= '3.15' and platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 'armv7l' and platform_machine != 's390x' and sys_platform == 'linux'",
+    "(python_full_version >= '3.15' and platform_machine == 'AMD64' and sys_platform == 'linux') or (python_full_version >= '3.15' and platform_machine == 'x86_64' and sys_platform == 'linux')",
+    "(python_full_version == '3.14.*' and platform_machine == 'AMD64' and sys_platform == 'linux') or (python_full_version == '3.14.*' and platform_machine == 'x86_64' and sys_platform == 'linux')",
+    "(python_full_version == '3.13.*' and platform_machine == 'AMD64' and sys_platform == 'linux') or (python_full_version == '3.13.*' and platform_machine == 'x86_64' and sys_platform == 'linux')",
+    "(python_full_version < '3.13' and platform_machine == 'AMD64' and sys_platform == 'linux') or (python_full_version < '3.13' and platform_machine == 'x86_64' and sys_platform == 'linux')",
+    "(python_full_version >= '3.15' and platform_machine == 'aarch64' and sys_platform == 'linux') or (python_full_version >= '3.15' and platform_machine == 'arm64' and sys_platform == 'linux')",
+    "(python_full_version == '3.14.*' and platform_machine == 'aarch64' and sys_platform == 'linux') or (python_full_version == '3.14.*' and platform_machine == 'arm64' and sys_platform == 'linux')",
+    "(python_full_version == '3.13.*' and platform_machine == 'aarch64' and sys_platform == 'linux') or (python_full_version == '3.13.*' and platform_machine == 'arm64' and sys_platform == 'linux')",
+    "(python_full_version < '3.13' and platform_machine == 'aarch64' and sys_platform == 'linux') or (python_full_version < '3.13' and platform_machine == 'arm64' and sys_platform == 'linux')",
+    "python_full_version >= '3.15' and platform_machine != 'AMD64' and platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 's390x' and platform_machine != 'x86_64' and sys_platform == 'linux'",
    "python_full_version >= '3.15' and platform_machine == 's390x' and sys_platform == 'linux'",
-    "python_full_version == '3.14.*' and platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 'armv7l' and platform_machine != 's390x' and sys_platform == 'linux'",
-    "python_full_version == '3.13.*' and platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 'armv7l' and platform_machine != 's390x' and sys_platform == 'linux'",
+    "python_full_version == '3.14.*' and platform_machine != 'AMD64' and platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 's390x' and platform_machine != 'x86_64' and sys_platform == 'linux'",
+    "python_full_version == '3.13.*' and platform_machine != 'AMD64' and platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 's390x' and platform_machine != 'x86_64' and sys_platform == 'linux'",
    "python_full_version == '3.14.*' and platform_machine == 's390x' and sys_platform == 'linux'",
    "python_full_version == '3.13.*' and platform_machine == 's390x' and sys_platform == 'linux'",
-    "python_full_version < '3.13' and platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 'armv7l' and platform_machine != 's390x' and sys_platform == 'linux'",
+    "python_full_version < '3.13' and platform_machine != 'AMD64' and platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 's390x' and platform_machine != 'x86_64' and sys_platform == 'linux'",
    "python_full_version < '3.13' and platform_machine == 's390x' and sys_platform == 'linux'",
-    "(python_full_version >= '3.15' and platform_machine == 'aarch64' and sys_platform == 'linux') or (python_full_version >= '3.15' and platform_machine == 'arm64' and sys_platform == 'linux') or (python_full_version >= '3.15' and platform_machine == 'armv7l' and sys_platform == 'linux')",
-    "(python_full_version == '3.14.*' and platform_machine == 'aarch64' and sys_platform == 'linux') or (python_full_version == '3.14.*' and platform_machine == 'arm64' and sys_platform == 'linux') or (python_full_version == '3.14.*' and platform_machine == 'armv7l' and sys_platform == 'linux')",
-    "(python_full_version == '3.13.*' and platform_machine == 'aarch64' and sys_platform == 'linux') or (python_full_version == '3.13.*' and platform_machine == 'arm64' and sys_platform == 'linux') or (python_full_version == '3.13.*' and platform_machine == 'armv7l' and sys_platform == 'linux')",
-    "(python_full_version < '3.13' and platform_machine == 'aarch64' and sys_platform == 'linux') or (python_full_version < '3.13' and platform_machine == 'arm64' and sys_platform == 'linux') or (python_full_version < '3.13' and platform_machine == 'armv7l' and sys_platform == 'linux')",
 ]
 dependencies = [
    { name = "numpy", marker = "sys_platform == 'linux'" },
@@ -6654,7 +6699,7 @@ wheels = [

 [[package]]
 name = "virtualenv"
-version = "21.3.1"
+version = "21.3.2"
 source = { registry = "https://pypi.org/simple" }
 dependencies = [
    { name = "distlib" },
@@ -6662,9 +6707,9 @@ dependencies = [
    { name = "platformdirs" },
    { name = "python-discovery" },
 ]
-sdist = { url = "https://files.pythonhosted.org/packages/ec/0d/915c02c94d207b85580eb09bffab54438a709e7288524094fe781da526c2/virtualenv-21.3.1.tar.gz", hash = "sha256:c2305bc1fddeec40699b8370d13f8d431b0701f00ce895061ce493aeded4426b", size = 7613791, upload-time = "2026-05-05T01:34:31.402Z" }
+sdist = { url = "https://files.pythonhosted.org/packages/69/e1/665267cea4767debd19f584667a9197c2098b5e7f67a502da9f3a086ab37/virtualenv-21.3.2.tar.gz", hash = "sha256:3ecda97894a6fc1c53106356f488690e5c86278c1f693f3fc0805ac85a513686", size = 7613810, upload-time = "2026-05-12T14:44:18.01Z" }
 wheels = [
-    { url = "https://files.pythonhosted.org/packages/b1/4f/f71e641e504111a5a74e3a20bc52d01bd86788b22699dd3fee1c63253cf6/virtualenv-21.3.1-py3-none-any.whl", hash = "sha256:d1a71cf58f2f9228fff23a1f6ec15d39785c6b32e03658d104974247145edd35", size = 7594539, upload-time = "2026-05-05T01:34:28.98Z" },
+    { url = "https://files.pythonhosted.org/packages/20/5b/885f479093f6627669d39b57bc3d4e674da532e1a4b247d473a61d8d2118/virtualenv-21.3.2-py3-none-any.whl", hash = "sha256:c58ea748fa50bb2a4367da5ba3d30b02458ed40b4ea888faad94021f3309f764", size = 7594558, upload-time = "2026-05-12T14:44:15.193Z" },
 ]

 [[package]]
Author	SHA1	Message	Date
Khalil Meftah	015c88cf0d	Frame count is now derived from the upstream .npy length	2026-05-18 10:57:16 +02:00
Khalil Meftah	0164725af8	fix decord	2026-05-18 10:39:51 +02:00
Khalil Meftah	34274c6f70	scripts: add Robometer parity checks (upstream example videos + LIBERO)	2026-05-17 15:41:31 +02:00
Khalil Meftah	f6a13b1338	Add Robometer reward model	2026-05-17 14:59:23 +02:00
Cheng Yin	9db9c35cb4	fix(config): add lora_alpha to PeftConfig (#3573 ) * fix(config): add lora_alpha to PeftConfig PeftConfig was missing the lora_alpha field, causing the PEFT library to default to alpha=8 regardless of the LoRA rank, which dampens the adaptation signal for high-rank adapters (e.g., r=128). This adds lora_alpha: int \| None = None to PeftConfig, allowing users to specify --peft.lora_alpha <value> on the CLI. Closes #3551 * fix(docs): add lora_alpha to peft training example + clarify scaling formula - Add --peft.lora_alpha=64 to docs/source/peft_training.mdx example to prevent new users from hitting the alpha=8 default dampening bug - Clarify lora_alpha comment in default.py with scaling = lora_alpha / r * docs: mention both --peft.r and --peft.lora_alpha in LoRA description --------- Co-authored-by: Cheng Yin <yin@users.noreply.github.com>	2026-05-13 11:09:19 +02:00
Jash Shah	fe96b28c74	Fix policy.path not working in YAML config files (#3145 ) * fix(config): support policy.path in YAML config files policy.path was only handled via CLI args (filtered from sys.argv before draccus, then retrieved in validate()). When specified in YAML, draccus would crash because 'path' is not a valid field on PreTrainedConfig. Extract path fields from the YAML/JSON config before draccus processes it, store them in a module-level dict, and fall back to it in get_path_arg() when the CLI doesn't have the path. Fixes #2957 * fix(parser): preserve YAML policy overrides when loading from pretrained When policy.path is set in YAML, validate() was calling from_pretrained with only CLI overrides, discarding any YAML policy fields (e.g. lr, batch_size) that draccus had already parsed. Fix by capturing the remaining YAML fields as CLI-style args in _config_yaml_overrides and merging them into the overrides passed to from_pretrained in train.py, eval.py, and lerobot_record.py (CLI args still take precedence). Also fix the NamedTemporaryFile SIM115 ruff warning and add types-PyYAML to the mypy pre-commit hook. * fix(parser): serialize bool/None values correctly in YAML policy overrides Bool values from YAML configs (e.g. push_to_hub: true) were passed as Python "True"/"False" strings instead of lowercase "true"/"false" that draccus expects. Also skip None values to avoid passing "None" strings. * revert: remove types-PyYAML from .pre-commit-config.yaml * chore: fix quality check caused by untyped YAML import Co-authored-by: masato-ka <jp6uzv@gmail.com> Signed-off-by: Khalil Meftah <khalil.meftah@huggingface.co> --------- Signed-off-by: Khalil Meftah <khalil.meftah@huggingface.co> Co-authored-by: Khalil Meftah <khalil.meftah@huggingface.co> Co-authored-by: masato-ka <jp6uzv@gmail.com>	2026-05-13 09:45:27 +02:00
Steven Palma	2438df1307	chore(dependencies): update uv.lock (#3561 )	2026-05-12 21:20:26 +02:00
Caroline Pascal	f218d5ab30	feat(episodes): adding support for metadata based episodes filtering (#3530 ) * feat(episode filtering): adding support for episodes filtering at initialization time in LeRobotDataset * test(tests): adding tests * chore(format): formatting code * feat(performance): improving implementation for better performances on big datasets * chores(warning): improving warnings and errors for episodes filtering * test(invalid key): adding test for invalid filtering key * chore(format): formatting code	2026-05-12 20:44:11 +02:00
Steven Palma	04125492e4	fix(datasets): expand torchcodec platform coverage + rewrite pyav fallback for torchvision >0.26 (#3588 ) * fix(deps): better versioning control for torchcodec * refactor(video_utils): replace torchvision with pyav * adding Torchcodec version to lerobot-info * chore(benchmarks): delete video benchmark --------- Co-authored-by: Maximellerbach <maxime.ellerbach@huggingface.co>	2026-05-12 16:59:11 +02:00
Khalil Meftah	e963e5a0c4	RL stack refactoring (#3075 ) * refactor: RL stack refactoring — RLAlgorithm, RLTrainer, DataMixer, and SAC restructuring * chore: clarify torch.compile disabled note in SACAlgorithm * fix(teleop): keyboard EE teleop not registering special keys and losing intervention state Fixes #2345 Co-authored-by: jpizarrom <jpizarrom@gmail.com> * fix: remove leftover normalization calls from reward classifier predict_reward Fixes #2355 * fix: add thread synchronization to ReplayBuffer to prevent race condition between add() and sample() * refactor: update SACAlgorithm to pass action_dim to _init_critics and fix encoder reference * perf: remove redundant CPU→GPU→CPU transition move in learner * Fix: add kwargs in reward classifier __init__() * fix: include IS_INTERVENTION in complementary_info sent to learner for offline replay buffer * fix: add try/finally to control_loop to ensure image writer cleanup on exit * fix: use string key for IS_INTERVENTION in complementary_info to avoid torch.load serialization error * fix: skip tests that require grpc if not available * fix(tests): ensure tensor stats comparison accounts for reshaping in normalization tests * fix(tests): skip tests that require grpc if not available * refactor(rl): expose public API in rl/__init__ and use relative imports in sub-packages * fix(config): update vision encoder model name to lerobot/resnet10 * fix(sac): clarify torch.compile status * refactor(rl): update shutdown_event type hints from 'any' to 'Any' for consistency and clarity * refactor(sac): simplify optimizer return structure * perf(rl): use async iterators in OnlineOfflineMixer.get_iterator * refactor(sac): decouple algorithm hyperparameters from policy config * update losses names in tests * fix docstring * remove unused type alias * fix test for flat dict structure * refactor(policies): rename policies/sac → policies/gaussian_actor * refactor(rl/sac): consolidate hyperparameter ownership and clean up discrete critic * perf(observation_processor): add CUDA support for image processing * fix(rl): correctly wire HIL-SERL gripper penalty through processor pipeline (cherry picked from commit `9c2af818ff`) * fix(rl): add time limit processor to environment pipeline (cherry picked from commit `cd105f65cb`) * fix(rl): clarify discrete gripper action mapping in GripperVelocityToJoint for SO100 (cherry picked from commit `494f469a2b`) * fix(rl): update neutral gripper action (cherry picked from commit `9c9064e5be`) * fix(rl): merge environment and action-processor info in transition processing (cherry picked from commit `30e1886b64`) * fix(rl): mirror gym_manipulator in actor (cherry picked from commit `d2a046dfc5`) * fix(rl): postprocess action in actor (cherry picked from commit `c2556439e5`) * fix(rl): improve action processing for discrete and continuous actions (cherry picked from commit `f887ab3f6a`) * fix(rl): enhance intervention handling in actor and learner (cherry picked from commit `ef8bfffbd7`) * Revert "perf(observation_processor): add CUDA support for image processing" This reverts commit `38b88c414c`. * refactor(rl): make algorithm a nested config so all SAC hyperparameters are JSON-addressable * refactor(rl): add make_algorithm_config function for RLAlgorithmConfig instantiation * refactor(rl): add type property to RLAlgorithmConfig for better clarity * refactor(rl): make RLAlgorithmConfig an abstract base class for better extensibility * refactor(tests): remove grpc import checks from test files for cleaner code * fix(tests): gate RL tests on the `datasets` extra * refactor: simplify docstrings for clarity and conciseness across multiple files * fix(rl): update gripper position key and handle action absence during reset * fix(rl): record pre-step observation so (obs, action, next.reward) align in gym_manipulator dataset * refactor: clean up import statements * chore: address reviewer comments * chore: improve visual stats reshaping logic and update docstring for clarity * refactor: enforce mandatory config_class and name attributes in RLAlgorithm * refactor: implement NotImplementedError for abstract methods in RLAlgorithm and DataMixer * refactor: replace build_algorithm with make_algorithm for SACAlgorithmConfig and update related tests * refactor: add require_package calls for grpcio and gym-hil in relevant modules * refactor(rl): move grpcio guards to runtime entry points * feat(rl): consolidate HIL-SERL checkpoint into HF-style components Make `RLAlgorithmConfig` and `RLAlgorithm` `HubMixin`s, add abstract `state_dict()` / `load_state_dict()` for critic ensemble, target nets and `log_alpha`, and persist them as a sibling `algorithm/` component next to `pretrained_model/`. Replace the pickled `training_state.pt` with an enriched `training_step.json` carrying `step` and `interaction_step`, so resume restores actor + critics + target nets + temperature + optimizers + RNG + counters from HF-standard files. * refactor(rl): move actor weight-sync wire format from policy to algorithm * refactor(rl): update type hints for learner and actor functions * refactor(rl): hoist grpcio guard to module top in actor/learner * chore(rl): manage import pattern in actor (#3564) * chore(rl): manage import pattern in actor * chore(rl): optional grpc imports in learner; quote grpc ServicerContext types --------- Co-authored-by: Khalil Meftah <khalil.meftah@huggingface.co> * update uv.lock * chore(doc): update doc --------- Co-authored-by: jpizarrom <jpizarrom@gmail.com> Co-authored-by: Steven Palma <imstevenpmwork@ieee.org>	2026-05-12 15:49:54 +02:00