chore(rollout): nice collored cli

2026-06-18 16:57:12 +00:00 · 2026-05-07 11:12:02 +02:00
20 changed files with 356 additions and 306 deletions
@@ -382,7 +382,6 @@ jobs:
                --policy.path=\"\$ROBOTWIN_POLICY\" \
                --env.type=robotwin \
                --env.task=\"\$ROBOTWIN_TASKS\" \
-                --env.max_parallel_tasks=5 \
                --eval.batch_size=1 \
                --eval.n_episodes=1 \
                --eval.use_async_envs=false \
@@ -483,7 +482,6 @@ jobs:
                --policy.path=lerobot/smolvla_robocasa \
                --env.type=robocasa \
                --env.task=CloseFridge,OpenCabinet,OpenDrawer,TurnOnMicrowave,TurnOffStove,CloseToasterOvenDoor,SlideDishwasherRack,TurnOnSinkFaucet,NavigateKitchen,TurnOnElectricKettle \
-                --env.max_parallel_tasks=5 \
                --eval.batch_size=1 \
                --eval.n_episodes=1 \
                --eval.use_async_envs=false \
@@ -695,7 +693,6 @@ jobs:
                --env.task=\"\$ROBOMME_TASKS\" \
                --env.dataset_split=test \
                --env.task_ids=[0] \
-                --env.max_parallel_tasks=5 \
                --eval.batch_size=1 \
                --eval.n_episodes=1 \
                --eval.use_async_envs=false \
@@ -803,7 +800,6 @@ jobs:
                --env.type=libero_plus \
                --env.task=\"\$LIBERO_PLUS_SUITE\" \
                --env.task_ids=\"\$LIBERO_PLUS_TASK_IDS\" \
-                --env.max_parallel_tasks=5 \
                --eval.batch_size=1 \
                --eval.n_episodes=1 \
                --eval.use_async_envs=false \
@@ -904,8 +900,6 @@ jobs:
                --policy.path=lerobot/smolvla_vlabench \
                --env.type=vlabench \
                --env.task=select_fruit,select_toy,select_book,select_painting,select_drink,select_ingredient,select_billiards,select_poker,add_condiment,insert_flower \
-                --env.episode_length=50 \
-                --env.max_parallel_tasks=5 \
                --eval.batch_size=1 \
                --eval.n_episodes=1 \
                --eval.use_async_envs=false \
@@ -232,8 +232,6 @@ Match the policy to the user's **GPU memory** and **time budget**. Numbers below

 All policies typically train for **5–10 epochs** (see §7).

-> **Human-facing version:** the [Compute Hardware Guide](./docs/source/hardware_guide.mdx) reuses the table below and adds a cloud-GPU tier guide and a Hugging Face Jobs pointer.
-
 | Policy      | Batch | Update (ms) | Peak GPU mem (GB) | Best for                                                                                         |
 | ----------- | ----: | ----------: | ----------------: | ------------------------------------------------------------------------------------------------ |
 | `act`       |     4 |    **83.9** |          **0.94** | First-time users, laptops, single-task. Fast and reliable.                                       |
@@ -109,7 +109,7 @@ lerobot-train \

 Similarly to the hardware, you can easily implement your own policy & leverage LeRobot's data collection, training, and visualization tools, and share your model to the HF Hub

-For detailed policy setup guides, see the [Policy Documentation](https://huggingface.co/docs/lerobot/bring_your_own_policies). For GPU/RAM requirements and expected training time per policy, see the [Compute Hardware Guide](https://huggingface.co/docs/lerobot/hardware_guide).
+For detailed policy setup guides, see the [Policy Documentation](https://huggingface.co/docs/lerobot/bring_your_own_policies).

 ## Inference & Evaluation

@@ -35,7 +35,7 @@ USER root
 ARG ROBOTWIN_SHA=0aeea2d669c0f8516f4d5785f0aa33ba812c14b4
 RUN apt-get update \
    && apt-get install -y --no-install-recommends \
-         cuda-nvcc-12-6 cuda-cudart-dev-12-6 \
+         cuda-nvcc-12-4 cuda-cudart-dev-12-4 \
         libvulkan1 vulkan-tools \
    && mkdir -p /usr/share/vulkan/icd.d \
    && echo '{"file_format_version":"1.0.0","ICD":{"library_path":"libGLX_nvidia.so.0","api_version":"1.3.0"}}' \
@@ -24,12 +24,6 @@
  - local: rename_map
    title: Using Rename Map and Empty Cameras
  title: "Tutorials"
- sections:
-  - local: hardware_guide
-    title: Compute Hardware Guide
-  - local: torch_accelerators
-    title: PyTorch accelerators
-  title: "Compute & Hardware"
 - sections:
  - local: lerobot-dataset-v3
    title: Using LeRobotDataset
@@ -148,6 +142,10 @@
  - local: cameras
    title: Cameras
  title: "Sensors"
+- sections:
+  - local: torch_accelerators
+    title: PyTorch accelerators
+  title: "Supported Hardware"
 - sections:
  - local: notebooks
    title: Notebooks
@@ -159,8 +157,6 @@
 - sections:
  - local: contributing
    title: Contribute to LeRobot
-  - local: contributing_a_policy
-    title: Contributing a Policy
  - local: backwardcomp
    title: Backward compatibility
  title: "About"
@@ -1,159 +0,0 @@
-# Contributing a Policy
-
-This is a practical guide for landing a new policy directly in the LeRobot codebase. It's the in-tree counterpart to [Bring Your Own Policies](./bring_your_own_policies), which packages a policy as an out-of-tree `lerobot_policy_*` plugin. The plugin route is faster (no PR required) and is usually the right starting point — land in `main` once the policy has stabilized and there's clear value in shipping it with the library.
-
-It assumes you've already read the general [contribution guide](./contributing) and the [PR template](https://github.com/huggingface/lerobot/blob/main/.github/PULL_REQUEST_TEMPLATE.md) — that's where you'll find the testing/quality expectations every PR has to meet (`pre-commit run -a`, `pytest`, the community-review rule, etc.). What's below is the policy-specific layer on top of that.
-
-A note on tone: robot-learning is an actively evolving field, and "what a policy looks like" can shift with each new architecture. The conventions described here exist because they let `lerobot-train` and `lerobot-eval` work uniformly across very different models. When a new policy genuinely doesn't fit them, raise it in your PR — the conventions are not sacred.
-
---
-
-## In-tree layout
-
-```
-src/lerobot/policies/my_policy/
-├── __init__.py                    # re-exports config + processor factory (NOT modeling)
-├── configuration_my_policy.py     # MyPolicyConfig + @register_subclass
-├── modeling_my_policy.py          # MyPolicy(PreTrainedPolicy)
-├── processor_my_policy.py         # make_my_policy_pre_post_processors
-└── README.md                      # symlink → ../../../../docs/source/policy_my_policy_README.md
-```
-
-Two notes:
-
- The `README.md` next to the source is a **symlink** into `docs/source/policy_<name>_README.md` — the actual file lives under `docs/`. Existing policies (act, smolvla, diffusion, …) all do this; copy one of those symlinks. The policy README is conventionally minimal: paper link + BibTeX citation.
- The user-facing tutorial — what to install, how to train, hyperparameters, benchmark numbers — lives separately at `docs/source/<my_policy>.mdx` and is registered in `_toctree.yml` under "Policies".
-
-The file names are load-bearing: the factory does lazy imports by name, and the processor is discovered by the `make_<policy_name>_pre_post_processors` convention.
-
---
-
-## Policy class
-
-Inherit from [`PreTrainedPolicy`](https://github.com/huggingface/lerobot/blob/main/src/lerobot/policies/pretrained.py) and set two class attributes — both are checked by `__init_subclass__`:
-
-```python
-class MyPolicy(PreTrainedPolicy):
-    config_class = MyPolicyConfig
-    name = "my_policy"  # must match @register_subclass and --policy.type
-```
-
-The methods called by the train/eval loops:
-
-| Method                                                            | Used by           | What it does                                                                                                                                           |
-| ----------------------------------------------------------------- | ----------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------ |
-| `reset() -> None`                                                 | `lerobot-eval`    | Clear per-episode state at the start of each episode.                                                                                                  |
-| `select_action(batch, **kwargs) -> Tensor`                        | `lerobot-eval`    | Return the next action `(B, action_dim)`. Called every step.                                                                                           |
-| `predict_action_chunk(batch, **kwargs) -> Tensor`                 | the policy itself | Return an action chunk `(B, chunk_size, action_dim)`. Currently abstract on the base class — raise `NotImplementedError` if your policy doesn't chunk. |
-| `forward(batch, reduction="mean") -> tuple[Tensor, dict \| None]` | `lerobot-train`   | Return `(loss, output_dict)`. Must accept `reduction="none"` for per-sample weighting.                                                                 |
-| `get_optim_params() -> dict`                                      | the optimizer     | Return parameter groups; `{"params": self.parameters()}` is fine if you don't need per-group settings.                                                 |
-| `update() -> None` _(optional)_                                   | `lerobot-train`   | Called after each optimizer step _if defined_. Use for EMA, target nets, replay buffers (TDMPC uses this).                                             |
-
-Batches are flat dictionaries keyed by the constants in [`lerobot.utils.constants`](https://github.com/huggingface/lerobot/blob/main/src/lerobot/utils/constants.py): `OBS_STATE` (`observation.state.<motor>`), `OBS_IMAGES` (`observation.images.<camera>`), `OBS_LANGUAGE`, `ACTION`, etc. Reuse the constants — don't invent new prefixes.
-
---
-
-## Config class
-
-Inherit from [`PreTrainedConfig`](https://github.com/huggingface/lerobot/blob/main/src/lerobot/configs/policies.py), decorate with `@PreTrainedConfig.register_subclass("my_policy")` (the string must match `MyPolicy.name`), and provide:
-
- `validate_features()` — raises `ValueError` if the configured input/output features can't satisfy your policy. Call it explicitly from your policy's `__init__`.
- `get_optimizer_preset()` — return a config from `lerobot.optim` (default to AdamW unless you genuinely need otherwise).
- `get_scheduler_preset()` — return a `LRSchedulerConfig` or `None`.
- `observation_delta_indices` / `action_delta_indices` / `reward_delta_indices` — relative timestep offsets the dataset loader returns per sample (`None` for single-frame, `list(range(self.horizon))` for action-chunking, etc.).
-
---
-
-## Wiring
-
-Three places need to know about your policy. All by name.
-
-1. **`policies/__init__.py`** — re-export `MyPolicyConfig` and add it to `__all__`. **Don't** re-export the modeling class; it loads lazily through the factory (so `import lerobot` stays fast).
-2. **`factory.py:get_policy_class`** — add a branch returning `MyPolicy` from a lazy import.
-3. **`factory.py:make_policy_config`** and **`factory.py:make_pre_post_processors`** — same idea, two more branches.
-
-Mirror an existing policy that's structurally similar to yours; the diff is small.
-
---
-
-## Heavy / optional dependencies
-
-Most policies need a heavy backbone (transformers, diffusers, a specific VLM SDK). The convention is **two-step gating**: a `TYPE_CHECKING`-guarded import at module top, and a `require_package` runtime check in the constructor. [`modeling_diffusion.py`](https://github.com/huggingface/lerobot/blob/main/src/lerobot/policies/diffusion/modeling_diffusion.py) is the canonical reference:
-
-```python
-from typing import TYPE_CHECKING
-from lerobot.utils.import_utils import _diffusers_available, require_package
-
-if TYPE_CHECKING or _diffusers_available:
-    from diffusers.schedulers.scheduling_ddim import DDIMScheduler
-else:
-    DDIMScheduler = None  # keeps the symbol bindable at import time
-
-class DiffusionPolicy(PreTrainedPolicy):
-    def __init__(self, config):
-        require_package("diffusers", extra="diffusion")
-        super().__init__(config)
-        ...
-```
-
-This way:
-
- `import lerobot.policies` keeps working without the extra installed (the symbol is just bound to `None`).
- Type checkers see the real symbol.
- Instantiating the policy without the extra raises a clear `ImportError` pointing at `pip install 'lerobot[diffusion]'`.
-
-Add a matching extra to [`pyproject.toml`](https://github.com/huggingface/lerobot/blob/main/pyproject.toml) `[project.optional-dependencies]` and include it in the `all` extra so `pip install 'lerobot[all]'` keeps installing everything.
-
---
-
-## Benchmarks and a published checkpoint
-
-A new policy is much easier to review — and far more useful — when it ships with a working checkpoint and at least one number you can reproduce.
-
-**Pick at least one in-tree benchmark.** LeRobot ships sim benchmarks with per-benchmark Docker images (LIBERO, LIBERO-plus, Meta-World, RoboTwin 2.0, RoboCasa365, RoboCerebra, RoboMME, VLABench and more). Pick the one that matches your policy's modality — VLAs usually go to LIBERO or VLABench; image-only BC to LIBERO or Meta-World. The full list lives under [Benchmarks](./libero) in the docs sidebar.
-
-**Push the checkpoint** to the Hub under `lerobot/<policy>_<benchmark>` (or your namespace if you don't have write access; a maintainer can mirror it). Use `PreTrainedPolicy.push_to_hub` so the repo gets `config.json`, `model.safetensors`, and a model card.
-
-**Report results in your policy's MDX**, with the exact `lerobot-eval` command and hardware so anyone can re-run:
-
-```markdown
-## Results
-
-Evaluated on LIBERO with `lerobot/<policy>_libero`:
-
-| Suite          | Success rate | n_episodes |
-| -------------- | -----------: | ---------: |
-| libero_spatial |        87.5% |         50 |
-| libero_object  |        93.0% |         50 |
-| libero_goal    |        81.5% |         50 |
-| libero_10      |        62.0% |         50 |
-| **average**    |    **81.0%** |        200 |
-
-Reproduce: `lerobot-eval --policy.path=lerobot/<policy>_libero --env.type=libero --env.task=libero_spatial --eval.n_episodes=50` (1× A100 40 GB).
-```
-
-Use `n_episodes ≥ 50` per suite for stable success-rate estimates.
-
-If your policy is real-robot-only and no sim benchmark applies, swap the sim eval for: a public training dataset on the Hub, the `lerobot-train` command, the checkpoint, and a real-robot success rate over ≥10 episodes via `lerobot-record --policy.path=...`.
-
---
-
-## PR checklist
-
-The general expectations are in [`CONTRIBUTING.md`](https://github.com/huggingface/lerobot/blob/main/CONTRIBUTING.md) and the [PR template](https://github.com/huggingface/lerobot/blob/main/.github/PULL_REQUEST_TEMPLATE.md). On top of those, reviewers will look for:
-
- [ ] `MyPolicy` and `MyPolicyConfig` cover the surface above; `__init_subclass__` accepts the class.
- [ ] `factory.py` and `policies/__init__.py` are wired (lazy imports for modeling).
- [ ] `make_my_policy_pre_post_processors` follows the naming convention.
- [ ] Optional deps live behind a `[project.optional-dependencies]` extra and the `TYPE_CHECKING + require_package` guard.
- [ ] `tests/policies/` updated; backward-compat artifact committed & policy-specifictests.
- [ ] `src/lerobot/policies/<name>/README.md` symlinked into `docs/source/policy_<name>_README.md`; user-facing `docs/source/<name>.mdx` written and added to `_toctree.yml`.
- [ ] At least one reproducible benchmark eval in the policy MDX with a published checkpoint (sim benchmark, or real-robot dataset + checkpoint).
-
-The fastest way to get a clean PR is to copy the directory of the existing policy closest to yours, rename, and replace contents method by method. Don't wait until everything is polished — open a draft PR early and iterate with us; reviewers would much rather give feedback on a half-finished branch than a fully-merged one.
-
---
-
-## Welcome aboard
-
-Thanks for taking the time to bring a new policy into LeRobot. Every architecture that lands in `main` makes the library a little more useful for the next person — and a little more representative of where robot learning is going. We're genuinely happy to have you contributing, and looking forward to seeing what you ship. 🤗
@@ -1,98 +0,0 @@
-# Compute HW Guide for LeRobot Training
-
-Rough sizing for training a LeRobot policy: how much VRAM each policy needs, what training time looks like, and where to run when local hardware isn't enough.
-
-The numbers below are **indicative** — order-of-magnitude figures for picking hardware, not exact predictions. Throughput depends heavily on dataset I/O, image resolution, batch size, and number of GPUs.
-
-## Memory by policy group
-
-Policies cluster by backbone size; the groupings below give a single VRAM envelope per group instead of repeating numbers per policy. Memory scales roughly linearly with batch size; AdamW (the LeRobot default) carries optimizer state that adds ~30–100% over a forward+backward pass alone.
-
-| Group      | Policies                                    | Peak VRAM (BS 8, AdamW) | Suitable starter GPUs             |
-| ---------- | ------------------------------------------- | ----------------------: | --------------------------------- |
-| Light BC   | `act`, `vqbet`, `tdmpc`                     |                  ~2–6GB | Laptop GPU (RTX 3060), L4, A10G   |
-| Diffusion  | `diffusion`, `multi_task_dit`               |                 ~8–14GB | RTX 4070+ / L4 / A10G             |
-| Small VLA  | `smolvla`                                   |                ~10–16GB | RTX 4080+ / L4 / A10G             |
-| Large VLA  | `pi0`, `pi0_fast`, `pi05`, `xvla`, `wall_x` |                ~24–40GB | A100 40 GB+ (24 GB tight at BS 1) |
-| Multimodal | `groot`, `eo1`                              |                ~24–40GB | A100 40 GB+                       |
-| RL         | `sac`                                       |             config-dep. | See [HIL-SERL guide](./hilserl)   |
-
-Memory-bound? Drop the batch size (~linear), use gradient accumulation to recover effective batch, or for SmolVLA leave `freeze_vision_encoder=True`.
-
-## Training time
-
-Robotics imitation learning typically converges in **5–10 epochs over the dataset**, not hundreds of thousands of raw steps. Once you know your epoch count, wall-clock is essentially:
-
-```text
-total_frames    = sum of frames over all episodes      # 50 ep × 30 fps × 30 s ≈ 45,000
-steps_per_epoch = ceil(total_frames / (num_gpus × batch_size))
-total_steps     = epochs × steps_per_epoch
-wall_clock      ≈ total_steps × per_step_time
-```
-
-Per-step time depends on the policy and the GPU. The numbers in the table below are anchors — pick the row closest to your setup and scale linearly with `total_steps` if you train longer or shorter.
-
-### Common scenarios
-
-Indicative wall-clock for **5 epochs on a ~50-episode dataset (~45k frames at 30 fps × 30 s)**, default optimizer (AdamW), 640×480 images:
-
-| Setup                                | Policy         | Batch | Wall-clock |
-| ------------------------------------ | -------------- | ----- | ---------: |
-| Single RTX 4090 / RTX 3090 (24 GB)   | `act`          | 8     |  ~30–60min |
-| Single RTX 4090 / RTX 3090 (24 GB)   | `diffusion`    | 8     |      ~2–4h |
-| Single L4 / A10G (24 GB)             | `act`          | 8     |      ~1–2h |
-| Single L4 / A10G (24 GB)             | `smolvla`      | 4     |      ~3–6h |
-| Single A100 40 GB                    | `smolvla`      | 16    |      ~1–2h |
-| Single A100 40 GB                    | `pi0` / `pi05` | 4     |      ~4–8h |
-| 4× H100 80 GB cluster (`accelerate`) | `diffusion`    | 32    |  ~30–60min |
-| 4× H100 80 GB cluster (`accelerate`) | `smolvla`      | 32    |      ~1–2h |
-| Apple Silicon M1/M2/M3 Max (MPS)     | `act`          | 4     |     ~6–14h |
-
-These are order-of-magnitude figures. Real runs deviate by ±50% depending on image resolution, dataset I/O, dataloader threading, and exact GPU SKU. They are useful as "is this run going to take an hour or a day?" intuition, not as SLAs.
-
-### Multi-GPU matters a lot
-
-`accelerate launch --num_processes=N` is the easiest way to cut training time. Each optimizer step processes `N × batch_size` samples in roughly the same wall-clock as a single-GPU step, so 4 GPUs ≈ 4× speedup for compute-bound runs. See the [Multi GPU training](./multi_gpu_training) guide for the full setup.
-
-Reference data points on a 4×H100 80 GB cluster (`accelerate launch --num_processes=4`), 5000 steps, batch 32, AdamW, dataset [`imstevenpmwork/super_poulain_draft`](https://huggingface.co/datasets/imstevenpmwork/super_poulain_draft) (~50 episodes, ~640×480 images):
-
-| Policy      | Wall-clock | `update_s` | `dataloading_s` | GPU util | Notable flags                                                                                                                  |
-| ----------- | ---------- | ---------: | --------------: | -------- | ------------------------------------------------------------------------------------------------------------------------------ |
-| `diffusion` | 16m 17s    |      0.167 |           0.015 | ~90%     | defaults (training from scratch)                                                                                               |
-| `smolvla`   | 27m 49s    |      0.312 |           0.011 | ~80%     | `--policy.path=lerobot/smolvla_base`, `freeze_vision_encoder=false`, `train_expert_only=false`                                 |
-| `pi05`      | 3h 41m     |      2.548 |           0.014 | ~95%     | `--policy.pretrained_path=lerobot/pi05_base`, `gradient_checkpointing=true`, `dtype=bfloat16`, vision encoder + expert trained |
-
-The `dataloading_s` vs. `update_s` ratio is the diagnostic that matters: when `dataloading_s` approaches `update_s`, more GPUs stop helping — your dataloader is the bottleneck and you should look at `--num_workers`, image resolution, and disk speed before adding compute.
-
-### Schedule and checkpoints
-
-If you shorten training (e.g. 5k–10k steps on a small dataset), also shorten the LR schedule with `--policy.scheduler_decay_steps≈--steps`. Otherwise the LR stays near its peak and never decays. Same for `--save_freq`.
-
-## Where to run
-
-VRAM is the first filter. Within a tier, pick by budget and availability — the `$`–`$$$$` columns are relative; check current pricing on the provider you actually use.
-
-| Class                      | VRAM  | Tier   | Comfortable for                                             |
-| -------------------------- | ----- | ------ | ----------------------------------------------------------- |
-| RTX 3090 / 4090 (consumer) | 24 GB | `$`    | Light BC, Diffusion, SmolVLA. Tight for VLAs at batch 1.    |
-| L4 / A10G (cloud)          | 24 GB | `$–$$` | Same envelope; common on Google Cloud, RunPod, AWS `g5/g6`. |
-| A100 40 GB                 | 40 GB | `$$$`  | Any policy at reasonable batch sizes.                       |
-| A100 80 GB / H100 80 GB    | 80 GB | `$$$$` | Multi-GPU clusters; large batches for VLAs.                 |
-| **CPU only**               | —     | —      | Don't train. Use Colab or rent a GPU.                       |
-
-### Hugging Face Jobs
-
-[Hugging Face Jobs](https://huggingface.co/docs/hub/jobs) lets you run training on managed HF infrastructure, billed by the second. The repo publishes a ready-to-use image: **`huggingface/lerobot-gpu:latest`**, rebuilt **every night at 02:00 UTC from `main`** ([`docker_publish.yml`](https://github.com/huggingface/lerobot/blob/main/.github/workflows/docker_publish.yml)) — so it tracks the current state of the repo, not a tagged release.
-
-```bash
-hf jobs run --flavor a10g-large huggingface/lerobot-gpu:latest \
-  bash -c "nvidia-smi && lerobot-train \
-    --policy.type=act --dataset.repo_id=<USER>/<DATASET> \
-    --policy.repo_id=<USER>/act_<task> --batch_size=8 --steps=50000"
-```
-
-Notes:
-
- The leading `nvidia-smi` is a quick sanity check that CUDA is visible inside the container — useful to fail fast if the flavor or driver mismatched.
- The default Job timeout is 30 minutes; pass `--timeout 4h` (or longer) for real training.
- `--flavor` maps onto the table above: `t4-small`/`t4-medium` (T4, ACT only), `l4x1`/`l4x4` (L4 24 GB), `a10g-small/large/largex2/largex4` (A10G 24 GB scaled out), `a100-large` (A100). For the current full catalogue + pricing see [https://huggingface.co/docs/hub/jobs](https://huggingface.co/docs/hub/jobs).
@@ -100,8 +100,8 @@ class DiffusionConfig(PreTrainedConfig):

    # Inputs / output structure.
    n_obs_steps: int = 2
-    horizon: int = 64
-    n_action_steps: int = 32
+    horizon: int = 16
+    n_action_steps: int = 8

    normalization_mapping: dict[str, NormalizationMode] = field(
        default_factory=lambda: {
@@ -122,10 +122,10 @@ class DiffusionConfig(PreTrainedConfig):
    crop_ratio: float = 1.0
    crop_shape: tuple[int, int] | None = None
    crop_is_random: bool = True
-    pretrained_backbone_weights: str | None = "ResNet18_Weights.IMAGENET1K_V1"
-    use_group_norm: bool = False
+    pretrained_backbone_weights: str | None = None
+    use_group_norm: bool = True
    spatial_softmax_num_keypoints: int = 32
-    use_separate_rgb_encoder_per_camera: bool = True
+    use_separate_rgb_encoder_per_camera: bool = False
    # Unet.
    down_dims: tuple[int, ...] = (512, 1024, 2048)
    kernel_size: int = 5
@@ -97,8 +97,8 @@ class VQBeTConfig(PreTrainedConfig):
    vision_backbone: str = "resnet18"
    crop_shape: tuple[int, int] | None = (84, 84)
    crop_is_random: bool = True
-    pretrained_backbone_weights: str | None = "ResNet18_Weights.IMAGENET1K_V1"
-    use_group_norm: bool = False
+    pretrained_backbone_weights: str | None = None
+    use_group_norm: bool = True
    spatial_softmax_num_keypoints: int = 32
    # VQ-VAE
    n_vqvae_training_steps: int = 20000
@@ -46,7 +46,7 @@ class LeKiwiConfig(RobotConfig):
    cameras: dict[str, CameraConfig] = field(default_factory=lekiwi_cameras_config)

    # Set to `True` for backward compatibility with previous policies/dataset
-    use_degrees: bool = True
+    use_degrees: bool = False


@dataclass
@@ -23,6 +23,7 @@ from lerobot.utils.robot_utils import precise_sleep

 from ..context import RolloutContext
 from .core import RolloutStrategy, send_next_action
+from .display import BaseDisplay

 logger = logging.getLogger(__name__)

@@ -38,6 +39,8 @@ class BaseStrategy(RolloutStrategy):
        """Initialise the inference engine."""
        self._init_engine(ctx)
        logger.info("Base strategy ready")
+        self._display = BaseDisplay(duration=ctx.runtime.cfg.duration)
+        self._display.show_banner()

    def run(self, ctx: RolloutContext) -> None:
        """Run the autonomous control loop until shutdown or duration expires."""
@@ -72,9 +75,7 @@ class BaseStrategy(RolloutStrategy):
            if (sleep_t := control_interval - dt) > 0:
                precise_sleep(sleep_t)
            else:
-                logger.warning(
-                    f"Record loop is running slower ({1 / dt:.1f} Hz) than the target FPS ({cfg.fps} Hz). Dataset frames might be dropped and robot control might be unstable. Common causes are: 1) Camera FPS not keeping up 2) Policy inference taking too long 3) CPU starvation"
-                )
+                self._warn_slow_loop(dt, control_interval, cfg.fps)

    def teardown(self, ctx: RolloutContext) -> None:
        """Disconnect hardware and stop inference."""
@@ -33,6 +33,7 @@ from ..inference import InferenceEngine
 if TYPE_CHECKING:
    from ..configs import RolloutStrategyConfig
    from ..context import HardwareContext, ProcessorContext, RolloutContext, RuntimeContext
+    from .display import RolloutStatusDisplay

 logger = logging.getLogger(__name__)

@@ -51,6 +52,17 @@ class RolloutStrategy(abc.ABC):
        self._interpolator: ActionInterpolator | None = None
        self._warmup_flushed: bool = False
        self._cached_obs_processed: dict | None = None
+        self._display: RolloutStatusDisplay | None = None
+
+    def _warn_slow_loop(self, dt: float, control_interval: float, fps: float) -> None:
+        """Warn when the control loop runs slower than the target FPS."""
+        if dt > control_interval:
+            logger.warning(
+                "Control loop running slower (%.1f Hz) than target (%.0f Hz). "
+                "Possible causes: camera FPS not keeping up, slow policy inference, CPU starvation.",
+                1 / dt,
+                fps,
+            )

    def _init_engine(self, ctx: RolloutContext) -> None:
        """Attach the inference engine and action interpolator, then start the backend.
@@ -71,6 +71,7 @@ from ..configs import DAggerKeyboardConfig, DAggerPedalConfig, DAggerStrategyCon
 from ..context import RolloutContext
 from ..robot_wrapper import ThreadSafeRobot
 from .core import RolloutStrategy, estimate_max_episode_seconds, safe_push_to_hub, send_next_action
+from .display import DAggerDisplay

 PYNPUT_AVAILABLE = _pynput_available
 keyboard = None
@@ -286,7 +287,7 @@ def _init_dagger_keyboard(events: DAggerEvents, cfg: DAggerKeyboardConfig):

    listener = keyboard.Listener(on_press=on_press)
    listener.start()
-    logger.info(
+    logger.debug(
        "DAgger keyboard listener started (pause_resume='%s', correction='%s', upload='%s', ESC=stop)",
        cfg.pause_resume,
        cfg.correction,
@@ -370,6 +371,28 @@ class DAggerStrategy(RolloutStrategy):
            self._episode_duration_s,
        )

+        if self.config.input_device == "keyboard":
+            kb = self.config.keyboard
+            pause_key, correction_key, upload_key = (
+                kb.pause_resume.upper(),
+                kb.correction.upper(),
+                kb.upload.upper(),
+            )
+        else:
+            pb = self.config.pedal
+            pause_key, correction_key, upload_key = pb.pause_resume, pb.correction, pb.upload
+
+        self._display = DAggerDisplay(
+            record_autonomous=self.config.record_autonomous,
+            num_episodes=self.config.num_episodes,
+            episode_duration_s=self._episode_duration_s,
+            input_device=self.config.input_device,
+            pause_key=pause_key,
+            correction_key=correction_key,
+            upload_key=upload_key,
+        )
+        self._display.show_banner()
+
    def run(self, ctx: RolloutContext) -> None:
        """Run DAgger episodes with human-in-the-loop intervention."""
        if self.config.record_autonomous:
@@ -442,6 +465,7 @@ class DAggerStrategy(RolloutStrategy):
        interpolator.reset()
        events.reset()
        engine.resume()
+        self._display.show_state(DAggerPhase.AUTONOMOUS)

        last_action: dict[str, Any] | None = None
        record_tick = 0
@@ -472,6 +496,7 @@ class DAggerStrategy(RolloutStrategy):
                            ctx,
                            last_action,
                        )
+                        self._display.show_state(new_phase)
                        if new_phase == DAggerPhase.AUTONOMOUS:
                            last_action = None

@@ -556,9 +581,7 @@ class DAggerStrategy(RolloutStrategy):
                    if (sleep_t := control_interval - dt) > 0:
                        precise_sleep(sleep_t)
                    else:
-                        logger.warning(
-                            f"Record loop is running slower ({1 / dt:.1f} Hz) than the target FPS ({cfg.fps} Hz). Dataset frames might be dropped and robot control might be unstable. Common causes are: 1) Camera FPS not keeping up 2) Policy inference taking too long 3) CPU starvation"
-                        )
+                        self._warn_slow_loop(dt, control_interval, cfg.fps)

            finally:
                logger.info("DAgger continuous control loop ended — pausing engine")
@@ -599,6 +622,7 @@ class DAggerStrategy(RolloutStrategy):
        interpolator.reset()
        events.reset()
        engine.resume()
+        self._display.show_state(DAggerPhase.AUTONOMOUS)

        last_action: dict[str, Any] | None = None
        start_time = time.perf_counter()
@@ -633,6 +657,7 @@ class DAggerStrategy(RolloutStrategy):
                            ctx,
                            last_action,
                        )
+                        self._display.show_state(new_phase)
                        if new_phase == DAggerPhase.AUTONOMOUS:
                            last_action = None

@@ -705,9 +730,7 @@ class DAggerStrategy(RolloutStrategy):
                    if (sleep_t := control_interval - dt) > 0:
                        precise_sleep(sleep_t)
                    else:
-                        logger.warning(
-                            f"Record loop is running slower ({1 / dt:.1f} Hz) than the target FPS ({cfg.fps} Hz). Dataset frames might be dropped and robot control might be unstable. Common causes are: 1) Camera FPS not keeping up 2) Policy inference taking too long 3) CPU starvation"
-                        )
+                        self._warn_slow_loop(dt, control_interval, cfg.fps)

            finally:
                logger.info("DAgger corrections-only loop ended — pausing engine")
@@ -0,0 +1,263 @@
+# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Console status display for rollout strategies.
+
+One subclass per strategy — static states/controls are declared as class
+constants; runtime-dependent values are passed to ``__init__``.
+
+In each strategy's ``setup()``:
+
+    self._display = DAggerDisplay(
+        record_autonomous=self.config.record_autonomous,
+        num_episodes=self.config.num_episodes,
+        episode_duration_s=self._episode_duration_s,
+        input_device=self.config.input_device,
+        pause_key="SPACE",
+        correction_key="TAB",
+        upload_key="ENTER",
+    )
+    self._display.show_banner()
+
+On each state transition:
+
+    self._display.show_state("correcting")
+"""
+
+from __future__ import annotations
+
+import enum
+import sys
+from dataclasses import dataclass
+
+
+def _supports_color() -> bool:
+    return hasattr(sys.stdout, "isatty") and sys.stdout.isatty()
+
+
+class _C:
+    """ANSI escape codes."""
+
+    RESET = "\033[0m"
+    BOLD = "\033[1m"
+    DIM = "\033[2m"
+    GREEN = "\033[1;92m"
+    YELLOW = "\033[1;93m"
+    RED = "\033[1;91m"
+    CYAN = "\033[1;96m"
+    WHITE = "\033[1;97m"
+    GRAY = "\033[2;37m"
+
+
+@dataclass
+class StateConfig:
+    """One named rollout state.
+
+    ``key`` must match the string passed to ``RolloutStatusDisplay.show_state()``.
+    """
+
+    key: str
+    emoji: str
+    label: str
+    description: str
+    color: str = _C.WHITE
+
+
+@dataclass
+class ControlConfig:
+    """One keyboard/pedal binding shown in the startup banner."""
+
+    key: str
+    description: str
+
+
+# ---------------------------------------------------------------------------
+# Base display class
+# ---------------------------------------------------------------------------
+
+
+class RolloutStatusDisplay:
+    """Unified console status display.  Subclass once per strategy."""
+
+    def __init__(
+        self,
+        strategy: str,
+        states: list[StateConfig],
+        controls: list[ControlConfig],
+        info: list[str] | None = None,
+    ) -> None:
+        self.strategy = strategy
+        self._states = {s.key: s for s in states}
+        self._controls = controls
+        self._info = info or []
+        self._use_color = _supports_color()
+
+    def _c(self, code: str, text: str) -> str:
+        if not self._use_color:
+            return text
+        return f"{code}{text}{_C.RESET}"
+
+    def show_banner(self) -> None:
+        """Print startup banner: strategy name, states, controls, config info."""
+        width = 62
+        sep = self._c(_C.BOLD, "═" * width)
+
+        print(f"\n{sep}")
+        print(self._c(_C.BOLD, f"  lerobot-rollout  │  {self.strategy}"))
+
+        if self._states:
+            print()
+            for state in self._states.values():
+                label = self._c(state.color, f"{state.label:<14}")
+                desc = self._c(_C.GRAY, state.description)
+                print(f"  {state.emoji}  {label}  {desc}")
+
+        if self._controls:
+            print()
+            key_width = max(len(c.key) for c in self._controls)
+            for ctrl in self._controls:
+                key_str = self._c(_C.CYAN, f"[{ctrl.key:<{key_width}}]")
+                print(f"  {key_str}  {ctrl.description}")
+
+        if self._info:
+            print()
+            for item in self._info:
+                print(f"  {item}")
+
+        print(f"{sep}\n")
+
+    def show_state(self, state_key: str | enum.Enum) -> None:
+        """Print the current state and available controls - call this on every transition."""
+        key = state_key.value if isinstance(state_key, enum.Enum) else state_key
+        state = self._states.get(key)
+        if state is None:
+            return
+        label = self._c(state.color, f"{state.label:<14}")
+        desc = self._c(_C.GRAY, state.description)
+        print(f"\n  {state.emoji}  {label}  {desc}\n")
+
+        if self._controls:
+            key_width = max(len(c.key) for c in self._controls)
+            for ctrl in self._controls:
+                key_str = self._c(_C.CYAN, f"[{ctrl.key:<{key_width}}]")
+                print(f"  {key_str}  {ctrl.description}")
+            print()
+
+
+# ---------------------------------------------------------------------------
+# One display subclass per strategy
+# ---------------------------------------------------------------------------
+
+
+class BaseDisplay(RolloutStatusDisplay):
+    """Status display for the base (eval-only, no recording) strategy."""
+
+    _STATES = [StateConfig("running", "🟢", "RUNNING", "autonomous rollout — no recording", _C.GREEN)]
+    _CONTROLS = [ControlConfig("Ctrl+C", "stop session")]
+
+    def __init__(self, duration: float = 0) -> None:
+        info = ["No recording — evaluation only."]
+        if duration > 0:
+            info.append(f"Duration: {duration:.0f}s")
+        super().__init__("base", self._STATES, self._CONTROLS, info)
+
+
+class SentryDisplay(RolloutStatusDisplay):
+    """Status display for the sentry (continuous autonomous recording) strategy."""
+
+    _STATES = [StateConfig("recording", "🟢", "RECORDING", "continuous autonomous recording", _C.GREEN)]
+    _CONTROLS = [ControlConfig("Ctrl+C", "stop session")]
+
+    def __init__(self, episode_duration_s: float, upload_every_n_episodes: int) -> None:
+        info = [
+            f"Episode rotation: ~{episode_duration_s:.0f}s  |  "
+            f"Upload every {upload_every_n_episodes} episodes",
+        ]
+        super().__init__("sentry", self._STATES, self._CONTROLS, info)
+
+
+class HighlightDisplay(RolloutStatusDisplay):
+    """Status display for the highlight (ring-buffer on-demand save) strategy."""
+
+    def __init__(self, ring_buffer_seconds: float, save_key: str, push_key: str) -> None:
+        states = [
+            StateConfig(
+                "buffering",
+                "⚪",
+                "BUFFERING",
+                f"ring buffer active — last {ring_buffer_seconds:.0f}s captured",
+                _C.WHITE,
+            ),
+            StateConfig("recording", "🔴", "RECORDING", "live recording — press [s] to save episode", _C.RED),
+        ]
+        controls = [
+            ControlConfig(save_key, "BUFFERING ↔ RECORDING  start recording / save episode"),
+            ControlConfig(push_key, "push dataset to Hub (background)"),
+            ControlConfig("ESC", "stop session"),
+        ]
+        super().__init__("highlight", states, controls)
+
+
+class DAggerDisplay(RolloutStatusDisplay):
+    """Status display for the dagger (human-in-the-loop) strategy."""
+
+    _PAUSED_STATE = StateConfig("paused", "🟡", "PAUSED", "holding last position — awaiting input", _C.YELLOW)
+    _CORRECTING_STATE = StateConfig(
+        "correcting", "🔴", "CORRECTING", "human teleop active — recording correction", _C.RED
+    )
+
+    def __init__(
+        self,
+        record_autonomous: bool,
+        num_episodes: int,
+        episode_duration_s: float,
+        input_device: str,
+        pause_key: str,
+        correction_key: str,
+        upload_key: str,
+    ) -> None:
+        mode = "continuous recording" if record_autonomous else "corrections only"
+        auto_desc = "policy running — recording" if record_autonomous else "policy running — no recording"
+        states = [
+            StateConfig("autonomous", "🟢", "AUTONOMOUS", auto_desc, _C.GREEN),
+            self._PAUSED_STATE,
+            self._CORRECTING_STATE,
+        ]
+        controls = [
+            ControlConfig(pause_key, "AUTONOMOUS ↔ PAUSED    pause / resume policy"),
+            ControlConfig(correction_key, "PAUSED ↔ CORRECTING   start / stop correction"),
+            ControlConfig(upload_key, "push dataset to Hub"),
+            ControlConfig("ESC", "stop session"),
+        ]
+        info = [f"Target: {num_episodes} episodes  |  Input: {input_device}"]
+        if record_autonomous:
+            info.append(f"Episode rotation: ~{episode_duration_s:.0f}s")
+        super().__init__(f"dagger  [{mode}]", states, controls, info)
+
+
+if __name__ == "__main__":
+    dagger_display = DAggerDisplay(
+        record_autonomous=False,
+        num_episodes=20,
+        episode_duration_s=30,
+        input_device="keyboard",
+        pause_key="SPACE",
+        correction_key="TAB",
+        upload_key="ENTER",
+    )
+    dagger_display.show_banner()
+    dagger_display.show_state("paused")
+    dagger_display.show_state("correcting")
+    dagger_display.show_state("paused")
+    dagger_display.show_state("autonomous")
@@ -17,6 +17,7 @@
 from __future__ import annotations

 import contextlib
+import enum
 import logging
 import os
 import sys
@@ -36,6 +37,7 @@ from ..configs import HighlightStrategyConfig
 from ..context import RolloutContext
 from ..ring_buffer import RolloutRingBuffer
 from .core import RolloutStrategy, safe_push_to_hub, send_next_action
+from .display import HighlightDisplay

 PYNPUT_AVAILABLE = _pynput_available
 keyboard = None
@@ -53,6 +55,13 @@ if PYNPUT_AVAILABLE:
 logger = logging.getLogger(__name__)


+class HighlightPhase(enum.Enum):
+    """Observable phases of a Highlight session."""
+
+    BUFFERING = "buffering"  # Ring buffer accumulating frames, not recording
+    RECORDING = "recording"  # Live recording active
+
+
 class HighlightStrategy(RolloutStrategy):
    """Autonomous rollout with on-demand recording via ring buffer.

@@ -105,6 +114,13 @@ class HighlightStrategy(RolloutStrategy):
            self.config.save_key,
            self.config.push_key,
        )
+        self._display = HighlightDisplay(
+            ring_buffer_seconds=self.config.ring_buffer_seconds,
+            save_key=self.config.save_key,
+            push_key=self.config.push_key,
+        )
+        self._display.show_banner()
+        self._display.show_state(HighlightPhase.BUFFERING)

    def run(self, ctx: RolloutContext) -> None:
        """Run the autonomous loop, buffering frames and recording on demand."""
@@ -162,6 +178,7 @@ class HighlightStrategy(RolloutStrategy):
                                for buffered_frame in ring.drain():
                                    dataset.add_frame(buffered_frame)
                                self._recording_live.set()
+                                self._display.show_state(HighlightPhase.RECORDING)
                            else:
                                dataset.add_frame(frame)
                                with self._episode_lock:
@@ -172,6 +189,7 @@ class HighlightStrategy(RolloutStrategy):
                                    play_sounds,
                                )
                                self._recording_live.clear()
+                                self._display.show_state(HighlightPhase.BUFFERING)
                                continue  # frame already consumed — skip ring.append

                        if self._push_requested.is_set():
@@ -188,9 +206,7 @@ class HighlightStrategy(RolloutStrategy):
                    if (sleep_t := control_interval - dt) > 0:
                        precise_sleep(sleep_t)
                    else:
-                        logger.warning(
-                            f"Record loop is running slower ({1 / dt:.1f} Hz) than the target FPS ({cfg.fps} Hz). Dataset frames might be dropped and robot control might be unstable. Common causes are: 1) Camera FPS not keeping up 2) Policy inference taking too long 3) CPU starvation"
-                        )
+                        self._warn_slow_loop(dt, control_interval, cfg.fps)

            finally:
                logger.info("Highlight control loop ended")
@@ -255,7 +271,7 @@ class HighlightStrategy(RolloutStrategy):

            self._listener = keyboard.Listener(on_press=on_press)
            self._listener.start()
-            logger.info("Keyboard listener started (save='%s', push='%s', ESC=stop)", save_key, push_key)
+            logger.debug("Keyboard listener started (save='%s', push='%s', ESC=stop)", save_key, push_key)
        except ImportError:
            logger.warning("pynput not available — keyboard listener disabled")

@@ -32,6 +32,7 @@ from lerobot.utils.utils import log_say
 from ..configs import SentryStrategyConfig
 from ..context import RolloutContext
 from .core import RolloutStrategy, estimate_max_episode_seconds, safe_push_to_hub, send_next_action
+from .display import SentryDisplay

 logger = logging.getLogger(__name__)

@@ -79,6 +80,11 @@ class SentryStrategy(RolloutStrategy):
            self._episode_duration_s,
            self.config.upload_every_n_episodes,
        )
+        self._display = SentryDisplay(
+            episode_duration_s=self._episode_duration_s,
+            upload_every_n_episodes=self.config.upload_every_n_episodes,
+        )
+        self._display.show_banner()

    def run(self, ctx: RolloutContext) -> None:
        """Run the continuous recording loop with automatic episode rotation."""
@@ -160,9 +166,7 @@ class SentryStrategy(RolloutStrategy):
                    if (sleep_t := control_interval - dt) > 0:
                        precise_sleep(sleep_t)
                    else:
-                        logger.warning(
-                            f"Record loop is running slower ({1 / dt:.1f} Hz) than the target FPS ({cfg.fps} Hz). Dataset frames might be dropped and robot control might be unstable. Common causes are: 1) Camera FPS not keeping up 2) Policy inference taking too long 3) CPU starvation"
-                        )
+                        self._warn_slow_loop(dt, control_interval, cfg.fps)

            finally:
                logger.info("Sentry control loop ended — saving final episode")
@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:51effd76b73e972f10d31f5084ab906386134b600c87b2668767d30232a902bd
+oid sha256:54aecbc1af72a4cd5e9261492f5e7601890517516257aacdf2a0ffb3ce281f1b
 size 992
@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:d4d7a16ca67f9adefac0e0620a7b2e9c822f2db42faaaced7a89fbad60e5ead4
-size 47680
+oid sha256:88a9c3775a2aa1e90a08850521970070a4fcf0f6b82aab43cd8ccc5cf77e0013
+size 47424
@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:796c439ee8a64bf9901ff8325e7419bda8bd316360ee95e6304e8e1ae0f4c36c
+oid sha256:91a2635e05a75fe187a5081504c5f35ce3417378813fa2deaf9ca4e8200e1819
 size 68
@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:ad33a8b47c39c2e1374567ff9da43cdb95e2dbe904c1b02a35051346d3043095
-size 47680
+oid sha256:645bff922ac7bea63ad018ebf77c303c0e4cd2c1c0dc5ef3192865281bef3dc6
+size 47424