Merge branch 'main' into user/michel-aractingi/2025_06_30_dataset_v3

Signed-off-by: Michel Aractingi <michel.aractingi@huggingface.co>
2026-07-08 02:22:02 +00:00 · 2025-07-15 21:40:22 +02:00
parent 3483e4441e e6e1f085d4
commit e05d22cb7b
138 changed files with 6032 additions and 939 deletions
@@ -1,6 +1,6 @@
 This tutorial will explain the training script, how to use it, and particularly how to configure everything needed for the training run.
-> **Note:** The following assumes you're running these commands on a machine equipped with a cuda GPU. If you don't have one (or if you're using a Mac), you can add `--policy.device=cpu` (`--policy.device=mps` respectively). However, be advised that the code executes much slower on cpu.

+> **Note:** The following assumes you're running these commands on a machine equipped with a cuda GPU. If you don't have one (or if you're using a Mac), you can add `--policy.device=cpu` (`--policy.device=mps` respectively). However, be advised that the code executes much slower on cpu.

 ## The training script

@@ -15,17 +15,22 @@ LeRobot offers a training script at [`lerobot/scripts/train.py`](../src/lerobot/
 ## Overview of the configuration system

 In the training script, the main function `train` expects a `TrainPipelineConfig` object:
+
+<!-- prettier-ignore-start -->
 ```python
 # train.py
@parser.wrap()
 def train(cfg: TrainPipelineConfig):
 ```
+<!-- prettier-ignore-end -->

 You can inspect the `TrainPipelineConfig` defined in [`lerobot/configs/train.py`](../src/lerobot/configs/train.py) (which is heavily commented and meant to be a reference to understand any option)

 When running the script, inputs for the command line are parsed thanks to the `@parser.wrap()` decorator and an instance of this class is automatically generated. Under the hood, this is done with [Draccus](https://github.com/dlwh/draccus) which is a tool dedicated to this purpose. If you're familiar with Hydra, Draccus can similarly load configurations from config files (.json, .yaml) and also override their values through command line inputs. Unlike Hydra, these configurations are pre-defined in the code through dataclasses rather than being defined entirely in config files. This allows for more rigorous serialization/deserialization, typing, and to manipulate configuration as objects directly in the code and not as dictionaries or namespaces (which enables nice features in an IDE such as autocomplete, jump-to-def, etc.)

 Let's have a look at a simplified example. Amongst other attributes, the training config has the following attributes:
+
+<!-- prettier-ignore-start -->
 ```python
@dataclass
 class TrainPipelineConfig:
@@ -33,7 +38,11 @@ class TrainPipelineConfig:
    env: envs.EnvConfig | None = None
    policy: PreTrainedConfig | None = None
 ```
+<!-- prettier-ignore-end -->
+
 in which `DatasetConfig` for example is defined as such:
+
+<!-- prettier-ignore-start -->
 ```python
@dataclass
 class DatasetConfig:
@@ -41,16 +50,17 @@ class DatasetConfig:
    episodes: list[int] | None = None
    video_backend: str = "pyav"
 ```
+<!-- prettier-ignore-end -->

 This creates a hierarchical relationship where, for example assuming we have a `cfg` instance of `TrainPipelineConfig`, we can access the `repo_id` value with `cfg.dataset.repo_id`.
 From the command line, we can specify this value by using a very similar syntax `--dataset.repo_id=repo/id`.

 By default, every field takes its default value specified in the dataclass. If a field doesn't have a default value, it needs to be specified either from the command line or from a config file – which path is also given in the command line (more in this below). In the example above, the `dataset` field doesn't have a default value which means it must be specified.

-
 ## Specifying values from the CLI

 Let's say that we want to train [Diffusion Policy](../src/lerobot/policies/diffusion) on the [pusht](https://huggingface.co/datasets/lerobot/pusht) dataset, using the [gym_pusht](https://github.com/huggingface/gym-pusht) environment for evaluation. The command to do so would look like this:
+
 ```bash
 python -m lerobot.scripts.train \
    --dataset.repo_id=lerobot/pusht \
@@ -59,11 +69,13 @@ python -m lerobot.scripts.train \
 ```

 Let's break this down:
+
 - To specify the dataset, we just need to specify its `repo_id` on the hub which is the only required argument in the `DatasetConfig`. The rest of the fields have default values and in this case we are fine with those so we can just add the option `--dataset.repo_id=lerobot/pusht`.
 - To specify the policy, we can just select diffusion policy using `--policy` appended with `.type`. Here, `.type` is a special argument which allows us to select config classes inheriting from `draccus.ChoiceRegistry` and that have been decorated with the `register_subclass()` method. To have a better explanation of this feature, have a look at this [Draccus demo](https://github.com/dlwh/draccus?tab=readme-ov-file#more-flexible-configuration-with-choice-types). In our code, we use this mechanism mainly to select policies, environments, robots, and some other components like optimizers. The policies available to select are located in [lerobot/policies](../src/lerobot/policies)
 - Similarly, we select the environment with `--env.type=pusht`. The different environment configs are available in [`lerobot/envs/configs.py`](../src/lerobot/envs/configs.py)

 Let's see another example. Let's say you've been training [ACT](../src/lerobot/policies/act) on [lerobot/aloha_sim_insertion_human](https://huggingface.co/datasets/lerobot/aloha_sim_insertion_human) using the [gym-aloha](https://github.com/huggingface/gym-aloha) environment for evaluation with:
+
 ```bash
 python -m lerobot.scripts.train \
    --policy.type=act \
@@ -71,10 +83,12 @@ python -m lerobot.scripts.train \
    --env.type=aloha \
    --output_dir=outputs/train/act_aloha_insertion
 ```
+
 > Notice we added `--output_dir` to explicitly tell where to write outputs from this run (checkpoints, training state, configs etc.). This is not mandatory and if you don't specify it, a default directory will be created from the current date and time, env.type and policy.type. This will typically look like `outputs/train/2025-01-24/16-10-05_aloha_act`.

 We now want to train a different policy for aloha on another task. We'll change the dataset and use [lerobot/aloha_sim_transfer_cube_human](https://huggingface.co/datasets/lerobot/aloha_sim_transfer_cube_human) instead. Of course, we also need to change the task of the environment as well to match this other task.
 Looking at the [`AlohaEnv`](../src/lerobot/envs/configs.py) config, the task is `"AlohaInsertion-v0"` by default, which corresponds to the task we trained on in the command above. The [gym-aloha](https://github.com/huggingface/gym-aloha?tab=readme-ov-file#description) environment also has the `AlohaTransferCube-v0` task which corresponds to this other task we want to train on. Putting this together, we can train this new policy on this different task using:
+
 ```bash
 python -m lerobot.scripts.train \
    --policy.type=act \
@@ -87,6 +101,7 @@ python -m lerobot.scripts.train \
 ## Loading from a config file

 Now, let's assume that we want to reproduce the run just above. That run has produced a `train_config.json` file in its checkpoints, which serializes the `TrainPipelineConfig` instance it used:
+
 ```json
 {
    "dataset": {
@@ -110,34 +125,40 @@ Now, let's assume that we want to reproduce the run just above. That run has pro
 ```

 We can then simply load the config values from this file using:
+
 ```bash
 python -m lerobot.scripts.train \
    --config_path=outputs/train/act_aloha_transfer/checkpoints/last/pretrained_model/ \
    --output_dir=outputs/train/act_aloha_transfer_2
 ```
+
 `--config_path` is also a special argument which allows to initialize the config from a local config file. It can point to a directory that contains `train_config.json` or to the config file itself directly.

 Similarly to Hydra, we can still override some parameters in the CLI if we want to, e.g.:
+
 ```bash
 python -m lerobot.scripts.train \
    --config_path=outputs/train/act_aloha_transfer/checkpoints/last/pretrained_model/ \
    --output_dir=outputs/train/act_aloha_transfer_2
    --policy.n_action_steps=80
 ```
+
 > Note: While `--output_dir` is not required in general, in this case we need to specify it since it will otherwise take the value from the `train_config.json` (which is `outputs/train/act_aloha_transfer`). In order to prevent accidental deletion of previous run checkpoints, we raise an error if you're trying to write in an existing directory. This is not the case when resuming a run, which is what you'll learn next.

 `--config_path` can also accept the repo_id of a repo on the hub that contains a `train_config.json` file, e.g. running:
+
 ```bash
 python -m lerobot.scripts.train --config_path=lerobot/diffusion_pusht
 ```
-will start a training run with the same configuration used for training [lerobot/diffusion_pusht](https://huggingface.co/lerobot/diffusion_pusht)

+will start a training run with the same configuration used for training [lerobot/diffusion_pusht](https://huggingface.co/lerobot/diffusion_pusht)

 ## Resume training

 Being able to resume a training run is important in case it crashed or aborted for any reason. We'll demonstrate how to do that here.

 Let's reuse the command from the previous run and add a few more options:
+
 ```bash
 python -m lerobot.scripts.train \
    --policy.type=act \
@@ -150,19 +171,24 @@ python -m lerobot.scripts.train \
 ```

 Here we've taken care to set up the log frequency and checkpointing frequency to low numbers so we can showcase resumption. You should be able to see some logging and have a first checkpoint within 1 minute (depending on hardware). Wait for the first checkpoint to happen, you should see a line that looks like this in your terminal:
+
 ```
 INFO 2025-01-24 16:10:56 ts/train.py:263 Checkpoint policy after step 100
 ```
+
 Now let's simulate a crash by killing the process (hit `ctrl`+`c`). We can then simply resume this run from the last checkpoint available with:
+
 ```bash
 python -m lerobot.scripts.train \
    --config_path=outputs/train/run_resumption/checkpoints/last/pretrained_model/ \
    --resume=true
 ```
+
 You should see from the logging that your training picks up from where it left off.

 Another reason for which you might want to resume a run is simply to extend training and add more training steps. The number of training steps is set by the option `--steps`, which is 100 000 by default.
 You could double the number of steps of the previous run with:
+
 ```bash
 python -m lerobot.scripts.train \
    --config_path=outputs/train/run_resumption/checkpoints/last/pretrained_model/ \
@@ -171,7 +197,9 @@ python -m lerobot.scripts.train \
 ```

 ## Outputs of a run
+
 In the output directory, there will be a folder called `checkpoints` with the following structure:
+
 ```bash
 outputs/train/run_resumption/checkpoints
 ├── 000100  # checkpoint_dir for training step 100
@@ -194,6 +222,7 @@ outputs/train/run_resumption/checkpoints
 In addition to the features currently in Draccus, we've added a special `.path` argument for the policy, which allows to load a policy as you would with `PreTrainedPolicy.from_pretrained()`. In that case, `path` can be a local directory that contains a checkpoint or a repo_id pointing to a pretrained policy on the hub.

 For example, we could fine-tune a [policy pre-trained on the aloha transfer task](https://huggingface.co/lerobot/act_aloha_sim_transfer_cube_human) on the aloha insertion task. We can achieve this with:
+
 ```bash
 python -m lerobot.scripts.train \
    --policy.path=lerobot/act_aloha_sim_transfer_cube_human \
@@ -209,15 +238,19 @@ When doing so, keep in mind that the features of the fine-tuning dataset would h
 When you start the training process, you will first see your full configuration being printed in the terminal. You can check it to make sure that you configured your run correctly. The final configuration will also be saved with the checkpoint.

 After that, you will see training log like this one:
+
 ```
 INFO 2024-08-14 13:35:12 ts/train.py:192 step:0 smpl:64 ep:1 epch:0.00 loss:1.112 grdn:15.387 lr:2.0e-07 updt_s:1.738 data_s:4.774
 ```
+
 or evaluation log:
+
 ```
 INFO 2024-08-14 13:38:45 ts/train.py:226 step:100 smpl:6K ep:52 epch:0.25 ∑rwrd:20.693 success:0.0% eval_s:120.266
 ```

 These logs will also be saved in wandb if `wandb.enable` is set to `true`. Here are the meaning of some abbreviations:
+
 - `smpl`: number of samples seen during training.
 - `ep`: number of episodes seen during training. An episode contains multiple samples in a complete manipulation task.
 - `epch`: number of time all unique samples are seen (epoch).
@@ -235,6 +268,7 @@ Some metrics are useful for initial performance profiling. For example, if you f
 We'll summarize here the main use cases to remember from this tutorial.

 #### Train a policy from scratch – CLI
+
 ```bash
 python -m lerobot.scripts.train \
    --policy.type=act \  # <- select 'act' policy
@@ -243,6 +277,7 @@ python -m lerobot.scripts.train \
 ```

 #### Train a policy from scratch - config file + CLI
+
 ```bash
 python -m lerobot.scripts.train \
    --config_path=path/to/pretrained_model \  # <- can also be a repo_id
@@ -250,6 +285,7 @@ python -m lerobot.scripts.train \
 ```

 #### Resume/continue a training run
+
 ```bash
 python -m lerobot.scripts.train \
    --config_path=checkpoint/pretrained_model/ \
@@ -258,6 +294,7 @@ python -m lerobot.scripts.train \
 ```

 #### Fine-tuning
+
 ```bash
 python -m lerobot.scripts.train \
    --policy.path=lerobot/act_aloha_sim_transfer_cube_human \  # <- can also be a local path to a checkpoint
@@ -1,67 +0,0 @@
-# Copyright 2024 The HuggingFace Inc. team. All rights reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-"""
-This script demonstrates how to use torchvision's image transformation with LeRobotDataset for data
-augmentation purposes. The transformations are passed to the dataset as an argument upon creation, and
-transforms are applied to the observation images before they are returned in the dataset's __getitem__.
-"""
-
-from pathlib import Path
-
-from torchvision.transforms import ToPILImage, v2
-
-from lerobot.datasets.lerobot_dataset import LeRobotDataset
-
-dataset_repo_id = "lerobot/aloha_static_screw_driver"
-
-# Create a LeRobotDataset with no transformations
-dataset = LeRobotDataset(dataset_repo_id, episodes=[0])
-# This is equivalent to `dataset = LeRobotDataset(dataset_repo_id, image_transforms=None)`
-
-# Get the index of the first observation in the first episode
-first_idx = dataset.meta.episodes["dataset_from_index"][0]
-
-# Get the frame corresponding to the first camera
-frame = dataset[first_idx][dataset.meta.camera_keys[0]]
-
-
-# Define the transformations
-transforms = v2.Compose(
-    [
-        v2.ColorJitter(brightness=(0.5, 1.5)),
-        v2.ColorJitter(contrast=(0.5, 1.5)),
-        v2.ColorJitter(hue=(-0.1, 0.1)),
-        v2.RandomAdjustSharpness(sharpness_factor=2, p=1),
-    ]
-)
-
-# Create another LeRobotDataset with the defined transformations
-transformed_dataset = LeRobotDataset(dataset_repo_id, episodes=[0], image_transforms=transforms)
-
-# Get a frame from the transformed dataset
-transformed_frame = transformed_dataset[first_idx][transformed_dataset.meta.camera_keys[0]]
-
-# Create a directory to store output images
-output_dir = Path("outputs/image_transforms")
-output_dir.mkdir(parents=True, exist_ok=True)
-
-# Save the original frame
-to_pil = ToPILImage()
-to_pil(frame).save(output_dir / "original_frame.png", quality=100)
-print(f"Original frame saved to {output_dir / 'original_frame.png'}.")
-
-# Save the transformed frame
-to_pil(transformed_frame).save(output_dir / "transformed_frame.png", quality=100)
-print(f"Transformed frame saved to {output_dir / 'transformed_frame.png'}.")
@@ -1,104 +0,0 @@
-# Copyright 2024 The HuggingFace Inc. team. All rights reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-"""This script demonstrates how to slice a dataset and calculate the loss on a subset of the data.
-
-This technique can be useful for debugging and testing purposes, as well as identifying whether a policy
-is learning effectively.
-
-Furthermore, relying on validation loss to evaluate performance is generally not considered a good practice,
-especially in the context of imitation learning. The most reliable approach is to evaluate the policy directly
-on the target environment, whether that be in simulation or the real world.
-"""
-
-import math
-
-import torch
-
-from lerobot.datasets.lerobot_dataset import LeRobotDataset, LeRobotDatasetMetadata
-from lerobot.policies.diffusion.modeling_diffusion import DiffusionPolicy
-
-
-def main():
-    device = torch.device("cuda")
-
-    # Download the diffusion policy for pusht environment
-    pretrained_policy_path = "lerobot/diffusion_pusht"
-    # OR uncomment the following to evaluate a policy from the local outputs/train folder.
-    # pretrained_policy_path = Path("outputs/train/example_pusht_diffusion")
-
-    policy = DiffusionPolicy.from_pretrained(pretrained_policy_path)
-    policy.eval()
-    policy.to(device)
-
-    # Set up the dataset.
-    delta_timestamps = {
-        # Load the previous image and state at -0.1 seconds before current frame,
-        # then load current image and state corresponding to 0.0 second.
-        "observation.image": [-0.1, 0.0],
-        "observation.state": [-0.1, 0.0],
-        # Load the previous action (-0.1), the next action to be executed (0.0),
-        # and 14 future actions with a 0.1 seconds spacing. All these actions will be
-        # used to calculate the loss.
-        "action": [-0.1, 0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 1.1, 1.2, 1.3, 1.4],
-    }
-
-    # Load the last 10% of episodes of the dataset as a validation set.
-    # - Load dataset metadata
-    dataset_metadata = LeRobotDatasetMetadata("lerobot/pusht")
-    # - Calculate train and val episodes
-    total_episodes = dataset_metadata.total_episodes
-    episodes = list(range(dataset_metadata.total_episodes))
-    num_train_episodes = math.floor(total_episodes * 90 / 100)
-    train_episodes = episodes[:num_train_episodes]
-    val_episodes = episodes[num_train_episodes:]
-    print(f"Number of episodes in full dataset: {total_episodes}")
-    print(f"Number of episodes in training dataset (90% subset): {len(train_episodes)}")
-    print(f"Number of episodes in validation dataset (10% subset): {len(val_episodes)}")
-    # - Load train and val datasets
-    train_dataset = LeRobotDataset(
-        "lerobot/pusht", episodes=train_episodes, delta_timestamps=delta_timestamps
-    )
-    val_dataset = LeRobotDataset("lerobot/pusht", episodes=val_episodes, delta_timestamps=delta_timestamps)
-    print(f"Number of frames in training dataset (90% subset): {len(train_dataset)}")
-    print(f"Number of frames in validation dataset (10% subset): {len(val_dataset)}")
-
-    # Create dataloader for evaluation.
-    val_dataloader = torch.utils.data.DataLoader(
-        val_dataset,
-        num_workers=4,
-        batch_size=64,
-        shuffle=False,
-        pin_memory=device != torch.device("cpu"),
-        drop_last=False,
-    )
-
-    # Run validation loop.
-    loss_cumsum = 0
-    n_examples_evaluated = 0
-    for batch in val_dataloader:
-        batch = {k: v.to(device, non_blocking=True) for k, v in batch.items()}
-        loss, _ = policy.forward(batch)
-
-        loss_cumsum += loss.item()
-        n_examples_evaluated += batch["index"].shape[0]
-
-    # Calculate the average loss over the validation set.
-    average_loss = loss_cumsum / n_examples_evaluated
-
-    print(f"Average loss on validation set: {average_loss:.4f}")
-
-
-if __name__ == "__main__":
-    main()