add video encoding tool

2026-07-06 09:37:06 +00:00 · 2025-12-01 13:46:22 +01:00
parent 5f7b5f2817
commit d22fa6446b
2 changed files with 307 additions and 5 deletions
@@ -11,13 +11,14 @@ LeRobot provides several utilities for manipulating datasets:
 3. **Merge Datasets** - Combine multiple datasets into one. The datasets must have identical features, and episodes are concatenated in the order specified in `repo_ids`
 4. **Add Features** - Add new features to a dataset
 5. **Remove Features** - Remove features from a dataset
+6. **Convert to Video** - Convert image-based datasets to video format for efficient storage

 The core implementation is in `lerobot.datasets.dataset_tools`.
 An example script detailing how to use the tools API is available in `examples/dataset/use_dataset_tools.py`.

 ## Command-Line Tool: lerobot-edit-dataset

-`lerobot-edit-dataset` is a command-line script for editing datasets. It can be used to delete episodes, split datasets, merge datasets, add features, and remove features.
+`lerobot-edit-dataset` is a command-line script for editing datasets. It can be used to delete episodes, split datasets, merge datasets, add features, remove features, and convert image datasets to video format.

 Run `lerobot-edit-dataset --help` for more information on the configuration of each operation.

@@ -86,6 +87,53 @@ lerobot-edit-dataset \
    --operation.feature_names "['observation.images.top']"
 ```

+#### Convert to Video
+
+Convert an image-based dataset to video format. This is useful for reducing storage requirements and improving data loading performance. Videos are encoded with configurable quality settings.
+
+```bash
+# Convert all episodes to video format with default settings
+lerobot-edit-dataset \
+    --repo_id lerobot/pusht_image \
+    --operation.type convert_to_video \
+    --operation.output_dir outputs/converted_videos
+
+# Convert with custom video codec and quality settings
+lerobot-edit-dataset \
+    --repo_id lerobot/pusht_image \
+    --operation.type convert_to_video \
+    --operation.output_dir outputs/converted_videos \
+    --operation.vcodec libsvtav1 \
+    --operation.pix_fmt yuv420p \
+    --operation.g 2 \
+    --operation.crf 30
+
+# Convert only specific episodes
+lerobot-edit-dataset \
+    --repo_id lerobot/pusht_image \
+    --operation.type convert_to_video \
+    --operation.output_dir outputs/converted_videos \
+    --operation.episode_indices "[0, 1, 2, 5, 10]"
+
+# Convert with multiple workers for parallel processing
+lerobot-edit-dataset \
+    --repo_id lerobot/pusht_image \
+    --operation.type convert_to_video \
+    --operation.output_dir outputs/converted_videos \
+    --operation.num_workers 8
+```
+
+**Parameters:**
+- `output_dir`: Directory where videos will be saved (default: `outputs/converted_videos`)
+- `vcodec`: Video codec to use - options: `h264`, `hevc`, `libsvtav1` (default: `libsvtav1`)
+- `pix_fmt`: Pixel format - options: `yuv420p`, `yuv444p` (default: `yuv420p`)
+- `g`: Group of pictures (GOP) size - lower values give better quality but larger files (default: 2)
+- `crf`: Constant rate factor - lower values give better quality but larger files, 0 is lossless (default: 30)
+- `fast_decode`: Fast decode tuning option (default: 0)
+- `episode_indices`: List of specific episodes to convert (default: all episodes)
+- `num_workers`: Number of parallel workers for processing (default: 4)
+- `overwrite`: Overwrite existing video files if they exist
+
 ### Push to Hub

 Add the `--push_to_hub` flag to any command to automatically upload the resulting dataset to the Hugging Face Hub: