lerobot

admin/lerobot

Fork 0

mirror of https://github.com/huggingface/lerobot.git synced 2026-06-30 14:47:10 +00:00

Commit Graph

Author SHA1 Message Date

Author	SHA1	Message	Date
Caroline Pascal	3dd19d043e	feat(depth maps): adding support for depth in LeRobot (#3644 ) * feat(depth): add depth quantization helpers and tests * feat(video): add ffv1 to supported codecs * feat(depth): persist depth metadata * feat(depth): extend quantization tools to better fit the encoding/decoding pipeline * feat(depth): plumb DepthEncoderConfig through LeRobotDataset and DatasetWriter * feat(depth): wire StreamingVideoEncoder + writer to depth encoder * feat(depth): wire DatasetReader to decode_depth_frames * feat(cameras/realsense): expose async depth in metric meters * feat(features): route 2D camera shapes to observation.depth.<key> * feat(robots/so_follower): emit + populate depth keys when use_depth * feat(record): plumb DepthEncoderConfig through lerobot-record * feat(viz): render depth observations as rr.DepthImage in Viridis * feat(depth maps writer): adding support for raw depth maps recording with image writer * chore(format): format code * feat(depth shape): ensuring depth maps shape is always including the channel * feat(is_depth): simplifying is_depth nested name + legacy support * fix(stop_event): fixing stop_event race condition in camera classes * fix(plumbing): fixing missing parts in the depth maps pipeline * chore(typos): fixing typos * test(fix): fixing exisiting tests to still work with latest features * tests(depth): adding new tests for depth integration validation * feat(pix_fmt channels): use PyAv to check get pixel formats number of channels * feat(refactor): refactor DepthEncoderConfig quantization pipeline, so that the methods do not live in the config class. Add pixel format - channels validation.Move the default pixel format for depth in the config file. * fix(pre-commit): fixing mutable defautl value * fix(info): fixing info metadata update when is_depth_map was set * tests(typos): fixing typos in tests * fix(realsense): fixing typo in realsense serial number * fix(normalization): restricting 255 normalization to non depth/uint8 images only * fix(typo): fixing typo * fix(TIFF): add missing quantization and cleanup for TIFF files * feat(batched dequantization): optimizing dequantize_depth for torch based batched dequantization * feat(tools): adding depth support in LeRobotDataset edition tools * test(aggregate): extending aggregation tests to depth frames * test(cleaning): cleaning up tests * fix(from_video_info): fixing early validation issue in from_video_info * fix(typo): fixing typo * fix(is_depth): adding missing doctrings and is_depth arguments in video decoding functions Co-authored-by: Wensi (Vince) Ai <59036629+wensi-ai@users.noreply.github.com> * fix(depth units): fixing depth units output for the realsense cameras * feat(output unit): adding support for output unit specification at dataset reading/training time Co-authored-by: Wensi (Vince) Ai <59036629+wensi-ai@users.noreply.github.com> * test(depth): cleaning up depth tests * test(depth encoding): updating and cleaning video/depth encoding tests * chore(format): formatting code * docs(depth): improving depth maps docs * test(fix): fixing depth tests * test(dataset tools): adding missing tests for new dataset edition tools features * chore(format): formatting code * fix(pyav check): fixing PyAV option validation for integer codec options by normalizing numeric values before calling `is_integer()` Co-authored-by: Wensi (Vince) Ai <59036629+wensi-ai@users.noreply.github.com> * docs(mermaid): fixing mermaid diagram * fix(rebase): rebase follow up corrections * feat(dataset tools): adding missing docstrings and features for depth fill support in dataset edition tools * docs(docstring): updating docstrings * docs(dataset tools): updating docs * fix(save images): fixing image saving in dataset tools * fix(update video info): fixing update video info logic to match the recording and editing use cases * test(reencode): fixing reencoding monkeypatch * fix(review): add Claude review * chore(format): format code * fix(update video info): ditching the differentiated approahces for video info update - video info are always updated unless for preserved keys. * chore(rebase): fixing rebase merge conflicts * test(visualization): fixing visualization tests * feat(docstrings): adding explicit docstring for encoding parameters. Docstrigns will now show up as description in the CLI --help. * feat(mm as default): adding a global DEFAULT_DEPTH_UNIT variable setting mm as default depth unit * fix(RGB <-> camera): renaming camera_encoder to rgb_encoder for clarity * chore(TODO): removing deprecated TODO * doc(write_u16_plane): improving docstrings for write_u16_plane * feat(units): adding constants for depth frames units (m and mm) * fix(spam): replacing spamming warning but a debug log * feat(leagcy metadata): adding automatic metadata update for legacy 'video.is_depth_map' feature * fix(copy&reindex): fixing metadat reshaping for single channel frames * fix(ImageNet): excluding dpeth frames from ImageNet stats * fix(PyAV container seek): fixing initial PyAV container seek to be robust againsy codec choice * feat(lerobot-dataset-viz): adding support for depth in lerobot-dataset-viz * fix(compress): removing rerun compression for DepthImages * fix(signle channel squeeze): fixing single channel squeezing * chore(format): format code * fix(streaming): adding support for dequantization in streaming_dataset.py * refactor(read depth): factorizing depth reading methods for realsense camera and adding support for depth-only usage * chore(renaming): fixing missed RGBEncoderConfig renamings * docs(renaming): reflecting renamings in a clearer way in the docs * chore(annotation): excluding depth from the annotation pipeline * feat(robots): adding depth support in compatible follower robots * feat(LeSadKiwi): excluding LeKiwi from depth support (for now) * chore(fail): removing misplaced file * chore(fail): removing misplaced file * fix(remove ffv1): removing ffv1 as it does not support MP4 * docs(cheat sheet): adding depth and video encoding to the cheat sheet * fix(lossless): tuning depth encoding parameters for lossless depth storage * test(fix): fixing failing tests * depth(ZMQ): excluding ZMQ from depth support * Revert "depth(ZMQ): excluding ZMQ from depth support" This reverts commit `b95cf4e4c2`. * fix(image transforms): excluding depth frames from images transforms * fix(typo): typo * fix(stats): fixing stats computation for depth frames * fix(TIFF vs. pytorch): adding an extra uint16 to float32 conversion for depth maps stored as raw TIFF images * fix(typos): fixing typos * test(dtype): fixing stats computation typing tests --------- Signed-off-by: Steven Palma <imstevenpmwork@ieee.org> Co-authored-by: Wensi (Vince) Ai <59036629+wensi-ai@users.noreply.github.com> Co-authored-by: Steven Palma <imstevenpmwork@ieee.org> Co-authored-by: Wensi Ai <wsai@stanford.edu>	2026-06-27 14:21:21 +02:00
Caroline Pascal	bd9619dfc3	feat(encoding parameters): adding support for user provided video encoding parameters (#3455 ) * chore(video backend): renaming codec into video_backend in get_safe_default_video_backend() * feat(pyav utils): adding suport for PyAV encoding parameters validation * feat(VideoEncoderConfig): creating a VideoEncoderConfig to encapsulate encoding parameters * feat(VideoEncoderConfig): propagating the VideoEncoderConfig in the codebase * chore(docs): updating the docs * feat(metadata): adding encoding parameters in dataset metadata * fix(concatenation compatibility): adding compatibility check when concatenating video files * feat(VideoEncoderConfig init): making VideoEncoderConfig more robust and adaptable to multiple backends * feat(pyav checks): making pyav parameters checks more robust * chore(duplicate): removing duplicate get_codec_options definition * test(existing): adapting existing tests * test(new): adding new tests for encoding related features * chore(format): fixing formatting issues * chore(PyAV): cleaning up PyAV utils and encoding parameters checks to stick to the minimun required tooling. * chore(format): formatting code * chore(doctrings): updating docstrings * fix(camera_encoder_config): Removing camera_encoder_config from LeRobotDataset, as it's only required in LeRobotDatasetWriter. * feat(default values): applying a consistent naming convention for default RGB cameras video encoder parameters * fix(rollout): propagating VideoEncoderConfig to the latest recording modes * chore(format): formatting code, fixing error messages and variable names * fix(arguments order): reverting changes in arguments order in StreamingVideoEncoder * chore(relative imports): switching to relative local imports within lerobot.datasets * test(artifacts): cleaning up artifacts for the video encoding tests * chore(docs): updating docs * chore(fromat): formatting code * fix(imports): refactoring the file architecture to avoid circular imports. VideoEncoderConfig is now defined in lerobot.configs and lazily imports av at runtime. * fix(typos): fixing typos and small mistakes * test(factories): updating factories * feat(aggregate): updating dataset aggregation procedure. Encoding tuning paramters (crf, g,...) are ignored for validation and changed to None in the aggregated dataset if incompatible. * docs(typos): fixing typos * fix(deletion): reverting unwanted deletion * fix(typos): fixing multiple typos * feat(codec options): passing codec options to lerobot_edit_dataset episode deletion tool * typo(typo): typo * fix(typos): fixing remaining typos * chore(rename): renaming camera_encoder_config to camera_encoder * docs(clean): cleaning and formating docs * docs(dataset): addind details about datasets * chore(format): formatting code * docs(warning): adding warning regarding encoding parameters modification * fix(re-encoding): removing inconsistent re-encoding option in lerobot_edit_dataset * typos(typos): typos * chore(format): resolving prettier issues * fix(h264_nvenc): fixing crf handling for h264_nvenc * docs(clean): removing too technical parts of the docs * fix(imports): fixing imports at the __init__ level * fix(imports): fixing not very pretty imports in video config file	2026-05-14 23:46:42 +02:00

Caroline Pascal

3dd19d043e

feat(depth maps): adding support for depth in LeRobot (#3644 )

* feat(depth): add depth quantization helpers and tests

* feat(video): add ffv1 to supported codecs

* feat(depth): persist depth metadata

* feat(depth): extend quantization tools to better fit the encoding/decoding pipeline

* feat(depth): plumb DepthEncoderConfig through LeRobotDataset and DatasetWriter

* feat(depth): wire StreamingVideoEncoder + writer to depth encoder

* feat(depth): wire DatasetReader to decode_depth_frames

* feat(cameras/realsense): expose async depth in metric meters

* feat(features): route 2D camera shapes to observation.depth.<key>

* feat(robots/so_follower): emit + populate depth keys when use_depth

* feat(record): plumb DepthEncoderConfig through lerobot-record

* feat(viz): render depth observations as rr.DepthImage in Viridis

* feat(depth maps writer): adding support for raw depth maps recording with image writer

* chore(format): format code

* feat(depth shape): ensuring depth maps shape is always including the channel

* feat(is_depth): simplifying is_depth nested name + legacy support

* fix(stop_event): fixing stop_event race condition in camera classes

* fix(plumbing): fixing missing parts in the depth maps pipeline

* chore(typos): fixing typos

* test(fix): fixing exisiting tests to still work with latest features

* tests(depth): adding new tests for depth integration validation

* feat(pix_fmt channels): use PyAv to check get pixel formats number of channels

* feat(refactor): refactor DepthEncoderConfig quantization pipeline, so that the methods do not live in the config class. Add pixel format - channels validation.Move the default pixel format for depth in the config file.

* fix(pre-commit): fixing mutable defautl value

* fix(info): fixing info metadata update when is_depth_map was set

* tests(typos): fixing typos in tests

* fix(realsense): fixing typo in realsense serial number

* fix(normalization): restricting 255 normalization to non depth/uint8 images only

* fix(typo): fixing typo

* fix(TIFF): add missing quantization and cleanup for TIFF files

* feat(batched dequantization): optimizing dequantize_depth for torch based batched dequantization

* feat(tools): adding depth support in LeRobotDataset edition tools

* test(aggregate): extending aggregation tests to depth frames

* test(cleaning): cleaning up tests

* fix(from_video_info): fixing early validation issue in from_video_info

* fix(typo): fixing typo

* fix(is_depth): adding missing doctrings and is_depth arguments in video decoding functions

Co-authored-by: Wensi (Vince) Ai <59036629+wensi-ai@users.noreply.github.com>

* fix(depth units): fixing depth units output for the realsense cameras

* feat(output unit): adding support for output unit specification at dataset reading/training time

Co-authored-by: Wensi (Vince) Ai <59036629+wensi-ai@users.noreply.github.com>

* test(depth): cleaning up depth tests

* test(depth encoding): updating and cleaning video/depth encoding tests

* chore(format): formatting code

* docs(depth): improving depth maps docs

* test(fix): fixing depth tests

* test(dataset tools): adding missing tests for new dataset edition tools features

* chore(format): formatting code

* fix(pyav check): fixing PyAV option validation for integer codec options by normalizing
numeric values before calling `is_integer()`

Co-authored-by: Wensi (Vince) Ai <59036629+wensi-ai@users.noreply.github.com>

* docs(mermaid): fixing mermaid diagram

* fix(rebase): rebase follow up corrections

* feat(dataset tools): adding missing docstrings and features for depth fill support in dataset edition tools

* docs(docstring): updating docstrings

* docs(dataset tools): updating docs

* fix(save images): fixing image saving in dataset tools

* fix(update video info): fixing update video info logic to match the recording and editing use cases

* test(reencode): fixing reencoding monkeypatch

* fix(review): add Claude review

* chore(format): format code

* fix(update video info): ditching the differentiated approahces for video info update - video info are always updated unless for preserved keys.

* chore(rebase): fixing rebase merge conflicts

* test(visualization): fixing visualization tests

* feat(docstrings): adding explicit docstring for encoding parameters. Docstrigns will now show up as description in the CLI --help.

* feat(mm as default): adding a global DEFAULT_DEPTH_UNIT variable setting mm as default depth unit

* fix(RGB <-> camera): renaming camera_encoder to rgb_encoder for clarity

* chore(TODO): removing deprecated TODO

* doc(write_u16_plane): improving docstrings for write_u16_plane

* feat(units): adding constants for depth frames units (m and mm)

* fix(spam): replacing spamming warning but a debug log

* feat(leagcy metadata): adding automatic metadata update for legacy 'video.is_depth_map' feature

* fix(copy&reindex): fixing metadat reshaping for single channel frames

* fix(ImageNet): excluding dpeth frames from ImageNet stats

* fix(PyAV container seek): fixing initial  PyAV container seek to be robust againsy codec choice

* feat(lerobot-dataset-viz): adding support for depth in lerobot-dataset-viz

* fix(compress): removing rerun compression for DepthImages

* fix(signle channel squeeze): fixing single channel squeezing

* chore(format): format code

* fix(streaming): adding support for dequantization in streaming_dataset.py

* refactor(read depth): factorizing depth reading methods for realsense camera and adding support for depth-only usage

* chore(renaming): fixing missed RGBEncoderConfig renamings

* docs(renaming): reflecting renamings in a clearer way in the docs

* chore(annotation): excluding depth from the annotation pipeline

* feat(robots): adding depth support in compatible follower robots

* feat(LeSadKiwi): excluding LeKiwi from depth support (for now)

* chore(fail): removing misplaced file

* chore(fail): removing misplaced file

* fix(remove ffv1): removing ffv1 as it does not support MP4

* docs(cheat sheet): adding depth and video encoding to the cheat sheet

* fix(lossless): tuning depth encoding parameters for lossless depth storage

* test(fix): fixing failing tests

* depth(ZMQ): excluding ZMQ from depth support

* Revert "depth(ZMQ): excluding ZMQ from depth support"

This reverts commit b95cf4e4c2.

* fix(image transforms): excluding depth frames from images transforms

* fix(typo): typo

* fix(stats): fixing stats computation for depth frames

* fix(TIFF vs. pytorch): adding an extra uint16 to float32 conversion for depth maps stored as raw TIFF images

* fix(typos): fixing typos

* test(dtype): fixing stats computation typing tests

---------

Signed-off-by: Steven Palma <imstevenpmwork@ieee.org>
Co-authored-by: Wensi (Vince) Ai <59036629+wensi-ai@users.noreply.github.com>
Co-authored-by: Steven Palma <imstevenpmwork@ieee.org>
Co-authored-by: Wensi Ai <wsai@stanford.edu>

2026-06-27 14:21:21 +02:00

Caroline Pascal

bd9619dfc3

feat(encoding parameters): adding support for user provided video encoding parameters (#3455 )

* chore(video backend): renaming codec into video_backend in get_safe_default_video_backend()

* feat(pyav utils): adding suport for PyAV encoding parameters validation

* feat(VideoEncoderConfig): creating a VideoEncoderConfig to encapsulate encoding parameters

* feat(VideoEncoderConfig): propagating the VideoEncoderConfig in the codebase

* chore(docs): updating the docs

* feat(metadata): adding encoding parameters in dataset metadata

* fix(concatenation compatibility): adding compatibility check when concatenating video files

* feat(VideoEncoderConfig init): making VideoEncoderConfig more robust and adaptable to multiple backends

* feat(pyav checks): making pyav parameters checks more robust

* chore(duplicate): removing duplicate get_codec_options definition

* test(existing): adapting existing tests

* test(new): adding new tests for encoding related features

* chore(format): fixing formatting issues

* chore(PyAV): cleaning up PyAV utils and encoding parameters checks to stick to the minimun required tooling.

* chore(format): formatting code

* chore(doctrings): updating docstrings

* fix(camera_encoder_config): Removing camera_encoder_config from LeRobotDataset, as it's only required in LeRobotDatasetWriter.

* feat(default values): applying a consistent naming convention for default RGB cameras video encoder parameters

* fix(rollout): propagating VideoEncoderConfig to the latest recording modes

* chore(format): formatting code, fixing error messages and variable names

* fix(arguments order): reverting changes in arguments order in StreamingVideoEncoder

* chore(relative imports): switching to relative local imports within lerobot.datasets

* test(artifacts): cleaning up artifacts for the video encoding tests

* chore(docs): updating docs

* chore(fromat): formatting code

* fix(imports): refactoring the file architecture to avoid circular imports. VideoEncoderConfig is now defined in lerobot.configs and lazily imports av at runtime.

* fix(typos): fixing typos and small mistakes

* test(factories): updating factories

* feat(aggregate): updating dataset aggregation procedure. Encoding tuning paramters (crf, g,...) are ignored for validation and changed to None in the aggregated dataset if incompatible.

* docs(typos): fixing typos

* fix(deletion): reverting unwanted deletion

* fix(typos): fixing multiple typos

* feat(codec options): passing codec options to lerobot_edit_dataset episode deletion tool

* typo(typo): typo

* fix(typos): fixing remaining typos

* chore(rename): renaming camera_encoder_config to camera_encoder

* docs(clean): cleaning and formating docs

* docs(dataset): addind details about datasets

* chore(format): formatting code

* docs(warning): adding warning regarding encoding parameters modification

* fix(re-encoding): removing inconsistent re-encoding option in lerobot_edit_dataset

* typos(typos): typos

* chore(format): resolving prettier issues

* fix(h264_nvenc): fixing crf handling for h264_nvenc

* docs(clean): removing too technical parts of the docs

* fix(imports): fixing imports at the __init__ level

* fix(imports): fixing not very pretty imports in video config file

2026-05-14 23:46:42 +02:00

2 Commits