lerobot

mirror of https://github.com/huggingface/lerobot.git synced 2026-06-18 00:37:10 +00:00

Author	SHA1	Message	Date
Pepijn	343ecd7980	feat(streaming): optional GPU (NVDEC) video decode device Add `video_decode_device` to StreamingLeRobotDataset and a `device` arg to VideoDecoderCache, passed to torchcodec's VideoDecoder. "cuda" offloads H.264/H.265 decode to the GPU's dedicated NVDEC engine (independent of the training SMs); requires a CUDA-enabled torchcodec build. benchmark: `--video_decode_device` flag. With cuda + num_workers>0 it forces the `spawn` start method (CUDA cannot init in forked workers) and disables CPU pin_memory (frames are already on-GPU). Decode device is recorded in results and the output filename. README documents the NVDEC option and its concurrency/IPC caveats. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-09 15:47:11 +02:00
Pepijn	f7c8a526e8	feat(streaming): wallclock benchmark throughput, cross-worker cache stats, bucket source - benchmark: frames_per_s_node now measures sustained wall-clock throughput over the post-warmup window. The previous metric summed inter-batch gaps, which collapse to ~0 under async prefetch (consumer drains a pre-filled queue) and overstated throughput ~100x. - VideoDecoderCache gains an optional shared [hits, misses, evictions] counter tensor; StreamingLeRobotDataset.video_decoder_cache_stats() aggregates it across DataLoader workers (lock-free, approximate; hit_rate preserved). Fixes empty cache stats with workers. - StreamingLeRobotDataset.data_files_root: read bulk data/ + videos/ from an fsspec root (e.g. hf://buckets/<owner>/<name>) while metadata still loads from repo_id. Enables bucket / prewarmed-bucket benchmark sources without copying metadata. Exposed as benchmark --data_files_root. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-09 15:25:44 +02:00
Pepijn	68fa5d80b0	feat(streaming): multinode example, dataloading benchmark, distributed smoke test - examples/scaling/train_streaming_multinode.py: Accelerate-based distributed/ resumable streaming training (no DistributedSampler; rank/world_size auto-resolved), checkpoints the dataset stream state, and supports a --dummy pure-dataloading path with throughput logging. SLURM launcher in slurm/train_streaming_robocasa.sh. - benchmarks/streaming/benchmark_streaming.py: dummy-consumer dataloading benchmark (single / sarm frame modes) emitting frames/s/node, p50/p95/p99 sample latency, first-batch latency, and VideoDecoderCache reuse stats as JSON + CSV. SLURM launcher + README documenting the source/node/mode matrix and manual bucket prewarming. - VideoDecoderCache: add hit/miss/eviction counters and a stats() method so the benchmark can surface decoder thrash (no new cache, no eviction-policy change). - tests/datasets/test_streaming_distributed.py: accelerate-launch smoke test asserting per-rank disjointness; skips (does not false-pass) when <2 processes spawn. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-09 13:48:23 +02:00
Steven Palma	04125492e4	fix(datasets): expand torchcodec platform coverage + rewrite pyav fallback for torchvision >0.26 (#3588 ) * fix(deps): better versioning control for torchcodec * refactor(video_utils): replace torchvision with pyav * adding Torchcodec version to lerobot-info * chore(benchmarks): delete video benchmark --------- Co-authored-by: Maximellerbach <maxime.ellerbach@huggingface.co>	2026-05-12 16:59:11 +02:00
Steven Palma	5f15232271	chore: remove usernames + use entrypoints in docs, comments & sample commands (#2988 )	2026-02-18 22:46:12 +01:00
Caroline Pascal	648ea8f485	fix(benchmark) : fixing video benchmark (#2094 ) * fix(time benchmark): removing deprecated TimeBenchmark dependency * fix(typo): renaming frames in an up-to-date fashion * feat(duets): rearanging crf and g parameters in a proper unique combination manner * fix(segfault): fixing segfault by adding a lock in ThreadPoolExecutor * chore(update) : update datasets, codecs and backends to the latest versions * chore(unused files): removing unused files * fix(dataset paths): fix datasets paths to live among lerobot datasets	2025-11-26 17:41:31 +01:00
Steven Palma	43d878a102	chore: replace hard-coded obs values with constants throughout all the source code (#2037 ) * chore: replace hard-coded OBS values with constants throughout all the source code * chore(tests): replace hard-coded OBS values with constants throughout all the test code	2025-09-25 15:36:47 +02:00
Steven Palma	af1760f175	chore(utils): move benchmark and buffer to their respective modules (#2028 )	2025-09-24 16:46:38 +02:00
Michel Aractingi	f55c6e89f0	Dataset v3 (#1412 ) Co-authored-by: Simon Alibert <75076266+aliberts@users.noreply.github.com> Co-authored-by: Remi Cadene <re.cadene@gmail.com> Co-authored-by: Tavish <tavish9.chen@gmail.com> Co-authored-by: fracapuano <francesco.capuano@huggingface.co> Co-authored-by: CarolinePascal <caroline8.pascal@gmail.com>	2025-09-15 09:53:30 +02:00
Steven Palma	378e1f0338	Update pre-commit-config.yaml + pyproject.toml + ceil rerun & transformer dependencies version (#1520 ) * chore: update .gitignore * chore: update pre-commit * chore(deps): update pyproject * fix(ci): multiple fixes * chore: pre-commit apply * chore: address review comments * Update pyproject.toml Co-authored-by: Ben Zhang <5977478+ben-z@users.noreply.github.com> Signed-off-by: Steven Palma <imstevenpmwork@ieee.org> * chore(deps): add todo --------- Signed-off-by: Steven Palma <imstevenpmwork@ieee.org> Co-authored-by: Ben Zhang <5977478+ben-z@users.noreply.github.com>	2025-07-17 14:30:20 +02:00
Simon Alibert	d4ee470b00	Package folder structure (#1417 ) * Move files * Replace imports & paths * Update relative paths * Update doc symlinks * Update instructions paths * Fix imports * Update grpc files * Update more instructions * Downgrade grpc-tools * Update manifest * Update more paths * Update config paths * Update CI paths * Update bandit exclusions * Remove walkthrough section	2025-07-01 16:34:46 +02:00
Steven Palma	c940676bdd	fix(benchmarks): remove .numpy() from frame in benchmark script (#1354 )	2025-06-19 17:07:13 +02:00
Caroline Pascal	6d723c45a9	feat(encoding): switching to PyAV for ffmpeg related tasks (#983 )	2025-04-29 17:39:35 +02:00
Steven Palma	4041f57943	feat(visualization): replace cv2 GUI with Rerun (and solves ffmpeg versioning issues) (#903 )	2025-04-09 17:33:01 +02:00
Steven Palma	1c15bab70f	fix(codec): hot-fix for default codec in linux arm platforms (#868 )	2025-03-17 13:23:11 +01:00
Jade Choghari	0e98c6ee96	Add torchcodec cpu (#798 ) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Remi <re.cadene@gmail.com> Co-authored-by: Remi <remi.cadene@huggingface.co> Co-authored-by: Simon Alibert <simon.alibert@huggingface.co> Co-authored-by: Simon Alibert <75076266+aliberts@users.noreply.github.com>	2025-03-14 16:53:42 +01:00
Simon Alibert	a1809ad3de	Add typos checks (#770 )	2025-02-25 23:51:15 +01:00
CharlesCNorton	bc16e1b497	fix(docs): typos in benchmark readme.md (#614 ) Co-authored-by: Simon Alibert <75076266+aliberts@users.noreply.github.com>	2025-01-09 09:35:27 +01:00
Simon Alibert	32eb0cec8f	Dataset v2.0 (#461 ) Co-authored-by: Remi <remi.cadene@huggingface.co>	2024-11-29 19:04:00 +01:00
Simon Alibert	0b21210d72	Convert datasets to av1 encoding (#302 )	2024-07-22 20:08:59 +02:00
Simon Alibert	e410e5d711	Improve video benchmark (#282 ) Co-authored-by: Alexander Soare <alexander.soare159@gmail.com> Co-authored-by: Remi <re.cadene@gmail.com>	2024-07-09 20:20:25 +02:00

21 Commits