Files
lerobot/benchmarks
pepijn 2ab71231cd feat(streaming): defer video decode, episode-pool shuffle, and remote-IO retries
- streaming_dataset: defer torchcodec decode until a sample leaves the shuffle
  buffer (buffer now holds ~KB tabular rows, not MB of pixels) and add an opt-in
  episode-pool shuffle (episode_pool_size) with exact in-episode delta lookups;
  expose decode/fetch timing_stats.
- video_utils: retry transient hf:///fsspec/httpx transport errors during
  streaming decode (LEROBOT_REMOTE_IO_MAX_RETRIES).
- dataset_tools: write multiple ~32MB row groups with a page index to bound
  per-shard streaming memory.
- benchmarks/slurm: streaming benchmark + matrix submitter updates.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-11 10:08:28 +00:00
..