Files
lerobot/examples
pepijn 674c990a39 feat(streaming): default episode pool 1024 and wire streaming into lerobot-train
Raise the default episode_pool_size to 1024 (DatasetConfig + StreamingLeRobotDataset)
for better default shuffle quality at scale.

Streaming is now a first-class option of the main train script: when cfg.dataset.streaming
is set, the dataloader is not handed to accelerate (the dataset is already rank-disjoint via
split_dataset_by_node, so IterableDatasetShard would drop (N-1)/N of each rank's stream),
batches are moved to device manually, and the episode-aware sampler is skipped. Remove the
standalone examples/scaling/train_streaming_multinode.py example in favor of this wiring.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-12 09:24:32 +00:00
..
2026-05-12 15:49:54 +02:00