lerobot

mirror of https://github.com/huggingface/lerobot.git synced 2026-07-08 18:41:54 +00:00

Files

T

Pepijn c1a0c601e2 feat(language): task_aug style + automatic ${task} rephrasing rotation

Adds task-prompt diversity (Xiao 2022 / CAST) without touching
``meta/tasks.parquet`` or forcing recipes to opt in. The plan reserved
``task_aug`` as a future style; this lands it now.

- ``language.py``: add ``task_aug`` to ``CORE_STYLES`` and
  ``PERSISTENT_STYLES``. ``column_for_style("task_aug")`` returns
  ``language_persistent`` so PR 2 writers route it correctly.

- ``language_render.py``: ``_resolve_task`` now consults the persistent
  slice for rows of ``style="task_aug", role="user"``. When any exist
  it picks one deterministically by ``sample_idx`` (blake2b-keyed, not
  Python's randomized hash) so an epoch sees every rephrasing of every
  episode while the same sample still resolves identically across
  reruns. Falls back to the canonical ``meta/tasks.parquet`` task when
  no rephrasings are present, so existing datasets and unannotated runs
  keep their behaviour. Explicit ``task=`` overrides still win.

- Tests: rephrasing coverage across samples, determinism on repeat
  ``sample_idx``, fallback when persistent has no ``task_aug`` rows,
  and explicit override priority.

Recipes get this for free: any ``${task}`` placeholder rotates through
the available rephrasings. Recipes that want the literal canonical task
can override the binding.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-04-30 16:45:39 +02:00

test_aggregate.py

feat(dependencies): minimal default tag install (#3362 )

2026-04-12 20:03:04 +02:00

test_compute_stats.py

feat(dependencies): minimal default tag install (#3362 )

2026-04-12 20:03:04 +02:00

test_dataset_metadata.py

feat(dependencies): minimal default tag install (#3362 )

2026-04-12 20:03:04 +02:00

test_dataset_reader.py

feat(dependencies): minimal default tag install (#3362 )