Files
lerobot/examples
Pepijn 1394a6ab5d chore(annotate): bump diversity knobs ~3x to fight memorisation
Following Pi0.7 §V (prompt expansion / diverse context conditioning),
push more atom variants per episode and higher VLM sampling
temperature so the training distribution has enough wording diversity
that the LM head is forced to use its parameters rather than memorise
specific (prompt, target) pairs.

Changes vs prior annotation pass:

  * vlm.temperature: 0.2 (default) → 0.7 — every Module-1/2/3 call
    now produces diverse phrasings; same prompt yields different
    completions across emissions.
  * module_1.n_task_rephrasings: 10 → 30 — three times as many
    ``task_aug`` rows in language_persistent. ``${task}`` already
    rotates through them deterministically per sample_idx (see
    ``_resolve_task`` in language_render.py).
  * module_2.max_interjections_per_episode: 3 (default) → 9 — more
    ``user_interjection_response`` training samples + more plan
    refresh events.
  * module_3.K: 1 → 3 — three VQA pairs per emission tick instead of
    one. Combined with the hz bump below, ~6× more VQA samples.
  * module_3.vqa_emission_hz: 1.0 → 2.0 — double the VQA emission
    rate within each subtask span.

Pushes to a new hub repo (``_tool3``) so the working ``_tool2``
dataset stays intact for comparison. ``${task}`` already wired to
rotate through ``task_aug`` rows, so no renderer change needed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 14:32:05 +02:00
..