lerobot

mirror of https://github.com/huggingface/lerobot.git synced 2026-05-15 16:49:55 +00:00

Author	SHA1	Message	Date
Pepijn	b6fb536460	chore(training): bump plan/memory dropout to 0.50 to force vision-grounding After the recipe fix (target=${subtask} at every frame) the model can still reach low text_loss by reading the answer off the plan in the prompt: at training the prompt contains the 6-step plan, and the current subtask is one of those steps, so the model just learns "active step N matches subtask N" and never needs to look at the image. Symptom at inference: subtask string is set but never updates because the model isn't really conditioning on the visual progress. Drop plan and memory with p=0.50 each — half of training frames the prompt is just "${task}" (constant for this dataset) + visual prefix, which is the only place the answer can come from. Forces the LM head to actually use vision. ``subtask_dropout`` stays at 0.20 because subtask isn't in the high-level prompt anymore (recipe fix removed the "Current subtask: X" message); the knob still affects other sub-recipes that reference it as context. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-12 21:31:00 +02:00
Pepijn	4908433f9a	chore(training): align smolvla2_hirobot.slurm with what's actually run Match the operator's current training command for the _tool6 retrain: * default DATASET / POLICY_REPO_ID / JOB_NAME point at the tool6 iteration (super_poulain_full_tool3 → smolvla2_hirobot_super_poulain_tool6) * STEPS default 2000 (short enough to iterate; bump to 10k for full) * save_freq=$STEPS so the only checkpoint is the final one * OUTPUT_DIR includes step count so successive runs don't clobber * Drop the wider augmentation envelope I added earlier — back to default ColorJitter ranges (brightness ±20% etc) since the high_level_subtask recipe fix (current-subtask supervision) is expected to fix the LM-head collapse on its own; the augmentation is just the standard regulariser, not a load-bearing widener. * prompt-dropout fractions stay at the original 0.15 / 0.15 / 0.20. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-12 18:45:38 +02:00
Pepijn	47fb8318b1	chore(training): widen augmentation envelope after live-robot diagnostic The tensor-level comparison between dry-run (dataset frame) and live- robot inference proved the runtime is bug-free — same shape, dtype, device, channel order, batch dim, and normalization on both paths. The remaining variable: front-camera mean brightness was 0.26 live vs 0.39 on the dataset frame, ~33% darker. Training augmentation only covered ±20% brightness, so the live scene sits just outside the supervised envelope and the LM head collapses to its dominant prior. Widen the augmentation knobs for the next retrain: * brightness 0.8–1.2 → 0.5–1.6 (covers ~30% darker / 60% lighter) * contrast 0.8–1.2 → 0.6–1.5 * saturation 0.5–1.5 → 0.3–1.7 * hue ±0.05 → ±0.10 * affine ±5°/±5% → ±15°/±15% (covers cube placement / camera drift) * max_num_transforms 3 → 4 And bump prompt-component dropout (subtask 0.20 → 0.30) so the LM can't lean on stale memorised plan/memory at inference. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-12 18:25:41 +02:00
Pepijn	01e2228b24	feat(smolvla2): per-component prompt dropout + augmented training script Two complementary regularisers to attack the ``text_loss=6e-6 = memorised one dataset`` failure mode that's making the model collapse on real-robot input: 1. Per-component prompt dropout (Pi0.7 §V.E / plan's ``feat/pi05-prompt-dropout`` follow-up). ``SmolVLA2ChatTokenizerStep`` gains ``plan_dropout_prob`` / ``memory_dropout_prob`` / ``subtask_dropout_prob`` knobs (default 0.0 — opt-in). At training, non-target messages whose rendered content starts with ``Plan:`` / ``Memory:`` / ``Current subtask:`` etc. are dropped with their respective probability before tokenisation, with a deterministic per-sample RNG keyed off the dataset ``index``. ``target_message_indices`` is re-mapped so the supervision still lands on the right turn. Forces the model to handle missing plan/memory/subtask context — directly attacks the real-robot collapse where a stale or empty plan field puts the prompt OOD. Surfaced on ``SmolVLA2Config`` as three floats so they're ``--policy.<knob>=<value>``-controllable from the train CLI; plumbed through ``make_smolvla2_pre_post_processors``. 2. Image augmentation is already wired in lerobot via ``--dataset.image_transforms.enable=true`` (torchvision v2 ColorJitter + SharpnessJitter + RandomAffine, default 3 of 6 sampled per frame). No code change needed — just a CLI flag. ``examples/training/smolvla2_hirobot.slurm`` shows the full training command with both enabled. Drop-in replacement for the ad-hoc SLURM script Pepijn was using locally; same args, plus the three dropout probs and the image-transforms flag. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-12 15:52:32 +02:00
Steven Palma	df0763a2bc	feat(dependencies): minimal default tag install (#3362 )	2026-04-12 20:03:04 +02:00
Steven Palma	d90e4bcfd3	refactor(dataset): modular files (#3171 ) * refactor(dataset): modular files * refactor(dataset): update imports across the codebase	2026-03-15 23:58:09 -07:00
Steven Palma	7cf04a5ec3	chore: move constants to utils (#2016 )	2025-09-24 11:11:53 +02:00
Pepijn	d65668ff3c	Add docs for LeRobot Image transforms (#1972 ) * Remove unused scripts, add docs for image transforms and add example * fix(examples): move train_policy.py under examples, remove outdated readme parts * remove script thats copied to train folder * remove outdated links to examples and example tests	2025-09-19 15:19:49 +02:00

8 Commits