chore(training): bump plan/memory dropout to 0.50 to force vision-grounding

After the recipe fix (target=${subtask} at every frame) the model can still reach low text_loss by reading the answer off the plan in the prompt: at training the prompt contains the 6-step plan, and the current subtask is one of those steps, so the model just learns "active step N matches subtask N" and never needs to look at the image. Symptom at inference: subtask string is set but never updates because the model isn't really conditioning on the visual progress. Drop plan and memory with p=0.50 each — half of training frames the prompt is just "${task}" (constant for this dataset) + visual prefix, which is the only place the answer can come from. Forces the LM head to actually use vision. ``subtask_dropout`` stays at 0.20 because subtask isn't in the high-level prompt anymore (recipe fix removed the "Current subtask: X" message); the knob still affects other sub-recipes that reference it as context. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-07-12 20:41:58 +00:00 · 2026-05-12 21:30:51 +02:00
parent bfd3bb1791
commit b6fb536460
1 changed files with 3 additions and 3 deletions
@@ -48,7 +48,7 @@ echo "  GPUs:         $NUM_PROCESSES"
 echo "  batch:        $BATCH_SIZE / GPU (global=$((NUM_PROCESSES * BATCH_SIZE)))"
 echo "  steps:        $STEPS"
 echo "  output:       $OUTPUT_DIR"
-echo "  augmentation: image_transforms ON, prompt dropout {plan:0.15 memory:0.15 subtask:0.20}"
+echo "  augmentation: image_transforms ON, prompt dropout {plan:0.50 memory:0.50 subtask:0.20}"

 accelerate launch --multi_gpu --num_processes="$NUM_PROCESSES" \
    -m lerobot.scripts.lerobot_train \
@@ -75,6 +75,6 @@ accelerate launch --multi_gpu --num_processes="$NUM_PROCESSES" \
    --dataset.image_transforms.enable=true \
    --dataset.image_transforms.max_num_transforms=3 \
    --dataset.image_transforms.random_order=true \
-    --policy.plan_dropout_prob=0.15 \
-    --policy.memory_dropout_prob=0.15 \
+    --policy.plan_dropout_prob=0.50 \
+    --policy.memory_dropout_prob=0.50 \
    --policy.subtask_dropout_prob=0.20