chore(training): STEPS=15000 default + dropout walked back to 0.30/0.30/0.20

After _tool-good (2000 steps, 0.50/0.50/0.20 dropout) the LM head's distribution at position 0 shifted from EOS to subtask-vocabulary tokens but emitted bag-of-words ("cube arm and") rather than well- formed sentences. That's the expected mid-fine-tuning phase: token- level supervision has landed, sequence-level grammar hasn't. Two changes for the next retrain: * STEPS=15000 (from 2000) — chat-pretrained backbones need O(10k+) steps to walk their pretraining priors down far enough to commit to the fine-tuned distribution structurally, not just at the token level. _tool-g2's bag-of-words output proves the model is on the right path; it just needs more gradient signal. * plan/memory dropout 0.50 -> 0.30 — 0.50 was probably too aggressive for a small dataset. Half the training samples had crucial context missing, which slows down learning the full conditional structure. 0.30 still regularises against prompt leakage but lets the model learn proper grammar first; the higher dropout can be revisited once the head is solid. Subtask dropout stays at 0.20 since subtask isn't in the high-level prompt anyway (recipe fix removed the "Current subtask:" message). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-07-14 05:22:14 +00:00 · 2026-05-13 10:46:19 +02:00
parent 3a20ea337e
commit 1d24301b67
1 changed files with 4 additions and 4 deletions
@@ -39,7 +39,7 @@ POLICY_REPO_ID="${POLICY_REPO_ID:-pepijn223/smolvla2_hirobot_super_poulain_tool6
 JOB_NAME="${JOB_NAME:-smolvla2-hirobot-super-poulain-tool6}"
 NUM_PROCESSES="${NUM_PROCESSES:-8}"
 BATCH_SIZE="${BATCH_SIZE:-32}"
-STEPS="${STEPS:-2000}"
+STEPS="${STEPS:-15000}"
 RUN_ID="${SLURM_JOB_ID:-$(date +%Y%m%d_%H%M%S)}"
 OUTPUT_DIR="${OUTPUT_DIR:-/fsx/pepijn/outputs/train/smolvla2_hirobot_super_poulain_tool3_${STEPS}_${RUN_ID}}"

@@ -48,7 +48,7 @@ echo "  GPUs:         $NUM_PROCESSES"
 echo "  batch:        $BATCH_SIZE / GPU (global=$((NUM_PROCESSES * BATCH_SIZE)))"
 echo "  steps:        $STEPS"
 echo "  output:       $OUTPUT_DIR"
-echo "  augmentation: image_transforms ON, prompt dropout {plan:0.50 memory:0.50 subtask:0.20}"
+echo "  augmentation: image_transforms ON, prompt dropout {plan:0.30 memory:0.30 subtask:0.20}"

 accelerate launch --multi_gpu --num_processes="$NUM_PROCESSES" \
    -m lerobot.scripts.lerobot_train \
@@ -75,6 +75,6 @@ accelerate launch --multi_gpu --num_processes="$NUM_PROCESSES" \
    --dataset.image_transforms.enable=true \
    --dataset.image_transforms.max_num_transforms=3 \
    --dataset.image_transforms.random_order=true \
-    --policy.plan_dropout_prob=0.50 \
-    --policy.memory_dropout_prob=0.50 \
+    --policy.plan_dropout_prob=0.30 \
+    --policy.memory_dropout_prob=0.30 \
    --policy.subtask_dropout_prob=0.20