From e7276880524d46c9b6e915774c521112fcdfe847 Mon Sep 17 00:00:00 2001 From: Pepijn Date: Fri, 15 May 2026 14:14:42 +0200 Subject: [PATCH] =?UTF-8?q?annotate:=20telegraphic=20subtasks=20=E2=80=94?= =?UTF-8?q?=20=E2=89=A44=20words,=20verb+object,=20consistent=20nouns?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Tighten the subtask prompt further per real-data feedback. The old ≤5-word cap still produced things like "release the yellow block into the green bin" (8 words, articles, destination, and "block" where the task said "cube"). New rules: * Hard cap ≤ 4 words, ideally 2-3. Form: VERB + (color) + OBJECT. * No articles, no destinations, no adverbs, no "robot/arm/gripper". * Must reuse the exact object nouns from the task — no block/cube, bin/box/container drift across the episode. * Concrete good/bad examples anchored on the cube task. Shorter, templated, consistent targets are far more robust for the autoregressive LM head — fewer tokens to drift on, fewer dominant n-grams to repetition-collapse into. Co-Authored-By: Claude Opus 4.7 (1M context) --- .../prompts/module_1_subtasks.txt | 23 ++++++++++++------- 1 file changed, 15 insertions(+), 8 deletions(-) diff --git a/src/lerobot/annotations/steerable_pipeline/prompts/module_1_subtasks.txt b/src/lerobot/annotations/steerable_pipeline/prompts/module_1_subtasks.txt index 10044fc0b..56d14d42f 100644 --- a/src/lerobot/annotations/steerable_pipeline/prompts/module_1_subtasks.txt +++ b/src/lerobot/annotations/steerable_pipeline/prompts/module_1_subtasks.txt @@ -4,17 +4,24 @@ The user originally asked: "{episode_task}" You are shown the entire demonstration as a single video. Watch the whole clip, then segment it into a list of consecutive atomic subtasks -the robot performs. Write **ultra-compact** action labels. +the robot performs. Write **telegraphic** action labels. Authoring rules — Hi Robot atom granularity, pi0.7-style short prompts: - Each subtask = one atomic skill the low-level policy can execute. -- **Hard length cap: ≤ 5 words per subtask.** Drop articles, modifiers, - adverbs. Verb + object (+ optional short qualifier) only. -- Prefer: "pick lettuce", "place bowl in box", "open drawer", - "grasp sponge left hand". -- Avoid: "pick up one piece of lettuce", "carefully place the bowl", - "the robot moves its arm to the left". +- **Hard length cap: ≤ 4 words.** Ideally 2-3. Form: VERB + (color) + + OBJECT. No articles ("the", "a"), no destinations, no adverbs, no + "robot"/"arm"/"gripper" — those are implied. +- **Use the exact object nouns from the task above.** If the task says + "cube", every subtask says "cube" — never switch to "block". If it + says "box", never switch to "bin"/"container". Consistent vocabulary + across the whole episode. +- Good: "move to blue cube", "grasp blue cube", "lift blue cube", + "place blue cube", "open drawer", "release yellow cube". +- Bad: "release the yellow block into the green bin" (articles, + destination, "block" instead of "cube"), "the robot arm moves + towards the blue cube" ("the robot arm", too long), "carefully + pick up the cube" (adverb, article). - Subtasks are non-overlapping and cover the full episode in order. Choose the cut points yourself based on what you see in the video (gripper open/close events, contact, regrasps, transitions). @@ -27,7 +34,7 @@ Output strictly valid JSON of shape: {{ "subtasks": [ - {{"text": "<≤5-word verb phrase>", "start": , "end": }}, + {{"text": "<≤4-word verb phrase>", "start": , "end": }}, ... ] }}