diff --git a/src/lerobot/annotations/steerable_pipeline/prompts/module_1_subtasks.txt b/src/lerobot/annotations/steerable_pipeline/prompts/module_1_subtasks.txt index 10044fc0b..56d14d42f 100644 --- a/src/lerobot/annotations/steerable_pipeline/prompts/module_1_subtasks.txt +++ b/src/lerobot/annotations/steerable_pipeline/prompts/module_1_subtasks.txt @@ -4,17 +4,24 @@ The user originally asked: "{episode_task}" You are shown the entire demonstration as a single video. Watch the whole clip, then segment it into a list of consecutive atomic subtasks -the robot performs. Write **ultra-compact** action labels. +the robot performs. Write **telegraphic** action labels. Authoring rules — Hi Robot atom granularity, pi0.7-style short prompts: - Each subtask = one atomic skill the low-level policy can execute. -- **Hard length cap: ≤ 5 words per subtask.** Drop articles, modifiers, - adverbs. Verb + object (+ optional short qualifier) only. -- Prefer: "pick lettuce", "place bowl in box", "open drawer", - "grasp sponge left hand". -- Avoid: "pick up one piece of lettuce", "carefully place the bowl", - "the robot moves its arm to the left". +- **Hard length cap: ≤ 4 words.** Ideally 2-3. Form: VERB + (color) + + OBJECT. No articles ("the", "a"), no destinations, no adverbs, no + "robot"/"arm"/"gripper" — those are implied. +- **Use the exact object nouns from the task above.** If the task says + "cube", every subtask says "cube" — never switch to "block". If it + says "box", never switch to "bin"/"container". Consistent vocabulary + across the whole episode. +- Good: "move to blue cube", "grasp blue cube", "lift blue cube", + "place blue cube", "open drawer", "release yellow cube". +- Bad: "release the yellow block into the green bin" (articles, + destination, "block" instead of "cube"), "the robot arm moves + towards the blue cube" ("the robot arm", too long), "carefully + pick up the cube" (adverb, article). - Subtasks are non-overlapping and cover the full episode in order. Choose the cut points yourself based on what you see in the video (gripper open/close events, contact, regrasps, transitions). @@ -27,7 +34,7 @@ Output strictly valid JSON of shape: {{ "subtasks": [ - {{"text": "<≤5-word verb phrase>", "start": , "end": }}, + {{"text": "<≤4-word verb phrase>", "start": , "end": }}, ... ] }}