mirror of
https://github.com/huggingface/lerobot.git
synced 2026-05-24 21:19:53 +00:00
feat(annotations): enforce imperative verb-first subtask phrasing
Rewrite module_1_subtasks prompt to produce short imperative commands
("pick up the orange") instead of third-person narration ("the robot
arm moves to the orange"). Drops the verbose "how, not what" rule and
adds a good/bad few-shot table.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -6,15 +6,18 @@ You are shown the entire demonstration as a single video. Watch the
|
|||||||
whole clip, then segment it into a list of consecutive atomic subtasks
|
whole clip, then segment it into a list of consecutive atomic subtasks
|
||||||
the robot performs.
|
the robot performs.
|
||||||
|
|
||||||
Authoring rules — based on Hi Robot (Shi 2025) atom granularity and
|
Authoring rules — based on Hi Robot (Shi 2025) atom granularity:
|
||||||
Pi0.7 (Physical Intelligence 2025) "how, not what" detail:
|
|
||||||
|
|
||||||
- Each subtask is one atomic skill the low-level policy can execute,
|
- Each subtask is one atomic skill the low-level policy can execute,
|
||||||
e.g. "pick up one piece of lettuce", "place the bowl into the box",
|
e.g. "pick up the orange", "place the bowl into the box".
|
||||||
"move the right arm to the left".
|
- Write each subtask as an IMPERATIVE COMMAND to the robot, starting
|
||||||
- Capture HOW the subtask is performed, not only WHAT — e.g. prefer
|
with a verb: move, reach, pick up, grasp, place, put, push, pull,
|
||||||
"grasp the handle of the sponge with the left hand" to "pick up the
|
open, close, turn, press, lift, insert, pour...
|
||||||
sponge".
|
- NEVER use third person. Never write "the robot", "the arm", "the
|
||||||
|
gripper moves", "it picks up". Command the robot, do not describe it.
|
||||||
|
- Keep it SHORT — 3 to 8 words. Add a "how" detail (which hand, which
|
||||||
|
grasp point) ONLY when it is needed to disambiguate.
|
||||||
|
- Lower-case, no trailing period.
|
||||||
- Subtasks are non-overlapping and cover the full episode in order.
|
- Subtasks are non-overlapping and cover the full episode in order.
|
||||||
Choose the cut points yourself based on what you see in the video
|
Choose the cut points yourself based on what you see in the video
|
||||||
(gripper open/close events, contact, regrasps, transitions).
|
(gripper open/close events, contact, regrasps, transitions).
|
||||||
@@ -23,11 +26,22 @@ Pi0.7 (Physical Intelligence 2025) "how, not what" detail:
|
|||||||
- Every subtask's [start_time, end_time] must lie within
|
- Every subtask's [start_time, end_time] must lie within
|
||||||
[0.0, {episode_duration}] seconds.
|
[0.0, {episode_duration}] seconds.
|
||||||
|
|
||||||
|
Style examples:
|
||||||
|
|
||||||
|
Good Bad (do NOT produce these)
|
||||||
|
"pick up the orange" "the robot arm moves to the orange"
|
||||||
|
"move to the yellow block" "the gripper approaches the block"
|
||||||
|
"close gripper to grasp "close the gripper to grasp the
|
||||||
|
the yellow cube" yellow cube so it can lift it"
|
||||||
|
"open the toaster oven" "it opens the toaster oven door"
|
||||||
|
"put the bagel on the "the white plate now has the bagel
|
||||||
|
white plate" placed on it by the arm"
|
||||||
|
|
||||||
Output strictly valid JSON of shape:
|
Output strictly valid JSON of shape:
|
||||||
|
|
||||||
{{
|
{{
|
||||||
"subtasks": [
|
"subtasks": [
|
||||||
{{"text": "<how-not-what>", "start": <float>, "end": <float>}},
|
{{"text": "<short imperative command>", "start": <float>, "end": <float>}},
|
||||||
...
|
...
|
||||||
]
|
]
|
||||||
}}
|
}}
|
||||||
|
|||||||
Reference in New Issue
Block a user