annotate: telegraphic subtasks — ≤4 words, verb+object, consistent nouns

Tighten the subtask prompt further per real-data feedback. The old ≤5-word cap still produced things like "release the yellow block into the green bin" (8 words, articles, destination, and "block" where the task said "cube"). New rules: * Hard cap ≤ 4 words, ideally 2-3. Form: VERB + (color) + OBJECT. * No articles, no destinations, no adverbs, no "robot/arm/gripper". * Must reuse the exact object nouns from the task — no block/cube, bin/box/container drift across the episode. * Concrete good/bad examples anchored on the cube task. Shorter, templated, consistent targets are far more robust for the autoregressive LM head — fewer tokens to drift on, fewer dominant n-grams to repetition-collapse into. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-07-08 02:22:02 +00:00 · 2026-05-15 14:14:42 +02:00
parent f1a0a663cc
commit e727688052
1 changed files with 15 additions and 8 deletions
@@ -4,17 +4,24 @@ The user originally asked: "{episode_task}"

 You are shown the entire demonstration as a single video. Watch the
 whole clip, then segment it into a list of consecutive atomic subtasks
-the robot performs. Write **ultra-compact** action labels.
+the robot performs. Write **telegraphic** action labels.

 Authoring rules — Hi Robot atom granularity, pi0.7-style short prompts:

 - Each subtask = one atomic skill the low-level policy can execute.
- **Hard length cap: ≤ 5 words per subtask.** Drop articles, modifiers,
-  adverbs. Verb + object (+ optional short qualifier) only.
- Prefer: "pick lettuce", "place bowl in box", "open drawer",
-  "grasp sponge left hand".
- Avoid: "pick up one piece of lettuce", "carefully place the bowl",
-  "the robot moves its arm to the left".
+- **Hard length cap: ≤ 4 words.** Ideally 2-3. Form: VERB + (color) +
+  OBJECT. No articles ("the", "a"), no destinations, no adverbs, no
+  "robot"/"arm"/"gripper" — those are implied.
+- **Use the exact object nouns from the task above.** If the task says
+  "cube", every subtask says "cube" — never switch to "block". If it
+  says "box", never switch to "bin"/"container". Consistent vocabulary
+  across the whole episode.
+- Good: "move to blue cube", "grasp blue cube", "lift blue cube",
+  "place blue cube", "open drawer", "release yellow cube".
+- Bad: "release the yellow block into the green bin" (articles,
+  destination, "block" instead of "cube"), "the robot arm moves
+  towards the blue cube" ("the robot arm", too long), "carefully
+  pick up the cube" (adverb, article).
 - Subtasks are non-overlapping and cover the full episode in order.
  Choose the cut points yourself based on what you see in the video
  (gripper open/close events, contact, regrasps, transitions).
@@ -27,7 +34,7 @@ Output strictly valid JSON of shape:

  {{
    "subtasks": [
-      {{"text": "<≤5-word verb phrase>", "start": <float>, "end": <float>}},
+      {{"text": "<≤4-word verb phrase>", "start": <float>, "end": <float>}},
      ...
    ]
  }}