annotate: telegraphic subtasks — ≤4 words, verb+object, consistent nouns

Tighten the subtask prompt further per real-data feedback. The old ≤5-word cap still produced things like "release the yellow block into the green bin" (8 words, articles, destination, and "block" where the task said "cube"). New rules: * Hard cap ≤ 4 words, ideally 2-3. Form: VERB + (color) + OBJECT. * No articles, no destinations, no adverbs, no "robot/arm/gripper". * Must reuse the exact object nouns from the task — no block/cube, bin/box/container drift across the episode. * Concrete good/bad examples anchored on the cube task. Shorter, templated, consistent targets are far more robust for the autoregressive LM head — fewer tokens to drift on, fewer dominant n-grams to repetition-collapse into. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-07-25 02:36:11 +00:00 · 2026-05-15 14:14:42 +02:00
parent f1a0a663cc
commit e727688052
1 changed files with 15 additions and 8 deletions
@@ -4,17 +4,24 @@ The user originally asked: "{episode_task}"
 You are shown the entire demonstration as a single video. Watch the
 whole clip, then segment it into a list of consecutive atomic subtasks
-the robot performs. Write **ultra-compact** action labels.
+the robot performs. Write **telegraphic** action labels.
 Authoring rules — Hi Robot atom granularity, pi0.7-style short prompts:
 - Each subtask = one atomic skill the low-level policy can execute.
- **Hard length cap: ≤ 5 words per subtask.** Drop articles, modifiers,
+- **Hard length cap: ≤ 4 words.** Ideally 2-3. Form: VERB + (color) +
-  adverbs. Verb + object (+ optional short qualifier) only.
+  OBJECT. No articles ("the", "a"), no destinations, no adverbs, no
- Prefer: "pick lettuce", "place bowl in box", "open drawer",
+  "robot"/"arm"/"gripper" — those are implied.
-  "grasp sponge left hand".
+- **Use the exact object nouns from the task above.** If the task says
- Avoid: "pick up one piece of lettuce", "carefully place the bowl",
+  "cube", every subtask says "cube" — never switch to "block". If it
-  "the robot moves its arm to the left".
+  says "box", never switch to "bin"/"container". Consistent vocabulary
  across the whole episode.
 - Good: "move to blue cube", "grasp blue cube", "lift blue cube",
  "place blue cube", "open drawer", "release yellow cube".
 - Bad: "release the yellow block into the green bin" (articles,
  destination, "block" instead of "cube"), "the robot arm moves
  towards the blue cube" ("the robot arm", too long), "carefully
  pick up the cube" (adverb, article).
 - Subtasks are non-overlapping and cover the full episode in order.
  Choose the cut points yourself based on what you see in the video
  (gripper open/close events, contact, regrasps, transitions).
@@ -27,7 +34,7 @@ Output strictly valid JSON of shape:
  {{
    "subtasks": [
-      {{"text": "<≤5-word verb phrase>", "start": <float>, "end": <float>}},
+      {{"text": "<≤4-word verb phrase>", "start": <float>, "end": <float>}},
      ...
    ]
  }}