revert(annotate): move memory + speech prompts to base PR (#3471)

The first-person memory narrative, task-rephrasing and initial-speech
prompt tweaks belong in the annotation pipeline itself. Applied to
feat/language-annotation-pipeline (#3471); reverting them here to the
merge-base so they drop out of this PR's diff. general_vqa.py keeps its
docstring fix since it references a recipe this PR introduces.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Pepijn
2026-05-19 14:17:52 +02:00
parent 182f10184f
commit 7b64e5498d
3 changed files with 12 additions and 24 deletions
@@ -1,35 +1,25 @@
You are updating the robot's compressed semantic memory at the boundary of
a completed subtask.
Reference (MEM, Torne 2026):
Reference (verbatim from MEM, Torne 2026):
"Remove or compress information in the language memory whenever
appropriate. Keep ONLY the minimal set of relevant information for future
task execution. Specific object attributes (colors, precise quantities of
each item) get discarded when their details won't affect subsequent
actions. Functional outcomes (where items went, how many) are preserved."
Concrete example from MEM:
Before: "I put a light green bowl, a dark blue bowl and a bright yellow
bowl into the top right cabinet"
After: "I placed three bowls in the top right cabinet"
Episode task: "{episode_task}"
Previous memory: {prior_memory}
Just-completed subtask: "{completed_subtask}"
Remaining subtasks (for relevance judgement only): {remaining_subtasks}
Write the **shortest possible** state note that future subtasks could
need. Telegraphic style.
**Hard caps**
- ≤ 10 words total.
- No articles. No verbs in past tense ("placed", "moved"). Use
comma-separated noun→location fragments.
- Drop colors/sizes/counts unless a later subtask depends on them.
- If nothing material changed for downstream subtasks, emit "" (empty
string).
Examples
- Good: "bowl in box, lid open"
- Good: "3 bowls in cabinet"
- Good: "cup on tray, drawer closed"
- Bad: "The bowl is now in the box and the lid is still open."
- Bad: "I placed the green bowl carefully into the cardboard box."
Update the memory. Drop irrelevant detail. Compress completed steps.
Keep WHAT happened, drop HOW. Shorter is better.
Output strictly valid JSON:
{{ "memory": "<≤10-word telegraphic state, or empty>" }}
{{ "memory": "<one or two short sentences>" }}
@@ -9,7 +9,7 @@ Original task:
Generate exactly {n} alternative phrasings of the same task. Vary:
- formality (casual / polite / curt)
- verbosity (mostly short imperative; occasional polite request)
- verbosity (short imperative vs longer polite request)
- word choice (synonyms, different verbs)
- sentence structure (imperative / question / suggestion)
@@ -17,7 +17,7 @@ Hard rules:
- Each phrasing MUST preserve the exact meaning of the original task.
Do not change which object is involved, the destination, or the
action. Do not add extra steps. Do not invent new objects.
- Each phrasing must be a short phrase or sentence, plain prose, no
- Each phrasing must be a single short sentence, plain prose, no
markdown, no quotes, no list numbers.
- Phrasings must be distinct — no near-duplicates.
- Output exactly {n} entries.
@@ -1,12 +1,10 @@
The user just asked the robot: "{episode_task}".
Generate a short verbal acknowledgement the robot would speak back before
beginning the task. Style: compact, confident, friendly.
beginning the task. Style: confident, friendly, single short sentence.
Examples (Hi Robot, Shi 2025): "Sure, I won't put cheese on it.",
"OK, starting with the sponge.", "Got it.".
Prefer very short replies: "Got it.", "On it.", "OK."
Output strictly valid JSON:
{{ "text": "<the spoken acknowledgement>" }}