From 134a707c7aee6d43a7548259b7360ea8d34644b9 Mon Sep 17 00:00:00 2001
From: Pepijn <pepijn@huggingface.co>
Date: Tue, 19 May 2026 14:17:30 +0200
Subject: [PATCH] feat(annotate): first-person memory narrative + shorter
 speech prompts

- module_1_memory: rewrite as an explicit first-person, past-tense
  narrative ("I picked up...", "I opened...") matching the MEM
  (Torne 2026) running-memory style, instead of "one or two short
  sentences" with no person/tense guidance.
- module_1_task_rephrasings: bias rephrasings toward short imperative.
- module_2_initial_speech: prefer very short robot acknowledgements.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 .../prompts/module_1_memory.txt               | 27 +++++++++++++------
 .../prompts/module_1_task_rephrasings.txt     |  4 +--
 .../prompts/module_2_initial_speech.txt       |  4 ++-
 3 files changed, 24 insertions(+), 11 deletions(-)
diff --git a/src/lerobot/annotations/steerable_pipeline/prompts/module_1_memory.txt b/src/lerobot/annotations/steerable_pipeline/prompts/module_1_memory.txt
index 6a89ecefa..b5278368b 100644
--- a/src/lerobot/annotations/steerable_pipeline/prompts/module_1_memory.txt
+++ b/src/lerobot/annotations/steerable_pipeline/prompts/module_1_memory.txt
@@ -8,18 +8,29 @@ task execution. Specific object attributes (colors, precise quantities of
 each item) get discarded when their details won't affect subsequent
 actions. Functional outcomes (where items went, how many) are preserved."
 
-Concrete example from MEM:
-  Before: "I put a light green bowl, a dark blue bowl and a bright yellow
-           bowl into the top right cabinet"
-  After:  "I placed three bowls in the top right cabinet"
-
 Episode task: "{episode_task}"
 Previous memory: {prior_memory}
 Just-completed subtask: "{completed_subtask}"
 Remaining subtasks (for relevance judgement only): {remaining_subtasks}
 
-Update the memory. Drop irrelevant detail. Compress completed steps.
-Keep WHAT happened, drop HOW. Shorter is better.
+Write the memory as a short FIRST-PERSON, PAST-TENSE narrative of what the
+robot has accomplished so far — the running story it would tell itself.
+
+Authoring rules:
+- First person, past tense. Every sentence starts with "I": "I picked
+  up...", "I opened...", "I moved to...".
+- One or two short sentences. Extend the previous memory with the
+  just-completed subtask; do not rewrite it from scratch.
+- Keep WHAT happened (functional outcomes — where items went, how many),
+  drop HOW (grasp details, motions).
+- Compress completed steps and drop object attributes (colors, exact
+  counts) once they no longer affect the remaining subtasks.
+
+Example (MEM, Torne 2026):
+  Before: "I prepared the pot and got the potatoes, milk, and butter. I
+           moved to the drawer."
+  After:  "I prepared the pot and got the ingredients. I opened the
+           drawer with the masher."
 
 Output strictly valid JSON:
-  {{ "memory": "<one or two short sentences>" }}
+  {{ "memory": "<one or two short first-person past-tense sentences>" }}
diff --git a/src/lerobot/annotations/steerable_pipeline/prompts/module_1_task_rephrasings.txt b/src/lerobot/annotations/steerable_pipeline/prompts/module_1_task_rephrasings.txt
index d03a6bf8b..602892bd3 100644
--- a/src/lerobot/annotations/steerable_pipeline/prompts/module_1_task_rephrasings.txt
+++ b/src/lerobot/annotations/steerable_pipeline/prompts/module_1_task_rephrasings.txt
@@ -9,7 +9,7 @@ Original task:
 Generate exactly {n} alternative phrasings of the same task. Vary:
 
 - formality (casual / polite / curt)
-- verbosity (short imperative vs longer polite request)
+- verbosity (mostly short imperative; occasional polite request)
 - word choice (synonyms, different verbs)
 - sentence structure (imperative / question / suggestion)
 
@@ -17,7 +17,7 @@ Hard rules:
 - Each phrasing MUST preserve the exact meaning of the original task.
   Do not change which object is involved, the destination, or the
   action. Do not add extra steps. Do not invent new objects.
-- Each phrasing must be a single short sentence, plain prose, no
+- Each phrasing must be a short phrase or sentence, plain prose, no
   markdown, no quotes, no list numbers.
 - Phrasings must be distinct — no near-duplicates.
 - Output exactly {n} entries.
diff --git a/src/lerobot/annotations/steerable_pipeline/prompts/module_2_initial_speech.txt b/src/lerobot/annotations/steerable_pipeline/prompts/module_2_initial_speech.txt
index 6058b1f5c..625ce920c 100644
--- a/src/lerobot/annotations/steerable_pipeline/prompts/module_2_initial_speech.txt
+++ b/src/lerobot/annotations/steerable_pipeline/prompts/module_2_initial_speech.txt
@@ -1,10 +1,12 @@
 The user just asked the robot: "{episode_task}".
 
 Generate a short verbal acknowledgement the robot would speak back before
-beginning the task. Style: confident, friendly, single short sentence.
+beginning the task. Style: compact, confident, friendly.
 
 Examples (Hi Robot, Shi 2025): "Sure, I won't put cheese on it.",
 "OK, starting with the sponge.", "Got it.".
 
+Prefer very short replies: "Got it.", "On it.", "OK."
+
 Output strictly valid JSON:
   {{ "text": "<the spoken acknowledgement>" }}