mirror of
https://github.com/huggingface/lerobot.git
synced 2026-05-17 09:39:47 +00:00
review: emitted_at uses 0.1s tolerance; MessageTurn requires stream at construction
* **Float tolerance in `emitted_at` for persistent styles.** The ``_timestamp(row) == t`` exact-equality check silently missed any caller that derived ``t`` arithmetically (e.g. ``frame_idx / fps``) even though the parquet timestamp would only differ by ULPs. Added ``EMITTED_AT_TOLERANCE_S = 0.1`` and check ``abs(...) <= tolerance`` instead, with a docstring explaining why exact equality wasn't enough and why 0.1 s is safe at typical 30–100 Hz control rates. Test asserts the new behavior at half-window (matches) and double-window (no match) using the constant so it stays in sync. * **`MessageTurn.stream` is required at construction.** It was typed ``MessageStream | None = None`` so YAML could omit ``stream:`` and pass the dataclass invariant — but ``_validate_rendered`` rejected ``None`` streams later, surfacing the error at the first sample instead of at recipe load. Now ``__post_init__`` raises ``ValueError`` if ``stream`` is ``None``, with the list of valid streams in the message. The redundant late-stage check in ``_validate_rendered`` is replaced with a one-line comment that cites the upstream invariant. Test pins the new construction-time rejection. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -6,6 +6,7 @@ pytest.importorskip("datasets", reason="datasets is required (install lerobot[da
|
||||
|
||||
from lerobot.configs.recipe import MessageTurn, TrainingRecipe # noqa: E402
|
||||
from lerobot.datasets.language_render import ( # noqa: E402
|
||||
EMITTED_AT_TOLERANCE_S,
|
||||
active_at,
|
||||
emitted_at,
|
||||
nth_next,
|
||||
@@ -342,6 +343,21 @@ def test_resolve_task_explicit_override_beats_rephrasings():
|
||||
assert rendered["messages"][0]["content"] == "explicit override wins"
|
||||
|
||||
|
||||
def test_emitted_at_persistent_tolerates_small_timestamp_drift():
|
||||
"""Persistent ``emitted_at`` should match within EMITTED_AT_TOLERANCE_S
|
||||
so callers that derive ``t`` arithmetically (``frame_idx / fps``) still
|
||||
line up with the parquet-stored timestamp.
|
||||
"""
|
||||
rows = [persistent_row("assistant", "memo", "memory", 1.0)]
|
||||
# Half a tolerance window — bit-different float, comfortably inside
|
||||
inside = emitted_at(1.0 + EMITTED_AT_TOLERANCE_S / 2, persistent=rows, events=[], style="memory")
|
||||
assert inside is not None and inside["content"] == "memo"
|
||||
|
||||
# Just past the window — no match
|
||||
outside = emitted_at(1.0 + EMITTED_AT_TOLERANCE_S * 2, persistent=rows, events=[], style="memory")
|
||||
assert outside is None
|
||||
|
||||
|
||||
def test_render_sample_rejects_non_dict_language_rows():
|
||||
"""``_normalize_rows`` must surface malformed inputs as TypeError.
|
||||
|
||||
|
||||
Reference in New Issue
Block a user