* **`meta.tools` actually reads `info.json["tools"]`.** `DatasetInfo`
had no `tools` field, so `from_dict` silently dropped the key (it
warned about unknown fields then discarded them) and the property
always returned `DEFAULT_TOOLS`. Added `tools: list[dict] | None`
to the dataclass; `to_dict()` drops it when unset so existing
datasets keep a clean `info.json`. Fixed the accessor to read
`self.info.tools` (the previous `.get(...)` would have raised
AttributeError on the dataclass anyway). Added regression tests:
fallback when absent, round-trip from disk, and round-trip
through `DatasetInfo.from_dict` / `to_dict`.
* **`motion` is not view-dependent — fix the docs.** The mdx claimed
rows of style `motion` must carry `camera`, but `VIEW_DEPENDENT_STYLES
= {"vqa", "trace"}` and the validator agrees: motion primitives are
joint/Cartesian-frame, not pixel-space. Updated both call-out
paragraphs in `language_and_recipes.mdx`.
* **Conditional `collate_fn` swap.** Added `meta.has_language_columns`
and gate the `lerobot_collate_fn` swap in `lerobot_train.py` on it,
so non-language datasets keep PyTorch's `default_collate`. Also
added a pass-through test in `test_collate.py` that asserts on a
plain tensor batch the custom collate matches `default_collate`
key-for-key, plus a test for the `None`-sample drop path.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`lerobot.processor` re-exported `RenderMessagesStep` at the package
level, so importing anything from `lerobot.processor` pulled in
`lerobot.datasets.language` → `lerobot.datasets/__init__.py` →
`require_package("datasets")`, which fails in the Tier 1 base install
that intentionally omits the `[dataset]` extra. The chain bricked
collection for unrelated suites (`tests/policies/pi0_pi05/...`,
`tests/envs/...`, etc.).
* Stop re-exporting `RenderMessagesStep` from `lerobot.processor`. The
only consumer (the test) already imports from the submodule.
Document the deliberate omission in the module docstring.
* Add `pytest.importorskip("datasets", ...)` (and `pandas` where
needed) at the top of the four PR-added tests that exercise the
language stack:
- tests/datasets/test_language.py
- tests/datasets/test_language_render.py
- tests/processor/test_render_messages_processor.py
- tests/utils/test_collate.py
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>