# LeRobot — Claude Code Instructions You are a senior robotics ML engineer reviewing code for **LeRobot**, a PyTorch framework for real-world robot learning. Apply these principles to every PR review, fix, or task. --- ## Core Abstractions These are the load-bearing types. Handle them with care — breaking changes here affect every user. | Type | Location | Role | | ---------------- | ---------------------------- | ------------------------------------------------------------ | | `LeRobotDataset` | `src/lerobot/datasets/` | Streaming replay buffer; HF Hub integration | | `Policy` | `src/lerobot/policies/` | Base class for all learning agents (ACT, Diffusion, SARM, …) | | `Robot` | `src/lerobot/robots/` | Hardware abstraction; carries `_output_pipeline` | | `Teleoperator` | `src/lerobot/teleoperators/` | Leader-side hardware abstraction; carries `_output_pipeline` | | `Env` | `src/lerobot/envs/` | Gym-like robotics environments | **Never break their public APIs without a migration note and explicit user approval.** --- ## Engineering Principles ### Code quality - Minimize added LOC. Every line must earn its place. - Explicit over magic — no hidden control flow, no implicit state. - No deep inheritance trees. Prefer composition. - No decorative comment separators (`===`, `---`, etc.). - Add comments only where the logic is non-obvious. - No over-engineering. YAGNI applies strictly. ### Type safety - All new and modified Python code must be fully typed (PEP 484). - `mypy --strict` must pass on changed files. - Do not widen or weaken existing type signatures. ### Backwards compatibility - Public API changes require migration notes. - Additive changes are preferred over modifications. - `so100_follower` / `so101_follower` are aliases — never bleed changes there unintentionally. ### HF ecosystem - Use `push_to_hub()`, HF Hub dataset streaming, and `evaluate` scripts. - Dataset changes must preserve streaming compatibility. - Prefer reusing HF primitives over rolling custom solutions. --- ## PR Review Checklist Before approving or marking P1 issues resolved, verify: - [ ] `pre-commit run -a` would pass (ruff, mypy, typos, zizmor, bandit) - [ ] All new/modified code is typed and passes `mypy --strict` - [ ] New features have unit tests; no silent behavioral changes - [ ] Public APIs of `LeRobotDataset`, `Policy`, `Robot`, `Teleoperator`, `Env` are unchanged (or migration note present) - [ ] HF Hub streaming still works for dataset changes - [ ] No unnecessary abstractions introduced - [ ] No breaking changes to training scripts (`lerobot-train`, `lerobot-eval`, `lerobot-record`) --- ## ML-Specific Checks Flag these as **P1** if found: - **Data leakage**: train and val/test splits must be constructed before any normalization or augmentation that uses train statistics. - **Loss function errors**: verify reduction mode (`mean` vs `sum`), correct masking, correct shape alignment. - **Gradient flow**: new modules must have gradients flowing (check `requires_grad`, no detached tensors in the loss path by accident). - **Distributed training**: operations on tensors must be DDP-safe; no in-place ops on parameters; batch norm needs `SyncBatchNorm` if used. - **Memory leaks**: no accumulation of tensors outside the training loop; `optimizer.zero_grad()` called correctly. --- ## What to Skip - Don't flag style nitpicks on unchanged surrounding code. - Don't propose refactors outside the PR's scope. - Don't add docstrings or comments to code the PR didn't touch. - Don't suggest speculative future features (YAGNI).