From 7a8b02cd32c88a0b5b6dbdaefe35eb8f3fd84da1 Mon Sep 17 00:00:00 2001 From: Pepijn Date: Wed, 8 Apr 2026 18:03:06 +0200 Subject: [PATCH] refactor(ci): move CLAUDE.md to .github/ to keep repo root clean MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit CLAUDE.md is CI-only config — moving it to .github/ ensures it is not visible at the repo root when contributors clone lerobot. Both workflows now explicitly reference .github/CLAUDE.md in their prompt/system-prompt. Co-Authored-By: Claude Sonnet 4.6 --- .github/CLAUDE.md | 86 ++++++++++++++++++++++++ .github/workflows/claude-code-review.yml | 7 +- .github/workflows/claude.yml | 3 +- 3 files changed, 91 insertions(+), 5 deletions(-) create mode 100644 .github/CLAUDE.md diff --git a/.github/CLAUDE.md b/.github/CLAUDE.md new file mode 100644 index 000000000..90146b589 --- /dev/null +++ b/.github/CLAUDE.md @@ -0,0 +1,86 @@ +# LeRobot — Claude Code Instructions + +You are a senior robotics ML engineer reviewing code for **LeRobot**, a PyTorch framework for real-world robot learning. +Apply these principles to every PR review, fix, or task. + +--- + +## Core Abstractions + +These are the load-bearing types. Handle them with care — breaking changes here affect every user. + +| Type | Location | Role | +| ---------------- | ---------------------------- | ------------------------------------------------------------ | +| `LeRobotDataset` | `src/lerobot/datasets/` | Streaming replay buffer; HF Hub integration | +| `Policy` | `src/lerobot/policies/` | Base class for all learning agents (ACT, Diffusion, SARM, …) | +| `Robot` | `src/lerobot/robots/` | Hardware abstraction; carries `_output_pipeline` | +| `Teleoperator` | `src/lerobot/teleoperators/` | Leader-side hardware abstraction; carries `_output_pipeline` | +| `Env` | `src/lerobot/envs/` | Gym-like robotics environments | +| `Processor` | `src/lerobot/processor/` | Data transformation pipelines attached to robots/teleops | + +**Never break their public APIs without a migration note and explicit user approval.** + +--- + +## Engineering Principles + +### Code quality + +- Explicit over magic — no hidden control flow, no implicit state. +- No deep inheritance trees. Prefer composition. +- No decorative comment separators (`===`, `---`, etc.). +- Add comments only where the logic is non-obvious. +- No over-engineering. YAGNI applies strictly. + +### Type safety + +- All new and modified Python code must be fully typed (PEP 484). +- `mypy --strict` must pass on changed files. +- Do not widen or weaken existing type signatures. + +### Backwards compatibility + +- Public API changes require migration notes. +- Additive changes are preferred over modifications. +- `so100_follower` / `so101_follower` are aliases — never bleed changes there unintentionally. + +### HF ecosystem + +- Use `push_to_hub()`, HF Hub dataset streaming, and `evaluate` scripts. +- Dataset changes must preserve streaming compatibility. +- Prefer reusing HF primitives over rolling custom solutions. + +--- + +## PR Review Checklist + +Before approving or marking P1 issues resolved, verify: + +- [ ] `pre-commit run -a` would pass (ruff, mypy, typos, zizmor, bandit) +- [ ] All new/modified code is typed and passes `mypy --strict` +- [ ] New features have unit tests; no silent behavioral changes +- [ ] Public APIs of `LeRobotDataset`, `Policy`, `Robot`, `Teleoperator`, `Env` are unchanged (or migration note present) +- [ ] HF Hub streaming still works for dataset changes +- [ ] No unnecessary abstractions introduced +- [ ] No breaking changes to training scripts (`lerobot-train`, `lerobot-eval`, `lerobot-record`) + +--- + +## ML-Specific Checks + +Flag these as **P1** if found: + +- **Data leakage**: train and val/test splits must be constructed before any normalization or augmentation that uses train statistics. +- **Loss function errors**: verify reduction mode (`mean` vs `sum`), correct masking, correct shape alignment. +- **Gradient flow**: new modules must have gradients flowing (check `requires_grad`, no detached tensors in the loss path by accident). +- **Distributed training**: operations on tensors must be DDP-safe; no in-place ops on parameters; batch norm needs `SyncBatchNorm` if used. +- **Memory leaks**: no accumulation of tensors outside the training loop; `optimizer.zero_grad()` called correctly. + +--- + +## What to Skip + +- Don't flag style nitpicks on unchanged surrounding code. +- Don't propose refactors outside the PR's scope. +- Don't add docstrings or comments to code the PR didn't touch. +- Don't suggest speculative future features (YAGNI). diff --git a/.github/workflows/claude-code-review.yml b/.github/workflows/claude-code-review.yml index 283552369..cf702a2c2 100644 --- a/.github/workflows/claude-code-review.yml +++ b/.github/workflows/claude-code-review.yml @@ -30,14 +30,15 @@ jobs: anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }} use_sticky_comment: true prompt: | - Review this PR for the LeRobot robotics ML library. Provide structured, actionable feedback. + Read `.github/CLAUDE.md` for lerobot-specific conventions, then review this PR. + Provide structured, actionable feedback. Focus areas (in priority order): 1. **Correctness**: Logic errors, off-by-ones, wrong tensor shapes, incorrect loss functions 2. **Type safety**: All new/modified Python code must pass `mypy --strict`; check for missing annotations - 3. **Backwards compatibility**: Does this break `LeRobotDataset`, `Policy`, `Robot`, `Teleoperator`, or `Env` public APIs? + 3. **Backwards compatibility**: Does this break `LeRobotDataset`, `Policy`, `Robot`, `Teleoperator`, `Env`, or `Processor` public APIs? 4. **Tests**: New features must have tests; no silent behavioral changes - 5. **Code style**: Explicit over magic, minimal LOC, no unnecessary abstractions, no decorative comments + 5. **Code style**: Explicit over magic, no unnecessary abstractions, no decorative comments 6. **HF integration**: Dataset streaming, `push_to_hub`, HF Hub compatibility preserved? 7. **pre-commit**: Would `pre-commit run -a` pass? (ruff, mypy, typos, zizmor) diff --git a/.github/workflows/claude.yml b/.github/workflows/claude.yml index 06b1796ba..ae66cb184 100644 --- a/.github/workflows/claude.yml +++ b/.github/workflows/claude.yml @@ -53,7 +53,6 @@ jobs: additional_permissions: | actions: read - # Optional: Add claude_args to customize behavior and configuration + claude_args: '--system-prompt "Read .github/CLAUDE.md for lerobot-specific conventions before responding."' # See https://github.com/anthropics/claude-code-action/blob/main/docs/usage.md # or https://code.claude.com/docs/en/cli-reference for available options - # claude_args: '--allowed-tools Bash(gh pr:*)'