mirror of
https://github.com/huggingface/lerobot.git
synced 2026-06-17 08:17:02 +00:00
750358895b
Mirror the tests/policies/pi0_pi05/utils convention: move dump_original_n1_7.py into a tests/policies/groot/utils/ package (with __init__.py) and update all path references in the test docstring/skip-message and the policy README.
105 lines
4.8 KiB
Markdown
105 lines
4.8 KiB
Markdown
## Research Paper
|
||
|
||
Paper: https://research.nvidia.com/labs/gear/gr00t-n1_5/
|
||
|
||
## Repository
|
||
|
||
Code: https://github.com/NVIDIA/Isaac-GR00T
|
||
|
||
## Citation
|
||
|
||
```bibtex
|
||
@inproceedings{gr00tn1_2025,
|
||
archivePrefix = {arxiv},
|
||
eprint = {2503.14734},
|
||
title = {{GR00T} {N1}: An Open Foundation Model for Generalist Humanoid Robots},
|
||
author = {NVIDIA and Johan Bjorck andFernando Castañeda, Nikita Cherniadev and Xingye Da and Runyu Ding and Linxi "Jim" Fan and Yu Fang and Dieter Fox and Fengyuan Hu and Spencer Huang and Joel Jang and Zhenyu Jiang and Jan Kautz and Kaushil Kundalia and Lawrence Lao and Zhiqi Li and Zongyu Lin and Kevin Lin and Guilin Liu and Edith Llontop and Loic Magne and Ajay Mandlekar and Avnish Narayan and Soroush Nasiriany and Scott Reed and You Liang Tan and Guanzhi Wang and Zu Wang and Jing Wang and Qi Wang and Jiannan Xiang and Yuqi Xie and Yinzhen Xu and Zhenjia Xu and Seonghyeon Ye and Zhiding Yu and Ao Zhang and Hao Zhang and Yizhou Zhao and Ruijie Zheng and Yuke Zhu},
|
||
month = {March},
|
||
year = {2025},
|
||
booktitle = {ArXiv Preprint},
|
||
}
|
||
```
|
||
|
||
## Additional Resources
|
||
|
||
Blog: https://developer.nvidia.com/isaac/gr00t
|
||
|
||
Hugging Face Models:
|
||
|
||
- GR00T N1.7: https://huggingface.co/nvidia/GR00T-N1.7-3B
|
||
- GR00T N1.7 LIBERO checkpoints: https://huggingface.co/nvidia/GR00T-N1.7-LIBERO
|
||
|
||
## Original-vs-LeRobot parity test
|
||
|
||
`tests/policies/groot/test_groot_vs_original.py` verifies that this LeRobot
|
||
reimplementation of GR00T N1.7 (Qwen3-VL backbone + flow-matching action head)
|
||
produces the **same raw model output** (`get_action(...)["action_pred"]`, the
|
||
normalized flow-matching prediction) as NVIDIA's original `gr00t` package, given
|
||
byte-identical pre-processed inputs and the same flow-matching seed. It is
|
||
parametrized over every embodiment tag present in the checkpoint.
|
||
|
||
### Why two environments
|
||
|
||
The original `gr00t` package pins `transformers==4.57.3` (Python 3.10); this
|
||
integration requires `transformers>=5.x` (Qwen3-VL). Under 5.x, `PretrainedConfig`
|
||
is itself a defaulted dataclass, so the original config dataclasses fail to import
|
||
(`non-default argument follows default argument`). The two implementations therefore
|
||
**cannot be imported in the same Python process**.
|
||
|
||
So the test uses a **producer / consumer** split across two venvs:
|
||
|
||
1. **Producer** — `tests/policies/groot/utils/dump_original_n1_7.py`, run in the *original*
|
||
gr00t venv. For each embodiment it builds dummy inputs generically from the
|
||
checkpoint metadata (state dims from `statistics.json`; camera/language keys from
|
||
the processor modality configs), runs the original model, and saves the exact
|
||
collated inputs + raw `action_pred` to one `.npz` per tag.
|
||
2. **Consumer** — the pytest above, run in the *LeRobot* venv. It discovers every
|
||
`.npz`, replays the byte-identical inputs through the LeRobot model with the same
|
||
seed, and asserts the outputs match.
|
||
|
||
### Fairness controls
|
||
|
||
- **Same pre-processed inputs** — the original processor's `input_ids`,
|
||
`pixel_values`, `image_grid_thw`, `attention_mask`, `state`, `embodiment_id` are
|
||
fed verbatim to the LeRobot model (no re-tokenization / re-normalization).
|
||
- **Same precision + attention kernel** — both sides run **fp32 + SDPA**. The
|
||
original defaults to `use_flash_attention=True` (flash_attention_2 + bf16); the
|
||
producer forces SDPA + fp32. (With the defaults the gap is ~3e-2 — pure
|
||
kernel/rounding noise, not an implementation difference.)
|
||
- **Same flow-matching seed** — fixed (42) right before sampling on both sides.
|
||
|
||
### How to run
|
||
|
||
```bash
|
||
# Resolve a local checkpoint (GR00T-N1.7-LIBERO / libero_10)
|
||
CKPT=$(python - <<'PY'
|
||
import os
|
||
from huggingface_hub import snapshot_download
|
||
print(os.path.join(snapshot_download("nvidia/GR00T-N1.7-LIBERO",
|
||
allow_patterns=["libero_10/*"]), "libero_10"))
|
||
PY
|
||
)
|
||
|
||
# 1) Produce the original-side artifacts for all embodiments (original gr00t venv, CUDA)
|
||
CUDA_VISIBLE_DEVICES=0 /path/to/Isaac-GR00T/.venv-original/bin/python \
|
||
tests/policies/groot/utils/dump_original_n1_7.py \
|
||
--ckpt "$CKPT" --out-dir tests/policies/groot/artifacts --device cuda --seed 42
|
||
|
||
# 2) Run the parity test (LeRobot venv) — one parametrized case per embodiment
|
||
CUDA_VISIBLE_DEVICES=0 GROOT_PARITY_DEVICE=cuda \
|
||
uv run pytest tests/policies/groot/test_groot_vs_original.py -v -s
|
||
```
|
||
|
||
The `.npz` artifacts are local-only (gitignored, ~6–9 MB each) and are regenerated by
|
||
the producer; they are never committed. The test **skips** (does not fail) on CI or
|
||
when the checkpoint / artifacts are absent.
|
||
|
||
#### Env knobs (all optional)
|
||
|
||
| Var | Default | Purpose |
|
||
|---|---|---|
|
||
| `GROOT_N1_7_PARITY_DIR` | `tests/policies/groot/artifacts` | directory of per-tag `.npz` artifacts |
|
||
| `GROOT_N1_7_LIBERO_CKPT` | auto (HF cache) | override checkpoint dir |
|
||
| `GROOT_PARITY_DEVICE` | `cuda` if available | `cpu` or `cuda` |
|
||
| `GROOT_PARITY_ATOL` / `GROOT_PARITY_RTOL` | `1e-3` | comparison tolerance |
|