## Research Paper GR00T N1 technical report (covers the GR00T N1.x family, including N1.7): https://arxiv.org/abs/2503.14734 GR00T N1.7 model card: https://huggingface.co/nvidia/GR00T-N1.7-3B GR00T N1.5 research page (earlier version): https://research.nvidia.com/labs/gear/gr00t-n1_5/ > GR00T N1.5 support was removed from LeRobot; the last release supporting it is `lerobot==0.5.1`. > Current releases support GR00T N1.7 only. ## Repository Code: https://github.com/NVIDIA/Isaac-GR00T ## Citation ```bibtex @inproceedings{gr00tn1_2025, archivePrefix = {arxiv}, eprint = {2503.14734}, title = {{GR00T} {N1}: An Open Foundation Model for Generalist Humanoid Robots}, author = {NVIDIA and Johan Bjorck andFernando Castañeda, Nikita Cherniadev and Xingye Da and Runyu Ding and Linxi "Jim" Fan and Yu Fang and Dieter Fox and Fengyuan Hu and Spencer Huang and Joel Jang and Zhenyu Jiang and Jan Kautz and Kaushil Kundalia and Lawrence Lao and Zhiqi Li and Zongyu Lin and Kevin Lin and Guilin Liu and Edith Llontop and Loic Magne and Ajay Mandlekar and Avnish Narayan and Soroush Nasiriany and Scott Reed and You Liang Tan and Guanzhi Wang and Zu Wang and Jing Wang and Qi Wang and Jiannan Xiang and Yuqi Xie and Yinzhen Xu and Zhenjia Xu and Seonghyeon Ye and Zhiding Yu and Ao Zhang and Hao Zhang and Yizhou Zhao and Ruijie Zheng and Yuke Zhu}, month = {March}, year = {2025}, booktitle = {ArXiv Preprint}, } ``` ## Additional Resources Blog: https://developer.nvidia.com/isaac/gr00t Hugging Face Models: - GR00T N1.7: https://huggingface.co/nvidia/GR00T-N1.7-3B - GR00T N1.7 LIBERO checkpoints: https://huggingface.co/nvidia/GR00T-N1.7-LIBERO ## Original-vs-LeRobot parity test `tests/policies/groot/test_groot_vs_original.py` verifies this LeRobot reimplementation of GR00T N1.7 (Qwen3-VL backbone + flow-matching action head) against NVIDIA's original `gr00t` package with two comparisons, each parametrized over every embodiment tag present in the checkpoint: 1. **Model parity** — given byte-identical pre-processed inputs and the same flow-matching seed (recorded in each artifact), both implementations must produce the **same raw model output** (`get_action(...)["action_pred"]`, the normalized flow-matching prediction). Output shapes must match exactly; any action-horizon or action-dim mismatch fails the test. 2. **Preprocessor parity** — given the identical raw observations (per-camera frames, state vectors, language instruction), LeRobot's own preprocessor pipeline (real Qwen3-VL chat template / tokenizer / image packing + checkpoint-driven state normalization, no mocks) must produce the **same collated model inputs** (`input_ids`, `attention_mask`, `pixel_values`, `image_grid_thw`, `state`, `embodiment_id`) as the original package's processor. ### Why two environments The original `gr00t` package pins `transformers==4.57.3` (Python 3.10); this integration requires `transformers>=5.x` (Qwen3-VL). Under 5.x, `PretrainedConfig` is itself a defaulted dataclass, so the original config dataclasses fail to import (`non-default argument follows default argument`). The two implementations therefore **cannot be imported in the same Python process**. So the test uses a **producer / consumer** split across two venvs: 1. **Producer** — `tests/policies/groot/utils/dump_original_n1_7.py`, run in the _original_ gr00t venv. For each embodiment it builds dummy inputs generically from the checkpoint metadata (state dims from `statistics.json`; camera/language keys from the processor modality configs), runs the original model, and saves to one `.npz` per tag: the raw observations (`raw::` keys), the exact collated inputs (`in::` keys), the seed, and the raw `action_pred`. 2. **Consumer** — the pytest above, run in the _LeRobot_ venv. It discovers every `.npz`; the model-parity case replays the byte-identical collated inputs through the LeRobot model with the recorded seed and asserts the outputs match, and the preprocessor-parity case replays the raw observations through LeRobot's full preprocessor pipeline and asserts the collated tensors match. > Artifacts generated by older versions of the dump script contain no `raw::` > fields; the preprocessor-parity case then **skips** with a regeneration hint. > Re-run the producer to refresh them. ### Fairness controls - **Same pre-processed inputs (model parity)** — the original processor's `input_ids`, `pixel_values`, `image_grid_thw`, `attention_mask`, `state`, `embodiment_id` are fed verbatim to the LeRobot model (no re-tokenization / re-normalization), so the model comparison isolates the model. LeRobot's own tokenization / image packing is covered separately by the preprocessor-parity case, which compares its output against those same collated tensors from identical raw observations. - **Same precision + attention kernel** — both sides run **fp32 + SDPA**. The original defaults to `use_flash_attention=True` (flash_attention_2 + bf16); the producer forces SDPA + fp32. (With the defaults the gap is ~3e-2 — pure kernel/rounding noise, not an implementation difference.) - **Same flow-matching seed** — fixed right before sampling on both sides; the producer records it in each artifact (`--seed`, default 42) and the consumer replays the recorded value. ### How to run ```bash # Resolve a local checkpoint (GR00T-N1.7-LIBERO / libero_10) CKPT=$(python - <<'PY' import os from huggingface_hub import snapshot_download print(os.path.join(snapshot_download("nvidia/GR00T-N1.7-LIBERO", allow_patterns=["libero_10/*"]), "libero_10")) PY ) # 1) Produce the original-side artifacts for all embodiments (original gr00t venv, CUDA) CUDA_VISIBLE_DEVICES=0 /path/to/Isaac-GR00T/.venv-original/bin/python \ tests/policies/groot/utils/dump_original_n1_7.py \ --ckpt "$CKPT" --out-dir tests/policies/groot/artifacts --device cuda --seed 42 # 2) Run the parity test (LeRobot venv) — one parametrized case per embodiment CUDA_VISIBLE_DEVICES=0 GROOT_PARITY_DEVICE=cuda \ uv run pytest tests/policies/groot/test_groot_vs_original.py -v -s ``` The `.npz` artifacts are local-only (gitignored, ~6–10 MB each) and are regenerated by the producer; they are never committed. The tests **skip** (do not fail) on CI or when the checkpoint / artifacts are absent. #### Env knobs (all optional) | Var | Default | Purpose | | ----------------------------------------- | -------------------------------- | ------------------------------------- | | `GROOT_N1_7_PARITY_DIR` | `tests/policies/groot/artifacts` | directory of per-tag `.npz` artifacts | | `GROOT_N1_7_LIBERO_CKPT` | auto (HF cache) | override checkpoint dir | | `GROOT_PARITY_DEVICE` | `cuda` if available | `cpu` or `cuda` | | `GROOT_PARITY_ATOL` / `GROOT_PARITY_RTOL` | `1e-3` | comparison tolerance |