Files
lerobot/docs/source/policy_groot_README.md
T
nv-sachdevkartik 750358895b test(groot): move parity producer into utils/ package
Mirror the tests/policies/pi0_pi05/utils convention: move dump_original_n1_7.py into
a tests/policies/groot/utils/ package (with __init__.py) and update all path
references in the test docstring/skip-message and the policy README.
2026-06-12 08:10:03 +00:00

4.8 KiB
Raw Blame History

Research Paper

Paper: https://research.nvidia.com/labs/gear/gr00t-n1_5/

Repository

Code: https://github.com/NVIDIA/Isaac-GR00T

Citation

@inproceedings{gr00tn1_2025,
  archivePrefix = {arxiv},
  eprint     = {2503.14734},
  title      = {{GR00T} {N1}: An Open Foundation Model for Generalist Humanoid Robots},
  author     = {NVIDIA and Johan Bjorck andFernando Castañeda, Nikita Cherniadev and Xingye Da and Runyu Ding and Linxi "Jim" Fan and Yu Fang and Dieter Fox and Fengyuan Hu and Spencer Huang and Joel Jang and Zhenyu Jiang and Jan Kautz and Kaushil Kundalia and Lawrence Lao and Zhiqi Li and Zongyu Lin and Kevin Lin and Guilin Liu and Edith Llontop and Loic Magne and Ajay Mandlekar and Avnish Narayan and Soroush Nasiriany and Scott Reed and You Liang Tan and Guanzhi Wang and Zu Wang and Jing Wang and Qi Wang and Jiannan Xiang and Yuqi Xie and Yinzhen Xu and Zhenjia Xu and Seonghyeon Ye and Zhiding Yu and Ao Zhang and Hao Zhang and Yizhou Zhao and Ruijie Zheng and Yuke Zhu},
  month      = {March},
  year       = {2025},
  booktitle  = {ArXiv Preprint},
}

Additional Resources

Blog: https://developer.nvidia.com/isaac/gr00t

Hugging Face Models:

Original-vs-LeRobot parity test

tests/policies/groot/test_groot_vs_original.py verifies that this LeRobot reimplementation of GR00T N1.7 (Qwen3-VL backbone + flow-matching action head) produces the same raw model output (get_action(...)["action_pred"], the normalized flow-matching prediction) as NVIDIA's original gr00t package, given byte-identical pre-processed inputs and the same flow-matching seed. It is parametrized over every embodiment tag present in the checkpoint.

Why two environments

The original gr00t package pins transformers==4.57.3 (Python 3.10); this integration requires transformers>=5.x (Qwen3-VL). Under 5.x, PretrainedConfig is itself a defaulted dataclass, so the original config dataclasses fail to import (non-default argument follows default argument). The two implementations therefore cannot be imported in the same Python process.

So the test uses a producer / consumer split across two venvs:

  1. Producertests/policies/groot/utils/dump_original_n1_7.py, run in the original gr00t venv. For each embodiment it builds dummy inputs generically from the checkpoint metadata (state dims from statistics.json; camera/language keys from the processor modality configs), runs the original model, and saves the exact collated inputs + raw action_pred to one .npz per tag.
  2. Consumer — the pytest above, run in the LeRobot venv. It discovers every .npz, replays the byte-identical inputs through the LeRobot model with the same seed, and asserts the outputs match.

Fairness controls

  • Same pre-processed inputs — the original processor's input_ids, pixel_values, image_grid_thw, attention_mask, state, embodiment_id are fed verbatim to the LeRobot model (no re-tokenization / re-normalization).
  • Same precision + attention kernel — both sides run fp32 + SDPA. The original defaults to use_flash_attention=True (flash_attention_2 + bf16); the producer forces SDPA + fp32. (With the defaults the gap is ~3e-2 — pure kernel/rounding noise, not an implementation difference.)
  • Same flow-matching seed — fixed (42) right before sampling on both sides.

How to run

# Resolve a local checkpoint (GR00T-N1.7-LIBERO / libero_10)
CKPT=$(python - <<'PY'
import os
from huggingface_hub import snapshot_download
print(os.path.join(snapshot_download("nvidia/GR00T-N1.7-LIBERO",
      allow_patterns=["libero_10/*"]), "libero_10"))
PY
)

# 1) Produce the original-side artifacts for all embodiments (original gr00t venv, CUDA)
CUDA_VISIBLE_DEVICES=0 /path/to/Isaac-GR00T/.venv-original/bin/python \
    tests/policies/groot/utils/dump_original_n1_7.py \
    --ckpt "$CKPT" --out-dir tests/policies/groot/artifacts --device cuda --seed 42

# 2) Run the parity test (LeRobot venv) — one parametrized case per embodiment
CUDA_VISIBLE_DEVICES=0 GROOT_PARITY_DEVICE=cuda \
    uv run pytest tests/policies/groot/test_groot_vs_original.py -v -s

The .npz artifacts are local-only (gitignored, ~69 MB each) and are regenerated by the producer; they are never committed. The test skips (does not fail) on CI or when the checkpoint / artifacts are absent.

Env knobs (all optional)

Var Default Purpose
GROOT_N1_7_PARITY_DIR tests/policies/groot/artifacts directory of per-tag .npz artifacts
GROOT_N1_7_LIBERO_CKPT auto (HF cache) override checkpoint dir
GROOT_PARITY_DEVICE cuda if available cpu or cuda
GROOT_PARITY_ATOL / GROOT_PARITY_RTOL 1e-3 comparison tolerance