lerobot/docs/source/policy_groot_README.md at 13ed6570563fef835a92cb02ecd118338e0aebbb

mirror of https://github.com/huggingface/lerobot.git synced 2026-06-17 08:17:02 +00:00

Files

T

Steven Palma edda8552ec docs(groot): document the N1.5 removal and the N1.7 parity test

- groot.mdx: breaking-change warning and migration path (pin lerobot==0.5.1 to
  keep N1.5, or move to N1.7); the dead `huggingface-cli download` is replaced
  with `hf download`.
- policy_groot_README.md: N1.5 removal note, updated paper / model-card links,
  and the two-comparison (model parity + preprocessor parity) description of
  the original-vs-LeRobot test, including the raw-observation artifacts and
  recorded seed.

2026-06-12 23:40:36 +02:00

7.0 KiB

Raw Blame History

Research Paper

GR00T N1 technical report (covers the GR00T N1.x family, including N1.7): https://arxiv.org/abs/2503.14734

GR00T N1.7 model card: https://huggingface.co/nvidia/GR00T-N1.7-3B

GR00T N1.5 research page (earlier version): https://research.nvidia.com/labs/gear/gr00t-n1_5/

GR00T N1.5 support was removed from LeRobot; the last release supporting it is lerobot==0.5.1. Current releases support GR00T N1.7 only.

Repository

Code: https://github.com/NVIDIA/Isaac-GR00T

Citation

@inproceedings{gr00tn1_2025,
  archivePrefix = {arxiv},
  eprint     = {2503.14734},
  title      = {{GR00T} {N1}: An Open Foundation Model for Generalist Humanoid Robots},
  author     = {NVIDIA and Johan Bjorck andFernando Castañeda, Nikita Cherniadev and Xingye Da and Runyu Ding and Linxi "Jim" Fan and Yu Fang and Dieter Fox and Fengyuan Hu and Spencer Huang and Joel Jang and Zhenyu Jiang and Jan Kautz and Kaushil Kundalia and Lawrence Lao and Zhiqi Li and Zongyu Lin and Kevin Lin and Guilin Liu and Edith Llontop and Loic Magne and Ajay Mandlekar and Avnish Narayan and Soroush Nasiriany and Scott Reed and You Liang Tan and Guanzhi Wang and Zu Wang and Jing Wang and Qi Wang and Jiannan Xiang and Yuqi Xie and Yinzhen Xu and Zhenjia Xu and Seonghyeon Ye and Zhiding Yu and Ao Zhang and Hao Zhang and Yizhou Zhao and Ruijie Zheng and Yuke Zhu},
  month      = {March},
  year       = {2025},
  booktitle  = {ArXiv Preprint},
}

Additional Resources

Blog: https://developer.nvidia.com/isaac/gr00t

Hugging Face Models:

GR00T N1.7: https://huggingface.co/nvidia/GR00T-N1.7-3B
GR00T N1.7 LIBERO checkpoints: https://huggingface.co/nvidia/GR00T-N1.7-LIBERO

Original-vs-LeRobot parity test

tests/policies/groot/test_groot_vs_original.py verifies this LeRobot reimplementation of GR00T N1.7 (Qwen3-VL backbone + flow-matching action head) against NVIDIA's original gr00t package with two comparisons, each parametrized over every embodiment tag present in the checkpoint:

Model parity — given byte-identical pre-processed inputs and the same flow-matching seed (recorded in each artifact), both implementations must produce the same raw model output (get_action(...)["action_pred"], the normalized flow-matching prediction). Output shapes must match exactly; any action-horizon or action-dim mismatch fails the test.
Preprocessor parity — given the identical raw observations (per-camera frames, state vectors, language instruction), LeRobot's own preprocessor pipeline (real Qwen3-VL chat template / tokenizer / image packing + checkpoint-driven state normalization, no mocks) must produce the same collated model inputs (input_ids, attention_mask, pixel_values, image_grid_thw, state, embodiment_id) as the original package's processor.

Why two environments

The original gr00t package pins transformers==4.57.3 (Python 3.10); this integration requires transformers>=5.x (Qwen3-VL). Under 5.x, PretrainedConfig is itself a defaulted dataclass, so the original config dataclasses fail to import (non-default argument follows default argument). The two implementations therefore cannot be imported in the same Python process.

So the test uses a producer / consumer split across two venvs:

Producer — tests/policies/groot/utils/dump_original_n1_7.py, run in the original gr00t venv. For each embodiment it builds dummy inputs generically from the checkpoint metadata (state dims from statistics.json; camera/language keys from the processor modality configs), runs the original model, and saves to one .npz per tag: the raw observations (raw:: keys), the exact collated inputs (in:: keys), the seed, and the raw action_pred.
Consumer — the pytest above, run in the LeRobot venv. It discovers every .npz; the model-parity case replays the byte-identical collated inputs through the LeRobot model with the recorded seed and asserts the outputs match, and the preprocessor-parity case replays the raw observations through LeRobot's full preprocessor pipeline and asserts the collated tensors match.

Artifacts generated by older versions of the dump script contain no raw:: fields; the preprocessor-parity case then skips with a regeneration hint. Re-run the producer to refresh them.

Fairness controls

Same pre-processed inputs (model parity) — the original processor's input_ids, pixel_values, image_grid_thw, attention_mask, state, embodiment_id are fed verbatim to the LeRobot model (no re-tokenization / re-normalization), so the model comparison isolates the model. LeRobot's own tokenization / image packing is covered separately by the preprocessor-parity case, which compares its output against those same collated tensors from identical raw observations.
Same precision + attention kernel — both sides run fp32 + SDPA. The original defaults to use_flash_attention=True (flash_attention_2 + bf16); the producer forces SDPA + fp32. (With the defaults the gap is ~3e-2 — pure kernel/rounding noise, not an implementation difference.)
Same flow-matching seed — fixed right before sampling on both sides; the producer records it in each artifact (--seed, default 42) and the consumer replays the recorded value.

How to run

# Resolve a local checkpoint (GR00T-N1.7-LIBERO / libero_10)
CKPT=$(python - <<'PY'
import os
from huggingface_hub import snapshot_download
print(os.path.join(snapshot_download("nvidia/GR00T-N1.7-LIBERO",
      allow_patterns=["libero_10/*"]), "libero_10"))
PY
)

# 1) Produce the original-side artifacts for all embodiments (original gr00t venv, CUDA)
CUDA_VISIBLE_DEVICES=0 /path/to/Isaac-GR00T/.venv-original/bin/python \
    tests/policies/groot/utils/dump_original_n1_7.py \
    --ckpt "$CKPT" --out-dir tests/policies/groot/artifacts --device cuda --seed 42

# 2) Run the parity test (LeRobot venv) — one parametrized case per embodiment
CUDA_VISIBLE_DEVICES=0 GROOT_PARITY_DEVICE=cuda \
    uv run pytest tests/policies/groot/test_groot_vs_original.py -v -s

The .npz artifacts are local-only (gitignored, ~6–10 MB each) and are regenerated by the producer; they are never committed. The tests skip (do not fail) on CI or when the checkpoint / artifacts are absent.

Env knobs (all optional)

Var	Default	Purpose
`GROOT_N1_7_PARITY_DIR`	`tests/policies/groot/artifacts`	directory of per-tag `.npz` artifacts
`GROOT_N1_7_LIBERO_CKPT`	auto (HF cache)	override checkpoint dir
`GROOT_PARITY_DEVICE`	`cuda` if available	`cpu` or `cuda`
`GROOT_PARITY_ATOL` / `GROOT_PARITY_RTOL`	`1e-3`	comparison tolerance

7.0 KiB Raw Blame History Unescape Escape