# GR00T Policy GR00T is an NVIDIA foundation model family for generalized humanoid robot reasoning and skills. It is a cross-embodiment policy that accepts multimodal input, including language, images, and proprioception, to perform manipulation tasks in diverse environments. LeRobot integrates GR00T N1.7 through the `groot` policy type. > [!WARNING] > **Breaking change:** GR00T N1.5 support was removed from LeRobot, and current releases support GR00T N1.7 only. N1.5 checkpoints, configs, and `--policy.model_version=n1.5` are rejected with a clear error. To keep using an N1.5 checkpoint, pin the last release that supports it: `pip install 'lerobot==0.5.1'`. To use the current release, migrate to GR00T N1.7 (`model_version='n1.7'`, base model [`nvidia/GR00T-N1.7-3B`](https://huggingface.co/nvidia/GR00T-N1.7-3B)). ## Model Overview GR00T N1.7 uses a Cosmos-Reason2/Qwen3-VL backbone and provides checkpoints for SimplerEnv, DROID, and LIBERO. Developers and researchers can post-train GR00T with their own real or synthetic data to adapt it for specific humanoid robots or tasks. GR00T uses pre-trained vision and language encoders with a flow matching action transformer to model a chunk of actions conditioned on vision, language, and proprioception. An overview of GR00T Its strong performance comes from being trained on an expansive and diverse humanoid dataset, which includes: - Real captured data from robots. - Synthetic data generated using NVIDIA Isaac GR00T Blueprint. - Internet-scale video data. This approach allows the model to be highly adaptable through post-training for specific embodiments, tasks, and environments. ## Installation Requirements GR00T is intended for NVIDIA GPU-accelerated systems. The `groot` extra still includes Flash Attention on non-macOS platforms, and Flash Attention needs a compatible PyTorch/CUDA environment before it is installed. Install the dependencies in this order: 1. Follow the Environment Setup in the [Installation Guide](./installation). Do not install `lerobot` yet. 2. Install PyTorch, TorchVision, and the build dependencies used by Flash Attention: ```bash # Check https://pytorch.org/get-started/locally/ for the right CUDA wheel index for your system. pip install "torch>=2.7,<2.12.0" "torchvision>=0.22.0,<0.27.0" \ --index-url https://download.pytorch.org/whl/cu128 pip install "ninja>=1.11.1,<2.0.0" "packaging>=24.2,<26.0" ``` 3. Install and verify Flash Attention: ```bash pip install "flash-attn>=2.5.9,<3.0.0" --no-build-isolation python -c "import flash_attn; print(f'Flash Attention {flash_attn.__version__} imported successfully')" ``` 4. Install LeRobot with the GR00T extra: ```bash pip install "lerobot[groot]" ``` For a source checkout, use the same order, then install the local package with: ```bash pip install -e ".[groot]" ``` If your CUDA/PyTorch build needs a different Flash Attention wheel or source build, follow the [Flash Attention project](https://github.com/Dao-AILab/flash-attention) instructions, but keep the same ordering: PyTorch first, Flash Attention next, then `lerobot[groot]`. ## Usage To use GR00T N1.7: ```bash --policy.type=groot \ --policy.model_version=n1.7 ``` ## Training ### Training Command Example Here's a complete training command for finetuning the base GR00T model on your own dataset: ```bash # Using a multi-GPU setup accelerate launch \ --multi_gpu \ --num_processes=$NUM_GPUS \ $(which lerobot-train) \ --output_dir=$OUTPUT_DIR \ --save_checkpoint=true \ --batch_size=$BATCH_SIZE \ --steps=$NUM_STEPS \ --save_freq=$SAVE_FREQ \ --log_freq=$LOG_FREQ \ --policy.push_to_hub=true \ --policy.type=groot \ --policy.repo_id=$REPO_ID \ --policy.tune_diffusion_model=false \ --dataset.repo_id=$DATASET_ID \ --wandb.enable=true \ --wandb.disable_artifact=true \ --job_name=$JOB_NAME ``` ## Performance Results ### LIBERO Benchmark Results > [!NOTE] > Follow the [LIBERO](./libero) setup instructions before running `lerobot-eval`. GR00T N1.7 has demonstrated strong performance on the LIBERO benchmark suite. To reproduce LeRobot results, follow the instructions in the [LIBERO](./libero) section. ### GR00T N1.7 LIBERO Checkpoints NVIDIA publishes GR00T N1.7 LIBERO checkpoints at [`nvidia/GR00T-N1.7-LIBERO`](https://huggingface.co/nvidia/GR00T-N1.7-LIBERO), with one subdirectory per LIBERO suite: | Suite | Checkpoint subdirectory | | -------------- | ----------------------- | | LIBERO Spatial | `libero_spatial` | | LIBERO Object | `libero_object` | | LIBERO Goal | `libero_goal` | | LIBERO 10 | `libero_10` | Preliminary LeRobot integration results: | Suite | Status | Success rate | n_episodes | | -------------- | ------ | -----------: | ---------: | | LIBERO Spatial | ✓ | ~95% | XX | | LIBERO Object | ✓ | XX% | XX | | LIBERO Goal | ✓ | XX% | XX | | LIBERO 10 | ✓ | XX% | XX | | **Average** | ✓ | **XX%** | **XX** | Replace the `XX` placeholders with final eval artifacts before merge. Download the suite checkpoint locally, then point `--policy.base_model_path` at the downloaded subdirectory. `--policy.path` is reserved for LeRobot checkpoints that contain a LeRobot `config.json` with a `type` field. ```bash hf download nvidia/GR00T-N1.7-LIBERO \ --include "libero_spatial/*" \ --local-dir ./GR00T-N1.7-LIBERO lerobot-eval \ --policy.type=groot \ --policy.model_version=n1.7 \ --policy.base_model_path=./GR00T-N1.7-LIBERO/libero_spatial \ --policy.embodiment_tag=libero_sim \ --env.type=libero \ --env.task=libero_spatial \ --eval.n_episodes=50 ``` Use `eval.n_episodes >= 50` per suite when reporting success rates. ### Evaluate in your hardware setup Once you have trained your model using your parameters you can run inference in your downstream task. Follow the instructions in [Policy Deployment (lerobot-rollout)](./inference). For example: ```bash lerobot-rollout\ --strategy.type=sentry \ --strategy.upload_every_n_episodes=5 \ --robot.type=bi_so_follower \ --robot.left_arm_port=/dev/ttyACM1 \ --robot.right_arm_port=/dev/ttyACM0 \ --robot.id=bimanual_follower \ --robot.cameras='{ right: {"type": "opencv", "index_or_path": 0, "width": 640, "height": 480, "fps": 30}, left: {"type": "opencv", "index_or_path": 2, "width": 640, "height": 480, "fps": 30}, top: {"type": "opencv", "index_or_path": 4, "width": 640, "height": 480, "fps": 30}, }' \ --display_data=true \ --dataset.repo_id=/eval_groot-bimanual \ --dataset.single_task="Grab and handover the red cube to the other arm" \ --dataset.streaming_encoding=true \ --dataset.encoder_threads=2 \ # --dataset.camera_encoder.vcodec=auto \ --policy.path=/groot-bimanual \ # your trained model --duration=600 ``` ## License GR00T N1.7 is released under the [NVIDIA Open Model License Agreement](https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license/).