lerobot/docs/source/groot.mdx

# GR00T Policy

GR00T is an NVIDIA foundation model family for generalized humanoid robot reasoning and skills. It is a cross-embodiment policy that accepts multimodal input, including language, images, and proprioception, to perform manipulation tasks in diverse environments.

LeRobot integrates GR00T N1.7 through the `groot` policy type.

## Model Overview

GR00T N1.7 uses a Cosmos-Reason2/Qwen3-VL backbone and provides checkpoints for SimplerEnv, DROID, and LIBERO.

Developers and researchers can post-train GR00T with their own real or synthetic data to adapt it for specific humanoid robots or tasks.

GR00T uses pre-trained vision and language encoders with a flow matching action transformer to model a chunk of actions conditioned on vision, language, and proprioception.

<img
  src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/lerobot/lerobot-groot-paper1%20(1).png"
  alt="An overview of GR00T"
  width="80%"
/>

Its strong performance comes from being trained on an expansive and diverse humanoid dataset, which includes:

- Real captured data from robots.
- Synthetic data generated using NVIDIA Isaac GR00T Blueprint.
- Internet-scale video data.

This approach allows the model to be highly adaptable through post-training for specific embodiments, tasks, and environments.

## Installation Requirements

GR00T is intended for NVIDIA GPU-accelerated systems. The `groot` extra still includes Flash Attention on non-macOS platforms, and Flash Attention needs a compatible PyTorch/CUDA environment before it is installed. Install the dependencies in this order:

1. Follow the Environment Setup in the [Installation Guide](./installation). Do not install `lerobot` yet.
2. Install PyTorch, TorchVision, and the build dependencies used by Flash Attention:

```bash
# Check https://pytorch.org/get-started/locally/ for the right CUDA wheel index for your system.
pip install "torch>=2.7,<2.12.0" "torchvision>=0.22.0,<0.27.0" \
  --index-url https://download.pytorch.org/whl/cu128
pip install "ninja>=1.11.1,<2.0.0" "packaging>=24.2,<26.0"
```

3. Install and verify Flash Attention:

```bash
pip install "flash-attn>=2.5.9,<3.0.0" --no-build-isolation
python -c "import flash_attn; print(f'Flash Attention {flash_attn.__version__} imported successfully')"
```

4. Install LeRobot with the GR00T extra:

```bash
pip install "lerobot[groot]"
```

For a source checkout, use the same order, then install the local package with:

```bash
pip install -e ".[groot]"
```

If your CUDA/PyTorch build needs a different Flash Attention wheel or source build, follow the [Flash Attention project](https://github.com/Dao-AILab/flash-attention) instructions, but keep the same ordering: PyTorch first, Flash Attention next, then `lerobot[groot]`.

## Usage

To use GR00T N1.7:

```bash
--policy.type=groot \
--policy.model_version=n1.7
```

## Training

### Training Command Example

Here's a complete training command for finetuning the base GR00T model on your own dataset:

```bash
# Using a multi-GPU setup
accelerate launch \
  --multi_gpu \
  --num_processes=$NUM_GPUS \
  $(which lerobot-train) \
  --output_dir=$OUTPUT_DIR \
  --save_checkpoint=true \
  --batch_size=$BATCH_SIZE \
  --steps=$NUM_STEPS \
  --save_freq=$SAVE_FREQ \
  --log_freq=$LOG_FREQ \
  --policy.push_to_hub=true \
  --policy.type=groot \
  --policy.repo_id=$REPO_ID \
  --policy.tune_diffusion_model=false \
  --dataset.repo_id=$DATASET_ID \
  --wandb.enable=true \
  --wandb.disable_artifact=true \
  --job_name=$JOB_NAME
```

## Performance Results

### LIBERO Benchmark Results

> [!NOTE]
> Follow the [LIBERO](./libero) setup instructions before running `lerobot-eval`.

GR00T N1.7 has demonstrated strong performance on the LIBERO benchmark suite. To reproduce LeRobot results, follow the instructions in the [LIBERO](./libero) section.

### GR00T N1.7 LIBERO Checkpoints

NVIDIA publishes GR00T N1.7 LIBERO checkpoints at [`nvidia/GR00T-N1.7-LIBERO`](https://huggingface.co/nvidia/GR00T-N1.7-LIBERO), with one subdirectory per LIBERO suite:

| Suite          | Checkpoint subdirectory |
| -------------- | ----------------------- |
| LIBERO Spatial | `libero_spatial`        |
| LIBERO Object  | `libero_object`         |
| LIBERO Goal    | `libero_goal`           |
| LIBERO 10      | `libero_10`             |

Preliminary LeRobot integration results:

| Suite          | Status | Success rate | n_episodes |
| -------------- | ------ | -----------: | ---------: |
| LIBERO Spatial | ✓      |         ~95% |         XX |
| LIBERO Object  | ✓      |          XX% |         XX |
| LIBERO Goal    | ✓      |          XX% |         XX |
| LIBERO 10      | ✓      |          XX% |         XX |
| **Average**    | ✓      |      **XX%** |     **XX** |

Replace the `XX` placeholders with final eval artifacts before merge.

Download the suite checkpoint locally, then point `--policy.base_model_path` at the downloaded subdirectory. `--policy.path` is reserved for LeRobot checkpoints that contain a LeRobot `config.json` with a `type` field.

```bash
huggingface-cli download nvidia/GR00T-N1.7-LIBERO \
  --include "libero_spatial/*" \
  --local-dir ./GR00T-N1.7-LIBERO

lerobot-eval \
  --policy.type=groot \
  --policy.model_version=n1.7 \
  --policy.base_model_path=./GR00T-N1.7-LIBERO/libero_spatial \
  --policy.embodiment_tag=libero_sim \
  --env.type=libero \
  --env.task=libero_spatial \
  --eval.n_episodes=50
```

Use `eval.n_episodes >= 50` per suite when reporting success rates.

### Evaluate in your hardware setup

Once you have trained your model using your parameters you can run inference in your downstream task. Follow the instructions in [Policy Deployment (lerobot-rollout)](./inference). For example:

```bash
lerobot-rollout\
  --strategy.type=sentry \
  --strategy.upload_every_n_episodes=5 \
  --robot.type=bi_so_follower \
  --robot.left_arm_port=/dev/ttyACM1 \
  --robot.right_arm_port=/dev/ttyACM0 \
  --robot.id=bimanual_follower \
  --robot.cameras='{ right: {"type": "opencv", "index_or_path": 0, "width": 640, "height": 480, "fps": 30},
    left: {"type": "opencv", "index_or_path": 2, "width": 640, "height": 480, "fps": 30},
    top: {"type": "opencv", "index_or_path": 4, "width": 640, "height": 480, "fps": 30},
  }' \
  --display_data=true \
  --dataset.repo_id=<user>/eval_groot-bimanual  \
  --dataset.single_task="Grab and handover the red cube to the other arm" \
  --dataset.streaming_encoding=true \
  --dataset.encoder_threads=2 \
  # --dataset.camera_encoder.vcodec=auto \
  --policy.path=<user>/groot-bimanual \ # your trained model
  --duration=600
```

## License

GR00T N1.7 is released under the [NVIDIA Open Model License Agreement](https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license/).