mirror of
https://github.com/huggingface/lerobot.git
synced 2026-05-15 08:39:49 +00:00
fix(profiling): fix pi0 cuBLAS error and pi05 OOM on 22GB GPU
- Move cudnn_deterministic to per-spec train_args instead of hardcoding it for all models. cuBLAS deterministic mode triggers internal errors on Gemma-based models (pi0, pi05) during backward pass. - Enable use_amp=true for pi0, pi0_fast, and pi05 to reduce memory footprint from fp32 (~16GB weights alone) to bf16, fitting within 22GB GPU budget with room for activations and gradients. - Small models (act, diffusion, multi_task_dit) still use deterministic mode for reproducible profiling results. Made-with: Cursor
This commit is contained in:
@@ -160,7 +160,6 @@ def build_train_command(spec: ProfilingSpec, run_dir: Path, profile_mode: str) -
|
||||
"--policy.push_to_hub=false",
|
||||
"--num_workers=0",
|
||||
"--log_freq=1",
|
||||
"--cudnn_deterministic=true",
|
||||
f"--profile_mode={profile_mode}",
|
||||
f"--profile_output_dir={profile_output_dir}",
|
||||
]
|
||||
|
||||
Reference in New Issue
Block a user