From a6dd28e8b47e01543c72a3d3b557c901a6f303df Mon Sep 17 00:00:00 2001 From: Pepijn Date: Thu, 16 Apr 2026 21:15:14 +0200 Subject: [PATCH] fix(profiling): tolerate groot dep-install failure groot's only policy-specific dependency is flash-attn, which has no prebuilt wheel for torch 2.10 and requires nvcc to build from source. The CI image is based on nvidia/cuda:12.4.1-base, which ships the CUDA runtime but not the compiler toolkit, so the source build fails with `/usr/local/cuda/bin/nvcc: No such file or directory`. The repo's own pyproject.toml already carries a TODO acknowledging this: gr00t needs bespoke flash-attn install steps. Treat this as an environmental limitation rather than a regression: dep-install failures for groot are logged via `::warning::` and skip the policy without failing the job. Dep-install failures for any other policy remain fatal, so real regressions still surface. Made-with: Cursor --- .github/workflows/model_profiling.yml | 20 ++++++++++++++++++-- 1 file changed, 18 insertions(+), 2 deletions(-) diff --git a/.github/workflows/model_profiling.yml b/.github/workflows/model_profiling.yml index 284879e3d..80a6746d0 100644 --- a/.github/workflows/model_profiling.yml +++ b/.github/workflows/model_profiling.yml @@ -161,6 +161,18 @@ jobs: esac } + # Policies whose dep-install may fail due to environment constraints + # (e.g. groot requires compiling flash-attn, which needs nvcc; the CI + # image only ships the CUDA runtime). Install failures for these are + # logged as warnings and do not fail the job. See the TODO next to + # `lerobot[groot]` in pyproject.toml. + is_install_failure_tolerated() { + case "$1" in + groot) return 0 ;; + *) return 1 ;; + esac + } + overall_status=0 for raw_policy in "${policies_to_run[@]}"; do policy="$(echo "${raw_policy}" | xargs)" @@ -183,8 +195,12 @@ jobs: # build isolation for flash-attn specifically. sync_cmd+=(--no-build-isolation-package flash-attn) if ! "${sync_cmd[@]}"; then - echo "Dependency install failed for ${policy}; skipping." >&2 - overall_status=1 + if is_install_failure_tolerated "${policy}"; then + echo "::warning::Dependency install failed for ${policy} (known-fragile); skipping." + else + echo "Dependency install failed for ${policy}; skipping." >&2 + overall_status=1 + fi echo "::endgroup::" continue fi