The dep-install layer (uv sync) now only depends on pyproject.toml,
uv.lock, and a minimal package stub — not the full src/ tree. Source
code changes only rebuild the final COPY layer (seconds, not minutes).
Also switch from type=local cache (lost on ephemeral runners) to
type=gha (persisted in GitHub Actions cache, shared across all runs).
Before: every src/ change → full uv sync rebuild (~8-10 min)
After: src/-only change → cached dep layer, ~30s source copy
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Running chmod on the host doesn't propagate into Docker due to UID/SELinux
mismatch. Instead, spin up the image as root to mkdir+chmod from inside
the container before the eval run mounts the same path.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Benchmark CI workflow, Dockerfiles, benchmark docs, evaluation smoke-test
doc, and dispatch tests belong in a separate PR. Scope this PR to the
async env init changes only.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
num2words (required by SmolVLM processor) is declared in lerobot[smolvla],
not lerobot[libero/metaworld]. Install both extras together.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Each benchmark gets its own Docker image (lerobot[libero] / lerobot[metaworld]
only) so incompatible dep trees cannot collide. A 1-episode smoke eval runs
per benchmark on GPU runners.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>