Commit Graph

13 Commits

Author SHA1 Message Date
Pepijn 3534331fcc feat(ci): add parse_eval_metrics step to benchmark workflow
Adds scripts/ci/parse_eval_metrics.py and wires it into both Libero and
MetaWorld jobs so the dashboard can read pc_success, avg_sum_reward and
eval_s from the metrics artifact instead of relying on GitHub step timing.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 10:04:30 +02:00
Pepijn 0dd0a8f11a fix(ci): write eval output to /tmp inside container
user_lerobot cannot create /artifacts at the container root.
Use /tmp/eval-artifacts (always writable) then docker cp it out.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-08 20:56:10 +02:00
Pepijn 936b42e6a2 fix(ci): use docker cp instead of bind mounts for artifacts
Bind mounts on these runners don't surface container-written files on
the host path (likely DinD/socket-mount setup). Switch to named
containers + docker cp, which copies directly through the daemon and
lands files in the runner's accessible filesystem.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-08 20:32:10 +02:00
Pepijn e8d029eaf2 fix(ci): change benchmark schedule from monthly to weekly (every Monday)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-08 20:06:57 +02:00
Pepijn d8305abb3e feat(ci): add monthly schedule trigger for benchmark tests
Runs on the 1st of every month at 02:00 UTC in addition to the
existing push/PR and manual dispatch triggers.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-08 20:06:02 +02:00
Pepijn a16f00ca66 fix(ci): re-chmod artifacts after eval to fix unreadable files
Files created by user_lerobot inside the eval container inherit a
restrictive umask, making them unreadable by the runner after the
container exits. Add a post-eval 'docker run --user root' chmod step
so upload-artifact can find the video files.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-08 19:59:28 +02:00
Pepijn 927118e0ee fix(ci): use root container chmod to fix PermissionError on artifact dirs
Running chmod on the host doesn't propagate into Docker due to UID/SELinux
mismatch. Instead, spin up the image as root to mkdir+chmod from inside
the container before the eval run mounts the same path.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-08 19:22:10 +02:00
Pepijn c8c2e88e24 chore: remove out-of-scope benchmark/CI/docs files from PR
Benchmark CI workflow, Dockerfiles, benchmark docs, evaluation smoke-test
doc, and dispatch tests belong in a separate PR. Scope this PR to the
async env init changes only.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-08 18:29:13 +02:00
Pepijn 841cbb0835 fix(ci): pre-create libero config in Dockerfile to bypass stdin prompt
libero/__init__.py calls input() when ~/.libero/config.yaml is missing.
We write the config at image build time (without importing libero) so
the prompt never fires at runtime. Also trigger CI on pyproject.toml changes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-08 18:29:10 +02:00
Pepijn dfd09c054d fix(ci): set LIBERO_DATA_FOLDER to bypass interactive stdin prompt
libero/__init__.py calls input() to ask about a custom dataset path,
which raises EOFError when stdin is closed inside Docker. Setting
LIBERO_DATA_FOLDER skips the prompt entirely.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-08 18:29:10 +02:00
Pepijn 07350f95a9 ci(benchmarks): trigger only on envs/ or lerobot_eval.py changes
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-08 18:29:09 +02:00
Pepijn 61e2be8c9e ci(benchmarks): pin action hashes and use uv sync --locked
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-08 18:29:09 +02:00
Pepijn 6e6f76d47f ci(benchmarks): add isolated integration tests for libero and metaworld
Each benchmark gets its own Docker image (lerobot[libero] / lerobot[metaworld]
only) so incompatible dep trees cannot collide. A 1-episode smoke eval runs
per benchmark on GPU runners.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-08 18:29:09 +02:00