lerobot

mirror of https://github.com/huggingface/lerobot.git synced 2026-05-15 16:49:55 +00:00

Author	SHA1	Message	Date
Pepijn	452d9abaa4	feat(ci): add health dashboard Space + benchmark metrics artifacts - spaces/health-dashboard/app.py: Gradio Space that queries the GitHub Actions API directly (no extra datastore). Shows benchmark status badges, success-rate and duration trend charts, and embeds the latest rollout video per benchmark. Results cached 5 min in-memory; video files cached on disk by artifact ID so downloads only happen once. - spaces/health-dashboard/requirements.txt + README.md: Space card with setup instructions for the GITHUB_RO_TOKEN secret (actions:read, metadata:read only). - scripts/ci/parse_eval_metrics.py: runs on the CI host after each eval, reads eval_info.json written by lerobot-eval, extracts pc_success and n_episodes, and writes metrics.json to the artifacts dir. - .github/workflows/benchmark_tests.yml: add "Parse … metrics" and "Upload … metrics" steps (if: always()) after each eval so the dashboard has data even when the eval fails. The Space should be deployed as a private Space under the huggingface org. Required secret: GITHUB_RO_TOKEN (fine-grained, read-only). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-08 17:46:44 +02:00
Pepijn	13ee7009fe	fix(ci): chmod 777 artifact dirs so non-root container can write videos Container runs as user_lerobot (non-root); host-mounted /artifacts volume was owned by root, causing PermissionError on first video write. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-08 16:50:54 +02:00
Pepijn	8bf77ef6b9	feat(ci): upload rollout videos as artifacts for quick visual validation Mount a host volume into the container so lerobot-eval writes videos to /artifacts, then upload artifacts/videos/ via actions/upload-artifact. `if: always()` ensures the video is uploaded even when the eval fails, which helps debug rollout issues. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-08 16:26:32 +02:00
Pepijn	225bec6552	fix(ci): fix HF download timeout and metaworld feature mismatch - Add HF_HUB_DOWNLOAD_TIMEOUT=300 to both jobs — SmolVLM2 processor download was timing out on CI runners with the default timeout - MetaWorld: add --rename_map to map observation.image → camera1 and --policy.empty_cameras=2 to pad the 2 missing cameras the policy expects (trained with 3 cameras, env provides 1) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-08 15:51:15 +02:00
Pepijn	a4d9bee6e2	fix(ci): use metaworld-push-v3 task (v2 not in TASK_DESCRIPTIONS) All MetaWorld task names in metaworld_config.json use the v3 suffix. push-v2 caused a KeyError on TASK_DESCRIPTIONS lookup. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-08 15:26:58 +02:00
Pepijn	437014926f	feat(ci): add benchmark smoke tests with isolated Docker images Each benchmark gets its own image (lerobot[<benchmark>,smolvla]) so incompatible dep trees can never collide. A 1-episode smoke eval runs per benchmark on GPU runners. - Libero: pepijn223/smolvla_libero, libero_spatial, camera_name_mapping - MetaWorld: pepijn223/smolvla_metaworld, metaworld-push-v2 - LIBERO config pre-created at build time to bypass interactive stdin prompt - Triggers on envs/**, lerobot_eval.py, Dockerfiles, pyproject.toml changes - Adds docs/source/evaluation.mdx and restores step 7 in adding_benchmarks Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-08 14:44:59 +02:00
Pepijn	c4d7e7468b	chore: remove out-of-scope benchmark/CI/docs files from PR Benchmark CI workflow, Dockerfiles, benchmark docs, evaluation smoke-test doc, and dispatch tests belong in a separate PR. Scope this PR to the async env init changes only. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-08 14:33:48 +02:00
Pepijn	28c5fd0421	fix(ci): pre-create libero config in Dockerfile to bypass stdin prompt libero/__init__.py calls input() when ~/.libero/config.yaml is missing. We write the config at image build time (without importing libero) so the prompt never fires at runtime. Also trigger CI on pyproject.toml changes. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-08 11:34:36 +02:00
Pepijn	1bb62aa0c5	fix(ci): set LIBERO_DATA_FOLDER to bypass interactive stdin prompt libero/__init__.py calls input() to ask about a custom dataset path, which raises EOFError when stdin is closed inside Docker. Setting LIBERO_DATA_FOLDER skips the prompt entirely. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-08 09:34:08 +02:00
Pepijn	834532f1dc	ci(benchmarks): trigger only on envs/ or lerobot_eval.py changes Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-07 22:29:42 +02:00
Pepijn	40757b3481	ci(benchmarks): pin action hashes and use uv sync --locked Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-07 21:56:47 +02:00
Pepijn	0bc68740f4	ci(benchmarks): add isolated integration tests for libero and metaworld Each benchmark gets its own Docker image (lerobot[libero] / lerobot[metaworld] only) so incompatible dep trees cannot collide. A 1-episode smoke eval runs per benchmark on GPU runners. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-07 21:55:59 +02:00

12 Commits