Files
lerobot/tests/envs
Steven Palma 698d2a0e77 feat(policies): add EVO1 policy (#3908)
* feat(policies): add EVO1 policy

* fix(evo1): infer batch size after normalizing image dims

`_collect_image_batches` read `batch_size = batch[camera_keys[0]].shape[0]`
before normalizing per-camera tensors to `(B, C, H, W)`. For an unbatched
`(C, H, W)` input (which the function tries to support via the `image.dim() == 3`
branch), this picked up the channel count `C` instead of the real batch size,
making the subsequent per-sample loop iterate `C` times and indexing go
out of bounds.

Normalize each camera tensor up-front, then read `batch_size` from the
normalized batch dim. Adds `test_collect_image_batches_handles_unbatched_chw`
covering the regression.

Reported by Copilot review on huggingface/lerobot#3545.

* chore(lock): regenerate uv.lock for evo1 extra

Adds the `evo1` entry to `[package.metadata.requires-dist]` and the
`provides-extras` list so that `uv sync --locked --extra test` (used by
fast_tests.yml) no longer reports the lockfile as stale.

Generated with `uv 0.8.0` (matching `UV_VERSION` in fast_tests.yml).
The non-evo1 marker tweaks are produced by `uv lock` re-resolving the
existing dep graph and are not introduced by this PR.

* chore(evo1): align with policy contribution guide conventions

- Add `src/lerobot/policies/evo1/README.md` symlink into `docs/source/evo1.mdx`
  to match the in-tree README convention (mirroring the EO-1 layout).
- Convert `transformers` import in `internvl3_embedder.py` to the standard
  `TYPE_CHECKING + _transformers_available` two-step gating used by other
  optional-backbone policies (e.g. diffusion). The previous lazy-in-`__init__`
  import was functionally equivalent for runtime gating but didn't expose the
  real symbols to type checkers.
- Add `lerobot[evo1]` to the `all` extra in `pyproject.toml` so
  `pip install 'lerobot[all]'` keeps installing every optional policy.

Per the guidance in https://moon-ci-docs.huggingface.co/docs/lerobot/pr_3534/en/contributing_a_policy.

* fix(evo1): finalize policy guide alignment

* docs(evo1): format results table

* Fix EVO1 LIBERO rollout processors

* Fix EVO1 LIBERO eval action postprocessing

* Fix eval action conversion for bf16 policies

* fix(evo1): move LIBERO padding into policy processors

* refactor(evo1): use native HF InternVL3-1B-hf, drop trust_remote_code

- Switch from OpenGVLab/InternVL3-1B (requires trust_remote_code=True)
  to OpenGVLab/InternVL3-1B-hf (native transformers implementation).
- Replace manual _extract_feature + _prepare_and_fuse_embeddings with
  a single model.forward() call — verified bit-for-bit identical output.
- Remove ~170 lines of manual ViT/pixel-shuffle/projection logic.
- Symlink README.md to docs/source/ following repo convention.

Weights are byte-identical between both model variants; only the module
naming differs. All 12 existing unit tests pass. Local training (10 steps)
on maximellerbach/omx_pickandplace confirmed working.

* refactor(policy): evo1 GPU-batched preprocessing +  vectorized attention masking + remove dead code

* fix(style): pre-commit

oops

* chore(evo1): delete added test + reduce diff

* refactor(policies): use config for evo1 + local imports

* refactor(policies): multiple improvements

* chore: update docs + remove legacy codepaths

* feat(policies): implement RTC to EVO1

---------

Co-authored-by: javadcc_mac <javadcc1@sjtu.edu.cn>
Co-authored-by: Yiming Wang <145452074+JAVAdcc@users.noreply.github.com>
Co-authored-by: Martino Russi <nopyeps@gmail.com>
2026-07-03 22:17:15 +02:00
..