* fix(ci): prevent runner group error on fork pushes
Add repository check to unbound_deps_tests workflow to ensure
aws-general-8-plus runner group is only used on main repository,
preventing 'Required runner group not found' errors on forks.
* fix(ci): use gating job to prevent runner allocation on forks
The previous approach failed because GitHub evaluates runs-on before if conditions.
Now using a check-repo job that runs on ubuntu-latest first, and all jobs with
special runners depend on it and check its output before being scheduled.
* fix(ci): add gating job to full_tests to prevent runner allocation on forks
Apply the same gating pattern used in unbound_deps_tests to full_tests.yml
to prevent GitHub from trying to allocate custom runners when workflows
run on forks. The check-repo job runs first on ubuntu-latest and all jobs
with custom runners depend on it and check its output.
* fix(ci): add repository check to unbound_deps_tests workflow
Add 'if: github.repository == huggingface/lerobot' check to build-and-push-docker job to prevent runner group access errors on forks, matching the pattern used in nightly.yml
* fix(ci): add repository check to full_tests workflow
Add 'if: github.repository == huggingface/lerobot' check to build-and-push-docker and gpu-tests jobs to prevent runner group access errors on forks
* refactor(ci): remove redundant check from gpu-tests job
gpu-tests depends on build-and-push-docker via needs, so it will automatically skip when the parent job is skipped
* refactor(ci): remove unnecessary fork check from full-tests job
full-tests runs on ubuntu-latest which is available to all forks, no need to restrict it
---------
Co-authored-by: Steven Palma <imstevenpmwork@ieee.org>
* pi fixes for dependencies
* add walls sarm conflict
* also add conflicts for pi
* fix(ci): use --extra all instead of --all-extras + --no-extra
---------
Co-authored-by: Steven Palma <steven.palma@huggingface.co>
* support wallx
* fix bugs in flow
* incorporate wallx model into lerobot
* update the policy methods
* reduce to least config and params & pass lerobot basic test
* fixed dtype bugs
* add wallx dependencies
* update
* remove flash-attn requirement && fix bug in inference and fast mode
* fix bug for inference
* add some small modifications
* fix pre-commit errors
* remove lerobot[wallx]
* fix ci
* fix precommit issues
* fix: exclude wallx extra properly in CI workflows
* fix: add uv conflicts for wallx transformers version
* fix: peft test import
* pre-commit
* only export WallXConfig from wall_x package to avoid peft import in CI
* remove torch dep
* precommit
* add import
---------
Co-authored-by: vincentchen <chenlufang@x2robot.com>
Co-authored-by: Geoffrey19 <sympathischmann35@gmail.com>
Co-authored-by: Pepijn <138571049+pkooij@users.noreply.github.com>
Co-authored-by: Pepijn <pepijn@huggingface.co>
* Enhance training and logging functionality with accelerator support
- Added support for multi-GPU training by introducing an `accelerator` parameter in training functions.
- Updated `update_policy` to handle gradient updates based on the presence of an accelerator.
- Modified logging to prevent duplicate messages in non-main processes.
- Enhanced `set_seed` and `get_safe_torch_device` functions to accommodate accelerator usage.
- Updated `MetricsTracker` to account for the number of processes when calculating metrics.
- Introduced a new feature in `pyproject.toml` for the `accelerate` library dependency.
* Initialize logging in training script for both main and non-main processes
- Added `init_logging` calls to ensure proper logging setup when using the accelerator and in standard training mode.
- This change enhances the clarity and consistency of logging during training sessions.
* add docs and only push model once
* Place logging under accelerate and update docs
* fix pre commit
* only log in main process
* main logging
* try with local rank
* add tests
* change runner
* fix test
* dont push to hub in multi gpu tests
* pre download dataset in tests
* small fixes
* fix path optimizer state
* update docs, and small improvements in train
* simplify accelerate main process detection
* small improvements in train
* fix OOM bug
* change accelerate detection
* add some debugging
* always use accelerate
* cleanup update method
* cleanup
* fix bug
* scale lr decay if we reduce steps
* cleanup logging
* fix formatting
* encorperate feedback pr
* add min memory to cpu tests
* use accelerate to determin logging
* fix precommit and fix tests
* chore: minor details
---------
Co-authored-by: AdilZouitine <adilzouitinegm@gmail.com>
Co-authored-by: Steven Palma <steven.palma@huggingface.co>
* refactor(scripts): update system info script
* chore(scripts): rename info script
* feat(scripts): add entrypoint for info
* chore(ci): update issue report template