mirror of https://github.com/huggingface/lerobot.git synced 2026-05-14 16:19:45 +00:00

Files

T

Pepijn a142c365dd use syslink for wall-x readme (#2708 )

* use syslink for wall-x readme

* remove whitespace

2025-12-23 14:13:32 +01:00

1.3 KiB

Raw Blame History

WALL-OSS

This repository contains the Hugging Face port of WALL-OSS, a Vision-Language-Action model for cross-embodiment robotic control based on Qwen2.5-VL with flow matching/FAST action prediction.

Model Overview

Feature	Description
Base Model	Qwen2.5-VL (Vision-Language Model)
Action Prediction	Flow Matching (diffusion) or FAST (discrete tokens)
Architecture	Mixture of Experts (MoE) with action-specific routing
Multi-Modal Inputs	Vision (images/videos), Language, Proprioception

Citation

If you use this work, please cite:

@article{zhai2025igniting,
    title   = {Igniting VLMs Toward the Embodied Space},
    author  = {Zhai, Andy and Liu, Brae and Fang, Bruno and Cai, Chalse and Ma, Ellie and Yin, Ethan and Wang, Hao and Zhou, Hugo and Wang, James and Shi, Lights and Liang, Lucy and Wang, Make and Wang, Qian and Gan, Roy and Yu, Ryan and Li, Shalfun and Liu, Starrick and Chen, Sylas and Chen, Vincent and Xu, Zach},
    journal = {arXiv preprint arXiv:2509.11766},
    year    = {2025}
}

License

This port follows the Apache 2.0 License.

1.3 KiB Raw Blame History

WALL-OSS

Model Overview

Citation

License

1.3 KiB

Raw Blame History