mirror of
https://github.com/huggingface/lerobot.git
synced 2026-05-20 11:09:59 +00:00
feat(annotate): default to HF Inference Providers, no local GPU needed
Flip the default backend to 'openai' with use_hf_inference_providers=True and a Qwen3-VL-30B-A3B-Instruct:novita default model_id. The CLI now runs end-to-end without a local model load — annotations are produced by sending video_url + prompt to https://router.huggingface.co/v1. Switch back to local inference with --vlm.backend=vllm or --vlm.use_hf_inference_providers=false. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -72,23 +72,26 @@ class Module3Config:
|
|||||||
class VlmConfig:
|
class VlmConfig:
|
||||||
"""Shared Qwen-VL client configuration."""
|
"""Shared Qwen-VL client configuration."""
|
||||||
|
|
||||||
backend: str = "vllm"
|
backend: str = "openai"
|
||||||
"""One of ``vllm``, ``transformers``, ``openai``, or ``stub`` (tests only).
|
"""One of ``vllm``, ``transformers``, ``openai``, or ``stub`` (tests only).
|
||||||
|
|
||||||
The ``openai`` backend talks to any OpenAI-compatible server — works
|
Default ``openai`` paired with ``use_hf_inference_providers=True``
|
||||||
with ``vllm serve``, ``transformers serve``, ``ktransformers serve``,
|
routes requests through HF Inference Providers — no local GPU
|
||||||
or hosted endpoints. Set ``api_base`` and (optionally) ``api_key``."""
|
needed. Switch to ``vllm`` / ``transformers`` for in-process
|
||||||
model_id: str = "Qwen/Qwen3.6-27B-FP8"
|
inference."""
|
||||||
|
model_id: str = "Qwen/Qwen3-VL-30B-A3B-Instruct:novita"
|
||||||
api_base: str = "http://localhost:8000/v1"
|
api_base: str = "http://localhost:8000/v1"
|
||||||
"""Base URL for the ``openai`` backend."""
|
"""Base URL for the ``openai`` backend."""
|
||||||
api_key: str = "EMPTY"
|
api_key: str = "EMPTY"
|
||||||
"""API key for the ``openai`` backend; ``EMPTY`` works for local servers."""
|
"""API key for the ``openai`` backend; ``EMPTY`` works for local servers."""
|
||||||
use_hf_inference_providers: bool = False
|
use_hf_inference_providers: bool = True
|
||||||
"""When True, route requests through https://router.huggingface.co/v1
|
"""Route requests through https://router.huggingface.co/v1 using your
|
||||||
using your ``HF_TOKEN`` env var as the API key. The CLI flips
|
``HF_TOKEN`` env var as the API key. Default ``True`` — no local GPU
|
||||||
``auto_serve`` off automatically — no local server is spawned. Use
|
needed. The CLI flips ``auto_serve`` off automatically when this is
|
||||||
``model_id`` of the form ``Qwen/Qwen3-VL-30B-A3B-Instruct:novita`` to
|
set. Use ``model_id`` of the form
|
||||||
pin a specific provider, or omit ``:provider`` to let HF route."""
|
``Qwen/Qwen3-VL-30B-A3B-Instruct:novita`` to pin a specific provider,
|
||||||
|
or omit ``:provider`` to let HF route. Set ``False`` to fall back to
|
||||||
|
a local server (vllm serve / transformers serve / external)."""
|
||||||
auto_serve: bool = True
|
auto_serve: bool = True
|
||||||
"""When True with ``backend=openai``, the CLI probes ``api_base``
|
"""When True with ``backend=openai``, the CLI probes ``api_base``
|
||||||
first; if no server answers, it spawns one (default:
|
first; if no server answers, it spawns one (default:
|
||||||
|
|||||||
Reference in New Issue
Block a user