optmize topreward input processing (#3660)

2026-07-23 17:56:07 +00:00 · 2026-05-25 22:07:45 +08:00
parent 616663cd9f
commit 3b5b94dbd6
10 changed files with 300 additions and 281 deletions
@@ -53,7 +53,7 @@ or, with `uv` from a source checkout:
 uv sync --extra topreward
 ```

-This pulls in `transformers` and `qwen-vl-utils`. The first time you run TOPReward, Hugging Face will also download the VLM weights from the Hub (~16 GB for Qwen3-VL-8B-Instruct). A GPU is strongly recommended.
+This pulls in `transformers`. The first time you run TOPReward, Hugging Face will also download the VLM weights from the Hub (~16 GB for Qwen3-VL-8B-Instruct). A GPU is strongly recommended.

 ## Model Inputs and Outputs