feat(smolvla2): chat-template processor + label mask + predict_actions

Wires PR 1's recipe stack into the SmolVLA2 pipeline so multi-target sub-recipes (memory_update, ask_vqa, user_interjection_response, high_level_subtask) carry meaningful supervision through to the model. - New ``chat_processor_smolvla2.py`` with ``SmolVLA2ChatTokenizerStep``: reads ``messages`` / ``message_streams`` / ``target_message_indices`` from the rendered sample (PR 1 ``RenderMessagesStep``), calls ``apply_chat_template(messages, tools=DEFAULT_TOOLS, ...)`` on the SmolVLM tokenizer, and writes: OBS_LANGUAGE_TOKENS / _ATTENTION_MASK ← chat-templated prompt text_labels ← -100 except target msg tokens predict_actions ← True iff any low_level target Builds the label mask robustly by re-rendering the chat through each target's prefix and reading off the prefix length — same tokenizer, same tools, so the prefix tokens are guaranteed to be a prefix of the full sequence. Image/video content blocks (LeRobot ``feature``-keyed) are stripped before tokenizing; the actual image tensors flow through SmolVLA's existing ``OBS_IMAGES_*`` channels and ``embed_prefix`` puts them before the language embeddings, matching the chat-template-stripped text order. - ``processor_smolvla2.py``: when ``config.recipe_path`` is set, build a new pipeline with ``RenderMessagesStep`` + ``SmolVLA2ChatTokenizerStep`` instead of SmolVLA's plain ``TokenizerProcessorStep``. When ``recipe_path`` is ``None``, fall back to SmolVLA's pipeline so unannotated datasets still work unchanged. Resolves recipe paths relative to ``src/lerobot/configs/`` so ``recipes/smolvla2_hirobot.yaml`` works directly. The next commit on this branch picks up ``text_labels`` and ``predict_actions`` from the batch and routes them through the SmolVLM ``lm_head`` for the actual dual-loss training. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
feat(tools): src/lerobot/tools/ — runnable tool registry + SayTool
2026-06-19 01:07:18 +00:00 · 2026-04-30 19:21:03 +02:00 · 2026-04-30 18:58:04 +02:00 · 2026-04-30 18:55:23 +02:00 · 2026-04-30 18:51:38 +02:00 · 2026-04-30 18:48:36 +02:00
98 changed files with 9282 additions and 1421 deletions
@@ -1,7 +1,5 @@
 This file provides guidance to AI agents when working with code in this repository.

-> **User-facing help → [`AGENT_GUIDE.md`](./AGENT_GUIDE.md)** (SO-101 setup, recording, picking a policy, training duration, eval — with copy-pasteable commands).
-
 ## Project Overview

 LeRobot is a PyTorch-based library for real-world robotics, providing datasets, pretrained policies, and tools for training, evaluation, data collection, and robot control. It integrates with Hugging Face Hub for model/dataset sharing.
@@ -1,410 +0,0 @@
-# AGENT_GUIDE.md — LeRobot Helper for AI Agents & Users
-
-This file is a practical, copy-paste-friendly companion for any AI agent (Cursor, Claude, ChatGPT, Codex, etc.) helping a user work with LeRobot. It complements [`AGENTS.md`](./AGENTS.md) (dev/contributor context) with **user-facing guidance**: how to start, what to train, how long, how to record, and how to calibrate an SO-101.
-
---
-
-## 1. Start here — ask the user first (MANDATORY)
-
-Before suggesting any command, an agent MUST ask the user at least these questions and wait for answers:
-
-1. **What's your goal?** (e.g. "teach my SO-101 to fold a cloth", "train a policy on an existing HF dataset", "contribute a PR", "understand the codebase")
-2. **What hardware do you have?**
-   - Robot: none / SO-100 / SO-101 / Koch / LeKiwi / Reachy / other
-   - Teleop: leader arm / phone / keyboard / gamepad / none
-   - Cameras: how many, resolution, fixed or moving?
-3. **What machine will you train on?**
-   - GPU model + VRAM (e.g. "laptop 3060 6 GB", "RTX 4090 24 GB", "A100 80 GB", "CPU only")
-   - OS: macOS / Linux / Windows
-4. **Skill level & time budget?** First time, some ML, experienced? Hours, days, a weekend?
-5. **Do you already have a dataset?** Yes (HF repo id?) / no / want to record one
-6. **How can I help right now?** (pick one concrete next step)
-
-Only after you have answers, propose a concrete path. If something is ambiguous, ask again rather than guessing. Bias toward **the simplest thing that works** for the user's hardware and goal.
-
---
-
-## 2. LeRobot in 60 seconds
-
-LeRobot = **datasets + policies + envs + robot control**, unified by a small set of strong abstractions.
-
- **`LeRobotDataset`** — episode-aware dataset (video or images + actions + state), loadable from the Hub or disk.
- **Policies** (`ACT`, `Diffusion`, `SmolVLA`, `π0`, `π0.5`, `Wall-X`, `X-VLA`, `VQ-BeT`, `TD-MPC`, …) — all inherit `PreTrainedPolicy` and can be pushed/pulled from the Hub.
- **Processors** — small composable transforms between dataset → policy → robot.
- **Envs** (sim) and **Robots** (real) — same action/observation contract so code swaps cleanly.
- **CLI** — `lerobot-record`, `lerobot-train`, `lerobot-eval`, `lerobot-teleoperate`, `lerobot-calibrate`, `lerobot-find-port`, `lerobot-setup-motors`, `lerobot-replay`.
-
-See [`AGENTS.md`](./AGENTS.md) for repo architecture.
-
---
-
-## 3. Quickstart paths (pick one)
-
-### Path A — "I have an SO-101 and want my first trained policy"
-
-Go to §4 (SO-101 end-to-end), then §5 (data tips), then §6 (pick a policy — likely **ACT**), then §7 (how long), then §8 (eval).
-
-### Path B — "No hardware, I want to train on an existing dataset"
-
-Skip §4. Pick a policy in §6, pick a duration in §7, then run `lerobot-train` per §4.9 with a Hub `--dataset.repo_id` and an `--env.type` for eval. Finish with §8.
-
-### Path C — "I just want to understand the codebase"
-
-Read §2 above, then `AGENTS.md` "Architecture", then open `src/lerobot/policies/act/` and `src/lerobot/datasets/lerobot_dataset.py` as canonical examples.
-
---
-
-## 4. SO-101 end-to-end cheat-sheet
-
-Full details in [`docs/source/so101.mdx`](./docs/source/so101.mdx) and [`docs/source/il_robots.mdx`](./docs/source/il_robots.mdx). Minimum commands in order. Confirm arms are assembled + powered before issuing.
-
-**4.1 Install**
-
-```bash
-pip install 'lerobot[feetech]'              # SO-100/SO-101 motor stack
-# pip install 'lerobot[all]'                # everything
-# pip install 'lerobot[aloha,pusht]'        # specific features
-# pip install 'lerobot[smolvla]'            # add SmolVLA deps
-git lfs install && git lfs pull
-hf auth login                               # required to push datasets/policies
-```
-
-Contributors can alternatively use `uv sync --locked --extra feetech` (see `AGENTS.md`).
-
-**4.2 Find USB ports** — run once per arm, unplug when prompted.
-
-```bash
-lerobot-find-port
-```
-
-macOS: `/dev/tty.usbmodem...`; Linux: `/dev/ttyACM0` (may need `sudo chmod 666 /dev/ttyACM0`).
-
-**4.3 Setup motor IDs & baudrate** (one-time, per arm)
-
-```bash
-lerobot-setup-motors --robot.type=so101_follower --robot.port=<FOLLOWER_PORT>
-lerobot-setup-motors --teleop.type=so101_leader  --teleop.port=<LEADER_PORT>
-```
-
-**4.4 Calibrate** — center all joints, press Enter, sweep each joint through its full range. The `id` is the calibration key — reuse it everywhere.
-
-```bash
-lerobot-calibrate --robot.type=so101_follower --robot.port=<FOLLOWER_PORT> --robot.id=my_follower
-lerobot-calibrate --teleop.type=so101_leader  --teleop.port=<LEADER_PORT>   --teleop.id=my_leader
-```
-
-**4.5 Teleoperate** (sanity check, no recording)
-
-```bash
-lerobot-teleoperate \
-  --robot.type=so101_follower --robot.port=<FOLLOWER_PORT> --robot.id=my_follower \
-  --teleop.type=so101_leader  --teleop.port=<LEADER_PORT>  --teleop.id=my_leader \
-  --robot.cameras="{ front: {type: opencv, index_or_path: 0, width: 640, height: 480, fps: 30}}" \
-  --display_data=true
-```
-
-> **Feetech timeout / comms error on SO-100 / SO-101?** Before touching software, check the **red motor LEDs** on the daisy chain.
->
-> - **All steady red, gripper → base chain** → wiring OK.
-> - **One or more motors dark / chain stops mid-way** → wiring issue: reseat the 3-pin cables, check the controller-board power supply, and make sure each motor is fully clicked in.
-> - **LEDs blinking** → the motor is in an **error state**: usually overload (forcing a joint past its limit) **or wrong power supply voltage**. SO-100 / SO-101 ship in two variants — a **5 V / 7.4 V** build and a **12 V** build — they are NOT interchangeable. Using a 12 V PSU on a 5 V / 7.4 V arm (or vice-versa) will trip this error; confirm your motor variant before powering up.
->
-> Most "timeout" errors are physical, not code.
-
-**4.6 Record a dataset** — keys: **→** next, **←** redo, **ESC** finish & upload.
-
-```bash
-HF_USER=$(NO_COLOR=1 hf auth whoami | awk -F': *' 'NR==1 {print $2}')
-
-lerobot-record \
-  --robot.type=so101_follower --robot.port=<FOLLOWER_PORT> --robot.id=my_follower \
-  --teleop.type=so101_leader  --teleop.port=<LEADER_PORT>  --teleop.id=my_leader \
-  --robot.cameras="{ front: {type: opencv, index_or_path: 0, width: 640, height: 480, fps: 30}}" \
-  --dataset.repo_id=${HF_USER}/my_task \
-  --dataset.single_task="<describe the task in one sentence>" \
-  --dataset.num_episodes=50 \
-  --dataset.episode_time_s=30 \
-  --dataset.reset_time_s=10 \
-  --display_data=true
-```
-
-**4.7 Visualize** — **always** do this before training. Look for missing frames, camera blur, unreachable targets, inconsistent object positions.
-After upload: https://huggingface.co/spaces/lerobot/visualize_dataset → paste `${HF_USER}/my_task`. Works for **any LeRobot-formatted Hub dataset** — use it to scout other datasets, inspect episode quality, or debug your own data before retraining.
-
-**4.8 Replay an episode** (sanity check)
-
-```bash
-lerobot-replay --robot.type=so101_follower --robot.port=<FOLLOWER_PORT> --robot.id=my_follower \
-  --dataset.repo_id=${HF_USER}/my_task --dataset.episode=0
-```
-
-**4.9 Train** (default: ACT — fastest, lowest memory). Apple silicon: `--policy.device=mps`. See §6/§7 for policy and duration.
-
-```bash
-lerobot-train \
-  --dataset.repo_id=${HF_USER}/my_task \
-  --policy.type=act \
-  --policy.device=cuda \
-  --output_dir=outputs/train/act_my_task \
-  --job_name=act_my_task \
-  --batch_size=8 \
-  --wandb.enable=true \
-  --policy.repo_id=${HF_USER}/act_my_task
-```
-
-**4.10 Evaluate on the real robot** — compare success rate to a teleoperated baseline.
-
-```bash
-lerobot-record \
-  --robot.type=so101_follower --robot.port=<FOLLOWER_PORT> --robot.id=my_follower \
-  --robot.cameras="{ front: {type: opencv, index_or_path: 0, width: 640, height: 480, fps: 30}}" \
-  --dataset.repo_id=${HF_USER}/eval_my_task \
-  --dataset.single_task="<same task description as training>" \
-  --dataset.num_episodes=10 \
-  --policy.path=${HF_USER}/act_my_task
-```
-
---
-
-## 5. Data collection tips (beginner → reliable policy)
-
-Good data beats clever models. Adopt these defaults and deviate only with evidence.
-
-### 5.1 Setup & ergonomics
-
- **Fix the rig and cameras** before touching the software. If the rig vibrates or the operator gets frustrated, fix that first — more bad data won't help.
- **Lighting matters more than resolution.** Diffuse, consistent light. Avoid moving shadows.
- **"Can you do the task from the camera view alone?"** If no, your cameras are wrong. Fix before recording.
- Enable **action interpolation** for rollouts when available for smoother trajectories.
-
-### 5.2 Practice before you record
-
- Do 5–10 demos without recording. Build a deliberate, repeatable strategy.
- Hesitant or inconsistent demos teach the model hesitation.
-
-### 5.3 Quality over speed
-
-Deliberate, high-quality execution beats fast sloppy runs. Optimize for speed only **after** strategy is dialed in — never trade quality for it.
-
-### 5.4 Consistency within and across episodes
-
-Same grasp, approach vector, and timing. Coherent strategies are much easier to learn than wildly varying movements.
-
-### 5.5 Start small, then extend (the golden rule)
-
- **First 50 episodes = constrained version** of the task: one object, fixed position, fixed camera setup, one operator.
- Train a quick ACT model. See what fails.
- **Then add diversity** along one axis at a time: more positions → more lighting → more objects → more operators.
- Don't try to collect the "perfect dataset" on day one. Iterate.
-
-### 5.6 Policy choice for beginners
-
- **Laptop / first time / want results fast → ACT.** Works surprisingly well, trains fast even on a laptop GPU.
- **Bigger GPU / language-conditioned / multi-task → SmolVLA.** Unfreezing the vision encoder (see §7) is a big win here.
- Defer π0 / π0.5 / Wall-X / X-VLA until you have a proven ACT baseline and a 20+ GB GPU.
-
-### 5.7 Recommended defaults for your first task
-
-| Setting          | Value                                                                                                                                                 |
-| ---------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------- |
-| Episodes         | **50** to start, scale to 100–300 after first training                                                                                                |
-| Episode length   | 20–45 s (shorter is fine for grasp/place)                                                                                                             |
-| Reset time       | 10 s                                                                                                                                                  |
-| FPS              | 30                                                                                                                                                    |
-| Cameras          | **2 cameras recommended**: 1 fixed front + 1 wrist. Multi-view often outperforms single-view. A single fixed camera also works to keep things simple. |
-| Task description | Short, specific, action-phrased sentence                                                                                                              |
-
-### 5.8 Troubleshooting signal
-
- Policy fails at one specific stage → record 10–20 more episodes **targeting that stage**.
- Policy flaps / oscillates → likely inconsistent demos, or need more training; re-record worst episodes (use **←** to redo).
- Policy ignores the object → camera framing or lighting issue, not a model issue.
-
-See also: [What makes a good dataset](https://huggingface.co/blog/lerobot-datasets#what-makes-a-good-dataset).
-
---
-
-## 6. Which policy should I train?
-
-Match the policy to the user's **GPU memory** and **time budget**. Numbers below come from an internal profiling run (one training update per policy). They are **indicative only** — see caveats.
-
-### 6.1 Profiling snapshot (indicative)
-
-All policies typically train for **5–10 epochs** (see §7).
-
-| Policy      | Batch | Update (ms) | Peak GPU mem (GB) | Best for                                                                                         |
-| ----------- | ----: | ----------: | ----------------: | ------------------------------------------------------------------------------------------------ |
-| `act`       |     4 |    **83.9** |          **0.94** | First-time users, laptops, single-task. Fast and reliable.                                       |
-| `diffusion` |     4 |       168.6 |              4.94 | Multi-modal action distributions; needs mid-range GPU.                                           |
-| `smolvla`   |     1 |       357.8 |              3.93 | Language-conditioned, multi-task, small VLA. **Unfreeze vision encoder for big gains** (see §7). |
-| `xvla`      |     1 |       731.6 |             15.52 | Large VLA, multi-task.                                                                           |
-| `wall_x`    |     1 |       716.5 |             15.95 | Large VLA with world-model objective.                                                            |
-| `pi0`       |     1 |       940.3 |             15.50 | Strong large VLA baseline (Physical Intelligence).                                               |
-| `pi05`      |     1 |      1055.8 |             16.35 | Newer π policy; similar footprint to `pi0`.                                                      |
-
-**Critical caveats:**
-
- **Optimizer:** measured with **SGD**. LeRobot's default is **AdamW**, which keeps extra optimizer state → **peak memory will be noticeably higher** with the default, especially for `pi0`, `pi05`, `wall_x`, `xvla`.
- **Batch size:** the large policies were profiled at batch 1. In practice use a **larger batch** for stable training (see §7.4). Memory scales roughly linearly with batch.
-
-### 6.2 Decision rules
-
- **< 8 GB VRAM (laptop, 3060, M-series Mac):** → `act`. Maybe `diffusion` if you have ~6–8 GB free.
- **12–16 GB VRAM (4070/4080, A4000):** → `smolvla` with defaults, or `act`/`diffusion` with larger batch. `pi0`/`pi05`/`wall_x`/`xvla` feasible only with small batch + gradient accumulation.
- **24+ GB VRAM (3090/4090/A5000):** → any policy. Prefer `smolvla` (unfrozen) for multi-task; `act` for single-task grasp-and-place (still often the best ROI). Could experiment with `pi0` or `pi05` or `xvla`
- **80 GB (A100/H100):** → any, with healthy batch. `pi05`, `xvla`, `wall_x` become comfortable.
- **CPU only:** → don't train here. Use Google Colab (see [`docs/source/notebooks.mdx`](./docs/source/notebooks.mdx)) or a rented GPU.
-
---
-
-## 7. How long should I train?
-
-Robotics imitation learning usually converges in a **few epochs over the dataset**, not hundreds of thousands of raw steps. Think **epochs first**, then translate to steps.
-
-### 7.1 Rule of thumb
-
- **Typical total: 5–10 epochs.** Start at 5, eval, then decide if more helps.
- Very small datasets (< 30 episodes) may want slightly more epochs — but first, **collect more data**.
- VLAs with a pretrained vision backbone typically need **fewer** epochs than training from scratch.
-
-### 7.2 Steps ↔ epochs conversion
-
-```
-total_frames     = sum of frames over all episodes      # e.g. 50 eps × 30 fps × 30 s ≈ 45,000
-steps_per_epoch  = ceil(total_frames / batch_size)
-total_steps      = epochs × steps_per_epoch
-```
-
-Examples for `--batch_size=8`:
-
-| Dataset size            |  Frames | Steps / epoch | 5 epochs | 10 epochs |
-| ----------------------- | ------: | ------------: | -------: | --------: |
-| 50 eps × 30 s @ 30 fps  |  45,000 |        ~5,625 |      28k |       56k |
-| 100 eps × 30 s @ 30 fps |  90,000 |       ~11,250 |      56k |      113k |
-| 300 eps × 30 s @ 30 fps | 270,000 |       ~33,750 |     169k |      338k |
-
-Pass the resulting total with `--steps=<N>`; eval at intermediate checkpoints (`outputs/train/.../checkpoints/`).
-
-### 7.3 Per-policy starting points (single-task, ~50 episodes)
-
-| Policy         | Batch | Steps (first run) | Notes                                                             |
-| -------------- | ----: | ----------------: | ----------------------------------------------------------------- |
-| `act`          |  8–16 |           30k–80k | Usually converges under 50k for single-task.                      |
-| `diffusion`    |  8–16 |          80k–150k | Benefits from longer training than ACT.                           |
-| `smolvla`      |   4–8 |           30k–80k | Pretrained VLM → converges fast.                                  |
-| `pi0` / `pi05` |   1–4 |           30k–80k | Memory-bound; use gradient accumulation for effective batch ≥ 16! |
-
-### 7.4 Batch size guidance
-
- **Bigger batch is preferable** for stable gradients on teleop data.
- If GPU memory is the bottleneck, use **gradient accumulation** to raise _effective_ batch without raising peak memory.
- Scale **learning rate** gently with batch; most LeRobot defaults work fine for a 2–4× batch change.
-
-### 7.5 Scale LR schedule & checkpoints with `--steps`
-
-LeRobot's default schedulers (e.g. SmolVLA's cosine decay) use `scheduler_decay_steps=30_000`, which is sized for long training runs. When you shorten training (e.g. 5k–10k steps on a small dataset), **scale the scheduler down to match** — otherwise the LR stays near the peak and never decays. Same for checkpoint frequency.
-
-```bash
-lerobot-train ... \
-  --steps=5000 \
-  --policy.scheduler_decay_steps=5000 \
-  --save_freq=5000
-```
-
-Rule of thumb: set `scheduler_decay_steps ≈ steps`, and `save_freq` to whatever granularity you want for eval (e.g. every 1k–5k steps). Match `scheduler_warmup_steps` proportionally if your run is very short.
-
-### 7.6 SmolVLA: unfreeze the vision encoder for real gains
-
-SmolVLA ships with `freeze_vision_encoder=True`. Unfreezing usually **improves performance substantially** on specialized tasks, at the cost of more VRAM and slower steps. Enable with:
-
-```bash
-lerobot-train ... --policy.type=smolvla \
-  --policy.freeze_vision_encoder=false \
-  --policy.train_expert_only=false
-```
-
-### 7.7 Signals to stop / keep going
-
- Train loss plateaus → stop, save a Hub checkpoint.
- Train loss still dropping and you're under 10 epochs → keep going.
-
---
-
-## 8. Evaluation & benchmarks
-
-Two flavors of evaluation:
-
-### 8.1 Real-robot eval (SO-101, etc.)
-
-Reuse `lerobot-record` with `--policy.path` to run the trained policy on-robot and save the run as an eval dataset. Convention: prefix the dataset with `eval_`.
-
-```bash
-lerobot-record \
-  --robot.type=so101_follower --robot.port=<FOLLOWER_PORT> --robot.id=my_follower \
-  --robot.cameras="{ front: {type: opencv, index_or_path: 0, width: 640, height: 480, fps: 30}}" \
-  --dataset.repo_id=${HF_USER}/eval_my_task \
-  --dataset.single_task="<same task description used during training>" \
-  --dataset.num_episodes=10 \
-  --policy.path=${HF_USER}/act_my_task
-```
-
-Report success rate across episodes. Compare to a teleoperated baseline and to an earlier checkpoint to catch regressions.
-
-### 8.2 Sim-benchmark eval
-
-For policies trained on sim datasets (PushT, Aloha, LIBERO, MetaWorld, RoboCasa, …) use `lerobot-eval` against the matching `env.type`:
-
-```bash
-lerobot-eval \
-  --policy.path=${HF_USER}/diffusion_pusht \
-  --env.type=pusht \
-  --eval.n_episodes=50 \
-  --eval.batch_size=10 \
-  --policy.device=cuda
-```
-
- Use `--policy.path=outputs/train/.../checkpoints/<step>/pretrained_model` for local checkpoints.
- `--eval.n_episodes` should be ≥ 50 for a stable success-rate estimate.
- Available envs live in `src/lerobot/envs/`. See [`docs/source/libero.mdx`](./docs/source/libero.mdx), [`metaworld.mdx`](./docs/source/metaworld.mdx), [`robocasa.mdx`](./docs/source/robocasa.mdx), [`vlabench.mdx`](./docs/source/vlabench.mdx) for specific benchmarks.
- To add a new benchmark, see [`docs/source/adding_benchmarks.mdx`](./docs/source/adding_benchmarks.mdx) and [`envhub.mdx`](./docs/source/envhub.mdx).
-
-### 8.2b Dockerfiles for benchmark eval
-
-Benchmark envs have native dependencies that are painful to install locally. The repo ships **pre-baked Dockerfiles** for each supported benchmark — use these to run `lerobot-eval` in a reproducible environment:
-
-| Benchmark   | Dockerfile                                                                             |
-| ----------- | -------------------------------------------------------------------------------------- |
-| LIBERO      | [`docker/Dockerfile.benchmark.libero`](./docker/Dockerfile.benchmark.libero)           |
-| LIBERO+     | [`docker/Dockerfile.benchmark.libero_plus`](./docker/Dockerfile.benchmark.libero_plus) |
-| MetaWorld   | [`docker/Dockerfile.benchmark.metaworld`](./docker/Dockerfile.benchmark.metaworld)     |
-| RoboCasa    | [`docker/Dockerfile.benchmark.robocasa`](./docker/Dockerfile.benchmark.robocasa)       |
-| RoboCerebra | [`docker/Dockerfile.benchmark.robocerebra`](./docker/Dockerfile.benchmark.robocerebra) |
-| RoboMME     | [`docker/Dockerfile.benchmark.robomme`](./docker/Dockerfile.benchmark.robomme)         |
-| RoboTwin    | [`docker/Dockerfile.benchmark.robotwin`](./docker/Dockerfile.benchmark.robotwin)       |
-| VLABench    | [`docker/Dockerfile.benchmark.vlabench`](./docker/Dockerfile.benchmark.vlabench)       |
-
-Build and run (adapt to your benchmark):
-
-```bash
-docker build -f docker/Dockerfile.benchmark.robomme -t lerobot-bench-robomme .
-docker run --gpus all --rm -it \
-  -v $HOME/.cache/huggingface:/root/.cache/huggingface \
-  lerobot-bench-robomme \
-  lerobot-eval --policy.path=<your_policy> --env.type=<env> --eval.n_episodes=50
-```
-
-See [`docker/README.md`](./docker/README.md) for base-image details.
-
-### 8.3 Target success rates
-
-Single-task grasp-and-place with 50 clean episodes: ACT should reach **> 70% success** on the training configuration. Less → data problem (see §5), not model problem. Expect a drop when generalizing to new positions — scale episodes or diversity to recover.
-
---
-
-## 9. Further reading & resources
-
- **Getting started:** [`installation.mdx`](./docs/source/installation.mdx) · [`il_robots.mdx`](./docs/source/il_robots.mdx) · [What makes a good dataset](https://huggingface.co/blog/lerobot-datasets)
- **Per-policy docs:** browse [`docs/source/*.mdx`](./docs/source/) (policies, hardware, benchmarks, advanced training).
- **Community:** [Discord](https://discord.com/invite/s3KuuzsPFb) · [Hub `LeRobot` tag](https://huggingface.co/datasets?other=LeRobot) · [Dataset visualizer](https://huggingface.co/spaces/lerobot/visualize_dataset)
-
-> Keep this file current. If you learn a rule that would prevent a class of user mistakes, add it here and in [`AGENTS.md`](./AGENTS.md).
@@ -178,3 +178,9 @@ test-smolvla-ete-eval:
 		--env.episode_length=5 \
 		--eval.n_episodes=1 \
 		--eval.batch_size=1
+
+# E2E annotation pipeline smoke test against a tiny in-memory fixture
+# dataset. Opt-in (not part of `make test-end-to-end`) and uses a stub VLM
+# backend, so it does not require a real model checkpoint or GPU.
+annotation-e2e:
+	uv run python -m tests.annotations.run_e2e_smoke
@@ -31,8 +31,12 @@
    title: Porting Large Datasets
  - local: using_dataset_tools
    title: Using the Dataset Tools
-  - local: dataset_subtask
-    title: Using Subtasks in the Dataset
+  - local: language_and_recipes
+    title: Language Columns and Recipes
+  - local: tools
+    title: Tools
+  - local: annotation_pipeline
+    title: Annotation Pipeline
  - local: streaming_video_encoding
    title: Streaming Video Encoding
  title: "Datasets"
@@ -0,0 +1,161 @@
+# Annotation Pipeline
+
+`lerobot-annotate` populates the two language columns introduced by the
+[Language Columns and Recipes](./language_and_recipes) page —
+`language_persistent` and `language_events` — directly into
+`data/chunk-*/file-*.parquet`. There is no flavor namespace and no sidecar
+file tree: multiple revisions of a dataset mean multiple dataset copies.
+
+## What the pipeline produces
+
+Three modules write into a per-episode staging tree, then a single writer
+rewrites the data shards in place:
+
+| Style / atom                                | Column                | Module   |
+| ------------------------------------------- | --------------------- | -------- |
+| `subtask` (Pi0.7-style "how, not what")     | `language_persistent` | Module 1 |
+| `plan` (initial + refresh on interjection)  | `language_persistent` | Module 1 |
+| `memory` (MEM-style compression)            | `language_persistent` | Module 1 |
+| `interjection`                              | `language_events`     | Module 2 |
+| speech tool-call atom (`style=null`, `say`) | `language_events`     | Module 2 |
+| `vqa` (user / assistant pair)               | `language_events`     | Module 3 |
+
+The writer drops the legacy `subtask_index` column. It does **not** add a
+`tools` column to the parquet — the tool catalog lives at
+`meta/info.json["tools"]` instead (see [Tools](./tools)). After every
+annotation run the pipeline ensures the canonical `say` schema is
+present in that list, preserving any tools the user pre-declared. Chat-
+template consumers read the catalog through
+`LeRobotDatasetMetadata.tools` and pass it to
+`apply_chat_template(messages, tools=meta.tools, ...)`.
+
+If you want to declare additional tools for a dataset before annotation
+runs, edit `meta/info.json["tools"]` directly — the pipeline preserves
+anything already there. Implementations of those tools live under
+`src/lerobot/tools/`; one file per tool, registered via
+`TOOL_REGISTRY`. See the [Tools](./tools) doc for the authoring guide.
+
+## How to run it locally or on SLURM
+
+Install the extra and invoke the console script:
+
+```bash
+uv sync --extra annotations
+uv run lerobot-annotate \
+  --repo_id=imstevenpmwork/super_poulain_draft \
+  --vlm.backend=vllm \
+  --vlm.model_id=Qwen/Qwen3.6-27B-FP8 \
+  --vlm.tensor_parallel_size=2
+```
+
+The pipeline attaches actual camera footage to every Module 1/2/3 prompt
+by default, decoded from the dataset's first `observation.images.*`
+stream. Override with `--vlm.camera_key=observation.images.<name>` to
+pin a specific viewpoint. Datasets with no video tracks fall back to
+text-only prompts automatically.
+
+**Module 1 sees the whole episode as one video block.** Subtask
+decomposition gets a `{"type":"video", "video":[<frames>]}` block
+covering the entire demonstration; Qwen-VL pools temporally on its own
+and decides where to cut. There is no keyframe stride or count knob —
+`--module_1.max_video_frames` (default 32) only caps the frames packed
+into the video block as a model-capacity bound. Module 2 attaches a
+single still frame at the interjection timestamp; Module 3 attaches the
+exact emission frame to each VQA pair.
+
+The executor picks `LocalPipelineExecutor` for small datasets and
+`SlurmPipelineExecutor` for large ones based on
+`--executor.auto_threshold` (default 32 episodes). Force local with
+`--executor.force_local=true`. SLURM jobs honour `--executor.slurm_partition`,
+`--executor.slurm_gpus`, and `--executor.slurm_time`.
+
+## Style-to-recipe consumer mapping
+
+The pipeline produces exactly the styles consumed by
+`src/lerobot/configs/recipes/pi05_hirobot.yaml`:
+
+- `low_level_execution`, `high_level_subtask`, `memory_update` consume
+  `subtask`/`plan`/`memory` from `language_persistent`.
+- `user_interjection_response` consumes `interjection` events plus the
+  paired speech atom (merged into one assistant target turn via
+  `tool_calls_from`) and the same-timestamp `plan` refresh.
+- `ask_vqa` consumes the `(vqa, user)` and `(vqa, assistant)` pairs from
+  `language_events`.
+
+## Why the design is scoped to the canonical recipe
+
+Two things drive the scope:
+
+1. **Persistent state vs exact-event split.** Persistent rows (`subtask`,
+   `plan`, `memory`) broadcast per episode and answer "what state is in
+   force at this frame?". Event rows (`interjection`, `vqa`, speech) only
+   appear on the exact frame whose timestamp matches the emission. The
+   pipeline writes timestamps taken straight from the source parquet — no
+   floating-point recomputation.
+2. **One Qwen-VL pass.** All three modules share a single VLM client
+   (vLLM if available, transformers fallback) so the cost is one model
+   load per dataset, not three.
+
+## Module independence and staged reruns
+
+Each module writes its raw output to
+`<root>/.annotate_staging/episode_{N:06d}/<module>.jsonl`. That makes
+prompt iteration cheap — re-running one module overwrites only its own
+JSONL file before the writer composes the final parquet. Modules can be
+disabled via `--module_1.enabled=false` (and similarly for 2 and 3) to
+test them in isolation.
+
+## Validation/report checks before final write
+
+Before the writer runs, `StagingValidator` checks:
+
+- exact frame-timestamp alignment for every event row;
+- no orphan speech / interjection pairs;
+- `plan` is refreshed at every interjection timestamp;
+- `memory` rows fall on subtask boundaries (warning, not error);
+- VQA assistant `content` parses as JSON in one of the
+  bbox / keypoint / count / attribute / spatial shapes;
+- every row routes to the column dictated by `column_for_style(style)`.
+
+Errors abort the writer (`--skip_validation=true` overrides for debugging).
+
+## Paper inspirations per module
+
+- **Module 1 — subtasks.** Hi Robot ([Shi 2025](https://arxiv.org/abs/2502.19417))
+  atom granularity ("pick up one piece of lettuce", "place bowl to box");
+  Pi0.7 ([Physical Intelligence 2025](https://pi.website/pi07)) "how, not
+  what" detail.
+- **Module 1 — memory.** MEM ([Torne 2026](https://arxiv.org/abs/2603.03596))
+  compression directive: keep only minimal relevant information; functional
+  outcomes preserved, specific attributes dropped.
+- **Module 2 — interjections.** Hi Robot scenario taxonomy: negative task,
+  situated correction, specific constraint, preference. Speech is a
+  tool-call-only atom (`tool_calls=[{type:function, function:{name:"say",
+arguments:{text:...}}}]`).
+- **Module 3 — VQA.** ECoT ([Zawalski 2024](https://arxiv.org/abs/2407.08693))
+  grounded features (bounding boxes in pixel `[x_min, y_min, x_max, y_max]`,
+  keypoints) and Steerable Policies' multi-abstraction grounding.
+
+Future maintainers should adjust the prompt templates in
+`src/lerobot/annotations/steerable_pipeline/prompts/` against these
+references rather than rewriting from scratch.
+
+## Compute and list-size estimates
+
+Per episode, the pipeline issues O(`max_steps`) Module 1 calls,
+O(`max_interjections_per_episode`) Module 2 calls, and
+O(`vqa_emission_hz × episode_seconds`) Module 3 calls. With defaults
+(8 subtasks, 1 interjection, 1 Hz × 3 pairs) and 30-second episodes, that
+is ~50 VLM calls per episode. `language_persistent` per episode is ~10s of
+KB at most (parquet dictionary-encodes one entry per episode);
+`language_events` is empty on most frames and is bounded by the number of
+emissions, not `num_frames × num_emissions`.
+
+## Reproducibility via seed and prompt hashes
+
+`--seed` (default 1729) feeds the per-episode RNGs that select interjection
+timestamps and VQA question types. Combined with the deterministic prompt
+templates checked into `prompts/`, two runs at the same seed against the
+same dataset and the same model checkpoint produce byte-identical staging
+artifacts. Prompt edits are recorded by file hash; future tooling can pin
+expected `(seed, prompt_hash)` pairs into the dataset card.
@@ -1,277 +0,0 @@
-# Using Subtasks in LeRobot Datasets
-
-Subtask support in robotics datasets has proven effective in improving robot reasoning and understanding. Subtasks are particularly useful for:
-
- **Hierarchical policies**: Building policies that include subtask predictions to visualize robot reasoning in real time
- **Reward modeling**: Helping reward models understand task progression (e.g., SARM-style stage-aware reward models)
- **Task decomposition**: Breaking down complex manipulation tasks into atomic, interpretable steps
-
-LeRobotDataset now supports subtasks as part of its dataset structure, alongside tasks.
-
-## What are Subtasks?
-
-While a **task** describes the overall goal (e.g., "Pick up the apple and place it in the basket"), **subtasks** break down the execution into finer-grained steps:
-
-1. "Approach the apple"
-2. "Grasp the apple"
-3. "Lift the apple"
-4. "Move to basket"
-5. "Release the apple"
-
-Each frame in the dataset can be annotated with its corresponding subtask, enabling models to learn and predict these intermediate stages.
-
-<img
-  src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/lerobot/subtask-asset.png"
-  alt="An overview of subtask annotation showing how frames are labeled with intermediate subtask stages"
-  width="80%"
-/>
-
-<p>
-  <em>Figure: Overview of subtask annotation.</em>
-</p>
-
-**Reference:** _Subtask-learning based for robot self-assembly in flexible collaborative assembly in manufacturing_, Original Article, Published: 19 April 2022.
-
-## Dataset Structure
-
-Subtask information is stored in the dataset metadata:
-
-```
-my-dataset/
-├── data/
-│   └── ...
-├── meta/
-│   ├── info.json
-│   ├── stats.json
-│   ├── tasks.parquet
-│   ├── subtasks.parquet      # Subtask index → subtask string mapping
-│   └── episodes/
-│       └── ...
-└── videos/
-    └── ...
-```
-
-### Subtasks Parquet File
-
-The `meta/subtasks.parquet` file maps subtask indices to their natural language descriptions:
-
-| subtask_index | subtask (index column) |
-| ------------- | ---------------------- |
-| 0             | "Approach the apple"   |
-| 1             | "Grasp the apple"      |
-| 2             | "Lift the apple"       |
-| ...           | ...                    |
-
-### Frame-Level Annotations
-
-Each frame in the dataset can include a `subtask_index` field that references the subtasks parquet file:
-
-```python
-# Example frame data in the parquet file
-{
-    "index": 42,
-    "timestamp": 1.4,
-    "episode_index": 0,
-    "task_index": 0,
-    "subtask_index": 2,  # References "Lift the apple"
-    "observation.state": [...],
-    "action": [...],
-}
-```
-
-## Annotating Datasets with Subtasks
-
-We provide a HuggingFace Space for easily annotating any LeRobotDataset with subtasks:
-
-**[https://huggingface.co/spaces/lerobot/annotate](https://huggingface.co/spaces/lerobot/annotate)**
-
-After completing your annotation:
-
-1. Click "Push to Hub" to upload your annotated dataset
-2. You can also run the annotation space locally by following the instructions at [github.com/huggingface/lerobot-annotate](https://github.com/huggingface/lerobot-annotate)
-
-## Loading Datasets with Subtasks
-
-When you load a dataset with subtask annotations, the subtask information is automatically available:
-
-```python
-from lerobot.datasets import LeRobotDataset
-
-# Load a dataset with subtask annotations
-dataset = LeRobotDataset("jadechoghari/collect-fruit-annotated")
-
-# Access a sample
-sample = dataset[100]
-
-# The sample includes both task and subtask information
-print(sample["task"])        # "Collect the fruit"
-print(sample["subtask"])     # "Grasp the apple"
-print(sample["task_index"])  # tensor(0)
-print(sample["subtask_index"])  # tensor(2)
-```
-
-### Checking for Subtask Support
-
-You can check if a dataset has subtask annotations:
-
-```python
-# Check if subtasks are available
-has_subtasks = (
-    "subtask_index" in dataset.features
-    and dataset.meta.subtasks is not None
-)
-
-if has_subtasks:
-    print(f"Dataset has {len(dataset.meta.subtasks)} unique subtasks")
-    print("Subtasks:", list(dataset.meta.subtasks.index))
-```
-
-## Using Subtasks for Training
-
-### With the Tokenizer Processor
-
-The `TokenizerProcessor` automatically handles subtask tokenization for Vision-Language Action (VLA) models:
-
-```python
-from lerobot.processor import TokenizerProcessorStep
-
-# Create a tokenizer processor step
-tokenizer_processor = TokenizerProcessorStep(
-    tokenizer_name_or_path="google/paligemma-3b-pt-224",
-    padding="max_length",
-    max_length=64,
-)
-
-# The processor will automatically tokenize subtasks if present in the batch
-# and add them to the observation under:
-# - "observation.subtask.tokens"
-# - "observation.subtask.attention_mask"
-```
-
-When subtasks are available in the batch, the tokenizer processor adds:
-
- `observation.subtask.tokens`: Tokenized subtask text
- `observation.subtask.attention_mask`: Attention mask for the subtask tokens
-
-### DataLoader with Subtasks
-
-```python
-import torch
-from lerobot.datasets import LeRobotDataset
-
-dataset = LeRobotDataset("jadechoghari/collect-fruit-annotated")
-
-dataloader = torch.utils.data.DataLoader(
-    dataset,
-    batch_size=16,
-    shuffle=True,
-)
-
-for batch in dataloader:
-    # Access subtask information in the batch
-    subtasks = batch["subtask"]  # List of subtask strings
-    subtask_indices = batch["subtask_index"]  # Tensor of subtask indices
-
-    # Use for training hierarchical policies or reward models
-    print(f"Batch subtasks: {set(subtasks)}")
-```
-
-## Example Datasets with Subtask Annotations
-
-Try loading a dataset with subtask annotations:
-
-```python
-from lerobot.datasets import LeRobotDataset
-
-# Example dataset with subtask annotations
-dataset = LeRobotDataset("jadechoghari/collect-fruit-annotated")
-
-# Explore the subtasks
-print("Available subtasks:")
-for subtask_name in dataset.meta.subtasks.index:
-    print(f"  - {subtask_name}")
-
-# Get subtask distribution
-subtask_counts = {}
-for i in range(len(dataset)):
-    sample = dataset[i]
-    subtask = sample["subtask"]
-    subtask_counts[subtask] = subtask_counts.get(subtask, 0) + 1
-
-print("\nSubtask distribution:")
-for subtask, count in sorted(subtask_counts.items(), key=lambda x: -x[1]):
-    print(f"  {subtask}: {count} frames")
-```
-
-## Use Cases
-
-### 1. Hierarchical Policy Training
-
-Train policies that predict both actions and current subtask:
-
-```python
-class HierarchicalPolicy(nn.Module):
-    def __init__(self, num_subtasks):
-        super().__init__()
-        self.action_head = nn.Linear(hidden_dim, action_dim)
-        self.subtask_head = nn.Linear(hidden_dim, num_subtasks)
-
-    def forward(self, observations):
-        features = self.encoder(observations)
-        actions = self.action_head(features)
-        subtask_logits = self.subtask_head(features)
-        return actions, subtask_logits
-```
-
-### 2. Stage-Aware Reward Modeling (SARM)
-
-Build reward models that understand task progression:
-
-```python
-# SARM predicts:
-# - Stage: Which subtask is being executed (discrete)
-# - Progress: How far along the subtask (continuous 0-1)
-
-class SARMRewardModel(nn.Module):
-    def forward(self, observations):
-        features = self.encoder(observations)
-        stage_logits = self.stage_classifier(features)
-        progress = self.progress_regressor(features)
-        return stage_logits, progress
-```
-
-### 3. Progress Visualization
-
-Monitor robot execution by tracking subtask progression:
-
-```python
-def visualize_execution(model, observations):
-    for t, obs in enumerate(observations):
-        action, subtask_logits = model(obs)
-        predicted_subtask = subtask_names[subtask_logits.argmax()]
-        print(f"t={t}: Executing '{predicted_subtask}'")
-```
-
-## API Reference
-
-### LeRobotDataset Properties
-
-| Property                    | Type                   | Description                                |
-| --------------------------- | ---------------------- | ------------------------------------------ |
-| `meta.subtasks`             | `pd.DataFrame \| None` | DataFrame mapping subtask names to indices |
-| `features["subtask_index"]` | `dict`                 | Feature spec for subtask_index if present  |
-
-### Sample Keys
-
-When subtasks are available, each sample includes:
-
-| Key             | Type           | Description                          |
-| --------------- | -------------- | ------------------------------------ |
-| `subtask_index` | `torch.Tensor` | Integer index of the current subtask |
-| `subtask`       | `str`          | Natural language subtask description |
-
-## Related Resources
-
- [SARM Paper](https://arxiv.org/pdf/2509.25358) - Stage-Aware Reward Modeling for Long Horizon Robot Manipulation
- [LeRobot Annotate Space](https://huggingface.co/spaces/lerobot/annotate) - Interactive annotation tool
- [LeRobotDataset v3.0](./lerobot-dataset-v3) - Dataset format documentation
@@ -0,0 +1,109 @@
+# Language columns and recipes
+
+LeRobot stores reusable language annotations directly next to frame data in `data/chunk-*/file-*.parquet`.
+The two optional columns are:
+
+- `language_persistent`: a list of rows broadcast across every frame in an episode for state that remains active, such as `subtask`, `plan`, and `memory`.
+- `language_events`: a list of rows only on the exact frame where an event was emitted, such as `interjection`, `vqa`, and speech tool calls.
+
+Both columns share the same row shape (event rows omit `timestamp` because the
+frame the row sits on already provides it):
+
+```text
+role: string
+content: string | null
+style: string | null
+timestamp: float64        # persistent rows only
+camera: string | null     # observation.images.* feature key, view-dependent rows only
+tool_calls: list[Json] | null
+```
+
+The `camera` field tags rows whose `content` is grounded in a specific camera
+view. Rows of view-dependent styles (`vqa`, and the reserved `motion` /
+`trace`) MUST set `camera` to the matching `observation.images.*` feature key.
+Rows of every other style MUST leave `camera` as `null`. Pipeline writers and
+the validator enforce this via `validate_camera_field(style, camera)`.
+
+`meta/tasks.parquet` remains the canonical source for the task. The special `${task}` recipe binding always reads that task string and does not depend on language annotations.
+
+## Architecture
+
+The language stack has three layers:
+
+1. `lerobot.datasets.language` defines the schema, style registry, and `column_for_style`.
+2. `lerobot.datasets.language_render` resolves rows and renders messages.
+3. `RenderMessagesStep` turns dataset samples into `messages`, `message_streams`, and `target_message_indices`.
+
+`LeRobotDataset` stays recipe-agnostic. It passes `language_persistent` and `language_events` through when present, and unannotated datasets keep their existing behavior.
+
+## Temporal semantics
+
+Persistent styles are active after emission until replaced:
+
+- `active_at(t, style=subtask)`
+- `nth_prev(style=memory, offset=1)`
+- `nth_next(style=subtask, offset=1)`
+
+Event styles only exist on their exact timestamp:
+
+- `emitted_at(t, style=interjection)`
+- `emitted_at(t, style=vqa, role=user, camera=observation.images.top)`
+- `emitted_at(t, role=assistant, tool_name=say)`
+
+Exact event matching has no tolerance window, so writers must stamp event rows with frame timestamps from the parquet data.
+
+## View-dependent resolution
+
+For view-dependent styles (`vqa`, `motion`, `trace`), the resolver gains a
+`camera=` filter parallel to `role=` and `tool_name=`. Datasets with multiple
+cameras typically emit one (`vqa`, `user`) + (`vqa`, `assistant`) pair per
+camera at the same timestamp; without `camera=`, those resolvers see two
+matches and raise an ambiguity error. Recipes consume each camera through its
+own binding plus a matching image block, e.g.
+
+```yaml
+ask_vqa_top:
+  bindings:
+    vqa_query: "emitted_at(t, style=vqa, role=user, camera=observation.images.top)"
+    vqa: "emitted_at(t, style=vqa, role=assistant, camera=observation.images.top)"
+  messages:
+    - role: user
+      stream: high_level
+      if_present: vqa_query
+      content:
+        - { type: image, feature: observation.images.top }
+        - { type: text, text: "${vqa_query}" }
+    - { role: assistant, content: "${vqa}", stream: high_level, target: true, if_present: vqa }
+```
+
+Add one such sub-recipe per camera the dataset records.
+
+## Recipe anatomy
+
+Recipes are YAML files backed by `TrainingRecipe` and `MessageTurn`.
+
+```yaml
+messages:
+  - { role: user, content: "${task}", stream: high_level }
+  - { role: assistant, content: "${subtask}", stream: low_level, target: true }
+```
+
+Rendered samples use HF-style chat messages plus LeRobot sidecars:
+
+```python
+sample["messages"]
+sample["message_streams"]
+sample["target_message_indices"]
+```
+
+The renderer does not apply a tokenizer chat template. Policy processors decide how to serialize the messages for their backbone.
+
+## Blends
+
+Blend recipes select one weighted sub-recipe deterministically from the sample index.
+The canonical `recipes/pi05_hirobot.yaml` combines memory updates, interjection responses, high-level subtask prediction, low-level execution, and VQA.
+
+## Graceful absence
+
+If both language columns are missing, `None`, or empty, `RenderMessagesStep` is a no-op.
+If an event-scoped branch is selected on a frame without the required event row, rendering returns `None`, allowing a loader to retry another sample.
@@ -0,0 +1,198 @@
+# Tools
+
+LeRobot v3.1 supports **tool calls** in policies — assistant messages can
+emit structured invocations like `say(text="OK, starting now")` that the
+runtime dispatches to a real implementation (TTS, controller, logger, …).
+
+This page covers:
+
+1. Where the tool catalog lives (PR 1).
+2. How the annotation pipeline produces tool-call atoms (PR 2).
+3. How to add your own tool (PR 3).
+
+## Where tools are declared
+
+Two layers.
+
+**The catalog** — a list of OpenAI-style function schemas — lives at
+`meta/info.json["tools"]` on each dataset. Example:
+
+```json
+{
+  "features": { "...": "..." },
+  "tools": [
+    {
+      "type": "function",
+      "function": {
+        "name": "say",
+        "description": "Speak a short utterance to the user via the TTS executor.",
+        "parameters": {
+          "type": "object",
+          "properties": {
+            "text": { "type": "string", "description": "The verbatim text to speak." }
+          },
+          "required": ["text"]
+        }
+      }
+    }
+  ]
+}
+```
+
+Read it via the dataset metadata accessor:
+
+```python
+from lerobot.datasets.dataset_metadata import LeRobotDatasetMetadata
+
+meta = LeRobotDatasetMetadata(repo_id="pepijn/super_poulain_final_annotations")
+tools = meta.tools     # list[dict] — OpenAI tool schemas
+```
+
+If the dataset's `info.json` doesn't declare any tools, `meta.tools`
+returns `DEFAULT_TOOLS` from `lerobot.datasets.language` — currently a
+single-entry list with the canonical `say` schema. So unannotated
+datasets and chat-template consumers keep working without any
+configuration:
+
+```python
+prompt_str = tokenizer.apply_chat_template(
+    sample["messages"],
+    tools=meta.tools,                 # works either way
+    add_generation_prompt=False,
+    tokenize=False,
+)
+```
+
+**The implementations** — runnable Python — live under
+`src/lerobot/tools/`, one file per tool. The `say` implementation
+arrives in PR 3 and wraps Kyutai's pocket-tts model.
+
+## Per-row tool *invocations*
+
+The catalog above describes *what can be called*. The actual *call* — the
+function name plus the argument values — is stored per-row, on the
+assistant atoms in `language_events`:
+
+```python
+{
+  "role": "assistant",
+  "content": null,
+  "style": null,
+  "timestamp": 12.4,
+  "camera": null,
+  "tool_calls": [
+    { "type": "function",
+      "function": { "name": "say", "arguments": { "text": "On it." } } }
+  ]
+}
+```
+
+Recipes splice these into rendered messages via `tool_calls_from`:
+
+```yaml
+user_interjection_response:
+  bindings:
+    speech: "emitted_at(t, role=assistant, tool_name=say)"
+  messages:
+    - { role: user,      content: "${task}",         stream: high_level }
+    - { role: assistant, content: "${current_plan}", stream: high_level,
+        target: true, tool_calls_from: speech }
+```
+
+The model's training target is one assistant turn that carries both the
+plan text *and* the `say` tool call. At inference, the runtime parses
+the generated text back into structured `tool_calls` and dispatches to
+the matching implementation.
+
+## How to add your own tool
+
+Three steps. Concrete example: a `record_observation` tool the policy
+can call to capture an extra observation outside the regular control
+loop.
+
+### Step 1 — declare the schema
+
+Add an entry under `meta/info.json["tools"]`. Either edit the file
+directly on disk *before* running the annotation pipeline (it'll be
+preserved) or hand it to `lerobot-annotate` via a config flag (PR 2 —
+exact CLI lands with the pipeline change).
+
+```json
+{
+  "tools": [
+    { "type": "function", "function": { "name": "say", "...": "..." } },
+    {
+      "type": "function",
+      "function": {
+        "name": "record_observation",
+        "description": "Capture a high-resolution still image for the user.",
+        "parameters": {
+          "type": "object",
+          "properties": {
+            "label": { "type": "string", "description": "Short label for the saved image." }
+          },
+          "required": ["label"]
+        }
+      }
+    }
+  ]
+}
+```
+
+The schema follows OpenAI's function-calling convention exactly, so the
+chat template can render it natively.
+
+### Step 2 — implement the call
+
+Create `src/lerobot/tools/record_observation.py`:
+
+```python
+from .base import Tool
+from typing import Any
+
+RECORD_OBSERVATION_SCHEMA: dict[str, Any] = { "...": "..." }   # mirrors the JSON above
+
+
+class RecordObservationTool:
+    name = "record_observation"
+    schema = RECORD_OBSERVATION_SCHEMA
+
+    def __init__(self, schema: dict | None = None, output_dir: str = "."):
+        self.output_dir = output_dir
+
+    def call(self, arguments: dict) -> str:
+        label = arguments["label"]
+        # ... save the latest camera frame to <output_dir>/<label>.png ...
+        return f"saved {label}.png"
+```
+
+One file per tool keeps dependencies isolated — `record_observation`
+might pull `pillow`, while `say` (PR 3) pulls `pocket-tts`. Users
+installing only the tools they need avoid heavy transitive deps.
+
+### Step 3 — register it
+
+Add to `src/lerobot/tools/registry.py` (PR 3):
+
+```python
+from .record_observation import RecordObservationTool
+
+TOOL_REGISTRY["record_observation"] = RecordObservationTool
+```
+
+That's it. At runtime `get_tools(meta)` looks up each schema in
+`meta.tools`, instantiates the matching registered class, and returns
+a name → instance dict the dispatcher can route into.
+
+## Where this fits in the three-PR stack
+
+| Layer | PR | What lands |
+|---|---|---|
+| Catalog storage in `meta/info.json` + `meta.tools` accessor | PR 1 | This page; `SAY_TOOL_SCHEMA`, `DEFAULT_TOOLS` constants in `lerobot.datasets.language`; `LeRobotDatasetMetadata.tools` property |
+| Annotation pipeline writes `tools` to meta after a run; honors anything users pre-populated | PR 2 | `lerobot-annotate` ensures `meta/info.json["tools"]` includes the canonical `say` and merges any user-declared tools |
+| Runnable implementations under `src/lerobot/tools/`; runtime dispatcher; `say.py` wired to Kyutai's pocket-tts | PR 3 | One file per tool; `Tool` protocol; `TOOL_REGISTRY`; optional `[tools]` extra in `pyproject.toml` |
+
+If you want to use a tool *without* writing an implementation (e.g. for
+training-time chat-template formatting only), step 1 alone is enough —
+the model still learns to *generate* the call. Steps 2 and 3 are only
+needed to actually *execute* it at inference.
@@ -220,7 +220,7 @@ REAL_DIM = 12
 # Postprocessing: Trim 20D predictions to 12D for deployment
 ```

-See the [action_hub.py](https://github.com/huggingface/lerobot/blob/main/src/lerobot/policies/xvla/action_hub.py) implementation for details.
+See the [action_hub.py](/home/jade_choghari/robot/lerobot/src/lerobot/policies/xvla/action_hub.py) implementation for details.

 #### Auto Action Mode (Recommended)

@@ -519,9 +519,9 @@ If you use X-VLA in your research, please cite:

 - [X-VLA Paper](https://arxiv.org/pdf/2510.10274)
 - [LeRobot Documentation](https://github.com/huggingface/lerobot)
- [Action Registry Implementation](https://github.com/huggingface/lerobot/blob/main/src/lerobot/policies/xvla/action_hub.py)
- [Processor Implementation](https://github.com/huggingface/lerobot/blob/main/src/lerobot/policies/xvla/processor_xvla.py)
- [Model Configuration](https://github.com/huggingface/lerobot/blob/main/src/lerobot/policies/xvla/configuration_xvla.py)
+- [Action Registry Implementation](https://github.com/huggingface/lerobot/src/lerobot/policies/xvla/action_hub.py)
+- [Processor Implementation](https://github.com/huggingface/lerobot/src/lerobot/policies/xvla/processor_xvla.py)
+- [Model Configuration](https://github.com/huggingface/lerobot/src/lerobot/policies/xvla/configuration_xvla.py)

 ## Contributing

@@ -0,0 +1,66 @@
+#!/usr/bin/env python
+"""Launch ``lerobot-annotate`` on a Hugging Face job (vllm + Qwen3.6 MoE).
+
+Spawns one ``h200x2`` job that:
+
+  1. installs this branch of ``lerobot`` plus the annotation extras,
+  2. boots two vllm servers (one per GPU) with Qwen3.6-35B-A3B-FP8,
+  3. runs Module 1/2/3 across the dataset (per-camera VQA via PR 3471),
+  4. uploads the annotated dataset to ``--push_to_hub``.
+
+Usage:
+
+    HF_TOKEN=hf_... uv run python examples/annotation/run_hf_job.py
+
+Adjust ``CMD`` below to point at your own dataset / target hub repo.
+"""
+
+import os
+
+from huggingface_hub import get_token, run_job
+
+token = os.environ.get("HF_TOKEN") or get_token()
+if not token:
+    raise RuntimeError(
+        "No HF token. Run `huggingface-cli login` or `export HF_TOKEN=hf_...`"
+    )
+
+CMD = (
+    "apt-get update -qq && apt-get install -y -qq git ffmpeg && "
+    "pip install --no-deps "
+    "'lerobot @ git+https://github.com/huggingface/lerobot.git@feat/language-annotation-pipeline' && "
+    "pip install --upgrade-strategy only-if-needed "
+    "datasets pyarrow av jsonlines draccus gymnasium torchcodec mergedeep pyyaml-include toml typing-inspect && "
+    "export VLLM_MEMORY_PROFILER_ESTIMATE_CUDAGRAPHS=0 && "
+    "export VLLM_VIDEO_BACKEND=pyav && "
+    "lerobot-annotate "
+    "--repo_id=imstevenpmwork/super_poulain_draft "
+    "--vlm.backend=openai "
+    "--vlm.model_id=Qwen/Qwen3.6-35B-A3B-FP8 "
+    "--vlm.parallel_servers=2 "
+    "--vlm.num_gpus=2 "
+    '--vlm.serve_command="vllm serve Qwen/Qwen3.6-35B-A3B-FP8 '
+    "--tensor-parallel-size 1 --max-model-len 32768 "
+    '--gpu-memory-utilization 0.8 --uvicorn-log-level warning --port {port}" '
+    "--vlm.serve_ready_timeout_s=1800 "
+    "--vlm.client_concurrency=256 "
+    "--vlm.max_new_tokens=512 "
+    "--executor.episode_parallelism=32 "
+    "--vlm.chat_template_kwargs='{enable_thinking: false}' "
+    "--vlm.camera_key=observation.images.wrist "
+    "--module_1.frames_per_second=1.0 "
+    "--module_1.use_video_url=true "
+    "--module_1.use_video_url_fps=1.0 "
+    "--module_3.K=1 --module_3.vqa_emission_hz=0.2 "
+    "--push_to_hub=pepijn223/super_poulain_qwen36moe-3"
+)
+
+job = run_job(
+    image="vllm/vllm-openai:latest",
+    command=["bash", "-c", CMD],
+    flavor="h200x2",
+    secrets={"HF_TOKEN": token},
+    timeout="2h",
+)
+print(f"Job URL: {job.url}")
+print(f"Job ID:  {job.id}")
@@ -95,7 +95,7 @@ dependencies = [

 # ── Feature-scoped extras ──────────────────────────────────
 dataset = [
-    "datasets>=4.0.0,<5.0.0",
+    "datasets>=4.7.0,<5.0.0",
    "pandas>=2.0.0,<3.0.0", # NOTE: Transitive dependency of datasets
    "pyarrow>=21.0.0,<30.0.0", # NOTE: Transitive dependency of datasets
    "lerobot[av-dep]",
@@ -200,6 +200,23 @@ hilserl = ["lerobot[transformers-dep]", "gym-hil>=0.1.13,<0.2.0", "lerobot[grpci
 async = ["lerobot[grpcio-dep]", "lerobot[matplotlib-dep]"]
 peft = ["lerobot[transformers-dep]", "lerobot[peft-dep]"]

+# Annotation pipeline (lerobot-annotate). datatrove is mandatory; vllm is
+# the preferred backend on Linux, with a transformers fallback elsewhere.
+annotations = [
+    "lerobot[dataset]",
+    "lerobot[transformers-dep]",
+    "datatrove>=0.4.0,<2.0.0",
+    "vllm>=0.6.0,<1.0.0; sys_platform == 'linux'",
+]
+
+# Tool implementations under src/lerobot/tools/. Each tool's dependencies
+# are isolated so adding a new tool doesn't bloat the base install.
+# Currently only `say` (Kyutai pocket-tts; CPU-only, ~100M params).
+tools = [
+    "pocket-tts>=0.1.0,<1.0.0",
+    "scipy>=1.11.0,<2.0.0",  # SayTool.output_dir uses scipy.io.wavfile
+]
+
 # Development
 dev = ["pre-commit>=3.7.0,<5.0.0", "debugpy>=1.8.1,<1.9.0", "lerobot[grpcio-dep]", "grpcio-tools==1.73.1", "mypy>=1.19.1", "ruff>=0.14.1", "lerobot[notebook]"]
 notebook = ["jupyter>=1.0.0,<2.0.0", "ipykernel>=6.0.0,<7.0.0"]
@@ -289,10 +306,11 @@ lerobot-find-joint-limits="lerobot.scripts.lerobot_find_joint_limits:main"
 lerobot-imgtransform-viz="lerobot.scripts.lerobot_imgtransform_viz:main"
 lerobot-edit-dataset="lerobot.scripts.lerobot_edit_dataset:main"
 lerobot-setup-can="lerobot.scripts.lerobot_setup_can:main"
+lerobot-annotate="lerobot.scripts.lerobot_annotate:main"

 # ---------------- Tool Configurations ----------------
 [tool.setuptools.package-data]
-lerobot = ["envs/*.json"]
+lerobot = ["envs/*.json", "annotations/steerable_pipeline/prompts/*.txt"]

 [tool.setuptools.packages.find]
 where = ["src"]
@@ -0,0 +1,15 @@
+#!/usr/bin/env python
+
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
@@ -0,0 +1,36 @@
+#!/usr/bin/env python
+
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""Steerable annotation pipeline producing ``language_persistent`` and
+``language_events`` columns for LeRobot datasets.
+
+The pipeline is decomposed into three independently runnable modules whose
+outputs are staged per-episode before a final parquet rewrite:
+
+- :mod:`.modules.plan_subtasks_memory` (Module 1) — persistent styles
+- :mod:`.modules.interjections_and_speech` (Module 2) — event styles + speech
+- :mod:`.modules.general_vqa` (Module 3) — event-style VQA pairs
+"""
+
+from .config import AnnotationPipelineConfig
+from .validator import StagingValidator, ValidationReport
+from .writer import LanguageColumnsWriter
+
+__all__ = [
+    "AnnotationPipelineConfig",
+    "LanguageColumnsWriter",
+    "StagingValidator",
+    "ValidationReport",
+]
@@ -0,0 +1,260 @@
+#!/usr/bin/env python
+
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from __future__ import annotations
+
+from dataclasses import dataclass, field
+from pathlib import Path
+from typing import Any
+
+
+@dataclass
+class Module1Config:
+    """Module 1 hyperparameters: plan + subtasks + memory + task augmentation.
+
+    Subtask decomposition sees the **whole episode** as one Qwen-VL video
+    block — no keyframe stride or count: the model handles temporal pooling
+    itself and decides where to cut. ``max_video_frames`` only caps the
+    number of frames packed into the video block (a model-capacity bound,
+    not an annotation-logic knob).
+    """
+
+    enabled: bool = True
+    n_task_rephrasings: int = 10
+    """Number of task rephrasings to generate at ``t=0`` as ``task_aug``
+    persistent rows (PR 1 ``CORE_STYLES``). The renderer's ``${task}``
+    binding rotates among them deterministically per ``sample_idx``,
+    realizing Xiao 2022 / CAST-style task-prompt diversity without
+    touching ``meta/tasks.parquet``. Set to 0 to disable."""
+    derive_task_from_video: str = "if_short"
+    """When to bypass the user-provided ``record.episode_task`` and
+    derive a fresh task description from the episode video alone:
+
+    - ``off``       never; always use the canonical task as the basis.
+    - ``if_short``  derive when the canonical task is empty, has fewer
+                    than ``derive_task_min_words`` words, or matches a
+                    placeholder string (``debug``, ``unnamed``, ``tbd``,
+                    ...). Default — fixes noisy / placeholder tasks
+                    without forcing derivation everywhere.
+    - ``always``    ignore the canonical task entirely; always derive
+                    from the video. Useful when the dataset's task
+                    labels are uniformly bad.
+
+    The video-derived task replaces the canonical task as the basis for
+    subtask decomposition, plan, memory, AND the ``task_aug`` rephrasings,
+    so every downstream annotation is grounded in what's actually visible.
+    ``meta/tasks.parquet`` is NOT modified — the Module-1-derived task
+    only lives in ``language_persistent`` rows."""
+    derive_task_min_words: int = 3
+    """Word-count threshold for ``derive_task_from_video=if_short``."""
+    frames_per_second: float = 1.0
+    """Sample one image-frame per ``1/fps`` seconds across the episode for
+    Module 1's subtask-decomposition prompt. ``1.0`` = 1 fps. Capped by
+    ``max_video_frames`` to avoid blowing up the request payload."""
+    max_video_frames: int = 128
+    """Hard cap on the number of frames Module 1 sends. With ``fps=1`` and
+    a 30 s episode this yields 30 frames. Bumped from 32 since each frame
+    is small (~30-100 KB PNG when base64'd)."""
+    min_subtask_seconds: float = 1.5
+    plan_max_steps: int = 8
+    use_video_url: bool = False
+    """When True (and backend supports it, e.g. ``openai``), Module 1
+    sends a ``video_url`` content block pointing at the episode's mp4
+    file instead of pre-decoded frames. Lets the server sample frames at
+    its own ``fps`` — no in-process conv3d cost. The video file is
+    extracted as a per-episode subclip to ``staging/.video_clips/`` so
+    the model sees only this episode's frames."""
+    use_video_url_fps: float = 1.0
+    """Frame-rate hint to send to the server (mm_processor_kwargs.fps).
+    Only used when ``use_video_url=True``. ``1.0`` = sample 1 frame per
+    second, which is plenty for subtask-boundary detection on most
+    manipulation episodes."""
+
+
+@dataclass
+class Module2Config:
+    """Module 2 hyperparameters: interjections + paired speech."""
+
+    enabled: bool = True
+    max_interjections_per_episode: int = 3
+    """Number of mid-episode interjections to generate per episode. Each
+    creates a paired ``(interjection, speech)`` event row plus triggers a
+    ``plan`` refresh at the same timestamp via Module 1. Bumped from the
+    original ``1`` after qwen36moe-10 showed plan/interjection coverage
+    was too sparse for Hi Robot-style training."""
+    interjection_min_t: float = 2.0
+    interjection_window_seconds: float = 2.0
+    """How many seconds of video to attach to the interjection prompt as
+    visual context. Without this the VLM only sees a single frozen frame
+    and writes generic interjections that aren't grounded in the actual
+    motion happening at the chosen timestamp."""
+    interjection_window_frames: int = 4
+    """How many frames to sample over ``interjection_window_seconds``.
+    Default 4 ⇒ ~0.5 fps over the leading 2 seconds — enough for the
+    model to read the ongoing motion, cheap enough to keep prompt size
+    bounded for the 32k context."""
+
+
+@dataclass
+class Module3Config:
+    """Module 3 hyperparameters: general VQA."""
+
+    enabled: bool = True
+    vqa_emission_hz: float = 1.0
+    K: int = 3
+    question_types: tuple[str, ...] = ("bbox", "keypoint", "count", "attribute", "spatial")
+
+
+@dataclass
+class VlmConfig:
+    """Shared Qwen-VL client configuration."""
+
+    backend: str = "openai"
+    """One of ``vllm``, ``transformers``, ``openai``, or ``stub`` (tests only).
+
+    Default ``openai`` talks to a local OpenAI-compatible server (vllm /
+    transformers) which the CLI auto-spawns when ``auto_serve=True``."""
+    model_id: str = "Qwen/Qwen2.5-VL-7B-Instruct"
+    api_base: str = "http://localhost:8000/v1"
+    """Base URL for the ``openai`` backend."""
+    api_key: str = "EMPTY"
+    """API key for the ``openai`` backend; ``EMPTY`` works for local servers."""
+    auto_serve: bool = True
+    """When True with ``backend=openai``, the CLI probes ``api_base``
+    first; if no server answers, it spawns one (default:
+    ``transformers serve``), waits for it to be ready, runs the
+    pipeline, and tears it down on exit. Default ``True`` so a single
+    ``lerobot-annotate`` call can drive the whole flow. Set to ``False``
+    if you want to fail fast when no server is reachable (e.g. you're
+    pointing at a remote endpoint that should already be up)."""
+    serve_port: int = 8000
+    """Port the auto-spawned server binds to. Sets ``api_base`` automatically."""
+    serve_command: str | None = None
+    """Override the auto-serve command (full shell command). When ``None``,
+    we run ``transformers serve <model_id> --port <serve_port> --continuous-batching``.
+
+    When ``parallel_servers > 1``, the literal ``{port}`` placeholder in
+    this command (if present) is substituted per-replica."""
+    parallel_servers: int = 1
+    """When >1, spawn this many independent inference servers (each pinned
+    to a GPU via ``CUDA_VISIBLE_DEVICES`` and listening on
+    ``serve_port + i``) and round-robin client requests across them.
+    Useful when DP/TP NCCL setup is broken on the node — single-GPU
+    replicas don't need cross-GPU communication. When
+    ``parallel_servers > num_gpus``, replicas are round-robin-assigned
+    to GPUs (e.g. 4 replicas on 2 GPUs → 0,1,0,1)."""
+    num_gpus: int = 0
+    """How many physical GPUs are available for round-robin replica
+    placement. ``0`` means ``parallel_servers`` (one GPU per replica,
+    backward-compatible default). Set this to ``2`` with
+    ``parallel_servers=4`` to pack 2 replicas per GPU."""
+    client_concurrency: int = 16
+    """Maximum number of in-flight chat requests the client issues in
+    parallel. vllm batches them internally for free, so bumping this
+    typically gives big throughput wins on a single TP=1 server. Set to
+    ``1`` for strict serial calls."""
+    serve_ready_timeout_s: float = 600.0
+    """Max seconds to wait for the server to start serving requests."""
+    max_new_tokens: int = 512
+    temperature: float = 0.2
+    json_mode: bool = True
+    batch_size: int = 4
+    tensor_parallel_size: int = 1
+    gpu_memory_utilization: float = 0.9
+    """Fraction of GPU memory vllm allocates for weights + KV cache.
+    Lower (e.g. 0.7) when the vision encoder needs cuDNN workspace, or to
+    avoid CUDNN_STATUS_NOT_INITIALIZED on tight VRAM (30B BF16 on 80 GB)."""
+    max_model_len: int | None = None
+    """Cap context length. ``None`` keeps the model's default; on H100 80 GB
+    a 30B BF16 model often needs ``max_model_len=8192`` or smaller to leave
+    room for KV cache."""
+    trust_remote_code: bool = False
+    """Pass ``trust_remote_code`` to HF auto-classes. Default ``False`` —
+    only enable for models that actually ship custom code in their repo
+    (rare for first-class VL releases). On Qwen3-VL it triggers an
+    std::bad_alloc post-load even though the official transformers class
+    is sufficient, so leaving this off is safest."""
+    camera_key: str | None = None
+    """Override the camera stream used for keyframe attachment. ``None`` picks
+    the first ``observation.images.*`` key the dataset declares."""
+    chat_template_kwargs: dict[str, Any] | None = None
+    """Forwarded as ``extra_body.chat_template_kwargs`` on every chat call.
+    Use this to pass model-specific template flags such as
+    ``{"enable_thinking": false}`` for Qwen3.5/Qwen3.6 to suppress the
+    reasoning preamble that otherwise eats the entire ``max_new_tokens``
+    budget before any JSON is emitted."""
+
+
+@dataclass
+class ExecutorConfig:
+    """Executor selection and SLURM hyperparameters."""
+
+    auto_threshold: int = 32
+    force_local: bool = False
+    slurm_partition: str | None = None
+    slurm_gpus: int = 1
+    slurm_time: str = "06:00:00"
+    workers: int = 1
+    episode_parallelism: int = 16
+    """Number of episodes processed concurrently within each module phase.
+    Each in-flight episode sends 3–5 dependent VLM calls; bumping this is
+    how you actually saturate ``parallel_servers`` and ``client_concurrency``
+    — without it, the executor loops one episode at a time and the
+    inference servers sit ~90% idle. Set to ``1`` for strict serial
+    execution."""
+
+
+@dataclass
+class AnnotationPipelineConfig:
+    """Top-level config for ``lerobot-annotate``.
+
+    Mirrors the structure of :class:`lerobot.configs.train.TrainPipelineConfig`:
+    a draccus-parsed dataclass that contains nested per-module sub-configs and
+    leaves the dataset, executor, and VLM choices independently knobbable.
+
+    Output is always in-place: the writer rewrites ``data/chunk-*/file-*.parquet``
+    in place. Multiple revisions of the same dataset live in separate copies.
+    """
+
+    repo_id: str | None = None
+    root: Path | None = None
+
+    staging_dir: Path | None = None
+    """If unset, defaults to ``<root>/.annotate_staging/``."""
+
+    seed: int = 1729
+
+    module_1: Module1Config = field(default_factory=Module1Config)
+    module_2: Module2Config = field(default_factory=Module2Config)
+    module_3: Module3Config = field(default_factory=Module3Config)
+
+    vlm: VlmConfig = field(default_factory=VlmConfig)
+    executor: ExecutorConfig = field(default_factory=ExecutorConfig)
+
+    skip_validation: bool = False
+    only_episodes: tuple[int, ...] | None = None
+
+    push_to_hub: str | None = None
+    """If set, after the pipeline completes, upload the annotated dataset
+    root to the Hugging Face Hub as a dataset repo with this id (e.g.
+    ``pepijn/super_poulain_steerable``). Creates the repo if missing."""
+    push_private: bool = False
+    """When ``push_to_hub`` is set, create the repo as private."""
+    push_commit_message: str | None = None
+    """Override the commit message used for the hub upload."""
+
+    def resolved_staging_dir(self, root: Path) -> Path:
+        return self.staging_dir if self.staging_dir is not None else root / ".annotate_staging"
@@ -0,0 +1,263 @@
+#!/usr/bin/env python
+
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""Executor selection: local vs SLURM via datatrove.
+
+The executor plans **four phases** with the dependency order from the plan:
+
+    phase 1: Module 1 (plan + subtasks + memory)
+    phase 2: Module 2 (interjections + speech)
+    phase 3: Module 1 plan-update pass — re-runs plan emission at every
+             interjection timestamp produced by phase 2
+    phase 4: Module 3 (VQA)
+    phase 5: validator
+    phase 6: writer
+
+Phase 3 is why ``executor.py`` documents the dependency: Module 1 must be
+re-entered after Module 2 to refresh ``plan`` rows at interjection times.
+"""
+
+from __future__ import annotations
+
+import logging
+from dataclasses import dataclass
+from pathlib import Path
+from typing import Any
+
+from .config import AnnotationPipelineConfig, ExecutorConfig
+from .reader import EpisodeRecord, iter_episodes
+from .staging import EpisodeStaging
+from .validator import StagingValidator
+from .writer import LanguageColumnsWriter
+
+logger = logging.getLogger(__name__)
+
+
+@dataclass
+class PhaseResult:
+    """Summary of one pipeline phase across all episodes."""
+
+    name: str
+    episodes_processed: int
+    episodes_skipped: int
+
+
+@dataclass
+class PipelineRunSummary:
+    """Aggregated result returned by :meth:`Executor.run`."""
+
+    phases: list[PhaseResult]
+    written_paths: list[Path]
+    validation_report: Any  # ValidationReport, kept Any to avoid import cycle
+
+
+def select_executor_class(num_episodes: int, config: ExecutorConfig) -> str:
+    """Return ``"local"`` or ``"slurm"`` based on the threshold.
+
+    The plan's "executor selection threshold" lives in
+    :class:`ExecutorConfig.auto_threshold`. ``force_local`` always wins.
+    """
+    if config.force_local:
+        return "local"
+    return "local" if num_episodes <= config.auto_threshold else "slurm"
+
+
+@dataclass
+class Executor:
+    """Run all four phases over a dataset root.
+
+    The executor is intentionally framework-agnostic: by default it runs the
+    phases inline (suitable for tests, small datasets, and the CLI's
+    ``--force-local`` mode). It will optionally hand off to datatrove's
+    :class:`LocalPipelineExecutor` or :class:`SlurmPipelineExecutor` when those
+    are installed and the dataset is large enough to benefit from them.
+
+    Tests construct the executor directly with stub modules.
+    """
+
+    config: AnnotationPipelineConfig
+    module_1: Any  # PlanSubtasksMemoryModule
+    module_2: Any  # InterjectionsAndSpeechModule
+    module_3: Any  # GeneralVqaModule
+    writer: LanguageColumnsWriter
+    validator: StagingValidator
+
+    def run(self, root: Path) -> PipelineRunSummary:
+        records = list(iter_episodes(root, only_episodes=self.config.only_episodes))
+        n = len(records)
+        if n == 0:
+            raise ValueError(f"No episodes found under {root}/data/")
+
+        executor_kind = select_executor_class(n, self.config.executor)
+        print(f"[annotate] {n} episodes total; executor={executor_kind}", flush=True)
+
+        staging_dir = self.config.resolved_staging_dir(root)
+        staging_dir.mkdir(parents=True, exist_ok=True)
+
+        phases: list[PhaseResult] = []
+
+        # Phase 1: Module 1 (plan + subtasks + memory)
+        phases.append(self._run_module_phase("module_1", records, staging_dir, self.module_1))
+        # Phase 2: Module 2 (interjections + speech). Module 2 reads
+        # Module 1's subtask rows from the same staging tree to ground
+        # the interjection prompt in the correct local subtask.
+        phases.append(self._run_module_phase("module_2", records, staging_dir, self.module_2))
+        # Phase 3: Module 1 plan-update pass at interjection timestamps.
+        phases.append(self._run_plan_update_phase(records, staging_dir))
+        # Phase 4: Module 3 (VQA)
+        phases.append(self._run_module_phase("module_3", records, staging_dir, self.module_3))
+
+        print("[annotate] running validator...", flush=True)
+        report = self.validator.validate(records, staging_dir)
+        if not report.ok and not self.config.skip_validation:
+            raise RuntimeError(f"Staging validation failed: {report.summary()}")
+        print(f"[annotate] validator: {report.summary()}", flush=True)
+
+        print(f"[annotate] writing parquet shards into {root}/data/...", flush=True)
+        written = self.writer.write_all(records, staging_dir, root)
+        print(f"[annotate] wrote {len(written)} shard(s); pipeline complete", flush=True)
+
+        # Persist the tool catalog to meta/info.json so chat-template
+        # consumers (PR 3 SmolVLA2 / Pi0.5 / dataset visualizer) can read
+        # it via ``LeRobotDatasetMetadata.tools`` (PR 1). Idempotent and
+        # additive: anything the user pre-populated is preserved; we only
+        # ensure the canonical ``say`` schema is present.
+        self._ensure_tools_in_info(root)
+
+        return PipelineRunSummary(phases=phases, written_paths=written, validation_report=report)
+
+    def _ensure_tools_in_info(self, root: Path) -> None:
+        """Write ``meta/info.json["tools"]`` if missing the canonical ``say``.
+
+        Reads any user-declared tools already in ``info.json`` and merges
+        the canonical ``SAY_TOOL_SCHEMA`` into the list (deduped by
+        ``function.name``). Writes back to disk only if the list
+        changed.
+        """
+        import json  # noqa: PLC0415
+
+        from lerobot.datasets.language import SAY_TOOL_SCHEMA  # noqa: PLC0415
+
+        info_path = root / "meta" / "info.json"
+        if not info_path.exists():
+            return
+        try:
+            info = json.loads(info_path.read_text())
+        except Exception as exc:  # noqa: BLE001
+            print(f"[annotate] could not read {info_path}: {exc}", flush=True)
+            return
+
+        existing = info.get("tools")
+        if not isinstance(existing, list):
+            existing = []
+        names = {
+            (t.get("function") or {}).get("name")
+            for t in existing
+            if isinstance(t, dict)
+        }
+        merged = list(existing)
+        if SAY_TOOL_SCHEMA["function"]["name"] not in names:
+            merged.append(SAY_TOOL_SCHEMA)
+        if merged != existing:
+            info["tools"] = merged
+            info_path.write_text(json.dumps(info, indent=2))
+            print(
+                f"[annotate] meta/info.json: tools={[t['function']['name'] for t in merged]}",
+                flush=True,
+            )
+
+    def _run_module_phase(
+        self,
+        name: str,
+        records: list[EpisodeRecord],
+        staging_dir: Path,
+        module: Any,
+    ) -> PhaseResult:
+        import time as _time  # noqa: PLC0415
+        from concurrent.futures import ThreadPoolExecutor, as_completed  # noqa: PLC0415
+
+        if not module.enabled:
+            print(f"[annotate] phase={name} skipped (module disabled)", flush=True)
+            return PhaseResult(name=name, episodes_processed=0, episodes_skipped=len(records))
+        n = len(records)
+        parallelism = max(1, min(self.config.executor.episode_parallelism, n))
+        print(
+            f"[annotate] phase={name} starting on {n} episode(s) "
+            f"(parallelism={parallelism})",
+            flush=True,
+        )
+        t0 = _time.time()
+
+        def _do(idx_record: tuple[int, EpisodeRecord]) -> tuple[int, int, float]:
+            i, record = idx_record
+            ep_start = _time.time()
+            staging = EpisodeStaging(staging_dir, record.episode_index)
+            module.run_episode(record, staging)
+            return i, record.episode_index, _time.time() - ep_start
+
+        processed = 0
+        if parallelism == 1:
+            for i, record in enumerate(records, 1):
+                _, ep_idx, elapsed = _do((i, record))
+                processed += 1
+                print(
+                    f"[annotate]   {name} episode {i}/{n} "
+                    f"(idx={ep_idx}) done in {elapsed:.1f}s",
+                    flush=True,
+                )
+        else:
+            with ThreadPoolExecutor(max_workers=parallelism) as pool:
+                futures = [pool.submit(_do, (i, r)) for i, r in enumerate(records, 1)]
+                for fut in as_completed(futures):
+                    i, ep_idx, elapsed = fut.result()
+                    processed += 1
+                    print(
+                        f"[annotate]   {name} episode {processed}/{n} "
+                        f"(idx={ep_idx}, submit_order={i}) done in {elapsed:.1f}s",
+                        flush=True,
+                    )
+        total = _time.time() - t0
+        print(f"[annotate] phase={name} complete: {processed}/{n} in {total:.1f}s", flush=True)
+        return PhaseResult(name=name, episodes_processed=processed, episodes_skipped=0)
+
+    def _run_plan_update_phase(  # noqa: PLR0915
+        self, records: list[EpisodeRecord], staging_dir: Path
+    ) -> PhaseResult:
+        """Re-emit ``plan`` rows at each interjection timestamp from Module 2.
+
+        Module 1 owns the prompt; Module 2 produced the timestamps. This phase
+        therefore calls back into Module 1 with the interjection timestamps so
+        Module 1's existing prompt path is reused.
+        """
+        if not self.module_1.enabled or not self.module_2.enabled:
+            return PhaseResult(
+                name="module_1_plan_update", episodes_processed=0, episodes_skipped=len(records)
+            )
+        processed = 0
+        for record in records:
+            staging = EpisodeStaging(staging_dir, record.episode_index)
+            interjection_rows = [
+                row
+                for row in staging.read("module_2")
+                if row.get("style") == "interjection"
+            ]
+            interjection_times = [float(row["timestamp"]) for row in interjection_rows]
+            interjection_texts = [str(row.get("content") or "") for row in interjection_rows]
+            if interjection_times:
+                self.module_1.run_plan_updates(
+                    record, staging, interjection_times, interjection_texts
+                )
+                processed += 1
+        return PhaseResult(name="module_1_plan_update", episodes_processed=processed, episodes_skipped=0)
@@ -0,0 +1,400 @@
+#!/usr/bin/env python
+
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""Keyframe extraction for the annotation pipeline.
+
+Modules attach decoded camera frames to their VLM prompts so the model can
+ground subtask decomposition, interjection scenarios, and VQA in actual
+visual content. The pipeline shares one provider across modules and one
+episode at a time, with a small per-episode cache so multiple modules
+querying the same timestamp pay decode cost once.
+"""
+
+from __future__ import annotations
+
+from dataclasses import dataclass, field
+from pathlib import Path
+from typing import Any, Protocol
+
+from .reader import EpisodeRecord
+
+
+class FrameProvider(Protocol):
+    """Decodes camera frames at episode-relative timestamps."""
+
+    @property
+    def camera_keys(self) -> list[str]:
+        """All ``observation.images.*`` feature keys this provider can decode."""
+
+    def frames_at(
+        self,
+        record: EpisodeRecord,
+        timestamps: list[float],
+        camera_key: str | None = None,
+    ) -> list[Any]:
+        """Return one PIL.Image per timestamp from ``camera_key`` (or default).
+
+        Empty list if the camera is unavailable. ``camera_key=None`` falls back
+        to the provider's default camera so existing single-camera callers
+        (Module 1, Module 2) keep working unchanged.
+        """
+
+    def video_for_episode(
+        self,
+        record: EpisodeRecord,
+        max_frames: int,
+        camera_key: str | None = None,
+    ) -> list[Any]:
+        """Return up to ``max_frames`` PIL images covering the whole episode.
+
+        Sampling is uniform across the episode duration. The returned list is
+        intended to be passed as one ``{"type":"video", "video":<list>}``
+        block to a Qwen-VL-compatible model that pools temporally itself.
+        Empty list if no camera available.
+        """
+
+
+@dataclass
+class _NullProvider:
+    """No-op provider used when the dataset has no video keys or in tests."""
+
+    @property
+    def camera_keys(self) -> list[str]:
+        return []
+
+    def frames_at(
+        self,
+        record: EpisodeRecord,
+        timestamps: list[float],
+        camera_key: str | None = None,
+    ) -> list[Any]:
+        return []
+
+    def video_for_episode(
+        self,
+        record: EpisodeRecord,
+        max_frames: int,
+        camera_key: str | None = None,
+    ) -> list[Any]:
+        return []
+
+
+def null_provider() -> FrameProvider:
+    return _NullProvider()
+
+
+@dataclass
+class VideoFrameProvider:
+    """Decodes frames from the dataset's ``observation.images.*`` streams.
+
+    By default the *first* camera key is used for Module 1 (subtask
+    decomposition) and Module 2 (interjection scenarios) — those prompts care
+    about *what is happening*, not which angle. Module 3 (VQA) instead
+    iterates over every camera in :attr:`camera_keys` so each frame's
+    grounded answer (bbox/keypoint/...) is tagged with the camera it was
+    grounded against.
+
+    ``camera_key`` overrides the default-camera choice but does not restrict
+    :attr:`camera_keys`. Pass ``camera_key`` explicitly to ``frames_at`` /
+    ``video_for_episode`` to read a non-default stream.
+
+    Caches up to ``cache_size`` decoded frames per process to keep
+    co-timestamped Module 2 + Module 1 plan-update calls cheap.
+    """
+
+    root: Path
+    camera_key: str | None = None
+    tolerance_s: float = 1e-2
+    cache_size: int = 256
+    _meta: Any = field(default=None, init=False, repr=False)
+    _cache: dict = field(default_factory=dict, init=False, repr=False)
+    _camera_keys: list[str] = field(default_factory=list, init=False, repr=False)
+
+    def __post_init__(self) -> None:
+        from lerobot.datasets.dataset_metadata import LeRobotDatasetMetadata  # noqa: PLC0415
+
+        self._meta = LeRobotDatasetMetadata(repo_id="local", root=self.root)
+        # ``camera_keys`` covers both image- and video-stored cameras
+        # (``video_keys`` is video-only). Some datasets declare cameras with
+        # ``dtype=image``, which would otherwise look empty here and silently
+        # disable Module 3 even though the videos are there.
+        keys = list(getattr(self._meta, "camera_keys", None) or self._meta.video_keys or [])
+        # Last-resort fallback: if metadata didn't surface anything but the
+        # caller explicitly named a camera (``--vlm.camera_key=...``), trust
+        # them — the key is by definition known to exist on the dataset.
+        if not keys and self.camera_key:
+            keys = [self.camera_key]
+        self._camera_keys = keys
+        if self.camera_key is None:
+            self.camera_key = keys[0] if keys else None
+
+    @property
+    def camera_keys(self) -> list[str]:
+        """All ``observation.images.*`` keys available on this dataset."""
+        return list(self._camera_keys)
+
+    def frames_at(
+        self,
+        record: EpisodeRecord,
+        timestamps: list[float],
+        camera_key: str | None = None,
+    ) -> list[Any]:
+        target = camera_key if camera_key is not None else self.camera_key
+        if not timestamps or target is None:
+            return []
+
+        out: list[Any] = []
+        misses: list[float] = []
+        miss_indices: list[int] = []
+        for i, ts in enumerate(timestamps):
+            key = (record.episode_index, target, round(float(ts), 6))
+            cached = self._cache.get(key)
+            if cached is not None:
+                out.append(cached)
+            else:
+                out.append(None)
+                misses.append(float(ts))
+                miss_indices.append(i)
+
+        if misses:
+            decoded = self._decode(record.episode_index, misses, target)
+            # decoder may return fewer frames than requested when some
+            # timestamps fall outside the video; pair what we have and
+            # leave the rest as None to be filtered below.
+            for i, img in zip(miss_indices, decoded):
+                out[i] = img
+                key = (record.episode_index, target, round(float(timestamps[i]), 6))
+                if len(self._cache) >= self.cache_size:
+                    self._cache.pop(next(iter(self._cache)))
+                self._cache[key] = img
+        # filter out any None left over from decode failures
+        return [img for img in out if img is not None]
+
+    def _decode(
+        self, episode_index: int, timestamps: list[float], camera_key: str
+    ) -> list[Any]:
+        ep = self._meta.episodes[episode_index]
+        from_timestamp = ep[f"videos/{camera_key}/from_timestamp"]
+        shifted = [from_timestamp + ts for ts in timestamps]
+        video_path = self.root / self._meta.get_video_file_path(episode_index, camera_key)
+
+        try:
+            return _decode_pyav_direct(video_path, shifted, self.tolerance_s)
+        except Exception as exc:
+            # Log loudly the first time decoding fails so silent
+            # Module-3-no-op (every prompt skipped because frames_at returned
+            # []) is debuggable from the job log instead of post-hoc parquet
+            # inspection. Subsequent failures stay quiet.
+            if not getattr(self, "_warned_decode_fail", False):
+                import logging  # noqa: PLC0415
+
+                logging.getLogger(__name__).warning(
+                    "VideoFrameProvider._decode failed for episode=%s camera=%s "
+                    "video_path=%s: %s",
+                    episode_index,
+                    camera_key,
+                    video_path,
+                    exc,
+                    exc_info=True,
+                )
+                self._warned_decode_fail = True
+            return []
+
+
+def _decode_pyav_direct(
+    video_path: Any, timestamps: list[float], tolerance_s: float
+) -> list[Any]:
+    """Decode the requested timestamps from ``video_path`` using PyAV directly.
+
+    Bypasses ``lerobot.datasets.video_utils.decode_video_frames`` entirely
+    because its "pyav" path actually goes through
+    ``decode_video_frames_torchvision`` → ``torchvision.io.VideoReader``,
+    which was removed in torchvision >= 0.22 (the vllm/vllm-openai:latest
+    container ships with torchvision 0.25). The annotation pipeline only
+    needs a handful of PIL images per (episode, ts), so we can decode them
+    with PyAV without any torch dependency at all.
+
+    Returns one ``PIL.Image`` per requested timestamp, in the same order.
+    Any timestamp the decoder couldn't reach is silently dropped (mirrors
+    the previous behaviour); callers filter ``None``/missing entries.
+    """
+    import av  # noqa: PLC0415
+    from PIL import Image  # noqa: PLC0415
+
+    if not timestamps:
+        return []
+
+    targets = sorted(set(timestamps))
+    seek_to = max(0.0, min(targets) - max(0.5, tolerance_s))
+
+    container = av.open(str(video_path))
+    try:
+        stream = container.streams.video[0]
+        # PyAV needs the seek target in stream timebase ticks.
+        if stream.time_base is None:
+            seek_pts = 0
+        else:
+            seek_pts = int(seek_to / float(stream.time_base))
+        try:
+            container.seek(seek_pts, any_frame=False, backward=True, stream=stream)
+        except av.AVError:
+            # Some streams reject the explicit seek; fall back to decoding from start.
+            container.seek(0)
+
+        results: dict[float, Any] = {}
+        target_iter = iter(targets)
+        next_target = next(target_iter, None)
+        for frame in container.decode(stream):
+            if next_target is None:
+                break
+            ts = float(frame.pts * frame.time_base) if frame.pts is not None else None
+            if ts is None:
+                continue
+            # Walk past targets we've already overshot — we keep the closest
+            # frame within tolerance.
+            while next_target is not None and ts >= next_target - tolerance_s:
+                if abs(ts - next_target) <= tolerance_s or ts >= next_target:
+                    img = frame.to_image()  # PIL.Image.Image (RGB)
+                    results.setdefault(next_target, img)
+                    next_target = next(target_iter, None)
+                else:
+                    break
+    finally:
+        container.close()
+
+    return [results[ts] for ts in timestamps if ts in results]
+
+    def video_for_episode(
+        self,
+        record: EpisodeRecord,
+        max_frames: int,
+        camera_key: str | None = None,
+    ) -> list[Any]:
+        """Return up to ``max_frames`` images uniformly sampled across the episode.
+
+        The whole episode duration is covered; the model picks subtask
+        boundaries from the temporal pooling it does internally.
+        """
+        target = camera_key if camera_key is not None else self.camera_key
+        if max_frames <= 0 or target is None or not record.frame_timestamps:
+            return []
+        n_frames = min(max_frames, len(record.frame_timestamps))
+        if n_frames == len(record.frame_timestamps):
+            timestamps = list(record.frame_timestamps)
+        else:
+            t0 = record.frame_timestamps[0]
+            t_last = record.frame_timestamps[-1]
+            if t_last <= t0:
+                timestamps = [float(t0)] * n_frames
+            else:
+                step = (t_last - t0) / (n_frames - 1) if n_frames > 1 else 0.0
+                timestamps = [float(t0 + i * step) for i in range(n_frames)]
+        return self.frames_at(record, timestamps, camera_key=target)
+
+
+def make_frame_provider(root: Path, camera_key: str | None = None) -> FrameProvider:
+    """Build a :class:`VideoFrameProvider` if videos are present, else null."""
+    try:
+        provider = VideoFrameProvider(root=root, camera_key=camera_key)
+    except Exception:
+        return null_provider()
+    if provider.camera_key is None:
+        return null_provider()
+    return provider
+
+
+def to_image_blocks(images: list[Any]) -> list[dict[str, Any]]:
+    """Convert PIL images to Qwen-VL-compatible content blocks."""
+    return [{"type": "image", "image": img} for img in images]
+
+
+def to_video_block(images: list[Any]) -> list[dict[str, Any]]:
+    """Wrap a list of PIL images as one Qwen-VL video block.
+
+    Returns ``[]`` when the list is empty, so the caller can splat the result
+    into a content array without a separate emptiness check.
+    """
+    if not images:
+        return []
+    return [{"type": "video", "video": list(images)}]
+
+
+def to_video_url_block(url: str | None, fps: float = 2.0) -> list[dict[str, Any]]:
+    """Wrap a video file URL as one ``video_url`` block.
+
+    Used by the ``openai`` backend (transformers serve / vllm serve /
+    ktransformers serve), where the server handles frame sampling.
+    Returns ``[]`` when ``url`` is ``None`` so the caller can splat.
+    """
+    if not url:
+        return []
+    return [{"type": "video_url", "video_url": {"url": url}, "fps": fps}]
+
+
+def episode_clip_path(
+    record: EpisodeRecord,
+    provider: "VideoFrameProvider",
+    cache_dir: Path,
+) -> Path | None:
+    """Extract the episode's subclip to ``cache_dir/ep_{idx:06d}.mp4``.
+
+    Returns ``None`` if the dataset has no video tracks. Skips re-extract
+    when the cached clip already exists. Re-encodes to H.264
+    (libx264) so the resulting mp4 is decodable by every downstream
+    video processor — stream-copy would inherit the source codec
+    (often AV1 in modern LeRobot datasets), which vllm's libav build
+    cannot decode.
+    """
+    import subprocess  # noqa: PLC0415
+
+    if provider.camera_key is None:
+        return None
+    cache_dir.mkdir(parents=True, exist_ok=True)
+    out_path = cache_dir / f"ep_{record.episode_index:06d}.mp4"
+    if out_path.exists() and out_path.stat().st_size > 0:
+        return out_path
+    ep = provider._meta.episodes[record.episode_index]
+    from_timestamp = float(ep[f"videos/{provider.camera_key}/from_timestamp"])
+    to_timestamp = float(ep[f"videos/{provider.camera_key}/to_timestamp"])
+    src = provider.root / provider._meta.get_video_file_path(
+        record.episode_index, provider.camera_key
+    )
+    cmd = [
+        "ffmpeg",
+        "-y",
+        "-loglevel",
+        "error",
+        "-ss",
+        f"{from_timestamp:.3f}",
+        "-to",
+        f"{to_timestamp:.3f}",
+        "-i",
+        str(src),
+        "-c:v",
+        "libx264",
+        "-preset",
+        "ultrafast",
+        "-crf",
+        "23",
+        "-pix_fmt",
+        "yuv420p",
+        "-an",
+        str(out_path),
+    ]
+    try:
+        subprocess.run(cmd, check=True, timeout=300)
+    except (subprocess.CalledProcessError, subprocess.TimeoutExpired, FileNotFoundError):
+        return None
+    return out_path if out_path.exists() and out_path.stat().st_size > 0 else None
@@ -0,0 +1,25 @@
+#!/usr/bin/env python
+
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from .general_vqa import GeneralVqaModule
+from .interjections_and_speech import InterjectionsAndSpeechModule
+from .plan_subtasks_memory import PlanSubtasksMemoryModule
+
+__all__ = [
+    "GeneralVqaModule",
+    "InterjectionsAndSpeechModule",
+    "PlanSubtasksMemoryModule",
+]
@@ -0,0 +1,238 @@
+#!/usr/bin/env python
+
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""Module 3: general VQA at a timed cadence.
+
+Anchors ``K`` (question, answer) pairs to ``K`` consecutive frames per
+emission. For datasets with multiple cameras, every emission tick produces
+one ``(vqa, user)`` + ``(vqa, assistant)`` pair *per camera*: each pair is
+generated against that camera's frame and stamped with the matching
+``camera`` field on the emitted rows. The resolver disambiguates via
+``camera=...``; recipes that consume VQA do so through one sub-recipe
+per camera (see ``recipes/pi05_hirobot.yaml``).
+
+Within a single (frame, camera) we still emit at most one ``(vqa, user)``
+and one ``(vqa, assistant)`` row, so the resolver contract stays scalar.
+
+Question types covered (per the plan's Module 3 table): bbox, keypoint,
+count, attribute, spatial. The assistant's ``content`` is a JSON string
+whose schema depends on the question type. Malformed JSON triggers one
+retry inside :meth:`VlmClient.generate_json`.
+"""
+
+from __future__ import annotations
+
+import json
+import random
+from collections.abc import Sequence
+from dataclasses import dataclass, field
+from typing import Any
+
+from ..config import Module3Config
+from ..frames import FrameProvider, null_provider, to_image_blocks
+from ..prompts import load as load_prompt
+from ..reader import EpisodeRecord
+from ..staging import EpisodeStaging
+from ..validator import classify_vqa_answer
+from ..vlm_client import VlmClient
+
+
+def _emission_anchor_indices(frame_timestamps: Sequence[float], hz: float, k: int) -> list[int]:
+    """Return the relative frame indices to anchor VQA emissions to.
+
+    For each emission tick (every ``1/hz`` seconds), we anchor ``k``
+    consecutive frames starting at the tick. Ticks fall on the nearest
+    available source frame timestamp.
+    """
+    if hz <= 0 or k <= 0 or not frame_timestamps:
+        return []
+    t0 = frame_timestamps[0]
+    t_last = frame_timestamps[-1]
+    period = 1.0 / hz
+    indices: list[int] = []
+    t = t0
+    while t <= t_last + 1e-9:
+        # find the index of the nearest frame to t
+        nearest_i = min(range(len(frame_timestamps)), key=lambda i: abs(frame_timestamps[i] - t))
+        for offset in range(k):
+            j = nearest_i + offset
+            if j >= len(frame_timestamps):
+                break
+            if not indices or indices[-1] != j:
+                indices.append(j)
+        t += period
+    # dedupe while preserving order
+    seen: set[int] = set()
+    deduped: list[int] = []
+    for i in indices:
+        if i in seen:
+            continue
+        seen.add(i)
+        deduped.append(i)
+    return deduped
+
+
+@dataclass
+class GeneralVqaModule:
+    """Emit grounded VQA pairs at a timed cadence."""
+
+    vlm: VlmClient
+    config: Module3Config
+    seed: int = 1729
+    frame_provider: FrameProvider = field(default_factory=null_provider)
+
+    @property
+    def enabled(self) -> bool:
+        return self.config.enabled
+
+    def run_episode(self, record: EpisodeRecord, staging: EpisodeStaging) -> None:
+        if not record.frame_timestamps:
+            staging.write("module_3", [])
+            return
+        rng = random.Random(f"{self.seed}:{record.episode_index}:vqa")
+        anchor_idx = _emission_anchor_indices(
+            record.frame_timestamps, self.config.vqa_emission_hz, self.config.K
+        )
+        cameras = self._target_cameras()
+        if not cameras:
+            # No camera available — emit nothing rather than producing
+            # untagged rows that would fail validation. Surface a loud one-
+            # time warning so this is never silently a no-op.
+            if not getattr(self, "_warned_no_camera", False):
+                import logging  # noqa: PLC0415
+
+                logging.getLogger(__name__).warning(
+                    "Module 3 (VQA) found no cameras on the frame provider — "
+                    "every episode will emit zero VQA rows. Check that the "
+                    "dataset declares observation.images.* features in "
+                    "meta/info.json; passing --vlm.camera_key=<key> at the "
+                    "CLI now also seeds the cameras list as a fallback."
+                )
+                self._warned_no_camera = True
+            staging.write("module_3", [])
+            return
+
+        # Build all messages first (one per (frame, camera)), then issue them
+        # as a single batched generate_json call so the client can fan them
+        # out concurrently.
+        per_call: list[tuple[float, str, str, list[dict[str, Any]]]] = []
+        for idx in anchor_idx:
+            ts = float(record.frame_timestamps[idx])
+            qtype = rng.choice(self.config.question_types)
+            for camera in cameras:
+                messages = self._build_messages(record, qtype, ts, camera)
+                # Skip cameras that decoded to zero frames at this ts: no point
+                # asking the VLM to ground a bbox without an image.
+                if not _has_image_block(messages):
+                    continue
+                per_call.append((ts, camera, qtype, messages))
+
+        if not per_call:
+            staging.write("module_3", [])
+            return
+
+        results = self.vlm.generate_json([m for _, _, _, m in per_call])
+
+        rows: list[dict[str, Any]] = []
+        for (ts, camera, _qtype, _messages), result in zip(per_call, results):
+            qa = self._postprocess(result)
+            if qa is None:
+                continue
+            question, answer = qa
+            rows.append(
+                {
+                    "role": "user",
+                    "content": question,
+                    "style": "vqa",
+                    "timestamp": ts,
+                    "camera": camera,
+                    "tool_calls": None,
+                }
+            )
+            rows.append(
+                {
+                    "role": "assistant",
+                    "content": json.dumps(answer, sort_keys=True),
+                    "style": "vqa",
+                    "timestamp": ts,
+                    "camera": camera,
+                    "tool_calls": None,
+                }
+            )
+        staging.write("module_3", rows)
+
+    def _target_cameras(self) -> list[str]:
+        """Return the cameras Module 3 should iterate per emission tick.
+
+        Defaults to every camera the provider exposes. Datasets with no
+        cameras (or test/null providers) yield an empty list, which makes
+        ``run_episode`` a no-op.
+        """
+        return list(getattr(self.frame_provider, "camera_keys", []) or [])
+
+    def _build_messages(
+        self,
+        record: EpisodeRecord,
+        question_type: str,
+        frame_timestamp: float,
+        camera_key: str,
+    ) -> list[dict[str, Any]]:
+        prompt = load_prompt("module_3_vqa").format(
+            episode_task=record.episode_task,
+            question_type=question_type,
+        )
+        images = self.frame_provider.frames_at(
+            record, [frame_timestamp], camera_key=camera_key
+        )
+        content = [*to_image_blocks(images), {"type": "text", "text": prompt}]
+        return [{"role": "user", "content": content}]
+
+    def _postprocess(self, result: Any) -> tuple[str, dict[str, Any]] | None:
+        if not isinstance(result, dict):
+            return None
+        question = result.get("question")
+        answer = result.get("answer")
+        if not isinstance(question, str) or not question.strip():
+            return None
+        if not isinstance(answer, dict):
+            return None
+        # The validator will enforce shape; here we just sanity-check that the
+        # answer matches *some* known shape so we can drop garbage early.
+        if classify_vqa_answer(answer) is None:
+            return None
+        return question.strip(), answer
+
+    def _generate_one(
+        self,
+        record: EpisodeRecord,
+        question_type: str,
+        frame_timestamp: float,
+        camera_key: str,
+    ) -> tuple[str, dict[str, Any]] | None:
+        messages = self._build_messages(record, question_type, frame_timestamp, camera_key)
+        result = self.vlm.generate_json([messages])[0]
+        return self._postprocess(result)
+
+
+def _has_image_block(messages: list[dict[str, Any]]) -> bool:
+    """Return True if any user content block is a populated image block."""
+    for msg in messages:
+        content = msg.get("content")
+        if not isinstance(content, list):
+            continue
+        for block in content:
+            if isinstance(block, dict) and block.get("type") == "image":
+                return True
+    return False
@@ -0,0 +1,231 @@
+#!/usr/bin/env python
+
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""Module 2: interjections + paired speech (EVENT styles + speech atoms).
+
+Two sub-passes:
+
+1. At ``t=0``, emit ONLY a speech tool-call atom (acknowledgement of the
+   canonical task). No interjection row — the canonical task is already the
+   user utterance from ``meta/tasks.parquet``.
+
+2. For mid-episode interruptions, emit a co-timestamped pair:
+       {role:user, style:interjection, content:<text>}
+       speech atom (role:assistant, style:None, tool_calls=[say(...)])
+   Both rows go in ``language_events`` at the same timestamp.
+
+Module 1's :meth:`run_plan_updates` reuses Module 2's interjection
+timestamps to refresh the ``plan`` row at the same instant.
+"""
+
+from __future__ import annotations
+
+import random
+from collections.abc import Sequence
+from dataclasses import dataclass, field
+from typing import Any
+
+from ..config import Module2Config
+from ..frames import FrameProvider, null_provider, to_image_blocks
+from ..prompts import load as load_prompt
+from ..reader import EpisodeRecord
+from ..staging import EpisodeStaging
+from ..vlm_client import VlmClient
+from ..writer import speech_atom
+
+
+def _snap_to_frame(t: float, frame_timestamps: Sequence[float]) -> float:
+    if not frame_timestamps:
+        return float(t)
+    return float(min(frame_timestamps, key=lambda f: abs(f - t)))
+
+
+@dataclass
+class InterjectionsAndSpeechModule:
+    """Generate task-start speech and mid-episode interjection/speech pairs."""
+
+    vlm: VlmClient
+    config: Module2Config
+    seed: int = 1729
+    frame_provider: FrameProvider = field(default_factory=null_provider)
+
+    @property
+    def enabled(self) -> bool:
+        return self.config.enabled
+
+    def run_episode(self, record: EpisodeRecord, staging: EpisodeStaging) -> None:
+        rows: list[dict[str, Any]] = []
+        if record.frame_timestamps:
+            t0 = float(record.frame_timestamps[0])
+            initial = self._initial_speech(record)
+            if initial:
+                rows.append(speech_atom(t0, initial))
+        # Pull Module 1's subtask spans for this episode so the
+        # interjection prompt can ground itself in the actual current
+        # subtask at each chosen timestamp. Module 1 ran first.
+        subtask_spans = self._read_subtask_spans(staging)
+        rows.extend(self._mid_episode_interjections(record, subtask_spans))
+        staging.write("module_2", rows)
+
+    @staticmethod
+    def _read_subtask_spans(staging: EpisodeStaging) -> list[dict[str, Any]]:
+        rows = [r for r in staging.read("module_1") if r.get("style") == "subtask"]
+        rows.sort(key=lambda r: float(r["timestamp"]))
+        spans: list[dict[str, Any]] = []
+        last_t: float | None = None
+        for r in rows:
+            t = float(r["timestamp"])
+            if last_t is not None and spans:
+                spans[-1]["end"] = t
+            spans.append({"text": r.get("content") or "", "start": t, "end": t})
+            last_t = t
+        return spans
+
+    @staticmethod
+    def _subtask_at(spans: Sequence[dict[str, Any]], t: float) -> str | None:
+        current: str | None = None
+        for span in spans:
+            if float(span["start"]) <= t:
+                current = span.get("text")
+            else:
+                break
+        return current
+
+    def _initial_speech(self, record: EpisodeRecord) -> str | None:
+        prompt = load_prompt("module_2_initial_speech").format(
+            episode_task=record.episode_task,
+        )
+        messages = [{"role": "user", "content": [{"type": "text", "text": prompt}]}]
+        result = self.vlm.generate_json([messages])[0]
+        if isinstance(result, dict) and isinstance(result.get("text"), str):
+            text = result["text"].strip()
+            if text:
+                return text
+        return None
+
+    def _mid_episode_interjections(
+        self,
+        record: EpisodeRecord,
+        subtask_spans: Sequence[dict[str, Any]],
+    ) -> list[dict[str, Any]]:
+        """Generate interjections aligned with the actual demo trajectory.
+
+        Teleop data is frozen — the robot already executed every step in
+        the video. A *counterfactual* interjection like "actually skip
+        the wipe" contradicts what then happens in the video, which is
+        what qwen36moe-10/11 surfaced as low-quality interjections.
+
+        Instead, anchor every interjection at a subtask boundary and
+        write it as a natural user request for the *upcoming* subtask.
+        The robot's visible next behavior IS the interjection's effect,
+        so the training signal stays consistent: interjection text →
+        plan refresh → action stream all line up.
+        """
+        if self.config.max_interjections_per_episode <= 0:
+            return []
+        if len(subtask_spans) < 2:
+            # Need at least one transition (subtask 0 → subtask 1).
+            return []
+        # Deterministic per-episode RNG so reruns are stable across SLURM jobs.
+        rng = random.Random(f"{self.seed}:{record.episode_index}:interjection")
+
+        # Boundaries: the start time of every subtask except the first
+        # (which is just t0 and is covered by the initial-task speech atom).
+        boundaries: list[tuple[float, str, str]] = []
+        for i in range(1, len(subtask_spans)):
+            ts = float(subtask_spans[i]["start"])
+            if ts < self.config.interjection_min_t:
+                continue
+            prev_text = (subtask_spans[i - 1].get("text") or "").strip()
+            next_text = (subtask_spans[i].get("text") or "").strip()
+            if not next_text:
+                continue
+            boundaries.append((ts, prev_text, next_text))
+        if not boundaries:
+            return []
+
+        n = min(self.config.max_interjections_per_episode, len(boundaries))
+        chosen = sorted(rng.sample(boundaries, n), key=lambda b: b[0])
+
+        out: list[dict[str, Any]] = []
+        for t, prev_subtask, next_subtask in chosen:
+            t_snap = _snap_to_frame(t, record.frame_timestamps)
+            # Window straddles the boundary so the VLM sees the end of the
+            # previous subtask and the start of the next one — same
+            # conditioning the policy will see at training time.
+            window_ts = self._window_timestamps(t_snap, record.frame_timestamps)
+            prompt = load_prompt("module_2_interjection").format(
+                episode_task=record.episode_task,
+                prev_subtask=prev_subtask or "(starting from initial state)",
+                next_subtask=next_subtask,
+                timestamp=t_snap,
+                window_seconds=self.config.interjection_window_seconds,
+            )
+            images = self.frame_provider.frames_at(record, window_ts)
+            content = [*to_image_blocks(images), {"type": "text", "text": prompt}]
+            messages = [{"role": "user", "content": content}]
+            result = self.vlm.generate_json([messages])[0]
+            if not isinstance(result, dict):
+                continue
+            interjection_text = result.get("interjection")
+            speech_text = result.get("speech")
+            if not isinstance(interjection_text, str) or not interjection_text.strip():
+                continue
+            if not isinstance(speech_text, str) or not speech_text.strip():
+                continue
+            out.append(
+                {
+                    "role": "user",
+                    "content": interjection_text.strip(),
+                    "style": "interjection",
+                    "timestamp": t_snap,
+                    "tool_calls": None,
+                }
+            )
+            out.append(speech_atom(t_snap, speech_text.strip()))
+        return out
+
+    def _window_timestamps(
+        self, t_anchor: float, frame_timestamps: Sequence[float]
+    ) -> list[float]:
+        """Return a small set of frame timestamps centered on ``t_anchor``.
+
+        The window straddles the subtask boundary the interjection sits
+        on: roughly half the frames cover the end of the previous
+        subtask, half cover the start of the next one. The VLM therefore
+        sees BOTH what just finished AND what's about to start, which is
+        the conditioning we need to write a natural "now please do X"
+        request that matches the visible upcoming behavior.
+        """
+        if not frame_timestamps:
+            return [t_anchor]
+        n = max(1, int(self.config.interjection_window_frames))
+        if n == 1:
+            return [t_anchor]
+        window = float(self.config.interjection_window_seconds)
+        step = window / max(1, n - 1)
+        # Center the window on the anchor so half lands before, half after.
+        start_offset = -window / 2.0
+        targets = [t_anchor + start_offset + step * i for i in range(n)]
+        last_ts = float(frame_timestamps[-1])
+        snapped: list[float] = []
+        seen: set[float] = set()
+        for tgt in targets:
+            clamped = min(last_ts, max(0.0, tgt))
+            t = _snap_to_frame(clamped, frame_timestamps)
+            if t not in seen:
+                seen.add(t)
+                snapped.append(t)
+        return snapped or [t_anchor]
@@ -0,0 +1,443 @@
+#!/usr/bin/env python
+
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""Module 1: subtask decomposition + plan + memory (PERSISTENT styles)."""
+
+from __future__ import annotations
+
+from collections.abc import Sequence
+from dataclasses import dataclass, field
+from typing import Any
+
+from pathlib import Path
+
+from ..config import Module1Config
+from ..frames import (
+    FrameProvider,
+    VideoFrameProvider,
+    episode_clip_path,
+    null_provider,
+    to_video_block,
+    to_video_url_block,
+)
+from ..prompts import load as load_prompt
+from ..reader import EpisodeRecord
+from ..staging import EpisodeStaging
+from ..vlm_client import VlmClient
+
+
+def _snap_to_frame(t: float, frame_timestamps: Sequence[float]) -> float:
+    """Snap an arbitrary float to the nearest exact source frame timestamp."""
+    if not frame_timestamps:
+        return float(t)
+    nearest = min(frame_timestamps, key=lambda f: abs(f - t))
+    return float(nearest)
+
+
+@dataclass
+class PlanSubtasksMemoryModule:
+    """Generate subtask spans, plan, and memory rows.
+
+    All output is persistent (lives in ``language_persistent``):
+
+    - ``subtask`` rows: one per span, stamped at the span's *start* timestamp
+      (snapped to an exact frame).
+    - ``plan`` rows: emitted at ``t=0``; refreshed at every interjection
+      timestamp via :meth:`run_plan_updates` (called by the executor after
+      Module 2 completes).
+    - ``memory`` rows: emitted at each subtask boundary (= subtask start
+      timestamp from the second subtask onward).
+    """
+
+    vlm: VlmClient
+    config: Module1Config
+    frame_provider: FrameProvider = field(default_factory=null_provider)
+
+    @property
+    def enabled(self) -> bool:
+        return self.config.enabled
+
+    def run_episode(self, record: EpisodeRecord, staging: EpisodeStaging) -> None:
+        rows: list[dict[str, Any]] = []
+        # Resolve the task that drives every other Module-1 prompt. May be
+        # the canonical ``record.episode_task`` (default), or a fresh
+        # description derived from the video when the canonical task is
+        # empty / placeholder / forced-off (see Module1Config.derive_task_*).
+        effective_task = self._resolve_effective_task(record)
+        # ``task_aug`` rows at t=0 (role=user), one per rephrasing — the
+        # PR 1 renderer rotates ``${task}`` deterministically through them
+        # so the policy sees diverse phrasings during training.
+        t0 = float(record.frame_timestamps[0]) if record.frame_timestamps else 0.0
+        if self.config.n_task_rephrasings > 0 and effective_task:
+            rephrasings = self._generate_task_rephrasings(
+                effective_task, n=self.config.n_task_rephrasings
+            )
+            # Always include the effective task itself as the first variant
+            # so the rotation is guaranteed to cover the source-of-truth
+            # phrasing, not just synthetic alternatives.
+            seen: set[str] = set()
+            ordered = [effective_task, *rephrasings]
+            for phrasing in ordered:
+                key = phrasing.strip()
+                if not key or key in seen:
+                    continue
+                seen.add(key)
+                rows.append(
+                    {
+                        "role": "user",
+                        "content": key,
+                        "style": "task_aug",
+                        "timestamp": t0,
+                        "tool_calls": None,
+                    }
+                )
+
+        subtask_spans = self._generate_subtasks(record, task=effective_task)
+        # subtask rows
+        for span in subtask_spans:
+            rows.append(
+                {
+                    "role": "assistant",
+                    "content": span["text"],
+                    "style": "subtask",
+                    "timestamp": _snap_to_frame(span["start"], record.frame_timestamps),
+                    "tool_calls": None,
+                }
+            )
+        # plan row at t=0
+        plan_text = self._generate_plan(record, subtask_spans, task=effective_task)
+        if plan_text is not None:
+            rows.append(
+                {
+                    "role": "assistant",
+                    "content": plan_text,
+                    "style": "plan",
+                    "timestamp": float(t0),
+                    "tool_calls": None,
+                }
+            )
+        # memory rows at every subtask boundary except the very first start
+        prior_memory = ""
+        for i, span in enumerate(subtask_spans[1:], start=1):
+            completed = subtask_spans[i - 1]["text"]
+            remaining = [s["text"] for s in subtask_spans[i:]]
+            mem_text = self._generate_memory(
+                record, prior_memory, completed, remaining, task=effective_task
+            )
+            if mem_text:
+                ts = _snap_to_frame(span["start"], record.frame_timestamps)
+                rows.append(
+                    {
+                        "role": "assistant",
+                        "content": mem_text,
+                        "style": "memory",
+                        "timestamp": ts,
+                        "tool_calls": None,
+                    }
+                )
+                prior_memory = mem_text
+        staging.write("module_1", rows)
+
+    # ------------------------------------------------------------------
+    # Task derivation + rephrasings
+    # ------------------------------------------------------------------
+
+    _PLACEHOLDER_TASKS: frozenset[str] = frozenset(
+        {
+            "debug",
+            "test",
+            "tbd",
+            "todo",
+            "n/a",
+            "na",
+            "untitled",
+            "unnamed",
+            "default",
+            "placeholder",
+        }
+    )
+
+    def _resolve_effective_task(self, record: EpisodeRecord) -> str:
+        """Decide which task string drives Module 1 for this episode.
+
+        Returns the user-supplied ``record.episode_task`` unless
+        ``derive_task_from_video`` says otherwise (see config docstring).
+        Falls back gracefully to the canonical task if video derivation
+        fails.
+        """
+        canonical = (record.episode_task or "").strip()
+        mode = (self.config.derive_task_from_video or "off").strip().lower()
+        if mode == "always":
+            derived = self._derive_task_from_video(record)
+            return derived or canonical
+        if mode == "if_short" and self._task_seems_bad(canonical):
+            derived = self._derive_task_from_video(record)
+            if derived:
+                return derived
+        return canonical
+
+    def _task_seems_bad(self, task: str) -> bool:
+        if not task:
+            return True
+        if len(task.split()) < int(self.config.derive_task_min_words):
+            return True
+        if task.lower() in self._PLACEHOLDER_TASKS:
+            return True
+        return False
+
+    def _derive_task_from_video(self, record: EpisodeRecord) -> str | None:
+        """Ask the VLM "what is this video about" with no task hint at all."""
+        prompt = load_prompt("module_1_video_task")
+        video_block = self._episode_video_block(record)
+        content = [*video_block, {"type": "text", "text": prompt}]
+        messages = [{"role": "user", "content": content}]
+        result = self.vlm.generate_json([messages])[0]
+        if isinstance(result, dict) and isinstance(result.get("task"), str):
+            text = result["task"].strip()
+            if text:
+                return text
+        return None
+
+    def _generate_task_rephrasings(self, base_task: str, *, n: int) -> list[str]:
+        """Generate ``n`` text-only paraphrases of ``base_task``."""
+        if n <= 0 or not base_task:
+            return []
+        prompt = load_prompt("module_1_task_rephrasings").format(
+            base_task=base_task, n=n
+        )
+        messages = [{"role": "user", "content": [{"type": "text", "text": prompt}]}]
+        result = self.vlm.generate_json([messages])[0]
+        if not isinstance(result, dict):
+            return []
+        raw = result.get("rephrasings")
+        if not isinstance(raw, list):
+            return []
+        out: list[str] = []
+        for item in raw:
+            if isinstance(item, str):
+                cleaned = item.strip().strip('"').strip("'")
+                if cleaned:
+                    out.append(cleaned)
+        return out[:n]
+
+    def _episode_video_block(self, record: EpisodeRecord) -> list[dict[str, Any]]:
+        """Same video block ``_generate_subtasks`` builds — extracted helper."""
+        if not record.frame_timestamps:
+            return []
+        if self.config.use_video_url and isinstance(self.frame_provider, VideoFrameProvider):
+            cache_dir = Path(self.frame_provider.root) / ".annotate_staging" / ".video_clips"
+            clip = episode_clip_path(record, self.frame_provider, cache_dir)
+            return (
+                to_video_url_block(f"file://{clip}", fps=self.config.use_video_url_fps)
+                if clip is not None
+                else []
+            )
+        episode_duration = record.frame_timestamps[-1] - record.frame_timestamps[0]
+        target_count = max(
+            1, int(round(episode_duration * self.config.frames_per_second))
+        )
+        target_count = min(target_count, self.config.max_video_frames)
+        video_frames = self.frame_provider.video_for_episode(record, target_count)
+        return to_video_block(video_frames)
+
+    def run_plan_updates(
+        self,
+        record: EpisodeRecord,
+        staging: EpisodeStaging,
+        interjection_times: Sequence[float],
+        interjection_texts: Sequence[str] | None = None,
+    ) -> None:
+        """Append additional ``plan`` rows at every interjection timestamp.
+
+        Plans refresh ONLY on user interjections — subtask generation
+        runs ~1 Hz at inference, but plan re-emission is event-driven.
+        Now also forwards the interjection's own text into the prompt so
+        the refreshed plan can actually reflect the user's correction
+        (the previous version told the model "an interjection happened"
+        without telling it what the user said).
+        """
+        existing = staging.read("module_1")
+        spans = self._reconstruct_subtasks_from_rows(existing)
+        already_planned: set[float] = {
+            float(r["timestamp"]) for r in existing if r.get("style") == "plan"
+        }
+        new_rows = list(existing)
+
+        texts: list[str | None] = (
+            [None] * len(interjection_times)
+            if interjection_texts is None
+            else [str(t) if t else None for t in interjection_texts]
+        )
+        for raw_t, inter_text in zip(interjection_times, texts):
+            t = _snap_to_frame(raw_t, record.frame_timestamps)
+            if t in already_planned:
+                continue
+            already_planned.add(t)
+            plan_text = self._generate_plan(
+                record, spans, refresh_t=t, interjection=inter_text
+            )
+            if plan_text is not None:
+                new_rows.append(
+                    {
+                        "role": "assistant",
+                        "content": plan_text,
+                        "style": "plan",
+                        "timestamp": t,
+                        "tool_calls": None,
+                    }
+                )
+        staging.write("module_1", new_rows)
+
+    @staticmethod
+    def _reconstruct_subtasks_from_rows(rows: Sequence[dict[str, Any]]) -> list[dict[str, Any]]:
+        out = []
+        last_t: float | None = None
+        for row in sorted(
+            (r for r in rows if r.get("style") == "subtask"),
+            key=lambda r: float(r["timestamp"]),
+        ):
+            t = float(row["timestamp"])
+            if last_t is not None:
+                out[-1]["end"] = t
+            out.append({"text": row.get("content") or "", "start": t, "end": t})
+            last_t = t
+        return out
+
+    def _generate_subtasks(
+        self, record: EpisodeRecord, *, task: str | None = None
+    ) -> list[dict[str, Any]]:
+        if record.row_count == 0 or not record.frame_timestamps:
+            return []
+        episode_duration = record.frame_timestamps[-1] - record.frame_timestamps[0]
+        prompt = load_prompt("module_1_subtasks").format(
+            episode_task=(task if task is not None else record.episode_task),
+            min_subtask_seconds=self.config.min_subtask_seconds,
+            max_steps=self.config.plan_max_steps,
+            episode_duration=f"{episode_duration:.3f}",
+        )
+        if self.config.use_video_url and isinstance(self.frame_provider, VideoFrameProvider):
+            cache_dir = Path(self.frame_provider.root) / ".annotate_staging" / ".video_clips"
+            clip = episode_clip_path(record, self.frame_provider, cache_dir)
+            video_block = (
+                to_video_url_block(f"file://{clip}", fps=self.config.use_video_url_fps)
+                if clip is not None
+                else []
+            )
+        else:
+            target_count = max(
+                1,
+                int(round(episode_duration * self.config.frames_per_second)),
+            )
+            target_count = min(target_count, self.config.max_video_frames)
+            video_frames = self.frame_provider.video_for_episode(record, target_count)
+            video_block = to_video_block(video_frames)
+        content = [*video_block, {"type": "text", "text": prompt}]
+        messages = [{"role": "user", "content": content}]
+        result = self.vlm.generate_json([messages])[0]
+        spans = result.get("subtasks") if isinstance(result, dict) else None
+        if not spans:
+            return []
+        # clamp to [t0, t_last] and sort
+        t0 = record.frame_timestamps[0]
+        t_last = record.frame_timestamps[-1]
+        cleaned: list[dict[str, Any]] = []
+        for span in spans:
+            try:
+                start = float(span["start"])
+                end = float(span["end"])
+                text = str(span["text"]).strip()
+            except (KeyError, ValueError, TypeError):
+                continue
+            start = max(t0, min(start, t_last))
+            end = max(t0, min(end, t_last))
+            if end < start:
+                start, end = end, start
+            if not text:
+                continue
+            cleaned.append({"text": text, "start": start, "end": end})
+        cleaned.sort(key=lambda s: s["start"])
+        return cleaned
+
+    def _generate_plan(
+        self,
+        record: EpisodeRecord,
+        subtask_spans: Sequence[dict[str, Any]],
+        *,
+        refresh_t: float | None = None,
+        interjection: str | None = None,
+        task: str | None = None,
+    ) -> str | None:
+        if not subtask_spans:
+            return None
+        subtasks_text = "\n".join(f"- {s['text']}" for s in subtask_spans)
+        prompt = load_prompt("module_1_plan").format(
+            episode_task=(task if task is not None else record.episode_task),
+            subtasks_text=subtasks_text,
+            plan_max_steps=self.config.plan_max_steps,
+        )
+        if refresh_t is not None:
+            # ``current_subtask`` is the span the refresh time falls into,
+            # so the model knows where in the demonstration the planner is
+            # standing when it re-emits.
+            current_subtask = ""
+            for span in subtask_spans:
+                if float(span["start"]) <= refresh_t and (
+                    "end" not in span or float(span["end"]) > refresh_t
+                ):
+                    current_subtask = span.get("text", "")
+                    break
+            if interjection:
+                prompt += (
+                    f"\n\n(Plan refresh at t={refresh_t:.2f}s after a user "
+                    f"interjection: {interjection!r}. Current subtask just "
+                    f"before the interjection: {current_subtask!r}. Update "
+                    f"the plan so it reflects the interjection — drop or "
+                    f"reorder steps as needed; do not just restate.)\n"
+                )
+            else:
+                # Refresh without an interjection text: still tell the model
+                # where in the episode the plan stands so the re-emission
+                # is grounded. Should be rare — plan refreshes are
+                # interjection-driven by design.
+                prompt += (
+                    f"\n\n(Plan refresh at t={refresh_t:.2f}s. Current "
+                    f"subtask: {current_subtask!r}.)\n"
+                )
+        messages = [{"role": "user", "content": [{"type": "text", "text": prompt}]}]
+        result = self.vlm.generate_json([messages])[0]
+        if isinstance(result, dict) and isinstance(result.get("plan"), str):
+            return result["plan"].strip()
+        return None
+
+    def _generate_memory(
+        self,
+        record: EpisodeRecord,
+        prior_memory: str,
+        completed: str,
+        remaining: Sequence[str],
+        *,
+        task: str | None = None,
+    ) -> str:
+        prompt = load_prompt("module_1_memory").format(
+            episode_task=(task if task is not None else record.episode_task),
+            prior_memory=prior_memory or "(none)",
+            completed_subtask=completed,
+            remaining_subtasks=", ".join(remaining) if remaining else "(none)",
+        )
+        messages = [{"role": "user", "content": [{"type": "text", "text": prompt}]}]
+        result = self.vlm.generate_json([messages])[0]
+        if isinstance(result, dict) and isinstance(result.get("memory"), str):
+            return result["memory"].strip()
+        return ""
@@ -0,0 +1,33 @@
+#!/usr/bin/env python
+
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""Prompt templates loaded as plain text.
+
+One file per use site. Templates use ``str.format(**vars)`` substitution; we
+intentionally avoid jinja2 here so the templates remain inspectable in
+plain editors and roundtrip cleanly through ``ruff format``.
+"""
+
+from __future__ import annotations
+
+from pathlib import Path
+
+_DIR = Path(__file__).parent
+
+
+def load(name: str) -> str:
+    """Read prompt template ``name.txt`` from the ``prompts/`` directory."""
+    path = _DIR / f"{name}.txt"
+    return path.read_text(encoding="utf-8")
@@ -0,0 +1,25 @@
+You are updating the robot's compressed semantic memory at the boundary of
+a completed subtask.
+
+Reference (verbatim from MEM, Torne 2026):
+"Remove or compress information in the language memory whenever
+appropriate. Keep ONLY the minimal set of relevant information for future
+task execution. Specific object attributes (colors, precise quantities of
+each item) get discarded when their details won't affect subsequent
+actions. Functional outcomes (where items went, how many) are preserved."
+
+Concrete example from MEM:
+  Before: "I put a light green bowl, a dark blue bowl and a bright yellow
+           bowl into the top right cabinet"
+  After:  "I placed three bowls in the top right cabinet"
+
+Episode task: "{episode_task}"
+Previous memory: {prior_memory}
+Just-completed subtask: "{completed_subtask}"
+Remaining subtasks (for relevance judgement only): {remaining_subtasks}
+
+Update the memory. Drop irrelevant detail. Compress completed steps.
+Keep WHAT happened, drop HOW. Shorter is better.
+
+Output strictly valid JSON:
+  {{ "memory": "<one or two short sentences>" }}
@@ -0,0 +1,18 @@
+You are the high-level planner for a robot demonstrating: "{episode_task}".
+
+Given the subtask decomposition below, write a concise hierarchical PLAN
+the robot should follow. Format the plan as a numbered list, one line per
+high-level step. The plan describes the full task; subtasks are the atomic
+skills used to execute it.
+
+Subtasks for context:
+{subtasks_text}
+
+Authoring rules:
+- 3 to {plan_max_steps} steps.
+- Each step describes one logical chunk of the task, not one motion.
+- Steps must be in execution order.
+- Plain prose, no JSON, no markdown headers.
+
+Output strictly valid JSON:
+  {{ "plan": "1. ...\n2. ...\n3. ..." }}
@@ -0,0 +1,33 @@
+You are labeling a teleoperated robot demonstration.
+
+The user originally asked: "{episode_task}"
+
+You are shown the entire demonstration as a single video. Watch the
+whole clip, then segment it into a list of consecutive atomic subtasks
+the robot performs.
+
+Authoring rules — based on Hi Robot (Shi 2025) atom granularity and
+Pi0.7 (Physical Intelligence 2025) "how, not what" detail:
+
+- Each subtask is one atomic skill the low-level policy can execute,
+  e.g. "pick up one piece of lettuce", "place the bowl into the box",
+  "move the right arm to the left".
+- Capture HOW the subtask is performed, not only WHAT — e.g. prefer
+  "grasp the handle of the sponge with the left hand" to "pick up the
+  sponge".
+- Subtasks are non-overlapping and cover the full episode in order.
+  Choose the cut points yourself based on what you see in the video
+  (gripper open/close events, contact, regrasps, transitions).
+- Each subtask spans at least {min_subtask_seconds} seconds.
+- Do not exceed {max_steps} subtasks total.
+- Every subtask's [start_time, end_time] must lie within
+  [0.0, {episode_duration}] seconds.
+
+Output strictly valid JSON of shape:
+
+  {{
+    "subtasks": [
+      {{"text": "<how-not-what>", "start": <float>, "end": <float>}},
+      ...
+    ]
+  }}
@@ -0,0 +1,32 @@
+You are generating training data for a Hi Robot-style policy. We need
+{n} alternative phrasings of the same robot task so the policy sees
+diverse user prompts during training instead of the same canonical
+string repeated every frame.
+
+Original task:
+"{base_task}"
+
+Generate exactly {n} alternative phrasings of the same task. Vary:
+
+- formality (casual / polite / curt)
+- verbosity (short imperative vs longer polite request)
+- word choice (synonyms, different verbs)
+- sentence structure (imperative / question / suggestion)
+
+Hard rules:
+- Each phrasing MUST preserve the exact meaning of the original task.
+  Do not change which object is involved, the destination, or the
+  action. Do not add extra steps. Do not invent new objects.
+- Each phrasing must be a single short sentence, plain prose, no
+  markdown, no quotes, no list numbers.
+- Phrasings must be distinct — no near-duplicates.
+- Output exactly {n} entries.
+
+Output strictly valid JSON:
+  {{
+    "rephrasings": [
+      "<phrasing 1>",
+      "<phrasing 2>",
+      ...
+    ]
+  }}
@@ -0,0 +1,17 @@
+The video above shows a robot manipulation episode in full. Look at
+the entire video and describe in ONE concise sentence what the robot
+is doing.
+
+Rules:
+- One sentence, in natural English, like a user instruction.
+- Capture the goal of the demonstration, not low-level motions.
+  Example: "place the yellow cube into the red bin" — not "move the
+  end-effector down 5cm and close the gripper".
+- 4 to 15 words. Plain prose, no markdown, no bullets, no quotes.
+- Do not invent objects or actions that aren't visible.
+- Do not output anything other than the JSON object below.
+
+Output strictly valid JSON:
+  {{
+    "task": "<single concise sentence describing what the robot does in this video>"
+  }}
@@ -0,0 +1,10 @@
+The user just asked the robot: "{episode_task}".
+
+Generate a short verbal acknowledgement the robot would speak back before
+beginning the task. Style: confident, friendly, single short sentence.
+
+Examples (Hi Robot, Shi 2025): "Sure, I won't put cheese on it.",
+"OK, starting with the sponge.", "Got it.".
+
+Output strictly valid JSON:
+  {{ "text": "<the spoken acknowledgement>" }}
@@ -0,0 +1,46 @@
+You are generating training data for a Hi Robot-style hierarchical
+robot policy. The robot in this demonstration has ALREADY executed
+every step shown in the video — we cannot retroactively change the
+action stream. To keep training data consistent with the video, the
+"interjection" must align with what the robot is *about to do next* in
+the demonstration, framed as a natural mid-task user request.
+
+The episode's overall task: "{episode_task}".
+
+The images above show roughly {window_seconds:.1f} seconds straddling a
+subtask boundary in the demonstration:
+
+- Subtask the robot just finished: "{prev_subtask}"
+- Subtask the robot is about to start: "{next_subtask}"
+- Time into episode: {timestamp:.2f}s
+
+Write ONE interjection the user would naturally say at this moment to
+prompt / confirm / encourage the robot to do "{next_subtask}". Phrase it
+like a real human mid-task remark — conversational, varied, sometimes
+just a nudge, sometimes a clarification, sometimes a small constraint
+that the upcoming motion happens to satisfy. Plus the robot's verbal
+acknowledgement.
+
+Hard rules:
+
+- The interjection MUST be consistent with the next subtask. The user
+  cannot ask for something different from what the robot then does in
+  the video. If you're tempted to say "actually skip X" or "do Y
+  instead", DO NOT — those would contradict the demonstration.
+- The interjection must reference an object, location, or action that
+  is plausible given the visible scene and the next subtask text.
+- One sentence each. Conversational, not robotic.
+
+Style examples (vary the phrasing — don't reuse these verbatim):
+  - "Now go ahead and {next_subtask}."
+  - "Great, can you {next_subtask} next?"
+  - "{next_subtask}, please."
+  - "Before you continue, please {next_subtask}."
+  - "Looking good — {next_subtask} now."
+  - "Okay, {next_subtask}."
+
+Output strictly valid JSON:
+  {{
+    "interjection": "<single sentence the user says, asking for the next subtask>",
+    "speech":       "<single sentence the robot speaks back, confirming and starting>"
+  }}
@@ -0,0 +1,32 @@
+You are generating a frame-grounded visual question/answer pair for
+chain-of-thought training. Reference: ECoT (Zawalski 2024) and Steerable
+Policies — both train policies on grounded features such as bounding box
+pixel coordinates, keypoints, counts, attributes, and spatial relations.
+
+The frame shows a robot working on: "{episode_task}".
+
+Question types and the EXACT answer JSON shape required for each:
+
+  bbox       => {{"detections": [{{"label": "<obj>", "bbox_format": "xyxy",
+                                    "bbox": [x1, y1, x2, y2]}}, ...]}}
+                bbox is in pixel coordinates (x_min, y_min, x_max, y_max).
+                ECoT example: "a white cup [124, 25, 176, 113]".
+
+  keypoint   => {{"label": "<point>", "point_format": "xy",
+                  "point": [x, y]}}
+
+  count      => {{"label": "<obj>", "count": <int>,
+                  "note": "<optional short note>"}}
+
+  attribute  => {{"label": "<obj>", "attribute": "<color|shape|state|...>",
+                  "value": "<observed value>"}}
+
+  spatial    => {{"subject": "<obj>", "relation": "<left_of|right_of|on|in|"
+                  "above|below|near>", "object": "<obj>"}}
+
+Generate a question of type "{question_type}". Output strictly valid JSON:
+
+  {{
+    "question": "<short, frame-grounded question>",
+    "answer":   <object whose shape matches the schema above>
+  }}
@@ -0,0 +1,219 @@
+#!/usr/bin/env python
+
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""Datatrove-shaped reader.
+
+The reader walks ``data/chunk-*/file-*.parquet`` and yields one record per
+episode containing:
+
+- ``episode_index``: int
+- ``frame_timestamps``: tuple[float, ...]
+- ``frame_indices``: tuple[int, ...]
+- ``episode_task``: str (canonical task from ``meta/tasks.parquet``)
+- ``data_path``: pathlib.Path of the source parquet shard
+- ``frames_df``: pandas.DataFrame slice for the episode (only loaded on demand)
+
+This shape lets each module operate per-episode without loading all parquet
+rows into memory at once. It deliberately does not depend on datatrove —
+datatrove integration wraps this generator inside a ``PipelineStep`` in
+:mod:`.executor`.
+"""
+
+from __future__ import annotations
+
+from collections.abc import Iterator
+from dataclasses import dataclass
+from pathlib import Path
+from typing import Any
+
+import pyarrow.parquet as pq
+
+from lerobot.datasets.utils import DEFAULT_TASKS_PATH
+
+
+@dataclass
+class EpisodeRecord:
+    """Per-episode record yielded by the reader."""
+
+    episode_index: int
+    episode_task: str
+    frame_timestamps: tuple[float, ...]
+    frame_indices: tuple[int, ...]
+    data_path: Path
+    row_offset: int  # row offset within the parquet file where this episode starts
+    row_count: int  # number of rows for this episode
+
+    def frames_df(self):  # type: ignore[no-untyped-def]
+        """Lazy-load the pandas slice for this episode."""
+        import pandas as pd  # noqa: PLC0415  - deferred for optional dataset extra
+
+        table = pq.read_table(self.data_path)
+        df: pd.DataFrame = table.to_pandas()
+        slice_ = df.iloc[self.row_offset : self.row_offset + self.row_count].reset_index(drop=True)
+        return slice_
+
+
+def _load_tasks_lookup(root: Path) -> dict[int, str]:
+    tasks_path = root / DEFAULT_TASKS_PATH
+    if not tasks_path.exists():
+        return {}
+    table = pq.read_table(tasks_path)
+    cols = {name: table.column(name).to_pylist() for name in table.column_names}
+    if "task_index" in cols and "task" in cols:
+        return dict(zip(cols["task_index"], cols["task"], strict=True))
+    raise ValueError(f"meta/tasks.parquet at {tasks_path} missing 'task_index' or 'task'")
+
+
+def iter_episodes(root: Path, *, only_episodes: tuple[int, ...] | None = None) -> Iterator[EpisodeRecord]:
+    """Yield :class:`EpisodeRecord` for every episode under ``root/data/``.
+
+    Episodes are yielded in ascending ``episode_index`` order. The reader does
+    not assume a specific chunk/file layout: it scans every ``*.parquet``
+    under ``data/`` and groups by ``episode_index``.
+    """
+    tasks = _load_tasks_lookup(root)
+    data_dir = root / "data"
+    parquet_files = sorted(data_dir.rglob("*.parquet"))
+
+    only_set = set(only_episodes) if only_episodes is not None else None
+
+    for path in parquet_files:
+        yield from _iter_one_path(path, tasks, only_set)
+
+
+def _iter_one_path(path: Path, tasks: dict[int, str], only_set: set[int] | None) -> Iterator[EpisodeRecord]:
+    table = pq.read_table(path)
+    names = table.column_names
+    if "episode_index" not in names:
+        return
+    episode_col = table.column("episode_index").to_pylist()
+    timestamp_col = (
+        table.column("timestamp").to_pylist() if "timestamp" in names else [0.0] * len(episode_col)
+    )
+    frame_col = (
+        table.column("frame_index").to_pylist() if "frame_index" in names else list(range(len(episode_col)))
+    )
+    task_col = table.column("task_index").to_pylist() if "task_index" in names else None
+
+    def _build(
+        ep: int,
+        start: int,
+        end: int,
+        task_idx: int | None,
+        ts_buf: list[float],
+        fi_buf: list[int],
+    ) -> EpisodeRecord | None:
+        if only_set is not None and ep not in only_set:
+            return None
+        task = tasks.get(task_idx, "") if task_idx is not None else ""
+        return EpisodeRecord(
+            episode_index=ep,
+            episode_task=task,
+            frame_timestamps=tuple(ts_buf),
+            frame_indices=tuple(fi_buf),
+            data_path=path,
+            row_offset=start,
+            row_count=end - start,
+        )
+
+    cur_ep: int | None = None
+    start_offset = 0
+    ts_buf: list[float] = []
+    fi_buf: list[int] = []
+    cur_task_idx: int | None = None
+
+    for i, ep in enumerate(episode_col):
+        if cur_ep is None:
+            cur_ep = ep
+            start_offset = i
+            ts_buf = [timestamp_col[i]]
+            fi_buf = [frame_col[i]]
+            cur_task_idx = task_col[i] if task_col is not None else None
+            continue
+        if ep != cur_ep:
+            rec = _build(cur_ep, start_offset, i, cur_task_idx, ts_buf, fi_buf)
+            if rec is not None:
+                yield rec
+            cur_ep = ep
+            start_offset = i
+            ts_buf = [timestamp_col[i]]
+            fi_buf = [frame_col[i]]
+            cur_task_idx = task_col[i] if task_col is not None else None
+        else:
+            ts_buf.append(timestamp_col[i])
+            fi_buf.append(frame_col[i])
+
+    if cur_ep is not None:
+        rec = _build(cur_ep, start_offset, len(episode_col), cur_task_idx, ts_buf, fi_buf)
+        if rec is not None:
+            yield rec
+
+
+def gather_data_paths(root: Path) -> list[Path]:
+    """Return every ``data/chunk-*/file-*.parquet`` path under ``root``."""
+    return sorted((root / "data").rglob("*.parquet"))
+
+
+def episode_offsets_per_path(path: Path) -> dict[int, tuple[int, int]]:
+    """Return ``{episode_index: (row_offset, row_count)}`` for one parquet."""
+    table = pq.read_table(path, columns=["episode_index"])
+    episode_col = table.column("episode_index").to_pylist()
+    out: dict[int, tuple[int, int]] = {}
+    cur_ep: int | None = None
+    start = 0
+    for i, ep in enumerate(episode_col):
+        if cur_ep is None:
+            cur_ep = ep
+            start = i
+            continue
+        if ep != cur_ep:
+            out[cur_ep] = (start, i - start)
+            cur_ep = ep
+            start = i
+    if cur_ep is not None:
+        out[cur_ep] = (start, len(episode_col) - start)
+    return out
+
+
+def keyframe_indices(record: EpisodeRecord, k: int) -> list[int]:
+    """Return ``k`` evenly spaced row indices into the episode (relative)."""
+    n = record.row_count
+    if k <= 0 or n == 0:
+        return []
+    if k >= n:
+        return list(range(n))
+    step = (n - 1) / (k - 1) if k > 1 else 0.0
+    return [int(round(i * step)) for i in range(k)] if k > 1 else [n // 2]
+
+
+def lookup_data_path(root: Path, episode_index: int) -> tuple[Path, int, int] | None:
+    """Find the parquet file containing ``episode_index`` and its slice bounds."""
+    for path in gather_data_paths(root):
+        offsets = episode_offsets_per_path(path)
+        if episode_index in offsets:
+            start, count = offsets[episode_index]
+            return path, start, count
+    return None
+
+
+def episode_frame_timestamps(root: Path, episode_index: int) -> tuple[Any, list[float]]:
+    """Return the parquet path and per-frame timestamps for ``episode_index``."""
+    found = lookup_data_path(root, episode_index)
+    if found is None:
+        raise ValueError(f"Episode {episode_index} not found under {root}/data/")
+    path, start, count = found
+    table = pq.read_table(path, columns=["timestamp"])
+    timestamps = table.column("timestamp").to_pylist()[start : start + count]
+    return path, [float(t) for t in timestamps]
@@ -0,0 +1,98 @@
+#!/usr/bin/env python
+
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""Per-episode staging.
+
+Each module writes its raw output as a JSONL file under
+``<staging_dir>/episode_{ep:06d}/<module>.jsonl``. The writer reads back this
+staging tree and partitions rows into the two language columns.
+
+JSONL is preferred over parquet here because the staging artifact is meant to
+be human-inspectable, easy to diff between prompt iterations, and trivially
+appended to. The final dataset format is parquet; staging is just an
+intermediate.
+"""
+
+from __future__ import annotations
+
+import json
+from collections.abc import Iterable, Iterator
+from dataclasses import dataclass
+from pathlib import Path
+from typing import Any
+
+ModuleName = str
+
+_MODULES: tuple[ModuleName, ...] = (
+    "module_1",
+    "module_2",
+    "module_3",
+)
+
+
+@dataclass
+class EpisodeStaging:
+    """Filesystem layout for a single episode's staged module outputs."""
+
+    root: Path
+    episode_index: int
+
+    @property
+    def episode_dir(self) -> Path:
+        return self.root / f"episode_{self.episode_index:06d}"
+
+    def path_for(self, module: ModuleName) -> Path:
+        if module not in _MODULES:
+            raise ValueError(f"Unknown module {module!r}; expected one of {_MODULES}")
+        return self.episode_dir / f"{module}.jsonl"
+
+    def write(self, module: ModuleName, rows: Iterable[dict[str, Any]]) -> Path:
+        path = self.path_for(module)
+        path.parent.mkdir(parents=True, exist_ok=True)
+        with path.open("w", encoding="utf-8") as f:
+            for row in rows:
+                f.write(json.dumps(row, ensure_ascii=False, sort_keys=True))
+                f.write("\n")
+        return path
+
+    def read(self, module: ModuleName) -> list[dict[str, Any]]:
+        path = self.path_for(module)
+        if not path.exists():
+            return []
+        out: list[dict[str, Any]] = []
+        with path.open(encoding="utf-8") as f:
+            for line in f:
+                line = line.strip()
+                if line:
+                    out.append(json.loads(line))
+        return out
+
+    def read_all(self) -> dict[ModuleName, list[dict[str, Any]]]:
+        return {m: self.read(m) for m in _MODULES}
+
+    def has(self, module: ModuleName) -> bool:
+        return self.path_for(module).exists()
+
+
+def iter_staged_episodes(root: Path) -> Iterator[int]:
+    """Yield episode indices for which any staging artifact exists."""
+    if not root.exists():
+        return
+    for child in sorted(root.iterdir()):
+        if child.is_dir() and child.name.startswith("episode_"):
+            try:
+                yield int(child.name.removeprefix("episode_"))
+            except ValueError:
+                continue
@@ -0,0 +1,334 @@
+#!/usr/bin/env python
+
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""Pre-write validation against staged outputs.
+
+Runs after Modules 1–3 have all written their per-episode artifacts but
+*before* the writer rewrites parquet shards. The validator never touches
+parquet; it only inspects the staging tree and the source frame timestamps
+exposed by :class:`EpisodeRecord`.
+
+Checks (per the plan's "Intermediate staging and validation" section):
+
+- exact timestamp alignment against source frame timestamps
+- no orphan speech / interjection pairs
+- plan / memory emission consistency (events have a paired persistent row)
+- VQA assistant ``content`` is valid JSON (one of bbox / keypoint / count /
+  attribute / spatial)
+- every row maps to its correct column under :func:`column_for_style`
+"""
+
+from __future__ import annotations
+
+import json
+import logging
+from collections.abc import Iterable, Sequence
+from dataclasses import dataclass, field
+from pathlib import Path
+from typing import Any
+
+from lerobot.datasets.language import (
+    LANGUAGE_EVENTS,
+    LANGUAGE_PERSISTENT,
+    column_for_style,
+    is_view_dependent_style,
+    validate_camera_field,
+)
+
+from .reader import EpisodeRecord
+from .staging import EpisodeStaging
+
+logger = logging.getLogger(__name__)
+
+
+@dataclass
+class ValidationReport:
+    """Outcome of one validation pass across all episodes."""
+
+    errors: list[str] = field(default_factory=list)
+    warnings: list[str] = field(default_factory=list)
+    episodes_checked: int = 0
+
+    @property
+    def ok(self) -> bool:
+        return not self.errors
+
+    def add_error(self, message: str) -> None:
+        self.errors.append(message)
+
+    def add_warning(self, message: str) -> None:
+        self.warnings.append(message)
+
+    def summary(self) -> str:
+        return f"checked={self.episodes_checked} errors={len(self.errors)} warnings={len(self.warnings)}"
+
+
+VQA_ANSWER_SHAPES: dict[str, set[str]] = {
+    "bbox": {"detections"},
+    "keypoint": {"label", "point_format", "point"},
+    "count": {"label", "count"},
+    "attribute": {"label", "attribute", "value"},
+    "spatial": {"subject", "relation", "object"},
+}
+
+
+def classify_vqa_answer(payload: Any) -> str | None:
+    """Best-effort classification of a VQA answer payload to a question type."""
+    if not isinstance(payload, dict):
+        return None
+    keys = set(payload.keys())
+    for kind, required in VQA_ANSWER_SHAPES.items():
+        if required.issubset(keys):
+            return kind
+    return None
+
+
+@dataclass
+class StagingValidator:
+    """Walks the staging tree and produces a :class:`ValidationReport`."""
+
+    timestamp_atol: float = 0.0  # exact-match by default
+    dataset_camera_keys: tuple[str, ...] | None = None
+    """Known ``observation.images.*`` keys on the dataset. When set, the
+    validator additionally enforces that every view-dependent row's
+    ``camera`` field references one of these keys. Pass ``None`` (default)
+    to skip that cross-check (e.g. in unit tests with no real dataset)."""
+
+    def validate(
+        self,
+        records: Sequence[EpisodeRecord],
+        staging_dir: Path,
+    ) -> ValidationReport:
+        report = ValidationReport()
+        for record in records:
+            self._validate_episode(record, staging_dir, report)
+            report.episodes_checked += 1
+        return report
+
+    def _validate_episode(
+        self,
+        record: EpisodeRecord,
+        staging_dir: Path,
+        report: ValidationReport,
+    ) -> None:
+        staging = EpisodeStaging(staging_dir, record.episode_index)
+        staged = staging.read_all()
+        all_rows: list[dict[str, Any]] = []
+        for module_name, rows in staged.items():
+            for row in rows:
+                row = {**row, "_module": module_name}
+                all_rows.append(row)
+
+        frame_ts = set(record.frame_timestamps)
+
+        events: list[dict[str, Any]] = []
+        persistent: list[dict[str, Any]] = []
+        for row in all_rows:
+            self._check_column_routing(row, report, record.episode_index)
+            self._check_camera_field(
+                row, report, record.episode_index, self.dataset_camera_keys
+            )
+            if column_for_style(row.get("style")) == LANGUAGE_PERSISTENT:
+                persistent.append(row)
+            else:
+                events.append(row)
+
+        for row in events:
+            self._check_event_timestamp_alignment(row, frame_ts, report, record.episode_index)
+
+        self._check_speech_interjection_pairs(events, report, record.episode_index)
+        self._check_plan_memory_consistency(persistent, events, report, record.episode_index)
+        self._check_vqa_json(events, report, record.episode_index)
+        self._check_vqa_uniqueness_per_frame_camera(events, report, record.episode_index)
+
+    def _check_camera_field(
+        self,
+        row: dict[str, Any],
+        report: ValidationReport,
+        episode_index: int,
+        dataset_camera_keys: Sequence[str] | None,
+    ) -> None:
+        """Enforce the camera invariant + that the key matches the dataset's cameras."""
+        style = row.get("style")
+        camera = row.get("camera")
+        try:
+            validate_camera_field(style, camera)
+        except ValueError as exc:
+            report.add_error(
+                f"ep={episode_index} module={row.get('_module')}: {exc}"
+            )
+            return
+        if (
+            is_view_dependent_style(style)
+            and dataset_camera_keys
+            and camera not in dataset_camera_keys
+        ):
+            report.add_error(
+                f"ep={episode_index} module={row.get('_module')}: camera {camera!r} on style "
+                f"{style!r} is not one of the dataset's video keys {sorted(dataset_camera_keys)!r}"
+            )
+
+    def _check_vqa_uniqueness_per_frame_camera(
+        self,
+        events: Iterable[dict[str, Any]],
+        report: ValidationReport,
+        episode_index: int,
+    ) -> None:
+        """Ensure at most one (vqa, user) and one (vqa, assistant) per (t, camera)."""
+        counts: dict[tuple[float, str, str], int] = {}
+        for row in events:
+            if row.get("style") != "vqa":
+                continue
+            ts = row.get("timestamp")
+            camera = row.get("camera")
+            role = row.get("role")
+            if ts is None or camera is None or role is None:
+                continue  # other validators flag these
+            key = (float(ts), str(camera), str(role))
+            counts[key] = counts.get(key, 0) + 1
+        for (ts, camera, role), n in counts.items():
+            if n > 1:
+                report.add_error(
+                    f"ep={episode_index}: {n} duplicate vqa rows at t={ts} "
+                    f"camera={camera!r} role={role!r}; expected at most one per (t, camera, role)"
+                )
+
+    def _check_column_routing(
+        self,
+        row: dict[str, Any],
+        report: ValidationReport,
+        episode_index: int,
+    ) -> None:
+        style = row.get("style")
+        module = row.get("_module")
+        try:
+            target_col = column_for_style(style)
+        except ValueError:
+            report.add_error(f"ep={episode_index} module={module}: unknown style {style!r}")
+            return
+        if module == "module_1" and target_col != LANGUAGE_PERSISTENT:
+            report.add_error(
+                f"ep={episode_index} module=module_1 emitted style {style!r} that routes to {target_col} (must be persistent)"
+            )
+        if module in {"module_2", "module_3"} and target_col != LANGUAGE_EVENTS:
+            report.add_error(
+                f"ep={episode_index} module={module} emitted style {style!r} that routes to {target_col} (must be events)"
+            )
+
+    def _check_event_timestamp_alignment(
+        self,
+        row: dict[str, Any],
+        frame_ts: set[float],
+        report: ValidationReport,
+        episode_index: int,
+    ) -> None:
+        ts = row.get("timestamp")
+        if ts is None:
+            report.add_error(f"ep={episode_index}: event row missing timestamp: {row!r}")
+            return
+        if self.timestamp_atol == 0.0:
+            if float(ts) not in frame_ts:
+                report.add_error(
+                    f"ep={episode_index}: event row timestamp {ts!r} does not match any source frame timestamp"
+                )
+        else:
+            if not any(abs(float(ts) - f) <= self.timestamp_atol for f in frame_ts):
+                report.add_error(
+                    f"ep={episode_index}: event row timestamp {ts!r} not within {self.timestamp_atol}s of any frame"
+                )
+
+    def _check_speech_interjection_pairs(
+        self,
+        events: Iterable[dict[str, Any]],
+        report: ValidationReport,
+        episode_index: int,
+    ) -> None:
+        speech_ts: dict[float, int] = {}
+        interjection_ts: dict[float, int] = {}
+        for row in events:
+            ts = row.get("timestamp")
+            if ts is None:
+                continue
+            ts_f = float(ts)
+            if row.get("style") is None and row.get("role") == "assistant":
+                speech_ts[ts_f] = speech_ts.get(ts_f, 0) + 1
+            if row.get("style") == "interjection":
+                interjection_ts[ts_f] = interjection_ts.get(ts_f, 0) + 1
+
+        for ts in interjection_ts:
+            if ts not in speech_ts:
+                report.add_error(f"ep={episode_index}: interjection at t={ts} has no paired speech atom")
+
+    def _check_plan_memory_consistency(
+        self,
+        persistent: Sequence[dict[str, Any]],
+        events: Sequence[dict[str, Any]],
+        report: ValidationReport,
+        episode_index: int,
+    ) -> None:
+        plan_ts = sorted({float(r["timestamp"]) for r in persistent if r.get("style") == "plan"})
+        memory_ts = sorted({float(r["timestamp"]) for r in persistent if r.get("style") == "memory"})
+        subtask_ts = sorted({float(r["timestamp"]) for r in persistent if r.get("style") == "subtask"})
+        interjection_ts = sorted(
+            {
+                float(r["timestamp"])
+                for r in events
+                if r.get("style") == "interjection" and r.get("timestamp") is not None
+            }
+        )
+
+        if persistent and not plan_ts:
+            report.add_warning(f"ep={episode_index}: persistent rows present but no plan emitted")
+        # every interjection should have a same-timestamp plan refresh
+        for ts in interjection_ts:
+            if ts not in set(plan_ts):
+                report.add_error(
+                    f"ep={episode_index}: interjection at t={ts} has no co-timestamped plan update"
+                )
+        # memory should be emitted at subtask boundaries (subset relation)
+        if memory_ts and subtask_ts:
+            mem_set = set(memory_ts)
+            sub_set = set(subtask_ts)
+            stray = sorted(mem_set - sub_set)
+            if stray:
+                report.add_warning(f"ep={episode_index}: memory rows at {stray} not at any subtask boundary")
+
+    def _check_vqa_json(
+        self,
+        events: Iterable[dict[str, Any]],
+        report: ValidationReport,
+        episode_index: int,
+    ) -> None:
+        for row in events:
+            if row.get("style") != "vqa" or row.get("role") != "assistant":
+                continue
+            content = row.get("content")
+            if content is None:
+                report.add_error(
+                    f"ep={episode_index}: VQA assistant row at t={row.get('timestamp')} has null content"
+                )
+                continue
+            try:
+                payload = json.loads(content)
+            except (TypeError, ValueError) as exc:
+                report.add_error(
+                    f"ep={episode_index}: VQA assistant content not valid JSON at t={row.get('timestamp')}: {exc}"
+                )
+                continue
+            shape = classify_vqa_answer(payload)
+            if shape is None:
+                report.add_error(
+                    f"ep={episode_index}: VQA assistant payload at t={row.get('timestamp')} does not match any known shape: keys={list(payload) if isinstance(payload, dict) else type(payload).__name__}"
+                )
@@ -0,0 +1,741 @@
+#!/usr/bin/env python
+
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""Shared Qwen-VL client.
+
+The pipeline uses a single shared VLM across modules. vLLM is preferred when
+available (high throughput, JSON-guided decoding); transformers is the
+fallback. A ``stub`` backend is used for unit tests so fixtures never call
+into a real model.
+
+The client speaks one method, :meth:`VlmClient.generate_json`, which:
+
+- accepts a list of OpenAI/HF-style multimodal messages,
+- requests JSON output (``json_mode=True`` enables guided decoding when the
+  backend supports it),
+- batches requests transparently,
+- and reprompts once on a JSON parse failure with an inline correction
+  message before raising.
+"""
+
+from __future__ import annotations
+
+import json
+import os
+import threading
+from collections.abc import Callable, Sequence
+from dataclasses import dataclass
+from typing import Any, Protocol
+
+from .config import VlmConfig
+
+
+class VlmClient(Protocol):
+    """Protocol every backend must implement."""
+
+    def generate_json(
+        self,
+        messages_batch: Sequence[Sequence[dict[str, Any]]],
+        *,
+        max_new_tokens: int | None = None,
+        temperature: float | None = None,
+    ) -> list[Any]:
+        """Generate one JSON-decoded response per messages list."""
+
+
+@dataclass
+class StubVlmClient:
+    """Deterministic stub used in unit tests.
+
+    A test passes a callable that maps the *last user message text* (or, if
+    that is empty, the full message list) to a JSON-serializable response.
+    """
+
+    responder: Callable[[Sequence[dict[str, Any]]], Any]
+
+    def generate_json(
+        self,
+        messages_batch: Sequence[Sequence[dict[str, Any]]],
+        *,
+        max_new_tokens: int | None = None,
+        temperature: float | None = None,
+    ) -> list[Any]:
+        return [self.responder(list(messages)) for messages in messages_batch]
+
+
+def _strip_to_json(text: str) -> Any:
+    text = text.strip()
+    # Strip <think>...</think> blocks (Qwen3 Thinking style)
+    while "<think>" in text and "</think>" in text:
+        start = text.find("<think>")
+        end = text.find("</think>", start) + len("</think>")
+        text = (text[:start] + text[end:]).strip()
+    # Strip ```json ... ``` fences from chat-tuned backbones
+    if text.startswith("```"):
+        first = text.find("\n")
+        last = text.rfind("```")
+        if first != -1 and last != -1 and last > first:
+            text = text[first + 1 : last].strip()
+    try:
+        return json.loads(text)
+    except (ValueError, json.JSONDecodeError):
+        pass
+    # Fall back to extracting the first balanced {...} block.
+    obj_text = _extract_first_json_object(text)
+    if obj_text is None:
+        raise json.JSONDecodeError("No JSON object found", text, 0)
+    return json.loads(obj_text)
+
+
+def _extract_first_json_object(text: str) -> str | None:
+    """Return the first balanced ``{...}`` substring, ignoring braces in
+    string literals. Returns ``None`` if no balanced block is found."""
+    start = text.find("{")
+    if start < 0:
+        return None
+    depth = 0
+    in_string = False
+    escape = False
+    for i in range(start, len(text)):
+        ch = text[i]
+        if escape:
+            escape = False
+            continue
+        if ch == "\\":
+            escape = True
+            continue
+        if ch == '"' and not escape:
+            in_string = not in_string
+            continue
+        if in_string:
+            continue
+        if ch == "{":
+            depth += 1
+        elif ch == "}":
+            depth -= 1
+            if depth == 0:
+                return text[start : i + 1]
+    return None
+
+
+@dataclass
+class _GenericTextClient:
+    """Wraps any text-generation callable in JSON-mode + one-retry semantics."""
+
+    generate_text: Callable[[Sequence[Sequence[dict[str, Any]]], int, float], list[str]]
+    config: VlmConfig
+
+    def generate_json(
+        self,
+        messages_batch: Sequence[Sequence[dict[str, Any]]],
+        *,
+        max_new_tokens: int | None = None,
+        temperature: float | None = None,
+    ) -> list[Any]:
+        max_tok = max_new_tokens if max_new_tokens is not None else self.config.max_new_tokens
+        temp = temperature if temperature is not None else self.config.temperature
+        raw = self.generate_text(messages_batch, max_tok, temp)
+        out: list[Any] = []
+        for messages, text in zip(messages_batch, raw, strict=True):
+            try:
+                out.append(_strip_to_json(text))
+                continue
+            except (ValueError, json.JSONDecodeError):
+                pass
+            retry = list(messages) + [
+                {"role": "assistant", "content": text},
+                {
+                    "role": "user",
+                    "content": (
+                        "Your previous reply was not valid JSON. "
+                        "Reply with strictly valid JSON, no prose, no fences."
+                    ),
+                },
+            ]
+            retry_text = self.generate_text([retry], max_tok, temp)[0]
+            try:
+                out.append(_strip_to_json(retry_text))
+            except (ValueError, json.JSONDecodeError):
+                # After retry: log preview and return None instead of crashing
+                # the whole pipeline. Modules treat None as "skip".
+                preview = retry_text.strip().replace("\n", " ")[:200]
+                print(
+                    f"[vlm] WARNING: failed to parse JSON after retry; preview: {preview!r}",
+                    flush=True,
+                )
+                out.append(None)
+        return out
+
+
+def make_vlm_client(config: VlmConfig) -> VlmClient:
+    """Build the shared VLM client per the configured backend.
+
+    For ``stub``, callers should construct :class:`StubVlmClient` directly with
+    a responder callable. ``stub`` here is rejected to make accidental misuse
+    obvious.
+    """
+    if config.backend == "stub":
+        raise ValueError(
+            "Use StubVlmClient(...) directly for the stub backend; make_vlm_client builds real clients."
+        )
+    if config.backend == "vllm":
+        return _make_vllm_client(config)
+    if config.backend == "transformers":
+        return _make_transformers_client(config)
+    if config.backend == "openai":
+        return _make_openai_client(config)
+    raise ValueError(f"Unknown VLM backend: {config.backend!r}")
+
+
+def _make_vllm_client(config: VlmConfig) -> VlmClient:
+    try:
+        from vllm import LLM, SamplingParams  # type: ignore[import-not-found]
+    except ImportError as exc:
+        raise ImportError(
+            "vllm is required for backend='vllm'. Install with `pip install lerobot[annotations]`."
+        ) from exc
+    # Workaround for cuDNN 9.x + torch 2.8 conv3d regression that surfaces
+    # as CUDNN_STATUS_NOT_INITIALIZED in Qwen-VL vision-tower patch
+    # embedders. Setting LEROBOT_DISABLE_CUDNN=1 forces native PyTorch
+    # convolution kernels — slower but functional.
+    import os as _os  # noqa: PLC0415
+
+    if _os.environ.get("LEROBOT_DISABLE_CUDNN", "").lower() in {"1", "true", "yes"}:
+        import torch as _torch  # noqa: PLC0415
+
+        _torch.backends.cudnn.enabled = False
+    llm_kwargs: dict[str, Any] = {
+        "model": config.model_id,
+        "tensor_parallel_size": config.tensor_parallel_size,
+        "gpu_memory_utilization": config.gpu_memory_utilization,
+        "trust_remote_code": config.trust_remote_code,
+    }
+    if config.max_model_len is not None:
+        llm_kwargs["max_model_len"] = config.max_model_len
+    llm = LLM(**llm_kwargs)
+
+    def _gen(batch: Sequence[Sequence[dict[str, Any]]], max_tok: int, temp: float) -> list[str]:
+        # ``guided_decoding`` would speed up parsing but its API differs across
+        # vllm releases (dict vs GuidedDecodingParams). The _GenericTextClient
+        # wrapper already has a one-retry JSON-recovery path, so we skip it.
+        params = SamplingParams(max_tokens=max_tok, temperature=temp)
+        # ``llm.chat`` handles chat-template application + multimodal input
+        # extraction (image/video blocks) internally, which ``llm.generate``
+        # does not.
+        outputs = llm.chat([list(m) for m in batch], params)
+        return [o.outputs[0].text for o in outputs]
+
+    return _GenericTextClient(_gen, config)
+
+
+def _make_transformers_client(config: VlmConfig) -> VlmClient:
+    try:
+        import torch  # type: ignore[import-not-found]
+        import transformers  # type: ignore[import-not-found]
+        from transformers import AutoProcessor  # type: ignore[import-not-found]
+    except ImportError as exc:
+        raise ImportError("transformers + torch are required for backend='transformers'.") from exc
+    auto_cls = (
+        getattr(transformers, "AutoModelForImageTextToText", None)
+        or getattr(transformers, "AutoModelForVision2Seq", None)
+    )
+    if auto_cls is None:
+        raise ImportError(
+            "Neither AutoModelForImageTextToText nor AutoModelForVision2Seq is available in this "
+            "transformers version. Install transformers>=4.45 (which has AutoModelForImageTextToText) "
+            "for VL models."
+        )
+    processor = AutoProcessor.from_pretrained(
+        config.model_id, trust_remote_code=config.trust_remote_code
+    )
+    import os as _os  # noqa: PLC0415
+
+    use_accelerate = _os.environ.get("LEROBOT_TRANSFORMERS_DEVICE_MAP", "manual") != "manual"
+    # ``device_map='auto'`` triggers a known std::bad_alloc on the Qwen3-VL
+    # post-load dispatch path (the alloc fails in accelerate's hook setup
+    # even with TBs of host RAM). Default to manual: load on CPU with
+    # ``low_cpu_mem_usage=True``, then ``.to("cuda")``. Set
+    # ``LEROBOT_TRANSFORMERS_DEVICE_MAP=auto`` to opt back into the old path.
+    if use_accelerate:
+        model = auto_cls.from_pretrained(
+            config.model_id,
+            torch_dtype="auto",
+            device_map="auto",
+            low_cpu_mem_usage=True,
+            trust_remote_code=config.trust_remote_code,
+        )
+    else:
+        import torch as _torch  # noqa: PLC0415
+
+        model = auto_cls.from_pretrained(
+            config.model_id,
+            torch_dtype=_torch.bfloat16,
+            low_cpu_mem_usage=True,
+            trust_remote_code=config.trust_remote_code,
+        )
+        model = model.to("cuda")
+    model.eval()
+
+    def _gen(batch: Sequence[Sequence[dict[str, Any]]], max_tok: int, temp: float) -> list[str]:
+        outs: list[str] = []
+        for messages in batch:
+            text = processor.apply_chat_template(messages, add_generation_prompt=True, tokenize=False)
+            inputs = processor(text=[text], return_tensors="pt").to(model.device)
+            with torch.no_grad():
+                gen = model.generate(
+                    **inputs,
+                    max_new_tokens=max_tok,
+                    temperature=temp,
+                    do_sample=temp > 0.0,
+                )
+            decoded = processor.batch_decode(
+                gen[:, inputs["input_ids"].shape[-1] :], skip_special_tokens=True
+            )[0]
+            outs.append(decoded)
+        return outs
+
+    return _GenericTextClient(_gen, config)
+
+
+def _make_openai_client(config: VlmConfig) -> VlmClient:
+    """Backend that talks to any OpenAI-compatible server.
+
+    Compatible with ``vllm serve``, ``transformers serve``,
+    ``ktransformers serve``, and hosted endpoints. By default the server
+    is expected to be already running. Set ``auto_serve=True`` to have
+    this client spawn one (default: ``transformers serve``), wait until
+    it's ready, and tear it down on process exit.
+
+    Image blocks ``{"type":"image", "image":<PIL.Image>}`` are
+    auto-converted to ``image_url`` data-URLs. Video blocks
+    ``{"type":"video", "video":[<PIL>...]}`` are forwarded as
+    multi-frame ``video_url`` items where supported.
+    """
+    try:
+        from openai import OpenAI  # type: ignore[import-not-found]
+    except ImportError as exc:
+        raise ImportError(
+            "openai package is required for backend='openai'. "
+            "Install with `pip install openai`."
+        ) from exc
+
+    api_base = config.api_base
+    api_key = config.api_key
+    auto_serve = config.auto_serve
+    api_bases: list[str] = [api_base]
+
+    print(
+        f"[lerobot-annotate] backend=openai model={config.model_id} "
+        f"api_base={api_base} auto_serve={auto_serve}",
+        flush=True,
+    )
+    if auto_serve:
+        if config.parallel_servers > 1:
+            print(
+                f"[lerobot-annotate] spawning {config.parallel_servers} parallel servers",
+                flush=True,
+            )
+            api_bases = _spawn_parallel_inference_servers(config)
+        elif _server_is_up(api_base):
+            print(f"[lerobot-annotate] reusing server already up at {api_base}", flush=True)
+        else:
+            print("[lerobot-annotate] no server reachable; spawning one", flush=True)
+            api_base = _spawn_inference_server(config)
+            api_bases = [api_base]
+            print(f"[lerobot-annotate] server ready at {api_base}", flush=True)
+
+    clients = [OpenAI(base_url=base, api_key=api_key) for base in api_bases]
+    client = clients[0]
+    # round-robin counter for parallel mode
+    rr_counter = {"i": 0}
+
+    # ``mm_processor_kwargs`` is a vllm-specific extra; transformers serve
+    # rejects it with HTTP 422. Send it only when explicitly opted in via
+    # an env var (e.g. ``LEROBOT_OPENAI_SEND_MM_KWARGS=1`` for vllm).
+    send_mm_kwargs = os.environ.get(
+        "LEROBOT_OPENAI_SEND_MM_KWARGS", ""
+    ).lower() in {"1", "true", "yes"}
+
+    rr_lock = threading.Lock()
+
+    def _one_call(
+        messages: Sequence[dict[str, Any]], max_tok: int, temp: float
+    ) -> str:
+        api_messages, mm_kwargs = _to_openai_messages(messages)
+        kwargs: dict[str, Any] = {
+            "model": config.model_id,
+            "messages": api_messages,
+            "max_tokens": max_tok,
+            "temperature": temp,
+        }
+        extra_body: dict[str, Any] = {}
+        if send_mm_kwargs and mm_kwargs:
+            extra_body["mm_processor_kwargs"] = {**mm_kwargs, "do_sample_frames": True}
+        if config.chat_template_kwargs:
+            extra_body["chat_template_kwargs"] = config.chat_template_kwargs
+        if extra_body:
+            kwargs["extra_body"] = extra_body
+        with rr_lock:
+            chosen = clients[rr_counter["i"] % len(clients)]
+            rr_counter["i"] += 1
+        response = chosen.chat.completions.create(**kwargs)
+        return response.choices[0].message.content or ""
+
+    def _gen(
+        batch: Sequence[Sequence[dict[str, Any]]], max_tok: int, temp: float
+    ) -> list[str]:
+        if len(batch) <= 1 or config.client_concurrency <= 1:
+            return [_one_call(messages, max_tok, temp) for messages in batch]
+        # Parallel fan-out — vllm batches these on the server side.
+        from concurrent.futures import ThreadPoolExecutor  # noqa: PLC0415
+
+        max_workers = min(config.client_concurrency, len(batch))
+        with ThreadPoolExecutor(max_workers=max_workers) as pool:
+            futures = [
+                pool.submit(_one_call, messages, max_tok, temp) for messages in batch
+            ]
+            return [f.result() for f in futures]
+
+    return _GenericTextClient(_gen, config)
+
+
+def _spawn_parallel_inference_servers(config: VlmConfig) -> list[str]:
+    """Spawn ``config.parallel_servers`` independent vllm replicas.
+
+    Each replica:
+    - is pinned to a single GPU via ``CUDA_VISIBLE_DEVICES``
+    - listens on ``serve_port + i``
+    - is shut down via the same atexit hook as the single-server path
+
+    Returns the list of ``api_base`` URLs the client should round-robin
+    across.
+    """
+    import atexit  # noqa: PLC0415
+    import os as _os  # noqa: PLC0415
+    import shlex  # noqa: PLC0415
+    import signal  # noqa: PLC0415
+    import subprocess  # noqa: PLC0415
+    import sys  # noqa: PLC0415
+    import threading  # noqa: PLC0415
+    import time  # noqa: PLC0415
+
+    n = config.parallel_servers
+    api_bases: list[str] = []
+    procs: list[subprocess.Popen] = []
+    ready_events: list[threading.Event] = []
+    # Multiple readiness signals — uvicorn's own banner is suppressed at
+    # ``--uvicorn-log-level warning``, so we also accept vllm's own
+    # "Starting vLLM API server" line and the route-listing line. The
+    # HTTP probe below is the ultimate fallback.
+    ready_markers = (
+        "Uvicorn running",
+        "Application startup complete",
+        "Starting vLLM API server",
+        "Available routes are",
+    )
+    # Single lock for all server-stream threads so multibyte chars from
+    # different servers don't interleave and tear UTF-8 sequences.
+    print_lock = threading.Lock()
+
+    base_cmd = config.serve_command or (
+        f"vllm serve {shlex.quote(config.model_id)} "
+        f"--tensor-parallel-size 1 "
+        f"--max-model-len {config.max_model_len or 32768} "
+        f"--uvicorn-log-level warning"
+    )
+
+    num_gpus = config.num_gpus if config.num_gpus > 0 else n
+    for i in range(n):
+        port = config.serve_port + i
+        gpu = i % num_gpus
+        env = _os.environ.copy()
+        env["CUDA_VISIBLE_DEVICES"] = str(gpu)
+        cmd = base_cmd
+        if "{port}" in cmd:
+            cmd = cmd.replace("{port}", str(port))
+        else:
+            cmd = f"{cmd} --port {port}"
+        api_base = f"http://localhost:{port}/v1"
+        api_bases.append(api_base)
+        print(f"[server-{i}] launching on GPU {gpu} port {port}: {cmd}", flush=True)
+        proc = subprocess.Popen(
+            shlex.split(cmd),
+            stdout=subprocess.PIPE,
+            stderr=subprocess.STDOUT,
+            text=True,
+            bufsize=1,
+            env=env,
+        )
+        procs.append(proc)
+        ready = threading.Event()
+        ready_events.append(ready)
+
+        def _stream(idx: int, p: subprocess.Popen, ev: threading.Event) -> None:
+            # Read whole lines and emit each line atomically under the
+            # shared print_lock so output from N servers stays readable.
+            assert p.stdout is not None
+            for line in iter(p.stdout.readline, ""):
+                with print_lock:
+                    sys.stdout.write(f"[server-{idx}] {line}")
+                    if not line.endswith(("\n", "\r")):
+                        sys.stdout.write("\n")
+                    sys.stdout.flush()
+                if any(m in line for m in ready_markers):
+                    ev.set()
+
+        threading.Thread(target=_stream, args=(i, proc, ready), daemon=True).start()
+
+        def _probe(idx: int, base: str, ev: threading.Event, p: subprocess.Popen) -> None:
+            while not ev.is_set() and p.poll() is None:
+                if _server_is_up(base):
+                    print(f"[server-{idx}] ready (http probe)", flush=True)
+                    ev.set()
+                    return
+                time.sleep(2)
+
+        threading.Thread(target=_probe, args=(i, api_base, ready, proc), daemon=True).start()
+
+    def _shutdown() -> None:
+        for i, p in enumerate(procs):
+            if p.poll() is None:
+                print(f"[server-{i}] stopping pid={p.pid}", flush=True)
+                p.send_signal(signal.SIGINT)
+        for p in procs:
+            try:
+                p.wait(timeout=15)
+            except subprocess.TimeoutExpired:
+                p.kill()
+                p.wait(timeout=5)
+
+    atexit.register(_shutdown)
+
+    deadline = time.monotonic() + config.serve_ready_timeout_s
+    while any(not ev.is_set() for ev in ready_events) and time.monotonic() < deadline:
+        for i, p in enumerate(procs):
+            if p.poll() is not None:
+                raise RuntimeError(
+                    f"[server-{i}] inference server exited unexpectedly with rc={p.returncode}"
+                )
+        time.sleep(2)
+    if any(not ev.is_set() for ev in ready_events):
+        raise RuntimeError(
+            f"[server] not all replicas became ready within {config.serve_ready_timeout_s}s"
+        )
+    print(f"[lerobot-annotate] all {n} servers ready: {api_bases}", flush=True)
+    return api_bases
+
+
+def _server_is_up(api_base: str) -> bool:
+    """Return True if ``api_base/models`` answers 200 within 2 seconds."""
+    import urllib.request  # noqa: PLC0415
+
+    url = api_base.rstrip("/") + "/models"
+    try:
+        with urllib.request.urlopen(url, timeout=2) as resp:
+            return resp.status == 200
+    except Exception:  # noqa: BLE001
+        return False
+
+
+def _spawn_inference_server(config: VlmConfig) -> str:
+    """Spawn ``transformers serve`` (or ``serve_command``), wait until it
+    accepts ``/v1/models``, and register a shutdown hook.
+
+    Streams the server's stdout/stderr to the parent terminal in
+    real-time on a background thread so users can see model-load
+    progress and errors as they happen.
+
+    Returns the full ``api_base`` URL the OpenAI client should use.
+    """
+    import atexit  # noqa: PLC0415
+    import shlex  # noqa: PLC0415
+    import signal  # noqa: PLC0415
+    import subprocess  # noqa: PLC0415
+    import sys  # noqa: PLC0415
+    import threading  # noqa: PLC0415
+    import time  # noqa: PLC0415
+    import urllib.request  # noqa: PLC0415
+
+    cmd = config.serve_command
+    if not cmd:
+        cmd = (
+            f"transformers serve {shlex.quote(config.model_id)} "
+            f"--port {config.serve_port} --continuous-batching"
+        )
+    api_base = f"http://localhost:{config.serve_port}/v1"
+    print(f"[server] launching: {cmd}", flush=True)
+    proc = subprocess.Popen(
+        shlex.split(cmd),
+        stdout=subprocess.PIPE,
+        stderr=subprocess.STDOUT,
+        text=True,
+        bufsize=1,
+    )
+
+    # Watch the server output for the uvicorn readiness banner. This is
+    # more reliable than polling /v1/models because transformers serve
+    # rescans its cache on every model-list request, which can exceed
+    # the urllib timeout and trigger an infinite probe loop.
+    ready_event = threading.Event()
+    # See _spawn_parallel_inference_servers for why we accept these.
+    ready_markers = (
+        "Uvicorn running",
+        "Application startup complete",
+        "Starting vLLM API server",
+        "Available routes are",
+    )
+
+    def _probe() -> None:
+        while not ready_event.is_set() and proc.poll() is None:
+            if _server_is_up(api_base):
+                print("[server] ready (http probe)", flush=True)
+                ready_event.set()
+                return
+            time.sleep(2)
+
+    threading.Thread(target=_probe, daemon=True).start()
+
+    def _stream_output() -> None:
+        # Read raw chunks instead of iterating lines so tqdm progress
+        # bars (which overwrite using \r) flush in real time.
+        assert proc.stdout is not None
+        buf = ""
+        prefix_started = False
+        while True:
+            ch = proc.stdout.read(1)
+            if ch == "":
+                # process exited; flush any tail
+                if buf:
+                    sys.stdout.write(buf)
+                    sys.stdout.flush()
+                return
+            if not prefix_started:
+                sys.stdout.write("[server] ")
+                prefix_started = True
+            sys.stdout.write(ch)
+            sys.stdout.flush()
+            buf += ch
+            if ch in ("\n", "\r"):
+                if any(marker in buf for marker in ready_markers):
+                    ready_event.set()
+                buf = ""
+                prefix_started = False
+
+    threading.Thread(target=_stream_output, daemon=True).start()
+
+    def _shutdown() -> None:
+        if proc.poll() is None:
+            print(f"[server] stopping pid={proc.pid}", flush=True)
+            proc.send_signal(signal.SIGINT)
+            try:
+                proc.wait(timeout=15)
+            except subprocess.TimeoutExpired:
+                proc.kill()
+                proc.wait(timeout=5)
+
+    atexit.register(_shutdown)
+
+    deadline = time.monotonic() + config.serve_ready_timeout_s
+    while time.monotonic() < deadline:
+        if proc.poll() is not None:
+            raise RuntimeError(
+                f"[server] inference server exited unexpectedly with rc={proc.returncode}. "
+                f"See [server] log lines above for the cause."
+            )
+        if ready_event.wait(timeout=2):
+            return api_base
+    proc.terminate()
+    raise RuntimeError(
+        f"[server] did not become ready within {config.serve_ready_timeout_s}s"
+    )
+
+
+def _to_openai_messages(
+    messages: Sequence[dict[str, Any]],
+) -> tuple[list[dict[str, Any]], dict[str, Any]]:
+    """Convert internal messages to OpenAI chat format.
+
+    Returns ``(api_messages, mm_kwargs)``. Multimodal-processor kwargs
+    (``fps`` from ``video_url`` blocks) are extracted out so the caller
+    can pass them via ``extra_body.mm_processor_kwargs`` rather than
+    inside the content blocks (which transformers serve rejects).
+
+    File-URL video blocks are inlined as base64 data URLs.
+    """
+    out_messages: list[dict[str, Any]] = []
+    mm_kwargs: dict[str, Any] = {}
+    for message in messages:
+        content = message.get("content")
+        if not isinstance(content, list):
+            out_messages.append({"role": message["role"], "content": content})
+            continue
+        out_blocks: list[dict[str, Any]] = []
+        for block in content:
+            block_type = block.get("type") if isinstance(block, dict) else None
+            if block_type == "text":
+                out_blocks.append({"type": "text", "text": block.get("text", "")})
+            elif block_type == "image":
+                out_blocks.append(
+                    {"type": "image_url", "image_url": {"url": _pil_to_data_url(block["image"])}}
+                )
+            elif block_type == "video":
+                frames = block.get("video", [])
+                for img in frames:
+                    out_blocks.append(
+                        {"type": "image_url", "image_url": {"url": _pil_to_data_url(img)}}
+                    )
+            elif block_type == "video_url":
+                video_url = dict(block["video_url"])
+                url = video_url.get("url", "")
+                if url.startswith("file://"):
+                    video_url["url"] = _file_to_data_url(url[len("file://") :])
+                out_blocks.append({"type": "video_url", "video_url": video_url})
+                fps = block.get("fps")
+                if fps is not None:
+                    mm_kwargs["fps"] = fps
+            else:
+                out_blocks.append(block)
+        out_messages.append({"role": message["role"], "content": out_blocks})
+    return out_messages, mm_kwargs
+
+
+def _file_to_data_url(path: str) -> str:
+    """Read a local video file and return a base64 ``data:video/mp4`` URL."""
+    import base64  # noqa: PLC0415
+
+    with open(path, "rb") as f:
+        b64 = base64.b64encode(f.read()).decode("ascii")
+    return f"data:video/mp4;base64,{b64}"
+
+
+def _pil_to_data_url(image: Any) -> str:
+    """Encode a PIL.Image as a base64 data URL."""
+    import base64  # noqa: PLC0415
+    import io  # noqa: PLC0415
+
+    buf = io.BytesIO()
+    image.save(buf, format="PNG")
+    b64 = base64.b64encode(buf.getvalue()).decode("ascii")
+    return f"data:image/png;base64,{b64}"
+
+
+def _messages_to_prompt(messages: Sequence[dict[str, Any]]) -> Any:
+    """Pass-through hook used by the vllm backend.
+
+    vllm exposes its own multimodal entry points that vary by version; for the
+    base flow we simply forward the raw message list and let the caller's
+    custom backend handle templating. Real deployments override this.
+    """
+    return list(messages)
@@ -0,0 +1,341 @@
+#!/usr/bin/env python
+
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""Final parquet rewrite.
+
+For every episode the writer:
+
+1. reads the staged module outputs,
+2. partitions them into a persistent slice (PERSISTENT_STYLES) and an event
+   slice (EVENT_ONLY_STYLES + style=None tool-call atoms),
+3. sorts each slice deterministically,
+4. broadcasts the persistent slice across every frame in the episode,
+5. for each frame, materializes the sublist of event rows whose timestamp
+   exactly equals that frame's timestamp,
+6. drops the legacy ``subtask_index`` column,
+7. writes the parquet shard back in place.
+
+The writer does NOT add a dataset-level ``tools`` column. Tool *calls* are
+emitted per-row via the existing ``tool_calls`` field on the v3.1 row
+struct (PR 1) for every speech atom. The tool *schema* (the description
+of the ``say`` function and its parameters) is a fixed code constant —
+``SAY_TOOL_SCHEMA`` below — and downstream chat-template consumers import
+it directly rather than reading a redundant per-row column.
+
+Invariants enforced here (and re-checked by the validator):
+
+- per-episode persistent slice is byte-identical across every frame;
+- ``language_events`` rows on a frame all have ``timestamp == frame_ts``
+  (timestamps come straight from the source parquet — never recomputed);
+- every row passes ``column_for_style(style)``.
+"""
+
+from __future__ import annotations
+
+import logging
+from collections import defaultdict
+from collections.abc import Iterable, Sequence
+from dataclasses import dataclass
+from pathlib import Path
+from typing import Any
+
+import pyarrow as pa
+import pyarrow.parquet as pq
+
+from lerobot.datasets.language import (
+    EVENT_ONLY_STYLES,
+    LANGUAGE_EVENTS,
+    LANGUAGE_PERSISTENT,
+    PERSISTENT_STYLES,
+    column_for_style,
+    validate_camera_field,
+)
+
+from .reader import EpisodeRecord
+from .staging import EpisodeStaging
+
+logger = logging.getLogger(__name__)
+
+
+# Tool schema constants moved to lerobot.datasets.language in PR 1 — single
+# source of truth. Re-exported here so existing imports
+# (``from lerobot.annotations.steerable_pipeline.writer import SAY_TOOL_SCHEMA``)
+# keep working.
+from lerobot.datasets.language import DEFAULT_TOOLS, SAY_TOOL_SCHEMA  # noqa: F401, E402
+
+
+def _row_persistent_sort_key(row: dict[str, Any]) -> tuple:
+    return (float(row["timestamp"]), row.get("style") or "", row.get("role") or "")
+
+
+def _row_event_sort_key(row: dict[str, Any]) -> tuple:
+    # events are bucketed per-frame, but within a frame we still want determinism
+    return (
+        row.get("style") or "",
+        row.get("role") or "",
+        row.get("camera") or "",
+    )
+
+
+def _normalize_persistent_row(row: dict[str, Any]) -> dict[str, Any]:
+    """Coerce a staged row into the persistent column's struct shape."""
+    style = row.get("style")
+    if style not in PERSISTENT_STYLES:
+        raise ValueError(
+            f"persistent slice contains row with non-persistent style {style!r}; "
+            "row would be misrouted under column_for_style()"
+        )
+    if "timestamp" not in row:
+        raise ValueError(f"persistent row missing timestamp: {row!r}")
+    camera = row.get("camera")
+    validate_camera_field(style, camera)
+    return {
+        "role": str(row["role"]),
+        "content": None if row.get("content") is None else str(row["content"]),
+        "style": style,
+        "timestamp": float(row["timestamp"]),
+        "camera": None if camera is None else str(camera),
+        "tool_calls": _normalize_tool_calls(row.get("tool_calls")),
+    }
+
+
+def _normalize_event_row(row: dict[str, Any]) -> dict[str, Any]:
+    """Coerce a staged row into the event column's struct shape (no timestamp)."""
+    style = row.get("style")
+    if style is not None and style not in EVENT_ONLY_STYLES:
+        raise ValueError(
+            f"event slice contains row with style {style!r}; expected None or one of {EVENT_ONLY_STYLES}"
+        )
+    if column_for_style(style) != LANGUAGE_EVENTS:
+        raise ValueError(f"event row with style {style!r} would not route to language_events")
+    camera = row.get("camera")
+    validate_camera_field(style, camera)
+    return {
+        "role": str(row["role"]),
+        "content": None if row.get("content") is None else str(row["content"]),
+        "style": style,
+        "camera": None if camera is None else str(camera),
+        "tool_calls": _normalize_tool_calls(row.get("tool_calls")),
+    }
+
+
+def _normalize_tool_calls(value: Any) -> list[Any] | None:
+    if value is None:
+        return None
+    if not isinstance(value, list):
+        raise ValueError(f"tool_calls must be a list or None, got {type(value).__name__}")
+    return list(value)
+
+
+def _validate_atom_invariants(row: dict[str, Any]) -> None:
+    """At-least-one of content/tool_calls; style=None implies tool_calls."""
+    has_content = row.get("content") is not None
+    has_tools = row.get("tool_calls") is not None
+    if not (has_content or has_tools):
+        raise ValueError(f"row has neither content nor tool_calls: {row!r}")
+    if row.get("style") is None and not has_tools:
+        raise ValueError(f"style=None requires tool_calls: {row!r}")
+
+
+def _validate_speech_atom(row: dict[str, Any]) -> None:
+    """Speech atoms: role=assistant, style=None, content=None, say tool call."""
+    if row.get("style") is not None:
+        return  # not a speech atom
+    if row.get("role") != "assistant":
+        raise ValueError(f"speech atom must have role=assistant: {row!r}")
+    if row.get("content") is not None:
+        raise ValueError(f"speech atom must have content=null: {row!r}")
+    tool_calls = row.get("tool_calls")
+    if not tool_calls or not isinstance(tool_calls, list):
+        raise ValueError(f"speech atom must have non-empty tool_calls list: {row!r}")
+    first = tool_calls[0]
+    if not isinstance(first, dict):
+        raise ValueError(f"speech atom tool_calls[0] must be a dict: {row!r}")
+    if first.get("type") != "function":
+        raise ValueError(f"speech atom tool_calls[0].type must be 'function': {row!r}")
+    fn = first.get("function") or {}
+    if fn.get("name") != "say":
+        raise ValueError(f"speech atom tool_calls[0].function.name must be 'say': {row!r}")
+    args = fn.get("arguments") or {}
+    if not isinstance(args, dict) or "text" not in args or not isinstance(args["text"], str):
+        raise ValueError(f"speech atom must carry 'text' string in arguments: {row!r}")
+
+
+@dataclass
+class LanguageColumnsWriter:
+    """Rewrite ``data/chunk-*/file-*.parquet`` with the two language columns."""
+
+    drop_existing_subtask_index: bool = True
+
+    def write_all(
+        self,
+        records: Sequence[EpisodeRecord],
+        staging_dir: Path,
+        root: Path,
+    ) -> list[Path]:
+        episodes_by_path: dict[Path, list[EpisodeRecord]] = defaultdict(list)
+        for record in records:
+            episodes_by_path[record.data_path].append(record)
+
+        written: list[Path] = []
+        for path, eps in episodes_by_path.items():
+            self._rewrite_one(path, eps, staging_dir, root)
+            written.append(path)
+        return written
+
+    def _rewrite_one(
+        self,
+        path: Path,
+        episodes: Sequence[EpisodeRecord],
+        staging_dir: Path,
+        root: Path,
+    ) -> None:
+        table = pq.read_table(path)
+        n_rows = table.num_rows
+
+        # Ensure we cover every episode in the file. Episodes that don't have
+        # staging artifacts are passed through with empty annotation lists —
+        # this keeps the writer idempotent and safe for partial reruns.
+        staged_per_ep: dict[int, dict[str, list[dict[str, Any]]]] = {}
+        for record in episodes:
+            staging = EpisodeStaging(staging_dir, record.episode_index)
+            staged_per_ep[record.episode_index] = staging.read_all()
+
+        persistent_by_ep: dict[int, list[dict[str, Any]]] = {}
+        events_by_ep_ts: dict[int, dict[float, list[dict[str, Any]]]] = {}
+
+        for ep_index, ep_staged in staged_per_ep.items():
+            persistent_rows: list[dict[str, Any]] = []
+            event_rows: list[dict[str, Any]] = []  # carry timestamp until bucketed
+            for _module_name, rows in ep_staged.items():
+                for row in rows:
+                    style = row.get("style")
+                    if column_for_style(style) == LANGUAGE_PERSISTENT:
+                        persistent_rows.append(row)
+                    else:
+                        event_rows.append(row)
+
+            persistent_rows.sort(key=_row_persistent_sort_key)
+            normalized_persistent = []
+            for r in persistent_rows:
+                _validate_atom_invariants(r)
+                _validate_speech_atom(r)
+                normalized_persistent.append(_normalize_persistent_row(r))
+            persistent_by_ep[ep_index] = normalized_persistent
+
+            buckets: dict[float, list[dict[str, Any]]] = defaultdict(list)
+            for r in event_rows:
+                _validate_atom_invariants(r)
+                _validate_speech_atom(r)
+                ts = float(r["timestamp"])
+                buckets[ts].append(_normalize_event_row(r))
+            for ts in list(buckets.keys()):
+                buckets[ts].sort(key=_row_event_sort_key)
+            events_by_ep_ts[ep_index] = buckets
+
+        episode_col = (
+            table.column("episode_index").to_pylist() if "episode_index" in table.column_names else None
+        )
+        ts_col = table.column("timestamp").to_pylist() if "timestamp" in table.column_names else None
+        if episode_col is None or ts_col is None:
+            raise ValueError(f"{path} is missing 'episode_index' or 'timestamp' — required by the writer.")
+
+        per_row_persistent: list[list[dict[str, Any]]] = []
+        per_row_events: list[list[dict[str, Any]]] = []
+        for i in range(n_rows):
+            ep = episode_col[i]
+            ts = float(ts_col[i])
+            per_row_persistent.append(persistent_by_ep.get(ep, []))
+            buckets = events_by_ep_ts.get(ep, {})
+            per_row_events.append(buckets.get(ts, []))
+
+        new_table = self._materialize_table(
+            table, per_row_persistent, per_row_events, drop_old=self.drop_existing_subtask_index
+        )
+        pq.write_table(new_table, path)
+
+    def _materialize_table(
+        self,
+        table: pa.Table,
+        persistent: list[list[dict[str, Any]]],
+        events: list[list[dict[str, Any]]],
+        *,
+        drop_old: bool,
+    ) -> pa.Table:
+        cols = []
+        names = []
+        for name in table.column_names:
+            if drop_old and name == "subtask_index":
+                continue
+            if name in (LANGUAGE_PERSISTENT, LANGUAGE_EVENTS):
+                continue  # we'll re-add canonical versions
+            # Strip any legacy ``tools`` column previously emitted by older
+            # writers — the schema no longer uses it (constant lives in
+            # SAY_TOOL_SCHEMA / DEFAULT_TOOLS).
+            if name == "tools":
+                continue
+            cols.append(table.column(name))
+            names.append(name)
+
+        # We let pyarrow infer struct/list schema rather than passing the
+        # canonical type from `lerobot.datasets.language` directly: that type
+        # uses `pa.json_()` for the `tool_calls` element type, which
+        # `pa.array(..., type=...)` cannot materialize from Python lists on
+        # current pyarrow versions. The inferred schema round-trips through
+        # parquet and `LeRobotDataset` correctly — see PR 1's
+        # `tests/datasets/test_language.py` which exercises the same flow.
+        persistent_arr = pa.array(persistent)
+        events_arr = pa.array(events)
+
+        cols.extend([persistent_arr, events_arr])
+        names.extend([LANGUAGE_PERSISTENT, LANGUAGE_EVENTS])
+
+        return pa.Table.from_arrays(cols, names=names)
+
+
+def speech_atom(timestamp: float, text: str) -> dict[str, Any]:
+    """Build a canonical speech tool-call atom for the events column."""
+    return {
+        "role": "assistant",
+        "content": None,
+        "style": None,
+        "timestamp": float(timestamp),
+        "camera": None,
+        "tool_calls": [
+            {
+                "type": "function",
+                "function": {
+                    "name": "say",
+                    "arguments": {"text": text},
+                },
+            }
+        ],
+    }
+
+
+def normalize_rows_for_writer(
+    rows: Iterable[dict[str, Any]],
+) -> tuple[list[dict[str, Any]], list[dict[str, Any]]]:
+    """Helper used by tests/validators to partition a flat row list into
+    (persistent_rows, event_rows) using ``column_for_style``.
+    """
+    persistent: list[dict[str, Any]] = []
+    events: list[dict[str, Any]] = []
+    for row in rows:
+        if column_for_style(row.get("style")) == LANGUAGE_PERSISTENT:
+            persistent.append(row)
+        else:
+            events.append(row)
+    return persistent, events
@@ -17,7 +17,6 @@ Provides the RealSenseCamera class for capturing frames from Intel RealSense cam
 """

 import logging
-import sys
 import time
 from threading import Event, Lock, Thread
 from typing import TYPE_CHECKING, Any
@@ -42,7 +41,6 @@ from ..utils import get_cv2_rotation
 from .configuration_realsense import RealSenseCameraConfig

 logger = logging.getLogger(__name__)
-pkg_name = "pyrealsense2-macosx" if sys.platform == "darwin" else "pyrealsense2"


 class RealSenseCamera(Camera):
@@ -116,7 +114,7 @@ class RealSenseCamera(Camera):
        Args:
            config: The configuration settings for the camera.
        """
-        require_package(pkg_name, extra="intelrealsense", import_name="pyrealsense2")
+        require_package("pyrealsense2", extra="intelrealsense")
        super().__init__(config)

        self.config = config
@@ -23,6 +23,7 @@ Import them directly: ``from lerobot.configs.train import TrainPipelineConfig``

 from .default import DatasetConfig, EvalConfig, PeftConfig, WandBConfig
 from .policies import PreTrainedConfig
+from .recipe import MessageTurn, TrainingRecipe, load_recipe
 from .types import (
    FeatureType,
    NormalizationMode,
@@ -41,7 +42,10 @@ __all__ = [
    # Config classes
    "DatasetConfig",
    "EvalConfig",
+    "MessageTurn",
    "PeftConfig",
    "PreTrainedConfig",
+    "TrainingRecipe",
    "WandBConfig",
+    "load_recipe",
 ]
@@ -0,0 +1,193 @@
+#!/usr/bin/env python
+
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from __future__ import annotations
+
+import re
+from dataclasses import dataclass
+from pathlib import Path
+from typing import Any, Literal, get_args
+
+MessageRole = Literal["user", "assistant", "system", "tool"]
+MessageStream = Literal["high_level", "low_level"]
+
+DEFAULT_BINDINGS = {
+    "subtask": "active_at(t, style=subtask)",
+    "memory": "active_at(t, style=memory)",
+    "plan": "active_at(t, style=plan)",
+    "speech": "emitted_at(t, role=assistant, tool_name=say)",
+    "interjection": "emitted_at(t, style=interjection)",
+    "vqa": "emitted_at(t, style=vqa, role=assistant)",
+    "vqa_query": "emitted_at(t, style=vqa, role=user)",
+}
+
+_PLACEHOLDER_RE = re.compile(r"\$\{([A-Za-z_][A-Za-z0-9_]*)\}")
+_VALID_ROLES = frozenset(get_args(MessageRole))
+_VALID_STREAMS = frozenset(get_args(MessageStream))
+
+
+@dataclass
+class MessageTurn:
+    """A single chat-style turn in a recipe template.
+
+    ``content`` may be a plain string, a list of HF-style multimodal blocks, or
+    ``None`` when ``tool_calls_from`` supplies tool-call payloads instead.
+    ``stream`` tags the turn for downstream filtering, ``target`` flags it as a
+    training target, and ``if_present`` skips the turn when the named binding
+    resolves to ``None``.
+    """
+
+    role: MessageRole
+    content: str | list[dict[str, Any]] | None = None
+    stream: MessageStream | None = None
+    target: bool = False
+    if_present: str | None = None
+    tool_calls_from: str | None = None
+
+    def __post_init__(self) -> None:
+        """Validate role, stream, and content after dataclass construction."""
+        if self.role not in _VALID_ROLES:
+            raise ValueError(f"Unsupported message role: {self.role!r}")
+        if self.stream is not None and self.stream not in _VALID_STREAMS:
+            raise ValueError(f"Unsupported message stream: {self.stream!r}")
+        if self.content is None and self.tool_calls_from is None:
+            raise ValueError("MessageTurn.content is required unless tool_calls_from is set.")
+        if self.content is not None and not isinstance(self.content, (str, list)):
+            raise TypeError("MessageTurn.content must be a string, a list of HF-style blocks, or None.")
+        if isinstance(self.content, list):
+            for block in self.content:
+                if not isinstance(block, dict) or "type" not in block:
+                    raise ValueError(
+                        "Multimodal content blocks must be HF-style dictionaries with a type key."
+                    )
+
+    @classmethod
+    def from_dict(cls, data: dict[str, Any]) -> MessageTurn:
+        """Construct a :class:`MessageTurn` from a plain dictionary."""
+        return cls(**data)
+
+
+@dataclass
+class TrainingRecipe:
+    """A recipe describing how to render training samples from language rows.
+
+    A recipe is either a *message recipe* (``messages`` plus optional
+    ``bindings``) or a *blend recipe* (``blend`` mapping names to weighted
+    sub-recipes). ``weight`` is only meaningful inside a blend.
+    """
+
+    messages: list[MessageTurn] | None = None
+    bindings: dict[str, str] | None = None
+    blend: dict[str, TrainingRecipe] | None = None
+    weight: float | None = None
+
+    def __post_init__(self) -> None:
+        """Validate that exactly one of ``messages`` or ``blend`` is set."""
+        if self.messages is not None and self.blend is not None:
+            raise ValueError("TrainingRecipe must set only one of messages or blend.")
+        if self.messages is None and self.blend is None:
+            raise ValueError("TrainingRecipe must set one of messages or blend.")
+
+        if self.messages is not None:
+            self._validate_message_recipe()
+        if self.blend is not None:
+            self._validate_blend_recipe()
+
+    @classmethod
+    def from_dict(cls, data: dict[str, Any]) -> TrainingRecipe:
+        """Construct a :class:`TrainingRecipe` from a nested dictionary."""
+        data = dict(data)
+        if data.get("messages") is not None:
+            data["messages"] = [
+                turn if isinstance(turn, MessageTurn) else MessageTurn.from_dict(turn)
+                for turn in data["messages"]
+            ]
+        if data.get("blend") is not None:
+            data["blend"] = {
+                name: recipe if isinstance(recipe, TrainingRecipe) else cls.from_dict(recipe)
+                for name, recipe in data["blend"].items()
+            }
+        return cls(**data)
+
+    @classmethod
+    def from_yaml(cls, path: str | Path) -> TrainingRecipe:
+        """Load a :class:`TrainingRecipe` from a YAML file at ``path``."""
+        import yaml  # type: ignore[import-untyped]
+
+        with open(path) as f:
+            data = yaml.safe_load(f)
+        if not isinstance(data, dict):
+            raise ValueError(f"Recipe YAML must contain a mapping at the top level: {path}")
+        return cls.from_dict(data)
+
+    def _validate_message_recipe(self) -> None:
+        """Ensure every templated binding is known and at least one turn is a target."""
+        assert self.messages is not None
+        known_bindings = set(DEFAULT_BINDINGS) | set(self.bindings or {}) | {"task"}
+
+        for turn in self.messages:
+            missing = self._referenced_bindings(turn) - known_bindings
+            if missing:
+                raise ValueError(f"MessageTurn references unknown binding(s): {sorted(missing)}")
+
+        if not any(turn.target for turn in self.messages):
+            raise ValueError("Message recipes must contain at least one target turn.")
+
+    def _validate_blend_recipe(self) -> None:
+        """Ensure each blend component is a non-empty, weighted message recipe."""
+        assert self.blend is not None
+        if not self.blend:
+            raise ValueError("Blend recipes must contain at least one component.")
+
+        for name, recipe in self.blend.items():
+            if recipe.blend is not None:
+                raise ValueError(f"Blend component {name!r} cannot itself define a blend.")
+            if recipe.messages is None:
+                raise ValueError(f"Blend component {name!r} must define messages.")
+            if recipe.weight is None:
+                raise ValueError(f"Blend component {name!r} must define weight.")
+            if recipe.weight <= 0:
+                raise ValueError(f"Blend component {name!r} must have a positive weight.")
+
+    def _referenced_bindings(self, turn: MessageTurn) -> set[str]:
+        """Return the binding names that ``turn`` references via placeholders or attributes."""
+        names: set[str] = set()
+        if turn.if_present is not None:
+            names.add(turn.if_present)
+        if turn.tool_calls_from is not None:
+            names.add(turn.tool_calls_from)
+        names.update(_placeholders_in_content(turn.content))
+        return names
+
+
+def _placeholders_in_content(content: str | list[dict[str, Any]] | None) -> set[str]:
+    """Return the set of ``${name}`` placeholders found anywhere in ``content``."""
+    if content is None:
+        return set()
+    if isinstance(content, str):
+        return set(_PLACEHOLDER_RE.findall(content))
+
+    names: set[str] = set()
+    for block in content:
+        for value in block.values():
+            if isinstance(value, str):
+                names.update(_PLACEHOLDER_RE.findall(value))
+    return names
+
+
+def load_recipe(path: str | Path) -> TrainingRecipe:
+    """Load a :class:`TrainingRecipe` from a YAML file at ``path``."""
+    return TrainingRecipe.from_yaml(path)
@@ -0,0 +1,74 @@
+blend:
+
+  memory_update:
+    weight: 0.10
+    bindings:
+      prior_memory: "nth_prev(style=memory, offset=1)"
+      current_memory: "emitted_at(t, style=memory)"
+      completed_subtask: "nth_prev(style=subtask, offset=1)"
+    messages:
+      - {role: user, content: "${task}", stream: high_level}
+      - {role: assistant, content: "Previous memory: ${prior_memory}", stream: high_level, if_present: prior_memory}
+      - {role: user, content: "Completed subtask: ${completed_subtask}", stream: high_level, if_present: completed_subtask}
+      - {role: assistant, content: "${current_memory}", stream: high_level, target: true, if_present: current_memory}
+
+  user_interjection_response:
+    weight: 0.16
+    bindings:
+      prior_plan: "nth_prev(style=plan, offset=1)"
+      current_plan: "emitted_at(t, style=plan)"
+      interjection: "emitted_at(t, style=interjection)"
+      speech: "emitted_at(t, role=assistant, tool_name=say)"
+    messages:
+      - {role: user, content: "${task}", stream: high_level}
+      - {role: assistant, content: "Previous plan:\n${prior_plan}", stream: high_level, if_present: prior_plan}
+      - {role: user, content: "${interjection}", stream: high_level, if_present: interjection}
+      - {role: assistant, content: "${current_plan}", stream: high_level, target: true, if_present: current_plan, tool_calls_from: speech}
+
+  high_level_subtask:
+    weight: 0.15
+    bindings:
+      next_subtask: "nth_next(style=subtask, offset=1)"
+    messages:
+      - {role: user, content: "${task}\nPlan: ${plan}\nMemory: ${memory}", stream: high_level}
+      - {role: user, content: "Current subtask: ${subtask}", stream: high_level, if_present: subtask}
+      - {role: assistant, content: "${next_subtask}", stream: high_level, target: true}
+
+  low_level_execution:
+    weight: 0.35
+    messages:
+      - {role: user, content: "${task}\nPlan: ${plan}\nMemory: ${memory}", stream: high_level}
+      - {role: assistant, content: "${subtask}", stream: low_level, target: true}
+
+  # VQA is view-dependent: bbox / keypoint / count answers only make sense for
+  # the camera they were grounded against. Each camera gets its own sub-recipe
+  # so the resolver can disambiguate via `camera=...` and the user-turn carries
+  # the matching image block. Adjust the camera keys (and add more sub-recipes)
+  # to match the cameras present on your dataset.
+  ask_vqa_top:
+    weight: 0.10
+    bindings:
+      vqa_query: "emitted_at(t, style=vqa, role=user, camera=observation.images.top)"
+      vqa: "emitted_at(t, style=vqa, role=assistant, camera=observation.images.top)"
+    messages:
+      - role: user
+        stream: high_level
+        if_present: vqa_query
+        content:
+          - {type: image, feature: observation.images.top}
+          - {type: text, text: "${vqa_query}"}
+      - {role: assistant, content: "${vqa}", stream: high_level, target: true, if_present: vqa}
+
+  ask_vqa_wrist:
+    weight: 0.10
+    bindings:
+      vqa_query: "emitted_at(t, style=vqa, role=user, camera=observation.images.wrist)"
+      vqa: "emitted_at(t, style=vqa, role=assistant, camera=observation.images.wrist)"
+    messages:
+      - role: user
+        stream: high_level
+        if_present: vqa_query
+        content:
+          - {type: image, feature: observation.images.wrist}
+          - {type: text, text: "${vqa_query}"}
+      - {role: assistant, content: "${vqa}", stream: high_level, target: true, if_present: vqa}
@@ -0,0 +1,88 @@
+# SmolVLA2 canonical training recipe — Hi Robot / MEM / ECoT blend.
+#
+# Same blend shape as pi05_hirobot.yaml. SmolVLA2 differs from Pi0.5 in
+# how the renderer's output is consumed:
+#
+#   - SmolVLA2 calls SmolVLM's tokenizer.apply_chat_template(messages,
+#     tools=DEFAULT_TOOLS) on the rendered messages, since SmolVLM is a
+#     chat-pretrained backbone.
+#   - The processor builds a `text_labels` tensor that masks every token
+#     except those belonging to messages whose index is in
+#     `target_message_indices`. Cross-entropy on those positions trains
+#     the LM head.
+#   - `predict_actions = bool(targets_by_stream.get("low_level"))` —
+#     same convention as Pi0.5. ``low_level_execution`` is the only
+#     branch that runs the action expert / flow head.
+
+blend:
+
+  memory_update:
+    weight: 0.10
+    bindings:
+      prior_memory: "nth_prev(style=memory, offset=1)"
+      current_memory: "emitted_at(t, style=memory)"
+      completed_subtask: "nth_prev(style=subtask, offset=1)"
+    messages:
+      - {role: user, content: "${task}", stream: high_level}
+      - {role: assistant, content: "Previous memory: ${prior_memory}", stream: high_level, if_present: prior_memory}
+      - {role: user, content: "Completed subtask: ${completed_subtask}", stream: high_level, if_present: completed_subtask}
+      - {role: assistant, content: "${current_memory}", stream: high_level, target: true, if_present: current_memory}
+
+  user_interjection_response:
+    weight: 0.16
+    bindings:
+      prior_plan: "nth_prev(style=plan, offset=1)"
+      current_plan: "emitted_at(t, style=plan)"
+      interjection: "emitted_at(t, style=interjection)"
+      speech: "emitted_at(t, role=assistant, tool_name=say)"
+    messages:
+      - {role: user, content: "${task}", stream: high_level}
+      - {role: assistant, content: "Previous plan:\n${prior_plan}", stream: high_level, if_present: prior_plan}
+      - {role: user, content: "${interjection}", stream: high_level, if_present: interjection}
+      - {role: assistant, content: "${current_plan}", stream: high_level, target: true, if_present: current_plan, tool_calls_from: speech}
+
+  high_level_subtask:
+    weight: 0.15
+    bindings:
+      next_subtask: "nth_next(style=subtask, offset=1)"
+    messages:
+      - {role: user, content: "${task}\nPlan: ${plan}\nMemory: ${memory}", stream: high_level}
+      - {role: user, content: "Current subtask: ${subtask}", stream: high_level, if_present: subtask}
+      - {role: assistant, content: "${next_subtask}", stream: high_level, target: true}
+
+  low_level_execution:
+    weight: 0.35
+    messages:
+      - {role: user, content: "${task}\nPlan: ${plan}\nMemory: ${memory}", stream: high_level}
+      - {role: assistant, content: "${subtask}", stream: low_level, target: true}
+
+  # Per-camera VQA sub-recipes (PR 1's view-dependent style routing).
+  # Adjust the camera keys (and add more sub-recipes) to match the
+  # cameras present on your dataset.
+  ask_vqa_top:
+    weight: 0.10
+    bindings:
+      vqa_query: "emitted_at(t, style=vqa, role=user, camera=observation.images.top)"
+      vqa: "emitted_at(t, style=vqa, role=assistant, camera=observation.images.top)"
+    messages:
+      - role: user
+        stream: high_level
+        if_present: vqa_query
+        content:
+          - {type: image, feature: observation.images.top}
+          - {type: text, text: "${vqa_query}"}
+      - {role: assistant, content: "${vqa}", stream: high_level, target: true, if_present: vqa}
+
+  ask_vqa_wrist:
+    weight: 0.10
+    bindings:
+      vqa_query: "emitted_at(t, style=vqa, role=user, camera=observation.images.wrist)"
+      vqa: "emitted_at(t, style=vqa, role=assistant, camera=observation.images.wrist)"
+    messages:
+      - role: user
+        stream: high_level
+        if_present: vqa_query
+        content:
+          - {type: image, feature: observation.images.wrist}
+          - {type: text, text: "${vqa_query}"}
+      - {role: assistant, content: "${vqa}", stream: high_level, target: true, if_present: vqa}
@@ -37,6 +37,14 @@ from .dataset_tools import (
 from .factory import make_dataset, resolve_delta_timestamps
 from .image_writer import safe_stop_image_writer
 from .io_utils import load_episodes, write_stats
+from .language import (
+    EVENT_ONLY_STYLES,
+    LANGUAGE_EVENTS,
+    LANGUAGE_PERSISTENT,
+    PERSISTENT_STYLES,
+    STYLE_REGISTRY,
+    column_for_style,
+)
 from .lerobot_dataset import LeRobotDataset
 from .multi_dataset import MultiLeRobotDataset
 from .pipeline_features import aggregate_pipeline_dataset_features, create_initial_features
@@ -53,10 +61,15 @@ __all__ = [
    "CODEBASE_VERSION",
    "DEFAULT_EPISODES_PATH",
    "DEFAULT_QUANTILES",
+    "EVENT_ONLY_STYLES",
    "EpisodeAwareSampler",
+    "LANGUAGE_EVENTS",
+    "LANGUAGE_PERSISTENT",
    "LeRobotDataset",
    "LeRobotDatasetMetadata",
    "MultiLeRobotDataset",
+    "PERSISTENT_STYLES",
+    "STYLE_REGISTRY",
    "StreamingLeRobotDataset",
    "VideoEncodingManager",
    "add_features",
@@ -66,6 +79,7 @@ __all__ = [
    "convert_image_to_video_dataset",
    "create_initial_features",
    "create_lerobot_dataset_card",
+    "column_for_style",
    "delete_episodes",
    "get_feature_stats",
    "load_episodes",
@@ -512,7 +512,7 @@ def compute_episode_stats(

    ep_stats = {}
    for key, data in episode_data.items():
-        if features[key]["dtype"] == "string":
+        if features[key]["dtype"] in {"string", "language"}:
            continue

        if features[key]["dtype"] in ["image", "video"]:
@@ -34,7 +34,6 @@ from .io_utils import (
    load_episodes,
    load_info,
    load_stats,
-    load_subtasks,
    load_tasks,
    write_info,
    write_json,
@@ -177,7 +176,6 @@ class LeRobotDatasetMetadata:
        self.info = load_info(self.root)
        check_version_compatibility(self.repo_id, self._version, CODEBASE_VERSION)
        self.tasks = load_tasks(self.root)
-        self.subtasks = load_subtasks(self.root)
        self.episodes = load_episodes(self.root)
        self.stats = load_stats(self.root)

@@ -320,6 +318,28 @@ class LeRobotDatasetMetadata:
        """Keys to access visual modalities (regardless of their storage method)."""
        return [key for key, ft in self.features.items() if ft["dtype"] in ["video", "image"]]

+    @property
+    def tools(self) -> list[dict]:
+        """OpenAI-style tool schemas declared by this dataset.
+
+        Read from ``meta/info.json["tools"]``. Returns a copy, so callers
+        can mutate the result safely. Falls back to
+        :data:`lerobot.datasets.language.DEFAULT_TOOLS` (the canonical
+        ``say`` schema) when the dataset doesn't declare any — that way
+        unannotated datasets and chat-template consumers
+        (``apply_chat_template(messages, tools=meta.tools)``) keep
+        working out of the box.
+
+        Implementations live under :mod:`lerobot.tools` (one file per
+        tool); see ``docs/source/tools.mdx`` for the authoring guide.
+        """
+        from .language import DEFAULT_TOOLS  # noqa: PLC0415  (avoid circular import)
+
+        declared = self.info.get("tools")
+        if isinstance(declared, list) and declared:
+            return [dict(t) for t in declared]
+        return [dict(t) for t in DEFAULT_TOOLS]
+
    @property
    def names(self) -> dict[str, list | dict]:
        """Names of the various dimensions of vector modalities."""
@@ -635,7 +655,6 @@ class LeRobotDatasetMetadata:
        _validate_feature_names(features)

        obj.tasks = None
-        obj.subtasks = None
        obj.episodes = None
        obj.stats = None
        obj.info = create_empty_dataset_info(
@@ -295,9 +295,4 @@ class DatasetReader:
        task_idx = item["task_index"].item()
        item["task"] = self._meta.tasks.iloc[task_idx].name

-        # add subtask information if available
-        if "subtask_index" in self._meta.features and self._meta.subtasks is not None:
-            subtask_idx = item["subtask_index"].item()
-            item["subtask"] = self._meta.subtasks.iloc[subtask_idx].name
-
        return item
@@ -22,6 +22,12 @@ from PIL import Image as PILImage
 from lerobot.utils.constants import DEFAULT_FEATURES
 from lerobot.utils.utils import is_valid_numpy_dtype_string

+from .language import (
+    LANGUAGE_PERSISTENT,
+    is_language_column,
+    language_events_column_feature,
+    language_persistent_column_feature,
+)
 from .utils import (
    DEFAULT_CHUNK_SIZE,
    DEFAULT_DATA_FILE_SIZE_IN_MB,
@@ -45,7 +51,13 @@ def get_hf_features_from_features(features: dict) -> datasets.Features:
    """
    hf_features = {}
    for key, ft in features.items():
-        if ft["dtype"] == "video":
+        if is_language_column(key):
+            hf_features[key] = (
+                language_persistent_column_feature()
+                if key == LANGUAGE_PERSISTENT
+                else language_events_column_feature()
+            )
+        elif ft["dtype"] == "video":
            continue
        elif ft["dtype"] == "image":
            hf_features[key] = datasets.Image()
@@ -242,6 +254,8 @@ def validate_feature_dtype_and_shape(
        return validate_feature_image_or_video(name, expected_shape, value)
    elif expected_dtype == "string":
        return validate_feature_string(name, value)
+    elif expected_dtype == "language":
+        return ""
    else:
        raise NotImplementedError(f"The feature dtype '{expected_dtype}' is not implemented yet.")

@@ -34,7 +34,6 @@ from lerobot.utils.utils import SuppressProgressBars, flatten_dict, unflatten_di
 from .utils import (
    DEFAULT_DATA_FILE_SIZE_IN_MB,
    DEFAULT_EPISODES_PATH,
-    DEFAULT_SUBTASKS_PATH,
    DEFAULT_TASKS_PATH,
    EPISODES_DIR,
    INFO_PATH,
@@ -189,14 +188,6 @@ def load_tasks(local_dir: Path) -> pandas.DataFrame:
    return tasks


-def load_subtasks(local_dir: Path) -> pandas.DataFrame | None:
-    """Load subtasks from subtasks.parquet if it exists."""
-    subtasks_path = local_dir / DEFAULT_SUBTASKS_PATH
-    if subtasks_path.exists():
-        return pd.read_parquet(subtasks_path)
-    return None
-
-
 def write_episodes(episodes: Dataset, local_dir: Path) -> None:
    """Write episode metadata to a parquet file in the LeRobot v3.0 format.
    This function writes episode-level metadata to a single parquet file.
@@ -268,11 +259,13 @@ def hf_transform_to_torch(items_dict: dict[str, list[Any]]) -> dict[str, list[to
        dict: The batch with items converted to torch tensors.
    """
    for key in items_dict:
+        if key in {"language_persistent", "language_events"}:
+            continue
        first_item = items_dict[key][0]
        if isinstance(first_item, PILImage.Image):
            to_tensor = transforms.ToTensor()
            items_dict[key] = [to_tensor(img) for img in items_dict[key]]
-        elif first_item is None:
+        elif first_item is None or isinstance(first_item, dict):
            pass
        else:
            items_dict[key] = [x if isinstance(x, str) else torch.tensor(x) for x in items_dict[key]]
@@ -308,7 +301,11 @@ def item_to_torch(item: dict) -> dict:
        dict: Dictionary with all tensor-like items converted to torch.Tensor.
    """
    for key, val in item.items():
-        if isinstance(val, (np.ndarray | list)) and key not in ["task"]:
+        if isinstance(val, (np.ndarray | list)) and key not in [
+            "task",
+            "language_persistent",
+            "language_events",
+        ]:
            # Convert numpy arrays and lists to torch tensors
            item[key] = torch.tensor(val)
    return item
@@ -0,0 +1,236 @@
+#!/usr/bin/env python
+
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from __future__ import annotations
+
+from typing import Literal
+
+import datasets
+import pyarrow as pa
+
+LANGUAGE_PERSISTENT = "language_persistent"
+LANGUAGE_EVENTS = "language_events"
+LANGUAGE_COLUMNS = (LANGUAGE_PERSISTENT, LANGUAGE_EVENTS)
+PERSISTENT_ROW_FIELDS = ("role", "content", "style", "timestamp", "camera", "tool_calls")
+EVENT_ROW_FIELDS = ("role", "content", "style", "camera", "tool_calls")
+
+CORE_STYLES = {
+    "subtask",
+    "plan",
+    "memory",
+    "motion",
+    "interjection",
+    "vqa",
+    "trace",
+    "task_aug",
+}
+EXTENDED_STYLES = set()
+STYLE_REGISTRY = CORE_STYLES | EXTENDED_STYLES
+
+PERSISTENT_STYLES = {"subtask", "plan", "memory", "motion", "task_aug"}
+EVENT_ONLY_STYLES = {"interjection", "vqa", "trace"}
+
+# Styles whose ``content`` is grounded in a specific camera view. Rows of these
+# styles MUST carry a non-null ``camera`` referencing an ``observation.images.*``
+# feature key. Rows of every other style MUST have ``camera=None``. ``motion``
+# is intentionally NOT in this set: motion primitives are described in
+# robot-frame (joint / Cartesian) terms, not pixel space, so they are
+# camera-agnostic. ``trace`` is the pixel-trajectory event style and IS
+# view-dependent. The ``camera`` field nevertheless lives on
+# ``PERSISTENT_ROW_FIELDS`` too so the schema, validator, and resolver
+# behave symmetrically across the two columns; persistent rows simply
+# always have ``camera=None`` in practice today.
+VIEW_DEPENDENT_STYLES = {"vqa", "trace"}
+
+LanguageColumn = Literal["language_persistent", "language_events"]
+
+
+def _json_arrow_type() -> pa.DataType:
+    """Return the Arrow JSON type, falling back to ``string`` on older pyarrow."""
+    return pa.json_() if hasattr(pa, "json_") else pa.string()
+
+
+def _json_feature() -> object:
+    """Return the HF ``datasets`` JSON feature, falling back to a string value."""
+    return datasets.Json() if hasattr(datasets, "Json") else datasets.Value("string")
+
+
+def language_persistent_row_arrow_type() -> pa.StructType:
+    """Return the Arrow struct type for a single persistent language row.
+
+    Persistent rows carry their own ``timestamp`` because they represent a state
+    that became active at a specific moment and remains active until superseded.
+    """
+    return pa.struct(
+        [
+            pa.field("role", pa.string(), nullable=False),
+            pa.field("content", pa.string(), nullable=True),
+            pa.field("style", pa.string(), nullable=True),
+            pa.field("timestamp", pa.float64(), nullable=False),
+            pa.field("camera", pa.string(), nullable=True),
+            pa.field("tool_calls", pa.list_(_json_arrow_type()), nullable=True),
+        ]
+    )
+
+
+def language_event_row_arrow_type() -> pa.StructType:
+    """Return the Arrow struct type for a single event language row.
+
+    Event rows have no ``timestamp`` field: each event is stored on the dataset
+    row whose frame timestamp is the event's firing time.
+    """
+    return pa.struct(
+        [
+            pa.field("role", pa.string(), nullable=False),
+            pa.field("content", pa.string(), nullable=True),
+            pa.field("style", pa.string(), nullable=True),
+            pa.field("camera", pa.string(), nullable=True),
+            pa.field("tool_calls", pa.list_(_json_arrow_type()), nullable=True),
+        ]
+    )
+
+
+def language_persistent_arrow_type() -> pa.ListType:
+    """Return the Arrow list type for the ``language_persistent`` column."""
+    return pa.list_(language_persistent_row_arrow_type())
+
+
+def language_events_arrow_type() -> pa.ListType:
+    """Return the Arrow list type for the ``language_events`` column."""
+    return pa.list_(language_event_row_arrow_type())
+
+
+def language_persistent_row_feature() -> dict[str, object]:
+    """Return the HF ``datasets`` feature mapping for a persistent language row."""
+    return {
+        "role": datasets.Value("string"),
+        "content": datasets.Value("string"),
+        "style": datasets.Value("string"),
+        "timestamp": datasets.Value("float64"),
+        "camera": datasets.Value("string"),
+        "tool_calls": datasets.List(_json_feature()),
+    }
+
+
+def language_event_row_feature() -> dict[str, object]:
+    """Return the HF ``datasets`` feature mapping for an event language row."""
+    return {
+        "role": datasets.Value("string"),
+        "content": datasets.Value("string"),
+        "style": datasets.Value("string"),
+        "camera": datasets.Value("string"),
+        "tool_calls": datasets.List(_json_feature()),
+    }
+
+
+def language_persistent_column_feature() -> datasets.List:
+    """Return the HF ``datasets`` feature for the ``language_persistent`` column."""
+    return datasets.List(language_persistent_row_feature())
+
+
+def language_events_column_feature() -> datasets.List:
+    """Return the HF ``datasets`` feature for the ``language_events`` column."""
+    return datasets.List(language_event_row_feature())
+
+
+def language_feature_info() -> dict[str, dict]:
+    """Return the ``info["features"]`` entries for both language columns."""
+    return {
+        LANGUAGE_PERSISTENT: {"dtype": "language", "shape": (1,), "names": None},
+        LANGUAGE_EVENTS: {"dtype": "language", "shape": (1,), "names": None},
+    }
+
+
+def is_language_column(key: str) -> bool:
+    """Return ``True`` if ``key`` is one of the dataset's language column names."""
+    return key in LANGUAGE_COLUMNS
+
+
+def is_view_dependent_style(style: str | None) -> bool:
+    """Return ``True`` if rows of ``style`` must be tagged with a ``camera`` key."""
+    return style in VIEW_DEPENDENT_STYLES
+
+
+def validate_camera_field(style: str | None, camera: str | None) -> None:
+    """Enforce the ``camera`` invariant: required iff ``style`` is view-dependent.
+
+    Raises ``ValueError`` if a view-dependent style is missing ``camera`` or if
+    a non-view-dependent style carries one. Pipeline writers and the validator
+    should call this on every emitted row.
+    """
+    if is_view_dependent_style(style):
+        if not camera:
+            raise ValueError(
+                f"Rows of view-dependent style {style!r} require a non-empty 'camera' "
+                f"field referencing an 'observation.images.*' feature key."
+            )
+    elif camera is not None:
+        raise ValueError(
+            f"Rows of style {style!r} must have camera=None; got camera={camera!r}."
+        )
+
+
+# --- Tool registry --------------------------------------------------------
+# Tools declared on a dataset live in ``meta/info.json["tools"]`` as a list
+# of OpenAI-style function schemas. The runtime / training stack reads them
+# through :class:`LeRobotDatasetMetadata.tools` (with these constants as
+# fallback when the dataset doesn't declare any). Implementations live
+# under :mod:`lerobot.tools` (one file per tool); see
+# ``docs/source/tools.mdx`` for the authoring guide.
+
+SAY_TOOL_SCHEMA: dict = {
+    "type": "function",
+    "function": {
+        "name": "say",
+        "description": "Speak a short utterance to the user via the TTS executor.",
+        "parameters": {
+            "type": "object",
+            "properties": {
+                "text": {
+                    "type": "string",
+                    "description": "The verbatim text to speak.",
+                }
+            },
+            "required": ["text"],
+        },
+    },
+}
+"""Canonical schema for the ``say`` tool emitted by the steerable
+annotation pipeline (PR 2 Module 2). Single source of truth — PR 2's
+writer, PR 3's runtime tool registry, and the dataset visualizer all
+import this constant rather than duplicating the dict."""
+
+DEFAULT_TOOLS: list[dict] = [SAY_TOOL_SCHEMA]
+"""Fallback tools list. Returned by ``LeRobotDatasetMetadata.tools``
+when ``meta/info.json["tools"]`` is unset, so unannotated datasets and
+chat-template consumers (``apply_chat_template(messages, tools=...)``)
+keep working out of the box."""
+
+
+def column_for_style(style: str | None) -> LanguageColumn:
+    """Map a language style to the column where rows of that style are stored.
+
+    Styles in :data:`PERSISTENT_STYLES` route to :data:`LANGUAGE_PERSISTENT`.
+    Styles in :data:`EVENT_ONLY_STYLES` and the implicit ``None`` style route
+    to :data:`LANGUAGE_EVENTS`.
+    """
+    if style is None:
+        return LANGUAGE_EVENTS
+    if style in PERSISTENT_STYLES:
+        return LANGUAGE_PERSISTENT
+    if style in EVENT_ONLY_STYLES:
+        return LANGUAGE_EVENTS
+    raise ValueError(f"Unknown language style: {style!r}")
@@ -0,0 +1,593 @@
+#!/usr/bin/env python
+
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from __future__ import annotations
+
+import copy
+import hashlib
+import re
+from collections.abc import Sequence
+from typing import Any
+
+from lerobot.configs.recipe import DEFAULT_BINDINGS, TrainingRecipe
+
+from .language import (
+    EVENT_ONLY_STYLES,
+    LANGUAGE_PERSISTENT,
+    PERSISTENT_STYLES,
+    column_for_style,
+)
+
+LanguageRow = dict[str, Any]
+RenderedMessages = dict[str, list[Any]]
+
+_RESOLVER_RE = re.compile(r"^(?P<name>[A-Za-z_][A-Za-z0-9_]*)\((?P<args>.*)\)$")
+_PLACEHOLDER_RE = re.compile(r"\$\{([A-Za-z_][A-Za-z0-9_]*)\}")
+
+
+def active_at(
+    t: float,
+    *,
+    persistent: Sequence[LanguageRow],
+    events: Sequence[LanguageRow] | None = None,
+    style: str | None = None,
+    role: str | None = None,
+    tool_name: str | None = None,
+    camera: str | None = None,
+) -> LanguageRow | None:
+    """Return the persistent row of ``style`` that is active at time ``t``.
+
+    A persistent row is "active" at ``t`` when its own ``timestamp`` is the
+    most recent one ``<= t`` for the given ``style``/``role``/``tool_name``/
+    ``camera`` selector. ``events`` is accepted for resolver-signature
+    uniformity but is not consulted: only persistent styles are valid here.
+    """
+    _validate_persistent_resolver("active_at", style)
+    matches = _matching_rows(
+        persistent, style=style, role=role, tool_name=tool_name, camera=camera
+    )
+    matches = [row for row in matches if _timestamp(row) <= t]
+    return _select_latest(
+        matches, style=style, role=role, tool_name=tool_name, camera=camera
+    )
+
+
+def emitted_at(
+    t: float,
+    *,
+    persistent: Sequence[LanguageRow],
+    events: Sequence[LanguageRow],
+    style: str | None = None,
+    role: str | None = None,
+    tool_name: str | None = None,
+    camera: str | None = None,
+) -> LanguageRow | None:
+    """Return the row of ``style`` emitted at exactly time ``t``.
+
+    For persistent styles, this matches persistent rows whose own ``timestamp``
+    equals ``t``. For event styles, the ``events`` list is assumed to come from
+    the dataset row at frame ``t`` (event rows carry no timestamp of their own),
+    so all matching event rows are considered emitted at ``t``. ``camera``
+    filters by the row's ``camera`` field — required to disambiguate when
+    multiple view-dependent rows share ``(t, role)`` across cameras.
+    """
+    column = column_for_style(style)
+    if column == LANGUAGE_PERSISTENT:
+        matches = [
+            row
+            for row in _matching_rows(
+                persistent, style=style, role=role, tool_name=tool_name, camera=camera
+            )
+            if _timestamp(row) == t
+        ]
+        return _select_one(
+            matches,
+            style=style,
+            role=role,
+            tool_name=tool_name,
+            camera=camera,
+            sort_key=_persistent_sort_key,
+        )
+    matches = _matching_rows(
+        events, style=style, role=role, tool_name=tool_name, camera=camera
+    )
+    return _select_one(
+        matches,
+        style=style,
+        role=role,
+        tool_name=tool_name,
+        camera=camera,
+        sort_key=_event_sort_key,
+    )
+
+
+def nth_prev(
+    t: float,
+    *,
+    persistent: Sequence[LanguageRow],
+    events: Sequence[LanguageRow] | None = None,
+    style: str | None = None,
+    offset: int = 1,
+    role: str | None = None,
+    tool_name: str | None = None,
+    camera: str | None = None,
+) -> LanguageRow | None:
+    """Return the persistent row that was active ``offset`` steps before ``t``.
+
+    Walks back through chronologically sorted persistent rows of ``style``
+    (filtered by optional ``role``/``tool_name``/``camera``) and returns the
+    one ``offset`` positions before the row active at ``t``. Only valid for
+    persistent styles.
+    """
+    return _nth_relative(
+        t,
+        persistent=persistent,
+        style=style,
+        offset=-offset,
+        role=role,
+        tool_name=tool_name,
+        camera=camera,
+        resolver_name="nth_prev",
+    )
+
+
+def nth_next(
+    t: float,
+    *,
+    persistent: Sequence[LanguageRow],
+    events: Sequence[LanguageRow] | None = None,
+    style: str | None = None,
+    offset: int = 1,
+    role: str | None = None,
+    tool_name: str | None = None,
+    camera: str | None = None,
+) -> LanguageRow | None:
+    """Return the persistent row that becomes active ``offset`` steps after ``t``.
+
+    Walks forward through chronologically sorted persistent rows of ``style``
+    (filtered by optional ``role``/``tool_name``/``camera``) and returns the
+    one ``offset`` positions after the row active at ``t``. Only valid for
+    persistent styles.
+    """
+    return _nth_relative(
+        t,
+        persistent=persistent,
+        style=style,
+        offset=offset,
+        role=role,
+        tool_name=tool_name,
+        camera=camera,
+        resolver_name="nth_next",
+    )
+
+
+def render_sample(
+    *,
+    recipe: TrainingRecipe,
+    persistent: Sequence[LanguageRow] | None,
+    events: Sequence[LanguageRow] | None,
+    t: float,
+    sample_idx: int,
+    task: str | None = None,
+    dataset_ctx: Any | None = None,
+) -> RenderedMessages | None:
+    """Render the chat-style messages for a single dataset sample.
+
+    Resolves the recipe's bindings against ``persistent`` and ``events`` rows
+    at frame timestamp ``t``, then expands the recipe's message templates.
+    Returns ``None`` if the resolved sample contains no target message.
+    """
+    persistent_rows = _normalize_rows(persistent or [])
+    event_rows = _normalize_rows(events or [])
+    selected_recipe = _select_recipe(recipe, sample_idx)
+    bindings = _resolve_bindings(
+        selected_recipe,
+        persistent=persistent_rows,
+        events=event_rows,
+        t=t,
+        sample_idx=sample_idx,
+        task=task,
+        dataset_ctx=dataset_ctx,
+    )
+    return _render_message_recipe(selected_recipe, bindings)
+
+
+def _select_recipe(recipe: TrainingRecipe, sample_idx: int) -> TrainingRecipe:
+    """Pick a deterministic blend component for ``sample_idx`` (or return ``recipe``)."""
+    if recipe.blend is None:
+        return recipe
+
+    total_weight = sum(component.weight or 0.0 for component in recipe.blend.values())
+    if total_weight <= 0:
+        raise ValueError("Blend weights must sum to a positive value.")
+
+    digest = hashlib.blake2b(str(sample_idx).encode(), digest_size=8).digest()
+    draw = int.from_bytes(digest, "big") / 2**64 * total_weight
+    cumulative = 0.0
+    last_component: TrainingRecipe | None = None
+    for component in recipe.blend.values():
+        last_component = component
+        cumulative += component.weight or 0.0
+        if draw < cumulative:
+            return component
+    assert last_component is not None
+    return last_component
+
+
+def _resolve_bindings(
+    recipe: TrainingRecipe,
+    *,
+    persistent: Sequence[LanguageRow],
+    events: Sequence[LanguageRow],
+    t: float,
+    sample_idx: int,
+    task: str | None,
+    dataset_ctx: Any | None,
+) -> dict[str, LanguageRow | str | None]:
+    """Resolve every binding in ``recipe`` (plus ``task``) at time ``t``."""
+    bindings: dict[str, LanguageRow | str | None] = {
+        "task": _resolve_task(
+            task, dataset_ctx, persistent=persistent, sample_idx=sample_idx
+        ),
+    }
+    specs = {**DEFAULT_BINDINGS, **(recipe.bindings or {})}
+    for name, spec in specs.items():
+        bindings[name] = _resolve_spec(spec, persistent=persistent, events=events, t=t)
+    return bindings
+
+
+def _resolve_task(
+    task: str | None,
+    dataset_ctx: Any | None,
+    *,
+    persistent: Sequence[LanguageRow] = (),
+    sample_idx: int = 0,
+) -> str | None:
+    """Return the task string for ``sample_idx``.
+
+    Resolution order:
+
+    1. Explicit ``task`` override (caller-supplied) wins.
+    2. If ``persistent`` contains rows of style ``task_aug`` (role=user),
+       deterministically pick one by ``sample_idx`` so each frame of an
+       episode rotates through the available rephrasings across an epoch.
+       This realizes Xiao 2022 / CAST-style task-prompt diversity without
+       changing ``meta/tasks.parquet`` and without forcing recipes to opt
+       in: ``${task}`` automatically picks a rephrasing when one exists,
+       and falls back to the canonical task otherwise. Recipes that want
+       the literal canonical task can override the binding.
+    3. Otherwise read the canonical task from ``dataset_ctx`` (which is
+       backed by ``meta/tasks.parquet``).
+    """
+    if task is not None:
+        return task
+
+    aug_rows = [
+        r
+        for r in persistent
+        if r.get("style") == "task_aug" and r.get("role") == "user"
+    ]
+    if aug_rows:
+        # Deterministic, blake2b-based pick keyed on sample_idx so the
+        # rotation is reproducible across runs (Python's built-in ``hash``
+        # is process-randomized).
+        digest = hashlib.blake2b(
+            f"task_aug:{sample_idx}".encode(), digest_size=8
+        ).digest()
+        idx = int.from_bytes(digest, "big") % len(aug_rows)
+        chosen = aug_rows[idx].get("content")
+        if chosen:
+            return str(chosen)
+
+    if dataset_ctx is None:
+        return None
+    if isinstance(dataset_ctx, dict):
+        return dataset_ctx.get("task")
+    return getattr(dataset_ctx, "task", None)
+
+
+def _resolve_spec(
+    spec: str,
+    *,
+    persistent: Sequence[LanguageRow],
+    events: Sequence[LanguageRow],
+    t: float,
+) -> LanguageRow | None:
+    """Parse a single binding's resolver expression and dispatch to its function."""
+    match = _RESOLVER_RE.match(spec.strip())
+    if match is None:
+        raise ValueError(f"Invalid resolver expression: {spec!r}")
+    name = match.group("name")
+    kwargs = _parse_resolver_args(match.group("args"))
+    kwargs.pop("t_arg", None)
+
+    resolvers = {
+        "active_at": active_at,
+        "emitted_at": emitted_at,
+        "nth_prev": nth_prev,
+        "nth_next": nth_next,
+    }
+    if name not in resolvers:
+        raise ValueError(f"Unknown language resolver: {name!r}")
+    return resolvers[name](t, persistent=persistent, events=events, **kwargs)
+
+
+def _parse_resolver_args(args: str) -> dict[str, Any]:
+    """Parse a comma-separated resolver argument list into a kwargs dict."""
+    kwargs: dict[str, Any] = {}
+    if not args.strip():
+        return kwargs
+
+    parts = [part.strip() for part in args.split(",") if part.strip()]
+    for part in parts:
+        if part == "t":
+            kwargs["t_arg"] = True
+            continue
+        if "=" not in part:
+            raise ValueError(f"Invalid resolver argument: {part!r}")
+        key, value = (item.strip() for item in part.split("=", 1))
+        if key == "offset":
+            kwargs[key] = int(value)
+        else:
+            kwargs[key] = value.strip("\"'")
+    return kwargs
+
+
+def _render_message_recipe(
+    recipe: TrainingRecipe,
+    bindings: dict[str, LanguageRow | str | None],
+) -> RenderedMessages | None:
+    """Expand ``recipe.messages`` into rendered chat messages using ``bindings``."""
+    assert recipe.messages is not None
+    messages: list[dict[str, Any]] = []
+    streams: list[str | None] = []
+    target_indices: list[int] = []
+
+    for turn in recipe.messages:
+        if turn.if_present is not None and bindings.get(turn.if_present) is None:
+            continue
+
+        message = {"role": turn.role}
+        if turn.content is not None:
+            message["content"] = _render_content(turn.content, bindings)
+
+        if turn.tool_calls_from is not None:
+            row = bindings.get(turn.tool_calls_from)
+            tool_calls = row.get("tool_calls") if isinstance(row, dict) else None
+            if tool_calls:
+                message["tool_calls"] = copy.deepcopy(tool_calls)
+
+        message_idx = len(messages)
+        messages.append(message)
+        streams.append(turn.stream)
+        if turn.target:
+            target_indices.append(message_idx)
+
+    if not target_indices:
+        return None
+
+    rendered = {
+        "messages": messages,
+        "message_streams": streams,
+        "target_message_indices": target_indices,
+    }
+    _validate_rendered(rendered)
+    return rendered
+
+
+def _render_content(
+    content: str | list[dict[str, Any]],
+    bindings: dict[str, LanguageRow | str | None],
+) -> str | list[dict[str, Any]]:
+    """Substitute bindings into a string or each string field of multimodal blocks."""
+    if isinstance(content, str):
+        return _substitute(content, bindings)
+
+    rendered_blocks = []
+    for block in content:
+        rendered_block = copy.deepcopy(block)
+        for key, value in rendered_block.items():
+            if isinstance(value, str):
+                rendered_block[key] = _substitute(value, bindings)
+        rendered_blocks.append(rendered_block)
+    return rendered_blocks
+
+
+def _substitute(template: str, bindings: dict[str, LanguageRow | str | None]) -> str:
+    """Replace ``${name}`` placeholders in ``template`` with their bound values."""
+
+    def replace(match: re.Match[str]) -> str:
+        """Resolve a single ``${name}`` match to its bound string value."""
+        name = match.group(1)
+        if name not in bindings:
+            raise ValueError(f"Unknown template binding: {name!r}")
+        value = bindings[name]
+        if value is None:
+            return ""
+        if isinstance(value, dict):
+            content = value.get("content")
+            return "" if content is None else str(content)
+        return str(value)
+
+    return _PLACEHOLDER_RE.sub(replace, template)
+
+
+def _validate_rendered(rendered: RenderedMessages) -> None:
+    """Sanity-check the rendered output for stream/target alignment."""
+    messages = rendered["messages"]
+    streams = rendered["message_streams"]
+    target_indices = rendered["target_message_indices"]
+
+    if len(streams) != len(messages):
+        raise ValueError("message_streams must be aligned with messages.")
+    if not target_indices:
+        raise ValueError("Rendered samples must contain at least one target message.")
+    for idx in target_indices:
+        if idx < 0 or idx >= len(messages):
+            raise ValueError(f"Target message index {idx} is out of bounds.")
+    for idx, stream in enumerate(streams):
+        if stream is None:
+            raise ValueError(f"Rendered message {idx} has no stream.")
+
+
+def _nth_relative(
+    t: float,
+    *,
+    persistent: Sequence[LanguageRow],
+    style: str | None,
+    offset: int,
+    role: str | None,
+    tool_name: str | None,
+    camera: str | None,
+    resolver_name: str,
+) -> LanguageRow | None:
+    """Shared body for ``nth_prev`` / ``nth_next`` with signed ``offset``."""
+    _validate_persistent_resolver(resolver_name, style)
+    if abs(offset) < 1:
+        raise ValueError(f"{resolver_name} offset must be non-zero.")
+
+    rows = sorted(
+        _matching_rows(persistent, style=style, role=role, tool_name=tool_name, camera=camera),
+        key=_persistent_sort_key,
+    )
+    if not rows:
+        return None
+
+    anchor_idx = None
+    for idx, row in enumerate(rows):
+        if _timestamp(row) <= t:
+            anchor_idx = idx
+        else:
+            break
+
+    target_idx = (offset - 1 if offset > 0 else None) if anchor_idx is None else anchor_idx + offset
+
+    if target_idx is None or target_idx < 0 or target_idx >= len(rows):
+        return None
+    return rows[target_idx]
+
+
+def _validate_persistent_resolver(resolver_name: str, style: str | None) -> None:
+    """Reject calls with missing or event-only ``style`` for persistent resolvers."""
+    if style is None:
+        raise ValueError(f"{resolver_name} requires a persistent style.")
+    if style in EVENT_ONLY_STYLES:
+        raise ValueError(f"{resolver_name} cannot be used with event-only style {style!r}.")
+    if style not in PERSISTENT_STYLES:
+        column_for_style(style)
+
+
+def _matching_rows(
+    rows: Sequence[LanguageRow],
+    *,
+    style: str | None,
+    role: str | None,
+    tool_name: str | None,
+    camera: str | None,
+) -> list[LanguageRow]:
+    """Return ``rows`` filtered by optional ``style``/``role``/``tool_name``/``camera`` selectors."""
+    return [
+        row
+        for row in rows
+        if (style is None or row.get("style") == style)
+        and (role is None or row.get("role") == role)
+        and (tool_name is None or _row_has_tool_name(row, tool_name))
+        and (camera is None or row.get("camera") == camera)
+    ]
+
+
+def _select_latest(
+    rows: Sequence[LanguageRow],
+    *,
+    style: str | None,
+    role: str | None,
+    tool_name: str | None,
+    camera: str | None,
+) -> LanguageRow | None:
+    """Return the row tied for the latest ``timestamp`` (disambiguated by selectors)."""
+    if not rows:
+        return None
+    rows = sorted(rows, key=_persistent_sort_key)
+    latest_ts = _timestamp(rows[-1])
+    return _select_one(
+        [row for row in rows if _timestamp(row) == latest_ts],
+        style=style,
+        role=role,
+        tool_name=tool_name,
+        camera=camera,
+        sort_key=_persistent_sort_key,
+    )
+
+
+def _select_one(
+    rows: Sequence[LanguageRow],
+    *,
+    style: str | None,
+    role: str | None,
+    tool_name: str | None,
+    camera: str | None,
+    sort_key: Any,
+) -> LanguageRow | None:
+    """Return the single matching row, or raise if the selectors are ambiguous."""
+    if not rows:
+        return None
+    if len(rows) > 1 and role is None and tool_name is None and camera is None:
+        raise ValueError(
+            f"Ambiguous resolver for style={style!r}; add role=..., tool_name=..., "
+            f"or camera=... to disambiguate."
+        )
+    return sorted(rows, key=sort_key)[0]
+
+
+def _persistent_sort_key(row: LanguageRow) -> tuple[float, str, str]:
+    """Sort key for persistent rows: ``(timestamp, style, role)``."""
+    return (_timestamp(row), row.get("style") or "", row.get("role") or "")
+
+
+def _event_sort_key(row: LanguageRow) -> tuple[str, str]:
+    """Sort key for event rows: ``(style, role)`` (timestamp is implicit in the frame)."""
+    return (row.get("style") or "", row.get("role") or "")
+
+
+def _timestamp(row: LanguageRow) -> float:
+    """Extract a row's ``timestamp`` as a Python float (unwrapping numpy scalars)."""
+    value = row["timestamp"]
+    return float(value.item() if hasattr(value, "item") else value)
+
+
+def _row_has_tool_name(row: LanguageRow, tool_name: str) -> bool:
+    """Return ``True`` if any of the row's tool calls invokes ``tool_name``."""
+    for tool_call in row.get("tool_calls") or []:
+        if isinstance(tool_call, str):
+            continue
+        function = tool_call.get("function") if isinstance(tool_call, dict) else None
+        if isinstance(function, dict) and function.get("name") == tool_name:
+            return True
+    return False
+
+
+def _normalize_rows(rows: Sequence[Any]) -> list[LanguageRow]:
+    """Convert pyarrow scalars / mappings into a fresh list of plain dict rows."""
+    normalized = []
+    for row in rows:
+        if row is None:
+            continue
+        if hasattr(row, "as_py"):
+            row = row.as_py()
+        if not isinstance(row, dict):
+            raise TypeError(f"Language rows must be dictionaries, got {type(row).__name__}.")
+        normalized.append(dict(row))
+    return normalized
@@ -83,7 +83,6 @@ VIDEO_DIR = "videos"

 CHUNK_FILE_PATTERN = "chunk-{chunk_index:03d}/file-{file_index:03d}"
 DEFAULT_TASKS_PATH = "meta/tasks.parquet"
-DEFAULT_SUBTASKS_PATH = "meta/subtasks.parquet"
 DEFAULT_EPISODES_PATH = EPISODES_DIR + "/" + CHUNK_FILE_PATTERN + ".parquet"
 DEFAULT_DATA_PATH = DATA_DIR + "/" + CHUNK_FILE_PATTERN + ".parquet"
 DEFAULT_VIDEO_PATH = VIDEO_DIR + "/{video_key}/" + CHUNK_FILE_PATTERN + ".mp4"
@@ -142,10 +142,9 @@ class ACTPolicy(PreTrainedPolicy):

        actions_hat, (mu_hat, log_sigma_x2_hat) = self.model(batch)

-        abs_err = F.l1_loss(batch[ACTION], actions_hat, reduction="none")
-        valid_mask = ~batch["action_is_pad"].unsqueeze(-1)
-        num_valid = valid_mask.sum() * abs_err.shape[-1]
-        l1_loss = (abs_err * valid_mask).sum() / num_valid.clamp_min(1)
+        l1_loss = (
+            F.l1_loss(batch[ACTION], actions_hat, reduction="none") * ~batch["action_is_pad"].unsqueeze(-1)
+        ).mean()

        loss_dict = {"l1_loss": l1_loss.item()}
        if self.config.use_vae:
@@ -380,9 +380,7 @@ class DiffusionModel(nn.Module):
                    f"{self.config.do_mask_loss_for_padding=}."
                )
            in_episode_bound = ~batch["action_is_pad"]
-            mask = in_episode_bound.unsqueeze(-1)
-            num_valid = mask.sum() * loss.shape[-1]
-            return (loss * mask).sum() / num_valid.clamp_min(1)
+            loss = loss * in_episode_bound.unsqueeze(-1)

        return loss.mean()

@@ -140,6 +140,10 @@ def get_policy_class(name: str) -> type[PreTrainedPolicy]:
        from .smolvla.modeling_smolvla import SmolVLAPolicy

        return SmolVLAPolicy
+    elif name == "smolvla2":
+        from .smolvla2.modeling_smolvla2 import SmolVLA2Policy
+
+        return SmolVLA2Policy
    elif name == "sarm":
        from .sarm.modeling_sarm import SARMRewardModel

@@ -200,6 +204,10 @@ def make_policy_config(policy_type: str, **kwargs) -> PreTrainedConfig:
        return SACConfig(**kwargs)
    elif policy_type == "smolvla":
        return SmolVLAConfig(**kwargs)
+    elif policy_type == "smolvla2":
+        from .smolvla2.configuration_smolvla2 import SmolVLA2Config
+
+        return SmolVLA2Config(**kwargs)
    elif policy_type == "reward_classifier":
        return RewardClassifierConfig(**kwargs)
    elif policy_type == "groot":
@@ -386,6 +394,17 @@ def make_pre_post_processors(
            dataset_stats=kwargs.get("dataset_stats"),
        )

+    elif policy_cfg.type == "smolvla2":
+        # NOTE: SmolVLA2Config subclasses SmolVLAConfig, so this branch
+        # MUST come before the SmolVLAConfig isinstance check below
+        # (otherwise SmolVLA2 would silently pick up SmolVLA's processor).
+        from .smolvla2.processor_smolvla2 import make_smolvla2_pre_post_processors
+
+        processors = make_smolvla2_pre_post_processors(
+            config=policy_cfg,
+            dataset_stats=kwargs.get("dataset_stats"),
+        )
+
    elif isinstance(policy_cfg, SmolVLAConfig):
        from .smolvla.processor_smolvla import make_smolvla_pre_post_processors

@@ -13,6 +13,7 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.

+from dataclasses import dataclass, field
 from pathlib import Path
 from typing import TYPE_CHECKING

@@ -173,14 +174,17 @@ N_COLOR_CHANNELS = 3


 # config
+@dataclass
 class GR00TN15Config(PretrainedConfig):
    model_type = "gr00t_n1_5"
+    backbone_cfg: dict = field(init=False, metadata={"help": "Backbone configuration."})

-    backbone_cfg: dict
-    action_head_cfg: dict
-    action_horizon: int
-    action_dim: int
-    compute_dtype: str = "float32"
+    action_head_cfg: dict = field(init=False, metadata={"help": "Action head configuration."})
+
+    action_horizon: int = field(init=False, metadata={"help": "Action horizon."})
+
+    action_dim: int = field(init=False, metadata={"help": "Action dimension."})
+    compute_dtype: str = field(default="float32", metadata={"help": "Compute dtype."})

    def __init__(self, **kwargs):
        super().__init__(**kwargs)
@@ -688,9 +688,8 @@ class DiffusionObjective(nn.Module):
        loss = F.mse_loss(predicted, target, reduction="none")

        if self.do_mask_loss_for_padding and "action_is_pad" in batch:
-            mask = ~batch["action_is_pad"].unsqueeze(-1)
-            num_valid = mask.sum() * loss.shape[-1]
-            return (loss * mask).sum() / num_valid.clamp_min(1)
+            valid_actions = ~batch["action_is_pad"]
+            loss = loss * valid_actions.unsqueeze(-1)

        return loss.mean()

@@ -753,9 +752,8 @@ class FlowMatchingObjective(nn.Module):
        loss = F.mse_loss(predicted_velocity, target_velocity, reduction="none")

        if self.do_mask_loss_for_padding and "action_is_pad" in batch:
-            mask = ~batch["action_is_pad"].unsqueeze(-1)
-            num_valid = mask.sum() * loss.shape[-1]
-            return (loss * mask).sum() / num_valid.clamp_min(1)
+            valid_mask = ~batch["action_is_pad"]
+            loss = loss * valid_mask.unsqueeze(-1)

        return loss.mean()

@@ -455,13 +455,7 @@ class SARMEncodingProcessorStep(ProcessorStep):
            inputs = {k: v.to(self.device) for k, v in inputs.items()}

            # Get image embeddings
-            # transformers 5.x returns BaseModelOutputWithPooling instead of a plain tensor
-            output = self.clip_model.get_image_features(**inputs)
-            if not isinstance(output, torch.Tensor):
-                output = output.pooler_output
-                if output is None:
-                    raise ValueError("pooler_output should not be None for CLIP models.")
-            embeddings = output.detach().cpu()
+            embeddings = self.clip_model.get_image_features(**inputs).detach().cpu()

            # Handle single frame case
            if embeddings.dim() == 1:
@@ -488,13 +482,7 @@ class SARMEncodingProcessorStep(ProcessorStep):
        inputs = self.clip_processor.tokenizer([text], return_tensors="pt", padding=True, truncation=True)
        inputs = {k: v.to(self.device) for k, v in inputs.items()}

-        # transformers 5.x returns BaseModelOutputWithPooling instead of a plain tensor
-        output = self.clip_model.get_text_features(**inputs)
-        if not isinstance(output, torch.Tensor):
-            output = output.pooler_output
-            if output is None:
-                raise ValueError("pooler_output should not be None for CLIP models.")
-        text_embedding = output.detach().cpu()
+        text_embedding = self.clip_model.get_text_features(**inputs).detach().cpu()
        text_embedding = text_embedding.expand(batch_size, -1)

        return text_embedding
@@ -394,21 +394,13 @@ class SmolVLAPolicy(PreTrainedPolicy):
        loss_dict["losses_after_rm_padding"] = losses.clone().mean().item()

        if reduction == "none":
-            # Return per-sample losses (B,) by averaging over valid (time, action) entries
-            if actions_is_pad is None:
-                per_sample_loss = losses.mean(dim=(1, 2))
-            else:
-                num_valid = ((~actions_is_pad).sum(dim=1) * losses.shape[-1]).clamp_min(1)
-                per_sample_loss = losses.sum(dim=(1, 2)) / num_valid
+            # Return per-sample losses (B,) by averaging over time and action dims
+            per_sample_loss = losses.mean(dim=(1, 2))
            loss_dict["loss"] = per_sample_loss.mean().item()
            return per_sample_loss, loss_dict
        else:
-            # Default: return scalar mean loss over valid (time, action) entries
-            if actions_is_pad is None:
-                loss = losses.mean()
-            else:
-                num_valid = ((~actions_is_pad).sum() * losses.shape[-1]).clamp_min(1)
-                loss = losses.sum() / num_valid
+            # Default: return scalar mean loss
+            loss = losses.mean()
            loss_dict["loss"] = loss.item()
            return loss, loss_dict

@@ -0,0 +1,38 @@
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""SmolVLA2 — SmolVLA with the SmolVLM language head re-enabled.
+
+SmolVLA strips the LM head from the SmolVLM backbone because it only does
+flow-matching action prediction. SmolVLA2 keeps the LM head so the same
+model can train on the full Hi Robot / MEM / ECoT message blend defined in
+the steerable annotation plan (PR1 + PR2):
+
+* action-only sub-recipes (e.g. ``low_level_execution``) → flow loss
+* text-only sub-recipes (e.g. ``memory_update``, ``ask_vqa``,
+  ``user_interjection_response``, ``high_level_subtask``) → CE loss on
+  ``lm_head`` over the recipe's target message tokens
+* mixed sub-recipes → both losses summed (weighted)
+
+The ``predict_actions`` toggle follows the Pi0.5 convention from Section
+I.7 of the plan: ``True`` if any ``low_level`` target is present in the
+sample, else ``False``.
+
+This package is a thin subclass of ``lerobot.policies.smolvla`` so most of
+the model code stays in one place — only the dual-loss path and the
+chat-template processor live here.
+"""
+
+from .configuration_smolvla2 import SmolVLA2Config
+
+__all__ = ["SmolVLA2Config"]
@@ -0,0 +1,271 @@
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""SmolVLA2's chat-template tokenization step.
+
+Replaces SmolVLA's plain ``TokenizerProcessorStep`` for SmolVLA2 when a
+``recipe_path`` is set. Reads the rendered messages produced by
+``RenderMessagesStep`` (PR 1) and produces:
+
+* ``OBS_LANGUAGE_TOKENS`` / ``OBS_LANGUAGE_ATTENTION_MASK`` —
+  the chat-templated prompt tokenized by SmolVLM's tokenizer, with
+  ``tools=meta.tools`` (PR 1's catalog).
+* ``text_labels`` — same shape as token ids, ``-100`` everywhere except
+  the positions belonging to messages whose index is in
+  ``target_message_indices``. The next commit's modeling forward path
+  applies cross-entropy on those positions via the SmolVLM ``lm_head``.
+* ``predict_actions`` — bool tensor, ``True`` iff any of the rendered
+  target messages has ``message_streams[i] == "low_level"``. The
+  modeling forward uses this to gate the flow head.
+
+Image / video content blocks in the rendered messages are dropped
+before tokenization — the chat template only handles text, and SmolVLA
+already passes camera tensors out-of-band via the standard
+``OBS_IMAGES_*`` features. This keeps the prefix layout unchanged
+(``embed_prefix`` puts image embeddings before language embeddings,
+matching the chat-template-stripped text order).
+"""
+
+from __future__ import annotations
+
+import copy
+import logging
+from dataclasses import dataclass
+from typing import Any
+
+import torch
+
+from lerobot.configs import PipelineFeatureType, PolicyFeature
+from lerobot.datasets.language import DEFAULT_TOOLS
+from lerobot.processor.pipeline import ProcessorStep, ProcessorStepRegistry
+from lerobot.types import EnvTransition, TransitionKey
+from lerobot.utils.constants import OBS_LANGUAGE_ATTENTION_MASK, OBS_LANGUAGE_TOKENS
+
+logger = logging.getLogger(__name__)
+
+
+@dataclass
+@ProcessorStepRegistry.register(name="smolvla2_chat_tokenizer")
+class SmolVLA2ChatTokenizerStep(ProcessorStep):
+    """Render messages → token ids + label mask + predict_actions flag.
+
+    This is the bridge between the recipe stack (PR 1's
+    ``RenderMessagesStep`` outputs) and the SmolVLA2 modeling forward
+    (next commit, which reads ``text_labels`` / ``predict_actions``).
+    Pure-text turns and multi-stream targets are both handled.
+    """
+
+    tokenizer_name: str = "HuggingFaceTB/SmolVLM2-500M-Video-Instruct"
+    max_length: int = 2048
+    padding: str = "longest"
+    padding_side: str = "right"
+    tools: list[dict[str, Any]] | None = None
+
+    def __post_init__(self) -> None:
+        # Lazy: don't load the tokenizer until the step actually runs,
+        # so unit tests that import the module without transformers
+        # installed still pass.
+        self._tokenizer: Any = None
+        if self.tools is None:
+            # Default: ship the canonical ``say`` schema. Users who set
+            # ``meta.tools`` differently can override via
+            # ``with_tools(meta.tools)``.
+            self.tools = list(DEFAULT_TOOLS)
+
+    # ------------------------------------------------------------------
+    # Public API
+    # ------------------------------------------------------------------
+
+    def with_tools(self, tools: list[dict[str, Any]]) -> "SmolVLA2ChatTokenizerStep":
+        """Override the tools catalog rendered into the system prompt."""
+        self.tools = list(tools)
+        return self
+
+    def __call__(self, transition: EnvTransition) -> EnvTransition | None:
+        comp = transition.get(TransitionKey.COMPLEMENTARY_DATA) or {}
+        messages = comp.get("messages")
+        if not messages:
+            # No recipe rendering happened — nothing to do; downstream
+            # falls back to whatever ``task`` is in the transition.
+            return transition
+
+        message_streams: list[str | None] = list(comp.get("message_streams") or [])
+        target_indices: list[int] = sorted(
+            int(i) for i in (comp.get("target_message_indices") or [])
+        )
+
+        tokenizer = self._get_tokenizer()
+        text_messages = [_strip_lerobot_blocks(m) for m in messages]
+
+        # Tokenize the full chat once.
+        full_ids = tokenizer.apply_chat_template(
+            text_messages,
+            tools=self.tools,
+            add_generation_prompt=False,
+            tokenize=True,
+            return_tensors=None,
+        )
+        if isinstance(full_ids, list) and full_ids and isinstance(full_ids[0], list):
+            full_ids = full_ids[0]
+
+        # Build the label mask by re-rendering progressively up to each
+        # target message and reading off the prefix length. This is the
+        # robust way to get exact token boundaries: we use the same
+        # tokenizer, the same ``tools=`` argument, and the same chat
+        # template — so the prefix tokens are guaranteed to be a prefix
+        # of the full sequence.
+        labels = [-100] * len(full_ids)
+        for tgt in target_indices:
+            prefix_ids = tokenizer.apply_chat_template(
+                text_messages[:tgt],
+                tools=self.tools,
+                add_generation_prompt=False,
+                tokenize=True,
+                return_tensors=None,
+            )
+            full_through_target = tokenizer.apply_chat_template(
+                text_messages[: tgt + 1],
+                tools=self.tools,
+                add_generation_prompt=False,
+                tokenize=True,
+                return_tensors=None,
+            )
+            if isinstance(prefix_ids, list) and prefix_ids and isinstance(prefix_ids[0], list):
+                prefix_ids = prefix_ids[0]
+            if (
+                isinstance(full_through_target, list)
+                and full_through_target
+                and isinstance(full_through_target[0], list)
+            ):
+                full_through_target = full_through_target[0]
+            start = len(prefix_ids)
+            end = min(len(full_through_target), len(full_ids))
+            for pos in range(start, end):
+                labels[pos] = int(full_ids[pos])
+
+        # Truncate / pad to ``max_length`` so batches collate cleanly.
+        # The SmolVLA pipeline downstream relies on a fixed length
+        # behaviour ("longest" or "max_length") — we mirror it here.
+        if len(full_ids) > self.max_length:
+            full_ids = full_ids[: self.max_length]
+            labels = labels[: self.max_length]
+        attn = [1] * len(full_ids)
+        if self.padding == "max_length" and len(full_ids) < self.max_length:
+            pad_id = (
+                tokenizer.pad_token_id
+                if tokenizer.pad_token_id is not None
+                else 0
+            )
+            n_pad = self.max_length - len(full_ids)
+            full_ids = full_ids + [pad_id] * n_pad
+            labels = labels + [-100] * n_pad
+            attn = attn + [0] * n_pad
+
+        ids_t = torch.tensor(full_ids, dtype=torch.long)
+        attn_t = torch.tensor(attn, dtype=torch.bool)
+        labels_t = torch.tensor(labels, dtype=torch.long)
+        predict_actions = any(
+            i < len(message_streams) and message_streams[i] == "low_level"
+            for i in target_indices
+        )
+
+        new_complementary = dict(comp)
+        # Drop the per-recipe sidecar keys; everything downstream needs
+        # is now in the tokenized form.
+        new_complementary.pop("messages", None)
+        new_complementary.pop("message_streams", None)
+        new_complementary.pop("target_message_indices", None)
+        # SmolVLA's pipeline expects ``OBS_LANGUAGE_TOKENS`` /
+        # ``OBS_LANGUAGE_ATTENTION_MASK`` on the OBSERVATION key. Place
+        # them there — and drop ``task`` so the upstream
+        # ``TokenizerProcessorStep`` (which we replace) doesn't double-
+        # tokenize.
+        observation = dict(transition.get(TransitionKey.OBSERVATION) or {})
+        observation[OBS_LANGUAGE_TOKENS] = ids_t
+        observation[OBS_LANGUAGE_ATTENTION_MASK] = attn_t
+        new_complementary["text_labels"] = labels_t
+        new_complementary["predict_actions"] = torch.tensor(predict_actions, dtype=torch.bool)
+        new_complementary.pop("task", None)
+
+        new_transition = dict(transition)
+        new_transition[TransitionKey.COMPLEMENTARY_DATA] = new_complementary
+        new_transition[TransitionKey.OBSERVATION] = observation
+        return new_transition
+
+    def transform_features(
+        self, features: dict[PipelineFeatureType, dict[str, PolicyFeature]]
+    ) -> dict[PipelineFeatureType, dict[str, PolicyFeature]]:
+        """Pass-through; this step writes runtime tensors not features."""
+        return features
+
+    # ------------------------------------------------------------------
+    # Helpers
+    # ------------------------------------------------------------------
+
+    def _get_tokenizer(self):  # noqa: ANN202
+        if self._tokenizer is not None:
+            return self._tokenizer
+        try:
+            from transformers import AutoTokenizer  # noqa: PLC0415
+        except ImportError as exc:  # pragma: no cover
+            raise ImportError(
+                "SmolVLA2ChatTokenizerStep requires transformers. "
+                "`pip install lerobot[transformers-dep]`."
+            ) from exc
+        self._tokenizer = AutoTokenizer.from_pretrained(self.tokenizer_name)
+        if self._tokenizer.pad_token_id is None and self._tokenizer.eos_token_id is not None:
+            self._tokenizer.pad_token = self._tokenizer.eos_token
+        return self._tokenizer
+
+
+def _strip_lerobot_blocks(message: dict[str, Any]) -> dict[str, Any]:
+    """Remove LeRobot-specific multimodal blocks from ``message`` content.
+
+    The recipe DSL allows authors to write multimodal content like
+    ``{"type": "image", "feature": "observation.images.top"}``. SmolVLM's
+    tokenizer doesn't know that ``feature`` key (it expects ``url`` or
+    ``path``). The actual image tensor flows through SmolVLA's
+    ``OBS_IMAGES_*`` channels separately; the chat template only needs
+    the text. So we strip non-text blocks before tokenizing.
+    """
+    new = dict(message)
+    content = new.get("content")
+    if isinstance(content, list):
+        text_parts: list[dict[str, Any]] = []
+        for block in content:
+            if not isinstance(block, dict):
+                continue
+            if block.get("type") == "text":
+                text_parts.append({"type": "text", "text": str(block.get("text", ""))})
+        # If only one text block survives, flatten to a string for
+        # template friendliness; some chat templates choke on a single-
+        # element list.
+        if len(text_parts) == 1:
+            new["content"] = text_parts[0]["text"]
+        elif text_parts:
+            new["content"] = text_parts
+        else:
+            new["content"] = ""
+    if "tool_calls" in new and not new["tool_calls"]:
+        # Drop empty tool_calls — some templates render them as a
+        # spurious empty marker.
+        new.pop("tool_calls")
+    # ``stream`` and ``target`` were recipe metadata; templates don't
+    # know them and may warn or crash.
+    new.pop("stream", None)
+    new.pop("target", None)
+    return new
+
+
+# Re-export for tests / introspection
+strip_lerobot_blocks = _strip_lerobot_blocks
@@ -0,0 +1,97 @@
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from dataclasses import dataclass
+
+from lerobot.configs import PreTrainedConfig
+
+from ..smolvla.configuration_smolvla import SmolVLAConfig
+
+
+@PreTrainedConfig.register_subclass("smolvla2")
+@dataclass
+class SmolVLA2Config(SmolVLAConfig):
+    """SmolVLA2 — SmolVLA with the underlying SmolVLM language head re-enabled.
+
+    SmolVLA strips the LM head from the SmolVLM backbone because it only
+    needs flow-matching action prediction. SmolVLA2 keeps the LM head so the
+    same model can train on:
+
+      * **action-only sub-recipes** (e.g. ``low_level_execution``) — flow loss
+        on the action expert, same as SmolVLA. ``predict_actions=True``.
+      * **text-only sub-recipes** (e.g. ``memory_update`` / ``ask_vqa`` /
+        ``user_interjection_response`` / ``high_level_subtask``) — cross-
+        entropy loss on the LM head over the recipe's target message tokens.
+        Skips the flow head entirely. ``predict_actions=False``.
+      * **mixed sub-recipes** — both heads run, losses summed (weighted).
+
+    The split is controlled by ``predict_actions = bool(targets_by_stream
+    .get("low_level"))`` per the Pi0.5 convention in the steerable
+    annotation plan (Section I.7), implemented inside the processor /
+    forward path. Recipes drive it via ``stream`` + ``target`` metadata.
+
+    Compared to ``SmolVLAConfig`` this adds:
+
+    - ``recipe_path``: path to a ``TrainingRecipe`` YAML (loaded by the
+      train script). When ``None``, SmolVLA2 falls back to the SmolVLA
+      task-only path so unannotated datasets still work.
+    - ``text_loss_weight`` / ``flow_loss_weight``: relative weights when
+      both losses are active in a single sample.
+    - ``unfreeze_lm_head``: must be ``True`` for the text head to learn —
+      SmolVLA freezes ``lm_head`` to "avoid unused params issues" and we
+      need to undo that for SmolVLA2.
+    - ``train_expert_only=False`` by default, since the VLM body now also
+      participates in text-target gradients.
+    """
+
+    # Recipe / language stack ---------------------------------------------
+    recipe_path: str | None = "recipes/smolvla2_hirobot.yaml"
+    """Path (absolute or relative to ``src/lerobot/configs/``) to a
+    ``TrainingRecipe`` YAML. The default points at the canonical Hi Robot
+    blend shipped alongside SmolVLA2. Set to ``None`` to disable recipe
+    rendering and fall back to SmolVLA's single-task prompt path
+    (unannotated datasets keep working that way)."""
+
+    apply_chat_template: bool = True
+    """Apply the SmolVLM tokenizer's chat template to the rendered messages
+    before tokenizing. SmolVLM's backbone is chat-pretrained, so this
+    matches its training distribution."""
+
+    # Loss weights --------------------------------------------------------
+    text_loss_weight: float = 1.0
+    """Weight on the LM-head cross-entropy term. Set to ``0`` to disable
+    text training entirely (reverts to flow-only / SmolVLA behaviour)."""
+
+    flow_loss_weight: float = 1.0
+    """Weight on the action-expert flow-matching term."""
+
+    # Backbone training ---------------------------------------------------
+    unfreeze_lm_head: bool = True
+    """Whether to unfreeze the SmolVLM ``lm_head`` (and the immediately
+    preceding norm + last text-model layer that SmolVLA freezes). Must be
+    ``True`` for the text head to learn. Setting this to ``False``
+    effectively reduces SmolVLA2 back to SmolVLA's flow-only training,
+    which is occasionally useful for ablations."""
+
+    def __post_init__(self) -> None:
+        super().__post_init__()
+        # Backbone needs gradients flowing through its text path when the
+        # LM head is producing supervised text. Override the SmolVLA
+        # default (`train_expert_only=True`) unless the user explicitly
+        # opts out of text training via `text_loss_weight=0`.
+        if self.text_loss_weight > 0 and self.unfreeze_lm_head:
+            # The user can still flip this back via CLI; this only
+            # changes the *default* when SmolVLA2 is actually training a
+            # text head.
+            self.train_expert_only = False
@@ -0,0 +1,119 @@
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""SmolVLA2 modeling — dual-head subclass of SmolVLAPolicy.
+
+This module defines :class:`SmolVLA2Policy`, which extends SmolVLA with:
+
+* an unfrozen SmolVLM ``lm_head`` so language tokens can be supervised,
+* a forward path that routes to the flow head, the text head, or both,
+  driven by ``batch["predict_actions"]`` and ``batch["text_labels"]``.
+
+The text-head computation itself is NOT wired up in this scaffold commit
+(the processor doesn't yet produce ``text_labels`` either). This file is
+the structural placeholder that:
+
+1. registers the ``SmolVLA2Policy`` class with the right config name so
+   ``policies/factory.py`` can build it,
+2. unfreezes ``lm_head`` at construction time when the config asks for it
+   (otherwise SmolVLA's ``train_expert_only`` freezes it again on every
+   ``train()`` call),
+3. forwards to ``SmolVLAPolicy.forward`` so behaviour is identical to
+   SmolVLA when no text labels are present — i.e. existing SmolVLA
+   training scripts keep working.
+
+The next commit on this branch fills in the actual text-loss path.
+"""
+
+from __future__ import annotations
+
+from typing import Any
+
+import torch
+from torch import Tensor
+
+from ..smolvla.modeling_smolvla import SmolVLAPolicy
+from .configuration_smolvla2 import SmolVLA2Config
+
+
+class SmolVLA2Policy(SmolVLAPolicy):
+    """SmolVLA + re-enabled SmolVLM language head.
+
+    Compatible drop-in for ``SmolVLAPolicy`` from a checkpoint or factory
+    perspective. Behaviourally identical to SmolVLA until the text-head
+    code path lands in the next commit on this branch.
+    """
+
+    config_class = SmolVLA2Config
+    name = "smolvla2"
+
+    def __init__(self, config: SmolVLA2Config, dataset_stats: dict[str, dict[str, Tensor]] | None = None):
+        if not isinstance(config, SmolVLA2Config):
+            # Allow loading a SmolVLA checkpoint into a SmolVLA2 model by
+            # widening the config type — the new fields fall back to their
+            # defaults, which preserves the existing SmolVLA behaviour.
+            config = SmolVLA2Config(**{
+                f.name: getattr(config, f.name)
+                for f in config.__dataclass_fields__.values()
+                if hasattr(config, f.name)
+            })
+        super().__init__(config, dataset_stats=dataset_stats)
+        if config.unfreeze_lm_head and config.text_loss_weight > 0:
+            self._unfreeze_lm_head()
+
+    # ------------------------------------------------------------------
+    # Backbone surgery
+    # ------------------------------------------------------------------
+
+    def _unfreeze_lm_head(self) -> None:
+        """Re-enable gradients on the SmolVLM ``lm_head`` (and the bits of
+        the text path SmolVLA freezes) so the text-loss can flow back.
+
+        SmolVLA's ``SmolVLMWithExpertModel.set_requires_grad`` freezes
+        ``lm_head``, ``text_model.model.norm.weight``, and the last
+        ``text_model.layers.<N-1>`` block. We undo that selectively when
+        text training is enabled.
+        """
+        vlm_with_expert = getattr(self.model, "vlm_with_expert", None)
+        if vlm_with_expert is None:
+            return
+        vlm = getattr(vlm_with_expert, "vlm", None)
+        if vlm is None:
+            return
+        for name, param in vlm.named_parameters():
+            if (
+                "lm_head" in name
+                or "text_model.model.norm.weight" in name
+            ):
+                param.requires_grad = True
+
+    # ------------------------------------------------------------------
+    # Forward
+    # ------------------------------------------------------------------
+
+    def forward(
+        self,
+        batch: dict[str, Tensor],
+        noise: Tensor | None = None,
+        time: Tensor | None = None,
+        reduction: str = "mean",
+    ) -> tuple[Tensor, dict[str, Any]]:
+        """Forward pass with optional text-head loss.
+
+        SCAFFOLD: forwards directly to ``SmolVLAPolicy.forward``. The
+        actual text-loss / dual-head routing lands in the next commit on
+        this branch — it will read ``batch["text_labels"]`` and
+        ``batch["predict_actions"]`` (both produced by the SmolVLA2
+        processor) to decide which head(s) to run.
+        """
+        return super().forward(batch, noise=noise, time=time, reduction=reduction)
@@ -0,0 +1,131 @@
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""SmolVLA2 processor pipelines.
+
+When ``config.recipe_path`` is set, the pre-processor pipeline becomes:
+
+    rename observations
+    add batch dim
+    RenderMessagesStep(recipe)              # PR 1: language_*  → messages
+    SmolVLA2ChatTokenizerStep(...)          # chat template + label mask + predict_actions
+    DeviceProcessorStep
+    NormalizerProcessorStep
+
+When ``config.recipe_path`` is ``None``, we delegate to SmolVLA's
+plain task-string pipeline so unannotated datasets still work.
+
+Post-processor is unchanged from SmolVLA.
+"""
+
+from __future__ import annotations
+
+from pathlib import Path
+from typing import Any
+
+import torch
+
+from lerobot.configs.recipe import TrainingRecipe
+from lerobot.processor import (
+    AddBatchDimensionProcessorStep,
+    DeviceProcessorStep,
+    NormalizerProcessorStep,
+    PolicyAction,
+    PolicyProcessorPipeline,
+    RenameObservationsProcessorStep,
+    RenderMessagesStep,
+    UnnormalizerProcessorStep,
+    policy_action_to_transition,
+    transition_to_policy_action,
+)
+from lerobot.utils.constants import POLICY_POSTPROCESSOR_DEFAULT_NAME, POLICY_PREPROCESSOR_DEFAULT_NAME
+
+from ..smolvla.processor_smolvla import make_smolvla_pre_post_processors
+from .chat_processor_smolvla2 import SmolVLA2ChatTokenizerStep
+from .configuration_smolvla2 import SmolVLA2Config
+
+
+def make_smolvla2_pre_post_processors(
+    config: SmolVLA2Config,
+    dataset_stats: dict[str, dict[str, torch.Tensor]] | None = None,
+) -> tuple[
+    PolicyProcessorPipeline[dict[str, Any], dict[str, Any]],
+    PolicyProcessorPipeline[PolicyAction, PolicyAction],
+]:
+    """Build SmolVLA2's pre/post-processor pipelines.
+
+    With ``recipe_path`` set, inserts the recipe-rendering step and the
+    chat-template tokenizer that emits ``text_labels`` and
+    ``predict_actions`` for the dual-loss path. Without it, falls back
+    to SmolVLA's plain task-string pipeline so unannotated datasets
+    keep working unchanged.
+    """
+    if not config.recipe_path:
+        return make_smolvla_pre_post_processors(config, dataset_stats=dataset_stats)
+
+    recipe = _load_recipe(config.recipe_path)
+
+    input_steps = [
+        RenameObservationsProcessorStep(rename_map={}),
+        AddBatchDimensionProcessorStep(),
+        RenderMessagesStep(recipe=recipe),
+        SmolVLA2ChatTokenizerStep(
+            tokenizer_name=config.vlm_model_name,
+            max_length=config.tokenizer_max_length,
+            padding=config.pad_language_to,
+        ),
+        DeviceProcessorStep(device=config.device),
+        NormalizerProcessorStep(
+            features={**config.input_features, **config.output_features},
+            norm_map=config.normalization_mapping,
+            stats=dataset_stats,
+        ),
+    ]
+    output_steps = [
+        UnnormalizerProcessorStep(
+            features=config.output_features,
+            norm_map=config.normalization_mapping,
+            stats=dataset_stats,
+        ),
+        DeviceProcessorStep(device="cpu"),
+    ]
+    return (
+        PolicyProcessorPipeline[dict[str, Any], dict[str, Any]](
+            steps=input_steps,
+            name=POLICY_PREPROCESSOR_DEFAULT_NAME,
+        ),
+        PolicyProcessorPipeline[PolicyAction, PolicyAction](
+            steps=output_steps,
+            name=POLICY_POSTPROCESSOR_DEFAULT_NAME,
+            to_transition=policy_action_to_transition,
+            to_output=transition_to_policy_action,
+        ),
+    )
+
+
+def _load_recipe(path_str: str) -> TrainingRecipe:
+    """Resolve ``path_str`` to a ``TrainingRecipe``.
+
+    Accepts an absolute path or a path relative to
+    ``src/lerobot/configs/`` so recipe authors can write
+    ``--policy.recipe_path=recipes/smolvla2_hirobot.yaml``.
+    """
+    p = Path(path_str)
+    if not p.is_absolute() and not p.exists():
+        from lerobot.configs import recipe as _recipe_module  # noqa: PLC0415
+
+        configs_dir = Path(_recipe_module.__file__).resolve().parent
+        candidate = configs_dir / path_str
+        if candidate.exists():
+            p = candidate
+    return TrainingRecipe.from_yaml(p)
@@ -93,6 +93,7 @@ from .relative_action_processor import (
    to_relative_actions,
 )
 from .rename_processor import RenameObservationsProcessorStep, rename_stats
+from .render_messages_processor import RenderMessagesStep
 from .tokenizer_processor import ActionTokenizerProcessorStep, TokenizerProcessorStep

 __all__ = [
@@ -128,6 +129,7 @@ __all__ = [
    "make_default_robot_observation_processor",
    "AbsoluteActionsProcessorStep",
    "RelativeActionsProcessorStep",
+    "RenderMessagesStep",
    "MapDeltaActionToRobotActionStep",
    "MapTensorToDeltaActionDictStep",
    "NewLineTaskProcessorStep",
@@ -174,6 +174,24 @@ class AddBatchDimensionComplementaryDataStep(ComplementaryDataProcessorStep):
            task_index_value = complementary_data["task_index"]
            if isinstance(task_index_value, Tensor) and task_index_value.dim() == 0:
                complementary_data["task_index"] = task_index_value.unsqueeze(0)
+
+        complementary_data.pop("language_persistent", None)
+        complementary_data.pop("language_events", None)
+
+        if "messages" in complementary_data:
+            messages = complementary_data["messages"]
+            if isinstance(messages, list) and (not messages or isinstance(messages[0], dict)):
+                complementary_data["messages"] = [messages]
+
+        if "message_streams" in complementary_data:
+            streams = complementary_data["message_streams"]
+            if isinstance(streams, list) and (not streams or isinstance(streams[0], str)):
+                complementary_data["message_streams"] = [streams]
+
+        if "target_message_indices" in complementary_data:
+            indices = complementary_data["target_message_indices"]
+            if isinstance(indices, list) and (not indices or isinstance(indices[0], int)):
+                complementary_data["target_message_indices"] = [indices]
        return complementary_data

    def transform_features(
@@ -167,12 +167,35 @@ def _extract_complementary_data(batch: dict[str, Any]) -> dict[str, Any]:
    """
    pad_keys = {k: v for k, v in batch.items() if "_is_pad" in k}
    task_key = {"task": batch["task"]} if "task" in batch else {}
-    subtask_key = {"subtask": batch["subtask"]} if "subtask" in batch else {}
    index_key = {"index": batch["index"]} if "index" in batch else {}
    task_index_key = {"task_index": batch["task_index"]} if "task_index" in batch else {}
    episode_index_key = {"episode_index": batch["episode_index"]} if "episode_index" in batch else {}
+    timestamp_key = {"timestamp": batch["timestamp"]} if "timestamp" in batch else {}
+    language_persistent_key = (
+        {"language_persistent": batch["language_persistent"]} if "language_persistent" in batch else {}
+    )
+    language_events_key = {"language_events": batch["language_events"]} if "language_events" in batch else {}
+    messages_key = {"messages": batch["messages"]} if "messages" in batch else {}
+    message_streams_key = {"message_streams": batch["message_streams"]} if "message_streams" in batch else {}
+    target_message_indices_key = (
+        {"target_message_indices": batch["target_message_indices"]}
+        if "target_message_indices" in batch
+        else {}
+    )

-    return {**pad_keys, **task_key, **subtask_key, **index_key, **task_index_key, **episode_index_key}
+    return {
+        **pad_keys,
+        **task_key,
+        **index_key,
+        **task_index_key,
+        **episode_index_key,
+        **timestamp_key,
+        **language_persistent_key,
+        **language_events_key,
+        **messages_key,
+        **message_streams_key,
+        **target_message_indices_key,
+    }


 def create_transition(
@@ -321,7 +321,6 @@ class GymHILAdapterProcessorStep(ProcessorStep):
    This step normalizes the `transition` object by:
    1. Copying `teleop_action` from `info` to `complementary_data`.
    2. Copying `is_intervention` from `info` (using the string key) to `info` (using the enum key).
-    3. Copying `discrete_penalty` from `info` to `complementary_data`.
    """

    def __call__(self, transition: EnvTransition) -> EnvTransition:
@@ -331,9 +330,6 @@ class GymHILAdapterProcessorStep(ProcessorStep):
        if TELEOP_ACTION_KEY in info:
            complementary_data[TELEOP_ACTION_KEY] = info[TELEOP_ACTION_KEY]

-        if DISCRETE_PENALTY_KEY in info:
-            complementary_data[DISCRETE_PENALTY_KEY] = info[DISCRETE_PENALTY_KEY]
-
        if "is_intervention" in info:
            info[TeleopEvents.IS_INTERVENTION] = info["is_intervention"]

@@ -352,24 +348,18 @@ class GymHILAdapterProcessorStep(ProcessorStep):
@ProcessorStepRegistry.register("gripper_penalty_processor")
 class GripperPenaltyProcessorStep(ProcessorStep):
    """
-    Applies a small per-transition cost on the discrete gripper action.
+    Applies a penalty for inefficient gripper usage.

-    Fires only when the commanded action would actually transition the gripper
-    from one extreme to the other (close-while-open or open-while-closed).
-    This discourages gripper oscillation while leaving "stay" and saturating-further
-    commands unpenalized.
+    This step penalizes actions that attempt to close an already closed gripper or
+    open an already open one, based on position thresholds.

    Attributes:
        penalty: The negative reward value to apply.
        max_gripper_pos: The maximum position value for the gripper, used for normalization.
-        open_threshold: Normalized state below which the gripper is considered "open".
-        closed_threshold: Normalized state above which the gripper is considered "closed".
    """

-    penalty: float = -0.02
+    penalty: float = -0.01
    max_gripper_pos: float = 30.0
-    open_threshold: float = 0.1
-    closed_threshold: float = 0.9

    def __call__(self, transition: EnvTransition) -> EnvTransition:
        """
@@ -401,13 +391,9 @@ class GripperPenaltyProcessorStep(ProcessorStep):
        gripper_state_normalized = current_gripper_pos / self.max_gripper_pos

        # Calculate penalty boolean as in original
-        #   - currently open  AND target is closed  -> close transition
-        #   - currently closed AND target is open   -> open transition
-        is_open = gripper_state_normalized < self.open_threshold
-        is_closed = gripper_state_normalized > self.closed_threshold
-        cmd_close = gripper_action_normalized > self.closed_threshold
-        cmd_open = gripper_action_normalized < self.open_threshold
-        gripper_penalty_bool = (is_open and cmd_close) or (is_closed and cmd_open)
+        gripper_penalty_bool = (gripper_state_normalized < 0.5 and gripper_action_normalized > 0.5) or (
+            gripper_state_normalized > 0.75 and gripper_action_normalized < 0.5
+        )

        gripper_penalty = self.penalty * int(gripper_penalty_bool)

@@ -423,14 +409,11 @@ class GripperPenaltyProcessorStep(ProcessorStep):
        Returns the configuration of the step for serialization.

        Returns:
-            A dictionary containing the penalty value, max gripper position,
-            and the open/closed thresholds.
+            A dictionary containing the penalty value and max gripper position.
        """
        return {
            "penalty": self.penalty,
            "max_gripper_pos": self.max_gripper_pos,
-            "open_threshold": self.open_threshold,
-            "closed_threshold": self.closed_threshold,
        }

    def reset(self) -> None:
@@ -134,15 +134,6 @@ class _NormalizationMixin:
        if self.dtype is None:
            self.dtype = torch.float32
        self._tensor_stats = to_tensor(self.stats, device=self.device, dtype=self.dtype)
-        self._reshape_visual_stats()
-
-    def _reshape_visual_stats(self) -> None:
-        """Reshape visual stats from ``[C]`` to ``[C, 1, 1]`` for image broadcasting."""
-        for key, feature in self.features.items():
-            if feature.type == FeatureType.VISUAL and key in self._tensor_stats:
-                for stat_name, stat_tensor in self._tensor_stats[key].items():
-                    if isinstance(stat_tensor, Tensor) and stat_tensor.ndim == 1:
-                        self._tensor_stats[key][stat_name] = stat_tensor.reshape(-1, 1, 1)

    def to(
        self, device: torch.device | str | None = None, dtype: torch.dtype | None = None
@@ -161,7 +152,6 @@ class _NormalizationMixin:
        if dtype is not None:
            self.dtype = dtype
        self._tensor_stats = to_tensor(self.stats, device=self.device, dtype=self.dtype)
-        self._reshape_visual_stats()
        return self

    def state_dict(self) -> dict[str, Tensor]:
@@ -211,7 +201,6 @@ class _NormalizationMixin:
            # Don't load from state_dict, keep the explicitly provided stats
            # But ensure _tensor_stats is properly initialized
            self._tensor_stats = to_tensor(self.stats, device=self.device, dtype=self.dtype)  # type: ignore[assignment]
-            self._reshape_visual_stats()
            return

        # Normal behavior: load stats from state_dict
@@ -222,7 +211,6 @@ class _NormalizationMixin:
            self._tensor_stats.setdefault(key, {})[stat_name] = tensor.to(
                dtype=torch.float32, device=self.device
            )
-        self._reshape_visual_stats()

        # Reconstruct the original stats dict from tensor stats for compatibility with to() method
        # and other functions that rely on self.stats
@@ -0,0 +1,92 @@
+#!/usr/bin/env python
+
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from __future__ import annotations
+
+from dataclasses import dataclass
+from typing import Any
+
+from lerobot.configs import PipelineFeatureType, PolicyFeature
+from lerobot.configs.recipe import TrainingRecipe
+from lerobot.datasets.language import LANGUAGE_EVENTS, LANGUAGE_PERSISTENT
+from lerobot.datasets.language_render import render_sample
+from lerobot.types import EnvTransition, TransitionKey
+
+from .pipeline import ProcessorStep, ProcessorStepRegistry
+
+
+@dataclass
+@ProcessorStepRegistry.register(name="render_messages_processor")
+class RenderMessagesStep(ProcessorStep):
+    """Processor step that turns raw language columns into rendered chat messages.
+
+    Reads ``language_persistent`` and ``language_events`` from the transition's
+    complementary data, renders them through ``recipe`` at the sample timestamp,
+    and replaces the raw columns with the resulting ``messages`` /
+    ``message_streams`` / ``target_message_indices`` keys.
+    """
+
+    recipe: TrainingRecipe
+    dataset_ctx: Any | None = None
+
+    def __call__(self, transition: EnvTransition) -> EnvTransition | None:
+        """Render messages for a single transition; return ``None`` to drop it."""
+        complementary_data = transition.get(TransitionKey.COMPLEMENTARY_DATA) or {}
+        persistent = complementary_data.get(LANGUAGE_PERSISTENT) or []
+        events = complementary_data.get(LANGUAGE_EVENTS) or []
+
+        if not persistent and not events:
+            return transition
+
+        timestamp = complementary_data.get("timestamp")
+        if timestamp is None:
+            raise KeyError("RenderMessagesStep requires sample timestamp in complementary data.")
+
+        sample_idx = complementary_data.get("index", 0)
+        rendered = render_sample(
+            recipe=self.recipe,
+            persistent=persistent,
+            events=events,
+            t=_scalar(timestamp),
+            sample_idx=int(_scalar(sample_idx)),
+            task=complementary_data.get("task"),
+            dataset_ctx=self.dataset_ctx,
+        )
+        if rendered is None:
+            return None
+
+        new_transition = transition.copy()
+        new_complementary_data = dict(complementary_data)
+        new_complementary_data.pop(LANGUAGE_PERSISTENT, None)
+        new_complementary_data.pop(LANGUAGE_EVENTS, None)
+        new_complementary_data.update(rendered)
+        new_transition[TransitionKey.COMPLEMENTARY_DATA] = new_complementary_data
+        return new_transition
+
+    def transform_features(
+        self, features: dict[PipelineFeatureType, dict[str, PolicyFeature]]
+    ) -> dict[PipelineFeatureType, dict[str, PolicyFeature]]:
+        """Pass features through unchanged; rendering only touches complementary data."""
+        return features
+
+
+def _scalar(value: Any) -> float | int:
+    """Unwrap a tensor/array/single-element list into a Python scalar."""
+    if hasattr(value, "item"):
+        return value.item()
+    if isinstance(value, list) and len(value) == 1:
+        return _scalar(value[0])
+    return value
@@ -60,7 +60,7 @@ from torch.multiprocessing import Event, Queue
 from lerobot.cameras import opencv  # noqa: F401
 from lerobot.configs import parser
 from lerobot.configs.train import TrainRLServerPipelineConfig
-from lerobot.policies import make_policy, make_pre_post_processors
+from lerobot.policies import make_policy
 from lerobot.policies.sac.modeling_sac import SACPolicy
 from lerobot.robots import so_follower  # noqa: F401
 from lerobot.teleoperators import gamepad, so_leader  # noqa: F401
@@ -89,9 +89,9 @@ from lerobot.utils.utils import (
 )

 from .gym_manipulator import (
+    create_transition,
    make_processors,
    make_robot_env,
-    reset_and_build_transition,
    step_env_and_process_transition,
 )
 from .process import ProcessSignalHandler
@@ -261,12 +261,13 @@ def act_with_policy(
    policy = policy.eval()
    assert isinstance(policy, nn.Module)

-    preprocessor, postprocessor = make_pre_post_processors(
-        policy_cfg=cfg.policy,
-        dataset_stats=cfg.policy.dataset_stats,
-    )
+    obs, info = online_env.reset()
+    env_processor.reset()
+    action_processor.reset()

-    transition = reset_and_build_transition(online_env, env_processor, action_processor)
+    # Process initial observation
+    transition = create_transition(observation=obs, info=info)
+    transition = env_processor(transition)

    # NOTE: For the moment we will solely handle the case of a single environment
    sum_reward_episode = 0
@@ -290,21 +291,8 @@ def act_with_policy(

        # Time policy inference and check if it meets FPS requirement
        with policy_timer:
-            normalized_observation = preprocessor.process_observation(observation)
-            action = policy.select_action(batch=normalized_observation)
-            # Unnormalize only the continuous part. When `num_discrete_actions` is set,
-            # `select_action` concatenates an argmax index in env space at the last dim;
-            # action stats cover the continuous dims only, so feeding the full vector to
-            # the unnormalizer would shape-mismatch and would also corrupt the discrete
-            # index by treating it as a normalized value.
-            if cfg.policy.num_discrete_actions is not None:
-                continuous_action = postprocessor.process_action(action[..., :-1])
-                discrete_action = action[..., -1:].to(
-                    device=continuous_action.device, dtype=continuous_action.dtype
-                )
-                action = torch.cat([continuous_action, discrete_action], dim=-1)
-            else:
-                action = postprocessor.process_action(action)
+            # Extract observation from transition for policy
+            action = policy.select_action(batch=observation)
        policy_fps = policy_timer.fps_last

        log_policy_frequency_issue(policy_fps=policy_fps, cfg=cfg, interaction_step=interaction_step)
@@ -338,8 +326,7 @@ def act_with_policy(

        # Check for intervention from transition info
        intervention_info = new_transition[TransitionKey.INFO]
-        is_intervention = bool(intervention_info.get(TeleopEvents.IS_INTERVENTION, False))
-        if is_intervention:
+        if intervention_info.get(TeleopEvents.IS_INTERVENTION, False):
            episode_intervention = True
            episode_intervention_steps += 1

@@ -347,10 +334,6 @@ def act_with_policy(
            "discrete_penalty": torch.tensor(
                [new_transition[TransitionKey.COMPLEMENTARY_DATA].get("discrete_penalty", 0.0)]
            ),
-            # Forward the intervention flag so the learner can route this transition
-            # into the offline replay buffer (see `process_transitions` in learner.py).
-            # Use the plain string key so the payload survives torch.load(weights_only=True).
-            TeleopEvents.IS_INTERVENTION.value: is_intervention,
        }
        # Create transition for learner (convert to old format)
        list_transition_to_send_to_learner.append(
@@ -407,7 +390,14 @@ def act_with_policy(
            episode_intervention_steps = 0
            episode_total_steps = 0

-            transition = reset_and_build_transition(online_env, env_processor, action_processor)
+            # Reset environment and processors
+            obs, info = online_env.reset()
+            env_processor.reset()
+            action_processor.reset()
+
+            # Process initial observation
+            transition = create_transition(observation=obs, info=info)
+            transition = env_processor(transition)

        if cfg.env.fps is not None:
            dt_time = time.perf_counter() - start_time
@@ -383,21 +383,10 @@ def make_processors(
            GymHILAdapterProcessorStep(),
            Numpy2TorchActionProcessorStep(),
            VanillaObservationProcessorStep(),
+            AddBatchDimensionProcessorStep(),
+            DeviceProcessorStep(device=device),
        ]

-        # Add time limit processor if reset config exists
-        if cfg.processor.reset is not None:
-            env_pipeline_steps.append(
-                TimeLimitProcessorStep(max_episode_steps=int(cfg.processor.reset.control_time_s * cfg.fps))
-            )
-
-        env_pipeline_steps.extend(
-            [
-                AddBatchDimensionProcessorStep(),
-                DeviceProcessorStep(device=device),
-            ]
-        )
-
        return DataProcessorPipeline(
            steps=env_pipeline_steps, to_transition=identity_transition, to_output=identity_transition
        ), DataProcessorPipeline(
@@ -562,19 +551,8 @@ def step_env_and_process_transition(
    terminated = terminated or processed_action_transition[TransitionKey.DONE]
    truncated = truncated or processed_action_transition[TransitionKey.TRUNCATED]
    complementary_data = processed_action_transition[TransitionKey.COMPLEMENTARY_DATA].copy()
-
-    if hasattr(env, "get_raw_joint_positions"):
-        raw_joint_positions = env.get_raw_joint_positions()
-        if raw_joint_positions is not None:
-            complementary_data["raw_joint_positions"] = raw_joint_positions
-
-    # Merge env and action-processor info: env wins for str keys, action-processor
-    # wins for `TeleopEvents` enum keys
-    action_info = processed_action_transition[TransitionKey.INFO]
    new_info = info.copy()
-    for key, value in action_info.items():
-        if isinstance(key, TeleopEvents):
-            new_info[key] = value
+    new_info.update(processed_action_transition[TransitionKey.INFO])

    new_transition = create_transition(
        observation=obs,
@@ -590,24 +568,6 @@ def step_env_and_process_transition(
    return new_transition


-def reset_and_build_transition(
-    env: gym.Env,
-    env_processor: DataProcessorPipeline[EnvTransition, EnvTransition],
-    action_processor: DataProcessorPipeline[EnvTransition, EnvTransition],
-) -> EnvTransition:
-    """Reset env + processors and return the first env-processed transition."""
-    obs, info = env.reset()
-    env_processor.reset()
-    action_processor.reset()
-    complementary_data: dict[str, Any] = {}
-    if hasattr(env, "get_raw_joint_positions"):
-        raw_joint_positions = env.get_raw_joint_positions()
-        if raw_joint_positions is not None:
-            complementary_data["raw_joint_positions"] = raw_joint_positions
-    transition = create_transition(observation=obs, info=info, complementary_data=complementary_data)
-    return env_processor(data=transition)
-
-
 def control_loop(
    env: gym.Env,
    env_processor: DataProcessorPipeline[EnvTransition, EnvTransition],
@@ -633,7 +593,17 @@ def control_loop(
    print("- When not intervening, robot will stay still")
    print("- Press Ctrl+C to exit")

-    transition = reset_and_build_transition(env, env_processor, action_processor)
+    # Reset environment and processors
+    obs, info = env.reset()
+    complementary_data = (
+        {"raw_joint_positions": info.pop("raw_joint_positions")} if "raw_joint_positions" in info else {}
+    )
+    env_processor.reset()
+    action_processor.reset()
+
+    # Process initial observation
+    transition = create_transition(observation=obs, info=info, complementary_data=complementary_data)
+    transition = env_processor(data=transition)

    # Determine if gripper is used
    use_gripper = cfg.env.processor.gripper.use_gripper if cfg.env.processor.gripper is not None else True
@@ -695,7 +665,7 @@ def control_loop(
        # Create a neutral action (no movement)
        neutral_action = torch.tensor([0.0, 0.0, 0.0], dtype=torch.float32)
        if use_gripper:
-            neutral_action = torch.cat([neutral_action, torch.tensor([1.0])])  # Gripper stay
+            neutral_action = torch.cat([neutral_action, torch.tensor([0.0])])  # Gripper stay

        # Use the new step function
        transition = step_env_and_process_transition(
@@ -753,7 +723,12 @@ def control_loop(
                    dataset.save_episode()

            # Reset for new episode
-            transition = reset_and_build_transition(env, env_processor, action_processor)
+            obs, info = env.reset()
+            env_processor.reset()
+            action_processor.reset()
+
+            transition = create_transition(observation=obs, info=info)
+            transition = env_processor(transition)

        # Maintain fps timing
        precise_sleep(max(dt - (time.perf_counter() - step_start_time), 0.0))
@@ -70,7 +70,7 @@ from lerobot.common.wandb_utils import WandBLogger
 from lerobot.configs import parser
 from lerobot.configs.train import TrainRLServerPipelineConfig
 from lerobot.datasets import LeRobotDataset, make_dataset
-from lerobot.policies import make_policy, make_pre_post_processors
+from lerobot.policies import make_policy
 from lerobot.policies.sac.modeling_sac import SACPolicy
 from lerobot.robots import so_follower  # noqa: F401
 from lerobot.teleoperators import gamepad, so_leader  # noqa: F401
@@ -317,11 +317,6 @@ def add_actor_information_and_train(

    policy.train()

-    preprocessor, _postprocessor = make_pre_post_processors(
-        policy_cfg=cfg.policy,
-        dataset_stats=cfg.policy.dataset_stats,
-    )
-
    push_actor_policy_to_queue(parameters_queue=parameters_queue, policy=policy)

    last_time_policy_pushed = time.time()
@@ -410,8 +405,8 @@ def add_actor_information_and_train(

            actions = batch[ACTION]
            rewards = batch["reward"]
-            observations = preprocessor.process_observation(batch["state"])
-            next_observations = preprocessor.process_observation(batch["next_state"])
+            observations = batch["state"]
+            next_observations = batch["next_state"]
            done = batch["done"]
            check_nan_in_transition(observations=observations, actions=actions, next_state=next_observations)

@@ -468,8 +463,8 @@ def add_actor_information_and_train(

        actions = batch[ACTION]
        rewards = batch["reward"]
-        observations = preprocessor.process_observation(batch["state"])
-        next_observations = preprocessor.process_observation(batch["next_state"])
+        observations = batch["state"]
+        next_observations = batch["next_state"]
        done = batch["done"]

        check_nan_in_transition(observations=observations, actions=actions, next_state=next_observations)
@@ -1168,7 +1163,7 @@ def process_transitions(

            # Add to offline buffer if it's an intervention
            if dataset_repo_id is not None and transition.get("complementary_info", {}).get(
-                TeleopEvents.IS_INTERVENTION.value
+                TeleopEvents.IS_INTERVENTION
            ):
                offline_replay_buffer.add(**transition)

@@ -353,8 +353,7 @@ class GripperVelocityToJoint(RobotActionProcessorStep):
        speed_factor: A scaling factor to convert the normalized velocity command to a position change.
        clip_min: The minimum allowed gripper joint position.
        clip_max: The maximum allowed gripper joint position.
-        discrete_gripper: If True, interpret the input as a discrete class index
-            {0 = close, 1 = stay, 2 = open}, matching `GamepadTeleop.GripperAction`.
+        discrete_gripper: If True, treat the input action as discrete (0: open, 1: close, 2: stay).
    """

    speed_factor: float = 20.0
@@ -378,10 +377,10 @@ class GripperVelocityToJoint(RobotActionProcessorStep):
            raise ValueError("Joints observation is require for computing robot kinematics")

        if self.discrete_gripper:
-            # Map discrete command {0=close, 1=stay, 2=open} -> signed velocity.
-            # Negation accounts for SO100 sign (joint position increases on close).
-            #   0 -> +clip_max (close), 1 -> 0 (stay), 2 -> -clip_max (open)
-            gripper_vel = -(gripper_vel - 1) * self.clip_max
+            # Discrete gripper actions are in [0, 1, 2]
+            # 0: open, 1: close, 2: stay
+            # We need to shift them to [-1, 0, 1] and then scale them to clip_max
+            gripper_vel = (gripper_vel - 1) * self.clip_max

        # Compute desired gripper position
        delta = gripper_vel * float(self.speed_factor)
@@ -0,0 +1,150 @@
+#!/usr/bin/env python
+
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""``lerobot-annotate`` — populate ``language_persistent`` and
+``language_events`` columns on a LeRobot dataset.
+
+Annotations live directly in ``data/chunk-*/file-*.parquet``: there is no
+flavor namespace and no sidecar tree. Multiple revisions of the same dataset
+mean multiple dataset copies.
+
+Example:
+
+  uv run lerobot-annotate \\
+      --root=/path/to/dataset \\
+      --vlm.backend=transformers \\
+      --vlm.model_id=Qwen/Qwen2.5-VL-7B-Instruct
+"""
+
+import logging
+from pathlib import Path
+
+from lerobot.annotations.steerable_pipeline.config import AnnotationPipelineConfig
+from lerobot.annotations.steerable_pipeline.executor import Executor
+from lerobot.annotations.steerable_pipeline.frames import make_frame_provider
+from lerobot.annotations.steerable_pipeline.modules import (
+    GeneralVqaModule,
+    InterjectionsAndSpeechModule,
+    PlanSubtasksMemoryModule,
+)
+from lerobot.annotations.steerable_pipeline.validator import StagingValidator
+from lerobot.annotations.steerable_pipeline.vlm_client import make_vlm_client
+from lerobot.annotations.steerable_pipeline.writer import LanguageColumnsWriter
+from lerobot.configs import parser
+
+logger = logging.getLogger(__name__)
+
+
+def _resolve_root(cfg: AnnotationPipelineConfig) -> Path:
+    if cfg.root is not None:
+        return Path(cfg.root)
+    if cfg.repo_id is not None:
+        from huggingface_hub import snapshot_download
+
+        return Path(snapshot_download(repo_id=cfg.repo_id, repo_type="dataset"))
+    raise ValueError("Either --root or --repo_id must be provided.")
+
+
+@parser.wrap()
+def annotate(cfg: AnnotationPipelineConfig) -> None:
+    """Run the steerable annotation pipeline against a dataset."""
+    logging.basicConfig(level=logging.INFO, format="%(asctime)s %(levelname)s %(message)s")
+    root = _resolve_root(cfg)
+    logger.info("annotate: root=%s", root)
+
+    vlm = make_vlm_client(cfg.vlm)
+    frame_provider = make_frame_provider(root, camera_key=cfg.vlm.camera_key)
+    # Surface the resolved cameras up front so silent Module-3-no-op
+    # regressions are obvious in job output rather than discovered post-hoc
+    # by counting parquet rows.
+    cam_keys = list(getattr(frame_provider, "camera_keys", []) or [])
+    logger.info(
+        "annotate: frame_provider default camera=%r, all cameras=%s",
+        getattr(frame_provider, "camera_key", None),
+        cam_keys,
+    )
+    if cfg.module_3.enabled and not cam_keys:
+        logger.warning(
+            "annotate: Module 3 (VQA) is enabled but no cameras were "
+            "resolved — Module 3 will produce zero VQA rows. Check "
+            "meta/info.json for observation.images.* features, or pass "
+            "--vlm.camera_key=<key> to seed the cameras list."
+        )
+    module_1 = PlanSubtasksMemoryModule(vlm=vlm, config=cfg.module_1, frame_provider=frame_provider)
+    module_2 = InterjectionsAndSpeechModule(
+        vlm=vlm, config=cfg.module_2, seed=cfg.seed, frame_provider=frame_provider
+    )
+    module_3 = GeneralVqaModule(vlm=vlm, config=cfg.module_3, seed=cfg.seed, frame_provider=frame_provider)
+    writer = LanguageColumnsWriter()
+    validator = StagingValidator(
+        dataset_camera_keys=tuple(getattr(frame_provider, "camera_keys", []) or []) or None,
+    )
+
+    executor = Executor(
+        config=cfg,
+        module_1=module_1,
+        module_2=module_2,
+        module_3=module_3,
+        writer=writer,
+        validator=validator,
+    )
+    summary = executor.run(root)
+    logger.info("annotate: wrote %d shard(s)", len(summary.written_paths))
+    for phase in summary.phases:
+        logger.info(
+            "annotate: phase=%s processed=%d skipped=%d",
+            phase.name,
+            phase.episodes_processed,
+            phase.episodes_skipped,
+        )
+    if summary.validation_report.warnings:
+        for w in summary.validation_report.warnings:
+            logger.warning(w)
+
+    if cfg.push_to_hub:
+        _push_to_hub(root, cfg)
+
+
+def _push_to_hub(root: Path, cfg: AnnotationPipelineConfig) -> None:
+    """Upload the annotated dataset directory to the Hugging Face Hub."""
+    from huggingface_hub import HfApi  # noqa: PLC0415
+
+    repo_id = cfg.push_to_hub
+    commit_message = cfg.push_commit_message or "Add steerable annotations (lerobot-annotate)"
+    api = HfApi()
+    print(f"[lerobot-annotate] creating/locating dataset repo {repo_id}...", flush=True)
+    api.create_repo(
+        repo_id=repo_id,
+        repo_type="dataset",
+        private=cfg.push_private,
+        exist_ok=True,
+    )
+    print(f"[lerobot-annotate] uploading {root} -> {repo_id}...", flush=True)
+    api.upload_folder(
+        folder_path=str(root),
+        repo_id=repo_id,
+        repo_type="dataset",
+        commit_message=commit_message,
+        ignore_patterns=[".annotate_staging/**", "**/.DS_Store"],
+    )
+    print(f"[lerobot-annotate] uploaded to https://huggingface.co/datasets/{repo_id}", flush=True)
+
+
+def main() -> None:
+    annotate()
+
+
+if __name__ == "__main__":
+    main()
@@ -150,24 +150,11 @@ Show dataset information without feature details:
        --operation.type info \
        --operation.show_features false

-Recompute dataset statistics (saves to lerobot/pusht_recomputed_stats by default):
+Recompute dataset statistics:
    lerobot-edit-dataset \
        --repo_id lerobot/pusht \
        --operation.type recompute_stats

-Recompute stats and save to a specific new repo_id:
-    lerobot-edit-dataset \
-        --repo_id lerobot/pusht \
-        --new_repo_id lerobot/pusht_new_stats \
-        --operation.type recompute_stats
-
-Recompute stats in-place (overwrites original dataset stats):
-    lerobot-edit-dataset \
-        --repo_id lerobot/pusht \
-        --new_repo_id lerobot/pusht \
-        --operation.type recompute_stats \
-        --operation.overwrite true
-
 Recompute stats for relative actions and push to hub:
    lerobot-edit-dataset \
        --repo_id lerobot/pusht \
@@ -269,7 +256,6 @@ class RecomputeStatsConfig(OperationConfig):
    relative_exclude_joints: list[str] | None = None
    chunk_size: int = 50
    num_workers: int = 0
-    overwrite: bool = False


@OperationConfig.register_subclass("info")
@@ -294,30 +280,16 @@ class EditDatasetConfig:
    push_to_hub: bool = False


-def _resolve_io_paths(
-    repo_id: str,
-    new_repo_id: str | None,
-    root: Path | str | None,
-    new_root: Path | str | None,
-    default_new_repo_id: str | None = None,
-) -> tuple[str, Path, Path]:
-    """Resolve input/output paths and repo_id for dataset operations.
-
-    Returns (output_repo_id, input_path, output_path) with resolved (symlink-safe) paths.
-    """
-    input_path = (Path(root) if root else HF_LEROBOT_HOME / repo_id).resolve()
-    output_repo_id = new_repo_id or default_new_repo_id or repo_id
-    output_path = (Path(new_root) if new_root else HF_LEROBOT_HOME / output_repo_id).resolve()
-    return output_repo_id, input_path, output_path
-
-
 def get_output_path(
    repo_id: str,
    new_repo_id: str | None,
    root: Path | str | None,
    new_root: Path | str | None,
 ) -> tuple[str, Path]:
-    output_repo_id, input_path, output_path = _resolve_io_paths(repo_id, new_repo_id, root, new_root)
+    input_path = Path(root) if root else HF_LEROBOT_HOME / repo_id
+
+    output_repo_id = new_repo_id if new_repo_id else repo_id
+    output_path = Path(new_root) if new_root else HF_LEROBOT_HOME / output_repo_id

    # In case of in-place modification, create a backup of the original dataset (if it exists)
    if output_path == input_path:
@@ -585,39 +557,7 @@ def handle_recompute_stats(cfg: EditDatasetConfig) -> None:
    if not isinstance(cfg.operation, RecomputeStatsConfig):
        raise ValueError("Operation config must be RecomputeStatsConfig")

-    # Determine whether this is an in-place operation
-    output_repo_id, input_root, output_root = _resolve_io_paths(
-        cfg.repo_id,
-        cfg.new_repo_id,
-        cfg.root,
-        cfg.new_root,
-        default_new_repo_id=f"{cfg.repo_id}_recomputed_stats",
-    )
-    in_place = output_root == input_root
-
-    if in_place and not cfg.operation.overwrite:
-        raise ValueError(
-            f"recompute_stats would overwrite the dataset in-place at {input_root}. "
-            "Pass --operation.overwrite true to allow in-place modification, "
-            "or use --new_repo_id / --new_root to write to a different location. "
-            f"Default output repo_id when neither is set: '{cfg.repo_id}_recomputed_stats'."
-        )
-
-    if in_place:
-        logging.warning(
-            f"Overwriting dataset stats in-place at {input_root}. The original stats will be lost."
-        )
-        dataset = LeRobotDataset(cfg.repo_id, root=input_root)
-    else:
-        logging.info(f"Copying dataset from {input_root} to {output_root}")
-        if output_root.exists():
-            backup_path = output_root.with_name(output_root.name + "_old")
-            logging.warning(f"Output directory {output_root} already exists. Moving to {backup_path}")
-            if backup_path.exists():
-                shutil.rmtree(backup_path)
-            shutil.move(output_root, backup_path)
-        shutil.copytree(input_root, output_root)
-        dataset = LeRobotDataset(output_repo_id, root=output_root)
+    dataset = LeRobotDataset(cfg.repo_id, root=cfg.root)

    logging.info(f"Recomputing stats for {cfg.repo_id}")
    if cfg.operation.relative_action:
@@ -638,7 +578,7 @@ def handle_recompute_stats(cfg: EditDatasetConfig) -> None:
    logging.info(f"Stats written to {dataset.root}")

    if cfg.push_to_hub:
-        logging.info(f"Pushing to hub as {dataset.repo_id}...")
+        logging.info(f"Pushing to hub as {dataset.meta.repo_id}...")
        dataset.push_to_hub()


@@ -47,6 +47,7 @@ from lerobot.datasets import EpisodeAwareSampler, make_dataset
 from lerobot.envs import close_envs, make_env, make_env_pre_post_processors
 from lerobot.optim.factory import make_optimizer_and_scheduler
 from lerobot.policies import PreTrainedPolicy, make_policy, make_pre_post_processors
+from lerobot.utils.collate import lerobot_collate_fn
 from lerobot.utils.import_utils import register_third_party_plugins
 from lerobot.utils.logging_utils import AverageMeter, MetricsTracker
 from lerobot.utils.random_utils import set_seed
@@ -386,6 +387,7 @@ def train(cfg: TrainPipelineConfig, accelerator: "Accelerator | None" = None):
        sampler=sampler,
        pin_memory=device.type == "cuda",
        drop_last=False,
+        collate_fn=lerobot_collate_fn,
        prefetch_factor=cfg.prefetch_factor if cfg.num_workers > 0 else None,
        persistent_workers=cfg.persistent_workers and cfg.num_workers > 0,
    )
@@ -0,0 +1,29 @@
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""LeRobot tool implementations.
+
+Storage of the tool catalog (``meta/info.json["tools"]``) and the
+``SAY_TOOL_SCHEMA`` constant live in PR 1
+(``lerobot.datasets.language``). This package holds the *runnable*
+implementations one file per tool, plus the registry that maps tool
+names to classes.
+
+See ``docs/source/tools.mdx`` for the authoring guide.
+"""
+
+from .base import Tool
+from .registry import TOOL_REGISTRY, get_tools
+from .say import SayTool
+
+__all__ = ["Tool", "TOOL_REGISTRY", "get_tools", "SayTool"]
@@ -0,0 +1,58 @@
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""Tool protocol — the contract every runnable tool implementation honors.
+
+Tools are the executable side of the OpenAI-style function-calling
+abstraction the v3.1 language schema (PR 1) carries on assistant
+messages: the schema describes *what can be called*, the tool
+implementation describes *how to call it*.
+
+Implementations live one-per-file under :mod:`lerobot.tools` (e.g.
+``say.py`` for ``SayTool``) and are registered in
+:mod:`lerobot.tools.registry`. The runtime instantiates them lazily so
+heavy dependencies (torch models, audio backends, network clients,
+hardware drivers) only load when the dataset actually declares the tool.
+"""
+
+from __future__ import annotations
+
+from typing import Any, Protocol, runtime_checkable
+
+
+@runtime_checkable
+class Tool(Protocol):
+    """Minimum surface every tool must expose."""
+
+    #: Name matching ``schema["function"]["name"]``. The runtime dispatcher
+    #: routes incoming ``tool_calls`` to the implementation by this key.
+    name: str
+
+    #: OpenAI-style function-call schema. Same dict the dataset stores in
+    #: ``meta/info.json["tools"]`` and the chat template renders into the
+    #: prompt.
+    schema: dict[str, Any]
+
+    def call(self, arguments: dict[str, Any]) -> Any:
+        """Execute the tool with the model-provided arguments.
+
+        ``arguments`` is the parsed dict from
+        ``tool_calls[i]["function"]["arguments"]`` (already JSON-decoded
+        when the model emits a JSON-string by the chat-template
+        convention). Implementations validate the dict against their own
+        schema; the runtime only routes by name.
+
+        Return value is implementation-defined — typically a tensor
+        (TTS audio), a Path (saved file), a dict (structured result), or
+        ``None`` (side-effect-only call).
+        """
@@ -0,0 +1,70 @@
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""Tool registry — name → implementation class.
+
+Adding a new tool:
+
+1. Drop a file under ``src/lerobot/tools/`` that defines a class
+   conforming to :class:`lerobot.tools.base.Tool` (must expose ``name``,
+   ``schema``, ``call(arguments)``).
+2. Register the class here under :data:`TOOL_REGISTRY`.
+3. (Optional) Pre-populate ``meta/info.json["tools"]`` on your dataset
+   to advertise the schema to the chat-template + policy. The PR 2
+   annotation pipeline preserves anything you put there.
+
+See ``docs/source/tools.mdx`` for the full authoring guide.
+"""
+
+from __future__ import annotations
+
+from typing import Any
+
+from .base import Tool
+from .say import SayTool
+
+#: Map from ``function.name`` to a class implementing :class:`Tool`.
+#: The runtime instantiates entries lazily — registering a tool here is
+#: essentially free (no model load happens until ``call`` runs).
+TOOL_REGISTRY: dict[str, type] = {
+    "say": SayTool,
+}
+
+
+def get_tools(meta: Any, **kwargs: Any) -> dict[str, Tool]:
+    """Build name → tool-instance dict from a dataset's declared catalog.
+
+    ``meta`` is anything with a ``.tools`` attribute returning the
+    OpenAI-style schema list — typically a
+    :class:`lerobot.datasets.dataset_metadata.LeRobotDatasetMetadata`.
+    Each entry whose ``function.name`` is registered here is
+    instantiated with the schema dict; tools whose name is unknown to
+    the registry are skipped (the schema still rides through the chat
+    template, the model just can't actually invoke that tool at
+    inference).
+
+    Extra keyword arguments are forwarded to every constructor — useful
+    for runtime defaults like ``output_dir=Path("./tts_log")``.
+    """
+    declared = list(meta.tools)
+    instances: dict[str, Tool] = {}
+    for schema in declared:
+        try:
+            name = schema["function"]["name"]
+        except (KeyError, TypeError):
+            continue
+        cls = TOOL_REGISTRY.get(name)
+        if cls is None:
+            continue
+        instances[name] = cls(schema=schema, **kwargs)
+    return instances
@@ -0,0 +1,170 @@
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""``SayTool`` — text-to-speech tool wrapping Kyutai's pocket-tts.
+
+The first concrete tool implementation. SmolVLA2 (PR 3) and downstream
+runtime dispatchers consume this when the model emits an assistant
+message with ``tool_calls=[{function: {name: "say", arguments:
+{text: ...}}}]``.
+
+Why pocket-tts:
+
+- runs on CPU (no GPU dependency); ~6× real-time on a MacBook Air M4
+- ~100M parameters, ~200ms first-chunk latency
+- streamable, voice-cloneable
+- pip-installable, MIT-style permissive license
+
+The pocket-tts model is loaded **lazily** the first time ``call(...)``
+runs (or eagerly via ``preload()``). Loading takes a few seconds and
+several hundred MB of RAM, so we don't pay the cost when the tool is
+merely *registered* — only when it's *invoked*.
+
+Optional dependency. Install with::
+
+    pip install lerobot[tools]
+    # or directly:
+    pip install pocket-tts
+"""
+
+from __future__ import annotations
+
+import logging
+from dataclasses import dataclass, field
+from pathlib import Path
+from typing import Any
+
+from lerobot.datasets.language import SAY_TOOL_SCHEMA
+
+logger = logging.getLogger(__name__)
+
+
+@dataclass
+class SayTool:
+    """Speak a short utterance via Kyutai's pocket-tts.
+
+    Parameters
+    ----------
+    schema:
+        Optional schema override; defaults to the canonical
+        ``SAY_TOOL_SCHEMA`` from PR 1. Custom voices or extended
+        argument shapes can pass in a modified schema, but the
+        implementation only reads ``arguments["text"]``.
+    voice:
+        One of the pocket-tts catalog voices (``alba``, ``marius``,
+        ``javert``, ``jean``, ``fantine``, ``cosette``, ``eponine``,
+        ``azelma``) or a path to a ``.wav`` / ``.safetensors`` voice
+        file for cloning. See the pocket-tts model card for licensing.
+    output_dir:
+        If set, every ``call(...)`` writes a ``<timestamp>.wav`` audio
+        file there in addition to returning the PCM tensor.
+        ``None`` (default) skips disk writes — useful for live
+        playback paths that hand the tensor directly to a sounddevice
+        / WebAudio sink.
+    """
+
+    schema: dict[str, Any] = field(default_factory=lambda: dict(SAY_TOOL_SCHEMA))
+    voice: str = "alba"
+    output_dir: Path | None = None
+
+    name: str = field(init=False, default="say")
+    _model: Any = field(init=False, default=None, repr=False)
+    _voice_state: Any = field(init=False, default=None, repr=False)
+    _sample_rate: int = field(init=False, default=24000, repr=False)
+
+    # ------------------------------------------------------------------
+    # Lazy model load
+    # ------------------------------------------------------------------
+
+    def preload(self) -> None:
+        """Load the pocket-tts model + voice state into memory.
+
+        Optional — ``call(...)`` triggers this automatically on first
+        invocation. Useful when you want the multi-second load to
+        happen at startup rather than on the first ``say`` the policy
+        emits.
+        """
+        if self._model is not None and self._voice_state is not None:
+            return
+        try:
+            from pocket_tts import TTSModel  # noqa: PLC0415  (optional dep)
+        except ImportError as exc:  # pragma: no cover (env-dependent)
+            raise ImportError(
+                "SayTool requires pocket-tts. Install with `pip install "
+                "lerobot[tools]` or `pip install pocket-tts`."
+            ) from exc
+        logger.info("SayTool: loading pocket-tts model + voice=%r", self.voice)
+        self._model = TTSModel.load_model()
+        self._voice_state = self._model.get_state_for_audio_prompt(self.voice)
+        self._sample_rate = int(getattr(self._model, "sample_rate", 24000))
+
+    # ------------------------------------------------------------------
+    # Tool protocol
+    # ------------------------------------------------------------------
+
+    def call(self, arguments: dict[str, Any]) -> Any:
+        """Speak ``arguments["text"]`` and return the PCM tensor.
+
+        Optionally also writes ``<output_dir>/<timestamp>.wav`` when
+        ``self.output_dir`` is set. The returned tensor is a 1-D
+        ``torch.Tensor`` of float32 PCM samples at
+        ``self.sample_rate`` Hz — directly playable by
+        ``sounddevice.play(audio.numpy(), self.sample_rate)`` or
+        encodable by ``scipy.io.wavfile.write``.
+        """
+        text = arguments.get("text")
+        if not isinstance(text, str) or not text.strip():
+            raise ValueError(
+                f"SayTool.call expects arguments={{'text': str}}, got {arguments!r}"
+            )
+        self.preload()
+
+        audio = self._model.generate_audio(self._voice_state, text)
+
+        if self.output_dir is not None:
+            self._write_wav(audio, text)
+
+        return audio
+
+    @property
+    def sample_rate(self) -> int:
+        """PCM sample rate of the returned tensor (Hz)."""
+        return self._sample_rate
+
+    # ------------------------------------------------------------------
+    # Helpers
+    # ------------------------------------------------------------------
+
+    def _write_wav(self, audio: Any, text: str) -> Path:
+        """Write a ``.wav`` next to ``output_dir`` for offline inspection."""
+        import time as _time  # noqa: PLC0415
+
+        try:
+            import scipy.io.wavfile  # noqa: PLC0415
+        except ImportError as exc:  # pragma: no cover
+            raise ImportError(
+                "SayTool.output_dir requires scipy. `pip install scipy`."
+            ) from exc
+
+        out_dir = Path(self.output_dir)
+        out_dir.mkdir(parents=True, exist_ok=True)
+        # One file per call; suffix with a millisecond timestamp + a
+        # short text snippet so a directory listing is informative.
+        snippet = "".join(c if c.isalnum() else "_" for c in text[:32]).strip("_")
+        ts_ms = int(_time.time() * 1000)
+        path = out_dir / f"say_{ts_ms}_{snippet}.wav"
+
+        # ``audio`` is a torch tensor; pocket-tts uses CPU, so a plain
+        # ``.numpy()`` is safe.
+        scipy.io.wavfile.write(path, self.sample_rate, audio.numpy())
+        return path
@@ -0,0 +1,54 @@
+#!/usr/bin/env python
+
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from __future__ import annotations
+
+from typing import Any
+
+from torch.utils.data._utils.collate import default_collate
+
+from lerobot.datasets.language import LANGUAGE_COLUMNS
+
+_PYTHON_LIST_KEYS = {"messages", "message_streams", "target_message_indices"}
+
+
+def lerobot_collate_fn(batch: list[dict[str, Any] | None]) -> dict[str, Any] | None:
+    """Collate function that preserves Python-list and language fields as lists.
+
+    Drops ``None`` samples (e.g. recipes that yielded no target message), keeps
+    rendered-message and language fields as plain Python lists, and delegates
+    every other key to PyTorch's ``default_collate``.
+    """
+    batch = [sample for sample in batch if sample is not None]
+    if not batch:
+        return None
+
+    preserved = {
+        key: [sample[key] for sample in batch if key in sample]
+        for key in _PYTHON_LIST_KEYS
+        if any(key in sample for sample in batch)
+    }
+    tensorizable = [
+        {
+            key: value
+            for key, value in sample.items()
+            if key not in _PYTHON_LIST_KEYS and key not in LANGUAGE_COLUMNS
+        }
+        for sample in batch
+    ]
+    collated = default_collate(tensorizable)
+    collated.update(preserved)
+    return collated
@@ -115,9 +115,7 @@ _feetech_sdk_available = is_package_available("feetech-servo-sdk", import_name="
 _reachy2_sdk_available = is_package_available("reachy2_sdk")
 _can_available = is_package_available("python-can", "can")
 _unitree_sdk_available = is_package_available("unitree-sdk2py", "unitree_sdk2py")
-_pyrealsense2_available = is_package_available("pyrealsense2") or is_package_available(
-    "pyrealsense2-macosx", import_name="pyrealsense2"
-)
+_pyrealsense2_available = is_package_available("pyrealsense2")
 _zmq_available = is_package_available("pyzmq", import_name="zmq")
 _hebi_available = is_package_available("hebi-py", import_name="hebi")
 _teleop_available = is_package_available("teleop")
@@ -0,0 +1,58 @@
+#!/usr/bin/env python
+
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""Helpers shared across annotation-pipeline tests."""
+
+from __future__ import annotations
+
+import json
+from typing import Any
+
+from lerobot.annotations.steerable_pipeline.vlm_client import StubVlmClient
+
+
+def make_canned_responder(
+    responses_by_marker: dict[str, Any],
+    default: Any = None,
+) -> StubVlmClient:
+    """Return a stub that picks a response by inspecting the user prompt.
+
+    For each call the responder examines the last user-message text and
+    returns the response keyed by the first marker substring it contains.
+    Falls back to ``default`` if no marker matches.
+    """
+
+    def responder(messages: list[dict[str, Any]]) -> Any:
+        last_user_text = ""
+        for message in messages:
+            if message.get("role") != "user":
+                continue
+            content = message.get("content")
+            if isinstance(content, str):
+                last_user_text = content
+            elif isinstance(content, list):
+                for block in content:
+                    if isinstance(block, dict) and block.get("type") == "text":
+                        last_user_text = block.get("text", "")
+        for marker, response in responses_by_marker.items():
+            if marker in last_user_text:
+                return response
+        return default
+
+    return StubVlmClient(responder=responder)
+
+
+def encode_vqa_answer(payload: dict[str, Any]) -> str:
+    return json.dumps(payload, sort_keys=True)
@@ -0,0 +1,112 @@
+#!/usr/bin/env python
+
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""Shared fixtures for annotation-pipeline tests.
+
+Builds a minimal LeRobot-shaped dataset on disk so writer/validator tests
+can exercise real parquet reads and writes without needing a checked-in
+LFS dataset.
+"""
+
+from __future__ import annotations
+
+import json
+from pathlib import Path
+
+import pyarrow as pa
+import pyarrow.parquet as pq
+import pytest
+
+
+def _make_episode_table(
+    episode_index: int,
+    num_frames: int,
+    *,
+    fps: int = 10,
+    task_index: int = 0,
+) -> pa.Table:
+    timestamps = [round(i / fps, 6) for i in range(num_frames)]
+    frame_indices = list(range(num_frames))
+    return pa.Table.from_pydict(
+        {
+            "episode_index": [episode_index] * num_frames,
+            "frame_index": frame_indices,
+            "timestamp": timestamps,
+            "task_index": [task_index] * num_frames,
+            "subtask_index": [0] * num_frames,  # legacy column the writer must drop
+        }
+    )
+
+
+def _build_dataset(root: Path, episode_specs: list[tuple[int, int, str]], *, fps: int = 10) -> Path:
+    """Create a fixture dataset under ``root``.
+
+    ``episode_specs`` is a list of ``(episode_index, num_frames, task_text)``.
+    Each episode goes into its own ``data/chunk-000/file-{ep:03d}.parquet``
+    so the writer's per-shard rewrite path is exercised.
+    """
+    data_dir = root / "data" / "chunk-000"
+    data_dir.mkdir(parents=True, exist_ok=True)
+    tasks = {}
+    for episode_index, num_frames, task_text in episode_specs:
+        task_index = len(tasks)
+        if task_text not in tasks.values():
+            tasks[task_index] = task_text
+        else:
+            task_index = next(k for k, v in tasks.items() if v == task_text)
+        table = _make_episode_table(episode_index, num_frames, fps=fps, task_index=task_index)
+        path = data_dir / f"file-{episode_index:03d}.parquet"
+        pq.write_table(table, path)
+
+    meta_dir = root / "meta"
+    meta_dir.mkdir(parents=True, exist_ok=True)
+    tasks_table = pa.Table.from_pydict(
+        {
+            "task_index": list(tasks.keys()),
+            "task": list(tasks.values()),
+        }
+    )
+    pq.write_table(tasks_table, meta_dir / "tasks.parquet")
+
+    info = {
+        "codebase_version": "v3.1",
+        "fps": fps,
+        "total_episodes": len(episode_specs),
+    }
+    (meta_dir / "info.json").write_text(json.dumps(info, indent=2))
+
+    return root
+
+
+@pytest.fixture
+def fixture_dataset_root(tmp_path: Path) -> Path:
+    """A tiny dataset with two episodes, 12 frames each at 10 fps."""
+    return _build_dataset(
+        tmp_path / "ds",
+        episode_specs=[
+            (0, 12, "Could you tidy the kitchen please?"),
+            (1, 12, "Please clean up the kitchen"),
+        ],
+        fps=10,
+    )
+
+
+@pytest.fixture
+def single_episode_root(tmp_path: Path) -> Path:
+    return _build_dataset(
+        tmp_path / "ds_one",
+        episode_specs=[(0, 30, "Pour water from the bottle into the cup.")],
+        fps=10,
+    )
@@ -0,0 +1,124 @@
+#!/usr/bin/env python
+
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""Opt-in E2E smoke run for ``make annotation-e2e``.
+
+Builds the same fixture used by the pytest suite, runs the full
+annotation pipeline against it with a stub VLM, and prints a short report.
+This is intentionally not a pytest test — it exercises the CLI plumbing
+without depending on conftest.py fixtures.
+"""
+
+from __future__ import annotations
+
+import json
+import sys
+import tempfile
+from pathlib import Path
+
+import pyarrow as pa
+import pyarrow.parquet as pq
+
+from lerobot.annotations.steerable_pipeline.config import AnnotationPipelineConfig
+from lerobot.annotations.steerable_pipeline.executor import Executor
+from lerobot.annotations.steerable_pipeline.modules import (
+    GeneralVqaModule,
+    InterjectionsAndSpeechModule,
+    PlanSubtasksMemoryModule,
+)
+from lerobot.annotations.steerable_pipeline.validator import StagingValidator
+from lerobot.annotations.steerable_pipeline.vlm_client import StubVlmClient
+from lerobot.annotations.steerable_pipeline.writer import LanguageColumnsWriter
+
+
+def _build_dataset(root: Path) -> Path:
+    data_dir = root / "data" / "chunk-000"
+    data_dir.mkdir(parents=True, exist_ok=True)
+    n = 30
+    timestamps = [round(i / 10, 6) for i in range(n)]
+    table = pa.Table.from_pydict(
+        {
+            "episode_index": [0] * n,
+            "frame_index": list(range(n)),
+            "timestamp": timestamps,
+            "task_index": [0] * n,
+            "subtask_index": [0] * n,
+        }
+    )
+    pq.write_table(table, data_dir / "file-000.parquet")
+    meta = root / "meta"
+    meta.mkdir(parents=True, exist_ok=True)
+    pq.write_table(
+        pa.Table.from_pydict({"task_index": [0], "task": ["Pour water into the cup."]}),
+        meta / "tasks.parquet",
+    )
+    (meta / "info.json").write_text(json.dumps({"codebase_version": "v3.1", "fps": 10}))
+    return root
+
+
+def _stub_responder(messages):
+    text = ""
+    for m in messages:
+        if m.get("role") == "user":
+            content = m.get("content")
+            if isinstance(content, list):
+                for block in content:
+                    if isinstance(block, dict) and block.get("type") == "text":
+                        text = block.get("text", "")
+            elif isinstance(content, str):
+                text = content
+    if "atomic subtasks" in text:
+        return {
+            "subtasks": [
+                {"text": "grasp the bottle", "start": 0.0, "end": 1.0},
+                {"text": "pour into the cup", "start": 1.0, "end": 2.0},
+                {"text": "place the bottle down", "start": 2.0, "end": 3.0},
+            ]
+        }
+    if "concise hierarchical PLAN" in text:
+        return {"plan": "1. grasp\n2. pour\n3. place"}
+    if "Update the memory" in text:
+        return {"memory": "poured once"}
+    if "acknowledgement the robot" in text:
+        return {"text": "Sure."}
+    if "ONE realistic interruption" in text:
+        return {"interjection": "use less water", "speech": "Using less water."}
+    if "frame-grounded visual question" in text:
+        return {"question": "How many cups?", "answer": {"label": "cup", "count": 1}}
+    return None
+
+
+def main() -> int:
+    with tempfile.TemporaryDirectory() as tmp:
+        root = _build_dataset(Path(tmp) / "ds")
+        vlm = StubVlmClient(responder=_stub_responder)
+        cfg = AnnotationPipelineConfig()
+        executor = Executor(
+            config=cfg,
+            module_1=PlanSubtasksMemoryModule(vlm=vlm, config=cfg.module_1),
+            module_2=InterjectionsAndSpeechModule(vlm=vlm, config=cfg.module_2, seed=cfg.seed),
+            module_3=GeneralVqaModule(vlm=vlm, config=cfg.module_3, seed=cfg.seed),
+            writer=LanguageColumnsWriter(),
+            validator=StagingValidator(),
+        )
+        summary = executor.run(root)
+        print(f"phases={[(p.name, p.episodes_processed) for p in summary.phases]}")
+        print(f"validation: {summary.validation_report.summary()}")
+        print(f"shards rewritten: {len(summary.written_paths)}")
+    return 0
+
+
+if __name__ == "__main__":
+    sys.exit(main())
@@ -0,0 +1,304 @@
+#!/usr/bin/env python
+
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""Module 1/2/3 unit tests with stubbed VLMs."""
+
+from __future__ import annotations
+
+import json
+from dataclasses import dataclass, field
+from pathlib import Path
+from typing import Any
+
+from lerobot.annotations.steerable_pipeline.config import (
+    Module1Config,
+    Module2Config,
+    Module3Config,
+)
+from lerobot.annotations.steerable_pipeline.modules import (
+    GeneralVqaModule,
+    InterjectionsAndSpeechModule,
+    PlanSubtasksMemoryModule,
+)
+from lerobot.annotations.steerable_pipeline.reader import iter_episodes
+from lerobot.annotations.steerable_pipeline.staging import EpisodeStaging
+from lerobot.annotations.steerable_pipeline.vlm_client import StubVlmClient
+
+from ._helpers import make_canned_responder
+
+
+@dataclass
+class _StubFrameProvider:
+    """Returns one sentinel object per requested timestamp."""
+
+    sentinel: Any = field(default_factory=lambda: object())
+    cameras: tuple[str, ...] = ("observation.images.top",)
+    calls: list[tuple[int, tuple[float, ...], str | None]] = field(default_factory=list)
+    video_calls: list[tuple[int, int, str | None]] = field(default_factory=list)
+
+    @property
+    def camera_keys(self) -> list[str]:
+        return list(self.cameras)
+
+    def frames_at(self, record, timestamps, camera_key=None):
+        self.calls.append((record.episode_index, tuple(timestamps), camera_key))
+        return [self.sentinel] * len(timestamps)
+
+    def video_for_episode(self, record, max_frames, camera_key=None):
+        self.video_calls.append((record.episode_index, max_frames, camera_key))
+        n = min(max_frames, len(record.frame_timestamps))
+        return [self.sentinel] * n
+
+
+def _spy_responder(captured: list[list[dict[str, Any]]], reply: Any):
+    def responder(messages):
+        captured.append(list(messages))
+        return reply
+
+    return StubVlmClient(responder=responder)
+
+
+def test_module1_plan_memory_subtask_smoke(fixture_dataset_root: Path, tmp_path: Path) -> None:
+    vlm = make_canned_responder(
+        {
+            "atomic subtasks": {
+                "subtasks": [
+                    {"text": "grasp the handle of the sponge", "start": 0.0, "end": 0.4},
+                    {"text": "wipe the counter from left to right", "start": 0.4, "end": 0.8},
+                    {"text": "place the sponge into the sink", "start": 0.8, "end": 1.1},
+                ]
+            },
+            "concise hierarchical PLAN": {"plan": "1. grasp\n2. wipe\n3. place"},
+            "Update the memory": {"memory": "wiped the counter once"},
+        },
+    )
+    module = PlanSubtasksMemoryModule(vlm=vlm, config=Module1Config())
+    record = next(iter_episodes(fixture_dataset_root))
+    staging = EpisodeStaging(tmp_path / "stage", record.episode_index)
+    module.run_episode(record, staging)
+    rows = staging.read("module_1")
+
+    styles = {r["style"] for r in rows}
+    assert {"subtask", "plan", "memory"}.issubset(styles)
+    # subtask timestamps must be exact frame timestamps
+    frame_set = set(record.frame_timestamps)
+    for row in rows:
+        assert row["timestamp"] in frame_set
+    # exactly one plan row at t0
+    plan_rows = [r for r in rows if r["style"] == "plan"]
+    assert len(plan_rows) == 1
+    assert plan_rows[0]["timestamp"] == record.frame_timestamps[0]
+
+
+def test_module2_at_t0_emits_speech_only_no_interjection(fixture_dataset_root: Path, tmp_path: Path) -> None:
+    vlm = make_canned_responder(
+        {"acknowledgement the robot": {"text": "Sure, on it."}},
+    )
+    module = InterjectionsAndSpeechModule(
+        vlm=vlm,
+        config=Module2Config(max_interjections_per_episode=0),
+    )
+    record = next(iter_episodes(fixture_dataset_root))
+    staging = EpisodeStaging(tmp_path / "stage", record.episode_index)
+    module.run_episode(record, staging)
+    rows = staging.read("module_2")
+    assert len(rows) == 1
+    only = rows[0]
+    assert only["role"] == "assistant"
+    assert only["style"] is None
+    assert only["content"] is None
+    assert only["timestamp"] == record.frame_timestamps[0]
+    assert only["tool_calls"][0]["function"]["name"] == "say"
+
+
+def test_module2_mid_episode_emits_paired_interjection_and_speech(
+    fixture_dataset_root: Path, tmp_path: Path
+) -> None:
+    vlm = make_canned_responder(
+        {
+            "acknowledgement the robot": {"text": "OK."},
+            "ONE realistic interruption": {
+                "interjection": "actually skip the dishes",
+                "speech": "Skipping the dishes.",
+            },
+        },
+    )
+    module = InterjectionsAndSpeechModule(
+        vlm=vlm,
+        config=Module2Config(max_interjections_per_episode=1, interjection_min_t=0.2),
+        seed=7,
+    )
+    record = next(iter_episodes(fixture_dataset_root))
+    staging = EpisodeStaging(tmp_path / "stage", record.episode_index)
+    module.run_episode(record, staging)
+    rows = staging.read("module_2")
+
+    interjections = [r for r in rows if r["style"] == "interjection"]
+    speeches = [r for r in rows if r["style"] is None and r["role"] == "assistant"]
+    assert len(interjections) == 1
+    assert len(speeches) >= 2  # initial t=0 + one paired with the interjection
+    inter_t = interjections[0]["timestamp"]
+    assert any(abs(s["timestamp"] - inter_t) < 1e-9 for s in speeches)
+
+
+def test_module3_vqa_unique_per_frame_and_camera(single_episode_root: Path, tmp_path: Path) -> None:
+    payload = {
+        "question": "How many cups?",
+        "answer": {"label": "cup", "count": 2, "note": "white & blue"},
+    }
+    vlm = make_canned_responder({"frame-grounded visual question": payload})
+    module = GeneralVqaModule(
+        vlm=vlm,
+        config=Module3Config(vqa_emission_hz=1.0, K=3),
+        seed=1,
+        frame_provider=_StubFrameProvider(
+            cameras=("observation.images.top", "observation.images.wrist")
+        ),
+    )
+    record = next(iter_episodes(single_episode_root))
+    staging = EpisodeStaging(tmp_path / "stage", record.episode_index)
+    module.run_episode(record, staging)
+    rows = staging.read("module_3")
+    # every vqa row must carry a camera tag and one of the configured cameras
+    for r in rows:
+        assert r["style"] == "vqa"
+        assert r.get("camera") in {"observation.images.top", "observation.images.wrist"}
+    # at most one (vqa, user) and one (vqa, assistant) per (timestamp, camera)
+    user_keys = [
+        (r["timestamp"], r["camera"]) for r in rows if r["role"] == "user" and r["style"] == "vqa"
+    ]
+    assistant_keys = [
+        (r["timestamp"], r["camera"])
+        for r in rows
+        if r["role"] == "assistant" and r["style"] == "vqa"
+    ]
+    assert len(user_keys) == len(set(user_keys))
+    assert len(assistant_keys) == len(set(assistant_keys))
+    # both cameras must be represented
+    assert {c for _, c in user_keys} == {"observation.images.top", "observation.images.wrist"}
+    # every emitted timestamp must be an exact source frame timestamp
+    frame_set = set(record.frame_timestamps)
+    for ts, _ in user_keys + assistant_keys:
+        assert ts in frame_set
+
+
+def test_module1_attaches_video_block_to_subtask_prompt(fixture_dataset_root: Path, tmp_path: Path) -> None:
+    """Module 1 sends one ``type=video`` block covering the whole episode."""
+    captured: list[list[dict[str, Any]]] = []
+    payload = {
+        "subtasks": [
+            {"text": "grasp the handle of the sponge", "start": 0.0, "end": 0.5},
+            {"text": "wipe the counter", "start": 0.5, "end": 1.1},
+        ]
+    }
+    plan_payload = {"plan": "1. grasp\n2. wipe"}
+    memory_payload = {"memory": "wiped once"}
+
+    def responder(messages):
+        captured.append(list(messages))
+        text = ""
+        for m in messages:
+            for block in m.get("content", []):
+                if isinstance(block, dict) and block.get("type") == "text":
+                    text = block.get("text", "")
+        if "concise hierarchical PLAN" in text:
+            return plan_payload
+        if "Update the memory" in text:
+            return memory_payload
+        return payload
+
+    provider = _StubFrameProvider()
+    module = PlanSubtasksMemoryModule(
+        vlm=StubVlmClient(responder=responder),
+        config=Module1Config(max_video_frames=5, frames_per_second=10.0),
+        frame_provider=provider,
+    )
+    record = next(iter_episodes(fixture_dataset_root))
+    staging = EpisodeStaging(tmp_path / "stage", record.episode_index)
+    module.run_episode(record, staging)
+
+    # the subtask call (the first VLM call) must carry exactly one video block
+    assert captured, "no VLM calls made"
+    first_call = captured[0]
+    content = first_call[0]["content"]
+    video_blocks = [b for b in content if isinstance(b, dict) and b.get("type") == "video"]
+    image_blocks = [b for b in content if isinstance(b, dict) and b.get("type") == "image"]
+    text_blocks = [b for b in content if isinstance(b, dict) and b.get("type") == "text"]
+    assert len(video_blocks) == 1, f"expected exactly 1 video block, got {content}"
+    assert image_blocks == [], "subtask prompt must not mix image blocks with the video block"
+    assert len(text_blocks) == 1
+    # video block must wrap a list of frames covering the episode
+    assert isinstance(video_blocks[0]["video"], list)
+    assert len(video_blocks[0]["video"]) <= 5
+    # provider is called with target_count = min(duration * fps, max). With
+    # fps=10 on a ~1s episode that requests >max, so max=5 wins.
+    assert provider.video_calls and provider.video_calls[0][0] == record.episode_index
+    assert provider.video_calls[0][1] <= 5
+
+
+def test_module3_attaches_frame_image_block_to_prompt(single_episode_root: Path, tmp_path: Path) -> None:
+    """Each VQA prompt must carry a single image block at the emission frame."""
+    captured: list[list[dict[str, Any]]] = []
+    payload = {
+        "question": "How many cups?",
+        "answer": {"label": "cup", "count": 1},
+    }
+    provider = _StubFrameProvider()
+    module = GeneralVqaModule(
+        vlm=_spy_responder(captured, payload),
+        config=Module3Config(vqa_emission_hz=1.0, K=1),
+        seed=0,
+        frame_provider=provider,
+    )
+    record = next(iter_episodes(single_episode_root))
+    staging = EpisodeStaging(tmp_path / "stage", record.episode_index)
+    module.run_episode(record, staging)
+
+    assert captured, "no VLM calls made"
+    for messages in captured:
+        content = messages[0]["content"]
+        image_blocks = [b for b in content if isinstance(b, dict) and b.get("type") == "image"]
+        text_blocks = [b for b in content if isinstance(b, dict) and b.get("type") == "text"]
+        assert len(image_blocks) == 1, f"expected 1 image block per VQA prompt, got {content}"
+        assert image_blocks[0]["image"] is provider.sentinel
+        assert len(text_blocks) == 1
+    # provider was called once per emission per camera with the exact emission timestamp
+    for ep_idx, ts_tuple, camera in provider.calls:
+        assert ep_idx == record.episode_index
+        assert len(ts_tuple) == 1
+        assert ts_tuple[0] in record.frame_timestamps
+        assert camera in provider.cameras
+
+
+def test_module3_assistant_content_is_valid_json(single_episode_root: Path, tmp_path: Path) -> None:
+    payload = {
+        "question": "Where is the cup?",
+        "answer": {"detections": [{"label": "cup", "bbox_format": "xyxy", "bbox": [10, 20, 50, 80]}]},
+    }
+    vlm = make_canned_responder({"frame-grounded visual question": payload})
+    module = GeneralVqaModule(
+        vlm=vlm,
+        config=Module3Config(vqa_emission_hz=1.0, K=2),
+        seed=2,
+        frame_provider=_StubFrameProvider(),
+    )
+    record = next(iter_episodes(single_episode_root))
+    staging = EpisodeStaging(tmp_path / "stage", record.episode_index)
+    module.run_episode(record, staging)
+    rows = staging.read("module_3")
+    for row in rows:
+        if row["role"] == "assistant" and row["style"] == "vqa":
+            decoded = json.loads(row["content"])
+            assert "detections" in decoded
@@ -0,0 +1,135 @@
+#!/usr/bin/env python
+
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""End-to-end smoke: pipeline output → PR 1 canonical recipe rendering."""
+
+from __future__ import annotations
+
+from pathlib import Path
+
+import pyarrow.parquet as pq
+
+from lerobot.annotations.steerable_pipeline.config import (
+    AnnotationPipelineConfig,
+    Module1Config,
+    Module2Config,
+    Module3Config,
+)
+from lerobot.annotations.steerable_pipeline.executor import Executor
+from lerobot.annotations.steerable_pipeline.modules import (
+    GeneralVqaModule,
+    InterjectionsAndSpeechModule,
+    PlanSubtasksMemoryModule,
+)
+from lerobot.annotations.steerable_pipeline.validator import StagingValidator
+from lerobot.annotations.steerable_pipeline.writer import LanguageColumnsWriter
+from lerobot.configs.recipe import TrainingRecipe
+from lerobot.datasets.language_render import render_sample
+
+from ._helpers import make_canned_responder
+
+_RECIPE_PATH = (
+    Path(__file__).resolve().parents[2] / "src" / "lerobot" / "configs" / "recipes" / "pi05_hirobot.yaml"
+)
+
+
+def _build_executor() -> Executor:
+    vlm = make_canned_responder(
+        {
+            "atomic subtasks": {
+                "subtasks": [
+                    {"text": "grasp the bottle", "start": 0.0, "end": 0.5},
+                    {"text": "pour into the cup", "start": 0.5, "end": 1.0},
+                    {"text": "place the bottle down", "start": 1.0, "end": 1.5},
+                ]
+            },
+            "concise hierarchical PLAN": {"plan": "1. grasp\n2. pour\n3. place"},
+            "Update the memory": {"memory": "poured once"},
+            "acknowledgement the robot": {"text": "Sure."},
+            "ONE realistic interruption": {
+                "interjection": "use less water",
+                "speech": "Using less water.",
+            },
+            "frame-grounded visual question": {
+                "question": "How many cups?",
+                "answer": {"label": "cup", "count": 1},
+            },
+        },
+    )
+    config = AnnotationPipelineConfig(
+        module_1=Module1Config(),
+        module_2=Module2Config(max_interjections_per_episode=1, interjection_min_t=0.5),
+        module_3=Module3Config(vqa_emission_hz=1.0, K=2),
+    )
+    return Executor(
+        config=config,
+        module_1=PlanSubtasksMemoryModule(vlm=vlm, config=config.module_1),
+        module_2=InterjectionsAndSpeechModule(vlm=vlm, config=config.module_2, seed=config.seed),
+        module_3=GeneralVqaModule(vlm=vlm, config=config.module_3, seed=config.seed),
+        writer=LanguageColumnsWriter(),
+        validator=StagingValidator(),
+    )
+
+
+def test_pr1_canonical_recipe_renders_nonempty_from_pipeline_output(
+    single_episode_root: Path,
+) -> None:
+    executor = _build_executor()
+    summary = executor.run(single_episode_root)
+    # validator may emit warnings but no errors for the synthetic fixture
+    assert summary.validation_report.ok, summary.validation_report.summary()
+
+    table = pq.read_table(single_episode_root / "data" / "chunk-000" / "file-000.parquet")
+    persistent_lists = table.column("language_persistent").to_pylist()
+    events_lists = table.column("language_events").to_pylist()
+    timestamps = table.column("timestamp").to_pylist()
+
+    recipe = TrainingRecipe.from_yaml(_RECIPE_PATH) if hasattr(TrainingRecipe, "from_yaml") else None
+    if recipe is None:
+        # PR 1 may not expose from_yaml; load via PyYAML and TrainingRecipe(**...)
+        import yaml
+
+        loaded = yaml.safe_load(_RECIPE_PATH.read_text(encoding="utf-8"))
+        recipe = TrainingRecipe(**loaded)
+
+    rendered_any = False
+    for ts, persistent, events in zip(timestamps, persistent_lists, events_lists, strict=True):
+        result = render_sample(
+            recipe=recipe,
+            persistent=persistent,
+            events=events,
+            t=float(ts),
+            sample_idx=0,
+            dataset_ctx={"task": "Pour water from the bottle into the cup."},
+        )
+        if result is None:
+            continue
+        if result["messages"]:
+            rendered_any = True
+            assert result["target_message_indices"]
+            break
+    assert rendered_any, "PR 1 recipe rendered no messages from pipeline output"
+
+    # Sanity: speech atom appears in events column intact
+    flat_events = [r for ev in events_lists for r in ev]
+    speech_rows = [r for r in flat_events if r.get("style") is None and r.get("role") == "assistant"]
+    assert speech_rows
+    say = speech_rows[0]["tool_calls"][0]
+    assert say["function"]["name"] == "say"
+    assert isinstance(say["function"]["arguments"]["text"], str)
+    # PR 2 no longer writes a ``tools`` column — the say schema lives as a
+    # constant (``SAY_TOOL_SCHEMA``) so PR 1's row struct is the single
+    # source of truth for the v3.1 schema.
+    assert "tools" not in table.column_names
@@ -0,0 +1,125 @@
+#!/usr/bin/env python
+
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""Validator behavior tests."""
+
+from __future__ import annotations
+
+import json
+from pathlib import Path
+
+from lerobot.annotations.steerable_pipeline.reader import iter_episodes
+from lerobot.annotations.steerable_pipeline.staging import EpisodeStaging
+from lerobot.annotations.steerable_pipeline.validator import StagingValidator
+from lerobot.annotations.steerable_pipeline.writer import speech_atom
+
+
+def _validate(root: Path, staging_dir: Path):
+    records = list(iter_episodes(root))
+    return StagingValidator().validate(records, staging_dir)
+
+
+def test_validator_catches_misaligned_timestamps(fixture_dataset_root: Path, tmp_path: Path) -> None:
+    staging_dir = tmp_path / "stage"
+    EpisodeStaging(staging_dir, 0).write(
+        "module_3",
+        [
+            {
+                "role": "assistant",
+                "content": json.dumps({"label": "cup", "count": 2}, sort_keys=True),
+                "style": "vqa",
+                "timestamp": 9.999,  # not on any 10 fps frame
+                "tool_calls": None,
+            }
+        ],
+    )
+    report = _validate(fixture_dataset_root, staging_dir)
+    assert not report.ok
+    assert any("does not match any source frame timestamp" in e for e in report.errors)
+
+
+def test_validator_catches_orphan_speech(fixture_dataset_root: Path, tmp_path: Path) -> None:
+    staging_dir = tmp_path / "stage"
+    EpisodeStaging(staging_dir, 0).write(
+        "module_2",
+        [
+            speech_atom(0.0, "Got it."),
+            # interjection at 0.3s with NO paired speech
+            {
+                "role": "user",
+                "content": "skip it",
+                "style": "interjection",
+                "timestamp": 0.3,
+                "tool_calls": None,
+            },
+        ],
+    )
+    report = _validate(fixture_dataset_root, staging_dir)
+    assert not report.ok
+    assert any("paired speech" in e for e in report.errors)
+
+
+def test_validator_catches_inconsistent_plan_memory(fixture_dataset_root: Path, tmp_path: Path) -> None:
+    staging_dir = tmp_path / "stage"
+    EpisodeStaging(staging_dir, 0).write(
+        "module_1",
+        [
+            {
+                "role": "assistant",
+                "content": "1. do x",
+                "style": "plan",
+                "timestamp": 0.0,
+                "tool_calls": None,
+            },
+            {
+                "role": "assistant",
+                "content": "do x",
+                "style": "subtask",
+                "timestamp": 0.0,
+                "tool_calls": None,
+            },
+        ],
+    )
+    EpisodeStaging(staging_dir, 0).write(
+        "module_2",
+        [
+            speech_atom(0.0, "Got it."),
+            speech_atom(0.4, "Replanning."),
+            {
+                "role": "user",
+                "content": "replan",
+                "style": "interjection",
+                "timestamp": 0.4,
+                "tool_calls": None,
+            },
+        ],
+    )
+    report = _validate(fixture_dataset_root, staging_dir)
+    # missing co-timestamped plan refresh at 0.4s → error
+    assert not report.ok
+    assert any("co-timestamped plan update" in e for e in report.errors)
+
+
+def test_validator_catches_wrong_column(fixture_dataset_root: Path, tmp_path: Path) -> None:
+    staging_dir = tmp_path / "stage"
+    EpisodeStaging(staging_dir, 0).write(
+        "module_1",
+        [
+            {"role": "user", "content": "where?", "style": "vqa", "timestamp": 0.0, "tool_calls": None},
+        ],
+    )
+    report = _validate(fixture_dataset_root, staging_dir)
+    assert not report.ok
+    assert any("module_1 emitted style 'vqa'" in e or "must be persistent" in e for e in report.errors)
@@ -0,0 +1,298 @@
+#!/usr/bin/env python
+
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""Writer correctness tests."""
+
+from __future__ import annotations
+
+import json
+from pathlib import Path
+
+import pyarrow.parquet as pq
+import pytest
+
+from lerobot.annotations.steerable_pipeline.reader import iter_episodes
+from lerobot.annotations.steerable_pipeline.staging import EpisodeStaging
+from lerobot.annotations.steerable_pipeline.writer import (
+    LanguageColumnsWriter,
+    speech_atom,
+)
+
+
+def _stage_episode(
+    staging_dir: Path,
+    episode_index: int,
+    *,
+    module_1: list[dict] | None = None,
+    module_2: list[dict] | None = None,
+    module_3: list[dict] | None = None,
+) -> None:
+    staging = EpisodeStaging(staging_dir, episode_index)
+    if module_1 is not None:
+        staging.write("module_1", module_1)
+    if module_2 is not None:
+        staging.write("module_2", module_2)
+    if module_3 is not None:
+        staging.write("module_3", module_3)
+
+
+def test_writer_persistence_identity(fixture_dataset_root: Path, tmp_path: Path) -> None:
+    """Every frame in an episode has a byte-identical persistent list."""
+    staging_dir = tmp_path / "stage"
+    _stage_episode(
+        staging_dir,
+        0,
+        module_1=[
+            {
+                "role": "assistant",
+                "content": "grasp the sponge",
+                "style": "subtask",
+                "timestamp": 0.0,
+                "tool_calls": None,
+            },
+            {
+                "role": "assistant",
+                "content": "1. wipe\n2. dry",
+                "style": "plan",
+                "timestamp": 0.0,
+                "tool_calls": None,
+            },
+            {
+                "role": "assistant",
+                "content": "wiped the counter",
+                "style": "memory",
+                "timestamp": 0.5,
+                "tool_calls": None,
+            },
+        ],
+    )
+    records = list(iter_episodes(fixture_dataset_root))
+    LanguageColumnsWriter().write_all(records, staging_dir, fixture_dataset_root)
+
+    table = pq.read_table(fixture_dataset_root / "data" / "chunk-000" / "file-000.parquet")
+    persistent = table.column("language_persistent").to_pylist()
+    first = persistent[0]
+    assert first  # non-empty
+    for row in persistent:
+        assert row == first, "persistent slice must be byte-identical across all frames"
+
+
+def test_writer_events_exact_timestamp(fixture_dataset_root: Path, tmp_path: Path) -> None:
+    staging_dir = tmp_path / "stage"
+    _stage_episode(
+        staging_dir,
+        0,
+        module_2=[
+            speech_atom(0.0, "Got it."),
+            {
+                "role": "user",
+                "content": "skip the dishes",
+                "style": "interjection",
+                "timestamp": 0.5,
+                "tool_calls": None,
+            },
+            speech_atom(0.5, "Skipping the dishes."),
+        ],
+    )
+    records = list(iter_episodes(fixture_dataset_root))
+    LanguageColumnsWriter().write_all(records, staging_dir, fixture_dataset_root)
+
+    table = pq.read_table(fixture_dataset_root / "data" / "chunk-000" / "file-000.parquet")
+    timestamps = table.column("timestamp").to_pylist()
+    events = table.column("language_events").to_pylist()
+    for ts, ev in zip(timestamps, events, strict=True):
+        if abs(ts - 0.0) < 1e-9:
+            assert any(r["role"] == "assistant" and r.get("style") is None for r in ev), ev
+        elif abs(ts - 0.5) < 1e-9:
+            assert any(r.get("style") == "interjection" for r in ev), ev
+            assert any(r.get("style") is None for r in ev), ev
+        else:
+            assert ev == []
+
+
+def test_writer_column_routing(fixture_dataset_root: Path, tmp_path: Path) -> None:
+    staging_dir = tmp_path / "stage"
+    _stage_episode(
+        staging_dir,
+        0,
+        module_1=[
+            {
+                "role": "assistant",
+                "content": "do X",
+                "style": "subtask",
+                "timestamp": 0.0,
+                "tool_calls": None,
+            },
+            {
+                "role": "assistant",
+                "content": "1. do X",
+                "style": "plan",
+                "timestamp": 0.0,
+                "tool_calls": None,
+            },
+            {
+                "role": "assistant",
+                "content": "did X",
+                "style": "memory",
+                "timestamp": 0.3,
+                "tool_calls": None,
+            },
+        ],
+        module_2=[
+            speech_atom(0.0, "OK"),
+            {
+                "role": "user",
+                "content": "wait",
+                "style": "interjection",
+                "timestamp": 0.2,
+                "tool_calls": None,
+            },
+            speech_atom(0.2, "Waiting"),
+        ],
+        module_3=[
+            {
+                "role": "user",
+                "content": "where is the cup?",
+                "style": "vqa",
+                "timestamp": 0.4,
+                "tool_calls": None,
+            },
+            {
+                "role": "assistant",
+                "content": json.dumps(
+                    {"detections": [{"label": "cup", "bbox_format": "xyxy", "bbox": [1, 2, 3, 4]}]},
+                    sort_keys=True,
+                ),
+                "style": "vqa",
+                "timestamp": 0.4,
+                "tool_calls": None,
+            },
+        ],
+    )
+    records = list(iter_episodes(fixture_dataset_root))
+    LanguageColumnsWriter().write_all(records, staging_dir, fixture_dataset_root)
+    table = pq.read_table(fixture_dataset_root / "data" / "chunk-000" / "file-000.parquet")
+
+    persistent = table.column("language_persistent").to_pylist()[0]
+    persistent_styles = {r["style"] for r in persistent}
+    assert persistent_styles == {"subtask", "plan", "memory"}
+
+    all_events = [r for ev in table.column("language_events").to_pylist() for r in ev]
+    event_styles = {r.get("style") for r in all_events}
+    assert event_styles == {None, "interjection", "vqa"}
+
+
+def test_writer_drops_subtask_index_idempotent(fixture_dataset_root: Path, tmp_path: Path) -> None:
+    staging_dir = tmp_path / "stage"
+    _stage_episode(
+        staging_dir,
+        0,
+        module_1=[
+            {
+                "role": "assistant",
+                "content": "do X",
+                "style": "subtask",
+                "timestamp": 0.0,
+                "tool_calls": None,
+            },
+        ],
+    )
+    records = list(iter_episodes(fixture_dataset_root))
+    writer = LanguageColumnsWriter()
+    writer.write_all(records, staging_dir, fixture_dataset_root)
+
+    path = fixture_dataset_root / "data" / "chunk-000" / "file-000.parquet"
+    table_a = pq.read_table(path)
+    assert "subtask_index" not in table_a.column_names
+    assert "language_persistent" in table_a.column_names
+    assert "language_events" in table_a.column_names
+    # The writer no longer emits a dataset-level ``tools`` column; the
+    # ``say`` tool schema lives as a code constant (``SAY_TOOL_SCHEMA``)
+    # so the parquet stays small and PR 2 doesn't extend PR 1's schema.
+    assert "tools" not in table_a.column_names
+
+    # second pass — must produce identical bytes for the language columns
+    records_again = list(iter_episodes(fixture_dataset_root))
+    writer.write_all(records_again, staging_dir, fixture_dataset_root)
+    table_b = pq.read_table(path)
+    assert (
+        table_a.column("language_persistent").to_pylist() == table_b.column("language_persistent").to_pylist()
+    )
+    assert table_a.column("language_events").to_pylist() == table_b.column("language_events").to_pylist()
+
+
+def test_writer_normalize_rejects_misrouted_persistent_style() -> None:
+    """``_normalize_persistent_row`` must reject any non-persistent style."""
+    from lerobot.annotations.steerable_pipeline.writer import _normalize_persistent_row
+
+    with pytest.raises(ValueError, match="non-persistent style"):
+        _normalize_persistent_row(
+            {"role": "assistant", "content": "oops", "style": "vqa", "timestamp": 0.0, "tool_calls": None}
+        )
+
+
+def test_writer_normalize_rejects_misrouted_event_style() -> None:
+    """``_normalize_event_row`` must reject any persistent style."""
+    from lerobot.annotations.steerable_pipeline.writer import _normalize_event_row
+
+    with pytest.raises(ValueError):
+        _normalize_event_row({"role": "assistant", "content": "oops", "style": "subtask", "tool_calls": None})
+
+
+def test_say_tool_schema_constant_is_well_formed() -> None:
+    """``SAY_TOOL_SCHEMA`` (and ``DEFAULT_TOOLS``) replace the parquet
+    ``tools`` column — chat-template consumers import them directly.
+    """
+    from lerobot.annotations.steerable_pipeline.writer import (
+        DEFAULT_TOOLS,
+        SAY_TOOL_SCHEMA,
+    )
+
+    assert DEFAULT_TOOLS == [SAY_TOOL_SCHEMA]
+    assert SAY_TOOL_SCHEMA["function"]["name"] == "say"
+    params = SAY_TOOL_SCHEMA["function"]["parameters"]
+    assert params["properties"]["text"]["type"] == "string"
+    assert params["required"] == ["text"]
+
+
+def test_writer_does_not_add_tools_column(fixture_dataset_root: Path, tmp_path: Path) -> None:
+    """Re-running on a parquet that already has a legacy ``tools`` column
+    must drop it cleanly so reruns converge to the v3.1 schema.
+    """
+    staging_dir = tmp_path / "stage"
+    _stage_episode(
+        staging_dir,
+        0,
+        module_1=[
+            {"role": "assistant", "content": "x", "style": "subtask", "timestamp": 0.0, "tool_calls": None}
+        ],
+    )
+    records = list(iter_episodes(fixture_dataset_root))
+    LanguageColumnsWriter().write_all(records, staging_dir, fixture_dataset_root)
+    table = pq.read_table(fixture_dataset_root / "data" / "chunk-000" / "file-000.parquet")
+    assert "tools" not in table.column_names
+
+
+def test_speech_atom_shape_matches_plan_spec() -> None:
+    atom = speech_atom(2.5, "I'm cleaning up!")
+    assert atom["role"] == "assistant"
+    assert atom["style"] is None
+    assert atom["content"] is None
+    assert atom["timestamp"] == 2.5
+    assert isinstance(atom["tool_calls"], list)
+    call = atom["tool_calls"][0]
+    assert call["type"] == "function"
+    assert call["function"]["name"] == "say"
+    assert call["function"]["arguments"]["text"] == "I'm cleaning up!"
@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:8a31653c11eccdd4d80fd3f6a351cd54c49b8a48db1f7e9faf38fddd7900a09f
+oid sha256:c2b8f8532c7a0b776de5e536b8b54e30b1a0c2e3d5cc25a2d86fe43e40ae5e8c
 size 515400
@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:75bf051698b37dcd7517ec8025a896ab5a0551a6dde5f89d0a3d5d50966e83e6
+oid sha256:224b5fa4828aa88171b68c036e8919c1eae563e2113f03b6461eadf5bf8525a6
 size 31672
@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:88e10930a10041d50f2cf369e6813ac14618d13dad1c21bdde1ac7798611c6ba
+oid sha256:016d2fa8fe5f58017dfd46f4632fdc19dfd751e32a2c7cde2077c6f95546d6bd
 size 68
@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:89833a5ccdb7d85c83f717ff8ec68b8e822005cb8803899acaae88c578e2e3ae
+oid sha256:eca0d87a699620e4fec7e68539b0be91e4cc933f6bf12032da52c182ab6f38cf
 size 31672
@@ -0,0 +1,32 @@
+#!/usr/bin/env python
+
+from pathlib import Path
+
+import pytest
+
+from lerobot.configs.recipe import MessageTurn, TrainingRecipe
+
+
+def test_message_recipe_validates_unknown_binding():
+    with pytest.raises(ValueError, match="unknown binding"):
+        TrainingRecipe(
+            messages=[
+                MessageTurn(role="user", content="${missing}", stream="high_level"),
+                MessageTurn(role="assistant", content="ok", stream="high_level", target=True),
+            ]
+        )
+
+
+def test_canonical_recipe_loads():
+    recipe = TrainingRecipe.from_yaml(Path("src/lerobot/configs/recipes/pi05_hirobot.yaml"))
+
+    assert recipe.blend is not None
+    assert set(recipe.blend) == {
+        "memory_update",
+        "user_interjection_response",
+        "high_level_subtask",
+        "low_level_execution",
+        "ask_vqa_top",
+        "ask_vqa_wrist",
+    }
+    assert sum(component.weight for component in recipe.blend.values()) == pytest.approx(0.96)
@@ -0,0 +1,152 @@
+#!/usr/bin/env python
+
+import numpy as np
+import pandas as pd
+import pyarrow as pa
+import pytest
+
+from lerobot.datasets import LeRobotDataset
+from lerobot.datasets.io_utils import write_info
+from lerobot.datasets.language import (
+    EVENT_ONLY_STYLES,
+    LANGUAGE_EVENTS,
+    LANGUAGE_PERSISTENT,
+    PERSISTENT_STYLES,
+    STYLE_REGISTRY,
+    VIEW_DEPENDENT_STYLES,
+    column_for_style,
+    is_view_dependent_style,
+    language_events_arrow_type,
+    language_feature_info,
+    language_persistent_arrow_type,
+    validate_camera_field,
+)
+from lerobot.datasets.utils import DEFAULT_DATA_PATH
+
+
+def test_language_arrow_schema_has_expected_fields():
+    persistent_row_type = language_persistent_arrow_type().value_type
+    event_row_type = language_events_arrow_type().value_type
+
+    assert isinstance(persistent_row_type, pa.StructType)
+    assert persistent_row_type.names == [
+        "role",
+        "content",
+        "style",
+        "timestamp",
+        "camera",
+        "tool_calls",
+    ]
+
+    assert isinstance(event_row_type, pa.StructType)
+    assert event_row_type.names == ["role", "content", "style", "camera", "tool_calls"]
+
+
+def test_style_registry_routes_columns():
+    assert {"subtask", "plan", "memory", "motion", "task_aug"} == PERSISTENT_STYLES
+    assert {"interjection", "vqa", "trace"} == EVENT_ONLY_STYLES
+    assert PERSISTENT_STYLES | EVENT_ONLY_STYLES <= STYLE_REGISTRY
+
+    assert column_for_style("subtask") == LANGUAGE_PERSISTENT
+    assert column_for_style("plan") == LANGUAGE_PERSISTENT
+    assert column_for_style("memory") == LANGUAGE_PERSISTENT
+    assert column_for_style("motion") == LANGUAGE_PERSISTENT
+    assert column_for_style("task_aug") == LANGUAGE_PERSISTENT
+    assert column_for_style("interjection") == LANGUAGE_EVENTS
+    assert column_for_style("vqa") == LANGUAGE_EVENTS
+    assert column_for_style("trace") == LANGUAGE_EVENTS
+    assert column_for_style(None) == LANGUAGE_EVENTS
+
+
+def test_view_dependent_styles():
+    # motion lives in PERSISTENT_STYLES and is described in robot-frame
+    # (joint / Cartesian) terms, so it is NOT view-dependent. Only vqa
+    # (event) and trace (event, pixel-trajectory) carry a camera tag.
+    assert {"vqa", "trace"} == VIEW_DEPENDENT_STYLES
+    assert is_view_dependent_style("vqa")
+    assert is_view_dependent_style("trace")
+    assert not is_view_dependent_style("motion")
+    assert not is_view_dependent_style("subtask")
+    assert not is_view_dependent_style("plan")
+    assert not is_view_dependent_style("interjection")
+    assert not is_view_dependent_style(None)
+
+
+def test_validate_camera_field_requires_camera_for_view_dependent_styles():
+    validate_camera_field("vqa", "observation.images.top")
+    validate_camera_field("trace", "observation.images.front")
+    with pytest.raises(ValueError, match="view-dependent"):
+        validate_camera_field("vqa", None)
+    with pytest.raises(ValueError, match="view-dependent"):
+        validate_camera_field("trace", "")
+
+
+def test_validate_camera_field_rejects_camera_on_non_view_dependent_styles():
+    validate_camera_field("subtask", None)
+    validate_camera_field("plan", None)
+    validate_camera_field("memory", None)
+    validate_camera_field("motion", None)
+    validate_camera_field("interjection", None)
+    validate_camera_field(None, None)
+    with pytest.raises(ValueError, match="must have camera=None"):
+        validate_camera_field("subtask", "observation.images.top")
+    with pytest.raises(ValueError, match="must have camera=None"):
+        validate_camera_field("motion", "observation.images.top")
+    with pytest.raises(ValueError, match="must have camera=None"):
+        validate_camera_field("interjection", "observation.images.top")
+    with pytest.raises(ValueError, match="must have camera=None"):
+        validate_camera_field(None, "observation.images.top")
+
+
+def test_unknown_style_rejected():
+    with pytest.raises(ValueError, match="Unknown language style"):
+        column_for_style("surprise")
+
+
+def test_lerobot_dataset_passes_language_columns_through(tmp_path, empty_lerobot_dataset_factory):
+    root = tmp_path / "language_dataset"
+    dataset = empty_lerobot_dataset_factory(
+        root=root,
+        features={"state": {"dtype": "float32", "shape": (2,), "names": None}},
+        use_videos=False,
+    )
+    dataset.add_frame({"state": np.array([0.0, 1.0], dtype=np.float32), "task": "tidy"})
+    dataset.add_frame({"state": np.array([1.0, 2.0], dtype=np.float32), "task": "tidy"})
+    dataset.save_episode()
+    dataset.finalize()
+
+    persistent = [
+        {
+            "role": "assistant",
+            "content": "reach for the cup",
+            "style": "subtask",
+            "timestamp": 0.0,
+            "camera": None,
+            "tool_calls": None,
+        }
+    ]
+    event = {
+        "role": "user",
+        "content": "what is visible?",
+        "style": "vqa",
+        "camera": "observation.images.top",
+        "tool_calls": None,
+    }
+    data_path = root / DEFAULT_DATA_PATH.format(chunk_index=0, file_index=0)
+    df = pd.read_parquet(data_path)
+    df[LANGUAGE_PERSISTENT] = [persistent, persistent]
+    df[LANGUAGE_EVENTS] = [[event], []]
+    df.to_parquet(data_path)
+
+    info = dataset.meta.info
+    info["features"].update(language_feature_info())
+    write_info(info, root)
+
+    reloaded = LeRobotDataset(repo_id=dataset.repo_id, root=root)
+
+    first = reloaded[0]
+    second = reloaded[1]
+    assert first[LANGUAGE_PERSISTENT] == persistent
+    assert first[LANGUAGE_EVENTS] == [event]
+    assert second[LANGUAGE_PERSISTENT] == persistent
+    assert second[LANGUAGE_EVENTS] == []
@@ -0,0 +1,388 @@
+#!/usr/bin/env python
+
+from pathlib import Path
+
+import pytest
+
+from lerobot.configs.recipe import MessageTurn, TrainingRecipe
+from lerobot.datasets.language_render import active_at, emitted_at, nth_next, nth_prev, render_sample
+
+
+def persistent_row(role, content, style, timestamp, tool_calls=None, camera=None):
+    return {
+        "role": role,
+        "content": content,
+        "style": style,
+        "timestamp": timestamp,
+        "camera": camera,
+        "tool_calls": tool_calls,
+    }
+
+
+def event_row(role, content, style, tool_calls=None, camera=None):
+    return {
+        "role": role,
+        "content": content,
+        "style": style,
+        "camera": camera,
+        "tool_calls": tool_calls,
+    }
+
+
+PERSISTENT = [
+    persistent_row("assistant", "plan 0", "plan", 0.0),
+    persistent_row("assistant", "memory 0", "memory", 0.0),
+    persistent_row("assistant", "subtask 0", "subtask", 0.0),
+    persistent_row("assistant", "memory 1", "memory", 1.0),
+    persistent_row("assistant", "subtask 1", "subtask", 1.0),
+]
+EVENTS_AT_1 = [
+    event_row("user", "what is visible?", "vqa", camera="observation.images.top"),
+    event_row("assistant", '{"count": 2}', "vqa", camera="observation.images.top"),
+]
+EVENTS_AT_2 = [
+    event_row("user", "skip wiping", "interjection"),
+    event_row(
+        "assistant",
+        None,
+        None,
+        [{"type": "function", "function": {"name": "say", "arguments": {"text": "Skipping wiping."}}}],
+    ),
+]
+# Same emission tick, two cameras: triggers per-camera disambiguation in
+# resolvers, mirroring how Module 3 of the annotation pipeline writes one
+# (vqa, user) + (vqa, assistant) pair per camera.
+EVENTS_AT_3_TWO_CAMERAS = [
+    event_row("user", "how many cups (top)?", "vqa", camera="observation.images.top"),
+    event_row("assistant", '{"count": 3}', "vqa", camera="observation.images.top"),
+    event_row("user", "how many cups (wrist)?", "vqa", camera="observation.images.wrist"),
+    event_row("assistant", '{"count": 1}', "vqa", camera="observation.images.wrist"),
+]
+
+
+def test_resolver_temporal_semantics():
+    assert active_at(0.5, persistent=PERSISTENT, style="subtask")["content"] == "subtask 0"
+    assert active_at(1.0, persistent=PERSISTENT, style="subtask")["content"] == "subtask 1"
+    assert emitted_at(0.5, persistent=PERSISTENT, events=[], style="vqa", role="assistant") is None
+    assert (
+        emitted_at(1.0, persistent=PERSISTENT, events=EVENTS_AT_1, style="vqa", role="assistant")["content"]
+        == '{"count": 2}'
+    )
+
+
+def test_persistent_relative_resolvers_reject_event_styles():
+    with pytest.raises(ValueError, match="event-only"):
+        active_at(1.0, persistent=PERSISTENT, style="vqa")
+    with pytest.raises(ValueError, match="event-only"):
+        nth_prev(1.0, persistent=PERSISTENT, style="interjection")
+
+
+def test_nth_prev_and_next():
+    assert nth_prev(1.0, persistent=PERSISTENT, style="subtask", offset=1)["content"] == "subtask 0"
+    assert nth_next(0.0, persistent=PERSISTENT, style="subtask", offset=1)["content"] == "subtask 1"
+
+
+def test_substitution_if_present_multimodal_and_tool_calls():
+    recipe = TrainingRecipe(
+        messages=[
+            MessageTurn(
+                role="user",
+                content=[
+                    {"type": "image", "feature": "observation.images.top"},
+                    {"type": "text", "text": "${task}: ${interjection}"},
+                ],
+                stream="high_level",
+                if_present="interjection",
+            ),
+            MessageTurn(
+                role="assistant",
+                content="${plan}",
+                stream="high_level",
+                target=True,
+                tool_calls_from="speech",
+            ),
+        ],
+        bindings={"plan": "active_at(t, style=plan)"},
+    )
+
+    rendered = render_sample(
+        recipe=recipe,
+        persistent=PERSISTENT,
+        events=EVENTS_AT_2,
+        t=2.0,
+        sample_idx=0,
+        task="clean kitchen",
+    )
+
+    assert rendered["messages"][0]["content"][1]["text"] == "clean kitchen: skip wiping"
+    assert rendered["messages"][1]["content"] == "plan 0"
+    assert rendered["messages"][1]["tool_calls"][0]["function"]["name"] == "say"
+    assert rendered["message_streams"] == ["high_level", "high_level"]
+    assert rendered["target_message_indices"] == [1]
+
+
+def test_exact_event_miss_returns_none_when_target_skips():
+    recipe = TrainingRecipe(
+        messages=[
+            MessageTurn(role="user", content="${vqa_query}", stream="high_level", if_present="vqa_query"),
+            MessageTurn(
+                role="assistant",
+                content="${vqa}",
+                stream="high_level",
+                target=True,
+                if_present="vqa",
+            ),
+        ]
+    )
+
+    assert (
+        render_sample(recipe=recipe, persistent=PERSISTENT, events=EVENTS_AT_2, t=0.0, sample_idx=0) is None
+    )
+
+
+def test_deterministic_blend_sampling():
+    recipe = TrainingRecipe(
+        blend={
+            "a": TrainingRecipe(
+                weight=1.0,
+                messages=[
+                    MessageTurn(role="user", content="${task}", stream="high_level"),
+                    MessageTurn(role="assistant", content="a", stream="high_level", target=True),
+                ],
+            ),
+            "b": TrainingRecipe(
+                weight=1.0,
+                messages=[
+                    MessageTurn(role="user", content="${task}", stream="high_level"),
+                    MessageTurn(role="assistant", content="b", stream="high_level", target=True),
+                ],
+            ),
+        }
+    )
+
+    first = render_sample(
+        recipe=recipe, persistent=PERSISTENT, events=EVENTS_AT_2, t=0.0, sample_idx=123, task="x"
+    )
+    second = render_sample(
+        recipe=recipe, persistent=PERSISTENT, events=EVENTS_AT_2, t=0.0, sample_idx=123, task="x"
+    )
+    assert first == second
+
+
+def test_emitted_at_filters_vqa_by_camera():
+    top = emitted_at(
+        3.0,
+        persistent=PERSISTENT,
+        events=EVENTS_AT_3_TWO_CAMERAS,
+        style="vqa",
+        role="assistant",
+        camera="observation.images.top",
+    )
+    wrist = emitted_at(
+        3.0,
+        persistent=PERSISTENT,
+        events=EVENTS_AT_3_TWO_CAMERAS,
+        style="vqa",
+        role="assistant",
+        camera="observation.images.wrist",
+    )
+    assert top["content"] == '{"count": 3}'
+    assert wrist["content"] == '{"count": 1}'
+
+
+def test_emitted_at_raises_on_ambiguous_per_camera_vqa():
+    with pytest.raises(ValueError, match="Ambiguous resolver"):
+        emitted_at(
+            3.0,
+            persistent=PERSISTENT,
+            events=EVENTS_AT_3_TWO_CAMERAS,
+            style="vqa",
+            role="assistant",
+        )
+
+
+def test_per_camera_blend_renders_both_views():
+    recipe = TrainingRecipe(
+        blend={
+            "top": TrainingRecipe(
+                weight=1.0,
+                bindings={
+                    "vqa_query": (
+                        "emitted_at(t, style=vqa, role=user, camera=observation.images.top)"
+                    ),
+                    "vqa": (
+                        "emitted_at(t, style=vqa, role=assistant, camera=observation.images.top)"
+                    ),
+                },
+                messages=[
+                    MessageTurn(
+                        role="user",
+                        content=[
+                            {"type": "image", "feature": "observation.images.top"},
+                            {"type": "text", "text": "${vqa_query}"},
+                        ],
+                        stream="high_level",
+                        if_present="vqa_query",
+                    ),
+                    MessageTurn(
+                        role="assistant",
+                        content="${vqa}",
+                        stream="high_level",
+                        target=True,
+                        if_present="vqa",
+                    ),
+                ],
+            ),
+            "wrist": TrainingRecipe(
+                weight=1.0,
+                bindings={
+                    "vqa_query": (
+                        "emitted_at(t, style=vqa, role=user, camera=observation.images.wrist)"
+                    ),
+                    "vqa": (
+                        "emitted_at(t, style=vqa, role=assistant, camera=observation.images.wrist)"
+                    ),
+                },
+                messages=[
+                    MessageTurn(
+                        role="user",
+                        content=[
+                            {"type": "image", "feature": "observation.images.wrist"},
+                            {"type": "text", "text": "${vqa_query}"},
+                        ],
+                        stream="high_level",
+                        if_present="vqa_query",
+                    ),
+                    MessageTurn(
+                        role="assistant",
+                        content="${vqa}",
+                        stream="high_level",
+                        target=True,
+                        if_present="vqa",
+                    ),
+                ],
+            ),
+        }
+    )
+
+    rendered_top = render_sample(
+        recipe=recipe.blend["top"],
+        persistent=PERSISTENT,
+        events=EVENTS_AT_3_TWO_CAMERAS,
+        t=3.0,
+        sample_idx=0,
+    )
+    rendered_wrist = render_sample(
+        recipe=recipe.blend["wrist"],
+        persistent=PERSISTENT,
+        events=EVENTS_AT_3_TWO_CAMERAS,
+        t=3.0,
+        sample_idx=0,
+    )
+
+    assert rendered_top["messages"][0]["content"][0]["feature"] == "observation.images.top"
+    assert rendered_top["messages"][0]["content"][1]["text"] == "how many cups (top)?"
+    assert rendered_top["messages"][1]["content"] == '{"count": 3}'
+
+    assert rendered_wrist["messages"][0]["content"][0]["feature"] == "observation.images.wrist"
+    assert rendered_wrist["messages"][0]["content"][1]["text"] == "how many cups (wrist)?"
+    assert rendered_wrist["messages"][1]["content"] == '{"count": 1}'
+
+
+def test_resolve_task_picks_rephrasing_deterministically_per_sample():
+    rephrasings = [
+        persistent_row("user", "tidy the kitchen", "task_aug", 0.0),
+        persistent_row("user", "please clean up the kitchen", "task_aug", 0.0),
+        persistent_row("user", "kitchen needs tidying", "task_aug", 0.0),
+        persistent_row("user", "make the kitchen clean", "task_aug", 0.0),
+    ]
+    recipe = TrainingRecipe(
+        messages=[
+            MessageTurn(role="user", content="${task}", stream="high_level"),
+            MessageTurn(role="assistant", content="ok", stream="high_level", target=True),
+        ]
+    )
+
+    # No explicit task override → resolver consults persistent rows.
+    seen: set[str] = set()
+    for sample_idx in range(64):
+        rendered = render_sample(
+            recipe=recipe,
+            persistent=rephrasings,
+            events=[],
+            t=0.0,
+            sample_idx=sample_idx,
+            dataset_ctx={"task": "canonical kitchen task"},
+        )
+        seen.add(rendered["messages"][0]["content"])
+    # Every rephrasing should be reachable across enough samples.
+    assert seen == {r["content"] for r in rephrasings}
+    # Same sample_idx → same pick (determinism).
+    a = render_sample(
+        recipe=recipe, persistent=rephrasings, events=[], t=0.0, sample_idx=42,
+        dataset_ctx={"task": "canonical"},
+    )
+    b = render_sample(
+        recipe=recipe, persistent=rephrasings, events=[], t=0.0, sample_idx=42,
+        dataset_ctx={"task": "canonical"},
+    )
+    assert a["messages"][0]["content"] == b["messages"][0]["content"]
+
+
+def test_resolve_task_falls_back_to_canonical_without_rephrasings():
+    recipe = TrainingRecipe(
+        messages=[
+            MessageTurn(role="user", content="${task}", stream="high_level"),
+            MessageTurn(role="assistant", content="ok", stream="high_level", target=True),
+        ]
+    )
+    rendered = render_sample(
+        recipe=recipe,
+        persistent=PERSISTENT,  # no task_aug rows
+        events=[],
+        t=0.0,
+        sample_idx=0,
+        dataset_ctx={"task": "clean the kitchen"},
+    )
+    assert rendered["messages"][0]["content"] == "clean the kitchen"
+
+
+def test_resolve_task_explicit_override_beats_rephrasings():
+    rephrasings = [
+        persistent_row("user", "rephrased one", "task_aug", 0.0),
+        persistent_row("user", "rephrased two", "task_aug", 0.0),
+    ]
+    recipe = TrainingRecipe(
+        messages=[
+            MessageTurn(role="user", content="${task}", stream="high_level"),
+            MessageTurn(role="assistant", content="ok", stream="high_level", target=True),
+        ]
+    )
+    rendered = render_sample(
+        recipe=recipe,
+        persistent=rephrasings,
+        events=[],
+        t=0.0,
+        sample_idx=0,
+        task="explicit override wins",
+        dataset_ctx={"task": "canonical"},
+    )
+    assert rendered["messages"][0]["content"] == "explicit override wins"
+
+
+def test_canonical_recipe_can_render_low_level_branch():
+    recipe = TrainingRecipe.from_yaml(Path("src/lerobot/configs/recipes/pi05_hirobot.yaml"))
+    low_level = TrainingRecipe(blend={"low": recipe.blend["low_level_execution"]})
+
+    rendered = render_sample(
+        recipe=low_level,
+        persistent=PERSISTENT,
+        events=[],
+        t=0.5,
+        sample_idx=0,
+        task="clean kitchen",
+    )
+
+    assert rendered["messages"][-1] == {"role": "assistant", "content": "subtask 0"}
+    assert rendered["message_streams"][-1] == "low_level"
+    assert rendered["target_message_indices"] == [1]
@@ -1,193 +0,0 @@
-#!/usr/bin/env python
-
-# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-"""
-Tests for subtask functionality in LeRobotDataset.
-
-These tests verify that:
- Subtask information is correctly loaded from datasets that have subtask data
- The __getitem__ method correctly adds subtask strings to returned items
- Subtask handling gracefully handles missing data
-"""
-
-import pytest
-
-pytest.importorskip("pandas", reason="pandas is required (install lerobot[dataset])")
-
-import pandas as pd  # noqa: E402
-import torch
-
-from lerobot.datasets.lerobot_dataset import LeRobotDataset
-
-
-class TestSubtaskDataset:
-    """Tests for subtask handling in LeRobotDataset."""
-
-    @pytest.fixture
-    def subtask_dataset(self):
-        """Load the test subtask dataset from the hub."""
-        # Use lerobot/pusht-subtask dataset with episode 1
-        return LeRobotDataset(
-            repo_id="lerobot/pusht-subtask",
-            episodes=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11],
-        )
-
-    def test_subtask_dataset_loads(self, subtask_dataset):
-        """Test that the subtask dataset loads successfully."""
-        assert subtask_dataset is not None
-        assert len(subtask_dataset) > 0
-
-    def test_subtask_metadata_loaded(self, subtask_dataset):
-        """Test that subtask metadata is loaded when present in dataset."""
-        # The dataset should have subtasks metadata loaded
-        assert subtask_dataset.meta.subtasks is not None
-        assert isinstance(subtask_dataset.meta.subtasks, pd.DataFrame)
-
-    def test_subtask_index_in_features(self, subtask_dataset):
-        """Test that subtask_index is a feature when dataset has subtasks."""
-        assert "subtask_index" in subtask_dataset.features
-
-    def test_getitem_returns_subtask_string(self, subtask_dataset):
-        """Test that __getitem__ correctly adds subtask string to returned item."""
-        item = subtask_dataset[0]
-
-        # Subtask should be present in the returned item
-        assert "subtask" in item
-        assert isinstance(item["subtask"], str)
-        assert len(item["subtask"]) > 0  # Should not be empty
-
-    def test_getitem_has_subtask_index(self, subtask_dataset):
-        """Test that __getitem__ includes subtask_index."""
-        item = subtask_dataset[0]
-
-        assert "subtask_index" in item
-        assert isinstance(item["subtask_index"], torch.Tensor)
-
-    def test_subtask_index_maps_to_valid_subtask(self, subtask_dataset):
-        """Test that subtask_index correctly maps to a subtask in metadata."""
-        item = subtask_dataset[0]
-
-        subtask_idx = item["subtask_index"].item()
-        subtask_from_metadata = subtask_dataset.meta.subtasks.iloc[subtask_idx].name
-
-        assert item["subtask"] == subtask_from_metadata
-
-    def test_all_items_have_subtask(self, subtask_dataset):
-        """Test that all items in the dataset have subtask information."""
-        for i in range(min(len(subtask_dataset), 5)):  # Check first 5 items
-            item = subtask_dataset[i]
-            assert "subtask" in item
-            assert isinstance(item["subtask"], str)
-
-    def test_task_and_subtask_coexist(self, subtask_dataset):
-        """Test that both task and subtask are present in returned items."""
-        item = subtask_dataset[0]
-
-        # Both task and subtask should be present
-        assert "task" in item
-        assert "subtask" in item
-        assert isinstance(item["task"], str)
-        assert isinstance(item["subtask"], str)
-
-
-class TestSubtaskDatasetMissing:
-    """Tests for graceful handling when subtask data is missing."""
-
-    @pytest.fixture
-    def dataset_without_subtasks(self, tmp_path, empty_lerobot_dataset_factory):
-        """Create a dataset without subtask information."""
-        features = {"state": {"dtype": "float32", "shape": (2,), "names": None}}
-        dataset = empty_lerobot_dataset_factory(root=tmp_path / "no_subtask", features=features)
-
-        # Add some frames and save
-        for _ in range(5):
-            dataset.add_frame({"state": torch.randn(2), "task": "Test task"})
-        dataset.save_episode()
-        dataset.finalize()
-
-        # Reload the dataset
-        return LeRobotDataset(dataset.repo_id, root=dataset.root)
-
-    def test_no_subtask_in_features(self, dataset_without_subtasks):
-        """Test that subtask_index is not in features when not provided."""
-        assert "subtask_index" not in dataset_without_subtasks.features
-
-    def test_getitem_without_subtask(self, dataset_without_subtasks):
-        """Test that __getitem__ works when subtask is not present."""
-        item = dataset_without_subtasks[0]
-
-        # Item should still be retrievable
-        assert item is not None
-        assert "state" in item
-        assert "task" in item
-
-        # Subtask should NOT be present
-        assert "subtask" not in item
-
-    def test_subtasks_metadata_is_none(self, dataset_without_subtasks):
-        """Test that subtasks metadata is None when not present."""
-        assert dataset_without_subtasks.meta.subtasks is None
-
-
-class TestSubtaskEdgeCases:
-    """Edge case tests for subtask handling."""
-
-    def test_subtask_with_multiple_episodes(self):
-        """Test subtask handling with multiple episodes if available."""
-        try:
-            dataset = LeRobotDataset(
-                repo_id="lerobot/pusht-subtask",
-                episodes=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11],
-            )
-        except Exception:
-            pytest.skip("Could not load test-subtask dataset")
-
-        # Check first and last items have valid subtasks
-        first_item = dataset[0]
-        last_item = dataset[len(dataset) - 1]
-
-        assert "subtask" in first_item
-        assert "subtask" in last_item
-        assert isinstance(first_item["subtask"], str)
-        assert isinstance(last_item["subtask"], str)
-
-    def test_subtask_index_consistency(self):
-        """Test that same subtask_index returns same subtask string."""
-        try:
-            dataset = LeRobotDataset(
-                repo_id="lerobot/pusht-subtask",
-                episodes=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11],
-            )
-        except Exception:
-            pytest.skip("Could not load test-subtask dataset")
-
-        if len(dataset) < 2:
-            pytest.skip("Dataset too small for this test")
-
-        # Collect subtask_index to subtask mappings
-        subtask_map = {}
-        for i in range(min(len(dataset), 10)):
-            item = dataset[i]
-            idx = item["subtask_index"].item()
-            subtask = item["subtask"]
-
-            if idx in subtask_map:
-                # Same index should always return same subtask
-                assert subtask_map[idx] == subtask, (
-                    f"Inconsistent subtask for index {idx}: '{subtask_map[idx]}' vs '{subtask}'"
-                )
-            else:
-                subtask_map[idx] = subtask
@@ -0,0 +1,56 @@
+#!/usr/bin/env python
+
+import torch
+
+from lerobot.configs.recipe import MessageTurn, TrainingRecipe
+from lerobot.processor.converters import create_transition
+from lerobot.processor.render_messages_processor import RenderMessagesStep
+from lerobot.types import TransitionKey
+
+
+def test_render_messages_step_noops_without_language_columns():
+    recipe = TrainingRecipe(
+        messages=[
+            MessageTurn(role="user", content="${task}", stream="high_level"),
+            MessageTurn(role="assistant", content="${subtask}", stream="low_level", target=True),
+        ]
+    )
+    transition = create_transition(complementary_data={"task": "do it"})
+
+    assert RenderMessagesStep(recipe)(transition) == transition
+
+
+def test_render_messages_step_renders_and_drops_raw_language():
+    recipe = TrainingRecipe(
+        messages=[
+            MessageTurn(role="user", content="${task}", stream="high_level"),
+            MessageTurn(role="assistant", content="${subtask}", stream="low_level", target=True),
+        ]
+    )
+    transition = create_transition(
+        complementary_data={
+            "task": "do it",
+            "timestamp": torch.tensor(0.0),
+            "index": torch.tensor(7),
+            "language_persistent": [
+                {
+                    "role": "assistant",
+                    "content": "reach carefully",
+                    "style": "subtask",
+                    "timestamp": 0.0,
+                    "camera": None,
+                    "tool_calls": None,
+                }
+            ],
+            "language_events": [],
+        }
+    )
+
+    out = RenderMessagesStep(recipe)(transition)
+    data = out[TransitionKey.COMPLEMENTARY_DATA]
+
+    assert "language_persistent" not in data
+    assert "language_events" not in data
+    assert data["messages"][-1]["content"] == "reach carefully"
+    assert data["message_streams"] == ["high_level", "low_level"]
+    assert data["target_message_indices"] == [1]
@@ -0,0 +1,36 @@
+#!/usr/bin/env python
+
+import torch
+
+from lerobot.utils.collate import lerobot_collate_fn
+
+
+def test_lerobot_collate_preserves_messages_and_drops_raw_language():
+    batch = [
+        {
+            "index": torch.tensor(0),
+            "messages": [{"role": "assistant", "content": "a"}],
+            "message_streams": ["low_level"],
+            "target_message_indices": [0],
+            "language_persistent": [{"content": "raw"}],
+            "language_events": [],
+        },
+        {
+            "index": torch.tensor(1),
+            "messages": [{"role": "assistant", "content": "b"}],
+            "message_streams": ["low_level"],
+            "target_message_indices": [0],
+            "language_persistent": [{"content": "raw"}],
+            "language_events": [],
+        },
+    ]
+
+    out = lerobot_collate_fn(batch)
+
+    assert out["index"].tolist() == [0, 1]
+    assert out["messages"][0][0]["content"] == "a"
+    assert out["messages"][1][0]["content"] == "b"
+    assert out["message_streams"] == [["low_level"], ["low_level"]]
+    assert out["target_message_indices"] == [[0], [0]]
+    assert "language_persistent" not in out
+    assert "language_events" not in out
@@ -2,39 +2,30 @@ version = 1
 revision = 2
 requires-python = ">=3.12"
 resolution-markers = [
-    "python_full_version >= '3.15' and platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 'armv7l' and platform_machine != 's390x' and sys_platform == 'linux'",
-    "python_full_version >= '3.15' and platform_machine == 's390x' and sys_platform == 'linux'",
-    "python_full_version == '3.14.*' and platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 'armv7l' and platform_machine != 's390x' and sys_platform == 'linux'",
+    "python_full_version >= '3.14' and platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 'armv7l' and platform_machine != 's390x' and sys_platform == 'linux'",
+    "python_full_version >= '3.14' and platform_machine == 's390x' and sys_platform == 'linux'",
    "python_full_version == '3.13.*' and platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 'armv7l' and platform_machine != 's390x' and sys_platform == 'linux'",
-    "python_full_version == '3.14.*' and platform_machine == 's390x' and sys_platform == 'linux'",
    "python_full_version == '3.13.*' and platform_machine == 's390x' and sys_platform == 'linux'",
    "python_full_version < '3.13' and platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 'armv7l' and platform_machine != 's390x' and sys_platform == 'linux'",
    "python_full_version < '3.13' and platform_machine == 's390x' and sys_platform == 'linux'",
-    "(python_full_version >= '3.15' and platform_machine == 'aarch64' and sys_platform == 'linux') or (python_full_version >= '3.15' and platform_machine == 'arm64' and sys_platform == 'linux') or (python_full_version >= '3.15' and platform_machine == 'armv7l' and sys_platform == 'linux')",
-    "(python_full_version == '3.14.*' and platform_machine == 'aarch64' and sys_platform == 'linux') or (python_full_version == '3.14.*' and platform_machine == 'arm64' and sys_platform == 'linux') or (python_full_version == '3.14.*' and platform_machine == 'armv7l' and sys_platform == 'linux')",
+    "(python_full_version >= '3.14' and platform_machine == 'aarch64' and sys_platform == 'linux') or (python_full_version >= '3.14' and platform_machine == 'arm64' and sys_platform == 'linux') or (python_full_version >= '3.14' and platform_machine == 'armv7l' and sys_platform == 'linux')",
    "(python_full_version == '3.13.*' and platform_machine == 'aarch64' and sys_platform == 'linux') or (python_full_version == '3.13.*' and platform_machine == 'arm64' and sys_platform == 'linux') or (python_full_version == '3.13.*' and platform_machine == 'armv7l' and sys_platform == 'linux')",
    "(python_full_version < '3.13' and platform_machine == 'aarch64' and sys_platform == 'linux') or (python_full_version < '3.13' and platform_machine == 'arm64' and sys_platform == 'linux') or (python_full_version < '3.13' and platform_machine == 'armv7l' and sys_platform == 'linux')",
-    "(python_full_version >= '3.15' and platform_machine != 's390x' and platform_machine != 'x86_64' and sys_platform == 'darwin') or (python_full_version >= '3.15' and platform_machine != 's390x' and sys_platform != 'darwin' and sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32')",
-    "python_full_version >= '3.15' and platform_machine == 's390x' and sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32'",
-    "python_full_version >= '3.15' and platform_machine != 's390x' and sys_platform == 'emscripten'",
-    "python_full_version >= '3.15' and platform_machine == 's390x' and sys_platform == 'emscripten'",
-    "(python_full_version == '3.14.*' and platform_machine != 's390x' and platform_machine != 'x86_64' and sys_platform == 'darwin') or (python_full_version == '3.14.*' and platform_machine != 's390x' and sys_platform != 'darwin' and sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32')",
+    "(python_full_version >= '3.14' and platform_machine != 's390x' and platform_machine != 'x86_64' and sys_platform == 'darwin') or (python_full_version >= '3.14' and platform_machine != 's390x' and sys_platform != 'darwin' and sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32')",
+    "python_full_version >= '3.14' and platform_machine == 's390x' and sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32'",
    "(python_full_version == '3.13.*' and platform_machine != 's390x' and platform_machine != 'x86_64' and sys_platform == 'darwin') or (python_full_version == '3.13.*' and platform_machine != 's390x' and sys_platform != 'darwin' and sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32')",
-    "python_full_version == '3.14.*' and platform_machine == 's390x' and sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32'",
    "python_full_version == '3.13.*' and platform_machine == 's390x' and sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32'",
    "(python_full_version < '3.13' and platform_machine != 's390x' and platform_machine != 'x86_64' and sys_platform == 'darwin') or (python_full_version < '3.13' and platform_machine != 's390x' and sys_platform != 'darwin' and sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32')",
    "python_full_version < '3.13' and platform_machine == 's390x' and sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32'",
-    "python_full_version == '3.14.*' and platform_machine != 's390x' and sys_platform == 'emscripten'",
+    "python_full_version >= '3.14' and platform_machine != 's390x' and sys_platform == 'emscripten'",
+    "python_full_version >= '3.14' and platform_machine == 's390x' and sys_platform == 'emscripten'",
    "python_full_version == '3.13.*' and platform_machine != 's390x' and sys_platform == 'emscripten'",
-    "python_full_version == '3.14.*' and platform_machine == 's390x' and sys_platform == 'emscripten'",
    "python_full_version == '3.13.*' and platform_machine == 's390x' and sys_platform == 'emscripten'",
    "python_full_version < '3.13' and platform_machine != 's390x' and sys_platform == 'emscripten'",
    "python_full_version < '3.13' and platform_machine == 's390x' and sys_platform == 'emscripten'",
-    "(python_full_version >= '3.15' and platform_machine == 'x86_64' and sys_platform == 'darwin') or (python_full_version >= '3.15' and platform_machine != 's390x' and sys_platform == 'win32')",
-    "python_full_version >= '3.15' and platform_machine == 's390x' and sys_platform == 'win32'",
-    "(python_full_version == '3.14.*' and platform_machine == 'x86_64' and sys_platform == 'darwin') or (python_full_version == '3.14.*' and platform_machine != 's390x' and sys_platform == 'win32')",
+    "(python_full_version >= '3.14' and platform_machine == 'x86_64' and sys_platform == 'darwin') or (python_full_version >= '3.14' and platform_machine != 's390x' and sys_platform == 'win32')",
+    "python_full_version >= '3.14' and platform_machine == 's390x' and sys_platform == 'win32'",
    "(python_full_version == '3.13.*' and platform_machine == 'x86_64' and sys_platform == 'darwin') or (python_full_version == '3.13.*' and platform_machine != 's390x' and sys_platform == 'win32')",
-    "python_full_version == '3.14.*' and platform_machine == 's390x' and sys_platform == 'win32'",
    "python_full_version == '3.13.*' and platform_machine == 's390x' and sys_platform == 'win32'",
    "(python_full_version < '3.13' and platform_machine == 'x86_64' and sys_platform == 'darwin') or (python_full_version < '3.13' and platform_machine != 's390x' and sys_platform == 'win32')",
    "python_full_version < '3.13' and platform_machine == 's390x' and sys_platform == 'win32'",
@@ -1119,7 +1110,8 @@ source = { registry = "https://pypi.org/simple" }
 dependencies = [
    { name = "absl-py" },
    { name = "dm-env" },
-    { name = "dm-tree" },
+    { name = "dm-tree", version = "0.1.9", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version >= '3.14'" },
+    { name = "dm-tree", version = "0.1.10", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version < '3.14'" },
    { name = "glfw" },
    { name = "labmaze" },
    { name = "lxml" },
@@ -1144,7 +1136,8 @@ version = "1.6"
 source = { registry = "https://pypi.org/simple" }
 dependencies = [
    { name = "absl-py" },
-    { name = "dm-tree" },
+    { name = "dm-tree", version = "0.1.9", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version >= '3.14'" },
+    { name = "dm-tree", version = "0.1.10", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version < '3.14'" },
    { name = "numpy" },
 ]
 sdist = { url = "https://files.pythonhosted.org/packages/62/c9/93e8d6239d5806508a2ee4b370e67c6069943ca149f59f533923737a99b7/dm-env-1.6.tar.gz", hash = "sha256:a436eb1c654c39e0c986a516cee218bea7140b510fceff63f97eb4fcff3d93de", size = 20187, upload-time = "2022-12-21T00:25:29.306Z" }
@@ -1156,11 +1149,22 @@ wheels = [
 name = "dm-tree"
 version = "0.1.9"
 source = { registry = "https://pypi.org/simple" }
+resolution-markers = [
+    "python_full_version >= '3.14' and platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 'armv7l' and platform_machine != 's390x' and sys_platform == 'linux'",
+    "python_full_version >= '3.14' and platform_machine == 's390x' and sys_platform == 'linux'",
+    "(python_full_version >= '3.14' and platform_machine == 'aarch64' and sys_platform == 'linux') or (python_full_version >= '3.14' and platform_machine == 'arm64' and sys_platform == 'linux') or (python_full_version >= '3.14' and platform_machine == 'armv7l' and sys_platform == 'linux')",
+    "(python_full_version >= '3.14' and platform_machine != 's390x' and platform_machine != 'x86_64' and sys_platform == 'darwin') or (python_full_version >= '3.14' and platform_machine != 's390x' and sys_platform != 'darwin' and sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32')",
+    "python_full_version >= '3.14' and platform_machine == 's390x' and sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32'",
+    "python_full_version >= '3.14' and platform_machine != 's390x' and sys_platform == 'emscripten'",
+    "python_full_version >= '3.14' and platform_machine == 's390x' and sys_platform == 'emscripten'",
+    "(python_full_version >= '3.14' and platform_machine == 'x86_64' and sys_platform == 'darwin') or (python_full_version >= '3.14' and platform_machine != 's390x' and sys_platform == 'win32')",
+    "python_full_version >= '3.14' and platform_machine == 's390x' and sys_platform == 'win32'",
+]
 dependencies = [
-    { name = "absl-py" },
-    { name = "attrs" },
-    { name = "numpy" },
-    { name = "wrapt" },
+    { name = "absl-py", marker = "python_full_version >= '3.14'" },
+    { name = "attrs", marker = "python_full_version >= '3.14'" },
+    { name = "numpy", marker = "python_full_version >= '3.14'" },
+    { name = "wrapt", marker = "python_full_version >= '3.14'" },
 ]
 sdist = { url = "https://files.pythonhosted.org/packages/a6/83/ce29720ccf934c6cfa9b9c95ebbe96558386e66886626066632b5e44afed/dm_tree-0.1.9.tar.gz", hash = "sha256:a4c7db3d3935a5a2d5e4b383fc26c6b0cd6f78c6d4605d3e7b518800ecd5342b", size = 35623, upload-time = "2025-01-30T20:45:37.13Z" }
 wheels = [
@@ -1177,6 +1181,58 @@ wheels = [
    { url = "https://files.pythonhosted.org/packages/c5/37/15603079854394f16e3833a7b50696c1f3cbf30a2243a119f64f18a16f36/dm_tree-0.1.9-cp313-cp313t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:e1f5d1e96b3a7de22b25b13a5eb30f41f8cf9c02dd4479a24920de99e780903c", size = 153052, upload-time = "2025-01-30T20:45:35.907Z" },
 ]

+[[package]]
+name = "dm-tree"
+version = "0.1.10"
+source = { registry = "https://pypi.org/simple" }
+resolution-markers = [
+    "python_full_version == '3.13.*' and platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 'armv7l' and platform_machine != 's390x' and sys_platform == 'linux'",
+    "python_full_version == '3.13.*' and platform_machine == 's390x' and sys_platform == 'linux'",
+    "python_full_version < '3.13' and platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 'armv7l' and platform_machine != 's390x' and sys_platform == 'linux'",
+    "python_full_version < '3.13' and platform_machine == 's390x' and sys_platform == 'linux'",
+    "(python_full_version == '3.13.*' and platform_machine == 'aarch64' and sys_platform == 'linux') or (python_full_version == '3.13.*' and platform_machine == 'arm64' and sys_platform == 'linux') or (python_full_version == '3.13.*' and platform_machine == 'armv7l' and sys_platform == 'linux')",
+    "(python_full_version < '3.13' and platform_machine == 'aarch64' and sys_platform == 'linux') or (python_full_version < '3.13' and platform_machine == 'arm64' and sys_platform == 'linux') or (python_full_version < '3.13' and platform_machine == 'armv7l' and sys_platform == 'linux')",
+    "(python_full_version == '3.13.*' and platform_machine != 's390x' and platform_machine != 'x86_64' and sys_platform == 'darwin') or (python_full_version == '3.13.*' and platform_machine != 's390x' and sys_platform != 'darwin' and sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32')",
+    "python_full_version == '3.13.*' and platform_machine == 's390x' and sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32'",
+    "(python_full_version < '3.13' and platform_machine != 's390x' and platform_machine != 'x86_64' and sys_platform == 'darwin') or (python_full_version < '3.13' and platform_machine != 's390x' and sys_platform != 'darwin' and sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32')",
+    "python_full_version < '3.13' and platform_machine == 's390x' and sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32'",
+    "python_full_version == '3.13.*' and platform_machine != 's390x' and sys_platform == 'emscripten'",
+    "python_full_version == '3.13.*' and platform_machine == 's390x' and sys_platform == 'emscripten'",
+    "python_full_version < '3.13' and platform_machine != 's390x' and sys_platform == 'emscripten'",
+    "python_full_version < '3.13' and platform_machine == 's390x' and sys_platform == 'emscripten'",
+    "(python_full_version == '3.13.*' and platform_machine == 'x86_64' and sys_platform == 'darwin') or (python_full_version == '3.13.*' and platform_machine != 's390x' and sys_platform == 'win32')",
+    "python_full_version == '3.13.*' and platform_machine == 's390x' and sys_platform == 'win32'",
+    "(python_full_version < '3.13' and platform_machine == 'x86_64' and sys_platform == 'darwin') or (python_full_version < '3.13' and platform_machine != 's390x' and sys_platform == 'win32')",
+    "python_full_version < '3.13' and platform_machine == 's390x' and sys_platform == 'win32'",
+]
+dependencies = [
+    { name = "absl-py", marker = "python_full_version < '3.14'" },
+    { name = "attrs", marker = "python_full_version < '3.14'" },
+    { name = "numpy", marker = "python_full_version < '3.14'" },
+    { name = "wrapt", marker = "python_full_version < '3.14'" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/5a/66/a3ec619d22b6baffa5ab853e8dc6ec9d0c837127948af59bb15b988d7312/dm_tree-0.1.10.tar.gz", hash = "sha256:22f37b599e01cc3402a17f79c257a802aebd8d326de05b54657650845956208a", size = 35748, upload-time = "2026-03-31T17:35:39.03Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/34/a1/17e0d68eec978c483db4712b14d083ee01484381b29ea85edb2b20210bd0/dm_tree-0.1.10-cp312-cp312-macosx_10_13_universal2.whl", hash = "sha256:94af18e4fd22ce69eccae89eeed8ed498b6b4cc4957f4ed10b4160e59f620e1d", size = 315976, upload-time = "2026-03-31T17:35:15.21Z" },
+    { url = "https://files.pythonhosted.org/packages/cc/6f/ed603715fbc29c887a8985252e2cfe0d449497aea96bac51010159771617/dm_tree-0.1.10-cp312-cp312-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:b442a0c1e9d0960e0314a2e4af81fd328a87921b6d6db6dc41bfa420536884d6", size = 184053, upload-time = "2026-03-31T17:35:16.512Z" },
+    { url = "https://files.pythonhosted.org/packages/83/eb/1d55c679cee9a54e552480d308535753c72e2250cf720d7aa777bff2a4fe/dm_tree-0.1.10-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:012c2b376e88d3685c73a4b5c23be41fe933e14e380dcd90172971690b0e02d2", size = 186506, upload-time = "2026-03-31T17:35:17.593Z" },
+    { url = "https://files.pythonhosted.org/packages/89/2d/adef6924f8dc7f1665eea4ce066387820c14a629d0e1005568892d56ea6a/dm_tree-0.1.10-cp312-cp312-win_amd64.whl", hash = "sha256:da8d5b8995bea1b6bb93f457e0dad5d16e6e2344a6488ced55320e7f3fd50f56", size = 112708, upload-time = "2026-03-31T17:35:18.699Z" },
+    { url = "https://files.pythonhosted.org/packages/d6/29/f39e8412c16740f4c914c6674a04a66ace344ce5cb99b537c2270ef4f204/dm_tree-0.1.10-cp313-cp313-macosx_10_13_universal2.whl", hash = "sha256:4a782f0382be16d66c9ed003e6992e56674504a1d9636f44d2807123f5df6343", size = 316108, upload-time = "2026-03-31T17:35:20.139Z" },
+    { url = "https://files.pythonhosted.org/packages/02/83/1b94d45351bd75a83976a88c9fcf109da6ce336f38a3b443703bb6b18e5d/dm_tree-0.1.10-cp313-cp313-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:0e8f8f1354f178112732b30d2293bc53d212ea64a9556db80a926af3d4647a6b", size = 183834, upload-time = "2026-03-31T17:35:21.463Z" },
+    { url = "https://files.pythonhosted.org/packages/2f/23/bd3e75cbff06a464339d32667d740acf49812b027142a013b54d2c4d830a/dm_tree-0.1.10-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:6d7134c0805294c640b94d85cc725084f0c5087bcda5a7fb38eeb7f479ecc37c", size = 186187, upload-time = "2026-03-31T17:35:23.495Z" },
+    { url = "https://files.pythonhosted.org/packages/aa/75/4b460253b9af862388940404b5df6a22b399800c850aab4724c95f8635f9/dm_tree-0.1.10-cp313-cp313-win_amd64.whl", hash = "sha256:b42e04482880b017d931511d7b5997be372fff26a1ee9b9be55eef03ef1c2918", size = 112768, upload-time = "2026-03-31T17:35:24.622Z" },
+    { url = "https://files.pythonhosted.org/packages/cb/ca/3b40a8a50f9c3492b795b157d769180edb5f2605e3c61ae826208f917baa/dm_tree-0.1.10-cp313-cp313t-macosx_10_13_universal2.whl", hash = "sha256:bde02efacca66514524922538b8a0c5dc15d482565d1c796edc34a726b376830", size = 324138, upload-time = "2026-03-31T17:35:25.627Z" },
+    { url = "https://files.pythonhosted.org/packages/83/e4/33c9218aa607f610e2b0334fc824c2abd5a6bc232bf0726cf275f88e639d/dm_tree-0.1.10-cp313-cp313t-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:033f9a063e1e19b6c65fb5c76079bd923044f5a6095357ad2637845513d47938", size = 185110, upload-time = "2026-03-31T17:35:26.784Z" },
+    { url = "https://files.pythonhosted.org/packages/6c/da/f8811666d61b6829ba1c2716c4119039428dd86078eddd120354aaf26a3b/dm_tree-0.1.10-cp313-cp313t-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:6d4237da7b072fff1e93db109ab545f00d2b978ead35e85e7a84908e15197826", size = 187013, upload-time = "2026-03-31T17:35:27.969Z" },
+    { url = "https://files.pythonhosted.org/packages/94/8d/135ddeea875fd1a2768e7aee6c224f92c9b7643ead1ec8b68bdbee52c60a/dm_tree-0.1.10-cp314-cp314-macosx_10_15_universal2.whl", hash = "sha256:f395390d6acfb5d39c564c8bbcaf35352a81eb2f0d34d449739039b2ef786e14", size = 316599, upload-time = "2026-03-31T17:35:29.339Z" },
+    { url = "https://files.pythonhosted.org/packages/cb/50/1eda610e9ca8ac59950ae028080e7c5320d7abc5567d6723d0cb3623e838/dm_tree-0.1.10-cp314-cp314-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:9c0f54547fbd4b82e88c71694b3836c90059b97102d3e36209f5d2fa66950964", size = 184263, upload-time = "2026-03-31T17:35:30.534Z" },
+    { url = "https://files.pythonhosted.org/packages/c7/59/07461ceb563702ba3943725bdf0e04be4de0ed7ef093837cdd2d67141d2a/dm_tree-0.1.10-cp314-cp314-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:cf6706ac425272c9b7e05f05a23a1ff3e670fb59a787f6089a638eea2d06f1d0", size = 186328, upload-time = "2026-03-31T17:35:31.894Z" },
+    { url = "https://files.pythonhosted.org/packages/88/af/d9c84787fefe9f7c35f474a945217c38396f2ca5ab06432fb566e32a7d1a/dm_tree-0.1.10-cp314-cp314-win_amd64.whl", hash = "sha256:a132047e846e769ddacefe77c42ae79bf3d0e9fce2a6adb638a0ea4cbadb8cdb", size = 114799, upload-time = "2026-03-31T17:35:33.361Z" },
+    { url = "https://files.pythonhosted.org/packages/fd/2c/2aaa63a510db520cd9e0c51e053a608486169bb9710f51f4ecf5699cebb4/dm_tree-0.1.10-cp314-cp314t-macosx_10_15_universal2.whl", hash = "sha256:23682221f63ad011dbd762ce5314740d7900b0426a2681614ea2472369b0c49c", size = 324205, upload-time = "2026-03-31T17:35:34.679Z" },
+    { url = "https://files.pythonhosted.org/packages/b0/89/a5a302bcf9c345e6bd0498627ee2aa12f0a1c3538d08a2f5884d3c6783ba/dm_tree-0.1.10-cp314-cp314t-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:8baeb3db1e92587d686022fb67a52f6c584a7d32bd98444ed3aafb399ad9ce67", size = 185113, upload-time = "2026-03-31T17:35:36.179Z" },
+    { url = "https://files.pythonhosted.org/packages/cc/e8/2d4fbc54bb68905588945cfb47c05445c66cab2d822b05827f1c62e23a70/dm_tree-0.1.10-cp314-cp314t-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:2236c9a4cf64ed0b04004a94902f39341be652b70dce322b33f08ada9b146baa", size = 187009, upload-time = "2026-03-31T17:35:37.584Z" },
+]
+
 [[package]]
 name = "docopt"
 version = "0.6.2"
@@ -1362,11 +1418,11 @@ sdist = { url = "https://files.pythonhosted.org/packages/5f/8e/c53d6f9a8bf3a86a6

 [[package]]
 name = "filelock"
-version = "3.29.0"
+version = "3.28.0"
 source = { registry = "https://pypi.org/simple" }
-sdist = { url = "https://files.pythonhosted.org/packages/b5/fe/997687a931ab51049acce6fa1f23e8f01216374ea81374ddee763c493db5/filelock-3.29.0.tar.gz", hash = "sha256:69974355e960702e789734cb4871f884ea6fe50bd8404051a3530bc07809cf90", size = 57571, upload-time = "2026-04-19T15:39:10.068Z" }
+sdist = { url = "https://files.pythonhosted.org/packages/d6/17/6e8890271880903e3538660a21d63a6c1fea969ac71d0d6b608b78727fa9/filelock-3.28.0.tar.gz", hash = "sha256:4ed1010aae813c4ee8d9c660e4792475ee60c4a0ba76073ceaf862bd317e3ca6", size = 56474, upload-time = "2026-04-14T22:54:33.625Z" }
 wheels = [
-    { url = "https://files.pythonhosted.org/packages/81/47/dd9a212ef6e343a6857485ffe25bba537304f1913bdbed446a23f7f592e1/filelock-3.29.0-py3-none-any.whl", hash = "sha256:96f5f6344709aa1572bbf631c640e4ebeeb519e08da902c39a001882f30ac258", size = 39812, upload-time = "2026-04-19T15:39:08.752Z" },
+    { url = "https://files.pythonhosted.org/packages/3b/21/2f728888c45033d34a417bfcd248ea2564c9e08ab1bfd301377cf05d5586/filelock-3.28.0-py3-none-any.whl", hash = "sha256:de9af6712788e7171df1b28b15eba2446c69721433fa427a9bee07b17820a9db", size = 39189, upload-time = "2026-04-14T22:54:32.037Z" },
 ]

 [[package]]
@@ -1563,14 +1619,14 @@ wheels = [

 [[package]]
 name = "gitpython"
-version = "3.1.47"
+version = "3.1.46"
 source = { registry = "https://pypi.org/simple" }
 dependencies = [
    { name = "gitdb" },
 ]
-sdist = { url = "https://files.pythonhosted.org/packages/c1/bd/50db468e9b1310529a19fce651b3b0e753b5c07954d486cba31bbee9a5d5/gitpython-3.1.47.tar.gz", hash = "sha256:dba27f922bd2b42cb54c87a8ab3cb6beb6bf07f3d564e21ac848913a05a8a3cd", size = 216978, upload-time = "2026-04-22T02:44:44.059Z" }
+sdist = { url = "https://files.pythonhosted.org/packages/df/b5/59d16470a1f0dfe8c793f9ef56fd3826093fc52b3bd96d6b9d6c26c7e27b/gitpython-3.1.46.tar.gz", hash = "sha256:400124c7d0ef4ea03f7310ac2fbf7151e09ff97f2a3288d64a440c584a29c37f", size = 215371, upload-time = "2026-01-01T15:37:32.073Z" }
 wheels = [
-    { url = "https://files.pythonhosted.org/packages/f2/c5/a1bc0996af85757903cf2bf444a7824e68e0035ce63fb41d6f76f9def68b/gitpython-3.1.47-py3-none-any.whl", hash = "sha256:489f590edfd6d20571b2c0e72c6a6ac6915ee8b8cd04572330e3842207a78905", size = 209547, upload-time = "2026-04-22T02:44:41.271Z" },
+    { url = "https://files.pythonhosted.org/packages/6a/09/e21df6aef1e1ffc0c816f0522ddc3f6dcded766c3261813131c78a704470/gitpython-3.1.46-py3-none-any.whl", hash = "sha256:79812ed143d9d25b6d176a10bb511de0f9c67b1fa641d82097b0ab90398a2058", size = 208620, upload-time = "2026-01-01T15:37:30.574Z" },
 ]

 [[package]]
@@ -1973,11 +2029,11 @@ wheels = [

 [[package]]
 name = "idna"
-version = "3.12"
+version = "3.11"
 source = { registry = "https://pypi.org/simple" }
-sdist = { url = "https://files.pythonhosted.org/packages/22/12/2948fbe5513d062169bd91f7d7b1cd97bc8894f32946b71fa39f6e63ca0c/idna-3.12.tar.gz", hash = "sha256:724e9952cc9e2bd7550ea784adb098d837ab5267ef67a1ab9cf7846bdbdd8254", size = 194350, upload-time = "2026-04-21T13:32:48.916Z" }
+sdist = { url = "https://files.pythonhosted.org/packages/6f/6d/0703ccc57f3a7233505399edb88de3cbd678da106337b9fcde432b65ed60/idna-3.11.tar.gz", hash = "sha256:795dafcc9c04ed0c1fb032c2aa73654d8e8c5023a7df64a53f39190ada629902", size = 194582, upload-time = "2025-10-12T14:55:20.501Z" }
 wheels = [
-    { url = "https://files.pythonhosted.org/packages/53/b2/acc33950394b3becb2b664741a0c0889c7ef9f9ffbfa8d47eddb53a50abd/idna-3.12-py3-none-any.whl", hash = "sha256:60ffaa1858fac94c9c124728c24fcde8160f3fb4a7f79aa8cdd33a9d1af60a67", size = 68634, upload-time = "2026-04-21T13:32:47.403Z" },
+    { url = "https://files.pythonhosted.org/packages/0e/61/66938bbb5fc52dbdf84594873d5b51fb1f7c7794e9c0f5bd885f30bc507b/idna-3.11-py3-none-any.whl", hash = "sha256:771a87f49d9defaf64091e6e6fe9c18d4833f140bd19464795bc32d966ca37ea", size = 71008, upload-time = "2025-10-12T14:55:18.883Z" },
 ]

 [[package]]
@@ -2289,7 +2345,7 @@ wheels = [

 [[package]]
 name = "jupyter-events"
-version = "0.12.1"
+version = "0.12.0"
 source = { registry = "https://pypi.org/simple" }
 dependencies = [
    { name = "jsonschema", extra = ["format-nongpl"] },
@@ -2301,9 +2357,9 @@ dependencies = [
    { name = "rfc3986-validator" },
    { name = "traitlets" },
 ]
-sdist = { url = "https://files.pythonhosted.org/packages/18/f8/475c4241b2b75af0deaae453ed003c6c851766dbc44d332d8baf245dc931/jupyter_events-0.12.1.tar.gz", hash = "sha256:faff25f77218335752f35f23c5fe6e4a392a7bd99a5939ccb9b8fbf594636cf3", size = 62854, upload-time = "2026-04-20T23:17:50.66Z" }
+sdist = { url = "https://files.pythonhosted.org/packages/9d/c3/306d090461e4cf3cd91eceaff84bede12a8e52cd821c2d20c9a4fd728385/jupyter_events-0.12.0.tar.gz", hash = "sha256:fc3fce98865f6784c9cd0a56a20644fc6098f21c8c33834a8d9fe383c17e554b", size = 62196, upload-time = "2025-02-03T17:23:41.485Z" }
 wheels = [
-    { url = "https://files.pythonhosted.org/packages/eb/6c/6fcde0c8f616ed360ffd3587f7db9e225a7e62b583a04494d2f069cf64ea/jupyter_events-0.12.1-py3-none-any.whl", hash = "sha256:c366585253f537a627da52fa7ca7410c5b5301fe893f511e7b077c2d93ec8bcf", size = 19512, upload-time = "2026-04-20T23:17:48.927Z" },
+    { url = "https://files.pythonhosted.org/packages/e2/48/577993f1f99c552f18a0428731a755e06171f9902fa118c379eb7c04ea22/jupyter_events-0.12.0-py3-none-any.whl", hash = "sha256:6464b2fa5ad10451c3d35fabc75eab39556ae1e2853ad0c0cc31b656731a97fb", size = 19430, upload-time = "2025-02-03T17:23:38.643Z" },
 ]

 [[package]]
@@ -2737,7 +2793,8 @@ gamepad = [
 groot = [
    { name = "decord", marker = "platform_machine == 'AMD64' or platform_machine == 'x86_64'" },
    { name = "diffusers" },
-    { name = "dm-tree" },
+    { name = "dm-tree", version = "0.1.9", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version >= '3.14'" },
+    { name = "dm-tree", version = "0.1.10", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version < '3.14'" },
    { name = "flash-attn", marker = "sys_platform != 'darwin'" },
    { name = "ninja" },
    { name = "peft" },
@@ -3191,82 +3248,82 @@ wheels = [

 [[package]]
 name = "lxml"
-version = "6.1.0"
+version = "6.0.4"
 source = { registry = "https://pypi.org/simple" }
-sdist = { url = "https://files.pythonhosted.org/packages/28/30/9abc9e34c657c33834eaf6cd02124c61bdf5944d802aa48e69be8da3585d/lxml-6.1.0.tar.gz", hash = "sha256:bfd57d8008c4965709a919c3e9a98f76c2c7cb319086b3d26858250620023b13", size = 4197006, upload-time = "2026-04-18T04:32:51.613Z" }
+sdist = { url = "https://files.pythonhosted.org/packages/ce/08/1217ca4043f55c3c92993b283a7dbfa456a2058d8b57bbb416cc96b6efff/lxml-6.0.4.tar.gz", hash = "sha256:4137516be2a90775f99d8ef80ec0283f8d78b5d8bd4630ff20163b72e7e9abf2", size = 4237780, upload-time = "2026-04-12T16:28:24.182Z" }
 wheels = [
-    { url = "https://files.pythonhosted.org/packages/d2/d4/9326838b59dc36dfae42eec9656b97520f9997eee1de47b8316aaeed169c/lxml-6.1.0-cp312-cp312-macosx_10_13_universal2.whl", hash = "sha256:d2f17a16cd8751e8eb233a7e41aecdf8e511712e00088bf9be455f604cd0d28d", size = 8570663, upload-time = "2026-04-18T04:27:48.253Z" },
-    { url = "https://files.pythonhosted.org/packages/d8/a4/053745ce1f8303ccbb788b86c0db3a91b973675cefc42566a188637b7c40/lxml-6.1.0-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:f0cea5b1d3e6e77d71bd2b9972eb2446221a69dc52bb0b9c3c6f6e5700592d93", size = 4624024, upload-time = "2026-04-18T04:27:52.594Z" },
-    { url = "https://files.pythonhosted.org/packages/90/97/a517944b20f8fd0932ad2109482bee4e29fe721416387a363306667941f6/lxml-6.1.0-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:fc46da94826188ed45cb53bd8e3fc076ae22675aea2087843d4735627f867c6d", size = 4930895, upload-time = "2026-04-18T04:32:56.29Z" },
-    { url = "https://files.pythonhosted.org/packages/94/7c/e08a970727d556caa040a44773c7b7e3ad0f0d73dedc863543e9a8b931f2/lxml-6.1.0-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:9147d8e386ec3b82c3b15d88927f734f565b0aaadef7def562b853adca45784a", size = 5093820, upload-time = "2026-04-18T04:32:58.94Z" },
-    { url = "https://files.pythonhosted.org/packages/88/ee/2a5c2aa2c32016a226ca25d3e1056a8102ea6e1fe308bf50213586635400/lxml-6.1.0-cp312-cp312-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:5715e0e28736a070f3f34a7ccc09e2fdcba0e3060abbcf61a1a5718ff6d6b105", size = 5005790, upload-time = "2026-04-18T04:33:01.272Z" },
-    { url = "https://files.pythonhosted.org/packages/e3/38/a0db9be8f38ad6043ab9429487c128dd1d30f07956ef43040402f8da49e8/lxml-6.1.0-cp312-cp312-manylinux_2_26_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:4937460dc5df0cdd2f06a86c285c28afda06aefa3af949f9477d3e8df430c485", size = 5630827, upload-time = "2026-04-18T04:33:04.036Z" },
-    { url = "https://files.pythonhosted.org/packages/31/ba/3c13d3fc24b7cacf675f808a3a1baabf43a30d0cd24c98f94548e9aa58eb/lxml-6.1.0-cp312-cp312-manylinux_2_26_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:bc783ee3147e60a25aa0445ea82b3e8aabb83b240f2b95d32cb75587ff781814", size = 5240445, upload-time = "2026-04-18T04:33:06.87Z" },
-    { url = "https://files.pythonhosted.org/packages/55/ba/eeef4ccba09b2212fe239f46c1692a98db1878e0872ae320756488878a94/lxml-6.1.0-cp312-cp312-manylinux_2_28_i686.whl", hash = "sha256:40d9189f80075f2e1f88db21ef815a2b17b28adf8e50aaf5c789bfe737027f32", size = 5350121, upload-time = "2026-04-18T04:33:09.365Z" },
-    { url = "https://files.pythonhosted.org/packages/7e/01/1da87c7b587c38d0cbe77a01aae3b9c1c49ed47d76918ef3db8fc151b1ca/lxml-6.1.0-cp312-cp312-manylinux_2_31_armv7l.whl", hash = "sha256:05b9b8787e35bec69e68daf4952b2e6dfcfb0db7ecf1a06f8cdfbbac4eb71aad", size = 4694949, upload-time = "2026-04-18T04:33:11.628Z" },
-    { url = "https://files.pythonhosted.org/packages/a1/88/7db0fe66d5aaf128443ee1623dec3db1576f3e4c17751ec0ef5866468590/lxml-6.1.0-cp312-cp312-manylinux_2_38_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:0f0f08beb0182e3e9a86fae124b3c47a7b41b7b69b225e1377db983802404e54", size = 5243901, upload-time = "2026-04-18T04:33:13.95Z" },
-    { url = "https://files.pythonhosted.org/packages/00/a8/1346726af7d1f6fca1f11223ba34001462b0a3660416986d37641708d57c/lxml-6.1.0-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:73becf6d8c81d4c76b1014dbd3584cb26d904492dcf73ca85dc8bff08dcd6d2d", size = 5048054, upload-time = "2026-04-18T04:33:16.965Z" },
-    { url = "https://files.pythonhosted.org/packages/2e/b7/85057012f035d1a0c87e02f8c723ca3c3e6e0728bcf4cb62080b21b1c1e3/lxml-6.1.0-cp312-cp312-musllinux_1_2_armv7l.whl", hash = "sha256:1ae225f66e5938f4fa29d37e009a3bb3b13032ac57eb4eb42afa44f6e4054e69", size = 4777324, upload-time = "2026-04-18T04:33:19.832Z" },
-    { url = "https://files.pythonhosted.org/packages/75/6c/ad2f94a91073ef570f33718040e8e160d5fb93331cf1ab3ca1323f939e2d/lxml-6.1.0-cp312-cp312-musllinux_1_2_ppc64le.whl", hash = "sha256:690022c7fae793b0489aa68a658822cea83e0d5933781811cabbf5ea3bcfe73d", size = 5645702, upload-time = "2026-04-18T04:33:22.436Z" },
-    { url = "https://files.pythonhosted.org/packages/3b/89/0bb6c0bd549c19004c60eea9dc554dd78fd647b72314ef25d460e0d208c6/lxml-6.1.0-cp312-cp312-musllinux_1_2_riscv64.whl", hash = "sha256:63aeafc26aac0be8aff14af7871249e87ea1319be92090bfd632ec68e03b16a5", size = 5232901, upload-time = "2026-04-18T04:33:26.21Z" },
-    { url = "https://files.pythonhosted.org/packages/a1/d9/d609a11fb567da9399f525193e2b49847b5a409cdebe737f06a8b7126bdc/lxml-6.1.0-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:264c605ab9c0e4aa1a679636f4582c4d3313700009fac3ec9c3412ed0d8f3e1d", size = 5261333, upload-time = "2026-04-18T04:33:28.984Z" },
-    { url = "https://files.pythonhosted.org/packages/a6/3a/ac3f99ec8ac93089e7dd556f279e0d14c24de0a74a507e143a2e4b496e7c/lxml-6.1.0-cp312-cp312-win32.whl", hash = "sha256:56971379bc5ee8037c5a0f09fa88f66cdb7d37c3e38af3e45cf539f41131ac1f", size = 3596289, upload-time = "2026-04-18T04:27:42.819Z" },
-    { url = "https://files.pythonhosted.org/packages/f2/a7/0a915557538593cb1bbeedcd40e13c7a261822c26fecbbdb71dad0c2f540/lxml-6.1.0-cp312-cp312-win_amd64.whl", hash = "sha256:bba078de0031c219e5dd06cf3e6bf8fb8e6e64a77819b358f53bb132e3e03366", size = 3997059, upload-time = "2026-04-18T04:27:46.764Z" },
-    { url = "https://files.pythonhosted.org/packages/92/96/a5dc078cf0126fbfbc35611d77ecd5da80054b5893e28fb213a5613b9e1d/lxml-6.1.0-cp312-cp312-win_arm64.whl", hash = "sha256:c3592631e652afa34999a088f98ba7dfc7d6aff0d535c410bea77a71743f3819", size = 3659552, upload-time = "2026-04-18T04:27:51.133Z" },
-    { url = "https://files.pythonhosted.org/packages/08/03/69347590f1cf4a6d5a4944bb6099e6d37f334784f16062234e1f892fdb1d/lxml-6.1.0-cp313-cp313-macosx_10_13_universal2.whl", hash = "sha256:a0092f2b107b69601adf562a57c956fbb596e05e3e6651cabd3054113b007e45", size = 8559689, upload-time = "2026-04-18T04:31:57.785Z" },
-    { url = "https://files.pythonhosted.org/packages/3f/58/25e00bb40b185c974cfe156c110474d9a8a8390d5f7c92a4e328189bb60e/lxml-6.1.0-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:fc7140d7a7386e6b545d41b7358f4d02b656d4053f5fa6859f92f4b9c2572c4d", size = 4617892, upload-time = "2026-04-18T04:32:01.78Z" },
-    { url = "https://files.pythonhosted.org/packages/f5/54/92ad98a94ac318dc4f97aaac22ff8d1b94212b2ae8af5b6e9b354bf825f7/lxml-6.1.0-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:419c58fc92cc3a2c3fa5f78c63dbf5da70c1fa9c1b25f25727ecee89a96c7de2", size = 4923489, upload-time = "2026-04-18T04:33:31.401Z" },
-    { url = "https://files.pythonhosted.org/packages/15/3b/a20aecfab42bdf4f9b390590d345857ad3ffd7c51988d1c89c53a0c73faf/lxml-6.1.0-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:37fabd1452852636cf38ecdcc9dd5ca4bba7a35d6c53fa09725deeb894a87491", size = 5082162, upload-time = "2026-04-18T04:33:34.262Z" },
-    { url = "https://files.pythonhosted.org/packages/45/26/2cdb3d281ac1bd175603e290cbe4bad6eff127c0f8de90bafd6f8548f0fd/lxml-6.1.0-cp313-cp313-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:a2853c8b2170cc6cd54a6b4d50d2c1a8a7aeca201f23804b4898525c7a152cfc", size = 4993247, upload-time = "2026-04-18T04:33:36.674Z" },
-    { url = "https://files.pythonhosted.org/packages/f6/05/d735aef963740022a08185c84821f689fc903acb3d50326e6b1e9886cc22/lxml-6.1.0-cp313-cp313-manylinux_2_26_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:8e369cbd690e788c8d15e56222d91a09c6a417f49cbc543040cba0fe2e25a79e", size = 5613042, upload-time = "2026-04-18T04:33:39.205Z" },
-    { url = "https://files.pythonhosted.org/packages/ee/b8/ead7c10efff731738c72e59ed6eb5791854879fbed7ae98781a12006263a/lxml-6.1.0-cp313-cp313-manylinux_2_26_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:e69aa6805905807186eb00e66c6d97a935c928275182eb02ee40ba00da9623b2", size = 5228304, upload-time = "2026-04-18T04:33:41.647Z" },
-    { url = "https://files.pythonhosted.org/packages/6b/10/e9842d2ec322ea65f0a7270aa0315a53abed06058b88ef1b027f620e7a5f/lxml-6.1.0-cp313-cp313-manylinux_2_28_i686.whl", hash = "sha256:4bd1bdb8a9e0e2dd229de19b5f8aebac80e916921b4b2c6ef8a52bc131d0c1f9", size = 5341578, upload-time = "2026-04-18T04:33:44.596Z" },
-    { url = "https://files.pythonhosted.org/packages/89/54/40d9403d7c2775fa7301d3ddd3464689bfe9ba71acc17dfff777071b4fdc/lxml-6.1.0-cp313-cp313-manylinux_2_31_armv7l.whl", hash = "sha256:cbd7b79cdcb4986ad78a2662625882747f09db5e4cd7b2ae178a88c9c51b3dfe", size = 4700209, upload-time = "2026-04-18T04:33:47.552Z" },
-    { url = "https://files.pythonhosted.org/packages/85/b2/bbdcc2cf45dfc7dfffef4fd97e5c47b15919b6a365247d95d6f684ef5e82/lxml-6.1.0-cp313-cp313-manylinux_2_38_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:43e4d297f11080ec9d64a4b1ad7ac02b4484c9f0e2179d9c4ef78e886e747b88", size = 5232365, upload-time = "2026-04-18T04:33:50.249Z" },
-    { url = "https://files.pythonhosted.org/packages/48/5a/b06875665e53aaba7127611a7bed3b7b9658e20b22bc2dd217a0b7ab0091/lxml-6.1.0-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:cc16682cc987a3da00aa56a3aa3075b08edb10d9b1e476938cfdbee8f3b67181", size = 5043654, upload-time = "2026-04-18T04:33:52.71Z" },
-    { url = "https://files.pythonhosted.org/packages/e9/9c/e71a069d09641c1a7abeb30e693f828c7c90a41cbe3d650b2d734d876f85/lxml-6.1.0-cp313-cp313-musllinux_1_2_armv7l.whl", hash = "sha256:d6d8efe71429635f0559579092bb5e60560d7b9115ee38c4adbea35632e7fa24", size = 4769326, upload-time = "2026-04-18T04:33:55.244Z" },
-    { url = "https://files.pythonhosted.org/packages/cc/06/7a9cd84b3d4ed79adf35f874750abb697dec0b4a81a836037b36e47c091a/lxml-6.1.0-cp313-cp313-musllinux_1_2_ppc64le.whl", hash = "sha256:7e39ab3a28af7784e206d8606ec0e4bcad0190f63a492bca95e94e5a4aef7f6e", size = 5635879, upload-time = "2026-04-18T04:33:58.509Z" },
-    { url = "https://files.pythonhosted.org/packages/cc/f0/9d57916befc1e54c451712c7ee48e9e74e80ae4d03bdce49914e0aee42cd/lxml-6.1.0-cp313-cp313-musllinux_1_2_riscv64.whl", hash = "sha256:9eb667bf50856c4a58145f8ca2d5e5be160191e79eb9e30855a476191b3c3495", size = 5224048, upload-time = "2026-04-18T04:34:00.943Z" },
-    { url = "https://files.pythonhosted.org/packages/99/75/90c4eefda0c08c92221fe0753db2d6699a4c628f76ff4465ec20dea84cc1/lxml-6.1.0-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:7f4a77d6f7edf9230cee3e1f7f6764722a41604ee5681844f18db9a81ea0ec33", size = 5250241, upload-time = "2026-04-18T04:34:03.365Z" },
-    { url = "https://files.pythonhosted.org/packages/5e/73/16596f7e4e38fa33084b9ccbccc22a15f82a290a055126f2c1541236d2ff/lxml-6.1.0-cp313-cp313-win32.whl", hash = "sha256:28902146ffbe5222df411c5d19e5352490122e14447e98cd118907ee3fd6ee62", size = 3596938, upload-time = "2026-04-18T04:31:56.206Z" },
-    { url = "https://files.pythonhosted.org/packages/8e/63/981401c5680c1eb30893f00a19641ac80db5d1e7086c62cb4b13ed813038/lxml-6.1.0-cp313-cp313-win_amd64.whl", hash = "sha256:4a1503c56e4e2b38dc76f2f2da7bae69670c0f1933e27cfa34b2fa5876410b16", size = 3995728, upload-time = "2026-04-18T04:31:58.763Z" },
-    { url = "https://files.pythonhosted.org/packages/e7/e8/c358a38ac3e541d16a1b527e4e9cb78c0419b0506a070ace11777e5e8404/lxml-6.1.0-cp313-cp313-win_arm64.whl", hash = "sha256:e0af85773850417d994d019741239b901b22c6680206f46a34766926e466141d", size = 3658372, upload-time = "2026-04-18T04:32:03.629Z" },
-    { url = "https://files.pythonhosted.org/packages/eb/45/cee4cf203ef0bab5c52afc118da61d6b460c928f2893d40023cfa27e0b80/lxml-6.1.0-cp314-cp314-macosx_10_15_universal2.whl", hash = "sha256:ab863fd37458fed6456525f297d21239d987800c46e67da5ef04fc6b3dd93ac8", size = 8576713, upload-time = "2026-04-18T04:32:06.831Z" },
-    { url = "https://files.pythonhosted.org/packages/8a/a7/eda05babeb7e046839204eaf254cd4d7c9130ce2bbf0d9e90ea41af5654d/lxml-6.1.0-cp314-cp314-macosx_10_15_x86_64.whl", hash = "sha256:6fd8b1df8254ff4fd93fd31da1fc15770bde23ac045be9bb1f87425702f61cc9", size = 4623874, upload-time = "2026-04-18T04:32:10.755Z" },
-    { url = "https://files.pythonhosted.org/packages/e7/e9/db5846de9b436b91890a62f29d80cd849ea17948a49bf532d5278ee69a9e/lxml-6.1.0-cp314-cp314-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:47024feaae386a92a146af0d2aeed65229bf6fff738e6a11dda6b0015fb8fd03", size = 4949535, upload-time = "2026-04-18T04:34:06.657Z" },
-    { url = "https://files.pythonhosted.org/packages/5a/ba/0d3593373dcae1d68f40dc3c41a5a92f2544e68115eb2f62319a4c2a6500/lxml-6.1.0-cp314-cp314-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:3f00972f84450204cd5d93a5395965e348956aaceaadec693a22ec743f8ae3eb", size = 5086881, upload-time = "2026-04-18T04:34:09.556Z" },
-    { url = "https://files.pythonhosted.org/packages/43/76/759a7484539ad1af0d125a9afe9c3fb5f82a8779fd1f5f56319d9e4ea2fd/lxml-6.1.0-cp314-cp314-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:97faa0860e13b05b15a51fb4986421ef7a30f0b3334061c416e0981e9450ca4c", size = 5031305, upload-time = "2026-04-18T04:34:12.336Z" },
-    { url = "https://files.pythonhosted.org/packages/dc/b9/c1f0daf981a11e47636126901fd4ab82429e18c57aeb0fc3ad2940b42d8b/lxml-6.1.0-cp314-cp314-manylinux_2_26_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:972a6451204798675407beaad97b868d0c733d9a74dafefc63120b81b8c2de28", size = 5647522, upload-time = "2026-04-18T04:34:14.89Z" },
-    { url = "https://files.pythonhosted.org/packages/31/e6/1f533dcd205275363d9ba3511bcec52fa2df86abf8abe6a5f2c599f0dc31/lxml-6.1.0-cp314-cp314-manylinux_2_26_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:fe022f20bc4569ec66b63b3fb275a3d628d9d32da6326b2982584104db6d3086", size = 5239310, upload-time = "2026-04-18T04:34:17.652Z" },
-    { url = "https://files.pythonhosted.org/packages/c3/8c/4175fb709c78a6e315ed814ed33be3defd8b8721067e70419a6cf6f971da/lxml-6.1.0-cp314-cp314-manylinux_2_28_i686.whl", hash = "sha256:75c4c7c619a744f972f4451bf5adf6d0fb00992a1ffc9fd78e13b0bc817cc99f", size = 5350799, upload-time = "2026-04-18T04:34:20.529Z" },
-    { url = "https://files.pythonhosted.org/packages/fd/77/6ffdebc5994975f0dde4acb59761902bd9d9bb84422b9a0bd239a7da9ca8/lxml-6.1.0-cp314-cp314-manylinux_2_31_armv7l.whl", hash = "sha256:3648f20d25102a22b6061c688beb3a805099ea4beb0a01ce62975d926944d292", size = 4697693, upload-time = "2026-04-18T04:34:23.541Z" },
-    { url = "https://files.pythonhosted.org/packages/f8/f1/565f36bd5c73294602d48e04d23f81ff4c8736be6ba5e1d1ec670ac9be80/lxml-6.1.0-cp314-cp314-manylinux_2_38_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:77b9f99b17cbf14026d1e618035077060fc7195dd940d025149f3e2e830fbfcb", size = 5250708, upload-time = "2026-04-18T04:34:26.001Z" },
-    { url = "https://files.pythonhosted.org/packages/5a/11/a68ab9dd18c5c499404deb4005f4bc4e0e88e5b72cd755ad96efec81d18d/lxml-6.1.0-cp314-cp314-musllinux_1_2_aarch64.whl", hash = "sha256:32662519149fd7a9db354175aa5e417d83485a8039b8aaa62f873ceee7ea4cad", size = 5084737, upload-time = "2026-04-18T04:34:28.32Z" },
-    { url = "https://files.pythonhosted.org/packages/ab/78/e8f41e2c74f4af564e6a0348aea69fb6daaefa64bc071ef469823d22cc18/lxml-6.1.0-cp314-cp314-musllinux_1_2_armv7l.whl", hash = "sha256:73d658216fc173cf2c939e90e07b941c5e12736b0bf6a99e7af95459cfe8eabb", size = 4737817, upload-time = "2026-04-18T04:34:30.784Z" },
-    { url = "https://files.pythonhosted.org/packages/06/2d/aa4e117aa2ce2f3b35d9ff246be74a2f8e853baba5d2a92c64744474603a/lxml-6.1.0-cp314-cp314-musllinux_1_2_ppc64le.whl", hash = "sha256:ac4db068889f8772a4a698c5980ec302771bb545e10c4b095d4c8be26749616f", size = 5670753, upload-time = "2026-04-18T04:34:33.675Z" },
-    { url = "https://files.pythonhosted.org/packages/08/f5/dd745d50c0409031dbfcc4881740542a01e54d6f0110bd420fa7782110b8/lxml-6.1.0-cp314-cp314-musllinux_1_2_riscv64.whl", hash = "sha256:45e9dfbd1b661eb64ba0d4dbe762bd210c42d86dd1e5bd2bdf89d634231beb43", size = 5238071, upload-time = "2026-04-18T04:34:36.12Z" },
-    { url = "https://files.pythonhosted.org/packages/3e/74/ad424f36d0340a904665867dab310a3f1f4c96ff4039698de83b77f44c1f/lxml-6.1.0-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:89e8d73d09ac696a5ba42ec69787913d53284f12092f651506779314f10ba585", size = 5264319, upload-time = "2026-04-18T04:34:39.035Z" },
-    { url = "https://files.pythonhosted.org/packages/53/36/a15d8b3514ec889bfd6aa3609107fcb6c9189f8dc347f1c0b81eded8d87c/lxml-6.1.0-cp314-cp314-win32.whl", hash = "sha256:ebe33f4ec1b2de38ceb225a1749a2965855bffeef435ba93cd2d5d540783bf2f", size = 3657139, upload-time = "2026-04-18T04:32:20.006Z" },
-    { url = "https://files.pythonhosted.org/packages/1a/a4/263ebb0710851a3c6c937180a9a86df1206fdfe53cc43005aa2237fd7736/lxml-6.1.0-cp314-cp314-win_amd64.whl", hash = "sha256:398443df51c538bd578529aa7e5f7afc6c292644174b47961f3bf87fe5741120", size = 4064195, upload-time = "2026-04-18T04:32:23.876Z" },
-    { url = "https://files.pythonhosted.org/packages/80/68/2000f29d323b6c286de077ad20b429fc52272e44eae6d295467043e56012/lxml-6.1.0-cp314-cp314-win_arm64.whl", hash = "sha256:8c8984e1d8c4b3949e419158fda14d921ff703a9ed8a47236c6eb7a2b6cb4946", size = 3741870, upload-time = "2026-04-18T04:32:27.922Z" },
-    { url = "https://files.pythonhosted.org/packages/30/e9/21383c7c8d43799f0da90224c0d7c921870d476ec9b3e01e1b2c0b8237c5/lxml-6.1.0-cp314-cp314t-macosx_10_15_universal2.whl", hash = "sha256:1081dd10bc6fa437db2500e13993abf7cc30716d0a2f40e65abb935f02ec559c", size = 8827548, upload-time = "2026-04-18T04:32:15.094Z" },
-    { url = "https://files.pythonhosted.org/packages/a5/01/c6bc11cd587030dd4f719f65c5657960649fe3e19196c844c75bf32cd0d6/lxml-6.1.0-cp314-cp314t-macosx_10_15_x86_64.whl", hash = "sha256:dabecc48db5f42ba348d1f5d5afdc54c6c4cc758e676926c7cd327045749517d", size = 4735866, upload-time = "2026-04-18T04:32:18.924Z" },
-    { url = "https://files.pythonhosted.org/packages/f3/01/757132fff5f4acf25463b5298f1a46099f3a94480b806547b29ce5e385de/lxml-6.1.0-cp314-cp314t-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:e3dd5fe19c9e0ac818a9c7f132a5e43c1339ec1cbbfecb1a938bd3a47875b7c9", size = 4969476, upload-time = "2026-04-18T04:34:41.889Z" },
-    { url = "https://files.pythonhosted.org/packages/fd/fb/1bc8b9d27ed64be7c8903db6c89e74dc8c2cd9ec630a7462e4654316dc5b/lxml-6.1.0-cp314-cp314t-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:9e7b0a4ca6dcc007a4cef00a761bba2dea959de4bd2df98f926b33c92ca5dfb9", size = 5103719, upload-time = "2026-04-18T04:34:44.797Z" },
-    { url = "https://files.pythonhosted.org/packages/d5/e7/5bf82fa28133536a54601aae633b14988e89ed61d4c1eb6b899b023233aa/lxml-6.1.0-cp314-cp314t-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:5d27bbe326c6b539c64b42638b18bc6003a8d88f76213a97ac9ed4f885efeab7", size = 5027890, upload-time = "2026-04-18T04:34:47.634Z" },
-    { url = "https://files.pythonhosted.org/packages/2d/20/e048db5d4b4ea0366648aa595f26bb764b2670903fc585b87436d0a5032c/lxml-6.1.0-cp314-cp314t-manylinux_2_26_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:c4e425db0c5445ef0ad56b0eec54f89b88b2d884656e536a90b2f52aecb4ca86", size = 5596008, upload-time = "2026-04-18T04:34:51.503Z" },
-    { url = "https://files.pythonhosted.org/packages/9a/c2/d10807bc8da4824b39e5bd01b5d05c077b6fd01bd91584167edf6b269d22/lxml-6.1.0-cp314-cp314t-manylinux_2_26_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:4b89b098105b8599dc57adac95d1813409ac476d3c948a498775d3d0c6124bfb", size = 5224451, upload-time = "2026-04-18T04:34:54.263Z" },
-    { url = "https://files.pythonhosted.org/packages/3c/15/2ebea45bea427e7f0057e9ce7b2d62c5aba20c6b001cca89ed0aadb3ad41/lxml-6.1.0-cp314-cp314t-manylinux_2_28_i686.whl", hash = "sha256:c4a699432846df86cc3de502ee85f445ebad748a1c6021d445f3e514d2cd4b1c", size = 5312135, upload-time = "2026-04-18T04:34:56.818Z" },
-    { url = "https://files.pythonhosted.org/packages/31/e2/87eeae151b0be2a308d49a7ec444ff3eb192b14251e62addb29d0bf3778f/lxml-6.1.0-cp314-cp314t-manylinux_2_31_armv7l.whl", hash = "sha256:30e7b2ed63b6c8e97cca8af048589a788ab5c9c905f36d9cf1c2bb549f450d2f", size = 4639126, upload-time = "2026-04-18T04:34:59.704Z" },
-    { url = "https://files.pythonhosted.org/packages/a3/51/8a3f6a20902ad604dd746ec7b4000311b240d389dac5e9d95adefd349e0c/lxml-6.1.0-cp314-cp314t-manylinux_2_38_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:022981127642fe19866d2907d76241bb07ed21749601f727d5d5dd1ce5d1b773", size = 5232579, upload-time = "2026-04-18T04:35:02.658Z" },
-    { url = "https://files.pythonhosted.org/packages/6d/d2/650d619bdbe048d2c3f2c31edb00e35670a5e2d65b4fe3b61bce37b19121/lxml-6.1.0-cp314-cp314t-musllinux_1_2_aarch64.whl", hash = "sha256:23cad0cc86046d4222f7f418910e46b89971c5a45d3c8abfad0f64b7b05e4a9b", size = 5084206, upload-time = "2026-04-18T04:35:05.175Z" },
-    { url = "https://files.pythonhosted.org/packages/dd/8a/672ca1a3cbeabd1f511ca275a916c0514b747f4b85bdaae103b8fa92f307/lxml-6.1.0-cp314-cp314t-musllinux_1_2_armv7l.whl", hash = "sha256:21c3302068f50d1e8728c67c87ba92aa87043abee517aa2576cca1855326b405", size = 4758906, upload-time = "2026-04-18T04:35:08.098Z" },
-    { url = "https://files.pythonhosted.org/packages/be/f1/ef4b691da85c916cb2feb1eec7414f678162798ac85e042fa164419ac05c/lxml-6.1.0-cp314-cp314t-musllinux_1_2_ppc64le.whl", hash = "sha256:be10838781cb3be19251e276910cd508fe127e27c3242e50521521a0f3781690", size = 5620553, upload-time = "2026-04-18T04:35:11.23Z" },
-    { url = "https://files.pythonhosted.org/packages/59/17/94e81def74107809755ac2782fdad4404420f1c92ca83433d117a6d5acf0/lxml-6.1.0-cp314-cp314t-musllinux_1_2_riscv64.whl", hash = "sha256:2173a7bffe97667bbf0767f8a99e587740a8c56fdf3befac4b09cb29a80276fd", size = 5229458, upload-time = "2026-04-18T04:35:14.254Z" },
-    { url = "https://files.pythonhosted.org/packages/21/55/c4be91b0f830a871fc1b0d730943d56013b683d4671d5198260e2eae722b/lxml-6.1.0-cp314-cp314t-musllinux_1_2_x86_64.whl", hash = "sha256:c6854e9cf99c84beb004eecd7d3a3868ef1109bf2b1df92d7bc11e96a36c2180", size = 5247861, upload-time = "2026-04-18T04:35:17.006Z" },
-    { url = "https://files.pythonhosted.org/packages/c2/ca/77123e4d77df3cb1e968ade7b1f808f5d3a5c1c96b18a33895397de292c1/lxml-6.1.0-cp314-cp314t-win32.whl", hash = "sha256:00750d63ef0031a05331b9223463b1c7c02b9004cef2346a5b2877f0f9494dd2", size = 3897377, upload-time = "2026-04-18T04:32:07.656Z" },
-    { url = "https://files.pythonhosted.org/packages/64/ce/3554833989d074267c063209bae8b09815e5656456a2d332b947806b05ff/lxml-6.1.0-cp314-cp314t-win_amd64.whl", hash = "sha256:80410c3a7e3c617af04de17caa9f9f20adaa817093293d69eae7d7d0522836f5", size = 4392701, upload-time = "2026-04-18T04:32:12.113Z" },
-    { url = "https://files.pythonhosted.org/packages/2b/a0/9b916c68c0e57752c07f8f64b30138d9d4059dbeb27b90274dedbea128ff/lxml-6.1.0-cp314-cp314t-win_arm64.whl", hash = "sha256:26dd9f57ee3bd41e7d35b4c98a2ffd89ed11591649f421f0ec19f67d50ec67ac", size = 3817120, upload-time = "2026-04-18T04:32:15.803Z" },
+    { url = "https://files.pythonhosted.org/packages/3d/18/4732abab49bbb041b1ded9dd913ca89735a0dcca038eacec64c44ba02163/lxml-6.0.4-cp312-cp312-macosx_10_13_universal2.whl", hash = "sha256:af0b8459c4e21a8417db967b2e453d1855022dac79c79b61fb8214f3da50f17e", size = 8570033, upload-time = "2026-04-12T16:24:10.728Z" },
+    { url = "https://files.pythonhosted.org/packages/72/7e/38523ec7178ca35376551911455d1b2766bc9d98bcc18f606a167fa9ecbb/lxml-6.0.4-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:e0cdcea2affa53fa17dc4bf5cefc0edf72583eac987d669493a019998a623fa3", size = 4623270, upload-time = "2026-04-12T16:24:13.2Z" },
+    { url = "https://files.pythonhosted.org/packages/f1/cf/f9b6c9bf9d8c63d923ef893915141767cea4cea71774f20c36d0c14e1585/lxml-6.0.4-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:8da4d4840c1bc07da6fcd647784f7fbaf538eeb7a57ce6b2487acc54c5e33330", size = 4929471, upload-time = "2026-04-12T16:24:15.453Z" },
+    { url = "https://files.pythonhosted.org/packages/e5/53/3117f988c9e20be4156d2b8e1bda82ae06878d11aeb820dea111a7cfa4e3/lxml-6.0.4-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:fb04a997588c3980894ded9172c10c5a3e45d3f1c5410472733626d268683806", size = 5092355, upload-time = "2026-04-12T16:24:17.876Z" },
+    { url = "https://files.pythonhosted.org/packages/4e/ca/05c6ac773a2bd3edb48fa8a5c5101e927ce044c4a8aed1a85ff00fab20a5/lxml-6.0.4-cp312-cp312-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:ca449642a08a6ceddf6e6775b874b6aee1b6242ed80aea84124497aba28e5384", size = 5004520, upload-time = "2026-04-12T16:24:20.184Z" },
+    { url = "https://files.pythonhosted.org/packages/f1/db/d8aa5aa3a51d0aa6706ef85f85027f7c972cd840fe69ba058ecaf32d093d/lxml-6.0.4-cp312-cp312-manylinux_2_26_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:35b3ccdd137e62033662787dd4d2b8be900c686325d6b91e3b1ff6213d05ba11", size = 5629961, upload-time = "2026-04-12T16:24:22.242Z" },
+    { url = "https://files.pythonhosted.org/packages/9d/75/8fff4444e0493aeb15ab0f4a55c767b5baed9074cf67a1835dc1161f3a1f/lxml-6.0.4-cp312-cp312-manylinux_2_26_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:45dc690c54b1341fec01743caed02e5f1ea49d7cfb81e3ba48903e5e844ed68a", size = 5237561, upload-time = "2026-04-12T16:24:24.572Z" },
+    { url = "https://files.pythonhosted.org/packages/2a/9f/6d6cd73014f2dbf47a8aa7accd9712726f46ef4891e1c126bc285cfb94e4/lxml-6.0.4-cp312-cp312-manylinux_2_28_i686.whl", hash = "sha256:15ae922e8f74b05798a0e88cee46c0244aaec6a66b5e00be7d18648fed8c432e", size = 5349197, upload-time = "2026-04-12T16:24:26.805Z" },
+    { url = "https://files.pythonhosted.org/packages/2d/43/e3e9a126e166234d1659d1dd9004dc1dd50cdc3c68575b071b0a1524b4de/lxml-6.0.4-cp312-cp312-manylinux_2_31_armv7l.whl", hash = "sha256:ebd816653707fbf10c65e3dee3bc24dac6b691654c21533b1ae49287433f4db0", size = 4693123, upload-time = "2026-04-12T16:24:28.812Z" },
+    { url = "https://files.pythonhosted.org/packages/6c/98/b146dd123a4a7b69b571ff23ea8e8c68de8d8c1b03e23d01c6374d4fd835/lxml-6.0.4-cp312-cp312-manylinux_2_38_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:21284cf36b95dd8be774eb06c304b440cf49ee811800a30080ce6d93700f0383", size = 5242967, upload-time = "2026-04-12T16:24:30.811Z" },
+    { url = "https://files.pythonhosted.org/packages/7e/60/8c275584452b55a902c883e8ab63d755c5ef35d7ad1f06f9e6559095521d/lxml-6.0.4-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:0c08a2a9d0c4028ef5fc5a513b2e1e51af069a83c5b4206139edd08b3b8c2926", size = 5046810, upload-time = "2026-04-12T16:24:33.289Z" },
+    { url = "https://files.pythonhosted.org/packages/19/aa/19ec216147e1105e5403fe73657c693a6e91bde855a13242dd6031e829e5/lxml-6.0.4-cp312-cp312-musllinux_1_2_armv7l.whl", hash = "sha256:1bc2f0f417112cf1a428599dd58125ab74d8e1c66893efd9b907cbb4a5db6e44", size = 4776383, upload-time = "2026-04-12T16:24:36.008Z" },
+    { url = "https://files.pythonhosted.org/packages/41/c8/90afdb838705a736268fcffd2698c05e9a129144ce215d5e14db3bdfc295/lxml-6.0.4-cp312-cp312-musllinux_1_2_ppc64le.whl", hash = "sha256:c0d86e328405529bc93913add9ff377e8b8ea9be878e611f19dbac7766a84483", size = 5643497, upload-time = "2026-04-12T16:24:38.276Z" },
+    { url = "https://files.pythonhosted.org/packages/32/ec/1135261ec9822dafb90be0ff6fb0ec79cee0b7fe878833dfe5f2b8c393bd/lxml-6.0.4-cp312-cp312-musllinux_1_2_riscv64.whl", hash = "sha256:3cce9420fe8f91eae5d457582599d282195c958cb670aa4bea313a79103ba33f", size = 5232185, upload-time = "2026-04-12T16:24:40.516Z" },
+    { url = "https://files.pythonhosted.org/packages/13/f2/7380b11cae6943720f525e5a28ad9dbead96ac710417e556b7c03f3a8af3/lxml-6.0.4-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:96214985ec194ce97b9028414e179cfb21230cba4e2413aee7e249461bb84f4d", size = 5259968, upload-time = "2026-04-12T16:24:42.917Z" },
+    { url = "https://files.pythonhosted.org/packages/65/8f/141734f2c456f2253fed4237d8d4b241e3d701129cf6f0b135ccf241a75a/lxml-6.0.4-cp312-cp312-win32.whl", hash = "sha256:b2209b310e7ed1d4cd1c00d405ec9c49722fce731c7036abc1d876bf8df78139", size = 3594958, upload-time = "2026-04-12T16:24:45.039Z" },
+    { url = "https://files.pythonhosted.org/packages/b7/a9/c6d3531c6d8814af0919fbdb9bda43c9e8b5deffcb70c8534017db233512/lxml-6.0.4-cp312-cp312-win_amd64.whl", hash = "sha256:03affcacfba4671ebc305813b02bfaf34d80b6a7c5b23eafc5d6da14a1a6e623", size = 3995897, upload-time = "2026-04-12T16:24:46.98Z" },
+    { url = "https://files.pythonhosted.org/packages/03/5d/1dabeddf762e5a315a31775b2bca39811d7e7a15fc3e677d044b9da973fe/lxml-6.0.4-cp312-cp312-win_arm64.whl", hash = "sha256:af9678e3a2a047465515d95a61690109af7a4c9486f708249119adcef7861049", size = 3658607, upload-time = "2026-04-12T16:24:49.19Z" },
+    { url = "https://files.pythonhosted.org/packages/78/f6/550a1ed9afde66e24bfcf9892446ea9779152df336062c6df0f7733151a2/lxml-6.0.4-cp313-cp313-macosx_10_13_universal2.whl", hash = "sha256:ecc3d55ed756ee6c3447748862a97e1f5392d2c5d7f474bace9382345e4fc274", size = 8559522, upload-time = "2026-04-12T16:24:51.563Z" },
+    { url = "https://files.pythonhosted.org/packages/11/93/3f687c14d2b4d24b60fe13fd5482c8853f82a10bb87f2b577123e342ed1a/lxml-6.0.4-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:a7d5a627a368a0e861350ccc567a70ec675d2bc4d8b3b54f48995ae78d8d530e", size = 4617380, upload-time = "2026-04-12T16:24:54.042Z" },
+    { url = "https://files.pythonhosted.org/packages/b5/ed/91e443366063d3fb7640ae2badd5d7b65be4095ac6d849788e39c043baae/lxml-6.0.4-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:d385141b186cc39ebe4863c1e41936282c65df19b2d06a701dedc2a898877d6a", size = 4922791, upload-time = "2026-04-12T16:24:56.381Z" },
+    { url = "https://files.pythonhosted.org/packages/30/4b/2243260b70974aca9ba0cc71bd668c0c3a79644d80ddcabbfbdb4b131848/lxml-6.0.4-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:0132bb040e9bb5a199302e12bf942741defbc52922a2a06ce9ff7be0d0046483", size = 5080972, upload-time = "2026-04-12T16:24:58.823Z" },
+    { url = "https://files.pythonhosted.org/packages/f8/c3/54c53c4f772341bc12331557f8b0882a426f53133926306cbe6d7f0ee7e4/lxml-6.0.4-cp313-cp313-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:26aee5321e4aa1f07c9090a35f6ab8b703903fb415c6c823cfdb20ee0d779855", size = 4992236, upload-time = "2026-04-12T16:25:01.099Z" },
+    { url = "https://files.pythonhosted.org/packages/be/0f/416de42e22f287585abee610eb0d1c2638c9fe24cee7e15136e0b5e138f8/lxml-6.0.4-cp313-cp313-manylinux_2_26_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:b5652455de198ff76e02cfa57d5efc5f834fa45521aaf3fcc13d6b5a88bde23d", size = 5612398, upload-time = "2026-04-12T16:25:03.517Z" },
+    { url = "https://files.pythonhosted.org/packages/7d/63/29a3fa79b8a182f5bd5b5bdcb6f625f49f08f41d60a26ca25482820a1b99/lxml-6.0.4-cp313-cp313-manylinux_2_26_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:75842801fb48aea73f4c281b923a010dfb39bad75edf8ceb2198ec30c27f01cc", size = 5227480, upload-time = "2026-04-12T16:25:06.119Z" },
+    { url = "https://files.pythonhosted.org/packages/7c/4a/44d1843de599b1c6dbe578e4248c2f15e7fac90c5c86eb26775eaeac0fe0/lxml-6.0.4-cp313-cp313-manylinux_2_28_i686.whl", hash = "sha256:94a1f74607a5a049ff6ff8de429fec922e643e32b5b08ec7a4fe49e8de76e17c", size = 5341001, upload-time = "2026-04-12T16:25:08.563Z" },
+    { url = "https://files.pythonhosted.org/packages/0d/52/c8aebde49f169e4e3452e7756be35be1cb2903e30d961cb57aa65a27055f/lxml-6.0.4-cp313-cp313-manylinux_2_31_armv7l.whl", hash = "sha256:173cc246d3d3b6d3b6491f0b3aaf22ebdf2eed616879482acad8bd84d73eb231", size = 4699105, upload-time = "2026-04-12T16:25:10.757Z" },
+    { url = "https://files.pythonhosted.org/packages/78/60/76fc3735c31c28b70220d99452fb72052e84b618693ca2524da96f0131d8/lxml-6.0.4-cp313-cp313-manylinux_2_38_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:f0f2ee1be1b72e9890da87e4e422f2f703ff4638fd5ec5383055db431e8e30e9", size = 5231095, upload-time = "2026-04-12T16:25:13.305Z" },
+    { url = "https://files.pythonhosted.org/packages/e5/60/448f01c52110102f23df5f07b3f4fde57c8e13e497e182a743d125324c0b/lxml-6.0.4-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:c51a274b7e8b9ce394c3f8b471eb0b23c1914eec64fdccf674e082daf72abf11", size = 5042411, upload-time = "2026-04-12T16:25:15.541Z" },
+    { url = "https://files.pythonhosted.org/packages/4a/2a/90612a001fa4fa0ff0443ebb0256a542670fe35473734c559720293e7aff/lxml-6.0.4-cp313-cp313-musllinux_1_2_armv7l.whl", hash = "sha256:210ea934cba1a1ec42f88c4190c4d5c67b2d14321a8faed9b39e8378198ff99d", size = 4768431, upload-time = "2026-04-12T16:25:17.581Z" },
+    { url = "https://files.pythonhosted.org/packages/84/d8/572845a7d741c8a8ffeaf928185263e14d97fbd355de164677340951d7a5/lxml-6.0.4-cp313-cp313-musllinux_1_2_ppc64le.whl", hash = "sha256:14fe654a59eebe16368c51778caeb0c8fda6f897adcd9afe828d87d13b5d5e51", size = 5634972, upload-time = "2026-04-12T16:25:20.111Z" },
+    { url = "https://files.pythonhosted.org/packages/d7/1d/392b8c9f8cf1d502bbec50dee137c7af3dd5def5e5cd84572fbf0ba0541c/lxml-6.0.4-cp313-cp313-musllinux_1_2_riscv64.whl", hash = "sha256:ec160a2b7e2b3cb71ec35010b19a1adea05785d19ba5c9c5f986b64b78fef564", size = 5222909, upload-time = "2026-04-12T16:25:22.243Z" },
+    { url = "https://files.pythonhosted.org/packages/21/ab/949fc96f825cf083612aee65d5a02eacc5eaeb2815561220e33e1e160677/lxml-6.0.4-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:d305b86ef10b23cf3a6d62a2ad23fa296f76495183ee623f64d2600f65ffe09c", size = 5249096, upload-time = "2026-04-12T16:25:24.781Z" },
+    { url = "https://files.pythonhosted.org/packages/56/e8/fbe44df79ede5ff760401cc3c49c4204f49f0f529cc6b27d0af7b63f5472/lxml-6.0.4-cp313-cp313-win32.whl", hash = "sha256:a2f31380aa9a9b52591e79f1c1d3ac907688fbeb9d883ba28be70f2eb5db2277", size = 3595808, upload-time = "2026-04-12T16:25:26.747Z" },
+    { url = "https://files.pythonhosted.org/packages/f8/df/e873abb881092256520edf0d67d686e36f3c86b3cf289f01b6458272dede/lxml-6.0.4-cp313-cp313-win_amd64.whl", hash = "sha256:b8efa9f681f15043e497293d58a4a63199564b253ed2291887d92bb3f74f59ab", size = 3994635, upload-time = "2026-04-12T16:25:28.828Z" },
+    { url = "https://files.pythonhosted.org/packages/23/a8/9c56c8914b9b18d89face5a7472445002baf309167f7af65d988842129fd/lxml-6.0.4-cp313-cp313-win_arm64.whl", hash = "sha256:905abe6a5888129be18f85f2aea51f0c9863fa0722fb8530dfbb687d2841d221", size = 3657374, upload-time = "2026-04-12T16:25:30.901Z" },
+    { url = "https://files.pythonhosted.org/packages/10/18/36e28a809c509a67496202771f545219ac5a2f1cd61aae325991fcf5ab91/lxml-6.0.4-cp314-cp314-macosx_10_15_universal2.whl", hash = "sha256:569d3b18340863f603582d2124e742a68e85755eff5e47c26a55e298521e3a01", size = 8575045, upload-time = "2026-04-12T16:25:33.57Z" },
+    { url = "https://files.pythonhosted.org/packages/11/38/a168c820e3b08d3b4fa0f4e6b53b3930086b36cc11e428106d38c36778cd/lxml-6.0.4-cp314-cp314-macosx_10_15_x86_64.whl", hash = "sha256:3b6245ee5241342d45e1a54a4a8bc52ef322333ada74f24aa335c4ab36f20161", size = 4622963, upload-time = "2026-04-12T16:25:36.818Z" },
+    { url = "https://files.pythonhosted.org/packages/53/e0/2c9d6abdd82358cea3c0d8d6ca272a6af0f38156abce7827efb6d5b62d17/lxml-6.0.4-cp314-cp314-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:79a1173ba3213a3693889a435417d4e9f3c07d96e30dc7cc3a712ed7361015fe", size = 4948832, upload-time = "2026-04-12T16:25:39.104Z" },
+    { url = "https://files.pythonhosted.org/packages/96/d7/f2202852e91d7baf3a317f4523a9c14834145301e5b0f2e80c01c4bfbd49/lxml-6.0.4-cp314-cp314-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:dc18bb975666b443ba23aedd2fcf57e9d0d97546b52a1de97a447c4061ba4110", size = 5085865, upload-time = "2026-04-12T16:25:41.226Z" },
+    { url = "https://files.pythonhosted.org/packages/09/57/abee549324496e92708f71391c6060a164d3c95369656a1a15e9f20d8162/lxml-6.0.4-cp314-cp314-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:2079f5dc83291ac190a52f8354b78648f221ecac19fb2972a2d056b555824de7", size = 5030001, upload-time = "2026-04-12T16:25:43.695Z" },
+    { url = "https://files.pythonhosted.org/packages/c2/f8/432da7178c5917a16468af6c5da68fef7cf3357d4bd0e6f50272ec9a59b5/lxml-6.0.4-cp314-cp314-manylinux_2_26_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:3eda02da4ca16e9ca22bbe5654470c17fa1abcd967a52e4c2e50ff278221e351", size = 5646303, upload-time = "2026-04-12T16:25:46.577Z" },
+    { url = "https://files.pythonhosted.org/packages/82/f9/e1c04ef667a6bf9c9dbd3bf04c50fa51d7ee25b258485bb748b27eb9a1c7/lxml-6.0.4-cp314-cp314-manylinux_2_26_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:c3787cdc3832b70e21ac2efafea2a82a8ccb5e85bec110dc68b26023e9d3caae", size = 5237940, upload-time = "2026-04-12T16:25:49.157Z" },
+    { url = "https://files.pythonhosted.org/packages/d0/f0/cdea60d92df731725fc3c4f33e387b100f210acd45c92969e42d2ba993fa/lxml-6.0.4-cp314-cp314-manylinux_2_28_i686.whl", hash = "sha256:3f276d49c23103565d39440b9b3f4fc08fa22f5a96395ea4b4d4fea4458b1505", size = 5350050, upload-time = "2026-04-12T16:25:52.027Z" },
+    { url = "https://files.pythonhosted.org/packages/2e/15/bf52c7a70b6081bb9e00d37cc90fcf60aa84468d9d173ad2fade38ec34c5/lxml-6.0.4-cp314-cp314-manylinux_2_31_armv7l.whl", hash = "sha256:fdfdad73736402375b11b3a137e48cd09634177516baf5fc0bd80d1ca85f3cda", size = 4696409, upload-time = "2026-04-12T16:25:55.141Z" },
+    { url = "https://files.pythonhosted.org/packages/c5/69/9bade267332cc06f9a9aa773b5a11bdfb249af485df9e142993009ea1fc4/lxml-6.0.4-cp314-cp314-manylinux_2_38_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:75912421456946931daba0ec3cedfa824c756585d05bde97813a17992bfbd013", size = 5249072, upload-time = "2026-04-12T16:25:57.362Z" },
+    { url = "https://files.pythonhosted.org/packages/14/ca/043bcacb096d6ed291cbbc58724e9625a453069d6edeb840b0bf18038d05/lxml-6.0.4-cp314-cp314-musllinux_1_2_aarch64.whl", hash = "sha256:48cd5a88da67233fd82f2920db344503c2818255217cd6ea462c9bb8254ba7cb", size = 5083779, upload-time = "2026-04-12T16:26:00.018Z" },
+    { url = "https://files.pythonhosted.org/packages/04/89/f5fb18d76985969e84af13682e489acabee399bb54738a363925ea6e7390/lxml-6.0.4-cp314-cp314-musllinux_1_2_armv7l.whl", hash = "sha256:87af86a8fa55b9ff1e6ee4233d762296f2ce641ba948af783fb995c5a8a3371b", size = 4736953, upload-time = "2026-04-12T16:26:02.289Z" },
+    { url = "https://files.pythonhosted.org/packages/84/ba/d1d7284bb4ba951f188c3fc0455943c1fcbd1c33d1324d6d57b7d4a45be6/lxml-6.0.4-cp314-cp314-musllinux_1_2_ppc64le.whl", hash = "sha256:a743714cd656ba7ccb29d199783906064c7b5ba3c0e2a79f0244ea0badc6a98c", size = 5669605, upload-time = "2026-04-12T16:26:04.694Z" },
+    { url = "https://files.pythonhosted.org/packages/72/05/1463e55f2de27bb60feddc894dd7c0833bd501f8861392ed416291b38db5/lxml-6.0.4-cp314-cp314-musllinux_1_2_riscv64.whl", hash = "sha256:e31c76bd066fb4f81d9a32e5843bffdf939ab27afb1ffc1c924e749bfbdb00e3", size = 5236886, upload-time = "2026-04-12T16:26:07.659Z" },
+    { url = "https://files.pythonhosted.org/packages/fe/fb/0b6ee9194ce3ac49db4cadaa8a9158f04779fc768b6c27c4e2945d71a99d/lxml-6.0.4-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:f185fd6e7d550e9917d7103dccf51be589aba953e15994fb04646c1730019685", size = 5263382, upload-time = "2026-04-12T16:26:10.067Z" },
+    { url = "https://files.pythonhosted.org/packages/9a/93/ec18a08e98dd82cac39f1d2511ee2bed5affb94d228356d8ef165a4ec3b9/lxml-6.0.4-cp314-cp314-win32.whl", hash = "sha256:774660028f8722a598400430d2746fb0075949f84a9a5cd9767d9152e3baaac5", size = 3656164, upload-time = "2026-04-12T16:26:59.568Z" },
+    { url = "https://files.pythonhosted.org/packages/15/86/52507316abfc7150bf6bb191e39a12e301ee80334610a493884ae2f9d20d/lxml-6.0.4-cp314-cp314-win_amd64.whl", hash = "sha256:fbd7d14349413f5609c0b537b1a48117d6ccef1af37986af6b03766ad05bf43e", size = 4062512, upload-time = "2026-04-12T16:27:02.212Z" },
+    { url = "https://files.pythonhosted.org/packages/f1/d5/09c593a2ef2234b8cd6cf059e2dc212e0654bf05c503f0ef2daf05adb680/lxml-6.0.4-cp314-cp314-win_arm64.whl", hash = "sha256:a61a01ec3fbfd5b73a69a7bf513271051fd6c5795d82fc5daa0255934cd8db3d", size = 3740745, upload-time = "2026-04-12T16:27:04.444Z" },
+    { url = "https://files.pythonhosted.org/packages/4a/3c/42a98bf6693938bf7b285ec7f70ba2ae9d785d0e5b2cdb85d2ee29e287eb/lxml-6.0.4-cp314-cp314t-macosx_10_15_universal2.whl", hash = "sha256:504edb62df33cea502ea6e73847c647ba228623ca3f80a228be5723a70984dd5", size = 8826437, upload-time = "2026-04-12T16:26:12.911Z" },
+    { url = "https://files.pythonhosted.org/packages/c2/c2/ad13f39b2db8709788aa2dcb6e90b81da76db3b5b2e7d35e0946cf984960/lxml-6.0.4-cp314-cp314t-macosx_10_15_x86_64.whl", hash = "sha256:f01b7b0316d4c0926d49a7f003b2d30539f392b140a3374bb788bad180bc8478", size = 4734892, upload-time = "2026-04-12T16:26:15.871Z" },
+    { url = "https://files.pythonhosted.org/packages/2c/6d/c559d7b5922c5b0380fc2cb5ac134b6a3f9d79d368347a624ee5d68b0816/lxml-6.0.4-cp314-cp314t-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:ab999933e662501efe4b16e6cfb7c9f9deca7d072cd1788b99c8defde78c0dfb", size = 4969173, upload-time = "2026-04-12T16:26:18.335Z" },
+    { url = "https://files.pythonhosted.org/packages/c7/78/ca521e36157f38e3e1a29276855cdf48d213138fc0c8365693ff5c876ca7/lxml-6.0.4-cp314-cp314t-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:67c3f084389fe75932c39b6869a377f6c8e21e818f31ae8a30c71dd2e59360e2", size = 5103134, upload-time = "2026-04-12T16:26:20.612Z" },
+    { url = "https://files.pythonhosted.org/packages/28/a7/7d62d023bacaa0aaf60af8c0a77c6c05f84327396d755f3aa64b788678a9/lxml-6.0.4-cp314-cp314t-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:377ea1d654f76ed6205c87d14920f829c9f4d31df83374d3cbcbdaae804d37b2", size = 5027205, upload-time = "2026-04-12T16:26:22.981Z" },
+    { url = "https://files.pythonhosted.org/packages/34/be/51b194b81684f2e85e5d992771c45d70cb22ac6f7291ac6bc7b255830afe/lxml-6.0.4-cp314-cp314t-manylinux_2_26_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:e60cd0bcacbfd1a96d63516b622183fb2e3f202300df9eb5533391a8a939dbfa", size = 5594461, upload-time = "2026-04-12T16:26:25.316Z" },
+    { url = "https://files.pythonhosted.org/packages/39/24/8850f38fbf89dd072ff31ba22f9e40347aeada7cadf710ecb04b8d9f32d4/lxml-6.0.4-cp314-cp314t-manylinux_2_26_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:6e9e30fd63d41dd0bbdb020af5cdfffd5d9b554d907cb210f18e8fcdc8eac013", size = 5223378, upload-time = "2026-04-12T16:26:28.68Z" },
+    { url = "https://files.pythonhosted.org/packages/2a/9b/595239ba8c719b0fdc7bc9ebdb7564459c9a6b24b8b363df4a02674aeece/lxml-6.0.4-cp314-cp314t-manylinux_2_28_i686.whl", hash = "sha256:1fb4a1606bb68c533002e7ed50d7e55e58f0ef1696330670281cb79d5ab2050d", size = 5311415, upload-time = "2026-04-12T16:26:31.513Z" },
+    { url = "https://files.pythonhosted.org/packages/be/cb/aa27ac8d041acf34691577838494ad08df78e83fdfdb66948d2903e9291e/lxml-6.0.4-cp314-cp314t-manylinux_2_31_armv7l.whl", hash = "sha256:695c7708438e449d57f404db8cc1b769e77ad5b50655f32f8175686ba752f293", size = 4637953, upload-time = "2026-04-12T16:26:33.806Z" },
+    { url = "https://files.pythonhosted.org/packages/f6/f2/f19114fd86825c2d1ce41cd99daad218d30cfdd2093d4de9273986fb4d68/lxml-6.0.4-cp314-cp314t-manylinux_2_38_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:d49c35ae1e35ee9b569892cf8f8f88db9524f28d66e9daee547a5ef9f3c5f468", size = 5231532, upload-time = "2026-04-12T16:26:36.518Z" },
+    { url = "https://files.pythonhosted.org/packages/9a/0e/c3fa354039ec0b6b09f40fbe1129efc572ac6239faa4906de42d5ce87c0a/lxml-6.0.4-cp314-cp314t-musllinux_1_2_aarch64.whl", hash = "sha256:5801072f8967625e6249d162065d0d6011ef8ce3d0efb8754496b5246b81a74b", size = 5083767, upload-time = "2026-04-12T16:26:39.332Z" },
+    { url = "https://files.pythonhosted.org/packages/b3/4b/1a0dbb6d6ffae16e54a8a3796ded0ad2f9c3bc1ff3728bde33456f4e1d63/lxml-6.0.4-cp314-cp314t-musllinux_1_2_armv7l.whl", hash = "sha256:cbf768541526eba5ef1a49f991122e41b39781eafd0445a5a110fc09947a20b5", size = 4758079, upload-time = "2026-04-12T16:26:42.138Z" },
+    { url = "https://files.pythonhosted.org/packages/a9/01/a246cf5f80f96766051de4b305d6552f80bdaefb37f04e019e42af0aba69/lxml-6.0.4-cp314-cp314t-musllinux_1_2_ppc64le.whl", hash = "sha256:eecce87cc09233786fc31c230268183bf6375126cfec1c8b3673fcdc8767b560", size = 5618686, upload-time = "2026-04-12T16:26:44.507Z" },
+    { url = "https://files.pythonhosted.org/packages/eb/1f/b072a92369039ebef11b0a654be5134fcf3ed04c0f437faf9435ac9ba845/lxml-6.0.4-cp314-cp314t-musllinux_1_2_riscv64.whl", hash = "sha256:07dce892881179e11053066faca2da17b0eeb0bb7298f11bcf842a86db207dbd", size = 5227259, upload-time = "2026-04-12T16:26:47.083Z" },
+    { url = "https://files.pythonhosted.org/packages/d5/a0/dc97034f9d4c0c4d30875147d81fd2c0c7f3d261b109db36ed746bf8ab1d/lxml-6.0.4-cp314-cp314t-musllinux_1_2_x86_64.whl", hash = "sha256:e4f97aee337b947e6699e5574c90d087d3e2ce517016241c07e7e98a28dca885", size = 5246190, upload-time = "2026-04-12T16:26:49.468Z" },
+    { url = "https://files.pythonhosted.org/packages/f2/ef/85cb69835113583c2516fee07d0ffb4d824b557424b06ba5872c20ba6078/lxml-6.0.4-cp314-cp314t-win32.whl", hash = "sha256:064477c0d4c695aa1ea4b9c1c4ee9043ab740d12135b74c458cc658350adcd86", size = 3896005, upload-time = "2026-04-12T16:26:52.163Z" },
+    { url = "https://files.pythonhosted.org/packages/3d/5e/2231f34cc54b8422b793593138d86d3fa4588fb2297d4ea0472390f25627/lxml-6.0.4-cp314-cp314t-win_amd64.whl", hash = "sha256:25bad2d8438f4ef5a7ad4a8d8bcaadde20c0daced8bdb56d46236b0a7d1cbdd0", size = 4391037, upload-time = "2026-04-12T16:26:54.398Z" },
+    { url = "https://files.pythonhosted.org/packages/39/53/8ba3cd5984f8363635450c93f63e541a0721b362bb32ae0d8237d9674aee/lxml-6.0.4-cp314-cp314t-win_arm64.whl", hash = "sha256:1dcd9e6cb9b7df808ea33daebd1801f37a8f50e8c075013ed2a2343246727838", size = 3816184, upload-time = "2026-04-12T16:26:57.011Z" },
 ]

 [[package]]
@@ -3689,7 +3746,7 @@ wheels = [

 [[package]]
 name = "mypy"
-version = "1.20.2"
+version = "1.20.1"
 source = { registry = "https://pypi.org/simple" }
 dependencies = [
    { name = "librt", marker = "platform_python_implementation != 'PyPy'" },
@@ -3697,37 +3754,37 @@ dependencies = [
    { name = "pathspec" },
    { name = "typing-extensions" },
 ]
-sdist = { url = "https://files.pythonhosted.org/packages/04/af/e3d4b3e9ec91a0ff9aabfdb38692952acf49bbb899c2e4c29acb3a6da3ae/mypy-1.20.2.tar.gz", hash = "sha256:e8222c26daaafd9e8626dec58ae36029f82585890589576f769a650dd20fd665", size = 3817349, upload-time = "2026-04-21T17:12:28.473Z" }
+sdist = { url = "https://files.pythonhosted.org/packages/0b/3d/5b373635b3146264eb7a68d09e5ca11c305bbb058dfffbb47c47daf4f632/mypy-1.20.1.tar.gz", hash = "sha256:6fc3f4ecd52de81648fed1945498bf42fa2993ddfad67c9056df36ae5757f804", size = 3815892, upload-time = "2026-04-13T02:46:51.474Z" }
 wheels = [
-    { url = "https://files.pythonhosted.org/packages/71/4e/7560e4528db9e9b147e4c0f22660466bf30a0a1fe3d63d1b9d3b0fd354ee/mypy-1.20.2-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:4dbfcf869f6b0517f70cf0030ba6ea1d6645e132337a7d5204a18d8d5636c02b", size = 14539393, upload-time = "2026-04-21T17:07:12.52Z" },
-    { url = "https://files.pythonhosted.org/packages/32/d9/34a5efed8124f5a9234f55ac6a4ced4201e2c5b81e1109c49ad23190ec8c/mypy-1.20.2-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:4b6481b228d072315b053210b01ac320e1be243dc17f9e5887ef167f23f5fae4", size = 13361642, upload-time = "2026-04-21T17:06:53.742Z" },
-    { url = "https://files.pythonhosted.org/packages/d1/14/eb377acf78c03c92d566a1510cda8137348215b5335085ef662ab82ecd3a/mypy-1.20.2-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:34397cdced6b90b836e38182076049fdb41424322e0b0728c946b0939ebdf9f6", size = 13740347, upload-time = "2026-04-21T17:12:04.73Z" },
-    { url = "https://files.pythonhosted.org/packages/b9/94/7e4634a32b641aa1c112422eed1bbece61ee16205f674190e8b536f884de/mypy-1.20.2-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:a5da6976f20cae27059ea8d0c86e7cef3de720e04c4bb9ee18e3690fdb792066", size = 14734042, upload-time = "2026-04-21T17:07:43.16Z" },
-    { url = "https://files.pythonhosted.org/packages/7a/f3/f7e62395cb7f434541b4491a01149a4439e28ace4c0c632bbf5431e92d1f/mypy-1.20.2-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:56908d7e08318d39f85b1f0c6cfd47b0cac1a130da677630dac0de3e0623e102", size = 14964958, upload-time = "2026-04-21T17:11:00.665Z" },
-    { url = "https://files.pythonhosted.org/packages/3e/0d/47e3c3a0ec2a876e35aeac365df3cac7776c36bbd4ed18cc521e1b9d255b/mypy-1.20.2-cp312-cp312-win_amd64.whl", hash = "sha256:d52ad8d78522da1d308789df651ee5379088e77c76cb1994858d40a426b343b9", size = 10911340, upload-time = "2026-04-21T17:10:49.179Z" },
-    { url = "https://files.pythonhosted.org/packages/d6/b2/6c852d72e0ea8b01f49da817fb52539993cde327e7d010e0103dc12d0dac/mypy-1.20.2-cp312-cp312-win_arm64.whl", hash = "sha256:785b08db19c9f214dc37d65f7c165d19a30fcecb48abfa30f31b01b5acaabb58", size = 9833947, upload-time = "2026-04-21T17:09:05.267Z" },
-    { url = "https://files.pythonhosted.org/packages/5b/c4/b93812d3a192c9bcf5df405bd2f30277cd0e48106a14d1023c7f6ed6e39b/mypy-1.20.2-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:edfbfca868cdd6bd8d974a60f8a3682f5565d3f5c99b327640cedd24c4264026", size = 14524670, upload-time = "2026-04-21T17:10:30.737Z" },
-    { url = "https://files.pythonhosted.org/packages/f3/47/42c122501bff18eaf1e8f457f5c017933452d8acdc52918a9f59f6812955/mypy-1.20.2-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:e2877a02380adfcdbc69071a0f74d6e9dbbf593c0dc9d174e1f223ffd5281943", size = 13336218, upload-time = "2026-04-21T17:08:44.069Z" },
-    { url = "https://files.pythonhosted.org/packages/92/8f/75bbc92f41725fbd585fb17b440b1119b576105df1013622983e18640a93/mypy-1.20.2-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:7488448de6007cd5177c6cea0517ac33b4c0f5ee9b5e9f2be51ce75511a85517", size = 13724906, upload-time = "2026-04-21T17:08:01.02Z" },
-    { url = "https://files.pythonhosted.org/packages/a1/32/4c49da27a606167391ff0c39aa955707a00edc500572e562f7c36c08a71f/mypy-1.20.2-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:bb9c2fa06887e21d6a3a868762acb82aec34e2c6fd0174064f27c93ede68ad15", size = 14726046, upload-time = "2026-04-21T17:11:22.354Z" },
-    { url = "https://files.pythonhosted.org/packages/7f/fc/4e354a1bd70216359deb0c9c54847ee6b32ef78dfb09f5131ff99b494078/mypy-1.20.2-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:9d56a78b646f2e3daa865bc70cd5ec5a46c50045801ca8ff17a0c43abc97e3ee", size = 14955587, upload-time = "2026-04-21T17:12:16.033Z" },
-    { url = "https://files.pythonhosted.org/packages/62/b2/c0f2056e9eb8f08c62cafd9715e4584b89132bdc832fcf85d27d07b5f3e5/mypy-1.20.2-cp313-cp313-win_amd64.whl", hash = "sha256:2a4102b03bb7481d9a91a6da8d174740c9c8c4401024684b9ca3b7cc5e49852f", size = 10922681, upload-time = "2026-04-21T17:06:35.842Z" },
-    { url = "https://files.pythonhosted.org/packages/e5/14/065e333721f05de8ef683d0aa804c23026bcc287446b61cac657b902ccac/mypy-1.20.2-cp313-cp313-win_arm64.whl", hash = "sha256:a95a9248b0c6fd933a442c03c3b113c3b61320086b88e2c444676d3fd1ca3330", size = 9830560, upload-time = "2026-04-21T17:07:51.023Z" },
-    { url = "https://files.pythonhosted.org/packages/ae/d1/b4ec96b0ecc620a4443570c6e95c867903428cfcde4206518eafdd5880c3/mypy-1.20.2-cp314-cp314-macosx_10_15_x86_64.whl", hash = "sha256:419413398fe250aae057fd2fe50166b61077083c9b82754c341cf4fd73038f30", size = 14524561, upload-time = "2026-04-21T17:06:27.325Z" },
-    { url = "https://files.pythonhosted.org/packages/3a/63/d2c2ff4fa66bc49477d32dfa26e8a167ba803ea6a69c5efb416036909d30/mypy-1.20.2-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:e73c07f23009962885c197ccb9b41356a30cc0e5a1d0c2ea8fd8fb1362d7f924", size = 13363883, upload-time = "2026-04-21T17:11:11.239Z" },
-    { url = "https://files.pythonhosted.org/packages/2a/56/983916806bf4eddeaaa2c9230903c3669c6718552a921154e1c5182c701f/mypy-1.20.2-cp314-cp314-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:0c64e5973df366b747646fc98da921f9d6eba9716d57d1db94a83c026a08e0fb", size = 13742945, upload-time = "2026-04-21T17:08:34.181Z" },
-    { url = "https://files.pythonhosted.org/packages/19/65/0cd9285ab010ee8214c83d67c6b49417c40d86ce46f1aa109457b5a9b8d7/mypy-1.20.2-cp314-cp314-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:5a65aa591af023864fd08a97da9974e919452cfe19cb146c8a5dc692626445dc", size = 14706163, upload-time = "2026-04-21T17:05:15.51Z" },
-    { url = "https://files.pythonhosted.org/packages/94/97/48ff3b297cafcc94d185243a9190836fb1b01c1b0918fff64e941e973cc9/mypy-1.20.2-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:4fef51b01e638974a6e69885687e9bd40c8d1e09a6cd291cca0619625cf1f558", size = 14938677, upload-time = "2026-04-21T17:05:39.562Z" },
-    { url = "https://files.pythonhosted.org/packages/fd/a1/1b4233d255bdd0b38a1f284feeb1c143ca508c19184964e22f8d837ec851/mypy-1.20.2-cp314-cp314-win_amd64.whl", hash = "sha256:913485a03f1bcf5d279409a9d2b9ed565c151f61c09f29991e5faa14033da4c8", size = 11089322, upload-time = "2026-04-21T17:06:44.29Z" },
-    { url = "https://files.pythonhosted.org/packages/78/c2/ce7ee2ba36aeb954ba50f18fa25d9c1188578654b97d02a66a15b6f09531/mypy-1.20.2-cp314-cp314-win_arm64.whl", hash = "sha256:c3bae4f855d965b5453784300c12ffc63a548304ac7f99e55d4dc7c898673aa3", size = 10017775, upload-time = "2026-04-21T17:07:20.732Z" },
-    { url = "https://files.pythonhosted.org/packages/4e/a1/9d93a7d0b5859af0ead82b4888b46df6c8797e1bc5e1e262a08518c6d48e/mypy-1.20.2-cp314-cp314t-macosx_10_15_x86_64.whl", hash = "sha256:2de3dcea53babc1c3237a19002bc3d228ce1833278f093b8d619e06e7cc79609", size = 15549002, upload-time = "2026-04-21T17:08:23.107Z" },
-    { url = "https://files.pythonhosted.org/packages/00/d2/09a6a10ee1bf0008f6c144d9676f2ca6a12512151b4e0ad0ff6c4fac5337/mypy-1.20.2-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:52b176444e2e5054dfcbcb8c75b0b719865c96247b37407184bbfca5c353f2c2", size = 14401942, upload-time = "2026-04-21T17:07:31.837Z" },
-    { url = "https://files.pythonhosted.org/packages/57/da/9594b75c3c019e805250bed3583bdf4443ff9e6ef08f97e39ae308cb06f2/mypy-1.20.2-cp314-cp314t-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:688c3312e5dadb573a2c69c82af3a298d43ecf9e6d264e0f95df960b5f6ac19c", size = 15041649, upload-time = "2026-04-21T17:09:34.653Z" },
-    { url = "https://files.pythonhosted.org/packages/97/77/f75a65c278e6e8eba2071f7f5a90481891053ecc39878cc444634d892abe/mypy-1.20.2-cp314-cp314t-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:29752dbbf8cc53f89f6ac096d363314333045c257c9c75cbd189ca2de0455744", size = 15864588, upload-time = "2026-04-21T17:11:44.936Z" },
-    { url = "https://files.pythonhosted.org/packages/d7/46/1a4e1c66e96c1a3246ddf5403d122ac9b0a8d2b7e65730b9d6533ba7a6d3/mypy-1.20.2-cp314-cp314t-musllinux_1_2_x86_64.whl", hash = "sha256:803203d2b6ea644982c644895c2f78b28d0e208bba7b27d9b921e0ec5eb207c6", size = 16093956, upload-time = "2026-04-21T17:10:17.683Z" },
-    { url = "https://files.pythonhosted.org/packages/5a/2c/78a8851264dec38cd736ca5b8bc9380674df0dd0be7792f538916157716c/mypy-1.20.2-cp314-cp314t-win_amd64.whl", hash = "sha256:9bcb8aa397ff0093c824182fd76a935a9ba7ad097fcbef80ae89bf6c1731d8ec", size = 12568661, upload-time = "2026-04-21T17:11:54.473Z" },
-    { url = "https://files.pythonhosted.org/packages/83/01/cd7318aa03493322ce275a0e14f4f52b8896335e4e79d4fb8153a7ad2b77/mypy-1.20.2-cp314-cp314t-win_arm64.whl", hash = "sha256:e061b58443f1736f8a37c48978d7ab581636d6ab03e3d4f99e3fa90463bb9382", size = 10389240, upload-time = "2026-04-21T17:09:42.719Z" },
-    { url = "https://files.pythonhosted.org/packages/28/9a/f23c163e25b11074188251b0b5a0342625fc1cdb6af604757174fa9acc9b/mypy-1.20.2-py3-none-any.whl", hash = "sha256:a94c5a76ab46c5e6257c7972b6c8cff0574201ca7dc05647e33e795d78680563", size = 2637314, upload-time = "2026-04-21T17:05:54.5Z" },
+    { url = "https://files.pythonhosted.org/packages/69/1b/75a7c825a02781ca10bc2f2f12fba2af5202f6d6005aad8d2d1f264d8d78/mypy-1.20.1-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:36ee2b9c6599c230fea89bbd79f401f9f9f8e9fcf0c777827789b19b7da90f51", size = 14494077, upload-time = "2026-04-13T02:45:55.085Z" },
+    { url = "https://files.pythonhosted.org/packages/b0/54/5e5a569ea5c2b4d48b729fb32aa936eeb4246e4fc3e6f5b3d36a2dfbefb9/mypy-1.20.1-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:fba3fb0968a7b48806b0c90f38d39296f10766885a94c83bd21399de1e14eb28", size = 13319495, upload-time = "2026-04-13T02:45:29.674Z" },
+    { url = "https://files.pythonhosted.org/packages/6f/a4/a1945b19f33e91721b59deee3abb484f2fa5922adc33bb166daf5325d76d/mypy-1.20.1-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:ef1415a637cd3627d6304dfbeddbadd21079dafc2a8a753c477ce4fc0c2af54f", size = 13696948, upload-time = "2026-04-13T02:46:15.006Z" },
+    { url = "https://files.pythonhosted.org/packages/b2/c6/75e969781c2359b2f9c15b061f28ec6d67c8b61865ceda176e85c8e7f2de/mypy-1.20.1-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:ef3461b1ad5cd446e540016e90b5984657edda39f982f4cc45ca317b628f5a37", size = 14706744, upload-time = "2026-04-13T02:46:00.482Z" },
+    { url = "https://files.pythonhosted.org/packages/a8/6e/b221b1de981fc4262fe3e0bf9ec272d292dfe42394a689c2d49765c144c4/mypy-1.20.1-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:542dd63c9e1339b6092eb25bd515f3a32a1453aee8c9521d2ddb17dacd840237", size = 14949035, upload-time = "2026-04-13T02:45:06.021Z" },
+    { url = "https://files.pythonhosted.org/packages/ca/4b/298ba2de0aafc0da3ff2288da06884aae7ba6489bc247c933f87847c41b3/mypy-1.20.1-cp312-cp312-win_amd64.whl", hash = "sha256:1d55c7cd8ca22e31f93af2a01160a9e95465b5878de23dba7e48116052f20a8d", size = 10883216, upload-time = "2026-04-13T02:45:47.232Z" },
+    { url = "https://files.pythonhosted.org/packages/c7/f9/5e25b8f0b8cb92f080bfed9c21d3279b2a0b6a601cdca369a039ba84789d/mypy-1.20.1-cp312-cp312-win_arm64.whl", hash = "sha256:f5b84a79070586e0d353ee07b719d9d0a4aa7c8ee90c0ea97747e98cbe193019", size = 9814299, upload-time = "2026-04-13T02:45:21.934Z" },
+    { url = "https://files.pythonhosted.org/packages/21/e8/ef0991aa24c8f225df10b034f3c2681213cb54cf247623c6dec9a5744e70/mypy-1.20.1-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:8f3886c03e40afefd327bd70b3f634b39ea82e87f314edaa4d0cce4b927ddcc1", size = 14500739, upload-time = "2026-04-13T02:46:05.442Z" },
+    { url = "https://files.pythonhosted.org/packages/23/73/416ebec3047636ed89fa871dc8c54bf05e9e20aa9499da59790d7adb312d/mypy-1.20.1-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:e860eb3904f9764e83bafd70c8250bdffdc7dde6b82f486e8156348bf7ceb184", size = 13314735, upload-time = "2026-04-13T02:46:47.154Z" },
+    { url = "https://files.pythonhosted.org/packages/10/1e/1505022d9c9ac2e014a384eb17638fb37bf8e9d0a833ea60605b66f8f7ba/mypy-1.20.1-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:a4b5aac6e785719da51a84f5d09e9e843d473170a9045b1ea7ea1af86225df4b", size = 13704356, upload-time = "2026-04-13T02:45:19.773Z" },
+    { url = "https://files.pythonhosted.org/packages/98/91/275b01f5eba5c467a3318ec214dd865abb66e9c811231c8587287b92876a/mypy-1.20.1-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:f37b6cd0fe2ad3a20f05ace48ca3523fc52ff86940e34937b439613b6854472e", size = 14696420, upload-time = "2026-04-13T02:45:24.205Z" },
+    { url = "https://files.pythonhosted.org/packages/a1/57/b3779e134e1b7250d05f874252780d0a88c068bc054bcff99ca20a3a2986/mypy-1.20.1-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:e4bbb0f6b54ce7cc350ef4a770650d15fa70edd99ad5267e227133eda9c94218", size = 14936093, upload-time = "2026-04-13T02:45:32.087Z" },
+    { url = "https://files.pythonhosted.org/packages/be/33/81b64991b0f3f278c3b55c335888794af190b2d59031a5ad1401bcb69f1e/mypy-1.20.1-cp313-cp313-win_amd64.whl", hash = "sha256:c3dc20f8ec76eecd77148cdd2f1542ed496e51e185713bf488a414f862deb8f2", size = 10889659, upload-time = "2026-04-13T02:46:02.926Z" },
+    { url = "https://files.pythonhosted.org/packages/1b/fd/7adcb8053572edf5ef8f3db59599dfeeee3be9cc4c8c97e2d28f66f42ac5/mypy-1.20.1-cp313-cp313-win_arm64.whl", hash = "sha256:a9d62bbac5d6d46718e2b0330b25e6264463ed832722b8f7d4440ff1be3ca895", size = 9815515, upload-time = "2026-04-13T02:46:32.103Z" },
+    { url = "https://files.pythonhosted.org/packages/40/cd/db831e84c81d57d4886d99feee14e372f64bbec6a9cb1a88a19e243f2ef5/mypy-1.20.1-cp314-cp314-macosx_10_15_x86_64.whl", hash = "sha256:12927b9c0ed794daedcf1dab055b6c613d9d5659ac511e8d936d96f19c087d12", size = 14483064, upload-time = "2026-04-13T02:45:26.901Z" },
+    { url = "https://files.pythonhosted.org/packages/d5/82/74e62e7097fa67da328ac8ece8de09133448c04d20ddeaeba251a3000f01/mypy-1.20.1-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:752507dd481e958b2c08fc966d3806c962af5a9433b5bf8f3bdd7175c20e34fe", size = 13335694, upload-time = "2026-04-13T02:46:12.514Z" },
+    { url = "https://files.pythonhosted.org/packages/74/c4/97e9a0abe4f3cdbbf4d079cb87a03b786efeccf5bf2b89fe4f96939ab2e6/mypy-1.20.1-cp314-cp314-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:c614655b5a065e56274c6cbbe405f7cf7e96c0654db7ba39bc680238837f7b08", size = 13726365, upload-time = "2026-04-13T02:45:17.422Z" },
+    { url = "https://files.pythonhosted.org/packages/d7/aa/a19d884a8d28fcd3c065776323029f204dbc774e70ec9c85eba228b680de/mypy-1.20.1-cp314-cp314-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:2c3f6221a76f34d5100c6d35b3ef6b947054123c3f8d6938a4ba00b1308aa572", size = 14693472, upload-time = "2026-04-13T02:46:41.253Z" },
+    { url = "https://files.pythonhosted.org/packages/84/44/cc9324bd21cf786592b44bf3b5d224b3923c1230ec9898d508d00241d465/mypy-1.20.1-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:4bdfc06303ac06500af71ea0cdbe995c502b3c9ba32f3f8313523c137a25d1b6", size = 14919266, upload-time = "2026-04-13T02:46:28.37Z" },
+    { url = "https://files.pythonhosted.org/packages/6e/dc/779abb25a8c63e8f44bf5a336217fa92790fa17e0c40e0c725d10cb01bbd/mypy-1.20.1-cp314-cp314-win_amd64.whl", hash = "sha256:0131edd7eba289973d1ba1003d1a37c426b85cdef76650cd02da6420898a5eb3", size = 11049713, upload-time = "2026-04-13T02:45:57.673Z" },
+    { url = "https://files.pythonhosted.org/packages/28/08/4172be2ad7de9119b5a92ca36abbf641afdc5cb1ef4ae0c3a8182f29674f/mypy-1.20.1-cp314-cp314-win_arm64.whl", hash = "sha256:33f02904feb2c07e1fdf7909026206396c9deeb9e6f34d466b4cfedb0aadbbe4", size = 9999819, upload-time = "2026-04-13T02:46:35.039Z" },
+    { url = "https://files.pythonhosted.org/packages/2d/af/af9e46b0c8eabbce9fc04a477564170f47a1c22b308822282a59b7ff315f/mypy-1.20.1-cp314-cp314t-macosx_10_15_x86_64.whl", hash = "sha256:168472149dd8cc505c98cefd21ad77e4257ed6022cd5ed2fe2999bed56977a5a", size = 15547508, upload-time = "2026-04-13T02:46:25.588Z" },
+    { url = "https://files.pythonhosted.org/packages/a7/cd/39c9e4ad6ba33e069e5837d772a9e6c304b4a5452a14a975d52b36444650/mypy-1.20.1-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:eb674600309a8f22790cca883a97c90299f948183ebb210fbef6bcee07cb1986", size = 14399557, upload-time = "2026-04-13T02:46:10.021Z" },
+    { url = "https://files.pythonhosted.org/packages/83/c1/3fd71bdc118ffc502bf57559c909927bb7e011f327f7bb8e0488e98a5870/mypy-1.20.1-cp314-cp314t-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:ef2b2e4cc464ba9795459f2586923abd58a0055487cbe558cb538ea6e6bc142a", size = 15045789, upload-time = "2026-04-13T02:45:10.81Z" },
+    { url = "https://files.pythonhosted.org/packages/8e/73/6f07ff8b57a7d7b3e6e5bf34685d17632382395c8bb53364ec331661f83e/mypy-1.20.1-cp314-cp314t-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:dee461d396dd46b3f0ed5a098dbc9b8860c81c46ad44fa071afcfbc149f167c9", size = 15850795, upload-time = "2026-04-13T02:45:03.349Z" },
+    { url = "https://files.pythonhosted.org/packages/ec/e2/f7dffec1c7767078f9e9adf0c786d1fe0ff30964a77eb213c09b8b58cb76/mypy-1.20.1-cp314-cp314t-musllinux_1_2_x86_64.whl", hash = "sha256:e364926308b3e66f1361f81a566fc1b2f8cd47fc8525e8136d4058a65a4b4f02", size = 16088539, upload-time = "2026-04-13T02:46:17.841Z" },
+    { url = "https://files.pythonhosted.org/packages/1a/76/e0dee71035316e75a69d73aec2f03c39c21c967b97e277fd0ef8fd6aec66/mypy-1.20.1-cp314-cp314t-win_amd64.whl", hash = "sha256:a0c17fbd746d38c70cbc42647cfd884f845a9708a4b160a8b4f7e70d41f4d7fa", size = 12575567, upload-time = "2026-04-13T02:45:34.795Z" },
+    { url = "https://files.pythonhosted.org/packages/22/a8/7ed43c9d9c3d1468f86605e323a5d97e411a448790a00f07e779f3211a46/mypy-1.20.1-cp314-cp314t-win_arm64.whl", hash = "sha256:db2cb89654626a912efda69c0d5c1d22d948265e2069010d3dde3abf751c7d08", size = 10378823, upload-time = "2026-04-13T02:45:13.35Z" },
+    { url = "https://files.pythonhosted.org/packages/d8/28/926bd972388e65a39ee98e188ccf67e81beb3aacfd5d6b310051772d974b/mypy-1.20.1-py3-none-any.whl", hash = "sha256:1aae28507f253fe82d883790d1c0a0d35798a810117c88184097fe8881052f06", size = 2636553, upload-time = "2026-04-13T02:46:30.45Z" },
 ]

 [[package]]
@@ -4451,7 +4508,7 @@ wheels = [

 [[package]]
 name = "pre-commit"
-version = "4.6.0"
+version = "4.5.1"
 source = { registry = "https://pypi.org/simple" }
 dependencies = [
    { name = "cfgv" },
@@ -4460,9 +4517,9 @@ dependencies = [
    { name = "pyyaml" },
    { name = "virtualenv" },
 ]
-sdist = { url = "https://files.pythonhosted.org/packages/8e/22/2de9408ac81acbb8a7d05d4cc064a152ccf33b3d480ebe0cd292153db239/pre_commit-4.6.0.tar.gz", hash = "sha256:718d2208cef53fdc38206e40524a6d4d9576d103eb16f0fec11c875e7716e9d9", size = 198525, upload-time = "2026-04-21T20:31:41.613Z" }
+sdist = { url = "https://files.pythonhosted.org/packages/40/f1/6d86a29246dfd2e9b6237f0b5823717f60cad94d47ddc26afa916d21f525/pre_commit-4.5.1.tar.gz", hash = "sha256:eb545fcff725875197837263e977ea257a402056661f09dae08e4b149b030a61", size = 198232, upload-time = "2025-12-16T21:14:33.552Z" }
 wheels = [
-    { url = "https://files.pythonhosted.org/packages/80/6e/4b28b62ecb6aae56769c34a8ff1d661473ec1e9519e2d5f8b2c150086b26/pre_commit-4.6.0-py2.py3-none-any.whl", hash = "sha256:e2cf246f7299edcabcf15f9b0571fdce06058527f0a06535068a86d38089f29b", size = 226472, upload-time = "2026-04-21T20:31:40.092Z" },
+    { url = "https://files.pythonhosted.org/packages/5d/19/fd3ef348460c80af7bb4669ea7926651d1f95c23ff2df18b9d24bab4f3fa/pre_commit-4.5.1-py2.py3-none-any.whl", hash = "sha256:3b3afd891e97337708c1674210f8eba659b52a38ea5f822ff142d10786221f77", size = 226437, upload-time = "2025-12-16T21:14:32.409Z" },
 ]

 [[package]]
@@ -4632,45 +4689,45 @@ wheels = [

 [[package]]
 name = "pyarrow"
-version = "24.0.0"
+version = "23.0.1"
 source = { registry = "https://pypi.org/simple" }
-sdist = { url = "https://files.pythonhosted.org/packages/91/13/13e1069b351bdc3881266e11147ffccf687505dbb0ea74036237f5d454a5/pyarrow-24.0.0.tar.gz", hash = "sha256:85fe721a14dd823aca09127acbb06c3ca723efbd436c004f16bca601b04dcc83", size = 1180261, upload-time = "2026-04-21T10:51:25.837Z" }
+sdist = { url = "https://files.pythonhosted.org/packages/88/22/134986a4cc224d593c1afde5494d18ff629393d74cc2eddb176669f234a4/pyarrow-23.0.1.tar.gz", hash = "sha256:b8c5873e33440b2bc2f4a79d2b47017a89c5a24116c055625e6f2ee50523f019", size = 1167336, upload-time = "2026-02-16T10:14:12.39Z" }
 wheels = [
-    { url = "https://files.pythonhosted.org/packages/b4/a9/9686d9f07837f91f775e8932659192e02c74f9d8920524b480b85212cc68/pyarrow-24.0.0-cp312-cp312-macosx_12_0_arm64.whl", hash = "sha256:6233c9ed9ab9d1db47de57d9753256d9dcffbf42db341576099f0fd9f6bf4810", size = 34981559, upload-time = "2026-04-21T10:47:22.17Z" },
-    { url = "https://files.pythonhosted.org/packages/80/b6/0ddf0e9b6ead3474ab087ae598c76b031fc45532bf6a63f3a553440fb258/pyarrow-24.0.0-cp312-cp312-macosx_12_0_x86_64.whl", hash = "sha256:f7616236ec1bc2b15bfdec22a71ab38851c86f8f05ff64f379e1278cf20c634a", size = 36663654, upload-time = "2026-04-21T10:47:28.315Z" },
-    { url = "https://files.pythonhosted.org/packages/7c/3b/926382efe8ce27ba729071d3566ade6dfb86bdf112f366000196b2f5780a/pyarrow-24.0.0-cp312-cp312-manylinux_2_28_aarch64.whl", hash = "sha256:1617043b99bd33e5318ae18eb2919af09c71322ef1ca46566cdafc6e6712fb66", size = 45679394, upload-time = "2026-04-21T10:47:34.821Z" },
-    { url = "https://files.pythonhosted.org/packages/b3/7a/829f7d9dfd37c207206081d6dad474d81dde29952401f07f2ba507814818/pyarrow-24.0.0-cp312-cp312-manylinux_2_28_x86_64.whl", hash = "sha256:6165461f55ef6314f026de6638d661188e3455d3ec49834556a0ebbdbace18bb", size = 48863122, upload-time = "2026-04-21T10:47:42.056Z" },
-    { url = "https://files.pythonhosted.org/packages/5f/e8/f88ce625fe8babaae64e8db2d417c7653adb3019b08aae85c5ed787dc816/pyarrow-24.0.0-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:3b13dedfe76a0ad2d1d859b0811b53827a4e9d93a0bcb05cf59333ab4980cc7e", size = 49376032, upload-time = "2026-04-21T10:47:48.967Z" },
-    { url = "https://files.pythonhosted.org/packages/36/7a/82c363caa145fff88fb475da50d3bf52bb024f61917be5424c3392eaf878/pyarrow-24.0.0-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:25ea65d868eb04015cd18e6df2fbe98f07e5bda2abefabcb88fce39a947716f6", size = 51929490, upload-time = "2026-04-21T10:47:55.981Z" },
-    { url = "https://files.pythonhosted.org/packages/66/1c/e3e72c8014ad2743ca64a701652c733cc5cbcee15c0463a32a8c55518d9e/pyarrow-24.0.0-cp312-cp312-win_amd64.whl", hash = "sha256:295f0a7f2e242dabd513737cf076007dc5b2d59237e3eca37b05c0c6446f3826", size = 27355660, upload-time = "2026-04-21T10:48:01.718Z" },
-    { url = "https://files.pythonhosted.org/packages/6f/d3/a1abf004482026ddc17f4503db227787fa3cfe41ec5091ff20e4fea55e57/pyarrow-24.0.0-cp313-cp313-macosx_12_0_arm64.whl", hash = "sha256:02b001b3ed4723caa44f6cd1af2d5c86aa2cf9971dacc2ffa55b21237713dfba", size = 34976759, upload-time = "2026-04-21T10:48:07.258Z" },
-    { url = "https://files.pythonhosted.org/packages/4f/4a/34f0a36d28a2dd32225301b79daad44e243dc1a2bb77d43b60749be255c4/pyarrow-24.0.0-cp313-cp313-macosx_12_0_x86_64.whl", hash = "sha256:04920d6a71aabd08a0417709efce97d45ea8e6fb733d9ca9ecffb13c67839f68", size = 36658471, upload-time = "2026-04-21T10:48:13.347Z" },
-    { url = "https://files.pythonhosted.org/packages/1f/78/543b94712ae8bb1a6023bcc1acf1a740fbff8286747c289cd9468fced2a5/pyarrow-24.0.0-cp313-cp313-manylinux_2_28_aarch64.whl", hash = "sha256:a964266397740257f16f7bb2e4f08a0c81454004beab8ff59dd531b73610e9f2", size = 45675981, upload-time = "2026-04-21T10:48:20.201Z" },
-    { url = "https://files.pythonhosted.org/packages/84/9f/8fb7c222b100d314137fa40ec050de56cd8c6d957d1cfff685ce72f15b17/pyarrow-24.0.0-cp313-cp313-manylinux_2_28_x86_64.whl", hash = "sha256:6f066b179d68c413374294bc1735f68475457c933258df594443bb9d88ddc2a0", size = 48859172, upload-time = "2026-04-21T10:48:27.541Z" },
-    { url = "https://files.pythonhosted.org/packages/a7/d3/1ea72538e6c8b3b475ed78d1049a2c518e655761ea50fe1171fc855fcab7/pyarrow-24.0.0-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:1183baeb14c5f587b1ec52831e665718ce632caab84b7cd6b85fd44f96114495", size = 49385733, upload-time = "2026-04-21T10:48:34.7Z" },
-    { url = "https://files.pythonhosted.org/packages/c3/be/c3d8b06a1ba35f2260f8e1f771abbee7d5e345c0937aab90675706b1690a/pyarrow-24.0.0-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:806f24b4085453c197a5078218d1ee08783ebbba271badd153d1ae22a3ee804f", size = 51934335, upload-time = "2026-04-21T10:48:42.099Z" },
-    { url = "https://files.pythonhosted.org/packages/9c/62/89e07a1e7329d2cde3e3c6994ba0839a24977a2beda8be6005ea3d860b99/pyarrow-24.0.0-cp313-cp313-win_amd64.whl", hash = "sha256:e4505fc6583f7b05ab854934896bcac8253b04ac1171a77dfb73efef92076d91", size = 27271748, upload-time = "2026-04-21T10:49:42.532Z" },
-    { url = "https://files.pythonhosted.org/packages/17/1a/cff3a59f80b5b1658549d46611b67163f65e0664431c076ad728bf9d5af4/pyarrow-24.0.0-cp313-cp313t-macosx_12_0_arm64.whl", hash = "sha256:1a4e45017efbf115032e4475ee876d525e0e36c742214fbe405332480ecd6275", size = 35238554, upload-time = "2026-04-21T10:48:48.526Z" },
-    { url = "https://files.pythonhosted.org/packages/a8/99/cce0f42a327bfef2c420fb6078a3eb834826e5d6697bf3009fe11d2ad051/pyarrow-24.0.0-cp313-cp313t-macosx_12_0_x86_64.whl", hash = "sha256:7986f1fa71cee060ad00758bcc79d3a93bab8559bf978fab9e53472a2e25a17b", size = 36782301, upload-time = "2026-04-21T10:48:55.181Z" },
-    { url = "https://files.pythonhosted.org/packages/2a/66/8e560d5ff6793ca29aca213c53eec0dd482dd46cb93b2819e5aab52e4252/pyarrow-24.0.0-cp313-cp313t-manylinux_2_28_aarch64.whl", hash = "sha256:d3e0b61e8efb24ed38898e5cdc5fffa9124be480008d401a1f8071500494ae42", size = 45721929, upload-time = "2026-04-21T10:49:03.676Z" },
-    { url = "https://files.pythonhosted.org/packages/27/0c/a26e25505d030716e078d9f16eb74973cbf0b33b672884e9f9da1c83b871/pyarrow-24.0.0-cp313-cp313t-manylinux_2_28_x86_64.whl", hash = "sha256:55a3bc1e3df3b5567b7d27ef551b2283f0c68a5e86f1cd56abc569da4f31335b", size = 48825365, upload-time = "2026-04-21T10:49:11.714Z" },
-    { url = "https://files.pythonhosted.org/packages/5f/eb/771f9ecb0c65e73fe9dccdd1717901b9594f08c4515d000c7c62df573811/pyarrow-24.0.0-cp313-cp313t-musllinux_1_2_aarch64.whl", hash = "sha256:641f795b361874ac9da5294f8f443dfdbee355cf2bd9e3b8d97aaac2306b9b37", size = 49451819, upload-time = "2026-04-21T10:49:21.474Z" },
-    { url = "https://files.pythonhosted.org/packages/48/da/61ae89a88732f5a785646f3ec6125dbb640fa98a540eb2b9889caa561403/pyarrow-24.0.0-cp313-cp313t-musllinux_1_2_x86_64.whl", hash = "sha256:8adc8e6ce5fccf5dc707046ae4914fd537def529709cc0d285d37a7f9cd442ca", size = 51909252, upload-time = "2026-04-21T10:49:31.164Z" },
-    { url = "https://files.pythonhosted.org/packages/cb/1a/8dd5cafab7b66573fa91c03d06d213356ad4edd71813aa75e08ce2b3a844/pyarrow-24.0.0-cp313-cp313t-win_amd64.whl", hash = "sha256:9b18371ad2f44044b81a8d23bc2d8a9b6a6226dca775e8e16cfee640473d6c5d", size = 27388127, upload-time = "2026-04-21T10:49:37.334Z" },
-    { url = "https://files.pythonhosted.org/packages/ad/80/d022a34ff05d2cbedd8ccf841fc1f532ecfa9eb5ed1711b56d0e0ea71fc9/pyarrow-24.0.0-cp314-cp314-macosx_12_0_arm64.whl", hash = "sha256:1cc9057f0319e26333b357e17f3c2c022f1a83739b48a88b25bfd5fa2dc18838", size = 35007997, upload-time = "2026-04-21T10:49:48.796Z" },
-    { url = "https://files.pythonhosted.org/packages/1a/ff/f01485fda6f4e5d441afb8dd5e7681e4db18826c1e271852f5d3957d6a80/pyarrow-24.0.0-cp314-cp314-macosx_12_0_x86_64.whl", hash = "sha256:e6f1278ee4785b6db21229374a1c9e54ec7c549de5d1efc9630b6207de7e170b", size = 36678720, upload-time = "2026-04-21T10:49:55.858Z" },
-    { url = "https://files.pythonhosted.org/packages/9e/c2/2d2d5fea814237923f71b36495211f20b43a1576f9a4d6da7e751a64ec6f/pyarrow-24.0.0-cp314-cp314-manylinux_2_28_aarch64.whl", hash = "sha256:adbbedc55506cbdabb830890444fb856bfb0060c46c6f8026c6c2f2cf86ae795", size = 45741852, upload-time = "2026-04-21T10:50:04.624Z" },
-    { url = "https://files.pythonhosted.org/packages/8e/3a/28ba9c1c1ebdbb5f1b94dfebb46f207e52e6a554b7fe4132540fde29a3a0/pyarrow-24.0.0-cp314-cp314-manylinux_2_28_x86_64.whl", hash = "sha256:ae8a1145af31d903fa9bb166824d7abe9b4681a000b0159c9fb99c11bc11ad26", size = 48889852, upload-time = "2026-04-21T10:50:12.293Z" },
-    { url = "https://files.pythonhosted.org/packages/df/51/4a389acfd31dca009f8fb82d7f510bb4130f2b3a8e18cf00194d0687d8ac/pyarrow-24.0.0-cp314-cp314-musllinux_1_2_aarch64.whl", hash = "sha256:d7027eba1df3b2069e2e8d80f644fa0918b68c46432af3d088ddd390d063ecde", size = 49445207, upload-time = "2026-04-21T10:50:20.677Z" },
-    { url = "https://files.pythonhosted.org/packages/19/4b/0bab2b23d2ae901b1b9a03c0efd4b2d070256f8ce3fc43f6e58c167b2081/pyarrow-24.0.0-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:e56a1ffe9bf7b727432b89104cc0849c21582949dd7bdcb34f17b2001a351a76", size = 51954117, upload-time = "2026-04-21T10:50:29.14Z" },
-    { url = "https://files.pythonhosted.org/packages/29/88/f4e9145da0417b3d2c12035a8492b35ff4a3dbc653e614fcfb51d9dedb38/pyarrow-24.0.0-cp314-cp314-win_amd64.whl", hash = "sha256:38be1808cdd068605b787e6ca9119b27eb275a0234e50212c3492331680c3b1e", size = 28001155, upload-time = "2026-04-21T10:51:22.337Z" },
-    { url = "https://files.pythonhosted.org/packages/79/4f/46a49a63f43526da895b1a45bbb51d5baf8e4d77159f8528fc3e5490007f/pyarrow-24.0.0-cp314-cp314t-macosx_12_0_arm64.whl", hash = "sha256:418e48ce50a45a6a6c73c454677203a9c75c966cb1e92ca3370959185f197a05", size = 35250387, upload-time = "2026-04-21T10:50:35.552Z" },
-    { url = "https://files.pythonhosted.org/packages/a0/da/d5e0cd5ef00796922404806d5f00325cdadc3441ce2c13fe7115f2df9a64/pyarrow-24.0.0-cp314-cp314t-macosx_12_0_x86_64.whl", hash = "sha256:2f16197705a230a78270cdd4ea8a1d57e86b2fdcbc34a1f6aebc72e65c986f9a", size = 36797102, upload-time = "2026-04-21T10:50:42.417Z" },
-    { url = "https://files.pythonhosted.org/packages/34/c7/5904145b0a593a05236c882933d439b5720f0a145381179063722fbfc123/pyarrow-24.0.0-cp314-cp314t-manylinux_2_28_aarch64.whl", hash = "sha256:fb24ac194bfc5e86839d7dcd52092ee31e5fe6733fe11f5e3b06ef0812b20072", size = 45745118, upload-time = "2026-04-21T10:50:49.324Z" },
-    { url = "https://files.pythonhosted.org/packages/13/d3/cca42fe166d1c6e4d5b80e530b7949104d10e17508a90ae202dac205ce2a/pyarrow-24.0.0-cp314-cp314t-manylinux_2_28_x86_64.whl", hash = "sha256:9700ebd9a51f5895ce75ff4ac4b3c47a7d4b42bc618be8e713e5d56bacf5f931", size = 48844765, upload-time = "2026-04-21T10:50:55.579Z" },
-    { url = "https://files.pythonhosted.org/packages/b0/49/942c3b79878ba928324d1e17c274ed84581db8c0a749b24bcf4cbdf15bd3/pyarrow-24.0.0-cp314-cp314t-musllinux_1_2_aarch64.whl", hash = "sha256:d8ddd2768da81d3ee08cfea9b597f4abb4e8e1dc8ae7e204b608d23a0d3ab699", size = 49471890, upload-time = "2026-04-21T10:51:02.439Z" },
-    { url = "https://files.pythonhosted.org/packages/76/97/ff71431000a75d84135a1ace5ca4ba11726a231a8007bbb320a4c54075d5/pyarrow-24.0.0-cp314-cp314t-musllinux_1_2_x86_64.whl", hash = "sha256:61a3d7eaa97a14768b542f3d284dc6400dd2470d9f080708b13cd46b6ae18136", size = 51932250, upload-time = "2026-04-21T10:51:10.576Z" },
-    { url = "https://files.pythonhosted.org/packages/51/be/6f79d55816d5c22557cf27533543d5d70dfe692adfbee4b99f2760674f38/pyarrow-24.0.0-cp314-cp314t-win_amd64.whl", hash = "sha256:c91d00057f23b8d353039520dc3a6c09d8608164c692e9f59a175a42b2ae0c19", size = 28131282, upload-time = "2026-04-21T10:51:16.815Z" },
+    { url = "https://files.pythonhosted.org/packages/9a/4b/4166bb5abbfe6f750fc60ad337c43ecf61340fa52ab386da6e8dbf9e63c4/pyarrow-23.0.1-cp312-cp312-macosx_12_0_arm64.whl", hash = "sha256:f4b0dbfa124c0bb161f8b5ebb40f1a680b70279aa0c9901d44a2b5a20806039f", size = 34214575, upload-time = "2026-02-16T10:09:56.225Z" },
+    { url = "https://files.pythonhosted.org/packages/e1/da/3f941e3734ac8088ea588b53e860baeddac8323ea40ce22e3d0baa865cc9/pyarrow-23.0.1-cp312-cp312-macosx_12_0_x86_64.whl", hash = "sha256:7707d2b6673f7de054e2e83d59f9e805939038eebe1763fe811ee8fa5c0cd1a7", size = 35832540, upload-time = "2026-02-16T10:10:03.428Z" },
+    { url = "https://files.pythonhosted.org/packages/88/7c/3d841c366620e906d54430817531b877ba646310296df42ef697308c2705/pyarrow-23.0.1-cp312-cp312-manylinux_2_28_aarch64.whl", hash = "sha256:86ff03fb9f1a320266e0de855dee4b17da6794c595d207f89bba40d16b5c78b9", size = 44470940, upload-time = "2026-02-16T10:10:10.704Z" },
+    { url = "https://files.pythonhosted.org/packages/2c/a5/da83046273d990f256cb79796a190bbf7ec999269705ddc609403f8c6b06/pyarrow-23.0.1-cp312-cp312-manylinux_2_28_x86_64.whl", hash = "sha256:813d99f31275919c383aab17f0f455a04f5a429c261cc411b1e9a8f5e4aaaa05", size = 47586063, upload-time = "2026-02-16T10:10:17.95Z" },
+    { url = "https://files.pythonhosted.org/packages/5b/3c/b7d2ebcff47a514f47f9da1e74b7949138c58cfeb108cdd4ee62f43f0cf3/pyarrow-23.0.1-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:bf5842f960cddd2ef757d486041d57c96483efc295a8c4a0e20e704cbbf39c67", size = 48173045, upload-time = "2026-02-16T10:10:25.363Z" },
+    { url = "https://files.pythonhosted.org/packages/43/b2/b40961262213beaba6acfc88698eb773dfce32ecdf34d19291db94c2bd73/pyarrow-23.0.1-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:564baf97c858ecc03ec01a41062e8f4698abc3e6e2acd79c01c2e97880a19730", size = 50621741, upload-time = "2026-02-16T10:10:33.477Z" },
+    { url = "https://files.pythonhosted.org/packages/f6/70/1fdda42d65b28b078e93d75d371b2185a61da89dda4def8ba6ba41ebdeb4/pyarrow-23.0.1-cp312-cp312-win_amd64.whl", hash = "sha256:07deae7783782ac7250989a7b2ecde9b3c343a643f82e8a4df03d93b633006f0", size = 27620678, upload-time = "2026-02-16T10:10:39.31Z" },
+    { url = "https://files.pythonhosted.org/packages/47/10/2cbe4c6f0fb83d2de37249567373d64327a5e4d8db72f486db42875b08f6/pyarrow-23.0.1-cp313-cp313-macosx_12_0_arm64.whl", hash = "sha256:6b8fda694640b00e8af3c824f99f789e836720aa8c9379fb435d4c4953a756b8", size = 34210066, upload-time = "2026-02-16T10:10:45.487Z" },
+    { url = "https://files.pythonhosted.org/packages/cb/4f/679fa7e84dadbaca7a65f7cdba8d6c83febbd93ca12fa4adf40ba3b6362b/pyarrow-23.0.1-cp313-cp313-macosx_12_0_x86_64.whl", hash = "sha256:8ff51b1addc469b9444b7c6f3548e19dc931b172ab234e995a60aea9f6e6025f", size = 35825526, upload-time = "2026-02-16T10:10:52.266Z" },
+    { url = "https://files.pythonhosted.org/packages/f9/63/d2747d930882c9d661e9398eefc54f15696547b8983aaaf11d4a2e8b5426/pyarrow-23.0.1-cp313-cp313-manylinux_2_28_aarch64.whl", hash = "sha256:71c5be5cbf1e1cb6169d2a0980850bccb558ddc9b747b6206435313c47c37677", size = 44473279, upload-time = "2026-02-16T10:11:01.557Z" },
+    { url = "https://files.pythonhosted.org/packages/b3/93/10a48b5e238de6d562a411af6467e71e7aedbc9b87f8d3a35f1560ae30fb/pyarrow-23.0.1-cp313-cp313-manylinux_2_28_x86_64.whl", hash = "sha256:9b6f4f17b43bc39d56fec96e53fe89d94bac3eb134137964371b45352d40d0c2", size = 47585798, upload-time = "2026-02-16T10:11:09.401Z" },
+    { url = "https://files.pythonhosted.org/packages/5c/20/476943001c54ef078dbf9542280e22741219a184a0632862bca4feccd666/pyarrow-23.0.1-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:9fc13fc6c403d1337acab46a2c4346ca6c9dec5780c3c697cf8abfd5e19b6b37", size = 48179446, upload-time = "2026-02-16T10:11:17.781Z" },
+    { url = "https://files.pythonhosted.org/packages/4b/b6/5dd0c47b335fcd8edba9bfab78ad961bd0fd55ebe53468cc393f45e0be60/pyarrow-23.0.1-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:5c16ed4f53247fa3ffb12a14d236de4213a4415d127fe9cebed33d51671113e2", size = 50623972, upload-time = "2026-02-16T10:11:26.185Z" },
+    { url = "https://files.pythonhosted.org/packages/d5/09/a532297c9591a727d67760e2e756b83905dd89adb365a7f6e9c72578bcc1/pyarrow-23.0.1-cp313-cp313-win_amd64.whl", hash = "sha256:cecfb12ef629cf6be0b1887f9f86463b0dd3dc3195ae6224e74006be4736035a", size = 27540749, upload-time = "2026-02-16T10:12:23.297Z" },
+    { url = "https://files.pythonhosted.org/packages/a5/8e/38749c4b1303e6ae76b3c80618f84861ae0c55dd3c2273842ea6f8258233/pyarrow-23.0.1-cp313-cp313t-macosx_12_0_arm64.whl", hash = "sha256:29f7f7419a0e30264ea261fdc0e5fe63ce5a6095003db2945d7cd78df391a7e1", size = 34471544, upload-time = "2026-02-16T10:11:32.535Z" },
+    { url = "https://files.pythonhosted.org/packages/a3/73/f237b2bc8c669212f842bcfd842b04fc8d936bfc9d471630569132dc920d/pyarrow-23.0.1-cp313-cp313t-macosx_12_0_x86_64.whl", hash = "sha256:33d648dc25b51fd8055c19e4261e813dfc4d2427f068bcecc8b53d01b81b0500", size = 35949911, upload-time = "2026-02-16T10:11:39.813Z" },
+    { url = "https://files.pythonhosted.org/packages/0c/86/b912195eee0903b5611bf596833def7d146ab2d301afeb4b722c57ffc966/pyarrow-23.0.1-cp313-cp313t-manylinux_2_28_aarch64.whl", hash = "sha256:cd395abf8f91c673dd3589cadc8cc1ee4e8674fa61b2e923c8dd215d9c7d1f41", size = 44520337, upload-time = "2026-02-16T10:11:47.764Z" },
+    { url = "https://files.pythonhosted.org/packages/69/c2/f2a717fb824f62d0be952ea724b4f6f9372a17eed6f704b5c9526f12f2f1/pyarrow-23.0.1-cp313-cp313t-manylinux_2_28_x86_64.whl", hash = "sha256:00be9576d970c31defb5c32eb72ef585bf600ef6d0a82d5eccaae96639cf9d07", size = 47548944, upload-time = "2026-02-16T10:11:56.607Z" },
+    { url = "https://files.pythonhosted.org/packages/84/a7/90007d476b9f0dc308e3bc57b832d004f848fd6c0da601375d20d92d1519/pyarrow-23.0.1-cp313-cp313t-musllinux_1_2_aarch64.whl", hash = "sha256:c2139549494445609f35a5cda4eb94e2c9e4d704ce60a095b342f82460c73a83", size = 48236269, upload-time = "2026-02-16T10:12:04.47Z" },
+    { url = "https://files.pythonhosted.org/packages/b0/3f/b16fab3e77709856eb6ac328ce35f57a6d4a18462c7ca5186ef31b45e0e0/pyarrow-23.0.1-cp313-cp313t-musllinux_1_2_x86_64.whl", hash = "sha256:7044b442f184d84e2351e5084600f0d7343d6117aabcbc1ac78eb1ae11eb4125", size = 50604794, upload-time = "2026-02-16T10:12:11.797Z" },
+    { url = "https://files.pythonhosted.org/packages/e9/a1/22df0620a9fac31d68397a75465c344e83c3dfe521f7612aea33e27ab6c0/pyarrow-23.0.1-cp313-cp313t-win_amd64.whl", hash = "sha256:a35581e856a2fafa12f3f54fce4331862b1cfb0bef5758347a858a4aa9d6bae8", size = 27660642, upload-time = "2026-02-16T10:12:17.746Z" },
+    { url = "https://files.pythonhosted.org/packages/8d/1b/6da9a89583ce7b23ac611f183ae4843cd3a6cf54f079549b0e8c14031e73/pyarrow-23.0.1-cp314-cp314-macosx_12_0_arm64.whl", hash = "sha256:5df1161da23636a70838099d4aaa65142777185cc0cdba4037a18cee7d8db9ca", size = 34238755, upload-time = "2026-02-16T10:12:32.819Z" },
+    { url = "https://files.pythonhosted.org/packages/ae/b5/d58a241fbe324dbaeb8df07be6af8752c846192d78d2272e551098f74e88/pyarrow-23.0.1-cp314-cp314-macosx_12_0_x86_64.whl", hash = "sha256:fa8e51cb04b9f8c9c5ace6bab63af9a1f88d35c0d6cbf53e8c17c098552285e1", size = 35847826, upload-time = "2026-02-16T10:12:38.949Z" },
+    { url = "https://files.pythonhosted.org/packages/54/a5/8cbc83f04aba433ca7b331b38f39e000efd9f0c7ce47128670e737542996/pyarrow-23.0.1-cp314-cp314-manylinux_2_28_aarch64.whl", hash = "sha256:0b95a3994f015be13c63148fef8832e8a23938128c185ee951c98908a696e0eb", size = 44536859, upload-time = "2026-02-16T10:12:45.467Z" },
+    { url = "https://files.pythonhosted.org/packages/36/2e/c0f017c405fcdc252dbccafbe05e36b0d0eb1ea9a958f081e01c6972927f/pyarrow-23.0.1-cp314-cp314-manylinux_2_28_x86_64.whl", hash = "sha256:4982d71350b1a6e5cfe1af742c53dfb759b11ce14141870d05d9e540d13bc5d1", size = 47614443, upload-time = "2026-02-16T10:12:55.525Z" },
+    { url = "https://files.pythonhosted.org/packages/af/6b/2314a78057912f5627afa13ba43809d9d653e6630859618b0fd81a4e0759/pyarrow-23.0.1-cp314-cp314-musllinux_1_2_aarch64.whl", hash = "sha256:c250248f1fe266db627921c89b47b7c06fee0489ad95b04d50353537d74d6886", size = 48232991, upload-time = "2026-02-16T10:13:04.729Z" },
+    { url = "https://files.pythonhosted.org/packages/40/f2/1bcb1d3be3460832ef3370d621142216e15a2c7c62602a4ea19ec240dd64/pyarrow-23.0.1-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:5f4763b83c11c16e5f4c15601ba6dfa849e20723b46aa2617cb4bffe8768479f", size = 50645077, upload-time = "2026-02-16T10:13:14.147Z" },
+    { url = "https://files.pythonhosted.org/packages/eb/3f/b1da7b61cd66566a4d4c8383d376c606d1c34a906c3f1cb35c479f59d1aa/pyarrow-23.0.1-cp314-cp314-win_amd64.whl", hash = "sha256:3a4c85ef66c134161987c17b147d6bffdca4566f9a4c1d81a0a01cdf08414ea5", size = 28234271, upload-time = "2026-02-16T10:14:09.397Z" },
+    { url = "https://files.pythonhosted.org/packages/b5/78/07f67434e910a0f7323269be7bfbf58699bd0c1d080b18a1ab49ba943fe8/pyarrow-23.0.1-cp314-cp314t-macosx_12_0_arm64.whl", hash = "sha256:17cd28e906c18af486a499422740298c52d7c6795344ea5002a7720b4eadf16d", size = 34488692, upload-time = "2026-02-16T10:13:21.541Z" },
+    { url = "https://files.pythonhosted.org/packages/50/76/34cf7ae93ece1f740a04910d9f7e80ba166b9b4ab9596a953e9e62b90fe1/pyarrow-23.0.1-cp314-cp314t-macosx_12_0_x86_64.whl", hash = "sha256:76e823d0e86b4fb5e1cf4a58d293036e678b5a4b03539be933d3b31f9406859f", size = 35964383, upload-time = "2026-02-16T10:13:28.63Z" },
+    { url = "https://files.pythonhosted.org/packages/46/90/459b827238936d4244214be7c684e1b366a63f8c78c380807ae25ed92199/pyarrow-23.0.1-cp314-cp314t-manylinux_2_28_aarch64.whl", hash = "sha256:a62e1899e3078bf65943078b3ad2a6ddcacf2373bc06379aac61b1e548a75814", size = 44538119, upload-time = "2026-02-16T10:13:35.506Z" },
+    { url = "https://files.pythonhosted.org/packages/28/a1/93a71ae5881e99d1f9de1d4554a87be37da11cd6b152239fb5bd924fdc64/pyarrow-23.0.1-cp314-cp314t-manylinux_2_28_x86_64.whl", hash = "sha256:df088e8f640c9fae3b1f495b3c64755c4e719091caf250f3a74d095ddf3c836d", size = 47571199, upload-time = "2026-02-16T10:13:42.504Z" },
+    { url = "https://files.pythonhosted.org/packages/88/a3/d2c462d4ef313521eaf2eff04d204ac60775263f1fb08c374b543f79f610/pyarrow-23.0.1-cp314-cp314t-musllinux_1_2_aarch64.whl", hash = "sha256:46718a220d64677c93bc243af1d44b55998255427588e400677d7192671845c7", size = 48259435, upload-time = "2026-02-16T10:13:49.226Z" },
+    { url = "https://files.pythonhosted.org/packages/cc/f1/11a544b8c3d38a759eb3fbb022039117fd633e9a7b19e4841cc3da091915/pyarrow-23.0.1-cp314-cp314t-musllinux_1_2_x86_64.whl", hash = "sha256:a09f3876e87f48bc2f13583ab551f0379e5dfb83210391e68ace404181a20690", size = 50629149, upload-time = "2026-02-16T10:13:57.238Z" },
+    { url = "https://files.pythonhosted.org/packages/50/f2/c0e76a0b451ffdf0cf788932e182758eb7558953f4f27f1aff8e2518b653/pyarrow-23.0.1-cp314-cp314t-win_amd64.whl", hash = "sha256:527e8d899f14bd15b740cd5a54ad56b7f98044955373a17179d5956ddb93d9ce", size = 28365807, upload-time = "2026-02-16T10:14:03.892Z" },
 ]

 [[package]]
@@ -4684,7 +4741,7 @@ wheels = [

 [[package]]
 name = "pydantic"
-version = "2.13.3"
+version = "2.13.2"
 source = { registry = "https://pypi.org/simple" }
 dependencies = [
    { name = "annotated-types" },
@@ -4692,84 +4749,84 @@ dependencies = [
    { name = "typing-extensions" },
    { name = "typing-inspection" },
 ]
-sdist = { url = "https://files.pythonhosted.org/packages/d9/e4/40d09941a2cebcb20609b86a559817d5b9291c49dd6f8c87e5feffbe703a/pydantic-2.13.3.tar.gz", hash = "sha256:af09e9d1d09f4e7fe37145c1f577e1d61ceb9a41924bf0094a36506285d0a84d", size = 844068, upload-time = "2026-04-20T14:46:43.632Z" }
+sdist = { url = "https://files.pythonhosted.org/packages/09/e5/06d23afac9973109d1e3c8ad38e1547a12e860610e327c05ee686827dc37/pydantic-2.13.2.tar.gz", hash = "sha256:b418196607e61081c3226dcd4f0672f2a194828abb9109e9cfb84026564df2d1", size = 843836, upload-time = "2026-04-17T09:31:59.636Z" }
 wheels = [
-    { url = "https://files.pythonhosted.org/packages/f3/0a/fd7d723f8f8153418fb40cf9c940e82004fce7e987026b08a68a36dd3fe7/pydantic-2.13.3-py3-none-any.whl", hash = "sha256:6db14ac8dfc9a1e57f87ea2c0de670c251240f43cb0c30a5130e9720dc612927", size = 471981, upload-time = "2026-04-20T14:46:41.402Z" },
+    { url = "https://files.pythonhosted.org/packages/77/ca/b45c378e6e8d0b90577288b533e04e95b7afd61bb1d51b6c263176435489/pydantic-2.13.2-py3-none-any.whl", hash = "sha256:a525087f4c03d7e7456a3de89b64cd693d2229933bb1068b9af6befd5563694e", size = 471947, upload-time = "2026-04-17T09:31:57.541Z" },
 ]

 [[package]]
 name = "pydantic-core"
-version = "2.46.3"
+version = "2.46.2"
 source = { registry = "https://pypi.org/simple" }
 dependencies = [
    { name = "typing-extensions" },
 ]
-sdist = { url = "https://files.pythonhosted.org/packages/2a/ef/f7abb56c49382a246fd2ce9c799691e3c3e7175ec74b14d99e798bcddb1a/pydantic_core-2.46.3.tar.gz", hash = "sha256:41c178f65b8c29807239d47e6050262eb6bf84eb695e41101e62e38df4a5bc2c", size = 471412, upload-time = "2026-04-20T14:40:56.672Z" }
+sdist = { url = "https://files.pythonhosted.org/packages/43/bb/4742f05b739b2478459bb16fa8470549518c802e06ddcf3f106c5081315e/pydantic_core-2.46.2.tar.gz", hash = "sha256:37bb079f9ee3f1a519392b73fda2a96379b31f2013c6b467fe693e7f2987f596", size = 471269, upload-time = "2026-04-17T09:10:07.017Z" }
 wheels = [
-    { url = "https://files.pythonhosted.org/packages/4b/cb/5b47425556ecc1f3fe18ed2a0083188aa46e1dd812b06e406475b3a5d536/pydantic_core-2.46.3-cp312-cp312-macosx_10_12_x86_64.whl", hash = "sha256:b11b59b3eee90a80a36701ddb4576d9ae31f93f05cb9e277ceaa09e6bf074a67", size = 2101946, upload-time = "2026-04-20T14:40:52.581Z" },
-    { url = "https://files.pythonhosted.org/packages/a1/4f/2fb62c2267cae99b815bbf4a7b9283812c88ca3153ef29f7707200f1d4e5/pydantic_core-2.46.3-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:af8653713055ea18a3abc1537fe2ebc42f5b0bbb768d1eb79fd74eb47c0ac089", size = 1951612, upload-time = "2026-04-20T14:42:42.996Z" },
-    { url = "https://files.pythonhosted.org/packages/50/6e/b7348fd30d6556d132cddd5bd79f37f96f2601fe0608afac4f5fb01ec0b3/pydantic_core-2.46.3-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:75a519dab6d63c514f3a81053e5266c549679e4aa88f6ec57f2b7b854aceb1b0", size = 1977027, upload-time = "2026-04-20T14:42:02.001Z" },
-    { url = "https://files.pythonhosted.org/packages/82/11/31d60ee2b45540d3fb0b29302a393dbc01cd771c473f5b5147bcd353e593/pydantic_core-2.46.3-cp312-cp312-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:a6cd87cb1575b1ad05ba98894c5b5c96411ef678fa2f6ed2576607095b8d9789", size = 2063008, upload-time = "2026-04-20T14:44:17.952Z" },
-    { url = "https://files.pythonhosted.org/packages/8a/db/3a9d1957181b59258f44a2300ab0f0be9d1e12d662a4f57bb31250455c52/pydantic_core-2.46.3-cp312-cp312-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:f80a55484b8d843c8ada81ebf70a682f3f00a3d40e378c06cf17ecb44d280d7d", size = 2233082, upload-time = "2026-04-20T14:40:57.934Z" },
-    { url = "https://files.pythonhosted.org/packages/9c/e1/3277c38792aeb5cfb18c2f0c5785a221d9ff4e149abbe1184d53d5f72273/pydantic_core-2.46.3-cp312-cp312-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:3861f1731b90c50a3266316b9044f5c9b405eecb8e299b0a7120596334e4fe9c", size = 2304615, upload-time = "2026-04-20T14:42:12.584Z" },
-    { url = "https://files.pythonhosted.org/packages/5e/d5/e3d9717c9eba10855325650afd2a9cba8e607321697f18953af9d562da2f/pydantic_core-2.46.3-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:fb528e295ed31570ac3dcc9bfdd6e0150bc11ce6168ac87a8082055cf1a67395", size = 2094380, upload-time = "2026-04-20T14:43:05.522Z" },
-    { url = "https://files.pythonhosted.org/packages/a1/20/abac35dedcbfd66c6f0b03e4e3564511771d6c9b7ede10a362d03e110d9b/pydantic_core-2.46.3-cp312-cp312-manylinux_2_31_riscv64.whl", hash = "sha256:367508faa4973b992b271ba1494acaab36eb7e8739d1e47be5035fb1ea225396", size = 2135429, upload-time = "2026-04-20T14:41:55.549Z" },
-    { url = "https://files.pythonhosted.org/packages/6c/a5/41bfd1df69afad71b5cf0535055bccc73022715ad362edbc124bc1e021d7/pydantic_core-2.46.3-cp312-cp312-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:5ad3c826fe523e4becf4fe39baa44286cff85ef137c729a2c5e269afbfd0905d", size = 2174582, upload-time = "2026-04-20T14:41:45.96Z" },
-    { url = "https://files.pythonhosted.org/packages/79/65/38d86ea056b29b2b10734eb23329b7a7672ca604df4f2b6e9c02d4ee22fe/pydantic_core-2.46.3-cp312-cp312-musllinux_1_1_aarch64.whl", hash = "sha256:ec638c5d194ef8af27db69f16c954a09797c0dc25015ad6123eb2c73a4d271ca", size = 2187533, upload-time = "2026-04-20T14:40:55.367Z" },
-    { url = "https://files.pythonhosted.org/packages/b6/55/a1129141678a2026badc539ad1dee0a71d06f54c2f06a4bd68c030ac781b/pydantic_core-2.46.3-cp312-cp312-musllinux_1_1_armv7l.whl", hash = "sha256:28ed528c45446062ee66edb1d33df5d88828ae167de76e773a3c7f64bd14e976", size = 2332985, upload-time = "2026-04-20T14:44:13.05Z" },
-    { url = "https://files.pythonhosted.org/packages/d7/60/cb26f4077719f709e54819f4e8e1d43f4091f94e285eb6bd21e1190a7b7c/pydantic_core-2.46.3-cp312-cp312-musllinux_1_1_x86_64.whl", hash = "sha256:aed19d0c783886d5bd86d80ae5030006b45e28464218747dcf83dabfdd092c7b", size = 2373670, upload-time = "2026-04-20T14:41:53.421Z" },
-    { url = "https://files.pythonhosted.org/packages/6b/7e/c3f21882bdf1d8d086876f81b5e296206c69c6082551d776895de7801fa0/pydantic_core-2.46.3-cp312-cp312-win32.whl", hash = "sha256:06d5d8820cbbdb4147578c1fe7ffcd5b83f34508cb9f9ab76e807be7db6ff0a4", size = 1966722, upload-time = "2026-04-20T14:44:30.588Z" },
-    { url = "https://files.pythonhosted.org/packages/57/be/6b5e757b859013ebfbd7adba02f23b428f37c86dcbf78b5bb0b4ffd36e99/pydantic_core-2.46.3-cp312-cp312-win_amd64.whl", hash = "sha256:c3212fda0ee959c1dd04c60b601ec31097aaa893573a3a1abd0a47bcac2968c1", size = 2072970, upload-time = "2026-04-20T14:42:54.248Z" },
-    { url = "https://files.pythonhosted.org/packages/bf/f8/a989b21cc75e9a32d24192ef700eea606521221a89faa40c919ce884f2b1/pydantic_core-2.46.3-cp312-cp312-win_arm64.whl", hash = "sha256:f1f8338dd7a7f31761f1f1a3c47503a9a3b34eea3c8b01fa6ee96408affb5e72", size = 2035963, upload-time = "2026-04-20T14:44:20.4Z" },
-    { url = "https://files.pythonhosted.org/packages/9b/3c/9b5e8eb9821936d065439c3b0fb1490ffa64163bfe7e1595985a47896073/pydantic_core-2.46.3-cp313-cp313-macosx_10_12_x86_64.whl", hash = "sha256:12bc98de041458b80c86c56b24df1d23832f3e166cbaff011f25d187f5c62c37", size = 2102109, upload-time = "2026-04-20T14:41:24.219Z" },
-    { url = "https://files.pythonhosted.org/packages/91/97/1c41d1f5a19f241d8069f1e249853bcce378cdb76eec8ab636d7bc426280/pydantic_core-2.46.3-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:85348b8f89d2c3508b65b16c3c33a4da22b8215138d8b996912bb1532868885f", size = 1951820, upload-time = "2026-04-20T14:42:14.236Z" },
-    { url = "https://files.pythonhosted.org/packages/30/b4/d03a7ae14571bc2b6b3c7b122441154720619afe9a336fa3a95434df5e2f/pydantic_core-2.46.3-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:1105677a6df914b1fb71a81b96c8cce7726857e1717d86001f29be06a25ee6f8", size = 1977785, upload-time = "2026-04-20T14:42:31.648Z" },
-    { url = "https://files.pythonhosted.org/packages/ae/0c/4086f808834b59e3c8f1aa26df8f4b6d998cdcf354a143d18ef41529d1fe/pydantic_core-2.46.3-cp313-cp313-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:87082cd65669a33adeba5470769e9704c7cf026cc30afb9cc77fd865578ebaad", size = 2062761, upload-time = "2026-04-20T14:40:37.093Z" },
-    { url = "https://files.pythonhosted.org/packages/fa/71/a649be5a5064c2df0db06e0a512c2281134ed2fcc981f52a657936a7527c/pydantic_core-2.46.3-cp313-cp313-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:60e5f66e12c4f5212d08522963380eaaeac5ebd795826cfd19b2dfb0c7a52b9c", size = 2232989, upload-time = "2026-04-20T14:42:59.254Z" },
-    { url = "https://files.pythonhosted.org/packages/a2/84/7756e75763e810b3a710f4724441d1ecc5883b94aacb07ca71c5fb5cfb69/pydantic_core-2.46.3-cp313-cp313-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:b6cdf19bf84128d5e7c37e8a73a0c5c10d51103a650ac585d42dd6ae233f2b7f", size = 2303975, upload-time = "2026-04-20T14:41:32.287Z" },
-    { url = "https://files.pythonhosted.org/packages/6c/35/68a762e0c1e31f35fa0dac733cbd9f5b118042853698de9509c8e5bf128b/pydantic_core-2.46.3-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:031bb17f4885a43773c8c763089499f242aee2ea85cf17154168775dccdecf35", size = 2095325, upload-time = "2026-04-20T14:42:47.685Z" },
-    { url = "https://files.pythonhosted.org/packages/77/bf/1bf8c9a8e91836c926eae5e3e51dce009bf495a60ca56060689d3df3f340/pydantic_core-2.46.3-cp313-cp313-manylinux_2_31_riscv64.whl", hash = "sha256:bcf2a8b2982a6673693eae7348ef3d8cf3979c1d63b54fca7c397a635cc68687", size = 2133368, upload-time = "2026-04-20T14:41:22.766Z" },
-    { url = "https://files.pythonhosted.org/packages/e5/50/87d818d6bab915984995157ceb2380f5aac4e563dddbed6b56f0ed057aba/pydantic_core-2.46.3-cp313-cp313-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:28e8cf2f52d72ced402a137145923a762cbb5081e48b34312f7a0c8f55928ec3", size = 2173908, upload-time = "2026-04-20T14:42:52.044Z" },
-    { url = "https://files.pythonhosted.org/packages/91/88/a311fb306d0bd6185db41fa14ae888fb81d0baf648a761ae760d30819d33/pydantic_core-2.46.3-cp313-cp313-musllinux_1_1_aarch64.whl", hash = "sha256:17eaface65d9fc5abb940003020309c1bf7a211f5f608d7870297c367e6f9022", size = 2186422, upload-time = "2026-04-20T14:43:29.55Z" },
-    { url = "https://files.pythonhosted.org/packages/8f/79/28fd0d81508525ab2054fef7c77a638c8b5b0afcbbaeee493cf7c3fef7e1/pydantic_core-2.46.3-cp313-cp313-musllinux_1_1_armv7l.whl", hash = "sha256:93fd339f23408a07e98950a89644f92c54d8729719a40b30c0a30bb9ebc55d23", size = 2332709, upload-time = "2026-04-20T14:42:16.134Z" },
-    { url = "https://files.pythonhosted.org/packages/b3/21/795bf5fe5c0f379308b8ef19c50dedab2e7711dbc8d0c2acf08f1c7daa05/pydantic_core-2.46.3-cp313-cp313-musllinux_1_1_x86_64.whl", hash = "sha256:23cbdb3aaa74dfe0837975dbf69b469753bbde8eacace524519ffdb6b6e89eb7", size = 2372428, upload-time = "2026-04-20T14:41:10.974Z" },
-    { url = "https://files.pythonhosted.org/packages/45/b3/ed14c659cbe7605e3ef063077680a64680aec81eb1a04763a05190d49b7f/pydantic_core-2.46.3-cp313-cp313-win32.whl", hash = "sha256:610eda2e3838f401105e6326ca304f5da1e15393ae25dacae5c5c63f2c275b13", size = 1965601, upload-time = "2026-04-20T14:41:42.128Z" },
-    { url = "https://files.pythonhosted.org/packages/ef/bb/adb70d9a762ddd002d723fbf1bd492244d37da41e3af7b74ad212609027e/pydantic_core-2.46.3-cp313-cp313-win_amd64.whl", hash = "sha256:68cc7866ed863db34351294187f9b729964c371ba33e31c26f478471c52e1ed0", size = 2071517, upload-time = "2026-04-20T14:43:36.096Z" },
-    { url = "https://files.pythonhosted.org/packages/52/eb/66faefabebfe68bd7788339c9c9127231e680b11906368c67ce112fdb47f/pydantic_core-2.46.3-cp313-cp313-win_arm64.whl", hash = "sha256:f64b5537ac62b231572879cd08ec05600308636a5d63bcbdb15063a466977bec", size = 2035802, upload-time = "2026-04-20T14:43:38.507Z" },
-    { url = "https://files.pythonhosted.org/packages/7f/db/a7bcb4940183fda36022cd18ba8dd12f2dff40740ec7b58ce7457befa416/pydantic_core-2.46.3-cp314-cp314-macosx_10_12_x86_64.whl", hash = "sha256:afa3aa644f74e290cdede48a7b0bee37d1c35e71b05105f6b340d484af536d9b", size = 2097614, upload-time = "2026-04-20T14:44:38.374Z" },
-    { url = "https://files.pythonhosted.org/packages/24/35/e4066358a22e3e99519db370494c7528f5a2aa1367370e80e27e20283543/pydantic_core-2.46.3-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:ced3310e51aa425f7f77da8bbbb5212616655bedbe82c70944320bc1dbe5e018", size = 1951896, upload-time = "2026-04-20T14:40:53.996Z" },
-    { url = "https://files.pythonhosted.org/packages/87/92/37cf4049d1636996e4b888c05a501f40a43ff218983a551d57f9d5e14f0d/pydantic_core-2.46.3-cp314-cp314-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:e29908922ce9da1a30b4da490bd1d3d82c01dcfdf864d2a74aacee674d0bfa34", size = 1979314, upload-time = "2026-04-20T14:41:49.446Z" },
-    { url = "https://files.pythonhosted.org/packages/d8/36/9ff4d676dfbdfb2d591cf43f3d90ded01e15b1404fd101180ed2d62a2fd3/pydantic_core-2.46.3-cp314-cp314-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:0c9ff69140423eea8ed2d5477df3ba037f671f5e897d206d921bc9fdc39613e7", size = 2056133, upload-time = "2026-04-20T14:42:23.574Z" },
-    { url = "https://files.pythonhosted.org/packages/bc/f0/405b442a4d7ba855b06eec8b2bf9c617d43b8432d099dfdc7bf999293495/pydantic_core-2.46.3-cp314-cp314-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:b675ab0a0d5b1c8fdb81195dc5bcefea3f3c240871cdd7ff9a2de8aa50772eb2", size = 2228726, upload-time = "2026-04-20T14:44:22.816Z" },
-    { url = "https://files.pythonhosted.org/packages/e7/f8/65cd92dd5a0bd89ba277a98ecbfaf6fc36bbd3300973c7a4b826d6ab1391/pydantic_core-2.46.3-cp314-cp314-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:0087084960f209a9a4af50ecd1fb063d9ad3658c07bb81a7a53f452dacbfb2ba", size = 2301214, upload-time = "2026-04-20T14:44:48.792Z" },
-    { url = "https://files.pythonhosted.org/packages/fd/86/ef96a4c6e79e7a2d0410826a68fbc0eccc0fd44aa733be199d5fcac3bb87/pydantic_core-2.46.3-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:ed42e6cc8e1b0e2b9b96e2276bad70ae625d10d6d524aed0c93de974ae029f9f", size = 2099927, upload-time = "2026-04-20T14:41:40.196Z" },
-    { url = "https://files.pythonhosted.org/packages/6d/53/269caf30e0096e0a8a8f929d1982a27b3879872cca2d917d17c2f9fdf4fe/pydantic_core-2.46.3-cp314-cp314-manylinux_2_31_riscv64.whl", hash = "sha256:f1771ce258afb3e4201e67d154edbbae712a76a6081079fe247c2f53c6322c22", size = 2128789, upload-time = "2026-04-20T14:41:15.868Z" },
-    { url = "https://files.pythonhosted.org/packages/00/b0/1a6d9b6a587e118482910c244a1c5acf4d192604174132efd12bf0ac486f/pydantic_core-2.46.3-cp314-cp314-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:a7610b6a5242a6c736d8ad47fd5fff87fcfe8f833b281b1c409c3d6835d9227f", size = 2173815, upload-time = "2026-04-20T14:44:25.152Z" },
-    { url = "https://files.pythonhosted.org/packages/87/56/e7e00d4041a7e62b5a40815590114db3b535bf3ca0bf4dca9f16cef25246/pydantic_core-2.46.3-cp314-cp314-musllinux_1_1_aarch64.whl", hash = "sha256:ff5e7783bcc5476e1db448bf268f11cb257b1c276d3e89f00b5727be86dd0127", size = 2181608, upload-time = "2026-04-20T14:41:28.933Z" },
-    { url = "https://files.pythonhosted.org/packages/e8/22/4bd23c3d41f7c185d60808a1de83c76cf5aeabf792f6c636a55c3b1ec7f9/pydantic_core-2.46.3-cp314-cp314-musllinux_1_1_armv7l.whl", hash = "sha256:9d2e32edcc143bc01e95300671915d9ca052d4f745aa0a49c48d4803f8a85f2c", size = 2326968, upload-time = "2026-04-20T14:42:03.962Z" },
-    { url = "https://files.pythonhosted.org/packages/24/ac/66cd45129e3915e5ade3b292cb3bc7fd537f58f8f8dbdaba6170f7cabb74/pydantic_core-2.46.3-cp314-cp314-musllinux_1_1_x86_64.whl", hash = "sha256:6e42d83d1c6b87fa56b521479cff237e626a292f3b31b6345c15a99121b454c1", size = 2369842, upload-time = "2026-04-20T14:41:35.52Z" },
-    { url = "https://files.pythonhosted.org/packages/a2/51/dd4248abb84113615473aa20d5545b7c4cd73c8644003b5259686f93996c/pydantic_core-2.46.3-cp314-cp314-win32.whl", hash = "sha256:07bc6d2a28c3adb4f7c6ae46aa4f2d2929af127f587ed44057af50bf1ce0f505", size = 1959661, upload-time = "2026-04-20T14:41:00.042Z" },
-    { url = "https://files.pythonhosted.org/packages/20/eb/59980e5f1ae54a3b86372bd9f0fa373ea2d402e8cdcd3459334430f91e91/pydantic_core-2.46.3-cp314-cp314-win_amd64.whl", hash = "sha256:8940562319bc621da30714617e6a7eaa6b98c84e8c685bcdc02d7ed5e7c7c44e", size = 2071686, upload-time = "2026-04-20T14:43:16.471Z" },
-    { url = "https://files.pythonhosted.org/packages/8c/db/1cf77e5247047dfee34bc01fa9bca134854f528c8eb053e144298893d370/pydantic_core-2.46.3-cp314-cp314-win_arm64.whl", hash = "sha256:5dcbbcf4d22210ced8f837c96db941bdb078f419543472aca5d9a0bb7cddc7df", size = 2026907, upload-time = "2026-04-20T14:43:31.732Z" },
-    { url = "https://files.pythonhosted.org/packages/57/c0/b3df9f6a543276eadba0a48487b082ca1f201745329d97dbfa287034a230/pydantic_core-2.46.3-cp314-cp314t-macosx_10_12_x86_64.whl", hash = "sha256:d0fe3dce1e836e418f912c1ad91c73357d03e556a4d286f441bf34fed2dbeecf", size = 2095047, upload-time = "2026-04-20T14:42:37.982Z" },
-    { url = "https://files.pythonhosted.org/packages/66/57/886a938073b97556c168fd99e1a7305bb363cd30a6d2c76086bf0587b32a/pydantic_core-2.46.3-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:9ce92e58abc722dac1bf835a6798a60b294e48eb0e625ec9fd994b932ac5feee", size = 1934329, upload-time = "2026-04-20T14:43:49.655Z" },
-    { url = "https://files.pythonhosted.org/packages/0b/7c/b42eaa5c34b13b07ecb51da21761297a9b8eb43044c864a035999998f328/pydantic_core-2.46.3-cp314-cp314t-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:a03e6467f0f5ab796a486146d1b887b2dc5e5f9b3288898c1b1c3ad974e53e4a", size = 1974847, upload-time = "2026-04-20T14:42:10.737Z" },
-    { url = "https://files.pythonhosted.org/packages/e6/9b/92b42db6543e7de4f99ae977101a2967b63122d4b6cf7773812da2d7d5b5/pydantic_core-2.46.3-cp314-cp314t-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:2798b6ba041b9d70acfb9071a2ea13c8456dd1e6a5555798e41ba7b0790e329c", size = 2041742, upload-time = "2026-04-20T14:40:44.262Z" },
-    { url = "https://files.pythonhosted.org/packages/0f/19/46fbe1efabb5aa2834b43b9454e70f9a83ad9c338c1291e48bdc4fecf167/pydantic_core-2.46.3-cp314-cp314t-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:9be3e221bdc6d69abf294dcf7aff6af19c31a5cdcc8f0aa3b14be29df4bd03b1", size = 2236235, upload-time = "2026-04-20T14:41:27.307Z" },
-    { url = "https://files.pythonhosted.org/packages/77/da/b3f95bc009ad60ec53120f5d16c6faa8cabdbe8a20d83849a1f2b8728148/pydantic_core-2.46.3-cp314-cp314t-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:f13936129ce841f2a5ddf6f126fea3c43cd128807b5a59588c37cf10178c2e64", size = 2282633, upload-time = "2026-04-20T14:44:33.271Z" },
-    { url = "https://files.pythonhosted.org/packages/cc/6e/401336117722e28f32fb8220df676769d28ebdf08f2f4469646d404c43a3/pydantic_core-2.46.3-cp314-cp314t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:28b5f2ef03416facccb1c6ef744c69793175fd27e44ef15669201601cf423acb", size = 2109679, upload-time = "2026-04-20T14:44:41.065Z" },
-    { url = "https://files.pythonhosted.org/packages/fc/53/b289f9bc8756a32fe718c46f55afaeaf8d489ee18d1a1e7be1db73f42cc4/pydantic_core-2.46.3-cp314-cp314t-manylinux_2_31_riscv64.whl", hash = "sha256:830d1247d77ad23852314f069e9d7ddafeec5f684baf9d7e7065ed46a049c4e6", size = 2108342, upload-time = "2026-04-20T14:42:50.144Z" },
-    { url = "https://files.pythonhosted.org/packages/10/5b/8292fc7c1f9111f1b2b7c1b0dcf1179edcd014fc3ea4517499f50b829d71/pydantic_core-2.46.3-cp314-cp314t-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:d0793c90c1a3c74966e7975eaef3ed30ebdff3260a0f815a62a22adc17e4c01c", size = 2157208, upload-time = "2026-04-20T14:42:08.133Z" },
-    { url = "https://files.pythonhosted.org/packages/2b/9e/f80044e9ec07580f057a89fc131f78dda7a58751ddf52bbe05eaf31db50f/pydantic_core-2.46.3-cp314-cp314t-musllinux_1_1_aarch64.whl", hash = "sha256:d2d0aead851b66f5245ec0c4fb2612ef457f8bbafefdf65a2bf9d6bac6140f47", size = 2167237, upload-time = "2026-04-20T14:42:25.412Z" },
-    { url = "https://files.pythonhosted.org/packages/f8/84/6781a1b037f3b96be9227edbd1101f6d3946746056231bf4ac48cdff1a8d/pydantic_core-2.46.3-cp314-cp314t-musllinux_1_1_armv7l.whl", hash = "sha256:2f40e4246676beb31c5ce77c38a55ca4e465c6b38d11ea1bd935420568e0b1ab", size = 2312540, upload-time = "2026-04-20T14:40:40.313Z" },
-    { url = "https://files.pythonhosted.org/packages/3e/db/19c0839feeb728e7df03255581f198dfdf1c2aeb1e174a8420b63c5252e5/pydantic_core-2.46.3-cp314-cp314t-musllinux_1_1_x86_64.whl", hash = "sha256:cf489cf8986c543939aeee17a09c04d6ffb43bfef8ca16fcbcc5cfdcbed24dba", size = 2369556, upload-time = "2026-04-20T14:41:09.427Z" },
-    { url = "https://files.pythonhosted.org/packages/e0/15/3228774cb7cd45f5f721ddf1b2242747f4eb834d0c491f0c02d606f09fed/pydantic_core-2.46.3-cp314-cp314t-win32.whl", hash = "sha256:ffe0883b56cfc05798bf994164d2b2ff03efe2d22022a2bb080f3b626176dd56", size = 1949756, upload-time = "2026-04-20T14:41:25.717Z" },
-    { url = "https://files.pythonhosted.org/packages/b8/2a/c79cf53fd91e5a87e30d481809f52f9a60dd221e39de66455cf04deaad37/pydantic_core-2.46.3-cp314-cp314t-win_amd64.whl", hash = "sha256:706d9d0ce9cf4593d07270d8e9f53b161f90c57d315aeec4fb4fd7a8b10240d8", size = 2051305, upload-time = "2026-04-20T14:43:18.627Z" },
-    { url = "https://files.pythonhosted.org/packages/0b/db/d8182a7f1d9343a032265aae186eb063fe26ca4c40f256b21e8da4498e89/pydantic_core-2.46.3-cp314-cp314t-win_arm64.whl", hash = "sha256:77706aeb41df6a76568434701e0917da10692da28cb69d5fb6919ce5fdb07374", size = 2026310, upload-time = "2026-04-20T14:41:01.778Z" },
-    { url = "https://files.pythonhosted.org/packages/34/42/f426db557e8ab2791bc7562052299944a118655496fbff99914e564c0a94/pydantic_core-2.46.3-graalpy312-graalpy250_312_native-macosx_10_12_x86_64.whl", hash = "sha256:b12dd51f1187c2eb489af8e20f880362db98e954b54ab792fa5d92e8bcc6b803", size = 2091877, upload-time = "2026-04-20T14:43:27.091Z" },
-    { url = "https://files.pythonhosted.org/packages/5c/4f/86a832a9d14df58e663bfdf4627dc00d3317c2bd583c4fb23390b0f04b8e/pydantic_core-2.46.3-graalpy312-graalpy250_312_native-macosx_11_0_arm64.whl", hash = "sha256:f00a0961b125f1a47af7bcc17f00782e12f4cd056f83416006b30111d941dfa3", size = 1932428, upload-time = "2026-04-20T14:40:45.781Z" },
-    { url = "https://files.pythonhosted.org/packages/11/1a/fe857968954d93fb78e0d4b6df5c988c74c4aaa67181c60be7cfe327c0ca/pydantic_core-2.46.3-graalpy312-graalpy250_312_native-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:57697d7c056aca4bbb680200f96563e841a6386ac1129370a0102592f4dddff5", size = 1997550, upload-time = "2026-04-20T14:44:02.425Z" },
-    { url = "https://files.pythonhosted.org/packages/17/eb/9d89ad2d9b0ba8cd65393d434471621b98912abb10fbe1df08e480ba57b5/pydantic_core-2.46.3-graalpy312-graalpy250_312_native-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:fd35aa21299def8db7ef4fe5c4ff862941a9a158ca7b63d61e66fe67d30416b4", size = 2137657, upload-time = "2026-04-20T14:42:45.149Z" },
+    { url = "https://files.pythonhosted.org/packages/97/ec/2fafa4c86f5d2a69372c7cddef30925fd0e370b1efaf556609c1a0196d8a/pydantic_core-2.46.2-cp312-cp312-macosx_10_12_x86_64.whl", hash = "sha256:ea1ad8c89da31512fe2d249cf0638fb666925bda341901541bc5f3311c6fcc9e", size = 2101729, upload-time = "2026-04-17T09:12:30.042Z" },
+    { url = "https://files.pythonhosted.org/packages/cf/55/be5386c2c4b49af346e8a26b748194ff25757bbb6cf544130854e997af7a/pydantic_core-2.46.2-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:b308da17b92481e0587244631c5529e5d91d04cb2b08194825627b1eca28e21e", size = 1951546, upload-time = "2026-04-17T09:10:10.585Z" },
+    { url = "https://files.pythonhosted.org/packages/29/92/89e273a055ce440e6636c756379af35ad86da9d336a560049c3ba5e41c80/pydantic_core-2.46.2-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:d333a50bdd814a917d8d6a7ee35ba2395d53ddaa882613bc24e54a9d8b129095", size = 1976178, upload-time = "2026-04-17T09:11:49.619Z" },
+    { url = "https://files.pythonhosted.org/packages/91/b3/e4664469cf70c0cb0f7b2f5719d64e5968bb6f38217042c2afa3d3c4ba17/pydantic_core-2.46.2-cp312-cp312-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:1d00b99590c5bd1fabbc5d28b170923e32c1b1071b1f1de1851a4d14d89eb192", size = 2051697, upload-time = "2026-04-17T09:12:04.917Z" },
+    { url = "https://files.pythonhosted.org/packages/98/58/dbf68213ee06ce51cdd6d8c95f97980e646858c45bd96bd2dfb40433be73/pydantic_core-2.46.2-cp312-cp312-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:9f0e686960ffe9e65066395af856ac2d52c159043144433602c50c221d81c1ba", size = 2233160, upload-time = "2026-04-17T09:12:00.956Z" },
+    { url = "https://files.pythonhosted.org/packages/f5/d3/68092aa0ee6c60ff4de4740eb82db3d4ce338ec89b3cecb978c532472f12/pydantic_core-2.46.2-cp312-cp312-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:2d1128da41c9cb474e0a4701f9c363ec645c9d1a02229904c76bf4e0a194fde2", size = 2298398, upload-time = "2026-04-17T09:10:29.694Z" },
+    { url = "https://files.pythonhosted.org/packages/e4/51/5d6155eb737db55b0ad354ca5f333ef009f75feb67df2d79a84bace45af6/pydantic_core-2.46.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:48649cf2d8c358d79586e9fb2f8235902fcaa2d969ec1c5301f2d1873b2f8321", size = 2094058, upload-time = "2026-04-17T09:12:10.995Z" },
+    { url = "https://files.pythonhosted.org/packages/6b/f3/eb4a986197d71319430464ff181226c95adc8f06d932189b158bae5a82f5/pydantic_core-2.46.2-cp312-cp312-manylinux_2_31_riscv64.whl", hash = "sha256:b902f0fc7c2cf503865a05718b68147c6cd5d0a3867af38c527be574a9fa6e9d", size = 2130388, upload-time = "2026-04-17T09:12:41.159Z" },
+    { url = "https://files.pythonhosted.org/packages/56/00/44a9c4fe6d0f64b5786d6a8c649d6f0e34ba6c89b3663add1066e54451a2/pydantic_core-2.46.2-cp312-cp312-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:e80011f808b03d1d87a8f1e76ae3da19a18eb706c823e17981dcf1fae43744fc", size = 2184245, upload-time = "2026-04-17T09:12:36.532Z" },
+    { url = "https://files.pythonhosted.org/packages/78/6b/685b98a834d5e3d1c34a1bde1627525559dd223b75075bc7490cdb24eb33/pydantic_core-2.46.2-cp312-cp312-musllinux_1_1_aarch64.whl", hash = "sha256:b839d5c802e31348b949b6473f8190cddbf7d47475856d8ac995a373ee16ec59", size = 2186842, upload-time = "2026-04-17T09:13:04.054Z" },
+    { url = "https://files.pythonhosted.org/packages/22/64/caa2f5a2ac8b6113adaa410ccdf31ba7f54897a6e54cd0d726fc7e780c88/pydantic_core-2.46.2-cp312-cp312-musllinux_1_1_armv7l.whl", hash = "sha256:c6b1064f3f9cf9072e1d59dd2936f9f3b668bec1c37039708c9222db703c0d5b", size = 2336066, upload-time = "2026-04-17T09:12:13.006Z" },
+    { url = "https://files.pythonhosted.org/packages/ee/f9/7d2701bf82945b5b9e7df8347be97ef6a36da2846bfe5b4afec299ffe27b/pydantic_core-2.46.2-cp312-cp312-musllinux_1_1_x86_64.whl", hash = "sha256:37a68e6f2ac95578ce3c0564802404b27b24988649616e556c07e77111ed3f1d", size = 2363691, upload-time = "2026-04-17T09:13:42.972Z" },
+    { url = "https://files.pythonhosted.org/packages/3b/65/0dab11574101522941055109419db3cc09db871643dc3fc74e2413215e5b/pydantic_core-2.46.2-cp312-cp312-win32.whl", hash = "sha256:d9ffa75a7ef4b97d6e5e205fabd4304ef01fec09e6f1bdde04b9ad1b07d20289", size = 1958801, upload-time = "2026-04-17T09:11:31.981Z" },
+    { url = "https://files.pythonhosted.org/packages/13/2b/df84baa609c676f6450b8ecad44ea59146c805e3371b7b52443c0899f989/pydantic_core-2.46.2-cp312-cp312-win_amd64.whl", hash = "sha256:0551f2d2ddb68af5a00e26497f8025c538f73ef3cb698f8e5a487042cd2792a8", size = 2072634, upload-time = "2026-04-17T09:11:02.407Z" },
+    { url = "https://files.pythonhosted.org/packages/d1/4e/e1ce8029fc438086a946739bf9d596f70ff470aad4a8345555920618cabe/pydantic_core-2.46.2-cp312-cp312-win_arm64.whl", hash = "sha256:83aef30f106edcc21a6a4cc44b82d3169a1dbe255508db788e778f3c804d3583", size = 2026188, upload-time = "2026-04-17T09:13:11.083Z" },
+    { url = "https://files.pythonhosted.org/packages/07/2b/662e48254479a2d3450ba24b1e25061108b64339794232f503990c519144/pydantic_core-2.46.2-cp313-cp313-macosx_10_12_x86_64.whl", hash = "sha256:d26e9eea3715008a09a74585fe9becd0c67fbb145dc4df9756d597d7230a652c", size = 2101762, upload-time = "2026-04-17T09:10:13.87Z" },
+    { url = "https://files.pythonhosted.org/packages/73/ab/bafd7c7503757ccc8ec4d1911e106fe474c629443648c51a88f08b0fe91a/pydantic_core-2.46.2-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:48b36e3235140510dc7861f0cd58b714b1cdd3d48f75e10ce52e69866b746f10", size = 1951814, upload-time = "2026-04-17T09:12:25.934Z" },
+    { url = "https://files.pythonhosted.org/packages/92/cc/7549c2d57ba2e9a42caa5861a2d398dbe31c02c6aca783253ace59ce84f8/pydantic_core-2.46.2-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:36b1f99dc451f1a3981f236151465bcf995bbe712d0727c9f7b236fe228a8133", size = 1977329, upload-time = "2026-04-17T09:13:37.605Z" },
+    { url = "https://files.pythonhosted.org/packages/18/50/7ed4a8a0d478a4dca8f0134a5efa7193f03cc8520dd4c9509339fb2e5002/pydantic_core-2.46.2-cp313-cp313-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:8641c8d535c2d95b45c2e19b646ecd23ebba35d461e0ae48a3498277006250ab", size = 2051832, upload-time = "2026-04-17T09:12:49.771Z" },
+    { url = "https://files.pythonhosted.org/packages/dc/16/bb35b193741c0298ddc5f5e4234269efdc0c65e2bcd198aa0de9b68845e4/pydantic_core-2.46.2-cp313-cp313-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:20fb194788a0a50993e87013e693494ba183a2af5b44e99cf060bbae10912b11", size = 2233127, upload-time = "2026-04-17T09:11:04.449Z" },
+    { url = "https://files.pythonhosted.org/packages/91/a5/98f4b637149185addea19e1785ea20c373cca31b202f589111d8209d9873/pydantic_core-2.46.2-cp313-cp313-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:9262d11d0cd11ee3303a95156939402bed6cedfe5ed0e331b95a283a4da6eb8b", size = 2297418, upload-time = "2026-04-17T09:11:25.929Z" },
+    { url = "https://files.pythonhosted.org/packages/36/90/93a5d21990b152da7b7507b7fddb0b935f6a0984d57ac3ec45a6e17777a2/pydantic_core-2.46.2-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:ac204542736aa295fa25f713b7fad6fc50b46ab7764d16087575c85f085174f3", size = 2093735, upload-time = "2026-04-17T09:12:06.908Z" },
+    { url = "https://files.pythonhosted.org/packages/14/22/b8b1ffdddf08b4e84380bcb67f41dbbf4c171377c1d36fc6290794bb2094/pydantic_core-2.46.2-cp313-cp313-manylinux_2_31_riscv64.whl", hash = "sha256:9a7c43a0584742dface3ca0daf6f719d46c1ac2f87cf080050f9ae052c75e1b2", size = 2127570, upload-time = "2026-04-17T09:11:53.906Z" },
+    { url = "https://files.pythonhosted.org/packages/c6/26/e60d72b4e2d0ce1fa811044a974412ac1c567fe067d97b3e6b290530786e/pydantic_core-2.46.2-cp313-cp313-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:fd05e1edb6a90ad446fa268ab09e59202766b837597b714b2492db11ee87fab9", size = 2183524, upload-time = "2026-04-17T09:11:30.092Z" },
+    { url = "https://files.pythonhosted.org/packages/35/32/36bec7584a1eefb17dec4dfa1c946d3fe4440f466c5705b8adfda69c9a9f/pydantic_core-2.46.2-cp313-cp313-musllinux_1_1_aarch64.whl", hash = "sha256:91155b110788b5501abc7ea954f1d08606219e4e28e3c73a94124307c06efb80", size = 2185408, upload-time = "2026-04-17T09:10:57.228Z" },
+    { url = "https://files.pythonhosted.org/packages/fc/d6/1a5689d873620efd67d6b163db0c444c056adb0849b5bc33e2b9f09665a6/pydantic_core-2.46.2-cp313-cp313-musllinux_1_1_armv7l.whl", hash = "sha256:e4e2c72a529fa03ff228be1d2b76944013f428220b764e03cc50ada67e17a42c", size = 2335171, upload-time = "2026-04-17T09:11:43.369Z" },
+    { url = "https://files.pythonhosted.org/packages/3e/8e/675104802abe8ef502b072050ee5f2e915251aa1a3af87e1015ce31ec42d/pydantic_core-2.46.2-cp313-cp313-musllinux_1_1_x86_64.whl", hash = "sha256:56291ec1a11c3499890c99a8fd9053b47e60fe837a77ec72c0671b1b8b3dce24", size = 2362743, upload-time = "2026-04-17T09:10:18.333Z" },
+    { url = "https://files.pythonhosted.org/packages/8d/bc/86c5dde4fa6e24467680eef5047da3c1a19be0a527d0d8e14aa76b39307c/pydantic_core-2.46.2-cp313-cp313-win32.whl", hash = "sha256:b50f9c5f826ddca1246f055148df939f5f3f2d0d96db73de28e2233f22210d4c", size = 1958074, upload-time = "2026-04-17T09:12:38.622Z" },
+    { url = "https://files.pythonhosted.org/packages/2a/97/2537e8c1282b2c4eb062580c0d7a4339e10b072b803d1ee0b7f1f0a5c22c/pydantic_core-2.46.2-cp313-cp313-win_amd64.whl", hash = "sha256:251a57788823230ca8cbc99e6245d1a2ed6e180ec4864f251c94182c580c7f2e", size = 2071741, upload-time = "2026-04-17T09:13:32.405Z" },
+    { url = "https://files.pythonhosted.org/packages/da/aa/2ee75798706f9dbc4e76dbe59e41a396c5c311e3d6223b9cf6a5fa7780be/pydantic_core-2.46.2-cp313-cp313-win_arm64.whl", hash = "sha256:315d32d1a71494d6b4e1e14a9fa7a4329597b4c4340088ad7e1a9dafbeed92a9", size = 2025955, upload-time = "2026-04-17T09:10:15.567Z" },
+    { url = "https://files.pythonhosted.org/packages/d0/96/a50ccb6b539ae780f73cea74905468777680e30c6c3bdf714b9d4c116ea0/pydantic_core-2.46.2-cp314-cp314-macosx_10_12_x86_64.whl", hash = "sha256:4f59b45f3ef8650c0c736a57f59031d47ed9df4c0a64e83796849d7d14863a2d", size = 2097111, upload-time = "2026-04-17T09:10:49.617Z" },
+    { url = "https://files.pythonhosted.org/packages/34/5f/fdead7b3afa822ab6e5a18ee0ecffd54937de1877c01ed13a342e0fb3f07/pydantic_core-2.46.2-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:3a075a29ebef752784a91532a1a85be6b234ccffec0a9d7978a92696387c3da6", size = 1951904, upload-time = "2026-04-17T09:12:32.062Z" },
+    { url = "https://files.pythonhosted.org/packages/95/e0/1c5d547e550cdab1bec737492aa08865337af6fe7fc9b96f7f45f17d9519/pydantic_core-2.46.2-cp314-cp314-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:0d12d786e30c04a9d307c5d7080bf720d9bac7f1668191d8e37633a9562749e2", size = 1978667, upload-time = "2026-04-17T09:11:35.589Z" },
+    { url = "https://files.pythonhosted.org/packages/0e/cb/665ce629e218c8228302cb94beff4f6531082a2c87d3ecc3d5e63a26f392/pydantic_core-2.46.2-cp314-cp314-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:0d5e6d6343b0b5dcacb3503b5de90022968da8ed0ab9ab39d3eda71c20cbf84e", size = 2046721, upload-time = "2026-04-17T09:11:47.725Z" },
+    { url = "https://files.pythonhosted.org/packages/77/e9/6cb2cf60f54c1472bbdfce19d957553b43dbba79d1d7b2930a195c594785/pydantic_core-2.46.2-cp314-cp314-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:233eebac0999b6b9ba76eb56f3ec8fce13164aa16b6d2225a36a79e0f95b5973", size = 2228483, upload-time = "2026-04-17T09:12:08.837Z" },
+    { url = "https://files.pythonhosted.org/packages/0d/2a/93e018dd5571f781ebaeda8c0cf65398489d5bee9b1f484df0b6149b43b9/pydantic_core-2.46.2-cp314-cp314-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:9cc0eee720dd2f14f3b7c349469402b99ad81a174ab49d3533974529e9d93992", size = 2294663, upload-time = "2026-04-17T09:12:52.053Z" },
+    { url = "https://files.pythonhosted.org/packages/5e/4f/49e57ca55c770c93d9bb046666a54949b42e3c9099a0c5fe94557873fe30/pydantic_core-2.46.2-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:83ee76bf2c9910513dbc19e7d82367131fa7508dedd6186a462393071cc11059", size = 2098742, upload-time = "2026-04-17T09:13:45.472Z" },
+    { url = "https://files.pythonhosted.org/packages/c6/b0/6e46b5cd3332af665f794b8cdeea206618a8630bd9e7bcc36864518fce81/pydantic_core-2.46.2-cp314-cp314-manylinux_2_31_riscv64.whl", hash = "sha256:d61db38eb4ee5192f0c261b7f2d38e420b554df8912245e3546aee5c45e2fd78", size = 2125922, upload-time = "2026-04-17T09:12:54.304Z" },
+    { url = "https://files.pythonhosted.org/packages/06/d1/40850c81585be443a2abfdf7f795f8fae831baf8e2f9b2133c8246ac671c/pydantic_core-2.46.2-cp314-cp314-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:8f09a713d17bcd55da8ab02ebd9110c5246a49c44182af213b5212800af8bc83", size = 2183000, upload-time = "2026-04-17T09:10:59.027Z" },
+    { url = "https://files.pythonhosted.org/packages/04/af/8493d7dfa03ebb7866909e577c6aa65ea0de7377b86023cc51d0c8e11db3/pydantic_core-2.46.2-cp314-cp314-musllinux_1_1_aarch64.whl", hash = "sha256:30cacc5fb696e64b8ef6fd31d9549d394dd7d52760db072eecb98e37e3af1677", size = 2180335, upload-time = "2026-04-17T09:12:57.01Z" },
+    { url = "https://files.pythonhosted.org/packages/72/5b/1f6a344c4ffdf284da41c6067b82d5ebcbd11ce1b515ae4b662d4adb6f61/pydantic_core-2.46.2-cp314-cp314-musllinux_1_1_armv7l.whl", hash = "sha256:7ccfb105fcfe91a22bbb5563ad3dc124bc1aa75bfd2e53a780ab05f78cdf6108", size = 2330002, upload-time = "2026-04-17T09:12:02.958Z" },
+    { url = "https://files.pythonhosted.org/packages/25/ff/9a694126c12d6d2f48a0cafa6f8eef88ef0d8825600e18d03ff2e896c3b2/pydantic_core-2.46.2-cp314-cp314-musllinux_1_1_x86_64.whl", hash = "sha256:13ffef637dc8370c249e5b26bd18e9a80a4fca3d809618c44e18ec834a7ca7a8", size = 2359920, upload-time = "2026-04-17T09:10:27.764Z" },
+    { url = "https://files.pythonhosted.org/packages/51/c8/3a35c763d68a9cb2675eb10ef242cf66c5d4701b28ae12e688d67d2c180e/pydantic_core-2.46.2-cp314-cp314-win32.whl", hash = "sha256:1b0ab6d756ca2704a938e6c31b53f290c2f9c10d3914235410302a149de1a83e", size = 1953701, upload-time = "2026-04-17T09:13:30.021Z" },
+    { url = "https://files.pythonhosted.org/packages/1a/6a/f2726a780365f7dfd89d62036f984f7acb99978c60c5e1fa7c0cb898ed11/pydantic_core-2.46.2-cp314-cp314-win_amd64.whl", hash = "sha256:99ebade8c9ada4df975372d8dd25883daa0e379a05f1cd0c99aa0c04368d01a6", size = 2071867, upload-time = "2026-04-17T09:10:39.205Z" },
+    { url = "https://files.pythonhosted.org/packages/e1/79/76baacb9feba3d7c399b245ca1a29c74ea0db04ea693811374827eec2290/pydantic_core-2.46.2-cp314-cp314-win_arm64.whl", hash = "sha256:de87422197cf7f83db91d89c86a21660d749b3cd76cd8a45d115b8e675670f02", size = 2017252, upload-time = "2026-04-17T09:10:26.175Z" },
+    { url = "https://files.pythonhosted.org/packages/f1/3b/77c26938f817668d9ad9bab1a905cb23f11d9a3d4bf724d429b3e55a8eaf/pydantic_core-2.46.2-cp314-cp314t-macosx_10_12_x86_64.whl", hash = "sha256:236f22b4a206b5b61db955396b7cf9e2e1ff77f372efe9570128ccfcd6a525eb", size = 2094545, upload-time = "2026-04-17T09:12:19.339Z" },
+    { url = "https://files.pythonhosted.org/packages/fe/de/42c13f590e3c260966aa49bcdb1674774f975467c49abd51191e502bea28/pydantic_core-2.46.2-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:c2012f64d2cd7cca50f49f22445aa5a88691ac2b4498ee0a9a977f8ca4f7289f", size = 1933953, upload-time = "2026-04-17T09:09:55.889Z" },
+    { url = "https://files.pythonhosted.org/packages/4e/84/ebe3ebb3e2d8db656937cfa6f97f544cb7132f2307a4a7dfdcd0ea102a12/pydantic_core-2.46.2-cp314-cp314t-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:d07d6c63106d3a9c9a333e2636f9c82c703b1a9e3b079299e58747964e4fdb72", size = 1974435, upload-time = "2026-04-17T09:10:12.371Z" },
+    { url = "https://files.pythonhosted.org/packages/b9/15/0bf51ca6709477cd4ef86148b6d7844f3308f029eac361dd0383f1e17b1a/pydantic_core-2.46.2-cp314-cp314t-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:c326a2b4b85e959d9a1fc3a11f32f84611b6ec07c053e1828a860edf8d068208", size = 2031113, upload-time = "2026-04-17T09:10:00.752Z" },
+    { url = "https://files.pythonhosted.org/packages/02/ae/b7b5af9b79db036d9e61a44c481c17a213dc8fc4b8b71fe6875a72fc778b/pydantic_core-2.46.2-cp314-cp314t-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:ac8a65e798f2462552c00d2e013d532c94d646729dda98458beaf51f9ec7b120", size = 2236325, upload-time = "2026-04-17T09:10:33.227Z" },
+    { url = "https://files.pythonhosted.org/packages/a6/ae/ecef7477b5a03d4a499708f7e75d2836452ebb70b776c2d64612b334f57a/pydantic_core-2.46.2-cp314-cp314t-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:5a3c2bc1cc8164bedbc160b7bb1e8cc1e8b9c27f69ae4f9ae2b976cdae02b2dd", size = 2278135, upload-time = "2026-04-17T09:10:23.287Z" },
+    { url = "https://files.pythonhosted.org/packages/db/e4/2f9d82faa47af6c39fc3f120145fd915971e1e0cb6b55b494fad9fdf8275/pydantic_core-2.46.2-cp314-cp314t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:e69aa5e10b7e8b1bb4a6888650fd12fcbf11d396ca11d4a44de1450875702830", size = 2109071, upload-time = "2026-04-17T09:11:06.149Z" },
+    { url = "https://files.pythonhosted.org/packages/f1/9c/677cf10873fbd0b116575ab7b97c90482b21564f8a8040beb18edef7a577/pydantic_core-2.46.2-cp314-cp314t-manylinux_2_31_riscv64.whl", hash = "sha256:4e6df5c3301e65fb42bc5338bf9a1027a02b0a31dc7f54c33775229af474daf0", size = 2106028, upload-time = "2026-04-17T09:10:51.525Z" },
+    { url = "https://files.pythonhosted.org/packages/d6/53/6a06183544daba51c059123a2064a99039df25f115a06bdb26f2ea177038/pydantic_core-2.46.2-cp314-cp314t-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:2c2f6e32548ac8d559b47944effcf8ae4d81c161f6b6c885edc53bc08b8f192d", size = 2164816, upload-time = "2026-04-17T09:11:56.187Z" },
+    { url = "https://files.pythonhosted.org/packages/57/6f/10fcdd9e3eca66fc828eef0f6f5850f2dd3bca2c59e6e041fb8bc3da39be/pydantic_core-2.46.2-cp314-cp314t-musllinux_1_1_aarch64.whl", hash = "sha256:b089a81c58e6ea0485562bbbbbca4f65c0549521606d5ef27fba217aac9b665a", size = 2166130, upload-time = "2026-04-17T09:10:03.804Z" },
+    { url = "https://files.pythonhosted.org/packages/29/83/92d3fd0e0156cad2e3cb5c26de73794af78ac9fa0c22ab666e566dd67061/pydantic_core-2.46.2-cp314-cp314t-musllinux_1_1_armv7l.whl", hash = "sha256:7f700a6d6f64112ae9193709b84303bbab84424ad4b47d0253301aabce9dfc70", size = 2316605, upload-time = "2026-04-17T09:12:45.249Z" },
+    { url = "https://files.pythonhosted.org/packages/97/f1/facffdb970981068219582e499b8d0871ed163ffcc6b347de5c412669e4c/pydantic_core-2.46.2-cp314-cp314t-musllinux_1_1_x86_64.whl", hash = "sha256:67db6814beaa5fefe91101ec7eb9efda613795767be96f7cf58b1ca8c9ca9972", size = 2358385, upload-time = "2026-04-17T09:09:54.657Z" },
+    { url = "https://files.pythonhosted.org/packages/8b/a1/b8160b2f22b2199467bc68581a4ed380643c16b348a27d6165c6c242d694/pydantic_core-2.46.2-cp314-cp314t-win32.whl", hash = "sha256:32fbc7447be8e3be99bf7869f7066308f16be55b61f9882c2cefc7931f5c7664", size = 1942373, upload-time = "2026-04-17T09:12:59.594Z" },
+    { url = "https://files.pythonhosted.org/packages/0d/90/db89acabe5b150e11d1b59fe3d947dda2ef6abbfef5c82f056ff63802f5d/pydantic_core-2.46.2-cp314-cp314t-win_amd64.whl", hash = "sha256:b317a2b97019c0b95ce99f4f901ae383f40132da6706cdf1731066a73394c25c", size = 2052078, upload-time = "2026-04-17T09:10:19.96Z" },
+    { url = "https://files.pythonhosted.org/packages/97/32/e19b83ceb07a3f1bb21798407790bbc9a31740158fd132b94139cb84e16c/pydantic_core-2.46.2-cp314-cp314t-win_arm64.whl", hash = "sha256:7dcb9d40930dfad7ab6b20bcc6ca9d2b030b0f347a0cd9909b54bd53ead521b1", size = 2016941, upload-time = "2026-04-17T09:12:34.447Z" },
+    { url = "https://files.pythonhosted.org/packages/f3/d2/66c146f421178641bda880b0267c0d57dd84f5fec9ecc8e46be17b480742/pydantic_core-2.46.2-graalpy312-graalpy250_312_native-macosx_10_12_x86_64.whl", hash = "sha256:e9fcabd1857492b5bf16f90258babde50f618f55d046b1309972da2396321ff9", size = 2091621, upload-time = "2026-04-17T09:12:47.501Z" },
+    { url = "https://files.pythonhosted.org/packages/ee/b2/c28419aa9fc8055f4ac8e801d1d11c6357351bfa4321ed9bafab3eb98087/pydantic_core-2.46.2-graalpy312-graalpy250_312_native-macosx_11_0_arm64.whl", hash = "sha256:fb3ec2c7f54c07b30d89983ce78dc32c37dd06a972448b8716d609493802d628", size = 1937059, upload-time = "2026-04-17T09:10:53.554Z" },
+    { url = "https://files.pythonhosted.org/packages/30/ce/cd0824a2db213dc17113291b7a09b9b0ccd9fbf97daa4b81548703341baf/pydantic_core-2.46.2-graalpy312-graalpy250_312_native-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:130a6c837d819ef33e8c2bf702ed2c3429237ea69807f1140943d6f4bdaf52fa", size = 1997278, upload-time = "2026-04-17T09:12:23.784Z" },
+    { url = "https://files.pythonhosted.org/packages/c9/69/47283fe3c0c967d3e9e9cd6c42b70907610c8a6f8d6e8381f1bb55f8006c/pydantic_core-2.46.2-graalpy312-graalpy250_312_native-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:c2e25417cec5cd9bddb151e33cb08c50160f317479ecc02b22a95ec18f8fe004", size = 2147096, upload-time = "2026-04-17T09:12:43.124Z" },
 ]

 [[package]]
@@ -5912,7 +5969,7 @@ wheels = [

 [[package]]
 name = "teleop"
-version = "0.1.5"
+version = "0.1.4"
 source = { registry = "https://pypi.org/simple" }
 dependencies = [
    { name = "fastapi" },
@@ -5923,9 +5980,9 @@ dependencies = [
    { name = "uvicorn", extra = ["standard"] },
    { name = "websocket-client" },
 ]
-sdist = { url = "https://files.pythonhosted.org/packages/87/dc/312c19122c8e64fcff16dc8a74659b84ba8a7bcd3ef7b3c330cfc65a2a29/teleop-0.1.5.tar.gz", hash = "sha256:9f5367b167e0f67abe818f346c467671bd2c1ad653df604bdfb2fa69b2937da9", size = 44173, upload-time = "2026-04-19T21:17:42.795Z" }
+sdist = { url = "https://files.pythonhosted.org/packages/ec/8c/902ef4c0fa148325e6b19a5af63c3aac5927c67551efabcd5732fc446c6d/teleop-0.1.4.tar.gz", hash = "sha256:b5cedcff336c612a3f7e6f93e379e24979ed42070903b722f5fefe07c8fca3ce", size = 44051, upload-time = "2025-12-08T10:49:45.823Z" }
 wheels = [
-    { url = "https://files.pythonhosted.org/packages/f2/d1/45c79fcbf2551f2035c375e81d560c4ac46a5bbdb1622583b559eedcfc4e/teleop-0.1.5-py3-none-any.whl", hash = "sha256:75c3e63bb9eed1ea8ca32b48086cea45fa5ae3eb022dd0dcf0d615cf0b0d58dc", size = 42380, upload-time = "2026-04-19T21:17:41.386Z" },
+    { url = "https://files.pythonhosted.org/packages/b0/9c/217176617df23f634b0388111adbeb17ccb0409072639a97512e6c1c818d/teleop-0.1.4-py3-none-any.whl", hash = "sha256:6b8013947b27b89dbce50f9231a57d29f2e59ea864807b1ce6611ea3ad1694f4", size = 42332, upload-time = "2025-12-08T10:49:44.531Z" },
 ]

 [[package]]
@@ -6367,15 +6424,15 @@ wheels = [

 [[package]]
 name = "uvicorn"
-version = "0.45.0"
+version = "0.44.0"
 source = { registry = "https://pypi.org/simple" }
 dependencies = [
    { name = "click" },
    { name = "h11" },
 ]
-sdist = { url = "https://files.pythonhosted.org/packages/eb/2e/62b0d9a2cfc8b4de6771322dae30f2db76c66dae9ec32e94e176a44ad563/uvicorn-0.45.0.tar.gz", hash = "sha256:3fe650df136c5bd2b9b06efc5980636344a2fbb840e9ddd86437d53144fa335d", size = 87818, upload-time = "2026-04-21T10:43:46.815Z" }
+sdist = { url = "https://files.pythonhosted.org/packages/5e/da/6eee1ff8b6cbeed47eeb5229749168e81eb4b7b999a1a15a7176e51410c9/uvicorn-0.44.0.tar.gz", hash = "sha256:6c942071b68f07e178264b9152f1f16dfac5da85880c4ce06366a96d70d4f31e", size = 86947, upload-time = "2026-04-06T09:23:22.826Z" }
 wheels = [
-    { url = "https://files.pythonhosted.org/packages/c1/88/d0f7512465b166a4e931ccf7e77792be60fb88466a43964c7566cbaff752/uvicorn-0.45.0-py3-none-any.whl", hash = "sha256:2db26f588131aeec7439de00f2dd52d5f210710c1f01e407a52c90b880d1fd4f", size = 69838, upload-time = "2026-04-21T10:43:45.029Z" },
+    { url = "https://files.pythonhosted.org/packages/b7/23/a5bbd9600dd607411fa644c06ff4951bec3a4d82c4b852374024359c19c0/uvicorn-0.44.0-py3-none-any.whl", hash = "sha256:ce937c99a2cc70279556967274414c087888e8cec9f9c94644dfca11bd3ced89", size = 69425, upload-time = "2026-04-06T09:23:21.524Z" },
 ]

 [package.optional-dependencies]