diff --git a/docs/source/adding_benchmarks.mdx b/docs/source/adding_benchmarks.mdx
index 2d8ef6000..73a951276 100644
--- a/docs/source/adding_benchmarks.mdx
+++ b/docs/source/adding_benchmarks.mdx
@@ -186,7 +186,7 @@ Register a config dataclass so users can select your benchmark with `--env.type=
 ```python
 @EnvConfig.register_subclass("<benchmark_name>")
 @dataclass
-class MyBenchmarkEnv(EnvConfig):
+class MyBenchmarkEnvConfig(EnvConfig):
     task: str = "<default_task>"
     fps: int = <fps>
     obs_type: str = "pixels_agent_pos"
@@ -229,7 +229,7 @@ Key points:
 - `features_map` maps raw observation keys to LeRobot convention keys.
 - **No changes to `factory.py` needed** — the factory delegates to `cfg.create_envs()` and `cfg.get_env_processors()` automatically.
 
-### 3. Env processor (optional) (`src/lerobot/processor/env_processor.py`)
+### 3. Env processor (optional — `src/lerobot/processor/env_processor.py`)
 
 Only needed if your benchmark requires observation transforms beyond what `preprocess_observation()` handles (e.g. image flipping, coordinate conversion). Define the processor step here and return it from `get_env_processors()` in your config (see step 2):
 
@@ -293,6 +293,15 @@ Add your benchmark to the "Benchmarks" section:
   title: "Benchmarks"
 ```
 
+## Verifying your integration
+
+After completing the steps above, confirm that everything works:
+
+1. **Install** — `pip install -e ".[mybenchmark]"` and verify the dependency group installs cleanly.
+2. **Smoke test env creation** — call `make_env()` with your config in Python, check that the returned dict has the expected `{suite: {task_id: VectorEnv}}` shape, and that `reset()` returns observations with the right keys.
+3. **Run a full eval** — `lerobot-eval --env.type=<name> --env.task=<task> --eval.n_episodes=1 --eval.batch_size=1 --policy.path=<any_compatible_policy>` to exercise the full pipeline end-to-end.
+4. **Check success detection** — verify that `info["is_success"]` flips to `True` when the task is actually completed. This is what the eval loop uses to compute success rates.
+
 ## Writing a benchmark doc page
 
 Each benchmark `.mdx` page should include:
diff --git a/docs/source/metaworld.mdx b/docs/source/metaworld.mdx
index 8e629dea9..5c4a780be 100644
--- a/docs/source/metaworld.mdx
+++ b/docs/source/metaworld.mdx
@@ -2,7 +2,7 @@
 
 Meta-World is an open-source simulation benchmark for **multi-task and meta reinforcement learning** in continuous-control robotic manipulation. It bundles 50 diverse manipulation tasks using everyday objects and a common tabletop Sawyer arm, providing a standardized playground to test whether algorithms can learn many different tasks and generalize quickly to new ones.
 
-- Paper: [Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning paper](https://arxiv.org/abs/1910.10897)
+- Paper: [Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning](https://arxiv.org/abs/1910.10897)
 - GitHub: [Farama-Foundation/Metaworld](https://github.com/Farama-Foundation/Metaworld)
 - Project website: [metaworld.farama.org](https://metaworld.farama.org)