docs(recipes): trim header comments, drop diversity-knobs note in run_hf_job

Recipes were over-commented (paper citations, history of removed
sub-recipes, inference-time loop walkthroughs). Stripped down to a
short header + a one-line note on the boundary-frame memory tail.

Also removed the ``_tool3`` diversity-knobs comment block in
``examples/annotation/run_hf_job.py`` — it was a personal note about
a since-merged experiment.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Pepijn
2026-05-13 12:55:03 +02:00
parent b2aa372fcf
commit 9ff62cb08c
3 changed files with 16 additions and 135 deletions
-12
View File
@@ -23,18 +23,6 @@ token = os.environ.get("HF_TOKEN") or get_token()
if not token:
raise RuntimeError("No HF token. Run `huggingface-cli login` or `export HF_TOKEN=hf_...`")
# --- Diversity knobs (Pi0.7-style prompt expansion) -----------------------
# Bumped roughly 3x across the board to fight memorization on small datasets.
# A single dataset trained for many epochs with deterministic atom wording
# converges to perfect recall on training prompts but produces JSON-token
# garbage at inference for any wording that drifts slightly. More atom
# variants per episode + higher sampling temperature widens the training
# distribution so the model has to actually use its language head, not
# just memorize.
#
# Pushes to a *new* hub repo (``_tool3``) so the previous annotation pass
# (``_tool2``) stays intact — re-train from scratch on the new dataset and
# compare loss-curve shapes to verify the diversity bump is doing something.
CMD = (
"apt-get update -qq && apt-get install -y -qq git ffmpeg && "
"pip install --no-deps "