Training Recursive Language Models (RLMs) via reinforcement learning on the Tinker API.
Paper (PDF) | Model on HuggingFace
+21.7pp average improvement across 14 benchmarks via RS-SFT on 3,644 self-mined trajectories. 13 wins, 1 loss vs base.
scaffold/ # RLM runtime (repl.py, rlm.py, llm_query.py)
eval/ # Evaluation harness (14 benchmarks)
training/ # Training scripts (GRPO, RS-SFT)
scripts/ # Data pipeline & utilities