Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
346 commits
Select commit Hold shift + click to select a range
a3e1b3a
Merge pull request #45 from radixark/shi/r3-only-page
Zhichenzzz Apr 29, 2026
174ddd4
docs: drop fictional MILES_DEBUG=1 env var
Apr 29, 2026
9a9302a
docs: remove the Release v0.1.0 blog page
Apr 29, 2026
3c53e17
docs(monitoring): use real metric names in 'What to watch'
Apr 29, 2026
f499ee9
docs(usage): mark --use-distributed-optimizer as forced on, not opt-in
Apr 29, 2026
8b50584
docs(monitoring): clarify the router catch-all route
Apr 29, 2026
6346cf3
docs(monitoring): drop the 'Implemented as ...' note from the catch-a…
Apr 29, 2026
a1ec008
Merge pull request #47 from radixark/shi/remove-release-v010
Zhichenzzz Apr 29, 2026
623d058
docs(customization): fix --rollout-function-path signature
Apr 29, 2026
a6add29
Merge pull request #48 from radixark/shi/fix-monitoring-watch
Zhichenzzz Apr 29, 2026
2a3790b
Merge pull request #49 from radixark/shi/fix-distributed-optimizer-claim
Zhichenzzz Apr 29, 2026
c87a726
Merge pull request #50 from radixark/shi/clarify-router-catchall
Zhichenzzz Apr 29, 2026
1e69543
Merge pull request #46 from radixark/shi/drop-miles-debug-env
Zhichenzzz Apr 29, 2026
444e13d
docs(customization): fix --rollout-sample-filter-path signature
Apr 29, 2026
541d8b6
Merge pull request #52 from radixark/shi/fix-sample-filter-doc
Zhichenzzz Apr 30, 2026
98a4936
Merge pull request #51 from radixark/shi/fix-rollout-fn-doc
Zhichenzzz Apr 30, 2026
684405e
docs(deepseek): replace V4 placeholder with full launch tutorial
Apr 30, 2026
50c5739
docs(customization): widen --buffer-filter-path rollout_id annotation
Apr 30, 2026
afc9a84
docs(deepseek): address PR review on V4-Flash/V4-Pro split
Apr 30, 2026
f578524
docs(deepseek): tighten V4 quirks list and fix active-param counts
Apr 30, 2026
4307d4b
Merge pull request #54 from radixark/shi/fix-buffer-filter-annotation
Zhichenzzz Apr 30, 2026
f70a443
docs(home): split Supported Hardware into NVIDIA / AMD bullets
Apr 30, 2026
339b671
docs(home): rename Latest-updates link to match CLI Reference
Apr 30, 2026
2a32737
docs(home): align all Latest-updates link texts with target page titles
Apr 30, 2026
a88b118
docs(blog): drop Try-it code block; defer to Quick Start
Apr 30, 2026
09b9bce
docs(blog): apply reviewer wording to Try-it paragraph
Apr 30, 2026
0dc041f
docs(contributing): refresh Repository Layout against current miles tree
Apr 30, 2026
3454436
docs(advanced): sync p2p-weight-transfer with miles upstream
Zhichenzzz Apr 30, 2026
3aa5afc
Merge pull request #55 from radixark/shi/split-supported-hw
Zhichenzzz Apr 30, 2026
c7ea25c
Merge pull request #56 from radixark/shi/fix-cli-reference-link
Zhichenzzz Apr 30, 2026
a51c4fb
Merge pull request #57 from radixark/shi/polish-introducing-miles-try-it
Zhichenzzz Apr 30, 2026
00cf9f1
Merge pull request #58 from radixark/shi/repo-layout-refresh
Zhichenzzz Apr 30, 2026
0b4f109
Merge pull request #59 from radixark/zhichen/sync-p2p-weight-transfer…
Zhichenzzz Apr 30, 2026
6c7f0df
docs(models): add Qwen3.6 dense and MoE recipes
Zhichenzzz Apr 30, 2026
da69aa8
docs(models): add Nemotron-3-Nano dense and MoE recipes
Zhichenzzz Apr 30, 2026
bee33b1
docs(qwen3.6): expand intro using SGLang cookbook description
Zhichenzzz May 1, 2026
dbce3be
docs(qwen3.6): rewrite intros in own words, drop cookbook attribution
Zhichenzzz May 1, 2026
400aa9b
Merge pull request #60 from radixark/zhichen/docs-qwen3-6
Zhichenzzz May 7, 2026
0d4b3ff
Merge pull request #61 from radixark/zhichen/docs-nemotron-3-nano
Zhichenzzz May 7, 2026
15e3d13
docs(index): align Supported hardware list with Platforms page
Zhichenzzz May 7, 2026
8fff2fc
Merge pull request #63 from radixark/zhichen/docs-supported-hardware-…
Zhichenzzz May 7, 2026
555fa02
docs: expand low-precision feature bullet (#65)
yueming-yuan May 8, 2026
4041ed8
docs: highlight Efficient R3 with cache + D2H overlap (#66)
yueming-yuan May 8, 2026
59c564d
docs: add fast-and-stable model-support bullet to core features (#67)
yueming-yuan May 8, 2026
28b344a
docs: add broad hardware support bullet (#68)
yueming-yuan May 8, 2026
e8455a2
docs(index): align Supported models lists with Models page (#64)
Zhichenzzz May 8, 2026
5969968
docs: rewrite low-precision page (FP8 / MXFP8 / NVFP4) (#70)
yueming-yuan May 8, 2026
b4d8241
docs: convert British '-ise/-isation' to American '-ize/-ization' (#71)
Shi-Dong May 8, 2026
7ea5de8
docs(installation): clarify H100/H200 status as 'CI guarded' (#72)
Shi-Dong May 8, 2026
5167649
docs(usage): make parse_args import copyable as a code block (#73)
Shi-Dong May 8, 2026
7545f5a
docs(usage): add Megatron parallelism compatibility section (#74)
Shi-Dong May 8, 2026
02d3354
docs: add Argument Groups page and linkify launch-script tables (#75)
Shi-Dong May 8, 2026
acef1c5
docs: add Fully Async Rollout user-guide page (#76)
Shi-Dong May 8, 2026
cc61125
docs(index): replace Supported models prose with two-panel matrix (#77)
Shi-Dong May 8, 2026
39b9a5b
docs(deepseek): replace fictional MILES_SCRIPT_* env vars with real l…
May 8, 2026
5f3891d
docs(deepseek): correct Key highlights against DeepSeek-V4 model card
May 8, 2026
5b0b30d
docs(deepseek): address PR draft-review feedback on §1 + §3.1
May 8, 2026
973dcc1
docs(deepseek): drop /cluster_public mention from §4.2 multi-node fan…
May 8, 2026
da6e12a
docs(deepseek): address follow-up PR review comments
May 8, 2026
739c33c
docs(usage): make page Megatron-only; move FSDP to developer/experime…
yueming-yuan May 8, 2026
ea35b7c
Add Mintlify docs.json config
unseenmars May 9, 2026
34421a4
Fix docs.json: add required theme field
unseenmars May 9, 2026
c1c19d0
Fix docs.json: correct navigation format and navbar fields
unseenmars May 9, 2026
beeac54
docs(experimental): fix FSDP backend path
May 9, 2026
92f6055
docs(deepseek): full DeepSeek V4 Flash launch tutorial (#53)
Shi-Dong May 9, 2026
92fbdcc
docs(deepseek): add GB300 docker image for V4-Flash (#62)
yushengsu-thu May 9, 2026
754528c
docs(v4-flash): lead with full-train quick start; fold env setup + mu…
yueming-yuan May 9, 2026
6dd5b81
docs(v4-flash): nest Launcher path defaults under Quick start; add do…
yueming-yuan May 9, 2026
ab91882
docs(v4-flash): document --model-local-dir as multi-node fan-out dest…
yueming-yuan May 9, 2026
8c40412
docs(v4-flash): drop GB300 7-node parallelism row
yueming-yuan May 9, 2026
ea576f4
docs(v4-flash): drop Miles Router and Fault Tolerance bullets from Pa…
yueming-yuan May 9, 2026
caf8fdc
docs(v4-flash): note custom TP/EP/PP/CP combos are allowed beyond the…
yueming-yuan May 9, 2026
3e40b8e
docs(v4-flash): drop _get_parallel_config preamble before parallelism…
yueming-yuan May 9, 2026
83b117a
docs(v4-flash): rename Recipe Configuration to Example Recipe Configu…
yueming-yuan May 9, 2026
76867fa
docs(v4-flash): move Notable quirks from §5.5 to §4.4 (end of Script …
yueming-yuan May 9, 2026
716f917
docs(index): add LoRA training-and-serving bullet to Core features
May 10, 2026
d51a2ed
Merge pull request #78 from radixark/yueming/megatron-only-usage-fsdp…
Zhichenzzz May 11, 2026
bd9bfca
Merge pull request #82 from radixark/shi/index-lora-bullet
Zhichenzzz May 11, 2026
10d4832
Merge pull request #80 from radixark/yueming/v4-flash-restructure
Zhichenzzz May 11, 2026
995d677
docs: fix mkdocs-only syntax that Mintlify can't parse
Zhichenzzz May 11, 2026
2bcc1ab
docs(nav): make group titles clickable via Mintlify `root`
Zhichenzzz May 11, 2026
78e78cf
docs(nav): split top-level into tabs so big sections don't all expand
Zhichenzzz May 11, 2026
dc4f8db
docs(nav): point logo href to /docs landing page
Zhichenzzz May 11, 2026
208baa6
Merge pull request #83 from radixark/zhichen/mintlify-syntax-fixes
Zhichenzzz May 11, 2026
6f0f4af
Drop Fault tolerance bullet; reword agentic rollout (#85)
Shi-Dong May 11, 2026
97efb3e
DeepSeek v4 Flash polishing (#86)
Shi-Dong May 11, 2026
a25bbb5
docs(index): group Supported models by model family as bullets
May 11, 2026
d388b9c
Keep Dense / MoE CardGroup; group bullets inside each card by family
May 11, 2026
193c126
Force regular font weight on model links inside the cards
May 11, 2026
a229839
Use plain HTML style attribute with !important for font-weight
May 11, 2026
a495c37
Merge pull request #88 from radixark/shi/index-models-bullets
Zhichenzzz May 11, 2026
2e22541
docs(models): put each model on its own line in the By family table
May 11, 2026
ce40cf9
Rank models within each family by recency, then by size
May 11, 2026
c62d3fb
Merge pull request #89 from radixark/shi/models-page-lines
Zhichenzzz May 11, 2026
f722c5b
Align Advanced grid with Core features on the homepage (#91)
Shi-Dong May 11, 2026
38f2e3d
Add blank-line padding inside MDX JSX wrappers
May 11, 2026
e1e80f8
docs(faq): fix three entries flagged by miles cross-check
May 11, 2026
2a3890c
docs(advanced): expand LoRA page into a one-page intro
yushengsu-thu May 11, 2026
c47d1ed
docs(advanced): scope LoRA page back to a stub augmentation
yushengsu-thu May 11, 2026
1bd638f
Update lora.md with PR merge note
yushengsu-thu May 11, 2026
82b1166
docs(v4-pro): lay down section skeleton mirroring V4-Flash; mark sect…
yueming-yuan May 11, 2026
79e920d
docs(v4-pro): set Pro one-line launch to --num-nodes 8 --num-gpus-per…
yueming-yuan May 11, 2026
6b5c14f
docs(v4-pro): point Script breakdown to V4-Flash sections; fix one-li…
yueming-yuan May 11, 2026
d5069d6
docs(v4-pro): collapse Script breakdown links to a single sentence
yueming-yuan May 11, 2026
d7feb33
docs(v4-pro): fill in 32x8 H200 parallelism row + Pro-specific defaul…
yueming-yuan May 11, 2026
3dd90e5
Merge main into yueming/v4-pro-skeleton
yueming-yuan May 11, 2026
05e4677
docs(v4-pro): fill in Key highlights as deltas vs Flash (indexer topk…
yueming-yuan May 11, 2026
ef63cd7
docs(v4-pro): fill in 5.3 Rollout & SGLang with 32-GPU engine config …
yueming-yuan May 11, 2026
bf76384
docs(v4-pro): drop Pro-specific launcher-defaults bullet from Key hig…
yueming-yuan May 11, 2026
cf70577
docs(v4-pro): drop R3 + weight-loader-drop-cache flags; update mem-fr…
yueming-yuan May 11, 2026
b1ce97c
docs(v4-pro): reuse Flash docker image; restate full-train chain matc…
yueming-yuan May 11, 2026
6611d66
docs(v4-pro): match Flash docker-pull comment listing H200/B200 + GB3…
yueming-yuan May 11, 2026
5e73880
docs(v4-pro): rename 5.1 Parallelism to Megatron Parallelism
yueming-yuan May 11, 2026
f8a455e
docs(v4): swap max_tokens_per_gpu column for CP in parallelism tables…
yueming-yuan May 11, 2026
0f958d1
docs(v4-pro): fill in 5.4 Optimizer with Adam config + CPU-offload tr…
yueming-yuan May 11, 2026
79df189
docs(v4): drop WIP admonition from Pro; add Tracking issue line at to…
yueming-yuan May 11, 2026
6a2e281
docs(v4): rename tracking-issue prefix to 'DeepSeek V4 training track…
yueming-yuan May 11, 2026
c3917d0
docs(v4): rename Flash 5.1 to Megatron Parallelism; move validated-la…
yueming-yuan May 11, 2026
4e8b188
docs(v4-pro): drop launcher-defaults paragraph after the parallelism …
yueming-yuan May 11, 2026
bfe3471
Merge pull request #93 from radixark/shi/mdx-blank-lines
Zhichenzzz May 11, 2026
600772c
Merge branch 'main' into shi/faq-fixes
Zhichenzzz May 11, 2026
f4fe0b2
Merge pull request #94 from radixark/shi/faq-fixes
Zhichenzzz May 11, 2026
0cc0c0c
Merge pull request #96 from radixark/yueming/v4-pro-skeleton
Zhichenzzz May 11, 2026
6dfd444
Merge pull request #95 from yushengsu-thu/yushengsu/lora-advanced-page
Zhichenzzz May 11, 2026
04c1b1f
fix(docs): move font-weight override to external CSS to fix /docs 500
Zhichenzzz May 11, 2026
3c028a8
trigger mintlify preview
Zhichenzzz May 11, 2026
26f1ca4
docs: trailing newline (retrigger mintlify)
Zhichenzzz May 11, 2026
c53a68f
docs(index): trigger preview revalidation of /docs
Zhichenzzz May 11, 2026
797b370
docs(index): remove trigger comment
Zhichenzzz May 11, 2026
195b75a
fix(docs): Mintlify compat fixes + orange theme
kai-rdxa May 12, 2026
c36b2f8
fix(docs): strip .md from anchor links missed in first pass
kai-rdxa May 12, 2026
876cc2c
fix(docs): convert all relative internal links to absolute paths
kai-rdxa May 12, 2026
3da58da
Merge pull request #99 from radixark/fix/mintlify-compat
unseenmars May 12, 2026
ff824f3
docs(blog): update X handle to @radixark; drop RSS placeholder
May 12, 2026
54e8e83
Merge pull request #101 from radixark/shi/blog-x-and-rss
Zhichenzzz May 12, 2026
ded349a
Migrate pages from /docs to /miles/docs
wisclmy0611 May 12, 2026
9e26270
fix(docs): update internal links from /docs/ to /miles/docs/
kai-rdxa May 12, 2026
625e5da
fix(faq): remove backticks from Accordion titles (re-apply after path…
kai-rdxa May 12, 2026
0f07483
fix(faq): use JSX <code> in Accordion titles for inline code rendering
kai-rdxa May 12, 2026
0758932
Merge pull request #104 from radixark/fix/update-links-for-miles-path
unseenmars May 12, 2026
bf246fd
fix(config): set custom domain, fix README link, make index page root
kai-rdxa May 12, 2026
985ccd1
fix(config): set Mintlify url to https://www.radixark.com/miles
kai-rdxa May 12, 2026
4e06d61
fix(config): set url to domain only (www.radixark.com)
kai-rdxa May 12, 2026
95ea1aa
fix(nav): move Miles Documentation to its own Welcome to Miles tab
kai-rdxa May 12, 2026
49f4a1b
Merge pull request #105 from radixark/fix/custom-domain-url
unseenmars May 12, 2026
db593d4
refactor(docs): move miles/docs/ → docs/ for Mintlify Host-at-/miles …
kai-rdxa May 12, 2026
a9a5e3d
Merge pull request #106 from radixark/fix/restructure-docs-path
unseenmars May 12, 2026
5e7c775
revert: restore miles/docs/ structure + add vercel.json rewrites
kai-rdxa May 12, 2026
34e443e
fix(nav): merge Welcome to Miles + Getting Started into single Welcom…
kai-rdxa May 12, 2026
95d59fe
Merge pull request #107 from radixark/fix/revert-to-miles-docs-path
unseenmars May 12, 2026
b1a6865
feat(style): linden theme + RadixArk design system
unseenmars May 12, 2026
3d6379c
ci: trigger Mintlify preview
unseenmars May 12, 2026
4221d9d
feat(style): add light mode + remove font overrides
unseenmars May 12, 2026
7d3fade
docs(glm5): reflect GLM-5.1 coverage in page title and description
May 12, 2026
f22bea7
fix(logo): crop tight + separate light/dark versions
unseenmars May 12, 2026
198bec6
fix(links): convert relative Card href to absolute /miles/docs/... paths
unseenmars May 12, 2026
70303a8
docs(blog): fix Introducing Miles card link; drop nav entry (#102)
Shi-Dong May 12, 2026
19b5512
feat(style): consistent typography + mobile overflow fixes
unseenmars May 12, 2026
648f578
fix(contributing): escape raw < in paragraph text to prevent MDX pars…
unseenmars May 12, 2026
de3ccc3
fix(config): remove url field to restore client-side navigation
unseenmars May 12, 2026
5b53157
fix(config): restore url for SEO canonical/sitemap
unseenmars May 12, 2026
55bb7d1
fix(navbar): replace github type with plain link to stop nav flicker
unseenmars May 12, 2026
ba7cf76
fix(logo): increase width 80 → 130
unseenmars May 12, 2026
89f75a3
fix(navbar): restore type:github — navbar persists in layout so star …
unseenmars May 12, 2026
29941ad
fix(style): thin link underlines + smaller model list links
unseenmars May 12, 2026
5e6f823
fix(style): remove Google Fonts config to stop navbar flicker
unseenmars May 12, 2026
60c026a
fix(navbar): replace github star widget with plain link
unseenmars May 12, 2026
9951b37
fix(theme): revert linden → mint
unseenmars May 12, 2026
3abb7e6
ci: trigger Mintlify preview
unseenmars May 12, 2026
5e29cfd
feat(style): restore GitHub star + Hanken Grotesk branding
unseenmars May 12, 2026
7469c8b
fix(style): merge duplicate a{} rules + restore Inter for nav/UI
unseenmars May 12, 2026
c92ca24
fix(logo+style): tighter logo crop + subtle external link indicator
unseenmars May 12, 2026
53dca3b
fix(style): external link shows > on hover only, no layout shift
unseenmars May 12, 2026
f1b956b
fix(font): load Hanken Grotesk via @import in style.css
unseenmars May 12, 2026
e28bc74
Revert "fix(font): load Hanken Grotesk via @import in style.css"
unseenmars May 12, 2026
224585f
fix(style): replace animated > with static ↗ for external links
unseenmars May 12, 2026
cd67b73
fix(logo): ultra-tight crop matching reference — letters fill frame
unseenmars May 12, 2026
782ed84
feat(style): Hanken Grotesk headings + Inter body + CSS compatibility…
unseenmars May 12, 2026
080ad3f
fix(logo): replace with transparent PNG source, proper light/dark ver…
unseenmars May 12, 2026
7c546ff
fix(style): reduce accordion/FAQ title font size to 0.9375rem
unseenmars May 12, 2026
a4efcfe
fix(logo): use original transparent PNG as-is, no crop
unseenmars May 12, 2026
944e849
fix(logo): use user-cropped logo (2976x2063, 1.44:1 ratio)
unseenmars May 12, 2026
57c5668
Convert remaining British spellings to American
May 12, 2026
6ab63f9
docs(arch-support): stop quoting Miles' own docstring
May 12, 2026
1ea9fb7
docs(r3): two small wording fixes
May 12, 2026
e0cab90
address review: clarify scenario is an example, polish 'With R3' line
May 12, 2026
1a34ae6
Expand 'top-k op' to 'top-k operation'
May 12, 2026
80c05f6
Minor Fixes in Fault Tolerance Page
May 12, 2026
e92c342
Strip line-number suffixes from miles source-file references
May 12, 2026
1ac0542
Merge pull request #110 from radixark/fix/card-href-absolute-paths
unseenmars May 12, 2026
4bff456
fix(contributing): replace task list checkboxes with plain list items
unseenmars May 12, 2026
b1b5c5e
fix(contributing): replace <your_user> with YOUR_USERNAME in bash block
unseenmars May 12, 2026
7f13710
test: minimal contributing.md to binary search 404 cause
unseenmars May 12, 2026
c074652
test: rename contributing -> contributor-guide to test if path is res…
unseenmars May 12, 2026
c5185f3
fix(links): update internal links from contributing to contributor-guide
unseenmars May 12, 2026
0079829
fix(contributing): restore original content — path rename was the onl…
unseenmars May 12, 2026
0770ec9
fix(contributor-guide): replace Unicode box-drawing with ASCII in dir…
unseenmars May 12, 2026
8950c3a
Revert "fix(contributor-guide): replace Unicode box-drawing with ASCI…
unseenmars May 12, 2026
aa8cb04
Merge pull request #109 from radixark/shi/glm5-title-includes-5.1
Zhichenzzz May 12, 2026
2a68d4f
Merge pull request #112 from radixark/shi/british-spelling-pass-2
Zhichenzzz May 12, 2026
3fa8165
Merge pull request #113 from radixark/shi/arch-support-no-self-quote
Zhichenzzz May 12, 2026
c19b2d9
Merge pull request #114 from radixark/shi/r3-page-edits
Zhichenzzz May 12, 2026
5f58e79
Merge pull request #116 from radixark/shi/ft-page-fixes
Zhichenzzz May 12, 2026
7677af9
Also strip (L29)/(L100) shorthand line refs in GLM pages
Zhichenzzz May 12, 2026
01bebbb
Merge branch 'main' into shi/strip-source-line-numbers
Zhichenzzz May 12, 2026
f83ce54
Merge pull request #108 from radixark/fix/linden-theme-radixark-design
Zhichenzzz May 12, 2026
23c4334
Merge pull request #117 from radixark/shi/strip-source-line-numbers
Zhichenzzz May 12, 2026
06428e5
Merge remote-tracking branch 'origin/main' into fix/contributing-chec…
unseenmars May 12, 2026
14010bb
Merge pull request #118 from radixark/fix/contributing-checkbox-mdx
unseenmars May 12, 2026
22fce70
docs(homepage): switch Supported models to by-family table
Zhichenzzz May 12, 2026
ed773d9
docs(gpt-oss): shorten sidebar label to 'GPT-OSS'
Zhichenzzz May 12, 2026
74ac25c
Merge pull request #119 from radixark/docs/homepage-models-by-family
Zhichenzzz May 12, 2026
d80efaa
Merge pull request #120 from radixark/docs/gpt-oss-sidebar-label
Zhichenzzz May 12, 2026
f6c5c39
Add 'Raise issue' / 'Suggest edit' footer buttons and doc-polish issu…
Shi-Dong May 13, 2026
9d12838
feedback: drop dead 'feedback' block; use contextual + footer instead…
Shi-Dong May 13, 2026
2b51698
trigger mintlify rebuild for contextual menu deployment
May 13, 2026
727f3cc
contextual: drop copy/view presets and the broken $page reference (#123)
Shi-Dong May 13, 2026
82c4628
docs(lora): drop the 'This page is a stub' sentence (#124)
Shi-Dong May 17, 2026
99d2786
docs(lora): four polish fixes from issue #125 (#126)
Shi-Dong May 20, 2026
7a7168e
docs(kimi): write the Kimi K2.5 launch recipe
Zhichenzzz Jun 1, 2026
48ea94d
docs(kimi): rewrite K2.5 intro from the model card; drop em-dashes
Zhichenzzz Jun 1, 2026
0d83734
docs(kimi): drop the K2.6-status and recipe-scope note blocks
Zhichenzzz Jun 1, 2026
f8abe11
docs(kimi): list K2.6 / K2.5 on the Kimi family index
Zhichenzzz Jun 1, 2026
5c29f7a
Merge pull request #128 from radixark/zhichen/kimi-k25-recipe
Zhichenzzz Jun 1, 2026
936d749
docs: sync agentic-chat-template.md from miles repo
guapisolo Jun 8, 2026
2d8f9f3
Merge pull request #129 from radixark/docs/sync-agentic-chat-template…
guapisolo Jun 8, 2026
346a459
docs: fix deepseek-v4 env var guidance, bump image to miles:latest
yueming-yuan Jun 9, 2026
73ad98d
Merge pull request #130 from radixark/docs/deepseek-v4-env-vars
yueming-yuan Jun 9, 2026
033f512
docs: document deepseek-v4 disaggregated rollout mode (miles#1310)
yueming-yuan Jun 9, 2026
d3cdf63
docs: drop --use-miles-router from deepseek-v4 recipes
yueming-yuan Jun 9, 2026
a5fcb53
docs: simplify disaggregated mode section
yueming-yuan Jun 9, 2026
1291faa
Merge pull request #131 from radixark/docs/deepseek-v4-disaggregated
yueming-yuan Jun 9, 2026
42239e8
docs: sync content from miles (router middleware_hub removal, test-su…
Zhichenzzz Jun 10, 2026
619e5ef
docs: finish miles#1291 sync — port index + rollout-endpoints changes
Zhichenzzz Jun 10, 2026
896e959
docs(deepseek): drop the mcore-integration PR-reference sentence from…
Zhichenzzz Jun 10, 2026
d99f072
docs(deepseek): cross-link V4-Flash intro to V4-Pro
Zhichenzzz Jun 10, 2026
3cab9fe
Merge pull request #132 from radixark/sync/miles-docs-1290-1227-1311
Zhichenzzz Jun 10, 2026
bd328fb
Migrate documentation into docs_new/ with full history
unseenmars Jun 11, 2026
8a7422a
docs_new: add README explaining layout and local preview
unseenmars Jun 11, 2026
921aa87
docs_new: trim README to structure and usage
unseenmars Jun 11, 2026
27d8f86
docs_new: point GitHub links at this repo, add docs issue template
unseenmars Jun 11, 2026
446b44c
docs_new: fix tito_tokenizer source link to absolute GitHub URL
unseenmars Jun 11, 2026
a83d43d
docs_new: drop docs issue template, link plain new-issue form
unseenmars Jun 11, 2026
993e836
docs_new: trim README
unseenmars Jun 11, 2026
0d9a0b9
docs_new: add live docs link to README
unseenmars Jun 11, 2026
97792c5
Replace stale docs/ with migrated documentation
unseenmars Jun 11, 2026
9ae0afc
README: point doc links at the live documentation site
unseenmars Jun 11, 2026
d6cd3ae
docs: update README paths after rename
unseenmars Jun 11, 2026
00aaca4
docs: flatten layout for miles.radixark.com/docs hosting
unseenmars Jun 12, 2026
7ee584b
docs: add canonical URL so search engines index the custom domain
unseenmars Jun 12, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
[![License](https://img.shields.io/github/license/radixark/miles)](LICENSE)
[![Slack](https://img.shields.io/badge/slack-join-brightgreen.svg)](https://slack.sglang.ai)

[**Latest Updates**](#latest-updates) | [**Quick Start**](#quick-start) | [**Key Features**](#key-features) | [**Documentation**](https://www.radixark.com/miles/docs)
[**Latest Updates**](#latest-updates) | [**Quick Start**](#quick-start) | [**Key Features**](#key-features) | [**Documentation**](https://miles.radixark.com/docs)

</div>

Expand All @@ -18,11 +18,11 @@

## Latest Updates

* **[2026/02]** 💡 **Miles Detailed Arguments**: We've added a detailed command-line argument guide used to configure Miles for RL training and inference. These arguments enable precise control over cluster resources, training backends (Megatron/FSDP), inference optimization via SGLang, and RL algorithmic hyperparameters. [Link](https://github.com/radixark/miles/blob/main/docs/en/advanced/miles_server_args.md)
* **[2026/02]** 💡 **Miles Detailed Arguments**: We've added a detailed command-line argument guide used to configure Miles for RL training and inference. These arguments enable precise control over cluster resources, training backends (Megatron/FSDP), inference optimization via SGLang, and RL algorithmic hyperparameters. [Link](https://miles.radixark.com/docs/user-guide/cli-reference)
* **[2026/01]** 💎 **INT4 Quantization-Aware Training (QAT)**: Inspired by the Kimi K2-Thinking report, Miles now features a full-stack INT4 W4A16 QAT pipeline. This allows 1TB-scale models to fit into single-machine VRAM (e.g., NVIDIA H200), doubling rollout efficiency by eliminating cross-node bottlenecks while maintaining BF16-equivalent accuracy. [Blog](https://lmsys.org/blog/2026-01-26-int4-qat/)
* **[2026/01]** 💎 **Unified VLM/LLM Multi-Turn Training**: We provided an implementation for the VLM multi-turn sampling paradigm. Developers only need to write a customized `rollout` function to easily start multi-turn RL for VLM, just like training LLM. [Blog](https://github.com/zhaochenyang20/Awesome-ML-SYS-Tutorial/blob/main/rlhf/slime/vlm-multi-turn/readme-en.md)
* **[2026/01]** 🤖 **Multi-Agent Co-Evolution**: Miles now supports **MrlX**, a novel asynchronous co-evolutionary framework for Multi-Agent RL. Achieve superior performance in complex tasks like Doctor-Patient simulations and DeepResearch pipelines by enabling specialized agents to evolve together symbiotically. [[Link]](https://github.com/AQ-MedAI/MrlX)
* **[2025/12]** 🔄 **Rollout Routing Replay (R3)**: In collaboration with SGLang, we've launched R3 to solve MoE RL instability. R3 records inference routing decisions and replays them during training, effectively eliminating the "training-inference mismatch" and preventing training collapse in large MoE models like Qwen3 and DeepSeek-V3. [[Paper]](https://arxiv.org/pdf/2510.11370) [[Docs]](docs/en/advanced/miles-router.md#22-rollout-routing-replay-r3-for-moe)
* **[2025/12]** 🔄 **Rollout Routing Replay (R3)**: In collaboration with SGLang, we've launched R3 to solve MoE RL instability. R3 records inference routing decisions and replays them during training, effectively eliminating the "training-inference mismatch" and preventing training collapse in large MoE models like Qwen3 and DeepSeek-V3. [[Paper]](https://arxiv.org/pdf/2510.11370) [[Docs]](https://miles.radixark.com/docs/advanced/miles-router)
* **[2025/11]** 🔥 **Unified FP8 Release**: Solves the stability issues in MoE RL by ensuring training and inference use the exact same FP8 quantization logic. [[Blog]](https://lmsys.org/blog/2025-11-25-fp8-rl/)
* **[2025/11]** ⚡ **Speculative Decoding in RL**: Integrated speculative rollout with online SFT for draft models, achieving massive throughput gains. [[Blog]](https://github.com/zhaochenyang20/Awesome-ML-SYS-Tutorial/blob/main/rlhf/slime/spec/readme-en.md)
* **[2025/11]** 🎉 **Miles Project Launch**: A joint effort by InfiXAI, Ant Group, SGLang RL Team, and the Miles community. [[Announcement]](https://lmsys.org/blog/2025-11-19-miles/)
Expand Down Expand Up @@ -107,7 +107,7 @@ python train.py \
--n-samples-per-prompt 8
```

For comprehensive guides on environment setup and custom reward functions, see the [Quick Start Guide](docs/en/get_started/quick_start.md).
For comprehensive guides on environment setup and custom reward functions, see the [Quick Start Guide](https://miles.radixark.com/docs/getting-started/quick-start).

---

Expand Down
34 changes: 34 additions & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# Miles Documentation

Live site: https://miles.radixark.com/docs

## Layout

```
docs/
├── docs.json # Mintlify config: navigation, theme, redirects
├── index.md # Homepage
├── getting-started/ models/ user-guide/ advanced/
├── examples/ developer/ platforms/ blog/
└── assets/ # Images and stylesheets
```

## Previewing locally

```bash
npm i -g mint
cd docs
mint dev
```

Then open http://localhost:3000.

## Adding or editing a page

1. Add or edit a `.md` file (e.g. `models/qwen/qwen4.md`).
2. New pages need an entry in the `navigation` tree in `docs.json`, otherwise they won't
show up in the sidebar.
3. When linking between pages, use absolute paths: `[Quick Start](/getting-started/quick-start)`.
Drop the `.md` extension.
4. Images and other assets go in `assets/` and are referenced the same way:
`/assets/images/arch.png`.
10 changes: 2 additions & 8 deletions docs/advanced/architecture-support.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,6 @@
title: Backends Beyond Megatron
description: Embed HuggingFace implementations as black-box modules inside Megatron's parallel pipeline.
---

# Backends Beyond Megatron

Adding a new architecture (such as Qwen3-Next's Gated-Delta-Net) directly to
Megatron-LM's native code path is invasive. Miles takes a different approach:
wrap the model's official HuggingFace implementation as a black-box module and
Expand Down Expand Up @@ -33,7 +30,6 @@ starts from `get_gpt_decoder_block_spec`, then for the layers whose HF
params={"args": args})` (referenced from `miles_plugins/models/`):

```python
# miles_plugins/models/qwen3_next.py (simplified excerpt)
transformer_layer_spec = get_gpt_decoder_block_spec(config, **kwargs)
...
for layer_id in range(num_layers_to_build):
Expand Down Expand Up @@ -122,10 +118,8 @@ of the model is bf16. Qwen3.5's `A_log` is the canonical example. Rounding it
to bf16 makes Megatron-side activations diverge from SGLang-side rollout,
causing precision drift.

The canonical cast point is Megatron's `Float16Module`, which (per the
docstring on `enforce_marked_param_dtypes` in
`miles/backends/megatron_utils/fp32_param_utils.py`) "unconditionally casts
every floating-point parameter to bf16/fp16 at wrap time". The mbridge
The canonical cast point is Megatron's `Float16Module`, which casts
every floating-point parameter to bf16/fp16 at wrap time. The mbridge
weight-conversion path (`_weight_to_mcore_format` and friends) is the
other place fp32 weights can be silently downcast. Two steps are required
to keep tagged params in fp32.
Expand Down
29 changes: 13 additions & 16 deletions docs/advanced/fault-tolerance.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,32 +2,29 @@
title: Fault Tolerance
description: Rollout-side health checks and engine recovery, gated by --use-fault-tolerance.
---

# Fault Tolerance

`--use-fault-tolerance` enables Miles's rollout-side fault-tolerance machinery.
It gates two code paths:
The `--use-fault-tolerance` flag enables Miles's rollout-side
fault-tolerance machinery. It gates two code paths:

1. A `RolloutHealthMonitor` thread per server group, started in
`miles/ray/rollout.py:379`, which periodically heart-beats each SGLang
`miles/ray/rollout.py`, which periodically heart-beats each SGLang
engine.
2. A recovery hook in the trainer's weight-update step
(`miles/backends/megatron_utils/actor.py:500`), which restarts engines
(`miles/backends/megatron_utils/actor.py`), which restarts engines
that the health monitor has killed.

```bash
--use-fault-tolerance
```

The flag is `action="store_true"`, default `False`
(`miles/utils/arguments.py:528`).
(`miles/utils/arguments.py`).

## Health monitor

`RolloutHealthMonitor` (`miles/utils/health_monitor.py`) runs in a daemon
thread. Lifecycle: `start` (called once during init), `pause` and `resume`
(called when engines offload / onload), `stop` (called during dispose).
`pause` / `resume` are wired up in `miles/ray/rollout.py:497, 501` and called
`pause` / `resume` are wired up in `miles/ray/rollout.py` and called
around offload / onload events.

Each loop iteration does:
Expand All @@ -37,7 +34,7 @@ Each loop iteration does:
2. For every active engine in the group, call `engine.health_generate.remote(timeout=self._check_timeout)`.
3. If the call raises, run `_kill_engine`: `engine.shutdown.remote()`,
`ray.kill(engine)`, and the engine slot is set to `None`
(`miles/utils/health_monitor.py:160-180`).
(`miles/utils/health_monitor.py`).
4. Sleep `--rollout-health-check-interval` seconds, then repeat.

### Flags
Expand All @@ -52,14 +49,14 @@ Each loop iteration does:

When `--use-fault-tolerance` is on, `MegatronActor.update_weights` calls
`rollout_manager.recover_updatable_engines` on rank 0 before each weight
update (`miles/backends/megatron_utils/actor.py:500`).
update (`miles/backends/megatron_utils/actor.py`).

`recover_updatable_engines` (`miles/ray/rollout.py:513`):
`recover_updatable_engines` (`miles/ray/rollout.py`):

1. Pauses health monitoring.
2. Calls `srv.recover()` on the updatable server.

`srv.recover()` (`miles/ray/rollout.py:263`):
`srv.recover()` (`miles/ray/rollout.py`):

1. Finds engine slots set to `None` (killed by the health monitor).
2. Calls `start_engines` for each affected group.
Expand All @@ -72,14 +69,14 @@ the new engines and the next weight transfer proceeds normally.

When `--update-weight-transfer-mode p2p` is on, every P2P transfer is
bounded by `--p2p-transfer-timeout` (default `30.0`s, defined in
`miles/utils/arguments.py:519`; consumed at
`miles/backends/megatron_utils/update_weight/update_weight_from_distributed/p2p.py:73`).
`miles/utils/arguments.py`; consumed at
`miles/backends/megatron_utils/update_weight/update_weight_from_distributed/p2p.py`).
On timeout the failed transfer is logged (`[P2P] Transfer future failed: ...`)
in `p2p_transfer_utils.py`. There is no automatic retry or automatic
broadcast-mode fallback in the source today.

## Dumper-mode interaction

In dumper mode (`miles/utils/arguments.py:2102`), Miles forces
In dumper mode (`miles/utils/arguments.py`), Miles forces
`use_fault_tolerance = False` and `rollout_health_check_interval = 1e18`
to keep heartbeats from firing.
4 changes: 0 additions & 4 deletions docs/advanced/fp8-low-precision.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,6 @@
title: Low Precision RL
description: Unified low-precision pipelines for RL — block-wise FP8, MXFP8, and NVFP4 across rollout and training.
---

# Low Precision RL

A common failure mode in MoE RL is precision drift between training and
inference. Pipelines that train in BF16 and serve in FP8 accumulate per-layer
numerical disagreement, which compounds into divergent log-probabilities and
Expand Down Expand Up @@ -82,7 +79,6 @@ recipe to use on Hopper, and the recipe DeepSeek-V3 / DeepSeek-R1 ship in.
Block layout is 128×128 with FP32 scales.

```bash
# Megatron / TransformerEngine
--transformer-impl transformer_engine
--bf16
--fp8-format e4m3
Expand Down
13 changes: 5 additions & 8 deletions docs/advanced/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,43 +2,40 @@
title: Advanced Features
description: Systems-level features for large-scale and long-running RL.
---

# Advanced Features

This section covers the Miles features that the Core-features section of the
homepage points at: low-precision training (FP8 / MXFP8 / INT4 QAT), Rollout
Routing Replay for MoE, speculative decoding, and LoRA training and serving.

<CardGroup cols={2}>

<Card title="Low Precision RL" icon="bolt" href="fp8-low-precision">
<Card title="Low Precision RL" icon="bolt" href="/advanced/fp8-low-precision">

The unified FP8 path: matched quantization between training and inference,
BF16 backward and master weights.

</Card>

<Card title="INT4 QAT" icon="microchip" href="int4-qat">
<Card title="INT4 QAT" icon="microchip" href="/advanced/int4-qat">

W4A16 quantization-aware training for fitting large models on a single
8-GPU node.

</Card>

<Card title="Rollout Routing Replay (R3)" icon="network-wired" href="miles-router">
<Card title="Rollout Routing Replay (R3)" icon="network-wired" href="/advanced/miles-router">

Capture expert routing during inference and replay during training. The
mechanism that keeps MoE RL stable.

</Card>

<Card title="Speculative Decoding" icon="rocket" href="speculative-decoding">
<Card title="Speculative Decoding" icon="rocket" href="/advanced/speculative-decoding">

Draft + target speculative rollout, with online MTP-SFT for the draft.

</Card>

<Card title="LoRA Training and Serving" icon="sliders" href="lora">
<Card title="LoRA Training and Serving" icon="sliders" href="/advanced/lora">

Train LoRA adapters with SFT or RL and serve them through SGLang from the
same checkpoint.
Expand Down
9 changes: 3 additions & 6 deletions docs/advanced/int4-qat.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,6 @@
title: INT4 Quantization-Aware Training
description: Fit large models on a single 8-GPU node by training with W4A16 quantization in the loop.
---

# INT4 W4A16 Quantization-Aware Training

When the model is large enough that even FP8 will not fit on one node, the
options are spreading across more nodes (and paying cross-node bandwidth) or
quantizing further. Miles ships an INT4 W4A16 quant-aware-training pipeline.
Expand Down Expand Up @@ -83,10 +80,10 @@ so the KL anchor stays full-precision.

## Pairs with

* [R3](miles-router.md). Keeps MoE routing stable across the quantized forward.
* [P2P weight transfer](p2p-weight-transfer.md). INT4 weights are 4× smaller,
* [R3](/advanced/miles-router). Keeps MoE routing stable across the quantized forward.
* [P2P weight transfer](/advanced/p2p-weight-transfer). INT4 weights are 4× smaller,
so weight sync transfers less data.
* [Speculative decoding](speculative-decoding.md). Compounds for end-to-end
* [Speculative decoding](/advanced/speculative-decoding). Compounds for end-to-end
rollout speedup.

## When QAT is not appropriate
Expand Down
38 changes: 19 additions & 19 deletions docs/advanced/lora.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,23 +2,17 @@
title: LoRA Training and Serving
description: Train LoRA adapters with miles SFT or RL recipes and serve them through SGLang from the same checkpoint.
---

# LoRA Training and Serving

Miles supports LoRA adapters for both SFT and RL recipes. Adapters trained by
miles load directly into SGLang for rollout, so there is no separate merge or
conversion step in the training-serving loop.

This page is a stub; the full LoRA tutorial is being written. In the meantime,
the pieces below are enough to get a recipe running.

## Example launchers

The canonical LoRA recipes live under
[`examples/lora/`](https://github.com/radixark/miles/tree/main/examples/lora) in
the miles repo:

- `examples/lora/run-qwen2.5-0.5B-megatron-lora.sh` — small dense, single GPU.
- `examples/lora/run-qwen2.5-0.5B-megatron-lora.sh` — small dense model, single GPU.
- `examples/lora/run-qwen3-4B-megatron-lora.sh` — Qwen3-4B, RL with LoRA.
- `examples/lora/run-gpt-oss-20B-megatron-moe-lora.sh` — MoE example.

Expand All @@ -35,12 +29,15 @@ the miles repo:
| `--lora-adapter-path` | Path to a pre-trained adapter to resume from. |
| `--lora-sync-from-tensor` | Sync adapter weights to SGLang via in-memory tensors instead of a file round-trip. |

Two existing arguments also have LoRA-specific requirements that are easy to
miss: the launcher has to pass `--megatron-to-hf-mode bridge` (the LoRA path
goes through Megatron-Bridge's PEFT integration; the default `raw` converter
does not understand LoRA layers), and the Ray job has to run with
`--colocate`. Distributed (PD-disaggregated) rollout with LoRA is not
supported today.
<Warning>
Two existing arguments are easy to miss when configuring LoRA:

- **`--megatron-to-hf-mode bridge`** is required. The LoRA path goes through
Megatron-Bridge's PEFT integration; the default `raw` converter does not
understand LoRA layers.
- **`--colocate`** is required. Distributed (PD-disaggregated) rollout with
LoRA is not supported today.
</Warning>

## MoE

Expand Down Expand Up @@ -75,11 +72,14 @@ reason.
drives `train.py`.
* **Low-precision training**: the LoRA branch follows the surrounding
precision, so block-wise FP8, MXFP8, and INT4 QAT recipes are compatible.
See [Low Precision RL](fp8-low-precision.md) and [INT4 QAT](int4-qat.md).
* **`--target-modules` is mandatory** when `--lora-rank > 0`. There is no
auto-detection; the launcher asserts at startup.
* **Single adapter per run**: multi-LoRA training in a single job is not
implemented today.
See [Low Precision RL](/advanced/fp8-low-precision) and [INT4 QAT](/advanced/int4-qat).
* **Target modules**: `--target-modules` is required whenever
`--lora-rank > 0`. There is no auto-detection; the launcher asserts at
startup.
* **Single adapter per run**: only one set of `--lora-*` arguments is
honored per training job. Training multiple LoRA adapters in parallel
within a single `train.py` run is not implemented today — run separate
jobs if you need multiple adapters.

## Internals

Expand All @@ -93,7 +93,7 @@ The bridge between Megatron's LoRA path and SGLang adapter loading is in:
- `miles/backends/megatron_utils/checkpoint.py` — adapter-aware save and load.
- `miles/backends/megatron_utils/update_weight/update_weight_from_tensor.py`
— colocate-mode weight sync from the trainer's LoRA tensors into the SGLang
rollout engine. We will merge this [PR](https://github.com/radixark/miles/pull/988) soon to support disaggregate mode.
rollout engine. Disaggregate-mode weight sync is not supported yet.

A worked tutorial covering checkpoint conversion, SGLang adapter loading, and
LoRA-specific evaluation will land here in a future doc pass.
17 changes: 7 additions & 10 deletions docs/advanced/miles-router.md
Original file line number Diff line number Diff line change
@@ -1,22 +1,19 @@
---
title: Rollout Routing Replay (R3)
description: Capture expert routing during inference and replay it during training so MoE RL is stable.
description: Capture expert routing during inference and replay it during training to stabilize RL.
---

# Rollout Routing Replay (R3)

Rollout Routing Replay (R3) records the expert routing decisions made during
inference and replays them during training, producing bit-identical expert
allocation between rollout and training.

## Why MoE RL was previously unstable
## Why MoE RL is unstable without R3

For each token, an MoE router picks `top-k` experts. The choice depends on the
input through a soft router and a top-k op. In production the router is a
input through a soft router and a top-k operation. In production the router is a
learned `nn.Linear` with non-deterministic kernels and FP8 quantization, so tiny
numerical differences flip routes at the per-layer, per-token level.

Without R3:
An example without R3:

* Rollout selects experts `{2, 7}` for token 314.
* Training (with the same weights but slightly different precision and kernels)
Expand All @@ -25,8 +22,8 @@ Without R3:
layers, tens of thousands of tokens, and thousands of training steps, the
policy diverges.

With R3 the inference router's choice is what training also uses. Numerical
noise no longer flips routes.
With R3, the trainer replays the rollout router's expert assignments verbatim,
so numerical noise no longer flips routes.

## How R3 wires up

Expand All @@ -50,7 +47,7 @@ forward pass so recorded routes are used instead of recomputed ones.
## Memory cost

`(num_tokens - 1) × num_layers × top_k × 4 bytes` (int32 per element, see
`miles/utils/types.py:29`). For a 32K-token sequence, 60 layers, and
`miles/utils/types.py`). For a 32K-token sequence, 60 layers, and
`top_k = 8`, that is roughly 60 MB per sample of routing metadata.

## When R3 is not required
Expand Down
Loading
Loading