Skip to content

Pull requests: radixark/miles

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

feat(rl): add CISPO advantage estimator (MiniMax-M1)
#1331 opened Jun 12, 2026 by EazyReal Loading…
[fix] set deepseek v32 override_hf_native=True
#1330 opened Jun 12, 2026 by yueming-yuan Collaborator Loading…
[fix] stop merging agentic turns at first non-COMPLETED turn
#1323 opened Jun 12, 2026 by Shi-Dong Contributor Loading…
feat: add FlashQLA backend for Qwen GDN linear-attention layers
#1318 opened Jun 11, 2026 by Zhichenzzz Contributor Loading…
fix: load Qwen 3.5 checkpoint with unfused experts
#1317 opened Jun 10, 2026 by lawrence-harmonic Contributor Loading…
[doc, CI] doc driven CI
#1312 opened Jun 9, 2026 by guapisolo Collaborator Loading…
fix(qwen3-vl): per-segment mRoPE + vision under CP + THD packing
#1308 opened Jun 8, 2026 by Zhichenzzz Contributor Loading…
fix(mtp): track megatron mtp_model_layer rename in raw converters
#1307 opened Jun 8, 2026 by Zhichenzzz Contributor Loading…
DO NOT MERGE: CI test run-ci-model-scripts Run model script smoke tests
#1306 opened Jun 8, 2026 by yueming-yuan Collaborator Loading…
[NPU] Feature add npu docker
#1305 opened Jun 8, 2026 by codemayq Loading…
Inject rank and millisecond timestamp into Ray train actor log lines
#1303 opened Jun 7, 2026 by fzyzcjy Collaborator Loading…
ProTip! Add no:assignee to see everything that’s not assigned.