Add fused AdamW option and warn on torch attention mask memory usage by ZhikangNiu · Pull Request #1270 · SWivid/F5-TTS

ZhikangNiu · 2026-03-02T14:33:48Z

add optim.use_fused_adamw to training configs, train.py, finetune CLI and Gradio
disallow enabling bnb_optimizer and use_fused_adamw at the same time
warn when attn_mask_enabled=True with attn_backend=torch due to high GPU memory usage

- add optim.use_fused_adamw to training configs, train.py, finetune CLI and Gradio - disallow enabling bnb_optimizer and use_fused_adamw at the same time - warn when attn_mask_enabled=True with attn_backend=torch due to high GPU memory usage

ZhikangNiu · 2026-03-02T14:34:17Z

cc @SWivid

least code, directly fused=True

fe4e6a4

SWivid merged commit b5ab1af into SWivid:main Mar 4, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add fused AdamW option and warn on torch attention mask memory usage#1270

Add fused AdamW option and warn on torch attention mask memory usage#1270
SWivid merged 2 commits intoSWivid:mainfrom
ZhikangNiu:main

ZhikangNiu commented Mar 2, 2026

Uh oh!

ZhikangNiu commented Mar 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ZhikangNiu commented Mar 2, 2026

Uh oh!

ZhikangNiu commented Mar 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants