DisTorch2 unblocked Flux 2 Dev (32B) on 24GB - thank you #197
Booyaka101
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Wanted to drop a real thank you here, not just a star.
Spent most of a day debugging what I thought was a sage-attention dtype bug while trying to run Flux 2 Dev (32B params, FP8mixed = 33GB) + PuLID-Flux2 + Fun-Controlnet on a 24GB RTX 4090. Stack collectively wanted ~42GB resident, was OOM-cascading and producing stripey-noise renders. Tried multiple Comfy core patches, all wrong path.
Found
ComfyUI-MultiGPUvia deep research. SwappedUNETLoaderforUNETLoaderDisTorch2MultiGPUwithvirtual_vram_gb: 14and donor_device cpu. The model now lives ~18-20GB on GPU + 14GB on CPU pinned memory, blocks streaming through async-offload, and:This basically opened the door to running Flux 2 Dev + the full identity/controlnet stack on hardware that has no business running it. And the "sequential model loading" pattern people keep asking about — DisTorch2 is the closest thing that exists today and it just works.
Looking at trying Flux 2 Dev FP16 (~64GB) next with
virtual_vram_gbtuned to bridge the gap to system RAM. If the dev branch supports SkyReels-V3 / LTX-Video 13B-dev FP16 too, that opens up another quality tier for video workflows.Thanks for shipping this. Genuine pipeline-unlocker for solo creators on 24GB cards.
Beta Was this translation helpful? Give feedback.
All reactions