I am working in cache-dit, where I use diffusers as the native context parallel backend and extensively utilize custom cp_plan to support context parallel for mainstream models (and has also make many models worked with context parallel + attn_mask via native attn backend). These models include: FLUX, Qwen-Image, LTXVideo, Wan, HunyuanImage, HunyuanVideo, CogVideoX, CogView3Plus, CogView4, Qwen-Image-Lightning, ConsisID, Chroma, VisualCloze, etc. If you wish to integrate these implementations into diffusers, I am willing to submit some PRs to provide support. I will keep a long-term focus on the context parallel feature of diffusers and hope it can become an extremely user-friendly function. our experimental cp_planners are at: cp_planners
@sayakpaul @yiyixuxu @DN6