Skip to content

Commit 8a58201

Browse files
committed
add docs and support comfyui weight names
1 parent 97a55b6 commit 8a58201

4 files changed

Lines changed: 43 additions & 2 deletions

File tree

README.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,7 @@ API and command-line option may change frequently.***
4040
- [Chroma](./docs/chroma.md)
4141
- [Chroma1-Radiance](./docs/chroma_radiance.md)
4242
- [Qwen Image](./docs/qwen_image.md)
43+
- [LongCat Image](./docs/longcat_image.md)
4344
- [Z-Image](./docs/z_image.md)
4445
- [Ovis-Image](./docs/ovis_image.md)
4546
- [Anima](./docs/anima.md)
@@ -48,6 +49,7 @@ API and command-line option may change frequently.***
4849
- Image Edit Models
4950
- [FLUX.1-Kontext-dev](./docs/kontext.md)
5051
- [Qwen Image Edit series](./docs/qwen_image_edit.md)
52+
- [LongCat Image Edit](./docs/longcat_image.md)
5153
- Video Models
5254
- [Wan2.1/Wan2.2](./docs/wan.md)
5355
- [LTX-2.3](./docs/ltx2.md)
@@ -133,6 +135,7 @@ For runtime and parameter backend placement, see the [backend selection guide](.
133135
- [Chroma](./docs/chroma.md)
134136
- [🔥Qwen Image](./docs/qwen_image.md)
135137
- [🔥Qwen Image Edit series](./docs/qwen_image_edit.md)
138+
- [🔥LongCat Image / LongCat Image Edit](./docs/longcat_image.md)
136139
- [🔥Wan2.1/Wan2.2](./docs/wan.md)
137140
- [🔥LTX-2.3](./docs/ltx2.md)
138141
- [🔥Z-Image](./docs/z_image.md)

assets/longcat/example.png

423 KB
Loading

docs/longcat_image.md

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
# How to Use
2+
3+
LongCat-Image uses a LongCat diffusion transformer, the FLUX VAE, and Qwen2.5-VL as the LLM text encoder.
4+
5+
## Download weights
6+
7+
- Download LongCat Image
8+
- safetensors: https://huggingface.co/Comfy-Org/LongCat-Image/tree/main/split_files/diffusion_models
9+
- gguf: https://huggingface.co/vantagewithai/LongCat-Image-GGUF/tree/main/comfy
10+
- Download LongCat Image Edit
11+
- LongCat Image Edit Turbo: https://huggingface.co/meituan-longcat/LongCat-Image-Edit-Turbo
12+
- gguf: https://huggingface.co/vantagewithai/LongCat-Image-Edit-GGUF/tree/main
13+
- Download vae
14+
- safetensors: https://huggingface.co/black-forest-labs/FLUX.1-dev/blob/main/ae.safetensors
15+
- Download qwen_2.5_vl 7b
16+
- safetensors: https://huggingface.co/Comfy-Org/Qwen-Image_ComfyUI/tree/main/split_files/text_encoders
17+
- gguf: https://huggingface.co/mradermacher/Qwen2.5-VL-7B-Instruct-GGUF/tree/main
18+
- For image editing with GGUF text encoders, also download the matching mmproj file and pass it with `--llm_vision`.
19+
20+
## Run
21+
22+
LongCat uses quoted text for character-level text rendering. Put target text inside single quotes, double quotes, or Chinese quotes.
23+
24+
### LongCat Image
25+
26+
```
27+
.\bin\Release\sd-cli.exe --diffusion-model ..\..\ComfyUI\models\diffusion_models\LongCat-Image-Q4_K_M.gguf --vae ..\..\ComfyUI\models\vae\ae.sft --llm ..\..\ComfyUI\models\text_encoders\Qwen2.5-VL-7B-Instruct-Q8_0.gguf -p "a lovely cat holding a sign says 'longcat.cpp'" --cfg-scale 5.0 --sampling-method euler --flow-shift 3 -v --offload-to-cpu --diffusion-fa
28+
```
29+
30+
<img alt="longcat example" src="../assets/longcat/example.png" />

src/name_conversion.cpp

Lines changed: 10 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -567,6 +567,11 @@ std::string convert_diffusers_dit_to_original_flux(std::string name) {
567567
flux_name_map[block_prefix + "attn.norm_k.weight"] = dst_prefix + "img_attn.norm.key_norm.scale";
568568
flux_name_map[block_prefix + "attn.norm_added_q.weight"] = dst_prefix + "txt_attn.norm.query_norm.scale";
569569
flux_name_map[block_prefix + "attn.norm_added_k.weight"] = dst_prefix + "txt_attn.norm.key_norm.scale";
570+
// Comfy-Org/LongCat-Image stores already-converted RMSNorm tensors as *.weight.
571+
flux_name_map[dst_prefix + "img_attn.norm.query_norm.weight"] = dst_prefix + "img_attn.norm.query_norm.scale";
572+
flux_name_map[dst_prefix + "img_attn.norm.key_norm.weight"] = dst_prefix + "img_attn.norm.key_norm.scale";
573+
flux_name_map[dst_prefix + "txt_attn.norm.query_norm.weight"] = dst_prefix + "txt_attn.norm.query_norm.scale";
574+
flux_name_map[dst_prefix + "txt_attn.norm.key_norm.weight"] = dst_prefix + "txt_attn.norm.key_norm.scale";
570575

571576
// ff
572577
flux_name_map[block_prefix + "ff.net.0.proj.weight"] = dst_prefix + "img_mlp.0.weight";
@@ -605,8 +610,11 @@ std::string convert_diffusers_dit_to_original_flux(std::string name) {
605610

606611
flux_name_map[block_prefix + "attn.norm_q.weight"] = dst_prefix + "norm.query_norm.scale";
607612
flux_name_map[block_prefix + "attn.norm_k.weight"] = dst_prefix + "norm.key_norm.scale";
608-
flux_name_map[block_prefix + "proj_out.weight"] = dst_prefix + "linear2.weight";
609-
flux_name_map[block_prefix + "proj_out.bias"] = dst_prefix + "linear2.bias";
613+
// Comfy-Org/LongCat-Image stores already-converted RMSNorm tensors as *.weight.
614+
flux_name_map[dst_prefix + "norm.query_norm.weight"] = dst_prefix + "norm.query_norm.scale";
615+
flux_name_map[dst_prefix + "norm.key_norm.weight"] = dst_prefix + "norm.key_norm.scale";
616+
flux_name_map[block_prefix + "proj_out.weight"] = dst_prefix + "linear2.weight";
617+
flux_name_map[block_prefix + "proj_out.bias"] = dst_prefix + "linear2.bias";
610618
}
611619

612620
// --- final layers ---

0 commit comments

Comments
 (0)