add docs and support comfyui weight names

leejet · leejet · commit 8a582016c36b · 2026-05-24T01:54:09.000+08:00
diff --git a/README.md b/README.md
@@ -40,6 +40,7 @@ API and command-line option may change frequently.***
     - [Chroma](./docs/chroma.md)
     - [Chroma1-Radiance](./docs/chroma_radiance.md)
     - [Qwen Image](./docs/qwen_image.md)
+    - [LongCat Image](./docs/longcat_image.md)
     - [Z-Image](./docs/z_image.md)
     - [Ovis-Image](./docs/ovis_image.md)
     - [Anima](./docs/anima.md)
@@ -48,6 +49,7 @@ API and command-line option may change frequently.***
   - Image Edit Models
     - [FLUX.1-Kontext-dev](./docs/kontext.md)
     - [Qwen Image Edit series](./docs/qwen_image_edit.md)
+    - [LongCat Image Edit](./docs/longcat_image.md)
   - Video Models
     - [Wan2.1/Wan2.2](./docs/wan.md)
     - [LTX-2.3](./docs/ltx2.md)
@@ -133,6 +135,7 @@ For runtime and parameter backend placement, see the [backend selection guide](.
 - [Chroma](./docs/chroma.md)
 - [🔥Qwen Image](./docs/qwen_image.md)
 - [🔥Qwen Image Edit series](./docs/qwen_image_edit.md)
+- [🔥LongCat Image / LongCat Image Edit](./docs/longcat_image.md)
 - [🔥Wan2.1/Wan2.2](./docs/wan.md)
 - [🔥LTX-2.3](./docs/ltx2.md)
 - [🔥Z-Image](./docs/z_image.md)
diff --git a/assets/longcat/example.png b/assets/longcat/example.png
diff --git a/docs/longcat_image.md b/docs/longcat_image.md
@@ -0,0 +1,30 @@
+# How to Use
+
+LongCat-Image uses a LongCat diffusion transformer, the FLUX VAE, and Qwen2.5-VL as the LLM text encoder.
+
+## Download weights
+
+- Download LongCat Image
+    - safetensors: https://huggingface.co/Comfy-Org/LongCat-Image/tree/main/split_files/diffusion_models
+    - gguf: https://huggingface.co/vantagewithai/LongCat-Image-GGUF/tree/main/comfy
+- Download LongCat Image Edit
+    - LongCat Image Edit Turbo: https://huggingface.co/meituan-longcat/LongCat-Image-Edit-Turbo
+    - gguf: https://huggingface.co/vantagewithai/LongCat-Image-Edit-GGUF/tree/main
+- Download vae
+    - safetensors: https://huggingface.co/black-forest-labs/FLUX.1-dev/blob/main/ae.safetensors
+- Download qwen_2.5_vl 7b
+    - safetensors: https://huggingface.co/Comfy-Org/Qwen-Image_ComfyUI/tree/main/split_files/text_encoders
+    - gguf: https://huggingface.co/mradermacher/Qwen2.5-VL-7B-Instruct-GGUF/tree/main
+    - For image editing with GGUF text encoders, also download the matching mmproj file and pass it with `--llm_vision`.
+
+## Run
+
+LongCat uses quoted text for character-level text rendering. Put target text inside single quotes, double quotes, or Chinese quotes.
+
+### LongCat Image
+
+```
+.\bin\Release\sd-cli.exe --diffusion-model  ..\..\ComfyUI\models\diffusion_models\LongCat-Image-Q4_K_M.gguf --vae ..\..\ComfyUI\models\vae\ae.sft --llm ..\..\ComfyUI\models\text_encoders\Qwen2.5-VL-7B-Instruct-Q8_0.gguf -p "a lovely cat holding a sign says 'longcat.cpp'" --cfg-scale 5.0 --sampling-method euler --flow-shift 3 -v --offload-to-cpu --diffusion-fa
+```
+
+<img alt="longcat example" src="../assets/longcat/example.png" />
diff --git a/src/name_conversion.cpp b/src/name_conversion.cpp
@@ -567,6 +567,11 @@ std::string convert_diffusers_dit_to_original_flux(std::string name) {
             flux_name_map[block_prefix + "attn.norm_k.weight"]       = dst_prefix + "img_attn.norm.key_norm.scale";
             flux_name_map[block_prefix + "attn.norm_added_q.weight"] = dst_prefix + "txt_attn.norm.query_norm.scale";
             flux_name_map[block_prefix + "attn.norm_added_k.weight"] = dst_prefix + "txt_attn.norm.key_norm.scale";
+            // Comfy-Org/LongCat-Image stores already-converted RMSNorm tensors as *.weight.
+            flux_name_map[dst_prefix + "img_attn.norm.query_norm.weight"] = dst_prefix + "img_attn.norm.query_norm.scale";
+            flux_name_map[dst_prefix + "img_attn.norm.key_norm.weight"]   = dst_prefix + "img_attn.norm.key_norm.scale";
+            flux_name_map[dst_prefix + "txt_attn.norm.query_norm.weight"] = dst_prefix + "txt_attn.norm.query_norm.scale";
+            flux_name_map[dst_prefix + "txt_attn.norm.key_norm.weight"]   = dst_prefix + "txt_attn.norm.key_norm.scale";
 
             // ff
             flux_name_map[block_prefix + "ff.net.0.proj.weight"] = dst_prefix + "img_mlp.0.weight";
@@ -605,8 +610,11 @@ std::string convert_diffusers_dit_to_original_flux(std::string name) {
 
             flux_name_map[block_prefix + "attn.norm_q.weight"] = dst_prefix + "norm.query_norm.scale";
             flux_name_map[block_prefix + "attn.norm_k.weight"] = dst_prefix + "norm.key_norm.scale";
-            flux_name_map[block_prefix + "proj_out.weight"]    = dst_prefix + "linear2.weight";
-            flux_name_map[block_prefix + "proj_out.bias"]      = dst_prefix + "linear2.bias";
+            // Comfy-Org/LongCat-Image stores already-converted RMSNorm tensors as *.weight.
+            flux_name_map[dst_prefix + "norm.query_norm.weight"] = dst_prefix + "norm.query_norm.scale";
+            flux_name_map[dst_prefix + "norm.key_norm.weight"]   = dst_prefix + "norm.key_norm.scale";
+            flux_name_map[block_prefix + "proj_out.weight"]      = dst_prefix + "linear2.weight";
+            flux_name_map[block_prefix + "proj_out.bias"]        = dst_prefix + "linear2.bias";
         }
 
         // --- final layers ---