Skip to content

Commit 8811f7a

Browse files
authored
Added blog post "INT4 Decoding GQA CUDA Optimizations for LLM Inference" (#1648)
Signed-off-by: Chris Abraham <[email protected]>
1 parent c2cd932 commit 8811f7a

File tree

23 files changed

+3485
-0
lines changed

23 files changed

+3485
-0
lines changed

_posts/2024-06-06-int4-decoding.md

Lines changed: 3485 additions & 0 deletions
Large diffs are not rendered by default.

assets/images/int4-decoding/eq.jpg

53.4 KB
Loading
98.9 KB
Loading
18.2 KB
Loading
54.6 KB
Loading
411 KB
Loading
296 KB
Loading
207 KB
Loading
347 KB
Loading
460 KB
Loading

0 commit comments

Comments
 (0)