vllm-project / vllm-gaudi Public

Notifications You must be signed in to change notification settings
Fork 87
Star 21

Code
Issues 1
Pull requests 67
Actions
Projects
Wiki
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Wiki
Security
Insights

Pull requests: vllm-project/vllm-gaudi

Labels 12 Milestones 1

New pull request New

67 Open 681 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

fix empty buckets issue for enforce eager mode

#761 opened Dec 25, 2025 by yangulei

Loading…

[GAUDISW-243560] Monkey-patching _get_attn_scale for the Llama4Attention layer

#760 opened Dec 24, 2025 by rsmyrek

Loading…

multimodal model embedding fixes

#759 opened Dec 23, 2025 by libinta

Loading…

[GAUDISW-243560] Monkey-patching _get_attn_scale for the Llama4Attention layer

#758 opened Dec 23, 2025 by rsmyrek

Loading…

debug inc

#755 opened Dec 23, 2025 by HolyFalafel • Draft

Prefill batching logic to handle chunked prefill/prefix caching for HPU

#753 opened Dec 23, 2025 by hlin99

Loading…

Release Notes for v0.13.0 documentation

Improvements or additions to documentation

skip-gaudi-tests

#750 opened Dec 22, 2025 by mhelf-intel

Loading…

[GAUDISW-244752] add dynamic scale for V-Cache on Hiddden dim

#749 opened Dec 21, 2025 by dudilester

Loading…

Update lmcache examples

#748 opened Dec 19, 2025 by hsubramony

Loading…

0.13.0 release updates skip-gaudi-tests

#747 opened Dec 19, 2025 by PatrykWo

Loading…

vllm-gaudi v0.13.0 release

Fix async_scheduling + batched prefill

#741 opened Dec 18, 2025 by tianmu-li

Loading…

Fix async_scheduling + batched prefill

#740 opened Dec 18, 2025 by tianmu-li

Loading…

[GAUDISW-244336] Add missing long ctx prompt buckets

#739 opened Dec 18, 2025 by kfojcik-intel

Loading…

Added Qwen3 Test

#736 opened Dec 18, 2025 by slokesha

Loading…

WA shared bias in UA

#727 opened Dec 16, 2025 by adobrzyn

Loading…

Dryrun implementation for generating command line file

#723 opened Dec 16, 2025 by rajanintel24

Loading…

DP: Fix for torch.compile

#722 opened Dec 16, 2025 by xinyu-intel

Loading…

Add heterogeneous pd docs

#714 opened Dec 13, 2025 by pi314ever • Draft

Create UBI based vLLM docker build instructions documentation

Improvements or additions to documentation

skip-gaudi-tests

#713 opened Dec 12, 2025 by ghandoura

Loading…

Add ucx test

#711 opened Dec 12, 2025 by pi314ever • Draft

Fix for Llama4 static quantization

#707 opened Dec 10, 2025 by vidyasiv

Loading…

Unified attn FP8 perf optimizations

#705 opened Dec 9, 2025 by afierka-intel

Loading…

Fix the docker image path documentation

Improvements or additions to documentation

skip-gaudi-tests

#691 opened Dec 5, 2025 by mhelf-intel

Loading…

Add support for chunked attention (#597)

#683 opened Dec 4, 2025 by jkaniecki

Loading…

Add support for chunked attention (#597)

#682 opened Dec 4, 2025 by jkaniecki

Loading…

Previous 1 2 3 Next

Previous Next

ProTip! Mix and match filters to narrow down what you’re looking for.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!