-
Notifications
You must be signed in to change notification settings - Fork 625
fix: fix package deprecation introduced by CUDA 13 #4117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
4ccb1e2 to
f6e326c
Compare
|
In the |
|
Should we also upgrade triton? |
Since flash-attention doesn't have a CUDA 13 build yet, we need to be more careful with the lmdeploy CUDA 13 release due to potential compatibility issues. |
I think currently we can make our code CUDA 13 ready but do not ship the CUDA 13 wheels and images until testing and relevant dependencies ready. Anyone wants to use LMDeploy in CUDA 13 can build from source by themselves. |
|
I've built the docker image by Then in the container, I tried serving a model using turbomind backend but got failure There is not "libcublas.so" in |
|
Pytorch engine doesn't work either |
But neither inference engine can work even though users can build lmdeploy from source in cu13 env. |
After setting |
After ugrading triton to its latest version, pytorch engine works too. I agree we should defer the release until complete verification. In the meantime, I recommend updating the LD_LIBRARY_PATH configuration to this PR to ensure at least one engine is functional. |
1d2ece8 to
4594575
Compare
4594575 to
05d200a
Compare
Motivation
NVIDIA has deprecated versioned wheel package since CUDA 13, causing CUDA 13+ installations to fail with deprecated package names like
nvidia-cublas-cu13.Modification
Remove the unconditional return to allow the version check to execute.