-
Notifications
You must be signed in to change notification settings - Fork 61
QR operator utilizing XPU. #2399
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
|
There were no tests on the skip lists, as QR was silently falling back to CPU. Maybe after removing it, some tests will start to fail now |
CuiYifeng
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check comments and fix related failed cases.
| TORCH_IMPL_FUNC(linalg_qr_xpu_out)(const Tensor& A, | ||
| std::string_view mode, | ||
| const Tensor & Q, | ||
| const Tensor & R) { | ||
| #if defined(USE_ONEMKL_XPU) | ||
| xpu::linalg_qr_kernel(A, mode, Q, R); | ||
| #else | ||
| auto A_cpu = A.to(at::kCPU); | ||
| auto Q_cpu = at::empty_like(Q, at::kCPU); | ||
| auto R_cpu = at::empty_like(R, at::kCPU); | ||
| at::cpu::linalg_qr_out(Q_cpu, R_cpu, A_cpu, mode); | ||
| Q.copy_(Q_cpu); | ||
| R.copy_(R_cpu); | ||
| #endif // USE_ONEMKL_XPU | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My suggestion is to register geqrf_kerenl_xpu/orgqr_kernel_xpu to geqrf_stub/orgqr_stub, which allows us to reuse op level code in stock Pytorch and reuse these two kernels in future.
| - func: linalg_qr(Tensor A, str mode='reduced') -> (Tensor Q, Tensor R) | ||
| python_module: linalg | ||
| variants: function | ||
| structured_delegate: linalg_qr.out | ||
|
|
||
| - func: linalg_qr.out(Tensor A, str mode='reduced', *, Tensor(a!) Q, Tensor(b!) R) -> (Tensor(a!) Q, Tensor(b!) R) | ||
| python_module: linalg | ||
| structured: True | ||
| dispatch: | ||
| XPU: linalg_qr_xpu_out | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please refer to https://github.com/pytorch/pytorch/blob/main/aten/src/ATen/native/native_functions.yaml#L14623 and consider stub mentioned above.
fixes #1900. The implementation utilizes two mkl::lapack libraries,
geqrf and geqrf for recovering pure Q matrix. Since torch and lapack use different storage formats, a hard transposition (memory layout not only stride ) was necessary. The iteraion over batch utilizes internal memory layout of processed data.