Skip to content

WIP : Adding an implementation based on rocfft.#389

Open
PaulMullowney wants to merge 3 commits into
ecmwf-ifs:developfrom
PaulMullowney:rocfft_outofplace
Open

WIP : Adding an implementation based on rocfft.#389
PaulMullowney wants to merge 3 commits into
ecmwf-ifs:developfrom
PaulMullowney:rocfft_outofplace

Conversation

@PaulMullowney
Copy link
Copy Markdown
Contributor

@PaulMullowney PaulMullowney commented Apr 24, 2026

This branch adds a rocfft based implementation. rocfft distinguishes between in and out-of-place ffts. hipfft does not. Using rocfft introduces considerable savings in terms of the number of FFT kernels that need to be JIT compiled during plan creation. This is measured by setting the ROCFFT_RTC_CACHE_PATH environment variable to save the JIT compiled kernels in a database.

Truncation FFT Impl ROCFFT_RTC_CACHE SIZE Time to create cache Time to reload from cache
639 ROCFFT 38 MB 34s 5s
639 HIPFFT 60 MB 150s 16s
1279 ROCFFT 88 MB 91s 23s
1279 HIPFFT 135 MB 450s 79s

@samhatfield
Copy link
Copy Markdown
Collaborator

That's quite a saving @PaulMullowney, thanks for taking this forward. I will try to reproduce your findings on LUMI and AAC7 myself, then start having a look at your code.

@samhatfield
Copy link
Copy Markdown
Collaborator

Here are some initial benchmarks on LUMI-G (MI250A):

Truncation develop (1st step) develop (median) this branch (1st step) this branch (median)
79 182 0.020 101 0.019
159 409 0.03 230 0.03
319 909 0.065 528 0.0655
639 ***** 0.193 ***** 0.191
1279 crash crash ***** 0.753

So indeed, the first step (which does FFT planning) is significantly accelerated with rocFFT.

Unfortunately the bigger runs took so long on this step it overflowed the print format statement... I will try to repeat the runs with a longer format.

Not sure what happened with the 1279 crash on develop yet.

@samhatfield
Copy link
Copy Markdown
Collaborator

@PaulMullowney - could you rebase on develop and then force push your branch back to GH?

@samhatfield
Copy link
Copy Markdown
Collaborator

Getting a strange warning on LUMI, for both hipFFT and rocFFT:

hip error code: 'hipErrorContextIsDestroyed':709 at /long_pathname_so_that_rpms_can_package_the_debug_info/src/rocBLAS/library/src/include/handle.hpp:437

Doesn't seem to have any serious consequences.

@PaulMullowney
Copy link
Copy Markdown
Contributor Author

@PaulMullowney - could you rebase on develop and then force push your branch back to GH?

Yes. I will try to do this today.

@PaulMullowney
Copy link
Copy Markdown
Contributor Author

Getting a strange warning on LUMI, for both hipFFT and rocFFT:

hip error code: 'hipErrorContextIsDestroyed':709 at /long_pathname_so_that_rpms_can_package_the_debug_info/src/rocBLAS/library/src/include/handle.hpp:437

Doesn't seem to have any serious consequences.

That's a new one. Hmmm

@PaulMullowney
Copy link
Copy Markdown
Contributor Author

I rebased against develop and pushed

@samhatfield
Copy link
Copy Markdown
Collaborator

Updated results on LUMI-G:

Truncation develop (1st step) develop (median) this branch (1st step) this branch (median)
79 182 0.020 101 0.019
159 409 0.03 230 0.03
319 909 0.065 528 0.0655
639 1877 0.195 1142 0.199
1279 timeout timeout 2761 0.755

Comment thread src/trans/gpu/algor/hicfft.rocfft.cpp Outdated
Comment thread src/trans/gpu/algor/hicfft.rocfft.cpp Outdated
Co-authored-by: Sam Hatfield <samuel.hatfield@ecmwf.int>
Co-authored-by: Sam Hatfield <samuel.hatfield@ecmwf.int>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants