Update NVHPC version to 26#386
Conversation
| if (size(sp3d) > 0) then | ||
| call egath_spec(kfgathg=numfld, kto=[(1, i = 1, numfld)], kvset=ivset, pspec=sp3d(:,:,jfld)) | ||
| endif |
There was a problem hiding this comment.
This could be a problem causing an MPI deadlock if it is expected that this rank is going to be communicating within this routine. (SEND/RECEIVE or BARRIER etc..)
When size(sp3d) > 0 on some ranks and size(sp3d) == 0 on others.
There was a problem hiding this comment.
Damn, I think you're right. The other tricks (LBOUND etc.) didn't work here. Let me investigate further.
|
|
||
| IF (NPROC > 1.AND.MYPROC /= KMASTER) THEN | ||
| CALL MPL_SEND(PSM(:,:),KDEST=NPRCIDS(KMASTER),KTAG=ITAG,& | ||
| CALL MPL_SEND(PSM,KDEST=NPRCIDS(KMASTER),KTAG=ITAG,& |
There was a problem hiding this comment.
I believe PSM(:,: ) was required for some other compilers, otherwise it would not be there, no ?
There was a problem hiding this comment.
Ah, interesting, didn't know that. I thought the coder was just following our style guide, which I think recommends always passing colon indices even when every element is requested.
There was a problem hiding this comment.
Not 100% sure, it was more a question. Maybe @ddegrauwe or @RyadElKhatibMF will know better ?
|
@samhatfield This is a known regression and should be fixed in 26.3. This provides some context: |
😮 we were assuming 26.1 and 26.3 behave similarly. So maybe none of these changes are needed?? Edit: let's see... reverting all workarounds. |
|
I thought I was testing all this with 26.3 as well, having these issues. |
|
It is at least worth testing... if you still see issues with 26.3, I am happy to also have a look. If it is a compiler bug, I would prefer fixing the compiler rather than adding workarounds :D |
Yes, issue still present in 26.3, see https://github.com/ecmwf-ifs/ectrans/actions/runs/24828523187/job/72670702893?pr=386. |
|
Thanks, I will have a look! |
|
Thanks Lukas for being proactive about this! Always nicer to have a more robust compiler without workarounds. |
|
When 26.5 is released soon (next month perhaps) we will revive this PR. |
|
@samhatfield could we just turn off bounds-checking for those compiler versions? Would that work for now? |
9ec7c62 to
32c7b2e
Compare
I'm not actually sure where ecTrans gets its debug flags from:
|
|
Yes it comes from -DCMAKE_BUILD_TYPE=Debug |
This PR updates the NVHPC version to 26 (currently 26.1 but we could consider 26.3).
At some point nvfortran with check bounds enabled became a lot stricter about how it handles zero-sized arrays. These arrays crop up in a few places in ecTrans, especially when running with NPRTRV /= 1 and with NPROC > 1. It's possible for a task to have no resident data for a spectral field if it hasn't been assigned any through IVSET. So we pass around arrays with shapes like
(0, n). Mathematically this shouldn't be a problem, as long as we index them appropriately (the alternative would be guards all over the place disabling subroutine calls when no local fields are present). But nvfortran now flags accesses to these arrays sometimes as "out of bounds", e.g.PSP(:,:)where the first dimension is zero sized.So this PR attempts to resolve all of the offending cases as well. A few of my fixes are a bit hacky so I'm open to suggestions for better ways around this.