Conversation
56ed028 to
a8fda92
Compare
d0c1431 to
1e08917
Compare
1e08917 to
eff87cf
Compare
eff87cf to
3d2c334
Compare
3d2c334 to
5486740
Compare
5486740 to
6ce8e33
Compare
| units="-" | ||
| description="Method to use for exchanging halos" | ||
| possible_values="`mpas_dmpar', `mpas_halo'"/> | ||
| <nml_option name="config_gpu_aware_mpi" type="logical" default_value="false" |
There was a problem hiding this comment.
For consistency and readability, let's consider adding a blank line above this. (I do see that the IAU namelist record doesn't use a blank line between options, but we can address that in a separate clean-up PR at a later time.)
src/core_atmosphere/Registry.xml
Outdated
| possible_values="`mpas_dmpar', `mpas_halo'"/> | ||
| <nml_option name="config_gpu_aware_mpi" type="logical" default_value="false" | ||
| units="-" | ||
| description="Whether to use GPU-aware MPI for halo exchanges" |
There was a problem hiding this comment.
In the description, we could also mention that this only works when config_halo_exch_method is set to mpas_halo.
src/framework/mpas_dmpar.F
Outdated
| iErr = MPAS_DMPAR_NOERR | ||
| end if | ||
|
|
||
| useGPUAwareMPI = .false. |
There was a problem hiding this comment.
Is this local variable actually needed?
src/core_atmosphere/mpas_atm_halos.F
Outdated
| call mpas_log_write('') | ||
|
|
||
| if (config_gpu_aware_mpi) then | ||
| call mpas_log_write('GPU-aware MPI is not presently supported with config_halo_exch_method = mpas_dmpar',MPAS_LOG_CRIT) |
There was a problem hiding this comment.
This seems like it's redundant given the check that exists in the mpas_dmpar_exch_group_full_halo_exch routine.
There was a problem hiding this comment.
I will retain this check and remove the redundant one in mpas_dmpar_exch_group_full_halo_exch so that it will terminate earlier.
There was a problem hiding this comment.
Actually, I think the check in mpas_dmpar_exch_group_full_halo_exch is preferable, since it would catch uses of the routine with withGPUAwareMPI = .true. in cores other than the atmosphere core.
| integer, dimension(:,:,:), CONTIGUOUS pointer :: recvListSrc, recvListDst | ||
| integer, dimension(:), CONTIGUOUS pointer :: unpackOffsets | ||
|
|
||
|
|
There was a problem hiding this comment.
I'd suggest leaving this blank line in place, as this change isn't necessary to the objective of this PR and this line wouldn't otherwise be modified.
src/framework/mpas_halo.F
Outdated
| return | ||
| end if | ||
|
|
||
| call mpas_timer_start('full_halo_exch') |
There was a problem hiding this comment.
In the past, my experience has been that timing for halo exchanges may not be especially valuable. Load imbalance can show up as time apparently spent in halo communications, when it's really just that one MPI task had to finish some extra work before participating in the halo exchange. What do you think about removing these timers?
There was a problem hiding this comment.
That's true. I'm fine with removing it for now and revisiting it as needed.
src/framework/mpas_halo.F
Outdated
| #endif | ||
| rank = group % fields(1) % compactHaloInfo(8) | ||
|
|
||
| !$acc data present(group % recvBuf(:), group % sendBuf(:)) if(useGPUAwareMPI) |
There was a problem hiding this comment.
If the parallel directives in this routine all use default(present), is this directive necessary?
There was a problem hiding this comment.
Doesn't appear to be required, and will remove it.
This commit enables execution of halo exchanges on GPUs via OpenACC directives, if the MPAS atmosphere core has been built with an appropriate GPU-aware MPI distribution. Module mpas_halo is modified in the following ways to enable GPU-aware halo exchanges: - In the call to mpas_halo_exch_group_complete, OpenACC directives copy to device all the relevant fields and metadata that are required for the packing and unpacking loops later. - OpenACC directives are introduced around the packing and unpacking loops to perform the field to/from send/recv buffer operations on the device. The attach clauses introduced to the parallel constructs ensures that the device pointers are attached to the device targets at the start of the parallel region and detached at the end of the region. - The actual MPI_Isend and MPI_Irecv operations use GPU-aware MPI, by wrapping these calls within !$acc host_data constructs. Note: This commit introduces temporary host-device data movements in the atm_core_init routine around the two calls to exchange_halo_group. This is required just for this commit as all halo-exchanges occur on the device and fields not yet present on the device must be copied over to it before the halo exchanges and back to host after it. These copies will be removed in subsequent commits.
Introducing a new namelist option under development, config_gpu_aware_mpi, which will control whether the MPAS runs on GPUs will use GPU-aware MPI or perform a device<->host update of variables around the call to a purely CPU-based halo exchange. Note: This feature is not available to use when config_halo_exch_method is set to 'mpas_dmpar'
6ce8e33 to
2ae3105
Compare
This PR enables execution of halo exchanges on GPUs via OpenACC directives, when the MPAS atmosphere core has been built with an appropriate GPU-aware MPI library. This PR builds on the OpenACC memory consolidation introduced in #1315.
Module
mpas_halois modified in the following ways to enable GPU-aware halo exchanges:mpas_halo_exch_group_complete, OpenACC directives copy to device all the relevant fields and metadata that are required for the packing and unpacking loops later.attachclauses introduced to the parallel constructs ensures that the device pointers are attached to the device targets at the start of the parallel region and detached at the end of the region.MPI_IsendandMPI_Irecvoperations use GPU-aware MPI, by wrapping these calls withinIn addition, this PR introduces a new namelist option under the
developmentgroup,config_gpu_aware_mpi. When set totrue, it switches on GPU-direct halo exchanges. The default setting isfalse, in which halo exchanges occur on the host, which necessitates host<->device data transfers around the halo exchanges as necessary.Note: This feature is not available to use when config_halo_exch_method is set to 'mpas_dmpar'