Skip to content

Use MPI_Dist_graph_create to determine comm pattern#1358

Merged
aprokop merged 4 commits into
arborx:masterfrom
aprokop:gone_with_mpi_alltoall
Jun 9, 2026
Merged

Use MPI_Dist_graph_create to determine comm pattern#1358
aprokop merged 4 commits into
arborx:masterfrom
aprokop:gone_with_mpi_alltoall

Conversation

@aprokop

@aprokop aprokop commented May 17, 2026

Copy link
Copy Markdown
Contributor

MPI_Alltoall has issues with scaling. This patch replaces it by using MPI_Dist_graph_create and other helper functions to rely on MPI. This seems to scale better on Frontier:
Screenshot 2026-05-17 at 12 49 22 PM

This bumps our MPI requirement to 2.2.

Also replaced OpenMPI with MPICH in the HIP build.

@dalg24 pointed out that we did something similar in ORNL-CEES/DataTransferKit#516. So this can be seen as a partial revert of that.

@aprokop aprokop added the performance Something is slower than it should be label May 17, 2026
@aprokop

aprokop commented May 17, 2026

Copy link
Copy Markdown
Contributor Author

Tested it on Frontier. Need to test it more thoroughly:

  • on Aurora to see if it helps with scaling
  • using GPU-aware MPI

@aprokop aprokop changed the title Use MPI_Dist_graph* to determine comm pattern Use MPI_Dist_graph_create to determine comm pattern May 18, 2026
dest_weights[i] = _dest_offsets[i + 1] - _dest_offsets[i];

MPI_Comm graph_comm;
MPI_Dist_graph_create(_comm, (outdegree ? 1 : 0), &comm_rank, &outdegree,

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we save the graph or do we need to rebuild it every time?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now, we don't use it past Distributor construction. I do want to try to use it for neighborhood communications inside doPostsAndWaits in the future. I'll store it then if it works out.

@aprokop aprokop force-pushed the gone_with_mpi_alltoall branch 3 times, most recently from 04101b8 to 5afb287 Compare June 4, 2026 15:40
@aprokop

aprokop commented Jun 4, 2026

Copy link
Copy Markdown
Contributor Author

Replaced OpenMPI with MPICH in the HIP build.

@aprokop aprokop force-pushed the gone_with_mpi_alltoall branch 3 times, most recently from 6beb94d to 0e33f85 Compare June 8, 2026 19:31
@aprokop

aprokop commented Jun 8, 2026

Copy link
Copy Markdown
Contributor Author

Switching HIP build from OpenMPI to MPICH results in very weird errors when building ArborX:

>>>               CMakeFiles/ArborX_Example_HostAccessTraits.exe.dir/example_host_access_traits.cpp.o:(Kokkos::Impl::SharedAllocationRecord<void, void>* Kokkos::Impl::ViewMapping<Kokkos::ViewTraits<int*, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>>, void>::allocate_shared<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>, Kokkos::HostSpace, Kokkos::OpenMP>(Kokkos::Impl::ViewCtorProp<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>, Kokkos::HostSpace, Kokkos::OpenMP> const&, Kokkos::LayoutRight const&, bool))
>>> referenced by Kokkos_SharedAlloc.hpp:474 (/opt/kokkos/include/impl/Kokkos_SharedAlloc.hpp:474)
>>>               CMakeFiles/ArborX_Example_HostAccessTraits.exe.dir/example_host_access_traits.cpp.o:(Kokkos::Impl::SharedAllocationRecord<Kokkos::HostSpace, Kokkos::Impl::ViewValueFunctor<Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, int>>::SharedAllocationRecord<Kokkos::OpenMP>(Kokkos::OpenMP const&, Kokkos::HostSpace const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&, unsigned long))
>>> referenced by Kokkos_SharedAlloc.hpp:482 (/opt/kokkos/include/impl/Kokkos_SharedAlloc.hpp:482)
>>>               CMakeFiles/ArborX_Example_HostAccessTraits.exe.dir/example_host_access_traits.cpp.o:(Kokkos::Impl::SharedAllocationRecord<void, void>* Kokkos::Impl::ViewMapping<Kokkos::ViewTraits<int*, Kokkos::HostSpace>, void>::allocate_shared<Kokkos::OpenMP, Kokkos::Impl::WithoutInitializing_t, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>, Kokkos::HostSpace>(Kokkos::Impl::ViewCtorProp<Kokkos::OpenMP, Kokkos::Impl::WithoutInitializing_t, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>, Kokkos::HostSpace> const&, Kokkos::LayoutRight const&, bool))
>>> the vtable symbol may be undefined because the class is missing its key function (see https://lld.llvm.org/missingkeyfunction)

Trying to see if the same thing happens with switching GCC build.

@aprokop aprokop force-pushed the gone_with_mpi_alltoall branch 2 times, most recently from 87ed69f to f9035d2 Compare June 8, 2026 20:41
@aprokop aprokop force-pushed the gone_with_mpi_alltoall branch from f9035d2 to 6e95a56 Compare June 8, 2026 21:27
@aprokop aprokop merged commit 240ec27 into arborx:master Jun 9, 2026
3 of 4 checks passed
@aprokop aprokop deleted the gone_with_mpi_alltoall branch June 9, 2026 00:56
@aprokop aprokop mentioned this pull request Jun 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance Something is slower than it should be

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants