Skip to content

Add buffer variables for the Biogeochemical Particles#323

Open
Mikolaj-A-Kowalski wants to merge 5 commits intomainfrom
mak/particle-buffer
Open

Add buffer variables for the Biogeochemical Particles#323
Mikolaj-A-Kowalski wants to merge 5 commits intomainfrom
mak/particle-buffer

Conversation

@Mikolaj-A-Kowalski
Copy link
Collaborator

@Mikolaj-A-Kowalski Mikolaj-A-Kowalski commented Dec 10, 2025

Resolves #321

Basically it allows a biogeochemical model in the particles to declare buffer for some heavy-to-compute (i.e. photosynthesis function result) variables so they can be reused across multiple kernels that update tendencies for the tracers and particle fields.

In contrast to what I said in #321, the buffers need to be filled twice since particle move between tracer and fields tendencies are updated.

It is still a draft so we can discuss alternative/better way to introduce buffers. In particular the following points:

  • At the moment there is no differentiation between a buffer variables for update of tracer and field tendencies, but in principle different set of variables may be needed. Should we try to have separate buffer_variable registration and filling for these steps?
  • I wasn't sure where the kernel that does the filling should be defined. At the moment it stayed in Particle and just delegates to a function provided by a specific Model/Individual. But I was wondering if the kernel should not be defined by a specific model so it can be optimised/tailored more to calculation of the specific buffer variable...

The performance change will follow in the post shortly.

TODO:

  • add unit tests for buffers

@Mikolaj-A-Kowalski Mikolaj-A-Kowalski added enhancement New feature or request feature labels Dec 10, 2025
@Mikolaj-A-Kowalski
Copy link
Collaborator Author

Mikolaj-A-Kowalski commented Dec 10, 2025

In terms of performance of the total time executing the 'update' kernels (both tracer and field tendencies) for $2^{18}$ particles decreases from ~ 33 ms to 17ms (~50%)

Looking in Nsight sytems this is the old:

image

This is the new:
image

Basically the majority of the time is now concentrated in the fill_single_buffer_variable kernel that does the photosynthesis calculation. Further if we look a bit up in the report to the 'SM Wrap Occupancy' section we will see that we have a significant problem related to load 'last-wave' problem. The kernel starts at its theoretical occupancy of 50%, but it drops almost immediately ending at sad 3% for the last third of the kernel execution 😢

I guess this is just related to the load imbalance in the Newton iteration. The good news is that it probably can be significantly optimised delivering even more gain.

The only cost of the change is extra memory (single double) per particle that needs to be stored on the GPU. I imagine though that the memory cost of the particles is much smaller than tracers in practice so it is not too heavy price to pay.

I attach the original NSight reports: reports.tar.gz

The benchmark script: LOBSTER_3D_profile_particles.jl.txt

@Mikolaj-A-Kowalski Mikolaj-A-Kowalski marked this pull request as ready for review December 12, 2025 11:52
@Mikolaj-A-Kowalski
Copy link
Collaborator Author

Added test and rebased on main. Ready for review!

@jagoosw jagoosw self-requested a review December 12, 2025 14:20
#:NO₃, :NH₄, :DIC, :O₂, :DOC, :DON, :bPOC, :bPON

@inline function (kelp::SugarKelp)(::Val{:NO₃}, t, A, N, C, u, v, w, T, NO₃, NH₄, PAR)
@inline function (kelp::SugarKelp)(::Val{:NO₃}, t, A, N, C, u, v, w, T, NO₃, NH₄, PAR, P)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
@inline function (kelp::SugarKelp)(::Val{:NO₃}, t, A, N, C, u, v, w, T, NO₃, NH₄, PAR, P)
@inline function (kelp::SugarKelp)(::Val{:NO₃}, t, A, N, C, u, v, w, T, NO₃, NH₄, PAR, photosynthesis)

I in this context its a bit confusing to use P since P almost always refers to phytoplankton

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🫡 I have used P since it is variable name to which the name was assigned, but will change to photosynthesis_val (I want to avoid it making the function from the module scope)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in the recent rebase.

@jagoosw
Copy link
Collaborator

jagoosw commented Dec 12, 2025

This seems like a great idea, thank you for implementing this!

  • At the moment there is no differentiation between a buffer variables for update of tracer and field tendencies, but in principle different set of variables may be needed. Should we try to have separate buffer_variable registration and filling for these steps?

I think its fine to have the same for both since thats what we do with all the other variables.

  • I wasn't sure where the kernel that does the filling should be defined. At the moment it stayed in Particle and just delegates to a function provided by a specific Model/Individual. But I was wondering if the kernel should not be defined by a specific model so it can be optimised/tailored more to calculation of the specific buffer variable...

I think this is fine too since a model could also define a new method like:

@inline fill_all_buffer_variables!(particles::BiogeochemicalParticles{N, <:SomeParticleBGC}, model) where N

Buffer allows to precompute some quantities before tracers update step.
@Mikolaj-A-Kowalski
Copy link
Collaborator Author

I think its fine to have the same for both since thats what we do with all the other variables.

Fair! Ultimately if it becomes relevant for some future model it should not be too difficult to change.

I think this is fine too since a model could also define a new method like:

Good point! I haven't though of that overload.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Allow to share computation between kernels in particle models

2 participants