Skip to content

Code updates that improve the performance of benchmark_ALE case with density-mapped diags#63

Draft
manodeep wants to merge 7 commits into
2026.05from
47-perf-rho-remap-diags
Draft

Code updates that improve the performance of benchmark_ALE case with density-mapped diags#63
manodeep wants to merge 7 commits into
2026.05from
47-perf-rho-remap-diags

Conversation

@manodeep

Copy link
Copy Markdown
Collaborator

Fix for #47

The base case was benchmark_ALE with the following patch (patch generated with git diff --patch MOM_input diag_table):

diff --git a/benchmark_ALE/MOM_input b/benchmark_ALE/MOM_input
index b52b28e..39fbc55 100644
--- a/benchmark_ALE/MOM_input
+++ b/benchmark_ALE/MOM_input
@@ -184,6 +184,14 @@ INITIAL_T_RANGE = -9.0          !   [degC] default = 0.0
                                 ! Initial temperature range (bottom - surface)
 
 ! === module MOM_diag_mediator ===
+NUM_DIAG_COORDS = 1
+DIAG_COORDS = "rho2 RHO2 RHO" !"z Z ZSTAR" !
+DIAG_COORD_DEF_RHO2 = "RFNC1:76,999.5,1020.,1034.1,3.1,1041.,0.002" ! default = "WOA09"
+REGRIDDING_ANSWER_DATE = 99991231 ! default = 20181231
 
 ! === module MOM_MEKE ===
 USE_MEKE = True                 !   [Boolean] default = False
diff --git a/benchmark_ALE/diag_table b/benchmark_ALE/diag_table
index 42b4a98..02ceeb8 100644
--- a/benchmark_ALE/diag_table
+++ b/benchmark_ALE/diag_table
@@ -47,6 +47,10 @@ benchmark_ALE
  "ocean_model",   "zos",        "zos",          "ocean_month", "all", "mean", "none",2
  "ocean_model",   "Rd1",        "Rd1",          "ocean_month", "all", "mean", "none",2
 
+# monthly 3d fields on rho2
+"access-om3.mom6.3d.umo+rho2.1mon.mean.%4yr", 1, "months", 1, "days", "time", 1, "years"
+"ocean_model_rho2", "umo", "umo", "access-om3.mom6.3d.umo+rho2.1mon.mean.%4yr", "all", "average", "none", 2
+
 # 3d annual
  "ocean_model_z", "agessc",     "agessc",       "ocean_annual_z", "all", "mean", "none",2

The baseline benchmark_ALE runtime improves from 145s (possibly longer since I had forgotten to compile with -fp-model=precise for the baseline) to 98s for a version that contained all of these changes (and some more that made minor performance impact) - about a 30% improvement in runtime. Most of the improvement comes from inlining density_anomaly_elem_Roquet_rho (~22s) and the extrapolation check reordering in get_polynomial_coordinate (~12s). The changes for the EOS are currently confined to the Roquet but would (should?) be carried across to the other EOS'.

From my tests with the deployments, the OM3 25k IAF runtime improves by about 10% (but that's probably because the load-balancing is now off)

I will add in performance tables here, once I have them in a easily-parseable format

@manodeep manodeep marked this pull request as draft June 19, 2026 00:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant