Levelset refactor by danieljvickers · Pull Request #1123 · MFlowCode/MFC

danieljvickers · 2026-02-04T21:44:56Z

User description

Description

This PR is a significant refactor of the levelset code. It contains

The removal of IB markers and levelset values from pre-processing. Now all values are computed in simulation
The deletion of the levelset and levelset norm global arrays. The are now tied to ghost points. The levelset code is now parallelized over ghost points instead of grid cells. This reduces the total amount of memory that needs to be allocated for levelsets by an order of magnitude in simple cases, and by several orders of magnitude when using multiple immersed boundaries.
I HAVE DEFERRED GPU STLs TO A SEPARATE PR FOR SIMPLICITY BECAUSE THE SCOPE HAS SIGNIFICANTLY CREEPED ON THIS FEATURE AND IT IS HINDERING DEBUGGING: GPU compute of STL IB levelst values. This is done by loading STL models into global memory, and they are computed separately for IB markers and levelset values. Then each separate loop was parallelized.
Arbitrary rotation of STL models which is compatible with the same local coordiante system as all other IBs

Fixes #1011

Type of change

New feature (non-breaking change which adds functionality)
Refactor

Scope

This PR comprises a set of related changes with a common goal

If you cannot check the above box, please split your PR into multiple PRs that each have a common goal.

How Has This Been Tested?

This currently passes all tests on MacOS and Ubuntu operating systms using the GNU compiler.

Checklist

I have added comments for the new code
I added Doxygen docstrings to the new code
I have made corresponding changes to the documentation (docs/)
I have added regression tests to the test suite so that people can verify in the future that the feature is behaving as expected
I have added example cases in examples/ that demonstrate my new feature performing as expected.
They run to completion and demonstrate "interesting physics"
I ran ./mfc.sh format before committing my code
New and existing tests pass locally with my changes, including with GPU capability enabled (both NVIDIA hardware with NVHPC compilers and AMD hardware with CRAY compilers) and disabled
This PR does not introduce any repeated code (it follows the DRY principle)
I cannot think of a way to condense this code and reduce any introduced additional line count

If your code changes any code source files (anything in `src/simulation`)

To make sure the code is performing as expected on GPU devices, I have:

Checked that the code compiles using NVHPC compilers
Checked that the code compiles using CRAY compilers
Ran the code on either V100, A100, or H100 GPUs and ensured the new feature performed as expected (the GPU results match the CPU results)
Ran the code on MI200+ GPUs and ensure the new features performed as expected (the GPU results match the CPU results)
Enclosed the new feature via nvtx ranges so that they can be identified in profiles
Ran a Nsight Systems profile using ./mfc.sh run XXXX --gpu -t simulation --nsys, and have attached the output file (.nsys-rep) and plain text results to this PR
Ran a Rocprof Systems profile using ./mfc.sh run XXXX --gpu -t simulation --rsys --hip-trace, and have attached the output file and plain text results to this PR.
Ran my code using various numbers of different GPUs (1, 2, and 8, for example) in parallel and made sure that the results scale similarly to what happens if you run without the new code/feature

PR Type

Enhancement, Refactor

Description

Removed levelset computation from preprocessing: IB markers and levelset values are no longer computed during pre-processing; all computations now occur during simulation runtime
Eliminated global levelset arrays: Replaced global levelset and levelset_norm arrays with local storage in the ghost_point derived type, reducing memory allocation by orders of magnitude
Parallelized levelset computation over ghost points: Refactored levelset code to parallelize over ghost points instead of grid cells, improving memory efficiency and GPU performance
GPU-accelerated STL model processing: Implemented GPU compute for STL IB markers and levelset values by loading STL models into global memory with separate parallelized loops
New m_compute_levelset module: Created dedicated module for all levelset computations with geometry-specific functions (circles, rectangles, ellipses, spheres, cylinders, airfoils, STL models)
Added STL model configuration parameters: Extended IB patch parameters to support model_filepath, model_spc, model_threshold, and transformation parameters (model_translate, model_rotate)
Simplified data I/O: Removed levelset and levelset norm from preprocessing output; added dedicated IB data output subroutines for simulation runtime
Updated test infrastructure: Adjusted grid resolution for circle test cases and updated golden test files to reflect removal of pre-computed IB markers

Diagram Walkthrough

flowchart LR
  A["Preprocessing<br/>Input Data"] -->|"No levelset<br/>computation"| B["Simulation<br/>Initialization"]
  B -->|"s_instantiate_STL_models"| C["Load STL Models<br/>to Global Memory"]
  C -->|"GPU compute"| D["Ghost Point<br/>Levelset Storage"]
  D -->|"s_apply_levelset"| E["IBM Updates<br/>During Simulation"]
  E -->|"s_write_ib_data_file"| F["Output IB Data<br/>at Runtime"]

File Walkthrough

Relevant files

Refactor

8 files

m_ib_patches.fpp `Refactor levelset computation from preprocessing to simulation runtime` src/simulation/m_ib_patches.fpp Removed `m_compute_levelset` module dependency and levelset computation calls from `s_apply_ib_patches` Added new `s_instantiate_STL_models()` subroutine to load and preprocess STL models during initialization Refactored `s_ib_model()` to use pre-instantiated models from global `models` array instead of computing levelsets inline Removed local `eta` variable declarations and replaced with local scope usage in geometry-specific functions Added `offset` variable to airfoil subroutines for cleaner centroid offset handling	+165/-234
m_data_output.fpp `Remove levelset data output from preprocessing` src/pre_process/m_data_output.fpp Removed `ib_markers`, `levelset`, and `levelset_norm` parameters from data output subroutine signatures Removed file I/O operations for writing IB markers, levelset, and levelset norm data Simplified abstract interface and implementation for `s_write_abstract_data_files()`	+6/-252
m_ibm.fpp `Refactor IBM module to use ghost-point-based levelsets` src/simulation/m_ibm.fpp Removed global `levelset` and `levelset_norm` arrays and replaced with ghost-point-local storage Added call to `s_instantiate_STL_models()` during IBM setup Replaced `s_apply_ib_patches()` calls to remove levelset parameters Changed `s_compute_image_points()` to use levelset values from `ghost_point` structure instead of global arrays Updated `s_update_mib()` to call new `s_apply_levelset()` instead of computing levelsets inline Removed `patch_id_fp` array allocation	+38/-44
m_start_up.fpp `Remove levelset restart file I/O from simulation startup` src/simulation/m_start_up.fpp Removed code that reads IB markers, levelset, and levelset norm data from restart files Removed airfoil grid data reading from preprocessing Added call to `s_write_ib_data_file()` after IBM setup during initialization Simplified `s_read_ic_data_files()` call signature by removing IB marker parameter	+6/-212
m_start_up.fpp `Remove levelset handling from preprocessing startup` src/pre_process/m_start_up.fpp Removed `m_ib_patches` module import Removed `ib_markers`, `levelset`, and `levelset_norm` variable declarations and allocations Removed IB marker reading from preprocessing startup Simplified `s_read_ic_data_files()` interface by removing IB marker parameter	+6/-70
m_mpi_common.fpp `Simplify MPI data initialization for IBM` src/common/m_mpi_common.fpp Removed `levelset` and `levelset_norm` parameters from `s_initialize_mpi_data()` subroutine Removed MPI type creation for levelset and levelset norm data structures Removed airfoil grid MPI I/O setup code Simplified MPI data initialization for IB markers only	+2/-66
m_initial_condition.fpp `Remove IBM-related variables from preprocessing` src/pre_process/m_initial_condition.fpp Removed `m_ib_patches` module import Removed `ib_markers`, `levelset`, and `levelset_norm` variable declarations Removed IB patch allocation and initialization code Removed `s_apply_ib_patches()` call from preprocessing	+1/-31
m_time_steppers.fpp `Remove levelset parameters from immersed boundary update` src/simulation/m_time_steppers.fpp Simplified the `s_update_mib` subroutine call by removing `levelset` and `levelset_norm` parameters Reflects the refactoring where levelset values are now computed in simulation rather than passed as arguments	+1/-1

Enhancement

5 files

m_compute_levelset.fpp `New module for ghost-point-based levelset computation` src/simulation/m_compute_levelset.fpp New module created to handle all levelset computations for immersed boundaries Implements `s_apply_levelset()` subroutine that computes levelsets for ghost points instead of grid cells Contains geometry-specific levelset functions for circles, rectangles, ellipses, spheres, cylinders, airfoils, and STL models Levelset and normal vector computations now tied to `ghost_point` derived type instead of global arrays Parallelized over ghost points using GPU macros for improved memory efficiency	+723/-0
m_data_output.fpp `Add dedicated IB data output subroutines` src/simulation/m_data_output.fpp Added new subroutines `s_write_serial_ib_data()` and `s_write_parallel_ib_data()` for IB data output Created `s_write_ib_data_file()` wrapper to handle both serial and parallel I/O Removed levelset and levelset norm output from main data writing routines Simplified MPI data initialization calls by removing levelset parameters	+96/-21
m_global_parameters.fpp `Add default initialization for IB patch parameters` src/simulation/m_global_parameters.fpp Added initialization loop for `patch_ib()` array with default values for all IB patch parameters Sets proper defaults for STL model transformation parameters (scale, translate, rotate) Initializes rotation matrices and moving boundary parameters	+41/-0
m_derived_types.fpp `Add derived types for model storage and ghost point levelsets` src/common/m_derived_types.fpp Added new `t_model_array` derived type to store STL model data with boundary vertices and interpolation information Extended `ghost_point` derived type with `levelset` and `levelset_norm` fields for local storage Removed global levelset array dependencies by moving data to ghost point structure	+13/-0
case_dicts.py `Add STL model and transformation parameters to IB configuration` toolchain/mfc/run/case_dicts.py Added three new immersed boundary patch parameters: `model_filepath`, `model_spc`, and `model_threshold` Added three new model transformation parameters: `model_translate` and `model_rotate` for each spatial direction Commented out `model_scale` parameter for potential future use	+6/-1

Tests

7 files

cases.py `Adjust grid resolution for circle test cases` toolchain/mfc/test/cases.py Updated Circle test case to use `n=49` grid resolution for improved precision Added comment explaining machine-level precision sensitivity for circular geometries Minor formatting adjustment to boundary condition test setup	+6/-2
golden.txt `Remove IB markers from golden test output` tests/7FA04E95/golden.txt Removed the `ib_markers` output line from golden test data Reflects the refactoring where IB markers are no longer pre-computed and stored globally	+1/-2
golden.txt `Remove IB markers output from test golden file` tests/5600D63B/golden.txt Removed the last line containing `D/ib_markers.00.dat` with all zero values Kept the `D/cons.5.00.000050.dat` line with updated values Test golden file updated to reflect removal of IB markers from output	+1/-2
golden-metadata.txt `Update test metadata with new build environment details` tests/7F70E665/golden-metadata.txt Updated timestamp from 2025-01-20 to 2026-02-03 Changed Git commit hash and branch information Updated CMake version and compiler information (AppleClang to GNU) Changed system from macOS (Apple M1 Pro) to Linux (Intel i7-12700K) Reordered build configuration sections (post_process moved to top) Added `OpenMP : OFF` configuration line Expanded CPU information with detailed lscpu output	+84/-41
golden-metadata.txt `Update test metadata with new build environment details` tests/F60D6594/golden-metadata.txt Updated timestamp from 2025-01-20 to 2026-02-03 Changed Git commit hash and branch information Updated CMake version and compiler information (AppleClang to GNU) Changed system from macOS (Apple M1 Pro) to Linux (Intel i7-12700K) Reordered build configuration sections Added `OpenMP : OFF` configuration line Expanded CPU information with detailed lscpu output	+84/-41
golden-metadata.txt `Update test metadata with new build environment details` tests/4F5A5E32/golden-metadata.txt Updated timestamp from 2025-01-20 to 2026-02-03 Changed Git commit hash and branch information Updated CMake version and compiler information (AppleClang to GNU) Changed system from macOS (Apple M1 Pro) to Linux (Intel i7-12700K) Reordered build configuration sections Added `OpenMP : OFF` configuration line Expanded CPU information with detailed lscpu output	+81/-38
golden-metadata.txt `Update test metadata with new build environment details` tests/8D8F6424/golden-metadata.txt Updated timestamp from 2025-01-20 to 2026-02-03 Changed Git commit hash and branch information Updated CMake version and compiler information (AppleClang to GNU) Changed system from macOS (Apple M1 Pro) to Linux (Intel i7-12700K) Reordered build configuration sections Added `OpenMP : OFF` configuration line Expanded CPU information with detailed lscpu output	+81/-38

Miscellaneous

3 files

golden-metadata.txt `Update golden test metadata and build environment` tests/B0CE19C5/golden-metadata.txt Updated test metadata with new generation timestamp and git commit hash Changed build system configuration details (CMake version, compiler paths, system information) Updated CPU information from macOS to Linux system specifications	+93/-102
golden-metadata.txt `Update golden test metadata and build environment` tests/7DCE34B4/golden-metadata.txt Updated test metadata with new generation timestamp and git commit hash Changed build system configuration details (CMake version, compiler paths, system information) Updated CPU information from macOS to Linux system specifications	+90/-99
golden-metadata.txt `Update golden test metadata and build environment` tests/6171E9D4/golden-metadata.txt Updated test metadata with new generation timestamp and git commit hash Changed build system configuration details (CMake version, compiler paths, system information) Updated CPU information from macOS M1 Pro to Linux x86_64 system specifications	+84/-41

Additional files

63 files

m_compute_levelset.fpp	+0/-625
m_model.fpp	+0/-1
m_global_parameters.fpp	+0/-10
m_icpp_patches.fpp	+0/-6
golden.txt	+0/-1
golden.txt	+0/-1
golden.txt	+0/-1
golden.txt	+0/-3
golden.txt	+0/-1
golden.txt	+0/-1
golden-metadata.txt	+78/-35
golden.txt	+10/-11
golden.txt	+0/-1
golden.txt	+1/-2
golden.txt	+0/-1
golden.txt	+0/-1
golden.txt	+0/-1
golden.txt	+0/-1
golden.txt	+1/-2
golden.txt	+0/-1
golden.txt	+0/-1
golden.txt	+0/-1
golden.txt	+0/-1
golden.txt	+0/-1
golden.txt	+0/-1
golden.txt	+0/-1
golden.txt	+0/-1
golden.txt	+10/-11
golden.txt	+0/-1
golden.txt	+0/-1
golden.txt	+10/-11
golden-metadata.txt	+81/-38
golden.txt	+14/-15
golden-metadata.txt	+81/-38
golden.txt	+14/-15
golden.txt	+0/-1
golden.txt	+0/-1
golden.txt	+0/-1
golden.txt	+0/-1
golden.txt	+18/-19
golden.txt	+12/-13
golden.txt	+0/-1
golden.txt	+0/-1
golden.txt	+18/-19
golden.txt	+0/-1
golden.txt	+0/-1
golden.txt	+1/-2
golden.txt	+0/-1
golden.txt	+12/-13
golden-metadata.txt	+81/-38
golden.txt	+10/-11
golden.txt	+1/-2
golden.txt	+1/-2
golden.txt	+0/-1
golden-metadata.txt	+81/-38
golden.txt	+14/-15
golden.txt	+1/-2
golden.txt	+1/-2
golden.txt	+0/-1
golden.txt	+0/-1
golden.txt	+0/-1
golden.txt	+14/-15
golden.txt	+1/-2

Summary by CodeRabbit

New Features
- STL/model-based immersed-boundary support and on-demand model instantiation
- New dedicated immersed-boundary (IB) data export routine
Refactor
- Reorganized immersed-boundary and level-set processing across stages
- Simplified interfaces for IB/level-set data handling; internal plumbing reduced
Bug Fixes
- Improved circle geometry precision by adjusting default sampling parameter

CodeAnt-AI Description

Refactor levelset and STL immersed-boundary handling to compute values per ghost point

What Changed

Levelset distances and normals are no longer stored as large global arrays; they are computed and stored per ghost point during simulation, reducing peak memory usage and removing levelset globals from pre-processing/output routines
STL models are loaded once, transformed (scale/rotate/translate), optionally interpolated, and used to compute levelsets and IB markers at runtime; STL support now respects arbitrary model rotation and is prepared for GPU-parallel evaluation
IB marker data is written out at time-step 0 and via new serial/parallel write paths, and IBM initialization now instantiates STL models and recomputes IB markers during setup
Image-point and ghost-point logic now uses per-ghost-point levelset/normals (observable as unchanged boundary behavior) and emits clearer diagnostic failure messages when image points fall outside the domain

Impact

✅ Lower memory during simulation
✅ Support for rotated STL immersed boundaries
✅ IB marker files available at startup and restart

💡 Usage Guide

Checking Your Pull Request

Every time you make a pull request, our system automatically looks through it. We check for security issues, mistakes in how you're setting up your infrastructure, and common code problems. We do this to make sure your changes are solid and won't cause any trouble later.

Talking to CodeAnt AI

Got a question or need a hand with something in your pull request? You can easily get in touch with CodeAnt AI right here. Just type the following in a comment on your pull request, and replace "Your question here" with whatever you want to ask:

@codeant-ai ask: Your question here

This lets you have a chat with CodeAnt AI about your pull request, making it easier to understand and improve your code.

Example

@codeant-ai ask: Can you suggest a safer alternative to storing this secret?

Preserve Org Learnings with CodeAnt

You can record team preferences so CodeAnt AI applies them in future reviews. Reply directly to the specific CodeAnt AI suggestion (in the same thread) and replace "Your feedback here" with your input:

@codeant-ai: Your feedback here

This helps CodeAnt AI learn and adapt to your team's coding style and standards.

Example

@codeant-ai: Do not flag unused imports.

Retrigger review

Ask CodeAnt AI to review the PR again, by typing:

@codeant-ai: review

Check Your Repository Health

To analyze the health of your code repository, visit our dashboard at https://app.codeant.ai. This tool helps you identify potential issues and areas for improvement in your codebase, ensuring your repository maintains high standards of code health.

… removed

…o levelset-refactor

…f main

coderabbitai

Actionable comments posted: 4

🤖 Fix all issues with AI agents

In `@src/simulation/m_compute_levelset.fpp`:
- Around line 239-240: z_min/z_max are computed in global coordinates (using
center(3)) but later compared to the already-translated local coordinate
xyz_local(3), causing wrong distances and normals; change the code so z bounds
are expressed in the local frame (e.g. z_min = -lz/2 and z_max = +lz/2) or
compute z_min/z_max after subtracting center (so they are relative to the
translated origin) and then use those when computing dist_side and normals for
xyz_local(3). Update the analogous occurrence around line ~288 as well (keep
references: z_min, z_max, center, lz, xyz_local, dist_side).
- Around line 64-66: The 2D GPU_PARALLEL_LOOP uses copy='[gps]' and
patch_ib(1:num_ibs) which is inconsistent with the 3D path; change the 2D loop's
data clauses to use present='[gps]' (since gps is already on-device via
GPU_UPDATE) and make the patch_ib clause consistent with the 3D loop (use the
full patch_ib array form rather than the sliced (1:num_ibs)), leaving
private='[i,patch_id,patch_geometry]' and copyin='[Np,patch_ib]' (or the same
copyin form used in the 3D branch) so the kernel uses present gps and a
consistent patch_ib specifier.
- Around line 39-60: The GPU loops call s_model_levelset which dereferences the
host array-of-structs models (e.g. models(patch_id)%interpolate,
%boundary_edge_count, %total_vertices, %model, %interpolated_boundary_v,
%boundary_v) but models is not included in the GPU data clauses; add models to
device data so the kernel won't fault. Best fix: after
s_instantiate_STL_models() returns, create a persistent GPU data region (using
$:GPU_ENTER_DATA(present='[models]', copyin='[models]') / $:END_GPU_ENTER_DATA()
or the project macro equivalent) to keep models on-device for reuse;
alternatively, add models to the copyin list of the $:GPU_PARALLEL_LOOP(...)
that calls s_model_levelset (and the corresponding 2D loop) so models is present
during the loops. Ensure the same symbol name models is used in the data clause
and that its lifetime covers both loops.

In `@src/simulation/m_ibm.fpp`:
- Around line 72-73: s_finalize_ibm_module must free the allocatable array
models and each of its allocatable components to avoid host/device leaks: in
s_finalize_ibm_module iterate over models (if allocated(models)) and for each
element check and deallocate its allocatable fields model, boundary_v, and
interpolated_boundary_v (use ALLOCATED() guards), then deallocate the models
array itself (again guarding with ALLOCATED()) and handle any device-specific
deallocation if these components were moved to the device; update
s_finalize_ibm_module to perform these checks and DEALLOCATE calls for model,
boundary_v, interpolated_boundary_v, then DEALLOCATE(models).

🧹 Nitpick comments (3)

src/simulation/m_compute_levelset.fpp (2)
557-559: Use wp real literals in array constructors for type consistency.

gp%levelset_norm = (/1, 0, 0/) creates an integer array that gets implicitly converted. While Fortran handles this correctly for these values, using explicit real literals is more consistent with the rest of the file (e.g., line 292 uses 0._wp).

The same pattern appears in s_cylinder_levelset (lines 601–612) with (/1, 0, 0/), (/0, 1, 0/), etc.
✏️ Example fix for sphere
-            gp%levelset_norm = (/1, 0, 0/)
+            gp%levelset_norm = (/1._wp, 0._wp, 0._wp/)
310-311: Stale docstring: s_rectangle_levelset is labeled "Initialize IBM module".

The !> Initialize IBM module comment on line 310 is copy-paste from a different subroutine and does not describe what this routine does.
✏️ Fix docstring
-    !>  Initialize IBM module
-    subroutine s_rectangle_levelset(gp)
+    !>  Computes the levelset distance and normal for a rectangle patch
+    subroutine s_rectangle_levelset(gp)
src/simulation/m_ibm.fpp (1)
97-98: Remove the commented-out GPU_ENTER_DATA line.

The commented-out directive on line 97 is dead code. The GPU_UPDATE on line 98 is the intended replacement—clean up the leftover.
🧹 Proposed fix
-        ! $:GPU_ENTER_DATA(copyin='[patch_ib(1:num_ibs)]')
         $:GPU_UPDATE(device='[patch_ib(1:num_ibs)]')

src/simulation/m_compute_levelset.fpp

src/simulation/m_ibm.fpp

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)

src/common/m_model.fpp (3)
1125-1152: ⚠️ Potential issue | 🟠 Major

VLA dist_buffer inside a GPU_ROUTINE(seq) function may fail on GPU devices.

Line 1136 declares dist_buffer(1:boundary_edge_count) as an automatic (stack) array with a runtime size. GPU device stacks are typically very limited, and some compilers (nvfortran) may reject or silently miscompile VLAs in device code. This can be eliminated by tracking the running minimum directly, which also avoids the extra minval pass.

Also, line 1131 is a leftover commented-out signature — please remove it.
🔧 Proposed fix: eliminate VLA with running minimum
     function f_distance(boundary_v, boundary_edge_count, point) result(distance)

         $:GPU_ROUTINE(parallelism='[seq]')

         integer, intent(in) :: boundary_edge_count
         real(wp), intent(in), dimension(:, :, :) :: boundary_v
-        ! real(wp), intent(in), dimension(1:boundary_edge_count, 1:3, 1:2) :: boundary_v
         real(wp), dimension(1:3), intent(in) :: point

         integer :: i
         real(wp) :: dist_buffer1, dist_buffer2
-        real(wp), dimension(1:boundary_edge_count) :: dist_buffer
         real(wp) :: distance

-        distance = 0._wp
+        distance = huge(1._wp)
         do i = 1, boundary_edge_count
             dist_buffer1 = sqrt((point(1) - boundary_v(i, 1, 1))**2 + &
                                 & (point(2) - boundary_v(i, 1, 2))**2)

             dist_buffer2 = sqrt((point(1) - boundary_v(i, 2, 1))**2 + &
                                 & (point(2) - boundary_v(i, 2, 2))**2)

-            dist_buffer(i) = minval((/dist_buffer1, dist_buffer2/))
+            distance = min(distance, min(dist_buffer1, dist_buffer2))
         end do

-        distance = minval(dist_buffer)
-
     end function f_distance
1159-1194: ⚠️ Potential issue | 🟡 Minor

Potential out-of-bounds access when boundary_edge_count == 0.

If boundary_edge_count is 0, idx_buffer stays at 0 (line 1174), and the access at line 1190 (boundary_v(idx_buffer, 3, 1)) is an out-of-bounds read. Since this is a GPU_ROUTINE(seq), it cannot call s_mpi_abort, but an early-return guard would prevent the OOB.
🛡️ Proposed defensive guard
         dist_buffer = 0._wp
         dist_min = initial_distance_buffer
         idx_buffer = 0

+        if (boundary_edge_count == 0) then
+            normals = 0._wp
+            return
+        end if
+
         do i = 1, boundary_edge_count
1060-1116: ⚠️ Potential issue | 🟡 Minor

Same OOB risk: tri_idx = 0 when model%ntrs == 0.

Lines 1111–1113 access model%trs(tri_idx)%n(...) with tri_idx initialized to 0 (line 1080). If the model has zero triangles, this is an out-of-bounds read. Same pattern as f_normals above.

🤖 Fix all issues with AI agents

In `@src/common/m_model.fpp`:
- Line 65: Remove the extra space between the allocate keyword and the opening
parenthesis in the allocation call so it matches the formatter expectation:
change the call to use allocate(model%trs(model%ntrs)) (referencing the allocate
invocation and the model%trs/model%ntrs symbols) — ensure no space exists
between "allocate" and "(".

In `@src/simulation/m_ib_patches.fpp`:
- Around line 216-220: The numeric literal 0.e5_wp is effectively zero and
misleading; in the grid padding assignments for grid_mm (when computing extents
from x_cc/y_cc/z_cc) replace the erroneous 0.e5_wp multiplier with the intended
half-cell padding 0.5_wp (e.g., use
0.5_wp*dx(0)/0.5_wp*dx(m)/0.5_wp*dy(0)/0.5_wp*dy(n)/0.5_wp*dz(0)/0.5_wp*dz(p) in
the grid_mm lines) — if you really intended zero padding instead, replace
0.e5_wp with the explicit 0._wp to make that clear.

🧹 Nitpick comments (1)

src/simulation/m_ib_patches.fpp (1)
487-489: Inconsistent allocation: plain allocate instead of @:ALLOCATE.

The 2D s_ib_airfoil (lines 333–334) was updated to use @:ALLOCATE for airfoil_grid_u and airfoil_grid_l, but the 3D s_ib_3D_airfoil still uses plain allocate. This inconsistency means the 3D path may not properly allocate in GPU-managed memory. As per coding guidelines, use @:ALLOCATE for device-aware allocation in src/**/*.fpp.
♻️ Proposed fix
         if (.not. allocated(airfoil_grid_u)) then
-            allocate (airfoil_grid_u(1:Np))
-            allocate (airfoil_grid_l(1:Np))
+            @:ALLOCATE(airfoil_grid_u(1:Np))
+            @:ALLOCATE(airfoil_grid_l(1:Np))

coderabbitai · 2026-02-06T07:09:41Z

src/common/m_model.fpp

        model%ntrs = nTriangles

-        allocate (model%trs(model%ntrs))
+        allocate(model%trs(model%ntrs))


⚠️ Potential issue | 🟡 Minor

Pipeline failure: allocate spacing.

The CI formatting check reports that this line has a space between allocate and (. Remove the space to match the expected format.

✏️ Suggested fix

- allocate (model%trs(model%ntrs)) + allocate(model%trs(model%ntrs))

🤖 Prompt for AI Agents

In `@src/common/m_model.fpp` at line 65, Remove the extra space between the allocate keyword and the opening parenthesis in the allocation call so it matches the formatter expectation: change the call to use allocate(model%trs(model%ntrs)) (referencing the allocate invocation and the model%trs/model%ntrs symbols) — ensure no space exists between "allocate" and "(".

coderabbitai · 2026-02-06T07:09:41Z

src/simulation/m_ib_patches.fpp

+                    grid_mm(1, :) = (/minval(x_cc(0:m)) - 0.e5_wp*dx(0), maxval(x_cc(0:m)) + 0.e5_wp*dx(m)/)
+                    grid_mm(2, :) = (/minval(y_cc(0:n)) - 0.e5_wp*dy(0), maxval(y_cc(0:n)) + 0.e5_wp*dy(n)/)
+
+                    if (p > 0) then
+                        grid_mm(3, :) = (/minval(z_cc(0:p)) - 0.e5_wp*dz(0), maxval(z_cc(0:p)) + 0.e5_wp*dz(p)/)


⚠️ Potential issue | 🟡 Minor

0.e5_wp evaluates to 0.0 — likely a typo.

0.e5_wp in Fortran is 0.0 × 10^5 = 0.0, so the padding term 0.e5_wp*dx(0) is always zero. If the intent was to add half-cell padding for display, this should be 0.5_wp*dx(0). If zero padding is intentional, use 0._wp for clarity instead of the misleading 0.e5_wp.

🤖 Prompt for AI Agents

In `@src/simulation/m_ib_patches.fpp` around lines 216 - 220, The numeric literal 0.e5_wp is effectively zero and misleading; in the grid padding assignments for grid_mm (when computing extents from x_cc/y_cc/z_cc) replace the erroneous 0.e5_wp multiplier with the intended half-cell padding 0.5_wp (e.g., use 0.5_wp*dx(0)/0.5_wp*dx(m)/0.5_wp*dy(0)/0.5_wp*dy(n)/0.5_wp*dz(0)/0.5_wp*dz(p) in the grid_mm lines) — if you really intended zero padding instead, replace 0.e5_wp with the explicit 0._wp to make that clear.

… into levelset-refactor

codeant-ai · 2026-02-06T15:46:08Z

CodeAnt AI is running Incremental review

Thanks for using CodeAnt! 🎉

We're free for open-source projects. if you're enjoying it, help us grow by sharing.

Share on X ·
Reddit ·
LinkedIn

codeant-ai · 2026-02-06T15:50:10Z

src/simulation/m_compute_levelset.fpp

+
+        integer :: i, patch_id, patch_geometry
+
+        $:GPU_UPDATE(device='[gps(1:num_gps)]')


Suggestion: The explicit device update of the gps array before the GPU parallel loop is redundant with the copy='[gps]' clause on the GPU loop and, in typical accelerator models (e.g., OpenACC/OpenMP target), issuing an update for data that is not yet present on the device can produce a runtime "variable not present on device" error; removing this update and relying on the copy clause avoids this inconsistency. [possible bug]

Severity Level: Major ⚠️

- ❌ Levelset GPU path may abort at s_apply_levelset (src/...m_compute_levelset.fpp:29). - ⚠️ GPU-accelerated runs of IB levelset computation unstable. - ⚠️ Affects simulations using device backends (GPU/accelerator).

Suggested change

$:GPU_UPDATE(device='[gps(1:num_gps)]')

! gps data is transferred via the GPU_PARALLEL_LOOP copy clause; no explicit device update needed here

Steps of Reproduction ✅

1. Execute code path that runs subroutine s_apply_levelset defined in src/simulation/m_compute_levelset.fpp:29 (impure subroutine s_apply_levelset(gps, num_gps)). 2. At runtime s_apply_levelset reaches the explicit device update on line 36: "$:GPU_UPDATE(device='[gps(1:num_gps)]')". 3. Immediately after the update the code enters the GPU parallel region beginning at the GPU_PARALLEL_LOOP on line 41 which contains copy='[gps]'. If the accelerator runtime does not already have a device allocation for gps, the explicit GPU_UPDATE on line 36 can cause the accelerator to error with "variable not present on device" (device update invoked before device allocation). 4. Observable failure: execution abort or runtime accelerator error reported at or before line 36 when running with GPU/accelerator enabled. If the explicit update is removed, the gpu-parallel loop's copy clause at line 41 will allocate and transfer gps as needed and avoid the pre-update error.

Prompt for AI Agent 🤖

This is a comment left during a code review. **Path:** src/simulation/m_compute_levelset.fpp **Line:** 36:36 **Comment:** *Possible Bug: The explicit device update of the `gps` array before the GPU parallel loop is redundant with the `copy='[gps]'` clause on the GPU loop and, in typical accelerator models (e.g., OpenACC/OpenMP target), issuing an update for data that is not yet present on the device can produce a runtime "variable not present on device" error; removing this update and relying on the `copy` clause avoids this inconsistency. Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.

codeant-ai · 2026-02-06T15:50:10Z

src/simulation/m_compute_levelset.fpp

+
+        end if
+
+        $:GPU_UPDATE(host='[gps(1:num_gps)]')


Suggestion: The explicit host update of the gps array after the GPU parallel loop is redundant with the copy='[gps]' clause on the GPU loop and, if the device copy created by the loop is deallocated at loop exit (as is typical), a subsequent host update for non-present data can raise a runtime error; removing this update ensures consistent device data lifetime. [possible bug]

Severity Level: Major ⚠️

- ❌ Post-levelset GPU-to-host transfer may abort simulation. - ⚠️ Host-visible gps updates redundant with loop copy clause. - ⚠️ Affects GPU-enabled runs of s_apply_levelset (src/...m_compute_levelset.fpp:29).

Suggested change

$:GPU_UPDATE(host='[gps(1:num_gps)]')

! gps data is already copied back to the host by the GPU_PARALLEL_LOOP copy clause; no explicit host update needed here

Steps of Reproduction ✅

1. Run the same s_apply_levelset subroutine in src/simulation/m_compute_levelset.fpp:29 which executes a GPU_PARALLEL_LOOP (3D branch at ~line 41 or 2D branch at ~line 66) using copy='[gps]'. 2. The GPU_PARALLEL_LOOP completes (END_GPU_PARALLEL_LOOP around lines 60/85) and typical accelerator runtimes will deallocate per-loop device copies that were created by the loop's copy clause. 3. Immediately after the loop the code executes the explicit host update at line 90: "$:GPU_UPDATE(host='[gps(1:num_gps)]')". If the runtime has already released the device allocation for gps, the explicit host update attempts to access a non-existent device copy and may fail with "variable not present on device". 4. Observable failure: runtime error or abort reported at the host-update statement (src/simulation/m_compute_levelset.fpp:90) on GPU-enabled runs. Relying on the loop's copy clause avoids this double-management of gps memory.

Prompt for AI Agent 🤖

This is a comment left during a code review. **Path:** src/simulation/m_compute_levelset.fpp **Line:** 90:90 **Comment:** *Possible Bug: The explicit host update of the `gps` array after the GPU parallel loop is redundant with the `copy='[gps]'` clause on the GPU loop and, if the device copy created by the loop is deallocated at loop exit (as is typical), a subsequent host update for non-present data can raise a runtime error; removing this update ensures consistent device data lifetime. Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.

codeant-ai · 2026-02-06T15:50:17Z

CodeAnt AI Incremental review completed.

coderabbitai

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)

src/common/m_model.fpp (3)
1060-1116: ⚠️ Potential issue | 🟡 Minor

Potential out-of-bounds access when model%ntrs == 0.

With the GPU_ROUTINE directive now marking this subroutine as device-callable, the pre-existing edge case becomes more concerning: if model%ntrs is 0, the loop on line 1081 is skipped, tri_idx remains 0 (line 1080), and lines 1111–1113 access model%trs(0)%n, which is an out-of-bounds read. On a GPU this is undefined behavior with no clean abort path.
🛡️ Proposed guard
+       if (tri_idx == 0) then
+         distance = 0._wp
+         normals = 0._wp
+         return
+       end if
+
        normals(1) = model%trs(tri_idx)%n(1)
1159-1194: ⚠️ Potential issue | 🟡 Minor

Same zero-count out-of-bounds risk as f_distance_normals_3D.

If boundary_edge_count is 0, idx_buffer stays 0 (line 1174) and lines 1190–1192 read boundary_v(0, …). Now that this is a GPU device routine, the undefined access is silent. Consider adding an early-return guard, mirroring the fix suggested for f_distance_normals_3D.

1125-1152: ⚠️ Potential issue | 🟠 Major

Variable-length automatic array in GPU device code.

dist_buffer(1:boundary_edge_count) (line 1136) is a stack-allocated VLA inside a GPU_ROUTINE(seq) function. Device threads have very limited stack space; a large boundary_edge_count will overflow it. Moreover, some GPU compilers (nvfortran) may reject automatic arrays in device routines outright.

The array is only used to collect per-edge minimums and then take minval. Replace it with a scalar running minimum:
♻️ Proposed refactor
     function f_distance(boundary_v, boundary_edge_count, point) result(distance)
 
         $:GPU_ROUTINE(parallelism='[seq]')
 
         integer, intent(in) :: boundary_edge_count
         real(wp), intent(in), dimension(:, :, :) :: boundary_v
         real(wp), dimension(1:3), intent(in) :: point
 
         integer :: i
         real(wp) :: dist_buffer1, dist_buffer2
-        real(wp), dimension(1:boundary_edge_count) :: dist_buffer
         real(wp) :: distance
 
-        distance = 0._wp
+        distance = initial_distance_buffer
         do i = 1, boundary_edge_count
             dist_buffer1 = sqrt((point(1) - boundary_v(i, 1, 1))**2 + &
                                 & (point(2) - boundary_v(i, 1, 2))**2)
 
             dist_buffer2 = sqrt((point(1) - boundary_v(i, 2, 1))**2 + &
                                 & (point(2) - boundary_v(i, 2, 2))**2)
 
-            dist_buffer(i) = minval((/dist_buffer1, dist_buffer2/))
+            distance = min(distance, min(dist_buffer1, dist_buffer2))
         end do
 
-        distance = minval(dist_buffer)
-
     end function f_distance
Does nvfortran support automatic arrays in OpenACC device routines?

🤖 Fix all issues with AI agents

In `@src/simulation/m_compute_levelset.fpp`:
- Around line 310-311: The Doxygen header above subroutine s_rectangle_levelset
is a copy-paste artifact ("Initialize IBM module") and should be replaced with a
concise description of what s_rectangle_levelset actually does: compute and
initialize a rectangular level-set (or populate level-set fields for a
rectangle), list key inputs/outputs (e.g., gp or grid/state being modified), and
any important assumptions; update the comment block associated with
s_rectangle_levelset accordingly so it accurately documents the rectangle
levelset computation and its parameters.
- Around line 643-648: The Doxygen `@param` tags are out-of-date for subroutine
s_model_levelset(gp): remove or replace the nonexistent parameters (patch_id,
STL_levelset, STL_levelset_norm) and instead document the actual argument gp and
its purpose (e.g., type/structure and role in the routine) in the comment header
above s_model_levelset; alternatively, if those other parameters are meant to be
inputs, update the subroutine signature to include them consistently—ensure the
docblock and the subroutine argument list match and use correct Doxygen `@param`
entries for gp (and any real parameters).
- Around line 558-559: The assignment to gp%levelset_norm uses an integer array
constructor (/1, 0, 0/) which relies on implicit conversion and doesn’t ensure
working precision; replace it with wp-suffixed real literals to match the module
precision (e.g., set gp%levelset_norm to a real(wp) array like (1._wp, 0._wp,
0._wp)) so the values are stored in the correct precision and consistent with
f_approx_equal and other wp-typed variables.
- Around line 708-714: The comments for the interpolate branches are swapped:
update the comment above the interpolate==.true. branch (where gp%levelset is
set via f_interpolated_distance using models(patch_id)%interpolated_boundary_v
and total_vertices) to say it is computing the shortest distance to the
interpolated model boundary, and update the comment above the
interpolate==.false. branch (where gp%levelset is set via f_distance using
models(patch_id)%boundary_v and boundary_edge_count) to say it is computing the
shortest distance to the actual model boundary; leave the code (interpolate,
gp%levelset, f_interpolated_distance, f_distance, xyz_local) unchanged.
- Line 419: Normalize normal_vector safely by guarding against division by zero:
compute the squared norm via dot_product(normal_vector, normal_vector), clamp it
with sgm_eps (e.g. use max(dot_product(normal_vector, normal_vector), sgm_eps)),
take the sqrt and divide by that clamped value instead of dividing by raw
sqrt(dot_product(...)); update the normalization at the end of the
m_compute_levelset routine where normal_vector is normalized to use this clamped
denominator (refer to normal_vector, dot_product and sgm_eps).

🧹 Nitpick comments (4)

src/common/m_model.fpp (3)

490-511: Commented-out GPU directive — consider implementing the fibonacci sphere now.

The GPU_ROUTINE directive is commented out (line 490) while all other routines in this file (f_distance_normals_3D, f_distance, f_normals, f_interpolated_distance) have active directives. The TODO block (lines 504–511) correctly identifies random_number as the blocker and proposes a fibonacci sphere replacement.

Since this PR's objective is to enable GPU-parallelized levelset computation, leaving f_model_is_inside as the sole non-GPU-callable routine may limit STL model support on GPU. If the fibonacci sphere approach is validated, implementing it here would make the GPU story complete.

1130-1131: Remove commented-out old declaration.

Line 1131 is a leftover from the previous fixed-shape signature. Dead commented-out code adds noise.

1-27: File exceeds 1000-line guideline (1251 lines).

This is pre-existing, but with the new m_compute_levelset module being introduced in this PR, it may be a good time to evaluate whether some of the interpolation routines (f_interpolate_2D, f_interpolate_3D, f_check_interpolation_2D, f_check_interpolation_3D) could be split into a separate module to bring this file under the limit. As per coding guidelines: "Keep … files ≤ 1000 lines."
src/simulation/m_compute_levelset.fpp (1)
598-613: Integer array constructors for dist_sides_vec and dist_surface_vec.

Same wp precision concern as the sphere subroutine — (/1, 0, 0/), (/0, 1, 0/), etc. produce default-real values rather than wp-precision. These are then used in dot_product and arithmetic with real(wp) variables.
Proposed fix (representative — apply to all six occurrences)
-            dist_sides_vec = (/1, 0, 0/)
-            dist_surface_vec = (/0, 1, 1/)
+            dist_sides_vec = (/1._wp, 0._wp, 0._wp/)
+            dist_surface_vec = (/0._wp, 1._wp, 1._wp/)
As per coding guidelines: "Use wp (working precision) parameter from m_precision_select instead of hardcoded precision."

coderabbitai · 2026-02-06T15:53:38Z

src/simulation/m_compute_levelset.fpp

+    !>  Initialize IBM module
+    subroutine s_rectangle_levelset(gp)


⚠️ Potential issue | 🟡 Minor

Misleading Doxygen: "Initialize IBM module" on a rectangle levelset subroutine.

Copy-paste artifact. Should describe the rectangle levelset computation.

Proposed fix

- !> Initialize IBM module + !> Compute the levelset and normal for a rectangle patch subroutine s_rectangle_levelset(gp)

🤖 Prompt for AI Agents

In `@src/simulation/m_compute_levelset.fpp` around lines 310 - 311, The Doxygen header above subroutine s_rectangle_levelset is a copy-paste artifact ("Initialize IBM module") and should be replaced with a concise description of what s_rectangle_levelset actually does: compute and initialize a rectangular level-set (or populate level-set fields for a rectangle), list key inputs/outputs (e.g., gp or grid/state being modified), and any important assumptions; update the comment block associated with s_rectangle_levelset accordingly so it accurately documents the rectangle levelset computation and its parameters.

src/simulation/m_compute_levelset.fpp

…t now works

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In `@src/simulation/m_compute_levelset.fpp`:
- Around line 36-41: The GPU loops use copy='[gps]' while gps is already moved
to the device by GPU_UPDATE(device='[gps(1:num_gps)]'), causing
redundant/conflicting data regions; in the GPU_PARALLEL_LOOP pragmas (the loops
that list private='[i,patch_id,patch_geometry]', copy='[gps]',
copyin='[patch_ib,Np]') replace copy='[gps]' with present='[gps]' (and apply the
same change to the second loop around line 66) so the loops reference the
existing device-resident gps and rely on the GPU_UPDATE(host=...) copy-back
later.
- Around line 118-119: The assignment uses an integer literal 0 for the real(wp)
array element gp%levelset_norm; change the literal to the working precision kind
(0._wp) to avoid implicit conversion and match the fix in
s_sphere_levelset—update the assignment in the block where f_approx_equal(dist,
0._wp) is checked so it reads gp%levelset_norm = 0._wp (ensure _wp from
m_precision_select is used).

🧹 Nitpick comments (3)

src/simulation/m_compute_levelset.fpp (3)
156-197: Multiple integer literals assigned to real(wp) variables.

Lines 175, 181, 195 assign 0 (integer) to dist_vec(3) which is real(wp). Use 0._wp consistently.
Proposed fix (representative)
-                dist_vec(3) = 0
+                dist_vec(3) = 0._wp
Apply at lines 175, 181, and 195.
222-222: Unused variable length_z.

length_z is declared but never referenced — lz is used instead. Remove it to avoid confusion.
Proposed fix
-        real(wp) :: length_z
598-613: Integer array constructors assigned to real(wp) vectors.

Lines 601–612 use (/1, 0, 0/), (/0, 1, 0/), etc. for dist_sides_vec and dist_surface_vec which are real(wp). Use _wp suffix on all literals for type safety and precision consistency.
Proposed fix (representative)
-            dist_sides_vec = (/1, 0, 0/)
-            dist_surface_vec = (/0, 1, 1/)
+            dist_sides_vec = (/1._wp, 0._wp, 0._wp/)
+            dist_surface_vec = (/0._wp, 1._wp, 1._wp/)
Apply similarly for lines 606–607 and 611–612.

coderabbitai · 2026-02-06T17:08:22Z

src/simulation/m_compute_levelset.fpp

+        $:GPU_UPDATE(device='[gps(1:num_gps)]')
+
+        !  3D Patch Geometries
+        if (p > 0) then
+
+            $:GPU_PARALLEL_LOOP(private='[i,patch_id,patch_geometry]', copy='[gps]', copyin='[patch_ib,Np]')


⚠️ Potential issue | 🟠 Major

copy='[gps]' in both GPU loops conflicts with the GPU_UPDATE(device=...) at line 36.

Line 36 moves gps to the device via GPU_UPDATE. The subsequent copy='[gps]' in both loops (lines 41, 66) creates a separate structured data region that redundantly allocates/copies gps, potentially conflicting with the unstructured data update. Use present='[gps]' instead, since gps is already on the device, and rely on the GPU_UPDATE(host=...) at line 90 for the copy-back.

Proposed fix

- $:GPU_PARALLEL_LOOP(private='[i,patch_id,patch_geometry]', copy='[gps]', copyin='[patch_ib,Np]') + $:GPU_PARALLEL_LOOP(private='[i,patch_id,patch_geometry]', present='[gps]', copyin='[patch_ib,Np]')

Apply similarly at line 66.

Also applies to: 64-66

🤖 Prompt for AI Agents

In `@src/simulation/m_compute_levelset.fpp` around lines 36 - 41, The GPU loops use copy='[gps]' while gps is already moved to the device by GPU_UPDATE(device='[gps(1:num_gps)]'), causing redundant/conflicting data regions; in the GPU_PARALLEL_LOOP pragmas (the loops that list private='[i,patch_id,patch_geometry]', copy='[gps]', copyin='[patch_ib,Np]') replace copy='[gps]' with present='[gps]' (and apply the same change to the second loop around line 66) so the loops reference the existing device-resident gps and rely on the GPU_UPDATE(host=...) copy-back later.

coderabbitai · 2026-02-06T17:08:22Z

src/simulation/m_compute_levelset.fpp

+        if (f_approx_equal(dist, 0._wp)) then
+            gp%levelset_norm = 0


⚠️ Potential issue | 🟡 Minor

Integer literal 0 assigned to real(wp) array.

Same issue that was fixed for s_sphere_levelset at line 559 — use 0._wp for consistency and to avoid implicit conversion.

Proposed fix

- gp%levelset_norm = 0 + gp%levelset_norm = 0._wp

As per coding guidelines: "Use wp (working precision) parameter from m_precision_select instead of hardcoded precision like real*8."

🤖 Prompt for AI Agents

In `@src/simulation/m_compute_levelset.fpp` around lines 118 - 119, The assignment uses an integer literal 0 for the real(wp) array element gp%levelset_norm; change the literal to the working precision kind (0._wp) to avoid implicit conversion and match the fix in s_sphere_levelset—update the assignment in the block where f_approx_equal(dist, 0._wp) is checked so it reads gp%levelset_norm = 0._wp (ensure _wp from m_precision_select is used).

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/simulation/m_ib_patches.fpp (1)

64-88: ⚠️ Potential issue | 🟠 Major

Remove the optional attribute from ib_markers_sf parameter.

Line 66 declares ib_markers_sf as optional, but it is used unconditionally throughout the subroutine and passed directly to geometry-specific subroutines (s_ib_sphere, s_ib_cuboid, s_ib_cylinder, s_ib_3D_airfoil, s_ib_model, s_ib_ellipse, s_ib_circle, s_ib_rectangle, s_ib_airfoil) that declare it as non-optional intent(inout). Both call sites in m_ibm.fpp (lines 111 and 1002) always provide this argument. Remove the optional attribute to match actual usage and prevent potential undefined behavior.

🤖 Fix all issues with AI agents

In `@src/simulation/m_ib_patches.fpp`:
- Line 143: Replace the raw Fortran allocate call with the Fypp device-aware
macro: in the allocation of models(patch_id)%model (referencing the models
array, patch_id index and model component), remove the plain "allocate" and use
the Fypp macro invocation @:ALLOCATE(models(patch_id)%model) so the allocation
becomes device-aware and satisfies the pipeline formatting rule.
- Around line 58-60: The allocation for GPU-managed array models (declared with
GPU_DECLARE) must use the fypp device-aware macro rather than plain Fortran
allocate: replace the plain allocate of models(patch_id)%model with
@:ALLOCATE(models(patch_id)%model) so the memory is allocated on the device
(symbols: models, GPU_DECLARE, models(patch_id)%model, @:ALLOCATE). Also fix the
scalar concatenation bug by replacing constructs that form (/dx, dy, dz/) with
scalar minima using minval(dx), minval(dy), minval(dz) (addresses the
concatenation at the spots handling dx/dy/dz) and correct the typo 0.e5_wp to
0.e-5_wp where tolerance is intended (fix occurrences around the 215–219
region).

🧹 Nitpick comments (1)

src/simulation/m_ib_patches.fpp (1)
928-958: Serial triple-nested loop over the entire grid is a significant performance bottleneck.

The TODO at line 928 explains that f_model_is_inside uses random_number() which is non-GPU-compatible. Until resolved, this loop runs serially over all (m+1)×(n+1)×(p+1) cells, which will be very slow on large grids.

Additionally, the loop order (i outermost, k innermost at line 930–932) is suboptimal for Fortran's column-major layout with ib_markers_sf(i, j, k). Consider swapping to k-j-i ordering (k outermost, i innermost) to improve cache locality while this remains CPU-only.
Suggested loop reorder for better cache performance
-        do i = 0, m
-            do j = 0, n
-                do k = 0, p
+        do k = 0, p
+            do j = 0, n
+                do i = 0, m

coderabbitai · 2026-02-06T19:26:36Z

src/simulation/m_ib_patches.fpp

+    type(t_model_array), allocatable, target :: models(:)
+    $:GPU_DECLARE(create='[models]')
+    !! array of STL models that can be allocated and then used in IB marker and levelset compute


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

rg -nP 'allocate\s*\(\s*models\s*\(' --type fortran -A2 -B2

Repository: MFlowCode/MFC

Length of output: 39

🏁 Script executed:

rg -nP 's_instantiate_STL_models|@:ALLOCATE.*models' --type fortran -B3 -A3

Repository: MFlowCode/MFC

Length of output: 39

🏁 Script executed:

rg -nP '\bmodels\s*=' --type fortran -B2 -A2 | head -100

Repository: MFlowCode/MFC

Length of output: 39

🏁 Script executed:

cd src/simulation && sed -n '118,250p' m_ib_patches.fpp

Repository: MFlowCode/MFC

Length of output: 6421

🏁 Script executed:

rg -nP 'call\s+s_instantiate_STL_models|allocate\s*\(\s*models\s*\(' --type fortran -B3 -A3

Repository: MFlowCode/MFC

Length of output: 39

🏁 Script executed:

rg -nP '@:ALLOCATE|allocate\s*\(\s*\w+\s*\(' src/simulation/ --type fortran -B1 -A1 | head -80

Repository: MFlowCode/MFC

Length of output: 39

🏁 Script executed:

fd -e fpp -x grep -l '@:ALLOCATE\|allocate' {} \; src/simulation/ | head -5 | xargs -I{} sh -c 'echo "=== {} ===" && grep -n "@:ALLOCATE\|allocate" {} | head -10'

Repository: MFlowCode/MFC

Length of output: 422

🏁 Script executed:

fd -e fpp . src/simulation/ | head -5 | xargs -I{} sh -c 'echo "=== {} ===" && grep -n "@:ALLOCATE" {} | head -5'

Repository: MFlowCode/MFC

Length of output: 1335

🏁 Script executed:

sed -n '210,225p' src/simulation/m_ib_patches.fpp

Repository: MFlowCode/MFC

Length of output: 1064

🏁 Script executed:

sed -n '180,210p' src/simulation/m_ib_patches.fpp

Repository: MFlowCode/MFC

Length of output: 1499

Use @:ALLOCATE macro for device-aware allocation at line 143.

Line 143 allocates models(patch_id)%model using standard Fortran allocate(), but models is GPU-managed (declared with GPU_DECLARE at line 59). Per coding guidelines, use the fypp @:ALLOCATE() macro for all device-aware memory: @:ALLOCATE(models(patch_id)%model). This also resolves the pipeline formatting check failure.

Additionally, the prior review flagged two issues still present in the code:

Lines 183, 200, 202: (/dx, dy, dz/) concatenates allocatable arrays instead of scalars. Should use minval() (partially done at line 185 for 2D).

Lines 215–219: 0.e5_wp appears to be a typo for 0.e-5_wp (current value is 0.0).

🤖 Prompt for AI Agents

In `@src/simulation/m_ib_patches.fpp` around lines 58 - 60, The allocation for GPU-managed array models (declared with GPU_DECLARE) must use the fypp device-aware macro rather than plain Fortran allocate: replace the plain allocate of models(patch_id)%model with @:ALLOCATE(models(patch_id)%model) so the memory is allocated on the device (symbols: models, GPU_DECLARE, models(patch_id)%model, @:ALLOCATE). Also fix the scalar concatenation bug by replacing constructs that form (/dx, dy, dz/) with scalar minima using minval(dx), minval(dy), minval(dz) (addresses the concatenation at the spots handling dx/dy/dz) and correct the typo 0.e5_wp to 0.e-5_wp where tolerance is intended (fix occurrences around the 215–219 region).

coderabbitai · 2026-02-06T19:26:36Z

src/simulation/m_ib_patches.fpp

+
+        do patch_id = 1, num_ibs
+            if (patch_ib(patch_id)%geometry == 5 .or. patch_ib(patch_id)%geometry == 12) then
+                allocate(models(patch_id)%model)


⚠️ Potential issue | 🟡 Minor

Use @:ALLOCATE macro for device-aware allocation.

Per coding guidelines, use the Fypp @:ALLOCATE macro instead of raw allocate. This also addresses the pipeline formatting failure flagged at this location.

Proposed fix

- allocate(models(patch_id)%model) + @:ALLOCATE(models(patch_id)%model)

As per coding guidelines: "Use fypp macro @:ALLOCATE(var1, var2) for device-aware allocation instead of standard Fortran allocate"

🤖 Prompt for AI Agents

In `@src/simulation/m_ib_patches.fpp` at line 143, Replace the raw Fortran allocate call with the Fypp device-aware macro: in the allocation of models(patch_id)%model (referencing the models array, patch_id index and model component), remove the plain "allocate" and use the Fypp macro invocation @:ALLOCATE(models(patch_id)%model) so the allocation becomes device-aware and satisfies the pipeline formatting rule.

…han on frontier. We should look into that eventually

danieljvickers · 2026-02-07T14:19:43Z

@sbryngelson I ran the Phoenix ACC test suite manually and got it to pass, but every time I submit on the runnners I get this weird device error that I have never seen before. printed over and over before it just gets killed:

[2026-02-06T20:58:03.989] error: NVML: Failed to get Compute running process count(3): Not Supported
     (293/547)    3D -> weno_order=3 -> wenoz=T                        27.22    BB0418CD
     (294/547)    3D -> weno_order=3                                   41.16    4B7FC359
     (295/547)    3D -> grcbc z                                        43.43    D7107B89
[2026-02-06T20:58:33.975] error: NVML: Failed to get Compute running process count(3): Not Supported
[2026-02-06T20:58:33.977] error: NVML: Failed to get Compute running process count(3): Not Supported
[2026-02-06T20:58:33.979] error: NVML: Failed to get Compute running process count(3): Not Supported
[2026-02-06T20:58:33.980] error: NVML: Failed to get Compute running process count(3): Not Supported
[2026-02-06T20:58:33.982] error: NVML: Failed to get Compute running process count(3): Not Supported
[2026-02-06T20:58:33.983] error: NVML: Failed to get Compute running process count(3): Not Supported
[2026-02-06T20:58:33.985] error: NVML: Failed to get Compute running process count(3): Not Supported
[2026-02-06T20:58:33.986] error: NVML: Failed to get Compute running process count(3): Not Supported
[2026-02-06T20:58:33.988] error: NVML: Failed to get Compute running process count(3): Not Supported
[2026-02-06T20:58:33.989] error: NVML: Failed to get Compute running process count(3): Not Supported
[2026-02-06T20:58:33.991] error: NVML: Failed to get Compute running process count(3): Not Supported
[2026-02-06T20:58:33.992] error: NVML: Failed to get Compute running process count(3): Not Supported
[2026-02-06T20:58:33.994] error: NVML: Failed to get Compute running process count(3): Not Supported
[2026-02-06T20:58:33.995] error: NVML: Failed to get Compute running process count(3): Not Supported
[2026-02-06T20:58:33.997] error: NVML: Failed to get Compute running process count(3): Not Supported
[2026-02-06T20:58:33.998] error: NVML: Failed to get Compute running process count(3): Not Supported
[2026-02-06T20:58:34.000] error: NVML: Failed to get Compute running process count(3): Not Supported
[2026-02-06T20:58:34.001] error: NVML: Failed to get Compute running process count(3): Not Supported
     (296/547)    3D -> weno_order=3 -> mapped_weno=T                  37.61    EFDCFF72
     (297/547)    3D -> weno_order=5                                   37.79    35231DF8
     (298/547)    3D -> weno_order=5 -> mp_weno=T                      38.51    A3EC9BD7
./mfc.sh: line 111: 3139309 Killed                  python3 "$(pwd)/toolchain/main.py" "$@"

Any idea what this could be?

codeant-ai · 2026-02-08T13:56:50Z

CodeAnt AI is running Incremental review

Thanks for using CodeAnt! 🎉

We're free for open-source projects. if you're enjoying it, help us grow by sharing.

Share on X ·
Reddit ·
LinkedIn

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/simulation/m_ib_patches.fpp (1)
480-481: 🛠️ Refactor suggestion | 🟠 Major

Use @:ALLOCATE for airfoil_grid_u and airfoil_grid_l in the 3D airfoil path.

The 2D airfoil path (lines 325–326) correctly uses @:ALLOCATE, but the 3D path still uses raw allocate. This inconsistency may cause issues if GPU-aware allocation behavior differs.
Proposed fix
-            allocate (airfoil_grid_u(1:Np))
-            allocate (airfoil_grid_l(1:Np))
+            @:ALLOCATE(airfoil_grid_u(1:Np))
+            @:ALLOCATE(airfoil_grid_l(1:Np))
As per coding guidelines: "Use fypp macro @:ALLOCATE(var1, var2) for device-aware allocation instead of standard Fortran allocate."

🧹 Nitpick comments (4)

src/simulation/m_compute_levelset.fpp (3)

180-180: Bare integer 0 literals assigned to real(wp) components throughout airfoil routines.

Lines 180, 186, 200, 269, 275, and 289 assign 0 (integer) to dist_vec(3) (a real(wp) component). Use 0._wp consistently, as done on lines 119, 166, 255.

As per coding guidelines: "Use wp (working precision) parameter from m_precision_select instead of hardcoded precision."

401-402: Unused variables k and idx declared but never referenced.

These appear to be leftover from a different version of this subroutine (perhaps copied from s_rectangle_levelset).

606-617: Integer array constructors for dist_sides_vec and dist_surface_vec.

Lines 606–617 use (/1, 0, 0/), (/0, 1, 0/), etc. for real(wp) arrays. Use wp-suffixed literals for consistency.

As per coding guidelines: "Use wp (working precision) parameter from m_precision_select instead of hardcoded precision."
src/simulation/m_ib_patches.fpp (1)
907-908: Unused variables point and local_point in s_ib_model.

These variables are declared but never referenced in the subroutine body. They appear to be remnants from a prior version of the code.
Proposed fix
         real(wp) :: eta
-        real(wp), dimension(1:3) :: point, local_point
         real(wp), dimension(1:3) :: center, xyz_local

codeant-ai · 2026-02-08T14:07:30Z

CodeAnt AI Incremental review completed.

danieljvickers and others added 30 commits December 12, 2025 13:48

Added ellipse marker patch

89a0ac5

Added ellipse IB patch

ef446f5

Added an ellipse immersed boundary patch

2f9b1be

Pre processing builds and all IB marker and levelset values have been…

e3495d9

… removed

Merged with master

f23eaff

intermediate commit

59c8dc2

Merge branch 'airfoil-4d-fixes' into levelset-refactor

b7b8d2d

Missed one merge message

128562b

Merge branch 'levelset-refactor' of github.com:danieljvickers/MFC int…

3e097d7

…o levelset-refactor

This code fully works, except the IB markers are not getting written

285999f

fixed MPI IO error, except for first time step

455d16b

We now also write at time step 0

d5aa186

Removed some of the ib marker data from tests

ed9edac

More case file changes

4fa238a

Removed unneeded variable

97c54f3

STLs run

0662d47

Merge branch 'levelset-refactor' of github.com:danieljvickers/MFC int…

75ddc11

…o levelset-refactor

Update casedicts to pass in STL file location to simulation

b2a821c

Intermitent commit swapping machines

6c45fc6

Transferring state

2b84f06

Resolved 2D STL files

e8fd89f

Remove additional airfoil data from golden files

fb9dbaf

Make sure we have reasonable default values in simulation as well

a9c6c5c

Removed Ib markers from golden

31799e1

I only need that change for circles

070e062

generated new golden files with master

8c4fa75

Merged in new golden files generated via the correct dimensions off o…

12256ec

…f main

intermittent commit'

0497b85

finished first pass of levelset module

2a0fb21

First pass of refactor finished. Time to try building

d7bef7f

Daniel J Vickers and others added 3 commits February 6, 2026 00:46

Finally got circle working on GPU

76ab0db

Reverted to match rest of ghost point reads

ccca21f

clean up

89412b6

coderabbitai bot reviewed Feb 6, 2026

View reviewed changes

src/simulation/m_compute_levelset.fpp Show resolved Hide resolved

src/simulation/m_compute_levelset.fpp Show resolved Hide resolved

src/simulation/m_compute_levelset.fpp Show resolved Hide resolved

src/simulation/m_ibm.fpp Show resolved Hide resolved

All tests pass with openACC

c0c9ea3

coderabbitai bot reviewed Feb 6, 2026

View reviewed changes

danieljvickers and others added 3 commits February 6, 2026 02:13

format

99de417

No present option in omp directives

7f21870

Merge branch 'levelset-refactor' of github.com:danieljvickers/MFC-Dan…

a2ebac7

… into levelset-refactor

codeant-ai bot added size:XXL This PR changes 1000+ lines, ignoring generated files and removed size:XXL This PR changes 1000+ lines, ignoring generated files labels Feb 6, 2026

codeant-ai bot reviewed Feb 6, 2026

View reviewed changes

coderabbitai bot reviewed Feb 6, 2026

View reviewed changes

Daniel J Vickers added 2 commits February 6, 2026 11:49

Found outr that openMP hates the side_dist array used for cuboids, bu…

aef9016

…t now works

Formatting

fe0dfab

coderabbitai bot reviewed Feb 6, 2026

View reviewed changes

Daniel Vickers added 2 commits February 6, 2026 14:19

Fixed openMP issue on Frontier

9952603

Formatting

ee7e431

coderabbitai bot reviewed Feb 6, 2026

View reviewed changes

For some reason running formatting on my local machine is different t…

3932983

…han on frontier. We should look into that eventually

Defering GPU STLs to a later PR

9c58276

codeant-ai bot added size:XXL This PR changes 1000+ lines, ignoring generated files and removed size:XXL This PR changes 1000+ lines, ignoring generated files labels Feb 8, 2026

Fixed 3D airfoils moving

c0948ea

coderabbitai bot reviewed Feb 8, 2026

View reviewed changes


		integer :: i, patch_id, patch_geometry

		$:GPU_UPDATE(device='[gps(1:num_gps)]')

	$:GPU_UPDATE(device='[gps(1:num_gps)]')
	! gps data is transferred via the GPU_PARALLEL_LOOP copy clause; no explicit device update needed here

	$:GPU_UPDATE(host='[gps(1:num_gps)]')
	! gps data is already copied back to the host by the GPU_PARALLEL_LOOP copy clause; no explicit host update needed here

Conversation

danieljvickers commented Feb 4, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

User description

User description

Description

Type of change

Scope

How Has This Been Tested?

Checklist

If your code changes any code source files (anything in src/simulation)

PR Type

Description

Diagram Walkthrough

File Walkthrough

Summary by CodeRabbit

CodeAnt-AI Description

What Changed

Impact

Checking Your Pull Request

Talking to CodeAnt AI

Example

Preserve Org Learnings with CodeAnt

Example

Retrigger review

Check Your Repository Health

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

codeant-ai bot commented Feb 6, 2026

Thanks for using CodeAnt! 🎉

Uh oh!

codeant-ai bot Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

codeant-ai bot Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

codeant-ai bot commented Feb 6, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 6, 2026

danieljvickers commented Feb 4, 2026 •

edited by coderabbitai bot

Loading

If your code changes any code source files (anything in `src/simulation`)