Skip to content

WIP --- Temporal reduction profile#776

Open
burlen wants to merge 4 commits into
developfrom
temporal_reduction_profile
Open

WIP --- Temporal reduction profile#776
burlen wants to merge 4 commits into
developfrom
temporal_reduction_profile

Conversation

@burlen
Copy link
Copy Markdown
Collaborator

@burlen burlen commented Sep 8, 2023

time each stage in the app.
this may need work/cleanup before merge
this info is already captured by the profiler.

@burlen
Copy link
Copy Markdown
Collaborator Author

burlen commented Sep 8, 2023

Fastest

perlmutter_kernel_profiling_Fastest

Average

perlmutter_kernel_profiling_Average

Slowest

perlmutter_kernel_profiling_Slowest

Takeaway: The temporal reduction is much faster on the GPU. I/O is slower, and has a lot more variability when GPU is used. Timing captures everything within execute of each stage

@burlen burlen force-pushed the temporal_reduction_profile branch from 218d905 to 02d3db8 Compare September 12, 2023 16:16
@burlen
Copy link
Copy Markdown
Collaborator Author

burlen commented Sep 12, 2023

varying steps per request (1 reduce thread, 1 writer thread)

steps_per_request_single_thread_1red_1wri

varying steps per request (4 reduce thread, 2 writer thread)

steps_per_request_single_thread_4red_2wri

@burlen
Copy link
Copy Markdown
Collaborator Author

burlen commented Sep 13, 2023

round 2 steps per request

I redid the tests this time going to larger steps per request. The same patterns appear.

varying steps per request (1 reduce thread, 1 writer thread)

steps_per_request_single_thread_1red_1wri_789

varying steps per request (4 reduce thread, 2 writer thread)

steps_per_request_single_thread_4red_2wri_789

@burlen
Copy link
Copy Markdown
Collaborator Author

burlen commented Sep 13, 2023

single node w. MPI

perlmutter_1_node_gpu_cpu_mpi_spr

@burlen
Copy link
Copy Markdown
Collaborator Author

burlen commented Sep 14, 2023

new vs old

perlmutter_1_node_gpu_cpu_mpi_spr_strm

@burlen
Copy link
Copy Markdown
Collaborator Author

burlen commented Sep 16, 2023

steps_per_request_single_thread_1red_1wri_cfs_scratch

Base automatically changed from temporal_reduction_multiple_steps_per_request to develop September 16, 2023 00:41
@burlen
Copy link
Copy Markdown
Collaborator Author

burlen commented Sep 19, 2023

steps_per_request_single_thread_cfs_lfs_nocomp
steps_per_request_single_thread_cfs_lfs_comp

@burlen
Copy link
Copy Markdown
Collaborator Author

burlen commented Sep 19, 2023

steps_per_request_single_thread_cfs_lfs_comp_nocomp

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant