This talk will cover porting strategies for two projects: XGC and HACC to Aurora, Argonne's exascale machine, as well as lessons learned and tools that were crucial in porting these applications.
XGC: the gyrokinetic plasma physics code XGC has been offloaded almost entirely to GPU via Kokkos and Cabana over the course of ECP. In addition to accelerating computation, we find that communication patterns and memory usage must be very flexible to maintain a code base that is performant across architectures and scales. The XGC portion of the talk will cover the progress made; the lessons learned from running on diverse new machines (Polaris, Sunspot, and recently Frontier); the unique challenges of Aurora; and how these inform our plans as Aurora becomes available.
HACC: This application uses CUDA as programming model on GPUs and since CUDA is propriety language the application developers have to convert their kernels to programming model suitable for Aurora. The HACC portion of the talk will discuss the tools and development strategies used to port HACC from CUDA to SYCL. We will cover the challenges of supporting multiple codebases (CUDA/HIP/SYCL) in HACC, and the optimizations made to improve performance for the Intel Xe GPUs.
Aaron Scheinberg is a computational scientist and consultant focusing on exascale computing, scientific application performance, particle-based methods, magnetic fusion simulations, and GPU programming.
Esteban Rangel joined the Computational Science (CPS) division at Argonne National Laboratory as a staff scientist in July 2021. He became a postdoc at the Argonne Leadership Computing Facility (ALCF) after receiving his PhD in Computer Science from Northwestern University in 2018. He began contributing to the HACC codebase as a graduate student, where much of the work towards his PhD thesis was designing and implementing scalable analysis software for N-body cosmological simulations.