Join us on May 28, 2025, for a webinar on Accelerating AI Training and Inference for Science on Aurora: Frameworks, Tools, and Best Practices presented by Riccardo Balin and Filippo Simini.

In this developer session, we will provide an overview of key AI frameworks, toolkits, and strategies on Aurora to achieve high-performance training and inference for scientific applications. We'll cover examples of using PyTorch and TensorFlow on Aurora, followed by distributed training at scale using PyTorch with Distributed Data Parallel (DDP) and TensorFlow with Horovod, all driven by the oneCCL communication library. Additionally, we will discuss effectively using Python on Intel's GPUs with Data Parallel Extensions for Python (DPEP). To maximize GPU performance, we will share best practices for profiling codes and identifying bottlenecks. We will also cover topics on pre-training, fine-tuning, and inference on Aurora, along with associated best practices.

 

Riccardo Balin is an Assistant Computational Scientist in the Data Services and Workflows team at the Argonne Leadership Computing Facility. His research interests include workflows for coupling traditional HPC simulations with AI/ML training and inferencing, online (in situ) and scalable deep learning from ongoing simulations and experiments, and applying ML and data driven methods for turbulence modeling of complex aerodynamic flows.

He obtained a B.S./M.S. degree in Aerospace Engineering from the University of Colorado Boulder in 2016 and a Ph.D. in Computational Fluid Dynamics and Turbulence Modeling in 2020 from the same institution. He joined Argonne in 2021 as a postdoc under the Aurora Early Science Program supporting a project aiming to perform in situ scientific machine learning from exascale simulations of turbulent flows.

Filippo Simini is a computer scientist in the Machine Learning and Artificial Intelligence Group at Argonne National Laboratory. His work focuses on helping develop, run, and evaluate High-Performance Computing applications that include machine learning and artificial intelligence components, often combined with traditional science and engineering simulations. Filippo's interests include generative modeling and AI evaluation.

 

 
 

 

Starts
Ends
America/Chicago
Online
Virtual meeting information to follow
Registration
Registration for this event is currently open.