Intro to AI-driven Science on Supercomputers: A Student Training Series

America/Chicago
Zoom (Virtual Training Series)

Zoom

Virtual Training Series

Description

The ability for artificial intelligence (AI) to successfully learn from large datasets has transformed science and engineering as we know it. AI can accelerate scientific discovery and innovation, but often requires more computing power than is available to most researchers. The DOE provides supercomputers to solve the nation’s biggest scientific challenges and this series aims to introduce a new generation of AI practitioners to these powerful resources.

Building on the ALCF's robust training program in the areas of AI and supercomputing, we are hosting a series of hands-on courses that will teach attendees to use leading-edge supercomputers to develop and apply AI solutions for the world's most challenging problems. This year, we will focus on understanding the fundamentals of large-language models (LLMs) and their scientific applications. 

REGISTRATION DEADLINE: 
REGISTRATION IS NOW CLOSED

Recordings for each session will be posted by the Friday of each week below.

Series materials can be found on on the series' Github.

FOR SUBMITTED ACCOUNT REQUESTS: We are currently processing large number of requests, and we will reach out once your account has been set up. Expect an email in the 3rd or 4th week of January 2024.

For those who submitted account requests after the registration deadline (Jan 15), we will not be able to process the request and give access to ALCF machines. However, all session materials will still be available.

 

EVENT WEBSITE

ALCF AI for Science Training Series 


EVENT DATES

The virtual workshop series will take place Tuesdays from 3:00pm - 4:30pm CT, February 6 - March 26, 2024

 

 

Intro to AI-driven Science on Supercomputers Support
    • Session 1: Session 1: Intro to AI on Supercomputers

      Intro to AI Series: Session 1
      Trainees will learn the basics of supercomputers and high-performance computing. They will be introduced to parallel programming and the fundamentals of training AI models on supercomputers.

      Lecturer
      Huihuo Zheng is a Computer Scientist at the Argonne Leadership Computing Facility. His areas of interests include data management and parallel I/O, large-scale distributed training. He applies high performance computing and deep learning to various domain sciences, such as physics, chemistry and material sciences. He also co-lead the MLPerf Storage Benchmarking group to develop benchmark suites for evaluating the performance of storage system for AI applications.

      AI for Science Talk Speaker
      Arvind Ramanathan is a computational biologist in the Data Science and Learning Division at Argonne National Laboratory and a senior scientist at the University of Chicago Consortium for Advanced Science and Engineering (CASE). His research interests are at the intersection of data science, high performance computing and biological/biomedical sciences. He will be speaking about Autononous Discovery for Biological Systems Design.

      Conveners: Arvind Ramanathan (DSL), Huihuo Zheng (LCF)
    • Session 2: Intro to Neural Networks

      Trainees will learn the basics of neural networks, opening up the black box of machine learning by building out by-hand networks for linear regression to increase the understanding of the math that goes into machine learning methods.

      Lecturer
      Bethany Lusch is a Computer Scientist in the data science group at the Argonne Leadership Computing Facility at Argonne National Lab. Her research expertise includes developing methods and tools to integrate AI with science, especially for dynamical systems and PDE-based simulations. Her recent work includes developing machine-learning emulators to replace expensive parts of simulations, such as computational fluid dynamics simulations of engines and climate simulations. She is also working on methods that incorporate domain knowledge in machine learning, representation learning, and using machine learning to analyze supercomputer logs. She holds a Ph.D. and MS in applied mathematics from the University of Washington and a BS in mathematics from the University of Notre Dame.

      AI for Science Talk Speaker
      Nicola Ferrier is a senior computer scientist as part of the Mathematics and Computer Science division at Argonne National Laboratory. Ferrier's research interests are in the use of computer vision (digital images) to control robots, machinery, and devices, with applications as diverse as medical systems, manufacturing, and projects that facilitate ​“scientific discovery” (such as her recent project using machine vision and robotics for plant phenotype studies). She will be speaking on AI @ Edge.

      Conveners: Bethany Lusch (LCF), Nicola Ferrier (MCS)
    • Session 3: Advanced Topics in Neural Networks

      Trainees will learn advanced topics in convolutional neural networks, such as deep, residual, variational, and adversarial networks

      Lecturer
      Corey Adams is a Computational Scientist at the Argonne Leadership Computing Facility. Originally a high-energy physicist working on neutrino physics problems, he now works on applying deep learning and machine learning techniques to science problems – and still neutrino physics – on high-performance computers. He has experience in classification, segmentation, and sparse convolutional neural networks as well as running machine learning training at scale.

      AI for Science Talk Speaker
      Katerina Vriza is an incoming staff scientist at the Center for Nanoscale Materials at Argonne National Lab. Her main focus is on using AI/ML to extract scientific data from literature and using such data and ML and robotics for the design and synthesis of polymer materials. She works on an autonomous materials synthesis robot called Polybot. Katerina is also the current lead developer for EXSCLAIM, a code that extracts and curates labeled electron microscopy datasets using AI. She will speak about Extracting and processing multimodal data from literature to guide robotic experiments.

      Conveners: Corey Adams (LCF), Katerina Vriza (CNM)
    • Session 4: Intro to Large Language Models

      Trainees will learn about essential concepts of sequential data modeling, and about modeling approaches such as transformers.

      Lecturer
      Carlo Graziani is a Computational Scientist at Argonne National Laboratory. He received a Ph.D. in physics from the University of Chicago in 1993. He has worked on research problems in high-energy astrophysics, computational fluid dynamics, high-energy density physics, and plasma physics, as well as in applications of advanced statistics and machine learning to physical and biological problems. He joined Argonne in 2017.
      AI for Science Talk Speaker
      Troy Arcomano is a postdoctoral fellow at Argonne National Lab working on machine learning applications for weather and climate in the EVS division. During his time at ANL, he was the Argonne lead for several projects including a large collaboration to create a state-of-the-art foundation model for weather prediction. Troy received his PhD at Texas A&M University where he worked on developing machine learning applications for weather forecasting and investigated how machine learning could be used to improve climate models. He'll be speaking about the AI revolution for Weather and Climate.

      Conveners: Carlo Graziani (MCS), Troy Arcomano
    • Session 5: LLM: Embeddings and Tokenization

      Trainees will learn about essential concepts of sequential data modeling, and modeling approaches such as transformers.

      Lecturer
      Archit Vasan is a postdoctoral appointee in the Argonne Leadership Computing Facility with a background in computational biophysics. His research interests at ALCF involve the discovery of cancer drugs using machine Learning coupled to exascale computing. Archit received a BA in Physics and Mathematics from Austin College in 2016. He then received his PhD in Biophysics from the University of Illinois at Urbana-Champaign in 2023 under the guidance of Dr. Emad Tajkhorshid

      Convener: Archit Vasan (LCF)
    • Session 6: Parallel Training Methods

      We present modern parallelism techniques and discuss how they can be used to train and distribute large models across many GPUs.

      Lecturer
      Sam Foreman is a Computational Scientist with a background in high energy physics, currently working as a postdoc in the ALCF. He is generally interested in the application of machine learning to computational problems in physics, particularly within the context of high-performance computing. Sam's current research focuses on using deep generative modeling to help build better sampling algorithms for simulations in lattice gauge theory.

      Conveners: Alessandro Lovato (ANL), Sam Foreman (LCF)
    • Session 7: AI Accelerators

      Trainees will learn about the current advances in AI hardware and the ALCF AI Testbed that is being integrated with existing and upcoming supercomputers at the facility to accelerate science insights.

      Lecturer
      Murali Emani is a Computer Scientist in the Data Science group with the Argonne Leadership Computing Facility (ALCF) at Argonne National Laboratory. Prior, he was a Postdoctoral Research Staff Member at Lawrence Livermore National Laboratory, US. Murali obtained a PhD from the Institute for Computing Systems Architecture at the School of Informatics, University of Edinburgh, UK. His research interests are in Scalable Machine Learning, Benchmarking, Runtime Systems, Emerging HPC architectures, Parallel programming models, High Performance Computing,

      Conveners: Murali Emani (LCF), Nesar Ramachandra (ANL)
    • Session 8: Evaluating LLMs and Potential Pitfalls

      Based on their understanding of LLMs from the series, trainees will engage in dialogue on evaluating LLMs and their potential limitations.

      Lecturer
      Bethany Lusch is a Computer Scientist in the data science group at the Argonne Leadership Computing Facility at Argonne National Lab. Her research expertise includes developing methods and tools to integrate AI with science, especially for dynamical systems and PDE-based simulations. Her recent work includes developing machine-learning emulators to replace expensive parts of simulations, such as computational fluid dynamics simulations of engines and climate simulations. She is also working on methods that incorporate domain knowledge in machine learning, representation learning, and using machine learning to analyze supercomputer logs. She holds a Ph.D. and MS in applied mathematics from the University of Washington and a BS in mathematics from the University of Notre Dame.

      Marieme Ngom is an Assistant Computer Scientist at the Argonne Leadership Computing Facility. Her research interests include probabilistic machine learning, high-performance computing, and dynamical systems modeling with applications in chemical engineering and material sciences. Ngom received her Ph.D. in mathematics from the University of Illinois at Chicago (UIC) in 2019 under the supervision of Prof. David Nicholls. Marieme holds an MSc in mathematics from the University of Paris-Saclay (formerly Paris XI), an MSc in computer science from the National Polytechnic Institute of Toulouse, and an MEng in computer science and applied mathematics from the École nationale supérieure d’électrotechnique, d’électronique, d’informatique, d’hydraulique et des télécommunications (ENSEEIHT) in Toulouse.

      Conveners: Bethany Lusch (LCF), Marieme Ngom (LCF), Sandeep Madireddy (MCS)