Opening Invited Talk

Speaker:

Brian Barrett

Brian Barrett

Principal Engineer
Amazon Web Services

Title: HPC On The Cloud: Opportunities to Redesign the Supercomputer

Abstract:

HPC has long been the realm of the Supercomputer, specialized machines dedicated to large scale MPI applications. Building these machines required carefully balancing network, processor, and memory technologies and chasing every system inefficiency in order to achieve peak performance. Over time, many of the components have become commoditized (memory, processor, and now even networking components are largely commodity options). This has opened the door to using Cloud computing infrastructure for HPC applications. Like current supercomputers, HPC in the Cloud requires intelligent system software to allow application developers to manage the complexity of the system (while still leaving time to get some real work done). In this talk, we’ll present some of the challenges we have faced in trying to run HPC applications in a large-scale Cloud environment, some of the challenges we unexpectedly did not face, and some of the solutions we have assembled for building successful HPC environments. Finally, we will discuss areas of research that we believe are critical to making HPC in the Cloud more than just another Supercomputer.

Biography:

Brian Barrett is a Principal Engineer at Amazon Web Services, focused on enabling High Performance Computing in the Cloud. Brian was one of the lead developers on the Elastic Fabric Adapter (EFA), a network interface designed to bring the choice and elasticity of the Cloud to HPC applications. Brian is one of the original developers of the Open MPI implementation of the MPI standard. Prior to joining Amazon, Brian spent 8 years at Sandia National Laboratories, contributing to the Portals 4 network programming interface and the MPI-3 standard. Brian received his PhD from Indiana University, Bloomington.