Keynote

Speaker:
Jidong Zhai

Jidong Zhai

Tenured Associate Professor
Department of Computer Science and Technology
Tsinghua University, China

Title: HPC System Software Enhanced by Source Code Analysis

Abstract:

Building efficient and scalable system software, especially performance analysis and monitoring, for large-scale systems, is increasingly important both for the developers of parallel applications and the designers of next-generation HPC systems. However, conventional performance tools suffer from significant time/space overhead due to the ever-increasing problem size and system scale. For instance, Memory monitoring is of critical use in understanding applications and evaluating systems. Due to the dynamic nature in programs’ memory accesses, common practice today leaves large amounts of address examination and data recording at runtime, at the cost of substantial performance overhead.

On the other hand, the cost of source code analysis is independent of the problem size and system scale, making it very appealing for large-scale performance analysis. Inspired by this observation, we have designed a series of light-weight system software for HPC systems, such as a memory access monitoring tool, a performance variance detection tool , and a communication trace compression tool. In this talk, I will share our experience on building these tools through combining static analysis and runtime analysis and also point out the main challenges in this direction.

Biography:

Jidong Zhai is a Tenured Associate Professor in the Computer Science Department of Tsinghua University. He is a recipient of Siebel Scholar, CCF outstanding doctoral dissertation award, and NSFC Young Career Award. He was a Visiting Professor of Stanford University (2015‒2016) and a Visiting Scholar of MSRA (Microsoft Research Asia) in 2013. His research interests include high performance computing, performance evaluation, compiler, and heterogeneous computing. He has published more than 40 papers in prestigious refereed conferences and top journals including SC, PPOPP, ASPLOS, ICS, ATC, MICRO, IEEE TPDS, and IEEE TC. His research received a Best Paper Finalist at SC’14. He is the advisor of Tsinghua Student Cluster Team. The team led by him has achieved 8 international champions in student supercomputing challenges at SC, ISC, and ASC. In 2015 and 2018, the team led by him swept all three champions at SC, ISC, and ASC. He was a program co-chair of NPC 2018 and a program co-chair of ICPP PASA 2015 workshop. He served or is now serving TPC member of SC, ICS, PPOPP, ICPP, NAS, LCPC, and Euro-Par. He is currently on the editorial board of IEEE Transactions on Parallel and Distributed Systems.