CUG2021 Proceedings

CUG2021 Proceedings

Overview | By Event Type | Author Index

Papers

Presentation, Paper

Acceptance and Testing

Chair: Stephen Leak (National Energy Research Scientific Computing Center/Lawrence Berkeley National Laboratory, Lawrence Berkeley National Laboratory)

Acceptance Testing the Chicoma HPE-Cray EX Supercomputer

Kody Everson (Los Alamos National Laboratory, Dakota State University Advanced Research Laboratory) and Paul Ferrell, Jennifer Green, Francine Lapid, Daniel Magee, Jordan Ogas, Calvin Seamons, and Nicholas Sly (Los Alamos National Laboratory)

A Step Towards the Final Frontier: Lessons Learned from Acceptance Testing of the First HPE/Cray EX 3000 System at ORNL

Veronica G. Vergara Larrea, Reuben Budiardja, Paul Peltz, Jeffery Niles, Christopher Zimmer, Daniel Dietz, Christopher Fuson, Hong Liu, Paul Newman, James Simmons, and Chris Muzyn (Oak Ridge National Laboratory)

Presentation, Paper

Storage and I/O 1

Chair: Tina Declerck (National Energy Research Scientific Computing Center/Lawrence Berkeley National Laboratory)

New data path solutions from HPE for HPC simulation, AI, and high performance workloads

Lance Evans and Marc Roskow (HPE)

Lustre and Spectrum Scale: Simplify parallel file system workflows with HPE Data Management Framework

Mark Wiertalla and Kirill Malkin (HPE) and Zsolt Ferenczy (HPEHPE)

Presentation, Paper

Storage and I/O 2

Chair: Veronica G. Vergara Larrea (Oak Ridge National Laboratory)

h5bench: HDF5 I/O Kernel Suite for Exercising HPC I/O Patterns

Tonglin Li (Lawrence Berkeley National Laboratory, National Energy Research Scientific Computing Center); Suren Byna (Lawrence Berkeley National Laboratory); Quincey Koziol (Lawrence Berkeley National Laboratory, National Center for Supercomputing Applications); and Houjun Tang, Jean Luca Bez, and Qiao Kang (Lawrence Berkeley National Laboratory)

Architecture and Performance of Perlmutter's 35 PB ClusterStor E1000 All-Flash File System

Glenn K. Lockwood (Lawrence Berkeley National Laboratory, National Energy Research Scientific Computing Center) and Alberto Chiusole, Lisa Gerhardt, Kirill Lozinskiy, David Paul, and Nicholas J. Wright (Lawrence Berkeley National Laboratory)

Presentation, Paper

System Analytics and Monitoring

Chair: Jim Brandt (Sandia National Laboratories)

Integrating System State and Application Performance Monitoring: Network Contention Impact

Jim Brandt (Sandia National Laboratories); Tom Tucker (Open Grid Computing); and Simon Hammond, Ben Schwaller, Ann Gentile, Kevin Stroup, and Jeanine Cook (Sandia National Laboratories)

trellis — An Analytics Framework for Understanding Slingshot Performance

Madhu Srinivasan, Dipanwita Mallick, Kristyn Maschhoff, and Haripriya Ayyalasomayajula (Hewlett Packard Enterprise)

AIOps: Leveraging AI/ML for Anomaly Detection in System Management

Sergey Serebryakov, Jeff Hanson, Tahir Cader, Deepak Nanjundaiah, and Joshi Subrahmanya (Hewlett-Packard Enterprise)

Real-time Slingshot Monitoring in HPCM

Priya K, Prasanth Kurian, and Jyothsna Deshpande (Hewlett Packard Enterprise)

Analytic Models to Improve Quality of Service of HPC Jobs

Saba Naureen, Prasanth Kurian, and Amarnath Chilumukuru (HPE)

Presentation, Paper

Systems Support

Chair: Hai Ah Nam (Lawrence Berkeley National Laboratory)

Blue Waters System and Component Reliability

Brett Bode, David King, Celso Mendes, and William Kramer (National Center for Supercomputing Applications/University of Illinois); Saurabh Jha (University of Illinois); and Roger Ford, Justin Davis, and Steven Dramstad (Cray Inc.)

Configuring and Managing Multiple Shasta Systems: Best Practices Developed During the Perlmutter Deployment

James Botts (Lawrence Berkeley National Laboratory/National Energy Research Scientific Computing Center); Zachary Crisler (Hewlett Packard Enterprise); Aditi Gaur and Douglas Jacobsen (Lawrence Berkeley National Laboratory/National Energy Research Scientific Computing Center); Harold Longley, Alex Lovell-Troy, and Dave Poulsen (Hewlett Packard Enterprise); and Eric Roman and Chris Samuel (Lawrence Berkeley National Laboratory/National Energy Research Scientific Computing Center)

Slurm on Shasta at NERSC: adapting to a new way of life

Christopher Samuel (Lawrence Berkeley National Laboratory/National Energy Research Scientific Computing Center, National Energy Research Scientific Computing Center) and Douglas M. Jacobsen and Aditi Gaur (Lawrence Berkeley National Laboratory, National Energy Research Scientific Computing Center)

Declarative automation of compute node lifecycle through Shasta API integration

J. Lowell Wofford and Kevin Pelzel (Los Alamos National Laboratory)

Cray EX Shasta v1.4 System Management Overview

Harold Longley (Hewlett Packard Enterprise)

Managing User Access with UAN and UAI

Harold Longley, Alex Lovell-Troy, and Gregory Baker (Hewlett Packard Enterprise)

User and Administrative Access Options for CSM-Based Shasta Systems

Alex Lovell-Troy, Sean Lynn, and Harold Longley (Hewlett Packard Enterprise)

HPE Ezmeral Container Platform: Current And Future

Thomas Phelan (HPE)

Presentation, Paper

Applications and Performance (ARM)

Chair: Simon McIntosh-Smith (University of Bristol)

An Evaluation of the A64FX Architecture for HPC Applications

Andrei Poenaru and Tom Deakin (University of Bristol, GW4); Simon McIntosh-Smith (University of Bristol); and Si Hammond and Andrew Younge (Sandia National Laboratories)

Vectorising and distributing NTTs to count Goldbach partitions on Arm-based supercomputers

Ricardo Jesus (EPCC, The University of Edinburgh); Tomás Oliveira e Silva (IEETA/DETI, Universidade de Aveiro); and Michèle Weiland (EPCC, The University of Edinburgh)

Optimizing a 3D multi-physics continuum mechanics code for the HPE Apollo 80 System

Vince Graziano (New Mexico Consortium, Los Alamos National Laboratory) and David Nystrom, Howard Pritchard, Brandon Smith, and Brian Gravelle (Los Alamos National Laboratory)

Presentation, Paper

Applications and Performance

Chair: Zhengji Zhao (National Energy Research Scientific Computing Center/Lawrence Berkeley National Laboratory, Lawrence Berkeley National Laboratory)

Optimizing the Cray Graph Engine for Performant Analytics on Cluster, SuperDome Flex, Shasta Systems and Cloud Deployment

Christopher Rickett, Kristyn Maschhoff, and Sreenivas Sukumar (Hewlett Packard Enterprise)

Real-Time XFEL Data Analysis at SLAC and NERSC: a Trial Run of Nascent Exascale Experimental Data Analysis

Best Paper

Johannes P. Blaschke (Lawrence Berkeley National Laboratory/National Energy Research Scientific Computing Center); Aaron S. Brewster, Daniel W. Paley, Derek Mendez, Asmit Bhowmick, and Nicholas K. Sauter (Lawrence Berkeley National Laboratory/Physical Biosciences Division); Wilko Kröger and Murali Shankar (SLAC National Accelerator Laboratory); and Bjoern Enders and Deborah Bard (Lawrence Berkeley National Laboratory/National Energy Research Scientific Computing Center)

Early Experiences Evaluating the HPE/Cray Ecosystem for AMD GPUs

Veronica G. Vergara Larrea, Reuben Budiardja, and Wayne Joubert (Oak Ridge National Laboratory)

Convergence of AI and HPC at HLRS. Our Roadmap.

Denns Hoppe (High Performance Computing Center Stuttgart)

Porting Codes to LUMI

Georgios Markomanolis (CSC - IT Center for Science Ltd.)

Birds of a Feather, Paper

BoF 1

Chair: Bilel Hadri (KAUST Supercomputing Lab)

Update of Cray Programming Environment

John Levesque (HPE)

Programming Environments, Applications, and Documentation (PEAD) Special Interest Group meeting

Bilel Hadri (KAUST Supercomputing Lab)

HPC System Test: Building a cross-center collaboration for system testing

Veronica G. Vergara Larrea (Oak Ridge National Laboratory), Bilel Hadri (King Abdullah University of Science and Technology), Reuben Budiardja (Oak Ridge National Laboratory), Vasileios Karakasis (Swiss National Supercomputing Centre), Shahzeb Siddiqui (Lawrence Berkeley National Laboratory), and George Markomanolis (CSC - IT Center for Science Ltd.)

Presentations

Presentation, Paper

Acceptance and Testing

Chair: Stephen Leak (National Energy Research Scientific Computing Center/Lawrence Berkeley National Laboratory, Lawrence Berkeley National Laboratory)

Acceptance Testing the Chicoma HPE-Cray EX Supercomputer

Kody Everson (Los Alamos National Laboratory, Dakota State University Advanced Research Laboratory) and Paul Ferrell, Jennifer Green, Francine Lapid, Daniel Magee, Jordan Ogas, Calvin Seamons, and Nicholas Sly (Los Alamos National Laboratory)

A Step Towards the Final Frontier: Lessons Learned from Acceptance Testing of the First HPE/Cray EX 3000 System at ORNL

Veronica G. Vergara Larrea, Reuben Budiardja, Paul Peltz, Jeffery Niles, Christopher Zimmer, Daniel Dietz, Christopher Fuson, Hong Liu, Paul Newman, James Simmons, and Chris Muzyn (Oak Ridge National Laboratory)

Presentation, Paper

Storage and I/O 1

Chair: Tina Declerck (National Energy Research Scientific Computing Center/Lawrence Berkeley National Laboratory)

New data path solutions from HPE for HPC simulation, AI, and high performance workloads

Lance Evans and Marc Roskow (HPE)

Lustre and Spectrum Scale: Simplify parallel file system workflows with HPE Data Management Framework

Mark Wiertalla and Kirill Malkin (HPE) and Zsolt Ferenczy (HPEHPE)

Presentation, Paper

Storage and I/O 2

Chair: Veronica G. Vergara Larrea (Oak Ridge National Laboratory)

h5bench: HDF5 I/O Kernel Suite for Exercising HPC I/O Patterns

Tonglin Li (Lawrence Berkeley National Laboratory, National Energy Research Scientific Computing Center); Suren Byna (Lawrence Berkeley National Laboratory); Quincey Koziol (Lawrence Berkeley National Laboratory, National Center for Supercomputing Applications); and Houjun Tang, Jean Luca Bez, and Qiao Kang (Lawrence Berkeley National Laboratory)

Architecture and Performance of Perlmutter's 35 PB ClusterStor E1000 All-Flash File System

Glenn K. Lockwood (Lawrence Berkeley National Laboratory, National Energy Research Scientific Computing Center) and Alberto Chiusole, Lisa Gerhardt, Kirill Lozinskiy, David Paul, and Nicholas J. Wright (Lawrence Berkeley National Laboratory)

Presentation, Paper

System Analytics and Monitoring

Chair: Jim Brandt (Sandia National Laboratories)

Integrating System State and Application Performance Monitoring: Network Contention Impact

Jim Brandt (Sandia National Laboratories); Tom Tucker (Open Grid Computing); and Simon Hammond, Ben Schwaller, Ann Gentile, Kevin Stroup, and Jeanine Cook (Sandia National Laboratories)

trellis — An Analytics Framework for Understanding Slingshot Performance

Madhu Srinivasan, Dipanwita Mallick, Kristyn Maschhoff, and Haripriya Ayyalasomayajula (Hewlett Packard Enterprise)

AIOps: Leveraging AI/ML for Anomaly Detection in System Management

Sergey Serebryakov, Jeff Hanson, Tahir Cader, Deepak Nanjundaiah, and Joshi Subrahmanya (Hewlett-Packard Enterprise)

Real-time Slingshot Monitoring in HPCM

Priya K, Prasanth Kurian, and Jyothsna Deshpande (Hewlett Packard Enterprise)

Analytic Models to Improve Quality of Service of HPC Jobs

Saba Naureen, Prasanth Kurian, and Amarnath Chilumukuru (HPE)

Presentation, Paper

Systems Support

Chair: Hai Ah Nam (Lawrence Berkeley National Laboratory)

Blue Waters System and Component Reliability

Brett Bode, David King, Celso Mendes, and William Kramer (National Center for Supercomputing Applications/University of Illinois); Saurabh Jha (University of Illinois); and Roger Ford, Justin Davis, and Steven Dramstad (Cray Inc.)

Configuring and Managing Multiple Shasta Systems: Best Practices Developed During the Perlmutter Deployment

James Botts (Lawrence Berkeley National Laboratory/National Energy Research Scientific Computing Center); Zachary Crisler (Hewlett Packard Enterprise); Aditi Gaur and Douglas Jacobsen (Lawrence Berkeley National Laboratory/National Energy Research Scientific Computing Center); Harold Longley, Alex Lovell-Troy, and Dave Poulsen (Hewlett Packard Enterprise); and Eric Roman and Chris Samuel (Lawrence Berkeley National Laboratory/National Energy Research Scientific Computing Center)

Slurm on Shasta at NERSC: adapting to a new way of life

Christopher Samuel (Lawrence Berkeley National Laboratory/National Energy Research Scientific Computing Center, National Energy Research Scientific Computing Center) and Douglas M. Jacobsen and Aditi Gaur (Lawrence Berkeley National Laboratory, National Energy Research Scientific Computing Center)

Declarative automation of compute node lifecycle through Shasta API integration

J. Lowell Wofford and Kevin Pelzel (Los Alamos National Laboratory)

Cray EX Shasta v1.4 System Management Overview

Harold Longley (Hewlett Packard Enterprise)

Managing User Access with UAN and UAI

Harold Longley, Alex Lovell-Troy, and Gregory Baker (Hewlett Packard Enterprise)

User and Administrative Access Options for CSM-Based Shasta Systems

Alex Lovell-Troy, Sean Lynn, and Harold Longley (Hewlett Packard Enterprise)

HPE Ezmeral Container Platform: Current And Future

Thomas Phelan (HPE)

Presentation, Paper

Applications and Performance (ARM)

Chair: Simon McIntosh-Smith (University of Bristol)

An Evaluation of the A64FX Architecture for HPC Applications

Andrei Poenaru and Tom Deakin (University of Bristol, GW4); Simon McIntosh-Smith (University of Bristol); and Si Hammond and Andrew Younge (Sandia National Laboratories)

Vectorising and distributing NTTs to count Goldbach partitions on Arm-based supercomputers

Ricardo Jesus (EPCC, The University of Edinburgh); Tomás Oliveira e Silva (IEETA/DETI, Universidade de Aveiro); and Michèle Weiland (EPCC, The University of Edinburgh)

Optimizing a 3D multi-physics continuum mechanics code for the HPE Apollo 80 System

Vince Graziano (New Mexico Consortium, Los Alamos National Laboratory) and David Nystrom, Howard Pritchard, Brandon Smith, and Brian Gravelle (Los Alamos National Laboratory)

Presentation, Paper

Applications and Performance

Chair: Zhengji Zhao (National Energy Research Scientific Computing Center/Lawrence Berkeley National Laboratory, Lawrence Berkeley National Laboratory)

Optimizing the Cray Graph Engine for Performant Analytics on Cluster, SuperDome Flex, Shasta Systems and Cloud Deployment

Christopher Rickett, Kristyn Maschhoff, and Sreenivas Sukumar (Hewlett Packard Enterprise)

Real-Time XFEL Data Analysis at SLAC and NERSC: a Trial Run of Nascent Exascale Experimental Data Analysis

Best Paper

Johannes P. Blaschke (Lawrence Berkeley National Laboratory/National Energy Research Scientific Computing Center); Aaron S. Brewster, Daniel W. Paley, Derek Mendez, Asmit Bhowmick, and Nicholas K. Sauter (Lawrence Berkeley National Laboratory/Physical Biosciences Division); Wilko Kröger and Murali Shankar (SLAC National Accelerator Laboratory); and Bjoern Enders and Deborah Bard (Lawrence Berkeley National Laboratory/National Energy Research Scientific Computing Center)

Early Experiences Evaluating the HPE/Cray Ecosystem for AMD GPUs

Veronica G. Vergara Larrea, Reuben Budiardja, and Wayne Joubert (Oak Ridge National Laboratory)

Convergence of AI and HPC at HLRS. Our Roadmap.

Denns Hoppe (High Performance Computing Center Stuttgart)

Porting Codes to LUMI

Georgios Markomanolis (CSC - IT Center for Science Ltd.)

Created 2021-12-13 17:26