CUG2021 Proceedings


Overview | By Event Type | Author Index



Papers
Presentation, Paper
Acceptance and Testing
Chair: Stephen Leak (National Energy Research Scientific Computing Center/Lawrence Berkeley National Laboratory, Lawrence Berkeley National Laboratory)
Acceptance Testing the Chicoma HPE-Cray EX Supercomputer
Kody Everson (Los Alamos National Laboratory, Dakota State University Advanced Research Laboratory) and Paul Ferrell, Jennifer Green, Francine Lapid, Daniel Magee, Jordan Ogas, Calvin Seamons, and Nicholas Sly (Los Alamos National Laboratory)
Abstract
pdf, pdf
A Step Towards the Final Frontier: Lessons Learned from Acceptance Testing of the First HPE/Cray EX 3000 System at ORNL
Veronica G. Vergara Larrea, Reuben Budiardja, Paul Peltz, Jeffery Niles, Christopher Zimmer, Daniel Dietz, Christopher Fuson, Hong Liu, Paul Newman, James Simmons, and Chris Muzyn (Oak Ridge National Laboratory)
Abstract
pdf, pdf
Presentation, Paper
Storage and I/O 1
Chair: Tina Declerck (National Energy Research Scientific Computing Center/Lawrence Berkeley National Laboratory)
New data path solutions from HPE for HPC simulation, AI, and high performance workloads
Lance Evans and Marc Roskow (HPE)
Abstract
pdf
Lustre and Spectrum Scale: Simplify parallel file system workflows with HPE Data Management Framework
Mark Wiertalla and Kirill Malkin (HPE) and Zsolt Ferenczy (HPEHPE)
Abstract
pdf
Presentation, Paper
Storage and I/O 2
Chair: Veronica G. Vergara Larrea (Oak Ridge National Laboratory)
h5bench: HDF5 I/O Kernel Suite for Exercising HPC I/O Patterns
Tonglin Li (Lawrence Berkeley National Laboratory, National Energy Research Scientific Computing Center); Suren Byna (Lawrence Berkeley National Laboratory); Quincey Koziol (Lawrence Berkeley National Laboratory, National Center for Supercomputing Applications); and Houjun Tang, Jean Luca Bez, and Qiao Kang (Lawrence Berkeley National Laboratory)
Abstract
pdf, pdf
Architecture and Performance of Perlmutter's 35 PB ClusterStor E1000 All-Flash File System
Glenn K. Lockwood (Lawrence Berkeley National Laboratory, National Energy Research Scientific Computing Center) and Alberto Chiusole, Lisa Gerhardt, Kirill Lozinskiy, David Paul, and Nicholas J. Wright (Lawrence Berkeley National Laboratory)
Abstract
pdf, pdf
Presentation, Paper
System Analytics and Monitoring
Chair: Jim Brandt (Sandia National Laboratories)
Integrating System State and Application Performance Monitoring: Network Contention Impact
Jim Brandt (Sandia National Laboratories); Tom Tucker (Open Grid Computing); and Simon Hammond, Ben Schwaller, Ann Gentile, Kevin Stroup, and Jeanine Cook (Sandia National Laboratories)
Abstract
trellis — An Analytics Framework for Understanding Slingshot Performance
Madhu Srinivasan, Dipanwita Mallick, Kristyn Maschhoff, and Haripriya Ayyalasomayajula (Hewlett Packard Enterprise)
Abstract
pdf, pdf
AIOps: Leveraging AI/ML for Anomaly Detection in System Management
Sergey Serebryakov, Jeff Hanson, Tahir Cader, Deepak Nanjundaiah, and Joshi Subrahmanya (Hewlett-Packard Enterprise)
Abstract
Real-time Slingshot Monitoring in HPCM
Priya K, Prasanth Kurian, and Jyothsna Deshpande (Hewlett Packard Enterprise)
Abstract
Analytic Models to Improve Quality of Service of HPC Jobs
Saba Naureen, Prasanth Kurian, and Amarnath Chilumukuru (HPE)
Abstract
Presentation, Paper
Systems Support
Chair: Hai Ah Nam (Lawrence Berkeley National Laboratory)
Blue Waters System and Component Reliability
Brett Bode, David King, Celso Mendes, and William Kramer (National Center for Supercomputing Applications/University of Illinois); Saurabh Jha (University of Illinois); and Roger Ford, Justin Davis, and Steven Dramstad (Cray Inc.)
Abstract
pdf, pdf
Configuring and Managing Multiple Shasta Systems: Best Practices Developed During the Perlmutter Deployment
James Botts (Lawrence Berkeley National Laboratory/National Energy Research Scientific Computing Center); Zachary Crisler (Hewlett Packard Enterprise); Aditi Gaur and Douglas Jacobsen (Lawrence Berkeley National Laboratory/National Energy Research Scientific Computing Center); Harold Longley, Alex Lovell-Troy, and Dave Poulsen (Hewlett Packard Enterprise); and Eric Roman and Chris Samuel (Lawrence Berkeley National Laboratory/National Energy Research Scientific Computing Center)
Abstract
pdf
Slurm on Shasta at NERSC: adapting to a new way of life
Christopher Samuel (Lawrence Berkeley National Laboratory/National Energy Research Scientific Computing Center, National Energy Research Scientific Computing Center) and Douglas M. Jacobsen and Aditi Gaur (Lawrence Berkeley National Laboratory, National Energy Research Scientific Computing Center)
Abstract
pdf
Declarative automation of compute node lifecycle through Shasta API integration
J. Lowell Wofford and Kevin Pelzel (Los Alamos National Laboratory)
Abstract
Cray EX Shasta v1.4 System Management Overview
Harold Longley (Hewlett Packard Enterprise)
Abstract
pdf
Managing User Access with UAN and UAI
Harold Longley, Alex Lovell-Troy, and Gregory Baker (Hewlett Packard Enterprise)
Abstract
pdf
User and Administrative Access Options for CSM-Based Shasta Systems
Alex Lovell-Troy, Sean Lynn, and Harold Longley (Hewlett Packard Enterprise)
Abstract
pdf, pdf
HPE Ezmeral Container Platform: Current And Future
Thomas Phelan (HPE)
Abstract
pdf
Presentation, Paper
Applications and Performance (ARM)
Chair: Simon McIntosh-Smith (University of Bristol)
An Evaluation of the A64FX Architecture for HPC Applications
Andrei Poenaru and Tom Deakin (University of Bristol, GW4); Simon McIntosh-Smith (University of Bristol); and Si Hammond and Andrew Younge (Sandia National Laboratories)
Abstract
pdf, pdf
Vectorising and distributing NTTs to count Goldbach partitions on Arm-based supercomputers
Ricardo Jesus (EPCC, The University of Edinburgh); Tomás Oliveira e Silva (IEETA/DETI, Universidade de Aveiro); and Michèle Weiland (EPCC, The University of Edinburgh)
Abstract
pdf, pdf
Optimizing a 3D multi-physics continuum mechanics code for the HPE Apollo 80 System
Vince Graziano (New Mexico Consortium, Los Alamos National Laboratory) and David Nystrom, Howard Pritchard, Brandon Smith, and Brian Gravelle (Los Alamos National Laboratory)
Abstract
pdf, pdf
Presentation, Paper
Applications and Performance
Chair: Zhengji Zhao (National Energy Research Scientific Computing Center/Lawrence Berkeley National Laboratory, Lawrence Berkeley National Laboratory)
Optimizing the Cray Graph Engine for Performant Analytics on Cluster, SuperDome Flex, Shasta Systems and Cloud Deployment
Christopher Rickett, Kristyn Maschhoff, and Sreenivas Sukumar (Hewlett Packard Enterprise)
Abstract
pdf, pdf
Real-Time XFEL Data Analysis at SLAC and NERSC: a Trial Run of Nascent Exascale Experimental Data Analysis
Best Paper
Johannes P. Blaschke (Lawrence Berkeley National Laboratory/National Energy Research Scientific Computing Center); Aaron S. Brewster, Daniel W. Paley, Derek Mendez, Asmit Bhowmick, and Nicholas K. Sauter (Lawrence Berkeley National Laboratory/Physical Biosciences Division); Wilko Kröger and Murali Shankar (SLAC National Accelerator Laboratory); and Bjoern Enders and Deborah Bard (Lawrence Berkeley National Laboratory/National Energy Research Scientific Computing Center)
Abstract
pdf, pdf
Early Experiences Evaluating the HPE/Cray Ecosystem for AMD GPUs
Veronica G. Vergara Larrea, Reuben Budiardja, and Wayne Joubert (Oak Ridge National Laboratory)
Abstract
pdf, pdf
Convergence of AI and HPC at HLRS. Our Roadmap.
Denns Hoppe (High Performance Computing Center Stuttgart)
Abstract
Porting Codes to LUMI
Georgios Markomanolis (CSC - IT Center for Science Ltd.)
Abstract
pdf
Birds of a Feather, Paper
BoF 1
Chair: Bilel Hadri (KAUST Supercomputing Lab)
Update of Cray Programming Environment
John Levesque (HPE)
Abstract
Programming Environments, Applications, and Documentation (PEAD) Special Interest Group meeting
Bilel Hadri (KAUST Supercomputing Lab)
Abstract
HPC System Test: Building a cross-center collaboration for system testing
Veronica G. Vergara Larrea (Oak Ridge National Laboratory), Bilel Hadri (King Abdullah University of Science and Technology), Reuben Budiardja (Oak Ridge National Laboratory), Vasileios Karakasis (Swiss National Supercomputing Centre), Shahzeb Siddiqui (Lawrence Berkeley National Laboratory), and George Markomanolis (CSC - IT Center for Science Ltd.)
Abstract
pdf

Presentations
Presentation, Paper
Acceptance and Testing
Chair: Stephen Leak (National Energy Research Scientific Computing Center/Lawrence Berkeley National Laboratory, Lawrence Berkeley National Laboratory)
Acceptance Testing the Chicoma HPE-Cray EX Supercomputer
Kody Everson (Los Alamos National Laboratory, Dakota State University Advanced Research Laboratory) and Paul Ferrell, Jennifer Green, Francine Lapid, Daniel Magee, Jordan Ogas, Calvin Seamons, and Nicholas Sly (Los Alamos National Laboratory)
Abstract
pdf, pdf
A Step Towards the Final Frontier: Lessons Learned from Acceptance Testing of the First HPE/Cray EX 3000 System at ORNL
Veronica G. Vergara Larrea, Reuben Budiardja, Paul Peltz, Jeffery Niles, Christopher Zimmer, Daniel Dietz, Christopher Fuson, Hong Liu, Paul Newman, James Simmons, and Chris Muzyn (Oak Ridge National Laboratory)
Abstract
pdf, pdf
Presentation, Paper
Storage and I/O 1
Chair: Tina Declerck (National Energy Research Scientific Computing Center/Lawrence Berkeley National Laboratory)
New data path solutions from HPE for HPC simulation, AI, and high performance workloads
Lance Evans and Marc Roskow (HPE)
Abstract
pdf
Lustre and Spectrum Scale: Simplify parallel file system workflows with HPE Data Management Framework
Mark Wiertalla and Kirill Malkin (HPE) and Zsolt Ferenczy (HPEHPE)
Abstract
pdf
Presentation, Paper
Storage and I/O 2
Chair: Veronica G. Vergara Larrea (Oak Ridge National Laboratory)
h5bench: HDF5 I/O Kernel Suite for Exercising HPC I/O Patterns
Tonglin Li (Lawrence Berkeley National Laboratory, National Energy Research Scientific Computing Center); Suren Byna (Lawrence Berkeley National Laboratory); Quincey Koziol (Lawrence Berkeley National Laboratory, National Center for Supercomputing Applications); and Houjun Tang, Jean Luca Bez, and Qiao Kang (Lawrence Berkeley National Laboratory)
Abstract
pdf, pdf
Architecture and Performance of Perlmutter's 35 PB ClusterStor E1000 All-Flash File System
Glenn K. Lockwood (Lawrence Berkeley National Laboratory, National Energy Research Scientific Computing Center) and Alberto Chiusole, Lisa Gerhardt, Kirill Lozinskiy, David Paul, and Nicholas J. Wright (Lawrence Berkeley National Laboratory)
Abstract
pdf, pdf
Presentation, Paper
System Analytics and Monitoring
Chair: Jim Brandt (Sandia National Laboratories)
Integrating System State and Application Performance Monitoring: Network Contention Impact
Jim Brandt (Sandia National Laboratories); Tom Tucker (Open Grid Computing); and Simon Hammond, Ben Schwaller, Ann Gentile, Kevin Stroup, and Jeanine Cook (Sandia National Laboratories)
Abstract
trellis — An Analytics Framework for Understanding Slingshot Performance
Madhu Srinivasan, Dipanwita Mallick, Kristyn Maschhoff, and Haripriya Ayyalasomayajula (Hewlett Packard Enterprise)
Abstract
pdf, pdf
AIOps: Leveraging AI/ML for Anomaly Detection in System Management
Sergey Serebryakov, Jeff Hanson, Tahir Cader, Deepak Nanjundaiah, and Joshi Subrahmanya (Hewlett-Packard Enterprise)
Abstract
Real-time Slingshot Monitoring in HPCM
Priya K, Prasanth Kurian, and Jyothsna Deshpande (Hewlett Packard Enterprise)
Abstract
Analytic Models to Improve Quality of Service of HPC Jobs
Saba Naureen, Prasanth Kurian, and Amarnath Chilumukuru (HPE)
Abstract
Presentation, Paper
Systems Support
Chair: Hai Ah Nam (Lawrence Berkeley National Laboratory)
Blue Waters System and Component Reliability
Brett Bode, David King, Celso Mendes, and William Kramer (National Center for Supercomputing Applications/University of Illinois); Saurabh Jha (University of Illinois); and Roger Ford, Justin Davis, and Steven Dramstad (Cray Inc.)
Abstract
pdf, pdf
Configuring and Managing Multiple Shasta Systems: Best Practices Developed During the Perlmutter Deployment
James Botts (Lawrence Berkeley National Laboratory/National Energy Research Scientific Computing Center); Zachary Crisler (Hewlett Packard Enterprise); Aditi Gaur and Douglas Jacobsen (Lawrence Berkeley National Laboratory/National Energy Research Scientific Computing Center); Harold Longley, Alex Lovell-Troy, and Dave Poulsen (Hewlett Packard Enterprise); and Eric Roman and Chris Samuel (Lawrence Berkeley National Laboratory/National Energy Research Scientific Computing Center)
Abstract
pdf
Slurm on Shasta at NERSC: adapting to a new way of life
Christopher Samuel (Lawrence Berkeley National Laboratory/National Energy Research Scientific Computing Center, National Energy Research Scientific Computing Center) and Douglas M. Jacobsen and Aditi Gaur (Lawrence Berkeley National Laboratory, National Energy Research Scientific Computing Center)
Abstract
pdf
Declarative automation of compute node lifecycle through Shasta API integration
J. Lowell Wofford and Kevin Pelzel (Los Alamos National Laboratory)
Abstract
Cray EX Shasta v1.4 System Management Overview
Harold Longley (Hewlett Packard Enterprise)
Abstract
pdf
Managing User Access with UAN and UAI
Harold Longley, Alex Lovell-Troy, and Gregory Baker (Hewlett Packard Enterprise)
Abstract
pdf
User and Administrative Access Options for CSM-Based Shasta Systems
Alex Lovell-Troy, Sean Lynn, and Harold Longley (Hewlett Packard Enterprise)
Abstract
pdf, pdf
HPE Ezmeral Container Platform: Current And Future
Thomas Phelan (HPE)
Abstract
pdf
Presentation, Paper
Applications and Performance (ARM)
Chair: Simon McIntosh-Smith (University of Bristol)
An Evaluation of the A64FX Architecture for HPC Applications
Andrei Poenaru and Tom Deakin (University of Bristol, GW4); Simon McIntosh-Smith (University of Bristol); and Si Hammond and Andrew Younge (Sandia National Laboratories)
Abstract
pdf, pdf
Vectorising and distributing NTTs to count Goldbach partitions on Arm-based supercomputers
Ricardo Jesus (EPCC, The University of Edinburgh); Tomás Oliveira e Silva (IEETA/DETI, Universidade de Aveiro); and Michèle Weiland (EPCC, The University of Edinburgh)
Abstract
pdf, pdf
Optimizing a 3D multi-physics continuum mechanics code for the HPE Apollo 80 System
Vince Graziano (New Mexico Consortium, Los Alamos National Laboratory) and David Nystrom, Howard Pritchard, Brandon Smith, and Brian Gravelle (Los Alamos National Laboratory)
Abstract
pdf, pdf
Presentation, Paper
Applications and Performance
Chair: Zhengji Zhao (National Energy Research Scientific Computing Center/Lawrence Berkeley National Laboratory, Lawrence Berkeley National Laboratory)
Optimizing the Cray Graph Engine for Performant Analytics on Cluster, SuperDome Flex, Shasta Systems and Cloud Deployment
Christopher Rickett, Kristyn Maschhoff, and Sreenivas Sukumar (Hewlett Packard Enterprise)
Abstract
pdf, pdf
Real-Time XFEL Data Analysis at SLAC and NERSC: a Trial Run of Nascent Exascale Experimental Data Analysis
Best Paper
Johannes P. Blaschke (Lawrence Berkeley National Laboratory/National Energy Research Scientific Computing Center); Aaron S. Brewster, Daniel W. Paley, Derek Mendez, Asmit Bhowmick, and Nicholas K. Sauter (Lawrence Berkeley National Laboratory/Physical Biosciences Division); Wilko Kröger and Murali Shankar (SLAC National Accelerator Laboratory); and Bjoern Enders and Deborah Bard (Lawrence Berkeley National Laboratory/National Energy Research Scientific Computing Center)
Abstract
pdf, pdf
Early Experiences Evaluating the HPE/Cray Ecosystem for AMD GPUs
Veronica G. Vergara Larrea, Reuben Budiardja, and Wayne Joubert (Oak Ridge National Laboratory)
Abstract
pdf, pdf
Convergence of AI and HPC at HLRS. Our Roadmap.
Denns Hoppe (High Performance Computing Center Stuttgart)
Abstract
Porting Codes to LUMI
Georgios Markomanolis (CSC - IT Center for Science Ltd.)
Abstract
pdf

Created 2021-12-13 17:26