CUG2013 Proceedings

Birds of a Feather	Paper
Invited Talk	Tutorial

Birds of a Feather

Interactive 3A

Chair: Colin McMurtrie (Swiss National Supercomputing Centre)

System Support SIG

Colin McMurtrie (Swiss National Supercomputing Centre)

Abstract

Birds of a Feather

Interactive 3B

Chair: Helen He (National Energy Research Scientific Computing Center)

Programming Environments, Applications and Documentation SIG

Helen He (Lawrence Berkeley National Laboratory/National Energy Research Scientific Computing Center)

Abstract

Birds of a Feather

Interactive 3C

Birds of a Feather

Interactive 8A

Chair: Nick Cardo (National Energy Research Scientific Computing Center)

Open discussion with CUG Board

Nick Cardo (National Energy Research Scientific Computing Center)

Abstract

Birds of a Feather

Interactive 8B

Chair: Duncan J. Poole (NVIDIA)

OpenACC BOF

Duncan Poole (NVIDIA)

Abstract

Birds of a Feather

Interactive 8C

Chair: John Hesterberg (Cray, Inc.)

System Management Futures

John Hesterberg (Cray Inc.)

Abstract

Birds of a Feather

Interactive 11A/B

Chair: David Henty (EPCC, The University of Edinburgh)

HPC training and education

David Henty (EPCC, The University of Edinburgh)

Abstract

Birds of a Feather

Interactive 11C

Chair: Jeff Keopp (Cray Inc.)

Cray External Services Systems

Jeff Keopp (Cray Inc.)

Abstract

Birds of a Feather

Interactive 17A

Chair: John Hesterberg (Cray, Inc.)

System Monitoring, Accounting and Metrics

John Hesterberg (Cray Inc.)

Abstract

Birds of a Feather

Interactive 17B

Chair: Jenett Tillotson (Indiana University)

Experiences with Moab and TORQUE

Jenett Tillotson (Indiana University)

Abstract

Birds of a Feather

Interactive 17C

Return to Top

Invited Talk

General Session 4

Chair: Nick Cardo (National Energy Research Scientific Computing Center)

CUG Welcome

Nick Cardo (National Energy Research Scientific Computing Center)

Abstract

Why we need Exascale, and why we won't get there by 2020

Horst D. Simon (Lawrence Berkeley National Laboratory)

Abstract

Invited Talk

General Session 5

Chair: David Hancock (Indiana University)

Cray Corporate Update

Peter Ungaro (Cray Inc.)

Abstract

Cray in Supercomputing

Peg Williams (Cray Inc.)

Abstract

Invited Talk

General Session 9

Chair: Nick Cardo (National Energy Research Scientific Computing Center)

CUG Business

Nick Cardo (Nation)

Abstract

Big Bang, Big Data, Big Iron – Analyzing Data From The Planck Satellite Mission

Julian Borrill (Lawrence Berkeley National Laboratory)

Abstract

CUG Business

Nick Cardo (Nation)

Abstract

Invited Talk

General Session 10

Chair: David Hancock (Indiana University)

Introduction and CUG 2013 Best Paper Award

David Hancock (Indiana University)

Abstract

The Changing Face of High Performance Computing

Rajeeb Hazra (Intel Corporation)

Abstract

Invited Talk

General Session 12

Chair: David Hancock (Indiana University)

1 on 100 or more

Peter Ungaro (Cray Inc.)

Abstract

Invited Talk

Closing General Session 20

Chair: Nick Cardo (National Energy Research Scientific Computing Center)

CUG Closing Session

Nick Cardo (nation)

Abstract

Return to Top

Paper

Technical Session 6A

Chair: Tina Butler (National Energy Research Scientific Computing Center)

Cray System Software Road Map

Charlie Carroll (Cray Inc.)

Abstract

pdf, pdf

Image Management and Provisioning System Overview

John Hesterberg (Cray Inc.)

Abstract

pdf, pdf

Paper

Technical Session 6B

Chair: Jason Hill (Oak Ridge National Laboratory)

Instrumenting IOR to Diagnose Performance Issues on Lustre File Systems

Doug J. Petesch and Mark S. Swan (Cray Inc.)

Abstract

pdf, pdf

Taking Advantage of Multicore for the Lustre Gemini LND Driver

James A. Simmons (Oak Ridge National Laboratory) and John Lewis (Cray Inc.)

Abstract

pdf, pdf

A file system utilization metric for I/O characterization

Andrew Uselton and Nicholas Wright (Lawrence Berkeley National Laboratory)

Abstract

pdf, pdf

Paper

Technical Session 6C

Chair: Craig Stewart (Indiana University)

The Cray Programming Environment: Current Status and Future Directions

Luiz DeRose (Cray Inc.)

Abstract

Enhancements to the Cray Performance Measurements and Analysis Tools

Heidi Poxon (Cray Inc.)

Abstract

pdf

Cray Compiling Environment Update

Suzanne LaCroix and James Beyer (Cray Inc.)

Abstract

pdf

Paper

Technical Session 7A

Chair: Jeff Broughton (NERSC/LBNL)

New Member Talk: iVEC and the Pawsey Centre

Charles Schwartz (iVEC)

Abstract

pdf

The Evolution of Cray Management Services

Tara Fly, Alan Mutschelknaus, Andrew Barry and John Navitsky (Cray Inc.)

Abstract

pdf, pdf

CRAY XC30 Installation – A System Level Overview

Nicola Bianchi, Colin McMurtrie and Sadaf Alam (Swiss National Supercomputing Centre)

Abstract

pdf, pdf

Cray External Services Systems Overview

Harold Longley and Jeff Keopp (Cray Inc.)

Abstract

pdf, pdf

Paper

Technical Session 7B

Chair: Andrew Uselton (Lawrence Berkeley National Laboratory/National Energy Research Scientific Computing Center)

Architecting Resilient Lustre Storage Solution

John Fragalla (Xyratex)

Abstract

pdf

BlueWaters I/O Performance

Mark S. Swan and Doug Petesch (Cray Inc.)

Abstract

pdf, pdf

Sonexion 1600 I/O Performance

Nicholas P. Cardo (National Energy Research Scientific Computing Center)

Abstract

OLCF's 1 TB/s, next-generation Spider file system

David Dillow, Sarp Oral, Douglas Fuller, Jason Hill, Dustin Leverman, Sudharshan Vazhkudai, Feiyi Wang, Kim Youngjae, James H. Rogers, James Simmons and Ross G. Miller (Oak Ridge National Laboratory)

Abstract

pdf, pdf

Paper

Technical Session 7C

Chair: Sadaf R. Alam (CSCS)

Optimizing GPU to GPU Communication on Cray XK7

Jeff M. Larkin (NVIDIA)

Abstract

pdf, pdf

Debugging and Optimizing Programs Accelerated with Intel® Xeon® Phi™ Coprocessors

Chris Gottbrath (Rogue Wave Software)

Abstract

pdf, pdf

Portable and Productive Performance on Hybrid System with OpenACC Compilers and Tools

Luiz DeRose (Cray Inc.)

Abstract

The current trend in the supercomputing industry is to provide hybrid systems with accelerators attached to multi-core processors. Some of the critical hurdles for the widespread adoption of accelerated computing in high performance computing are portability and programmability. In order to facilitate the migration to hybrid systems with accelerators attached to CPUs, users need a simple programming model that is portable across machine types. Moreover, to allow for users to maintain a single code base, this programming model, and the required optimization techniques, should not be significantly different for “accelerated” nodes from the approaches used on current multi-core x86 processors.

In this talk I will present Cray’s approach to accelerator programming, which is based on a high level programming environment with tightly coupled OpenACC compilers, libraries, and tools that can interoperate and hide the complexity of the system. Ease of use is possible with compiler making it feasible for users to write applications in Fortran, C, or C++ with OpenACC directives, tools to help users port, debug, and optimize for GPUs, as well as conventional multi-core CPUs.

In this programming environment, the compiler does the “heavy lifting” to split off the work destined for the accelerator and perform the necessary data transfers. In addition, it does optimizations to take advantage of the accelerator and the multi-core X86 hardware appropriately. A full debugger with integrated support for the CPU and the GPU is available with DDT from Allinea or TotalView from Rogue Wave Software. The Cray Performance Tools provide statistics for the whole application, which could be grouped by accelerator directive or mapped back to the high level source by line number. A single performance report can include statistics for both the host and the accelerator, including hardware performance counters information. The Cray Scientific Libraries uses the Cray auto-tuning framework to select the best kernel for the each task. With this scientific libraries interface, data copy is automatic and the GPU or host execution placement is automatic. Finally, the Cray Programming Environment for accelerators supports experienced CUDA developers, by providing interoperability between the compiler, performance tools, and debugger with existing CUDA codes.

Tesla vs Xeon Phi vs Radeon: A Compiler Writer's Perspective

Brent Leback, Douglas Miles and Michael Wolfe (The Portland Group)

Abstract

pdf

Paper

Technical Session 13A

Chair: Douglas W. Doerfler (Sandia National Laboratories)

SeaStar Unchained: Multiplying the Performance of the Cray SeaStar Network

David A. Dillow and Scott Atchley (Oak Ridge National Laboratory)

Abstract

pdf, pdf

Intel Multicore, Manycore, and Fabric Integrated Parallel Computing

Jim Jeffers (Intel Corporation)

Abstract

pdf

Understanding the Impact of Interconnect Failures on System Operation

Matthew A. Ezell (Oak Ridge National Laboratory)

Abstract

pdf, pdf

Paper

Technical Session 13B

Chair: Jason Hill (Oak Ridge National Laboratory)

The Changing Face of Storage for Exascale

Brent Gorda (Intel Corporation)

Abstract

Cray's Implementation of LNET Fine Grained Routing: Overview and Characteristics

Mark S. Swan (Cray Inc.) and Nic Henke (Xyratex)

Abstract

pdf, pdf

Discovery in Big Data using a Graph Analytics Appliance

Amar Shan and Ramesh Menon (Cray Inc.)

Abstract

Paper

Technical Session 13C

Chair: Helen He (National Energy Research Scientific Computing Center)

Using the Cray Gemini Performance Counters

Kevin Pedretti, Courtenay Vaughan, Richard Barrett, Karen Devine and K. Scott Hemmert (Sandia National Laboratories)

Abstract

pdf, pdf

Performance Measurements of the NERSC Cray Cascade System

Harvey J. Wasserman, Nicholas J. Wright, Brian M. Austin and Matthew J. Cordery (Lawrence Berkeley National Laboratory/National Energy Research Scientific Computing Center)

Abstract

pdf, pdf

From thousands to millions: visual and system scalability for debugging and profiling

Mark O'Connor, David Lecomber, Ian Lumb and Jonathan Byrd (Allinea Software)

Abstract

pdf

Paper

Technical Session 14A

Chair: Ashley Barker (Oak Ridge National Laboratory)

Investigating Topology Aware Scheduling

David Jackson (Adaptive Computing)

Abstract

External Torque / Moab and Fairshare on the Cray XC30

Tina Declerck and Iwona Sakrejda (National Energy Research Scientific Computing Center/Lawrence Berkeley National Laboratory)

Abstract

pdf, pdf

Production Experiences with the Cray-Enabled TORQUE Resource Manager

Matthew A. Ezell and Don Maxwell (Oak Ridge National Laboratory) and David Beer (Adaptive Computing)

Abstract

pdf, pdf

Paper

Technical Session 14B

Chair: Steve Simms (Indiana University)

Evaluation of A Flash Storage Filesystem on the Cray XE-6

Jay Srinivasan and Shane Canon (Lawrence Berkeley National Laboratory)

Abstract

pdf, pdf

Analysis of the Blue Waters File System Architecture for Application I/O Performance

Kalyana Chadalavada and Robert Sisneros (National Center for Supercomputing Applications, University of Illinois)

Abstract

pdf, pdf

Trillion Particles, 120,000 cores, and 350 TBs: Lessons Learned from a Hero I/O Run on Hopper

Suren Byna and Andrew Uselton (Lawrence Berkeley National Laboratory), Prabhat Mr. (Lawrence Berkeley National Laboratory/National Energy Research Scientific Computing Center), David Knaak (Cray Inc.) and Yun (Helen) He (Lawrence Berkeley National Laboratory/National Energy Research Scientific Computing Center)

Abstract

Modern petascale applications can present a variety of configuration, runtime, and data management challenges when run at scale. In this paper, we describe our experiences in running a large-scale plasma physics simulation, called VPIC, on the NERSC Hopper Cray-XE6 system. The simulation ran on 120,000 cores using ~80% of computing resources, 90% of the available memory on each node and 50% of a Lustre file system. Over two trillion particles were simulated for 23,000 timesteps, and 10 one-trillion particle dumps, each ranging between 30 and 42TB were written to HDF5 files at a sustained rate of ~27GB/s. To the best of our knowledge, this job represents the largest I/O undertaken by a NERSC application and the largest collective writes to single HDF5 files. We outline several obstacles that we overcame in the process of completing this run, and list lessons learned that are of potential interest to HPC practitioners.

We will elaborate on the following insights in the paper:

1. Collective writes to a single shared HDF5 file can work as well as file-per-process writes We demonstrate that collective writes from 20,000 MPI processes to a single, shared ~40TB HDF5 file using collective buffering can achieve a sustained performance of 27GB/s on a Lustre file system. The peak performance of the system is ~35GB/s, which is achieved by our code for a substantial fraction of the runtime. This outperformed the strategy where each process wrote a separate file, i.e. a total of 20,000 files, that achieved 24GB/s.

2. Advance verification of file system hardware is important for obtaining peak performance Our initial execution of VPIC achieved only 65% of Lustre peak performance. With the use of Lustre Monitoring Toolkit (LMT), we pinpointed the problem to a small set of slow OSTs, which were exhibiting degraded performance. We temporarily excluded these OSTs from our tests, and were able to demonstrate ~80% of the peak I/O rates. Advance verification for slow OSTs can avoid performance pitfalls.

3. Advance verification of available resources for memory-intensive applications is important Since the simulation requires 90% of the memory on each node, it was necessary to verify that each node reserved for executing this simulation had at least that much of available memory. Unreleased memory from previous applications could cause out-of-memory errors.

We will also discuss tuning multiple layers of parallel I/O subsystem and emphasize the need for scalable tools for diagnosing software and hardware problems.

pdf, pdf

Paper

Technical Session 14C

Chair: Helen He (National Energy Research Scientific Computing Center)

Performance Comparison of Scientific Applications on Cray Architectures

Haihang You, Reuben D. Budiardja, Jeremy Logan, Lonnie D. Crosby, Vincent Betro, Pragneshkumar Patel, Bilel Hadri and Mark Fahey (National Institute for Computational Sciences)

Abstract

pdf

First 12-cabinets Cray XC30 System at CSCS: Scaling and Performance Efficiencies of Applications

Sadaf Alam, Themis Athanassiadou, Tim Robinson, Gilles Fourestey, Andreas Jocksch, Luca Marsella, Jean-Guillaume Piccinali and Jeff Poznanovic (Swiss National Supercomputing Centre)

Abstract

pdf, pdf

Effects of Hyper-Threading on the NERSC workload on Edison

Zhengji Zhao, Nicholas J. Wright and Katie Anytpas (Lawrence Berkeley National Laboratory/National Energy Research Scientific Computing Center)

Abstract

pdf, pdf

Paper

Technical Session 15A

Chair: Craig Stewart (Indiana University)

Preparing Slurm for use on the Cray XC30

Stephen Trofinoff and Colin McMurtrie (Swiss National Supercomputing Centre)

Abstract

pdf, pdf

Lesson’s From 20 Continuous Years of Cray/HPC Systems

Liam Forbes, Don Bahls, Gene McGill, Oralee Nudson and Gregory Newby (Arctic Region Supercomputing Center, UAF)

Abstract

pdf, pdf

Cray Workload Management with PBS Professional 12.0

Scott Suchyta and Sam Goosen (Altair Engineering, Inc.)

Abstract

pdf

Paper

Technical Session 15B

Chair: Robert Henschel (Indiana University)

Introduction to HSA Hardware, Software and HSAIL with A HPC Usage Example

Vinod Tipparaju (AMD, Inc.)

Abstract

Heterogeneous systems have been around for several years, and the accelerator-based heterogeneous systems (CPU-GPU) have become popular in the last five years. Particularly, accelerating general-purpose computation using GPUs is gaining momentum in both academic research and vendors in the industry. OpenCL and CUDA are the two most popular programming models that enable end-application programmers to take advantage of the GPGPU through the compiler, runtime, and driver tool chain. While the opportunity of GPGPU has been opened up to expert programmers, this has not reached a big mass yet, primarily, because of the following reasons: (i) The CPU-GPU system has a distributed-asymmetric memory that needs to be explicitly managed for coherency and synchronization (ii) Two-way high-latency memory copies and kernel dispatch (iii) Lack of support for dynamic scheduling or load balancing, advanced debugging, system calls, exception handling etc.

The Heterogeneous System Architecture (HSA) is a new set of architectural features (to be standardized) to efficiently support a wide range of data-parallel and task-parallel programming models. HSA architectural features include: Unified Virtual Address Space, Architected User Mode Queuing, Fully-Coherent Memory Model, Architected Queuing Language (AQL), and several others. Thus, the overarching goal of HSA is to bring GPGPU to the masses by drastically improving the productivity, performance and energy-efficiency of the applications that may want to take advantage of the GPU acceleration. HSA-enabled processors come with associated software ecosystem to expose the architectural features, which include: HSA Driver, HSA Runtime, and HSA Intermediate Language (HSAIL). Specifically, HSA Runtime exposes Coherent Memory, Architected User-Mode Queues, and Architected low-latency dispatch through low-level APIs. These APIs are designed (and standardized) to be generic, and can be consumed by several high-level runtimes, programming models and languagues (OpenCL, C++ AMP, Java, OpenMP etc). HSAIL is an abstract virtual machine language of HSA components, which will be standardized. Thus, each vendor of a HSA component will comply with the standard set of architectural features, provide a core runtime implementation (adhering to the standard), and a finalizer component that translates the HSAIL into its vendor-specific ISA.

Overall, using the new architectural features of HSA, and its software ecosystem, it is possible to support several high-level programming models and languages, and at the same time, influence them to improve the programmability, thereby, bringing heterogeneous computing to the masses.

Reliable Computation Using Unpredictable Components

Joel O. Stevenson, Robert A. Ballance, Suzanne M. Kelly, John P. Noe and Jon R. Stearley (Sandia National Laboratories) and Michael E. Davis (Cray Inc.)

Abstract

pdf, pdf

Requirements Analysis for Adaptive Supercomputing using the Cray XK7 as a Case Study

Sadaf R. Alam, Mauro Bianco, Ben Cumming, Gilles Fourestey, Jeffrey Poznanovic and Ugo Varetto (Swiss National Supercomputing Centre)

Abstract

pdf, pdf

Paper

Technical Session 15C

Chair: Douglas W. Doerfler (Sandia National Laboratories)

Improving the Performance of the PSDNS Pseudo-Spectral Turbulence Application on Blue Waters using Coarray Fortran and Task Placement

Robert A. Fiedler, Nathan Wichmann and Stephen Whalen (Cray Inc.) and Dmitry Pekurovsky (San Diego Supercomputer Center)

Abstract

pdf, pdf

A Review of The Challenges and Results of Refactoring the Community Climate Code COSMO for Hybrid Cray HPC Systems.

Benjamin Cumming (Swiss National Supercomputing Centre), Carlos Osuna (Center For Climate Systems Modeling ETHZ), Tobias Gysi (Supercomputing Systems AG), Mauro Bianco (Swiss National Supercomputing Centre), Xavier Lapillonne and Oliver Fuhrer (Federal Office of Meteorology and Climatology MeteoSwiss) and Thomas C. Schulthess (ETH Zurich)

Abstract

pdf, pdf

CloverLeaf: Preparing Hydrodynamics Codes for Exascale

Andrew C. Mallinson and David A. Beckingsale (University of Warwick), Wayne P. Gaudin and John A. Herdman (Atomic Weapons Establishment), John M. Levesque (Cray Inc.) and Stephen A. Jarvis (University of Warwick)

Abstract

pdf, pdf

Paper

Technical Session 16A

Chair: John Noe (Sandia National Laboratories)

Methods and Results for Measuring Kepler Utilization on a Cray XK7

Jim Rogers (Oak Ridge National Laboratory), Roger Green (NVIDIA) and Kevin Peterson (Cray Inc.)

Abstract

pdf

Resource Utilization Reporting on Cray Systems

Andrew P. Barry (Cray Inc.)

Abstract

pdf, pdf

The Complexity of Arriving at Useful Reports to Aid in the Succesful Operation of an HPC Center

Ashley Barker, Adam Carlyle, Chris Fuson, Mitch Griffith and Don Maxwell (Oak Ridge National Laboratory)

Abstract

pdf, pdf

Paper

Technical Session 16B

Chair: Liz Sim (EPCC, The University of Edinburgh)

Building Balanced Systems for the Cray Datacenter of the Future

Keith Miller (DataDirect Networks)

Abstract

Surviving the Life Sciences Data Deluge using Cray Supercomputers

Bhanu Rekapalli and Paul Giblock (National Institute for Computational Sciences)

Abstract

pdf

Early Experience on Crays with Genomic Applications Used as Part of Next Generation Sequencing Workflow

Mikhail Kandel (University of Illinois), Steve Behling and Bill Long (Cray Inc.), Carlos P. Sosa (Cray Inc. and University of Minnesota Rochester), Sebastien Boisvert and Jacques Corbeil (Universite Laval) and Lorenzo Pesce (University of Chicago)

Abstract

pdf, pdf

Paper

Technical Session 16C

Chair: Nicholas J. Wright (LBNL/NERSC)

Measuring Sustained Performance on Blue Waters with the SPP Metric

William Kramer (National Center for Supercomputing Applications)

Abstract

Experiences Porting a Molecular Dynamics Code to GPUs on a Cray XK7

Donald K. Berry (Indiana University), Joseph Schuchart (Technische Universität Dresden) and Robert Henschel (Indiana University)

Abstract

pdf, pdf

Chasing Exascale: the Future of GPU Computing

Steve Scott (NVIDIA)

Abstract

Paper

Technical Session 18A

Chair: Ashley Barker (Oak Ridge National Laboratory)

Blue Waters Acceptance: Challenges and Accomplishments

Celso L. Mendes, Brett Bode, Gregory H. Bauer, Joseph R. Muggli, Cristina Beldica and William T. Kramer (National Center for Supercomputing Applications)

Abstract

pdf, pdf

Saving Energy with “Free” Cooling and the Cray XC30

Brent Draney, Tina Declerck, Jeffrey Broughton and John Hutchings (National Energy Research Scientific Computing Center/Lawrence Berkeley National Laboratory)

Abstract

pdf, pdf

Real-time mission critical supercomputing with Cray systems

Jason Temple and Luc Corbeil (Swiss National Supercomputing Centre)

Abstract

pdf, pdf

Paper

Technical Session 18B

Chair: Jenett Tillotson (Indiana University)

High Fidelity Data Collection and Transport Service Applied to the Cray XE6/XK6

Jim Brandt (Sandia National Laboratories), Tom Tucker (Open Grid Computing), Ann Gentile (Sandia National Laboratories), David Thompson (Kitware Inc.) and Victor Kuhns and Jason Repik (Cray Inc.)

Abstract

pdf, pdf

Production I/O Characterization on the Cray XE6

Philip Carns (Argonne National Laboratory), Yushu Yao (Lawrence Berkeley National Laboratory), Kevin Harms, Robert Latham and Robert Ross (Argonne National Laboratory) and Katie Antypas (Lawrence Berkeley National Laboratory)

Abstract

pdf, pdf

Improvement of TOMCAT-GLOMAP File Access with User Defined MPI Datatypes

Mark Richardson (Numerical Algorithms Group) and Martyn Chipperfield (University of Leeds)

Abstract

pdf, pdf

Paper

Technical Session 18C

Chair: Liam O. Forbes (Arctic Region Supercomputing Center, UAF)

Cray’s Cluster Supercomputer Architecture

John Lee, Susan Kraus and Maria McLaughlin (Cray Inc.)

Abstract

pdf, pdf

Performance Metrics and Application Experiences on a Cray CS300-AC™ Cluster Supercomputer Equipped with Intel® Xeon Phi™ Coprocessors

Vincent C. Betro, Robert P. Harkness, Bilel Hadri, Haihang You, Ryan C. Hulguin, R. Glenn Brook and Lonnie D. Crosby (National Institute for Computational Sciences)

Abstract

pdf, pdf

Paper

Technical Session 19A

Chair: Tina Butler (National Energy Research Scientific Computing Center)

Effect of Rank Placement on Cray XC30 Communication Cost

Reuben D. Budiardja, Lonnie D. Crosby and Haihang You (National Institute for Computational Sciences)

Abstract

pdf

Evaluating Node Orderings For Improved Compactness

Carl Albing (Cray Inc.)

Abstract

pdf, pdf

Improving Task Placement for Applications with 2D, 3D, and 4D Virtual Cartesian Topologies on 3D Torus Networks with Service Nodes

Robert A. Fiedler and Stephen Whalen (Cray Inc.)

Abstract

pdf, pdf

Paper

Technical Session 19B

Chair: Zhengji Zhao (Lawrence Berkeley National Laboratory)

The State of the Chapel Union

Bradford L. Chamberlain, Sung-Eun Choi, Martha B. Dumler, Thomas Hildebrandt, David Iten, Vassily Litvinov and Greg Titus (Cray Inc.)

Abstract

pdf, pdf

Recent enhancements to the Automatic Library Tracking Database infrastructure at the Swiss National Supercomputing Centre

Timothy W. Robinson and Neil Stringfellow (Swiss National Supercomputing Centre)

Abstract

The Automatic Library Tracking Database (ALTD)—an infrastructure developed previously by staff at the National Institute for Computational Sciences (NICS)—is in production today on Cray XT, XE, XK, and XC30 systems at several Cray sites, including NICS, Oak Ridge National Laboratory, the National Energy Research Scientific Computing Center, and the Swiss National Supercomputing Centre (CSCS). The Automatic Library Tracking Database automatically and transparently stores information about applications running on Cray systems and also records which libraries are linked to those applications, and from these data, support staff at HPC centres can derive a wealth of information about software usage—such as the use or non-use of particular compiler suites or the uptake of numerical libraries and third-party applications—right down to the level of specific version numbers. The tool works by intercepting the GNU linker to gather information on compilers and libraries, and intercepting the job launcher to track the execution of applications at launch time. We have recently extended the ALTD framework deployed at CSCS to record more detailed information on the individual jobs executed on our machines: the job information recorded by the previous incarnation of ALTD was limited to user name, executable, (batch) job id, and run date; we have extended the tool to record many additional job characteristics such as begin and end times, requested versus used core counts, number of processing elements and threads per process, and mode of linking (e.g. static, dynamic). In combination with custom post-processing scripts—which map executables to software codes, research domains or research groups—our ALTD implementation now delivers a far more complete picture of system usage, providing not only a list of running applications but also information on the way that these same applications are being run. On a practical level, such information can be used, for example, to guide future hardware and software procurements, or to assess whether or not researchers are using our systems in the manner for which they were provided with resource allocations.

pdf, pdf

Comparing Compiler and Library Performance in Material Science Applications on Edison

Jack Deslippe and Zhengji Zhao (National Energy Research Scientific Computing Center)

Abstract

pdf, pdf

Paper

Technical Session 19C

Chair: John Noe (Sandia National Laboratories)

A Single Pane of Glass: Bright Cluster Manager for Cray

Matthijs van Leeuwen, Mark Blessing and David Maples (Bright Computing)

Abstract

Supporting Multiple Workloads, Batch Systems, and Computing Environments on a Single Linux Cluster