Cray User Group SV1 Workshop
October 23-25, 2000
Minneapolis, Minnesota

Final Program and Presentations

Download a copy of the workshop attendee list here.

If you do not already have it, you may from adobe.com

Monday
October 23
9:00
Welcome and Introductions
 
Welcome from CUG, Sally Haerer (OR-ST), CUG President
 
Welcome/Corporate Update, Jim Rottsolk, President, Cray Inc.
9:30
The Challenges Ahead: Choosing the Right Architecture for High Performance Computing, Steve Scott (Cray)
10:45
Cray SV1 Hardware Update, Gary Shorrel (Cray)
11:15
Cray SV1 Software Overview, Jay Blakeborough (Cray)
1:30
ARSC J932se to SV1 Code Migration Experiences, Guy Robinson and Jeff McAlister (ARSC), and Frank Chism (CRAY)
2:15
Vectorization of the Generalized Born Method in AMBER, Carlos Sosa and T. Hewitt (CRAY) and D. A. Case (SCRIPPS)
3:30
Direct Numerical Simulations of Droplet-Laden Flows, Nora Okong'o (JPL)
4:00
Cray SV1 Application Performance, James Giuliani and David Robertson (OSC)
6:00
Cray Dinner/Reception, Speaker: Burton Smith (CRAY)
Tuesday
October 24
8:45
SV1 Optimization Tutorial. Jeff Brooks (Cray)
10:45
Gaussian 98 Performance Guide on a Heterogeneous Environment, Carlos Sosa (CRAY) and Michael J. Frisch, Gaussian Inc.
11:15
Performance Improvements for NVH Analysis on Cray SV1 Computers, Nathan Wichmann , Kristyn Maschhoff, and Ted Stern (CRAY)
1:30
Programming Environment Update, Lisa Krause (Cray)
2:15
4-Node SV1 Cluster Implementation, Terry Jones, Beata Sarnowska, and D. Cole (NAVO)
3:30
Update on SV2
4:14
SV1-to-SV2 Migration Issues, Margaret Cahir (Cray)
Wednesday
October 25
9:00
Tuning Tutorial
9:45
CAB Recap
10:45
Storage Issues
11:15
Executive Panel

Abstracts

Monday

9:30 The Challenges Ahead: Choosing the Right Architecture for High Performance Computing, Steve Scott (Cray)

Forty or so years of progress have failed to produce a convergence in the approach to high end computing. This is due, in large part, to the very rapid pace of change in the underlying technology. Diversity flourishes in the unstable environment caused by this change. The job of the computer architect is to find new answers to the same questions as the constraints change.
In this talk I'll discuss some of the challenges and issues, ranging from the underlying technology, to application characteristics, to economics that are confronting HPC designers and users today. Finally I'll attempt to extrapolate to the years ahead and discuss how these issues bear upon high-end computer architecture.

10:45 Cray SV1 Hardware Update, Gary Shorrel (Cray)

The CRAY SV1 system has been in production for over a year. In the past year, enhancements have been made to the product and further significant enhancements are in development. This presentation will review the SV1 hardware and discuss hardware enhancements currently in development.

1:30 ARSC J932se to SV1 Code Migration Experiences, Guy Robinson and Jeff McAlister (ARSC), and Frank Chism (CRAY)

Arctic Region Supercomputing Centre vector systems are used by a diverse range of users who employ an equally diverse range of codes, including single processor vector, multitasked vector, and parallel (MPI) vector codes. This paper will describe experiences, and report the lessons learnt, moving users codes from a twelve processor Cray J932se system to a 32 processor Cray SV1 system and seek answers to questions from attending Cray staff and sites with more experience of Cray SV1 systems

2:15 Vectorization of the Generalized Born Method in AMBER, Carlos Sosa and T. Hewitt (CRAY) and D. A. Case (SCRIPPS)

In biological simulations in solution, explicit inclusion of the solvent such as water, tend to increase CPU requirements in the calculation. Continuum solvent models provide a way to include solvent effects without introducing excessive extra CPU cost. However, even in these models, it is important to leverage the machine architecture to tackle larger systems. In this study we present the vectorization of one of such models, the generalized Born method.

3:30 Direct Numerical Simulations of Droplet-Laden Flows, Nora Okong'o (JPL)

An efficient code for performing numerical simulations of fluid flow carrying evaporating droplets has been developed based on the time-dependent compressible Navier-Stokes equations for the carrier fluid, augmented by species conservation equations and the equation of state. The droplets are individually tracked using a Lagrangian formulation. Based on length scale considerations, it is shown that the drops are much smaller than the flow scales, which determine the grid size. The equations are integrated in time using an explicit fourth-order Runge-Kutta scheme, while spatial derivatives are evaluated using eighth-order finite differences. Two complementary approaches are pursued in the numerical simulations. The first approach is Direct Numerical Simulation (DNS) in which all the relevant length scales are resolved. The second approach is Large Eddy Simulation (LES) in which only the large scales are resolved and the small, or subgrid, scales (SGS) are modeled. This allows, for the same flow Reynolds number, to perform LES on a coarser grid than would be required for the DNS, a mandatory requirement for simulating turbulent flows. However, in order to develop the SGS models, one must first perform DNS at a high enough Reynolds number, portraying the physics of turbulence. Since the needed resolution increases with Reynolds number, the DNS code required modification to run on a multiprocessor machine (CRAY SGI Origin2000, using MPI) in order to accommodate more grid points. Further, the method requires analyzing the DNS results to devise SGS models, and then implementing the SGS models into the original DNS code. We present our experience in moving our serial code from the CRAY J90 using the FORTRAN77 compiler to the SV1A using the FORTRAN90 compiler, and our strategy in writing a serial code that could be modified into a parallel code while maintaining efficiency.

4:00 Cray SV1 Application Performance, James Giuliani and David Robertson (OSC)

The introduction of cached vector operations and Multi-Stream (MSP) processors in Cray's SV1 architecture will result in new design and performance tuning issues for developers of many scientific applications. Serial code developed for Y-MP/C90/T90 series machines may require a significant additional tuning effort to achieve efficient performance on the SV1. In addition, the larger processor count makes possible a new class of large-scale parallel vector applications. Following an overview of the relevant SV1 architectural features and their theoretical performance implications, we examine the performance of several real-world research applications on the SV1 and other Cray systems at OSC. We consider single-processor and MSP performance, as well as scalability for both the shared memory (OpenMP) and message passing (MPI) models. Finally, we discuss the insights gained into the process of migrating applications to Cray's new architecture roadmap.

Tuesday

10:45 Gaussian 98 Performance Guide on a Heterogeneous Environment, Carlos Sosa (CRAY) and Michael J. Frisch, Gaussian Inc.

Gaussian 98 is an integrated system of electronic structure programs. It is widely used to model a variety of molecular systems using first principles. Gaussian has been optimized for parallel vector supercomputers as well as massively parallel machines. In this study, we provide a performance guide for Gaussian98 on a heterogeneous environment.

11:15 Performance Improvements for NVH Analysis on Cray SV1 Computers, Nathan Wichmann , Kristyn Maschhoff, and Ted Stern (CRAY)

Significant improvements were made to MSC.NASTRAN kernel to exploit the architecture of the Cray SV1. Efforts focused on improving the absolute performance of the dominant kernels typical in NVH analysis. The result is improved turnaround time for most large-scale NVH computations while reducing the total memory bandwidth required by the problem.


1:30 Programming Environment Update, Lisa Krause (Cray)

This talk provides information on the programming environments that support CRAY PVP and MPP platforms. The SV1 multi-streaming processor (MSP) concept is discussed along with the implementation constraints from the CRAY SV1 hardware. Both single-cpu (SSP) and MSP performance enhancements from the compiler and libraries will be presented along with recommendations for optimal MSP utilization.

2:15 4-Node SV1 Cluster Implementation, Terry Jones, Beata Sarnowska, and D. Cole (NAVO)

In September 1999, the NAVOCEANO MSRC became the first Domestic U.S. site to place a multi-node SV1 cluster into production. In June 2000, the system was upgraded to a four-node configuration. This paper will present NAVOCEANO MSRC's experiences with a multiple-node SV1 deployed in a cluster configuration. It will discuss the system configuration achieved to support diverse user workloads, integration of the SV1 Cluster product in a large-scale HPC environment, and measured performance.

4:14 SV1-to-SV2 Migration Issues, Margaret Cahir (Cray)

This talk will cover the migration issues that an application developer is likely to face when porting codes to the CRAY SV2. Since the CRAY SV2 is still under development, many issues are still open. Part of the time for this talk will be open for collection of questions and issues that attendees have. We will address these to the extent possible at this time and use the requests as a basis for supplying information in the future.

Questions and information: contact

Gary Jensen, Workshop Chair
139 West 640 North
American Fork, UT 84003 USA
(1-801) 492-9535
FAX: (1-208) 475-6258
guido@cug.org


Revised: November 21, 2000