CUG SUMMIT 2001 Abstracts

May 23-25, 2001

Indian Wells, California, USA

Final Program Abstracts

SGI Focused Program

Session Day Session Number Session Time Presentation Title Author(s) Abstract
Thursday 3A 2:30 Partitioning on Origin 3000 and SNIA Russ Anderson, SGI An overview of RAS features in Irix, with emphasis on partitioning and error recovery.
Thursday 3B 2:00 Data Migration Facility 3.0, Client/Server Model Thomas Goozen, SGI This paper will explain the architecture of DMF 3.0, expounding upon the details of both the client and server functionalities. The paper will also discuss the business case for the development of DMF 3.0 and lastly how DMF 3.0 will integrate with third party backup solutions software such as Legato's Networker.
Thursday 3C 2:00 Early Experiences with the Developed Virtual User Account Norbert Meyer, Miroslaw Kupc, Marcin Lawenda, and Pawel Wolniewicz (POZNAN) The Virtual User Account (VUS) is a mechanism, which significantly simplifies the user's management process in a national GRID structure, which connects supercomputer systems geographically distributed and belonging to different independent institutions. The mechanism was implemented within the Polish national cluster. The concept of the VUS was presented last year at the CUG Summit 2000. The paper will describe the efficiency of the system running in a GRID environment.
Thursday 3A 3:00 Job Scheduling Techniques for Large Origin Systems Steve Caruso, SGI We describe effective job scheduling techniques for large Origin systems. The use of a batch scheduler such as LSF coupled with dynamic cpusets is discussed along with an illustration of the benefits of this scheduling technique. Also described is a pre-emptive, priority scheduling scheme that SGI is currently developing for an operational weather site. The scheme utilizes current and future features in LSF and IRIX, including cpusets, in addition to custom software developed by SGI. The scheduler will permit high priority jobs to acquire the required computational resources by pre-empting lower priority interactive and batch jobs. When the high priority jobs have completed, pre-empted jobs are resumed. The scheduler also utilizes dynamic cpusets to reduce runtime variability. Relevant tunable system parameters are described.
Thursday 3B 2:30 SGI CXFS Overview, Current Status and Future Plans Neil Bannister, SGI Silicon Graphics released its Clustered File-System, CXFS, at the end of 1999. Since that date is has gained good acceptance amongst SGI customers and it continues to be deployed for large storage requirements. This paper will give an overview of CXFS and how customers are using it. It will also cover how it is integrated into SGI's storage solutions, current status and future plans.
Thursday 3C 2:30 An Early Experience on Job Checkpoint/Restart - Working with SGI IRIX OS and the Portable Batch System (PBS)

Click here for slides
Yan-Tyng Chang (NAS) In this paper, we described the use of SGI cpr utility for job checkpoint/restart for two types of applications, gaussian and mpi jobs. Both interactive jobs and batch jobs were tested.
Thursday 3A 3:00 IRIX OS Update Lynne Johnson, SGI This talk will cover the work SGI is doing on IRIX in support of high performance computing. An update on accomplishments for the previous year, and a road map for the next year will be presented.
Thursday 3B 3:00 SGI CXFS Overview, Current Status and Future Plans Neil Bannister, SGI Silicon Graphics released its Clustered File-System, CXFS, at the end of 1999. Since that date is has gained good acceptance amongst SGI customers and it continues to be deployed for large storage requirements. This paper will give an overview of CXFS and how customers are using it. It will also cover how it is integrated into SGI's storage solutions, current status and future plans.
Thursday 3C 3:00 Supporting Users of the NAS Facility at the NASA Ames Research Center

Click here for slides
Chuck Niggley and Ed Hook (NAS) The NAS facility provides both production computing resources and test bed research & development resources to local users and groups as well as the other NASA Centers, our industrial partners and University collaborators. The presentation will discuss how these services are provided and present statistics dealing with the user support activities.
Thursday 4A 4:00 'SNIA' Product Directions Steve Reinhardt, SGI SGI's Itanium-based SNIA systems will deliver strong processor and interconnect performance. The Linux OS will be suitable for some, but not all, workloads. This talk will cover project directions and status, and guidance on which workloads will be early candidates for SNIA.
Thursday 4B 4:00 SGI SAN/CXFS/FailSafe/DMF Demonstration Dave Ellis and Linda Lait, SGI We will build and demonstrate a Storage Area Network (SAN) based Clustered XFS (CXFS) environment, running DMF under FailSafe protection. We will also discuss performance characteristics of the SAN and make recommendations for setting up applications on a shared filesystem.
Thursday 4C 4:00 Performance Metrics: SPEC Gigaflops vs. Linpack Gigaflops Tom Elken, SGI (no abstract; slides only)
Friday 5A 8:30 Efficient Data-Parallel Programming on cc-NUMA Machines Siegfried Benkner, SGI and Thomas Brandes (GMD) In this presentation we describe optimized parallelization strategies for high-level data-parallel programs programs on cc-Numa machines that avail itself with the mechanisms provided by OpenMP for work sharing and thread parallelism while exploiting data locality based on user-specified distribution directives. We discuss how such parallelization strategies can be controlled by the programmer at a high-level by means of OpenMP-like extensions to High Performance Fortran (HPF). Performance results on an SGI Origin 2000 using the ADAPTOR and VFC compilers show the effectiveness of our approach in comparison to traditional HPF and OpenMP parallelization strategies.
Friday 5B 8:30 From Elsa to Teras: How to Guide Users from a Vector System to a Huge Parallel System Bert Van Corler (SARA) In November 2000 a 1024-node SGI Origin 3800 was placed at SARA. An overview will be presented of the system, but especially of what the consequences were for the end users. The transition started more than half a year before the arrival of the Origin system.
Friday 5C 8:30 The HPC Virtual Consultant

Click here for slides
Sally Haerer (OR-ST) Funded through a DoD-related grant, the Northwest Alliance for Computational Science & Engineering (NACSE) at Oregon State University has established a collaboration to create a web-based, search-and-query interface that will provide a portal to a variety of excellent online documentation sites around the country. This will allow programmers and technical consultants alike to learn just one common access point, one navigational hierarchy and information strategy, and one search mechanism, yet have access to a vast amount of information and expertise from distributed sites. By providing consolidated access to a wide variety of Web resources, the Virtual Consultant will allow users to benefit from materials they would not be able to find on their own. By removing the need to learn and traverse multiple Web sites, more time can be spent directly on problem-solving. Furthermore, the development of a model for a common, distributed web-based documentation strategy means that other targeted search-and-query interfaces could be engineered in the future with much less lead-time and resources than this venture.
Friday 5A 9:00 Topology Aware Scheduling in the LSF Distributed Resource Manager Chris Smith, Bill McMillan, and Ian Lumb, Platform Computing Consistent and optimal application performance are identified issues for NUMA-based architectures. Often these inconsistencies and resulting sub-optimal performance are a consequence of memory-locality latency. Production-quality solutions need to simultaneously allow for flexibility in parallel job placement algorithms, and provide a rich framework for managing all the distributed compute resources. Through integration with SGI IRIX's cpusets functionality, Platform Computing's Load Sharing Facility (LSF) is empowered with topological knowledge of the Origin family's ccNUMA architecture. Preliminary results from customer sites illustrate the performance advantages that can be achieved in practice.
Friday 5B 9:00 PBS Pro Workload Management: Preemption and Reservations Bill Nitzberg, SGI and James Jones, Veridian Case studies of the Portable Batch System (PBS) on Cray SV1, Cray T3E, SGI Origin 2000 and SGI Origin 3000 are presented with a short discussion to include: tens of thousands of jobs; managing thousands of cpu's with one login system; partitioning a system to meet different user requirements; using cpusets to manage jobs on large SGI Origins; using suspend/resume and preemptive scheduling to assist power users; using PBS Pro's advance reservation feature. A road map of PBS Pro including Linux and vendor relationships will conclude the presentation.
Friday 5C 9:00 Networking Update George Hyman, SGI (no abstract; slides only)
Friday 5A 9:30 Large Shared Memory Systems

Click here for slides
Bob Ciotti (NAS) The Numerical Aerospace Simulation (NAS) Facility has pioneered four firsts in large shared memory systems. A 256p O2K system in 1998 and a 512p O2K system in 1999. More recently a 512p and 1024p O3K systems in 2001. This paper reviews the progress made in this approach and provides early results from the latest 512p and 1024p system.
Friday 5B 9:30 Experiences Using the Portable Batch System on Large Origin Systems Edward Hook (NAS) Large SGI Origin systems (larger than the original design envisioned) require the use of "coarse mode" addressing in jobs using many CPUs. This paper describes the design/implementation/deployment of a PBS scheduler that shields users from the effects of coarse mode addressing, while giving them the benefits that come from the use of cpusets.
Friday 6A 11:00 Beyond Beowulf Clustering Ian Lumb, SGI and Bill McMillan, Platform First-generation Beowulf clusters have clearly demonstrated that powerful computing environments can be build around COTS hardware and networking components, parallel programming involving PVM or MPI, with system software as the 'glue' that provides the abstraction of a distributed process space. As scientists and engineers increasingly employ these clusters in practice, two fundamental manageability issues are becoming increasingly prohibitive. Thus administration of the environment, and management of the workload, is addressed here through distributed resource management solutions involving Platform Computing's Load Sharing Facility (LSF).
Friday 6B 11:00 Early Experiences with Storage Area Networks and CXFS

Click here for slides
John Lynch, SGI This paper looks at the design, integration and application issues involved in deploying an early access, very large, highly available storage area network. Covered are topics from filesystem failover, issues regarding numbers of nodes in a cluster, and using leading edge solutions to solve complex issues in a real-time data processing network.
Friday 6C 11:00 SGI Message-Passing Status and Plans

Click here for slides
Karl Feind, SGI SGI message-passing software has been enhanced in the past year to support additional interconnect fabrics, improve NUMA-awareness, increase MPI-2 content, and provide other improvements. This presentation describes the recent enhancements to MPI and SHMEM software and also outlines our roadmap of planned future enhancements.
Friday 6A 11:30 HPC Linux Update Bill Roske, SGI Progress has been made in Linux for HPC customers on SGI's Intel based platforms. This talk covers progress and plans for Linux functionality needed by our HPC customers.
Friday 6B 11:30 Storage Architecture Choice: SAN or NAS LaNet Merrill, SGI What should a company look at before deciding to use SAN versus NAS? This paper looks at the application requirement, the security requirement, the performance requirement, the scaling requirement and the data management requirement. It includes comparisons between SAN and NAS as well as products that meet both requirements.
Friday 6C 11:30 Some Performance Comparisions for an Ocean Model on the SGI Origin 2000 and the HP V-class 2500 Benny Cheng (JPL) Using a state-of-the-art ocean circulation model, we compare the performance of the code on an SGI Origin 2000 and an HP V2500, and discuss some of the reasons behind the differences in the results.
Friday 6A 12:00 LINUX–The Changing Market Dynamics Paul McNamara, SGI This paper will discuss the market acceptance of LINUX and its future.
Friday 6B 12:00 Performance Analysis Benchmark of the SGI TP9400 William Julien (BCS) This paper will discuss our experience with the performance of SGI's TP9400 RAID Storage Array. Using the standard SGI diskperf performance testing utility and a locally written benchmark, we will demonstrate what actual performance improvements that can be expected over locally connected XFS scsi filesystems. We benchmarked the TP9400 using a shared connection via a Brocade switch between a desk-side Origin 2000 and an Onyx systems. This paper will present our findings of the thoughput for each system using dedicated partitions and a CXFS shared filesystem.
Friday 6C 12:00 An Inexpensive Platform for Computational Chemistry Applications Haibo Wang, Ken Rosetti, and Amanda W. Wu (MCSR) We built up a Linux cluster with a lot of inexpensive disk and memory to carry out the large Gaussian and Gamess jobs, which cannot be finished during the regular system maintenance cycle and these jobs are unable to restart from checkpoint file. Performance benchmark and tuning have been carried out and a model for the production system has been tested and planned.
Friday 7A 2:00 Operating Systems SIG Open Meeting Chuck Keagle (BCS), Virginia Bedford (ARSC), Tina Butler (NERSC), John Mulholland (CSE), and Cheryl Wampler (LANL) This Open meeting will focus on IRIX, LINUX, and IRIX Security. Following brief introductions of the Focus Chairs, we will discuss any unanswered questions members might have concerning SGI, IRIX, LINUX, Origin 2000, Origin 3000, Onyx, and IRIX Security. SGI Liaisons and technical experts will be available to comment on various issues. We will also discuss CUG issues such as the SUMMIT format, presentation content, and ways to make CUG better serve its members.
Friday 7A 2:45 Programming Environments SIG Open Meeting Hans-Hermann Frese (ZIB), David Gigrich, (BCS), and Guy Robinson (ARSC) The Programming Environments SIG invites you to attend its Open Meeting on Friday afternoon. After a brief introduction of the SIG's business and Focus areas, we shall start off with a life discussion on various issues concerning Programming Environments, Compilers and Libraries, and Software Tools. Attendees will have the opportunity to discuss their concerns with the liaisons and technical experts from SGI. Your feedback for the SIG's business and recommendations on future activities will be gratefully acknowledged.
Friday 7B 2:45 Communications and Data Management SIG Open Meeting Kevin Wohlever (OSC) and Paul Anderson (DOD) This will be a SIG meeting during the SGI portion of the conference. Normal SIG business will be done, including getting additional volunteers to assist with the SIG.

Back to the CUG SUMMIT 2001 Proceedings home page

This page last modified