Karl Feind
kaf@cray.com
http://reality.sgi.com/kaf_craypark
The Message Passing Toolkit (MPT) and Cluster Products Group supports a variety of software products that support parallel programming and applications support for highly parallel and clustered Silicon Graphics and Cray computer systems. The MPT product is comprised of MPI, SHMEM, and PVM message passing software, used by highly parallel applications to support applications launch and inter-process communication.
IRIS FailSafe provides high-availability (HA) support for applications running on Silicon Graphics servers through automatic fail-over of applications from one server node to another. In the event of a failure, in combination with a RAID or mirrored disk configuration, an IRIS FailSafe cluster provides resilience from any single point of failure and acts as insurance against unplanned outages.
IRISconsole is the third major product supported by our group. The IRISconsole multi-server management system manages multi-server clusters from a single workstation with an easy-to-use interface. Servers are managed and monitored, and their activity logging is supported. IRISconsole performs intelligent actions based upon this information and also allows remote console access.
The most recent IRIS FailSafe release was version 1.2 released in 1997. This release provided fail-over capability for paired server nodes.
The next planned release will be IRIS FailSafe 2.0. A June 1998 beta release is planned, with general product release planned for later in the year. IRIS FailSafe 2.0 will support more generalized fail-over configurations, with the limit of two nodes increasing to eight nodes. This will provide more flexibility for highly available applications because any nodes in the group of up to eight machines can serve as the fail-over system for one or several of the other nodes and applications in the group.
For more information about IRIS FailSafe see http://www.sgi.com/Products/software/failsafe.html
.
The current IRISconsole release is version 1.2, which was primarily a maintenance release without many new features. The upcoming 1.3 release will include the following features:
For more information about IRISconsole, see http://www.sgi.com/products/remanufactured/challenge/ti_irisconsole.html
.
We support the MPI, PVM, and SHMEM message passing models. All are released in the MPT software release, with the exception of SHMEM on CRAY T3E systems which is delivered as part of CrayLibs and Programming Environment software releases.
Message Passing Model |
Hardware Platform |
Release Package |
---|---|---|
MPI | all | MPT |
PVM | all | MPT |
SHMEM | T3E | Programming Environment |
SHMEM | MIPS and PVP | MPT |
Message passing software is supported on all Sillcon Graphics and Cray computer systems.
For more information about Message Passing Software, see http://www.sgi.com/Products/software/mpt.html
.
The most significant Message Passing release in 1997 was the June MPT 1.1 release. This was the first time the MPT product was released on IRIX systems. The MPI and PVM message passing models had previously been released independently, and bringing them together with SHMEM in the MPT release package on IRIX systems provided more product consistency across IRIX, CRAY T3E, and CRAY PVP systems.
Major features released with MPT 1.1 included:
The T3E version of the SHMEM library was enhanced in late 1997 in CrayLibs 3.0.1, 3.0.1.2, and 3.0.2 revision and update releases.
SHMEM_GROUP_CREATE_STRIDED added.
There were two all-platform MPT software releases in early 1998 and a third release planned for June 1998. Releases 1.2, 1.2.0.2, and 1.2.1 provided features chiefly for IRIX platforms with a few features that enhanced T3E and PVP message passing.
The following features were added in January 1998 in the MPT 1.2 release.
MPI_ENVIRONMENT
environment variable.
mpirun
and array services set this environment variable for the benefit of .cshrc
or .profile
start-up files.
mpirun
offers the -p
option to add optional prefixes to line written by MPI processes to stdin or stdout.
shmalloc
, shfree
, and a set of SHMEM collective communication routines.
SMA_DSM_OFF
, SMA_DSM_VERBOSE
, SMA_DSM_PPM
, PAGESIZE_DATA
, SMA_SYMMETRIC_SIZE
, and SMA_VERSION
.
PVM_RSH
environment variable.
The following features were added in March 1998 in the MPT 1.2.0.2 release.
shmem_barrier_all
performance improved on Origin 2000 systems.
The following features are added in the MPT 1.2.1 release.
MPI_Type_get_contents
, MPI_Type_get_envelope
and MPI-2 error codes are added.
shmem_set_lock
and shmem_clear_lock
are added.
MPI_COMM_WORLD
.
One of the requirements of the message passing libraries on Silicon Graphics and Cray systems is to provide inter-process communication with lowest possible latency and highest possible bandwidth. This fundamental requirement provides a motivation to continue adding performance enhancements to the message passing libraries.
The enhanced bandwidth for point-to-point communication on T3E-1200 systems was achieved by enhancing SHMEM get algorithm. The following graph shows the effective bandwidth obtained over varying transfer sizes on the CRAY T3E and the CRAY T3E-1200.
A common practice in message passing programs that use SHMEM is to use barrier synchronization to separate computational and communication phases. The barrier synchronization must be as fast as possible for these programs to get best performance. The following graph shows how Origin SHMEM barrier synchronization time has been continually improved in the MPT releases the past year
shmem_barrier_all
synchronization timeThe following graphic compares bandwidths and latencies for MPI and SHMEM message passing on Origin 2000 systems.
The MPT and cluster group will be enhancing message passing software in the coming year to provide many customer-requested features. The chart in this section outlines the planned features for the coming years. These are not commitments. These are target plans that might change.
4Q98 |
1Q99 |
3Q99 |
1Q00 |
|
|
|
|
In the past year, MPT and cluster software has been enhanced to better support all Silicon Graphics and Cray hardware platforms. The most significant effort has been made recently for the Origin 2000 system to support increasing CPU counts and to support clustered Origin 2000 configurations. Future enhancements to message passing software will seek to balance the needs of large clustered and non-clustered Silicon Graphics and Cray systems.
Karl Feind is a Core Design Engineer in the MPT and Cluster Products group, where his primary responsibilities are support of Distributed Shared Memory (SHMEM) data passing and Message Passing Interface (MPI). In the past, Karl's responsibilities have included the Cray Fortran I/O and Flexible File I/O (FFIO) libraries.
Home page: http://reality.sgi.com/kaf_craypark
E-mail address: kaf@cray.com