Neil Bannister and Laraine MacKenzie, Strategic Software Organization, Silicon Graphics, Inc., 655F Lone Oak Drive, Eagan, MN 55121, USA
ABSTRACT: Since the last CUG meeting DMF has been released for the SGI IRIX platform. This has presented the team with many challenges. This paper will review these challenges as well as status for the DMF, TMF and CRL products on both SGI and Cray platforms.
Porting the Data Migration Facility to IRIX has challenged the team in several areas, including work done for the file system and tape subsystem. Probably the biggest challenge has been with the complete integration of DMF into the IRIX environment
On a Cray platform, DMF has been able to leverage subsystems that were constructed to provide a production quality environment (with resilience, high performance, and rich system interfaces). Porting DMF to IRIX was very different. It consisted of creating a new file system interface and then interfacing to it, bolting up to the existing character-special tape interface, changing philosophy to an open strategy, and changing DMF to use new system interfaces. Integrating DMF into the IRIX environment presented challenges in file system, disk system performance, tape mounting, and integration with other standard services such as dump, restore and third-party backup solutions.
DMF 2.5.5 is the current release of DMF for Cray platforms. It is a much safer product than DMF 2.4 because it includes a new database package (Raima Data Manager -RDM), which supports the following:
For these reasons, we have been encouraging customers to move to DMF 2.5 since its release over a year ago. It offers customers a much more stable environment because it includes more fixes than previous versions and the new database. The new database does not require DMF to be taken offline for maintenance such as merging and compression.
We have also added a new capability to DMF 2.5.5 that enables a customer to migrate from an alternative "store" such as UNITREE. DMF 2.5.5 provides these tools and documentation.
In summary, DMF 2.5.5 is a good holding release for UNICOS and UNICOS/mk customers until they transition to DMF 3.0 or move to the IRIX platform.
DMF 2.6.1 was released last autumn and supports IRIX 6.2, 6.4.1, and 6.5. In porting DMF to IRIX we created a new file system interface specified by the DMIG Data Migration Application Programming Interface, DMAPI. This prepares DMF very well for its launch in to the open platform space where many DMAPI implementations are available.
During the port we also took the opportunity to update DMF in various ways such as installation, configuration, and release mechanisms. DMF now has a graphical user interface for initial installation and setup, and it installs using the standard IRIX installation tools. Also, DMF 2.6 is released via the WorldWide Web (as well as being available on CD), which allows us to be very responsive to problems and be able to make new fixes available to the world in a matter of hours.
Today, DMF supports the standard XFS file system as well as the real-time file system (no automatic space management is available). A new disk media specific process (MSP) has been added and could be used to support fast disk caches, large slower (probably cheaper) disk caches or could be configured with other MSPs to create a hierarchy of storage media.
XFSdump and XFSrestore have been modified to understand the concepts of dual state files and offline files as they pertain to DMF. This will allow dumping and restoration of inodes and directories without the unnecessary movement of data blocks that DMF has under its control.
DMF on IRIX also supports a basic set of tape library robots and tape drives. This has been accomplished by adding a tape mounting service interface from TMF and using the IRIX character special SCSI tape device driver. This temporary solution will be resolved when DMF 2.6.2 supports both TMF and OpenVault. DMF also supports manual mount tape drives via the msgdaemon/OPER interface and provides a command dm_tape which allows the mounting of tape to be accomplished for use with other UNIX commands and at the same time to be synchronized with DMF.
In summary, DMF 2.6.1 is a very capable product, but at this time is unable to exploit the rich tape services of UNICOS. It also requires further enhancements in the area of graphical interfaces for the administrator and closer integration with third part backup products. Many of these current deficiencies are being addressed at this time.
There are two types of platform migration for DMF:
Migration from UNICOS DMF to IRIX DMF should be very well planned. We strongly urge you to contact SGI/Cray support staff to assist in the migration. There are many different factors that need to be decided ahead of the actual migration such as access methodology for the new platform. Do you want to move small files? Migrate large files? Is the source server available after the migration? Several issues regarding tape drive performance and disk space considerations also exist. There are many detailed considerations in a migration of this kind, because most, if not all, of the data is NOT being moved between the platforms.
Migration from a non-DMF "store" to DMF is more straightforward. Here, the data is actually being moved from one system to another, and depending upon certain criteria, this simplifies the amount of planning detail that much be decided in advance of the migration.
For this type of platform migration, a meta-data picture of the alternative "store" is captured and used to populate the DMF databases and create inodes in the file system on the DMF platform. Now users can access their files on the new platform and DMF will automatically move them over via FTP. This has the advantage of offering the administrator several ways in which to move user files. They can be moved whenever a user accesses them, or they could be moved on a per-user basis or on a per-file system basis.
In summary, DMF is well positioned to support platform migration from UNICOS to IRIX DMF or from an alternative "store" to DMF. These capabilities are currently released with DMF for both UNICOS and IRIX.
After the port of DMF to IRIX was operational, we faced various file system and disk subsystem performance and behavior issues. We implemented several methods to bypass these issues:
It is clear from the last year of DMF exposure on IRIX that it needs to integrate with several third party backup packages, the most prevalent being Legato's Networker. It would be unreasonable of us to think that we could have each backup vendor modify their software to know all about the details of dual state and offline files. To solve this problem, we are implementing an API that will hide the semantics of the various IRIX file systems (XFS, real-time, DMAPI) and allow the integration of third party backup packages to be straightforward. We are currently talking with several major backup vendors that are interested in using this new API on IRIX.
The DMF project is currently working on providing support for IRIS FailSafe which will allow a second, dormant DMF on another FailSafe cluster machine to be started when FailSafe fails over to that machine. To be clear, this is not application fail over to an another machine, but rather fail over as directed by the IRIS FailSafe product. There will be a new DMF FailSafe license that will be required for the second dormant DMF platform. At this time, SGI does not support SCSI switches, but we are investigating the use and consequent support in this DMF-FailSafe environment to support the automatic switch over of SCSI tape drives.
DMF futures can be considered as three phases:
Immediate Futures:
The immediate future for DMF is the 2.6.2 release. The major focus of this release is the support of the TMF and OpenVault tape mounting services. DMF 2.6.2 is in beta test with TMF and OpenVault this summer and should release during 3QCY98.
The following features are also included in DMF 2.6.2 release:
DMF is also being integrated with other products. Products such as Studio Central and MediaBase (from SGI) are asset management systems used in the film and broadcast industry. DMF also works with PC/MAC integration products such as XINET, HELIOS and SYNTAX-TAS.
The DMF 2.6.2 architecture is shown below:
Throughout 1998 more tape drive and tape robot support will be added, including support for SONY drives and robots as well as FUJITSU and STK tape drives. Support for smaller robots will grow as they are added to the list of products supported by OpenVault.
Looking further out there are planned enhancements for DMAPI for it to support Cellular IRIX, Fenceposts and Multiple Managed Regions.
The current version of DMF is a very capable HSM product on a single platform. However, the enterprise file serving environments consist of machines with tape drives, machines with file systems to be managed, and machines with one or both of the above that a system administrator wants to manage as a single migration domain. The picture below shows the type of environment.
Here there are workstations with file systems to be managed, servers with tape drives, servers with file systems to be managed and servers with both file systems and tape drives. All of these need to be managed as one entity. DMF 3.0 will solve this problem. The architecture of DMF 3.0 is shown below:
DMF 3.0 architecture has three major components:
This architecture allows DMF to be split up and run on a set of machines with the individual components running on any supported platforms. It also satisfies the major objectives for the DMF 3.0 program, which is to be able to manage a distributed collection of machines with distributed storage devices as a single migration domain.
The Open Systems Strategic goals for DMF can be expressed as three phases of product life:
Further enhancing the DMF Open Systems Strategy is SGI's support for the ANSI C21.1 standard. SGI is participating as an active member in the construction of this standard. This standard defines a methodology for the import/export of meta-data between HSM vendors. It also alleviates the need for vendors to publish proprietary data formats, thereby protecting their intellectual property while providing customers with a way to move data between vendors' HSM offerings and therefore not getting trapped if that vendor exits the market.
The Cray Tape Subsystem on UNICOS and UNICOS/mk are mature products, but from time to time, they still require a significant injection of resource to fix problems. This has been particularly true in the UNICOS/mk environment. For the GigaRing environment we have implemented support for Peer-up/Peer-down error recovery. This still requires QA and should be available during 3QCY98. The future for tapes on UNICOS and UNICOS/mk mostly looks like adding support for more tape drives. At this time the only drive being investigated is the STK Eagle. One possibility for tapes on UNICOS and UNICOS/mk is to provide interoperation with OpenVault running on an IRIX platform. This would provide system administrators with the ability to dynamically share tape drives across multiple hosts in an UNICOS and UNICOS/mk environments. However, at this time we are unsure of the demand for this capability.
The Cray Tape Management Facility (TMF), version 1.0, will be available on Silicon Graphics Origin2000 systems that runs IRIX versions 6.4.1 during 2HCY98. This release supports only the 64-bit architecture.
The goal of porting TMF from UNICOS to IRIX is to provide the same Cray tape subsystem capabilities on IRIX that Cray customers have been used to on UNICOS. The basic elements of IRIX TMF are the TMF daemon and TMF tape device driver. TMF provides operating personnel with a means to view and manage the tape resources configured within TMF. It also is the backbone for the operation DMF XFSdump and XFSrestore, and Cray REELlibrarian (CRL) (CRL implementation is deferred). The basic feature list for IRIX TMF follows:
TMF on IRIX will provide support for the following tape autoloaders and tape devices on Silicon Graphics Origin2000 running the IRIX 6.4.1 release:
The Cray REELlibrarian product version 2.0.9 has just been released. This supports the T3E platform as well as the PVP Cray platforms. CRL is to be ported to IRIX during 2HCY98.