Security, Test and Evaluation:
ARSC Experiences with CRAY and SGI Systems

Virginia Bedford
Arctic Region Supercomputing Center (ARSC)

virginia.bedford@arsc.edu
http://www.arsc.edu

ABSTRACT:
This paper describes the experiences of the Arctic Region Supercomputing Center in preparing for and undergoing several formal Security Test and Evaluations of UNICOS and IRIX systems. Specific findings and resolutions are presented. Suggestions for determining the appropriate level of security for a site, how to reach that level, and how to maintain that level are included.
KEYWORDS:
security UNICOS IRIX monitoring vulnerability

Slides for this paper are available in Acrobat PDF format.



What is a Security, Test and Evaluation?

The Arctic Region Supercomputing Center has benefited from several security reviews over the past three years. These have included self-reviews, assistance reviews by consultants hired by ARSC and examinations by funding agencies. ARSC does no classified computing, but provides high performance computing and visualization services to the Department of Defense and to faculty, researchers and students at the University of Alaska and their colleagues around the country. Because of ARSC's relationship with the Department of Defense's High Performance Computing Modernization Program, a Security, Test and Evaluation (ST&E) is required after any major system modification or upgrade, and a Security Assistance Visit (SAV) is required every year. ARSC is connected to the Internet and to the Defense Research Engineering Network. The center is part of the University of Alaska and is located on the University of Alaska Fairbanks campus. Systems that have been examined include a CRAY T3E, a CRAY Y-MP (since removed), a CRAY J90, and 60 SGIs ranging from Indys to Onyx2s, Sun workstations used as system support machines: SWSs for each CRAY, Silo workstation, and several Suns for network management. All SGIs are running IRIX 6.2, 6.3, 6.4 and 6.5; full conversion to IRIX 6.5 is in progress. MLS is not used extensively beyond its inherent existence as part of UNICOS 10 but other security software packages including Kerberos and Secure Shell have become fundamental components of the ARSC operating system environment.

An ST&E is a formal review of a computing environment with the goal of determining areas which need attention with respect to security. Other names for this process include Vulnerability Assessment, Audit, or Risk Assessment. The deliverables of such an examination will be recommendations for changes or improvements as well as notice of areas in which good practices are already being followed. An ST&E is differentiated from an SAV by approach. According to the review teams, "The SAV is . . . aimed at improving security across all of its provider sites . . . . The team will be there to help you improve your overall security posture." On the other hand, "ST&Es . . . will certify the security accreditation your site has received. . . . Unlike the SAV, the main goal of the ST&E is not to educate or achieve long term improvements in a site's security posture. It is an inspection to certify that a site is currently operating in a manner consistent with the accreditation it has been granted." In some situations an ST&E may be used as the basis for disciplinary action or other consequences.

These formal reviews cover several security disciplines: physical, administrative, personnel, network, systems, and information systems. Only network and information systems security will be discussed in this paper. Formal categorized "findings" and observations describe anything that is determined to present a vulnerability. A finding is a brief description of the problem, a reference to a public law or policy, and a recommendation for how to repair it. The findings have not always been clear; interpretation has often been based on extended discussions with the examiners. A response is required to each finding within 120 days, describing how each problem has been fixed, or a statement that a risk is accepted. In some cases ARSC disagreed with the finding and accepted the risk; in others ARSC agreed but site configuration or other issues prevent resolution. Vulnerabilities are divided between those for which a local account is required, and those that may be exploited through network access. Findings are separated into four (4) categories. Category I findings must be fixed immediately or the site would be closed down. As many findings as possible are fixed before the team leaves the site. Even when findings are fixed during the examination period, they are included in the final reports.

Figure 1. Categories of Findings by DoD
  • Category I: Critical - Loss of life and/or $1M in damage and/or INFOSEC loss of secure operations.

  • Category II: Urgent - Potential loss of life and/or $100K in damage and/or potential INFOSEC loss of secure operations

  • Category III: Routine or degraded operations

  • Category IV: Enhancement

Rules describe what level of security is desired and what is unacceptable. They vary depending on the needs of an institution and expectations of its users. These rules should be clear and in writing. The fundamental guidelines or rules established by the institute, users, or funding agencies are the basis against which the review is performed. Fundamental principles upon which the ARSC security policy is based include requiring individual accountability, limiting the system information presented to non-users over the network and limiting overall access to the needs of authorized users. The review teams based their findings on government policies and public law as follows:

  1. Department of Defense Directive (DoDD) 5200.28 Security Requirements for Automated Information Systems
  2. OMB A-130 Management of Federal Information Resources
  3. Federal Information Processing Standards (FIPS) 102 Guidelines for Computer Security Certification and Accreditation
  4. Public law
  5. Local policies

To prepare for these security examinations ARSC performed self-reviews, and hired consultants from two companies. These companies were chosen because they were most familiar with the particular details of the formal ST&E for which ARSC was preparing. All visits resulted in some improvement. Initial visits resulted in dramatic improvement. The most effective examinations consisted of a systematic review of multiple aspects of the computing environment and provided suggestions for improvement. Varying levels of proficiency and experience of the reviewers and differing styles of communication with staff impacted the quality of the reviews on occasion.

Whether a review is imposed or invited, it consists of a detailed examination of the current state of the environment as well as an assessment about whether the policies and processes of the organization are in place to ensure that the site can maintain and improve this level. A snapshot of a currently secure system must be reinforced by the methods that will make sure it stays that way during the next system upgrade and through staffing changes. The capability of achieving the desired level of security is dependent on the staff involved, their skills and knowledge. Because security issues change, they must be provided formal training opportunities as well as time to hone their skills. Good system administration practices are the basis for a secure system. These include understanding of the operating systems supported, strong configuration, change and problem management, and automated and consistent procedures backing up clear site policies.

Since May of 1996, ARSC has benefited from two formal SAVs, two formal ST&Es, two onsite visits by consultants and one remote vulnerability test. For each visit, teams ranging from one to five people spent up to a week at the center. A final presentation is typically made on the last day and a formal report is provided to management with findings and recommendations. For ST&Es ARSC hs been required to respond to each finding within a limited period of time in order to maintain its Authority to Operate (ATO).

Review team members were given local user accounts with no special privileges. When privileged operations were needed, such as examining log files, examiners worked through the hands of local system administrators. All aspects of system configuration were examined, known exploits were attempted and logs were reviewed. Local policies were studied and examiners worked with and interviewed system administrators to ensure that those policies and procedures were followed. Reviewers also connected portable computers to the network to do network vulnerability testing, network port scanning, and network sniffing. Port scanning is intended to discover open network services which may be running on a platform and could provide a portal into the system other than those intended. Some of the reviewers ran proprietary tools to do systems checks although they were not always able to interpret the results. In addition to overall review, gaining root access is, of course, one of their major goals. This would fully demonstrate that they had discovered a vulnerability, exploited it, and taken control of an ARSC system.

A site Point of Contact is designated and site personnel are available for the entire visit. At least one staff person stayed with the consultants at all times. The week typically began with a management briefing session where some ground rules were set, followed by larger staff meetings describing plans and making staff introductions. These in-briefs were especially valuable during initial reviews when staff didn't know what to expect. Daily debriefing sessions were held at the end of each day to review activities, discuss vulnerabilities discovered and plan for the next day. Staff were closely involved uring the preparation reviews sponsored by ARSC. The experiences provided training for staff as well as system assessment. An alternate method for vulnerability assessment is to exclude local staff and perform a blind test. This is against ARSC site policy unless management approval has been obtained. The reviews were most valuable when staff were most involved.

Individual Accountability

Individual user accountability is a fundamental tenet of ARSC policies. This means that there should be no accounts for which the password is known by more than one person. It also means that there should be no way for one user to do anything that would change another user's behavior without that user's knowledge and control. Each user is responsible for maintaining the integrity and security of his or her account, files and crontabs. It also means that every file has an owner, and that the owner is responsible for each file and its contents. Group accounts, in which more than one person uses the account and the password is shared, are not permitted.

File Ownership. Files which have no owner are flagged and the user in whose home directory they reside are contacted for suggestions for remedy. This situation can occur if a user places files in a location other than his home directory, and then his account is removed from the system. The owner of the home directory in which these files reside cannot change the ownership, although he may be able to copy the file and remove the original. On occasion, system upgrades on both SGI and CRAY platforms also resulted in nouser or nogroup files in the system areas.

During the initial review, the IRIX kernel configuration allowed users to change ownership of files to accounts other than their own. This is apparently the default. This was determined to be unacceptable based on the policy of individual accountability and the value was changed in /var/sysgen/stune.

Default:

restricted_chown = 0 sysV style chown(2), non super-user can give away files

Changed to:

restricted_chown = 1 bsd style chown(2), only super-user can give away files

File Permissions. Many system and user files need attention with respect to permissions. The reviewers noted that "Passwords are the first line of protection. File permissions form the next line of defense, against hackers that succeed in breaking into an account and legitimate users trying to do something they're not supposed to. Properly set up file permissions can prevent many potential problems."

The ARSC account creation process was changed so that initial files are now created with owner access only. All umasks are set to a default of 077, except on the web server. This is intended to ensure that any additional group or world access has been added by the user. The owner of the file has full control and accepts responsibility for changes to the file. There are no restrictions if a user chooses to change his personal default umask or if a user adds additional permissions except for particular cases described later.

For both IRIX and UNICOS, the default ftp umask is also set to 077 by adding the flag in /etc/inetd.conf:

/sbin/ftpd -l -u077 -a

Under IRIX the default umask is set in /etc/default/login:

# Default umask, in octal.
UMASK=077

The security review teams have recommended that no directories on the system, whatsoever, have world-write permissions. The dangers include accidental or malicious file change or damage, and that the world-writable directory could be used as a hiding place.world-writable files have not been forbidden as policy for users but world-writable user files are identified via a tool called Tiger. Users with large numbers of world-writable files may be contacted to make sure that they are purposely setting these permissions.

Appropriate world-writable directories in system areas should have the sticky bit set, as with /tmp, to ensure that only the user who creates files and directories may remove them.

UNICOS includes several world-writable files and directories, however, which do not follow these rules. It is possible, after research, to modify the permissions on some.

CRL (CRAY/REELlibrarian) uses /usr/spool/crl for debugging. As specified by the documentationt it is world-writable. The key files in that directory are the catalog journal (a file named in the form yy.mm.dd) and the server log (named RL.RL). In addition, the core_RL directory resides within /usr/spool/crl. If RLLOGDIR is not defined in the /etc/config/reelenv file, the server log will go in RLLIBDIR, currently defined as /usr/crl, which is not world-writable. Finally, if RLLOGDIR is not defined, cores will go to either /usr/tmp or /usr/tmp/core_RL. The catalog journal can still go to /usr/spool/crl, but the permission should be changed to 775. The journal should still go to a file system other than where the CRL database resides. At ARSC this was reconfigured to go to /usr/tmp.

NQS (Network Queuing System) includes a series of world-writable files. SGI tells us this is normal; ARSC has not taken the time to pursue it further.

T3E<180> sudo ls -Rl /usr/spool/nqs/private/root
total 640
prw------- 1 root bin 0 Jul 17 1997 FIFO
prw------- 1 root bin 0 Jul 17 1997 LOGFIFO
drwx------ 2 root bin 4096 Mar 19 1997 chkpnt
-rw-r----- 1 root bin 286616 Mar 19 1997 col.out
drwxrwxrwx 2 root bin 4096 Jul 16 1997 control
drwxrwxrwx 2 root bin 4096 Jul 16 1997 data
drwxrwxrwx 3 root bin 4096 Jul 16 1997 database
drwxrwxrwx 2 root bin 4096 Mar 19 1997 database_qa
drwxrwxrwx 2 root bin 4096 Jul 17 1997 database_qo
drwxrwxrwx 2 root bin 4096 Mar 25 1997 failed
drwxrwxrwx 2 root bin 4096 Jul 17 1997 interproc
drwxrwxrwx 2 root bin 4096 Jul 16 1997 output
drwxrwxrwx 2 root bin 4096 Mar 19 1997 reconnect

The DMF FTP MSP on the T3E gave migrated files on the destination host world-write, until a umask was specified in /usr/dm/ftp_info.

. . . SITE umask\ 077 idle\ 30

Other UNICOS system files that were delivered with world-write permissions included.

drwxr-xr-x root bin /usr/src/uts/cf.9634
drwxrwxrwx root bin /usr/src/installed
drwxrwxrwx root bin /usr/lib/array
drwxrwxrwx root root /installed
-rwxrwxrwx root bin /dev/mkdev.sh
-rw-rw-rw- root root /raidscript
-rw-rw-rw- root root /installed/uni.VIF
-rw-rw-rw- unknown root /tmp/diagccmt.stdout
-rw-rw-rw- unknown root /tmp/diagccmt.stderr
-rw-rw-rw- root bin /usr/src/installed/SRC.VIF
-rw-rw-rw- root bin /etc/install/.gigaring
-rw-rw-rw- root root /usr/lib/cron/cronlog
-rw-rw-rw- root root /skl/usr/lib/cron/cronlog
drwxrwxrwx root bin /usr/src/uts/cf.9634

This is not an exhaustive list because the Crays were not fully examined at installation and some changes were not tracked. A system review at initial system installation, and regularly thereafter, using find commands or a tool like Tiger will identify world-writable files that may need attention.

In most cases some research should be done to confirm that file permissions can be changed without unforeseen ramification.

Dot (Environment) Files. "Dot" files such as .login, .profile, .rhosts, etc. are executed automatically to set up a user's environment when the user logs in, or invokes a particular command. Modification of these could undermine a user's control of his account without his knowledge. Specified "dot" files are checked regularly for permissions and content to ensure that no account sharing occurs and that permissions do not allow someone other than the owner to access the file. When these policies and reviews were first undertaken, user environment files were extremely out of compliance. It took a significant effort to develop and review policies, and initially contact users. Once the files were cleaned up, it has been fairly easy to keep them this way.

Early on we were under the impression that under Unicos, MLS counteracted .rhosts files. That is, even if a user put an unacceptable host in their .rhosts file, if it was not in the /etc/hosts.equiv file, then it wouldn't work. We later found that this was not the case, but are not sure when or if this was a change in behavior.

Environmental dot files created by third party software are also checked. For example, Alias|Wavefront created an environment file in the users home directory called .dirrc, with world-write permissions. Even if the user removed these, the next invocation of the product added world-write again. Alias|Wavefront was contacted and eventually provided a special mod to change this. The problems is that .dirrc contains paths for executed programs.

In addition to the .rhosts file, which could introduce a significant vulnerability if contents are set up incorrectly, other dot files and the home directory also provide this potential. If any of these were group- or world-writable, then individual accountability can be lost. Read access to some files is also prohibited.An initial list of appropriate environment files was identified, appropriate permissions were determined for them, and a Perl tool called the "dotfile checker" was developed to automatically review permissions for all of these files. The contents of some were reviewed (.rhosts, .shosts and .netrc). When anomalies are detected, no changes are made to the files themselves, nor are they removed; all remedial actions are done by the user. This serves both to educate the user and also to protect against problems that might occur when making the change to the user's file. Defined response steps depend on the severity of the vulnerability and range from contacting the user in a timely manner to inactivating the user and changing permissions on the home directory to 700 so that no one else can access the data therein.

The "dotfile checker" runs hourly on all systems, and therefore typically flags a user fairly soon after the vulnerability is created. An immediate mail message is sent to User Services staff for a level 3 (high) vulnerability so that they can immediately contact and educate the user and avoid inactivating the account if possible. Notification of any level 1 (low) vulnerability is done each morning. Checks are performed on dot files for

In Figure 2, the first column lists all environment files examined for potential permission vulnerabilities. Each digit in the second column represents owner, group, and world permissions. A value of 0 means that setting is an acceptable permission. If the field contains 1, 2 or 3, then a file with that permission setting has violated ARSC's policy at the corresponding level. For example, if the .Xauthority file has group or world read or write permissions, this is considered a level 2 violation. Level 3 is most severe, level 1 is least severe.

Figure 2. Dot File Checker: Files and Acceptable Permissions
Dot (Environment) file name Permissions:

u=owner; g=group; o=world
r=read; w=write; x=execute

uuugggooo
rwxrwxrwx
.Xauthority
.Xdefaults
.Xresources
.cshrc
.dirrc
.exrc
.forward
.kshrc
.login
.netrc
.pgp
.pgp/pubring.bak
.pgp/pubring.pkr
.pgp/randseed.bin
.pgp/secring.bak
.pgp/secring.skr
.profile
.rhosts
.sgisession
.shosts
.ssh
.ssh/authorized_keys
.ssh/identity
.ssh/identity.pub
.ssh/known_hosts
.ssh/random_seed
.tcshrc
.xinitrc
.xsession
home directory
000220220
000020020
000020020
000020020
000020020
000020020
000020020
000020020
000020020
000130130
000222222
000010010
000010010
000110110
000220220
000220220
000020020
010030030
000020020
010030030
000131131
000220220
000230230
000120120
000130130
000120120
000020020
000020020
000020020
000030030

Figure 3 shows a sample email for a daily dotfile summary check. It lists the files reviewed, a report of anomalies, and the actions which are taken by the User Services. Note that even if there is a level 3 vulnerability, it will be reduced to level 1 if the user is already inactive or for other mitigating circumstances. Levels 2 and 3 require immediate action.

Figure 3. Dot File Checker Daily Summary Email to User Services
Date: Fri, 19 Mar 1999 06:09:57 -0900 (AKST)
From: sysmon-chilkoot <sysmon>
To: accountman, consult, technical_services
Subject: Summary check of user files: chilkoot (vulnerabilities detected)

CHILKOOT
Details in chilkoot:/var/local/output/dotfiles/199903190608.rpt
------------------------------------------------------------------------------< BR> Check of home directories and any of these dot files/dirs found therein:
.Xauthority .Xdefaults .Xresources .cshrc .dirrc .exrc .forward .kshrc
.login .netrc .pgp .profile .rhosts .sgisession .shosts .ssh
.ssh/authorized_keys .ssh/identity .ssh/identity.pub .ssh/known_hosts
.ssh/random_seed .tcshrc .xinitrc .xsession
------------------------------------------------------------------------------< BR>
1 P----- rw------- /u2/dale/.netrc ACTIVE
1a ----D- r-------- /u1/gene/.rhosts ACTIVE
1a ----D- rWxrWxrWx /u1/richard/.rhosts ACTIVE
1a ----D- r-------- /u1/kurt/.rhosts ACTIVE
1i ----D- r-------- /u1/liam/.rhosts INACTIVE 19981002

------------------------------------------------------------------------------< BR> Response levels:
1 Follow internal procedures for user contact and escalation.
a Reduced to 1 because rhost is in arsc.edu domain.
h Reduced to 1 because home directory already has 700 permissions.
i Reduced to 1 because user is already inactive.
2 !! Change home directory privileges to 700.
3 !! Change home directory privileges to 700 and inactivate user.

Flags:
P Has a password in the file.
U Contains a user other than the owner of the file.
S Is a symbolic link.
H Is a hard link.
D Unacceptable rhost.
O File not owned by this user.

Permissions: Unacceptable permissions are capitalized.
------------------------------------------------------------------------------< BR> Script: /usr/local/sbin/check.dotfiles
Log: /var/local/logs/dotfiles/check.log
Report: /var/local/output/dotfiles/199903190608.rpt
Email: accountman,consult,technical_services
Time: 199903190608


Crontabs. Ensure that only the owner of the crontab has the ability to modify the programs run within a crontab. Otherwise anyone who can modify the executable can take over the crontab owner's account.

Dot in Path. Having a dot ("." or current directory) in the execution path means that the user need only enter the command name for executables in the current directory. If the dot precedes system directories in the path, an executable which has the same name as a system binary will be executed. Thus if the user is lured or stumbles into a directory in which such a file has been placed, either accidentally or maliciously, the user will execute something other than what is intended. If the dot is behind system directories the user may still execute something unintentionally from an unknown location.

From the IRIX 6.5 man page for csh:

path The list of directories in which to search for commands. path is initialized from the environment variable PATH, which the C shell updates whenever path changes. A null word specifies the current directory. The default search path for normal users is: (. /usr/sbin /usr/bsd /bin /usr/bin /usr/bin/X11). For the privileged user, the default search path is: (/usr/sbin /usr/bsd /bin /usr/bin /etc /usr/etc /usr/bin/X11). If path becomes unset, only full pathnames execute. An interactive C shell normally hashes the contents of the directories listed after reading .cshrc, and whenever path is reset. If new commands are added, use the rehash command to update the table.

From the UNICOS man page for csh:

path Each word of the path variable specifies a directory in which commands are to be sought for execution. A null word specifies the current directory. If there is no path variable, only full path names will execute. The usual search path is /bin, /usr/bin, and /usr/ucb, but this may vary from system to system. For the super user, the default search path is /etc, /bin, and /usr/bin. A shell that is given neither the -c nor the -t option usually hashes the contents of the directories in the path variable after reading .cshrc and also each time the path variable is reset. If new commands are added to these directories while the shell is active, it may be necessary to specify the rehash command; otherwise, the commands may not be found.

The review team's observation noted that

New users had "." in path. Many systems had a "." in the root account PATH system variable. The "." should be removed. Additionally, some PATH variables have occurrences of two ":" characters next to each other, which functions as a ".", and should be fixed as well.

There was agreement that neither root, nor any system administrator who has special privileges, should have dot in their path. Several systems included dot in path for system accounts which needed to be fixed. On the silo workstation, the path for the system account oracle contained current directory at the end and the path for the system account acsss contained current directory at the beginning. These values are set in ~acsss/data/internal/release.vars. Before these were changed, StorageTek was contacted. Initially they stated that all Unix accounts have to have dot in path but after further discusion they provided a formal response that they weren't necessary so ARSC removed them. On the MWS dot was in the path for the root and mws acounts. Dot was in the root path for all SGIs except one for which the .cshrc file had been previously modified.

There was some disagreement about making this change for new and current users. The challenge was trying to balance concerns about possible vulnerabilities with concerns about impact to users in making such a change. These issues were considered extensively and the policy was finally defined to remove the default dot in path for new and existing users. Our recommendation to users was that they not add it themselves, but if they did, to add it at the end of the path. Some users objected initially and still do from time to time, but this has not been a major problem. There is clearly a potential danger from having dot in path, but the user may find it more convenient to simply type the executable without path, even if they simply enter "./<command>" . Some users did not understand the concept of current directory and dot in path.

Figuring out where and how to change this default took a while since there are many places in which a dot could be, and was, added depending on the user's shell and method of access.

Figure 4. Where path can be set on IRIX systems.

user fi les

  • shell-specific
    • sh/ksh
      • $HOME/$ENV
      • $HOME/.profile
      • $PWD/.profile
      • $PWD/.kshrc
    • csh/tcsh
      • $HOME/.cshrc
      • $HOME/.login
    • bash
      • $HOME/.bash_profile
      • $HOME/.bash_login
      • $HOME/.bashrc
  • non-shell-specific
    • (any shell script can reset PATH for itself and its children)

system files

  • shell-specific
    • sh/ksh
      • /etc/profile
      • /etc/stdprofile
      • /etc/setuid_profile
    • csh/tcsh
      • /etc/.login
      • /etc/csh.cshrc
      • /etc/cshrc
      • /etc/stdcshrc
      • /etc/stdlogin
    • bash
      • /etc/profile
  • non-shell-specific
    • /etc/default/login
    • /etc/default/su
    • /usr/include/paths.h

Figure 4. Where path can be set on UNICOS systems.

user files

  • shell-specific
    • sh/ksh
      • $HOME/$ENV
      • $HOME/.profile
      • $PWD/.profile
      • $PWD/.kshrc

    • csh/tcsh
      • $HOME/.cshrc
      • $HOME/.login

  • non-shell-specific
    • (any shell script can reset PATH for itself and its children

system files

  • shell-specific
    • sh/ksh
      • /etc/profile
      • /etc/setuid_profile
      • /usr/skel/.kshrc
      • /usr/skel/.profile
    • csh/tcsh
      • /etc/.login
      • /etc/csh.cshrc
      • /etc/cshrc
      • /usr/skel/.cshrc
      • /usr/skel/.login

  • non-shell-specific
    • /opt/modules/modules/modulefiles/*
    • /usr/include/path.h


Xhost versus xauth. A full review of the configuration of Xwindows was done. Some of the systems had "xhost +" set by default. With xhost, which is based upon machine name, once a user adds a remote machine to his local xhost list, anyone on that remote system can run X Window applications on the local machine, including keystroke-capturing programs; "xhost +" allows any machine anywhere to do this. It is extremely dangerous. Further, if a user mistakenly directs his display to a remote machine, anyone at that machine can run his applications with his privileges. The SGIs are delivered with "xhost +" set as the default in /usr/lib/X11/xdm/.Xsession.scripts. User education regarding xauth is, however, an ongoing process.

At ARSC, use of xhost was replaced by xauth. xauth allows remote displays based upon a user's current login session and a cookie generated by xdm. There was some pain involved, but very little objection from users since the security vulnerability could be very easily demonstrated. A script which would assist users in merging xauth cookies between systems was helpful. Xhost was disabled by wrapping it with software which would send an email to User Services when executed.

The default on the MWS and SWS was xhost +. Capability for using xhost was removed on the SWS by removing its permissions. Technical support staff began to use xauth on those platforms as on other systems. This caused complications in use of the various shared logins such as crayops and crayadm, but these were resolved. The system administrators and CRAY engineers need to switch between these shared accounts which caused some initial confusion with merging cookies.

Figure 5. script for merging cookies between systems: exauth
#!/bin/csh

# exauth: extract the current display and transfer to another machine
# input - $1 = remote machine
# $2 = remote userid
#

# created: Mar 27, 1997 lforbes
# modifications:
# 19980630 switched to csh; file lock problems with .Xauthority under sh

umask 077

set RMCHN=$1
set LMCHN=`/usr/bsd/hostname`
set RUSER=$2
set LUSER=`/bin/whoami`

# check for a help request
if ( "-h" == "$RMCHN" ) then
echo "Usage: exauth remote_machine remote_userid"
exit 1
endif

# check that two arguments have been supplied
if ( 2 != $#argv ) then
echo "Usage: exauth remote_machine remote_userid"
exit 1
endif

/usr/bin/X11/xauth -f ~${LUSER}/.Xauthority nextract - ${LMCHN}.arsc.edu:0 | \
/usr/bsd/rsh ${RMCHN} "/usr/bin/X11/xauth -f ~${RUSER}/.Xauthority nmerge -"
set CHECK=$status
if ( 0 != $CHECK ) then
echo "magic cookie possibly not merged on remote machine"
endif

Root Access

Control of root access may be considered the most critical and fundamental aspect of security, along with physical access to the systems. A person with root access can do anything, see anything, change anything, and hide his trail. The reviewers wanted to know the root access management policy. Exactly who had root access? Who decided that they could have it? How did they use it? When did they use it? When and how is the password changed? A fundamental requirement is that the root password never pass over the network and should only be used at the console. This leads to some challenges in how to get practical work done with multiple people who needed to use root, particularly on the CRAYs.

Over the past three years several tools have been used that provide alternatives to giving the root password to individuals. Later tools provided the ability to provide privileges at the command level. Some commands are differentiated as passive- or non-passive. A passive command is one that allows examining a file or its attributes without affecting it beyond access time of the inode. Examples of passive commands include cat and ls. Non-passive commands may modify the system in some way. Examples include vi, cp, rm and mv. Some commands that may appear to be passive, such as view or more, are categorized as non-passive because it is possible to break out of them accidentally or deliberately and gain full root access.

regular root access. With over 65 systems it is impractical to have separate root passwords for each, but common root passwords are not considered acceptable. In addition, it can be extremely cumbersome to change them all if a root knowledgeable staff person leaves for any reason. Besides logging that someone is using root (via su), there is no detailed logging of actions performed.

zup. For UNICOS our onsite CRAY analyst implemented a tool named zup. A system administrator designated in the zup database could simply enter the zup command, enter his own password, and have full root privileges. A zup log would track that he entered and left zup, but there is no logging of activity while in zup mode. An expiration date in the database allowed for turning someone's zup privileges off quickly. Zup was never used under IRIX or on the Suns. Use is logged in a special zup log rather than through use of standard syslog facilities.

super. Super provides an alternative to zup and allows definition of specific commands for specific people. A system administrator defined in the configuration file enters super followed by the desired command that is also authorized in the configuration file. For non-passive commands he would also be required to enter his own password. Super provides improved logging: each time a command is executed it is logged (without parameters) andsuper gives the ability to provide a subset of commands to consultants. The configuration file is complicated to manage. No attempt was made to make this work under IRIX or on the Suns. Use of super is logged in a special super log rather than through use standard syslog facilities.

sudo. Similar to super, sudo is a much more sophisticated and manageable tool. It still uses the system administrator's password for access, provides extensive and flexible configuration and groups, and uses standard syslog logging providing command and passwords, path at time of invocation and notification via email and log of failed attempts. It also has a configurable time window that allows issuing multiple commands without reentering the password. ARSC has implemented sudo on all platforms and uses it exclusively for root access.

The advantages of these systems include not having to manage the root password, and the ability to easily disable an individual's root access without terminating their account or changing the password; a disadvantage is that there are now essentially multiple root passwords. If the password for anyone who has zup, super, or sudo access is obtained, it is the same as having the root password.

A particular concern was about passwords being sniffed. To respond to that possibility, client and server SecureShell (ssh) was installed on the CRAYs and SGIs and all staff with sudo privileges (i.e. root) were required to use this so that there would be no cleartext root passwords on the network. Cleartext passwords on the network are avoided with ssh, but compromise of a system administrator's account is still equivalent to compromise of the root account.

Not long after implementation of ssh, the HPCMP gave a mandate to move to Kerberos and SecurID for all user accounts. Elimination of cleartext passwords and implementation of one-time passwords via SecurID is the rule. This project on the CRAYs and SGIs will be discussed later in the paper, but for purposes of this discussion it meant that system administrators would now use a Kerberos passphrase and SecurID pincode to get connected to the system, and then use a sudo password to gain root access. With the plan to eliminate static passwords in /etc/passwd or /etc/shadow for all users, this meant that only system administrators still had a valid password field. This password can no longer be used for system access and is only used for sudo access.

There has been some debate as to whether this is acceptable. Sudo could be modified to use a file other than /etc/passwd, or could be modified to use SecurID. There does not appear to be a way to use sudo with both Kerberos and SecurID authentication. SecurID by itself does provide for a one-time password and requires that the system administrator have in his possession the SecurID card. Sudo with SecurID has been implemented on the IRIX and Solaris systems. It does not work with UNICOS because the libraries are not available.

Passwords and Kerberos with SecurID

In response to findings of the first Security Assistance Visits, ARSC made extensive changes in password control and management of passwords. Although ARSC site policy states that passwords must be of a certain length and contain certain kinds of characters, it is possible to set an unacceptable password. Because of NIS, there was little control of the password attributes. From the yppasswd man page:

New passwords must be at least four characters long if they use a sufficiently rich alphabet and at least six characters long if monocase. These rules are relaxed if you are insistent enough.

A password cracking tool called "Jack the Ripper" was run regularly and a wrapper was written for yppasswd that enforces password criteria. Once the wrapper was implemented there were few instances of cracked passwords.

The reviewing teams also recommended that static passwords be protected with shadow files. This is already the case with the UDB on the CRAYs, but was not possible with NIS on the SGIs. As an alternative, root passwords were completely removed on most SGIs and replaced with a trusted master host and .ssh. Because of the use of sudo the root passwords were no longer necessary. This change eliminated encrypted (crackable) static passwords from /etc/passwd.

All of these efforts became moot, however, with implementation of Kerberos and SecurID. This software was installed in early 1999 and eliminated static login passwords for all users. Use of static passwords is replaced with use of a combination of one-time passwords (pincodes) through SecurID cards and Kerberos passphrases. This change was due to an initiative by the High Performance Computing Modernization Office that supports about 18 sites, all of which run CRAYs and/or SGIs.

Set-UIDand set-GID files

One of the teams made it a formal finding that there are set-uid/set-gid files on the SGIs and CRAYs. What they were looking for in our response, however, was not that all set-uid files should be removed, but that the potential dangers had been evaluated and the risk accepted. Before doing so, however, the risk was assessed. It appeared to be fairly high on the SGIs. For a period of time under IRIX 6.2, 6.3, and 6.4 there were numerous security reports, through SGI and Bugtraq (a security discussion list) about buffer overflows with set-uid/set-gid programs. With buffer overflows a user can give input to the executable beyond what the program is capable of handling. If the program aborts incorrectly, the user may end up with the privileges of the alternate account.

Set-uid and set-gid executables allow a user to run with alternate privilege levels to perform defined activities. If a user can "break out" in the middle of this defined activity, the user may be able to retain these privileges to take additional actions beyond the original limited scope.

Many programs appeared to be unnecessarily set-uid; reports of vulnerabilities were frequent in the security mailing lists. A fairly detailed review of all set-uid programs was performed. If the permissions were not determined to be necessary, they were changed. These changes were done carefully by an automated tool and configuration file, so that the original and new permissions were documented, could be reversed, could be checked regularly to avoid regression, and so that the reasons for changing them were documented. SGI was contacted for information about the purpose of permissions and risks of changing them.

The review was performed first on the SGIs where the highest quantity of vulnerabilities were discovered. Set-uid/set-gid files were categorized as follows:

Necessary and Unnecessary Services

All systems had unnecessary services defined in /etc/inetd.conf. In many cases there are real or potential vulnerabilities associated with them. The reviewers recommended that anything not needed be turned off. In some cases it took a while to know whether or not a service is needed; some cases are obvious. As with restricting the amount and kind of information about the system available to non account holders, there is the potential for tying the hands of system administrators in problem resolution. Knowing that the services could be reinstated if necessary made this of less concern. Be careful in turning off apparently unnecessary services and in testing. The reviewers noted that "Unnecessary ports are enabled on the HPC. Recommendation: Determine ports necessary for proper and secure operation. Disable privileged ports (ports below 1024) not required for operations (e.g., NFS, NIS). Remote users should not require these capabilities."

All services were reviewed; any that were not needed were turned off. Additional logging is enabled where possible. The sections below list which services were disabled, which were modified and which were left alone. Note that as possible, services were invoked through tcpwrappers, which provide additionally logging and can be configured to allow or deny access based on IP.

These services were changed when possible:

Silo Workstation

Remaining services:

telnet stream tcp nowait root /usr/sbin/in.telnetd in.telnetd
shell stream tcp nowait root /usr/sbin/in.rshd in.rshd
login stream tcp nowait root /usr/sbin/in.rlogind in.rlogind
exec stream tcp nowait root /usr/sbin/in.rexecd in.rexecd

Which were removed:

finger, tnamed, comsat, talk, uucpd, rusersd, sprayd, walld, time, echo, chargen, daytime, discard, admind (used at installation), rstatd for perfmeter, calendard, rquotad, ttdbserverd, ftpd (ftp from silo rather than to silo, where there are only group accounts)

SWSs

Remaining services:

ftp stream tcp nowait root /usr/sbin/in.ftpd in.ftpd -l
telnet stream tcp nowait root /usr/sbin/in.telnetd in.telnetd
shell stream tcp nowait root /usr/sbin/in.rshd in.rshd
login stream tcp nowait root /usr/sbin/in.rlogind in.rlogind
exec stream tcp nowait root /usr/sbin/in.rexecd in.rexecd
tftp dgram udp wait root /usr/sbin/in.tftpd in.tftpd -s /opt
time stream tcp nowait root internal
time dgram udp wait root internal
100232/10 tli rpc/udp wait root /usr/sbin/sadmind sadmind
100083/1 stream rpc/tcp wait root /usr/dt/bin/rpc.ttdbserverd rpc.ttdbserverd
100221/1 tli rpc/tcp wait root /usr/openwin/bin/kcms_server kcms_server
fs stream tcp wait nobody /usr/openwin/lib/fs.auto fs
dtspc stream tcp nowait root /usr/dt/bin/dtspcd /usr/dt/bin/dtspcd
bootps dgram udp wait root /opt/CYRIops/bin/bootpd bootpd -d4

Which were removed and changed:

Removed rpc.csmd (due to Sun Security Bulletin), finger, echo, discard, daytime, chargen, rquotad, ruserds, sprayd, walld, rstatd, tnamed, comsat, talk

Added "-l" option to ftpd for improved logging; Added " -s" option to tftp to enforce /opt

SGIs

Remaining services:

ftp stream tcp nowait root /usr/software/bin/tcpd /usr/local/sbin/ftpd -l -u077 -a
telnet stream tcp nowait root /usr/software/bin/tcpd /usr/local/sbin/telnetd -a valid
kshell stream tcp nowait root /usr/software/bin/tcpd /usr/local/sbin/kshd -5c
klogin stream tcp nowait root /usr/software/bin/tcpd /usr/local/sbin/klogind -5c
eklogin stream tcp nowait root /usr/software/bin/tcpd /usr/local/sbin/klogind -e -5c
ntalk dgram udp wait root /usr/software/bin/tcpd /usr/etc/talkd
time tream tcp nowait root internal
time dgram udp wait root internal
sgi-dgl stream tcp nowait root/rcv /usr/etc/dgld dgld -IM -tDGLTSOCKET
mountd/1 stream rpc/tcp wait/lc root /usr/etc/rpc.mountd mountd
mountd/1 dgram rpc/udp wait/lc root /usr/etc/rpc.mountd mountd
sgi_mountd/1 stream rpc/tcp wait/lc root /usr/etc/rpc.mountd mountd
sgi_mountd/1 dgram rpc/udp wait/lc root /usr/etc/rpc.mountd mountd
sgi_videod/1 stream rpc/tcp wait root ?/usr/etc/videod videod
sgi_fam/1 stream rpc/tcp wait root ?/usr/etc/fam famd -t 6
sgi_pcsd/1 dgram rpc/udp wait root ?/usr/etc/cvpcsd pcsd -L

Which were removed and changed:

Removed finger, rstatd, tcpmux, walld, rquotad (except on systems with file systems and quotas), echo, discard, chargen, daytime, snoopd, rusersd, sprayd, bootparam, tftp, bootp, sgi_scanner, sgi_printer,

Added logging to pcsd (pcsd -L), rlogin -a, rshd -aL to log failures; use restrictive umask with ftpd (-u077);

Added tcpwrappers and replace nonkerberized with kerberized daemons.

J90/UNICOS and T3E/UNICOS/mk

Remaining services:

ftp stream tcp nowait root /usr/local/sbin/tcpd /usr/local/sbin/ftpd -l -u077 -a
dmf-ftp stream tcp nowait root /usr/local/sbin/tcpd_alt/tcpd /etc/ftpd -l -u077
kftp stream tcp nowait root /usr/local/sbin/tcpd /usr/local/sbin/ftpd -l -u077 -a
telnet stream tcp nowait root /usr/local/sbin/tcpd /usr/local/sbin/telnetd -a valid -L /bin/login
ktelnet stream tcp nowait root /usr/local/sbin/tcpd /usr/local/sbin/telnetd -a valid -L /bin/login
shell stream tcp nowait root /usr/local/sbin/tcpd /etc/rshd
kshell stream tcp nowait root /usr/local/sbin/tcpd /usr/local/sbin/kshd -5c
klogin stream tcp nowait root /usr/local/sbin/tcpd /usr/local/sbin/klogind -5c
eklogin stream tcp nowait root /usr/local/sbin/tcpd /usr/local/sbin/klogind -e -5c
erlogin stream tcp nowait root /usr/local/sbin/tcpd /usr/local/sbin/klogind -L /bin/login -e -5c
login stream tcp nowait root /usr/local/sbin/tcpd_alt/tcpd /etc/rlogind
ntalk dgram udp wait root /usr/local/sbin/tcpd /etc/ntalkd
time stream tcp nowait root internal
time dgram udp wait root internal

Which were removed and changed:

Removed rexecd, uucpd, finger, tftpd, comsat, talkd, echo, discard, chargen, daytime, tcpmux

Added tcpwrappers and replace nonkerberized with kerberized daemons.

In addition to the identifiable services in /etc/inetd.conf, the consultants and reviewers did various network port scans. It has been difficult to identify all of the open services on the SGIs; some remain unidentified and continue to be researched. Access to all unneeded ports less than 1024 has been disabled at the gateway router.

Finding: Systems are running unknown services.

Requirement: DoDD 5200.28, Minimum Security Requirements, Section A8; Data Integrity. There shall be safeguards in place to detect and minimize inadvertent modification or destruction of data, and detect and prevent malicious destruction or modification of data.

Recommendation: Determine what the unknown services are and remove if not needed.

Network Issues. There were several network issues that needed attention. Several were related to the UNICOS and IRIX systems.

IPforwarding is turned on by default for all SGIs even when not needed or desired. IPforwarding allows a machine with multiple interfaces to pass packets from one interface to another. This behavior could potentially bypass a firewall. In response to this finding the ipforwarding, ipdirected_broadcast and ipsendredirects kernel parameters were updated in /var/sysgen/master.d/bsd.

IPforwarding is also turned on by default on the CRAYs. ARSC turned this off except when using UNICOS under UNICOS.This is done through the file /etc/config/netvar.conf that specifies the options used when /etc/tcpstart runs the netvar command to set certain kernel options for tcp/ip. There is usually no need for a CRAY to forward packets to any other host or send redirects to redirect traffic flow if it thinks there's a shorter path to a destination. If a host is misconfigured and is sending data in a round-about way, having ipsendredirects on hides the problem.

When SNMP was turned on to allow for network monitoring, the default SNMP Community of Interest values had not been changed from defaults. This was changed.

Tcpwrappers were installed on all systems primarily to provide logging and, if necessary, limit access to some services.

Restricting Access to System Information. Restricting information presented to non-users of the system is sometimes controversial. When information is restricted, there is the potential for interfering with legitimate efforts of system administrators from other sites, or local, to solve problems. Each case is discussed on an individual basis. Services that were removed included finger and rstatd. Information about exported file systems may be accessed by world using the showmount command. This is suppressed on the SGIs by the /etc/config/portmap.options file, but must be managed at the router or firewall for the CRAYs by access control lists since UNICOS does not provide the capability to restrict this system information.

Email

Throughout this period there were several iterations of dealing with outdated versions and inappropriate incarnations of sendmail. Initially sendmail ran on all systems, in some cases on purpose, in some cases without thought, because "it came that way." Sendmail is notorious for introducing vulnerabilities to a system, but it was difficult to modify the way some systems had been configured.

During the first Security Assistance Visit, the reviewers noted that email was enabled and recommended that it be disabled on all HPC resources. The systems that needed attention included all CRAYs, all SGIs, the SWSs, OWS, MWS, and silo workstation.

Incoming sendmail was turned off almost immediately on the SWSs, OWS, MWS and silo workstation. Sendmail had been active there because "it came that way". Removal was done by moving /etc/rc*/*sendmail* to /etc/rc*/save_*sendmail* or starting the sendmail daemon without the "-bd" option.

The CRAYs were a bit more difficult to change because of the way ARSC user accounts were organized. Many of our users had active accounts only on the CRAYs, and communicated with us from those platforms. Our User Services staff sent mail to those users on the CRAY systems. When I queried other sites for their policies, one UNICOS-l respondent wrote, echoing the opinion of others: "We think a CRAY or a NEC is not a mailhost. The only reason for mail on such machines is to deliver messages from NQS, but it's not necessary for the user to receive mail on supercomputers."

At that time, although turning off incoming sendmail on the CRAYs was clearly the correct way to proceed, the mechanics of doing so involved more time than was available.

The next best thing would be to upgrade sendmail. This turned out to be impractical, if not impossible. While the version on IRIX were upgraded on occasion, the version on the CRAYs was not. The UNICOS version had been modified by SGI/Cray from the freeware version making it both potentially more robust as well as more difficult to change. Sendmail 5.61 ran on the Y-MP and 8.7.6 on the T3E while the public versions was 8.8.x. One supercomputing center, whose security posture is considered good, stated that they had made no changes in their sendmail and they did allow incoming sendmail. SGI's response to a request for new versions was that the latest version of sendmail was incorporated with UNICOS 9.2 and UNICOS/mk 1.4 although modifications had been made to it to due to MLS. SGI recommended that the sendmail updates be obtained from Berkeley. Due to other site issues, including that MLS was under evaluation, this was not followed.

The alternative was to accept the risk. None of the vulnerability tests had exposed actual weaknesses within the CRAY sendmail; the recommendation to change was based on the age of the version, and the knowledge that sendmail is notorious for vulnerabilities. But was there really a problem? The risk was accepted.

Two changes were made, however. The VRFY and EXPN options that provide information about email aliases were removed. One of the automated tools noted that EXPN and VRFY allow an intruder to determine if an account exists on a system, providing a significant aid to a brute force attack on user accounts. They recommended that these be disabled.

The change on the CRAYs was to modify the line in /etc/sendmail.cf from "Opneedmailhelo,needexpnhelo,needvrfyhelo" to "Opneedmailhelo,noexpn,novrfy".

The useful functionality previously provided by EXPN for systems administrators solving valid email problems can be performed using this command:

/usr/lib/sendmail -bv -d27.3 <alias>

Not very long after the first ST&E, the ARSC CRAYs and SGIs were used for spam relay by a third party. Because the version of sendmail under UNICOS which would allow spam to be prevented via configuration was not available, the priority for turning off incoming sendmail was instantly increased. An RTA and SPR were opened to see when SGI would provide the new version. It was a low priority for them and unlikely that the priority would be raised. Another site had opened an SPR for this and had been unable to get the publicly available version to compile.

Soon after this incident, ARSC was able to implement the change to stop incoming sendmail on the CRAYs and all SGIs except on the actual mail server and a backup. ARSC User Services staff developed alternate ways to communicate with users, including requiring .forward files and use of a tool which allowed them to create a .forward file for the user. An alternate backup mail relay was put into place and the MX records for all systems were changed to point to the mailservers so that mail to any ARSC host is directed to the mailhost.

SWSs (and OWSs and MWSs)

OWS/MWS

Some of the following information on the OWS and MWS is a bit dated since ARSC's Y-MP is now gone, but it is included to illustrate the kinds of vulnerabilities that can be introduced if systems are not examined adequately, and may be of interest to those with Y-MPs.

During initial reviews of the Y-MP, the OWS and MWS needed attention. Both ran old versions of sendmail with vulnerabilities and accepted incoming mail. Dot (".") was in the path of root and other accounts. Many unnecessary services were running. In some cases it was necessary to keep the services enabled to do testing because it was not clear from contacting SGI or from documentation whether or not a service was needed. All system accounts on these platforms were shared, making extra work during staff or operator transition.

The OWS had an account with no password that used "telnet" as a shell. On the MWS, there was an account called haltos which was not password protected. At the time neither ARSC nor SGI knew what this was for or if it was needed. SGI later stated that it could not do any damage or shut down the system, but at the time of the review it was not explained adequately to the reviewers. The MWS had a "+" in /etc/hosts.equiv which made it vulnerable to a threat from an authorized user on the Crays.

SWS (for T3E and J90)

When the T3E and J90 were first installed, there was discussion about putting their respective SWSs on the non-private network. Potential vulnerabilities were reviewed and a decision made that because of the way they were delivered, combined with the fact that security fixes seemed to regularly regress at SWS upgrades, they should never be placed on a public network. To mitigate the possible difficulty in access for the Cray engineers, each SWS is defined in both the T3E and J90 making it possible to connect with either SWS in the event one mainframe were to go down. Connection to the SWSs for system maintenance is only through a mainframe, or at the console.

Due to the number of people accessing the SWS for diagnostics, software installation and troubleshooting (more than five) sudo and individual accounts were installed on these systems.

A review by a consultant found that, as with the MWS, the haltos account did not have a password and had an undocumented command as a shell entry: /opt/CYRIdiag/bin/haltos. In addition the tftp daemon is configured to allow unauthorized access to any world readable file such as /etc/passwd. Although the password file did not contain encrypted passwords, it lists valid accounts on the machine and reveals the existence of an account with no password. SGI's response was that haltos could only be used from the console so this shouldn't be a big security concern.

The tftp entry in /etc/inetd.conf was changed per SGI's recommendation to run in secure mode by adding the option "-s". This required significant review prior to implementation due to concerns about possible consequences in booting IONs. This issue was resolved by adding "-s /opt" on the tftp entry to restrict tftp access to the /opt directory only. The /opt references in the IONs were satisfied by creating a link within /opt.

# cd /opt
# ls -l opt
lrwxrwxrwx 1 root other 1 Jun 25 13:29 opt -> .


However, at least three SWS upgrades regressed these changes to /etc/inetd.conf, replacing the previously removed "-s" option without notification. This appears to have been resolved in SWS/ION 4.0.

The SWS was shipped with "xhost +". This was replaced by using of xauth authentication with few problems. The /opt/home directory was delivered world-writable. The concern was more for accidental deletion; permissions were changed to 755. The SWS was delivered with .rhosts files for the root and crayops accounts that included access from CRAY sites. These appeared to be leftover from hardware checkout and were removed. Root on the T3E requires an .rhosts file to allow free access from the SWS during software installs. It was originally set up with crayops in the T3E's root .rhost, which made the T3E more open from the console than was desired. Two different .rhosts files have been created; .rhosts.ForUpgradesOnly and .rhosts.ForNormalOps are swapped in for .rhosts as needed. An SGI home-grown control panel tool modified permissions unexpectedly on other system directories, causing critical tools to fail. It also expected certain settings in the .rhosts files on the T3E to do remote shell system monitoring via "watch" leading to additional vulnerabilities. Vulnerabilities in Sun software and permissions in the CDE desktop binaries were detected and exploited during the review. These permissions have been changed.

Other Changes and Considerations

Different versions of IRIX. The differences between IRIX 6.2, 6.3 and 6.4 have caused numerous inconsistencies. Problems, patches, and behavior of programs differed between systems. IRIX 6.5 may simplify and resolve these inconsistencies.

Number of access attempts. The reviewing team recommended that accounts be disabled after unsuccessful attempts. This change was implemented on the Crays, but not on the SGIs. Since there is a delay in prompt after each failed login which makes it difficult to automate such an attack, the risk from this was determined to be less than the impact from accounts of staff being locked out maliciously. No brute force attacks of this type have been experienced.

ftpusers.

The file /etc/ftpusers was implemented on all systems. This prevents incoming ftp to system accounts and was recommended by several teams. The file had no entries when IRIX 6.5 was initially installed.

tcpwrappers.

Tcpwrappers were installed on all systems and have been used for logging rather than limiting access.

system and user accounts with no shells.

The review teams flagged user accounts with no shells on the CRAYs. This occurred when a user used the chsh command without specifying a new shell resulting in nulling out the shell field in /etc/passwd and the UDB entry. Login would then default to /bin/sh. The reviewers thought that this meant the Bourne shell, which they considered insecure, but under UNICOS this apparently is the Posix shell (ksh) which does not have the same vulnerabilities.

Reviewers (and the Security Profile Inspector) also flagged system accounts which didn't have a shell. They recommended that system accounts should have shells set to /dev/null or /bin/false.

The reviewers were concerned that having a valid shell (default of /bin/sh) if the system account were broken into in some manner would provide additional capabilities. Because a system administrator was aware of an exploit with use of /dev/null or /bin/false , a unique nullogin shell was created for such accounts and the accounts were assigned an empty home directory. These accounts include adm, bin, cron, daemon, nobody, nqs, osi, uscp. System accounts which DO need valid shells because scripts are run under their control include include root, bin, sys, adm.

Figure 6. Nullogin user shell.

/**** nullogin.c to be used as an immediate exit. ********/

main ()
{
exit(1) ;
}

Figure x shows a list of all system accounts. Accounts shown in bold-face were given the nullogin shell. This change had to be backed out for the account unknown when daemons or crontabs which ran under this account were discovered. SGI was contacted prior to changing the shells for nqs, cron and Idle. Their answer was that these accounts "are really not used and actually can be deleted or as you have suggested, change their shells to something that immediately exits. Up to you."

Figure 7. System Accounts in /etc/passwd under UNICOS.
root:*:0:0:root-chilkoot:/:/bin/sh
daemon:*:1:0:System:/:/usr/local/bin/nullogin
sync:*:1:1:sync:/:/usr/local/bin/nullogin
bin:*:2:2:System Binaries:/:/usr/local/bin/nullogin
sys:*:3:3::/:/bin/sh
adm:*:4:4:Accounting:/usr/adm:/bin/sh
cron:*:5:0:Chronological Manager:/:/usr/local/bin/nullogin
nqs:*:6:0:Network Queueing System:/usr/spool/nqs:/usr/local/bin/nullogin
operator:*:9:9:Operator:/u1/cri/operator:/bin/sh
ce:*:10:5:CRAY Engineer:/ce:/bin/ksh
Idle:*:11:0:System Idle:/:/usr/local/bin/nullogin
unknown:*:12:0:Default for Fair-Share:/:/bin/sh
sl:*:13:10:SUPerlINK:/etc/config:/usr/local/bin/nullogin
osi:*:14:11:OSI:/etc/config:/usr/local/bin/nullogin
uscp:*:31:31:UNICOS Station Call Processor:/:/usr/local/bin/nullogin
nobody:*:32:32:Non-prived nw apps:/usr/local/homefree:/usr/local/bin/nullogin

Webservers. The reviewers did little examination of the web server. No consultant or examiner checked web security unless requested. Although no observations were made in this area, a secure webserver has been installed for the intranet.

Operating System Vulnerabilities

Operating system vulnerabilities are handled by vendors. With or without source, it is beyond the scope and capabilities of sites to detect or resolve security issues at this level. It is critical to install security patches in a timely manner, however. Security mailing lists frequently identify operating system frailties long before they are resolved by the vendor. The reviewers were well aware of IRIX buffer overflows and other publicized vulnerabilities, and attempted them.

SWS vulnerabilities - SUN Security advisories are located at http://sunsolve.sun.com/pub-cgi/secBulletin.pl and patches at http://sunsolve.sun.com/security . Some security patches seem to be included in the SWS/ION upgrades. Depending on how frequently or up to date the SWS is kept, additional patches may be appropriate.

IRIX vulnerabilities - IRIX security vulnerabilities are sent via a mailing list. Anyone directly or peripherally responsible for security on IRIX systems should be subscribed to this list (http://www.sgi.com/Suppo rt/security/wiretap.html) to get immediate notification.

UNICOS vulnerabilities - Sites are supposed to be notified of UNICOS security vulnerabilities via Field Notices or CERT bulletins. In our experience this has not been done consistently, however. There have been far fewer UNICOS vulnerabilities reported than for IRIX.

Maintaining currency - Many vulnerabilities found in other flavors of Unix may also apply to IRIX or UNICOS. Even if there has not been an announcement, it is worth reviewing announcements from CERT, FIRST or other formal emergency response teams to consider if they might apply. Following Bugtraq or other informal mailing lists is good education and also may give a system administrator the first notification of a vulnerability, sometimes a long time before SGI announces it.

Whether or not security patches are kept up to date it is certain that a security review team will search for any vulnerabilities in this area. They will almost certainly take advantage of an unpatched buffer overflow to gain root access on the system.

Systems Management

Impacts to Users. There have been various impacts to users due to implementation of the changes described above. Some users were requested to modify file permissions, for example. Other changes, such as the restriction of xhost or implementation of Kerberos, were more global and intrusive. Efforts were made to fully educate users in how they would be impacted by changes, why changes would be made, and how to work within the changed environment. Individual attention to many users was provided as needed. The biggest consequence has been to the use of remote commands and distributed processing due to elimination of rcommands from systems not managed by ARSC, and by issues related to the Kerberos implementaton.

Software Support. A major outstanding concern is for support of the authentication and access software, in particular ssh, Kerberos, and related modules such as xlock and xdm. The standard access daemons such as ftp, telnet and rlogin, have been removed. ARSC and the other HPCMP sites have replaced these services with freeware. This freeware software is used at and supported by multiple sites and may not undergo rigorous or systematic testing as changes are made to support new versions of UNICOS and IRIX.

Secure Shell in particular is the widely used and accepted method for avoiding cleartext passwords on Unix platforms. It should be part of the supported suite for UNICOS and IRIX systems. Kerberos software has been implemented on all High Performance Computing Modernization Program supercomputing center platforms. It should be part of a vendor supported suite for UNICOS and IRIX systems.

Configuration Management. The processes whereby changes were planned, implemented and tracked need to be defined, to ensure that all of these modifications are not regressed and that the reasons for making them are clear. This is done through configuration management plans on each system, documentation in configuration files and a change and problem management tool in which each individual change or problem and its resolution is described. In addition, "sanity checker tools" run daily to ensure that the contents of configuration files have not unexpectedly changed and that permissions and ownership are as intended. Any item out of compliance is flagged for attention. This can be useful if a file has been inappropriately modified but tends to be used to catch ourselves when we've made a quick fix. Some of the configuration files on the CRAYs are managed by the install tool. In some cases they were switched to be managed manually to allow for easier access to previous versions and comments. Examples include all the network configuration files and /etc/inetd.conf.

Figure 8. List of SGI configuration files verified on all systems daily by tool.
/.shosts
/.ssh/config
/etc/TIMEZONE
/etc/aliases
/etc/awkeys
/etc/config/chkconfig.arsc
/etc/config/chkconfig
/etc/config/ifconfig-1.options
/etc/config/ifconfig-2.options
/etc/config/ifconfig-3.options
/etc/config/ipaliases.options
/etc/config/netif.options
/etc/config/portmap.options
/etc/config/static-route.options
/etc/config/timeslave.options
/etc/config/ypbind.options
/etc/config/ypmaster.options
/etc/cshrc
/etc/default/login
/etc/default/su
/etc/ethers
/etc/exports
/etc/fstab
/etc/ftpusers
/etc/group
/etc/hostconfig
/etc/hosts.equiv
/etc/hosts
/etc/inetd.conf
/etc/ipfilterd.conf
/etc/issue.normal
/etc/issue
/etc/krb5.conf
/etc/krb5.keytab
/etc/lvtab
/etc/motd.restrict
/etc/motd
/etc/mrouted.conf
/etc/netgroup
/etc/netinfo
/etc/passwd.nis
/etc/passwd
/etc/profile
/etc/rc.local
/etc/resolv.conf
/etc/sendmail.cR
/etc/sendmail.cf
/etc/sendmail.cw
/etc/sendmail.params
/etc/services
/etc/shadow
/etc/sys_id
/etc/syslog.conf
/usr/etc/hippi.imap
/usr/lib/X11/app-defaults/Amapi
/usr/lib/X11/app-defaults/XLock
/usr/lib/X11/system.chestrc
/usr/lib/X11/xdm/Xaccess
/usr/lib/X11/xdm/Xservers
/usr/lib/X11/xdm/Xsession-remote
/usr/lib/X11/xdm/Xsession.dt
/usr/lib/X11/xdm/Xsession
/usr/lib/X11/xdm/xdm-config
/usr/software/etc/hosts.allow
/usr/software/etc/hosts.deny
/usr/software/etc/k5hosts
/usr/software/etc/krb5/keytab.ids
/usr/software/etc/purge.cfg
/usr/software/etc/ssh_config-adm
/usr/software/etc/ssh_config
/usr/software/etc/ssh_host_key.pub
/usr/software/etc/ssh_host_key
/usr/software/etc/ssh_known_hosts
/usr/software/etc/sshd_config-adm
/usr/software/etc/sshd_config
/usr/software/etc/sudoers
/usr/spool/cron/crontabs/adm
/usr/spool/cron/crontabs/ids
/usr/spool/cron/crontabs/root
/usr/spool/cron/crontabs/sys
/usr/spool/cron/crontabs/sysmon
/var/adm/.forward
/var/avs/license.dat
/var/flexlm/avs.dat
/var/flexlm/aw.dat
/var/flexlm/aw_indigo13.dat
/var/flexlm/geowatch.dat
/var/flexlm/license.dat
/var/flexlm/license_indigo13.dat
/var/netls/nodelock
/var/sysgen/master.d/bsd
/var/sysgen/master.d/if_hip
/var/sysgen/stune
/var/sysgen/system/hippi.sm
/var/yp/ypdomain
/var/yp/ypservers

Logging and Auditing. The quantity and types of logs on all the systems is large. Early review teams noted that "the audits are fragmented among different logs and no policy regarding the audit trail review and retention was available to the test team."

As a start, an inventory of all current logs, on all systems, was performed. For each log a review and retention policy was determined. As possible, a mirror copy is directed to a log host, which combines all log types into files which can be examined systematically and programmatically. The logs are important for issues other than security, of course. Day to day system problems can also be quickly identified and appropriate persons could be paged to repair. The volume of information in the logs has made it nearly impossible to review them in a timely manner to detect potential security issues as well as system problems. In response to this, log reduction and review tools using swatch are being developed. Swatch is a freeware pattern-matching software package of Perl scripts.

As the logs were examined more closely, some required information was not available. Successful and failed logins were not tracked under IRIX. Changing this required a fairly simple update to two files:

/etc/default/login

# Log to syslog all login failures (SYSLOG=FAIL) or all successes and
# failures (SYSLOG=ALL). No messages to syslog if not set.
SYSLOG=ALL


/etc/default/su

# Log to syslog all login failures (SYSLOG=FAIL) or all successes and
# failures (SYSLOG=ALL). No messages to syslog if not set.
SYSLOG=ALL

Successful and failed logins were not tracked under UNICOS either. In order for syslog logging of failed logins to occur for rlogin and telnet, it is necessary to create /etc/config/confval with the following entry:

# /etc/config/confval for Chilkoot.
# File maintained locally in /var/local/ConfigFiles/etc/config/confval
#
# login.logbadpass: 1 = Turn on bad login logging.
# login.login_attempts: 5 = Five tries and you're out.

login.logbadpass: 1
login.login_attempts: 5

In order to get syslog logging of successful logins, SGI was contacted for advice. This took over a year to get resolved but was eventually achieved by implementing a user exit which is linked with login, rshd, rexecd and nqsdaemon.

Working with Review Teams

As noted earlier, there is a difference between inviting a team in to help a site become more secure and having a team come in to evaluate a site. Prior to a review by an outsider, it may be most effective to do a self-review, particularly if the site is able to resolve the problems detected. Having a consultant review as follow-up helps to ensure that all vulnerabilities are detected. There is little value in telling the site about vulnerabilities they are already aware of but haven't yet fixed, except when awareness needs to be raised.

Care should be taken in all vulnerability testing to ensure that appropriate people are notified, data is protected, and that there are no unforeseen system impacts. Some suggestions for preparing for either type of review include:

  1. Detailed advance preparation of assistance visits results in best use of staff time while visitors are on site. Make sure reviewers have the experience and knowledge appropriate for the site.
  2. Tools should be appropriate for the site operating systems. Windows/NT testing won't help in reviewing UNICOS.
  3. Make sure reviewers are familiar with the site operating systems OR plan to work closely with onsite staff to leverage their OS knowledge and combine it with the reviewer's knowledge of security issues.
  4. Testers should understand the output and tests done by any automated tools they use, or ensure that the vendor is available to answer questions. When a test reports a vulnerability the reviewers should be able to explain it and make recommendations for repair.
  5. Computing systems such as laptops brought in by consultants should not introduce vulnerabilities to the site. If the consultants plan to do things like leave laptops with root prompt unattended to test the responsiveness of site personnel, this should be reviewed with site POC in advance. Reviewers should set an example and make sure their own systems are in good shape.
  6. Prior to the visit, determine how information gained during the visit will be dealt with. Who gets the reports, how information will be transmitted (email, encryption, fax), how will this data be removed from visitors' laptops? Be sure to let site POC know about remote passwords sniffed so users can be contacted.
  7. If possible, site personnel should run the tests with consultant assistance rather than having consultants run tests and report results. This is a good way to ensure that the tools work onsite and that local staff understand them. It also is a way to save time and may lead to a more complete evaluation since staff can quickly help consultants with unusual site configurations to make sure all items are covered. Since staff can spend up to several man-weeks during a four or five day consultant visit, they may just as well be able to use this time to integrate tools into production operations, and ask for help when necessary.

Conclusions

A security review can be an enlighting experience if done correctly. Do it before it is done to you.

It is a myth that a secure system is harder to use or maintain. If done correctly, the opposite can be true. Many of the vulnerabilities discovered cannot be excused by ease of use.

Impacts to users as well as to security of the systems should be evaluated with each change.

Vendor support is needed for software such as SecureShell and Kerberos. Authentication software is fundamental for security and resource accounting. Issues of troubleshooting support, problem management, change management, source control, version control, upgrade support and testing should be addressed.

< /TR>

Tools and Software

A variety of tools are available for authentication and accesscontrol, regular system monitoring and for occasional system examination. Types include homegrown, freeware, freeware with local modifications, and commercial products. Most tools require some modification to work effectively with UNICOS. None of these products, including those which provide login authentication, are distributed as part of the operating system environment. Some of these products have export restrictions which can affect use and implementation.

These tools may or may not be available from web sites noted below. ARSC has modified some of them to get them to work under Unicos and Irix.

Internet Security Systems (ISS) Security Scanner http://www.iss.net/
general vulnerability assessment;
when commercial version was run on Unicos, caused inetd to hang (SPR 712517)
John the Ripper password cracking tool
http://www.false.com/security/john/</ A>
Kerberos;
Kerberos with SecurID
provides strong authentication with public key encryption; limited availability due to export restrictions
http://web.mit.edu/kerberos/www/
ftp://ftp.cmf.nrl.navy.mil/pub/ kerberos5
Netrecon security vulnerability assessment software; http://www.axent.com/netrecon/d efault.htm
SPI - Security Profile Inspector limited availability security vulnerability assessment software; developed through the Department of Energy. http://ciac.llnl.gov/cstc/spi/spi.html</ A>
ssh - SecureShell http://www.cs.hut.fi/ssh/; provides a trusted, secure, encrypted mechanism for communicating between systems. San Diego Supercomputer Center has coordinated the modifications for this software for UNICOS. Announcements about these typically are sent out over CUG mailing lists. (Note export issues.)
sudo Alternative for root access;
http://www.courtesan.com/sudo/
Swatch Perl scripts; syslog reduction and review tool ftp://ftp.stanford.e du/general/security-tools/swatch or ftp://coast.cs.purdue.ed u/pub/tools/unix/swatch/
Tcpwrappers ftp://ftp.cert.org/pub/tools/t cp_wrappers/
Tiger security vulnerability assessment software;http://www.net.tamu.edu/ftp/se curity/TAMU/
Tripwire system integrity checker;
http://www.cert.org/ftp/tools/tri pwire/

References


Cardo, Nicholas "Securing the User's Work Environment", Proceedings from 40th Cray User Group Meeting, 1998, Stuttgart.
Silicon Graphic Security Headquarters http://www.sgi.com/Supp ort/security/security.html
SGI Security Frequently Asked Questions http://www-viz. tamu.edu/~sgi-faq/faq/html-1/security.html
IRIX Admin: Backup, Security, and Accounting http://tech pubs.sgi.com/library/lib/makepage.cgi?007-2862-001
SGI Security Standard Operating Procedures http://www.sgi.com/Support/s ecurity/SOP.html
Crinform http://crinform.CRAY.com
Bugtraq Archives http://www.geek-girl.com/bugtraq/ (searchable)
http://netspace.org/lsv-arch ive/bugtraq.html
Anti-Online - includes lists of operating system specific vulnerabilities, including for IRIX and UNICOS http://www.antionline.com