Liam Forbes
lforbes@arsc.edu
http://www.arsc.edu/~lforbes
Slides for this presentation in pdf format
Using Kerberos, SSH, Sudo, Swatch, LPRng, TCPwrappers, and Tiger, we've created a strong overall security profile and simplified managing the security on more than 70 systems. The modifications we have made so that the tools will work together securely, or even compile in some cases, are discussed within the framework of our local software management practices.
The additional security features ARSC would like to see on all platforms are fine grain root access control, connection logging, strong account authentication, complete system monitoring at the user and command level, and an easy way to comb through the auditing and logging data for significant information. Fine grain root access control is the ability to control administrator's access to the root account at a level lower than sharing the root password (i.e. su). Connection logging is the ability to log when a connection to the system is made, the source of the connection, the account the connection tries to connect as, and the outcome (success or failure) of the attempt. Strong account authentication means, better than UNIX passwords, even shadowed passwords, for authenticating users and storing the authentication tokens. Complete system monitoring and a mechanism for reviewing the information is system auditing/logging and a way to process the audit and log records to detect patterns, identify anomalous events, and verify proper system operation.
These features are either implemented very differently from platform to platform, or not at all. Not only are they desired on each platform, but if they had consistent interfaces from operating system to operating system, general system and security administration would be much easier, allowing more time and effort to be spent on user's requirements rather than the system's requirements.
Of particular concern are the two Cray systems and the SP. These are the primary compute resources at ARSC. Ensuring that these systems are properly protected is a high priority for ARSC security administrators. Compared to desktop systems like IRIX, Unicos, Unicos/mk, and AIX are unique platforms. Unfortunately, there are not a lot of third party security vendors targeting them, porting their software to them, or testing their software on them.
Enter Open Source Software[10]. Based upon principals of openness, portability, security, and, best of all, low to no cost, open source software is a mechanism for adding security features to systems lacking them, and to have the same interface across multiple platforms. If a specific open source package is desired on the Crays, the software can be ported to that platform. If enough sites desire the software, there is a viable business case for the platform vendor to port the software themselves. (In fact Cray distributes the "Cray Open Software[6]" CDrom containing open source tools requested by customers.) Either way, one can now leverage a security software package common to all platforms in a high performance environment as well.
The software packages addressed in this paper are all in use at ARSC on all or most of our platforms. As we add new platforms, we either port the software, or load the latest versions from the Internet. This way we maintain the same operating procedures no matter which operating system is installed.
Hopefully the benefits of using open source, including improved security, will out-weigh the downside represented by those concerns. In fact, the general experience with open source packages has been positive. Certainly ARSC has not had any problems related to lacking a paid vendor. In fact, there are several benefits that have made open source software invaluable in ARSC's environment.
Using open source, the administrator and users have the same features, interface, and functionality on each platform. Common features allow the same procedures to apply to all the systems. It simplifies the administrative effort. Rather than having to know many system specific procedures, an administrator only has to know one procedure that applies to many systems.
For example, by using the Kerberos authentication protocol, logging into ARSC Crays is the same as logging into the SGIs. The procedure is the same, the credentials are the same, and the results should be the same. Now, instead of having to memorize multiple passwords, users and administrators only need to know one passphrase and can use the same one time passcode mechanism.
Every administrator has had the experience of running into the one bug that only applies to their site because of a slightly different requirement or implementation. By having access to the source code, the administrator can make the software work for their site. Then that fix can be supplied back to the community.
Beyond fixing bugs, having the source code provides the opportunity to add new features to the software. The unique combinations of platforms and software at every site means there will invariably be some need or feature not met by a software package. This is not a bug, it is just an unimplemented feature waiting for an enterprising administrator to write the code.
Once a patch is written, submitting it back to the original developers is accepted practice in the open source community. The developer may or may not accept the patch, but that does not mean the patch cannot be shared on its own. However, it is important to use good coding practices, especially when modifying security software. It is very easy for patches to introduce vulnerabilities that did not exist before the patch was applied.
For example, SSH is a very common security tool. However many people have found bugs in the code, or required new authentication features to make the software useful for them. The developers maintain a single source tree, but have incorporated dozens of patches from around the world. These patches have expanded the usefulness of the software for everyone, but at least one became an exploited security hole (the CRC32 vulnerability[5]).
ARSC has adopted a directory layout for third party software installs to help make each installation consistent across systems, and between packages. Now, no matter who initially installs a package, it is possible for anyone else to look at it, know where the different parts reside, be able to copy the package onto a different platform, or upgrade the package, and maintain a similar configuration. This procedure works especially well for open software packages which are normally distributed as tar kits with configure scripts or makefiles that can specify where installed files should reside. Other packages that either come with their own install scripts, use a vendor supplied software install tool, or are binary only distributions are a little harder to fit to a standardized model. However it has been possible to install these packages using most of the developed rules. Documenting the remaining differences appears to be enough to allow multiple staff members to maintain these packages as well. This introduction to ARSC's third party software layout will hopefully clarify the terms and assumptions that come up as each package is reviewed.
The installation procedure is based upon the desire to keep each package in its own subdirectory rather than installing multiple packages into /usr/local. This desire developed as more packages were installed on ARSC systems and it became nearly impossible to know which files went with which packages. Now, any file in /usr/local/bin, lib, include, etc, sbin, and share are just links to files in a new directory named /usr/local/pkg. This directory is called the "package directory".
Within the package directory are "software directories". Each software directory contains one or more versions of a software package. Each software directory is totally self-contained. This keeps all the files in a distribution together, and ensures that it is possible to identify what files go with which package.
Each version is contained in a subdirectory referred to as a "version directory". One version directory is designated as the "current" version, and is the target of links installed in the standard /usr/local directories. This has been very handy for having multiple versions of packages that do not make use of a versioning system, such as module under Unicos or inst(1M) under IRIX. Within each version directory is a copy of the directory structure found below /usr/local. So, for example, there is a bin directory, a lib directory, an include directory, the man directory structure, and so on. The version directory is the installation target of each package.
Many operating system and software vendors use the Kerberos protocol in their own products. This should mean that Kerberos is very portable. However, it also adds complications to the installation and usage. Installing Kerberos is a significant task requiring hardware resources (one or two systems to be key distribution centers), system modifications (modifying the login/authentication process), software modifications (modifying the authentication process), and plenty of man hours. Sites with unique platforms and many systems, who desire a strong, central authentication method, Kerberos should be considered.
The latest Kerberos available from MIT is version 5, release 1.2. ARSC uses a modified Kerberos, based on version 5, release 1.1. The modifications are primarily to implement "hardware preauthentication", i.e. incorporate the SecurID or CryptoCard one time passcode mechanisms. Unicos supports Kerberos natively, but unfortunately is based on version 4. The standard version of IRIX does not appear to support Kerberos. By installing a sourced kit downloaded from MIT, a site can implement interoperability and consistent package versions on each platform.
Describing the Kerberos protocol is a paper of its own. Several of those papers are archived at MIT's website, http://web.mit.edu/kerberos/www/papers.html. Installation is not straightforward. It takes planning and effort. However the end result is definitely a more secure site.
Before the daemons are replaced though, the new ones must be compiled. Kerberos, being a network protocol, relies upon the network functionality of the operating system. Most UNIX systems have very similar interfaces for networking so the Kerberos code moves easily from system to system. IRIX for example attempts to maintain compatibility with System V DNS lookup routines. Unicos though uses a differently structured lookup call which required modifying the code.
Beyond networking differences, because the kerberized daemons invoke login shells, they have to interface with the system routines for initiating a user's environment including setting up environment variables, temporary directories (under Unicos), and user privileges, as well as completing system chores such as allocating a tty and updating the utmp/wtmp files. Kerberos tries to use "standard" UNIX/POSIX code for these operations. System differences such as the Unicos User DataBase (UDB) and unique user privilege structures (PALs, ACLs, and labeling associated with MLS) necessitate further code modifications.
Incorporating the UDB requires adding include files (ia.h, tmpdir.h) and modifying routines that handle authentication. When a user fails to authenticate properly, the UDB has to be updated to update counts of the failures. The UDB also stores user limits that need to be applied to successful login sessions and the kerberized daemons have to be modified to establish those limits.
On some systems it is possible to replace the login program with a kerberized binary. Then it would not be necessary to patch each daemon separately. Because Unicos has so many unique login operations, the opposite operation is preferable, calling the Unicos login program from the kerberized code. Either way, allowing the operating system to handle its own functions is generally less error prone than trying to duplicate those functions in the applications. Other kerberized daemons still require source code modification to accomplish the same operations as login.
One of the most common problems porting codes to HPC environments is managing data types. Most Open Source software is developed on 32 bit architectures. Some of the data types used in the source code are not available on the larger architectures that make up a high performance environment. This affects memory allocation that uses the size_of routine, encryption operations that rely on either memory allocation or consistent mathematical operations, and network packet manipulations that rely on bit operations. The source code has to be modified to either use alternate data types with the same desired sizes, or substitute code to generate the same results.
Besides modifying code to replace missing data types, it is necessary to add global definitions to activate some of Unicos' compatibility with other UNIX operating systems. Specifically, by using the "__BIT_TYPES_DEFINED__" define, standard data types such as char, short, int and their unsigned counterparts are matched to data types such as int8_t, int16_t, and int32_t. This way the size of the data types will not be larger than expected in the source code.
On Unicos systems, this data type size mismatch also appears in the Kerberos configuration files. The system "keytab" file contains binary data unique to the host and is used for identifying the system to the Kerberos servers. A special program is required to convert the binary data generated by the Kerberos utilities to a format/data type that can be used to communicate with the Kerberos servers. The data types stored within that file have to be modified to match the data types expected by the server.
When a user has finished their kerberized session, they log off the system and signals are generated by the child processes. Modifications have to be made to ensure that the signals are properly handled. Unicos has additional signals to signify the end of a job. The job concept is unique to Unicos so most applications are not written to handle these additional signals. This modification is also related to the UDB modifications since the job structure is a user accounting addition.
Once the daemons and Kerberos utilities are compiled, it is important to properly manage the configuration files and control access to them. Keeping the files up to date, unchanged, and properly accessible helps maintain proper system security. After all, these configuration files now control system authentication. The Kerberos configuration files take on importance equivalent to, if not greater than, the password files on non-kerberized systems.
ARSC has many workstations and desktop systems, all of which need access to printing. There are many printers to choose from, based upon location and the type of print job. Each printer has specific features that users would like to utilize. However, each platform had a radically different printing system, none of which could be managed together. Also, on the UNIX systems that implemented the LPD print spools, the vendors used a code base with known current and historical security problems. Since ARSC definitely wanted to securely access the printers from the UNIX systems, and centralized server management is preferred, LPRng was chosen to replace the vendor supplied software.
The latest LPRng software is 3.8.9. The latest version of the filters package is 3.5.6. ARSC tries to keep this software up to date to both take advantage of new features and new printer descriptions, as well as to avoid any newly discovered security vulnerabilities. LPRng is not installed on the Cray SV1 or the T3E as they are not functioning as print servers. Since the Unicos print clients are already lpd based (i.e. they use /etc/printcap) and there is no printer daemon running, it was decided not to install one.
Maybe the most confusing part of LPRng is the filter package, IFHP. The documentation is vague because of the wide variety of printers and printer options. Unless very special printer options are necessary, stick to the generic filter definitions as much as possible.
There is a very active LPRng mailing list that is the main LPRng support channel. The primary LPRng developer, Patrick Powell, is generally very prompt in responding to questions and bug reports. This is probably the ideal example for how good open source support can be. The user community is very active, and fixes are released quickly. Rapid code turn over can also be the bane of open source. It is important to test new versions of software in the local environment before putting it into production. If the software is updating faster than an administrator can test and install it, pretty soon the administrator will stop trying to keep up with the latest version. This means unpatched security vulnerabilities may reside on the system.
All ARSC systems support SSH access, using Kerberos authentication, both for user logins and for administrative connections from a central administration server. Users connecting across the Internet using SSH do not have to worry about eavesdropping. ARSC administrators using SSH to update local configuration files do not have to worry about the files being captured. SSH also replaces old, insecure daemons with new software that is continuously being reviewed and updated.
A central administration server is used to store, update, and monitor the majority of system and software configuration files on most ARSC systems. Using SSH, modifications can be securely propagated to the client systems without worrying about eavesdropping. It can also be automatically propagated and monitored over a system trust relationship that is strengthened by using the alternate authentication methods provided by SSH. Specifically, root and other system accounts on the administration server can be allowed to connect to client systems via automated jobs and scripted commands using the RSA authentication method, which is based on public/private keys. This way the system administrator does not have to worry about the trust relationship between systems being poisoned by misconfiguration of the traditional "r-files", or by exploitation of the domain name system (DNS).
There are two SSH product lines. SSH is maintained by the company SSH.com. OpenSSH[11] is maintained by the OpenBSD developers and is fully open source. SSH for UNIX is open source to educational institutions and government agencies up to version 1.2.27. Version 1.2.28 and later is open source to educational institutions and anyone using an open source operating system (Linux and the various BSDs). Because ARSC has a mix of educational and government users, we are currently standardized on the SSH 1.2.27 product. Until recently, OpenSSH did not have the necessary Kerberos support for it to be integrated into ARSC's Kerberos/SecurID authentication method. Recently Simon Wilkinson[17] has been debeloping patches for OpenSSH to support the modified Kerberos ARSC uses. His patches, combined with Cray's Unicos port of OpenSSH should allow ARSC to upgrade.
This package exemplifies one of the important features of open source software. Because the code is available, any site can modify the software to work in their environment. By submitting the modifications back to the maintainers, each site also contributes to the open source community and allows other sites to use the software as well.
Compiling SSH for Unicos is not straightforward. Like Kerberos, modifications are required. Many of the same issues arise. ARSC has generated Unicos patches for SSH 1.2.27[3] and Cray has ported OpenSSH 3.0 to Unicos and Unicos/mk as part of the Cray Open Software package (available for a small fee). Furthermore, ARSC is working on a patch for OpenSSH 3.1 that includes the Cray modifications and modifications incorporating Kerberos 5 support.
To tie SSH into the operating system, several libraries and include files need to be added to the compile options, and to some of the SSH code. The primary tie in is to the UDB system. The UDB is unique to Unicos so SSH does not support UDB formatting and account structures. To better support the UDB, a few system specific routines are necessary to make the proper login and logout calls and properly modify the user environment settings, including handling the tmpdir creation.
Unicos also has a unique multi-level security implementation. Privileges mechanisms are stored in the UDB which need to be enforced at login. If not properly handled, it is possible for a user to connect to the system and either have not enough privileges to actually function, or to have root access to the entire system. The ARSC patch creates a separate file containing the necessary routines and then only interjects subroutine calls into the original SSH code. This modular approach to program modifications helps to maintain a secure code base.
SSH was developed on 32 bit platforms. The Crays are 64 bit and lack some of the data types used for memory allocation in SSH's encryption and network packet manipulation routines. Once located, appropriate data types, or code computing the proper sizes are substituted in the source code.
Like Kerberos, another modification specific to Unicos is job termination. Unicos has a job concept which is not the norm. This affects how signals need to be handled when a session finishes. Proper, clean process termination is very important when handling user sessions that provide a shell. Receiving and handling the signals in the proper order helps to close the session cleanly, and provide the proper return code.
Every UNIX system handles logging and auditing just a little bit differently. In the case of login daemons, it is important to log successful and failed connections as well as properly update files such as utmp and wtmp. Minor modifications need to be made for each new system that SSH is ported to, including IRIX and Unicos.
Once code modifications are completed and SSH is compiled, the next big hurdle is setting up a proper configuration file. Each option needs to be reviewed and set to an accepted value. SSH has many options. If installing SSH on many platforms, ensuring that SSH is configured correctly on each system adds another magnitude of management required. However, by documenting the options for each platform and referring back to that documentation for each upgrade or new platform, time is saved during security audits, upgrades, and meetings with managers who want to know how the system functions.
Using sudo, a system administrator can allow limited, or full, root access to any user on the system without having to share the root password. Each user with sudo abilities uses their own password (or SecurID card, or Kerberos ticket) to authenticate their root commands. Each time sudo is used to execute a command, the entire command-line, as well as the user and group IDs are logged either to a sudo log file, or SYSLOG. The configuration file, called "the sudoers file", is very flexible which means the administrator can control the extent of the users root access. In fact, on most ARSC systems, the root account no longer has a root password (it is locked) which means if the password file is captured, the root account cannot be broken into via password cracking.
The latest version of sudo is 1.6.6. ARSC tries to stay at the current version to take advantage of new features and protect against discovered vulnerabilities.
Installing the software is very straightforward, however configuring the sudoers file can be tricky.
Once the compilation options are chosen and the program is compiled, the next step is to properly configure user's privileges. The best approach is to only grant the ability to execute certain well-defined commands as root. However for a general system administrator, this list could get quite long and complicated. So the core administrator group can be given the ability to execute any command as root. Remember, all such commands are fully logged, down to the command line options supplied with the command.
It would be nice if, once the core administrator group is configured, one could disallow the ability to execute the shell as root. Certainly the configuration language of the sudoers file allows one to specify not executing the shell binary itself, but there are many ways to get around such a restriction, for example executing vi and then escaping out of it. It is probably better to trust the core group of administrators and allow them to do their job, then to waste resources trying to tie them down.
On the other hand, non-administrators should have very restricted access. Each defined command should be reviewed for potential shell escapes or alternate uses. The command should be defined as completely as possible with minimal regular expressions. Each user should be fully aware that there account now has potential root access and they need to protect access to their account even more than before.
At ARSC, there are two uses of swatch. The first is for log monitoring. Using the available actions, system administrators open a shell to the central loghost and have Swatch parsing entries as they come in. Those entries defined as interesting are marked somehow (highlighted in various colors, activate the terminal bell) to catch the administrator's attention. By having multiple administrators monitoring the logs in real time, it is possible to catch potential errors before a system goes down.
The second usage is for intrusion detection. Nightly the system logs are parsed for network traffic and authentication activity. The "interesting" entries are then emailed to the security administrator(s) for further review. Because all of the systems are sending log entries to a central log host, this parsing greatly reduces the effort to identify unauthorized behavior.
Currently swatch version 3.0.4 is available. It requires Perl 5 and several Perl modules to operate though. Even when new versions of the software are available, ARSC is not as quick to update this software. As long as the parsing routines work, it is not as critical to install a fix. The current version has undergone a lot of revision to take advantage of new features so a new install should definitely use the latest version. However once installed, it is only important that the needed features are available and work properly.
After learning about Perl modules and completing the install, filters have to be developed for parsing the log files. These filters are Perl regular expressions that define what actions Swatch takes when a match is made. It is important that each expression be as exact as possible. If an expression matches too many things, the result is either false positives (warnings of problems that do not really exist) or false negatives (no warning when a problem exists). The best approach is to start with the philosophy that every log entry is a problem, then filter out the entries that are known, understood, acceptable events. After several (maybe many) repetitions of updating the filter and reviewing the results, it will become possible to quickly identify errant system behavior and react appropriately.
Because system access logging is up to individual daemons, it is difficult to know who connects to a given system via which service. It is also difficult to implement a common ACL used by the various services. By wrapping each service with TCPwrappers, an administrator can control all the available services from a single list, and have a single record of all the attempted connections to those services.
TCPwrappers is also very portable. This means that the same monitoring and filtering can be implemented on many systems with little effort. Combined with syslog (and Swatch), this means a powerful intrusion detection mechanism can be implemented without deploying new hardware.
The latest version of TCPwrappers is 7.6, and it has been for a long time. There does not appear to be active development anymore. However it also does not seem to be necessary. Nobody has found a bug or vulnerability in this software for quite a while.
After compiling TCPwrappers and installing the binaries, the configuration file needs to be setup and managed. At ARSC, access controls are implemented on the network level, not the system level. So instead of allowing or disallowing connections through TCPwrappers, the daemon is used primarily for logging standard services like rsh, ftp, and so on. The TCPwrappers library is also useful for logging connections to sshd and sendmail.
Log entries generated by tcpd are sent to their own SYSLOG file by using a unique facility. This file can then be easily parsed using Swatch. By combining the TCPwrappers SYSLOG file from all the systems, patterns are easily detectable. It is possible to quickly identify when the local network is being scanned, or when a specific host is being port mapped for example. Because TCPwrappers can run on many different platforms, the log format from any of the site's systems will be consistent which simplifies writing Swatch filters and identifying security/system problems.
Setting up a new system correctly the first time is very important. Current estimates from the Honeynet Project indicate that insecure systems are compromised in less than 24 hours from the time they are put on a public network. They are scanned in less than an hour, even on partially obscured networks. It is extremely important that new systems undergo some kind of vulnerability analysis before connecting to a public network and tiger can be that analysis tool.
Monitor a system's current health by periodically re-running tiger and looking for changes in the output is important to maintaining security operations. System changes, especially those that occur without an administrator's knowledge are signs that something potentially dangerous is occurring on the system and should be investigated.
Unfortunately tiger is an old tool. It has been at version 2.2.4p1 since 1999 and no active development is underway, except possible for Linux systems. A new tool, named TARA[15], was started to update tiger by recoding as well as adding new vulnerability checks. TARA is currently at version 2.0.9 and there are rumors that it will be updated again soon thanks to a fresh infusion of development cash. ARSC is still using tiger because so much effort has been put into updating the scripts for our systems as well as creating a back end parser to email the output in a specific, readable format to the administrators of each system.
ARSC also wrote a post processing script to sanitize the output of Tiger before it is mailed to the administrators. The post processing includes reformatting the output so it is human readable, removing known conditions, and adding counts of other conditions instead of listing out each affected file (which can sometimes range into thousands of files). This post processing of the output allows ARSC to run Tiger on a weekly basis and quickly identify system changes or anomalies. This is the hardest part of making Tiger useful - making the changes stand out from the copious amounts of output, some of which is just plain wrong.
1. Nessus[7] - a vulnerability checking and system monitoring tool. ARSC is evaluating it in conjunction with ISS (a commercial network vulnerability scanner) and as a possible replacement/upgrade for Tiger.
2. SyslogNG[2] - a replacement for the standard syslog daemon. ARSC is evaluating the "expanded filtering capability" and its use of TCP instead of UDP for transmitting entries over the network.
3. Tripwire[16] - file integrity monitoring software. ARSC is evaluating it both for monitoring the root file system of the various platforms, and for monitoring the document trees of the various web servers deployed.
Using the installation procedure and experiences with the tools already installed, incorporating these new packages should be straightforward. New software is also an opportunity to improve the installation process. Each time a package is installed, the procedure is updated and new suggestions make the installation process even more efficient, and more secure.
Open source security tools are currently some of the oldest, active open source projects. Just as hackers share their cracks and exploits, security administrators share their knowledge and tools for the benefit of the entire community. After all, risk assumed by each site on a network is risk affecting other sites. Using open source tools, an administrator leverages the same tools on many systems and across dissimilar platforms. The commonality of configuration and maintenance reduces the overall effort to secure the entire site. Since the information security realm adds new facets at an alarming rate, it is important that an administrator's time and effort be as efficiently employed as possible. Common tools provide that efficiency.
Hopefully new security tools will continue to be open source development projects. It is very difficult for any one person, or organization, to maintain expertise in the many areas of information security. By working together as a community, the resulting toolbox will be very powerful and effective. All administrators with any interest in security and programming should become involved in the open source effort. Helping to maintain the tools provides benefit to all users, and possible fame (but no fortune) for the contributor. Administrators not interested in programming or porting tools, should encourage their vendor(s) to become involved. Either way, the tools improve and the high performance computing community benefits with every contribution.