The Six Most Important Tenets For Configuration Management
By Stephen Northcutt
The Six Most Important Tenets for Configuration Management
Summary: According to Answer.com, a tenet is an opinion, doctrine, or principle held as being true by a person or especially by an organization. In SANS Security Leadership Essentials, we consider six tenets an information assurance manager can use as a guiding set of principles to do configuration management right from the get-go and help lead an IT organization to achieve more security and more robustness. Implementing operational changes may seem difficult to grasp without a framework or road map to achieve improvement so we will introduce these six tenets to plan for improving the operational practice of your organization:
- Focus on hardening systems
- Develop repeatable builds
- Implement change control
- Audit change control, discipline people if you have to, but make change control stick!
- Don't troubleshoot! Burn bad boxes to the ground and reload them!
- Reengineer the frailest boxes first
Our basic problem is that our systems are simply too frail to exist on a network. The Internet Storm Center maintains a research project called Internet Survival Time. At the time of this writing, it was forty minutes, which could change depending on the day or your location on the Internet, but the key concept is that if attacks are coming in faster than patches are being applied, the operating system will not survive without being taken over by some attacker. Therefore, as our number one goal, we need to figure out how to harden our systems. SANS has courses on Windows and Unix of course[5,6], but the Pareto Principle solution to get us 80% of the way there with 20% of the expended effort and expense is to use the Center for Internet Security templates. They provide generally accepted configurations for Unix, Microsoft servers and desktops, Cisco routers and a host of other tools. Vendors, such as Sun and Dell, deliver operating systems, databases, networking gear and applications in every state imaginable.
Software and operating systems have a tremendous number of features that are able to be used for security and regulatory compliance, but many manufacturers deliver equipment with those features turned off. It would be similar to buying a car and having its safety equipment, such as brakes, in the trunk so that you would have to find an expert to install them for you. That is how manufacturers of computing and network equipment treat the industry. Fortunately there was a meeting August 2000 at the Cosmos Club in Washington DC where interested parties met to discuss this very problem and funding was made available to create a non-profit called the Center for Internet Security (CIS).
Many Information Security managers make the mistake of using widgets such as firewalls and anti-virus as the primary line of defense. Instead, our emphasis should be on building better systems that are consistent and carefully managed with strict change control. We want to get to the point where widgets are like parachutes - we rely on the airplane to be safe and to keep us in the air, with the parachute as a protection mechanism if the primary fails.
Develop repeatable builds
Each system in an enterprise should use a consistent configuration and image. This includes all devices, Windows servers, Unix servers, client workstations, routers, switches, etc. Eliminate the unique snowflake systems by using consistent imaging or configuration files for each system that is deployed. Have the builds carefully examined to ensure they are properly hardened to a Gold Standard, then deploy. If you are fortunate enough to be able to create systems entirely from source code, a great article titled Benefits of the Build is available from Doctor Dobbs, http://www.ddj.com/dept/architect/184415286. The key points are shown below with the final one added by our authoring team:
- The build must be a clean compile, with no compiler-generated syntactical errors.
- The build should be done using the most current version of all source files.
- A clean build implies that all application source files are fully compiled.
- The build should generate all files needed for deployment.
- All unit tests for the application should be run. If they all pass, the build is successful.
- First deploy it to a development region to be further verified.
- The build is released, and it becomes operational.
The common term in industry for the deployment of an application or system to become operational is a release. An important subset of the repeatable build process is release management. A release consists of the new or changed software and/or hardware required to implement approved and tested changes. According to the Information Technology Infrastructure Library (ITIL) releases are categorized as:
- Major software releases and hardware upgrades, normally containing large areas of new functionality, some of which may make intervening fixes to problems redundant. A major upgrade or release usually supersedes all preceding minor upgrades, releases and emergency fixes
- Minor software releases and hardware upgrades, normally containing small enhancements and fixes, some of which may have already been issued as emergency fixes. A minor upgrade or release usually supersedes all preceding emergency fixes
- Emergency software and hardware fixes, normally containing the corrections to a small number of known problems
Change control begins in the data center. Establish your "first revision" based on your existing infrastructure and systems. It is not usually possible to implement change control on the entire organization's IT structure at one time. Break the project into phases, such as database servers, email servers, standard desktops, etc. Once a phase or release is moved out of development into operations, the development team no longer has the authority to change the release. Now we must go through a change control process. Design a policy for approval and require "change orders" or "change requests" before changes to any system can be made. The change request must list all changes. Changes should require several levels of authorization, and authorizing change should never be granted to a single person, but rather to one or more teams. Use technology, such as registry and file imaging, to ensure that no unauthorized changes occur. Ensure there is proper auditing and documentation for each change to permit system rollback and concise documentation on the environment.
Audit change control, discipline people if you have to, but make change control stick!
Make change control stick! Don't permit exceptions to change control policies - this undermines the entire process and introduces more risk to the network. Audit the change control environment regularly, and enforce the policies by training the incident response team to react to unauthorized changes in the organization. This requires commitment and solidarity on the part of the management team. Many engineers that implement unauthorized changes did not have malicious intent.
Don't troubleshoot! Burn bad boxes to the ground and reload them!
Once the change control policy is in place with repeatable builds of hardened systems, we can eliminate much of the effort spent on troubleshooting systems. When a system acts flaky, wipe the system clean and reload from the current image. If problems persist, examine the image as the source of the problem, identify the fault, get approval for a change order, and fix the source of the problem. Deploy the fix throughout the organization to maintain consistency. The result is a much more reliable system that will contribute to improved operational performance for the entire organization.
Reengineer the frailest boxes first
We learned this from the fine folks at visible ops. Here is how they put it: "Often, infrastructure exists that cannot be repeatedly replicated. In this step, we inventory assets, configurations and services, to identify those with the lowest change success rates, highest MTTR and highest business downtime costs." In other words, if you are just starting out, try to start out with a win and pick a box that has crashed or has otherwise been a problem. That way, everybody supports reengineering to a more stable operating system.
11. See note 10 above