Материал (01 март 2010)

Liebert
Emerson Power Network

Five Questions to Ask Before Selecting Power Protection for Critical Systems

A Guide for IT and Data Center Managers

SUMMARY

Appropriate power protection is one of the keys to achieving high availability of business critical systems. Yet many of the key factors that should be considered when selecting these essential systems are often overlooked.

In an ideal world, electrical utilities would deliver clean, reliable power to business critical systems. Unfortunately, this is not the reality. Appropriate systems are required to ensure necessary power availability and quality is achieved as simply and cost effectively as possible.

This is not as easy as it sounds. Different applications have different requirements, depending upon the size and type of equipment being supported, the cost of downtime and the organization’s availability goals. Plus, disruptions in power availability and quality can originate from a variety of sources and take a number of forms. Also, the environment being protected is not static: availability requirements may increase over time and the contents of the data center are subject to regular change.

These factors make issues such as lifecycle costs, adaptability, and ease of service more important then ever when selecting appropriate power protection. The following five questions can be used to ensure these issues are adequately addressed:

  1. What level of availability is the power system expected to support?
  2. What are the lifetime costs of the power system?
  3. What is the impact of the power system on data center space?
  4. How will the power system be tested and installed?
  5. How will the power system be monitored and maintained?

The answers to these questions can help ensure the power protection system meets application requirements as simply and cost-effectively as possible. Failing to consider these factors can result in a power system with high lifetime costs, unnecessary complexity or poor performance.

The Basics

A power protection system is configured using four basic types of equipment:

  • Transient Voltage Surge Suppression (TVSS) systems
  • Uninterruptible Power Supply (UPS) systems (including batteries for power during short-term outages. Ensuring uninterrupted power during longer term outages requires a backup power source, usually an on-site generator.)
  • Static Transfer Switches (STS) and other switchgear
  • Power Distribution Units (PDU)

Depending upon the configuration, the system may include one or all of these components.

Transient Voltage Surge Suppression (TVSS)

A "transient" is a brief but extreme burst of energy that can travel across AC power, telephone or data lines. About 35 percent of all transients originate outside the facility from lightning, utility grid switching, electrical accidents and other sources. The remaining 65 percent come from inside the facility, often when large power-consuming systems, such as motors or building air conditioning systems, are switched on.

Fuses and circuit breakers are designed for overcurrent protection and do not provide transient voltage protection. The IEEE (Institute for Electrical and Electronic Engineers), recommends TVSS at both the service entrance and the data center. This protection can be integrated into the UPS or installed as standalone devices.

Uninterruptible Power Supply (UPS)

The UPS serves two critical functions: it provides backup power in the event of an interruption in utility power, and, depending upon topology, "conditions" utility power to eliminate power disturbances that can shut down or damage sensitive electronics.

The battery capacity of the UPS determines how long the system can provide power to the load in the absence of utility power. Typically, battery plants are sized to provide 10 to 20 minutes of power at full load in large data centers. Smaller facilities or highpower applications may be configured with more battery capacity.

The degree of power conditioning provided by a particular UPS system is a function of its topology. The International Electrotechnical Commission (IEC) defines three types of UPS topology: passive standby (offline), line interactive, and double conversion (online).

A passive standby, or off-line, UPS mainly provides short-term outage protection. It may include surge suppression, but does not provide true power conditioning and is typically used in non-critical, desktop applications.

Line interactive UPS systems monitor incoming power quality and correct for major sags or surges, relying on battery power to enhance the quality of power to the load. This topology eliminates major fluctuations in power quality, but does not protect against the full range of power problems and does not isolate connected equipment from the power source.

An online double conversion system provides the highest degree of protection for critical systems. Rather than simply monitoring the power passing through it, as with the line interactive approach, a double conversion UPS system creates new, clean AC power within the UPS itself by converting incoming AC power to DC and then back to AC. This not only provides precisely regulated power, but also has the advantage of isolating the load from the upstream distribution system.

UPS topology has a major influence on system performance and should be a key factor in matching the UPS to the application. Historically, "online" and "double conversion" were synonymous. However, some manufacturers now market line interactive UPS systems as "online" systems. Always clarify how a particular UPS would be classified under IEC standards – not the manufacturer’s own terminology.

Static Transfer Switch (STS)

Redundancy is often designed into the power system to eliminate single points of failure. When redundant UPS systems are used in a distributed redundant configuration (see Figure 4, Continuous Availability architecture) there must be a means of switching between systems that is transparent to the load. This is the function of the static transfer switch.

Power Distribution Unit (PDU)

The power distribution unit distributes power from the UPS (or utility if no UPS is present) to the supported systems. A power distribution system can range from a single PDU, to a PDU/static transfer switch combination, to multiple PDUs and switches.

[Note: UPS topology has a major influence on system performance and should be a key factor in matching the UPS to the application.]

Establishing Availability Requirements

Availability refers to the percentage of time that a system is available, on line, and capable of doing productive work. It is typically described as an annual percentage, or number of "nines." A system with "three nines" availability is 99.9 percent available, which translates into 8.8 hours of downtime annually. "Fives nines" availability — the standard many data centers aspire to — translates into less than six minutes of downtime annually.

With power availability, an additional factor should be considered: the availability of conditioned power. Is it acceptable for systems being protected to operate on unconditioned utility power for short periods of time? Answering this question requires balancing the increased risk of operating on unconditioned power versus the added cost of UPS redundancy, which is required to minimize or eliminate the time protected equipment is exposed to utility power.

If the cost of downtime is low, the investment in redundant systems may not be warranted. If the cost of downtime is high, failing to design redundancy into the system could result in significant financial losses. At minimum, the following should be considered when calculating downtime costs:

Direct employee costs:

The total hourly estimate for all employees who are impacted by the downtime multiplied by the percent of the impact.

Employee recovery cost:

The costs of employees to "catch up" once power is restored.

Lost revenue:

The impact of downtime on customers, either through an inability to complete transactions or reduced service. The cost of loss of goodwill and reduced customer confidence can also be considered. Businesses may not only lose transactions, but customers as well.

Recovery Costs:

The time and out-of-pocket expense required to restore the system. This includes service costs as well as the cost to replace systems damaged by surges or other power anomalies.

Adding up these costs, plus others specific to a business, will produce an average cost per hour of downtime that can be used to evaluate the return on investment of power system components required to achieve higher levels of availability.

[Note: …not all problems are externally generated. Some studies show that up to 80 percent of power disturbances occur downstream from the UPS. The power system must isolate protected equipment from internally generated disturbances, in addition to ensuring an uninterruptible source of incoming power.]

The Causes of Downtime

The major causes of power system downtime include utility outages, human error, externally and internally generated disturbances, maintenance of power system components (depending on system configuration), and failure of power system components (depending on system configuration).

Power Problems:

Disruptions in incoming utility power are unavoidable, whether caused by lightning strikes, construction projects or problems with power company equipment. The widespread blackout that affected the U.S. in August 2003 demonstrated just how quickly a small problem in one area can ripple across the grid to create a widespread outage. This is one of the reasons power protection is essential for business critical systems.

However, not all problems are externally generated. Some studies show that up to 80 percent of disturbances occur downstream from the UPS. The power system must isolate protected equipment from internally generated disturbances, in addition to ensuring an uninterruptible source of incoming power.

Maintenance:

Power systems require regular maintenance to ensure proper protection. During maintenance, power to the load may be interrupted or power that is distributed to the load may be unconditioned, depending upon system design. Power systems can be designed for 100 percent concurrent maintenance, meaning all system components can be serviced without affecting power to the load.

New Equipment:

New systems are added to the data center on a regular basis. This may require that power be shut off to some systems while new components or distribution equipment are added, depending upon the configuration of the power system.

Equipment Reliability:

No product is 100 percent reliable, but differences do exist between equipment failure rates and the number of parts used in different UPS designs. Knowing a manufacturer will stand behind its products can minimize downtime caused by low reliability designs

The Five Questions

1. What level of availability is the power system expected to support?

To a large extent, the "architecture" of the power system determines the level of availability it can support. There are four basic system architectures, each providing a different level of protection and availability.

Basic Protection

Basic hardware protection of connected equipment is typically accomplished through a TVSS and PDU. The TVSS keeps noise and high voltages from reaching the load, preventing equipment damage. This system does not prevent unexpected shutdown resulting from loss of utility power. It delivers approximately 99.9 percent availability, depending upon the reliability of the utility power source.

Operational Support

Adding a UPS adds to the power system provides protection against short-term interruptions in utility power and the ability to ensure a controlled shutdown in the event the outage exceeds UPS battery capacity. This increases power availability to 99.99 percent, or less than one hour of unplanned downtime annually. The UPS can also condition the power being delivered to the load. For most critical applications, an online double-conversion UPS is recommended as it delivers the most consistent power quality.

High Availability

A High Availability architecture adds redundancy at the UPS level of the system to increase availability to between 99.999 percent (five minutes annually) and 99.99999 percent (3 seconds annually). With these systems, maintenance on all but the PDU can be performed concurrently, eliminating the majority of planned downtime. This can be accomplished through parallel redundancy or 1+1 redundancy.

UPS Basic ProtectionFigure 1: A basic power system prevents damage to sensitive electronics, but provides no protection against interruptions in incoming power.
UPS Operational SupportFigure 2: An Operational Support system raises availability to 99.99 percent by bridging brief interruptions in power and preserves data in the event of an extended outage.

A parallel-redundant system has two or more UPS modules connected in parallel to a common distribution network. It uses enough modules to carry the maximum projected load, plus at least one additional module for redundancy. During normal operation all modules share the load. If a module has to be taken off-line, the other modules have the capacity to carry the full load.

In a 1+1 redundant system, each UPS system can carry the full load, creating redundant power paths for part of the system.

Continuous Availability

The highest-level system is a Continuous Availability architecture, which utilizes a dual-bus with distributed redundancy to deliver near 100 percent availability. This system includes two or more independent UPS systems — each capable of carrying the entire load. Each system provides power to its own independent distribution network. No power connections exist between the two UPS systems. This allows 100 percent concurrent maintenance and brings power system redundancy to every piece of load equipment as close to the input terminals as possible — key to ensuring both maintainability and fault tolerance throughout the facility.

The right architecture for a particular application will depend on the cost of downtime and the degree of flexibility that exists to accommodate planned downtime for maintenance. Where little time is available for maintenance-related downtime or the cost of downtime is high, a High or Continuous Availability architecture should be considered.

UPS High AvailabilityFigure 3: A High Availability architecture adds UPS redundancy to increase availability to 99.999 percent to 99.99999 percent. Load sharing of parallel redundant units is coordinated by system controls, which can be located in an external cabinet (SCC) or within a UPS system designed specifically for use in a parallel redundant architecture.
UPS Continuous AvailabilityFigure 4: Achieving continuous availability requires a dual-bus architecture with a high degree of redundancy. Continuous Availability systems can be designed to support both single and dual-cord loads.

2. What are the lifetime costs of the power system?

The initial cost of a power system represents only part of the total costs of owning the system. Initial costs will always play a significant role in the decision-making process; however, organizations are increasingly factoring lifecycle costs into their technology purchasing decisions. This is relevant to power systems in terms of performance and capacity planning.

A less expensive passive standby or line interactive system does not provide the same degree of protection as an online double conversion system. Often, savings realized by choosing a less expensive topology are more than offset by increased downtime costs resulting from a lesser degree of power protection.

In regards to capacity planning, it is generally assumed that the power required to support information technology systems will increase over time as higher-density systems are added and the IT infrastructure expands. UPS systems are scalable in different ways, depending upon the design of the system. Fixed-capacity UPSs deliver the full capacity of the system from the time of installation. Modular systems are designed to grow incrementally as application requirements increase.

Modularity provides the ability to "pay as you grow." This can be an attractive option as it can reduce up-front costs, but modularity must be evaluated in light of projected capacity requirements. If capacity is expected to increase significantly, a larger fixed-capacity UPS may prove more cost-effective in the long run than the combined initial cost and expansion cost of a modular system.

For example, if initial capacities are 40 kVA and capacities in five years are projected to be 80 kVA, installing a 40 kVA modular system and expanding it to 80 kVA later can be more than twice the cost of installing a traditional 80 kVA system initially.

An N+1, or parallel redundant configuration, strikes an excellent balance between scalability and lifecycle costs while providing the redundancy required to achieve five nines availability.

[Note: High and Continuous Availability systems are not simply collections of discrete components. They are systems in which all components must work together to ensure seamless operation at the time they are needed most.]

3. What is the impact of the power system on data center space?

This question may be addressed as part of the lifecycle cost analysis or may be considered separately. Either way, it should not be ignored. The cost per square foot of data center space is higher than general building space and increasing equipment densities are putting the squeeze on data center floorspace. Although new generation servers and communication systems are typically smaller than their predecessors, they are almost always more powerful and generate more heat per rack than the systems they replace. This heat density then becomes the key factor in how much floorspace is required for each rack.

In regard to power systems, two factors affect data center space utilization: equipment location and footprint.

Many data center designers prefer to locate the UPS system outside the data center in an auxiliary room to avoid consuming valuable data center floor space with support systems. However, this approach is only possible if a room-level approach to protection is adopted. If power protection is dealt with on a rack-by-rack basis, the designer does not have this flexibility and, over time, the UPS systems can consume a significant amount of floorspace, reducing the space available for racks.

Some data center managers prefer to have the UPS in the data center with the equipment being protected. The UPS vendor should have the flexibility to support equipment in the data center or outside of it.

If the UPS is installed in the data center, the current — and future — footprint of the UPS system should be considered. Fixed capacity systems are typically more compact and consume less data center floor space than a modular system delivering the same capacity (see Figure 5).

4. How will the power system be tested and installed?

For lower level systems, testing and installation may not be critical factors to consider. But as system complexity increases, these factors become more important.

High and Continuous Availability systems are not simply collections of discrete components. They are systems in which all components must work together to ensure seamless operation at the time they are needed most. Ideally, all components in a Continuous or High Availability power system should be tested together prior to installation to ensure interoperability under all conditions. It is also important to understand the installation requirements of the system selected and the vendor’s role in the installation process.

Having access to experienced, local manufacturer representatives can make the difference between installation success and failure. In addition to helping prevent common problems, such as improper system grounding, they can ensure the power protection system is properly integrated with building and network systems and answer questions, such as:

  • What are the number and length of UPS power feeds required?
  • Is floor loading required to support the UPS battery system?
  • If the UPS is installed in the data center, what is the impact of the UPS heat load on the cooling system?
  • How is power distributed in the critical space – overhead or in the floor?
  • What aspects of the power system can be monitored by building or network management systems?
  • Is coordination with a standby generator required?

[Note: Having access to experienced, local manufacturer representatives can make the difference between installation success and failure.]

UPS System DesignsFigure 5: Footprint comparisons of different UPS system designs at different capacities. Footprints include distribution equipment and batteries.

5. How will the power system be monitored and maintained?

Regardless of manufacturer, UPS and switchgear should be de-energized periodically for preventive maintenance. This usually requires from one to four hours of scheduled equipment downtime. For some organizations this will not be a problem as they have regular windows for scheduled maintenance.

But increasingly IT infrastructures are being asked to support 7x24 operation, eliminating opportunities to bring down critical systems for support system maintenance. In these cases, the power system needs to be designed in a way that enables maintenance without disrupting power to protected equipment. Ideally, this means designing redundancy into the system as close to the point of use as possible, as shown previously with the High and Continuous Availability architectures. At minimum, an external maintenance bypass should be considered to enable service personnel to manually route incoming power around the UPS during service.

Determining how UPS service will be performed is also important. This is a function few organizations are equipped to deal with in-house; therefore, the service capabilities of the UPS vendor should be considered during the selection process. Some UPS vendors have their own service organization, but many rely on third-party contractors to provide service in most areas. This can increase the complexity of service management and often reduces the expertise and accountability of service personnel. Consider whether factory-trained service personnel and parts are available locally and the experience of the service organization.

The number of UPS systems also impacts service management. It can be much more difficult to coordinate service across 10 or 12 small UPS systems, compared to two or three larger systems. Having more systems also increases the parts count, which increases the likelihood of failure.

Finally, UPS monitoring can have a significant impact on service and availability. Does the power system provide the monitoring, control and access management capability required by the responsible party in your organization?

[Note: Does the power system provide the monitoring, control and access management capability required by the responsible party in your organization?]

Conclusion

The goal of any power system is to achieve the appropriate levels of power quality and system availability as simply and cost-effectively as possible.

Understanding the cost of downtime is an important first step in selecting a power system architecture. Increasing dependence on the IT infrastructure is increasing the cost of downtime while also reducing the time available for scheduled maintenance.

Power system costs should be considered based on the system lifecycle, rather than solely on initial cost. Also, consider power system floorspace requirements in light of the impact increasing server and switch densities are having on data center space requirements.

Finally, develop a service and installation strategy as early in the process as possible. Treating service and installation as an afterthought can prevent the system from achieving expected levels of availability.

The job of managing a dynamic IT infrastructure is difficult enough without adding extra complexity that results when the power system does not meet application requirements. Understanding the basics of power system design and asking the right questions before equipment is installed can ensure the power system eliminates, rather than adds to, IT concerns.


Новини


25 окт 2010Datacenter Infrastructure in Sofiaповече

20 яну 2010Liebert XDFN Panduit Editionповече

21 дек 2009Liebert CRVповече

White papers


01 май 2011Application Considerations for Cooling Small Computer and...повече

01 апр 2011Longevity of Key Components in Uninterruptible Power Systemsповече