Blog: Green Data Centers an Oxymoron
Lead Analyst: Cal Braunstein
The New York Times published "Power, Pollution and the Internet," an article on the dark side of data centers. The report, which was the result of a yearlong investigation, highlights the facts related to the environmental waste and inefficiencies that can be found in the vast majority of data centers around the world. RFG does not contest the facts as presented in the article but the Times failed to fully recognize all the causes that led to today's environment and the use of poor processes and practices. Therefore, it can only be partially fixed – cloud computing notwithstanding – until there is a true transformation in culture and mindset.
New York Times Article
The New York Times enumerated the following energy-related facts about data centers:
- Most data centers, by design, consume vast amounts of energy
- Online companies run their facilities 24x7 at maximum capacity regardless of demand
- Data centers waste 90 percent or more of the electricity they consume
- Worldwide digital warehouses use about 30 billion watts of energy; U.S. accounts for 25 to 33 percent of the load
- McKinsey & Company found servers use only six to 12 percent of their power consumption on real work, on average; the rest of the time the servers are idle or in standby mode
- International Data Corp. (IDC) estimates there are now more than three million data centers of varying sizes worldwide
- U.S. data centers use about 76 billion kWh in 2010, or roughly two percent of all electricity used in the country that year, according to a study by Jonathan G. Koomey.
- A study by Viridity Software Inc. found in one case where of 333 servers monitored, more than half were "comatose" – i.e., plugged in, using energy, but doing little if any work. Overall, the company found nearly 75 percent of all servers sampled had a utilization of less than 10 percent.
- IT's low utilization "original sin" was the result of relying on software operating systems that crashed too much. Therefore, each system seldom ran more than one application and was always left on.
- McKinsey's 2012 study currently finds servers run at six to 12 percent utilization, only slightly better than the 2008 results. Gartner Group also finds the typical utilization rates to be in the seven to 12 percent range.
- In a typical data center when all power losses are included – infrastructure and IT systems – and combined with the low utilization rates, the energy wasted can be as much as 30 times the amount of electricity used for data processing.
- In contrast the National Energy Research Scientific Computing Center (NERSCC), which uses server clusters and mainframes at the Lawrence Berkeley National Laboratory (LBNL), ran at 96.4 percent utilization in July.
- Data centers must have spare capacity and backup so that they can handle traffic surges and provide high levels of availability. IT staff get bonuses for 99.999 percent availability, not for savings on the electric bill, according to an official at the Electric Power Research Institute.
- In the Virginia area data centers now consume 500 million watts of electricity and projections are that this will grow to one billion over the next five years.
- Some believe the use of clouds and virtualization may be a solution to this problem; however, other experts disagree.
Facts, Trends and Missed Opportunities
There are two headliners in the article that are buried deep within the text. The "original sin" was not relying on buggy software as stated. The issue is much deeper than that and it was a critical inflection point. And to prove the point the author states the NERSCC obtains utilization rates of 96.4 percent in July with mainframes and server clusters. Hence, the real story is that mainframes are a more energy efficient solution and the default option of putting workloads on distributed servers is not a best practice from a sustainability perspective.
In the 1990s the client server providers and their supporters convinced business and IT executives that the mainframe was dead and that the better solution was the client server generation of distributed processing. The theory was that hardware is cheap but people costs are expensive and therefore, the development productivity gains outweighed the operational flaws within the distributed environment. The mantra was unrelenting over the decade of the 90s and the myth took hold. Over time the story evolved to include the current x86-architected server environment and its operating systems. But now it is turning out that the theory – never verified factually – is falling apart and the quick reference to the 96.4 percent utilization achieved by using mainframes and clusters exposes the myth.
Let's take the key NY Times talking points individually.
- Data centers do and will consume vast amounts of energy but the curve is bending downward
- Companies are beginning to learn to not run their facilities at less than maximum capacity. This change is relatively new and there is a long way to go.
- Newer technologies – hardware, software and cloud – will enable data centers to reduce waste to less than 20 percent. The average data center today more than half of their power consumption on non-IT infrastructure. This can be reduced drastically. Moreover, as the NERSCC shows, it is possible to drive utilization to greater than 90 percent.
- The multiple data points that found the average server utilization to be in the six to 12 percent range demonstrated the poor utilization enterprises are getting from Unix and Intel servers. Where virtualization has been employed, the utilization rates are up but they still remain less than 30 percent on average. On the other hand, mainframes tend to operate at the 80 to 100 percent utilization level. Moreover, mainframes allow for shared data whereas distributed systems utilize a shared-nothing data model. This means more copies of data on more storage devices which means more energy consumption and inefficient processes.
- Comatose servers are a distributed processing phenomenon, mostly with Intel servers. Asset management of the huge server farms created by the use of low-cost, single application, scale-out hardware is problematic. The complexity caused by the need for orchestration of the farms has hindered management from effectively managing the data center complex. New tools are constantly coming on board but change is occurring faster than the tools can be applied. As long as massive single-application server farms exist, the problem will remain.
- Power losses can be reduced from 30 times that used to less than 1.5 times.
- The NERSCC utilization achievement would not be possible without mainframes.
- Over the next five years enterprises will learn how to reduce the spare capacity and backup capabilities of their data centers and rely upon cloud services to handle traffic surges and some of their backup/disaster recovery needs.
- Most data center staffs are not measured on power usage as most shops do not allocate those costs to the IT budget. Energy consumption is usually charged to facilities departments.
- If many of the above steps occur, plus use of other processes such as the lease-refresh-scale-up delivery model (vs the buy-hold-scale-out model) and the standardized operations platform model (vs development selected platform model), then the energy growth curve will be greatly abated, and could potentially end up using less power over time.
Operations standard platforms (cloud) |
Greater standardization and reduced platform sprawl but more underutilized systems |
Least cost |
Development selected platforms |
Most expensive
|
Greater technical currency with platform islands and sprawl |
Model philosophies
|
Buy-hold-scale-out
|
Lease-refresh-scale-up
|
- Clouds and virtualization will be one solution to the problem but more is needed, as discussed above.
RFG POV: The mainframe myths have persisted too long and have led to greater complexity, higher data center costs, inefficiencies, and sub-optimization. RFG studies have found that had enterprises kept their data on the mainframe while applications were shifted to other platforms, companies would be far better off than they are today. Savings of up to 50 percent are possible. With future environments evolving to processing and storage nodes connected over multiple networks, it is logical to use zEnterprise solutions to simplify the data environment. IT executives should consider mainframe-architected solutions as one of their targeted environments as well as an approach to private clouds. Moreover, IT executives should discuss the shift to a lease-refresh-scale-up approach with their financial peers to see if and how it might work in their shops.
CIO Ceiling, Social Success and Exposures
Lead Analyst: Cal Braunstein
According to a Gartner Inc. survey, CIOs are not valued as much as other senior executives and most will have hit a glass ceiling. Meanwhile a Spredfast Inc. social engagement index benchmark report finds a brand’s level of social engagement is more influenced by its commitment to social business than its size. In other news, a New York judge forced Twitter Inc. to turn over tweets from one of its users.
Focal Points:
- Recent Gartner research of more than 200 CEOs globally finds CIOs have a great opportunity to lead innovation in their organization, but they are not valued as strategic advisors by their CEOs, most of whom think they will leave the enterprise. Only five percent of CEOs rated their CIOs as a close strategic advisor while CFOs scored a 60 percent rating and COOs achieved a 40 percent rating. When it comes to innovation, CIOs fared little better – with five percent of CEOs saying IT executives were responsible for managing innovation. Gartner also asked the survey participants where they thought their CIO's future career would lead. Only 18 percent of respondents said they could see them as a future business leader within the organization, while around 40 percent replied that they would stay in the same industry, but at a different firm.
- Spredfest gathered data from 154 companies and developed a social engagement index benchmark report that highlights key social media trends across the brand and assesses the success of social media programs against their peers. The vendor categorized companies into three distinct segments with similar levels of internal and external engagement: Activating, Expanding, and Proliferating. Amongst the findings was that a brand's level of social engagement is more influenced by its commitment to social business than its size. Social media is also no longer one person's job but averages about 29 people participating in social programs across 11 business groups and 51 social accounts. Publishing is heavier on Twitter but engagement is higher on Facebook, Inc. but what works best for a brand does depend on industry and audience. Another key point was that corporate social programs are multi-channel, requiring employees to participate in multiple roles. Additionally, users expect more high-quality content and segmented groups. One shortfall the company pointed out was that companies use social media as an opportunity for brand awareness and reputation but miss the opportunity to convert the exchange into subsequent actions and business.
- Under protest Twitter surrendered the tweets of an Occupy Wall Street protester, Malcolm Harris, to a Manhattan judge rather than face contempt of court. The case became a media sensation after Twitter notified Harris about prosecutors' demands for his account. Mr. Harris challenged the demand but the judge ruled that he had no standing because the tweets did not belong to him. While the tweets are public statements, Mr. Harris had deleted them. Twitter asserts that users own their tweets and that the ruling is in error. Twitter claims there are two open questions with the ruling: are tweets public documents and who owns them. Twitter is appealing.
RFG POV: For the most part CIOs and senior IT executives have yet to bridge the gap from technologist to strategist and business advisor. One implication here is that IT executives still are unable to understand the business so that IT efforts are aligned with the business and corporate needs. To quote an ex-CIO at Kellogg's when asked what his role is said, "I sell cereal." Most IT executives do not think that way but need to. Until they do, they will not become strategic advisors, gain a seat at the table or have an opportunity to move up and beyond IT. The Spredfest report shows that using social media has matured and requires attention like any other corporate function. Moreover, to get it to have a decent payback companies have to dedicate resources to keeping the content current and of high quality and to getting users to interact with the company. Thus, social media is no longer just an add-on but must be integrated with business plans and processes. IT executives should play a role in getting users to understand how to utilize social media tools and collaboration so that the enterprise optimizes its returns. The Twitter tale is enlightening in that information posted publicly may not be recalled (if the ruling holds) and can be used in court. RFG has personal experience with that. Years ago, in a dispute with WorldCom, RFG claimed the rates published on its Web site were valid at the time published. The telecom vendor claimed its new posting were applicable and had removed the older rates. When RFG was able to produce the original rate postings, WorldCom backed down. IT executives are finding a number of vendors are writing contracts with terms not written in the contract but posted online. This is an advantage to the vendors and a moving target for users. IT executives should negotiate contracts that have terms and conditions locked in and not changeable at the whim of the vendor. Additionally, enterprises should train staff on how to be careful about is posted in external social media. It can cost people their jobs as well as damage the company's financials and reputation.
More Risk Exposures
Lead Analyst: Cal Braunstein
Hackers leaked more than one million user account records from over 100 websites, including those of banks and government agencies. Moreover, critical zero-day flaws were found in recently-patched Java code and a SCADA software vendor was charged with having default insecurity, including a hidden factory account with password. Meanwhile, millions of websites hosted by world's largest domain registrar, GoDaddy.com LLC, were knocked offline for a day.
Focal Points:
- The hacker group, Team GhostShell, raided more than 100 websites and leaked a cache of more than one million user account records. Although the numbers claimed have not been verified, security firm Imperva noted that some breached databases contained more than 30,000 records. Victims of the attack included banks, consulting firms, government agencies, and manufacturing firms. Prominent amongst the data stolen from the banks were personal credit histories and current standing. A large portion of the pilfered files comes from content management systems (CMS), which likely indicates that the hackers exploited the same CMS flaw at multiple websites. Also taken were usernames and passwords. Per Imperva "the passwords show the usual "123456" problem. However, one law firm implemented an interesting password system where the root password, "law321" was pre-pended with your initials. So if your name is Mickey Mouse, your password is "mmlaw321". Worse, the law firm didn't require users to change the password. Jeenyus!" The group threatened to carry out further attacks and leak more sensitive data.
- A critical Java security vulnerability that popped up at the end of August leverages two zero-day flaws. Moreover, the revelation comes with news that Oracle knew about the holes as early as April 2012. Microsoft Corp. Windows, Apple Inc. Mac OS X and Linux desktops running multiple browser platforms are all vulnerable to attacks. The exploit code first uses a vulnerability to gain access to the restricted sun.awt.SunToolkit class before a second bug is used to disable the SecurityManager, and ultimately to break out of the Java sandbox. Those that have left unpatched the vulnerabilities to the so-called Gondvv exploit that was introduced in the July 2011 Java 7.0 release are at risk since all versions of Java 7 are vulnerable. Notably older Java 6 versions appear to be immune. Oracle Corp. has yet to issue an advisory on the problem but is studying it; for now the best protection is to disable or uninstall Java in Web browsers. SafeNet Inc. has tagged a SCADA maker for default insecurity. The firm uncovered a hidden factory account, complete with hard-coded password, in switch management software made by Belden-owned GarrettCom Inc. The Department of Homeland Security's (DHS) ICS-CERT advisory states the vendor's Magnum MNS-6K management application allows an attacker to gain administrative privileges over the application and thereby access to the SCADA switches it manages. The DHS advisory also notes a patch was issued in May that would remove the vulnerability; however, the patch notice did not document the change. The vendor claims 75 of the top 100 power companies as customers.
- GoDaddy has stated the daylong DNS outage that downed many of its customers' websites was not caused by a hacker (as claimed by the supposed perpetrator), but that the service interruption was not the result of a DDoS attack at all. Instead the provider claims the downtime was caused by "a series of network events that corrupted router tables." The firm says that it has since corrected the elements that triggered the outage and has implemented measures to prevent a similar event from happening again. Customer websites were inaccessible for six hours. GoDaddy claims to have as many as 52 million websites registered but has not disclosed how many of the sites were affected by the outage.
RFG POV: Risk management must be a mandatory part of the process for Web and operational technology (OT) appliances and portals. User requirements come from more places than the user department that requested the functionality; it also comes from areas such as audit, legal, risk and security. IT should always be ensuring their inputs and requirements are met. Unfortunately this "flaw" has been an IT shortfall for decades and it seems new generations keep perpetuating the shortcomings of the past. As to the SCADA bugs, RFG notes that not all utilities are current with the Federal Energy Regulatory Commission (FERC) cyber security requirements or updates, which is a major U.S. exposure. IT executives should be looking to automate the update process so that utility risk exposures are minimized. The GoDaddy outage is one of those unfortunate human errors that will occur regardless of the quality of the processes in place. But it is a reminder that cloud computing brings with it its own risks, which must be probed and evaluated before making a final decision. Unlike internal outages where IT has control and the ability to fix the problem, users are at the discretion of outsourced sites and the terms and conditions of the contract they signed. In this case GoDaddy not only apologized to its users but offered customers 30 percent across-the-board discounts as part of their apology. Not many providers are so generous. IT executives and procurement staff should look into how vendors responded to their past failures and then ensure the contracts protect them before committing to use such services.