Provoking IT from Good to Great
Measuring a Unified/Useful Computing System

Measuring the usefuleness of IT
Imagine if there was a way to measure the efficiency and effectiveness of a technology solution with an objective, industry standard, universally applicable, mathematical forumula?
Tom Canning and I have been discussing this of late, and here’s a transcript of our discussion!
It would help me in my job at Cisco to explain how although it LOOKS like a blade server system, it’s SO MUCH MORE… and here’s the formula to prove it – TA-DA! (big flourish!). There must be something out there so I went looking…
First I found a really interesting APC paper titled Selecting and Industry Standard Metric for Data Center Efficiency. This is a nice and concise paper that refers to the EPA Data Center Report for US Congress.
The APC document is interesting because it helps make sense of the data center in economic terms of stuff in and stuff out by providing some interesting ratios that you should be aware of even if you’re not a data center facilities person. After all, you know how need to know how much performance you get out of your car without being a mechanic, right?
The bit I love about the whole Data Center Infrastructure Efficiency (DCiE) stuff is the approach of using science to reduce something complex down to simple, universal ratios:
- Watts In -> IT Watts -> Useful Computing
- Watts In to the data center goes through the DC infrastructure to produce IT Watts.
- IT Watts drive IT infrastructure (compute, network and storage) to produce Useful Computing, which is the output of a data center.
- Watts In converted to Useful Computing
What is this data center output metric, Useful Computing, though? The paper suggests a source (Brill 2007) that refers to output like web pages or calculations over some time period which is really interesting… Imagine being able to measure the conversion of raw input (Watts In) and the raw output of a data center, such as:
- Searches (a la Google)
- Transactions (a la Ebay)
- Videos (a la YouTube)
- Desktops (a la Desktop as a Service)
- Utilization (a la any service units)
After that initial research, I found someone much brighter than me and who is a guru in this area to try and explain what this all means: this is where Tom Canning enters the discussion. Tom is a Green Consultant and is an expert on these matters. Here he is explaining the current situation in his own words (thanks, Tom!).
If you still scratch your head about Useful Computing, let’s look at a simple real world example: painting a house!
I hire some some to paint my house, and they spend the whole day brush-on/brush-off, then you would expect based on the empty cans and sweat bowed face - that they delivered a “useful service” to me. But, what happens if they never loaded the brush with paint, they dump the paint cans and the house never receives a drop! The amount of expended work is the same, but the results are no where near what I expected. A “useful service” has now become a “useless service”.
Services need to have some degree of measurement and monitoring to correlate “usefulness” and to rule out false positives. If I been measuring paint brush weight and paint can volume over the day – I could have gained useful insight and been alerted early that my supposed “useful service” was now in jeopardy. The more quickly the alert, the more quickly I could have could have re-mediated the situation and save money. Time is a very important element for determining “usefulness” – as the longer to wait to validate, the longer it will take to re-mediate errors and return the process to a positive state value.
Back to computing. How many dumped paint cans are there in the Data Center? How long do you want to wait before they pile up? We can’t visibility see the paint cans, because the drips, spills and tweaked lids are scattered and hidden from the casual obersever. DCiE speaks to efficiency – and efficiency is something you just can’t see – it has to be measured. And because the data center is a dynamic organism that delivers mission critical enterprise services – it needs to be measured in near real time. It’s the common phase of “you can improve what you can’t measure” and without some level of real time instrumentation – you cannot determine an initial baseline for where you are, what your level of efficiency is and how “useful” is the final output of the data center.
As a Cisco employee, I’m interested in leveraging this new way of understanding to Cisco’s Unified/Useful Computing System. Perhaps one could even shrink this down from the data center in/out to a solution in/out, but of course this is moving the focus from DCiE to Solution Infrastructure Efficiency.
If you could focus down on a solution, you’re just looking at comparing IT Watts to Useful Computing. Here’s a simple VDI solution running on a dedicated Cisco’s Unified (Useful!) Computing System:
- IT Watts in, for the IT infrastructure. In UCS terms this is the wattage required to drive the Fabric Interconnects and all the chassis that are connected, adding on any dedicated LAN switches, SAN fabric and Storage arrays. Let’s suggest a sum of 50Kw/h.
- Useful Computing out, which is desktops. For VDI you could measure the number of desktops which is what you are probably doing showback/chargeback against. Let’s throw a sum of 4000 out there.
A useful computing system can deliver:
- 4000 desktops for 50Kw/h of power, which equates to
- 4000 / 50 = 80 desktops per Kw/h, or
- 50,000 / 4000 = 12.5w/h per desktop.
Tom picks up the baton:
Now we have a really good idea of the power per desktop, but we’re still missing what services are actually being performed by the desktop. Running to the rescue is the Green Grid with a new buzzword called DCP. Still in development DCP attempts to how much actual work, IE “useful work” is my IT equipment capable of delivering.
DCP = Useful Work/Total Facility Power
Great! Plug in your values and push “go”. But as you see – “useful work” still remains an undefined variable for many. Green Grid is working on creating proxies as a way to measure useful work, and there is mention of “utility value” which of course varies immensely and has no common baseline value. Feel like you are trying to pin-the-tail-on-the-donkey blind folded with your legs tied? Maybe. There is hope as customers explore the proposed proxy notion – feel free to review the candidate proxies. But, is it really possible to normalize “ultility value” so it will correctly correlate across different companies?. It’s still early – so be hopeful and participate. . Case in point with Proxy #1.
Confusing situation? Well, anything new always is, but like the SPECmark (just dated myself) and MIPS, we learned how to use this ratings, interpret for our own organizational value and we got our jobs done.! Now that that is the real definition of useful computing - GET YOUR JOB DONE! Let’s also not forget about useful computing in terms of human computing efficiency. We all know the ‘effective value of each human worker” varies as the person’s ability and reasoning to leverage a useful service can fail at times rendering a supposed “useful service” basically useless.
Back to DCie… Super smart people, like James Hamilton take the above notion one step further with a slight modification to DCIE/ PUE.
tPUE =Total Facility Power / Productive IT Equipment Power
where power is carefully measured before all the UPS and conditioning equipment and we’re dividing that by “productive IT power” Having fun?
As you can see – here is a lot of discussion and views on what is useful, productive and meaningful. Almost like standing around a water cooler and having a conversation with fellow employees! So how can all this really be useful and why should you care? Simple – it makes people take action is explore. No matter what is your definition or interpretation – the actions to measure DCiE, DCP, measure the # of UCS megaflops or whatever, it is this investigative action that delivers new insights into the enterprise DNA and provides for a baseline to improve in whatever fashion best fits the findings.
Final thoughts from Tom:
Metrics create competition. Competition creates action. We see this today with company’s publishing their PUE/DCIE numbers and proudly proclaiming super efficiency within their Data Centers. This is good. This is useful. It’s not, however, the actual metrics that are important though – its the actions, learnings and remediation’s that take place to get the company to this position that are truly important.
Additional Info
Related posts:
| Print article | This entry was posted by Steve Chambers on 7 September, 2009 at 19:08, and is filed under GreenGrid, UCS. Follow any responses to this post through RSS 2.0. You can leave a response or trackback from your own site. |