How I screamed when I realised what I had done

How I screamed when I realised what I had done

I have one Cisco Unified Computing System (UCS) running 3,200 compute nodes on 64 half-slot Cisco B200 blades in 8 Cisco 5108 Chassis.

A critical part of my Unified Computing System is simple cabling: I only need 2 cables from each Chassis’s 2104 Fabric Extender to the each Cisco 6120XP Fabric Interconnect, for a total of 4 cables per chassis to provide highly available unified fabric for both LAN and SAN networking.

That’s a total of 32 cables for all 3,200 compute nodes (1 per 100), 64 blades, 8 chassis and all LAN and SAN traffic.

A consultant came to see me today to give us a insight into industry best practices, to see if we could improve our compute platform.

The consultant said we should separate, redundant cards, cables and ports for each class of networking traffic – VMs, COS, vMotion, Fibre Channel – why?

Because that was the industry best practice, and it was “less risky”.

I asked more about the risk, after all people have been trusting ESX virtual switches to keep traffic separate in software before it gets to the physical NIC.  What was the specific risk, in terms of exploit, probability and impact?  No answer from the consultant.

What about the benefits of such a change?  What added value does this give me?  No answer from the consultant?

What about the costs of such a change?  Saving the Silent Consultant embarrassment, here’s what I think I’d have to spend:

  • Assume I need four physical NICs then I can’t use my blades, so that’s a none starter as I’d have to invest in less-efficient and larger servers to cope with all those NICs and HBAs meaning at least double my cost, conservatively adding over $100k in CapEx.
  • Even if I spent more money on the servers to get four pNICs, according to the consultant I’d actually need eight pNICs which could be four dual-port NIC cards, let’s add $2k per host just to buy, never mind install, configure maintain.  Total $128k.
  • Add the cost of the network ports + switches – that would be 64 hosts x 8 ports = 512 ports, which is about 512 / 48 ~ 22 access layer switches ($4k each) – that’s a grand total of $88k.
  • I’d also need separate dual-port HBA cards, let’s call that $2k again for each host – another lump of $128k.

At this point we’re already running to a grand total increase in CapEx of $444k to add no value and mitigate no risks.

I don’t even want to think of the OpEx of such a change, but again off the top of my head:

  • Labour cost to architect, purchase, install and configure the new environment.
  • With all those added components and complexity, the increased risk (probability and impact) of operational mistakes on my environment.
  • Cost of additional monitoring (switches, NIC and HBA) which also runs into $tens-of-thousands.

And then I woke up.  I was dreaming! Life’s not like that at all!  I’m not running Unified Computing…woah!  Things are the other way around!  I’ve already spent all that money… OH NOOOOOOOOOOO!

Oh My God!  You mean I’m actually spending all that money for no return in real life?  Please don’t let it be true, I couldn’t stand the shame!  How will I explain this waste of money, time and resource to my senior execs, and to my customers who are footing this expensive bill?

Perhaps it’s not too late, what’s the number for Cisco?

Disclaimer: OK, I made this all up, but you get my point, dontcha? :-)

PS: Stu (@vinternals) pointed out that I missed off FC, and I also realised I missed off power.  Let’s call it a cool $500k to end the discussion, but know that the real figure is, scarily, much more and mostly invisible to the people who influence compute purchases.

Related posts:

  1. Tradeoffs between scalability and performance in UCS
  2. Cisco UCS dog food tastes nice
  3. With UCS you buy one set of switches every 112 blades
  4. Cisco 6100XP aint Switches like VMware ESX aint Linux
  5. Brush up on your drawing skills for UCS and vSphere