
This is Kevin. He’s a Script Kiddie. We really need to talk about Kevin.
He’s a glass-half-full kinda guy! He loves learning on the job, and he’s looking insanely happy because he’s been told he can build his first converged infrastructure by plagiarising someone else’s to-do list, also known as a Reference Architecture.
That’s right: as if his blindly plagiarising a simple best practice doesn’t cause enough damage, imagine him using a huge complex interwoven collection of them that stretches over one-hundred pages: that’s what a Reference Architecture is.
Chad Sakac has just posted an excellent blog on this topic, and it’s timely that I share eleven reasons why you should avoid the incorrect use of Reference Architectures, Dear Reader.
- Best is a relative term. Better than what? If there are three ways to skin a cat, what is the best practice? Well, if the guy that wrote it had a knife to skin that cat but you only have your teeth: what do you do now? Before you buy a knife, check your name isn’t Reg Prescott.
- Sharing is not caring. Just because someone shares their practices with you (and I’m a fan of that, after all I did VIOPS), if you blindly copy what they did and it doesn’t work then do not expect them to help you. You are on your own. It’s all your risk.
- Prepare to shave a yak. When it doesn’t work (and believe me, it won’t) then will you know why? Chances are this is your first time (otherwise why are you following a strangers to-do list?), and therefore you will get a sore head with all that scratching trying to figure out the answers. I hope you can reach Google from the datacenter?
- You will stray from the path. You will have to customise the best practices in a reference architecture: but do you know what has to change due to different requirements? And when you DO change something, what’s the impact of THAT change? At this point you are on your own, your solution is not standard nor recognisable to the original reference.
- Entropy is Flexibility’s evil twin. The more you stray from the path the more you disintegrate the architecture and start to experience entropy: cause and effect start bouncing off each other like rabbits, breeding misconfigurations across your solution. Think of “re-pristining” Windows OS: that’s Entropy in action on a small scale affecting one user.
- The future is unpredictable. Products change over time, so do people, so do solutions: everything changes: a reference architecture does not protect your from these changes. Imagine you’ve got a converged infrastructure, that you’ve customised, and one of the elements has a new release: how do you do it? Should you do it? What happens if you do? You. Are. On. Your. Own.
- Come on in the water’s lovely. Reference architectures are mostly marketing papers, giving you the illusion of a safety blanket so that you feel empowered to buy the latest widget. That means you are at the cutting edge, installing version 1.0. Is that blanket still keeping you warm?
- Failure has many faces. Gone are the days when IT guys just had to keep the lights on. Now they are being asked to do custom work at the speed of cloud providers. Failure is now an East-West (solution lifecycle) as well as North-South (technology stack functions) problem.
- Consumers have no tolerance for failure. Clouds work indoors and outdoors today, and they are ready for your customers who have applications that need to work NOW while you’re creating a customer solution from someone else’s to-do list: if you fail (see point 8) then don’t expect them to wait six months for you to get it right.
- A fool with a tool is still a fool. Think you can short cut the process and fix the above with a tool? I see that Cloupia have an automated way of implementing a reference architecture for FlexPod, and HP have a million different incompatible products that need services… Think that’s awesome? It certainly is: I can’t think of a faster way to implement all nine failure scenarios above. Now you have to maintain the reference architect AND the tool AND integrate it… headache inducing.
- IT people are horrible tech writers. There are very few people I know in the tech industry who love also to write documentation. And if you hate a job you are usually bad at it. Bad means missing stuff, having sloppy commands that don’t work, and not keeping the document up to date.
All those 11 reasons multiplied together with various helpings of impact and probability give you a big old lump of risk that might be too hard to swallow.
There are two ways to derisk these eleven reasons though, and both of these are about the appropriate use of Reference Architectures:
- A reference architecture has a single use-case: for learning about technologies and solutions.
- For real life, skip the reference architecture and go to a reputable, experienced, knowledgeable, customer-focused partner to get it right.
Reference Architectures when used properly are great learning tools. Take the Cisco Validated Designs (CVD) as a great example. When I was at Cisco you would find a brilliant Technical Marketing Engineer to author a new CVD to combine new and exciting products together for the first time. EMC and VMware do similar great work to showcase their technologies. Yes it is possible to use OTV and VPLEX for long distance vMotion, but would you build this for real just using the reference architecture?
If someone claims to be a knowledgeable expert but then refers to a reference architecture, they aren’t likely to be what they claim. A reputable expert will have superceded a reference architecture long ago and have their own kitbag from which to help craft a solution. Great channel partners of the major vendors stand out immediately, if you need help finding one let me know!
If you want to create something as powerful as converged infrastructure to be right first time, be fit for purpose to run mission critical workloads and run for a significant period of business: don’t use a reference architecture!
5 Comments
This post makes a lot of sense to me and fits my experience as both a customer and now partner of VCE. I think it emphasizes why it’s so important to work with a good partner (Disclosure: I work for Presidio)
Great post sir… #10 is my favorite. Completely agree with your last paragraph about great channel partners. Last week I met with VCE channel partner that is way beyond reference architecture and has very advanced engineering staff that can do it all. As we were discussing how they are looking at the market one of the owners said “we don’t want the accountability for providing our own reference architecture” — they see the value VCE is bringing to market beyond what anyone else can do.
Great post Stevie! As a purveyor of several “converged infrastructure” solutions (Disclaimer – I work for CDW), I see reference architectures or validated designs as a starting point. Each piece of the process needs to be validated against what a business is trying to accomplish. The further you move from “flexibility,” the further you move (SOMETIMES!) from many of the risks involved in architecting a datacenter. A vBlock is a great solution that fits the needs of many businesses.
However… The risks that are avoided may cause other risks to be introduced. Not everyone can comfortably fit into a “few sizes fit most” offerings and some shops may decide that they need to stray a little from a specific solution or even a reference architecture in order to meet the goals of their business. What if *GASP* they want to use Hyper-V in branch offices? What if they have Unix systems that need to consume some of the resources? What if they hang a tape library off of the whole works? Does that “void the ‘supportability’?” This is my favorite – what if I take the red pill and present a LUN to something outside of the solution? Should the entire software stack decide it cannot do any more work? Are these things acceptable when meeting the demands of my business and the consumers of my stuff?
BTW: Is Kevin related? I really see the resemblance :o)
Dave
Agreed on all points :) In the case of accessing a Vblock’s storage from outside the Vblock, we want to avoid Fragile SAN Syndrome, so by default we allow access to backup, replication etc, but we want to take more care when allowing outside compute to access the storage so we don’t break the array: that’s a fair business risk management approach, no?
Nice post. I’m a former infrastructure architect and there are couple of points that really resonate here. ( Disclaimer, I work for CA these days)
There is a clue in the name, reference architectures are great tools for Architects to refer too while they design a system but the day after this moves to production you are now relying on that Architect to understand and on board all future updates and changes.
Things change both in technology and in the people involved with running a solution so while you might have a great team during the implementation when a product is seen as the hot thing to work on, those people are nowhere to be seen in 18 months time when you have updated everything twice and are trying to troubleshoot a performance issue. How involved is that architect going to be 3 years into the life of the service.
So If you plan to move that responsibility to an outside organisation, make sure you have the demarcation covers the service not the product design.
Secondly, we used to joke that if you build a better tool, they build a better idiot. If you obscure the complexity and remove the need to understand how a system works, people tend to do things that would formerly have been obviously daft and you never considered in the outset. The onus shifts to the team building the tools to resolve the issue. If that team owns the service ( or it’s static and very well defined) then it’s OK, but if it falls to the person using the tool then they will have a mountain to climb to fix anything , troubleshoot anything or change anything.
Nick Martin