Consider the following logic

  • Product (Car)
    • Product (Car) . components() = { Chassis , Seats(), Wheels(), Steering Wheel, Engine, Body }
    • Product (Car) . Wheels (Front Left) = { Tyre, Alloy, Nuts }
    • Product (Car) . Wheels (Front Left) . build = { fit Tyre to Alloy, put Alloy on Axle, apply Nuts }
  • Product (Web Site)
    • Product (Web Site) . components() = { Web Servers (), App Servers (), Database }
    • Product (Web Site) . Web Servers (web01) = { Linux, Apache, Web code }
    • Product (Web Site) . Web Servers (web01) . build = { create VM, install Linux, install Apache, install Web code }

The above is an approximation to object-oriented thinking where you abstract objects, their attributes and actions and in doing so you can establish a pattern of how something looks and acts.  Some more examples of abstraction:

  • Tree.  There is no such thing as “a Tree” – this is an abstraction for all objects that have similar characteristics to the English Oak, the Californian Redwood etc.
  • Person.  There is no such thing as “a Person” – again, this is an abstraction for a human being no matter what their nationality, race or sense of humour.  They can appear to be completely different (look and act),  but they follow the same abstract pattern (physiology and behaviour).

Manufacturing of products such as motor vehicles is a business where efficiency, effectiveness and governance are essential to beating the competition.  Here, success in Operations means more revenue, faster time to market, and lower costs.  The world famous Toyota Production System is the best in the world: Operations is the reason they are the number one automotive organization in the world, and why inefficient, ineffective and poorly governed operations of their competitors are either in Chapter 11, out of business, or owned by their government.

If you look at what makes the Toyota Production System successful (read Steven Spear’s Decoding the DNA of the Toyota Production System in Harvard Business Review, or buy his excellent book Chasing the Rabbit), it is not just the tools they use, or the processes they have, but the culture and approach of the people that work there.

For many years I have equated process with rigid, inflexible and quite frankly boring working practices that have little basis in reality of the front line.  Processes are written by people in ivory towers who “know best” but really “know nothing”.  These processes are documented, tools are bought, and much money is spent with little return on investment.  The staff on the front line don’t use the processes because they are a joke, and the process flow diagram is posted on the office wall and ignored forever.  ITIL initiatives are great examples of this, and a great way to fill cabinets with useless documentation.  Why is this?

Steven Spear identified that people like me have been doing processes all wrong, and giving way to much kudos to tools.  Eli Goldratt says the same about tools in his inimitable The Goal: no point investing in that super-duper-machinery if you can’t keep it busy, or if keeping it busy causes problems in other areas.

So if tools aren’t the answer, and processes aren’t, what IS the answer?  Steven Spear explains this much better than I – please go read his stuff, it’s very readable – but in essence, the staff on the Toyota Production System are creative problem solvers who own develop their own process every day, constantly finding new ways to be more efficient, more effective and better governed.  They work in a consistent framework – a pattern – but within that framework they can be creative.

Applying TPS to IT

If building an IT service is akin to building a car, then why can’t IT work the same way as the Toyota Production System?  IT workers long for creativity and problem solving, so surely it is ideal for them?

The first objection I get from IT staff is that building an IT service is more complicated than building a car.  Really?  Bear in mind that one Toyota Production Line builds different models to different specifications (inefficient to have one line for all permutations of models and specs, right?), and that each car is made up of hundreds of thousands of parts, how can building IT services be complicated?

Well, I agree IT is more complicated (not the end product, but the system) because the framework that the Toyota staff have doesn’t exist in IT, so every piece of work is custom and in effect IT is building the production line as they are also building the product. That is certainly more complicated, not because of numbers of components, but because nobody has a co-ordinated plan of what to do.  Without a co-ordinated plan, I think two people working on one piece of work would struggle, never mind hundreds working on thousands.

Another objection to codifying working practices is that, in IT, there would be too many, it would take too long, they would be out of date fast: that is very true, if we do processes the non-Toyota way and in the same old manner we’ve been failing at for years.

Developing re-usable processes across individual teams, and improving those processes where possible, is different than “delegating” that process and its improvement to someone who isn’t doing the job, but who is responsible for process improvement.  This approach is deemed more efficient by some, but it patently doesn’t work: if you aren’t doing the process, how can you improve it and implement it?  IMPOSSIBLE!

How Cisco UCS and VMware vSphere *might* help IT

Breaking the production line management from the product management is a critical feature of any IT implementation of the Toyota Production line.  There are other critical changes like culture (and you don’t need to boil the ocean across the whole organization, just individual teams can start the revolution), but think about this:

Think of machines servicing product components.

Think of UCS blades with vSphere servicing virtual machines (which are the actual product components).

Think Toyota Production Line machines = UCS + vSphere

Think Toyota Car Components = Virtual Machines

Separating the production line (UCS + vSphere) from the components (virtual machines) helps your staff discover working teams who can focus on how to make their piece efficient, effective and well governed.  In addition, these working teams can clearly state how their inputs and outputs work, and how the immediate teams around them are affected by their performance.

Between the production line managers and the component managers, there is an objectively defined way to work: a production forecast (virtual machine pipeline), agreements on how long things take (vm + linux + apache + web code = 1 day), how to escalate problems/improvements with the process (either process is improved, or the people doing the process is improved) – and so on.

The impact on ITIL

This is a different approach to ITIL.  There is no such thing as a “Problem Manager” – everyone is using problems as a way to improve efficiency, effectiveness and governance.  Supervisors see data as it’s rolled up, and might act upon it.

There is no such thing as a “Change Manager” – there are many different roles managing change constantly.  Again, changes have different impacts but these impacts are learned and embedded into the every day working practices: an example, if one team finds a better way to install linux, apache and the web code, then they might escalate this (e.g. change an installation library used by other teams) to help other teams benefit from this change: no change is enforced on other teams, it is agreed.

Anyone who is experienced in IT can usually open any page of an ITIL book and understand what is going on, and usually remarks: “I’ve been doing that for years, but we call it something different, and there’s some detail missing here.”

Well, the devil is in the detail, and so are the efficiencies!

This also helps security, because the people doing the work are responsible for securing the systems.  If anyone is thinking “but they don’t know how to…”, these people are expected to find out how from an expert and codify those security practices into their operations, and this security can happen at multiple team layers – component (e.g. VM), production line (e.g. ESX) and end product, like QA (Web Service).  Failures (in security) found at the end product level that are tracked down to components (e.g. Apache exploit) are fixed between the team finding the fault and the team fixing.  Obvious, eh!

Next up: an example of a virtual machine component production using vSphere – a reusable process that can be automated.