I somehow doubt that large IT shops are usually compared to the T.V. series M.A.S.H., but the more I thought about it, the more it really made sense.

This whole thing started with me talking to one of the folks at my company in engineering who was asking me about what they considered lack luster effort on the part of IT in diagnosing a problem with their machine.

Disgruntled Engineer
“So I called IT the other day and complained about issues with my machine being slow, random Office errors popping up, and this weird thing where IE would freeze from time to time, and you know what they wanted to do? They wanted to wipe my machine and re-install Windows!

Why don’t you guys spend more time finding the root problem, and spend less time re-installing machines that impact engineering productivity?”

“Well to be honest, when we get trouble tickets we triage them according to severity, and our ability to fix them with limited resources, and sometimes the best option is to wipe your machine and start over”

Disgruntled Engineer
“So your saying because my machine didn’t have an “easy” or “moderate” problem to solve your answer is to just re-install?”

“Uhhhh yeah, that’s how it goes sometimes”

So for those that are too young to remember M.A.S.H., it was a T.V. show that ran in the 70’s and 80’s detailing the exploits of a mobile army surgical hospital.

M.A.S.H units would be moved from time to time to be closer to the front lines, and they served as the first stop that wounded would be brought to, think a trauma ER room made out of tents. There the wounded would be assessed, quickly patched up, and then sent off to a more robust medical facility.

Incoming wounded would be flown in on helicopters and dropped off and then immediately triaged by the medical staff into by which get priority based on their condition. Unfortunately one of these categorizations was “Not going to make it”.

Due to limited medical staff and the critical need of other patients, hard decisions had to be made right away about who could be saved, and who was probably beyond saving, and that holds true in large IT departments today as well.

If we think of machine problems as “wounded”, on an average day we receive 850 “wounded”. These range from someone has a stubbed toe (password reset) up a patient who is flatline (dead CPU).

We also have similar constraints as the medical personal in M.A.S.H., limited personnel, resources, and equipment, and keeping those in mind we have to make hard decisions about where we spend those resources.

So if an employee brings in their machine with really odd problems, or intermittent issues we often times don’t have the luxury of spending 8 hours on a single machine, trying to determine the root cause of the problem, but we have to recommend that we have to re-image.

No IT person likes to punt and just re-image, but we are constantly reminded by the sound of those choppers that we have wounded coming in on average every two minutes, day in and day out, and sometimes the best you can do is admit that you don’t have the time and resources to determine the real issue.

That’s not to say we would ignore a systemic bizarre problem that seems to be widespread across the enterprise, it just means that we often can’t spend the time we would like to on each and every issue that we see.

Machines much like the human body are prone to problems and as they age can exhibit all types of interesting issues that can be difficult and time consuming to diagnose, and unlike the folks in the #4077 M.A.S.H. unit, we have the option starting over when need be.

Hopefully that helps educate folks that what some could perceive as lackluster effort is really us just trying to triage patients as fast as we can with the resources that we have been given.

On that note, I hear incoming helicopters, I have to run…

