Incident Management is typically the first stop in most people’s ITSM journey. So, if that’s the case, then why can it go so wrong, particularly in the case of a Major Incident?
I recently read an article on a failed Major Incident Response. A ‘very stable’ system fell over for the first time in years, long after the people who implemented it had hung up their cables.
Guess what happened?
- MI Bridge chaos
- Every SME is talking at the same time
- Mini solutions appearing with no coordination
- Documentation? What documentation?
So here’s your cheat sheet.
DO:
- Get the right people (not everyone)
- Have a single leader
- Document everything as you go, even if rough notes
- Focus on restoration first
- Keep communications clear, brief and relevant
DON’T:
- Start finger-pointing
- Chase the root cause during the fire
- Let non-essential management hijack the call
- Forget stakeholder communications
- Throw everything at it without a plan
- Try multiple resolutions at once, obscuring the fix
When you are weathering a storm, have a single Captain steering the ship.

