Let’s talk about bringing continuous improvement to operations.
I deeply believe in making things 1% better each and every week.
I've used this approach when coaching and mentoring people, improving the performance of the software I've been responsible for, and keeping my services up in the operations.
But when I talk about it to folks in operation, they often push back, saying:
“I don't even know what problems I have, so how am I supposed to fix them?”
“I know I have a lot of problems, but I have no idea which one I should work on.”
“I have a lot of problems, but I can't even collect the data, as it takes me hours to do so. How am I supposed to start this process you're talking about on the weekly cadence?”
These are actually fair points.
To solve these, we’ve introduced a free tool called Incident Insights at Shoreline.
It pulls data out of your incident management environment and, about 2 minutes later, categorizes your data using some ML techniques to tell you which incidents are:
- the highest cause of pain to your team
- taking the longest time to repair
- happening all the time, representing noisy alarms
You can further customize the reports as per your need, but it gives you a starting point to figure out which of your alarms:
- are noisy and should be turned off
- have a mis-set threshold
- need to be refined using Shoreline to understand if it should be eliminated, automated, or rerouted to a human.
I think this tool will really help you introduce the process of continuous improvement to operations.
And … did I mention that it’s free?