Back to videos

The Best Way to Improve Your On-Call

No one wants to do on-call because you can't control when the incident happens. Improve your on-call by building automations that eliminate common production incidents.
2 min
play_arrow
Summary

“Should I separate development from operations to manage incidents?”

This question comes from a deep pain point.

No one wants to do on-call.

It’s because you can't control when the incident happens. It might happen over a weekend and overnight. And if it's in your shift, you're the one carrying the load. But the notion of separating Dev and Ops misses the point.

It's like separating development from QA. Yes, you have QA people, but their job is to ensure that:
- deployments go cleanly,
- regression testing is being done properly,
- things that escaped the dev testing process get handled, etc.

But you still have developers writing tests.

Similarly, you need to have developers taking on-call shifts because:
- that way, the load gets shared and becomes easier to manage.
- More importantly, you share the problem instead of hiding it in one community. This incentivizes you to solve it for yourself and your customers.

How do you solve it?

You do it by building automations that eliminate common production incidents.

That’s what we do at Shoreline: Enabling your DevOps engineers to build automations in an afternoon that fix issues forever. As a result, you get fewer incidents and better on-call.

Transcript

View more Shoreline videos

Looking for more? View our most recent videos
2 min
Why We Leverage Wavelets for Data Compression
Wavelets are the best way to deal with errors in the underlying data stream
3 min
Why You Need Automation Today
A ton of tools help you observe your environment and maybe half a ton help you route things and deduplicate them. But there's hardly anything out there that actually fixes your environment. That's the reason we need automation in production ops today.
2 min
How to Reduce On-Call Incidents
Shoreline's recent survey found that 48% of incidents are straightforward and repetitive while 55% of them escalate beyond the 1st line on call. If your on-call sucks, you must find a path to make incidents incidental.