Back to videos

The Best Way to Improve Your On-Call

No one wants to do on-call because you can't control when the incident happens. Improve your on-call by building automations that eliminate common production incidents.
2 min
play_arrow
Summary

“Should I separate development from operations to manage incidents?”

This question comes from a deep pain point.

No one wants to do on-call.

It’s because you can't control when the incident happens. It might happen over a weekend and overnight. And if it's in your shift, you're the one carrying the load. But the notion of separating Dev and Ops misses the point.

It's like separating development from QA. Yes, you have QA people, but their job is to ensure that:
- deployments go cleanly,
- regression testing is being done properly,
- things that escaped the dev testing process get handled, etc.

But you still have developers writing tests.

Similarly, you need to have developers taking on-call shifts because:
- that way, the load gets shared and becomes easier to manage.
- More importantly, you share the problem instead of hiding it in one community. This incentivizes you to solve it for yourself and your customers.

How do you solve it?

You do it by building automations that eliminate common production incidents.

That’s what we do at Shoreline: Enabling your DevOps engineers to build automations in an afternoon that fix issues forever. As a result, you get fewer incidents and better on-call.

Transcript

View more Shoreline videos

Looking for more? View our most recent videos
2 min
How Notebooks Empower Your On-Call Teams
Some issues can't be automated. For things that require human judgment, we provide on-call teams with notebooks that are optimized for operations. That way you know what action to take and when.
3 min
Shoreline Datadog Incident Repair Kit Demo
Create a library of best practice debugging tools and pre-built remediation actions so that everyone on-call is as good as your best SRE with Shoreline's Datadog Incident Repair Kit.
4 min
How to Solve the Challenges of MELT Data at Scale
The bigger the data set, the slower it is to analyze. For MELT, you need to be able to execute a query at scale across your fleet and see what's going on in the live environment. That’s why, at Shoreline, we favor modeling the distributed system as a distributed system.