Back to videos

Why You Should Automate Production Ops

Most of the on-call issues are commonplace, which means they happen again and again. It’s important to automate these issues because it’s a one-time investment, doesn’t make mistakes, and stays with you forever.
2 min
play_arrow
Summary

Let’s talk about the value of automation for production operations.

Most of the on-call issues are commonplace, which means they happen again and again.

So if you’re trying to fix it manually, you run into the following problems:

- People are less efficient.

It can take them an hour to register that something happened, find the right runbook, and make the fix, which wastes their time and causes unavailability.

- People make mistakes.

They make even more mistakes if things are commonplace because they don't have their head in the game.

- People leave.

You might put in a lot of resources to train your people, but when they leave, they take that expertise with them.

That’s why it’s important to automate these issues using software as it’s a one-time investment, doesn’t make mistakes (unless there’s a bug), and stays with you forever.

Here are 2 main reasons why people don’t automate their commonplace incidents:

1. It takes a long time.

When I was at AWS, each automation would take about a month to build, which is a long time.

So we’d go through the cost-benefit analysis to decide whether to focus on that or some other dev tasks.

But if it takes just a couple of hours to build (like how we do it at Shoreline), the cost is always low.

So it doesn't even matter. You just build the automation as it takes the same amount of time as fixing the issue once.

2. It may run amok.

People know how to build the solution for an individual box, but they often make mistakes when scaling it across the fleet.

At Shoreline, we're distributed systems people. We work with circuit breakers, leases, etc., to ensure that the automations are safe and fast.

That’s how we help you build automations that enable you to:
- be less dependent on expensive, high-churn labor
- improve your availability to the customers
- sleep stress-free at night

Transcript

View more Shoreline videos

Looking for more? View our most recent videos
3 min
Actively Managing Systems to Improve Utilization
We're all being asked to do more with less now a days. For those of us in production operations, one of the best ways we can do that is eliminate waste with automation to drive higher utilization.
2:40 min
How to Do Continuous Improvement in Operations
Things that enabled me to do more with lower cloud computing costs
3 min
How to Setup Shoreline’s Incident Insights Tool
Learn step by step how to setup Shoreline's Incident Insights so that you can pinpoint the top causes of incidents, measure team health, and use trending data to drive continuous improvement. Get up and running in 2 minutes.