Back to videos

How to Manage Your Operational Data Efficiently

"How long should we keep operational data?"
3 min
play_arrow
Summary

Customers often ask us this question, primarily because it's expensive to store. Let's look at a couple common cases and see how to manage operational data efficiently.

Case 1: an ongoing event

If an event is going on right now, you want real-time data, maybe up to per-second granularity, to debug a live event without having to query each box separately. Here's how most companies mishandle it:

Even though production ops at its core is a distributed system, they handle the events by pulling all the data into one system, which:

  • creates lag and inconsistency across your data silos,
  • prevents them from knowing what's going on right now, and
  • costs them a lot of money because they end up storing a lot of unneeded metrics.

At @Shoreline, we believe that the ground truth is in the boxes you manage. We treat the distributed system like a distributed system by pushing the questions you ask out to the nodes and pulling the answers back to have a real-time view per second on metrics, resources, and the output of Linux commands.

Case 2: Operational reporting

If you want to do operational reporting over, let's say, the last month, the data doesn't need to be as high grain. You need accurate, high-fidelity information for the issues that occur to keep track of trends or anomalies. But you don't care about the rest of the data. This is how we deal with it at Shoreline: (This is going to get a bit technical…so buckle up!)

We transform the raw data into the frequency-time domain using Wavelets. Wavelets is the same technology that underpins MPEG and JPEG. It gives us great compression – about 40x, if you could believe it, which enables:

  • high-resolution per second data
  • looking at the trends over time because you're looking at the curve to match if an event occurred in the past.

All that geeking out aside, the basic point is that you need:

  • live high-resolution data, and
  • a cost-effective way to retain it for a long time.

We believe that people shouldn't store operational data for a long time because we don't think people will look at it. But we make it efficient for them to do so. A 100 metric sampled/second costs us about $0.25/host/year. That's so inexpensive we don't even bother charging for it right now.

Transcript

View more Shoreline videos

Looking for more? View our most recent videos
2 min
About Company Values
Part of the reason to create a company is to create the environment you want to be in.So it’s important that you reflect your values in your interview process. Otherwise, the sheer number of people joining will dilute things.
2 min
About Shoreline’s Fleet-Wide Debugging and Repair
Shoreline enables highly targeted fleet-wide debugging and repair allowing you to debug across the fleet in about the same amount of time as an individual box.
2 min
Shoreline Incident Insights
A quick overview video that shows automated categorization, filtering, and analysis of incidents.