Back to videos

Reliability Engineering: The Southwest Debacle

Because it's less expensive and quicker for passengers, Southwest operates on a point-to-point model. Any disruptions in one route affect the entire chain. But to engineer a reliable architecture, you need to balance cost versus reliability in an economically constrained way.
5 min
play_arrow
Summary

“Why is COVID better than Southwest Airlines? Because COVID is airborne.”

I read this on a handwritten sign while flying during the holidays.

The joke highlights the issue that Southwest faced, with almost 60% of their flights being grounded last December.

Many reasons are being cited for this issue, like weather, high demand, insufficient crew and planes, their outdated Sky Solver software, etc.

While there is some truth to all these explanations, my question is: Why did this happen to Southwest and not to other airlines?

The real difference between Southwest and other airlines that didn’t fall over is the technical architecture of how they operate.

Southwest operates on a point-to-point location model, which means that each flight is directly routed from one location to another, without connecting through a central hub.

So any disruptions in one route affect the entire chain.

On the other hand, most other airlines use a hub and spoke model, which is more resilient in case of failures.

This model allows the airlines to adopt an n+k approach, where they have n number of things that need to work and can tolerate k failures.

So they can have k reserve planes and crew available at the hub to ensure that there is a contingency in case of disruptions.

To do the same in the point-to-point model, you’d need to have k reserves at all locations, which isn’t economically feasible.

There are more nuances to this, such as the point-to-point model being less expensive for the airline and quicker for the passengers.

But to engineer a reliable architecture, you need to balance cost versus reliability in an economically constrained way.

Transcript

View more Shoreline videos

Looking for more? View our most recent videos
14 min
theCUBE Interviews Shoreline CEO Anurag Gupta at AWS re:Invent
Anurag Gupta joined John Walls to discuss innovation in the cloud with DevOps teams for the Global Startup Program at AWS re:Invent 2022.
2 min
What We Do at Shoreline (In 140 Seconds)
Shoreline helps on-call operators reduce incidents resulting in a better on-call experience and better availability for their customers.
3 min
How to Setup Shoreline’s Incident Insights Tool
Learn step by step how to setup Shoreline's Incident Insights so that you can pinpoint the top causes of incidents, measure team health, and use trending data to drive continuous improvement. Get up and running in 2 minutes.