Back to videos

Decoding Taylor Swift’s Ticketmaster Debacle

What can we learn from the Ticketmaster (Taylor Swift) Debacle? Ticketmaster experienced an unprecedented demand that resulted in their site crashing for many hours. If they had designed a reliable service with an escalator-like system instead of an elevator, this could have been avoided.
3 min
play_arrow
Summary

Let’s talk about the Ticketmaster (Taylor Swift) Debacle and what we can learn from it.

You may remember this incident where Ticketmaster tried to sell tickets for a Taylor Swift concert, and their site went down for hours.

They said that it happened due to the unprecedented demand.

To me, that’s nonsense because this situation could have been easily avoided if they had load tested their systems properly.

But I want to talk about a deeper underlying issue: The first job of a service is to protect itself.

You do that by putting a queue in front of your service, which acts as a buffer between the service and the incoming requests. Suppose your service can handle 500 requests per second. If 50,000 requests arrive, instead of crashing, it will show an error message or queue up the other 49500 requests while it serves the 500.

Had Ticketmaster used this mechanism, it’d have protected their service from crashing while ensuring that a portion of the demand was still being served.

Think of a queue as an escalator. It operates at a consistent pace and can handle a certain amount of demand.

In comparison, an elevator is like a service that does not handle variability in demand very well. During rush hour, elevators tend to get overwhelmed and stop functioning effectively.

So when designing a reliable service, try to create an escalator-like system instead of an elevator.

Transcript

View more Shoreline videos

Looking for more? View our most recent videos
5 min
Reliability Engineering: The Southwest Debacle
Because it's less expensive and quicker for passengers, Southwest operates on a point-to-point model. Any disruptions in one route affect the entire chain. But to engineer a reliable architecture, you need to balance cost versus reliability in an economically constrained way.
2 min
About Shoreline’s Fleet-Wide Debugging and Repair
Shoreline enables highly targeted fleet-wide debugging and repair allowing you to debug across the fleet in about the same amount of time as an individual box.
2 min
Shoreline Incident Automation Overview
Shoreline’s Incident Automation Platform was built to reduce manual and repetitive work, so that you can repair issues faster, increase team productivity, and eliminate thousands of hours of degraded service.