Kubernetes Pods Restarting Too Often

Detect pod restart loops and capture diagnostics to identify the root cause.

Kubernetes
Most Popular

The problem

A pod can get into a frequent restart loop for many reasons. Perhaps a new version of the software has an issue that causes it to start up correctly, but fail quickly thereafter.  Or maybe running the software for long periods of time causes corruption that slowly builds.  In some cases intermittent network connectivity can cause the software to fail.

Whatever the cause, when the software fails, the pod restarts, and a new version is deployed.  In most cases this will resolve the problem and keep the application online.  But if the crashes begin repeating, this can degrade responsiveness of the software.  If all the pods are crashing and restarting, the software will go offline.

The solution

Shoreline makes it easy to detect a pod boot loop: the number of restarts over a specific period of time. This setting is easily configured from the Shoreline portal. When excessive restarts occur, Shoreline captures diagnostic details around the application including memory and CPU used, and JVM details if applicable. Shoreline raises an alert, and optionally can flag an issue in an external ticketing system.

Highlights

Customer experience impact
Potential hours of downtime
High
Occurrence frequency
Can occur on clusters of any size
High
Shoreline time to repair
1-2 minutes
Low
Time to diagnose manually
Security
Cost impact
Time to repair manually
5-7 manual hours
High

Related Solutions