Fluentd is a great logging system, aggregating all the logs from mission critical systems into one data stream we can pipe to anywhere. It’s a great source for application diagnostic data … if it’s running correctly.
If the logging platform slows down, anything sending logs synchronously slows down too, causing customer pain, and potentially causing system outages. For all the tools that depend on this diagnostic data, they can’t know of any problem if the problem is the logging system itself.
This Shoreline automation monitors the liveness probe and health check on the Fluentd containers. If a pod is stuck or slow to respond, Shoreline automatically kills the pod, allowing Kubernetes to create a new pod, bringing the system back online.