When the length of your Kafka topic is too long, you will fail to consume messages at the right rate. When messages aren’t consumed, applications may begin to break, with reports and transactions being the first to fail.
On the surface, this is not a difficult problem to diagnose. Close monitoring of metrics will tell you if messages are not being consumed. If the issue is caught early, then the pods simply need to be restarted. The true issue arises when you are unable to keep up with monitoring. The further you fall behind, the more things get out of sync, and the harder it is to fix. This will most likely lead to customer availability issues.
Shoreline’s Kafka OpPack detects Kafka lag and restarts consumer pods to remove lag. It works by allowing you to designate the group of pods that are consumers of the topic. Shoreline can capture metrics from a Kafka exporter or we can call Kafka to get the topic length.