Server environments can often be challenging to run. Sometimes processes silently die. Other times old versions of processes are left running. And worst of all, is when a malicious process is running creating a security vulnerability or allowing a crypto mining to occur. In these situations, debugging the issue can be tricky. For example, an engineer might wonder - “Why is there no data in this report? The report server is running correctly.” Eventually they will discover that the data loading process quietly died.
Another important use case is for companies that offer a flexible platform and a free trial. In these situations customers have the ability to install third party applications. How do they give their customers the ability and flexibility to run lots of applications, but also have boundaries in place to prevent abuse of the platform? This problem has come into focus most recently because of the rise of cryptocurrency mining.
Processes that are running, but shouldn’t are going undetected driving up compute costs, creating potential reliability issues and even security vulnerabilities. Conversely, organizations also have the issue of processes that should be running but have stalled or crashed and are also going undetected. Missing processes can be tricky to diagnose, increasing the time and effort required to identify and resolve incidents.
So how do you prevent this? Kubernetes is good at handling this type of problem, unless you are giving your customers the flexibility to run their own applications. Teams running virtual machines and hybrid VM/container environments don’t have a universal way to configure this detection and create uniform rules about what processes should and should not be running. In these situations, Shoreline is a powerful tool for managing these types of problems.
Dead processes can be particularly challenging with web servers, application servers and log pushers.
In all three of these cases and many more, Shoreline can help proactively identify and restart these services.
The Shoreline Process List Op pack allows you to create a good process list and a bad process list. The good process list does two things. First, it notifies you if anything that should be running is not running. Next it notifies you if anything NOT on the good list is running. The bad process list notifies you if any process you have put on the bad list is running. This gives users two different ways of identifying potentially disruptive processes that are running in your environment.
This alarm has two out of the box actions:
The op pack is configurable and includes a mechanism for counting the number of instances of each process that is running. For example, users can specify that 16 instances of a process should be running and then alert if only 15 are running. This allows users to detect a partial failure of an application.
Shoreline is also very flexible in how it identifies the processes using a matching algorithm with un-regular expressions. This is important because the name of a process instance can sometimes have a few variations.
This Op pack is platform agnostic, allowing teams to manage environments across clouds, Kubernetes and virtual machines and to create uniform rules across these environments. This reduces some of the complexity of managing hybrid or multi-cloud environments..