What's the Best Way to Effectively Monitor a Kubernetes Cluster?
Friday, 17 December 2021

Kubernetes provides an efficient, robust, and feature-rich platform to orchestrate containers at any scale. However, if not managed properly, Kubernetes can easily become unwieldy regardless of whether you are using a managed or a self-hosted Kubernetes cluster. We look at some of the tools and services that are available for Kubernetes monitoring.


Monitoring plays a crucial part in the overall Kubernetes management process, allowing users to easily identify performance bottlenecks and troubleshoot any cluster or container level issues. There are a plethora of monitoring options available for Kunbernetes (aka K8s). Let's look at some of these options and best practices to implement when monitoring a Kubernetes cluster.

Kubernetes is a distributed environment with resources spread across multiple nodes and even multiple clusters. Thus, having a unified monitoring solution is ideal for monitoring all aspects of the environment, from the cluster metrics, container health to application logs. 

The inbuilt Kubernetes dashboard is an excellent starting point for K8s management and monitoring. It will be more than enough for small to medium size cluster management. As a native application, users can easily deploy this dashboard on any cluster and start using it without the need for complex configurations. This dashboard provides most of the necessary features to manage the cluster with the ability to deploy applications directly from the interface and modify cluster resources. You can create a comprehensive monitoring solution by combining this dashboard with a notification manager.

Prometheus / Grafana

Prometheus and Grafana are separate tools used for Kubernetes monitoring. With its native K8s support, Prometheus is a metrics aggregator, while Grafana is a data visualization solution. However, these two tools go hand in hand when creating a comprehensive monitoring solution for K8s. Prometheus can target the Kubernetes API (Prometheus Kubernetes SD) to gather all K8 control plane, service, node metrics, as well as use autodiscover to gather application metrics. Moreover, Prometheus comes with inbuilt altering mechanisms, and this altered data can then be visualized using Grafana for comprehensive monitoring.

ELK Stack

ELK has become the industry standard for monitoring, especially when it comes to Kubernetes monitoring. ELK with Logstash, Elasticsearch, and Kibana provides an all-in-one monitoring solution for both metrics and logs in any environment. ELK can monitor Kubernetes and can also be integrated into the applications itself, not just for log shipping but also as a comprehensive performance monitoring solution via Elastic Synthetics. Additionally, some features of ELK further expand its functionality, such as Elastic Feet for easy deployments/integrations and Elastic Security for XDR. The downside of ELK is that there will be a deeper learning curve for configuring and utilizing it. Thus investing in such a solution for a small-scale deployment will not be an ideal solution while it will be an invaluable asset for large-scale deployments.

When it comes to best practices for kubernetes monitoring, most people forget that Kubernetes itself comes with a comprehensive monitoring tool kit. For example, the simple command-line tool kubectl allows users to inspect and manage cluster resources and view logs directly without relying on any external tool. Simple commands like describe and cluster-info dumps can be invaluable for troubleshooting. Additionally, tools like liveness, readiness, and startup probes can be used to monitor Pod and container status. For example, the Kubernetes liveness probe provides a way to control the state of the Pod using HTTP requests. Furthermore, features like Audit policies allow users to easily create auditable environments with the ability to store these events on a log or webhook backend. Therefore, always see if there is a native solution for your needs before looking into third-party solutions, as native solutions offer better compatibility and performance.

The effectiveness of monitoring will be greatly reduced if there are no alerts configured. Alerts allow users to implement continuous monitoring and immediately be notified of any unusual behavior in the cluster. Resource metrics such as CPU, memory, disk utilization, network performance, and network events to application events can be used as triggers for alerts. These alerts should then be delivered to the relevant parties to take relevant actions quickly.

K8s monitoring is not limited to the cluster and resources inside it and includes any external resources associated with it. It means that monitoring should also include the underlying hardware and software that power the K8s cluster in a self-hosted environment. In a cloud environment, these external resources can be user events, network performance, and cost explorations to manage security, performance, and costs of the cluster.

There might be instances where an error in an application leads to errors within a cluster. For example, suppose an application is draining all the configured resources due to a resource management issue in the application. In that case, adding more resources to the cluster will not fix the issue. Thus, it is essential to consider applications deployed within a cluster as a part of the overall monitoring process and monitor both as a single entity to better understand the overall workloads of each.

In conclusion, while Kubernetes monitoring is a vast and complex subject, it is a must-have skill for any Kubernetes admin. However, the scope and features expected from monitoring depend on the size and complexity of the cluster as well as the underlying workloads. Users can quickly implement a monitoring solution that suits their exact needs by focusing on the best practices mentioned above. 



More Information 


Related Articles

The DevOps Master Class - Go Behind The Concept

Grafana 7 Adds New Visualizations

Kubernetes for Full-Stack Developers

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.


Android Studio Jellyfish Ready To Use

Well, as ready as any of the recent Android Studio's have been. This one boasts an AI assistant called Gemini - shame Android Studio isn't as fast to implement as Gemini is to suggest.

Google Gemini API Developer Competition

Google is running a Gemini API Developer Competition with prizes including a 1981 custom electric DeLorean. Entrants will use the Gemini API to tackle real-world challenges, and the organizers su [ ... ]

More News

raspberry pi books



or email your comment to: comments@i-programmer.info



Last Updated ( Friday, 17 December 2021 )