Kubernetes cluster observability with Prometheus and Grafana on ESA HPC

Complex systems deployed on Kubernetes take advantage of multiple Kubernetes resources. Such deployments often consist of a number of namespaces, pods and many other entities, which contribute to consuming the cluster resources.

To allow proper insight into how the cluster resources are utilized, and enable optimizing their use, one needs a functional cluster observability setup.

In this article we will present the use of a popular open-source observability stack consisting of Prometheus and Grafana.

What Are We Going To Cover

  • Install Prometheus

  • Install Grafana

  • Access Prometheus as datasource to Grafana

  • Add cluster observability dashboard

Prerequisites

No. 1 Hosting

You need a ESA HPC hosting account with Horizon interface https://horizon.eohpc.net/auth/login/?next=/.

No. 2 A cluster created on EOHPC cloud

Kubernetes cluster available. For guideline on creating a Kubernetes cluster refer to How to Create a Kubernetes Cluster Using ESA HPC OpenStack Magnum.

No. 3 Familiarity with Helm

For more information on using Helm and installing apps with Helm on Kubernetes, refer to Deploying Helm Charts on Magnum Kubernetes Clusters on ESA HPC EOHPC Cloud

No. 4 Access to kubectl command line

The instructions for activation of kubectl are provided in: How To Access Kubernetes Cluster Post Deployment Using Kubectl On ESA HPC OpenStack Magnum

1. Install Prometheus with Helm

Prometheus is an open-source monitoring and alerting toolkit, widely used in System Administration and DevOps domains. Prometheus comes with a timeseries database, which can store metrics generated by variety of other systems and software tools. It provides a query language called PromQL to efficiently access this data. In our case, we will use Prometheus to get access to the metrics generated by our Kubernetes cluster.

We will use the Prometheus distribution delivered via the Bitnami, so the first step is to download Bitnami to our local Helm repository cache. To do so, type in the following command:

helm repo add bitnami https://charts.bitnami.com/bitnami

Next, download the Prometheus Helm chart:

helm install prometheus bitnami/kube-prometheus

With the above commands correctly applied, the result should be similar to the following:

NAME: prometheus
LAST DEPLOYED: Thu Nov  2 09:22:38 2023
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
CHART NAME: kube-prometheus
CHART VERSION: 8.21.2
APP VERSION: 0.68.0

Note that we are deploying the Helm chart to the default namespace for simplicity. For production, you might consider using a dedicated namespace.

Behind the scenes, several Prometheus pods are launched by the chart, which can be verified as follows:

kubectl get pods

...

NAME                                                            READY   STATUS    RESTARTS   AGE
alertmanager-prometheus-kube-prometheus-alertmanager-0          2/2     Running   0          2m39s
prometheus-kube-prometheus-blackbox-exporter-5cf8597545-22wxc   1/1     Running   0          2m51s
prometheus-kube-prometheus-operator-69584c98f-7wwrg             1/1     Running   0          2m51s
prometheus-kube-state-metrics-db4f67c5c-h77lb                   1/1     Running   0          2m51s
prometheus-node-exporter-8twzf                                  1/1     Running   0          2m51s
prometheus-node-exporter-sc8d7                                  1/1     Running   0          2m51s
prometheus-prometheus-kube-prometheus-prometheus-0              2/2     Running   0          2m39s

Similarily, several dedicated Kubernetes services are also deployed. The service prometheus-kube-prometheus-prometheus exposes the Prometheus dashboard. To access this service in the browser on default port 9090, type in the following command:

kubectl port-forward svc/prometheus-kube-prometheus-prometheus 9090:9090

Then access your browser via localhost:9090 to see the result similar to the following:

../_images/image2023-11-7_13-17-10.png

Notice, when you start typing kube in the search results, the autocomplete suggest some of the metrics that are available from our Kubernetes cluster. Along with the Helm chart installation, these metrics got exposed to Prometheus, so they are stored in Prometheus database and can be queried for.

../_images/image2023-11-7_13-22-56.png

You can select one of the metrics and hit Execute button to process the query for statistics of this metrics. For example, insert the following expression

kube_pod_info{namespace="default"}

to query for all pods in the default namespace. (Further elaboration about the capabilities of Prometheus GUI and PromQL syntax is beyond the scope of this article.)

../_images/image2023-11-7_13-40-1.png

2. Install Grafana

The next step is to install Grafana. We already added the Bitnami repository when installing Prometheus, so Grafana repository was also added to our local cache. We only need to install Grafana.

Note that if you want to keep an active browser session of Prometheus from the previous step, you will need to start another Linux terminal to proceed with the below installation guideline.

By default, Grafana chart will be installed with a random auto-generated admin password. We can overwrite one of the Helm settings to define our own password, in this case: ownpassword, for simplicity of the demo:

helm install grafana bitnami/grafana --set admin.password=ownpassword

If you prefer to stick to the defaults, instead of the above command, use the following commands to install the chart and extract the auto-generated password:

helm install grafana bitnami/grafana
echo "Password: $(kubectl get secret grafana-admin --namespace default -o jsonpath="{.data.GF_SECURITY_ADMIN_PASSWORD}" | base64 -d)"

There will be a single pod generated by the chart installation. Ensure to wait until this pod is ready before proceeding with the further steps:

kubectl get pods

NAME                                                            READY   STATUS    RESTARTS   AGE
...
grafana-fb6877dbc-5jvjc                                         1/1     Running   0          65s
...

Now, similarly as with Prometheus we can access Grafana dashboard locally in the browser via the port-forward command:

kubectl port-forward svc/grafana 8080:3000

Then access the Grafana dashboard by entering localhost:8080 in the browser:

../_images/image2023-11-7_14-24-1.png

Type the login: admin and the password ownpassword (or the auto-generated password you extracted in the earlier step).

3. Add Prometheus as datasource to Grafana

In this step we will setup Grafana to use our Prometheus installation as a datasource.

To proceed, click on Home menu in the left upper corner of Grafana UI, select Connections and then Data sources:

../_images/image2023-11-7_14-47-56.png

Then select Add data source and choose Prometheus as datasource type. You will enter the following screen:

../_images/image2023-11-7_14-54-8.png

Just change “Prometheus server URL” field to http://prometheus-kube-prometheus-prometheus.default.svc.cluster.local:9090 which represents the address of the Prometheus Kubernetes service in charge of exposing the metrics.

Hit the Save and test button. If all went well, you will see the following screen:

../_images/image2023-11-7_15-1-59.png

4. Add cluster observability dashboard

We could be building a Kubernetes observability dashboard from the scratch, but we will much rather utilize one of the open-source dashboards already available.

To proceed, select the Dashboards section from the collapsible menu in top left corner and click Import:

../_images/image2023-11-7_15-15-23.png

Then in the import via grafana.com field, enter 10000, which is the ID of the Kubernetes observability dashboard from the grafana.com marketplace represented in: https://grafana.com/grafana/dashboards/10000-kubernetes-cluster-monitoring-via-prometheus/

../_images/image2023-11-7_15-16-10.png

Then another screen appears as per below. Change data source to Prometheus and hit Import button:

../_images/importdashboard.png

As the result, the Grafana Kubernetes observability dashboard gets populated:

../_images/image2023-11-7_15-38-40.png

What To Do Next

You can find and import many other dashboards for Kubernetes observability by browsing https://grafana.com/grafana/dashboards/. Some examples are dashboards with IDs: 315, 15758, 15761 or many more.

The following article shows another approach to creating a Kubernetes dashboard:

Using Dashboard To Access Kubernetes Cluster Post Deployment On ESA HPC OpenStack Magnum