AKS monitoring and Management

This article is written to give you a high-level insight into the possibilities of monitoring (Azure) resources, but AKS cluster in particular.

Published: 06 September 2021

This article is written to give you a high level insight into the possibilities of monitoring (Azure) resources, but AKS cluster in particular. The tools described in this article are just the tip of the iceberg. There are so many tools in the market to monitor resources and the possibilities are endless. Setting up monitoring takes time and effort. It is best to include the implementation of monitoring in the design or development of your application and to continuously refine and improve it in your DevOps process.

Why should we monitor at all?

Suppose you have built a web shop application and divided it into different microservices (small pieces of processes), such as a microservice to place items on an order, a checkout process to place the order and a payment process so that your customers can pay for the items. Each microservice of this application is pivotal in the proper functioning of your web shop. But what if one of these microservices fails in your application and you don't realize it? Your company will lose customers because of this, because they will order elsewhere. And because your web shop is not available, you lose money to the competitor.

So that’s why monitoring can help you. Monitoring tells you something about the availability of your application, the health of your infrastructure, about how your applications perform and eventually the health of your business.

Before you start

If you have followed this article “Azure Kubernetes cluster set up” you should now have your first Azure Kubernetes Service cluster, further referred as AKS, up and running and probably have one or more applications running on it. Very good and well done! But what should be the next logical step for your AKS environment? I have a suggestion for you: Monitoring and a bit of management.

So let’s start with the basics. For this we need to install three command-line tools.

  • az cli: The Azure CLI is a set of commands used to create and manages resources. You can download it here and is available for Windows, Linux and macOS environments.
  • kubectl: Is the command-line tool to control and manage your AKS environment. You can download this tool from here and is also available for macOS, Linux and Windows environments.
  • helm: helps you manage Kubernetes applications. Helm Charts help you define, install, and upgrade even the most complex Kubernetes application. You can download helm from here and is available for Linux, Windows and macOS.

When you deploy AKS a metric server and dashboard are installed accordingly. To see them you first need to connect to your cluster and this is where we need the az command for. Open a command shell an run the following commands:

'az login’ will open a browser session to login to Microsoft Azure and you can close this session once logged in. The ‘az aks’ command is used to connect to your AKS cluster. If you have only one subscription then you won’t need to use the –subscription option. To test if the connection has been created successfully you can run following command:

The output should tell you that the master, replicaset service, CoreDNS and Metrics server are running. It also gives a hint how to diagnose cluster problems.

Metric server

When we are all set, we can start with the absolute basics of monitoring the cluster. Like mentioned before, the metrics server comes with the AKS installation and can tell you about Memory and CPU usage of nodes and pods. The metrics server is also used for Horizontal Pod Autoscaling but this is outside the scope of this subject. You can query the metric server with the following commands:

Azure Monitor

Azure Monitor maximizes the availability and performance of your applications and services by delivering a comprehensive solution for collecting, analyzing, and acting on telemetry from your cloud and on-premises environments.

Alternatively it is possible to create your own workbooks based on your monitoring requirements.

In this pane you can further select on Cluster Performance, Disk IO and Capacity and Network.

Another way to get there and see more about the health of your AKS Cluster is to navigate in the Azure portal to Kubernetes Service -> SelectYourCluster -> Insights.

Give it a try and navigate around. Make sure you don’t miss the “View Workbooks” dropdown.

To drill further down into your application consider to install a small instrumentation package (SDK) into your application. You can read all about it at this location.

Alternative 3rd party tools

There are a lot of 3rd party tools (paid solutions) in the market for full-stack (end-to-end) monitoring in Azure including AKS cluster and on-premise resources. To name a few: NewRelicDatadog or Dynatrace. These are so called Application Performance Management (APM) tools sometimes also known as Application Performance Monitoring tools. APM stands for keeping an eye on applications from the user perspective but also the back-end in order to early detect problems and bottlenecks in the applications.

There is a possibility that you already own one of those tools, or similar tooling, in use for your current environment. Look at the vendors page if it supports or has an integration available for Azure.

Weave Scope

Another nice Open Source tool is Weave Scope. Weave Scope is a visualization and monitoring tool for Docker and Kubernetes and can also be used to diagnose problems. To install Weave Scope run the following command:

To run Weave Scope run the command below and open a browser session to localhost: 4040

Note: Do not expose the Scope service to the Internet, e.g. by changing the type to NodePort or LoadBalancer. Scope allows anyone with access to the user interface control over your hosts and containers.

Prometheus and Grafana

A combination that is often used is Prometheus and Grafana. Prometheus is a time-series database or metric server and Grafana is used to visualize metrics from different data sources including Prometheus. To install Prometheus and Grafana we use helm. You can find the needed helm charts across the internet. For this example I used the helm repository grafana and only run this locally.

You can run the following steps:

You must add Prometheus as a data source in Grafana. You can read here how you can add prometheus as a data source. If you have added the data source you are ready to deploy your first dashboard. If you are confident that both solutions work you can consider to create an ingress-controller to serve Grafana to the outside world, using Grafana as your front-end and Prometheus as the back-end.

There is a large community who are developing dashboards which you can use in your Grafana environment. You can find them at this location and you can read here how to import dashboards into Grafana. Off course you have the ability to create your own dashboards. Furthermore, Prometheus is not the only data source you can add to Grafana there are more such as Azure Monitor, MySQL, Graphite, Elasticsearch etc.

Management with kubectl

To manage you AKS cluster you will have to work with the command line tool kubectl. However, since a while you can also manage resources through the Azure portal. Please note that this functionality is still in preview, which means that it shouldn’t be used against a production environment.

If you want to start or get familiar with kubectl it is best to start here at the kubectl cheat sheet

One last word ...

If you managed to read this article to its end then you should have a good basis to start monitoring your AKS cluster. Of course you are not bound to one particular product or tool mentioned in this article. If you are happy with product X that’s fine. If you are happy using XYZ or any other combination that’s fine as well. The bottom line is that if you want to ensure that your applications are available for your business you need some kind of monitoring!

Intercept wishes you good luck with implementing monitoring and if you have any questions, feel free to contact us.


This article is part of a series 

This is the last article of the series. Want to read back articles? Check here:

1. The evolution of AKS
2. Hybride deployments with Kubernetes
3. Microservices on AKS
4. Update scenario's AKS
5. Linux vs. Windows containers
6. Security on AKS
7. Ingress, Services, Pods & Namespaces


Sign up here for our Intercept Insights and we’ll keep you updated with the latest articles.

Vist our AKS workshop

Learn even more about AKS through our interactive AKS workshop. In 1.5 hours you will receive the benefits and best practices to make your environment more efficient. Through common AKS challenges you will be ready for AKS. Click here for dates and register!