What are nodes?
Nodes are virtual machines. They host the pods, which are the units for running containerised applications.
Scaling on AKS
Scaling on AKS is possible on different technologies. It depends on the needs and scenarios of the applications. Firstly, we need to consider two things when talking about scaling on AKS: We must scale the nodes (1) and the pods (2).
Let's break this down further in detail:
1) Scaling the nodes means adding or removing nodes from the cluster.
2) Scaling the pods means changing the number of pod replicas across the nodes.
Yet, picking the wrong one can cause everything you do not want… Think about performance issues, inefficiencies, and even downtime. So, with that said, let’s look at all the technologies.
Cluster autoscaler
“Imagine your application always running smoothly, without resource issues.”
That’s what the Cluster Auto Scaler brings you.
Cluster AutoScaler is a vital component. It adjusts the node pool's size based on pod demand and availability, ensuring your application always has the resources it needs to perform at its best.
It monitors pending pods that can't schedule due to a lack of resources. And when the system needs more nodes, it adds them to the node pool. But it doesn't stop there. The model also removes underused nodes with no schedulable pods. This not only cuts costs but also boosts the cluster's efficiency. Less cost, more efficiency — isn't that what you want?
The AKS Control plane lets the Cluster AutoScaler talk to Azure. This automation scales the Virtual Machine Scale set (Node Pools) — without you lifting a single finger.
Horizontal Pod Autoscaler
This feature scales the number of pod replicas. It does so, based on the CPU use or custom metrics of the pods. The Horizontal Pod Autoscaler (HPA) adjusts the number of pod replicas based on CPU usage or custom metrics.
Let’s explain: You set a target metric value and minimum and maximum replica counts for each deployment or replica set. Then the HPA monitors these metrics and changes the replica count.
Here is an example: “If the pods' CPU use exceeds 80%, it will add replicas. It's going to do this until their average CPU use drops below 80%.”
Kubernetes Event Driven Auto Scaler (KEDA)
Kubernetes Event Driven Auto Scaler (KEDA) is an open-source project. It extends the Horizontal Pod Auto Scaler to support event-driven and serverless apps. It allows scaling the pod replicas. This depends on the number of events or messages in the sources.
Such sources include:
● Azure Queue Storage
● Azure Event Hubs
● Kafka
● RabbitMQ
Besides, it also supports scaling to zero. Are there no events? Then scale down the pod replicas to zero. When events occur, you can scale them up again.
The best part? This optimizes resource use and cluster cost.
Node Auto Provisioning (preview)
Node Auto Provision (or NAP for short) is a new technology that's currently in preview. This feature lets us provision a virtual machine automatically.
How? Well, NAP replaces scaling a node pool or a Virtual Machine Scaleset.
NAP can look at a deployment's needs. Then, it can provision any required machine with the right CPU and memory. And the machine will deallocate when it's not needed any longer. Although some may argue it isn't a “common” technology yet, it's without doubt promising!
Closing thoughts
Yes, there are more scaling technologies to implement... but for now, we've highlighted the most common ones we see with our customers. As a user, you can choose the best tech based on use cases. Or, combine them for optimal cluster scaling.