Autoscaling in EKS

Failure or Failover? None I say!

Eliminate the hassel of system failure or failover by implementing autoscaling to maximize the availability of your pods or cluster in Elastic Kubernetes Service. This could be achieved at two levels:

  • Horizontal Pod Autoscaler
  • Cluster Autoscaler

Horzontal Pod Autoscaler aka hpa, helps in adjusting the number of pods running within your worker nodes based on the observed CPU utilization. A pod with 2 replica sets running with a CPU utilization of 100% is prone to terminate during a sudden spike in the traffic. Running a higher number of replica sets is not efficient when the resource utilization and user traffic is fluctuating, hence the use of hpa is a reliable choice. HPA can be set to a minimum CPU utilization (say 50%) and the scale out is performed when the threshold has been reached. The pods are scaled in when the CPU utilization has dropped within the minimum threshold.

Steps to set up horizontal pod autoscaler

# Download metric server using below command
$ wget -O v0.3.6.tar.gz https://codeload.github.com/kubernetes-sigs/metrics-server/tar.gz/v0.3.6

# Untar the zip the file
$ tar -xzf v0.3.6.tar.gz

# Apply all the YAML manifest in metric-server/v0.3.6/deploy/1.8+ directory
$ kubectl apply -f metrics-server-0.3.6/deploy/1.8+/

# Verify that the metric server is running
$ kubectl get deployment metrics-server -n kube-system

# Create HPA resource for apache deployment
$ kubectl autoscale deployment <deployment-name> --cpu-percent=50 --min=1 --max=5

# Check HPA for the deployment
$ kubectl get hpa


Cluster Autoscaler maximizes the availability of your infrastructure by adding or removing the number of worker nodes running within your kubernetes cluster. Cluster autoscaling is performed when the pods fail to launch due to lack of resource. A pod in pending state initiates the process of node scale out to accomodate the creation of new pods. When the nodes in the cluster are underutilized and the pods can be rescheduled onto other nodes, the nodes are scaled in.

Steps to set up cluster autoscaler in Elastic Kubernetes Service
Node Group IAM Policy

# Add below inline policy to the Node Group IAM role: Goto EC2 Node >> Click on Role
  "Version": "2012-10-17",
  "Statement": [
          "Action": [
          "Resource": "*",
          "Effect": "Allow"

Auto Scaling Group Tags

# Attach the following tags to your node group Auto Scaling groups
k8s.io/cluster-autoscaler/<cluster-name>: owned
k8s.io/cluster-autoscaler/enabled: true

Deploy the Cluster Autoscaler

# Download cluster auoscaler yaml using below command
$ wget https://raw.githubusercontent.com/kubernetes/autoscaler/master/cluster-autoscaler/cloudprovider/aws/examples/cluster-autoscaler-autodiscover.yaml

# Edit the cluster-autoscaler-autodiscover.yaml file
$ vi cluster-autoscaler-autodiscover.yaml
# Replace <YOUR CLUSTER NAME> with your EKS clusters name by adding following options below it.
- --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/<YOUR CLUSTER NAME>
- --balance-similar-node-groups
- --skip-nodes-with-system-pods=false

# Create a deployment
$ kubectl apply -f cluster-autoscaler-autodiscover.yaml

# Add the cluster-autoscaler.kubernetes.io/safe-to-evict annotation to the deployment with the following command.
$ kubectl -n kube-system annotate deployment.apps/cluster-autoscaler cluster-autoscaler.kubernetes.io/safe-to-evict="false"

# View your cluster logs to check Autoscaling
$ kubectl -n kube-system logs -f deployment.apps/cluster-autoscaler

You have performed initial level of set up on your Kubernetes cluster. Now, lets test the Auto Scaling configurations

Test your Autoscaling configuration

# Create a simple apache web server
$ kubectl run httpd --image=httpd --requests=cpu=100m --limits=cpu=200m --expose --port=80
# Create HPA resource for apache deployment
~$ kubectl autoscale deployment httpd --cpu-percent=50 --min=1 --max=5
# Open a new terminal session to create a load on the web server
$ kubectl run apache-bench -i --tty --rm --image=httpd -- ab -n 250000 -c 1000 http://httpd.default.svc.cluster.local/
# Watch the Apache deployment autoscale using below command
$ kubectl get hpa -w
# Delete the deployment once done
$ kubectl delete deployment.apps/httpd service/httpd horizontalpodautoscaler.autoscaling/httpd

Thats it!! You have successfully configured Auto Scaling on your cluster.