Kubernetes Capacity Planning with Vertical Pod Autoscaler

February 8, 2022
Sergio Rua

Introduction

When you deploy a new Pod into your Kubernetes cluster it is good practice to restrict how much memory and CPU it is able to use. This is set up in the resources section:

apiVersion: v1
kind: Pod
metadata:
  name: frontend
spec:
  containers:
  - name: app
    image: images.my-company.example/app:v4
    resources:
      requests:
        memory: "64Mi"
        cpu: "250m"
      limits:
        memory: "128Mi"
        cpu: "500m"

One of the most time-consuming tasks we perform is deciding how much resources we should allocate to each pod we deploy. This is very important for many reasons:

  • Too few resources will impact your application delivery and it will either run too slow or will crash often with OOM errors (Out Of Memory)
  • Too many resources allocated sometimes can even be worse as you are telling the application to use more than it needs which not only impacts the misconfigured pod but everything else by taking away resources much needed by others. Resources can be expensive, so don’t waste money.
  • The Vertical Pod Autoscaler is a fantastic tool to use for this and it can help you in several ways:
  • It can advise on the right values to use for your memory and CPU allocation
  • It can update the deployments with the recommended values

Enabling

If you are using Cloud-based Kubernetes servers you will need to check if your provider supports VPA. For example, Google’s GKE does support it but it is disabled by default.

~$ gcloud container clusters \
 create CLUSTER_NAME — enable-vertical-pod-autoscaling

If you are using terraform, this is also configurable using the option enable_vertical_pod_autoscaling.

Installation

We always recommend using helm charts for any installation. There are several community charts you can use but I’m going to recommend the one published by Fairwinds which also includes a nice Web UI and a controller we can play with.

The default installation would look like this:

helm repo add fairwinds-stable https://charts.fairwinds.com/stable
helm install vpa fairwinds-stable/vpa — namespace vpa — create-namespace

If you want to add the Goldilocks web UI you can enable it as well using an override values file. My example below includes an ingress to be able to reach it but you can also leave the ingress out and use port-forwarding:

dashboard:
  enabled: true
  replicaCount: 1

  ingress:
    enabled: true
    annotations:
      external-dns.alpha.kubernetes.io/hostname: goldilocks.digitalis.io

    path: /
    # Ingress Host
    hosts:
      - host: goldilocks.digitalis.io
        paths:
          - /

Usage

You will need to create a VerticalPodAutoscaler object for each of the deployments you would like to let VPA work on. For example, I have a Kafka cluster running and I’d like to get some metrics from it:

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: kafka
  namespace: kafka
spec:
  targetRef:
    apiVersion: "apps/v1"
    kind:       StatefulSet
    name:       digitalis-kafka
  updatePolicy:
    updateMode: "Off"

targetRef is used to discover the applications (in my example a StatefulSet). My Kafka cluster is deployed to the kafka namespace and it is called digitalis-kafka.

updateMode is quite important. Possible values are “Off”, “Initial”, “Recreate”, and “Auto”.

  • Initial: The Initial mode automatically applies VPA recommendations only at pod creation.
  • The Auto and Recreate modes automatically apply the VPA CPU and memory recommendations throughout the pod lifetime. The VPA deletes any pods in the project that are out of alignment with its recommendations. When redeployed by the workload object, the VPA updates the new pods with its recommendations.
  • The Off mode only provides recommended resource limits and requests, allowing you to manually apply the recommendations. The off mode does not update pods.

If you are going to let VPA update your pods, I would recommend using the Initial mode or you may find pods restarted by VPA. However, I prefer leaving the update mode to Off and applying the recommendations whenever it is more convenient for us.

You can see the recommendations by describing the object:

~$ kubectl describe -n kafka verticalpodautoscaler kafka
[...]
Status:
  Conditions:
    Last Transition Time:  2021-09-27T15:23:04Z
    Status:                False
    Type:                  LowConfidence
    Last Transition Time:  2021-09-27T15:23:04Z
    Status:                True
    Type:                  RecommendationProvided
  Recommendation:
    Container Recommendations:
      Container Name:  kafka
      Lower Bound:
        Cpu:     25m
        Memory:  2711617536
      Target:
        Cpu:     95m
        Memory:  2760900608
      Uncapped Target:
        Cpu:     95m
        Memory:  2760900608
      Upper Bound:
        Cpu:     125m
        Memory:  3310354432

Goldilocks

Goldilocks give you a bit more than just installing VPA for you. It can also create the VPA objects and on top of that, it provides a nice Web UI.

I also have a Postgres database in my lab. This time instead of creating a VPA object for each of the deployments, I’m going to label the namespace informing goldilocks to create the configuration for me.

~$ kubectl label ns postgres goldilocks.fairwinds.com/enabled=true
~$ kubectl get verticalpodautoscaler -n postgres
NAME AGE
digitalis 1m
digitalis-backrest-shared-repo 1m
digitalis-dcun 1m
digitalis-qaka 1m

As you can see shortly after labelling the namespace goldilocks created the virtualpodautoscaler and I just need to wait a few minutes to get the data.

Now I can open the Web UI and check them out:

Grafana

Finally, if you prefer not to use Goldilocks Web UI and you are using Grafana you can have a look at this dashboard that will show you the recommendations clearly.

Conclusion

Vertical Pod Autoscaler can be a powerful tool in your arsenal to better use the resources. By controlling resource allocation you may be able to run more pods and decrease your overall cost of running the platform.

You can also combine VPA with the Horizontal Pod Autoscaler (HPA) which will allow you to scale your pods both horizontally and vertically. By using the VPA to fine tune the resource allocations you will most likely be able to reduce the number of times the HPA triggers new pod deployments.

Subscribe to newsletter

Subscribe to receive the latest blog posts to your inbox every week.

By subscribing you agree to with our Privacy Policy.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Ready to Transform 

Your Business?