Orchestration

Scale Pods with Custom Metrics: Kubernetes HPA

Kubernetes Horizontal Pod Autoscaler with Custom Metrics

In the dynamic world of cloud-native applications, efficiently scaling your services to meet fluctuating demand is paramount. While Kubernetes offers robust autoscaling capabilities out-of-the-box, relying solely on CPU and memory utilization can sometimes fall short. What if your application’s performance bottlenecks aren’t directly tied to these traditional metrics? Perhaps it’s the number of messages in a queue, the rate of HTTP requests, or the latency of a specific API endpoint that truly indicates load.

This is where the Horizontal Pod Autoscaler (HPA) with custom metrics shines. By allowing you to define scaling policies based on application-specific or infrastructure-specific metrics, you gain unparalleled control and precision over your resource allocation. Imagine automatically scaling your worker pods when the RabbitMQ queue depth exceeds a certain threshold, or spinning up more API servers when your Prometheus ingress rate spikes. This guide will walk you through the process of setting up and leveraging HPA with custom metrics, transforming your Kubernetes clusters into truly adaptive and intelligent environments.

TL;DR: HPA with Custom Metrics

The Horizontal Pod Autoscaler (HPA) can scale your Kubernetes deployments based on custom metrics beyond CPU/Memory. This requires a metrics server (like Prometheus Adapter) to expose custom metrics to the Kubernetes API. Here’s a quick rundown:

  • Install Metrics Server: Essential for standard CPU/Memory HPA.
  • Install Prometheus & Prometheus Adapter: Prometheus collects custom metrics, Adapter exposes them via the Custom Metrics API.
  • Deploy a Sample Application: An app exposing a custom metric (e.g., a counter).
  • Configure ServiceMonitor: Tell Prometheus how to scrape your app’s metrics.
  • Create HPA with Custom Metrics: Define HorizontalPodAutoscaler resource targeting your custom metric.

Key Commands:


# Install Metrics Server (if not already present)
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

# Install Prometheus Stack (using Helm)
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
helm install prometheus prometheus-community/kube-prometheus-stack --namespace monitoring --create-namespace

# Deploy Prometheus Adapter (example values)
helm install prometheus-adapter prometheus-community/prometheus-adapter --namespace monitoring -f adapter-values.yaml

# Example HPA with custom metric 'http_requests_total'
kubectl apply -f my-hpa.yaml
    

Prerequisites

Before diving into the configuration, ensure you have the following:

  • Kubernetes Cluster: A running Kubernetes cluster (v1.16+ recommended). You can use Minikube, Kind, or a cloud-managed cluster like GKE, EKS, or AKS.
  • kubectl: The Kubernetes command-line tool, configured to connect to your cluster. Refer to the official kubectl installation guide for instructions.
  • Helm: A package manager for Kubernetes. We’ll use Helm to install Prometheus and its adapter. Install it by following the Helm installation guide.
  • Basic understanding of Kubernetes concepts: Deployments, Services, and Horizontal Pod Autoscalers.
  • Metrics Server: While not strictly for custom metrics, the Metrics Server is essential for HPA to work with standard CPU/memory metrics and is often a prerequisite for custom metrics solutions.

Step-by-Step Guide

Step 1: Install Metrics Server

The Kubernetes Metrics Server is a cluster-wide aggregator of resource usage data from Kubelets. It’s crucial for the HPA to function with CPU and memory metrics, and many custom metrics solutions rely on it being present. If you don’t have it installed, deploy it now. It exposes the metrics.k8s.io API, which the HPA uses to query resource metrics.

Apply the official Metrics Server components to your cluster:


kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

Verify that the Metrics Server pods are running and healthy:


kubectl get pods -n kube-system -l k8s-app=metrics-server

Expected output:


NAME                            READY   STATUS    RESTARTS   AGE
metrics-server-578b9756b5-abcde   1/1     Running   0          2m

You can also check if the API is available:


kubectl get --raw "/apis/metrics.k8s.io/v1beta1/nodes"

Expected output (truncated):


{"kind":"NodeMetricsList","apiVersion":"metrics.k8s.io/v1beta1","metadata":{"selfLink":"/apis/metrics.k8s.io/v1beta1/nodes"},"items":[...]}

Step 2: Install Prometheus and Prometheus Adapter

Prometheus is the de facto standard for monitoring in Kubernetes environments. It scrapes metrics from your applications and infrastructure. The Prometheus Adapter then translates these Prometheus metrics into a format that the Kubernetes Custom Metrics API (custom.metrics.k8s.io) and External Metrics API (external.metrics.k8s.io) can understand, making them available for the HPA.

First, add the Prometheus community Helm repository and update it:


helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update

Next, install the kube-prometheus-stack which includes Prometheus, Grafana, and other exporters. We’ll install it in its own monitoring namespace:


helm install prometheus prometheus-community/kube-prometheus-stack --namespace monitoring --create-namespace

This installation might take a few minutes. Verify that Prometheus and Grafana pods are running:


kubectl get pods -n monitoring -l app.kubernetes.io/name=prometheus
kubectl get pods -n monitoring -l app.kubernetes.io/name=grafana

Expected output (truncated):


NAME                                             READY   STATUS    RESTARTS   AGE
prometheus-kube-prometheus-stack-prometheus-0    2/2     Running   0          5m

NAME                                            READY   STATUS    RESTARTS   AGE
prometheus-grafana-78c4d96bbd-abcde             1/1     Running   0          5m

Now, install the Prometheus Adapter. We’ll need a custom values.yaml file to configure which metrics it exposes. Create a file named adapter-values.yaml:


# adapter-values.yaml
prometheus:
  url: http://prometheus-kube-prometheus-stack-prometheus.monitoring.svc
  port: 9090

rules:
  - seriesQuery: '{__name__=~"^http_requests_total$"}'
    resources:
      overrides:
        kubernetes_namespace: {resource: "namespace"}
        kubernetes_pod_name: {resource: "pod"}
    name:
      matches: "^(.*)_total$"
      as: "${1}_per_second"
    metricsQuery: sum(rate(<<.Series>>{<<.LabelMatchers>>}[2m])) by (<<.GroupBy>>)

This adapter-values.yaml configures the adapter to look for Prometheus metrics named similar to http_requests_total. It will expose a metric named http_requests_per_second by calculating the rate over a 2-minute window. The resources section maps Kubernetes labels to Prometheus labels, which is crucial for HPA to target specific resources.

Install the Prometheus Adapter using Helm:


helm install prometheus-adapter prometheus-community/prometheus-adapter --namespace monitoring -f adapter-values.yaml

Verify that the Prometheus Adapter pod is running:


kubectl get pods -n monitoring -l app.kubernetes.io/name=prometheus-adapter

Expected output:


NAME                                      READY   STATUS    RESTARTS   AGE
prometheus-adapter-79c5c879d7-xyzab       1/1     Running   0          1m

Check if the custom metrics API is now available:


kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1" | jq .

Expected output (truncated, look for your custom metric):


{
  "kind": "APIResourceList",
  "apiVersion": "v1",
  "groupVersion": "custom.metrics.k8s.io/v1beta1",
  "resources": [
    {
      "name": "pods/http_requests_per_second",
      "singularName": "",
      "namespaced": true,
      "kind": "MetricValueList",
      "verbs": [
        "get"
      ]
    },
    ...
  ]
}

If you see pods/http_requests_per_second or a similar metric, the adapter is successfully exposing it.

Step 3: Deploy a Sample Application with Custom Metrics

To demonstrate HPA with custom metrics, we need an application that exposes such metrics in a Prometheus-compatible format. We’ll use a simple Go application that exposes an HTTP endpoint and increments a counter metric on each request. For more advanced networking configurations, you might consider solutions like Cilium WireGuard Encryption or Istio Ambient Mesh.

Create a file named app.yaml for our sample deployment and service:


# app.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: custom-metric-app
  labels:
    app: custom-metric-app
spec:
  replicas: 1
  selector:
    matchLabels:
      app: custom-metric-app
  template:
    metadata:
      labels:
        app: custom-metric-app
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/path: "/metrics"
        prometheus.io/port: "8080"
    spec:
      containers:
      - name: app
        image: quay.io/kubezilla/custom-metric-app:v1.0.0 # A simple Go app that exposes /metrics
        ports:
        - containerPort: 8080
          name: http
        resources:
          requests:
            cpu: 100m
            memory: 100Mi
          limits:
            cpu: 200m
            memory: 200Mi
---
apiVersion: v1
kind: Service
metadata:
  name: custom-metric-app
  labels:
    app: custom-metric-app
spec:
  selector:
    app: custom-metric-app
  ports:
    - protocol: TCP
      port: 80
      targetPort: 8080
  type: ClusterIP

The quay.io/kubezilla/custom-metric-app:v1.0.0 image is a simple Go application that exposes a /metrics endpoint with a counter named http_requests_total. The annotations prometheus.io/scrape, prometheus.io/path, and prometheus.io/port are crucial for Prometheus to automatically discover and scrape metrics from this pod.

Deploy the application:


kubectl apply -f app.yaml

Verify that the application pod and service are running:


kubectl get pods -l app=custom-metric-app
kubectl get svc custom-metric-app

Expected output:


NAME                                  READY   STATUS    RESTARTS   AGE
custom-metric-app-7c7d6d5f78-abcde    1/1     Running   0          1m

NAME                TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)   AGE
custom-metric-app   ClusterIP   10.96.10.10     <none>        80/TCP    1m

Step 4: Verify Prometheus is Scraping Custom Metrics

Prometheus should now be automatically scraping metrics from our custom-metric-app due to the annotations. We can verify this by accessing the Prometheus UI.

Port-forward the Prometheus UI to your local machine:


kubectl port-forward svc/prometheus-kube-prometheus-stack-prometheus 9090:9090 -n monitoring

Open your browser to http://localhost:9090. In the Prometheus expression browser, type http_requests_total and click “Execute”. You should see the metric from your custom-metric-app. To generate some traffic, you can curl the service:


# Get the pod name
POD_NAME=$(kubectl get pods -l app=custom-metric-app -o jsonpath='{.items[0].metadata.name}')

# Send some requests
for i in $(seq 1 10); do kubectl exec $POD_NAME -- curl -s localhost:8080 > /dev/null; done

After generating traffic, refresh the Prometheus UI. The value of http_requests_total should have increased. This confirms Prometheus is successfully collecting your custom metrics.

Step 5: Create HPA with Custom Metrics

Now that our custom metric is being scraped by Prometheus and exposed by the Prometheus Adapter, we can create the Horizontal Pod Autoscaler resource. We will configure it to scale our custom-metric-app deployment based on the http_requests_per_second metric.

Create a file named hpa.yaml:


# hpa.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: custom-metric-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: custom-metric-app
  minReplicas: 1
  maxReplicas: 5
  metrics:
  - type: Pods
    pods:
      metric:
        name: http_requests_per_second
      target:
        type: AverageValue
        averageValue: "5" # Target 5 requests per second per pod

In this HPA definition:

  • scaleTargetRef points to our custom-metric-app deployment.
  • minReplicas is 1, and maxReplicas is 5.
  • metrics defines the custom metric:
    • type: Pods indicates that this is a custom metric aggregated across pods.
    • metric.name: http_requests_per_second matches the metric we configured in the Prometheus Adapter.
    • target.type: AverageValue means the HPA will try to maintain an average of averageValue: "5" requests per second across all pods.

Apply the HPA:


kubectl apply -f hpa.yaml

Verify the HPA status:


kubectl get hpa custom-metric-hpa -w

Expected initial output:


NAME                REFERENCE                      TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
custom-metric-hpa   Deployment/custom-metric-app   <unknown>/5   1         5         1          10s

Initially, TARGETS might show <unknown>/5 because it takes a moment for the HPA to fetch the first metric values. After a short while, it should update to show the current requests per second.

Now, let’s generate significant load to trigger scaling. In a new terminal, continuously send requests to the application:


# Get the service IP
SERVICE_IP=$(kubectl get svc custom-metric-app -o jsonpath='{.spec.clusterIP}')

# Loop to generate continuous load
while true; do curl -s $SERVICE_IP > /dev/null; sleep 0.1; done

Observe the HPA status in the terminal where you ran kubectl get hpa custom-metric-hpa -w. You should see the TARGETS value increase, and eventually, the REPLICAS count will go up as the HPA scales out the deployment:


NAME                REFERENCE                      TARGETS            MINPODS   MAXPODS   REPLICAS   AGE
custom-metric-hpa   Deployment/custom-metric-app   <unknown>/5        1         5         1          1m
custom-metric-hpa   Deployment/custom-metric-app   12/5               1         5         1          1m
custom-metric-hpa   Deployment/custom-metric-app   12/5               1         5         2          1m20s # Scaled up!
custom-metric-hpa   Deployment/custom-metric-app   6/5                1         5         2          1m30s
custom-metric-hpa   Deployment/custom-metric-app   14/5               1         5         3          1m45s # Scaled up again!

You can also check the deployment’s pods:


kubectl get pods -l app=custom-metric-app

You will see new pods being created. This demonstrates successful scaling based on custom metrics. For more advanced autoscaling scenarios, especially for cost optimization, consider exploring tools like Karpenter Cost Optimization.

Production Considerations

Deploying HPA with custom metrics in a production environment requires careful planning and robust infrastructure:

  1. Reliable Metrics Source: Ensure your Prometheus setup is highly available and robust. Consider Thanos or Mimir for long-term storage and global views.
  2. Appropriate Metric Selection: Choose metrics that truly reflect the load and performance of your application. Avoid “noisy” metrics or those that don’t directly correlate with scaling needs. For applications dealing with heavy data processing or AI/ML workloads, metrics related to GPU utilization might be more relevant, as discussed in our LLM GPU Scheduling Guide.
  3. Prometheus Adapter Configuration: Carefully define your Prometheus Adapter rules. Incorrect regex or aggregation can lead to inaccurate metrics and erratic scaling. Test these rules thoroughly.
  4. HPA Cooldown and Stabilization: HPA has default cooldown periods (e.g., 5 minutes for scale down) to prevent rapid flapping. Adjust these values in the kube-controller-manager configuration if necessary, but be cautious.
  5. Resource Requests and Limits: Set accurate CPU and memory requests and limits for your application pods. This helps the HPA make better scaling decisions and ensures your nodes aren’t oversubscribed.
  6. Monitoring HPA Itself: Monitor the HPA’s decisions and its target’s performance. Grafana dashboards that show current replicas, target metrics, and actual metrics are invaluable.
  7. Cost Management: While autoscaling helps optimize resource usage, it can also lead to unexpected costs if not properly managed. Combine HPA with cluster autoscalers (like Kubernetes Cluster Autoscaler or Karpenter) to ensure nodes are also scaled up/down efficiently.
  8. Security: Ensure your Prometheus and Adapter deployments are secured. Use network policies (refer to our Network Policies Security Guide) to restrict access to the metrics endpoints and the Prometheus UI.
  9. Observability: Integrate with robust observability platforms. Tools like eBPF Observability with Hubble can provide deep insights into network and application performance, complementing your custom metrics.
  10. Testing Scaling Behavior: Always test your HPA configurations under simulated load in a staging environment before deploying to production.

Troubleshooting

  1. HPA status shows <unknown>/TARGET or FailedGetResourceMetric

    Issue: The HPA cannot retrieve the custom metric value.

    Solution:

    • Check the Prometheus Adapter logs: kubectl logs -n monitoring -l app.kubernetes.io/name=prometheus-adapter. Look for errors related to Prometheus queries or metric exposure.
    • Verify the Prometheus Adapter’s configuration (adapter-values.yaml). Ensure the seriesQuery correctly matches your metric and the metricsQuery is valid PromQL.
    • Ensure Prometheus itself is healthy and scraping your application’s metrics (check Prometheus UI targets).
    • Verify the custom metrics API endpoint: kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1" | jq .. Your metric should be listed.
    • Ensure the Metrics Server is running and healthy (Step 1).
  2. HPA isn’t scaling up/down even when metrics exceed/fall below target.

    Issue: The HPA reports metrics, but replica count doesn’t change.

    Solution:

    • Describe the HPA: kubectl describe hpa custom-metric-hpa. Look at the “Events” section for reasons why scaling might not be happening (e.g., cooldown periods, failed to scale target).
    • Check minReplicas and maxReplicas in your HPA definition. Ensure the target metric is actually outside the desired range.
    • Verify the metric value reported by the HPA: kubectl get hpa custom-metric-hpa -o yaml. Compare currentReplicas, desiredReplicas, and currentMetrics.
    • Ensure the target deployment is healthy and can accept new replicas.
  3. Prometheus isn’t scraping metrics from my application.

    Issue: Your custom metric doesn’t appear in the Prometheus UI.

    Solution:

    • Check the pod annotations: Ensure prometheus.io/scrape: "true", prometheus.io/path, and prometheus.io/port are correctly set on your application’s pod template.
    • Verify network connectivity: Can Prometheus reach your application pod’s metrics endpoint? Check service and network configurations. Consider Kubernetes Network Policies if you have strict firewall rules.
    • Check Prometheus targets in the UI (http://localhost:9090/targets). Look for your application’s endpoint and any errors.
    • Ensure your application is actually exposing metrics on the specified path and port.
  4. Prometheus Adapter logs show “no metrics returned from Prometheus” or similar.

    Issue: The adapter can’t query Prometheus successfully.

    Solution:

    • Verify the Prometheus URL and port in adapter-values.yaml: prometheus.url: http://prometheus-kube-prometheus-stack-prometheus.monitoring.svc. Ensure it’s correct for your Prometheus service.
    • Test the PromQL query directly in the Prometheus UI. If it doesn’t return data there, it won’t work in the adapter.
    • Check network connectivity between the Prometheus Adapter pod and the Prometheus service.
  5. HPA scales too aggressively or not aggressively enough.

    Issue: The scaling behavior isn’t optimal for your application.

    Solution:

    • Adjust the target.averageValue for your custom metric in the HPA. A lower value will scale up more aggressively, a higher value less so.
    • Consider the HPA’s behavior field (available in v2 and v2beta2 API versions) to fine-tune scaling policies, including stabilization windows and scaling velocities. Refer to the Kubernetes HPA documentation.
    • Review the PromQL query in the adapter. Is the aggregation window (e.g., [2m] for rate) appropriate for your metric’s volatility?
    • Ensure your application’s resource requests and limits are set appropriately.

FAQ Section

  1. What’s the difference between custom.metrics.k8s.io and external.metrics.k8s.io?

    custom.metrics.k8s.io refers to metrics associated with Kubernetes objects (like Pods, Deployments, Nodes), typically collected from within the cluster (e.g., Prometheus scraping application pods). external.metrics.k8s.io refers to metrics that are not directly related to Kubernetes objects but come from external sources (e.g., AWS SQS queue length, Google Cloud Pub/Sub backlog). The HPA can scale based on both, but requires different adapter configurations.

  2. Can I use multiple custom metrics in a single HPA?

    Yes, you can define multiple metrics in the metrics array of your HPA. The HPA will calculate the desired number of replicas for each metric independently and then choose the maximum of those desired replica counts. This ensures that the application scales sufficiently for all defined metrics.

  3. How do I define custom metrics for specific pods or namespaces?

    The Prometheus Adapter’s rules allow you to map Kubernetes labels (like kubernetes_namespace, kubernetes_pod_name) to Prometheus labels. This enables the HPA to query for metrics specific to a particular namespace, deployment, or even individual pods, depending on how your Prometheus metrics are labeled.

  4. What alternatives are there to Prometheus and Prometheus Adapter for custom metrics?

    While Prometheus is dominant, other solutions exist. For cloud-specific metrics, you might use adapters that integrate directly with cloud monitoring services (e.g., AWS CloudWatch adapter, Google Cloud Monitoring adapter). For specific message queues, there might be dedicated operators or adapters. However, Prometheus Adapter is highly flexible due to its PromQL-based configuration.

  5. How does HPA handle rapidly fluctuating custom metrics?

    HPA has built-in stabilization logic to prevent “thrashing” (rapid scaling up and down). It uses a “stabilization window” (default 5 minutes for scale down, 0 for scale up) during which it considers previous desired replica counts. If your metric is extremely volatile, you might need to adjust the Prometheus Adapter’s aggregation window (e.g., increase the [2m] in rate()) or configure more aggressive stabilization windows in the HPA’s behavior section (for HPA v2 and later).

Cleanup Commands

To remove all resources created during this tutorial:


# Delete the HPA
kubectl delete -f hpa.yaml

# Delete the application deployment and service
kubectl delete -f app.yaml

# Uninstall Prometheus Adapter
helm uninstall prometheus-adapter -n monitoring

# Uninstall Prometheus stack
helm uninstall prometheus -n monitoring
kubectl delete namespace monitoring

# Delete Metrics Server
kubectl delete -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

Next Steps / Further Reading

Conclusion

Harnessing the power of the Horizontal Pod Autoscaler with custom metrics is a game-changer for building resilient, cost-effective, and performance-optimized applications on Kubernetes. By moving beyond generic CPU and memory metrics, you can create scaling policies that truly align with your application’s unique operational characteristics and business logic. This guide provided a comprehensive walkthrough, from setting up the necessary components to deploying and verifying your custom metric-driven HPA.

As you continue your cloud-native journey, remember that observability is key to effective autoscaling. A robust monitoring stack like Prometheus, combined with the flexibility of custom metrics, empowers you to build self-healing and auto-adaptive systems. Embrace these tools, iterate on your scaling strategies, and watch your Kubernetes deployments gracefully handle any workload thrown their way.

Leave a Reply

Your email address will not be published. Required fields are marked *