Kubernetes Horizontal Pod Autoscaler with Custom Metrics
In the dynamic world of cloud-native applications, efficiently scaling your services to meet fluctuating demand is paramount. While Kubernetes offers robust autoscaling capabilities out-of-the-box, relying solely on CPU and memory utilization can sometimes fall short. What if your application’s performance bottlenecks aren’t directly tied to these traditional metrics? Perhaps it’s the number of messages in a queue, the rate of HTTP requests, or the latency of a specific API endpoint that truly indicates load.
This is where the Horizontal Pod Autoscaler (HPA) with custom metrics shines. By allowing you to define scaling policies based on application-specific or infrastructure-specific metrics, you gain unparalleled control and precision over your resource allocation. Imagine automatically scaling your worker pods when the RabbitMQ queue depth exceeds a certain threshold, or spinning up more API servers when your Prometheus ingress rate spikes. This guide will walk you through the process of setting up and leveraging HPA with custom metrics, transforming your Kubernetes clusters into truly adaptive and intelligent environments.
TL;DR: HPA with Custom Metrics
The Horizontal Pod Autoscaler (HPA) can scale your Kubernetes deployments based on custom metrics beyond CPU/Memory. This requires a metrics server (like Prometheus Adapter) to expose custom metrics to the Kubernetes API. Here’s a quick rundown:
- Install Metrics Server: Essential for standard CPU/Memory HPA.
- Install Prometheus & Prometheus Adapter: Prometheus collects custom metrics, Adapter exposes them via the Custom Metrics API.
- Deploy a Sample Application: An app exposing a custom metric (e.g., a counter).
- Configure ServiceMonitor: Tell Prometheus how to scrape your app’s metrics.
- Create HPA with Custom Metrics: Define
HorizontalPodAutoscalerresource targeting your custom metric.
Key Commands:
# Install Metrics Server (if not already present)
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
# Install Prometheus Stack (using Helm)
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
helm install prometheus prometheus-community/kube-prometheus-stack --namespace monitoring --create-namespace
# Deploy Prometheus Adapter (example values)
helm install prometheus-adapter prometheus-community/prometheus-adapter --namespace monitoring -f adapter-values.yaml
# Example HPA with custom metric 'http_requests_total'
kubectl apply -f my-hpa.yaml
Prerequisites
Before diving into the configuration, ensure you have the following:
- Kubernetes Cluster: A running Kubernetes cluster (v1.16+ recommended). You can use Minikube, Kind, or a cloud-managed cluster like GKE, EKS, or AKS.
- kubectl: The Kubernetes command-line tool, configured to connect to your cluster. Refer to the official kubectl installation guide for instructions.
- Helm: A package manager for Kubernetes. We’ll use Helm to install Prometheus and its adapter. Install it by following the Helm installation guide.
- Basic understanding of Kubernetes concepts: Deployments, Services, and Horizontal Pod Autoscalers.
- Metrics Server: While not strictly for custom metrics, the Metrics Server is essential for HPA to work with standard CPU/memory metrics and is often a prerequisite for custom metrics solutions.
Step-by-Step Guide
Step 1: Install Metrics Server
The Kubernetes Metrics Server is a cluster-wide aggregator of resource usage data from Kubelets. It’s crucial for the HPA to function with CPU and memory metrics, and many custom metrics solutions rely on it being present. If you don’t have it installed, deploy it now. It exposes the metrics.k8s.io API, which the HPA uses to query resource metrics.
Apply the official Metrics Server components to your cluster:
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
Verify that the Metrics Server pods are running and healthy:
kubectl get pods -n kube-system -l k8s-app=metrics-server
Expected output:
NAME READY STATUS RESTARTS AGE
metrics-server-578b9756b5-abcde 1/1 Running 0 2m
You can also check if the API is available:
kubectl get --raw "/apis/metrics.k8s.io/v1beta1/nodes"
Expected output (truncated):
{"kind":"NodeMetricsList","apiVersion":"metrics.k8s.io/v1beta1","metadata":{"selfLink":"/apis/metrics.k8s.io/v1beta1/nodes"},"items":[...]}
Step 2: Install Prometheus and Prometheus Adapter
Prometheus is the de facto standard for monitoring in Kubernetes environments. It scrapes metrics from your applications and infrastructure. The Prometheus Adapter then translates these Prometheus metrics into a format that the Kubernetes Custom Metrics API (custom.metrics.k8s.io) and External Metrics API (external.metrics.k8s.io) can understand, making them available for the HPA.
First, add the Prometheus community Helm repository and update it:
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
Next, install the kube-prometheus-stack which includes Prometheus, Grafana, and other exporters. We’ll install it in its own monitoring namespace:
helm install prometheus prometheus-community/kube-prometheus-stack --namespace monitoring --create-namespace
This installation might take a few minutes. Verify that Prometheus and Grafana pods are running:
kubectl get pods -n monitoring -l app.kubernetes.io/name=prometheus
kubectl get pods -n monitoring -l app.kubernetes.io/name=grafana
Expected output (truncated):
NAME READY STATUS RESTARTS AGE
prometheus-kube-prometheus-stack-prometheus-0 2/2 Running 0 5m
NAME READY STATUS RESTARTS AGE
prometheus-grafana-78c4d96bbd-abcde 1/1 Running 0 5m
Now, install the Prometheus Adapter. We’ll need a custom values.yaml file to configure which metrics it exposes. Create a file named adapter-values.yaml:
# adapter-values.yaml
prometheus:
url: http://prometheus-kube-prometheus-stack-prometheus.monitoring.svc
port: 9090
rules:
- seriesQuery: '{__name__=~"^http_requests_total$"}'
resources:
overrides:
kubernetes_namespace: {resource: "namespace"}
kubernetes_pod_name: {resource: "pod"}
name:
matches: "^(.*)_total$"
as: "${1}_per_second"
metricsQuery: sum(rate(<<.Series>>{<<.LabelMatchers>>}[2m])) by (<<.GroupBy>>)
This adapter-values.yaml configures the adapter to look for Prometheus metrics named similar to http_requests_total. It will expose a metric named http_requests_per_second by calculating the rate over a 2-minute window. The resources section maps Kubernetes labels to Prometheus labels, which is crucial for HPA to target specific resources.
Install the Prometheus Adapter using Helm:
helm install prometheus-adapter prometheus-community/prometheus-adapter --namespace monitoring -f adapter-values.yaml
Verify that the Prometheus Adapter pod is running:
kubectl get pods -n monitoring -l app.kubernetes.io/name=prometheus-adapter
Expected output:
NAME READY STATUS RESTARTS AGE
prometheus-adapter-79c5c879d7-xyzab 1/1 Running 0 1m
Check if the custom metrics API is now available:
kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1" | jq .
Expected output (truncated, look for your custom metric):
{
"kind": "APIResourceList",
"apiVersion": "v1",
"groupVersion": "custom.metrics.k8s.io/v1beta1",
"resources": [
{
"name": "pods/http_requests_per_second",
"singularName": "",
"namespaced": true,
"kind": "MetricValueList",
"verbs": [
"get"
]
},
...
]
}
If you see pods/http_requests_per_second or a similar metric, the adapter is successfully exposing it.
Step 3: Deploy a Sample Application with Custom Metrics
To demonstrate HPA with custom metrics, we need an application that exposes such metrics in a Prometheus-compatible format. We’ll use a simple Go application that exposes an HTTP endpoint and increments a counter metric on each request. For more advanced networking configurations, you might consider solutions like Cilium WireGuard Encryption or Istio Ambient Mesh.
Create a file named app.yaml for our sample deployment and service:
# app.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: custom-metric-app
labels:
app: custom-metric-app
spec:
replicas: 1
selector:
matchLabels:
app: custom-metric-app
template:
metadata:
labels:
app: custom-metric-app
annotations:
prometheus.io/scrape: "true"
prometheus.io/path: "/metrics"
prometheus.io/port: "8080"
spec:
containers:
- name: app
image: quay.io/kubezilla/custom-metric-app:v1.0.0 # A simple Go app that exposes /metrics
ports:
- containerPort: 8080
name: http
resources:
requests:
cpu: 100m
memory: 100Mi
limits:
cpu: 200m
memory: 200Mi
---
apiVersion: v1
kind: Service
metadata:
name: custom-metric-app
labels:
app: custom-metric-app
spec:
selector:
app: custom-metric-app
ports:
- protocol: TCP
port: 80
targetPort: 8080
type: ClusterIP
The quay.io/kubezilla/custom-metric-app:v1.0.0 image is a simple Go application that exposes a /metrics endpoint with a counter named http_requests_total. The annotations prometheus.io/scrape, prometheus.io/path, and prometheus.io/port are crucial for Prometheus to automatically discover and scrape metrics from this pod.
Deploy the application:
kubectl apply -f app.yaml
Verify that the application pod and service are running:
kubectl get pods -l app=custom-metric-app
kubectl get svc custom-metric-app
Expected output:
NAME READY STATUS RESTARTS AGE
custom-metric-app-7c7d6d5f78-abcde 1/1 Running 0 1m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
custom-metric-app ClusterIP 10.96.10.10 <none> 80/TCP 1m
Step 4: Verify Prometheus is Scraping Custom Metrics
Prometheus should now be automatically scraping metrics from our custom-metric-app due to the annotations. We can verify this by accessing the Prometheus UI.
Port-forward the Prometheus UI to your local machine:
kubectl port-forward svc/prometheus-kube-prometheus-stack-prometheus 9090:9090 -n monitoring
Open your browser to http://localhost:9090. In the Prometheus expression browser, type http_requests_total and click “Execute”. You should see the metric from your custom-metric-app. To generate some traffic, you can curl the service:
# Get the pod name
POD_NAME=$(kubectl get pods -l app=custom-metric-app -o jsonpath='{.items[0].metadata.name}')
# Send some requests
for i in $(seq 1 10); do kubectl exec $POD_NAME -- curl -s localhost:8080 > /dev/null; done
After generating traffic, refresh the Prometheus UI. The value of http_requests_total should have increased. This confirms Prometheus is successfully collecting your custom metrics.
Step 5: Create HPA with Custom Metrics
Now that our custom metric is being scraped by Prometheus and exposed by the Prometheus Adapter, we can create the Horizontal Pod Autoscaler resource. We will configure it to scale our custom-metric-app deployment based on the http_requests_per_second metric.
Create a file named hpa.yaml:
# hpa.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: custom-metric-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: custom-metric-app
minReplicas: 1
maxReplicas: 5
metrics:
- type: Pods
pods:
metric:
name: http_requests_per_second
target:
type: AverageValue
averageValue: "5" # Target 5 requests per second per pod
In this HPA definition:
scaleTargetRefpoints to ourcustom-metric-appdeployment.minReplicasis 1, andmaxReplicasis 5.metricsdefines the custom metric:type: Podsindicates that this is a custom metric aggregated across pods.metric.name: http_requests_per_secondmatches the metric we configured in the Prometheus Adapter.target.type: AverageValuemeans the HPA will try to maintain an average ofaverageValue: "5"requests per second across all pods.
Apply the HPA:
kubectl apply -f hpa.yaml
Verify the HPA status:
kubectl get hpa custom-metric-hpa -w
Expected initial output:
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
custom-metric-hpa Deployment/custom-metric-app <unknown>/5 1 5 1 10s
Initially, TARGETS might show <unknown>/5 because it takes a moment for the HPA to fetch the first metric values. After a short while, it should update to show the current requests per second.
Now, let’s generate significant load to trigger scaling. In a new terminal, continuously send requests to the application:
# Get the service IP
SERVICE_IP=$(kubectl get svc custom-metric-app -o jsonpath='{.spec.clusterIP}')
# Loop to generate continuous load
while true; do curl -s $SERVICE_IP > /dev/null; sleep 0.1; done
Observe the HPA status in the terminal where you ran kubectl get hpa custom-metric-hpa -w. You should see the TARGETS value increase, and eventually, the REPLICAS count will go up as the HPA scales out the deployment:
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
custom-metric-hpa Deployment/custom-metric-app <unknown>/5 1 5 1 1m
custom-metric-hpa Deployment/custom-metric-app 12/5 1 5 1 1m
custom-metric-hpa Deployment/custom-metric-app 12/5 1 5 2 1m20s # Scaled up!
custom-metric-hpa Deployment/custom-metric-app 6/5 1 5 2 1m30s
custom-metric-hpa Deployment/custom-metric-app 14/5 1 5 3 1m45s # Scaled up again!
You can also check the deployment’s pods:
kubectl get pods -l app=custom-metric-app
You will see new pods being created. This demonstrates successful scaling based on custom metrics. For more advanced autoscaling scenarios, especially for cost optimization, consider exploring tools like Karpenter Cost Optimization.
Production Considerations
Deploying HPA with custom metrics in a production environment requires careful planning and robust infrastructure:
- Reliable Metrics Source: Ensure your Prometheus setup is highly available and robust. Consider Thanos or Mimir for long-term storage and global views.
- Appropriate Metric Selection: Choose metrics that truly reflect the load and performance of your application. Avoid “noisy” metrics or those that don’t directly correlate with scaling needs. For applications dealing with heavy data processing or AI/ML workloads, metrics related to GPU utilization might be more relevant, as discussed in our LLM GPU Scheduling Guide.
- Prometheus Adapter Configuration: Carefully define your Prometheus Adapter rules. Incorrect regex or aggregation can lead to inaccurate metrics and erratic scaling. Test these rules thoroughly.
- HPA Cooldown and Stabilization: HPA has default cooldown periods (e.g., 5 minutes for scale down) to prevent rapid flapping. Adjust these values in the
kube-controller-managerconfiguration if necessary, but be cautious. - Resource Requests and Limits: Set accurate CPU and memory requests and limits for your application pods. This helps the HPA make better scaling decisions and ensures your nodes aren’t oversubscribed.
- Monitoring HPA Itself: Monitor the HPA’s decisions and its target’s performance. Grafana dashboards that show current replicas, target metrics, and actual metrics are invaluable.
- Cost Management: While autoscaling helps optimize resource usage, it can also lead to unexpected costs if not properly managed. Combine HPA with cluster autoscalers (like Kubernetes Cluster Autoscaler or Karpenter) to ensure nodes are also scaled up/down efficiently.
- Security: Ensure your Prometheus and Adapter deployments are secured. Use network policies (refer to our Network Policies Security Guide) to restrict access to the metrics endpoints and the Prometheus UI.
- Observability: Integrate with robust observability platforms. Tools like eBPF Observability with Hubble can provide deep insights into network and application performance, complementing your custom metrics.
- Testing Scaling Behavior: Always test your HPA configurations under simulated load in a staging environment before deploying to production.
Troubleshooting
-
HPA status shows
<unknown>/TARGETorFailedGetResourceMetricIssue: The HPA cannot retrieve the custom metric value.
Solution:
- Check the Prometheus Adapter logs:
kubectl logs -n monitoring -l app.kubernetes.io/name=prometheus-adapter. Look for errors related to Prometheus queries or metric exposure. - Verify the Prometheus Adapter’s configuration (
adapter-values.yaml). Ensure theseriesQuerycorrectly matches your metric and themetricsQueryis valid PromQL. - Ensure Prometheus itself is healthy and scraping your application’s metrics (check Prometheus UI targets).
- Verify the custom metrics API endpoint:
kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1" | jq .. Your metric should be listed. - Ensure the Metrics Server is running and healthy (Step 1).
- Check the Prometheus Adapter logs:
-
HPA isn’t scaling up/down even when metrics exceed/fall below target.
Issue: The HPA reports metrics, but replica count doesn’t change.
Solution:
- Describe the HPA:
kubectl describe hpa custom-metric-hpa. Look at the “Events” section for reasons why scaling might not be happening (e.g., cooldown periods, failed to scale target). - Check
minReplicasandmaxReplicasin your HPA definition. Ensure the target metric is actually outside the desired range. - Verify the metric value reported by the HPA:
kubectl get hpa custom-metric-hpa -o yaml. ComparecurrentReplicas,desiredReplicas, andcurrentMetrics. - Ensure the target deployment is healthy and can accept new replicas.
- Describe the HPA:
-
Prometheus isn’t scraping metrics from my application.
Issue: Your custom metric doesn’t appear in the Prometheus UI.
Solution:
- Check the pod annotations: Ensure
prometheus.io/scrape: "true",prometheus.io/path, andprometheus.io/portare correctly set on your application’s pod template. - Verify network connectivity: Can Prometheus reach your application pod’s metrics endpoint? Check service and network configurations. Consider Kubernetes Network Policies if you have strict firewall rules.
- Check Prometheus targets in the UI (http://localhost:9090/targets). Look for your application’s endpoint and any errors.
- Ensure your application is actually exposing metrics on the specified path and port.
- Check the pod annotations: Ensure
-
Prometheus Adapter logs show “no metrics returned from Prometheus” or similar.
Issue: The adapter can’t query Prometheus successfully.
Solution:
- Verify the Prometheus URL and port in
adapter-values.yaml:prometheus.url: http://prometheus-kube-prometheus-stack-prometheus.monitoring.svc. Ensure it’s correct for your Prometheus service. - Test the PromQL query directly in the Prometheus UI. If it doesn’t return data there, it won’t work in the adapter.
- Check network connectivity between the Prometheus Adapter pod and the Prometheus service.
- Verify the Prometheus URL and port in
-
HPA scales too aggressively or not aggressively enough.
Issue: The scaling behavior isn’t optimal for your application.
Solution:
- Adjust the
target.averageValuefor your custom metric in the HPA. A lower value will scale up more aggressively, a higher value less so. - Consider the HPA’s
behaviorfield (available in v2 and v2beta2 API versions) to fine-tune scaling policies, including stabilization windows and scaling velocities. Refer to the Kubernetes HPA documentation. - Review the PromQL query in the adapter. Is the aggregation window (e.g.,
[2m]for rate) appropriate for your metric’s volatility? - Ensure your application’s resource requests and limits are set appropriately.
- Adjust the
FAQ Section
-
What’s the difference between
custom.metrics.k8s.ioandexternal.metrics.k8s.io?custom.metrics.k8s.iorefers to metrics associated with Kubernetes objects (like Pods, Deployments, Nodes), typically collected from within the cluster (e.g., Prometheus scraping application pods).external.metrics.k8s.iorefers to metrics that are not directly related to Kubernetes objects but come from external sources (e.g., AWS SQS queue length, Google Cloud Pub/Sub backlog). The HPA can scale based on both, but requires different adapter configurations. -
Can I use multiple custom metrics in a single HPA?
Yes, you can define multiple metrics in the
metricsarray of your HPA. The HPA will calculate the desired number of replicas for each metric independently and then choose the maximum of those desired replica counts. This ensures that the application scales sufficiently for all defined metrics. -
How do I define custom metrics for specific pods or namespaces?
The Prometheus Adapter’s rules allow you to map Kubernetes labels (like
kubernetes_namespace,kubernetes_pod_name) to Prometheus labels. This enables the HPA to query for metrics specific to a particular namespace, deployment, or even individual pods, depending on how your Prometheus metrics are labeled. -
What alternatives are there to Prometheus and Prometheus Adapter for custom metrics?
While Prometheus is dominant, other solutions exist. For cloud-specific metrics, you might use adapters that integrate directly with cloud monitoring services (e.g., AWS CloudWatch adapter, Google Cloud Monitoring adapter). For specific message queues, there might be dedicated operators or adapters. However, Prometheus Adapter is highly flexible due to its PromQL-based configuration.
-
How does HPA handle rapidly fluctuating custom metrics?
HPA has built-in stabilization logic to prevent “thrashing” (rapid scaling up and down). It uses a “stabilization window” (default 5 minutes for scale down, 0 for scale up) during which it considers previous desired replica counts. If your metric is extremely volatile, you might need to adjust the Prometheus Adapter’s aggregation window (e.g., increase the
[2m]inrate()) or configure more aggressive stabilization windows in the HPA’sbehaviorsection (for HPA v2 and later).
Cleanup Commands
To remove all resources created during this tutorial:
# Delete the HPA
kubectl delete -f hpa.yaml
# Delete the application deployment and service
kubectl delete -f app.yaml
# Uninstall Prometheus Adapter
helm uninstall prometheus-adapter -n monitoring
# Uninstall Prometheus stack
helm uninstall prometheus -n monitoring
kubectl delete namespace monitoring
# Delete Metrics Server
kubectl delete -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
Next Steps / Further Reading
- Explore the official Kubernetes HPA documentation to understand advanced features like behaviors and multiple metric types.
- Deep dive into Prometheus Query Language (PromQL) to craft more sophisticated custom metrics.
- Learn about the Prometheus Adapter configuration options for more complex metric transformations and resource mappings.
- Consider using the HorizontalPodAutoscaler v2 API for more granular control over scaling behavior and richer metric types.
- Investigate other Kubernetes autoscaling mechanisms like the Cluster Autoscaler for node-level scaling, and Karpenter for highly efficient node provisioning.
- For advanced traffic management and routing based on metrics, consider integrating with tools like the Kubernetes Gateway API.
Conclusion
Harnessing the power of the Horizontal Pod Autoscaler with custom metrics is a game-changer for building resilient, cost-effective, and performance-optimized applications on Kubernetes. By moving beyond generic CPU and memory metrics, you can create scaling policies that truly align with your application’s unique operational characteristics and business logic. This guide provided a comprehensive walkthrough, from setting up the necessary components to deploying and verifying your custom metric-driven HPA.
As you continue your cloud-native journey, remember that observability is key to effective autoscaling. A robust monitoring stack like Prometheus, combined with the flexibility of custom metrics, empowers you to build self-healing and auto-adaptive systems. Embrace these tools, iterate on your scaling strategies, and watch your Kubernetes deployments gracefully handle any workload thrown their way.