Redis Cluster on Kubernetes: High Availability Setup
Deploying stateful applications like Redis in a highly available and scalable manner on Kubernetes can be a complex undertaking. While Redis offers robust clustering capabilities, integrating them seamlessly with Kubernetes’ dynamic orchestration presents unique challenges. Many organizations struggle with persistent storage, network configuration, and ensuring quorum and failover in a containerized environment, often leading to data loss or service downtime.
This guide cuts through that complexity, providing a comprehensive, step-by-step approach to deploying a highly available Redis Cluster on Kubernetes. We’ll leverage Kubernetes’ native features like StatefulSets for stable identifiers and persistent storage, alongside Redis’s own clustering mechanisms, to build a resilient and scalable data store. By the end, you’ll have a production-ready Redis Cluster that can withstand node failures, scale horizontally, and provide consistent performance for your applications.
Whether you’re migrating an existing Redis setup or building a new one from scratch, understanding how to harness the power of Kubernetes for stateful workloads is crucial. Let’s dive in and unlock the full potential of Redis on your container orchestration platform.
TL;DR: Redis Cluster on Kubernetes
Deploy a highly available Redis Cluster using StatefulSets and PersistentVolumes. This guide covers creating a headless service, StatefulSet, and then initializing the cluster using redis-cli. Ensure your Kubernetes cluster has a default StorageClass.
# 1. Create a namespace
kubectl create namespace redis-cluster
# 2. Apply the Headless Service and StatefulSet
kubectl apply -f redis-headless-service.yaml -n redis-cluster
kubectl apply -f redis-statefulset.yaml -n redis-cluster
# 3. Wait for pods to be ready
kubectl get pods -n redis-cluster -w
# 4. Initialize the Redis Cluster (adjust replica count if needed)
# Get pod IPs
REDIS_IPS=$(kubectl get pods -l app=redis-cluster -o jsonpath='{.items[*].status.podIP}' -n redis-cluster)
echo "Redis Pod IPs: $REDIS_IPS"
# Run cluster creation command from one of the pods
# Replace ... with actual IPs
# For a 6-node cluster with 1 replica per master:
kubectl exec -it redis-cluster-0 -n redis-cluster -- redis-cli --cluster create $REDIS_IPS --cluster-replicas 1
# 5. Verify the cluster status
kubectl exec -it redis-cluster-0 -n redis-cluster -- redis-cli cluster info
kubectl exec -it redis-cluster-0 -n redis-cluster -- redis-cli cluster nodes
# 6. Clean up
kubectl delete namespace redis-cluster
Prerequisites
- A running Kubernetes cluster (version 1.18+ recommended). You can use Minikube, Kind, or a cloud-managed service like AWS EKS, GCP GKE, or Azure AKS.
kubectlinstalled and configured to connect to your cluster.- A default StorageClass configured in your cluster for dynamic PersistentVolume provisioning. If you don’t have one, you’ll need to create it or manually provision PersistentVolumes.
- Basic understanding of Kubernetes concepts: Pods, Services, StatefulSets, PersistentVolumes, and PersistentVolumeClaims.
- Familiarity with Redis clustering concepts.
Step-by-Step Guide: Redis Cluster Deployment
1. Create a Namespace
It’s always a good practice to isolate your applications within their own Kubernetes namespaces. This helps with resource management, access control, and overall organization of your cluster. We’ll create a namespace specifically for our Redis Cluster, named redis-cluster.
kubectl create namespace redis-cluster
Verify that the namespace has been created:
kubectl get namespaces
Expected Output:
NAME STATUS AGE
default Active 2d
kube-system Active 2d
kube-public Active 2d
kube-node-lease Active 2d
redis-cluster Active 5s
2. Define the Headless Service
A Headless Service is crucial for StatefulSets. Unlike a regular Service that provides a single stable IP address, a Headless Service doesn’t have a cluster IP. Instead, it allows for direct discovery of the individual Pods backing the Service via DNS. Each Pod in our StatefulSet will get a unique, stable hostname (e.g., redis-cluster-0.redis-cluster-headless.redis-cluster.svc.cluster.local), which is essential for Redis Cluster’s peer-to-peer communication and discovery.
This service will enable Redis nodes to discover each other by their stable DNS names, rather than relying on dynamic IP addresses that might change if pods are rescheduled. This is a fundamental building block for stateful applications on Kubernetes.
# redis-headless-service.yaml
apiVersion: v1
kind: Service
metadata:
name: redis-cluster-headless
namespace: redis-cluster
labels:
app: redis-cluster
spec:
ports:
- port: 6379
name: redis
- port: 16379 # Port for cluster bus communication
name: cluster-bus
clusterIP: None # This makes it a Headless Service
selector:
app: redis-cluster
Apply the Headless Service:
kubectl apply -f redis-headless-service.yaml -n redis-cluster
Verify the service creation:
kubectl get service -n redis-cluster
Expected Output:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
redis-cluster-headless ClusterIP None <none> 6379/TCP,16379/TCP 5s
3. Create the Redis StatefulSet
The StatefulSet is the cornerstone of our Redis Cluster deployment. It ensures stable, unique network identifiers for each Pod, ordered deployment and scaling, and most importantly, persistent storage. Each Redis Pod will get its own PersistentVolumeClaim (PVC) and PersistentVolume (PV), guaranteeing that data persists even if the Pod is restarted or rescheduled.
We’ll configure the StatefulSet to deploy 6 Redis nodes initially. Redis Cluster requires at least 3 master nodes for high availability, and 6 nodes (3 masters, 3 replicas) is a common starting point. Each node will expose port 6379 for client connections and 16379 for the cluster bus, as required by Redis Cluster. The volumeClaimTemplates section dynamically provisions persistent storage for each Pod, ensuring data durability.
For advanced networking configurations, especially in multi-cluster or hybrid cloud scenarios, you might consider solutions like Cilium WireGuard Encryption to secure inter-node communication, or even leveraging Kubernetes Network Policies for fine-grained access control to your Redis pods.
# redis-statefulset.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: redis-cluster
namespace: redis-cluster
spec:
serviceName: redis-cluster-headless
replicas: 6 # We need at least 6 nodes for a 3-master, 3-replica cluster
selector:
matchLabels:
app: redis-cluster
template:
metadata:
labels:
app: redis-cluster
spec:
containers:
- name: redis
image: redis:6.2.6-alpine # Using a stable Redis image
command: ["redis-server"]
args: ["/etc/redis/redis.conf"]
ports:
- containerPort: 6379
name: redis
- containerPort: 16379 # Cluster bus port
name: cluster-bus
volumeMounts:
- name: redis-data
mountPath: /data
- name: redis-config
mountPath: /etc/redis
livenessProbe:
exec:
command:
- redis-cli
- -a
- $(REDIS_PASSWORD) # Assumes password is set, remove if not
- ping
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
exec:
command:
- redis-cli
- -a
- $(REDIS_PASSWORD) # Assumes password is set, remove if not
- ping
initialDelaySeconds: 10
periodSeconds: 5
env:
- name: REDIS_PASSWORD
valueFrom:
secretKeyRef:
name: redis-secret
key: password
resources:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "500m"
memory: "512Mi"
volumes:
- name: redis-config
configMap:
name: redis-cluster-config
volumeClaimTemplates:
- metadata:
name: redis-data
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: standard # Ensure this matches your cluster's default StorageClass
resources:
requests:
storage: 1Gi # Adjust storage as needed
Before applying the StatefulSet, we need a ConfigMap for Redis configuration and a Secret for the password (if you choose to use one, which is highly recommended for production). Let’s create those first.
Create Redis Configuration ConfigMap
This ConfigMap holds the redis.conf file that each Redis Pod will use. Key settings for cluster mode are enabled here.
# redis-configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: redis-cluster-config
namespace: redis-cluster
data:
redis.conf: |
port 6379
cluster-enabled yes
cluster-config-file nodes.conf
cluster-node-timeout 5000
appendonly yes
protected-mode no # Set to 'yes' and configure bind for production
dir /data
# Require authentication
requirepass $(REDIS_PASSWORD)
masterauth $(REDIS_PASSWORD)
Apply the ConfigMap:
kubectl apply -f redis-configmap.yaml -n redis-cluster
Create Redis Password Secret
Store your Redis password securely using a Kubernetes Secret. This prevents hardcoding sensitive information in your StatefulSet definition.
# Generate a strong password
REDIS_PASSWORD=$(head /dev/urandom | tr -dc A-Za-z0-9_\!\@\#\$\%\^\&\*\(\)-+= | head -c 20)
echo "Your Redis password: $REDIS_PASSWORD"
# Create the secret
kubectl create secret generic redis-secret -n redis-cluster --from-literal=password=$REDIS_PASSWORD
Expected Output:
secret/redis-secret created
Now, apply the StatefulSet:
kubectl apply -f redis-statefulset.yaml -n redis-cluster
Monitor the creation of the Pods and PVCs. This might take a few minutes as Kubernetes provisions the PersistentVolumes.
kubectl get pods -n redis-cluster -w
Expected Output (after some time):
NAME READY STATUS RESTARTS AGE
redis-cluster-0 1/1 Running 0 2m
redis-cluster-1 1/1 Running 0 2m
redis-cluster-2 1/1 Running 0 1m
redis-cluster-3 1/1 Running 0 1m
redis-cluster-4 1/1 Running 0 50s
redis-cluster-5 1/1 Running 0 30s
kubectl get pvc -n redis-cluster
Expected Output:
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
redis-data-redis-cluster-0 Bound pvc-12345... 1Gi RWO standard 2m
redis-data-redis-cluster-1 Bound pvc-67890... 1Gi RWO standard 2m
redis-data-redis-cluster-2 Bound pvc-abcde... 1Gi RWO standard 1m
redis-data-redis-cluster-3 Bound pvc-fghij... 1Gi RWO standard 1m
redis-data-redis-cluster-4 Bound pvc-klmno... 1Gi RWO standard 50s
redis-data-redis-cluster-5 Bound pvc-pqrst... 1Gi RWO standard 30s
4. Initialize the Redis Cluster
Once all 6 Pods are running, they are individual Redis instances, but not yet part of a cluster. We need to use redis-cli --cluster create to form the cluster. This command connects to each node, assigns slots, and sets up the master-replica relationships. We’ll specify --cluster-replicas 1, meaning each master will have one replica, achieving our desired 3 masters and 3 replicas.
First, retrieve the IP addresses of all Redis Pods. These IPs will be used by redis-cli to connect to and configure each node.
REDIS_IPS=$(kubectl get pods -l app=redis-cluster -o jsonpath='{.items[*].status.podIP}' -n redis-cluster)
echo "Redis Pod IPs: $REDIS_IPS"
Example Output:
Redis Pod IPs: 10.42.0.10 10.42.0.11 10.42.0.12 10.42.0.13 10.42.0.14 10.42.0.15
Now, execute the cluster creation command from one of the Redis Pods. We’ll use redis-cluster-0. Remember to replace $(REDIS_PASSWORD) with the actual password you set or remove the -a flag if you didn’t set a password.
kubectl exec -it redis-cluster-0 -n redis-cluster -- redis-cli -a $(REDIS_PASSWORD) --cluster create $REDIS_IPS --cluster-replicas 1
Expected Output (interactive prompt, type ‘yes’ and press Enter):
>>> Performing hash slots allocation on 6 nodes...
Master nodes:
10.42.0.10:6379
10.42.0.11:6379
10.42.0.12:6379
Replica nodes:
10.42.0.13:6379 (replica of 10.42.0.10:6379)
10.42.0.14:6379 (replica of 10.42.0.11:6379)
10.42.0.15:6379 (replica of 10.42.0.12:6379)
Can I set the above configuration? (type 'yes' to accept): yes
>>> Nodes configuration updated
>>> Assign a different config epoch to each node
>>> Sending CLUSTER MEET messages to join the cluster
Waiting for the cluster to join
...
>>> Performing Cluster Check (using node 10.42.0.10:6379)
M: 12345... 10.42.0.10:6379
slots:0-5460 (5461 slots) master
1 additional replica(s)
S: 67890... 10.42.0.13:6379
slots: (0 slots) slave
replicates 12345...
M: abcde... 10.42.0.11:6379
slots:5461-10922 (5462 slots) master
1 additional replica(s)
S: fghij... 10.42.0.14:6379
slots: (0 slots) slave
replicates abcde...
M: klmno... 10.42.0.12:6379
slots:10923-16383 (5461 slots) master
1 additional replica(s)
S: pqrst... 10.42.0.15:6379
slots: (0 slots) slave
replicates klmno...
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
5. Verify Cluster Status
After initialization, it’s crucial to verify that the cluster is healthy and all nodes are communicating correctly. We can do this by checking the cluster information and node list from any of the Redis Pods.
kubectl exec -it redis-cluster-0 -n redis-cluster -- redis-cli -a $(REDIS_PASSWORD) cluster info
Expected Output:
cluster_state:ok
cluster_slots_assigned:16384
cluster_slots_ok:16384
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:6
cluster_size:3
cluster_current_epoch:6
cluster_my_epoch:0
cluster_stats_messages_ping_sent:306
cluster_stats_messages_pong_sent:308
cluster_stats_messages_meet_sent:6
cluster_stats_messages_fail_sent:0
cluster_stats_messages_publish_sent:0
cluster_stats_messages_sent:620
cluster_stats_messages_ping_received:302
cluster_stats_messages_pong_received:306
cluster_stats_messages_meet_received:5
cluster_stats_messages_fail_received:0
cluster_stats_messages_publish_received:0
cluster_stats_messages_received:613
Look for cluster_state:ok and cluster_slots_assigned:16384 to confirm a healthy cluster.
Next, inspect the node configuration:
kubectl exec -it redis-cluster-0 -n redis-cluster -- redis-cli -a $(REDIS_PASSWORD) cluster nodes
Expected Output (will vary slightly but show 3 masters and 3 replicas):
12345... 10.42.0.10:6379@16379 master - 0 1678891234567 1 connected 0-5460
67890... 10.42.0.13:6379@16379 slave 12345... 0 1678891234567 1 connected
abcde... 10.42.0.11:6379@16379 master - 0 1678891234567 2 connected 5461-10922
fghij... 10.42.0.14:6379@16379 slave abcde... 0 1678891234567 2 connected
klmno... 10.42.0.12:6379@16379 master - 0 1678891234567 3 connected 10923-16383
pqrst... 10.42.0.15:6379@16379 slave klmno... 0 1678891234567 3 connected
This output clearly shows three master nodes, each responsible for a range of hash slots, and three replica nodes, each replicating one of the masters. Your Redis Cluster is now fully operational and highly available on Kubernetes!
Production Considerations
- Persistent Storage: While we used a default StorageClass, for production, consider a robust, high-performance storage solution. Cloud providers offer managed block storage (e.g., AWS EBS, GCP Persistent Disk) that seamlessly integrates with Kubernetes via CSI drivers.
- Resource Limits & Requests: Properly size CPU and memory requests/limits for your Redis Pods. Over-provisioning wastes resources, while under-provisioning can lead to performance issues or OOMKills. Monitor your Redis instances closely using tools like eBPF Observability with Hubble or Prometheus and Grafana.
- Security:
- Authentication: Always use
requirepassandmasterauthfor Redis. Store passwords in Kubernetes Secrets, as demonstrated. - Network Policies: Implement Kubernetes Network Policies to restrict traffic to and from Redis Pods. Only allow necessary connections from application Pods and block all other ingress/egress.
- TLS/SSL: For highly sensitive data, consider enabling TLS for Redis client and cluster bus communication. This adds complexity but enhances security.
- Pod Security: Use Pod Security Standards (PSS) or tools like Kyverno to enforce security best practices for your Redis Pods, such as running as a non-root user.
- Authentication: Always use
- Backup and Restore: Implement a robust backup strategy. Redis provides RDB snapshots and AOF persistence. You can use Kubernetes CronJobs to trigger Redis
BGSAVEcommands and copy the RDB files to object storage (S3, GCS) for off-site backups.
Official Redis Persistence Documentation. - Monitoring and Alerting: Deploy monitoring agents (e.g., Prometheus Node Exporter, Redis Exporter) to collect metrics. Set up alerts for high memory usage, high latency, master failures, or insufficient replicas.
- Scaling:
- Horizontal Scaling: To add more master nodes or replicas, you’ll need to scale the StatefulSet and then use
redis-cli --cluster add-nodeor--cluster add-node --cluster-slave, followed by re-sharding. - Vertical Scaling: Adjust CPU/memory limits in the StatefulSet and perform a rolling update.
- For cost optimization and efficient node management, especially in dynamic environments, consider tools like Karpenter to manage your underlying cluster nodes.
- Horizontal Scaling: To add more master nodes or replicas, you’ll need to scale the StatefulSet and then use
- External Access: For applications outside the cluster, expose Redis via a LoadBalancer Service (for clients that can handle Redis Cluster redirects) or use an intermediate proxy like Envoy or Twemproxy. For advanced traffic management, consider the Kubernetes Gateway API.
- Service Mesh Integration: If you’re running a service mesh like Istio Ambient Mesh, ensure Redis Pods are properly configured to interact with the mesh, especially for mTLS and traffic policies.
Troubleshooting
-
Pods stuck in
Pendingstate:Issue: Redis Pods are not scheduling and remain in
Pendingstatus.Solution: This usually indicates a problem with PersistentVolumeClaim provisioning. Check if your cluster has a default StorageClass or if the specified
storageClassNamein your StatefulSet is valid.kubectl describe pod redis-cluster-0 -n redis-cluster kubectl get events -n redis-cluster kubectl get storageclassLook for events like “Failed scheduling” or “no persistent volume available for this claim”. Ensure your StorageClass exists and has available capacity.
-
Pods stuck in
CrashLoopBackOff:Issue: Redis Pods repeatedly start and crash.
Solution: Check the Pod logs to identify the root cause. Common reasons include incorrect Redis configuration, insufficient memory, or issues with the persistent volume.
kubectl logs redis-cluster-0 -n redis-clusterLook for errors related to configuration files (e.g.,
redis.conf), memory allocation, or file permissions on/data. -
redis-cli --cluster createfails or hangs:Issue: The cluster creation command does not complete successfully.
Solution:
- Network Connectivity: Ensure all Redis Pods can communicate with each other on both 6379 and 16379 ports. Check Kubernetes network policies or firewall rules if applicable.
- Pod IPs: Verify that the
REDIS_IPSvariable contains the correct and current IPs of all running Pods. - Password: Double-check that the
-a $(REDIS_PASSWORD)flag is correctly used and the password matches the one in the Secret/ConfigMap. - Existing Cluster: If you’re re-running the command, ensure no previous cluster configuration exists on the nodes. You might need to delete
nodes.conffrom the persistent volumes (requires careful steps).
-
cluster_state:failor missing slots:Issue: After creation,
cluster infoshows a failed state or not all 16384 slots are covered.Solution: This means the cluster didn’t initialize correctly.
- Re-run
--cluster create: Sometimes a temporary network glitch can cause issues. Try running the command again. - Check logs: Inspect logs of all Redis Pods for errors during startup or cluster communication.
- Manual Fix: In complex scenarios, you might need to use
redis-cli --cluster fixor manually intervene usingcluster meetandcluster replicatecommands. This is advanced and should be done with caution.
- Re-run
-
Clients cannot connect or receive redirects:
Issue: Application clients fail to connect to the Redis Cluster or get stuck in redirect loops.
Solution:
- Client Library: Ensure your client library supports Redis Cluster mode and is configured to connect to multiple nodes.
- Endpoint: Clients should connect to any master node. The Redis client will then handle redirects to the correct node based on the slot.
- Firewall/Network Policies: Verify that your application Pods have network access to the Redis Pods on port 6379.
- External Access: If clients are external to the cluster, ensure your LoadBalancer or proxy is correctly configured to expose the cluster.
FAQ Section
-
What is the minimum number of nodes required for a Redis Cluster?
A Redis Cluster requires at least 3 master nodes for high availability. Each master node should ideally have at least one replica to ensure data durability and automatic failover. So, a minimum practical setup is 6 nodes (3 masters, 3 replicas).
-
Can I scale my Redis Cluster after deployment?
Yes, you can scale a Redis Cluster horizontally. To add more master nodes, you would scale up your StatefulSet, then use
redis-cli --cluster add-nodeto add the new nodes to the cluster, and finallyredis-cli --cluster reshardto redistribute hash slots to the new masters. To add replicas, you’d add new nodes and useredis-cli --cluster add-node --cluster-slave --cluster-master-id <master_id>. -
How does Redis Cluster handle data persistence on Kubernetes?
Each Redis Pod in the StatefulSet is provisioned with its own PersistentVolumeClaim (PVC) and underlying PersistentVolume (PV). Redis stores its data (RDB snapshots and AOF logs) on this persistent storage. If a Pod restarts or is rescheduled, Kubernetes ensures the same PV is reattached, preserving the data.
-
Is it safe to delete a Redis Pod in a cluster?
If you delete a master Pod, its replica will automatically be promoted to master. If it was a replica Pod, a new replica will eventually be assigned or you can manually add one. Kubernetes will try to recreate the Pod with the same persistent volume. However, for planned maintenance or scaling down, it’s best to use Redis Cluster’s native migration tools (
redis-cli --cluster del-node) first to ensure data is properly re-sharded or replicas are reassigned before deleting the Pods or reducing StatefulSet replicas. -
How do I connect my application to the Redis Cluster?
Your application should use a Redis client library that supports Redis Cluster. You typically provide the client with a list of initial cluster node IPs (or hostnames from the headless service). The client will then discover the entire cluster topology and handle redirects for specific keys to the correct master node. You can expose one or more Redis masters via a standard Kubernetes Service (e.g., ClusterIP or LoadBalancer) for external client access, though the client still needs to understand cluster redirects.
Cleanup Commands
Once you’re done experimenting, you can clean up all the resources created by deleting the namespace. This will remove the StatefulSet, Services, ConfigMap, Secret, Pods, and PersistentVolumeClaims. Depending on your StorageClass configuration, the underlying PersistentVolumes might also be deleted automatically (reclaimPolicy: Delete) or need manual cleanup (reclaimPolicy: Retain).
kubectl delete namespace redis-cluster
Expected Output:
namespace "redis-cluster" deleted
If your PVs are not automatically deleted, you may need to manually clean them up:
kubectl get pv # Identify PVs associated with redis-cluster PVCs
# kubectl delete pv