Introduction
In the dynamic world of Kubernetes, the Container Network Interface (CNI) is the backbone of pod-to-pod communication, network policy enforcement, and overall cluster networking. While traditional CNIs like Calico and Flannel have served the community well, a new generation of CNI solutions has emerged, leveraging advanced kernel technologies to deliver unparalleled performance, security, and observability. Among these, Cilium stands out as a powerful, eBPF-based CNI that redefines what’s possible in Kubernetes networking.
Cilium harnesses the power of eBPF (extended Berkeley Packet Filter), a revolutionary in-kernel technology that allows for dynamic, programmable packet processing without modifying the kernel source code or loading kernel modules. This enables Cilium to provide high-performance networking, advanced security features like API-aware network policies, and deep observability into network traffic, all with minimal overhead. This guide will walk you through the complete installation and configuration of Cilium, from setting up your cluster to exploring its advanced features, ensuring you can leverage its full potential for your Kubernetes workloads.
🚀 TL;DR: Cilium CNI Quickstart 🚀
Cilium brings eBPF-powered networking, security, and observability to Kubernetes. Here’s how to get it running fast:
- Requirements: Kubernetes 1.16+, Helm, kubectl.
- Installation with Helm:
helm repo add cilium https://helm.cilium.io/
helm repo update
helm install cilium cilium/cilium --version 1.15.4 \
--namespace kube-system \
--set ipam.mode=kubernetes \
--set egressGateway.enabled=true \
--set hubble.enabled=true \
--set hubble.ui.enabled=true \
--set l7Proxy=true \
--set encryption.enabled=true \
--set encryption.type=wireguard
kubectl -n kube-system get pods -l k8s-app=cilium
kubectl -n kube-system get pods -l k8s-app=hubble-ui
cilium status --wait
cilium hubble port-forward & # Run in background
kubectl port-forward -n kube-system svc/hubble-ui 8080:80 & # Or directly
echo "Access Hubble UI at http://localhost:8080"
Prerequisites
Before we dive into the installation, ensure you have the following:
- A running Kubernetes cluster (version 1.16+ is recommended, but Cilium supports older versions with specific configurations). For this guide, we’ll assume a fresh cluster.
kubectlinstalled and configured to communicate with your cluster. Refer to the official Kubernetes documentation for installation instructions.helminstalled (version 3.x is required). You can find installation instructions on the Helm website.- Sufficient privileges to install cluster-wide resources (ClusterRole, ClusterRoleBinding, DaemonSet, etc.).
- Basic understanding of Kubernetes networking concepts and YAML manifests.
Step-by-Step Guide: Cilium Installation and Configuration
Step 1: Prepare Your Kubernetes Cluster
Before installing Cilium, it’s crucial to ensure your cluster is in a clean state, especially if you’re replacing an existing CNI. If you have another CNI like Calico or Flannel installed, you must remove it completely. Leaving multiple CNIs active will lead to networking conflicts and instability.
For most installations, this involves deleting the DaemonSet, ClusterRole, and associated resources of the previous CNI. If you’re on a cloud provider, some of them might install their own CNI by default. For instance, on GKE, you might need to provision a cluster without network policy enforcement enabled by default if you plan to use Cilium for that. Always consult your cloud provider’s documentation for CNI removal procedures.
Additionally, ensure that the kube-proxy component is either disabled or configured to run in IPVS mode. While Cilium can operate alongside kube-proxy, it can also replace it entirely for better performance and eBPF-native service load balancing. For simplicity, we’ll assume a setup where Cilium manages kube-proxy’s responsibilities, which means we will disable it during installation.
# If you have an existing CNI, remove it.
# Example for a common CNI (adjust as per your actual CNI):
# kubectl delete -f https://raw.githubusercontent.com/projectcalico/calico/v3.25.0/manifests/calico.yaml
# For a fresh cluster, no specific action might be needed if no CNI is pre-installed.
# If you are using a managed Kubernetes service, consult their documentation on disabling default CNIs or kube-proxy.
# For Kubeadm clusters, you might need to re-initialize or configure kube-proxy later.
# For now, let's proceed assuming a clean slate or a cluster where default CNI is removed.
echo "Cluster preparation complete. Proceeding with Cilium installation."
Verify: There’s no direct output to verify here, but a good check is to ensure that pods are not stuck in a ContainerCreating state due to networking issues before Cilium is installed. After removing a CNI, some pods might show networking errors, which is expected until Cilium is deployed.
Step 2: Install Cilium with Helm
Cilium is best installed using Helm, which provides a flexible way to configure its various components. We’ll add the Cilium Helm repository and then install Cilium, enabling several key features like Hubble for observability, WireGuard encryption, and L7 proxy support. We’ll also specifically configure IPAM mode to kubernetes, meaning Cilium will use Kubernetes’ built-in IP address management, which is common in many setups.
The --set egressGateway.enabled=true flag enables Cilium’s egress gateway, allowing you to route traffic from specific pods through dedicated egress IPs. --set hubble.enabled=true and --set hubble.ui.enabled=true activate Cilium’s powerful observability layer and its user interface. For enhanced security, we’ll enable WireGuard encryption with --set encryption.enabled=true --set encryption.type=wireguard. For more details on this, refer to our guide on Cilium WireGuard Encryption. Lastly, --set l7Proxy=true enables advanced L7 policy enforcement for HTTP/gRPC traffic.
# Add the Cilium Helm repository
helm repo add cilium https://helm.cilium.io/
helm repo update
# Install Cilium with recommended settings
helm install cilium cilium/cilium --version 1.15.4 \
--namespace kube-system \
--set ipam.mode=kubernetes \
--set egressGateway.enabled=true \
--set hubble.enabled=true \
--set hubble.ui.enabled=true \
--set l7Proxy=true \
--set encryption.enabled=true \
--set encryption.type=wireguard \
--set kubeProxyReplacement=strict # This disables kube-proxy entirely and uses eBPF for services
Verify: After running the Helm install command, you should see output indicating that Cilium has been deployed. Check the status of the Cilium pods.
kubectl -n kube-system get pods -l k8s-app=cilium
Expected Output:
NAME READY STATUS RESTARTS AGE
cilium-xxxxx 1/1 Running 0 2m
cilium-yyyyy 1/1 Running 0 2m
cilium-zzzzz 1/1 Running 0 2m
# ... and potentially hubble-relay and hubble-ui pods once they start
Make sure all Cilium pods are in the Running state and show 1/1 READY. It might take a minute or two for all pods to become ready.
Step 3: Verify Cilium Status and Connectivity
Once the Cilium pods are running, it’s essential to verify that Cilium is correctly configured and that network connectivity is working as expected. Cilium provides a powerful CLI tool, cilium, which can be installed locally and used to interact with the Cilium agents running in your cluster. This tool allows you to check the health of the Cilium agents, inspect network policies, and debug connectivity issues.
We’ll use cilium status to get a comprehensive overview of the Cilium deployment. This command checks various components, including agent health, eBPF programs, IPAM, and connectivity to the Kubernetes API server.
# Install the Cilium CLI (if you haven't already)
# This command fetches the latest CLI from the official Cilium GitHub releases.
CILIUM_CLI_VERSION=$(curl -s https://api.github.com/repos/cilium/cilium-cli/releases/latest | grep -oP "v\d+\.\d+\.\d+")
CLI_ARCH=amd64
if [ "$(uname -m)" = "aarch64" ]; then CLI_ARCH=arm64; fi
curl -L --fail --remote-name-all https://github.com/cilium/cilium-cli/releases/download/${CILIUM_CLI_VERSION}/cilium-linux-${CLI_ARCH}.tar.gz{,.sha256sum}
sha256sum --check cilium-linux-${CLI_ARCH}.tar.gz.sha256sum
sudo tar xzvfC cilium-linux-${CLI_ARCH}.tar.gz /usr/local/bin
rm cilium-linux-${CLI_ARCH}.tar.gz{,.sha256sum}
# Check Cilium status
cilium status --wait
Expected Output:
/¯¯\
/¯¯\__/¯¯\ Cilium: OK
\__/¯¯\__/ Operator: OK
/¯¯\__/¯¯\ Hubble: OK
\__/¯¯\__/ ClusterMesh: disabled
\__/
DaemonSet cilium Desired: 3, Ready: 3/3, Available: 3/3
Deployment cilium-operator Desired: 1, Ready: 1/1, Available: 1/1
Deployment hubble-relay Desired: 1, Ready: 1/1, Available: 1/1
Deployment hubble-ui Desired: 1, Ready: 1/1, Available: 1/1
Containers: cilium Running: 3
cilium-operator Running: 1
hubble-relay Running: 1
hubble-ui Running: 1
Cluster Pods: 10/10 managed (unmanaged: 0)
Image versions cilium quay.io/cilium/cilium:v1.15.4: 3
cilium-operator quay.io/cilium/operator:v1.15.4: 1
hubble-relay quay.io/cilium/hubble-relay:v1.15.4: 1
hubble-ui quay.io/cilium/hubble-ui:v0.12.0: 1
The output should show Cilium: OK, Operator: OK, and Hubble: OK. All DaemonSets and Deployments should be in a ready and available state. This confirms that Cilium and its components are healthy and operational.
Step 4: Deploy a Sample Application and Test Connectivity
To confirm that Cilium is correctly routing traffic and enforcing policies (even if none are defined yet), let’s deploy a simple application and test connectivity between pods. We’ll deploy two deployments: http-client and http-server. The http-client will attempt to curl the http-server.
apiVersion: apps/v1
kind: Deployment
metadata:
name: http-server
spec:
selector:
matchLabels:
app: http-server
replicas: 1
template:
metadata:
labels:
app: http-server
spec:
containers:
- name: http-server
image: docker.io/cilium/json-mock:1.2
ports:
- containerPort: 8080
---
apiVersion: v1
kind: Service
metadata:
name: http-server
spec:
selector:
app: http-server
ports:
- protocol: TCP
port: 80
targetPort: 8080
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: http-client
spec:
selector:
matchLabels:
app: http-client
replicas: 1
template:
metadata:
labels:
app: http-client
spec:
containers:
- name: http-client
image: curlimages/curl:7.83.1
command: ["/bin/sh", "-c", "sleep infinity"]
Apply these manifests:
kubectl apply -f https://raw.githubusercontent.com/kubezilla-io/tutorials/main/cilium-cni/sample-app.yaml
Verify: Once the pods are running, get the name of the http-client pod and execute a curl command to the http-server service.
kubectl get pods -l app=http-server
kubectl get pods -l app=http-client
CLIENT_POD=$(kubectl get pods -l app=http-client -o jsonpath='{.items[0].metadata.name}')
echo "Executing curl from client pod: $CLIENT_POD"
kubectl exec -it $CLIENT_POD -- curl -s http://http-server
Expected Output:
{"message": "Hello, world!"}
This output confirms that the http-client pod can successfully communicate with the http-server service, indicating that Cilium is correctly handling pod-to-pod networking.
Step 5: Explore Hubble for Observability
Hubble is Cilium’s built-in observability platform, providing deep insights into network traffic, service dependencies, and policy enforcement. Since we enabled Hubble and Hubble UI during installation, we can now access its graphical interface to visualize network flows.
# Port-forward Hubble UI to your local machine
# Run in background
kubectl -n kube-system port-forward svc/hubble-ui 8080:80 &
echo "Hubble UI is available at http://localhost:8080"
echo "You can also use the cilium hubble port-forward command:"
cilium hubble port-forward & # This command sets up port-forwarding for hubble-relay and UI
Open your web browser and navigate to http://localhost:8080. You should see the Hubble UI, displaying live network flows between your pods, including the traffic between http-client and http-server. You can filter flows, inspect connection details, and visualize network policies in action. For more advanced usage and custom metrics, check out our guide on eBPF Observability with Hubble.
Step 6: Implement a Cilium Network Policy
One of Cilium’s most powerful features is its ability to enforce identity-based network policies, including L7 (application layer) policies. Let’s create a simple L3/L4 network policy to restrict access to the http-server, allowing traffic only from pods with the label app: http-client.
apiVersion: "cilium.io/v2"
kind: CiliumNetworkPolicy
metadata:
name: allow-http-client-to-server
spec:
endpointSelector:
matchLabels:
app: http-server
ingress:
- fromEndpoints:
- matchLabels:
app: http-client
toPorts:
- ports:
- port: "8080"
protocol: TCP
Apply this policy:
kubectl apply -f https://raw.githubusercontent.com/kubezilla-io/tutorials/main/cilium-cni/cnp-allow-client.yaml
Verify: After applying the policy, the http-client should still be able to access the http-server. Let’s confirm this.
CLIENT_POD=$(kubectl get pods -l app=http-client -o jsonpath='{.items[0].metadata.name}')
echo "Executing curl from client pod with policy enabled: $CLIENT_POD"
kubectl exec -it $CLIENT_POD -- curl -s http://http-server
Expected Output:
{"message": "Hello, world!"}
Now, let’s try to simulate traffic from a pod that doesn’t have the app: http-client label. We’ll create a temporary pod with a different label.
apiVersion: v1
kind: Pod
metadata:
name: test-client
labels:
app: unauthorized-client # This label is not allowed by the policy
spec:
containers:
- name: test-client
image: curlimages/curl:7.83.1
command: ["/bin/sh", "-c", "sleep infinity"]
kubectl apply -f https://raw.githubusercontent.com/kubezilla-io/tutorials/main/cilium-cni/cnp-test-client.yaml
kubectl wait --for=condition=ready pod/test-client --timeout=30s
echo "Executing curl from unauthorized client pod:"
kubectl exec -it test-client -- curl -s --connect-timeout 5 http://http-server
Expected Output:
curl: (28) Connection timed out after 5001 milliseconds
This “Connection timed out” message confirms that the CiliumNetworkPolicy is effectively blocking traffic from the unauthorized client. This demonstrates Cilium’s powerful identity-aware policy enforcement. For a deeper dive into Kubernetes network policies, including those enforced by Cilium, refer to our Network Policies Security Guide.
Production Considerations
Deploying Cilium in production requires careful planning beyond a basic installation. Here are critical aspects to consider:
- Kube-Proxy Replacement: We used
kubeProxyReplacement=strict, which fully replaceskube-proxywith eBPF-based service load balancing. This offers significant performance benefits and reduces complexity. However, ensure compatibility with any existing load balancers or ingress controllers. Test thoroughly. - IPAM Mode: We used
ipam.mode=kubernetes. Other options includeclusterPool(Cilium manages a dedicated CIDR for pods) or integration with cloud provider IPAM (e.g., AWS ENI, Azure IPAM). Choose the mode that best fits your environment and scaling needs. For large clusters, native cloud provider IPAM orclusterPoolcan be more efficient. - Resource Requirements: Cilium agents on each node consume CPU and memory. Monitor these resources closely, especially in high-traffic environments. Adjust resource limits and requests for the Cilium DaemonSet as needed.
- Observability (Hubble): Hubble is invaluable for debugging and monitoring. In production, consider deploying Hubble Relay with persistent storage for flow data and integrating it with external tools like Prometheus and Grafana for long-term metrics and dashboards.
- Network Policy Strategy: Plan your network policies carefully. Start with a default deny policy and explicitly allow necessary traffic. Leverage Cilium’s L7 policies for granular control. Remember that CiliumNetworkPolicy objects are more powerful and flexible than standard Kubernetes NetworkPolicy objects.
- Encryption: We enabled WireGuard encryption. While excellent for pod-to-pod encryption, evaluate its performance impact on high-throughput links. For specific use cases, consider other encryption methods or use cases where a service mesh like Istio (see our Istio Ambient Mesh Guide) might handle encryption at a different layer.
- Upgrades: Follow the official Cilium upgrade guide diligently. Plan for downtime (if any) and test upgrades in a staging environment first.
- Cluster Sizing and Autoscaling: Cilium works well with autoscaling solutions like Karpenter or Cluster Autoscaler. Ensure your node autoscaling strategy accounts for Cilium’s resource requirements and proper CNI initialization on new nodes.
- External Connectivity: If you have specific requirements for external access, egress filtering, or integrating with on-premises networks, explore Cilium’s Egress Gateway and Cluster Mesh features.
- Kernel Compatibility: Cilium relies heavily on eBPF, which requires a relatively modern Linux kernel (5.4+ is generally recommended for optimal performance and features). Always check the Cilium system requirements for the version you are deploying.
Troubleshooting
Even with a smooth installation, issues can arise. Here are common problems and their solutions:
-
Cilium Pods Stuck in
PendingorContainerCreating:Issue: Cilium DaemonSet pods are not starting.
Solution:
- Check Events: Use
kubectl describe pod <cilium-pod-name> -n kube-systemto check for events like insufficient CPU/memory, image pull errors, or volume mounting issues. - Node Taints/Tolerations: Ensure your nodes don’t have taints that prevent Cilium pods from scheduling, or that Cilium has appropriate tolerations.
- Resource Limits: Increase resource requests/limits for Cilium pods if they are being evicted due to resource constraints.
- Check Events: Use
-
Pods Cannot Communicate (
Connection timed outorNo route to host):Issue: After Cilium installation, pods cannot reach each other or external services.
Solution:
- Cilium Status: Run
cilium status --verboseto check for any errors in the Cilium agent’s configuration or health. - Cilium Logs: Inspect Cilium agent logs:
kubectl logs -f -n kube-system <cilium-pod-name>. Look for errors related to eBPF program loading, IPAM, or connectivity. - Network Policies: If you have network policies applied, ensure they are not inadvertently blocking legitimate traffic. Use Hubble UI or
cilium policy getto inspect active policies. kube-proxy: IfkubeProxyReplacement=strictwas not set, ensurekube-proxyis functioning correctly or consider enabling the full replacement. Ifkube-proxyis still running alongside Cilium, ensure their configurations don’t conflict.- IPAM Conflicts: If you’re using a custom IPAM mode or have multiple CNIs, check for IP address conflicts.
- Cilium Status: Run
-
Hubble UI Not Showing Flows or Accessible:
Issue: You can’t access Hubble UI, or it’s empty.
Solution:
- Port-forwarding: Ensure the
kubectl port-forwardorcilium hubble port-forwardcommand is running correctly and not conflicting with other local ports. - Hubble Pods: Check the status of
hubble-relayandhubble-uipods inkube-systemnamespace. Restart them if necessary. - Cilium Configuration: Verify that
hubble.enabled=trueandhubble.ui.enabled=truewere set during installation. - Cilium Logs: Check logs of
hubble-relayandhubble-uipods for errors.
- Port-forwarding: Ensure the
-
External Access Issues (Pods cannot reach external IPs):
Issue: Pods can communicate internally but fail to reach external websites or services.
Solution:
- NAT/Masquerading: Ensure that NAT/masquerading is correctly configured on your nodes. Cilium handles this by default, but verify if custom configurations are interfering.
- Firewall Rules: Check if any host-level firewalls (e.g.,
firewalld,ufw) are blocking outgoing traffic from your nodes. - Egress Gateway: If you’re using Cilium’s Egress Gateway, ensure it’s configured correctly and that traffic is being routed through it as intended.
- Cloud Provider Network: Verify your cloud provider’s network security groups or network ACLs are not blocking egress traffic.
-
cilium statusReports Errors:Issue: The
cilium statuscommand shows warnings or errors, even if pods seem to be running.Solution:
- Read the Output Carefully: The
cilium statuscommand is quite verbose and often points directly to the problem (e.g., “eBPF programs failed to load”, “Kubernetes API connectivity down”). - Cilium Logs: Always refer to the Cilium agent logs for more detailed error messages.
- Kernel Version: Ensure your kernel meets Cilium’s requirements. Older kernels might lack necessary eBPF features.
- Restart Cilium: In some cases, restarting the Cilium pods (
kubectl -n kube-system rollout restart ds/cilium) can resolve transient issues.
- Read the Output Carefully: The
FAQ Section
-
What is eBPF and why is it important for Cilium?
eBPF (extended Berkeley Packet Filter) is a revolutionary in-kernel technology that allows users to run custom programs in the Linux kernel without modifying the kernel source code or loading kernel modules. Cilium leverages eBPF to implement its networking, security, and observability features directly in the kernel’s data path. This provides superior performance, enables advanced features like L7 policy enforcement, and offers deep visibility into network traffic, all with minimal overhead compared to traditional CNI approaches.
-
Can Cilium replace
kube-proxy?Yes, Cilium can fully replace
kube-proxyusing its eBPF-based kube-proxy replacement mode (kubeProxyReplacement=strict). This offloads service load balancing to eBPF, resulting in significant performance improvements, reduced latency, and enhanced scalability by removing the need foriptablesoripvsrules, which can become bottlenecks in large clusters. It also simplifies the networking stack. -
How does Cilium handle network policies compared to standard Kubernetes NetworkPolicies?
Cilium’s native
CiliumNetworkPolicy(CNP) resource extends the capabilities of standard Kubernetes NetworkPolicies. While standard policies are limited to L3/L4 (IP address, port, protocol), CNPs can enforce L7 policies (e.g., HTTP methods, paths, gRPC services), integrate with identity-based security, and provide richer ingress/egress filtering rules. This allows for much more granular and application-aware security controls. For a deeper dive, read our Network Policies Security Guide. -
What is Hubble and how do I use it?
Hubble is Cilium’s built-in observability platform. It leverages eBPF to provide deep visibility into network traffic flows, service dependencies, and network policy enforcement in real-time. You can access Hubble through its command-line interface (
cilium hubble) or its graphical user interface (Hubble UI). It allows you to visualize connections, filter flows, inspect metadata, and troubleshoot networking issues efficiently. We covered accessing Hubble UI in Step 5 of this guide. For custom metrics, check out our guide on eBPF Observability with Hubble. -
Is Cilium compatible with all Kubernetes distributions and cloud providers?
Cilium is designed to be highly portable and works with most Kubernetes distributions (e.g., Kubeadm, EKS, AKS, GKE, OpenShift). However, specific configurations or considerations might apply depending on the underlying operating system, kernel version, and cloud provider’s networking setup. Always consult the official Cilium installation guide for your specific environment to ensure full compatibility and optimal performance.
Cleanup Commands
To remove all resources created during this tutorial, execute the following commands:
# Delete the sample applications
kubectl delete deploy http-server http-client
kubectl delete svc http-server
kubectl delete pod test-client
# Delete the CiliumNetworkPolicy
kubectl delete CiliumNetworkPolicy allow-http-client-to-server
# Uninstall Cilium using Helm
helm uninstall cilium --namespace kube-system
# Remove the Cilium Helm repository (optional)
helm repo remove cilium
# Remove the Cilium CLI (optional)
sudo rm /usr/local/bin/cilium
# Kill any background port-forward processes (if running)
killall kubectl
killall cilium # If you used 'cilium hubble port-forward'
Next Steps / Further Reading
Congratulations on mastering the basics of Cilium! Here are some resources to continue your journey:
- Official Cilium Documentation: The Cilium documentation is comprehensive and covers all advanced features.
- Cilium Network Policy Examples: Explore more complex Cilium Network Policy examples, including L7 policies for various protocols.
- Cilium Cluster Mesh: Learn how to connect multiple Kubernetes clusters with Cilium’s Cluster Mesh.
- Kubezilla Guides:
- Cilium WireGuard Encryption for Pod-to-Pod Traffic: Dive deeper into securing your network.
- eBPF Observability: Building Custom Metrics with Hubble: Extend your observability capabilities.