Orchestration

Cilium CNI: Install & Configure Kubernetes

Introduction

In the dynamic world of Kubernetes, the Container Network Interface (CNI) is the backbone of pod-to-pod communication, network policy enforcement, and overall cluster networking. While traditional CNIs like Calico and Flannel have served the community well, a new generation of CNI solutions has emerged, leveraging advanced kernel technologies to deliver unparalleled performance, security, and observability. Among these, Cilium stands out as a powerful, eBPF-based CNI that redefines what’s possible in Kubernetes networking.

Cilium harnesses the power of eBPF (extended Berkeley Packet Filter), a revolutionary in-kernel technology that allows for dynamic, programmable packet processing without modifying the kernel source code or loading kernel modules. This enables Cilium to provide high-performance networking, advanced security features like API-aware network policies, and deep observability into network traffic, all with minimal overhead. This guide will walk you through the complete installation and configuration of Cilium, from setting up your cluster to exploring its advanced features, ensuring you can leverage its full potential for your Kubernetes workloads.

🚀 TL;DR: Cilium CNI Quickstart 🚀

Cilium brings eBPF-powered networking, security, and observability to Kubernetes. Here’s how to get it running fast:

  • Requirements: Kubernetes 1.16+, Helm, kubectl.
  • Installation with Helm:
  • helm repo add cilium https://helm.cilium.io/
    helm repo update
    helm install cilium cilium/cilium --version 1.15.4 \
      --namespace kube-system \
      --set ipam.mode=kubernetes \
      --set egressGateway.enabled=true \
      --set hubble.enabled=true \
      --set hubble.ui.enabled=true \
      --set l7Proxy=true \
      --set encryption.enabled=true \
      --set encryption.type=wireguard
  • Verify Installation:
  • kubectl -n kube-system get pods -l k8s-app=cilium
    kubectl -n kube-system get pods -l k8s-app=hubble-ui
    cilium status --wait
  • Access Hubble UI:
  • cilium hubble port-forward & # Run in background
    kubectl port-forward -n kube-system svc/hubble-ui 8080:80 & # Or directly
    echo "Access Hubble UI at http://localhost:8080"
  • Key Features: eBPF, Identity-based security, L7 policies, Encryption (WireGuard), Hubble observability.

Prerequisites

Before we dive into the installation, ensure you have the following:

  • A running Kubernetes cluster (version 1.16+ is recommended, but Cilium supports older versions with specific configurations). For this guide, we’ll assume a fresh cluster.
  • kubectl installed and configured to communicate with your cluster. Refer to the official Kubernetes documentation for installation instructions.
  • helm installed (version 3.x is required). You can find installation instructions on the Helm website.
  • Sufficient privileges to install cluster-wide resources (ClusterRole, ClusterRoleBinding, DaemonSet, etc.).
  • Basic understanding of Kubernetes networking concepts and YAML manifests.

Step-by-Step Guide: Cilium Installation and Configuration

Step 1: Prepare Your Kubernetes Cluster

Before installing Cilium, it’s crucial to ensure your cluster is in a clean state, especially if you’re replacing an existing CNI. If you have another CNI like Calico or Flannel installed, you must remove it completely. Leaving multiple CNIs active will lead to networking conflicts and instability.

For most installations, this involves deleting the DaemonSet, ClusterRole, and associated resources of the previous CNI. If you’re on a cloud provider, some of them might install their own CNI by default. For instance, on GKE, you might need to provision a cluster without network policy enforcement enabled by default if you plan to use Cilium for that. Always consult your cloud provider’s documentation for CNI removal procedures.

Additionally, ensure that the kube-proxy component is either disabled or configured to run in IPVS mode. While Cilium can operate alongside kube-proxy, it can also replace it entirely for better performance and eBPF-native service load balancing. For simplicity, we’ll assume a setup where Cilium manages kube-proxy’s responsibilities, which means we will disable it during installation.

# If you have an existing CNI, remove it.
# Example for a common CNI (adjust as per your actual CNI):
# kubectl delete -f https://raw.githubusercontent.com/projectcalico/calico/v3.25.0/manifests/calico.yaml

# For a fresh cluster, no specific action might be needed if no CNI is pre-installed.
# If you are using a managed Kubernetes service, consult their documentation on disabling default CNIs or kube-proxy.
# For Kubeadm clusters, you might need to re-initialize or configure kube-proxy later.
# For now, let's proceed assuming a clean slate or a cluster where default CNI is removed.

echo "Cluster preparation complete. Proceeding with Cilium installation."

Verify: There’s no direct output to verify here, but a good check is to ensure that pods are not stuck in a ContainerCreating state due to networking issues before Cilium is installed. After removing a CNI, some pods might show networking errors, which is expected until Cilium is deployed.

Step 2: Install Cilium with Helm

Cilium is best installed using Helm, which provides a flexible way to configure its various components. We’ll add the Cilium Helm repository and then install Cilium, enabling several key features like Hubble for observability, WireGuard encryption, and L7 proxy support. We’ll also specifically configure IPAM mode to kubernetes, meaning Cilium will use Kubernetes’ built-in IP address management, which is common in many setups.

The --set egressGateway.enabled=true flag enables Cilium’s egress gateway, allowing you to route traffic from specific pods through dedicated egress IPs. --set hubble.enabled=true and --set hubble.ui.enabled=true activate Cilium’s powerful observability layer and its user interface. For enhanced security, we’ll enable WireGuard encryption with --set encryption.enabled=true --set encryption.type=wireguard. For more details on this, refer to our guide on Cilium WireGuard Encryption. Lastly, --set l7Proxy=true enables advanced L7 policy enforcement for HTTP/gRPC traffic.

# Add the Cilium Helm repository
helm repo add cilium https://helm.cilium.io/
helm repo update

# Install Cilium with recommended settings
helm install cilium cilium/cilium --version 1.15.4 \
  --namespace kube-system \
  --set ipam.mode=kubernetes \
  --set egressGateway.enabled=true \
  --set hubble.enabled=true \
  --set hubble.ui.enabled=true \
  --set l7Proxy=true \
  --set encryption.enabled=true \
  --set encryption.type=wireguard \
  --set kubeProxyReplacement=strict # This disables kube-proxy entirely and uses eBPF for services

Verify: After running the Helm install command, you should see output indicating that Cilium has been deployed. Check the status of the Cilium pods.

kubectl -n kube-system get pods -l k8s-app=cilium

Expected Output:

NAME            READY   STATUS    RESTARTS   AGE
cilium-xxxxx    1/1     Running   0          2m
cilium-yyyyy    1/1     Running   0          2m
cilium-zzzzz    1/1     Running   0          2m
# ... and potentially hubble-relay and hubble-ui pods once they start

Make sure all Cilium pods are in the Running state and show 1/1 READY. It might take a minute or two for all pods to become ready.

Step 3: Verify Cilium Status and Connectivity

Once the Cilium pods are running, it’s essential to verify that Cilium is correctly configured and that network connectivity is working as expected. Cilium provides a powerful CLI tool, cilium, which can be installed locally and used to interact with the Cilium agents running in your cluster. This tool allows you to check the health of the Cilium agents, inspect network policies, and debug connectivity issues.

We’ll use cilium status to get a comprehensive overview of the Cilium deployment. This command checks various components, including agent health, eBPF programs, IPAM, and connectivity to the Kubernetes API server.

# Install the Cilium CLI (if you haven't already)
# This command fetches the latest CLI from the official Cilium GitHub releases.
CILIUM_CLI_VERSION=$(curl -s https://api.github.com/repos/cilium/cilium-cli/releases/latest | grep -oP "v\d+\.\d+\.\d+")
CLI_ARCH=amd64
if [ "$(uname -m)" = "aarch64" ]; then CLI_ARCH=arm64; fi
curl -L --fail --remote-name-all https://github.com/cilium/cilium-cli/releases/download/${CILIUM_CLI_VERSION}/cilium-linux-${CLI_ARCH}.tar.gz{,.sha256sum}
sha256sum --check cilium-linux-${CLI_ARCH}.tar.gz.sha256sum
sudo tar xzvfC cilium-linux-${CLI_ARCH}.tar.gz /usr/local/bin
rm cilium-linux-${CLI_ARCH}.tar.gz{,.sha256sum}

# Check Cilium status
cilium status --wait

Expected Output:

    /¯¯\
 /¯¯\__/¯¯\    Cilium:             OK
 \__/¯¯\__/    Operator:           OK
 /¯¯\__/¯¯\    Hubble:             OK
 \__/¯¯\__/    ClusterMesh:        disabled
    \__/

DaemonSet              cilium             Desired: 3, Ready: 3/3, Available: 3/3
Deployment             cilium-operator    Desired: 1, Ready: 1/1, Available: 1/1
Deployment             hubble-relay       Desired: 1, Ready: 1/1, Available: 1/1
Deployment             hubble-ui          Desired: 1, Ready: 1/1, Available: 1/1

Containers:            cilium             Running: 3
                       cilium-operator    Running: 1
                       hubble-relay       Running: 1
                       hubble-ui          Running: 1
Cluster Pods:          10/10 managed     (unmanaged: 0)
Image versions         cilium             quay.io/cilium/cilium:v1.15.4: 3
                       cilium-operator    quay.io/cilium/operator:v1.15.4: 1
                       hubble-relay       quay.io/cilium/hubble-relay:v1.15.4: 1
                       hubble-ui          quay.io/cilium/hubble-ui:v0.12.0: 1

The output should show Cilium: OK, Operator: OK, and Hubble: OK. All DaemonSets and Deployments should be in a ready and available state. This confirms that Cilium and its components are healthy and operational.

Step 4: Deploy a Sample Application and Test Connectivity

To confirm that Cilium is correctly routing traffic and enforcing policies (even if none are defined yet), let’s deploy a simple application and test connectivity between pods. We’ll deploy two deployments: http-client and http-server. The http-client will attempt to curl the http-server.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: http-server
spec:
  selector:
    matchLabels:
      app: http-server
  replicas: 1
  template:
    metadata:
      labels:
        app: http-server
    spec:
      containers:
      - name: http-server
        image: docker.io/cilium/json-mock:1.2
        ports:
        - containerPort: 8080
---
apiVersion: v1
kind: Service
metadata:
  name: http-server
spec:
  selector:
    app: http-server
  ports:
    - protocol: TCP
      port: 80
      targetPort: 8080
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: http-client
spec:
  selector:
    matchLabels:
      app: http-client
  replicas: 1
  template:
    metadata:
      labels:
        app: http-client
    spec:
      containers:
      - name: http-client
        image: curlimages/curl:7.83.1
        command: ["/bin/sh", "-c", "sleep infinity"]

Apply these manifests:

kubectl apply -f https://raw.githubusercontent.com/kubezilla-io/tutorials/main/cilium-cni/sample-app.yaml

Verify: Once the pods are running, get the name of the http-client pod and execute a curl command to the http-server service.

kubectl get pods -l app=http-server
kubectl get pods -l app=http-client

CLIENT_POD=$(kubectl get pods -l app=http-client -o jsonpath='{.items[0].metadata.name}')
echo "Executing curl from client pod: $CLIENT_POD"
kubectl exec -it $CLIENT_POD -- curl -s http://http-server

Expected Output:

{"message": "Hello, world!"}

This output confirms that the http-client pod can successfully communicate with the http-server service, indicating that Cilium is correctly handling pod-to-pod networking.

Step 5: Explore Hubble for Observability

Hubble is Cilium’s built-in observability platform, providing deep insights into network traffic, service dependencies, and policy enforcement. Since we enabled Hubble and Hubble UI during installation, we can now access its graphical interface to visualize network flows.

# Port-forward Hubble UI to your local machine
# Run in background
kubectl -n kube-system port-forward svc/hubble-ui 8080:80 &

echo "Hubble UI is available at http://localhost:8080"
echo "You can also use the cilium hubble port-forward command:"
cilium hubble port-forward & # This command sets up port-forwarding for hubble-relay and UI

Open your web browser and navigate to http://localhost:8080. You should see the Hubble UI, displaying live network flows between your pods, including the traffic between http-client and http-server. You can filter flows, inspect connection details, and visualize network policies in action. For more advanced usage and custom metrics, check out our guide on eBPF Observability with Hubble.

Step 6: Implement a Cilium Network Policy

One of Cilium’s most powerful features is its ability to enforce identity-based network policies, including L7 (application layer) policies. Let’s create a simple L3/L4 network policy to restrict access to the http-server, allowing traffic only from pods with the label app: http-client.

apiVersion: "cilium.io/v2"
kind: CiliumNetworkPolicy
metadata:
  name: allow-http-client-to-server
spec:
  endpointSelector:
    matchLabels:
      app: http-server
  ingress:
  - fromEndpoints:
    - matchLabels:
        app: http-client
    toPorts:
    - ports:
      - port: "8080"
        protocol: TCP

Apply this policy:

kubectl apply -f https://raw.githubusercontent.com/kubezilla-io/tutorials/main/cilium-cni/cnp-allow-client.yaml

Verify: After applying the policy, the http-client should still be able to access the http-server. Let’s confirm this.

CLIENT_POD=$(kubectl get pods -l app=http-client -o jsonpath='{.items[0].metadata.name}')
echo "Executing curl from client pod with policy enabled: $CLIENT_POD"
kubectl exec -it $CLIENT_POD -- curl -s http://http-server

Expected Output:

{"message": "Hello, world!"}

Now, let’s try to simulate traffic from a pod that doesn’t have the app: http-client label. We’ll create a temporary pod with a different label.

apiVersion: v1
kind: Pod
metadata:
  name: test-client
  labels:
    app: unauthorized-client # This label is not allowed by the policy
spec:
  containers:
  - name: test-client
    image: curlimages/curl:7.83.1
    command: ["/bin/sh", "-c", "sleep infinity"]
kubectl apply -f https://raw.githubusercontent.com/kubezilla-io/tutorials/main/cilium-cni/cnp-test-client.yaml
kubectl wait --for=condition=ready pod/test-client --timeout=30s

echo "Executing curl from unauthorized client pod:"
kubectl exec -it test-client -- curl -s --connect-timeout 5 http://http-server

Expected Output:

curl: (28) Connection timed out after 5001 milliseconds

This “Connection timed out” message confirms that the CiliumNetworkPolicy is effectively blocking traffic from the unauthorized client. This demonstrates Cilium’s powerful identity-aware policy enforcement. For a deeper dive into Kubernetes network policies, including those enforced by Cilium, refer to our Network Policies Security Guide.

Production Considerations

Deploying Cilium in production requires careful planning beyond a basic installation. Here are critical aspects to consider:

  1. Kube-Proxy Replacement: We used kubeProxyReplacement=strict, which fully replaces kube-proxy with eBPF-based service load balancing. This offers significant performance benefits and reduces complexity. However, ensure compatibility with any existing load balancers or ingress controllers. Test thoroughly.
  2. IPAM Mode: We used ipam.mode=kubernetes. Other options include clusterPool (Cilium manages a dedicated CIDR for pods) or integration with cloud provider IPAM (e.g., AWS ENI, Azure IPAM). Choose the mode that best fits your environment and scaling needs. For large clusters, native cloud provider IPAM or clusterPool can be more efficient.
  3. Resource Requirements: Cilium agents on each node consume CPU and memory. Monitor these resources closely, especially in high-traffic environments. Adjust resource limits and requests for the Cilium DaemonSet as needed.
  4. Observability (Hubble): Hubble is invaluable for debugging and monitoring. In production, consider deploying Hubble Relay with persistent storage for flow data and integrating it with external tools like Prometheus and Grafana for long-term metrics and dashboards.
  5. Network Policy Strategy: Plan your network policies carefully. Start with a default deny policy and explicitly allow necessary traffic. Leverage Cilium’s L7 policies for granular control. Remember that CiliumNetworkPolicy objects are more powerful and flexible than standard Kubernetes NetworkPolicy objects.
  6. Encryption: We enabled WireGuard encryption. While excellent for pod-to-pod encryption, evaluate its performance impact on high-throughput links. For specific use cases, consider other encryption methods or use cases where a service mesh like Istio (see our Istio Ambient Mesh Guide) might handle encryption at a different layer.
  7. Upgrades: Follow the official Cilium upgrade guide diligently. Plan for downtime (if any) and test upgrades in a staging environment first.
  8. Cluster Sizing and Autoscaling: Cilium works well with autoscaling solutions like Karpenter or Cluster Autoscaler. Ensure your node autoscaling strategy accounts for Cilium’s resource requirements and proper CNI initialization on new nodes.
  9. External Connectivity: If you have specific requirements for external access, egress filtering, or integrating with on-premises networks, explore Cilium’s Egress Gateway and Cluster Mesh features.
  10. Kernel Compatibility: Cilium relies heavily on eBPF, which requires a relatively modern Linux kernel (5.4+ is generally recommended for optimal performance and features). Always check the Cilium system requirements for the version you are deploying.

Troubleshooting

Even with a smooth installation, issues can arise. Here are common problems and their solutions:

  1. Cilium Pods Stuck in Pending or ContainerCreating:

    Issue: Cilium DaemonSet pods are not starting.

    Solution:

    • Check Events: Use kubectl describe pod <cilium-pod-name> -n kube-system to check for events like insufficient CPU/memory, image pull errors, or volume mounting issues.
    • Node Taints/Tolerations: Ensure your nodes don’t have taints that prevent Cilium pods from scheduling, or that Cilium has appropriate tolerations.
    • Resource Limits: Increase resource requests/limits for Cilium pods if they are being evicted due to resource constraints.
  2. Pods Cannot Communicate (Connection timed out or No route to host):

    Issue: After Cilium installation, pods cannot reach each other or external services.

    Solution:

    • Cilium Status: Run cilium status --verbose to check for any errors in the Cilium agent’s configuration or health.
    • Cilium Logs: Inspect Cilium agent logs: kubectl logs -f -n kube-system <cilium-pod-name>. Look for errors related to eBPF program loading, IPAM, or connectivity.
    • Network Policies: If you have network policies applied, ensure they are not inadvertently blocking legitimate traffic. Use Hubble UI or cilium policy get to inspect active policies.
    • kube-proxy: If kubeProxyReplacement=strict was not set, ensure kube-proxy is functioning correctly or consider enabling the full replacement. If kube-proxy is still running alongside Cilium, ensure their configurations don’t conflict.
    • IPAM Conflicts: If you’re using a custom IPAM mode or have multiple CNIs, check for IP address conflicts.
  3. Hubble UI Not Showing Flows or Accessible:

    Issue: You can’t access Hubble UI, or it’s empty.

    Solution:

    • Port-forwarding: Ensure the kubectl port-forward or cilium hubble port-forward command is running correctly and not conflicting with other local ports.
    • Hubble Pods: Check the status of hubble-relay and hubble-ui pods in kube-system namespace. Restart them if necessary.
    • Cilium Configuration: Verify that hubble.enabled=true and hubble.ui.enabled=true were set during installation.
    • Cilium Logs: Check logs of hubble-relay and hubble-ui pods for errors.
  4. External Access Issues (Pods cannot reach external IPs):

    Issue: Pods can communicate internally but fail to reach external websites or services.

    Solution:

    • NAT/Masquerading: Ensure that NAT/masquerading is correctly configured on your nodes. Cilium handles this by default, but verify if custom configurations are interfering.
    • Firewall Rules: Check if any host-level firewalls (e.g., firewalld, ufw) are blocking outgoing traffic from your nodes.
    • Egress Gateway: If you’re using Cilium’s Egress Gateway, ensure it’s configured correctly and that traffic is being routed through it as intended.
    • Cloud Provider Network: Verify your cloud provider’s network security groups or network ACLs are not blocking egress traffic.
  5. cilium status Reports Errors:

    Issue: The cilium status command shows warnings or errors, even if pods seem to be running.

    Solution:

    • Read the Output Carefully: The cilium status command is quite verbose and often points directly to the problem (e.g., “eBPF programs failed to load”, “Kubernetes API connectivity down”).
    • Cilium Logs: Always refer to the Cilium agent logs for more detailed error messages.
    • Kernel Version: Ensure your kernel meets Cilium’s requirements. Older kernels might lack necessary eBPF features.
    • Restart Cilium: In some cases, restarting the Cilium pods (kubectl -n kube-system rollout restart ds/cilium) can resolve transient issues.

FAQ Section

  1. What is eBPF and why is it important for Cilium?

    eBPF (extended Berkeley Packet Filter) is a revolutionary in-kernel technology that allows users to run custom programs in the Linux kernel without modifying the kernel source code or loading kernel modules. Cilium leverages eBPF to implement its networking, security, and observability features directly in the kernel’s data path. This provides superior performance, enables advanced features like L7 policy enforcement, and offers deep visibility into network traffic, all with minimal overhead compared to traditional CNI approaches.

  2. Can Cilium replace kube-proxy?

    Yes, Cilium can fully replace kube-proxy using its eBPF-based kube-proxy replacement mode (kubeProxyReplacement=strict). This offloads service load balancing to eBPF, resulting in significant performance improvements, reduced latency, and enhanced scalability by removing the need for iptables or ipvs rules, which can become bottlenecks in large clusters. It also simplifies the networking stack.

  3. How does Cilium handle network policies compared to standard Kubernetes NetworkPolicies?

    Cilium’s native CiliumNetworkPolicy (CNP) resource extends the capabilities of standard Kubernetes NetworkPolicies. While standard policies are limited to L3/L4 (IP address, port, protocol), CNPs can enforce L7 policies (e.g., HTTP methods, paths, gRPC services), integrate with identity-based security, and provide richer ingress/egress filtering rules. This allows for much more granular and application-aware security controls. For a deeper dive, read our Network Policies Security Guide.

  4. What is Hubble and how do I use it?

    Hubble is Cilium’s built-in observability platform. It leverages eBPF to provide deep visibility into network traffic flows, service dependencies, and network policy enforcement in real-time. You can access Hubble through its command-line interface (cilium hubble) or its graphical user interface (Hubble UI). It allows you to visualize connections, filter flows, inspect metadata, and troubleshoot networking issues efficiently. We covered accessing Hubble UI in Step 5 of this guide. For custom metrics, check out our guide on eBPF Observability with Hubble.

  5. Is Cilium compatible with all Kubernetes distributions and cloud providers?

    Cilium is designed to be highly portable and works with most Kubernetes distributions (e.g., Kubeadm, EKS, AKS, GKE, OpenShift). However, specific configurations or considerations might apply depending on the underlying operating system, kernel version, and cloud provider’s networking setup. Always consult the official Cilium installation guide for your specific environment to ensure full compatibility and optimal performance.

Cleanup Commands

To remove all resources created during this tutorial, execute the following commands:

# Delete the sample applications
kubectl delete deploy http-server http-client
kubectl delete svc http-server
kubectl delete pod test-client

# Delete the CiliumNetworkPolicy
kubectl delete CiliumNetworkPolicy allow-http-client-to-server

# Uninstall Cilium using Helm
helm uninstall cilium --namespace kube-system

# Remove the Cilium Helm repository (optional)
helm repo remove cilium

# Remove the Cilium CLI (optional)
sudo rm /usr/local/bin/cilium

# Kill any background port-forward processes (if running)
killall kubectl
killall cilium # If you used 'cilium hubble port-forward'

Next Steps / Further Reading

Congratulations on mastering the basics of Cilium! Here are some resources to continue your journey:

Leave a Reply

Your email address will not be published. Required fields are marked *