Introduction
In the dynamic world of Kubernetes, efficient and robust networking is paramount. As clusters scale and integrate with on-premise infrastructure or other cloud environments, the need for advanced routing solutions becomes critical. Traditional Kubernetes networking often relies on cloud provider integrations or basic overlay networks, which can present limitations when direct, high-performance routing is required. This is especially true when you need to advertise Kubernetes service IPs, Pod CIDRs, or even External IPs directly to your physical network infrastructure.
Enter Border Gateway Protocol (BGP). BGP is the routing protocol of the internet, designed for exchanging routing and reachability information among autonomous systems. When integrated with Kubernetes, BGP allows your cluster to become a first-class citizen in your network, directly advertising its internal IP ranges to your data center routers. This eliminates the need for complex NAT rules, external load balancers for every service, or cumbersome VPN tunnels, leading to simplified network architectures, reduced latency, and enhanced performance. Cilium, a powerful CNI (Container Network Interface) based on eBPF, offers native BGP peering capabilities, transforming your Kubernetes cluster into a truly integrated network component.
This guide will walk you through the process of setting up Cilium with native BGP peering, enabling your Kubernetes cluster to advertise its services and Pod IPs directly to your network infrastructure. We’ll cover everything from installation and configuration to verification and troubleshooting, ensuring you can confidently deploy and manage BGP-enabled Kubernetes clusters. Prepare to unlock a new level of network integration and efficiency for your containerized applications!
TL;DR: Cilium BGP Peering
Cilium’s native BGP support allows Kubernetes to advertise Pod CIDRs and Service IPs directly to your network routers, simplifying network integration and improving performance. Install Cilium with BGP enabled, configure BGPPeer and BGPAdvertisement custom resources, and watch your routes propagate.
Key Commands:
# Install Cilium with BGP support
helm install cilium cilium/cilium --version 1.15.5 \
--namespace kube-system \
--set ipam.mode=kubernetes \
--set bpf.masquerade=false \
--set externalIPs.enabled=true \
--set hostServices.enabled=true \
--set nodePort.enabled=true \
--set hostPort.enabled=true \
--set cni.chaining.enabled=false \
--set enable-bgp=true \
--set bgp.announce.loadbalancerIP=true \
--set bgp.announce.podCIDR=true
# Example BGPPeer configuration
kubectl apply -f - <<EOF
apiVersion: cilium.io/v2alpha1
kind: BGPPeer
metadata:
name: my-router-peer
spec:
peerAddress: 192.168.1.1 # Your router's IP
peerASN: 65001
nodeSelector:
matchLabels:
kubernetes.io/os: linux
localASN: 64512
EOF
# Example BGPAdvertisement for LoadBalancer IPs
kubectl apply -f - <<EOF
apiVersion: cilium.io/v2alpha1
kind: BGPAdvertisement
metadata:
name: advertise-lb-ips
spec:
serviceSelector:
matchLabels:
app: my-loadbalancer-service
aggregationPolicy: Exact
selector: {}
EOF
# Check Cilium BGP status
kubectl exec -it -n kube-system cilium-operator- -- cilium bgp status
kubectl exec -it -n kube-system cilium- -- cilium bgp routes
Prerequisites
Before diving into the configuration, ensure you have the following:
- Kubernetes Cluster: A running Kubernetes cluster (v1.20+ recommended). This guide assumes you have
kubectlconfigured and pointing to your cluster. - Helm: Helm v3+ installed. We’ll use Helm to install Cilium. Refer to the official Helm documentation for installation instructions.
- Network Infrastructure: Access to a router or network device that supports BGP (e.g., Cisco, Juniper, VyOS, or even a Linux machine with FRR). You’ll need to configure this device to peer with your Kubernetes nodes.
- AS Number: An Autonomous System Number (ASN) for your Kubernetes cluster (private ASNs like 64512-65534 are common for internal use) and the ASN of your peer router.
- IP Address Range: A clear understanding of your Kubernetes Pod CIDRs and Service CIDRs.
- Basic Networking Knowledge: Familiarity with TCP/IP, routing concepts, and BGP fundamentals will be beneficial.
- Firewall Rules: Ensure that TCP port 179 (BGP) is open between your Kubernetes nodes and your BGP peer router.
Step-by-Step Guide: Cilium BGP Peering
Step 1: Add Cilium Helm Repository and Install Cilium with BGP Enabled
First, we need to add the Cilium Helm chart repository and then install Cilium. It’s crucial to enable BGP support during the installation. We’ll also configure other settings like disabling masquerading for specific BGP use cases and enabling external IPs, host services, and node ports, which are often used in conjunction with BGP advertisement.
For a deep dive into Cilium’s capabilities beyond BGP, consider exploring features like Cilium WireGuard Encryption for Pod-to-Pod Traffic to secure your inter-pod communication, or eBPF Observability: Building Custom Metrics with Hubble for enhanced visibility into your network traffic.
# Add Cilium Helm repository
helm repo add cilium https://helm.cilium.io/
# Update Helm repositories
helm repo update
# Install Cilium with BGP enabled
# Note: Adjust --version to the latest stable Cilium release if needed.
# For this guide, we're using 1.15.5 as an example.
helm install cilium cilium/cilium --version 1.15.5 \
--namespace kube-system \
--create-namespace \
--set ipam.mode=kubernetes \
--set bpf.masquerade=false \
--set externalIPs.enabled=true \
--set hostServices.enabled=true \
--set nodePort.enabled=true \
--set hostPort.enabled=true \
--set cni.chaining.enabled=false \
--set enable-bgp=true \
--set bgp.announce.loadbalancerIP=true \
--set bgp.announce.podCIDR=true \
--set k8s.requireIPv4PodCIDR=true \
--set tunnel=disabled \
--set autoDirectNodeRoutes=true \
--set peerRouter.enabled=true # Required for BGP CRDs
Explanation:
--namespace kube-system --create-namespace: Installs Cilium into thekube-systemnamespace.ipam.mode=kubernetes: Cilium will use Kubernetes to manage IP addresses for Pods.bpf.masquerade=false: Disables masquerading for traffic leaving the node. This is often desired with BGP so that the original Pod IP is preserved, and your router can route directly back to the Pod.externalIPs.enabled=true,hostServices.enabled=true,nodePort.enabled=true,hostPort.enabled=true: These enable various Kubernetes service types and features that might be advertised via BGP.cni.chaining.enabled=false: We are using Cilium as the sole CNI.enable-bgp=true: The most critical flag, enabling the BGP control plane within Cilium.bgp.announce.loadbalancerIP=true: Instructs Cilium to automatically advertise LoadBalancer service IPs.bgp.announce.podCIDR=true: Instructs Cilium to automatically advertise the Pod CIDR of each node.k8s.requireIPv4PodCIDR=true: Ensures IPv4 Pod CIDRs are assigned.tunnel=disabled&autoDirectNodeRoutes=true: These settings are common for direct routing mode, where BGP is used to advertise routes directly, avoiding encapsulation overhead.peerRouter.enabled=true: This is crucial for enabling the BGP Custom Resources (BGPPeer,BGPAdvertisement, etc.) that we will use to configure BGP.
Verify
Check if Cilium pods are running and healthy. This might take a few minutes.
kubectl get pods -n kube-system -l k8s-app=cilium
Expected Output:
NAME READY STATUS RESTARTS AGE
cilium-xxxxx 1/1 Running 0 2m
cilium-operator-xxxx 1/1 Running 0 2m
Step 2: Configure BGP Peer on Your Router
Before configuring Cilium, you need to set up your external router to peer with your Kubernetes nodes. The configuration will vary based on your router’s vendor. Below is an example using FRRouting (FRR) on a Linux machine or a similar network OS.
Example FRR Configuration (/etc/frr/frr.conf):
!
router bgp 65001
bgp router-id 192.168.1.1
# Disable strict capability checking for easier setup
no bgp default ipv4-unicast
neighbor 192.168.1.10 remote-as 64512
neighbor 192.168.1.10 timers 5 15
neighbor 192.168.1.10 activate
neighbor 192.168.1.10 soft-reconfiguration inbound
!
address-family ipv4 unicast
redistribute connected
neighbor 192.168.1.10 activate
exit-address-family
!
Explanation:
router bgp 65001: Configures BGP for AS 65001 (your router’s ASN).bgp router-id 192.168.1.1: Sets the router ID.neighbor 192.168.1.10 remote-as 64512: Defines a BGP neighbor at192.168.1.10(one of your Kubernetes node IPs) with a remote AS of 64512 (your Kubernetes cluster’s ASN). You’ll typically need to define one neighbor entry for each Kubernetes node that will peer.neighbor ... activate: Activates the peering for IPv4 unicast.redistribute connected: (Optional, for FRR) This would advertise directly connected networks from the router itself. For receiving routes from Kubernetes, this isn’t strictly necessary but good for general router setup.
Important: Replace 192.168.1.10 with the actual IP addresses of your Kubernetes nodes. Ensure your router’s firewall allows incoming TCP port 179 from your Kubernetes nodes.
Verify
Check the BGP summary on your router. The state should eventually transition to Established.
# On your FRR router
vtysh -c "show ip bgp summary"
Expected Output (example):
BGP router identifier 192.168.1.1, local AS number 65001
BGP table version 1
...
Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd
192.168.1.10 4 64512 5 5 0 0 0 00:00:05 Established
Step 3: Configure Cilium BGP Peers using BGPPeer Custom Resource
Now, we’ll define which external BGP routers your Cilium agents should peer with. This is done using the BGPPeer Custom Resource Definition (CRD) provided by Cilium. You can define multiple peers, and use nodeSelector to specify which nodes should establish peering with which routers.
In a production environment, you might have multiple racks or availability zones, each with its own set of routers. nodeSelector allows for flexible and targeted peering configurations. For instance, you could peer specific nodes with a router in their respective rack. For more advanced network segmentation and security, consider integrating with Kubernetes Network Policies: Complete Security Hardening Guide.
# bgppeer-config.yaml
apiVersion: cilium.io/v2alpha1
kind: BGPPeer
metadata:
name: my-router-peer
spec:
peerAddress: 192.168.1.1 # The IP address of your BGP router
peerASN: 65001 # The AS number of your BGP router
localASN: 64512 # The AS number for your Kubernetes cluster (Cilium)
# Optional: Use nodeSelector to specify which nodes should peer with this router.
# If omitted, all nodes with BGP enabled will attempt to peer.
nodeSelector:
matchLabels:
kubernetes.io/os: linux # Example: Peer with all Linux nodes
# Optional: BGP authentication (MD5)
# authSecret:
# name: bgp-auth-secret
# namespace: kube-system
# Optional: BGP graceful restart settings
# gracefulRestart:
# enabled: true
# restartTimeSeconds: 120
# staleRoutesTimeSeconds: 300
kubectl apply -f bgppeer-config.yaml
Explanation:
peerAddress: The IP address of your external BGP router.peerASN: The Autonomous System Number of your external BGP router.localASN: The Autonomous System Number that Cilium will use for your Kubernetes cluster. This should be different frompeerASN.nodeSelector: This allows you to select specific nodes to form BGP peering sessions with this particular peer. If omitted, all nodes with BGP enabled will try to establish a session.authSecret: For production environments, it’s highly recommended to use BGP MD5 authentication for security. Create a secret containing the MD5 password.
Verify
Check the BGP status in Cilium. You should see the peering session in an Established state.
kubectl exec -it -n kube-system cilium-operator- -- cilium bgp peers
# Or on a specific cilium agent pod
kubectl exec -it -n kube-system cilium- -- cilium bgp peers
Expected Output (example):
PeerAddress PeerASN LocalASN SessionState Uptime RouterID Advertised Received
192.168.1.1 65001 64512 Established 0m0s 192.168.1.1 0 0
The Uptime should be increasing, and SessionState should be Established.
Step 4: Configure BGP Advertisements using BGPAdvertisement Custom Resource
Now that peering is established, we need to tell Cilium what to advertise. Cilium offers BGPAdvertisement CRDs to control which IPs or CIDRs are advertised to your BGP peers. You can advertise:
- Pod CIDRs: The network ranges assigned to Pods on each node.
- Service IPs: Specifically,
LoadBalancerservice IPs. - External IPs: IPs assigned to services via
ExternalIPsfield. - Node IPs: The IPs of the Kubernetes nodes themselves.
We already enabled automatic advertisement of loadbalancerIP and podCIDR during Cilium installation. However, the BGPAdvertisement CRD gives you more granular control, especially for specific LoadBalancer services or custom IP ranges.
Let’s create a sample Deployment and a LoadBalancer Service to demonstrate.
# nginx-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
spec:
replicas: 2
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:latest
ports:
- containerPort: 80
---
# nginx-service.yaml
apiVersion: v1
kind: Service
metadata:
name: nginx-loadbalancer
labels:
app: nginx-service
spec:
selector:
app: nginx
ports:
- protocol: TCP
port: 80
targetPort: 80
type: LoadBalancer
# If you have a specific IP range for LoadBalancer services,
# you can specify it here. Otherwise, Cilium will allocate one
# from its configured range or the Kubernetes service CIDR.
# loadBalancerIP: 192.168.100.100
kubectl apply -f nginx-deployment.yaml
kubectl apply -f nginx-service.yaml
Now, let’s create a BGPAdvertisement to specifically advertise the IP of our nginx-loadbalancer service. Note that if bgp.announce.loadbalancerIP=true was set during install, this specific advertisement might be redundant but serves as an example for granular control.
# bgpadvertisement-lb.yaml
apiVersion: cilium.io/v2alpha1
kind: BGPAdvertisement
metadata:
name: advertise-nginx-lb
spec:
# Select services to advertise based on labels
serviceSelector:
matchLabels:
app: nginx-service
# You can also use a CIDR selector to advertise specific IP ranges
# cidrSelector:
# - cidr: "10.0.0.0/8" # Example: Advertise a custom CIDR
aggregationPolicy: Exact # Advertise the exact service IP, not a summarized route
selector: {} # Apply to all BGP peers configured in the cluster
kubectl apply -f bgpadvertisement-lb.yaml
Explanation:
serviceSelector: This allows you to select specific Kubernetes Services whose IPs should be advertised. This is powerful for controlling which services are exposed via BGP.aggregationPolicy: Exact: Ensures that the exact IP address of the service is advertised, not a summarized CIDR.selector: {}: This empty selector means this advertisement applies to all configuredBGPPeerresources. You could specifypeerSelectorhere to target specific BGP peers.
Verify
Check the BGP routes advertised by Cilium and then verify on your router.
# Check advertised routes from Cilium
kubectl exec -it -n kube-system cilium-operator- -- cilium bgp routes
# Or on a specific cilium agent pod
kubectl exec -it -n kube-system cilium- -- cilium bgp routes
Expected Output (example):
Prefix NextHop LocalPref MED Communities Age
10.42.0.0/24 192.168.1.10 100 0 - 0m0s (Pod CIDR)
10.42.1.0/24 192.168.1.11 100 0 - 0m0s (Pod CIDR)
10.100.0.100/32 192.168.1.10 100 0 - 0m0s (LoadBalancer IP)
Then, verify on your BGP router:
# On your FRR router
vtysh -c "show ip bgp"
vtysh -c "show ip route"
Expected Output (example from FRR):
# show ip bgp
...
Network Next Hop Metric LocPrf Weight Path
*> 10.42.0.0/24 192.168.1.10 0 100 0 64512 i
*> 10.42.1.0/24 192.168.1.11 0 100 0 64512 i
*> 10.100.0.100/32 192.168.1.10 0 100 0 64512 i
# show ip route
...
B> 10.42.0.0/24 [20/0] via 192.168.1.10, eth0, 00:00:10
B> 10.42.1.0/24 [20/0] via 192.168.1.11, eth0, 00:00:10
B> 10.100.0.100/32 [20/0] via 192.168.1.10, eth0, 00:00:10
You should see routes to your Pod CIDRs and LoadBalancer service IP, with the next hop being the respective Kubernetes node’s IP address. This confirms your router has learned the routes from Cilium.
Step 5: Test Connectivity
With routes advertised, your external network should now be able to directly reach your Kubernetes Pods and services. From a machine on your external network (that uses your BGP router as its gateway), try to ping or curl the LoadBalancer IP or a Pod IP (if your router is configured to route to Pods directly).
# From an external machine on your network
ping 10.100.0.100 # Your LoadBalancer IP
curl http://10.100.0.100 # Your LoadBalancer IP
Expected Output:
# ping
PING 10.100.0.100 (10.100.0.100): 56 data bytes
64 bytes from 10.100.0.100: icmp_seq=0 ttl=62 time=0.5 ms
64 bytes from 10.100.0.100: icmp_seq=1 ttl=62 time=0.4 ms
...
# curl
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
...
Success! Your external network can now directly access your Kubernetes services, bypassing traditional NAT or complex ingress solutions. This direct routing capability is a game-changer for hybrid cloud environments and bare-metal Kubernetes deployments.
Production Considerations
Deploying Cilium BGP in production requires careful planning and adherence to best practices:
- High Availability: Ensure your BGP router infrastructure is highly available. If you have multiple routers, configure Cilium to peer with all of them using multiple
BGPPeerresources. - AS Number Planning: Carefully choose your ASNs. Use private ASNs (64512-65534) for internal networks. Ensure your cluster’s ASN is distinct from your peer routers.
- BGP Authentication: Always enable BGP MD5 authentication (
authSecretinBGPPeer) for security. This prevents unauthorized peers from establishing sessions. - Route Aggregation: While
Exactaggregation is useful for specific service IPs, for large numbers of Pod CIDRs, consider if route summarization on your router or via Cilium’sBGPAdvertisement(e.g., advertising a larger subnet that encompasses multiple Pod CIDRs) is appropriate to reduce routing table size. - Firewall Rules: Ensure all necessary firewall rules are in place to allow TCP port 179 between Kubernetes nodes and BGP routers. Also, consider any security groups or network ACLs if running in a cloud environment. For comprehensive network security in Kubernetes, revisit our Kubernetes Network Policies: Complete Security Hardening Guide.
- IP Address Management (IPAM): Plan your IP address ranges carefully. Ensure your Pod CIDRs and Service CIDRs do not overlap with your existing network infrastructure and are routable.
- Monitoring and Alerting: Implement robust monitoring for BGP session status (using
cilium bgp peersoutput) and route advertisements. Integrate with your existing observability stack. For eBPF-specific observability, see eBPF Observability: Building Custom Metrics with Hubble. - Graceful Restart: Configure BGP graceful restart (
gracefulRestartinBGPPeer) to minimize service disruption during router reloads or BGP daemon restarts. - Path Selection: Understand BGP path attributes (Local_Pref, MED, AS_Path) if you have multiple BGP peers and need to influence traffic engineering. Cilium allows setting
localPreferenceandmedinBGPAdvertisement. - Resource Limits: Ensure your Cilium agents and operator have adequate CPU and memory resources, especially in large clusters with many BGP routes.
- Cloud Provider Integration: If running in a cloud, ensure Cilium’s BGP mode is compatible with your cloud provider’s network. Some cloud providers have their own BGP services (e.g., AWS Direct Connect, GCP Cloud Router) that can peer with your cluster.
- Load Balancing: For LoadBalancer services, BGP provides ECMP (Equal-Cost Multi-Path) routing, distributing traffic across multiple nodes. Ensure your network devices support ECMP for optimal load distribution.
Troubleshooting
BGP setup can be finicky. Here are common issues and their solutions:
-
BGP Session Not Establishing (
IdleorActivestate)Issue: The BGP session on Cilium or your router remains in an
IdleorActivestate instead ofEstablished.Solution:
- Firewall: Check firewalls on both the Kubernetes nodes and the BGP router. Ensure TCP port 179 is open bi-directionally.
- IP Address Mismatch: Verify
peerAddressinBGPPeermatches the router’s actual IP, and the router’s neighbor configuration matches the Kubernetes node IP. - ASN Mismatch: Double-check that
localASNinBGPPeermatches theremote-ason the router, andpeerASNmatches the router’s actual ASN. - Network Connectivity: Ping the router IP from a Kubernetes node and vice-versa to ensure basic IP connectivity.
- Router Logs: Check BGP logs on your router for errors (e.g., “invalid capabilities,” “authentication failure”).
- Cilium Logs: Check Cilium agent logs for BGP-related errors:
kubectl logs -n kube-system cilium--f | grep -i bgp
-
Routes Not Advertised/Received
Issue: BGP session is established, but no routes are appearing on the router, or desired routes (e.g., LoadBalancer IPs) aren’t being advertised by Cilium.
Solution:
- Cilium BGP Routes: Use
cilium bgp routesto see what Cilium is actively advertising. If your desired routes aren’t listed, check yourBGPAdvertisementconfiguration (selectors, CIDRs). - Cilium Installation Flags: Ensure
bgp.announce.loadbalancerIP=trueand/orbgp.announce.podCIDR=truewere set during Cilium installation if you expect automatic advertisement. - Router Configuration: On the router, ensure the BGP address family (e.g.,
address-family ipv4 unicast) is correctly configured and the neighbor is activated for that family. - Route-Maps/Filters: Check if any inbound or outbound route-maps or filters on your router are blocking routes from Cilium.
- Service IP Allocation: For LoadBalancer services, ensure they have an IP allocated. If
loadBalancerIPis not specified, Cilium needs an IPAM range to allocate from, or it will use a Kubernetes assigned IP from the service CIDR.
- Cilium BGP Routes: Use
-
Connectivity Issues After BGP Setup (e.g., Pods can’t reach external services)
Issue: After enabling BGP and disabling masquerading (
bpf.masquerade=false), Pods can’t reach external services.Solution:
- Return Routes: When
bpf.masquerade=false, external services will see the Pod’s original IP. Your external network must have routes back to the Pod CIDRs via your BGP router. Verify these routes exist on your router and any intermediate devices. - NAT Gateway: If direct return routes are not possible for all external services, you might need a NAT gateway for egress traffic or selectively enable masquerading for egress traffic only.
- Cilium Tunneling: If you disabled tunneling (
tunnel=disabled), ensureautoDirectNodeRoutes=trueand that your underlying network can route between nodes directly.
- Return Routes: When
-
BGP Session Flapping
Issue: The BGP session repeatedly goes up and down.
Solution:
- Network Instability: Check for network instability between the Kubernetes node and the BGP router (packet loss, high latency).
- Resource Exhaustion: Ensure the Cilium pod or the BGP router process isn’t experiencing CPU or memory exhaustion.
- BGP Timers: Misconfigured BGP keepalive and hold timers can cause flapping. Ensure they are compatible between peers. Cilium defaults are typically fine, but check your router.
- Duplicate Router ID: Ensure the BGP router ID on your external router is unique across your BGP domain.
-
LoadBalancer Service IP Not Advertised
Issue: A Kubernetes LoadBalancer service exists, but its IP is not advertised by Cilium.
Solution:
- Cilium Configuration: Ensure
bgp.announce.loadbalancerIP=truewas set during Cilium installation, or a specificBGPAdvertisementCRD exists for the service. - Service Type: Confirm the service is indeed of type
LoadBalancer. - IP Allocation: Verify the LoadBalancer service has an
EXTERNAL-IPassigned. If not, Cilium might not have an IPAM range configured for LoadBalancer IPs, or there might be an issue with the LoadBalancer IPAM. - Service Selector:
- Cilium Configuration: Ensure