Kubernetes networking is one of the most critical and complex aspects of container orchestration. Whether you’re running a small development cluster or managing large-scale production workloads, understanding and implementing networking best practices can make the difference between a smooth-running cluster and a debugging nightmare.
In this comprehensive guide, we’ll dive deep into Kubernetes networking best practices, covering everything from CNI selection to network policies, service mesh implementation, and performance optimization. Each section includes practical code examples and real-world scenarios.
Understanding Kubernetes Networking Fundamentals
Before diving into best practices, let’s establish the four fundamental networking requirements in Kubernetes:
- Pod-to-Pod Communication: Every pod can communicate with every other pod without NAT
- Node-to-Pod Communication: Nodes can communicate with all pods without NAT
- Pod IP Visibility: The IP address a pod sees itself as is the same IP address others see it as
- Service Discovery: Services provide stable endpoints for pod groups
The Kubernetes Networking Model
# Example: Understanding Pod Networking
apiVersion: v1
kind: Pod
metadata:
name: network-test-pod
labels:
app: nettest
spec:
containers:
- name: network-container
image: nicolaka/netshoot
command: ["/bin/bash"]
args: ["-c", "sleep 3600"]
# Each pod gets its own IP address
# All containers in a pod share the same network namespace
Verify Pod Networking
# Get pod IP
kubectl get pod network-test-pod -o wide
# Test pod-to-pod communication
kubectl exec -it network-test-pod -- ping <another-pod-ip>
# Check network interfaces
kubectl exec -it network-test-pod -- ip addr show
Choosing the Right CNI Plugin
The Container Network Interface (CNI) is crucial for your cluster’s networking performance and capabilities. Here’s a detailed comparison and implementation guide.
Popular CNI Plugins Comparison
| CNI Plugin | Network Model | Performance | Features | Best For |
|---|---|---|---|---|
| Calico | L3 BGP/VXLAN | High | Network policies, BGP routing | Production, multi-cloud |
| Cilium | eBPF | Very High | Advanced observability, security | Performance-critical apps |
| Flannel | VXLAN | Medium | Simple, easy setup | Development, simple deployments |
| Weave Net | Mesh | Medium | Automatic discovery, encryption | Small to medium clusters |
| Canal | Calico + Flannel | High | Best of both worlds | Hybrid requirements |
Installing Calico (Production-Grade Setup)
# Download the Calico manifest
curl https://raw.githubusercontent.com/projectcalico/calico/v3.27.0/manifests/calico.yaml -O
# Customize the manifest for your environment
# Important: Set the pod CIDR to match your cluster
sed -i 's|192.168.0.0/16|10.244.0.0/16|g' calico.yaml
# Apply the configuration
kubectl apply -f calico.yaml
# Verify installation
kubectl get pods -n kube-system | grep calico
Calico Configuration Best Practices
# Custom Calico configuration for production
apiVersion: operator.tigera.io/v1
kind: Installation
metadata:
name: default
spec:
# Enable IP-in-IP encapsulation for cross-subnet communication
calicoNetwork:
ipPools:
- blockSize: 26
cidr: 10.244.0.0/16
encapsulation: IPIPCrossSubnet
natOutgoing: true
nodeSelector: all()
# Enable Prometheus metrics
nodeMetricsPort: 9091
# Configure MTU for optimal performance
calicoNetwork:
mtu: 1440
Installing Cilium (eBPF-Based Performance)
# Install Cilium CLI
CILIUM_CLI_VERSION=$(curl -s https://raw.githubusercontent.com/cilium/cilium-cli/main/stable.txt)
curl -L --fail --remote-name-all https://github.com/cilium/cilium-cli/releases/download/${CILIUM_CLI_VERSION}/cilium-linux-amd64.tar.gz
sudo tar xzvfC cilium-linux-amd64.tar.gz /usr/local/bin
rm cilium-linux-amd64.tar.gz
# Install Cilium with Hubble (observability)
cilium install \
--version 1.14.5 \
--set hubble.relay.enabled=true \
--set hubble.ui.enabled=true \
--set prometheus.enabled=true \
--set operator.prometheus.enabled=true
# Verify installation
cilium status --wait
CNI Selection Decision Tree
Start
|
├─ Need advanced observability/eBPF? ──> Cilium
├─ Need mature network policies + BGP? ──> Calico
├─ Simple dev environment? ──> Flannel
├─ Need encryption + simple setup? ──> Weave Net
└─ Best practices for production ──> Calico or Cilium
Network Policy Best Practices
Network Policies are your first line of defense for securing pod communication. Here’s how to implement them effectively.
Default Deny All Traffic
Always start with a default deny policy:
# Default deny all ingress and egress traffic
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-all
namespace: production
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
Allow Specific Traffic Pattern
# Allow frontend to communicate with backend
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: frontend-to-backend
namespace: production
spec:
podSelector:
matchLabels:
app: backend
tier: api
policyTypes:
- Ingress
ingress:
- from:
- podSelector:
matchLabels:
app: frontend
tier: web
ports:
- protocol: TCP
port: 8080
Multi-Tier Application Network Policy
# Complete 3-tier application network policy
---
# 1. Frontend Policy: Allow ingress from LoadBalancer
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: frontend-policy
namespace: production
spec:
podSelector:
matchLabels:
tier: frontend
policyTypes:
- Ingress
- Egress
ingress:
- from:
- namespaceSelector:
matchLabels:
name: ingress-nginx
ports:
- protocol: TCP
port: 80
egress:
- to:
- podSelector:
matchLabels:
tier: backend
ports:
- protocol: TCP
port: 8080
# Allow DNS
- to:
- namespaceSelector:
matchLabels:
name: kube-system
- podSelector:
matchLabels:
k8s-app: kube-dns
ports:
- protocol: UDP
port: 53
---
# 2. Backend Policy: Allow from frontend, egress to database
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: backend-policy
namespace: production
spec:
podSelector:
matchLabels:
tier: backend
policyTypes:
- Ingress
- Egress
ingress:
- from:
- podSelector:
matchLabels:
tier: frontend
ports:
- protocol: TCP
port: 8080
egress:
- to:
- podSelector:
matchLabels:
tier: database
ports:
- protocol: TCP
port: 5432
# Allow DNS
- to:
- namespaceSelector:
matchLabels:
name: kube-system
ports:
- protocol: UDP
port: 53
---
# 3. Database Policy: Only allow from backend
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: database-policy
namespace: production
spec:
podSelector:
matchLabels:
tier: database
policyTypes:
- Ingress
- Egress
ingress:
- from:
- podSelector:
matchLabels:
tier: backend
ports:
- protocol: TCP
port: 5432
egress:
# Database needs minimal egress - only DNS
- to:
- namespaceSelector:
matchLabels:
name: kube-system
ports:
- protocol: UDP
port: 53
Testing Network Policies
# Create test pods
kubectl run frontend --image=nginx --labels=tier=frontend -n production
kubectl run backend --image=nginx --labels=tier=backend -n production
kubectl run database --image=postgres --labels=tier=database -n production
# Test connectivity (should succeed)
kubectl exec -it frontend -n production -- curl http://backend-service:8080
# Test blocked connectivity (should fail)
kubectl exec -it frontend -n production -- curl http://database-service:5432
# Debug network policy
kubectl describe networkpolicy frontend-policy -n production
Advanced Network Policy with CIDR Blocks
# Allow egress to specific external IPs (e.g., external API)
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-external-api
namespace: production
spec:
podSelector:
matchLabels:
app: api-consumer
policyTypes:
- Egress
egress:
# Allow to external API
- to:
- ipBlock:
cidr: 203.0.113.0/24
except:
- 203.0.113.1/32
ports:
- protocol: TCP
port: 443
# Allow DNS
- to:
- namespaceSelector: {}
podSelector:
matchLabels:
k8s-app: kube-dns
ports:
- protocol: UDP
port: 53
Service and DNS Optimization
Kubernetes Services and DNS are critical for service discovery. Let’s optimize them.
ClusterIP Service (Default)
# Production-ready ClusterIP service
apiVersion: v1
kind: Service
metadata:
name: backend-service
namespace: production
labels:
app: backend
spec:
type: ClusterIP
# Use sessionAffinity for stateful applications
sessionAffinity: ClientIP
sessionAffinityConfig:
clientIP:
timeoutSeconds: 3600
selector:
app: backend
tier: api
ports:
- name: http
protocol: TCP
port: 8080
targetPort: 8080
# Enable topology-aware routing (Kubernetes 1.21+)
topologyKeys:
- "kubernetes.io/hostname"
- "topology.kubernetes.io/zone"
- "*"
Headless Service for StatefulSets
# Headless service for direct pod access
apiVersion: v1
kind: Service
metadata:
name: database-headless
namespace: production
spec:
clusterIP: None # This makes it headless
selector:
app: postgres
tier: database
ports:
- name: postgres
port: 5432
targetPort: 5432
publishNotReadyAddresses: true # Include not-ready pods in DNS
External Service Mapping
# Map external database to Kubernetes service
apiVersion: v1
kind: Service
metadata:
name: external-database
namespace: production
spec:
type: ExternalName
externalName: database.example.com
ports:
- port: 5432
DNS Optimization
# CoreDNS ConfigMap optimization
apiVersion: v1
kind: ConfigMap
metadata:
name: coredns
namespace: kube-system
data:
Corefile: |
.:53 {
errors
health {
lameduck 5s
}
ready
# Enable prometheus metrics
prometheus :9153
# Kubernetes plugin with optimized settings
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
ttl 30
}
# Forward to upstream DNS
forward . /etc/resolv.conf {
max_concurrent 1000
}
# Enable caching
cache 30
loop
reload
loadbalance
# Limit queries per second to prevent DNS floods
ratelimit 100
}
DNS Policy Best Practices
# Pod with custom DNS policy
apiVersion: v1
kind: Pod
metadata:
name: dns-optimized-pod
spec:
# Use ClusterFirst for most applications
dnsPolicy: ClusterFirst
# Custom DNS configuration
dnsConfig:
nameservers:
- 8.8.8.8
searches:
- production.svc.cluster.local
- svc.cluster.local
- cluster.local
options:
- name: ndots
value: "2"
- name: timeout
value: "2"
- name: attempts
value: "2"
containers:
- name: app
image: myapp:latest
Service Endpoint Monitoring
# Check service endpoints
kubectl get endpoints backend-service -n production
# Describe service for debugging
kubectl describe svc backend-service -n production
# Test DNS resolution
kubectl run -it --rm debug --image=nicolaka/netshoot --restart=Never -- nslookup backend-service.production.svc.cluster.local
# Check CoreDNS logs
kubectl logs -n kube-system -l k8s-app=kube-dns --tail=100
Ingress Controller Configuration
Choosing and configuring the right Ingress controller is crucial for production workloads.
NGINX Ingress Controller Installation
# Install NGINX Ingress Controller
kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v1.9.5/deploy/static/provider/cloud/deploy.yaml
# Verify installation
kubectl get pods -n ingress-nginx
kubectl get svc -n ingress-nginx
Production-Grade Ingress Configuration
# NGINX Ingress with TLS, rate limiting, and advanced routing
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: production-ingress
namespace: production
annotations:
# Enable SSL redirect
nginx.ingress.kubernetes.io/ssl-redirect: "true"
# Force HTTPS
nginx.ingress.kubernetes.io/force-ssl-redirect: "true"
# Rate limiting
nginx.ingress.kubernetes.io/limit-rps: "100"
nginx.ingress.kubernetes.io/limit-connections: "50"
# Connection timeout
nginx.ingress.kubernetes.io/proxy-connect-timeout: "30"
nginx.ingress.kubernetes.io/proxy-send-timeout: "30"
nginx.ingress.kubernetes.io/proxy-read-timeout: "30"
# Enable CORS
nginx.ingress.kubernetes.io/enable-cors: "true"
nginx.ingress.kubernetes.io/cors-allow-methods: "GET, POST, PUT, DELETE, OPTIONS"
nginx.ingress.kubernetes.io/cors-allow-origin: "https://example.com"
# Client body size
nginx.ingress.kubernetes.io/proxy-body-size: "10m"
# Use cert-manager for TLS
cert-manager.io/cluster-issuer: "letsencrypt-prod"
# Enable modsecurity WAF
nginx.ingress.kubernetes.io/enable-modsecurity: "true"
nginx.ingress.kubernetes.io/enable-owasp-core-rules: "true"
spec:
ingressClassName: nginx
tls:
- hosts:
- api.example.com
- app.example.com
secretName: production-tls-secret
rules:
- host: api.example.com
http:
paths:
- path: /v1
pathType: Prefix
backend:
service:
name: backend-service
port:
number: 8080
- host: app.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: frontend-service
port:
number: 80
Path-Based Routing with Rewrite
# Advanced path-based routing
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: path-based-ingress
namespace: production
annotations:
nginx.ingress.kubernetes.io/rewrite-target: /$2
nginx.ingress.kubernetes.io/use-regex: "true"
spec:
ingressClassName: nginx
rules:
- host: services.example.com
http:
paths:
- path: /api(/|$)(.*)
pathType: Prefix
backend:
service:
name: api-service
port:
number: 8080
- path: /auth(/|$)(.*)
pathType: Prefix
backend:
service:
name: auth-service
port:
number: 8081
- path: /metrics(/|$)(.*)
pathType: Prefix
backend:
service:
name: metrics-service
port:
number: 9090
Canary Deployment with Ingress
# Production ingress (90% traffic)
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: production-ingress
namespace: production
spec:
ingressClassName: nginx
rules:
- host: app.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: app-stable
port:
number: 80
---
# Canary ingress (10% traffic)
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: canary-ingress
namespace: production
annotations:
nginx.ingress.kubernetes.io/canary: "true"
nginx.ingress.kubernetes.io/canary-weight: "10"
# Or use header-based routing
# nginx.ingress.kubernetes.io/canary-by-header: "X-Canary"
# nginx.ingress.kubernetes.io/canary-by-header-value: "enabled"
spec:
ingressClassName: nginx
rules:
- host: app.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: app-canary
port:
number: 80
Custom Error Pages
apiVersion: v1
kind: ConfigMap
metadata:
name: custom-error-pages
namespace: ingress-nginx
data:
404.html: |
<!DOCTYPE html>
<html>
<head><title>404 Not Found</title></head>
<body>
<h1>Resource Not Found</h1>
<p>The requested resource was not found on this server.</p>
</body>
</html>
503.html: |
<!DOCTYPE html>
<html>
<head><title>503 Service Unavailable</title></head>
<body>
<h1>Service Temporarily Unavailable</h1>
<p>We're experiencing technical difficulties. Please try again later.</p>
</body>
</html>
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: error-pages-ingress
annotations:
nginx.ingress.kubernetes.io/custom-http-errors: "404,503"
nginx.ingress.kubernetes.io/default-backend: error-pages-service
spec:
# ... rest of ingress config
Service Mesh Implementation
Service meshes provide advanced traffic management, security, and observability. Let’s explore Istio implementation.
Installing Istio
# Download Istio
curl -L https://istio.io/downloadIstio | sh -
cd istio-1.20.0
export PATH=$PWD/bin:$PATH
# Install Istio with production profile
istioctl install --set profile=production -y
# Enable sidecar injection for namespace
kubectl label namespace production istio-injection=enabled
# Verify installation
kubectl get pods -n istio-system
Istio Gateway and VirtualService
# Istio Gateway for external traffic
apiVersion: networking.istio.io/v1beta1
kind: Gateway
metadata:
name: production-gateway
namespace: production
spec:
selector:
istio: ingressgateway
servers:
- port:
number: 443
name: https
protocol: HTTPS
tls:
mode: SIMPLE
credentialName: production-tls
hosts:
- "app.example.com"
- port:
number: 80
name: http
protocol: HTTP
hosts:
- "app.example.com"
# Redirect HTTP to HTTPS
tls:
httpsRedirect: true
---
# VirtualService for routing
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: app-virtualservice
namespace: production
spec:
hosts:
- "app.example.com"
gateways:
- production-gateway
http:
- match:
- uri:
prefix: "/api/v1"
route:
- destination:
host: backend-service
port:
number: 8080
weight: 90
- destination:
host: backend-service-canary
port:
number: 8080
weight: 10
timeout: 10s
retries:
attempts: 3
perTryTimeout: 3s
retryOn: 5xx,reset,connect-failure,refused-stream
- match:
- uri:
prefix: "/"
route:
- destination:
host: frontend-service
port:
number: 80
Circuit Breaking
# Destination Rule with circuit breaking
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
name: backend-circuit-breaker
namespace: production
spec:
host: backend-service
trafficPolicy:
connectionPool:
tcp:
maxConnections: 100
http:
http1MaxPendingRequests: 50
http2MaxRequests: 100
maxRequestsPerConnection: 2
outlierDetection:
consecutive5xxErrors: 5
interval: 30s
baseEjectionTime: 30s
maxEjectionPercent: 50
minHealthPercent: 40
loadBalancer:
simple: LEAST_REQUEST
mTLS Configuration
# PeerAuthentication for mutual TLS
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
name: production-mtls
namespace: production
spec:
mtls:
mode: STRICT # Enforce mTLS for all services
---
# Authorization Policy
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
name: frontend-authz
namespace: production
spec:
selector:
matchLabels:
app: frontend
action: ALLOW
rules:
- from:
- source:
principals: ["cluster.local/ns/ingress-nginx/sa/ingress-nginx"]
to:
- operation:
methods: ["GET", "POST"]
paths: ["/api/*"]
Traffic Splitting
# Gradual rollout with traffic splitting
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: reviews-route
namespace: production
spec:
hosts:
- reviews-service
http:
- match:
- headers:
user-type:
exact: "beta-tester"
route:
- destination:
host: reviews-service
subset: v2
- route:
- destination:
host: reviews-service
subset: v1
weight: 75
- destination:
host: reviews-service
subset: v2
weight: 25
---
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
name: reviews-destination
namespace: production
spec:
host: reviews-service
subsets:
- name: v1
labels:
version: v1
- name: v2
labels:
version: v2
Load Balancing Strategies
Proper load balancing ensures optimal resource utilization and high availability.
Service Load Balancing Algorithms
# Round Robin (Default)
apiVersion: v1
kind: Service
metadata:
name: backend-roundrobin
spec:
selector:
app: backend
ports:
- port: 8080
# Default behavior - no additional config needed
---
# Session Affinity (Sticky Sessions)
apiVersion: v1
kind: Service
metadata:
name: backend-sticky
spec:
selector:
app: backend
sessionAffinity: ClientIP
sessionAffinityConfig:
clientIP:
timeoutSeconds: 10800 # 3 hours
ports:
- port: 8080
---
# Internal Traffic Policy (Topology-Aware)
apiVersion: v1
kind: Service
metadata:
name: backend-topology
spec:
selector:
app: backend
internalTrafficPolicy: Local # Route to node-local endpoints
ports:
- port: 8080
External Load Balancer with Health Checks
apiVersion: v1
kind: Service
metadata:
name: frontend-lb
namespace: production
annotations:
# Cloud provider specific annotations
# AWS
service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: "true"
service.beta.kubernetes.io/aws-load-balancer-backend-protocol: "http"
service.beta.kubernetes.io/aws-load-balancer-healthcheck-path: "/health"
service.beta.kubernetes.io/aws-load-balancer-healthcheck-interval: "10"
# GCP
cloud.google.com/load-balancer-type: "Internal"
# Azure
service.beta.kubernetes.io/azure-load-balancer-internal: "true"
spec:
type: LoadBalancer
selector:
app: frontend
ports:
- name: http
port: 80
targetPort: 8080
externalTrafficPolicy: Local # Preserve source IP
healthCheckNodePort: 32000
Pod Disruption Budget
# Ensure high availability during updates
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: backend-pdb
namespace: production
spec:
minAvailable: 2 # Or use maxUnavailable: 1
selector:
matchLabels:
app: backend
tier: api
HorizontalPodAutoscaler with Custom Metrics
# HPA based on CPU and custom metrics
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: backend-hpa
namespace: production
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: backend
minReplicas: 3
maxReplicas: 20
behavior:
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 50
periodSeconds: 60
scaleUp:
stabilizationWindowSeconds: 0
policies:
- type: Percent
value: 100
periodSeconds: 30
- type: Pods
value: 4
periodSeconds: 30
selectPolicy: Max
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
- type: Pods
pods:
metric:
name: http_requests_per_second
target:
type: AverageValue
averageValue: "1000"
Network Security Hardening
Security should be built into your networking layer from day one.
Pod Security Standards
# Enforce Pod Security Standards at namespace level
apiVersion: v1
kind: Namespace
metadata:
name: production
labels:
pod-security.kubernetes.io/enforce: restricted
pod-security.kubernetes.io/audit: restricted
pod-security.kubernetes.io/warn: restricted
Secure Pod Specification
apiVersion: v1
kind: Pod
metadata:
name: secure-app
namespace: production
spec:
# Use service account with minimal permissions
serviceAccountName: app-sa
automountServiceAccountToken: false
securityContext:
runAsNonRoot: true
runAsUser: 1000
fsGroup: 2000
seccompProfile:
type: RuntimeDefault
containers:
- name: app
image: myapp:latest
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 1000
capabilities:
drop:
- ALL
# Resource limits for network bandwidth control
resources:
limits:
cpu: "1"
memory: "512Mi"
ephemeral-storage: "2Gi"
requests:
cpu: "500m"
memory: "256Mi"
Egress Gateway for External Traffic
# Dedicated egress gateway for external API calls
apiVersion: networking.istio.io/v1beta1
kind: Gateway
metadata:
name: egress-gateway
namespace: istio-system
spec:
selector:
istio: egressgateway
servers:
- port:
number: 443
name: tls
protocol: TLS
hosts:
- "*.external-api.com"
tls:
mode: PASSTHROUGH
---
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: external-api-route
namespace: production
spec:
hosts:
- "*.external-api.com"
gateways:
- mesh
- istio-system/egress-gateway
tls:
- match:
- gateways:
- mesh
port: 443
sniHosts:
- "*.external-api.com"
route:
- destination:
host: istio-egressgateway.istio-system.svc.cluster.local
port:
number: 443
- match:
- gateways:
- istio-system/egress-gateway
port: 443
sniHosts:
- "*.external-api.com"
route:
- destination:
host: external-api.com
port:
number: 443
Network Policy for Egress Control
# Strict egress control with allowed destinations
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: strict-egress
namespace: production
spec:
podSelector:
matchLabels:
app: secure-app
policyTypes:
- Egress
egress:
# Allow DNS
- to:
- namespaceSelector:
matchLabels:
name: kube-system
ports:
- protocol: UDP
port: 53
# Allow HTTPS to specific domains (requires Calico or Cilium)
- to:
- namespaceSelector: {}
ports:
- protocol: TCP
port: 443
# Allow internal services
- to:
- podSelector:
matchLabels:
tier: database
ports:
- protocol: TCP
port: 5432
TLS Certificate Management
# cert-manager ClusterIssuer for Let's Encrypt
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-prod
spec:
acme:
server: https://acme-v02.api.letsencrypt.org/directory
email: admin@example.com
privateKeySecretRef:
name: letsencrypt-prod-key
solvers:
- http01:
ingress:
class: nginx
- dns01:
cloudflare:
email: admin@example.com
apiTokenSecretRef:
name: cloudflare-api-token
key: api-token
---
# Certificate for multiple domains
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: production-tls
namespace: production
spec:
secretName: production-tls-secret
issuerRef:
name: letsencrypt-prod
kind: ClusterIssuer
dnsNames:
- example.com
- "*.example.com"
- api.example.com
Monitoring and Troubleshooting
Effective monitoring is essential for maintaining healthy networking.
Prometheus ServiceMonitor
# Monitor Kubernetes services with Prometheus
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: backend-monitor
namespace: production
labels:
app: backend
spec:
selector:
matchLabels:
app: backend
endpoints:
- port: metrics
interval: 30s
path: /metrics
scheme: http
Network Troubleshooting Pod
# Deploy debugging pod with network tools
apiVersion: v1
kind: Pod
metadata:
name: netshoot
namespace: production
spec:
containers:
- name: netshoot
image: nicolaka/netshoot:latest
command: ["/bin/bash"]
args: ["-c", "while true; do sleep 3600; done"]
securityContext:
capabilities:
add:
- NET_ADMIN
- NET_RAW
Essential Troubleshooting Commands
# DNS troubleshooting
kubectl exec -it netshoot -n production -- nslookup backend-service
kubectl exec -it netshoot -n production -- dig backend-service.production.svc.cluster.local
# Connectivity testing
kubectl exec -it netshoot -n production -- curl -v http://backend-service:8080/health
kubectl exec -it netshoot -n production -- nc -zv backend-service 8080
# Network interface inspection
kubectl exec -it netshoot -n production -- ip addr show
kubectl exec -it netshoot -n production -- ip route show
# Packet capture
kubectl exec -it netshoot -n production -- tcpdump -i any -w /tmp/capture.pcap
# MTU path discovery
kubectl exec -it netshoot -n production -- tracepath backend-service.production.svc.cluster.local
# Test network policies
kubectl exec -it netshoot -n production -- telnet database-service 5432
# Check iptables rules (requires privileged access)
kubectl exec -it netshoot -n production -- iptables -L -n -v
# Latency testing
kubectl exec -it netshoot -n production -- ping -c 10 backend-service
# HTTP performance testing
kubectl exec -it netshoot -n production -- ab -n 1000 -c 10 http://backend-service:8080/
Cilium Hubble for Network Observability
# Install Hubble CLI
export HUBBLE_VERSION=$(curl -s https://raw.githubusercontent.com/cilium/hubble/master/stable.txt)
curl -L --fail --remote-name-all https://github.com/cilium/hubble/releases/download/$HUBBLE_VERSION/hubble-linux-amd64.tar.gz
tar xzvfC hubble-linux-amd64.tar.gz /usr/local/bin
# Enable port-forward
cilium hubble port-forward&
# Observe network flows
hubble observe --namespace production
# Filter by specific pod
hubble observe --pod backend-deployment-abc123
# Check dropped packets
hubble observe --verdict DROPPED
# HTTP metrics
hubble observe --http-status 500
# Network policy violations
hubble observe --verdict DENIED
Performance Optimization
Optimize your network performance for production workloads.
Configure CNI MTU
# Check current MTU
kubectl exec -it <pod-name> -- ip link show eth0
# For Calico
kubectl patch installation default --type=merge -p '{"spec":{"calicoNetwork":{"mtu":1440}}}'
# For Cilium
cilium config set mtu 1450
Optimize kube-proxy
# kube-proxy ConfigMap for IPVS mode
apiVersion: v1
kind: ConfigMap
metadata:
name: kube-proxy
namespace: kube-system
data:
config.conf: |
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: "ipvs" # Use IPVS instead of iptables
ipvs:
scheduler: "rr" # Round-robin
strictARP: true
tcpTimeout: 900s
tcpFinTimeout: 120s
udpTimeout: 300s
# Connection tracking
conntrack:
maxPerCore: 524288
min: 131072
tcpEstablishedTimeout: 86400s
tcpCloseWaitTimeout: 3600s
Enable BBR Congestion Control
# DaemonSet to enable BBR on all nodes
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: node-tuning
namespace: kube-system
spec:
selector:
matchLabels:
app: node-tuning
template:
metadata:
labels:
app: node-tuning
spec:
hostNetwork: true
hostPID: true
initContainers:
- name: sysctl-tuning
image: busybox
command:
- sh
- -c
- |
sysctl -w net.core.default_qdisc=fq
sysctl -w net.ipv4.tcp_congestion_control=bbr
sysctl -w net.ipv4.tcp_rmem="4096 87380 16777216"
sysctl -w net.ipv4.tcp_wmem="4096 65536 16777216"
sysctl -w net.core.rmem_max=16777216
sysctl -w net.core.wmem_max=16777216
sysctl -w net.ipv4.tcp_slow_start_after_idle=0
securityContext:
privileged: true
containers:
- name: pause
image: k8s.gcr.io/pause:3.9
Connection Pooling
# Deployment with optimized connection settings
apiVersion: apps/v1
kind: Deployment
metadata:
name: optimized-app
spec:
replicas: 3
template:
spec:
containers:
- name: app
image: myapp:latest
env:
# HTTP keep-alive
- name: HTTP_KEEP_ALIVE_TIMEOUT
value: "65"
- name: HTTP_MAX_CONNECTIONS
value: "1000"
# Database connection pool
- name: DB_POOL_SIZE
value: "20"
- name: DB_MAX_OVERFLOW
value: "10"
- name: DB_POOL_TIMEOUT
value: "30"
resources:
requests:
cpu: 500m
memory: 512Mi
limits:
cpu: 2
memory: 2Gi
Readiness and Liveness Probes
# Optimized health checks
apiVersion: apps/v1
kind: Deployment
metadata:
name: backend
spec:
template:
spec:
containers:
- name: backend
image: backend:latest
ports:
- containerPort: 8080
# Startup probe for slow-starting containers
startupProbe:
httpGet:
path: /healthz
port: 8080
failureThreshold: 30
periodSeconds: 10
# Liveness probe
livenessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
# Readiness probe
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 10
periodSeconds: 5
timeoutSeconds: 3
successThreshold: 1
failureThreshold: 3
Real-World Production Architecture
Let’s put it all together with a complete production-grade architecture.
# Complete production namespace setup
---
apiVersion: v1
kind: Namespace
metadata:
name: production
labels:
istio-injection: enabled
pod-security.kubernetes.io/enforce: restricted
name: production
---
# Default deny all network policy
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-all
namespace: production
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
---
# Frontend deployment with all optimizations
apiVersion: apps/v1
kind: Deployment
metadata:
name: frontend
namespace: production
spec:
replicas: 3
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
selector:
matchLabels:
app: frontend
tier: web
version: v1
template:
metadata:
labels:
app: frontend
tier: web
version: v1
spec:
serviceAccountName: frontend-sa
automountServiceAccountToken: false
securityContext:
runAsNonRoot: true
runAsUser: 1000
fsGroup: 2000
seccompProfile:
type: RuntimeDefault
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchLabels:
app: frontend
topologyKey: kubernetes.io/hostname
containers:
- name: frontend
image: frontend:v1.0.0
imagePullPolicy: IfNotPresent
ports:
- name: http
containerPort: 8080
protocol: TCP
- name: metrics
containerPort: 9090
protocol: TCP
env:
- name: BACKEND_URL
value: "http://backend-service:8080"
- name: ENVIRONMENT
value: "production"
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 1000
capabilities:
drop:
- ALL
resources:
requests:
cpu: 250m
memory: 256Mi
limits:
cpu: 1
memory: 512Mi
livenessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 10
periodSeconds: 5
timeoutSeconds: 3
failureThreshold: 3
volumeMounts:
- name: cache
mountPath: /tmp
- name: config
mountPath: /etc/config
readOnly: true
volumes:
- name: cache
emptyDir: {}
- name: config
configMap:
name: frontend-config
---
# Frontend service
apiVersion: v1
kind: Service
metadata:
name: frontend-service
namespace: production
labels:
app: frontend
spec:
type: ClusterIP
sessionAffinity: ClientIP
sessionAffinityConfig:
clientIP:
timeoutSeconds: 3600
selector:
app: frontend
tier: web
ports:
- name: http
port: 80
targetPort: 8080
protocol: TCP
- name: metrics
port: 9090
targetPort: 9090
protocol: TCP
---
# Frontend network policy
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: frontend-policy
namespace: production
spec:
podSelector:
matchLabels:
app: frontend
tier: web
policyTypes:
- Ingress
- Egress
ingress:
- from:
- namespaceSelector:
matchLabels:
name: ingress-nginx
ports:
- protocol: TCP
port: 8080
egress:
- to:
- podSelector:
matchLabels:
app: backend
ports:
- protocol: TCP
port: 8080
- to:
- namespaceSelector:
matchLabels:
name: kube-system
podSelector:
matchLabels:
k8s-app: kube-dns
ports:
- protocol: UDP
port: 53
---
# HPA for frontend
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: frontend-hpa
namespace: production
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: frontend
minReplicas: 3
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
---
# PDB for frontend
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: frontend-pdb
namespace: production
spec:
minAvailable: 2
selector:
matchLabels:
app: frontend
tier: web
---
# Production ingress
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: production-ingress
namespace: production
annotations:
cert-manager.io/cluster-issuer: "letsencrypt-prod"
nginx.ingress.kubernetes.io/ssl-redirect: "true"
nginx.ingress.kubernetes.io/force-ssl-redirect: "true"
nginx.ingress.kubernetes.io/limit-rps: "100"
nginx.ingress.kubernetes.io/enable-cors: "true"
nginx.ingress.kubernetes.io/proxy-body-size: "10m"
spec:
ingressClassName: nginx
tls:
- hosts:
- app.example.com
secretName: production-tls
rules:
- host: app.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: frontend-service
port:
number: 80
Best Practices Checklist
Before going to production, ensure you’ve implemented:
Network Configuration
- [ ] CNI plugin selected and optimized for your use case
- [ ] MTU configured correctly for your network
- [ ] Pod CIDR doesn’t overlap with node network or service CIDR
- [ ] Network policies implemented with default deny-all
- [ ] DNS caching and optimization enabled
Security
- [ ] TLS/mTLS enabled for service-to-service communication
- [ ] Network policies enforce least privilege access
- [ ] Egress traffic controlled and monitored
- [ ] Pod Security Standards enforced
- [ ] Service accounts follow principle of least privilege
Reliability
- [ ] Health checks (liveness, readiness, startup) properly configured
- [ ] Pod Disruption Budgets set for critical services
- [ ] HPA configured with appropriate metrics
- [ ] Resource requests and limits defined
- [ ] Anti-affinity rules for high availability
Performance
- [ ] kube-proxy mode optimized (IPVS recommended)
- [ ] Connection pooling implemented
- [ ] BBR congestion control enabled
- [ ] Service mesh circuit breakers configured
- [ ] Load balancing strategy appropriate for workload
Observability
- [ ] Prometheus metrics exposed
- [ ] Network flow monitoring enabled (Hubble/Cilium)
- [ ] Distributed tracing implemented
- [ ] Alerts configured for network issues
- [ ] Logging captures network errors
Conclusion
Kubernetes networking is complex, but following these best practices will help you build a robust, secure, and performant infrastructure. Key takeaways:
- Start with the right CNI: Choose Calico for production maturity or Cilium for cutting-edge performance
- Default deny everything: Implement network policies from day one
- Secure by default: Use mTLS, enforce Pod Security Standards, control egress
- Monitor relentlessly: Use Prometheus, Hubble, and distributed tracing
- Optimize for your workload: Tune MTU, use IPVS, enable BBR
- Plan for failure: Implement circuit breakers, health checks, and PDBs
- Test thoroughly: Validate every network policy and connectivity requirement
Remember, networking configuration should evolve with your application. Start simple, measure everything, and optimize based on real-world metrics.
Additional Resources
- Kubernetes Networking Documentation
- Calico Documentation
- Cilium Documentation
- Istio Documentation
- NGINX Ingress Controller
One thought on “Kubernetes Networking Best Practices: A Complete Guide for 2025”