Kubernetes Networking Orchestration Security

10 Things You’re Doing Wrong While Using Kubernetes (And How to Fix Them)

Stop making these critical Kubernetes mistakes that are costing you performance, security, and sleep


Kubernetes has become the de facto standard for container orchestration, but even experienced developers fall into common traps that can lead to production nightmares. After working with hundreds of K8s deployments, I’ve identified the most frequent mistakes that could be silently breaking your clusters right now.

1. Not Setting Resource Requests and Limits

The Mistake: Deploying pods without defining CPU and memory constraints.

Why It’s Bad: Without resource limits, a single misbehaving pod can consume all node resources, causing cascading failures across your cluster.

Wrong Way:

apiVersion: v1
kind: Pod
metadata:
  name: my-app
spec:
  containers:
  - name: app
    image: myapp:latest
    # No resources defined - Recipe for disaster!

Right Way:

apiVersion: v1
kind: Pod
metadata:
  name: my-app
spec:
  containers:
  - name: app
    image: myapp:latest
    resources:
      requests:
        memory: "256Mi"
        cpu: "250m"
      limits:
        memory: "512Mi"
        cpu: "500m"

Pro Tip: Start with requests at 50% of limits, then tune based on actual metrics from your monitoring tools.


2. Using latest Tag in Production

The Mistake: Deploying containers with the latest tag or no tag at all.

Why It’s Bad: You lose reproducibility, can’t roll back reliably, and introduce unpredictable behavior when images update.

Wrong Way:

spec:
  containers:
  - name: web
    image: nginx:latest  # Which version is this really?

Right Way:

spec:
  containers:
  - name: web
    image: nginx:1.25.3  # Explicit, reproducible, rollback-friendly
    imagePullPolicy: IfNotPresent

Bonus: Implement semantic versioning for your own images: myapp:v1.2.3 or use git commit SHAs: myapp:a4f2c8d.


3. Running Containers as Root

The Mistake: Not specifying a security context, defaulting to root user (UID 0).

Why It’s Bad: If a container is compromised, the attacker has root-level access, making privilege escalation trivial.

Wrong Way:

spec:
  containers:
  - name: app
    image: myapp:v1.0
    # Runs as root by default

Right Way:

spec:
  securityContext:
    runAsNonRoot: true
    runAsUser: 1000
    fsGroup: 2000
    capabilities:
      drop:
        - ALL
  containers:
  - name: app
    image: myapp:v1.0
    securityContext:
      allowPrivilegeEscalation: false
      readOnlyRootFilesystem: true

Implementation: Update your Dockerfile to create a non-root user:

FROM node:18-alpine
RUN addgroup -g 1000 appgroup && \
    adduser -D -u 1000 -G appgroup appuser
USER appuser
WORKDIR /app
COPY --chown=appuser:appgroup . .
CMD ["node", "server.js"]

4. Not Implementing Readiness and Liveness Probes

The Mistake: Deploying applications without health checks.

Why It’s Bad: Kubernetes can’t determine if your app is healthy, leading to traffic being sent to broken pods or pods being restarted unnecessarily.

Wrong Way:

spec:
  containers:
  - name: api
    image: myapi:v2.0
    # No health checks

Right Way:

spec:
  containers:
  - name: api
    image: myapi:v2.0
    ports:
    - containerPort: 8080
    livenessProbe:
      httpGet:
        path: /healthz
        port: 8080
      initialDelaySeconds: 30
      periodSeconds: 10
      timeoutSeconds: 5
      failureThreshold: 3
    readinessProbe:
      httpGet:
        path: /ready
        port: 8080
      initialDelaySeconds: 10
      periodSeconds: 5
      timeoutSeconds: 3
      failureThreshold: 3

Quick Guide:

  • Liveness Probe: Is the app alive? If not, restart it.
  • Readiness Probe: Is the app ready to serve traffic? If not, remove from service.

5. Ignoring Pod Disruption Budgets (PDB)

The Mistake: Not protecting critical applications from voluntary disruptions during cluster maintenance.

Why It’s Bad: During node drains or cluster upgrades, all your pods might go down simultaneously, causing downtime.

Wrong Way:

# Just the deployment, no PDB
apiVersion: apps/v1
kind: Deployment
metadata:
  name: payment-service
spec:
  replicas: 3
  # ...

Right Way:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: payment-service
spec:
  replicas: 3
  selector:
    matchLabels:
      app: payment-service
  template:
    metadata:
      labels:
        app: payment-service
    spec:
      containers:
      - name: payment
        image: payment:v1.0
---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: payment-service-pdb
spec:
  minAvailable: 2  # Keep at least 2 pods running during disruptions
  selector:
    matchLabels:
      app: payment-service

Alternative: Use maxUnavailable: 1 to ensure only one pod is disrupted at a time.


6. Not Using Namespaces for Multi-Tenancy

The Mistake: Deploying everything in the default namespace.

Why It’s Bad: No logical separation, difficult RBAC management, and resource quotas can’t be applied effectively.

Wrong Way:

kubectl apply -f app.yaml  # Goes to default namespace

Right Way:

apiVersion: v1
kind: Namespace
metadata:
  name: production
  labels:
    environment: production
---
apiVersion: v1
kind: ResourceQuota
metadata:
  name: prod-quota
  namespace: production
spec:
  hard:
    requests.cpu: "100"
    requests.memory: "200Gi"
    persistentvolumeclaims: "10"
    pods: "50"
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-app
  namespace: production
spec:
  # ... your deployment spec

Namespace Strategy:

  • dev – Development workloads
  • staging – Pre-production testing
  • production – Production workloads
  • monitoring – Prometheus, Grafana
  • ingress – Ingress controllers

7. Exposing Secrets in Environment Variables

The Mistake: Storing sensitive data in ConfigMaps or plain environment variables.

Why It’s Bad: Secrets are visible in pod specs, logs, and can be easily extracted from a compromised container.

Wrong Way:

spec:
  containers:
  - name: app
    env:
    - name: DB_PASSWORD
      value: "SuperSecret123!"  # Plain text password!
    - name: API_KEY
      valueFrom:
        configMapKeyRef:
          name: app-config
          key: api-key  # ConfigMaps aren't encrypted!

Right Way:

apiVersion: v1
kind: Secret
metadata:
  name: db-credentials
type: Opaque
stringData:
  password: "SuperSecret123!"
  username: "dbadmin"
---
spec:
  containers:
  - name: app
    env:
    - name: DB_PASSWORD
      valueFrom:
        secretKeyRef:
          name: db-credentials
          key: password
    volumeMounts:
    - name: secret-volume
      mountPath: /etc/secrets
      readOnly: true
  volumes:
  - name: secret-volume
    secret:
      secretName: db-credentials

Better Yet: Use external secret management:

apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: db-secret
spec:
  refreshInterval: 1h
  secretStoreRef:
    name: vault-backend
    kind: SecretStore
  target:
    name: db-credentials
  data:
  - secretKey: password
    remoteRef:
      key: database/production
      property: password

8. Not Implementing Network Policies

The Mistake: Leaving pod-to-pod communication wide open.

Why It’s Bad: A compromised pod can communicate with any other pod in your cluster, enabling lateral movement for attackers.

Wrong Way:

# No NetworkPolicy = All pods can talk to each other

Right Way:

# Default deny all ingress traffic
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-ingress
  namespace: production
spec:
  podSelector: {}
  policyTypes:
  - Ingress
---
# Allow specific traffic
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-frontend-to-backend
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: backend
  policyTypes:
  - Ingress
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: frontend
    ports:
    - protocol: TCP
      port: 8080
---
# Allow backend to database
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-backend-to-db
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: postgres
  policyTypes:
  - Ingress
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: backend
    ports:
    - protocol: TCP
      port: 5432

9. Forgetting About Horizontal Pod Autoscaling (HPA)

The Mistake: Setting a fixed number of replicas regardless of load.

Why It’s Bad: You’re either over-provisioned (wasting money) or under-provisioned (causing performance issues).

Wrong Way:

spec:
  replicas: 3  # Always 3, whether you need 1 or 10

Right Way:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-app
spec:
  replicas: 2  # Initial minimum
  selector:
    matchLabels:
      app: web
  template:
    metadata:
      labels:
        app: web
    spec:
      containers:
      - name: web
        image: webapp:v1.0
        resources:
          requests:
            cpu: 200m
            memory: 256Mi
          limits:
            cpu: 500m
            memory: 512Mi
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: web-app-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: web-app
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 50
        periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 0
      policies:
      - type: Percent
        value: 100
        periodSeconds: 30
      - type: Pods
        value: 2
        periodSeconds: 30
      selectPolicy: Max

10. Not Implementing Proper Logging and Monitoring

The Mistake: Deploying applications without centralized logging or metrics collection.

Why It’s Bad: When things go wrong (and they will), you’re flying blind with no way to debug issues.

Wrong Way:

# Deploy and hope for the best
spec:
  containers:
  - name: app
    image: myapp:v1.0

Right Way:

Logging with Fluent Bit:

apiVersion: v1
kind: ConfigMap
metadata:
  name: fluent-bit-config
  namespace: logging
data:
  fluent-bit.conf: |
    [SERVICE]
        Flush        5
        Daemon       Off
        Log_Level    info
    
    [INPUT]
        Name              tail
        Path              /var/log/containers/*.log
        Parser            docker
        Tag               kube.*
        Refresh_Interval  5
    
    [FILTER]
        Name                kubernetes
        Match               kube.*
        Kube_URL            https://kubernetes.default.svc:443
        Kube_CA_File        /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        Kube_Token_File     /var/run/secrets/kubernetes.io/serviceaccount/token
    
    [OUTPUT]
        Name  es
        Match *
        Host  elasticsearch.logging.svc
        Port  9200
        Index fluent-bit

Application Instrumentation:

spec:
  containers:
  - name: app
    image: myapp:v1.0
    env:
    - name: OTEL_EXPORTER_OTLP_ENDPOINT
      value: "http://otel-collector:4317"
    - name: OTEL_SERVICE_NAME
      value: "my-app"
    ports:
    - containerPort: 8080
      name: http
    - containerPort: 8081
      name: metrics  # Prometheus metrics endpoint

ServiceMonitor for Prometheus:

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: app-metrics
  namespace: production
spec:
  selector:
    matchLabels:
      app: my-app
  endpoints:
  - port: metrics
    interval: 30s
    path: /metrics

Bonus: Quick Checklist Before Production

Before deploying to production, verify:

  • [ ] Resource requests and limits are set
  • [ ] Image tags are specific (no latest)
  • [ ] Security context is configured (non-root user)
  • [ ] Liveness and readiness probes are implemented
  • [ ] Pod Disruption Budget is in place
  • [ ] Namespaces are used for separation
  • [ ] Secrets are managed securely (not in ConfigMaps)
  • [ ] Network policies are configured
  • [ ] HPA is configured for variable workloads
  • [ ] Logging and monitoring are set up
  • [ ] RBAC is configured properly
  • [ ] Backup strategy is in place
  • [ ] Resource quotas are set per namespace
  • [ ] Anti-affinity rules for critical pods

Conclusion

These mistakes are incredibly common, even in production clusters at major companies. The good news? They’re all fixable with proper configuration and a security-first mindset.

Start by auditing your current deployments:

# Find pods without resource limits
kubectl get pods --all-namespaces -o json | \
  jq '.items[] | select(.spec.containers[].resources.limits == null) | .metadata.name'

# Find pods running as root
kubectl get pods --all-namespaces -o json | \
  jq '.items[] | select(.spec.securityContext.runAsNonRoot != true) | .metadata.name'

# Find deployments without HPA
kubectl get deployments --all-namespaces -o json | \
  jq -r '.items[] | select(.spec.replicas != null) | "\(.metadata.namespace)/\(.metadata.name)"'

Remember: Kubernetes gives you the tools to build resilient, secure, and scalable applications. But like any powerful tool, it requires knowledge and discipline to use correctly.

What Kubernetes mistakes have you made? Share your experiences in the comments below, or join the Kubezilla community where we discuss Kubernetes best practices daily.


Leave a Reply

Your email address will not be published. Required fields are marked *