Kubernetes Security

Kubernetes Security: The Stuff Nobody Tells You Until Your Cluster Gets Pwned

TL;DR – Quick Takeaways

🔒 Essential Kubernetes Security Facts:

  • 94% of organizations experienced at least one Kubernetes security incident in the past year
  • The shared responsibility model means YOU own security from the container layer up
  • Security must be implemented across 5 critical layers: cluster, container, code, cloud, and compliance
  • RBAC, network policies, and pod security standards are non-negotiable foundations
  • Runtime security and continuous monitoring catch what static scans miss

⚡ Action Items:

  • Enable RBAC and follow the principle of least privilege
  • Implement network policies to control pod-to-pod communication
  • Scan container images before deployment
  • Encrypt secrets using tools like Sealed Secrets or external vaults
  • Deploy admission controllers to enforce security policies

The $4 Million Wake-Up Call: Why Kubernetes Security Can’t Wait

Last March, a fintech company learned an expensive lesson about Kubernetes security. A misconfigured Role-Based Access Control policy gave a compromised service account cluster-admin privileges. Within hours, attackers had exfiltrated customer data and deployed cryptomining containers across their production cluster. The breach cost them $4 million in incident response, regulatory fines, and lost business.

This isn’t a cautionary tale from the dark corners of the internet—it’s a composite of real incidents happening right now. As Kubernetes becomes the de facto standard for container orchestration, it’s also becoming the primary target for attackers. The complex, distributed nature of Kubernetes introduces security challenges that traditional security tools weren’t designed to handle.

Here’s the uncomfortable truth: your Kubernetes cluster is probably vulnerable right now. But the good news? Most security issues stem from misconfigurations and gaps in understanding—problems you can fix starting today.


Understanding the Kubernetes Security Landscape

The Shared Responsibility Model in Kubernetes

Think of Kubernetes security like securing an apartment building. The building owner (cloud provider) handles the physical structure, locks on the main entrance, and fire suppression systems. But you’re responsible for locking your apartment door, securing your valuables, and not leaving windows open.

In Kubernetes terms:

Cloud Provider Responsibilities:

  • Physical infrastructure security
  • Host OS patching (in managed services)
  • API server availability
  • Control plane security (in managed services)

Your Responsibilities:

  • Container image security
  • Application code vulnerabilities
  • RBAC configuration
  • Network policies
  • Secrets management
  • Workload configurations
  • Runtime security

This division means that even on fully managed services like Google Kubernetes Engine or Amazon Elastic Kubernetes Service, the majority of security controls rest in your hands.


The 5 Layers of Kubernetes Security

Kubernetes security isn’t a single switch you flip—it’s a multi-layered defense strategy. The Cloud Native Computing Foundation defines this as the “4 C’s of Cloud Native Security,” but I’ve expanded it to 5 layers for comprehensive protection:

1. Cloud/Infrastructure Security

Your cluster is only as secure as the infrastructure it runs on. This layer includes:

  • Network segmentation: Isolate Kubernetes nodes in private subnets
  • IAM integration: Connect Kubernetes authentication to your cloud provider’s identity system
  • Compliance requirements: Ensure your infrastructure meets regulatory standards (SOC 2, PCI-DSS, HIPAA)
  • Infrastructure as Code scanning: Catch misconfigurations before deployment

Real-world example: A healthcare provider achieved HIPAA compliance by implementing private GKE clusters with VPC Service Controls, ensuring no traffic touched the public internet and all data remained within approved geographic boundaries.

2. Cluster Security

The cluster level is where most misconfigurations happen. Critical controls include:

Role-Based Access Control (RBAC): The cornerstone of Kubernetes security. RBAC defines who can do what in your cluster. Think of it as creating custom permission sets for different team members. A developer might get permissions to view and create pods in the development namespace but can’t delete resources or access production environments.

Here’s a practical example of a least-privilege role for developers:

# Create a Role with limited permissions
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  namespace: development
  name: developer-role
rules:
- apiGroups: ["", "apps"]
  resources: ["pods", "deployments", "services", "configmaps"]
  verbs: ["get", "list", "watch", "create", "update", "patch"]
  # Notice: NO "delete" verb and NO cluster-wide access
---
# Bind the Role to a user or group
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: developer-binding
  namespace: development
subjects:
- kind: User
  name: jane@company.com
  apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: Role
  name: developer-role
  apiGroup: rbac.authorization.k8s.io

This configuration creates a Role that allows developers to view and manage common resources in the development namespace but prevents them from deleting anything or accessing other namespaces. The RoleBinding connects the Role to a specific user.

Network Policies: Think of these as firewalls for your pods. By default, Kubernetes allows all pod-to-pod communication—a security nightmare. Network policies let you define rules about which pods can talk to each other.

Pod Security Standards: Kubernetes 1.25+ introduced three standards (Privileged, Baseline, Restricted) to prevent dangerous pod configurations. These standards define what security contexts are allowed for your workloads.

3. Container Security

Your containers are the execution units that actually run your code. Secure them by:

  • Scanning images for vulnerabilities using tools like Trivy, Grype, or commercial solutions
  • Using minimal base images (Alpine, distroless) to reduce attack surface
  • Implementing image signing with tools like Sigstore/Cosign to verify provenance
  • Running containers as non-root users whenever possible

Case study: After implementing mandatory image scanning, an e-commerce company discovered that 37% of their container images contained high or critical CVEs. By blocking vulnerable images at deployment time, they reduced their attack surface by 82%.

4. Code Security

The code inside your containers represents your custom application logic—and its unique vulnerabilities:

  • Software Composition Analysis (SCA): Scan dependencies for known vulnerabilities
  • Static Application Security Testing (SAST): Analyze source code for security flaws
  • Secrets scanning: Prevent credentials from being committed to repositories
  • Supply chain security: Verify the integrity of third-party dependencies

5. Compliance and Governance

Security without compliance is incomplete. This layer ensures you can prove your security posture:

  • Audit logging: Enable Kubernetes audit logs to track all API server requests
  • Policy enforcement: Use tools like OPA (Open Policy Agent) or Kyverno
  • Compliance scanning: Regular checks against benchmarks like CIS Kubernetes Benchmark
  • Incident response procedures: Document and practice security incident handling

Critical Kubernetes Security Best Practices

1. Implement Zero Trust Architecture

Traditional perimeter security doesn’t work in Kubernetes. In a world where your “network perimeter” is constantly shifting as pods scale up and down, you need to verify every request.

How to implement:

  • Use mutual TLS (mTLS) for service-to-service communication (service meshes like Istio or Linkerd make this easier)
  • Require authentication for all API requests
  • Implement fine-grained authorization with RBAC
  • Continuously validate security posture, don’t just trust configurations

Real-world application: A financial services company implemented Istio’s mTLS across their entire Kubernetes environment. When a container was compromised through a zero-day vulnerability, the blast radius was contained because the attacker couldn’t impersonate other services or make unauthorized API calls.

2. Master Secrets Management

Never store secrets in plain text or environment variables. Kubernetes Secrets are base64-encoded by default—that’s encoding, not encryption. Anyone with access to etcd or the API server can decode them instantly.

Better approaches:

Encrypt secrets at rest: Enable encryption at the etcd level so secrets are encrypted when stored. Here’s how to configure encryption providers:

# /etc/kubernetes/encryption-config.yaml
apiVersion: apiserver.config.k8s.io/v1
kind: EncryptionConfiguration
resources:
  - resources:
      - secrets
    providers:
      - aescbc:
          keys:
            - name: key1
              secret: <BASE64_ENCODED_SECRET>
      - identity: {}  # Fallback for reading unencrypted secrets

Then configure the API server to use this encryption config:

# Add to kube-apiserver flags
--encryption-provider-config=/etc/kubernetes/encryption-config.yaml

External secret management: Use the External Secrets Operator to pull secrets from vaults. This configuration pulls database credentials from AWS Secrets Manager:

# First, create a SecretStore pointing to AWS Secrets Manager
apiVersion: external-secrets.io/v1beta1
kind: SecretStore
metadata:
  name: aws-secrets-manager
  namespace: production
spec:
  provider:
    aws:
      service: SecretsManager
      region: us-west-2
      auth:
        jwt:
          serviceAccountRef:
            name: external-secrets-sa
---
# Then create an ExternalSecret that references it
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: database-credentials
  namespace: production
spec:
  refreshInterval: 1h
  secretStoreRef:
    name: aws-secrets-manager
    kind: SecretStore
  target:
    name: db-secret
    creationPolicy: Owner
  data:
  - secretKey: username
    remoteRef:
      key: production/database
      property: username
  - secretKey: password
    remoteRef:
      key: production/database
      property: password

This setup automatically syncs secrets from AWS Secrets Manager into Kubernetes, with hourly refresh. Your application consumes the standard Kubernetes Secret, but the actual secret never lives in your git repository.

3. Enforce Network Segmentation

Default-allow networking is like leaving all doors in your office building unlocked. Implement network policies to create microsegmentation.

A practical approach starts with a default-deny policy that blocks all traffic:

# Deny all ingress and egress traffic by default
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-all
  namespace: production
spec:
  podSelector: {}  # Applies to all pods in namespace
  policyTypes:
  - Ingress
  - Egress

Then you explicitly allow only the communication paths your application needs:

# Allow frontend pods to communicate with backend pods
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: frontend-to-backend
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: backend
      tier: api
  policyTypes:
  - Ingress
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: frontend
          tier: web
    ports:
    - protocol: TCP
      port: 8080
---
# Allow backend to communicate with database
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: backend-to-database
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: database
      tier: data
  policyTypes:
  - Ingress
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: backend
          tier: api
    ports:
    - protocol: TCP
      port: 5432
---
# Allow egress to DNS for all pods
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-dns-egress
  namespace: production
spec:
  podSelector: {}
  policyTypes:
  - Egress
  egress:
  - to:
    - namespaceSelector:
        matchLabels:
          name: kube-system
    - podSelector:
        matchLabels:
          k8s-app: kube-dns
    ports:
    - protocol: UDP
      port: 53

This approach follows the principle of least privilege at the network level. If an attacker compromises your frontend, they can’t pivot to your database because that network path was never explicitly allowed.

4. Implement Runtime Security

Static security only catches what you scan for. Runtime security detects anomalous behavior during execution—things you might not have anticipated during design.

Deploy Falco for runtime threat detection:

# Install Falco using Helm
helm repo add falcosecurity https://falcosecurity.github.io/charts
helm repo update

# Install with custom rules
helm install falco falcosecurity/falco \
  --namespace falco \
  --create-namespace \
  --set falco.grpc.enabled=true \
  --set falco.grpcOutput.enabled=true

Create custom Falco rules for your environment:

# custom-rules.yaml
- rule: Unauthorized Process in Container
  desc: Detect if an unauthorized process runs in a container
  condition: >
    spawned_process and
    container and
    not proc.name in (node, java, python, nginx, envoy)
  output: >
    Unauthorized process started in container
    (user=%user.name command=%proc.cmdline
    container_id=%container.id container_name=%container.name
    image=%container.image.repository)
  priority: WARNING
  tags: [container, process]

- rule: Sensitive File Access
  desc: Detect access to sensitive files
  condition: >
    open_read and
    container and
    fd.name in (/etc/shadow, /etc/sudoers, /root/.ssh/authorized_keys)
  output: >
    Sensitive file accessed
    (user=%user.name file=%fd.name
    container=%container.name image=%container.image.repository)
  priority: CRITICAL
  tags: [filesystem, security]

- rule: Kubernetes API Server Access from Pod
  desc: Detect when a pod tries to access the K8s API server
  condition: >
    outbound and
    fd.sip="10.96.0.1" and
    container
  output: >
    Pod attempting to access Kubernetes API server
    (pod=%k8s.pod.name namespace=%k8s.ns.name
    destination=%fd.rip:%fd.rport)
  priority: WARNING
  tags: [network, kubernetes]

Key capabilities:

  • Process monitoring: Alert on unexpected processes spawning in containers
  • Network behavior analysis: Detect unusual network connections
  • File integrity monitoring: Identify unauthorized file modifications
  • System call monitoring: Detect suspicious syscalls

Case study: A SaaS company deployed Falco in their production environment. Within the first week, it detected a privilege escalation attempt when a compromised container tried to access the Kubernetes API server token—an attack that their perimeter defenses had missed.


Technical Deep Dive: Securing the Kubernetes Supply Chain

<details> <summary><strong>🔧 Click to expand: Advanced Supply Chain Security Implementation</strong></summary>

The software supply chain has become a primary attack vector. The SolarWinds breach, Log4Shell, and countless other incidents prove that attackers target the development and deployment pipeline, not just production systems.

Image Signing and Verification with Sigstore

Sigstore provides keyless signing for container images, making it practical to verify that images haven’t been tampered with between build and deployment.

Generate a key pair and sign images:

# Generate a key pair (one-time setup)
cosign generate-key-pair

# Sign an image after building
cosign sign --key cosign.key myregistry.io/myapp:v1.2.3

# Verify the signature
cosign verify --key cosign.pub myregistry.io/myapp:v1.2.3

For keyless signing using OIDC (recommended for CI/CD):

# Sign using OIDC (GitHub Actions, GitLab CI, etc.)
cosign sign --oidc-issuer=https://token.actions.githubusercontent.com \
  myregistry.io/myapp:v1.2.3

# Verify using certificate and OIDC issuer
cosign verify --certificate-identity=your-identity \
  --certificate-oidc-issuer=https://token.actions.githubusercontent.com \
  myregistry.io/myapp:v1.2.3

Enforce signature verification with Sigstore Policy Controller:

# Install the policy controller
kubectl apply -f https://github.com/sigstore/policy-controller/releases/latest/download/policy-controller.yaml

# Create a ClusterImagePolicy requiring signed images
apiVersion: policy.sigstore.dev/v1beta1
kind: ClusterImagePolicy
metadata:
  name: require-signed-images
spec:
  images:
  - glob: "myregistry.io/**"
  authorities:
  - keyless:
      url: https://fulcio.sigstore.dev
      identities:
      - issuer: https://token.actions.githubusercontent.com
        subject: https://github.com/myorg/myrepo/.github/workflows/build.yml@refs/heads/main

This policy ensures that only images signed by your specific GitHub Actions workflow can run in the cluster.

Software Bill of Materials (SBOM)

SBOMs provide transparency into what’s actually in your container images—every library, dependency, and component.

Generate SBOM with Syft:

# Install Syft
curl -sSfL https://raw.githubusercontent.com/anchore/syft/main/install.sh | sh

# Generate SBOM for a container image
syft packages myregistry.io/myapp:v1.2.3 -o spdx-json > sbom.json

# Generate SBOM for a directory
syft packages dir:./app -o cyclonedx-json > sbom.json

# Generate SBOM during Docker build
docker build -t myapp:latest --sbom=true .

Scan SBOM for vulnerabilities:

# Install Grype
curl -sSfL https://raw.githubusercontent.com/anchore/grype/main/install.sh | sh

# Scan the SBOM
grype sbom:./sbom.json

# Scan with specific severity threshold
grype sbom:./sbom.json --fail-on high

# Output in JSON for CI/CD integration
grype sbom:./sbom.json -o json > vulnerability-report.json

Integrate into CI/CD pipeline:

# GitHub Actions example
name: Container Security Scan
on: [push, pull_request]

jobs:
  scan:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      
      - name: Build image
        run: docker build -t myapp:${{ github.sha }} .
      
      - name: Generate SBOM
        uses: anchore/sbom-action@v0
        with:
          image: myapp:${{ github.sha }}
          format: spdx-json
          output-file: sbom.spdx.json
      
      - name: Scan for vulnerabilities
        uses: anchore/scan-action@v3
        with:
          sbom: sbom.spdx.json
          fail-build: true
          severity-cutoff: high

Admission Controller Policy Enforcement

Admission controllers sit between the API server and etcd, intercepting every request to create or modify resources.

Deploy Kyverno for Kubernetes-native policies:

# Install Kyverno
kubectl create -f https://github.com/kyverno/kyverno/releases/download/v1.10.0/install.yaml

Create policies to enforce security standards:

# Require resource limits on all containers
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: require-resource-limits
spec:
  validationFailureAction: enforce
  background: true
  rules:
  - name: validate-resources
    match:
      any:
      - resources:
          kinds:
          - Pod
    validate:
      message: "CPU and memory resource limits are required"
      pattern:
        spec:
          containers:
          - resources:
              limits:
                memory: "?*"
                cpu: "?*"
---
# Require images from approved registries only
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: restrict-image-registries
spec:
  validationFailureAction: enforce
  background: true
  rules:
  - name: validate-registries
    match:
      any:
      - resources:
          kinds:
          - Pod
    validate:
      message: "Images must come from approved registries"
      pattern:
        spec:
          containers:
          - image: "myregistry.io/* | gcr.io/myproject/*"
---
# Disallow privileged containers
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: disallow-privileged-containers
spec:
  validationFailureAction: enforce
  background: true
  rules:
  - name: validate-privileged
    match:
      any:
      - resources:
          kinds:
          - Pod
    validate:
      message: "Privileged containers are not allowed"
      pattern:
        spec:
          containers:
          - =(securityContext):
              =(privileged): false
---
# Require security context for all pods
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: require-security-context
spec:
  validationFailureAction: enforce
  background: true
  rules:
  - name: validate-security-context
    match:
      any:
      - resources:
          kinds:
          - Pod
    validate:
      message: "Security context is required"
      pattern:
        spec:
          securityContext:
            runAsNonRoot: true
            seccompProfile:
              type: RuntimeDefault
          containers:
          - securityContext:
              allowPrivilegeEscalation: false
              capabilities:
                drop:
                - ALL
              readOnlyRootFilesystem: true

Using OPA Gatekeeper for more complex policies:

# Install Gatekeeper
kubectl apply -f https://raw.githubusercontent.com/open-policy-agent/gatekeeper/master/deploy/gatekeeper.yaml
# Define a ConstraintTemplate
apiVersion: templates.gatekeeper.sh/v1
kind: ConstraintTemplate
metadata:
  name: k8srequiredlabels
spec:
  crd:
    spec:
      names:
        kind: K8sRequiredLabels
      validation:
        openAPIV3Schema:
          type: object
          properties:
            labels:
              type: array
              items:
                type: string
  targets:
    - target: admission.k8s.gatekeeper.sh
      rego: |
        package k8srequiredlabels
        
        violation[{"msg": msg, "details": {"missing_labels": missing}}] {
          provided := {label | input.review.object.metadata.labels[label]}
          required := {label | label := input.parameters.labels[_]}
          missing := required - provided
          count(missing) > 0
          msg := sprintf("Missing required labels: %v", [missing])
        }
---
# Use the template with a Constraint
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sRequiredLabels
metadata:
  name: require-owner-label
spec:
  match:
    kinds:
      - apiGroups: [""]
        kinds: ["Namespace", "Pod"]
  parameters:
    labels:
      - "owner"
      - "team"
      - "environment"

Continuous Compliance Scanning

Automate CIS Benchmark checks with kube-bench:

# Run as a job
kubectl apply -f https://raw.githubusercontent.com/aquasecurity/kube-bench/main/job.yaml

# Wait for completion
kubectl wait --for=condition=complete job/kube-bench -n default --timeout=60s

# Review results
kubectl logs job/kube-bench -n default

# Run on specific node type
kubectl apply -f - <<EOF
apiVersion: batch/v1
kind: Job
metadata:
  name: kube-bench-master
spec:
  template:
    spec:
      hostPID: true
      nodeSelector:
        node-role.kubernetes.io/control-plane: ""
      tolerations:
        - key: node-role.kubernetes.io/control-plane
          operator: Exists
          effect: NoSchedule
      containers:
      - name: kube-bench
        image: aquasec/kube-bench:latest
        command: ["kube-bench", "run", "--targets", "master"]
        volumeMounts:
        - name: var-lib-etcd
          mountPath: /var/lib/etcd
          readOnly: true
        - name: etc-kubernetes
          mountPath: /etc/kubernetes
          readOnly: true
      restartPolicy: Never
      volumes:
      - name: var-lib-etcd
        hostPath:
          path: "/var/lib/etcd"
      - name: etc-kubernetes
        hostPath:
          path: "/etc/kubernetes"
EOF

Continuous scanning with automated remediation tracking:

# Create a CronJob for weekly scans
kubectl apply -f - <<EOF
apiVersion: batch/v1
kind: CronJob
metadata:
  name: kube-bench-scan
  namespace: security
spec:
  schedule: "0 2 * * 0"  # 2 AM every Sunday
  jobTemplate:
    spec:
      template:
        spec:
          serviceAccountName: kube-bench-sa
          containers:
          - name: kube-bench
            image: aquasec/kube-bench:latest
            command:
            - sh
            - -c
            - |
              kube-bench run --json > /tmp/results.json
              # Send results to your logging/monitoring system
              curl -X POST https://your-monitoring-endpoint.com/kube-bench \
                -H "Content-Type: application/json" \
                -d @/tmp/results.json
          restartPolicy: OnFailure
EOF

</details>


Common Kubernetes Security Vulnerabilities and How to Fix Them

Vulnerability #1: Overprivileged Service Accounts

The Problem: By default, every pod gets a service account with API access. Many organizations leave the default service account with excessive permissions, creating a privilege escalation pathway.

The Fix:

Create minimal service accounts for each application with only the permissions they need:

# Create a restricted service account
apiVersion: v1
kind: ServiceAccount
metadata:
  name: restricted-app-sa
  namespace: production
automountServiceAccountToken: false  # Disable by default
---
# Create minimal Role for the app
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: app-reader-role
  namespace: production
rules:
- apiGroups: [""]
  resources: ["configmaps"]
  verbs: ["get", "list", "watch"]
  # Only read access to ConfigMaps, nothing else
---
# Bind the Role to the ServiceAccount
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: app-reader-binding
  namespace: production
subjects:
- kind: ServiceAccount
  name: restricted-app-sa
  namespace: production
roleRef:
  kind: Role
  name: app-reader-role
  apiGroup: rbac.authorization.k8s.io
---
# Use the restricted service account in your deployment
apiVersion: apps/v1
kind: Deployment
metadata:
  name: secure-app
  namespace: production
spec:
  replicas: 3
  selector:
    matchLabels:
      app: secure-app
  template:
    metadata:
      labels:
        app: secure-app
    spec:
      serviceAccountName: restricted-app-sa
      automountServiceAccountToken: false
      containers:
      - name: app
        image: myregistry.io/secure-app:v1.0.0

For pods that don’t need API access at all (most don’t), disable automatic mounting of the service account token. This prevents the pod from making any API calls, eliminating an entire attack vector.

Vulnerability #2: Containers Running as Root

The Problem: Running containers as root (UID 0) makes privilege escalation trivial if the container is compromised. An attacker who breaks out of the container has root on the host.

The Fix:

Configure comprehensive security contexts:

apiVersion: v1
kind: Pod
metadata:
  name: secure-pod
  namespace: production
spec:
  # Pod-level security context
  securityContext:
    runAsNonRoot: true
    runAsUser: 1000
    runAsGroup: 3000
    fsGroup: 2000
    seccompProfile:
      type: RuntimeDefault
  
  containers:
  - name: app
    image: myregistry.io/myapp:latest
    
    # Container-level security context
    securityContext:
      allowPrivilegeEscalation: false
      readOnlyRootFilesystem: true
      runAsNonRoot: true
      runAsUser: 1000
      capabilities:
        drop:
        - ALL
        # Only add specific capabilities if absolutely needed
        # add:
        # - NET_BIND_SERVICE
    
    # Use a writable volume for temporary files
    volumeMounts:
    - name: tmp
      mountPath: /tmp
    - name: cache
      mountPath: /app/cache
  
  volumes:
  - name: tmp
    emptyDir: {}
  - name: cache
    emptyDir: {}

This configuration:

  • Forces the container to run as UID 1000 (non-root)
  • Prevents privilege escalation attempts
  • Makes the root filesystem read-only (prevents malware persistence)
  • Drops all Linux capabilities
  • Enables seccomp to restrict system calls
  • Provides writable volumes for temporary files

Build Dockerfile for non-root operation:

FROM node:18-alpine

# Create a non-root user
RUN addgroup -g 1000 appuser && \
    adduser -D -u 1000 -G appuser appuser

# Set up application directory
WORKDIR /app
COPY --chown=appuser:appuser package*.json ./
RUN npm ci --only=production

COPY --chown=appuser:appuser . .

# Switch to non-root user
USER appuser

EXPOSE 3000
CMD ["node", "server.js"]

Vulnerability #3: Exposed Kubernetes Dashboard

The Problem: The Kubernetes Dashboard, if publicly accessible and poorly configured, is a common attack vector. Attackers scan for exposed dashboards and exploit weak authentication.

The Fix:

Option 1: Secure the dashboard properly

# Deploy dashboard with recommended settings
kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.7.0/aio/deploy/recommended.yaml

# Create an admin user with limited access
kubectl apply -f - <<EOF
apiVersion: v1
kind: ServiceAccount
metadata:
  name: dashboard-admin
  namespace: kubernetes-dashboard
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: dashboard-admin
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
subjects:
- kind: ServiceAccount
  name: dashboard-admin
  namespace: kubernetes-dashboard
EOF

# Get the token for login
kubectl -n kubernetes-dashboard create token dashboard-admin

# Access via kubectl proxy (never expose publicly)
kubectl proxy

# Access at: http://localhost:8001/api/v1/namespaces/kubernetes-dashboard/services/https:kubernetes-dashboard:/proxy/

Option 2: Use modern alternatives

# Install k9s (terminal-based UI)
brew install derailed/k9s/k9s
# Or download from https://github.com/derailed/k9s/releases

# Install Lens (desktop application)
# Download from https://k8slens.dev/

# Install Portainer for Kubernetes
kubectl apply -n portainer -f https://downloads.portainer.io/ce2-18/portainer.yaml

Never do this:

# DON'T expose dashboard with LoadBalancer or Ingress
apiVersion: v1
kind: Service
metadata:
  name: kubernetes-dashboard
  namespace: kubernetes-dashboard
spec:
  type: LoadBalancer  # NEVER DO THIS
  ports:
  - port: 443
    targetPort: 8443

Essential Kubernetes Security Tools

Open Source Tools

1. Trivy – Comprehensive Vulnerability Scanner

# Install Trivy
brew install aquasecurity/trivy/trivy
# Or: wget -qO - https://aquasecurity.github.io/trivy-repo/deb/public.key | sudo apt-key add -

# Scan a container image
trivy image myregistry.io/myapp:v1.2.3

# Scan and fail on high/critical vulnerabilities
trivy image --severity HIGH,CRITICAL --exit-code 1 myregistry.io/myapp:v1.2.3

# Scan Kubernetes manifests
trivy config ./k8s-manifests/

# Scan a running cluster
trivy k8s --report summary cluster

# Generate SBOM
trivy image --format spdx-json myregistry.io/myapp:v1.2.3 > sbom.json

# Integrate into CI/CD
trivy image --format json --output results.json myregistry.io/myapp:v1.2.3

2. Falco – Runtime Security

Already covered in the Runtime Security section above.

3. Kyverno – Kubernetes-Native Policy Management

Already covered in the Admission Controller section above.

4. kube-bench – CIS Benchmark Compliance

Already covered in the Continuous Compliance section above.

5. kubescape – Security Posture Assessment

# Install kubescape
curl -s https://raw.githubusercontent.com/kubescape/kubescape/master/install.sh | /bin/bash

# Scan your cluster against multiple frameworks
kubescape scan framework nsa,mitre,cis-v1.23-t1.0.1

# Scan specific namespaces
kubescape scan --include-namespaces production,staging

# Generate detailed reports
kubescape scan framework nsa --format json --output results.json

# Scan YAML files before applying
kubescape scan *.yaml

# Fix issues automatically (where possible)
kubescape fix results.json

Commercial Platforms

  1. Aqua Security – Full-stack container security platform
  2. Sysdig Secure – Runtime security and compliance
  3. Prisma Cloud (Palo Alto) – Cloud-native application protection
  4. StackRox/Red Hat Advanced Cluster Security – Kubernetes-native security platform
  5. Snyk Container – Developer-first vulnerability management

Choosing the right tools: Start with open-source tools to build foundational security, then add commercial platforms as your security maturity and budget grow. A typical progression:

  • Phase 1: Trivy + kube-bench
  • Phase 2: Add Falco + Kyverno
  • Phase 3: Add commercial platform for advanced features and support

Building a Kubernetes Security Program: A Roadmap

Phase 1: Foundation (Weeks 1-4)

Week 1: Assessment and Quick Wins

# Audit current RBAC configuration
kubectl get clusterrolebindings -o json | \
  jq '.items[] | select(.subjects[]?.name=="system:anonymous") | .metadata.name'

# Check for pods running as root
kubectl get pods --all-namespaces -o json | \
  jq '.items[] | select(.spec.securityContext.runAsUser==0 or .spec.containers[].securityContext.runAsUser==0) | .metadata.name'

# Identify pods without resource limits
kubectl get pods --all-namespaces -o json | \
  jq '.items[] | select(.spec.containers[].resources.limits==null) | "\(.metadata.namespace)/\(.metadata.name)"'

# Run initial CIS benchmark scan
kubectl apply -f https://raw.githubusercontent.com/aquasecurity/kube-bench/main/job.yaml
kubectl logs job/kube-bench

Week 2: Enable Core Security Features

# Enable audit logging (for managed clusters, use provider's method)
# For self-managed clusters, add to kube-apiserver:
--audit-policy-file=/etc/kubernetes/audit-policy.yaml
--audit-log-path=/var/log/kubernetes/audit/audit.log
--audit-log-maxage=30
--audit-log-maxbackup=10
--audit-log-maxsize=100

# Deploy initial network policies
kubectl apply -f - <<EOF
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-all
  namespace: production
spec:
  podSelector: {}
  policyTypes:
  - Ingress
  - Egress
EOF

Week 3: Container Image Security

# Set up image scanning in CI/CD
# Add to your pipeline:
- name: Scan image
  run: |
    trivy image --severity HIGH,CRITICAL --exit-code 1 $IMAGE_NAME

# Deploy admission controller to enforce scanning
kubectl apply -f https://github.com/kyverno/kyverno/releases/download/v1.10.0/install.yaml

Week 4: Pod Security Standards

# Label namespaces with pod security standards
kubectl label namespace production \
  pod-security.kubernetes.io/enforce=restricted \
  pod-security.kubernetes.io/audit=restricted \
  pod-security.kubernetes.io/warn=restricted

kubectl label namespace staging \
  pod-security.kubernetes.io/enforce=baseline \
  pod-security.kubernetes.io/audit=restricted \
  pod-security.kubernetes.io/warn=restricted

Phase 2: Hardening (Months 2-3)

Deploy secrets management:

# Option 1: Sealed Secrets
kubectl apply -f https://github.com/bitnami-labs/sealed-secrets/releases/download/v0.24.0/controller.yaml

# Install kubeseal CLI
brew install kubeseal

# Seal a secret
kubectl create secret generic mysecret --dry-run=client --from-literal=password=mypassword -o yaml | \
  kubeseal -o yaml > mysealedsecret.yaml

# Option 2: External Secrets Operator
helm repo add external-secrets https://charts.external-secrets.io
helm install external-secrets external-secrets/external-secrets -n external-secrets-system --create-namespace

Enable encryption at rest:

# For managed Kubernetes (GKE example)
gcloud container clusters update my-cluster \
  --database-encryption-key projects/my-project/locations/us-central1/keyRings/my-keyring/cryptoKeys/my-key

# For self-managed clusters, configure encryption provider (shown earlier)

Phase 3: Advanced Security (Months 4-6)

Deploy service mesh for zero-trust:

# Install Istio
curl -L https://istio.io/downloadIstio | sh -
cd istio-*
export PATH=$PWD/bin:$PATH

# Install with mTLS strict mode
istioctl install --set profile=default -y

# Enable automatic mTLS
kubectl apply -f - <<EOF
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
  name: default
  namespace: istio-system
spec:
  mtls:
    mode: STRICT
EOF

Set up continuous monitoring:

# Deploy Prometheus and Grafana for security metrics
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm install prometheus prometheus-community/kube-prometheus-stack -n monitoring --create-namespace

# Create security dashboards tracking:
# - Failed authentication attempts
# - Policy violations
# - Runtime security events
# - Vulnerability trends

Phase 4: Continuous Improvement (Ongoing)

Automate security testing:

# Create a security test suite
cat > security-tests.sh <<'EOF'
#!/bin/bash
set -e

echo "Running security tests..."

# Test 1: No pods running as root
echo "Checking for root containers..."
ROOT_PODS=$(kubectl get pods --all-namespaces -o json | \
  jq '.items[] | select(.spec.securityContext.runAsUser==0 or .spec.containers[].securityContext.runAsUser==0) | "\(.metadata.namespace)/\(.metadata.name)"' | wc -l)
if [ "$ROOT_PODS" -gt 0 ]; then
  echo "FAIL: Found $ROOT_PODS pods running as root"
  exit 1
fi

# Test 2: All images from approved registries
echo "Checking image registries..."
UNAPPROVED=$(kubectl get pods --all-namespaces -o json | \
  jq -r '.items[].spec.containers[].image' | \
  grep -v "myregistry.io\|gcr.io/myproject" | wc -l)
if [ "$UNAPPROVED" -gt 0 ]; then
  echo "FAIL: Found $UNAPPROVED images from unapproved registries"
  exit 1
fi

# Test 3: Network policies exist
echo "Checking network policies..."
for ns in production staging; do
  POLICIES=$(kubectl get networkpolicies -n $ns --no-headers | wc -l)
  if [ "$POLICIES" -eq 0 ]; then
    echo "FAIL: No network policies in namespace $ns"
    exit 1
  fi
done

echo "All security tests passed!"
EOF

chmod +x security-tests.sh

# Run in CI/CD
./security-tests.sh

Frequently Asked Questions

Q: How do I secure Kubernetes clusters in a multi-tenant environment?

A: Multi-tenancy requires additional isolation layers. Here’s a comprehensive approach:

# 1. Create isolated namespaces per tenant
apiVersion: v1
kind: Namespace
metadata:
  name: tenant-acme
  labels:
    tenant: acme
    environment: production
---
# 2. Implement strict RBAC per tenant
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: tenant-admin
  namespace: tenant-acme
rules:
- apiGroups: ["*"]
  resources: ["*"]
  verbs: ["*"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: tenant-acme-admin
  namespace: tenant-acme
subjects:
- kind: User
  name: admin@acme.com
roleRef:
  kind: Role
  name: tenant-admin
  apiGroup: rbac.authorization.k8s.io
---
# 3. Network isolation between tenants
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: deny-cross-tenant
  namespace: tenant-acme
spec:
  podSelector: {}
  policyTypes:
  - Ingress
  - Egress
  ingress:
  - from:
    - namespaceSelector:
        matchLabels:
          tenant: acme
  egress:
  - to:
    - namespaceSelector:
        matchLabels:
          tenant: acme
---
# 4. Resource quotas per tenant
apiVersion: v1
kind: ResourceQuota
metadata:
  name: tenant-acme-quota
  namespace: tenant-acme
spec:
  hard:
    requests.cpu: "100"
    requests.memory: 200Gi
    persistentvolumeclaims: "50"
    pods: "100"

Use separate node pools with taints:

# Create node pool for tenant
gcloud container node-pools create tenant-acme-pool \
  --cluster=my-cluster \
  --node-taints=tenant=acme:NoSchedule \
  --node-labels=tenant=acme

# In pod spec, add toleration
kubectl apply -f - <<EOF
apiVersion: v1
kind: Pod
metadata:
  name: tenant-app
  namespace: tenant-acme
spec:
  tolerations:
  - key: tenant
    operator: Equal
    value: acme
    effect: NoSchedule
  nodeSelector:
    tenant: acme
  containers:
  - name: app
    image: myapp:latest
EOF

For highest security, consider virtual clusters using vcluster:

# Install vcluster
curl -s -L "https://github.com/loft-sh/vcluster/releases/latest" | sed -nE 's!.*"([^"]*vcluster-linux-amd64)".*!https://github.com\1!p' | xargs -n 1 curl -L -o vcluster && chmod +x vcluster

# Create virtual cluster for tenant
vcluster create tenant-acme -n host-namespace

# Connect to virtual cluster
vcluster connect tenant-acme -n host-namespace

Q: What’s the difference between Pod Security Policies and Pod Security Standards?

A: Pod Security Policies (PSPs) were deprecated in 1.21 and removed in 1.25. Pod Security Standards (PSS) are simpler and built into the admission controller.

Migration from PSP to PSS:

# Check current PSP usage
kubectl get psp

# Label namespaces for PSS
kubectl label namespace production \
  pod-security.kubernetes.io/enforce=restricted \
  pod-security.kubernetes.io/audit=restricted \
  pod-security.kubernetes.io/warn=restricted

# Monitor violations without blocking
kubectl label namespace staging \
  pod-security.kubernetes.io/enforce=baseline \
  pod-security.kubernetes.io/audit=restricted \
  pod-security.kubernetes.io/warn=restricted

# Check for violations
kubectl get events -n production | grep "violates PodSecurity"

Q: Should I scan images before or after pushing to the registry?

A: Both. Here’s the complete workflow:

# 1. Scan during build (shift-left)
docker build -t myapp:latest .
trivy image --severity HIGH,CRITICAL --exit-code 1 myapp:latest

# 2. Sign and push if scan passes
cosign sign --key cosign.key myregistry.io/myapp:latest
docker push myregistry.io/myapp:latest

# 3. Continuous registry scanning
# Deploy Trivy operator for continuous monitoring
kubectl apply -f https://raw.githubusercontent.com/aquasecurity/trivy-operator/main/deploy/static/trivy-operator.yaml

# 4. Pre-deployment validation
trivy image --severity HIGH,CRITICAL myregistry.io/myapp:latest

Complete CI/CD integration:

# GitLab CI example
stages:
  - build
  - scan
  - sign
  - deploy

build:
  stage: build
  script:
    - docker build -t $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA .

scan:
  stage: scan
  script:
    - trivy image --severity HIGH,CRITICAL --exit-code 1 $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA
    - trivy image --format json --output scan-results.json $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA
  artifacts:
    reports:
      container_scanning: scan-results.json

sign:
  stage: sign
  script:
    - cosign sign --key $COSIGN_KEY $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA
  only:
    - main

deploy:
  stage: deploy
  script:
    - kubectl set image deployment/myapp app=$CI_REGISTRY_IMAGE:$CI_COMMIT_SHA
  only:
    - main

Q: How can I detect if my Kubernetes cluster has been compromised?

A: Implement comprehensive monitoring and detection:

1. Deploy Falco with alerting:

# Falco configuration with Slack alerts
apiVersion: v1
kind: ConfigMap
metadata:
  name: falco-config
  namespace: falco
data:
  falco.yaml: |
    json_output: true
    json_include_output_property: true
    http_output:
      enabled: true
      url: "https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK"

2. Enable comprehensive audit logging:

# /etc/kubernetes/audit-policy.yaml
apiVersion: audit.k8s.io/v1
kind: Policy
rules:
# Log all authentication failures
- level: Metadata
  omitStages:
  - RequestReceived
  verbs: ["create"]
  resources:
  - group: ""
    resources: ["pods/exec", "pods/portforward"]

# Log secret access
- level: Metadata
  resources:
  - group: ""
    resources: ["secrets"]

# Log role binding changes
- level: RequestResponse
  verbs: ["create", "update", "patch", "delete"]
  resources:
  - group: "rbac.authorization.k8s.io"
    resources: ["roles", "rolebindings", "clusterroles", "clusterrolebindings"]

3. Monitor for indicators of compromise:

# Create monitoring script
cat > monitor-ioc.sh <<'EOF'
#!/bin/bash

# Check for unknown workloads
echo "Checking for unauthorized pods..."
kubectl get pods --all-namespaces -o json | \
  jq -r '.items[] | select(.metadata.namespace | IN("kube-system", "kube-public", "default") | not) | "\(.metadata.namespace)/\(.metadata.name)"'

# Check for suspicious resource usage
echo "Checking for resource spikes..."
kubectl top nodes
kubectl top pods --all-namespaces --sort-by=cpu | head -20

# Check audit logs for anomalies
echo "Checking audit logs..."
grep "Forbidden\|Unauthorized" /var/log/kubernetes/audit/audit.log | tail -20

# Check for privilege escalation attempts
echo "Checking for privilege escalation..."
kubectl get events --all-namespaces | grep "escalation"
EOF

Q: What’s the biggest Kubernetes security mistake organizations make?

A: Treating Kubernetes security as a one-time configuration. Security is a continuous process. Here’s a checklist for ongoing security:

# Create weekly security checklist script
cat > weekly-security-check.sh <<'EOF'
#!/bin/bash

echo "=== Weekly Kubernetes Security Checklist ==="
echo "Run date: $(date)"
echo

# 1. Update security tools
echo "1. Updating security tools..."
trivy image --download-db-only

# 2. Scan for new vulnerabilities
echo "2. Scanning for vulnerabilities..."
trivy k8s cluster --report summary

# 3. Check CIS compliance
echo "3. Running CIS benchmark..."
kubectl apply -f https://raw.githubusercontent.com/aquasecurity/kube-bench/main/job.yaml
sleep 30
kubectl logs job/kube-bench | grep "\[FAIL\]"

# 4. Review RBAC changes
echo "4. Reviewing RBAC changes from last week..."
kubectl get clusterrolebindings -o json | \
  jq '.items[] | select(.metadata.creationTimestamp > "'$(date -d '7 days ago' -Iseconds)'") | .metadata.name'

# 5. Check for exposed services
echo "5. Checking for LoadBalancer services..."
kubectl get svc --all-namespaces -o json | \
  jq -r '.items[] | select(.spec.type=="LoadBalancer") | "\(.metadata.namespace)/\(.metadata.name)"'

# 6. Review network policies
echo "6. Checking namespaces without network policies..."
for ns in $(kubectl get ns -o jsonpath='{.items[*].metadata.name}'); do
  policies=$(kubectl get networkpolicies -n $ns --no-headers 2>/dev/null | wc -l)
  if [ "$policies" -eq 0 ]; then
    echo "WARNING: No network policies in namespace: $ns"
  fi
done

echo
echo "=== Security check complete ==="
EOF

chmod +x weekly-security-check.sh

# Run weekly
crontab -e
# Add: 0 9 * * 1 /path/to/weekly-security-check.sh | mail -s "Weekly K8s Security Report" security@company.com

Take Action: Your Next Steps in Kubernetes Security

Kubernetes security isn’t optional—it’s fundamental to running production workloads safely. The threat landscape is evolving rapidly, but you don’t have to tackle everything at once.

Start here:

1. Audit your current state – Run these commands now:

# Quick security audit
kubectl apply -f https://raw.githubusercontent.com/aquasecurity/kube-bench/main/job.yaml
kubectl logs job/kube-bench

# Check for common misconfigurations
kubescape scan framework nsa --verbose

2. Fix the low-hanging fruit – Apply these immediately:

# Enable pod security standards on all namespaces
for ns in $(kubectl get ns -o jsonpath='{.items[*].metadata.name}'); do
  kubectl label namespace $ns \
    pod-security.kubernetes.io/enforce=baseline \
    pod-security.kubernetes.io/warn=restricted \
    --overwrite
done

# Deploy default-deny network policy template
kubectl apply -f - <<EOF
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-all
spec:
  podSelector: {}
  policyTypes:
  - Ingress
  - Egress
EOF

3. Build incrementally – Follow the roadmap above, focusing on one phase at a time. Don’t try to implement everything simultaneously.

4. Make security cultural – Include security discussions in sprint planning and retrospectives. When security is part of normal workflow, it doesn’t feel like an add-on burden.

Leave a Reply

Your email address will not be published. Required fields are marked *