Introduction
In the rapidly evolving landscape of cloud-native technologies, managing complex Kubernetes environments has become increasingly challenging. Enter Kagent – a revolutionary open-source framework that brings agentic AI directly into your Kubernetes clusters. This comprehensive tutorial will guide you through everything you need to know about Kagent, from basic installation to building sophisticated AI agents that can autonomously manage your infrastructure.
What is Kagent?
Kagent is a Kubernetes-native framework designed specifically for building, deploying, and managing AI agents within Kubernetes environments. Unlike traditional automation tools that execute predetermined scripts, Kagent leverages Large Language Models (LLMs) to create intelligent agents capable of multi-step reasoning, dynamic problem-solving, and adaptive decision-making.
Key Features
- Declarative Configuration: Define agents and tools using simple YAML files, making them portable and easy to share across teams.
- Multi-Provider Support: Works seamlessly with OpenAI, Anthropic Claude, Google Gemini, Azure OpenAI, Ollama, and custom AI gateways.
- Model Context Protocol (MCP): Integrates with any MCP server, providing access to tools for Kubernetes, Istio, Helm, Argo, Prometheus, Grafana, and more.
- Kubernetes-Native: Runs directly inside your cluster with full access to cluster state, APIs, and observability data.
- Observable: Built-in OpenTelemetry tracing allows you to monitor agent behavior and tool execution in real-time.
- Extensible: Create custom tools and agents tailored to your specific infrastructure needs.
Prerequisites
Before diving into Kagent, ensure you have the following:
Required Tools
- Kubernetes Cluster: Minikube, Kind, K3s, or any production cluster
- kubectl: Kubernetes command-line tool
- Helm: Package manager for Kubernetes (version 3.x or higher)
- LLM API Key: From OpenAI, Anthropic, Google, or your preferred provider
Minimum System Requirements
- CPU: 4 cores minimum
- Memory: 8GB RAM minimum
- Disk Space: 20GB available
Knowledge Prerequisites
- Basic understanding of Kubernetes concepts (Pods, Deployments, Services)
- Familiarity with YAML syntax
- Understanding of REST APIs and webhooks
Installation Guide
Let’s walk through the complete installation process step by step.
Step 1: Set Up Your Kubernetes Cluster
If you don’t have a cluster yet, create one using Kind:
# Create a Kind cluster
kind create cluster --name kagent-demo
# Verify the cluster is running
kubectl cluster-info
kubectl get nodes
For Minikube users:
# Start Minikube with adequate resources
minikube start --cpus=4 --memory=8192 --driver=docker
# Verify the cluster
minikube status
Step 2: Configure Your LLM Provider
Set your API key as an environment variable. Choose the provider you prefer:
For OpenAI:
export OPENAI_API_KEY="sk-your-openai-key-here"
For Anthropic Claude:
export ANTHROPIC_API_KEY="sk-ant-your-anthropic-key-here"
For Google Gemini:
export GOOGLE_API_KEY="your-google-api-key-here"
Step 3: Install Kagent Using CLI
Download and install the Kagent CLI:
# Download the installation script
curl https://raw.githubusercontent.com/kagent-dev/kagent/refs/heads/main/scripts/get-kagent | bash
# Verify installation
kagent version
Install Kagent with default demo agents:
# Install with demo profile (includes sample agents)
kagent install --api-key $OPENAI_API_KEY
# Or install minimal profile (no default agents)
kagent install --api-key $OPENAI_API_KEY --profile minimal
Step 4: Install Kagent Using Helm
Alternatively, install using Helm for more control:
# Create namespace
kubectl create namespace kagent
# Install Kagent CRDs
helm install kagent-crds oci://ghcr.io/kagent-dev/kagent/helm/kagent-crds \
--namespace kagent
# Install Kagent with your preferred provider
helm upgrade --install kagent oci://ghcr.io/kagent-dev/kagent/helm/kagent \
--namespace kagent \
--set providers.default=openai \
--set providers.openai.apiKey=$OPENAI_API_KEY
Step 5: Verify Installation
Check that all Kagent components are running:
# Verify CRDs are installed
kubectl get crd | grep kagent.dev
# Check kagent pods
kubectl get pods -n kagent
# Expected output:
# NAME READY STATUS RESTARTS AGE
# kagent-controller-xxxxx 1/1 Running 0 2m
# kagent-ui-xxxxx 1/1 Running 0 2m
# kagent-tools-xxxxx 1/1 Running 0 2m
Step 6: Access the Kagent Dashboard
Launch the web-based dashboard:
# Start port-forwarding
kagent dashboard
# Or manually:
kubectl port-forward service/kagent-ui 8080:80 -n kagent
Open your browser and navigate to http://localhost:8080
Understanding Core Components
Kagent’s architecture consists of four main components that work together seamlessly.
1. Controller
The Kagent controller is a Kubernetes controller that watches custom resources (CRDs) and creates the necessary infrastructure to run your agents. It handles:
- Agent lifecycle management
- Tool server registration
- Model configuration
- Resource reconciliation
2. User Interface (UI)
The web-based UI provides a visual interface for:
- Creating and managing agents
- Chatting with agents in real-time
- Viewing agent execution traces
- Managing tools and model configurations
- Monitoring agent performance
3. Engine
The engine executes your agents using Google’s Agent Development Kit (ADK). It handles:
- LLM communication
- Tool invocation
- Context management
- Response streaming
4. CLI
The command-line interface enables:
- Installation and upgrades
- Agent deployment from YAML
- Direct agent invocation
- Dashboard access
Deep Dive: Kagent YAML Files
Let’s explore each type of YAML configuration file in detail. Understanding these files is crucial for building effective agents.
1. Agent YAML Configuration
The Agent is the core building block of Kagent. It represents an autonomous AI entity with specific instructions, tools, and capabilities.
Basic Agent Structure
apiVersion: kagent.dev/v1alpha2
kind: Agent
metadata:
name: my-k8s-agent
namespace: kagent
spec:
description: "A helpful Kubernetes assistant"
type: Declarative
declarative:
modelConfig: default-model-config
systemMessage: |
You are an expert Kubernetes administrator with deep knowledge of:
- Pod management and troubleshooting
- Deployment strategies and rollbacks
- Service networking and ingress
- Resource optimization
## Instructions
- Always verify cluster state before making changes
- Provide clear explanations of actions taken
- If uncertain, ask for clarification
- Format responses in Markdown
## Safety Guidelines
- Never delete production resources without confirmation
- Always check namespace before operations
- Validate YAML before applying changes
tools:
- type: McpServer
mcpServer:
name: kagent-tool-server
kind: RemoteMCPServer
apiGroup: kagent.dev
toolNames:
- k8s_get_resources
- k8s_get_pod_logs
- k8s_get_cluster_configuration
Agent YAML Field Explanations
apiVersion: Specifies the API version. Currently kagent.dev/v1alpha2 for Agents.
kind: Defines the resource type. Use Agent for AI agents.
metadata.name: Unique identifier for your agent. Use descriptive names like k8s-troubleshooter or helm-installer.
metadata.namespace: Kubernetes namespace where the agent runs. Typically kagent.
spec.description: Human-readable description displayed in the UI. Helps users understand the agent’s purpose.
spec.type: Agent type. Options:
Declarative: Configuration defined directly in YAMLBYO(Bring Your Own): Custom agent implementation
spec.declarative.modelConfig: Reference to a ModelConfig resource that defines which LLM to use.
spec.declarative.systemMessage: The agent’s core instructions and personality. This is equivalent to the system prompt in direct LLM interactions. Key sections:
- Role Definition: Who the agent is and what expertise it has
- Instructions: Specific behavioral guidelines
- Response Format: How to structure answers
- Safety Guidelines: Critical constraints and validations
spec.declarative.tools: Array of tools the agent can use. Each tool definition includes:
- type: Tool type (
McpServer,HttpTool, etc.) - mcpServer.name: Name of the MCP server providing the tool
- mcpServer.kind: Resource type (
MCPServerorRemoteMCPServer) - mcpServer.toolNames: Specific tools to enable from the server
Advanced Agent Example with Multiple Tool Servers
apiVersion: kagent.dev/v1alpha2
kind: Agent
metadata:
name: devops-super-agent
namespace: kagent
labels:
team: platform-engineering
environment: production
spec:
description: "Advanced DevOps agent for deployment and monitoring"
type: Declarative
declarative:
modelConfig: claude-sonnet-config
systemMessage: |
You are a senior DevOps engineer responsible for:
## Core Responsibilities
1. Application deployment and rollback
2. Infrastructure monitoring and alerting
3. Performance optimization
4. Incident response
## Deployment Workflow
- Validate Helm charts before installation
- Check resource availability
- Apply progressive rollout strategies
- Monitor deployment health
- Rollback if metrics degrade
## Monitoring Workflow
- Query Prometheus for metrics
- Analyze Grafana dashboards
- Correlate logs with metrics
- Create actionable alerts
## Communication Style
- Be concise but thorough
- Use code blocks for commands
- Explain technical decisions
- Provide next steps
tools:
# Kubernetes operations
- type: McpServer
mcpServer:
name: kagent-tool-server
kind: RemoteMCPServer
apiGroup: kagent.dev
toolNames:
- k8s_get_resources
- k8s_apply_yaml
- k8s_delete_resource
- k8s_get_pod_logs
- k8s_exec_pod
# Helm operations
- type: McpServer
mcpServer:
name: kagent-tool-server
kind: RemoteMCPServer
apiGroup: kagent.dev
toolNames:
- helm_list_releases
- helm_install
- helm_upgrade
- helm_rollback
- helm_get_values
# Prometheus monitoring
- type: McpServer
mcpServer:
name: kagent-tool-server
kind: RemoteMCPServer
apiGroup: kagent.dev
toolNames:
- prometheus_query
- prometheus_query_range
# Grafana dashboards
- type: McpServer
mcpServer:
name: kagent-tool-server
kind: RemoteMCPServer
apiGroup: kagent.dev
toolNames:
- grafana_get_dashboard
- grafana_search_dashboards
Agent with A2A (Agent-to-Agent) Protocol
apiVersion: kagent.dev/v1alpha2
kind: Agent
metadata:
name: k8s-a2a-agent
namespace: kagent
spec:
description: "Kubernetes agent with A2A capabilities"
type: Declarative
declarative:
modelConfig: default-model-config
systemMessage: |
You are an expert Kubernetes agent that uses tools to help users.
When asked to retrieve information:
1. Use the appropriate k8s tool
2. Parse the response
3. Present it in a clear format
tools:
- type: McpServer
mcpServer:
name: kagent-tool-server
kind: RemoteMCPServer
toolNames:
- k8s_get_resources
- k8s_get_available_api_resources
# A2A configuration exposes agent capabilities
a2aConfig:
skills:
- id: get-resources-skill
name: Get Resources
description: Get resources in the Kubernetes cluster
inputModes:
- text
outputModes:
- text
tags:
- k8s
- resources
examples:
- "Get all pods in the default namespace"
- "List services in istio-system"
- "Show deployments across all namespaces"
2. ModelConfig YAML Configuration
ModelConfig defines how Kagent connects to LLM providers. Each provider has specific configuration requirements.
OpenAI ModelConfig
apiVersion: kagent.dev/v1alpha2
kind: ModelConfig
metadata:
name: gpt4-model-config
namespace: kagent
spec:
# Provider type
provider: OpenAI
# Model identifier
model: gpt-4o
# API key from Kubernetes Secret
apiKeySecret: kagent-openai
apiKeySecretKey: OPENAI_API_KEY
# OpenAI-specific configuration
openAI:
# Optional: custom base URL for OpenAI-compatible APIs
baseUrl: https://api.openai.com/v1
# Optional: organization ID
organization: org-xxxxx
# Model capabilities metadata
modelInfo:
family: gpt
functionCalling: true
jsonOutput: true
multipleSystemMessages: false
structuredOutput: true
vision: true
ModelConfig Field Explanations
apiKeySecret: Name of the Kubernetes Secret containing your API key. Create it with:
kubectl create secret generic kagent-openai \
--from-literal=OPENAI_API_KEY=$OPENAI_API_KEY \
-n kagent
apiKeySecretKey: The key name within the Secret where the API key is stored.
provider: LLM provider. Supported values:
OpenAIAnthropicGeminiAzureOpenAIOllamaVertexAI
model: Specific model name/version to use. Examples:
- OpenAI:
gpt-4o,gpt-4-turbo,gpt-3.5-turbo - Anthropic:
claude-sonnet-4,claude-opus-4 - Gemini:
gemini-2.5-pro,gemini-2.0-flash
modelInfo: Capability flags that inform Kagent what the model can do:
functionCalling: Supports tool/function callingjsonOutput: Can output structured JSONmultipleSystemMessages: Handles multiple system promptsstructuredOutput: Supports structured output schemasvision: Can process images
Anthropic Claude ModelConfig
apiVersion: kagent.dev/v1alpha2
kind: ModelConfig
metadata:
name: claude-sonnet-config
namespace: kagent
spec:
provider: Anthropic
model: claude-sonnet-4-20250514
apiKeySecret: kagent-anthropic
apiKeySecretKey: ANTHROPIC_API_KEY
# Anthropic-specific settings
anthropic:
# API version
apiVersion: "2023-06-01"
modelInfo:
family: claude
functionCalling: true
jsonOutput: true
multipleSystemMessages: true
structuredOutput: true
vision: true
Google Gemini ModelConfig
apiVersion: kagent.dev/v1alpha2
kind: ModelConfig
metadata:
name: gemini-model-config
namespace: kagent
spec:
provider: Gemini
model: gemini-2.5-pro
apiKeySecret: kagent-gemini
apiKeySecretKey: GOOGLE_API_KEY
# Gemini-specific configuration
gemini:
# Optional: project ID for Vertex AI
projectId: my-gcp-project
# Optional: region for Vertex AI
region: us-central1
modelInfo:
family: gemini
functionCalling: true
jsonOutput: true
multipleSystemMessages: false
structuredOutput: true
vision: true
Azure OpenAI ModelConfig
apiVersion: kagent.dev/v1alpha2
kind: ModelConfig
metadata:
name: azure-openai-config
namespace: kagent
spec:
provider: AzureOpenAI
model: gpt-4
apiKeySecret: kagent-azure
apiKeySecretKey: AZURE_OPENAI_API_KEY
azureOpenAI:
# Azure endpoint
endpoint: https://your-resource.openai.azure.com/
# Deployment name in Azure
deploymentName: gpt-4-deployment
# API version
apiVersion: "2024-02-15-preview"
modelInfo:
family: gpt
functionCalling: true
jsonOutput: true
structuredOutput: true
vision: true
Ollama (Local Model) Configuration
apiVersion: kagent.dev/v1alpha2
kind: ModelConfig
metadata:
name: llama3-local-config
namespace: kagent
spec:
provider: Ollama
model: llama3.2
# No API key needed for local Ollama
ollama:
# Ollama service endpoint
host: http://ollama.ollama.svc.cluster.local:11434
# Optional: specific model version
modelVersion: latest
modelInfo:
family: llama
functionCalling: true
jsonOutput: true
multipleSystemMessages: false
structuredOutput: false
vision: false
Using LiteLLM Proxy for Multiple Providers
apiVersion: kagent.dev/v1alpha2
kind: ModelConfig
metadata:
name: bedrock-claude-config
namespace: kagent
spec:
provider: OpenAI # LiteLLM exposes OpenAI-compatible API
model: bedrock-claude-3-5-sonnet
apiKeySecret: bedrock-credentials
apiKeySecretKey: AWS_ACCESS_KEY_ID
openAI:
# LiteLLM proxy endpoint
baseUrl: http://litellm-service.default.svc.cluster.local:4000
modelInfo:
family: claude
functionCalling: true
jsonOutput: true
structuredOutput: true
vision: false
3. MCPServer YAML Configuration
MCPServer resources define custom MCP tool servers that run in your cluster. These extend agent capabilities with domain-specific tools.
Basic MCPServer with STDIO Transport
apiVersion: kagent.dev/v1alpha1
kind: MCPServer
metadata:
name: mcp-website-fetcher
namespace: kagent
spec:
# Deployment configuration
deployment:
# Command to run
cmd: uvx
# Arguments passed to the command
args:
- mcp-server-fetch
# Container port (for SSE transport)
port: 3000
# Transport mechanism
transportType: stdio
stdioTransport: {}
MCPServer Field Explanations
spec.deployment.cmd: The executable command to start the MCP server. Common values:
uvx: Python package runnernpx: Node package runnernode: Direct Node.js execution- Custom binary path
spec.deployment.args: Arguments passed to the command. Typically the MCP server package name.
spec.deployment.port: Container port the MCP server listens on (for SSE transport).
spec.transportType: Communication protocol. Options:
stdio: Standard input/output (simpler, lower overhead)sse: Server-Sent Events over HTTP (better for networked tools)
MCPServer with SSE Transport
apiVersion: kagent.dev/v1alpha1
kind: MCPServer
metadata:
name: github-mcp-server
namespace: kagent
spec:
deployment:
cmd: node
args:
- /app/dist/index.js
port: 3000
# Environment variables
env:
- name: GITHUB_TOKEN
valueFrom:
secretKeyRef:
name: github-credentials
key: token
# Resource limits
resources:
requests:
memory: "256Mi"
cpu: "100m"
limits:
memory: "512Mi"
cpu: "500m"
transportType: sse
sseTransport:
# SSE endpoint path
endpoint: /sse
Advanced MCPServer with Custom Image
apiVersion: kagent.dev/v1alpha1
kind: MCPServer
metadata:
name: custom-documentation-server
namespace: kagent
labels:
app: mcp-server
type: documentation
spec:
deployment:
# Custom Docker image
image: your-registry.io/mcp-doc-server:v1.0.0
imagePullPolicy: IfNotPresent
cmd: npm
args:
- start
port: 3001
# Multiple environment variables
env:
- name: OPENAI_API_KEY
valueFrom:
secretKeyRef:
name: kagent-openai
key: OPENAI_API_KEY
- name: DATABASE_PATH
value: /data/docs.db
- name: LOG_LEVEL
value: info
# Volume mounts for persistent data
volumeMounts:
- name: doc-database
mountPath: /data
volumes:
- name: doc-database
persistentVolumeClaim:
claimName: doc-db-pvc
# Health checks
livenessProbe:
httpGet:
path: /health
port: 3001
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /ready
port: 3001
initialDelaySeconds: 5
periodSeconds: 5
# Resource allocation
resources:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "1Gi"
cpu: "1000m"
transportType: sse
sseTransport:
endpoint: /sse
MCPServer with Init Container
apiVersion: kagent.dev/v1alpha1
kind: MCPServer
metadata:
name: mcp-with-init
namespace: kagent
spec:
deployment:
# Init containers run before main container
initContainers:
- name: download-data
image: curlimages/curl:latest
command:
- sh
- -c
- |
echo "Downloading required data..."
curl -o /data/dataset.json https://example.com/data.json
volumeMounts:
- name: shared-data
mountPath: /data
image: mcp-processor:latest
cmd: python
args:
- /app/server.py
port: 3000
env:
- name: DATA_PATH
value: /data/dataset.json
volumeMounts:
- name: shared-data
mountPath: /data
volumes:
- name: shared-data
emptyDir: {}
transportType: stdio
stdioTransport: {}
4. RemoteMCPServer YAML Configuration
RemoteMCPServer connects to external MCP servers running outside your cluster or pre-existing services.
Basic RemoteMCPServer
apiVersion: kagent.dev/v1alpha1
kind: RemoteMCPServer
metadata:
name: external-mcp-server
namespace: kagent
spec:
# External URL
url: https://mcp.example.com/sse
# Transport type
transportType: sse
RemoteMCPServer with Authentication
apiVersion: kagent.dev/v1alpha1
kind: RemoteMCPServer
metadata:
name: authenticated-mcp
namespace: kagent
spec:
url: https://secure-mcp.example.com/sse
transportType: sse
# Authentication configuration
auth:
type: bearer
tokenSecretRef:
name: mcp-auth-token
key: token
RemoteMCPServer for Internal Service
apiVersion: kagent.dev/v1alpha1
kind: RemoteMCPServer
metadata:
name: kagent-builtin-tools
namespace: kagent
spec:
# Internal Kubernetes service
url: http://kagent-tool-server.kagent.svc.cluster.local:3001/sse
transportType: sse
# Optional: specific tools to expose
tools:
- k8s_get_resources
- k8s_apply_yaml
- k8s_get_pod_logs
- helm_list_releases
- prometheus_query
RemoteMCPServer with Custom Headers
apiVersion: kagent.dev/v1alpha1
kind: RemoteMCPServer
metadata:
name: custom-headers-mcp
namespace: kagent
spec:
url: https://api.example.com/mcp/sse
transportType: sse
# Custom HTTP headers
headers:
X-API-Version: "v1"
X-Client-ID: "kagent-cluster-1"
X-Environment: "production"
# Authentication
auth:
type: apiKey
apiKeySecretRef:
name: api-credentials
key: api-key
apiKeyHeader: X-API-Key
One thought on “Kagent Tutorial: Build Cloud-Native AI Agents on Kubernetes”