I've been running OpenClaw — an open-source AI assistant platform — on a single-node Kubernetes cluster for a few months now. It connects to Mattermost, has browser automation, and runs various tools. Here's how I set it up.
Why self-host
OpenClaw has a hosted offering, but I wanted full control over the data, integrations, and the ability to bolt on extra tools. The macOS app connects to the self-hosted gateway via WebSocket, so you get the nice UI without running the heavy stuff locally.
The custom Docker image
The official image is great but I needed things like gh CLI, kubectl, ffmpeg, OCR, and Python tools available for the AI to use. The Dockerfile extends the official image:
ARG OPENCLAW_VERSION=2026.2.26
FROM ghcr.io/openclaw/openclaw:${OPENCLAW_VERSION}
USER root
RUN apt-get update && apt-get install -y --no-install-recommends \
postgresql-client curl ca-certificates \
jq ffmpeg rsync sqlite3 \
poppler-utils pandoc \
tesseract-ocr tesseract-ocr-eng \
&& rm -rf /var/lib/apt/lists/*
# gh CLI, kubectl, uv, Python tools...
# (full Dockerfile in the repo)
USER 1000
I also gave it kubectl access to a sandboxed namespace so the AI can spin up its own pods and deployments without touching anything else on the cluster. That's been fun to watch.
The configMode: merge trap
This one cost me a few hours. OpenClaw's helm chart supports configMode: merge — on each deploy, the init container deep-merges your Helm values into the existing config on the PVC. Helm values win on conflicts.
I initially put model configs, agent settings, and all sorts of things in the Helm values. Every deploy would overwrite whatever I'd changed through the OpenClaw UI. The fix: only put infrastructure requirements in Helm values (gateway.mode, browser.cdpUrl, credentials). Everything else — agents, models, channel settings — manage through the UI and let it persist on the PVC.
Credentials
OpenClaw has its own way of handling env vars — you pass them via configOverrides.env in the chart values, and it injects them into its process environment at startup:
configOverrides:
env:
GH_TOKEN: "{{ .StateValues.secrets.openclaw.ghToken }}"
This works for tools that read env vars. For tools that need config files (like file-based keyrings), I wrote an init-tools.sh that creates those files from env vars on startup. Secrets are managed with SOPS and decrypted at deploy time.
CI/CD
I use Woodpecker CI. The pipeline builds the custom image and deploys with helmfile:
when:
- event: push
branch: main
path:
include:
- openclaw-custom/**
- k8s/values/openclaw.yaml.gotmpl
steps:
- name: build
image: woodpeckerci/plugin-docker-buildx:5-insecure
settings:
repo: my-registry/openclaw-custom
tag: sha-${CI_COMMIT_SHA:0:8}
platforms: linux/amd64
context: openclaw-custom
- name: deploy
depends_on: [build]
image: ghcr.io/helmfile/helmfile:v1.2.3
commands:
- # ... kubeconfig and SOPS key setup ...
- cd k8s && helmfile -l name=openclaw sync \
--state-values-set apps.openclaw.enabled=true \
--set image.tag=sha-${CI_COMMIT_SHA:0:8}
Push to main, image builds, helmfile deploys. The image tag is the commit SHA so I always know what's running. The OpenClaw release is disabled by default in helmfile — it only deploys when its own files change, via this dedicated pipeline.
The deployment uses Recreate strategy (not rolling) with a single replica. OpenClaw is a stateful gateway holding WebSocket connections to Mattermost, the macOS app, etc. Running two instances would mean duplicate bot messages and split state. The ~30 second downtime per deploy is fine.
Backups
This is the part I'm most paranoid about. The data lives on a PVC — config, sessions, chat history — and PVCs aren't backed up by default. I run a daily systemd timer that:
- Dumps PostgreSQL —
pg_dumpall, keeps 3 most recent dumps - Exports Kubernetes manifests — all resources to YAML
- Backs up everything to Google Drive — using Duplicacy (all for 7 days, daily for 30, weekly for a year)
- Pushes a metric to VictoriaMetrics so I get alerted if backups stop working
The timer uses Persistent=true so it catches up if the machine was down at the scheduled time. On a single-node cluster, PVC data lives on the host filesystem, so the Duplicacy backup covers it.
Other things that bit me
- Container home is read-only. The official image sets
/home/openclawas read-only. Every tool that writes config files needs to point to the PVC mount at/home/openclaw/.openclawinstead. I kept hitting this withgh,gog, and anything else that writes to$HOME. - Google OAuth tokens expire in Testing mode. If your OAuth app is in "Testing" mode in Google Cloud Console, refresh tokens expire after 7 days of no use. Switching to "Production" mode fixes it. The 100-user cap and "unverified app" warning are irrelevant for personal use. This one was particularly annoying to debug because it would work fine for a week and then silently stop.
Is it worth it?
Honestly, it's over-engineered for a single-user instance. But I already had the cluster, and the incremental cost of adding OpenClaw was small. The key win is that configMode: merge lets you treat infrastructure config as code while keeping the interactive stuff in the UI. And having proper CI/CD means I can patch upstream bugs and deploy in under a minute.
If you're thinking about self-hosting, start with docker compose. Only move to Kubernetes if you already have a cluster.