I've been running OpenClaw — an open-source AI assistant platform — on a single-node Kubernetes cluster for a few months now. It connects to Mattermost, has browser automation, and runs various tools. Here's how I set it up.

Why self-host

OpenClaw has a hosted offering, but I wanted full control over the data, integrations, and the ability to bolt on extra tools. The macOS app connects to the self-hosted gateway via WebSocket, so you get the nice UI without running the heavy stuff locally.

The custom Docker image

The official image is great but I needed things like gh CLI, kubectl, ffmpeg, OCR, and Python tools available for the AI to use. The Dockerfile extends the official image:

ARG OPENCLAW_VERSION=2026.2.26
FROM ghcr.io/openclaw/openclaw:${OPENCLAW_VERSION}

USER root

RUN apt-get update && apt-get install -y --no-install-recommends \
    postgresql-client curl ca-certificates \
    jq ffmpeg rsync sqlite3 \
    poppler-utils pandoc \
    tesseract-ocr tesseract-ocr-eng \
    && rm -rf /var/lib/apt/lists/*

# gh CLI, kubectl, uv, Python tools...
# (full Dockerfile in the repo)

USER 1000

I also gave it kubectl access to a sandboxed namespace so the AI can spin up its own pods and deployments without touching anything else on the cluster. That's been fun to watch.

The configMode: merge trap

This one cost me a few hours. OpenClaw's helm chart supports configMode: merge — on each deploy, the init container deep-merges your Helm values into the existing config on the PVC. Helm values win on conflicts.

I initially put model configs, agent settings, and all sorts of things in the Helm values. Every deploy would overwrite whatever I'd changed through the OpenClaw UI. The fix: only put infrastructure requirements in Helm values (gateway.mode, browser.cdpUrl, credentials). Everything else — agents, models, channel settings — manage through the UI and let it persist on the PVC.

Credentials

OpenClaw has its own way of handling env vars — you pass them via configOverrides.env in the chart values, and it injects them into its process environment at startup:

configOverrides:
  env:
    GH_TOKEN: "{{ .StateValues.secrets.openclaw.ghToken }}"

This works for tools that read env vars. For tools that need config files (like file-based keyrings), I wrote an init-tools.sh that creates those files from env vars on startup. Secrets are managed with SOPS and decrypted at deploy time.

CI/CD

I use Woodpecker CI. The pipeline builds the custom image and deploys with helmfile:

when:
  - event: push
    branch: main
    path:
      include:
        - openclaw-custom/**
        - k8s/values/openclaw.yaml.gotmpl

steps:
  - name: build
    image: woodpeckerci/plugin-docker-buildx:5-insecure
    settings:
      repo: my-registry/openclaw-custom
      tag: sha-${CI_COMMIT_SHA:0:8}
      platforms: linux/amd64
      context: openclaw-custom

  - name: deploy
    depends_on: [build]
    image: ghcr.io/helmfile/helmfile:v1.2.3
    commands:
      - # ... kubeconfig and SOPS key setup ...
      - cd k8s && helmfile -l name=openclaw sync \
          --state-values-set apps.openclaw.enabled=true \
          --set image.tag=sha-${CI_COMMIT_SHA:0:8}

Push to main, image builds, helmfile deploys. The image tag is the commit SHA so I always know what's running. The OpenClaw release is disabled by default in helmfile — it only deploys when its own files change, via this dedicated pipeline.

The deployment uses Recreate strategy (not rolling) with a single replica. OpenClaw is a stateful gateway holding WebSocket connections to Mattermost, the macOS app, etc. Running two instances would mean duplicate bot messages and split state. The ~30 second downtime per deploy is fine.

Backups

This is the part I'm most paranoid about. The data lives on a PVC — config, sessions, chat history — and PVCs aren't backed up by default. I run a daily systemd timer that:

  1. Dumps PostgreSQLpg_dumpall, keeps 3 most recent dumps
  2. Exports Kubernetes manifests — all resources to YAML
  3. Backs up everything to Google Drive — using Duplicacy (all for 7 days, daily for 30, weekly for a year)
  4. Pushes a metric to VictoriaMetrics so I get alerted if backups stop working

The timer uses Persistent=true so it catches up if the machine was down at the scheduled time. On a single-node cluster, PVC data lives on the host filesystem, so the Duplicacy backup covers it.

Other things that bit me

  • Container home is read-only. The official image sets /home/openclaw as read-only. Every tool that writes config files needs to point to the PVC mount at /home/openclaw/.openclaw instead. I kept hitting this with gh, gog, and anything else that writes to $HOME.
  • Google OAuth tokens expire in Testing mode. If your OAuth app is in "Testing" mode in Google Cloud Console, refresh tokens expire after 7 days of no use. Switching to "Production" mode fixes it. The 100-user cap and "unverified app" warning are irrelevant for personal use. This one was particularly annoying to debug because it would work fine for a week and then silently stop.

Is it worth it?

Honestly, it's over-engineered for a single-user instance. But I already had the cluster, and the incremental cost of adding OpenClaw was small. The key win is that configMode: merge lets you treat infrastructure config as code while keeping the interactive stuff in the UI. And having proper CI/CD means I can patch upstream bugs and deploy in under a minute.

If you're thinking about self-hosting, start with docker compose. Only move to Kubernetes if you already have a cluster.