Skip to content
Go back

Claude Code in a Homelab Workflow

By SumGuy 12 min read
Claude Code in a Homelab Workflow

Full example: Grab a starter CLAUDE.md and .claude/settings.json for a homelab repo at github.com/KingPin/sumguy-examples/…

You know the workflow. You’re writing a docker-compose file at 11 PM, something’s wrong with the healthcheck syntax, and you alt-tab to your browser, paste the yaml into claude.ai, wait for the chat response, copy the fix, alt-tab back, paste it in, and realize the indentation got mangled. You fix the indentation. Something else is wrong now. Repeat four more times. Your coffee is cold. The container still isn’t healthy.

This is the standard “AI-assisted” homelab workflow for a lot of people, and it’s honestly kind of embarrassing given how capable these models are. The problem isn’t the model — it’s the interface. Chat UIs are built for conversation. Your homelab is built around files, directories, running processes, and shell commands. There’s a gap there, and copy-pasting across it a hundred times a day is the tax you pay.

Claude Code is Anthropic’s attempt to close that gap. It’s not a chat UI. It’s a terminal-native agentic CLI that lives inside your repo, reads your actual files, and runs your actual shell commands. That changes the dynamic considerably.

Here’s what it looks like in practice.


It’s Not a Chatbot That Happens to Run in a Terminal

Before we get into the homelab jobs, let’s be clear on what Claude Code actually is — because “AI in the terminal” is one of those phrases that’s been applied to a lot of things that are really just glorified autocomplete or a wrapper around curl api.openai.com.

Claude Code is an agentic loop. You give it a task, and it reasons through it in steps, using real tools: it can read files, write files, run shell commands, search your codebase with grep, and look things up. It doesn’t just generate text — it takes actions and responds to what those actions return. Ask it to fix a failing service and it’ll look at your systemd unit, run journalctl -u yourservice -n 50, read what comes back, and then make changes based on the actual error output rather than guessing.

The permission model is worth understanding upfront. Every time Claude Code wants to do something potentially destructive — run a shell command, edit a file you haven’t explicitly okayed — it stops and asks. You can approve once, approve for the session, or deny. There’s also a --dangerously-skip-permissions flag — often called “YOLO mode.” We’ll come back to why that name is funny in a not-funny way.

It’s driven by a CLAUDE.md file in your repo root — a plain markdown file where you tell it how your project works, what tools are available, what conventions to follow. Think of it as the onboarding doc for an agent that just walked into your homelab. The more you put in there, the less you have to explain every session.


The Jobs It’s Actually Good At

That docker-compose.yml That’s Never Quite Right

Give Claude Code your current compose file and tell it what the service should do. It’ll write the thing, but more usefully, it’ll catch the problems you didn’t notice.

Here’s a pattern that comes up constantly: you write a compose file, you add depends_on: [db], and you call it done. Claude Code will flag that depends_on only waits for the container to start, not for the database inside it to be ready to accept connections. It’ll suggest the healthcheck + condition: service_healthy pattern before you discover the race condition yourself at 2 AM when your app can’t connect on first boot.

docker-compose.yml
services:
db:
image: postgres:16
environment:
POSTGRES_PASSWORD: ${DB_PASSWORD}
healthcheck:
test: ["CMD-SHELL", "pg_isready -U postgres"]
interval: 10s
timeout: 5s
retries: 5
app:
image: myapp:latest
depends_on:
db:
condition: service_healthy
environment:
DATABASE_URL: postgres://postgres:${DB_PASSWORD}@db:5432/mydb

That healthcheck pattern is the kind of thing you learn once the hard way, or you learn it because your tool caught it. Claude Code tends to catch it.

It’ll also notice when you’re missing a restart policy on a service that should survive reboots, when your volume mount paths look wrong, or when you’re exposing ports that probably shouldn’t be exposed on a multi-service setup.

That Bash Backup Script You Wrote in 2019

Every homelab has one. A bash script that started as ten lines and is now 200, written across a handful of late nights, with variable names like $DIR2 and $NEWDIR and a comment that says ”# TODO: fix this” with no indication of what this is or what fixing it would involve.

Point Claude Code at it. Tell it to refactor for readability and add error handling. Watch it do something your bash scripts have probably never had: set -euo pipefail at the top, proper function definitions, meaningful variable names, and actual error messages when something fails.

backup() {
cp -r $DIR $DEST/$DATE
if [ $? -ne 0 ]; then
echo "failed"
fi
}
backup_directory() {
local source_dir="${1:?source directory required}"
local dest_dir="${2:?destination directory required}"
local timestamp
timestamp=$(date +%Y%m%d_%H%M%S)
if ! cp -r "$source_dir" "${dest_dir}/${timestamp}"; then
echo "ERROR: backup failed: $source_dir -> $dest_dir" >&2
return 1
fi
echo "Backup complete: ${dest_dir}/${timestamp}"
}

It won’t just mechanically rename things. It’ll ask clarifying questions if the script’s intent is ambiguous, and it’ll flag things that look like bugs — like iterating over a glob without quoting, or using $? several lines after the command that set it.

”Here’s What I Did By Hand — Write Me a Playbook”

This is genuinely one of the most useful things it can do for a homelab context. You set up a service manually, took notes (or didn’t), and now you want an Ansible playbook so you can reproduce it on the next box.

Describe what you did: installed packages, edited config files, enabled a service, created a user, added a cron job. Claude Code will write the playbook. The tasks will be in the right order, the become: yes will be on the right plays, and it’ll use appropriate modules (lineinfile vs copy vs template) rather than just shelling out for everything.

What’s more useful: give it your actual config files and tell it to turn them into templates. It’ll identify the variables, parameterize them, and wire up group_vars. You end up with something idempotent and reusable, not a bash script with ssh calls dressed up in a YAML wrapper.

Debugging the Dead systemd Unit

This one hits differently. A service dies overnight, you wake up to alerts (or you don’t, because your alerting is also broken), and you’re standing in the kitchen trying to figure out what happened.

Terminal window
$ claude "the sumguy-backup service failed overnight, figure out why"

It’ll run systemctl status sumguy-backup, read the output, decide it needs more context, run journalctl -u sumguy-backup --since "yesterday" -n 100, parse the actual error messages, and give you a diagnosis. It doesn’t guess — it reads what the system actually says. If the journal says No space left on device, it’ll check disk usage. If it says permission denied, it’ll look at the unit file’s User= directive and the target directory’s ownership.

This is the alt-tab workflow’s nemesis. Instead of copying log lines into a chat window, the tool just reads them itself.

Bulk Config Editing Without Crying

You’ve got fifteen Caddy config files across a directory structure, or twenty nginx server blocks, or a pile of .env files, and you need to make the same change to all of them. Maybe you’re updating a domain name, or changing an upstream address, or adding a header everywhere.

sed works until your regex needs to handle slight variations across files, and then you spend twenty minutes debugging the regex instead of making the actual change. Claude Code can read all the files in a directory, understand the variations, and make the right change to each one rather than applying a dumb pattern that breaks half of them.

Tell it: “In all the nginx configs under /etc/nginx/sites-available/, add a proxy_read_timeout 60; directive inside every location / block that doesn’t already have one.” It’ll read each file, check for the existing directive, and only add it where it’s actually missing. No sed tears.

READMEs for the Self-Hosted Stuff You’ll Forget You Set Up

Six months from now, you will not remember how you configured that Vaultwarden instance, what ports it uses, what the backup path is, or why you mounted that specific volume. You won’t remember the incantation to regenerate the admin token either.

Give Claude Code your compose file and your notes, tell it to write a README. It’ll produce a description of what the service does, prerequisites, how to bring it up, the relevant env vars, and where things live on disk. Boring work that nobody wants to do and that saves you hours later.


The Honest Part (Don’t Skip This)

It Has Real Tools on Your Real Box

The permission prompt model sounds responsible, and it is — when you use it. The --dangerously-skip-permissions flag exists for CI/sandboxed environments where you want it to just run without stopping. On your actual homelab server with real data, production databases, and years of accumulated configuration: do not use this flag. Not because Claude Code is malicious — it isn’t — but because agentic AI tools make mistakes, and “mistakes with permission prompts” looks very different from “mistakes on a box where it can silently rm -rf something.”

Keep the prompts. They feel annoying. They are also the thing standing between you and “it interpreted my intent incorrectly and deleted the wrong directory.”

The Token Bill Is Real

Claude Code runs against the Claude API. Every session burns tokens — input tokens for everything it reads, output tokens for everything it generates, and a nontrivial amount for the internal reasoning of the agentic loop. For light homelab tasks, this is fine. For a long session where it’s reading a lot of files, running a lot of commands, and making many iterations, you will notice it on your bill.

This is not a complaint about the pricing model — it’s a fact about how it works. Your local Ollama box running on that GPU you bought for gaming doesn’t send you a bill. Claude Code does. Budget accordingly.

Your Code and Configs Go to a Cloud API

This is the one that matters most to a self-hosting audience, and glossing over it would be dishonest. When Claude Code reads your files and runs your queries, that content goes to Anthropic’s API. Your docker-compose files, your bash scripts, your Ansible inventories with hostnames and possibly credentials — all of it becomes API request content.

Anthropic’s API terms say they don’t train on your inputs and outputs by default for commercial API usage. But the data still leaves your box and hits their servers. For a homelab setup where you’re working on pet projects and learning exercises, this is probably fine. For a work box with proprietary code, or a setup that handles sensitive personal data, or anything where you have strong opinions about data sovereignty — which, if you’re reading a self-hosting blog, you might — this is the tradeoff you’re making.

The alternative is local code assistants, which we’ll get to.


MCP: Pointing It at Your Homelab Services

Claude Code supports the Model Context Protocol (MCP), which is a way to give it tools beyond what it ships with. Think of it as plugins — you can wire up an MCP server that exposes your homelab’s tooling and Claude Code can use it as part of its agentic loop.

A concrete homelab example: there’s an MCP server for Prometheus. Configure Claude Code to query your Prometheus instance directly, and instead of copying metric queries out of Grafana and pasting them into a chat, you ask it to look at your actual metrics, spot anomalies, and write alert rules — without leaving the terminal. Same concept applies to Home Assistant and anything else that exposes an API you can wrap.

MCP is the mechanism by which Claude Code goes from “file editor” to “agent that interacts with your running infrastructure.” The ecosystem is still growing, but it’s the right direction.


Where It Fits vs Your Local Stack

Here’s the honest verdict, because “when should I use Claude Code vs Aider/Continue.dev/Tabby against Ollama” is a real question with a real answer.

Use Claude Code when the task spans multiple files and needs real investigation — diagnose, read logs, make coordinated changes. When the complexity is high enough that the cloud model’s reasoning quality is worth paying for. When you want MCP integrations with your actual running infrastructure.

Stay local when anything sensitive is in scope: production credentials, personal data, work code. When the task is simple enough that a local 12B model can handle it — generating a basic compose file, explaining a man page. When data sovereignty matters more than capability ceiling.

The honest middle ground: Aider against a local Ollama model (Qwen3-Coder, or a Gemma 4 variant) gets you 70-80% of the agentic capability with zero data leaving your box and zero marginal cost. The gap between local and cloud has closed considerably. You’re paying the Claude Code tax for the top of that range — complex tasks, many files, judgment calls that need careful reasoning rather than autocomplete.

For most homelab work, a good local model is enough. Where Claude Code earns its keep is the investigative work: debugging a dead service by reading its actual logs, refactoring a complex script that requires understanding intent, bulk edits that need judgment rather than pattern-matching.

The tool is genuinely good. The workflow shift from “paste into chat” to “agent that lives in the repo” is a real improvement. Just go in with clear eyes about the cost and where your data goes. Your homelab is built on controlling your own stuff — an agent that phones home is a deliberate tradeoff, not a free lunch. Make it consciously.


Share this post on:

Send a Webmention

Written about this post on your own site? Send a webmention and it'll show up above once verified.


Previous Post
BIOS/UEFI Tweaks for Headless Servers
Next Post
Kustomize vs Jsonnet for K8s Manifests

Discussion

Powered by Garrul . Sign in with GitHub or Google, or post anonymously.

Related Posts