Skip to content
Go back

Mimir + Grafana: Long-Term Prometheus Storage

By SumGuy 8 min read
Mimir + Grafana: Long-Term Prometheus Storage

Prometheus Drops Your Data After 15 Days. Surprise.

You’ve been running Prometheus for three months. Dashboards look good. Alerts fire when they should. Then someone asks: “Can you show me what CPU usage looked like last month?” You open Prometheus UI, query for data from 40 days ago, and get back nothing. Flat line. Silence. Your metrics are gone.

This isn’t a bug. It’s a design choice.

Prometheus retains about 15 days of data by default, and bumping retention to “store forever” on a single box tanks your performance around the 30-40 day mark. Prometheus stores everything in a local time-series database optimized for speed, not capacity. It’s a car, not a truck. Great for real-time dashboards and alerting. Terrible for capacity planning, year-over-year trend analysis, or compliance audits that ask for 12 months of data.

This is where long-term storage comes in. And if you’re running a home lab or small business, Grafana Mimir is probably the right move.

Why Prometheus Ain’t a Data Warehouse

Before we talk solutions, let’s understand the problem. Prometheus’s local storage is optimized for write speed and query latency. Every incoming metric gets compressed into blocks on disk, and blocks get compacted into larger ones over time. This design scales up vertically—more RAM, bigger SSD—until you hit the ceiling where a single machine can’t keep up. There’s no prize for running a monitoring system that’s so resource-heavy it becomes a SPOF (single point of failure) itself.

The solution isn’t to buy a bigger server. It’s to offload old data somewhere cheaper and keep Prometheus fast.

When Bumping Retention Is Actually Fine

Before you jump to Mimir, be honest: do you actually need long-term storage?

If your home lab has maybe 500 metrics total and you’re only concerned with the last 30 days of data, bumping Prometheus’s --storage.tsdb.retention.time flag to 30d might just work. You’ll need more disk space and a bit more RAM, but it’s simple. No extra services. No S3 bill. No debugging a distributed system at 2 AM.

The math is simple: roughly 1 KB per metric per day on average (varies wildly based on cardinality and scrape interval). 500 metrics × 30 days = ~15 GB of disk. Cheap. Easy.

But if you’re tracking thousands of metrics, or you need historical data for capacity planning, or compliance says “keep 12 months”, then a single Prometheus box becomes a money pit. This is where the long-term storage options split into camps.

The Players: Mimir vs Thanos vs VictoriaMetrics

You’ve got three main paths forward.

Thanos is the old guard. It works by sidecar: you run a Thanos sidecar container next to your Prometheus, and the sidecar uploads blocks to object storage (S3, GCS, whatever). Then you run separate query layer, store gateway, and compactor services to stitch it all together. It works, but it’s got more moving parts than a Swiss watch. Each component can fail independently. You’ll spend time debugging why your query layer can’t talk to the store gateway at 3 AM.

VictoriaMetrics is a separate beast entirely. It’s a time-series database built to be long-term storage from day one. Different architecture, different query language quirks, different operational model. That deserves its own article (slot 153 in the master plan), so we’ll skip it here.

Mimir is Grafana’s answer. It’s Thanos’s more organized cousin. Instead of sidecars and separate components, Mimir runs in a few flavors: monolithic mode (everything in one binary for labs), or fully distributed mode (horizontal scaling for production). It uses object storage as a backing layer—S3, MinIO, Google Cloud Storage, whatever—but the operational story is cleaner. Fewer moving parts than Thanos, more opinionated, better documentation.

For a home lab or small-to-mid company? Mimir wins on simplicity.

How Mimir Actually Works

Mimir is a long-term storage system that sits alongside your existing Prometheus setup. Your Prometheus stays exactly as it is. Mimir doesn’t replace it; it supplements it.

Here’s the flow:

  1. Prometheus scrapes metrics and stores them locally (15-day default retention).
  2. Prometheus is configured with a remote_write endpoint pointing to Mimir.
  3. Every metric Prometheus sees gets sent to Mimir in real-time.
  4. Mimir accepts the data, compresses it, and stores it in object storage (S3, MinIO, etc.).
  5. Your Grafana dashboard points to Prometheus for recent data (fast, cached), and can query Mimir for older data via a separate Mimir datasource.

Or, to keep dashboards simple, you point Grafana only at Mimir, and Mimir’s query layer automatically falls back to Prometheus for the most recent 15 minutes (where Mimir hasn’t caught up yet). Either way works.

The architecture is this: Mimir runs as a horizontally scalable system. You can run it in monolithic mode (single binary with ingester, querier, compactor, and store all in one process) for a home lab, or split it out into separate deployments for production. It auto-scales. If one ingester crashes, another picks up the load. If you need more query throughput, you spin up more query nodes. This is how you avoid the ceiling Prometheus hits.

The Minimum Viable Mimir Setup

Let’s deploy Mimir for a home lab using Docker Compose. You’ll need:

Here’s a minimal docker-compose.yml:

version: "3.8"
services:
minio:
image: minio/minio:latest
ports:
- "9000:9000"
- "9001:9001"
environment:
MINIO_ROOT_USER: minioadmin
MINIO_ROOT_PASSWORD: minioadmin
command: server /data --console-address ":9001"
volumes:
- minio_data:/data
mimir:
image: grafana/mimir:latest
ports:
- "9009:9009"
volumes:
- ./mimir-config.yaml:/etc/mimir/mimir.yaml
command: -config.file=/etc/mimir/mimir.yaml
depends_on:
- minio
prometheus:
image: prom/prometheus:latest
ports:
- "9090:9090"
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
- prometheus_data:/prometheus
command:
- "--config.file=/etc/prometheus/prometheus.yml"
- "--storage.tsdb.path=/prometheus"
- "--storage.tsdb.retention.time=15d"
grafana:
image: grafana/grafana:latest
ports:
- "3000:3000"
environment:
GF_SECURITY_ADMIN_PASSWORD: admin
volumes:
- grafana_data:/var/lib/grafana
volumes:
minio_data:
prometheus_data:
grafana_data:

Now the Mimir config (mimir-config.yaml):

multitenancy_enabled: false
ingester:
lifecycler:
ring:
kvstore:
store: inmemory
blocks_storage:
tsdb:
dir: /tmp/mimir-tsdb
bucket_store:
sync_dir: /tmp/mimir-sync
backend: s3
s3:
endpoint: minio:9000
access_key_id: minioadmin
secret_access_key: minioadmin
insecure: true
bucket_name: mimir-blocks
compactor:
data_dir: /tmp/mimir-compactor
sharding_ring:
kvstore:
store: inmemory
store_gateway:
sharding_ring:
kvstore:
store: inmemory
query_scheduler:
max_cache_freshness_per_tenant: 10m
limits:
max_global_samples_per_user: 10000000

And wire up Prometheus’s remote_write to Mimir. In your prometheus.yml:

global:
scrape_interval: 15s
remote_write:
- url: http://mimir:9009/api/prom/push
queue_config:
capacity: 10000
max_retries: 3
min_backoff: 100ms
max_backoff: 100ms
scrape_configs:
- job_name: prometheus
static_configs:
- targets: ['localhost:9090']

That’s it. Prometheus now ships every metric to Mimir. Mimir buffers, compresses, and stages it to MinIO. You can query 15 days of Prometheus data plus everything Mimir’s seen since you turned it on.

Querying Across Time

In Grafana, you add a Mimir datasource pointing to http://mimir:9009/prometheus. It speaks the same PromQL as Prometheus, so your dashboards don’t change. When you query a time range Prometheus no longer has, Grafana transparently queries Mimir instead.

Want to compare CPU usage across the last 12 months? Query Mimir. Want to see last hour? Prometheus is faster (local SSD). Grafana handles both without you thinking about it.

The Cost Question

This is the part nobody talks about openly: running long-term storage costs money.

If you’re using S3 in AWS, you’re paying for object storage (~$0.023/GB/month in the US), plus API calls (list/put are cheap, get is cheaper per call). A year of metrics for a moderately busy system (10K samples/sec) might be 2-3 TB, so you’re looking at $50-75/month in storage alone, plus data transfer if Grafana queries it often. That adds up.

MinIO in a home lab? Disk costs nothing extra (you own the drives), electricity is pennies, and complexity is low.

If you’re at a company and metrics are a compliance requirement, you’ll swallow the cost. S3 for a year of metrics is cheaper than a second full-time ops person.

But if you’re a home lab and just curious about historical trends, bumping Prometheus retention to 30d and calling it a day might be the smarter move.

When You Actually Need Long-Term Storage

You need Mimir (or Thanos, or VictoriaMetrics) when:

If none of these apply, Prometheus’s 15-day default is fine. Running Mimir adds operational overhead. You maintain MinIO or S3, tune compactors, debug query latency across two systems. Only add complexity if the benefit is real.

For a small home lab, Prometheus + a bigger disk usually wins. For anything larger, Mimir + object storage becomes cheaper and simpler than scaling Prometheus vertically.

Pick the tool that matches your scale. And if 15 days is enough? Sleep better at night with a simpler system.


Share this post on:

Send a Webmention

Written about this post on your own site? Send a webmention and it'll show up above once verified.


Next Post
iperf3 + nload: Network Diagnosis

Discussion

Powered by Garrul . Sign in with GitHub or Google, or post anonymously.

Related Posts