Mimir + Grafana: Long-Term Prometheus Storage

Prometheus Drops Your Data After 15 Days. Surprise.

You’ve been running Prometheus for three months. Dashboards look good. Alerts fire when they should. Then someone asks: “Can you show me what CPU usage looked like last month?” You open Prometheus UI, query for data from 40 days ago, and get back nothing. Flat line. Silence. Your metrics are gone.

This isn’t a bug. It’s a design choice.

Prometheus retains about 15 days of data by default, and bumping retention to “store forever” on a single box tanks your performance around the 30-40 day mark. Prometheus stores everything in a local time-series database optimized for speed, not capacity. It’s a car, not a truck. Great for real-time dashboards and alerting. Terrible for capacity planning, year-over-year trend analysis, or compliance audits that ask for 12 months of data.

This is where long-term storage comes in. And if you’re running a home lab or small business, Grafana Mimir is probably the right move.

Why Prometheus Ain’t a Data Warehouse

Before we talk solutions, let’s understand the problem. Prometheus’s local storage is optimized for write speed and query latency. Every incoming metric gets compressed into blocks on disk, and blocks get compacted into larger ones over time. This design scales up vertically, more RAM, bigger SSD, until you hit the ceiling where a single machine can’t keep up. There’s no prize for running a monitoring system that’s so resource-heavy it becomes a SPOF (single point of failure) itself.

The solution isn’t to buy a bigger server. It’s to offload old data somewhere cheaper and keep Prometheus fast.

When Bumping Retention Is Actually Fine

Before you jump to Mimir, be honest: do you actually need long-term storage?

If your home lab has maybe 500 metrics total and you’re only concerned with the last 30 days of data, bumping Prometheus’s --storage.tsdb.retention.time flag to 30d might just work. You’ll need more disk space and a bit more RAM, but it’s simple. No extra services. No S3 bill. No debugging a distributed system at 2 AM.

The math is simple: roughly 1 KB per metric per day on average (varies wildly based on cardinality and scrape interval). 500 metrics × 30 days = ~15 GB of disk. Cheap. Easy.

But if you’re tracking thousands of metrics, or you need historical data for capacity planning, or compliance says “keep 12 months”, then a single Prometheus box becomes a money pit. This is where the long-term storage options split into camps.

The Players: Mimir vs Thanos vs VictoriaMetrics

You’ve got three main paths forward.

Thanos is the old guard. It works by sidecar: you run a Thanos sidecar container next to your Prometheus, and the sidecar uploads blocks to object storage (S3, GCS, whatever). Then you run separate query layer, store gateway, and compactor services to stitch it all together. It works, but it’s got more moving parts than a Swiss watch. Each component can fail independently. You’ll spend time debugging why your query layer can’t talk to the store gateway at 3 AM.

VictoriaMetrics is a separate beast entirely. It’s a time-series database built to be long-term storage from day one. Different architecture, different query language quirks, different operational model. That deserves its own article, so we’ll skip it here.

Mimir is Grafana’s answer. It’s Thanos’s more organized cousin. Instead of sidecars and separate components, Mimir runs in a few flavors: monolithic mode (everything in one binary for labs), or fully distributed mode (horizontal scaling for production). It uses object storage as a backing layer, S3, MinIO, Google Cloud Storage, whatever, but the operational story is cleaner. Fewer moving parts than Thanos, more opinionated, better documentation.

For a home lab or small-to-mid company? Mimir wins on simplicity.

How Mimir Actually Works

Mimir is a long-term storage system that sits alongside your existing Prometheus setup. Your Prometheus stays exactly as it is. Mimir doesn’t replace it; it supplements it.

Here’s the flow:

Prometheus scrapes metrics and stores them locally (15-day default retention).
Prometheus is configured with a remote_write endpoint pointing to Mimir.
Every metric Prometheus sees gets sent to Mimir in real-time.
Mimir accepts the data, compresses it, and stores it in object storage (S3, MinIO, etc.).
Your Grafana dashboard points to Prometheus for recent data (fast, cached), and can query Mimir for older data via a separate Mimir datasource.

Or, to keep dashboards simple, you point Grafana only at Mimir, and Mimir’s query layer automatically falls back to Prometheus for the most recent 15 minutes (where Mimir hasn’t caught up yet). Either way works.

The architecture is this: Mimir runs as a horizontally scalable system. You can run it in monolithic mode (single binary with ingester, querier, compactor, and store all in one process) for a home lab, or split it out into separate deployments for production. It auto-scales. If one ingester crashes, another picks up the load. If you need more query throughput, you spin up more query nodes. This is how you avoid the ceiling Prometheus hits.

The Minimum Viable Mimir Setup

Let’s deploy Mimir for a home lab using Docker Compose. You’ll need:

Mimir (obviously)
A backing store (we’ll use MinIO, which is S3-compatible and runs on a single machine)
Prometheus with remote_write configured
Grafana

Here’s a minimal docker-compose.yml:

services:
  minio:
    image: minio/minio:latest
    ports:
      - "9000:9000"
      - "9001:9001"
    environment:
      MINIO_ROOT_USER: minioadmin
      MINIO_ROOT_PASSWORD: minioadmin
    command: server /data --console-address ":9001"
    volumes:
      - minio_data:/data

  mimir:
    image: grafana/mimir:latest
    ports:
      - "9009:9009"
    volumes:
      - ./mimir-config.yaml:/etc/mimir/mimir.yaml
    command: -config.file=/etc/mimir/mimir.yaml
    depends_on:
      - minio

  prometheus:
    image: prom/prometheus:latest
    ports:
      - "9090:9090"
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml
      - prometheus_data:/prometheus
    command:
      - "--config.file=/etc/prometheus/prometheus.yml"
      - "--storage.tsdb.path=/prometheus"
      - "--storage.tsdb.retention.time=15d"

  grafana:
    image: grafana/grafana:latest
    ports:
      - "3000:3000"
    environment:
      GF_SECURITY_ADMIN_PASSWORD: admin
    volumes:
      - grafana_data:/var/lib/grafana

volumes:
  minio_data:
  prometheus_data:
  grafana_data:

Now the Mimir config (mimir-config.yaml):

multitenancy_enabled: false

ingester:
  lifecycler:
    ring:
      kvstore:
        store: inmemory

blocks_storage:
  tsdb:
    dir: /tmp/mimir-tsdb
  bucket_store:
    sync_dir: /tmp/mimir-sync
  backend: s3
  s3:
    endpoint: minio:9000
    access_key_id: minioadmin
    secret_access_key: minioadmin
    insecure: true
    bucket_name: mimir-blocks

compactor:
  data_dir: /tmp/mimir-compactor
  sharding_ring:
    kvstore:
      store: inmemory

store_gateway:
  sharding_ring:
    kvstore:
      store: inmemory

query_scheduler:
  max_cache_freshness_per_tenant: 10m

limits:
  max_global_samples_per_user: 10000000

And wire up Prometheus’s remote_write to Mimir. In your prometheus.yml:

global:
  scrape_interval: 15s

remote_write:
  - url: http://mimir:9009/api/v1/push
    queue_config:
      capacity: 10000
      max_retries: 3
      min_backoff: 100ms
      max_backoff: 100ms

scrape_configs:
  - job_name: prometheus
    static_configs:
      - targets: ['localhost:9090']

That’s it. Prometheus now ships every metric to Mimir. Mimir buffers, compresses, and stages it to MinIO. You can query 15 days of Prometheus data plus everything Mimir’s seen since you turned it on.

Querying Across Time

In Grafana, you add a Mimir datasource pointing to http://mimir:9009/prometheus. It speaks the same PromQL as Prometheus, so your dashboards don’t change. When you query a time range Prometheus no longer has, Grafana transparently queries Mimir instead.

Want to compare CPU usage across the last 12 months? Query Mimir. Want to see last hour? Prometheus is faster (local SSD). Grafana handles both without you thinking about it.

The Cost Question

This is the part nobody talks about openly: running long-term storage costs money.

If you’re using S3 in AWS, you’re paying for object storage (~$0.023/GB/month in the US), plus API calls (list/put are cheap, get is cheaper per call). A year of metrics for a moderately busy system (10K samples/sec) might be 2-3 TB, so you’re looking at $50-75/month in storage alone, plus data transfer if Grafana queries it often. That adds up.

MinIO in a home lab? Disk costs nothing extra (you own the drives), electricity is pennies, and complexity is low.

If you’re at a company and metrics are a compliance requirement, you’ll swallow the cost. S3 for a year of metrics is cheaper than a second full-time ops person.

But if you’re a home lab and just curious about historical trends, bumping Prometheus retention to 30d and calling it a day might be the smarter move.

When You Actually Need Long-Term Storage

You need Mimir (or Thanos, or VictoriaMetrics) when:

Capacity planning: You need to see how resource usage scales month-to-month, year-to-year. You can’t make budget decisions with 15 days of data.
Compliance: Your org says “keep 12 months of audit data.” A single Prometheus box can’t store it cheaply. Mimir + S3 gives you compliance on a realistic budget.
Incident investigation: It’s 2 AM, production is on fire, and someone asks “when did this start?” Historical data from 8 weeks ago answers the question.
Performance tuning: You’re optimizing alerting thresholds and you need to see how metrics behaved over seasons (summer spike vs winter baseline).
SLO tracking: You’re calculating availability and you need 90-day windows of data. Prometheus can’t hold it.

If none of these apply, Prometheus’s 15-day default is fine. Running Mimir adds operational overhead. You maintain MinIO or S3, tune compactors, debug query latency across two systems. Only add complexity if the benefit is real.

For a small home lab, Prometheus + a bigger disk usually wins. For anything larger, Mimir + object storage becomes cheaper and simpler than scaling Prometheus vertically.

Pick the tool that matches your scale. And if 15 days is enough? Sleep better at night with a simpler system.

Mimir + Grafana: Long-Term Prometheus Storage

Prometheus Drops Your Data After 15 Days. Surprise.

Why Prometheus Ain’t a Data Warehouse

When Bumping Retention Is Actually Fine

The Players: Mimir vs Thanos vs VictoriaMetrics

How Mimir Actually Works

The Minimum Viable Mimir Setup

Querying Across Time

The Cost Question

When You Actually Need Long-Term Storage

Responses from around the web

Discussion

Related Posts

cAdvisor + Prometheus: Per-Container Metrics Done Right

TIG: Telegraf + InfluxDB + Grafana

SigNoz vs Uptrace Self-Hosted Observability

Promtail to Alloy Migration: A Practical Diff

Mimir + Grafana: Long-Term Prometheus Storage

Prometheus Drops Your Data After 15 Days. Surprise.

Why Prometheus Ain’t a Data Warehouse

When Bumping Retention Is Actually Fine

The Players: Mimir vs Thanos vs VictoriaMetrics

How Mimir Actually Works

The Minimum Viable Mimir Setup

Querying Across Time

The Cost Question

When You Actually Need Long-Term Storage

Related Reading

Responses from around the web

Discussion

Related Posts

cAdvisor + Prometheus: Per-Container Metrics Done Right

TIG: Telegraf + InfluxDB + Grafana

SigNoz vs Uptrace Self-Hosted Observability

Promtail to Alloy Migration: A Practical Diff