Skip to content
Go back

fio: Real Disk Benchmarking

By SumGuy 12 min read
fio: Real Disk Benchmarking

Every Drive’s Marketing Spec Is a Lie

That 100,000 IOPS number on your SSD’s datasheet? That’s a lab fairy tale. Sequential reads under perfect conditions, zero network jitter, zero other workloads, probably on a cold disk. Your database doesn’t work that way. Your storage cluster doesn’t work that way. Real workloads are messier: random access patterns, queue depths all over the place, latency that spikes at 3 AM.

Here’s the thing: you need actual numbers from your storage under your workload. That’s where fio comes in. It’s a disk I/O benchmark tool that lets you define exactly how your application hammers the disk, then gives you honest performance data: throughput, IOPS, latency percentiles. No marketing. No lab conditions. Just brutal truth.

If you’re choosing storage for a homelab, running a self-hosted database, or just tired of guessing whether your NVMe is actually faster than that SATA drive, fio is your answer.


Why fio Matters More Than Vendor Specs

Marketing teams measure IOPS with a single sequential operation at iodepth 1 (one request at a time). That’s great if your workload is “read this file from start to finish.” But real databases? They queue up multiple requests. They random-access. They mix reads and writes. A disk that claims 100K IOPS might only do 5K under your actual pattern.

fio lets you replicate your workload exactly:

You can test:

And you’ll get reproducible numbers you can trust.


Installing fio

On most distros, it’s in the package manager:

Terminal window
# Debian/Ubuntu
sudo apt install fio
# RHEL/CentOS/Fedora
sudo dnf install fio
# Alpine
apk add fio
# macOS (via Homebrew)
brew install fio

Verify the install:

Terminal window
fio --version

That’s it. No complicated dependencies.


The Core Concepts

Before you run benchmarks, understand what you’re configuring:

Jobs: One benchmark test. A job defines what the disk does (read, write, random, sequential) and how (block size, queue depth, number of threads).

ioengine: How fio talks to the disk. Most common are:

iodepth: How many operations fio queues up at once. iodepth=1 = one at a time (like vendor specs, unrealistic for real workloads). iodepth=32 or 64 = queue up 32–64 ops (realistic for databases).

numjobs: Number of threads/processes. One job with numjobs=4 = four threads running the same test in parallel.

rw: Read/write pattern.

bs: Block size. How much data per operation. 4K is tiny (databases). 1M is huge (sequential throughput). Can specify per-operation: bsrange=4k-512k.

size: Total amount of data to test. fio creates a file or uses raw device space. Bigger = more realistic (avoids cache artifacts).

runtime: How long the test runs. Often you set this instead of size, to stress the disk over time.

direct: Use O_DIRECT (bypass filesystem cache). Almost always 1 for honest storage benchmarks. If you skip this, you’re measuring your RAM cache, not your disk.


Four Core Workload Patterns

1. Sequential Throughput

What it measures: How fast can you read or write a huge file? This is your disk’s best-case scenario (and what vendor specs love to show).

Terminal window
fio --name=seq-read \
--ioengine=libaio \
--iodepth=4 \
--rw=read \
--bs=1m \
--size=1g \
--direct=1 \
--numjobs=1 \
--runtime=30 \
--time_based

You’ll see throughput in MB/s. This is where an NVMe shines (5000+ MB/s), while an old HDD gasps (100–200 MB/s).

2. Random 4K Performance

What it measures: How many small operations can the disk handle per second? This is what matters for databases, filesystems, and virtual machines.

Terminal window
fio --name=random4k \
--ioengine=libaio \
--iodepth=32 \
--rw=randread \
--bs=4k \
--size=10g \
--direct=1 \
--numjobs=4 \
--runtime=60 \
--time_based

Look for IOPS in the thousands. A good NVMe does 100K+ IOPS. A SATA SSD does 20K–50K. An HDD does 100–300 IOPS (ouch).

3. Mixed Workload (Real-World DB)

What it measures: Your database isn’t pure reads or pure writes. It’s often 70% reads and 30% writes. This gets closer to reality.

Terminal window
fio --name=mixed-workload \
--ioengine=libaio \
--iodepth=16 \
--rw=randrw \
--rwmixread=70 \
--bs=4k \
--size=10g \
--direct=1 \
--numjobs=4 \
--runtime=60 \
--time_based

IOPS will drop compared to pure reads because writes are slower. But this is honest.

4. Latency Profile

What it measures: How consistent is the disk? A drive with 99.99th percentile latency of 50 ms will ruin your user experience even if average latency is 5 ms.

Terminal window
fio --name=latency-test \
--ioengine=libaio \
--iodepth=1 \
--rw=randread \
--bs=4k \
--size=10g \
--direct=1 \
--numjobs=1 \
--runtime=120 \
--time_based \
--output=latency.json \
--output-format=json

Run this long (2+ minutes) to get good percentile data. The JSON output includes 99th, 99.9th, and 99.99th percentile latencies.


Reading fio Output (The Good, the Bad, the Ugly)

When fio finishes, you get a wall of text. Here’s what matters:

read: IOPS=25435, BW=99.3MiB/s (104MB/s)(5960MiB/60001msec)
slat (nsec): min=1852, max=45603, avg=2892.13, stdev=892.44
clat (nsec): min=1234, max=156789, avg=1256.12, stdev=5234.55
percentile (nsec):
50.00th=[ 1234], 90.00th=[ 1892], 99.00th=[ 2456], 99.90th=[ 3892], 99.99th=[12456]
lat (nsec): min=3456, max=158901, avg=4148.25, stdev=6123.67

BW (bandwidth): Throughput in MB/s. Compare this across drives. Higher is better.

IOPS: Operations per second. Compare across drives. Higher is better.

slat (submission latency): Time between fio asking the kernel to do I/O and the kernel accepting it. Usually microseconds. Ignore unless it’s huge (microseconds).

clat (completion latency): Time from submission to completion. This is what your application feels. Lower is better.

lat (total latency): slat + clat.

percentiles: The big one. Look at 99.00th, 99.90th, 99.99th. If 99% of requests finish in 2 ms but 99.99% take 12 ms, your tail latency is rough. Some users will experience that slowness.


libaio vs io_uring: The Engine Debate

libaio (Linux native AIO): Stable, well-tested, works everywhere. It’s the default. Use this unless you have a reason not to.

Terminal window
fio --name=test --ioengine=libaio ...

io_uring (Linux 5.1+): Newer, faster, more flexible. If your kernel supports it (5.1 or later), try it for 10–15% better performance.

Terminal window
fio --name=test --ioengine=io_uring ...

To check if io_uring is available:

Terminal window
cat /proc/sys/kernel/io_uring_disabled 2>/dev/null && echo "io_uring present" || echo "not available"

If you’re on an older kernel or unsure, stick with libaio.


Saving Tests as Job Files

Typing long fio commands gets tedious. Save them as job files (INI format) and reuse them:

random4k.fio
[global]
ioengine=libaio
direct=1
group_reporting=1
time_based=1
runtime=60
[random4k-test]
rw=randread
bs=4k
iodepth=32
numjobs=4
size=10g

Run it:

Terminal window
fio random4k.fio

Much cleaner. You can commit these to git and version them over time.


Real-World Examples

NVMe Benchmark

You bought a fancy NVMe. Prove it’s worth the money:

Terminal window
fio --name=nvme-seq \
--ioengine=io_uring \
--iodepth=32 \
--rw=read \
--bs=256k \
--size=20g \
--direct=1 \
--numjobs=1 \
--runtime=60 \
--time_based
fio --name=nvme-rand \
--ioengine=io_uring \
--iodepth=32 \
--rw=randread \
--bs=4k \
--size=20g \
--direct=1 \
--numjobs=8 \
--runtime=60 \
--time_based

Expect: sequential 3000–7000 MB/s, random 50K–500K IOPS depending on your drive.

SATA SSD Benchmark

The workhorse drive in most homelabs:

Terminal window
fio --name=sata-seq \
--ioengine=libaio \
--iodepth=4 \
--rw=read \
--bs=256k \
--size=10g \
--direct=1 \
--numjobs=1 \
--runtime=60 \
--time_based
fio --name=sata-rand \
--ioengine=libaio \
--iodepth=16 \
--rw=randread \
--bs=4k \
--size=10g \
--direct=1 \
--numjobs=4 \
--runtime=60 \
--time_based

Expect: sequential 400–600 MB/s, random 15K–50K IOPS.

HDD (Spinning Rust)

For bulk storage or archival. Latency will be brutal:

Terminal window
fio --name=hdd-seq \
--ioengine=libaio \
--iodepth=1 \
--rw=read \
--bs=1m \
--size=5g \
--direct=1 \
--numjobs=1 \
--runtime=60 \
--time_based
fio --name=hdd-rand \
--ioengine=libaio \
--iodepth=1 \
--rw=randread \
--bs=4k \
--size=5g \
--direct=1 \
--numjobs=1 \
--runtime=60 \
--time_based

Expect: sequential 100–200 MB/s (if lucky), random 100–300 IOPS. This is why you don’t run databases on HDDs.

ZFS Pool Benchmark

Testing your RAID-1 or RAID-5 pool:

Terminal window
# Test on /mnt/tank (your ZFS pool)
fio --name=zfs-test \
--ioengine=libaio \
--iodepth=32 \
--rw=randrw \
--rwmixread=70 \
--bs=4k \
--size=5g \
--directory=/mnt/tank \
--numjobs=4 \
--runtime=60 \
--time_based

Note: no direct=1 here because ZFS manages its own caching. Let it do its thing.


Common Mistakes and How to Avoid Them

Forgetting direct=1: Without it, fio measures your RAM cache, not your disk. Always use --direct=1 unless you’re specifically testing cache performance.

Running on a mounted filesystem with heavy caching: Even with direct=1, if your filesystem is doing COW (copy-on-write) or compression, you’ll get weird results. Test on a dedicated partition or raw device if possible.

Not dropping caches between tests: Linux caches aggressively. Between benchmark runs, clear the cache:

Terminal window
sudo sh -c 'sync; echo 3 > /proc/sys/vm/drop_caches'

Then run your benchmark. Same command, clean slate.

Using iodepth=1 for everything: It’s realistic for synchronous code but misleading for async workloads. Most modern systems queue up 4–32 operations. Test with realistic iodepth.

Not running long enough: fio’s default 30-second runtime is short. Run at least 60 seconds, ideally 120. SSDs warm up, caches settle, you get better data.

Testing on a small dataset: If you test 100 MB on a 1 TB drive, you’re hitting a tiny part of the platters/flash. Use at least 10% of the drive’s capacity. A good rule: size = 10 GB for a 100 GB drive, 100 GB for a 1 TB drive.

Comparing different block sizes without thinking: A random 4K benchmark looks different from a random 64K benchmark. They’re testing different things. Be consistent when comparing drives.


Latency Testing and ramp-time

The first few seconds of a benchmark aren’t representative. Disks warm up, caches settle, queue depth stabilizes. Use ramp_time to throw away the warmup:

Terminal window
fio --name=test \
--ioengine=libaio \
--iodepth=32 \
--rw=randread \
--bs=4k \
--size=10g \
--direct=1 \
--numjobs=4 \
--ramp_time=10 \
--runtime=60 \
--time_based

This runs 10 seconds of warmup (discarded), then 60 seconds of real testing.

For latency testing, also check the 99.99th percentile tail. That’s what your user experiences when everything hits at once:

Terminal window
fio --name=latency \
--ioengine=libaio \
--iodepth=1 \
--rw=randread \
--bs=4k \
--size=10g \
--direct=1 \
--numjobs=1 \
--ramp_time=5 \
--runtime=180 \
--time_based \
--output=results.json \
--output-format=json

Parse the JSON to see the percentile breakdown. A drive with low 99.99th latency is solid; one with spiky tail latency is a problem.


File-Based vs Raw Device Testing

File-based (what we’ve shown above): Create a file on your filesystem and benchmark it. This includes filesystem overhead. Good for real-world results.

Terminal window
fio --name=file-test \
--filename=/mnt/disk/test.img \
...

Raw device: Benchmark the raw disk without filesystem. This is pure storage performance.

Terminal window
sudo fio --name=device-test \
--filename=/dev/nvme0n1 \
--direct=1 \
...

For most of you, file-based is fine. Raw device is useful if you’re diagnosing filesystem issues or comparing raw RAID performance.


How to Compare Disks Fairly

Use the same test on all drives:

  1. Same ioengine (libaio or io_uring)
  2. Same iodepth (usually 32)
  3. Same block size (usually 4k or 256k depending on workload)
  4. Same runtime (at least 60 seconds)
  5. Same ramp_time (10 seconds)
  6. Drop caches between runs
  7. Compare IOPS or throughput

Create a job file and use it on each drive:

compare.fio
[global]
ioengine=libaio
direct=1
ramp_time=10
runtime=60
time_based=1
[4k-random-reads]
rw=randread
bs=4k
iodepth=32
numjobs=4
size=10g

Run on drive A, note the IOPS. Clear caches. Run on drive B. Compare.


Network Storage (NFS, iSCSI, Ceph)

Same tool works on network storage:

Terminal window
# NFS mount at /mnt/nfs
fio --name=nfs-test \
--ioengine=libaio \
--iodepth=16 \
--rw=randrw \
--rwmixread=70 \
--bs=4k \
--size=5g \
--directory=/mnt/nfs \
--numjobs=4 \
--runtime=60 \
--time_based

Latency will be higher (network overhead), but you’ll see how your storage cluster actually performs. This is valuable for homelab setups with Ceph or iSCSI.


Your fio Cheat Sheet

Save these. Commit them to a git repo. Use them whenever you’re evaluating storage.

Quick sequential throughput:

Terminal window
fio --name=seq --ioengine=libaio --iodepth=4 --rw=read --bs=1m --size=10g --direct=1 --runtime=60 --time_based

Quick random 4K IOPS:

Terminal window
fio --name=rand4k --ioengine=libaio --iodepth=32 --rw=randread --bs=4k --size=10g --direct=1 --numjobs=4 --runtime=60 --time_based

Database-like mixed workload:

Terminal window
fio --name=mixed --ioengine=libaio --iodepth=16 --rw=randrw --rwmixread=70 --bs=4k --size=10g --direct=1 --numjobs=4 --runtime=60 --time_based

Latency tail (99.99th percentile):

Terminal window
fio --name=lat --ioengine=libaio --iodepth=1 --rw=randread --bs=4k --size=10g --direct=1 --ramp_time=5 --runtime=180 --time_based --output=lat.json --output-format=json

The Honest Conclusion

You now have a tool that tells the truth about your storage. No marketing BS. No lab conditions. Just real numbers from real workloads.

The next time a vendor tells you their drive does “100,000 IOPS,” you can smile and run fio on your actual workload. Maybe it does 100K. Maybe it does 5K. Now you know.

Your 2 AM self will thank you when a storage bottleneck doesn’t crater your database at 3 AM.


Share this post on:

Send a Webmention

Written about this post on your own site? Send a webmention and it'll show up above once verified.


Previous Post
Jellyseerr Tagging Workflows for Real Libraries
Next Post
Boundary vs Teleport

Discussion

Powered by Garrul . Sign in with GitHub or Google, or post anonymously.

Related Posts