Skip to content
Go back

Stirling-PDF: Stop Uploading Your Tax Returns to Sketchy Sites

By SumGuy 8 min read
Stirling-PDF: Stop Uploading Your Tax Returns to Sketchy Sites

You’ve Absolutely Used a Sketchy PDF Site

Admit it. You needed to merge two PDFs at 11 PM. You Googled “merge PDF free,” clicked the first result that wasn’t an ad (the second one), uploaded your documents, waited, downloaded the result, and closed the tab hoping nobody noticed.

Those documents were your lease agreement. Or a bank statement. Or — yes — tax forms with your full name, address, and SSN buried in page three.

ilovepdf.com and smallpdf.com aren’t evil. They’re just not yours. Their privacy policies exist. You have not read them. Neither have I. And that’s the problem.

Stirling-PDF fixes this. It’s a single Docker container that gives you a full PDF manipulation suite: merge, split, compress, rotate, OCR, redact, watermark, password protect, repair, convert to/from images, and more. Everything runs locally. Nothing leaves your server.


What’s In the Box

Stirling-PDF (formerly Stirling-Tools) is a Spring Boot web app with a clean UI and an API. The feature list is embarrassingly long for a free, self-hosted tool:

That’s not a features list — that’s a replacement for a $30/month Adobe subscription.


Spinning It Up

One Compose file. Done.

docker-compose.yml
services:
stirling-pdf:
image: frooodle/s-pdf:latest
container_name: stirling-pdf
restart: unless-stopped
ports:
- "8080:8080"
volumes:
- ./stirling-config:/configs
- ./stirling-logs:/logs
- ./stirling-extras:/usr/share/tessdata # OCR language packs
- ./stirling-pipeline:/pipeline # optional automation pipeline
environment:
- DOCKER_ENABLE_SECURITY=false
- INSTALL_BOOK_AND_ADVANCED_HTML_OPS=false
- LANGS=en_GB
Terminal window
mkdir -p stirling-config stirling-logs stirling-extras stirling-pipeline
docker compose up -d

Hit http://your-server:8080 and you’re in. The UI is clean enough that you won’t feel bad handing it to a non-technical family member who keeps asking you to “fix a PDF.”

Persistent Volume Considerations

The ./stirling-config mount is where your settings live. If you nuke the container, your config survives. The ./stirling-extras mount is where Tesseract language packs go — more on that below.

/pipeline is optional and used for automation workflows (chained operations on watched folders). Useful for bulk processing, skip it for casual use.


OCR Language Packs

Out of the box you get English. If you need German, French, Japanese, or anything else, you need to drop Tesseract .traineddata files into ./stirling-extras.

Terminal window
# Download a language pack — example: German
wget -O ./stirling-extras/deu.traineddata \
https://github.com/tesseract-ocr/tessdata/raw/main/deu.traineddata
# Multiple languages at once
for lang in fra spa por; do
wget -O ./stirling-extras/${lang}.traineddata \
https://github.com/tesseract-ocr/tessdata/raw/main/${lang}.traineddata
done

After dropping in the files, restart the container and the new languages show up in the OCR dropdown. No rebuild, no config change.

OCR quality is “good enough for scanned documents that weren’t printed by a dot-matrix printer in 1994.” It’s Tesseract — managed expectations apply.


File Size Limits

By default, Spring Boot caps uploads at 50MB. For most PDF work that’s fine. If you’re trying to OCR a 400-page architectural drawing package, you’ll hit the wall.

Bump it in ./stirling-config/settings.yml:

stirling-config/settings.yml
system:
maxUploadSize: 500 # MB

Restart the container. Done.

Compression is handled by Ghostscript internally — you pick a quality preset (screen, ebook, printer, prepress) and Stirling calls out to gs. The “ebook” preset is the sweet spot for most documents: good quality, meaningfully smaller files.


Reverse Proxy + Auth

If you’re exposing this to your LAN or beyond, you’ll want a reverse proxy in front. Stirling-PDF doesn’t have built-in user accounts in the default config — it’s “everyone on the network can use it.”

For home use behind Tailscale or a VPN, that’s probably fine. For anything internet-facing, either:

Option A: Basic auth via Nginx/Caddy

# Caddyfile snippet
stirling.yourdomain.com {
basicauth {
your_user JDJhJDE0JGhIZ2l... # bcrypt hash
}
reverse_proxy stirling-pdf:8080
}

Option B: Enable Stirling’s built-in security

Set DOCKER_ENABLE_SECURITY=true in your Compose file. This enables Spring Security with a default admin account. Check the logs on first boot for the generated credentials, or set them via environment variables:

docker-compose.yml
environment:
- DOCKER_ENABLE_SECURITY=true
- SECURITY_INITIALLOGIN_USERNAME=admin
- SECURITY_INITIALLOGIN_PASSWORD=changeme_please

With security enabled you get user management, per-user settings, and the ability to restrict which tools are accessible. More overhead, but appropriate if you’re sharing it with people who aren’t you.


Paperless-ngx Integration

You’re probably already running Paperless-ngx if you’re the kind of person reading this article. The two tools complement each other without needing explicit integration:

Drop the processed PDF into your Paperless consume folder and let it do its thing. The workflow is: scan → Stirling (compress + OCR) → Paperless consume folder → indexed and searchable forever.

If you want to get fancy, Stirling’s pipeline feature can watch a folder and auto-apply a processing chain. Point it at a “raw scans” directory, configure compress + OCR as the pipeline steps, output to your Paperless consume folder. You’ve just automated document ingestion without writing a single line of code.


Resource Use Under Load

Stirling-PDF is a Java app. It uses memory like a Java app. At idle, expect 300-500MB RAM. Under load — especially OCR on multi-page documents — you’ll see spikes to 1-2GB and meaningful CPU usage (Tesseract is single-threaded per page).

For a home server or a modest VPS, this is fine. For a Raspberry Pi 3: maybe not. A Pi 4 with 4GB handles it without complaint.

Compression jobs are fast. OCR on a 20-page scanned PDF takes 30-60 seconds on modern hardware. Conversion tasks (PDF to images) are nearly instant.

If you’re processing large batches, don’t queue 50 OCR jobs simultaneously. It’ll work, it’ll just be slow and your server will breathe heavy for a while.


When You’d Still Pay Adobe

Stirling-PDF is excellent. It’s not perfect. There are cases where the paid tools still win:

PDF/A archival compliance — If you need legally archivable PDF/A-1b or PDF/A-2a output for regulatory reasons, Stirling’s compliance with specific archival standards is not guaranteed. LibreOffice can produce PDF/A output; for strict compliance workflows, test carefully.

Advanced fillable forms — Creating complex PDF forms with conditional logic, JavaScript validation, and digital signature fields is outside Stirling’s scope. It can fill and flatten existing forms, not author them.

Accessibility (WCAG/PDF/UA tagging) — Producing fully tagged, accessible PDFs for institutional publishing requires Adobe Acrobat or specialized tools. Stirling doesn’t touch the tag tree.

Enterprise audit trails — If you need cryptographic signing, timestamping, and compliance logs for legal documents, you’re in enterprise PDF tooling territory.

For the other 95% of things a normal person needs to do with a PDF? Stirling has you covered.


The Privacy Argument Is Simple

You wouldn’t upload your medical records to a random website to print them. A PDF with your SSN or financial statements is the same thing. The “free online tool” model is built on processing your data — at minimum for product improvement, potentially for more.

Self-hosting isn’t paranoia. It’s recognizing that free tools have costs that aren’t listed on the pricing page.

Stirling-PDF costs you one Docker container, maybe 500MB of RAM, and the 10 minutes it took to read this article. In exchange, your documents stay on your hardware, behind your firewall, processed by code you can inspect.

Your 2 AM tax document panic mode just got a lot less sketchy.


Quick Reference

TaskStirling-PDF?
Merge PDFsYes
Split / extract pagesYes
OCR scanned documentsYes (Tesseract)
Compress for emailYes
Redact sensitive infoYes
Password protectYes
Convert PDF ↔ imagesYes
Repair corrupted PDFYes (often)
Create fillable formsNo
PDF/A complianceLimited
Enterprise signingNo
Terminal window
# Get started in 60 seconds
mkdir stirling && cd stirling
curl -o docker-compose.yml https://raw.githubusercontent.com/Stirling-Tools/Stirling-PDF/main/docker/docker-compose.yml
docker compose up -d
# Open http://localhost:8080

Done. No more uploading your lease to a website called “PDFsupertools.xyz.”


Share this post on:

Send a Webmention

Written about this post on your own site? Send a webmention and it'll show up above once verified.


Next Post
ModSecurity vs Coraza WAF

Discussion

Powered by Garrul . Sign in with GitHub or Google, or post anonymously.

Related Posts