You’ve Absolutely Used a Sketchy PDF Site
Admit it. You needed to merge two PDFs at 11 PM. You Googled “merge PDF free,” clicked the first result that wasn’t an ad (the second one), uploaded your documents, waited, downloaded the result, and closed the tab hoping nobody noticed.
Those documents were your lease agreement. Or a bank statement. Or — yes — tax forms with your full name, address, and SSN buried in page three.
ilovepdf.com and smallpdf.com aren’t evil. They’re just not yours. Their privacy policies exist. You have not read them. Neither have I. And that’s the problem.
Stirling-PDF fixes this. It’s a single Docker container that gives you a full PDF manipulation suite: merge, split, compress, rotate, OCR, redact, watermark, password protect, repair, convert to/from images, and more. Everything runs locally. Nothing leaves your server.
What’s In the Box
Stirling-PDF (formerly Stirling-Tools) is a Spring Boot web app with a clean UI and an API. The feature list is embarrassingly long for a free, self-hosted tool:
- Merge / split / extract pages — combine multiple PDFs or pull out specific pages
- Compress — reduce file size using Ghostscript under the hood
- OCR — make scanned PDFs searchable with Tesseract
- Rotate / reorder / remove pages — basic manipulation without losing your mind
- Convert — PDF to images (PNG/JPEG/TIFF), images to PDF, HTML to PDF, Office docs to PDF
- Redact — black out text or regions before sharing
- eSign — add signature fields, sign documents
- Password protect / unlock — standard PDF encryption
- Watermark — text or image overlays
- Repair — fix corrupted PDFs (works surprisingly often)
- Metadata editor — strip or edit PDF metadata
- Compare PDFs — visual diff between two versions
That’s not a features list — that’s a replacement for a $30/month Adobe subscription.
Spinning It Up
One Compose file. Done.
services: stirling-pdf: image: frooodle/s-pdf:latest container_name: stirling-pdf restart: unless-stopped ports: - "8080:8080" volumes: - ./stirling-config:/configs - ./stirling-logs:/logs - ./stirling-extras:/usr/share/tessdata # OCR language packs - ./stirling-pipeline:/pipeline # optional automation pipeline environment: - DOCKER_ENABLE_SECURITY=false - INSTALL_BOOK_AND_ADVANCED_HTML_OPS=false - LANGS=en_GBmkdir -p stirling-config stirling-logs stirling-extras stirling-pipelinedocker compose up -dHit http://your-server:8080 and you’re in. The UI is clean enough that you won’t feel bad handing it to a non-technical family member who keeps asking you to “fix a PDF.”
Persistent Volume Considerations
The ./stirling-config mount is where your settings live. If you nuke the container, your config survives. The ./stirling-extras mount is where Tesseract language packs go — more on that below.
/pipeline is optional and used for automation workflows (chained operations on watched folders). Useful for bulk processing, skip it for casual use.
OCR Language Packs
Out of the box you get English. If you need German, French, Japanese, or anything else, you need to drop Tesseract .traineddata files into ./stirling-extras.
# Download a language pack — example: Germanwget -O ./stirling-extras/deu.traineddata \ https://github.com/tesseract-ocr/tessdata/raw/main/deu.traineddata
# Multiple languages at oncefor lang in fra spa por; do wget -O ./stirling-extras/${lang}.traineddata \ https://github.com/tesseract-ocr/tessdata/raw/main/${lang}.traineddatadoneAfter dropping in the files, restart the container and the new languages show up in the OCR dropdown. No rebuild, no config change.
OCR quality is “good enough for scanned documents that weren’t printed by a dot-matrix printer in 1994.” It’s Tesseract — managed expectations apply.
File Size Limits
By default, Spring Boot caps uploads at 50MB. For most PDF work that’s fine. If you’re trying to OCR a 400-page architectural drawing package, you’ll hit the wall.
Bump it in ./stirling-config/settings.yml:
system: maxUploadSize: 500 # MBRestart the container. Done.
Compression is handled by Ghostscript internally — you pick a quality preset (screen, ebook, printer, prepress) and Stirling calls out to gs. The “ebook” preset is the sweet spot for most documents: good quality, meaningfully smaller files.
Reverse Proxy + Auth
If you’re exposing this to your LAN or beyond, you’ll want a reverse proxy in front. Stirling-PDF doesn’t have built-in user accounts in the default config — it’s “everyone on the network can use it.”
For home use behind Tailscale or a VPN, that’s probably fine. For anything internet-facing, either:
Option A: Basic auth via Nginx/Caddy
# Caddyfile snippetstirling.yourdomain.com { basicauth { your_user JDJhJDE0JGhIZ2l... # bcrypt hash } reverse_proxy stirling-pdf:8080}Option B: Enable Stirling’s built-in security
Set DOCKER_ENABLE_SECURITY=true in your Compose file. This enables Spring Security with a default admin account. Check the logs on first boot for the generated credentials, or set them via environment variables:
environment: - DOCKER_ENABLE_SECURITY=true - SECURITY_INITIALLOGIN_USERNAME=admin - SECURITY_INITIALLOGIN_PASSWORD=changeme_pleaseWith security enabled you get user management, per-user settings, and the ability to restrict which tools are accessible. More overhead, but appropriate if you’re sharing it with people who aren’t you.
Paperless-ngx Integration
You’re probably already running Paperless-ngx if you’re the kind of person reading this article. The two tools complement each other without needing explicit integration:
- Stirling-PDF handles pre-processing: compress that 8MB scan, OCR it, clean up orientation
- Paperless-ngx handles storage, tagging, and search
Drop the processed PDF into your Paperless consume folder and let it do its thing. The workflow is: scan → Stirling (compress + OCR) → Paperless consume folder → indexed and searchable forever.
If you want to get fancy, Stirling’s pipeline feature can watch a folder and auto-apply a processing chain. Point it at a “raw scans” directory, configure compress + OCR as the pipeline steps, output to your Paperless consume folder. You’ve just automated document ingestion without writing a single line of code.
Resource Use Under Load
Stirling-PDF is a Java app. It uses memory like a Java app. At idle, expect 300-500MB RAM. Under load — especially OCR on multi-page documents — you’ll see spikes to 1-2GB and meaningful CPU usage (Tesseract is single-threaded per page).
For a home server or a modest VPS, this is fine. For a Raspberry Pi 3: maybe not. A Pi 4 with 4GB handles it without complaint.
Compression jobs are fast. OCR on a 20-page scanned PDF takes 30-60 seconds on modern hardware. Conversion tasks (PDF to images) are nearly instant.
If you’re processing large batches, don’t queue 50 OCR jobs simultaneously. It’ll work, it’ll just be slow and your server will breathe heavy for a while.
When You’d Still Pay Adobe
Stirling-PDF is excellent. It’s not perfect. There are cases where the paid tools still win:
PDF/A archival compliance — If you need legally archivable PDF/A-1b or PDF/A-2a output for regulatory reasons, Stirling’s compliance with specific archival standards is not guaranteed. LibreOffice can produce PDF/A output; for strict compliance workflows, test carefully.
Advanced fillable forms — Creating complex PDF forms with conditional logic, JavaScript validation, and digital signature fields is outside Stirling’s scope. It can fill and flatten existing forms, not author them.
Accessibility (WCAG/PDF/UA tagging) — Producing fully tagged, accessible PDFs for institutional publishing requires Adobe Acrobat or specialized tools. Stirling doesn’t touch the tag tree.
Enterprise audit trails — If you need cryptographic signing, timestamping, and compliance logs for legal documents, you’re in enterprise PDF tooling territory.
For the other 95% of things a normal person needs to do with a PDF? Stirling has you covered.
The Privacy Argument Is Simple
You wouldn’t upload your medical records to a random website to print them. A PDF with your SSN or financial statements is the same thing. The “free online tool” model is built on processing your data — at minimum for product improvement, potentially for more.
Self-hosting isn’t paranoia. It’s recognizing that free tools have costs that aren’t listed on the pricing page.
Stirling-PDF costs you one Docker container, maybe 500MB of RAM, and the 10 minutes it took to read this article. In exchange, your documents stay on your hardware, behind your firewall, processed by code you can inspect.
Your 2 AM tax document panic mode just got a lot less sketchy.
Quick Reference
| Task | Stirling-PDF? |
|---|---|
| Merge PDFs | Yes |
| Split / extract pages | Yes |
| OCR scanned documents | Yes (Tesseract) |
| Compress for email | Yes |
| Redact sensitive info | Yes |
| Password protect | Yes |
| Convert PDF ↔ images | Yes |
| Repair corrupted PDF | Yes (often) |
| Create fillable forms | No |
| PDF/A compliance | Limited |
| Enterprise signing | No |
# Get started in 60 secondsmkdir stirling && cd stirlingcurl -o docker-compose.yml https://raw.githubusercontent.com/Stirling-Tools/Stirling-PDF/main/docker/docker-compose.ymldocker compose up -d# Open http://localhost:8080Done. No more uploading your lease to a website called “PDFsupertools.xyz.”