How PDF Compression Works: Balancing File Size and Quality

A 40-page PDF report can be 200 KB or 40 MB depending entirely on what's inside it — and it's rarely the text that's the problem. Understanding where PDF size actually comes from makes it much easier to know what a "compress" button is doing to your file, and whether it's safe to use aggressively.

Where PDF size actually comes from

Text in a PDF is stored efficiently regardless of page count — even a 500-page text-only document is usually just a few hundred KB. The size explosion almost always comes from embedded images: scanned pages, high-resolution photos, or screenshots pasted into a document at their original resolution. A single uncompressed photo can be several megabytes on its own; a scanned document is essentially a series of full-page images, which is why scanned PDFs are often dramatically larger than PDFs created directly from a word processor.

What a PDF compressor actually does

Compression tools primarily target the embedded images: reducing their resolution (downsampling), applying image compression algorithms (similar to converting a photo to a more efficient JPG), and removing duplicate or unused embedded data such as unused fonts or metadata. Text and vector graphics (like charts made in the document itself, rather than pasted as images) are generally left untouched, since they're already compact.

Why compression settings matter for scanned documents

Aggressive compression downsamples images to a lower resolution, which is invisible for a photo viewed on a screen but can matter for a scanned document you might need to print or read closely — small text within a scanned page can become blurry or hard to read if compressed too aggressively. As a rule of thumb: for documents that will only be viewed on-screen (emailed reports, digital forms), high compression is safe. For scanned documents with important fine print, or anything that will be printed, use a lighter compression setting and check a few pages before relying on the result.

A rough expectation

Text-only PDFs (reports, contracts, invoices): already small; compression usually saves very little
PDFs with a few embedded photos: can often shrink 40-70% with no visible quality loss
Scanned documents (image-only PDFs): can shrink 60-90%, but check readability at higher compression levels

Try it yourself

Our PDF Compressor runs entirely in your browser using pdf.js and pdf-lib, so your document is never uploaded to a server. Try a moderate setting first and check the output before applying maximum compression to anything you might need to print.

This guide is for general understanding. Always keep your original file until you've verified the compressed version meets your needs.

Frequently asked questions

Why is my scanned PDF so much bigger than a typed document of the same length?

Because a scanned PDF is really a sequence of full-page images rather than searchable text, and images take up far more space than text does.

Will compressing a PDF make the text blurry?

Actual text (typed directly into the document) isn't affected by image compression. Only scanned pages — which are images of text rather than real text — can lose sharpness at high compression levels.

Can I compress a PDF multiple times to shrink it further?

Recompressing an already-compressed PDF yields little to no further size reduction and risks additional quality loss on any embedded images, so it's best to compress once from the original.