Skip to content

Using pdfminer.six for text extraction from PDFs

Out of all of the PDF analysis tools, pdfminer.six is without a doubt the easiest to use!

Installation

Installation is simple.

pip install pdfminer.six

Usage

Usage is just as simple!

from pdfminer.high_level import extract_text

text = extract_text("example.pdf")

And... that's it! Your PDF does need to have selectable text, though!