Pdfminer.six is a community maintained fork of the original PDFMiner. It is a
tool for extracting information from PDF documents. It focuses on getting and
analyzing text data. Pdfminer.six extracts the text from a page directly from
the sourcecode of the PDF. It can also be used to get the exact location, font
or color of the text.
It is built in a modular way such that each component of pdfminer.six can be
replaced easily. You can implement your own interpreter or rendering device
that uses the power of pdfminer.six for other purposes than text analysis.
Check out the full documentation on Read the Docs
(https://pdfminersix.readthedocs.io/en/latest/).
Features:
\u2022 Written entirely in Python.
\u2022 Parse, analyze, and convert PDF documents.
\u2022 PDF-1.7 specification support. (well, almost).
\u2022 CJK languages and vertical writing scripts support.
\u2022 Various font types (Type1, TrueType, Type3, and CID) support.
\u2022 Support for extracting images (JPG, JBIG2, Bitmaps).
\u2022 Support for various compressions (ASCIIHexDecode, ASCII85Decode, LZWDecode,
FlateDecode, RunLengthDecode, CCITTFaxDecode)
\u2022 Support for RC4 and AES encryption.
\u2022 Support for AcroForm interactive form extraction.
\u2022 Table of contents extraction.
\u2022 Tagged contents extraction.
\u2022 Automatic layout analysis.
Update Information:
Update to 20251107 Fix: arbitary code execution when loading pickle font files Security fix for GHSA-wf5f-4jwr-ppcp / CVE-2025-64512
* Fri Nov 7 2025 Benjamin A. Beasley - 20251107-1
- Update to 20251107 (fixes RHBZ#2413443)
- Security fix for GHSA-wf5f-4jwr-ppcp
[ 1 ] Bug #2413443 - python-pdfminer-20251107 is available
https://bugzilla.redhat.com/show_bug.cgi?id=2413443
This update can be installed with the "dnf" update program. Use su -c 'dnf upgrade --advisory FEDORA-2025-63872f52bb' at the command line. For more information, refer to the dnf documentation available at http://dnf.readthedocs.io/en/latest/command_ref.html#upgrade-command-label
Get the latest Linux and open source security news straight to your inbox.