Pdfminer.six is a community maintained fork of the original PDFMiner. It is a
tool for extracting information from PDF documents. It focuses on getting and
analyzing text data. Pdfminer.six extracts the text from a page directly from
the sourcecode of the PDF. It can also be used to get the exact location, font
or color of the text.
It is built in a modular way such that each component of pdfminer.six can be
replaced easily. You can implement your own interpreter or rendering device
that uses the power of pdfminer.six for other purposes than text analysis.
Check out the full documentation on Read the Docs
(https://pdfminersix.readthedocs.io/en/latest/).
Features:
\u2022 Written entirely in Python.
\u2022 Parse, analyze, and convert PDF documents.
\u2022 PDF-1.7 specification support. (well, almost).
\u2022 CJK languages and vertical writing scripts support.
\u2022 Various font types (Type1, TrueType, Type3, and CID) support.
\u2022 Support for extracting images (JPG, JBIG2, Bitmaps).
\u2022 Support for various compressions (ASCIIHexDecode, ASCII85Decode, LZWDecode,
FlateDecode, RunLengthDecode, CCITTFaxDecode)
\u2022 Support for RC4 and AES encryption.
\u2022 Support for AcroForm interactive form extraction.
\u2022 Table of contents extraction.
\u2022 Tagged contents extraction.
\u2022 Automatic layout analysis.
Update Information:
Update to 20251230: security fix for CVE-2025-64512 https://github.com/pdfminer/pdfminer.six/blob/20251230/CHANGELOG.md
* Wed Dec 31 2025 Benjamin A. Beasley - 20251230-1
- Update to 20251230; Fixes RHBZ#2426287
- Security fix for CVE-2025-64512
* Tue Dec 30 2025 Benjamin A. Beasley - 20251229-1
- Update to 20251229 (close RHBZ#2425927)
* Sun Dec 28 2025 Benjamin A. Beasley - 20251228-1
- Update to 20251228 (close RHBZ#2425643)
[ 1 ] Bug #2425643 - python-pdfminer-20251227 is available
https://bugzilla.redhat.com/show_bug.cgi?id=2425643
[ 2 ] Bug #2425927 - python-pdfminer-20251229 is available
https://bugzilla.redhat.com/show_bug.cgi?id=2425927
[ 3 ] Bug #2426287 - python-pdfminer-20251230 is available
https://bugzilla.redhat.com/show_bug.cgi?id=2426287
This update can be installed with the "dnf" update program. Use su -c 'dnf upgrade --advisory FEDORA-2025-e77e051f0c' at the command line. For more information, refer to the dnf documentation available at http://dnf.readthedocs.io/en/latest/command_ref.html#upgrade-command-label
Get the latest Linux and open source security news straight to your inbox.