5 projects
pdfminer.six
A Python library for extracting text, images, and metadata from PDF files. It provides tools for parsing PDF documents, analyzing their structure, and converting them into other formats. The library supports both Python 2 and 3, and includes features for handling various PDF encodings and document layouts.
1,100
214
$3.9M
pdfcpu
pdfcpu is a comprehensive PDF processing library and command line tool written in Go that enables users to validate, analyze, optimize, encrypt, decrypt, merge, split, stamp, watermark, rotate, and extract content from PDF files
906
229
$5.6M
qpdf
QPDF is a command-line tool and C++ library for structural, content-preserving transformations on PDF files. It supports operations like linearization, encryption, decryption, and manipulation of PDF objects without changing the content of the original PDF.
778
161
$3.5M
pikepdf
pikepdf is a Python library for reading, manipulating, and repairing PDF files. It provides a Pythonic wrapper around the QPDF C++ library, enabling PDF operations like merging, splitting, optimization, and encryption while maintaining PDF structural integrity.
509
112
$1M
pyHanko
pyHanko is a Python library and command line tool for creating, validating, and manipulating PDF digital signatures and timestamps. It supports PDF document signing, signature validation, timestamp handling, and PDF form field operations in accordance with industry standards.
255
41
$4.2M