LFX Platform

Know more about LFX Platform

LFX Insights

PDF Processing Libraries

Libraries that enable programmatic creation, manipulation, and transformation of PDF documents, including features like merging, splitting, encryption, and content modification.

5 projects

3,548 contributors

$18M

pdfminer.six

A Python library for extracting text, images, and metadata from PDF files. It provides tools for parsing PDF documents, analyzing their structure, and converting them into other formats. The library supports both Python 2 and 3, and includes features for handling various PDF encodings and document layouts.

Contributors

1,100

Organizations

214

Software value

$3.9M

pdfcpu

pdfcpu is a comprehensive PDF processing library and command line tool written in Go that enables users to validate, analyze, optimize, encrypt, decrypt, merge, split, stamp, watermark, rotate, and extract content from PDF files

Contributors

906

Organizations

229

Software value

$5.6M

qpdf

QPDF is a command-line tool and C++ library for structural, content-preserving transformations on PDF files. It supports operations like linearization, encryption, decryption, and manipulation of PDF objects without changing the content of the original PDF.

Contributors

778

Organizations

161

Software value

$3.5M

pikepdf

pikepdf is a Python library for reading, manipulating, and repairing PDF files. It provides a Pythonic wrapper around the QPDF C++ library, enabling PDF operations like merging, splitting, optimization, and encryption while maintaining PDF structural integrity.

Contributors

509

Organizations

112

Software value

$1M

pyHanko

pyHanko is a Python library and command line tool for creating, validating, and manipulating PDF digital signatures and timestamps. It supports PDF document signing, signature validation, timestamp handling, and PDF form field operations in accordance with industry standards.

Contributors

255

Organizations

41

Software value

$4.2M

Looking for a project that’s not listed?