1
markitdown
Markdown
154,675
2026-05-26
Markdown
File Format Processing
Text & Documents
→
Python tool for converting files and office documents to Markdown.
2
tomllib
Data Formats
73,285
2026-06-16
Data Formats
File Format Processing
Text & Documents
Built-in
→
(Python standard library) Parse TOML files.
3
docling
General
61,699
2026-06-16
General
File Format Processing
Text & Documents
→
Library for converting documents into structured data.
4
pypdf
PDF
10,065
2026-06-16
PDF
File Format Processing
Text & Documents
→
A library capable of splitting, merging, cropping, and transforming PDF pages.
5
weasyprint
PDF
9,286
2026-06-16
PDF
File Format Processing
Text & Documents
→
A visual rendering engine for HTML and CSS that can export to PDF.
6
kreuzberg
General
8,498
2026-06-16
General
File Format Processing
Text & Documents
→
High-performance document extraction library with a Rust core, supporting 62+ formats including PDF, Office, images with OCR, HTML, email, and archives.
7
pdfminer.six
PDF
6,991
2026-03-13
PDF
File Format Processing
Text & Documents
→
Pdfminer.six is a community maintained fork of the original PDFMiner.
8
csvkit
Data Formats
6,390
2026-06-08
Data Formats
File Format Processing
Text & Documents
→
Utilities for converting to and working with CSV.
9
python-docx
MS Office
5,639
2025-06-16
MS Office
File Format Processing
Text & Documents
→
Reads, queries and modifies Microsoft Word 2007/2008 docx files.
10
tablib
General
4,754
2026-06-16
General
File Format Processing
Text & Documents
→
A module for Tabular Datasets in XLS, CSV, JSON, YAML.
11
markdown
Markdown
4,215
2026-05-26
Markdown
File Format Processing
Text & Documents
→
A Python implementation of John Gruber’s Markdown.
12
xlsxwriter
MS Office
3,948
2026-03-22
MS Office
File Format Processing
Text & Documents
→
A Python module for creating Excel .xlsx files.
13
python-pptx
MS Office
3,422
2024-08-07
MS Office
File Format Processing
Text & Documents
→
Python library for creating and updating PowerPoint (.pptx) files.
14
xlwings
MS Office
3,363
2026-06-16
MS Office
File Format Processing
Text & Documents
→
A BSD-licensed library that makes it easy to call Python from Excel and vice versa.
15
mistune
Markdown
3,043
2026-06-16
Markdown
File Format Processing
Text & Documents
→
Fastest and full featured pure Python parsers of Markdown.
16
pyyaml
Data Formats
2,900
2025-09-25
Data Formats
File Format Processing
Text & Documents
→
YAML implementations for Python.
17
pikepdf
PDF
2,744
2026-06-08
PDF
File Format Processing
Text & Documents
→
A powerful library for reading and editing PDF files, based on qpdf.
18
docxtpl
MS Office
2,662
2025-11-13
MS Office
File Format Processing
Text & Documents
→
Editing a docx document by jinja2 template
19
pyelftools
General
2,249
2026-06-04
General
File Format Processing
Text & Documents
→
Parsing and analyzing ELF files and DWARF debugging information.
20
markdown-it-py
Markdown
1,324
2026-05-19
Markdown
File Format Processing
Text & Documents
→
Markdown parser with 100% CommonMark support, extensions, and syntax plugins.
21
pyexcel
MS Office
1,283
2025-12-10
MS Office
File Format Processing
Text & Documents
→
Providing one API for reading, manipulating and writing csv, ods, xls, xlsx and xlsm files.
22
pdf_oxide
PDF
834
2026-06-16
PDF
File Format Processing
Text & Documents
→
A fast PDF library for text extraction, image extraction, and markdown conversion, powered by Rust.
23
openpyxl
MS Office
External
—
MS Office
File Format Processing
Text & Documents
→
A library for reading and writing Excel 2010 xlsx/xlsm/xltx/xltm files.
24
reportlab
PDF
External
—
PDF
File Format Processing
Text & Documents
→
Allowing Rapid creation of rich PDF documents.