1
markitdown
Markdown
123,845
2026-04-20
Markdown
File Format Processing
Text & Documents
→
Python tool for converting files and office documents to Markdown.
2
difflib
General
72,760
2026-05-19
General
Text Processing
Text & Documents
Built-in
→
(Python standard library) Helpers for computing deltas.
3
mimetypes
Text & Documents
72,760
2026-05-19
File Manipulation
Text & Documents
Built-in
→
(Python standard library) Map filenames to MIME types.
4
pathlib
Text & Documents
72,760
2026-05-19
File Manipulation
Text & Documents
Built-in
→
(Python standard library) A cross-platform, object-oriented path library.
5
tomllib
Data Formats
72,760
2026-05-19
Data Formats
File Format Processing
Text & Documents
Built-in
→
(Python standard library) Parse TOML files.
6
docling
General
59,999
2026-05-18
General
File Format Processing
Text & Documents
→
Library for converting documents into structured data.
7
pypdf
PDF
10,003
2026-05-19
PDF
File Format Processing
Text & Documents
→
A library capable of splitting, merging, cropping, and transforming PDF pages.
8
weasyprint
PDF
9,160
2026-05-18
PDF
File Format Processing
Text & Documents
→
A visual rendering engine for HTML and CSS that can export to PDF.
9
kreuzberg
General
8,344
2026-05-19
General
File Format Processing
Text & Documents
→
High-performance document extraction library with a Rust core, supporting 62+ formats including PDF, Office, images with OCR, HTML, email, and archives.
10
watchdog
Text & Documents
7,340
2026-05-07
File Manipulation
Text & Documents
→
API and shell utilities to monitor file system events.
11
pdfminer.six
PDF
6,974
2026-03-13
PDF
File Format Processing
Text & Documents
→
Pdfminer.six is a community maintained fork of the original PDFMiner.
12
csvkit
Data Formats
6,374
2026-05-19
Data Formats
File Format Processing
Text & Documents
→
Utilities for converting to and working with CSV.
13
xmltodict
Text & Documents
5,736
2026-03-23
HTML Manipulation
Text & Documents
→
Working with XML feel like you are working with JSON.
14
python-docx
MS Office
5,585
2025-06-16
MS Office
File Format Processing
Text & Documents
→
Reads, queries and modifies Microsoft Word 2007/2008 docx files.
15
pypinyin
General
5,309
2026-04-19
General
Text Processing
Text & Documents
→
Convert Chinese hanzi (漢字) to pinyin (拼音).
16
tablib
General
4,752
2026-04-06
General
File Format Processing
Text & Documents
→
A module for Tabular Datasets in XLS, CSV, JSON, YAML.
17
markdown
Markdown
4,208
2026-02-09
Markdown
File Format Processing
Text & Documents
→
A Python implementation of John Gruber’s Markdown.
18
ftfy
General
4,039
2024-10-30
General
Text Processing
Text & Documents
→
Makes Unicode text less broken and more consistent automagically.
19
sqlparse
Parser
4,001
2025-12-19
Parser
Text Processing
Text & Documents
→
A non-validating SQL parser.
20
xlsxwriter
MS Office
3,943
2026-03-22
MS Office
File Format Processing
Text & Documents
→
A Python module for creating Excel .xlsx files.
21
python-phonenumbers
Parser
3,738
2026-05-07
Parser
Text Processing
Text & Documents
→
Parsing, formatting, storing and validating international phone numbers.
22
textdistance
General
3,529
2025-04-18
General
Text Processing
Text & Documents
→
Compute distance between sequences with 30+ algorithms.
23
python-pptx
MS Office
3,364
2024-08-07
MS Office
File Format Processing
Text & Documents
→
Python library for creating and updating PowerPoint (.pptx) files.
24
xlwings
MS Office
3,350
2026-05-06
MS Office
File Format Processing
Text & Documents
→
A BSD-licensed library that makes it easy to call Python from Excel and vice versa.
25
lxml
Text & Documents
3,027
2026-05-19
HTML Manipulation
Text & Documents
→
A very fast, easy-to-use and versatile library for handling HTML and XML.
26
mistune
Markdown
3,027
2026-05-03
Markdown
File Format Processing
Text & Documents
→
Fastest and full featured pure Python parsers of Markdown.
27
python-magic
Text & Documents
2,911
2026-05-11
File Manipulation
Text & Documents
→
A Python interface to the libmagic file type identification library.
28
pyyaml
Data Formats
2,888
2025-09-25
Data Formats
File Format Processing
Text & Documents
→
YAML implementations for Python.
29
pikepdf
PDF
2,719
2026-05-19
PDF
File Format Processing
Text & Documents
→
A powerful library for reading and editing PDF files, based on qpdf.
30
docxtpl
MS Office
2,636
2025-11-13
MS Office
File Format Processing
Text & Documents
→
Editing a docx document by jinja2 template
31
chardet
General
2,627
2026-05-05
General
Text Processing
Text & Documents
→
Python 2/3 compatible character encoding detector.
32
watchfiles
Text & Documents
2,490
2026-05-18
File Manipulation
Text & Documents
→
Simple, modern and fast file watching and code reload in python.
33
pyparsing
Parser
2,471
2026-04-30
Parser
Text Processing
Text & Documents
→
A general purpose framework for generating parsers.
34
pyquery
Text & Documents
2,379
2026-02-18
HTML Manipulation
Text & Documents
→
A jQuery-like library for parsing HTML.
35
pyelftools
General
2,237
2026-05-15
General
File Format Processing
Text & Documents
→
Parsing and analyzing ELF files and DWARF debugging information.
36
shortuuid
Unique identifiers
2,186
2025-12-01
Unique identifiers
Text Processing
Text & Documents
→
A generator library for concise, unambiguous and URL-safe UUIDs.
37
pygments
Parser
2,162
2026-05-18
Parser
Text Processing
Text & Documents
→
A generic syntax highlighter.
38
python-slugify
General
1,612
2026-01-07
General
Text Processing
Text & Documents
→
A Python slugify library that translates unicode to ASCII.
39
pyfiglet
General
1,558
2025-08-15
General
Text Processing
Text & Documents
→
An implementation of figlet written in Python.
40
python-user-agents
Parser
1,515
2023-02-16
Parser
Text Processing
Text & Documents
→
Browser user agent parser.
41
babel
General
1,442
2026-04-20
General
Text Processing
Text & Documents
→
An internationalization library for Python.
42
markdown-it-py
Markdown
1,308
2026-05-19
Markdown
File Format Processing
Text & Documents
→
Markdown parser with 100% CommonMark support, extensions, and syntax plugins.
43
pyexcel
MS Office
1,281
2025-12-10
MS Office
File Format Processing
Text & Documents
→
Providing one API for reading, manipulating and writing csv, ods, xls, xlsx and xlsm files.
44
justhtml
Text & Documents
1,141
2026-05-15
HTML Manipulation
Text & Documents
→
A pure Python HTML5 parser that just works.
45
pdf_oxide
PDF
763
2026-05-19
PDF
File Format Processing
Text & Documents
→
A fast PDF library for text extraction, image extraction, and markdown conversion, powered by Rust.
46
python-nameparser
Parser
707
2023-09-21
Parser
Text Processing
Text & Documents
→
Parsing human names into their individual components.
47
markupsafe
Text & Documents
689
2025-09-27
HTML Manipulation
Text & Documents
→
Implements a XML/HTML/XHTML Markup safe string for Python.
48
unidecode
General
608
2026-01-05
General
Text Processing
Text & Documents
→
ASCII transliterations of Unicode text.
49
sqids
Unique identifiers
508
2025-03-26
Unique identifiers
Text Processing
Text & Documents
→
A library for generating short unique IDs from numbers.
50
pangu.py
General
273
2023-03-30
General
Text Processing
Text & Documents
→
Paranoid text spacing.
51
tinycss2
Text & Documents
184
2025-11-23
HTML Manipulation
Text & Documents
→
A low-level CSS parser and generator written in Python.
52
beautifulsoup
Text & Documents
External
—
HTML Manipulation
Text & Documents
→
Providing Pythonic idioms for iterating, searching, and modifying HTML or XML.
53
openpyxl
MS Office
External
—
MS Office
File Format Processing
Text & Documents
→
A library for reading and writing Excel 2010 xlsx/xlsm/xltx/xltm files.
54
reportlab
PDF
External
—
PDF
File Format Processing
Text & Documents
→
Allowing Rapid creation of rich PDF documents.