Text & Documents

Search every project in one place

Press / to search. Tap a tag to filter. Click any row for details.

Search and filter

Filtering for

Results

Row number Tags
Python tool for converting files and office documents to Markdown.
(Python standard library) Helpers for computing deltas.
(Python standard library) Map filenames to MIME types.
(Python standard library) A cross-platform, object-oriented path library.
(Python standard library) Parse TOML files.
Library for converting documents into structured data.
A library capable of splitting, merging, cropping, and transforming PDF pages.
A visual rendering engine for HTML and CSS that can export to PDF.
High-performance document extraction library with a Rust core, supporting 62+ formats including PDF, Office, images with OCR, HTML, email, and archives.
API and shell utilities to monitor file system events.
Pdfminer.six is a community maintained fork of the original PDFMiner.
Utilities for converting to and working with CSV.
Working with XML feel like you are working with JSON.
Reads, queries and modifies Microsoft Word 2007/2008 docx files.
Convert Chinese hanzi (漢字) to pinyin (拼音).
A module for Tabular Datasets in XLS, CSV, JSON, YAML.
A Python implementation of John Gruber’s Markdown.
Makes Unicode text less broken and more consistent automagically.
A non-validating SQL parser.
A Python module for creating Excel .xlsx files.
Parsing, formatting, storing and validating international phone numbers.
Compute distance between sequences with 30+ algorithms.
A BSD-licensed library that makes it easy to call Python from Excel and vice versa.
Python library for creating and updating PowerPoint (.pptx) files.
Fastest and full featured pure Python parsers of Markdown.
A very fast, easy-to-use and versatile library for handling HTML and XML.
A Python interface to the libmagic file type identification library.
YAML implementations for Python.
A powerful library for reading and editing PDF files, based on qpdf.
Editing a docx document by jinja2 template
Python 2/3 compatible character encoding detector.
Simple, modern and fast file watching and code reload in python.
A general purpose framework for generating parsers.
A jQuery-like library for parsing HTML.
Parsing and analyzing ELF files and DWARF debugging information.
A generator library for concise, unambiguous and URL-safe UUIDs.
A generic syntax highlighter.
A Python slugify library that translates unicode to ASCII.
An implementation of figlet written in Python.
Browser user agent parser.
An internationalization library for Python.
Markdown parser with 100% CommonMark support, extensions, and syntax plugins.
Providing one API for reading, manipulating and writing csv, ods, xls, xlsx and xlsm files.
A pure Python HTML5 parser that just works.
A fast PDF library for text extraction, image extraction, and markdown conversion, powered by Rust.
Parsing human names into their individual components.
Implements a XML/HTML/XHTML Markup safe string for Python.
ASCII transliterations of Unicode text.
A library for generating short unique IDs from numbers.
Paranoid text spacing.
A low-level CSS parser and generator written in Python.
Providing Pythonic idioms for iterating, searching, and modifying HTML or XML.
A library for reading and writing Excel 2010 xlsx/xlsm/xltx/xltm files.
Allowing Rapid creation of rich PDF documents.

Know a project that belongs here?

Tell us what it does and why it stands out.