Alternatives

Text Extract API alternatives for AI agents.

Compare similar skills by workflow fit, trust score, quality, GitHub adoption, maintenance, and install readiness.

Current skill

Text Extract API

Document (PDF, Word, PPTX ...) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents. Remove PII. Convert any document or picture to structured JSON or Markdown

88
Quality
88
Trust
3.1K
Stars
#1

OCRmyPDF

Similarity 134Trust 98Excellent 100

OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched

34K starsJun 12, 2026 pushdocument-processingPythonPDF
$ npx skills add ocrmypdf/OCRmyPDF
#2

MinerU

Similarity 134Trust 94Excellent 100

Transforms complex documents like PDFs and Office docs into LLM-ready markdown/JSON for your Agentic workflows.

68K starsJun 15, 2026 pushdocument-processingPythonPDF
$ npx skills add opendatalab/MinerU
#3

PyMuPDF

Similarity 132Trust 97Excellent 100

PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.

10K starsJun 12, 2026 pushdocument-processingPythonPDF
$ npx skills add pymupdf/PyMuPDF
#4

Pdf Craft

Similarity 131Trust 96Excellent 100

PDF craft can convert PDF files into various other formats. This project will focus on processing PDF files of scanned books.

5.8K starsJun 6, 2026 pushdocument-processingPythonPDF
$ npx skills add oomol-lab/pdf-craft
#5

Dolphin

Similarity 131Trust 91Excellent 100

The official repo for “Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting”, ACL, 2025.

9.0K starsMar 25, 2026 pushdocument-processingPythonPDF
$ npx skills add bytedance/Dolphin
#6

Excalibur

Similarity 128Trust 92Excellent 100

A web interface to extract tabular data from PDFs

1.8K starsMay 20, 2026 pushdocument-processingPythonPDF
$ npx skills add camelot-dev/excalibur
#7

Umi OCR

Similarity 128Trust 93Excellent 100

OCR software, free and offline. 开源、免费的离线OCR软件。支持截屏/批量导入图片,PDF文档识别,排除水印/页眉页脚,扫描/生成二维码。内置多国语言库。

45K starsNov 20, 2025 pushdocument-processingPythonOCR
$ npx skills add hiroi-sora/Umi-OCR
#8

Markpdfdown

Similarity 127Trust 91Excellent 94

A high-quality PDF to Markdown tool based on large language model visual recognition. 一款基于大模型视觉识别的高质量PDF转Markdown工具

1.8K starsJan 25, 2026 pushdocument-processingPythonPDF
$ npx skills add MarkPDFdown/markpdfdown
#9

Paperless Ngx

Similarity 126Trust 98Excellent 100

A community-supported supercharged document management system: scan, index and archive all your documents

42K starsJun 16, 2026 pushdocument-processingPythonPDF
$ npx skills add paperless-ngx/paperless-ngx
#10

Gotenberg

Similarity 126Trust 94Excellent 100

A developer-friendly API for converting many document formats into PDF files, and more!

12K starsJun 12, 2026 pushdocument-processingGoPDF
$ npx skills add gotenberg/gotenberg
#11

Markitdown

Similarity 126Trust 96Excellent 100

Python tool for converting files and office documents to Markdown.

154K starsMay 26, 2026 pushdocument-processingPythonPDF
$ npx skills add microsoft/markitdown
#12

MegaParse

Similarity 126Trust 88Strong 79

File Parser optimised for LLM Ingestion with no loss 🧠 Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.

7.4K starsFeb 21, 2025 pushdocument-processingPythonPDF
$ npx skills add QuivrHQ/MegaParse
#13

Liteparse

Similarity 126Trust 95Excellent 100

A fast, helpful, and open-source document parser

10K starsJun 16, 2026 pushdocument-processingRustPDF
$ npx skills add run-llama/liteparse
#14

Docling

Similarity 125Trust 92Excellent 100

Get your documents ready for gen AI

62K starsJun 15, 2026 pushdocument-processingPythonPDF
$ npx skills add docling-project/docling
#15

Video Subtitle Extractor

Similarity 125Trust 93Excellent 100

视频硬字幕提取,生成srt文件。无需申请第三方API,本地实现文本识别。基于深度学习的视频字幕提取框架,包含字幕区域检测、字幕内容提取。A GUI tool for extracting hard-coded subtitle (hardsub) from videos and generating srt files.

9.0K starsApr 9, 2026 pushdocument-processingPythonOCR
$ npx skills add YaoFANGUK/video-subtitle-extractor
#16

Parsr

Similarity 124Trust 92Excellent 100

Transforms PDF, Documents and Images into Enriched Structured Data

6.2K starsMar 20, 2026 pushdocument-processingJavaScriptPDF
$ npx skills add axa-group/Parsr

How to choose

When should you switch?

Use an alternative when it has a clearer install path, higher trust score, fresher maintenance, or better platform fit for your current agent stack. Keep Text Extract API if it already passes your workflow test and repository review.

Next step

Compare top candidates side by side

Open the compare page, test the install commands in a sandbox, and check each repository before using a skill in production.