Alternatives

Retain Pdf alternatives for AI agents.

Compare similar skills by workflow fit, trust score, quality, GitHub adoption, maintenance, and install readiness.

Current skill

Retain Pdf

在保留版面、公式与结构的前提下进行 PDF 翻译,适用于科研与技术文档

100
Quality
92
Trust
1.9K
Stars
#1

OCRmyPDF

Similarity 128Trust 98Excellent 100

OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched

34K starsJun 12, 2026 pushdocument-processingPythonPDF
$ npx skills add ocrmypdf/OCRmyPDF
#2

Liteparse

Similarity 128Trust 95Excellent 100

A fast, helpful, and open-source document parser

10K starsJun 16, 2026 pushdocument-processingRustPDF
$ npx skills add run-llama/liteparse
#3

MinerU

Similarity 128Trust 94Excellent 100

Transforms complex documents like PDFs and Office docs into LLM-ready markdown/JSON for your Agentic workflows.

68K starsJun 15, 2026 pushdocument-processingPythonPDF
$ npx skills add opendatalab/MinerU
#4

PyMuPDF

Similarity 126Trust 97Excellent 100

PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.

10K starsJun 12, 2026 pushdocument-processingPythonPDF
$ npx skills add pymupdf/PyMuPDF
#5

Donut

Similarity 126Trust 88Strong 79

Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022

6.9K starsJul 11, 2024 pushdocument-processingPythonDocument AI
$ npx skills add clovaai/donut
#6

Unilm

Similarity 125Trust 94Excellent 100

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

22K starsJan 23, 2026 pushdocument-processingPythonDocument AI
$ npx skills add microsoft/unilm
#7

Pdf Craft

Similarity 125Trust 96Excellent 100

PDF craft can convert PDF files into various other formats. This project will focus on processing PDF files of scanned books.

5.8K starsJun 6, 2026 pushdocument-processingPythonPDF
$ npx skills add oomol-lab/pdf-craft
#8

Dolphin

Similarity 125Trust 91Excellent 100

The official repo for “Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting”, ACL, 2025.

9.0K starsMar 25, 2026 pushdocument-processingPythonPDF
$ npx skills add bytedance/Dolphin
#9

Deepdoctection

Similarity 123Trust 89Excellent 100

A Repo For Document AI

3.2K starsJun 12, 2026 pushdocument-processingPythonOCR
$ npx skills add deepdoctection/deepdoctection
#10

OpenOCR

Similarity 122Trust 94Excellent 100

OpenOCR: An Open-Source Toolkit for General-OCR Research and Applications, integrates a unified training and evaluation benchmark, commercial-grade OCR and Document Parsing systems, and faithful reproductions of the core implementations from a wide range of academic papers.

1.4K starsMay 20, 2026 pushdocument-processingPythonOCR
$ npx skills add Topdu/OpenOCR
#11

ParseBench

Similarity 121Trust 83Strong 84

ParseBench - A Document Parsing Benchmark for AI Agents

497 starsJun 15, 2026 pushdocument-processingPythonDocument AI
$ npx skills add run-llama/ParseBench
#12

Ebook GPT Translator

Similarity 121Trust 89Excellent 98

Enjoy reading with your favorite style.

1.7K starsMar 15, 2026 pushdocument-processingPythonPDF
$ npx skills add jesselau76/ebook-GPT-translator
#13

Zotero Pdf Translate

Similarity 120Trust 97Excellent 100

Translate PDF, EPub, webpage, metadata, annotations, notes to the target language. Support 20+ translate services.

11K starsJun 10, 2026 pushdocument-processingTypeScriptPDF
$ npx skills add windingwind/zotero-pdf-translate
#14

PaddleOCR

Similarity 120Trust 98Excellent 100

Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.

82K starsJun 12, 2026 pushdocument-processingPythonOCR
$ npx skills add PaddlePaddle/PaddleOCR
#15

Paperless Ngx

Similarity 120Trust 98Excellent 100

A community-supported supercharged document management system: scan, index and archive all your documents

42K starsJun 16, 2026 pushdocument-processingPythonPDF
$ npx skills add paperless-ngx/paperless-ngx
#16

Text Extract API

Similarity 120Trust 88Excellent 88

Document (PDF, Word, PPTX ...) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents. Remove PII. Convert any document or picture to structured JSON or Markdown

3.1K starsDec 8, 2025 pushdocument-processingPythonPDF
$ npx skills add CatchTheTornado/text-extract-api

How to choose

When should you switch?

Use an alternative when it has a clearer install path, higher trust score, fresher maintenance, or better platform fit for your current agent stack. Keep Retain Pdf if it already passes your workflow test and repository review.

Next step

Compare top candidates side by side

Open the compare page, test the install commands in a sandbox, and check each repository before using a skill in production.