OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
$ npx skills add ocrmypdf/OCRmyPDFAlternatives
Compare similar skills by workflow fit, trust score, quality, GitHub adoption, maintenance, and install readiness.
Current skill
在保留版面、公式与结构的前提下进行 PDF 翻译,适用于科研与技术文档
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
$ npx skills add ocrmypdf/OCRmyPDFA fast, helpful, and open-source document parser
$ npx skills add run-llama/liteparseTransforms complex documents like PDFs and Office docs into LLM-ready markdown/JSON for your Agentic workflows.
$ npx skills add opendatalab/MinerUPyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
$ npx skills add pymupdf/PyMuPDFOfficial Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022
$ npx skills add clovaai/donutLarge-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
$ npx skills add microsoft/unilmPDF craft can convert PDF files into various other formats. This project will focus on processing PDF files of scanned books.
$ npx skills add oomol-lab/pdf-craftThe official repo for “Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting”, ACL, 2025.
$ npx skills add bytedance/DolphinA Repo For Document AI
$ npx skills add deepdoctection/deepdoctectionOpenOCR: An Open-Source Toolkit for General-OCR Research and Applications, integrates a unified training and evaluation benchmark, commercial-grade OCR and Document Parsing systems, and faithful reproductions of the core implementations from a wide range of academic papers.
$ npx skills add Topdu/OpenOCRParseBench - A Document Parsing Benchmark for AI Agents
$ npx skills add run-llama/ParseBenchEnjoy reading with your favorite style.
$ npx skills add jesselau76/ebook-GPT-translatorTranslate PDF, EPub, webpage, metadata, annotations, notes to the target language. Support 20+ translate services.
$ npx skills add windingwind/zotero-pdf-translateTurn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
$ npx skills add PaddlePaddle/PaddleOCRA community-supported supercharged document management system: scan, index and archive all your documents
$ npx skills add paperless-ngx/paperless-ngxDocument (PDF, Word, PPTX ...) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents. Remove PII. Convert any document or picture to structured JSON or Markdown
$ npx skills add CatchTheTornado/text-extract-apiHow to choose
Use an alternative when it has a clearer install path, higher trust score, fresher maintenance, or better platform fit for your current agent stack. Keep Retain Pdf if it already passes your workflow test and repository review.
Next step
Open the compare page, test the install commands in a sandbox, and check each repository before using a skill in production.