OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
$ npx skills add ocrmypdf/OCRmyPDFAlternatives
Compare similar skills by workflow fit, trust score, quality, GitHub adoption, maintenance, and install readiness.
Current skill
PDFStract - Extract, Chunking and Embedding Layer in Your RAG Pipeline - Available as CLI - WEBUI - API
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
$ npx skills add ocrmypdf/OCRmyPDFTransforms complex documents like PDFs and Office docs into LLM-ready markdown/JSON for your Agentic workflows.
$ npx skills add opendatalab/MinerUGet your documents ready for gen AI
$ npx skills add docling-project/doclingPyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
$ npx skills add pymupdf/PyMuPDF🖼️ Image Toolbox is a powerful app for advanced image manipulation. It offers dozens of features, from basic tools like crop and draw to filters, OCR, and a wide range of image processing options
$ npx skills add T8RIN/ImageToolboxA community-supported supercharged document management system: scan, index and archive all your documents
$ npx skills add paperless-ngx/paperless-ngxA fast, helpful, and open-source document parser
$ npx skills add run-llama/liteparseOCR model that handles complex tables, forms, handwriting with full layout.
$ npx skills add datalab-to/chandraPython tool for converting files and office documents to Markdown.
$ npx skills add microsoft/markitdownA pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files
$ npx skills add py-pdf/pypdfBISHENG is an open LLM devops platform for next generation Enterprise AI applications. Powerful and comprehensive features include: GenAI workflow, RAG, Agent, Unified model management, Evaluation, SFT, Dataset Management, Enterprise-level System Management, Observability and more.
$ npx skills add dataelement/bishengTurn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
$ npx skills add PaddlePaddle/PaddleOCRAn ebook reader application supporting PDF, DjVu, EPUB, FB2 and many more formats, running on Cervantes, Kindle, Kobo, PocketBook and Android devices
$ npx skills add koreader/koreaderA modern ebook manager and reader with sync and backup capacities for Windows, macOS, Linux, Android, iOS and Web
$ npx skills add koodo-reader/koodo-readerReadest is a modern, feature-rich ebook reader designed for avid readers offering seamless cross-platform access, powerful tools, and an intuitive interface to elevate your reading experience.
$ npx skills add readest/readest#1 PDF Application on GitHub that lets you edit PDFs on any device anywhere
$ npx skills add Stirling-Tools/Stirling-PDFHow to choose
Use an alternative when it has a clearer install path, higher trust score, fresher maintenance, or better platform fit for your current agent stack. Keep Pdfstract if it already passes your workflow test and repository review.
Next step
Open the compare page, test the install commands in a sandbox, and check each repository before using a skill in production.