The official repo for “Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting”, ACL, 2025.
$ npx skills add bytedance/DolphinAlternatives
Compare similar skills by workflow fit, trust score, quality, GitHub adoption, maintenance, and install readiness.
Current skill
A fast, helpful, and open-source document parser
The official repo for “Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting”, ACL, 2025.
$ npx skills add bytedance/DolphinRust library and CLI tool for OCR (extracting text from images)
$ npx skills add robertknight/ocrsOCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
$ npx skills add ocrmypdf/OCRmyPDFTransforms PDF, Documents and Images into Enriched Structured Data
$ npx skills add axa-group/Parsr在保留版面、公式与结构的前提下进行 PDF 翻译,适用于科研与技术文档
$ npx skills add wxyhgk/retain-pdfRust library to read, manipulate and write PDF files.
$ npx skills add pdf-rs/pdfRuVector is a High Performance, Real-Time, Self-Learning Ai, Vector GNN, Memory DB built in Rust.
$ npx skills add ruvnet/RuVectorFast Rust library for PDF inspection, classification, and text extraction. Intelligently detects scanned vs text-based PDFs to enable smart routing decisions.
$ npx skills add firecrawl/pdf-inspectorTransforms complex documents like PDFs and Office docs into LLM-ready markdown/JSON for your Agentic workflows.
$ npx skills add opendatalab/MinerUA pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files
$ npx skills add py-pdf/pypdfTesseract Open Source OCR Engine (main repository)
$ npx skills add tesseract-ocr/tesseractOCR software, free and offline. 开源、免费的离线OCR软件。支持截屏/批量导入图片,PDF文档识别,排除水印/页眉页脚,扫描/生成二维码。内置多国语言库。
$ npx skills add hiroi-sora/Umi-OCRPure Javascript OCR for more than 100 Languages 📖🎉🖥
$ npx skills add naptha/tesseract.jsShareX is a free and open-source application that enables users to capture or record any area of their screen with a single keystroke. It also supports uploading images, text, and various file types to a wide range of destinations.
$ npx skills add ShareX/ShareXPyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
$ npx skills add pymupdf/PyMuPDFPDF Parser for AI-ready data. Automate PDF accessibility. Open-source.
$ npx skills add opendataloader-project/opendataloader-pdfHow to choose
Use an alternative when it has a clearer install path, higher trust score, fresher maintenance, or better platform fit for your current agent stack. Keep Liteparse if it already passes your workflow test and repository review.
Next step
Open the compare page, test the install commands in a sandbox, and check each repository before using a skill in production.