Transforms complex documents like PDFs and Office docs into LLM-ready markdown/JSON for your Agentic workflows.
$ npx skills add opendatalab/MinerUAlternatives
Compare similar skills by workflow fit, trust score, quality, GitHub adoption, maintenance, and install readiness.
Current skill
A Python library to inspect and modify the internal structure of a PDF file
Transforms complex documents like PDFs and Office docs into LLM-ready markdown/JSON for your Agentic workflows.
$ npx skills add opendatalab/MinerUThe official repo for “Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting”, ACL, 2025.
$ npx skills add bytedance/DolphinCommunity maintained fork of pdfminer - we fathom PDF
$ npx skills add pdfminer/pdfminer.sixborb is a library for reading, creating and manipulating PDF files in python.
$ npx skills add borb-pdf/borb:fire: The Python library & CLI for PDF forms.
$ npx skills add chinapandaman/PyPDFFormA community-supported supercharged document management system: scan, index and archive all your documents
$ npx skills add paperless-ngx/paperless-ngxOCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
$ npx skills add ocrmypdf/OCRmyPDFDocument (PDF, Word, PPTX ...) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents. Remove PII. Convert any document or picture to structured JSON or Markdown
$ npx skills add CatchTheTornado/text-extract-apiA developer-friendly API for converting many document formats into PDF files, and more!
$ npx skills add gotenberg/gotenbergQuestPDF is a modern library for PDF document generation. Its fluent C# API lets you design complex layouts with clean, readable code. Create documents using a flexible, component-based approach.
$ npx skills add QuestPDF/QuestPDFPython tool for converting files and office documents to Markdown.
$ npx skills add microsoft/markitdownFile Parser optimised for LLM Ingestion with no loss 🧠 Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.
$ npx skills add QuivrHQ/MegaParsePDFium - Project to compile PDFium library to multiple platforms.
$ npx skills add paulocoutinhox/pdfium-libGet your documents ready for gen AI
$ npx skills add docling-project/doclingPDF tooling for Go and the command line.
$ npx skills add pdfcpu/pdfcpuPyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
$ npx skills add pymupdf/PyMuPDFHow to choose
Use an alternative when it has a clearer install path, higher trust score, fresher maintenance, or better platform fit for your current agent stack. Keep Pdfsyntax if it already passes your workflow test and repository review.
Next step
Open the compare page, test the install commands in a sandbox, and check each repository before using a skill in production.