#01
MinerU
Transforms complex documents like PDFs and Office docs into LLM-ready markdown/JSON for your Agentic workflows.
100
Quality
100
Trust
77
Fit
$ npx skills add opendatalab/MinerUDocument skills
Compare skills for PDF parsing, OCR, table extraction, markdown conversion, document metadata, and agent-ready file processing.
Built for users searching for AI agent skills that can parse PDFs, extract tables, and convert documents into usable context.
Matched
16
Stars
527K
Input
Output
Markdown
Agent jobs
These pages are built for high-intent search and for agents that need a structured shortlist before installing third-party code.
01
Convert PDFs into clean markdown for agents
02
Extract tables and metadata from reports
03
Prepare legal, finance, and research documents for review
04
Use OCR fallback when scanned pages need text extraction
Task routes
Ranked shortlist
#01
Transforms complex documents like PDFs and Office docs into LLM-ready markdown/JSON for your Agentic workflows.
100
Quality
100
Trust
77
Fit
$ npx skills add opendatalab/MinerU#02
Get your documents ready for gen AI
100
Quality
100
Trust
77
Fit
$ npx skills add docling-project/doclingKnowledge Agents and Management in the Cloud
100
Quality
100
Trust
67
Fit
$ npx skills add run-llama/llama_cloud_services#04
Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean, structured formats for language models. Visit our website to learn more about our enterprise grade Platform product for production grade workflows, partitioning, enrichments, chunking and embedding.
100
Quality
100
Trust
72
Fit
$ npx skills add Unstructured-IO/unstructured#05
A fast, helpful, and open-source document parser
100
Quality
100
Trust
71
Fit
$ npx skills add run-llama/liteparse#06
The official repo for “Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting”, ACL, 2025.
100
Quality
100
Trust
66
Fit
$ npx skills add bytedance/DolphinEvaluation
Handles layout, headings, and tables without destroying context
Reports extraction limits and OCR uncertainty
Supports batch or repeatable processing
Documents privacy and local processing assumptions
Questions
Choose a skill that supports your document type, preserves tables or headings, and makes extraction failures visible instead of silently guessing.
Some can, but scanned PDFs usually need OCR and human review for high-stakes documents.