MinerU
Transforms complex documents like PDFs and Office docs into LLM-ready markdown/JSON for your Agentic workflows.
Install with one command
$ npx skills add opendatalab/MinerUDecision summary
Production-ready for Document processing
Use this as a leading candidate, then validate the README and install path in your own agent stack.
Best for
- Document processing workflows
- Claude Code teams
- teams that value GitHub adoption signals
Not ideal for
- teams that need a vendor-supported SLA
- high-compliance environments without internal security review
Risk notes
- No major risk signals from current metadata
Quality profile
Excellent candidate for agent workflows
High-confidence pick with strong adoption and healthy maintenance signals.
Workflow fit
Use this skill in these scenarios
Parse messy files
Document processing
I need my agent to read PDFs, extract tables, and turn documents into structured data.
Search private knowledge
RAG and knowledge
I need my agent to build a RAG workflow over documents and retrieve reliable context.
Manage repositories
GitHub automation
I need my agent to triage GitHub issues, review pull requests, and summarize repository changes.
Stack fit
Add it to a complete workflow
Ingest, retrieve, and cite
RAG knowledge base
A stack for document-heavy agents that ingest files, create searchable knowledge, retrieve relevant context, and answer with grounded sources.
Turn skills into distribution
Content growth agent
A stack for turning newly indexed skills into SEO briefs, social drafts, comparison pages, and reusable publishing workflows.
Scrape, clean, and reuse web data
Web data pipeline
A practical stack for agents that crawl public pages, extract clean content, normalize data, and hand it to downstream research or RAG workflows.
Overview
Transforms complex documents like PDFs and Office docs into LLM-ready markdown/JSON for your Agentic workflows.
Imported by the skill-only GitHub discovery pipeline because it matches agent skill, automation, RAG, or developer-tool signals. Protocol-server projects are excluded from automated imports.
Platform Compatibility
Technical Details
- Version
- 1.0.0
- License
- Unknown
- Last Updated
- 6/8/2026
- Published
- 6/5/2026
Frameworks & Tools
Claim this skill
Project owners can request ownership review. Approved claims unlock a stronger trust signal.
Author
opendatalab✓
@opendatalab
Platform Fit
Health Signals
- GitHub stars
- 66.8K
- Quality score
- 77/100
- Last GitHub push
- Jun 6, 2026
- Framework hints
- 2
- OpenAgentSkill views
- 2
- Install copies
- 0
- Outbound clicks
- 0
Community Signal
Share whether this skill looks useful for your agent workflow. Aggregated feedback improves rankings over time.
Trust & Safety
- —Open source (public GitHub repo)
- —AI static analysis passed
- —License: Unknown
- —Manually verified by team
Related Skills
Stirling PDF
#1 PDF Application on GitHub that lets you edit PDFs on any device anywhere
80.3K stars · 0 installsTesseract
Tesseract Open Source OCR Engine (main repository)
74.6K stars · 0 installsDocling
Get your documents ready for gen AI
61.1K stars · 0 installsSiyuan
A privacy-first, self-hosted, fully open source personal knowledge management software, written in typescript and golang.
44.3K stars · 0 installs