Convert documents into Markdown for agent-readable context
$ npx skills add microsoft/markitdownDecision filters
154 skills matching "document"
Best blend of quality, stars, freshness, and agent usage
Convert documents into Markdown for agent-readable context
$ npx skills add microsoft/markitdownBuild document intelligence and RAG workflows for agents
$ npx skills add infiniflow/ragflowTurn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
$ npx skills add PaddlePaddle/PaddleOCRInstallable GitHub library of 1,273+ agentic skills for Claude Code, Cursor, Codex CLI, Gemini CLI, Antigravity, and more. Includes installer CLI, bundles, workflows, and official/community skill collections.
$ npx skills add sickn33/antigravity-awesome-skills📑 PageIndex: Document Index for Vectorless, Reasoning-based RAG
$ npx skills add VectifyAI/PageIndexA set of ready to use Agent Skills for research, science, engineering, analysis, finance and writing.
$ npx skills add K-Dense-AI/claude-scientific-skillsPDF Parser for AI-ready data. Automate PDF accessibility. Open-source.
$ npx skills add opendataloader-project/opendataloader-pdfSpecification and documentation for Agent Skills
$ npx skills add agentskills/agentskillsAI generates natively editable PPTX from any document — real PowerPoint shapes with native animations, not images · by Hugo He
$ npx skills add hugohe3/ppt-masterClaude Code Skills and 380+ agent skills from official dev teams and the community, compatible with Codex, Antigravity, Gemini CLI, Cursor and others.
$ npx skills add VoltAgent/awesome-agent-skillsPrivate AI platform for agents, assistants and enterprise search. Built-in Agent Builder, Deep research, Document analysis, Multi-model support, and API connectivity for agents.
$ npx skills add arc53/DocsGPTOpen-source LLM knowledge platform: turn raw documents into a queryable RAG, an autonomous reasoning agent, and a self-maintaining Wiki.
$ npx skills add Tencent/WeKnoraPM Skills Marketplace: 100+ agentic skills, commands, and plugins — from discovery to strategy, execution, launch, and growth.
$ npx skills add phuryn/pm-skillsAn open-source RAG-based tool for chatting with your documents.
$ npx skills add Cinnamon/kotaemon280+ free n8n automation templates — ready-to-use workflows for Gmail, Telegram, Slack, Discord, WhatsApp, Google Drive, Notion, OpenAI, and more. AI agents, RAG chatbots, email automation, social media, DevOps, and document processing. The largest open-source n8n template collection.
$ npx skills add enescingoz/awesome-n8n-templatesAnthony Fu's curated collection of agent skills.
$ npx skills add antfu/skillsAutoRAG: An Open-Source Framework for Retrieval-Augmented Generation (RAG) Evaluation & Optimization with AutoML-Style Automation
$ npx skills add Marker-Inc-Korea/AutoRAGEasiest and laziest way for building multi-agent LLMs applications.
$ npx skills add LazyAGI/LazyLLMThe most accurate document search and store for building AI apps
$ npx skills add morphik-org/morphik-coreHigh accuracy RAG for answering questions from scientific documents with citations
$ npx skills add Future-House/paper-qaREADME file generator, powered by AI.
$ npx skills add eli64s/readme-aiA Python library for reading and writing PDF, powered by QPDF
$ npx skills add pikepdf/pikepdfA maroto way to create PDFs. Maroto is inspired in Bootstrap and uses gofpdf. Fast and simple.
$ npx skills add johnfercher/marotoA curated list of skills, tools, tutorials, and capabilities for AI coding agents (Claude, Codex, Antigravity, Copilot, VS Code)
$ npx skills add heilcheng/awesome-agent-skillsApache Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and scales everywhere python does.
$ npx skills add apache/hamiltonRead and extract text and other content from PDFs in C# (port of PDFBox)
$ npx skills add UglyToad/PdfPigPDF exporter for HTML presentations
$ npx skills add astefanutti/decktapeDistributed vector search for AI-native applications
$ npx skills add vearch/vearchiText for Java represents the next level of SDKs for developers that want to take advantage of the benefits PDF can bring. Equipped with a better document engine, high and low-level programming capabilities and the ability to create, edit and enhance PDF documents, iText can be a boon to nearly every workflow.
$ npx skills add itext/itext-javaAssist in organizing your piles of documents, resulting from scanners, e-mails and other sources with miminal effort.
$ npx skills add eikek/docspellDocument scanning app
$ npx skills add ossappscollective/OSS-DocumentScannerA search engine that "just works" for Obsidian. Supports OCR and PDF indexing.
$ npx skills add scambier/obsidian-omnisearch📐⚙ 2D vector line drawing and shape modeling for CNC and laser cutters.
$ npx skills add microsoft/maker.jsEdegQuake 🌋 High-performance GraphRAG inspired from LightRag written in Rust; Transform documents into intelligent knowledge graphs for superior retrieval and generation
$ npx skills add raphaelmansuy/edgequakePDF editor for Windows. Install or run portable. GPLv3. No account, no subscription, no telemetry.
$ npx skills add SteveTheKiller/KillerPDFiText for .NET is the .NET version of the iText library, formerly known as iTextSharp, which it replaces. iText represents the next level of SDKs for developers that want to take advantage of the benefits PDF can bring. Equipped with a better document engine, high and low-level programming capabilities and the ability to create, edit and enha
$ npx skills add itext/itext-dotnetComic and Manga reader, written with Node.js and using Electron
$ npx skills add ollm/OpenComicPHP PDF Library (official TCPDF successor)
$ npx skills add tecnickcom/tc-lib-pdfMinimal PDF creation library. <400 LOC, zero dependencies, makes real PDFs.
$ npx skills add Lulzx/tinypdfVector graphics in Go
$ npx skills add tdewolff/canvasA web interface to extract tabular data from PDFs
$ npx skills add camelot-dev/excaliburA <Pdf /> component for react-native
$ npx skills add wonday/react-native-pdfRust Bindings for the Skia Graphics Library
$ npx skills add rust-skia/rust-skiaThe SILE Typesetter — Simon’s Improved Layout Engine
$ npx skills add sile-typesetter/sile在保留版面、公式与结构的前提下进行 PDF 翻译,适用于科研与技术文档
$ npx skills add wxyhgk/retain-pdfA modern PDF library for TypeScript. Parse, modify, and generate PDFs with a clean, intuitive API.
$ npx skills add LibPDF-js/core基于 manga-image-translator 的开源漫画翻译工具。支持日/韩/美漫自动翻译,内置 OpenAI、Gemini 等 5 种翻译引擎,并提供可视化编辑器自由调整文本样式。一键安装,开箱即用。如果喜欢,欢迎点亮 ⭐ Star 支持!
$ npx skills add hgmzhn/manga-translator-uiAn extensible Markdown Editor, Viewer and Weblog Publisher for Windows
$ npx skills add RickStrahl/MarkdownMonsterInteractive architecture diagrams for codebases
$ npx skills add CodeBoarding/CodeBoardingRust library to read, manipulate and write PDF files.
$ npx skills add pdf-rs/pdfBuilding blocks for rapid development of GenAI applications
$ npx skills add deepsense-ai/ragbitsDatabase Reporting Tool and Tasks (.Net)
$ npx skills add ariacom/Seal-ReportA lightning fast image processing and resizing library for Go
$ npx skills add davidbyttow/govips学习计算机科学的电子书
$ npx skills add tolerious/Programming_learning_resourceMORT 번역기 프로젝트 - Real-time game translator with OCR
$ npx skills add killkimno/MORTA lightweight 2D graphics library for modern GPUs, delivering high-performance text, image, and vector rendering across major platforms.
$ npx skills add Tencent/tgfxDeclarative way to run AI models in React Native on device, powered by ExecuTorch.
$ npx skills add software-mansion/react-native-executorchPython wrapper for the arXiv API
$ npx skills add lukasschwab/arxiv.pyCollection of open-source libraries and tools for Robotic Process Automation (RPA), designed to be used with both Robot Framework and Python
$ npx skills add robocorp/rpaframeworkYomiTokuはAIを活用した日本語文書解析エンジンを提供するPythonパッケージです。 Yomitoku is an AI-powered document image analysis package designed specifically for the Japanese language.
$ npx skills add kotaro-kinoshita/yomitokuVersatile PDF creation and manipulation for Ruby
$ npx skills add gettalong/hexapdf📰 Binary distribution of PDFium
$ npx skills add bblanchon/pdfium-binariesOpen source PDF editor.
$ npx skills add JakubMelka/PDF4QTOpenOCR: An Open-Source Toolkit for General-OCR Research and Applications, integrates a unified training and evaluation benchmark, commercial-grade OCR and Document Parsing systems, and faithful reproductions of the core implementations from a wide range of academic papers.
$ npx skills add Topdu/OpenOCRA self-hosted file conversion server & share tool that supports 445 file formats in 13 languages.
$ npx skills add zelon88/HRConvert2JasperReports® - Free Java Reporting Library
$ npx skills add Jaspersoft/jasperreportsjavascript based business reporting platform :rocket:
$ npx skills add jsreport/jsreportAn iOS OCR Server Using Apple’s Vision Framework
$ npx skills add riddleling/iOS-OCR-ServerA curated collection of practical AI projects implementing OCR systems, RAG, AI agents, and other AI use cases.
$ npx skills add Sumanth077/Hands-On-AI-EngineeringMouseover Translate Any Language At Once - Chrome Extension: PDF Translator, EBOOK, EPUB, OCR, TTS, NETFLIX, YOUTUBE DUAL SUBTITLES, GOOGLE DOCS, AI, VIEWER, GMAIL, WRITING, IMAGE, DUAL SUBS, MANGA, HOVER, DICTIONARY, WEBTOON, EDGE, JAPANESE, ENGLISH
$ npx skills add ttop32/MouseTooltipTranslator🌝 MLKit是一个强大易用的工具包。通过ML Kit您可以很轻松的实现文字识别、条码识别、图像标记、人脸检测、对象检测等功能。
$ npx skills add jenly1314/MLKitA group of notebooks and other files which can help you learn AI from scratch.
$ npx skills add Ramakm/ai-hands-onPDF references add-on for Zotero.
$ npx skills add MuiseDestiny/zotero-referenceOCR engine for all the languages
$ npx skills add mittagessen/krakenCross-platform desktop GUI app to clean image metadata
$ npx skills add szTheory/exifcleanerconverts binary PDF to JSON and text, for server-side PDF processing and command-line use. Zero dependency.
$ npx skills add modesty/pdf2jsonAn on-premises, OCR-free unstructured data extraction, markdown conversion and benchmarking toolkit. (https://idp-leaderboard.org/)
$ npx skills add NanoNets/docextCreate chatbots with ease
$ npx skills add n4ze3m/dialoqbaseSelfhosted PDF manager, viewer and editor offering a seamless user experience on multiple devices.
$ npx skills add mrmn2/PdfDingEnjoy reading with your favorite style.
$ npx skills add jesselau76/ebook-GPT-translatorDocument reader
$ npx skills add baskerville/plato中文古籍刻本風格直排電子書製作工具 Chinese Ancient eBooks Generator
$ npx skills add shanleiguang/vRainGet clean data from tricky documents, powered by vision-language models ⚡
$ npx skills add emcf/thepipePdf creation module for dart/flutter
$ npx skills add DavBfr/dart_pdfA CLI text-to-speech tool using the Kokoro model, supporting multiple languages, voices (with blending), and various input formats including EPUB books and PDF documents.
$ npx skills add nazdridoy/kokoro-ttsConvert a pdf to an image
$ npx skills add spatie/pdf-to-imageDisplay paginated content in the browser and generate print books using web technology
$ npx skills add pagedjs/pagedjsA multi-threaded PDF password cracking utility equipped with commonly encountered password format builders and dictionary attacks.
$ npx skills add mufeedvh/pdfripA library for converting HTML into PDFs using ReportLab
$ npx skills add xhtml2pdf/xhtml2pdfAn Open source app to download and read books from shadow library (Anna’s Archive)
$ npx skills add dstark5/OpenlibSpecify a github or local repo, github pull request, arXiv or Sci-Hub paper, Youtube transcript or documentation URL on the web and scrape into a text file and clipboard for easier LLM ingestion
$ npx skills add jimmc414/onefilellmHackable CLI tool for converting Markdown files to PDF using Node.js and headless Chrome.
$ npx skills add simonhaenisch/md-to-pdfOffline markdown to pdf, choose -> edit -> transform 🥂
$ npx skills add realdennis/md2pdfkramdown is a fast, pure Ruby Markdown superset converter, using a strict syntax definition and supporting several common extensions.
$ npx skills add gettalong/kramdownA high-quality PDF to Markdown tool based on large language model visual recognition. 一款基于大模型视觉识别的高质量PDF转Markdown工具
$ npx skills add MarkPDFdown/markpdfdownRead Japanese manga inside browser with selectable text.
$ npx skills add kha-white/mokuroSVG file parsing / rendering library
$ npx skills add dompdf/php-svg-libFree Offline OCR 离线的中文文本检测+识别SDK
$ npx skills add myhub/tr📄 PDF Viewer Component for Angular
$ npx skills add VadimDez/ng2-pdf-viewerKibana Alert & Report App for Elasticsearch
$ npx skills add sentinl/sentinlAn app to convert images to PDF file!
$ npx skills add Swati4star/Images-to-PDFOpen Source Document Management System for Digital Archives (Scanned Documents)
$ npx skills add ciur/papermergeRMT (RuoMengTu) is a free, open-source macro tool built on AHKv2. Let the code handle the tedious work—you have more meaningful things to do.
$ npx skills add zclucas/RMTSnapX is a free, open-source, cross-platform tool that lets you capture or record any area of your screen and instantly share it with a single keypress. Upload images, videos, text, and more to multiple supported destinations—all with ease. ShareX fork
$ npx skills add SnapXL/SnapXCCExtractor - Official version maintained by the core team
$ npx skills add CCExtractor/ccextractor📜 A Cheat-Sheet Collection from the WWW
$ npx skills add sk3pp3r/cheat-sheet-pdfOpen-source screenshot and screen recording for macOS. The free, native alternative to CleanShot X. Built with Swift 6.0 and SwiftUI.
$ npx skills add lzhgus/CapsoLocal-first, open-source AI assistant for your data. Unify tasks, notes, docs, photos, and bookmarks. Private, self-hosted, and extensible via APIs.
$ npx skills add eclaire-labs/eclaireNote Companion: AI assistant for Obsidian that goes beyond just a chat. (prev File Organizer 2000)
$ npx skills add Nexus-JPF/note-companionPDF++: the most Obsidian-native PDF annotation & viewing tool ever. Comes with optional Vim keybindings.
$ npx skills add RyotaUshio/obsidian-pdf-plusA minimalist SOTA LaTeX OCR model with only 20M parameters, running in browser. Full training pipeline available for self-reproduction. | 超轻量SOTA LaTeX公式识别模型,仅20M参数量,可在浏览器中运行。训练全流程代码开源,以便自学复现。
$ npx skills add alephpi/TexoFree Open Source Document Management System (mirror, no pull request or issues)
$ npx skills add mayan-edms/Mayan-EDMSDownload your resume from resume.io as PDF
$ npx skills add felipeall/resumeio-to-pdfCnSTD: 基于 PyTorch/MXNet 的 中文/英文 场景文字检测(Scene Text Detection)、数学公式检测(Mathematical Formula Detection, MFD)、篇章分析(Layout Analysis)的Python3 包
$ npx skills add breezedeus/CnSTDZotero Plugin for OCR
$ npx skills add UB-Mannheim/zotero-ocrWeb interface for recognizing text, proofreading OCR, and creating fully-digitized documents.
$ npx skills add scribeocr/scribeocrAn local, offline (after initial setup), portable OCR software that can process images and PDF files, using DeepSeek-OCR AI (running directly on your machine).
$ npx skills add th1nhhdk/local_ai_ocrQuick, painless, intuitive OCR platform written in Rust and TypeScript. Modern UI with modern API, with an emphasis on intuitive user experience.
$ npx skills add readur/readurDedoc is a library (service) for automate documents parsing and bringing to a uniform format. It automatically extracts content, logical structure, tables, and meta information from textual electronic documents. (Parse document; Document content extraction; Logical structure extraction; PDF parser; Scanned document parser; DOCX parser; HTML parser
$ npx skills add ispras/dedocMulti-modal OCR pipeline optimized for ML training (text, figure, math, tables, diagrams)
$ npx skills add raphael-seo/Versatile-OCR-ProgramExtractThinker is a Document Intelligence library for LLMs, offering ORM-style interaction for flexible and powerful document workflows.
$ npx skills add enoch3712/ExtractThinkerExtract and convert data from any document, images, pdfs, word doc, ppt or URL into multiple formats (Markdown, JSON, CSV, HTML) with intelligent structured data extraction and advanced OCR.
$ npx skills add NanoNets/docstrangeA packaged and flexible version of the CRAFT text detector and Keras CRNN recognition model.
$ npx skills add faustomorales/keras-ocrJavaScript Promiseの本
$ npx skills add azu/promises-booktranslate scientific papers in latex, especially arxiv papers
$ npx skills add SUSYUSTC/MathTranslate(eBook,PDFs Translation) A multilingual eBook processing tool supporting all eBook formats. Features online and offline translation while preserving original layouts. Compatible with both scanned and digital PDFs. Elegant user interface. The world's highest-performing open-source layout-preserving eBook translator.
$ npx skills add CBIhalsen/PolyglotPDFPaddleOCR inference in PyTorch. Converted from [PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR)
$ npx skills add frotms/PaddleOCR2PytorchONNX Model Exporter for PaddlePaddle
$ npx skills add PaddlePaddle/Paddle2ONNX:blue_book: 电子书 -《Real-Time Rendering 3rd》提炼总结 | 全书共9万7千余字。你可以把它看做中文通俗版的《Real-Time Rendering 3rd》,也可以把它看做《Real-Time Rendering 3rd》的解读版与配套学习伴侣,或者《Real-Time Rendering 4th》的前置阅读材料。
$ npx skills add QianMo/Real-Time-Rendering-3rd-CN-Summary-Ebook🤖 A Node queue API for generating PDFs using headless Chrome. Comes with a CLI, S3 storage and webhooks for notifying subscribers about generated PDFs
$ npx skills add esbenp/pdf-botAI VTuber with LLM, ASR, TTS, OCR, CV and more technologies to live stream or play Minecraft with you.
$ npx skills add AkagawaTsurunaki/ZerolanLiveRobotScan, index, and archive all of your paper documents (acquired by Mayan EDMS)
$ npx skills add zhoubear/open-paperlessSimple wrapper of tabula-java: extract table from PDF into pandas DataFrame
$ npx skills add chezou/tabula-pyvue.js pdf viewer
$ npx skills add FranckFreiburger/vue-pdfA set of tools for extracting tables from PDF files helping to do data mining on (OCR-processed) scanned documents.
$ npx skills add WZBSocialScienceCenter/pdftabextractbooks pdf
$ npx skills add huyubing/books-pdfAI Bank Statement Document Automation By LLM model and Personal Finanical Analysis
$ npx skills add johnsonhk88/AI-Bank-Statement-Document-Automation-By-LLM-And-Personal-Finanical-Analysis-PredictionAn HTML to PDF library for the JVM. Based on Flying Saucer and Apache PDF-BOX 2. With SVG image support. Now also with accessible PDF support (WCAG, Section 508, PDF/UA)!
$ npx skills add danfickle/openhtmltopdfA python module that wraps the pdftoppm utility to convert PDF to PIL Image object
$ npx skills add Belval/pdf2imageText-To-Speech, RAG, and LLMs. All local!
$ npx skills add alexpinel/DotOpen Source Virtual (Network) Printer for Windows that allows you to create PDFs, OCR text, and print images, with advanced features usually available only in enterprise solutions.
$ npx skills add clawsoftware/clawPDFConverts PDF, DOC, DOCX, XML, HTML, RTF, etc to plain text
$ npx skills add sajari/docconvPython tool for grabbing text via screenshot
$ npx skills add ianzhao/textshotVision utilities for web interaction agents 👀
$ npx skills add reworkd/tarsierCAJ 转 PDF 转换器(GUI 版本)
$ npx skills add sainnhe/caj2pdf-qtA plugin for reading and annotating PDFs and EPUBs in obsidian.
$ npx skills add elias-sundqvist/obsidian-annotatorAndroid widget that can render PDF documents stored on SD card, linked as assets, or downloaded from a remote URL.
$ npx skills add voghDev/PdfViewPagerMixTeX multimodal LaTeX, ZhEn, and, Table OCR. It performs efficient CPU-based inference in a local offline on Windows.
$ npx skills add RQLuo/MixTeX-Latex-OCRConverts a pdf file into a text file while keeping the layout of the original pdf. Useful to extract the content from a table in a pdf file for instance. This is a subclass of PDFTextStripper class (from the Apache PDFBox library).
$ npx skills add JonathanLink/PDFLayoutTextStrippera "Proof of Concept or GTFO" mirror with an extensive index with also whole issues or individual articles as clean PDFs.
$ npx skills add angea/pocorgtfoA curated list of resources for Document Understanding (DU) topic
$ npx skills add tstanislawek/awesome-document-understandingOCR离线图片文字识别命令行windows程序,以JSON字符串形式输出结果,方便别的程序调用。提供各种语言API。由 PaddleOCR C++ 编译。
$ npx skills add hiroi-sora/PaddleOCR-jsonList of Elixir books
$ npx skills add sger/ElixirBooksOpen-source platform for extracting structured data from documents using AI.
$ npx skills add DocumindHQ/documind