Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
$ npx skills add PaddlePaddle/PaddleOCRDecision filters
96 skills matching "pdf"
Best blend of quality, stars, freshness, and agent usage
Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
$ npx skills add PaddlePaddle/PaddleOCRAI-powered job search system built on Claude Code. 14 skill modes, Go dashboard, PDF generation, batch processing.
$ npx skills add santifer/career-opsPDF Parser for AI-ready data. Automate PDF accessibility. Open-source.
$ npx skills add opendataloader-project/opendataloader-pdfCrawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Parsel, BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.
$ npx skills add apify/crawlee-pythonAI suite powered by state-of-the-art models and providing advanced AI/AGI functions. Includes AI personas, AGI functions, world-class Beam multi-model chats, text-to-image, voice, response streaming, code highlighting and execution, PDF import, presets for developers, much more. Deploy on-prem or in the cloud.
$ npx skills add enricoros/big-AGIA Python library for reading and writing PDF, powered by QPDF
$ npx skills add pikepdf/pikepdfA maroto way to create PDFs. Maroto is inspired in Bootstrap and uses gofpdf. Fast and simple.
$ npx skills add johnfercher/marotoRead and extract text and other content from PDFs in C# (port of PDFBox)
$ npx skills add UglyToad/PdfPigPDF exporter for HTML presentations
$ npx skills add astefanutti/decktapeiText for Java represents the next level of SDKs for developers that want to take advantage of the benefits PDF can bring. Equipped with a better document engine, high and low-level programming capabilities and the ability to create, edit and enhance PDF documents, iText can be a boon to nearly every workflow.
$ npx skills add itext/itext-javaAssist in organizing your piles of documents, resulting from scanners, e-mails and other sources with miminal effort.
$ npx skills add eikek/docspellDocument scanning app
$ npx skills add ossappscollective/OSS-DocumentScannerA search engine that "just works" for Obsidian. Supports OCR and PDF indexing.
$ npx skills add scambier/obsidian-omnisearch📐⚙ 2D vector line drawing and shape modeling for CNC and laser cutters.
$ npx skills add microsoft/maker.jsPDF editor for Windows. Install or run portable. GPLv3. No account, no subscription, no telemetry.
$ npx skills add SteveTheKiller/KillerPDFiText for .NET is the .NET version of the iText library, formerly known as iTextSharp, which it replaces. iText represents the next level of SDKs for developers that want to take advantage of the benefits PDF can bring. Equipped with a better document engine, high and low-level programming capabilities and the ability to create, edit and enha
$ npx skills add itext/itext-dotnetComic and Manga reader, written with Node.js and using Electron
$ npx skills add ollm/OpenComicPHP PDF Library (official TCPDF successor)
$ npx skills add tecnickcom/tc-lib-pdfMinimal PDF creation library. <400 LOC, zero dependencies, makes real PDFs.
$ npx skills add Lulzx/tinypdfVector graphics in Go
$ npx skills add tdewolff/canvasThe headless Chrome/Chromium driver on top of Puppeteer. Take screenshots, generate PDFs, extract text and HTML with a production-ready API.
$ npx skills add microlinkhq/browserlessA web interface to extract tabular data from PDFs
$ npx skills add camelot-dev/excaliburA <Pdf /> component for react-native
$ npx skills add wonday/react-native-pdfRust Bindings for the Skia Graphics Library
$ npx skills add rust-skia/rust-skiaThe SILE Typesetter — Simon’s Improved Layout Engine
$ npx skills add sile-typesetter/sile在保留版面、公式与结构的前提下进行 PDF 翻译,适用于科研与技术文档
$ npx skills add wxyhgk/retain-pdfA modern PDF library for TypeScript. Parse, modify, and generate PDFs with a clean, intuitive API.
$ npx skills add LibPDF-js/coreAn extensible Markdown Editor, Viewer and Weblog Publisher for Windows
$ npx skills add RickStrahl/MarkdownMonsterRust library to read, manipulate and write PDF files.
$ npx skills add pdf-rs/pdfDatabase Reporting Tool and Tasks (.Net)
$ npx skills add ariacom/Seal-ReportA lightning fast image processing and resizing library for Go
$ npx skills add davidbyttow/govips学习计算机科学的电子书
$ npx skills add tolerious/Programming_learning_resourceA lightweight 2D graphics library for modern GPUs, delivering high-performance text, image, and vector rendering across major platforms.
$ npx skills add Tencent/tgfxPython wrapper for the arXiv API
$ npx skills add lukasschwab/arxiv.pyVersatile PDF creation and manipulation for Ruby
$ npx skills add gettalong/hexapdf📰 Binary distribution of PDFium
$ npx skills add bblanchon/pdfium-binariesOpen source PDF editor.
$ npx skills add JakubMelka/PDF4QTJasperReports® - Free Java Reporting Library
$ npx skills add Jaspersoft/jasperreportsjavascript based business reporting platform :rocket:
$ npx skills add jsreport/jsreportMouseover Translate Any Language At Once - Chrome Extension: PDF Translator, EBOOK, EPUB, OCR, TTS, NETFLIX, YOUTUBE DUAL SUBTITLES, GOOGLE DOCS, AI, VIEWER, GMAIL, WRITING, IMAGE, DUAL SUBS, MANGA, HOVER, DICTIONARY, WEBTOON, EDGE, JAPANESE, ENGLISH
$ npx skills add ttop32/MouseTooltipTranslatorPDF references add-on for Zotero.
$ npx skills add MuiseDestiny/zotero-referenceCross-platform desktop GUI app to clean image metadata
$ npx skills add szTheory/exifcleanerconverts binary PDF to JSON and text, for server-side PDF processing and command-line use. Zero dependency.
$ npx skills add modesty/pdf2jsonCreate chatbots with ease
$ npx skills add n4ze3m/dialoqbaseSelfhosted PDF manager, viewer and editor offering a seamless user experience on multiple devices.
$ npx skills add mrmn2/PdfDingEnjoy reading with your favorite style.
$ npx skills add jesselau76/ebook-GPT-translatorDocument reader
$ npx skills add baskerville/plato中文古籍刻本風格直排電子書製作工具 Chinese Ancient eBooks Generator
$ npx skills add shanleiguang/vRainGet clean data from tricky documents, powered by vision-language models ⚡
$ npx skills add emcf/thepipePdf creation module for dart/flutter
$ npx skills add DavBfr/dart_pdfA CLI text-to-speech tool using the Kokoro model, supporting multiple languages, voices (with blending), and various input formats including EPUB books and PDF documents.
$ npx skills add nazdridoy/kokoro-ttsConvert a pdf to an image
$ npx skills add spatie/pdf-to-imageDisplay paginated content in the browser and generate print books using web technology
$ npx skills add pagedjs/pagedjsA multi-threaded PDF password cracking utility equipped with commonly encountered password format builders and dictionary attacks.
$ npx skills add mufeedvh/pdfripA library for converting HTML into PDFs using ReportLab
$ npx skills add xhtml2pdf/xhtml2pdfAn Open source app to download and read books from shadow library (Anna’s Archive)
$ npx skills add dstark5/OpenlibSpecify a github or local repo, github pull request, arXiv or Sci-Hub paper, Youtube transcript or documentation URL on the web and scrape into a text file and clipboard for easier LLM ingestion
$ npx skills add jimmc414/onefilellmHackable CLI tool for converting Markdown files to PDF using Node.js and headless Chrome.
$ npx skills add simonhaenisch/md-to-pdfOffline markdown to pdf, choose -> edit -> transform 🥂
$ npx skills add realdennis/md2pdfkramdown is a fast, pure Ruby Markdown superset converter, using a strict syntax definition and supporting several common extensions.
$ npx skills add gettalong/kramdownA high-quality PDF to Markdown tool based on large language model visual recognition. 一款基于大模型视觉识别的高质量PDF转Markdown工具
$ npx skills add MarkPDFdown/markpdfdownSVG file parsing / rendering library
$ npx skills add dompdf/php-svg-lib📄 PDF Viewer Component for Angular
$ npx skills add VadimDez/ng2-pdf-viewerKibana Alert & Report App for Elasticsearch
$ npx skills add sentinl/sentinlAn app to convert images to PDF file!
$ npx skills add Swati4star/Images-to-PDFOpen Source Document Management System for Digital Archives (Scanned Documents)
$ npx skills add ciur/papermerge📜 A Cheat-Sheet Collection from the WWW
$ npx skills add sk3pp3r/cheat-sheet-pdfPDF++: the most Obsidian-native PDF annotation & viewing tool ever. Comes with optional Vim keybindings.
$ npx skills add RyotaUshio/obsidian-pdf-plusDownload your resume from resume.io as PDF
$ npx skills add felipeall/resumeio-to-pdfAn local, offline (after initial setup), portable OCR software that can process images and PDF files, using DeepSeek-OCR AI (running directly on your machine).
$ npx skills add th1nhhdk/local_ai_ocrDedoc is a library (service) for automate documents parsing and bringing to a uniform format. It automatically extracts content, logical structure, tables, and meta information from textual electronic documents. (Parse document; Document content extraction; Logical structure extraction; PDF parser; Scanned document parser; DOCX parser; HTML parser
$ npx skills add ispras/dedocExtractThinker is a Document Intelligence library for LLMs, offering ORM-style interaction for flexible and powerful document workflows.
$ npx skills add enoch3712/ExtractThinkerExtract and convert data from any document, images, pdfs, word doc, ppt or URL into multiple formats (Markdown, JSON, CSV, HTML) with intelligent structured data extraction and advanced OCR.
$ npx skills add NanoNets/docstrangeJavaScript Promiseの本
$ npx skills add azu/promises-booktranslate scientific papers in latex, especially arxiv papers
$ npx skills add SUSYUSTC/MathTranslate(eBook,PDFs Translation) A multilingual eBook processing tool supporting all eBook formats. Features online and offline translation while preserving original layouts. Compatible with both scanned and digital PDFs. Elegant user interface. The world's highest-performing open-source layout-preserving eBook translator.
$ npx skills add CBIhalsen/PolyglotPDF:blue_book: 电子书 -《Real-Time Rendering 3rd》提炼总结 | 全书共9万7千余字。你可以把它看做中文通俗版的《Real-Time Rendering 3rd》,也可以把它看做《Real-Time Rendering 3rd》的解读版与配套学习伴侣,或者《Real-Time Rendering 4th》的前置阅读材料。
$ npx skills add QianMo/Real-Time-Rendering-3rd-CN-Summary-Ebook🤖 A Node queue API for generating PDFs using headless Chrome. Comes with a CLI, S3 storage and webhooks for notifying subscribers about generated PDFs
$ npx skills add esbenp/pdf-botScan, index, and archive all of your paper documents (acquired by Mayan EDMS)
$ npx skills add zhoubear/open-paperlessSimple wrapper of tabula-java: extract table from PDF into pandas DataFrame
$ npx skills add chezou/tabula-pyvue.js pdf viewer
$ npx skills add FranckFreiburger/vue-pdfMoodle-DL downloads course content fast from Moodle (eg. lecture pdfs)
$ npx skills add C0D3D3V/Moodle-DLA set of tools for extracting tables from PDF files helping to do data mining on (OCR-processed) scanned documents.
$ npx skills add WZBSocialScienceCenter/pdftabextractbooks pdf
$ npx skills add huyubing/books-pdfAn HTML to PDF library for the JVM. Based on Flying Saucer and Apache PDF-BOX 2. With SVG image support. Now also with accessible PDF support (WCAG, Section 508, PDF/UA)!
$ npx skills add danfickle/openhtmltopdfA python module that wraps the pdftoppm utility to convert PDF to PIL Image object
$ npx skills add Belval/pdf2imageOpen Source Virtual (Network) Printer for Windows that allows you to create PDFs, OCR text, and print images, with advanced features usually available only in enterprise solutions.
$ npx skills add clawsoftware/clawPDFConverts PDF, DOC, DOCX, XML, HTML, RTF, etc to plain text
$ npx skills add sajari/docconvCAJ 转 PDF 转换器(GUI 版本)
$ npx skills add sainnhe/caj2pdf-qtA plugin for reading and annotating PDFs and EPUBs in obsidian.
$ npx skills add elias-sundqvist/obsidian-annotatorAndroid widget that can render PDF documents stored on SD card, linked as assets, or downloaded from a remote URL.
$ npx skills add voghDev/PdfViewPagerConverts a pdf file into a text file while keeping the layout of the original pdf. Useful to extract the content from a table in a pdf file for instance. This is a subclass of PDFTextStripper class (from the Apache PDFBox library).
$ npx skills add JonathanLink/PDFLayoutTextStrippera "Proof of Concept or GTFO" mirror with an extensive index with also whole issues or individual articles as clean PDFs.
$ npx skills add angea/pocorgtfoA curated list of resources for Document Understanding (DU) topic
$ npx skills add tstanislawek/awesome-document-understandingList of Elixir books
$ npx skills add sger/ElixirBooksOpen-source platform for extracting structured data from documents using AI.
$ npx skills add DocumindHQ/documind