PDF Parser for AI-ready data. Automate PDF accessibility. Open-source.
$ npx skills add opendataloader-project/opendataloader-pdfDecision filters
15 skills matching "parse"
Best blend of quality, stars, freshness, and agent usage
PDF Parser for AI-ready data. Automate PDF accessibility. Open-source.
$ npx skills add opendataloader-project/opendataloader-pdfCrawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Parsel, BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.
$ npx skills add apify/crawlee-pythonTrail of Bits Claude Code skills for security research, vulnerability detection, and audit workflows
$ npx skills add trailofbits/skillsAutoRAG: An Open-Source Framework for Retrieval-Augmented Generation (RAG) Evaluation & Optimization with AutoML-Style Automation
$ npx skills add Marker-Inc-Korea/AutoRAGThe AI-native database built for LLM applications, providing incredibly fast hybrid search of dense vector, sparse vector, tensor (multi-vector), and full-text.
$ npx skills add infiniflow/infinityTo extract main article from given URL with Node.js
$ npx skills add extractus/article-extractorA modern PDF library for TypeScript. Parse, modify, and generate PDFs with a clean, intuitive API.
$ npx skills add LibPDF-js/corePython binding to Modest and Lexbor engines. Fast HTML5 parser with CSS selectors for Python.
$ npx skills add rushter/selectolaxNext-gen phpDoc parser with support for intersection types and generics
$ npx skills add phpstan/phpdoc-parserSVG file parsing / rendering library
$ npx skills add dompdf/php-svg-libGolang短视频去水印:抖音,皮皮虾,火山,微视,最右,快手,全民小视频,皮皮搞笑,西瓜视频,虎牙,梨视频,acfun,好看视频...
$ npx skills add wujunwei928/parse-videoA Kotlin-based testing/scraping/parsing library providing the ability to analyze and extract data from HTML (server & client-side rendered). It places particular emphasis on ease of use and a high level of readability by providing an intuitive DSL. It aims to be a testing lib, but can also be used to scrape websites in a convenient fashion.
$ npx skills add skrapeit/skrape.itDedoc is a library (service) for automate documents parsing and bringing to a uniform format. It automatically extracts content, logical structure, tables, and meta information from textual electronic documents. (Parse document; Document content extraction; Logical structure extraction; PDF parser; Scanned document parser; DOCX parser; HTML parser
$ npx skills add ispras/dedocAvito Parser —бесплатный парсер для автоматического мониторинга новых объявлений Avito и\или выгрузки объявлений в файл
$ npx skills add Duff89/parser_avitoExtract and convert data from any document, images, pdfs, word doc, ppt or URL into multiple formats (Markdown, JSON, CSV, HTML) with intelligent structured data extraction and advanced OCR.
$ npx skills add NanoNets/docstrange