Decision filters

Choose skills by scenario, quality, and trust signals.

250 skills matching "extraction"

Best blend of quality, stars, freshness, and agent usage

1

Crawl4AI

VERIFIEDEXCELLENT · 100

Web crawling built for AI

$ npx skills add unclecode/crawl4ai
3 agent calls100% success66.1K stars79 qualityClaude Code + OpenAI Agents31.0K installs
High-confidence pick with strong adoption and healthy maintenance signals.
claudegpt-4langchaincrewaiopenclaw
by unclecodeQuick view
2

Firecrawl

VERIFIEDEXCELLENT · 100

🔥 Search, scrape, and clean the web for AI agents.

$ npx skills add firecrawl/firecrawl
123.5K stars78 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
typescriptai-agents
by firecrawlQuick view
3

Scrapy

VERIFIEDEXCELLENT · 100

High-throughput crawling and scraping for agent data pipelines

$ npx skills add scrapy/scrapy
61.8K stars77 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
pythoncrawlerrag
by scrapyQuick view
4

EasySpider

VERIFIEDEXCELLENT · 100

A visual no-code/code-free web crawler/spider易采集:一个可视化浏览器自动化测试/数据采集/网页爬虫软件,可以无代码图形化的设计和执行爬虫任务。别名:ServiceWrapper面向Web应用的智能化服务封装系统。

$ npx skills add NaiboWang/EasySpider
43.9K stars76 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
javascriptcrawler
by NaiboWangQuick view
5

Colly

VERIFIEDEXCELLENT · 100

Elegant Scraper and Crawler Framework for Golang

$ npx skills add gocolly/colly
25.3K stars74 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
gocrawler
by gocollyQuick view
6

Proxy Pool

VERIFIEDEXCELLENT · 100

Python ProxyPool for web spider

$ npx skills add jhao104/proxy_pool
23.4K stars74 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
pythoncrawler
by jhao104Quick view
7

Katana

VERIFIEDEXCELLENT · 100

A next-generation crawling and spidering framework.

$ npx skills add projectdiscovery/katana
16.7K stars73 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
gocrawler
by projectdiscoveryQuick view
8

Newspaper

VERIFIEDEXCELLENT · 100

newspaper3k is a news, full-text, and article metadata extraction in Python 3. Advanced docs:

$ npx skills add codelucas/newspaper
15.1K stars72 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
pythoncrawler
by codelucasQuick view
9

Lux

VERIFIEDEXCELLENT · 100

👾 Fast and simple video download library and CLI tool written in Go

$ npx skills add iawia002/lux
31.4K stars72 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
gocrawler
by iawia002Quick view
10

Python

VERIFIEDEXCELLENT · 100

Python脚本。模拟登录知乎, 爬虫,操作excel,微信公众号,远程开机

$ npx skills add injetlee/Python
10.6K stars71 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
pythoncrawler
by injetleeQuick view
11

Wiseflow

VERIFIEDEXCELLENT · 100

为你 7*24 在线搞钱的“云上牛马”团队

$ npx skills add TeamWiseFlow/wiseflow
8.2K stars69 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
typescriptcrawler
by TeamWiseFlowQuick view
12

Ferret

VERIFIEDEXCELLENT · 100

Declarative web scraping

$ npx skills add MontFerret/ferret
6.0K stars68 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
gocrawler
by MontFerretQuick view
13

JMComic Crawler Python

VERIFIEDEXCELLENT · 100

Python API for JMComic | 提供Python API访问禁漫天堂,同时支持网页端和移动端 | 禁漫天堂GitHub Actions下载器🚀

$ npx skills add hect0x7/JMComic-Crawler-Python
5.8K stars68 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
pythoncrawler
by hect0x7Quick view
14

Scrapy Redis

VERIFIEDEXCELLENT · 100

Redis-based components for Scrapy.

$ npx skills add rmax/scrapy-redis
5.6K stars68 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
pythoncrawler
by rmaxQuick view
15

Sparrow

VERIFIEDEXCELLENT · 100

Structured data extraction and instruction calling with ML, LLM and Vision LLM

$ npx skills add katanaml/sparrow
5.2K stars68 qualityClaude Code + OpenAI Agents
High-confidence pick with strong adoption and healthy maintenance signals.
pythonrag
by katanamlQuick view
16

Browser Fingerprinting

VERIFIEDEXCELLENT · 100

Analysis of Bot Protection systems with available countermeasures 🚿. How to defeat anti-bot system 👻 and get around browser fingerprinting scripts 🕵️‍♂️ when scraping the web?

$ npx skills add niespodd/browser-fingerprinting
5.0K stars68 qualityClaude Code + Browser agents
High-confidence pick with strong adoption and healthy maintenance signals.
javascriptcrawler
by niespoddQuick view
17

Weibo Crawler

VERIFIEDEXCELLENT · 97

新浪微博爬虫,用python爬取新浪微博数据,并下载微博图片和微博视频

$ npx skills add dataabc/weibo-crawler
4.5K stars67 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
pythoncrawler
by dataabcQuick view
18

Puppeteer Sharp

VERIFIEDEXCELLENT · 100

Headless Chrome .NET API

$ npx skills add hardkoded/puppeteer-sharp
3.9K stars67 qualityClaude Code + Browser agents
High-confidence pick with strong adoption and healthy maintenance signals.
c#crawler
by hardkodedQuick view
19

Feapder

VERIFIEDEXCELLENT · 100

🚀🚀🚀feapder is an easy to use, powerful crawler framework | feapder是一款上手简单,功能强大的Python爬虫框架。内置AirSpider、Spider、TaskSpider、BatchSpider四种爬虫解决不同场景的需求。且支持断点续爬、监控报警、浏览器渲染、海量数据去重等功能。更有功能强大的爬虫管理系统feaplat为其提供方便的部署及调度

$ npx skills add Boris-code/feapder
3.7K stars67 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
pythoncrawler
by Boris-codeQuick view
20

Toapi

VERIFIEDEXCELLENT · 100

Every web site provides APIs.

$ npx skills add elliotgao2/toapi
3.5K stars67 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
pythoncrawler
by elliotgao2Quick view
21

Cariddi

VERIFIEDEXCELLENT · 100

Take a list of domains, crawl urls and scan for endpoints, secrets, api keys, file extensions, tokens and more

$ npx skills add edoardottt/cariddi
3.4K stars66 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
gocrawler
by edoardotttQuick view
22

Crawler

VERIFIEDEXCELLENT · 100

https://spatie.be/docs/crawler

$ npx skills add spatie/crawler
2.8K stars66 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
phpcrawler
by spatieQuick view
23

FinalRecon

VERIFIEDEXCELLENT · 100

All In One Web Recon

$ npx skills add thewhiteh4t/FinalRecon
2.8K stars66 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
pythoncrawler
by thewhiteh4tQuick view
24

Pikepdf

VERIFIEDEXCELLENT · 100

A Python library for reading and writing PDF, powered by QPDF

$ npx skills add pikepdf/pikepdf
2.7K stars66 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
pythonpdf
by pikepdfQuick view
25

Maroto

VERIFIEDEXCELLENT · 100

A maroto way to create PDFs. Maroto is inspired in Bootstrap and uses gofpdf. Fast and simple.

$ npx skills add johnfercher/maroto
2.7K stars66 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
gopdf
by johnfercherQuick view
26

QueryList

VERIFIEDEXCELLENT · 100

:spider: The progressive PHP crawler framework! 优雅的渐进式PHP采集框架。

$ npx skills add jae-jae/QueryList
2.7K stars66 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
phpcrawler
by jae-jaeQuick view
27

PdfPig

VERIFIEDEXCELLENT · 100

Read and extract text and other content from PDFs in C# (port of PDFBox)

$ npx skills add UglyToad/PdfPig
2.4K stars65 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
c#pdf
by UglyToadQuick view
28

Decktape

VERIFIEDEXCELLENT · 99

PDF exporter for HTML presentations

$ npx skills add astefanutti/decktape
2.4K stars65 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
javascriptpdf
by astefanuttiQuick view
29

Crawler Detect

VERIFIEDEXCELLENT · 100

🕷 CrawlerDetect is a PHP class for detecting bots/crawlers/spiders via the user agent

$ npx skills add JayBizzle/Crawler-Detect
2.4K stars65 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
phpcrawler
by JayBizzleQuick view
30

WechatSogou

VERIFIEDEXCELLENT · 97

基于搜狗微信搜索的微信公众号爬虫接口

$ npx skills add chyroc/WechatSogou
6.3K stars65 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
pythoncrawler
by chyrocQuick view
31

Itext Java

VERIFIEDEXCELLENT · 100

iText for Java represents the next level of SDKs for developers that want to take advantage of the benefits PDF can bring. Equipped with a better document engine, high and low-level programming capabilities and the ability to create, edit and enhance PDF documents, iText can be a boon to nearly every workflow.

$ npx skills add itext/itext-java
2.2K stars65 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
javapdf
by itextQuick view
32

Docspell

VERIFIEDEXCELLENT · 100

Assist in organizing your piles of documents, resulting from scanners, e-mails and other sources with miminal effort.

$ npx skills add eikek/docspell
2.2K stars65 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
elmpdf
by eikekQuick view
33

OSS DocumentScanner

VERIFIEDEXCELLENT · 99

Document scanning app

$ npx skills add ossappscollective/OSS-DocumentScanner
2.1K stars65 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
c++pdf
by ossappscollectiveQuick view
34

Photon

VERIFIEDEXCELLENT · 98

Incredibly fast crawler designed for OSINT.

$ npx skills add s0md3v/Photon
12.9K stars65 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
pythoncrawler
by s0md3vQuick view
35

Videodl

VERIFIEDEXCELLENT · 100

Videodl: A lightweight video downloader written in pure python. (轻量级视频下载器,优先高清无水印,支持抖音,快手,小红书,B站,TikTok,YouTube,FIFA+,优酷,腾讯,爱奇艺,1905电影网,乐视,芒果,咪咕,PPTV,搜狐,Facebook,Twitter,新浪微博,今日头条,网易公开课,全民K歌,CCTV央视频,酷狗音乐MV,新片场,知乎,百度贴吧,TED等海量流媒体平台)

$ npx skills add CharlesPikachu/videodl
2.1K stars65 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
pythoncrawler
by CharlesPikachuQuick view
36

Skycaiji

VERIFIEDEXCELLENT · 100

蓝天采集器是一款开源免费的爬虫系统,仅需点选编辑规则即可采集数据,可运行在本地、虚拟主机或云服务器中,几乎能采集所有类型的网页,无缝对接各类CMS建站程序,免登录实时发布数据,全自动无需人工干预!是网页大数据采集软件中完全跨平台的云端爬虫系统

$ npx skills add zorlan/skycaiji
2.1K stars65 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
phpcrawler
by zorlanQuick view
37

SCrawler

VERIFIEDEXCELLENT · 100

🏳️‍🌈 Media downloader from any sites, including Twitter, Reddit, Instagram, BlueSky, TikTok, Threads, Facebook, OnlyFans, YouTube, Pinterest, PornHub, XHamster, XVIDEOS, ThisVid etc.

$ npx skills add AAndyProgram/SCrawler
2.0K stars65 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
visual-basic-.netcrawler
by AAndyProgramQuick view
38

Gain

VERIFIEDEXCELLENT · 98

Web crawling framework based on asyncio.

$ npx skills add elliotgao2/gain
2.0K stars65 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
pythoncrawler
by elliotgao2Quick view
39

Crawlab

VERIFIEDEXCELLENT · 100

Distributed web crawler admin platform for spiders management regardless of languages and frameworks. 分布式爬虫管理平台,支持任何语言和框架

$ npx skills add crawlab-team/crawlab
12.2K stars65 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
gocrawler
by crawlab-teamQuick view
40

Obsidian Omnisearch

VERIFIEDEXCELLENT · 100

A search engine that "just works" for Obsidian. Supports OCR and PDF indexing.

$ npx skills add scambier/obsidian-omnisearch
2.0K stars65 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
typescriptpdf
by scambierQuick view
41

Maker.Js

VERIFIEDEXCELLENT · 100

📐⚙ 2D vector line drawing and shape modeling for CNC and laser cutters.

$ npx skills add microsoft/maker.js
2.0K stars65 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
typescriptpdf
by microsoftQuick view
42

KillerPDF

VERIFIEDEXCELLENT · 100

PDF editor for Windows. Install or run portable. GPLv3. No account, no subscription, no telemetry.

$ npx skills add SteveTheKiller/KillerPDF
2.0K stars65 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
c#pdf
by SteveTheKillerQuick view
43

Itext Dotnet

VERIFIEDEXCELLENT · 99

iText for .NET is the .NET version of the iText library, formerly known as iTextSharp, which it replaces. iText represents the next level of SDKs for developers that want to take advantage of the benefits PDF can bring. Equipped with a better document engine, high and low-level programming capabilities and the ability to create, edit and enha

$ npx skills add itext/itext-dotnet
1.9K stars65 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
c#pdf
by itextQuick view
44

Webmagic

VERIFIEDEXCELLENT · 98

A scalable web crawler framework for Java.

$ npx skills add code4craft/webmagic
11.7K stars65 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
javacrawler
by code4craftQuick view
45

Article Extractor

VERIFIEDEXCELLENT · 100

To extract main article from given URL with Node.js

$ npx skills add extractus/article-extractor
1.9K stars65 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
javascriptcrawler
by extractusQuick view
46

OpenComic

VERIFIEDEXCELLENT · 100

Comic and Manga reader, written with Node.js and using Electron

$ npx skills add ollm/OpenComic
1.9K stars65 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
javascriptpdf
by ollmQuick view
47

NewPipeExtractor

VERIFIEDEXCELLENT · 100

NewPipe's core library for extracting data from streaming sites

$ npx skills add TeamNewPipe/NewPipeExtractor
1.9K stars65 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
javacrawler
by TeamNewPipeQuick view
48

X Crawl

VERIFIEDEXCELLENT · 98

Flexible Node.js AI-assisted crawler library

$ npx skills add coder-hxl/x-crawl
1.9K stars65 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
typescriptcrawler
by coder-hxlQuick view
49

Tc Lib Pdf

VERIFIEDEXCELLENT · 93

PHP PDF Library (official TCPDF successor)

$ npx skills add tecnickcom/tc-lib-pdf
1.8K stars65 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
phppdf
by tecnickcomQuick view
50

WaterCrawl

VERIFIEDEXCELLENT · 93

Transform Web Content into LLM-Ready Data

$ npx skills add watercrawl/WaterCrawl
1.8K stars65 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
typescriptcrawler
by watercrawlQuick view
51

Tinypdf

VERIFIEDEXCELLENT · 100

Minimal PDF creation library. <400 LOC, zero dependencies, makes real PDFs.

$ npx skills add Lulzx/tinypdf
1.8K stars65 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
typescriptpdf
by LulzxQuick view
52

Canvas

VERIFIEDEXCELLENT · 98

Vector graphics in Go

$ npx skills add tdewolff/canvas
1.8K stars65 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
gopdf
by tdewolffQuick view
53

Excalibur

VERIFIEDEXCELLENT · 100

A web interface to extract tabular data from PDFs

$ npx skills add camelot-dev/excalibur
1.8K stars65 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
pythonpdf
by camelot-devQuick view
54

React Native Pdf

VERIFIEDEXCELLENT · 98

A <Pdf /> component for react-native

$ npx skills add wonday/react-native-pdf
1.8K stars64 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
javascriptpdf
by wondayQuick view
55

Rust Skia

VERIFIEDEXCELLENT · 98

Rust Bindings for the Skia Graphics Library

$ npx skills add rust-skia/rust-skia
1.8K stars64 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
rustpdf
by rust-skiaQuick view
56

Sile

VERIFIEDEXCELLENT · 100

The SILE Typesetter — Simon’s Improved Layout Engine

$ npx skills add sile-typesetter/sile
1.8K stars64 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
luapdf
by sile-typesetterQuick view
57

Retain Pdf

VERIFIEDEXCELLENT · 98

在保留版面、公式与结构的前提下进行 PDF 翻译,适用于科研与技术文档

$ npx skills add wxyhgk/retain-pdf
1.8K stars64 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
pythonpdf
by wxyhgkQuick view
58

Crawler Illegal Cases In China

VERIFIEDEXCELLENT · 97

Collection of China illegal cases about web crawler 本项目用来整理所有中国大陆爬虫开发者涉诉与违规相关的新闻、资料与法律法规。致力于帮助在中国大陆工作的爬虫行业从业者了解我国相关法律,避免触碰数据合规红线。

$ npx skills add hiddendevj/Crawler_Illegal_Cases_In_China
4.6K stars64 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
htmlcrawler
by hiddendevjQuick view
59

Core

VERIFIEDEXCELLENT · 100

A modern PDF library for TypeScript. Parse, modify, and generate PDFs with a clean, intuitive API.

$ npx skills add LibPDF-js/core
1.7K stars64 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
typescriptpdf
by LibPDF-jsQuick view
60

Manga Translator Ui

VERIFIEDEXCELLENT · 100

基于 manga-image-translator 的开源漫画翻译工具。支持日/韩/美漫自动翻译,内置 OpenAI、Gemini 等 5 种翻译引擎,并提供可视化编辑器自由调整文本样式。一键安装,开箱即用。如果喜欢,欢迎点亮 ⭐ Star 支持!

$ npx skills add hgmzhn/manga-translator-ui
1.7K stars64 qualityClaude Code + OpenAI Agents
High-confidence pick with strong adoption and healthy maintenance signals.
pythonocr
by hgmzhnQuick view
61

MarkdownMonster

VERIFIEDEXCELLENT · 99

An extensible Markdown Editor, Viewer and Weblog Publisher for Windows

$ npx skills add RickStrahl/MarkdownMonster
1.7K stars64 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
htmlpdf
by RickStrahlQuick view
62

Pdf

VERIFIEDEXCELLENT · 100

Rust library to read, manipulate and write PDF files.

$ npx skills add pdf-rs/pdf
1.7K stars64 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
rustpdf
by pdf-rsQuick view
63

Seal Report

VERIFIEDEXCELLENT · 92

Database Reporting Tool and Tasks (.Net)

$ npx skills add ariacom/Seal-Report
1.6K stars64 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
c#pdf
by ariacomQuick view
64

Govips

VERIFIEDEXCELLENT · 100

A lightning fast image processing and resizing library for Go

$ npx skills add davidbyttow/govips
1.6K stars64 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
gopdf
by davidbyttowQuick view
65

Programming Learning Resource

VERIFIEDEXCELLENT · 92

学习计算机科学的电子书

$ npx skills add tolerious/Programming_learning_resource
1.6K stars64 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
htmlpdf
by toleriousQuick view
66

Douyin

VERIFIEDEXCELLENT · 97

抖音爬虫——采集账号主页、喜欢、收藏、音乐原声、话题、搜索、合集、作品、关注、粉丝等公开数据。

$ npx skills add erma0/douyin
1.6K stars64 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
typescriptcrawler
by erma0Quick view
67

DotnetSpider

VERIFIEDEXCELLENT · 100

DotnetSpider, a .NET standard web crawling library. It is lightweight, efficient and fast high-level web crawling & scraping framework

$ npx skills add dotnetcore/DotnetSpider
4.1K stars64 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
c#crawler
by dotnetcoreQuick view
68

MORT

VERIFIEDEXCELLENT · 100

MORT 번역기 프로젝트 - Real-time game translator with OCR

$ npx skills add killkimno/MORT
1.5K stars64 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
c#ocr
by killkimnoQuick view
69

Tgfx

VERIFIEDEXCELLENT · 98

A lightweight 2D graphics library for modern GPUs, delivering high-performance text, image, and vector rendering across major platforms.

$ npx skills add Tencent/tgfx
1.5K stars64 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
c++pdf
by TencentQuick view
70

React Native Executorch

VERIFIEDEXCELLENT · 98

Declarative way to run AI models in React Native on device, powered by ExecuTorch.

$ npx skills add software-mansion/react-native-executorch
1.5K stars64 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
c++ocr
by software-mansionQuick view
71

Arxiv.Py

VERIFIEDEXCELLENT · 97

Python wrapper for the arXiv API

$ npx skills add lukasschwab/arxiv.py
1.5K stars64 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
pythonpdf
by lukasschwabQuick view
72

Rpaframework

VERIFIEDEXCELLENT · 100

Collection of open-source libraries and tools for Robotic Process Automation (RPA), designed to be used with both Robot Framework and Python

$ npx skills add robocorp/rpaframework
1.5K stars64 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
pythonocr
by robocorpQuick view
73

ScopeSentry

VERIFIEDEXCELLENT · 98

ScopeSentry-Cyberspace mapping, subdomain enumeration, port scanning, sensitive information discovery, vulnerability scanning, distributed nodes

$ npx skills add Autumn-27/ScopeSentry
1.5K stars64 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
gocrawler
by Autumn-27Quick view
74

Work Crawler

VERIFIEDEXCELLENT · 96

Download comics novels 小说漫画下载工具 小説漫画のダウンローダ 小說漫畫下載:腾讯漫画 大角虫漫画 有妖气 咪咕 SF漫画 哦漫画 看漫画 漫画柜 汗汗酷漫 動漫伊甸園 快看漫画 微博动漫 733动漫网 大古漫画网 漫画DB 無限動漫 動漫狂 卡推漫画 动漫之家 动漫屋 古风漫画网 36漫画网 亲亲漫画网 乙女漫画 webtoons 咚漫 ニコニコ静画 ComicWalker ヤングエースUP モアイ pixivコミック サイコミ;アルファポリス カクヨム ハーメルン 小説家になろう 起点中文网 八一中文网 顶点小说 落霞小说网 努努书坊 笔趣阁→epub.

$ npx skills add kanasimi/work_crawler
4.0K stars64 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
javascriptcrawler
by kanasimiQuick view
75

Fscrawler

VERIFIEDEXCELLENT · 97

Elasticsearch File System Crawler (FS Crawler)

$ npx skills add dadoonet/fscrawler
1.4K stars64 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
javacrawler
by dadoonetQuick view
76

Skills

VERIFIEDEXCELLENT · 100

Give your AI the power to browse, scrape, and extract structured data from complex websites — with faster execution, lower cost, and more reliable results.

$ npx skills add browser-act/skills
1.4K stars64 qualityClaude Code + Cursor
High-confidence pick with strong adoption and healthy maintenance signals.
pythonweb-automation
by browser-actQuick view
77

Yomitoku

VERIFIEDEXCELLENT · 98

YomiTokuはAIを活用した日本語文書解析エンジンを提供するPythonパッケージです。 Yomitoku is an AI-powered document image analysis package designed specifically for the Japanese language.

$ npx skills add kotaro-kinoshita/yomitoku
1.4K stars64 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
pythonocr
by kotaro-kinoshitaQuick view
78

OpenWPM

VERIFIEDEXCELLENT · 92

A web privacy measurement framework

$ npx skills add openwpm/OpenWPM
1.4K stars64 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
pythoncrawler
by openwpmQuick view
79

Hexapdf

VERIFIEDEXCELLENT · 92

Versatile PDF creation and manipulation for Ruby

$ npx skills add gettalong/hexapdf
1.4K stars64 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
rubypdf
by gettalongQuick view
80

Pdfium Binaries

VERIFIEDEXCELLENT · 97

📰 Binary distribution of PDFium

$ npx skills add bblanchon/pdfium-binaries
1.4K stars64 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
shellpdf
by bblanchonQuick view
81

PDF4QT

VERIFIEDEXCELLENT · 97

Open source PDF editor.

$ npx skills add JakubMelka/PDF4QT
1.4K stars64 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
c++pdf
by JakubMelkaQuick view
82

OpenOCR

VERIFIEDEXCELLENT · 100

OpenOCR: An Open-Source Toolkit for General-OCR Research and Applications, integrates a unified training and evaluation benchmark, commercial-grade OCR and Document Parsing systems, and faithful reproductions of the core implementations from a wide range of academic papers.

$ npx skills add Topdu/OpenOCR
1.4K stars64 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
pythonocr
by TopduQuick view
83

HRConvert2

VERIFIEDEXCELLENT · 100

A self-hosted file conversion server & share tool that supports 445 file formats in 13 languages.

$ npx skills add zelon88/HRConvert2
1.3K stars64 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
phpocr
by zelon88Quick view
84

Jasperreports

VERIFIEDEXCELLENT · 97

JasperReports® - Free Java Reporting Library

$ npx skills add Jaspersoft/jasperreports
1.3K stars64 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
javapdf
by JaspersoftQuick view
85

Jsreport

VERIFIEDEXCELLENT · 100

javascript based business reporting platform :rocket:

$ npx skills add jsreport/jsreport
1.3K stars64 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
javascriptpdf
by jsreportQuick view
86

IOS OCR Server

VERIFIEDEXCELLENT · 96

An iOS OCR Server Using Apple’s Vision Framework

$ npx skills add riddleling/iOS-OCR-Server
1.3K stars63 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
swiftocr
by riddlelingQuick view
87

Hands On AI Engineering

VERIFIEDEXCELLENT · 97

A curated collection of practical AI projects implementing OCR systems, RAG, AI agents, and other AI use cases.

$ npx skills add Sumanth077/Hands-On-AI-Engineering
1.2K stars63 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
pythonocr
by Sumanth077Quick view
88

MouseTooltipTranslator

VERIFIEDEXCELLENT · 100

Mouseover Translate Any Language At Once - Chrome Extension: PDF Translator, EBOOK, EPUB, OCR, TTS, NETFLIX, YOUTUBE DUAL SUBTITLES, GOOGLE DOCS, AI, VIEWER, GMAIL, WRITING, IMAGE, DUAL SUBS, MANGA, HOVER, DICTIONARY, WEBTOON, EDGE, JAPANESE, ENGLISH

$ npx skills add ttop32/MouseTooltipTranslator
1.2K stars63 qualityClaude Code + OpenAI Agents
High-confidence pick with strong adoption and healthy maintenance signals.
javascriptocr
by ttop32Quick view
89

MLKit

VERIFIEDEXCELLENT · 100

🌝 MLKit是一个强大易用的工具包。通过ML Kit您可以很轻松的实现文字识别、条码识别、图像标记、人脸检测、对象检测等功能。

$ npx skills add jenly1314/MLKit
1.2K stars63 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
javaocr
by jenly1314Quick view
90

Pandora Box

VERIFIEDEXCELLENT · 96

A Simple Mihomo GUI. 一个简易的 Mihomo 桌面客户端

$ npx skills add snakem982/Pandora-Box
1.1K stars63 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
vuecrawler
by snakem982Quick view
91

AI Hands On

VERIFIEDEXCELLENT · 100

A group of notebooks and other files which can help you learn AI from scratch.

$ npx skills add Ramakm/ai-hands-on
1.1K stars63 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
jupyter-notebookocr
by RamakmQuick view
92

Fess

VERIFIEDEXCELLENT · 100

Fess is very powerful and easily deployable Enterprise Search Server.

$ npx skills add codelibs/fess
1.1K stars63 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
javacrawler
by codelibsQuick view
93

Newspaper4k

VERIFIEDEXCELLENT · 100

📰 Newspaper4k a fork of the beloved Newspaper3k. Extraction of articles, titles, and metadata from news websites.

$ npx skills add AndyTheFactory/newspaper4k
1.1K stars63 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
pythoncrawler
by AndyTheFactoryQuick view
94

Google Play Scraper

VERIFIEDEXCELLENT · 94

Node.js scraper to get data from Google Play

$ npx skills add facundoolano/google-play-scraper
2.9K stars63 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
javascriptcrawler
by facundoolanoQuick view
95

Browsertrix Crawler

VERIFIEDEXCELLENT · 100

Run a high-fidelity browser-based web archiving crawler in a single Docker container

$ npx skills add webrecorder/browsertrix-crawler
1.0K stars63 qualityClaude Code + Browser agents
High-confidence pick with strong adoption and healthy maintenance signals.
typescriptcrawler
by webrecorderQuick view
96

Zotero Reference

VERIFIEDEXCELLENT · 94

PDF references add-on for Zotero.

$ npx skills add MuiseDestiny/zotero-reference
2.7K stars63 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
javascriptpdf
by MuiseDestinyQuick view
97

Kraken

VERIFIEDEXCELLENT · 95

OCR engine for all the languages

$ npx skills add mittagessen/kraken
1.0K stars63 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
pythonocr
by mittagessenQuick view
98

Exifcleaner

VERIFIEDEXCELLENT · 99

Cross-platform desktop GUI app to clean image metadata

$ npx skills add szTheory/exifcleaner
2.5K stars62 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
perlpdf
by szTheoryQuick view
99

News Please

VERIFIEDEXCELLENT · 99

news-please - an integrated web crawler and information extractor for news that just works

$ npx skills add fhamborg/news-please
2.5K stars62 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
pythoncrawler
by fhamborgQuick view
100

Pdf2json

VERIFIEDEXCELLENT · 94

converts binary PDF to JSON and text, for server-side PDF processing and command-line use. Zero dependency.

$ npx skills add modesty/pdf2json
2.2K stars62 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
javapdf
by modestyQuick view
101

Goclone

VERIFIEDEXCELLENT · 99

Website Cloner - Utilizes powerful Go routines to clone websites to your computer within seconds.

$ npx skills add goclone-dev/goclone
2.1K stars62 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
gocrawler
by goclone-devQuick view
102

Docext

VERIFIEDEXCELLENT · 98

An on-premises, OCR-free unstructured data extraction, markdown conversion and benchmarking toolkit. (https://idp-leaderboard.org/)

$ npx skills add NanoNets/docext
2.0K stars62 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
pythonrag
by NanoNetsQuick view
103

Dialoqbase

VERIFIEDEXCELLENT · 92

Create chatbots with ease

$ npx skills add n4ze3m/dialoqbase
1.8K stars61 qualityClaude Code + OpenAI Agents
High-confidence pick with strong adoption and healthy maintenance signals.
typescriptpdf
by n4ze3mQuick view
104

Diskover Community

VERIFIEDEXCELLENT · 98

Diskover Community Edition - Open source file indexer, file search engine and data management and analytics powered by Elasticsearch

$ npx skills add diskoverdata/diskover-community
1.8K stars61 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
phpcrawler
by diskoverdataQuick view
105

Examples Of Web Crawlers

VERIFIEDEXCELLENT · 97

一些非常有趣的python爬虫例子,对新手比较友好,主要爬取淘宝、天猫、微信、微信读书、豆瓣、QQ等网站。(Some interesting examples of python crawlers that are friendly to beginners. )

$ npx skills add shengqiangzhang/examples-of-web-crawlers
14.6K stars61 qualityClaude Code + Browser agents
High-confidence pick with strong adoption and healthy maintenance signals.
htmlcrawler
by shengqiangzhangQuick view
106

PdfDing

VERIFIEDEXCELLENT · 98

Selfhosted PDF manager, viewer and editor offering a seamless user experience on multiple devices.

$ npx skills add mrmn2/PdfDing
1.7K stars61 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
pythonpdf
by mrmn2Quick view
107

Ebook GPT Translator

VERIFIEDEXCELLENT · 92

Enjoy reading with your favorite style.

$ npx skills add jesselau76/ebook-GPT-translator
1.7K stars61 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
pythonpdf
by jesselau76Quick view
108

Plato

VERIFIEDEXCELLENT · 87

Document reader

$ npx skills add baskerville/plato
1.6K stars61 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
rustpdf
by baskervilleQuick view
109

VRain

VERIFIEDEXCELLENT · 97

中文古籍刻本風格直排電子書製作工具 Chinese Ancient eBooks Generator

$ npx skills add shanleiguang/vRain
1.6K stars61 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
perlpdf
by shanleiguangQuick view
110

Thepipe

VERIFIEDEXCELLENT · 97

Get clean data from tricky documents, powered by vision-language models ⚡

$ npx skills add emcf/thepipe
1.5K stars61 qualityClaude Code + OpenAI Agents
High-confidence pick with strong adoption and healthy maintenance signals.
pythonpdf
by emcfQuick view
111

Dart Pdf

VERIFIEDEXCELLENT · 91

Pdf creation module for dart/flutter

$ npx skills add DavBfr/dart_pdf
1.5K stars61 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
dartpdf
by DavBfrQuick view
112

Kokoro Tts

VERIFIEDEXCELLENT · 97

A CLI text-to-speech tool using the Kokoro model, supporting multiple languages, voices (with blending), and various input formats including EPUB books and PDF documents.

$ npx skills add nazdridoy/kokoro-tts
1.5K stars61 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
pythonpdf
by nazdridoyQuick view
113

Pdf To Image

VERIFIEDEXCELLENT · 94

Convert a pdf to an image

$ npx skills add spatie/pdf-to-image
1.4K stars61 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
phppdf
by spatieQuick view
114

Sperm

VERIFIEDEXCELLENT · 91

浏览过的精彩逆向文章汇总,值得一看

$ npx skills add darbra/sperm
1.4K stars61 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
crawler
by darbraQuick view
115

Pagedjs

VERIFIEDEXCELLENT · 100

Display paginated content in the browser and generate print books using web technology

$ npx skills add pagedjs/pagedjs
1.4K stars61 qualityClaude Code + Browser agents
High-confidence pick with strong adoption and healthy maintenance signals.
htmlpdf
by pagedjsQuick view
116

Wombat

VERIFIEDEXCELLENT · 97

Lightweight Ruby web crawler/scraper with an elegant DSL which extracts structured data from pages.

$ npx skills add felipecsl/wombat
1.4K stars61 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
rubycrawler
by felipecslQuick view
117

Pdfrip

VERIFIEDEXCELLENT · 97

A multi-threaded PDF password cracking utility equipped with commonly encountered password format builders and dictionary attacks.

$ npx skills add mufeedvh/pdfrip
1.4K stars61 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
rustpdf
by mufeedvhQuick view
118

XSRFProbe

VERIFIEDEXCELLENT · 96

The Prime Cross Site Request Forgery (CSRF) Audit and Exploitation Toolkit.

$ npx skills add 0xInfection/XSRFProbe
1.3K stars60 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
pythoncrawler
by 0xInfectionQuick view
119

AppCrawler

VERIFIEDEXCELLENT · 85

基于appium的app自动遍历工具

$ npx skills add seveniruby/AppCrawler
1.2K stars60 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
scalacrawler
by sevenirubyQuick view
120

MyGPTReader

VERIFIEDEXCELLENT · 98

A community-driven way to read and chat with AI bots - powered by chatGPT.

$ npx skills add myreader-io/myGPTReader
4.4K stars60 qualityClaude Code + OpenAI Agents
High-confidence pick with strong adoption and healthy maintenance signals.
pythoncrawler
by myreader-ioQuick view
121

TorBot

VERIFIEDEXCELLENT · 87

Dark Web OSINT Tool

$ npx skills add DedSecInside/TorBot
4.1K stars60 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
pythoncrawler
by DedSecInsideQuick view
122

Gecco

VERIFIEDEXCELLENT · 89

Easy to use lightweight web crawler(易用的轻量化网络爬虫)

$ npx skills add xtuhcy/gecco
2.5K stars59 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
javacrawler
by xtuhcyQuick view
123

Xhtml2pdf

VERIFIEDEXCELLENT · 95

A library for converting HTML into PDFs using ReportLab

$ npx skills add xhtml2pdf/xhtml2pdf
2.4K stars58 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
pythonpdf
by xhtml2pdfQuick view
124

Openlib

VERIFIEDEXCELLENT · 95

An Open source app to download and read books from shadow library (Anna’s Archive)

$ npx skills add dstark5/Openlib
2.4K stars58 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
dartpdf
by dstark5Quick view
125

Onefilellm

VERIFIEDEXCELLENT · 94

Specify a github or local repo, github pull request, arXiv or Sci-Hub paper, Youtube transcript or documentation URL on the web and scrape into a text file and clipboard for easier LLM ingestion

$ npx skills add jimmc414/onefilellm
2.0K stars58 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
pythonpdf
by jimmc414Quick view
126

Node Crawler

VERIFIEDEXCELLENT · 92

Web Crawler/Spider for NodeJS + server-side jQuery ;-)

$ npx skills add bda-research/node-crawler
6.8K stars58 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
typescriptcrawler
by bda-researchQuick view
127

Md To Pdf

VERIFIEDEXCELLENT · 94

Hackable CLI tool for converting Markdown files to PDF using Node.js and headless Chrome.

$ npx skills add simonhaenisch/md-to-pdf
1.8K stars58 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
typescriptpdf
by simonhaenischQuick view
128

Md2pdf

VERIFIEDEXCELLENT · 94

Offline markdown to pdf, choose -> edit -> transform 🥂

$ npx skills add realdennis/md2pdf
1.8K stars57 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
javascriptpdf
by realdennisQuick view
129

Kramdown

VERIFIEDEXCELLENT · 89

kramdown is a fast, pure Ruby Markdown superset converter, using a strict syntax definition and supporting several common extensions.

$ npx skills add gettalong/kramdown
1.8K stars57 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
rubypdf
by gettalongQuick view
130

Markpdfdown

VERIFIEDEXCELLENT · 94

A high-quality PDF to Markdown tool based on large language model visual recognition. 一款基于大模型视觉识别的高质量PDF转Markdown工具

$ npx skills add MarkPDFdown/markpdfdown
1.7K stars57 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
pythonpdf
by MarkPDFdownQuick view
131

Mokuro

VERIFIEDEXCELLENT · 93

Read Japanese manga inside browser with selectable text.

$ npx skills add kha-white/mokuro
1.6K stars57 qualityClaude Code + Browser agents
High-confidence pick with strong adoption and healthy maintenance signals.
htmlocr
by kha-whiteQuick view
132

Trafilatura

VERIFIEDEXCELLENT · 91

Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML

$ npx skills add adbar/trafilatura
6.0K stars57 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
pythonweb-automation
by adbarQuick view
133

Php Svg Lib

VERIFIEDEXCELLENT · 87

SVG file parsing / rendering library

$ npx skills add dompdf/php-svg-lib
1.4K stars57 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
phppdf
by dompdfQuick view
134

XHS Spider

VERIFIEDEXCELLENT · 93

小红书数据采集、网站图片、视频资源批量下载工具,颜值超高的数据采集工具(批量下载,视频提取,图片)Telegram:https://t.me/+ZtLSwuIKTo44MDY1

$ npx skills add xisuo67/XHS-Spider
1.4K stars57 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
crawler
by xisuo67Quick view
135

Tr

VERIFIEDSTRONG · 82

Free Offline OCR 离线的中文文本检测+识别SDK

$ npx skills add myhub/tr
1.4K stars57 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.
pythonocr
by myhubQuick view
136

Ng2 Pdf Viewer

VERIFIEDEXCELLENT · 87

📄 PDF Viewer Component for Angular

$ npx skills add VadimDez/ng2-pdf-viewer
1.3K stars57 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
typescriptpdf
by VadimDezQuick view
137

Spider Flow

VERIFIEDSTRONG · 77

新一代爬虫平台,以图形化方式定义爬虫流程,不写代码即可完成爬虫。

$ npx skills add ssssssss-team/spider-flow
11.3K stars57 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.Check: Repository looks stale
javacrawler
by ssssssss-teamQuick view
138

Sentinl

VERIFIEDEXCELLENT · 87

Kibana Alert & Report App for Elasticsearch

$ npx skills add sentinl/sentinl
1.3K stars57 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
javascriptpdf
by sentinlQuick view
139

Images To PDF

VERIFIEDEXCELLENT · 87

An app to convert images to PDF file!

$ npx skills add Swati4star/Images-to-PDF
1.3K stars57 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
javapdf
by Swati4starQuick view
140

Tumblr Crawler

VERIFIEDEXCELLENT · 87

Easily download all the photos/videos from tumblr blogs. 下载指定的 Tumblr 博客中的图片,视频

$ npx skills add dixudx/tumblr-crawler
1.2K stars56 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
pythoncrawler
by dixudxQuick view
141

Scylla

VERIFIEDEXCELLENT · 89

Intelligent proxy pool for Humans™ to extract content from the internet and build your own Large Language Models in this new AI era

$ npx skills add MikeChongCan/scylla
4.0K stars56 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
pythoncrawler
by MikeChongCanQuick view
142

Mdcx

VERIFIEDSTRONG · 83

Movie metadata scraper

$ npx skills add sqzw-x/mdcx
3.6K stars56 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.
pythoncrawler
by sqzw-xQuick view
143

Papermerge

VERIFIEDEXCELLENT · 88

Open Source Document Management System for Digital Archives (Scanned Documents)

$ npx skills add ciur/papermerge
2.9K stars55 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
pythonpdf
by ciurQuick view
144

Avbook

VERIFIEDSTRONG · 75

AV 电影管理系统, avmoo , javbus , javlibrary 爬虫,线上 AV 影片图书馆,AV 磁力链接数据库,Japanese Adult Video Library,Adult Video Magnet Links - Japanese Adult Video Database

$ npx skills add guyueyingmu/avbook
10.0K stars55 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.Check: Repository looks stale
phpcrawler
by guyueyingmuQuick view
145

RMT

EXCELLENT · 87

RMT (RuoMengTu) is a free, open-source macro tool built on AHKv2. Let the code handle the tedious work—you have more meaningful things to do.

$ npx skills add zclucas/RMT
984 stars55 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
htmlocr
by zclucasQuick view
146

Stormcrawler

EXCELLENT · 87

A scalable, mature and versatile web crawler based on Apache Storm

$ npx skills add apache/stormcrawler
977 stars55 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
javacrawler
by apacheQuick view
147

SnapX

EXCELLENT · 87

SnapX is a free, open-source, cross-platform tool that lets you capture or record any area of your screen and instantly share it with a single keypress. Upload images, videos, text, and more to multiple supported destinations—all with ease. ShareX fork

$ npx skills add SnapXL/SnapX
930 stars54 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
c#ocr
by SnapXLQuick view
148

X Kit

STRONG · 81

一个用于抓取和分析 X (Twitter) 用户数据和推文的工具。

$ npx skills add xiaoxiunique/x-kit
923 stars54 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.
typescriptcrawler
by xiaoxiuniqueQuick view
149

Parse Video

EXCELLENT · 87

Golang短视频去水印:抖音,皮皮虾,火山,微视,最右,快手,全民小视频,皮皮搞笑,西瓜视频,虎牙,梨视频,acfun,好看视频...

$ npx skills add wujunwei928/parse-video
921 stars54 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
gocrawler
by wujunwei928Quick view
150

PIKE RAG

VERIFIEDEXCELLENT · 87

PIKE-RAG: sPecIalized KnowledgE and Rationale Augmented Generation

$ npx skills add microsoft/PIKE-RAG
2.4K stars54 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
pythonrag
by microsoftQuick view
151

Ccextractor

EXCELLENT · 87

CCExtractor - Official version maintained by the core team

$ npx skills add CCExtractor/ccextractor
881 stars54 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
cocr
by CCExtractorQuick view
152

Cheat Sheet Pdf

VERIFIEDSTRONG · 81

📜 A Cheat-Sheet Collection from the WWW

$ npx skills add sk3pp3r/cheat-sheet-pdf
2.3K stars54 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.
htmlpdf
by sk3pp3rQuick view
153

Skrape.It

EXCELLENT · 87

A Kotlin-based testing/scraping/parsing library providing the ability to analyze and extract data from HTML (server & client-side rendered). It places particular emphasis on ease of use and a high level of readability by providing an intuitive DSL. It aims to be a testing lib, but can also be used to scrape websites in a convenient fashion.

$ npx skills add skrapeit/skrape.it
871 stars54 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
kotlincrawler
by skrapeitQuick view
154

Spider Reverse

EXCELLENT · 87

爬虫逆向案例,已完成:TLS指纹|瑞数|震坤行 | 网易易盾 | 微信小程序反编译逆向(百达星系) | 同花顺 | rpc解密 | 加速乐 | 极验滑块验证码 | 巨量算数 | Boss直聘 | 企查查 | 中国五矿 | qq音乐 | 产业政策大数据平台 | 企知道 | 雪球网(acw_sc__v2) | 1688 | 七麦数据 | whggzy | 企名科技 | mohurd | 艺恩数据 | 欧科云链

$ npx skills add 0xAllenChen/spider_reverse
869 stars54 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
pythoncrawler
by 0xAllenChenQuick view
155

Capso

STRONG · 82

Open-source screenshot and screen recording for macOS. The free, native alternative to CleanShot X. Built with Swift 6.0 and SwiftUI.

$ npx skills add lzhgus/Capso
865 stars54 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.
swiftocr
by lzhgusQuick view
156

Eclaire

EXCELLENT · 87

Local-first, open-source AI assistant for your data. Unify tasks, notes, docs, photos, and bookmarks. Private, self-hosted, and extensible via APIs.

$ npx skills add eclaire-labs/eclaire
863 stars54 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
typescriptocr
by eclaire-labsQuick view
157

ArrowDL

EXCELLENT · 87

ArrowDL (Arrow Downloader) is a download manager for Windows, MacOS and Linux

$ npx skills add setvisible/ArrowDL
842 stars54 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
c++crawler
by setvisibleQuick view
158

Note Companion

EXCELLENT · 87

Note Companion: AI assistant for Obsidian that goes beyond just a chat. (prev File Organizer 2000)

$ npx skills add Nexus-JPF/note-companion
841 stars54 qualityClaude Code + OpenAI Agents
High-confidence pick with strong adoption and healthy maintenance signals.
typescriptocr
by Nexus-JPFQuick view
159

Obsidian Pdf Plus

VERIFIEDEXCELLENT · 87

PDF++: the most Obsidian-native PDF annotation & viewing tool ever. Comes with optional Vim keybindings.

$ npx skills add RyotaUshio/obsidian-pdf-plus
2.2K stars54 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
typescriptpdf
by RyotaUshioQuick view
160

Texo

EXCELLENT · 86

A minimalist SOTA LaTeX OCR model with only 20M parameters, running in browser. Full training pipeline available for self-reproduction. | 超轻量SOTA LaTeX公式识别模型,仅20M参数量,可在浏览器中运行。训练全流程代码开源,以便自学复现。

$ npx skills add alephpi/Texo
822 stars54 qualityClaude Code + Browser agents
High-confidence pick with strong adoption and healthy maintenance signals.
pythonocr
by alephpiQuick view
161

Wreq

EXCELLENT · 86

An ergonomic Rust HTTP Client with TLS fingerprint

$ npx skills add 0x676e67/wreq
816 stars54 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
rustcrawler
by 0x676e67Quick view
162

Mayan EDMS

STRONG · 81

Free Open Source Document Management System (mirror, no pull request or issues)

$ npx skills add mayan-edms/Mayan-EDMS
809 stars54 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.
pythonocr
by mayan-edmsQuick view
163

Resumeio To Pdf

STRONG · 80

Download your resume from resume.io as PDF

$ npx skills add felipeall/resumeio-to-pdf
806 stars54 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.
htmlocr
by felipeallQuick view
164

Jvppeteer

STRONG · 80

Java API For Chrome and Firefox

$ npx skills add fanyong920/jvppeteer
806 stars54 qualityClaude Code + Browser agents
Solid option that is likely worth shortlisting for production workflows.
javacrawler
by fanyong920Quick view
165

CnSTD

EXCELLENT · 86

CnSTD: 基于 PyTorch/MXNet 的 中文/英文 场景文字检测(Scene Text Detection)、数学公式检测(Mathematical Formula Detection, MFD)、篇章分析(Layout Analysis)的Python3 包

$ npx skills add breezedeus/CnSTD
791 stars54 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
pythonocr
by breezedeusQuick view
166

Zotero Ocr

STRONG · 80

Zotero Plugin for OCR

$ npx skills add UB-Mannheim/zotero-ocr
784 stars54 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.
javascriptocr
by UB-MannheimQuick view
167

Scribeocr

EXCELLENT · 86

Web interface for recognizing text, proofreading OCR, and creating fully-digitized documents.

$ npx skills add scribeocr/scribeocr
781 stars54 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
javascriptocr
by scribeocrQuick view
168

Xxl Crawler

STRONG · 80

A lightweight web crawler framework.(Java爬虫框架)

$ npx skills add xuxueli/xxl-crawler
755 stars54 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.
javacrawler
by xuxueliQuick view
169

Local AI Ocr

EXCELLENT · 86

An local, offline (after initial setup), portable OCR software that can process images and PDF files, using DeepSeek-OCR AI (running directly on your machine).

$ npx skills add th1nhhdk/local_ai_ocr
737 stars54 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
pythonocr
by th1nhhdkQuick view
170

Readur

EXCELLENT · 86

Quick, painless, intuitive OCR platform written in Rust and TypeScript. Modern UI with modern API, with an emphasis on intuitive user experience.

$ npx skills add readur/readur
737 stars54 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
rustocr
by readurQuick view
171

PyPtt

STRONG · 80

The best PTT library

$ npx skills add PyPtt/PyPtt
724 stars54 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.
pythoncrawler
by PyPttQuick view
172

TumblThree

STRONG · 80

A Tumblr and Twitter Blog Backup Application

$ npx skills add TumblThreeApp/TumblThree
721 stars54 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.
c#crawler
by TumblThreeAppQuick view
173

Awesome Crawler

VERIFIEDSTRONG · 79

A collection of awesome web crawler,spider in different languages

$ npx skills add BruceDone/awesome-crawler
7.2K stars54 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.Check: Repository looks stale
crawler
by BruceDoneQuick view
174

Seonaut

STRONG · 80

Open source SEO audit tool.

$ npx skills add StJudeWasHere/seonaut
713 stars54 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.
gocrawler
by StJudeWasHereQuick view
175

Wscan

STRONG · 81

Wscan is a web security scanner that focuses on web security, dedicated to making web security accessible to everyone.

$ npx skills add chushuai/wscan
706 stars54 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.
gocrawler
by chushuaiQuick view
176

Dedoc

EXCELLENT · 86

Dedoc is a library (service) for automate documents parsing and bringing to a uniform format. It automatically extracts content, logical structure, tables, and meta information from textual electronic documents. (Parse document; Document content extraction; Logical structure extraction; PDF parser; Scanned document parser; DOCX parser; HTML parser

$ npx skills add ispras/dedoc
704 stars54 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
pythonocr
by isprasQuick view
177

Multi-modal OCR pipeline optimized for ML training (text, figure, math, tables, diagrams)

$ npx skills add raphael-seo/Versatile-OCR-Program
683 stars54 qualityClaude Code + OpenAI Agents
Solid option that is likely worth shortlisting for production workflows.
pythonocr
by raphael-seoQuick view
178

ExtractThinker

VERIFIEDEXCELLENT · 85

ExtractThinker is a Document Intelligence library for LLMs, offering ORM-style interaction for flexible and powerful document workflows.

$ npx skills add enoch3712/ExtractThinker
1.5K stars53 qualityClaude Code + LangChain
High-confidence pick with strong adoption and healthy maintenance signals.
pythonpdf
by enoch3712Quick view
179

Headless Chrome Crawler

VERIFIEDSTRONG · 72

Distributed crawler powered by Headless Chrome

$ npx skills add yujiosaka/headless-chrome-crawler
5.6K stars53 qualityClaude Code + Browser agents
Solid option that is likely worth shortlisting for production workflows.Check: Repository looks stale
javascriptcrawler
by yujiosakaQuick view
180

Haipproxy

VERIFIEDSTRONG · 78

:sparkling_heart: High available distributed ip proxy pool, powerd by Scrapy and Redis

$ npx skills add SpiderClub/haipproxy
5.5K stars53 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.Check: Repository looks stale
pythoncrawler
by SpiderClubQuick view
181

ECommerceCrawlers

VERIFIEDSTRONG · 78

实战🐍多种网站、电商数据爬虫🕷。包含🕸:淘宝商品、微信公众号、大众点评、企查查、招聘网站、闲鱼、阿里任务、博客园、微博、百度贴吧、豆瓣电影、包图网、全景网、豆瓣音乐、某省药监局、搜狐新闻、机器学习文本采集、fofa资产采集、汽车之家、国家统计局、百度关键词收录数、蜘蛛泛目录、今日头条、豆瓣影评、携程、小米应用商店、安居客、途家民宿❤️❤️❤️。微信爬虫展示项目:

$ npx skills add DropsDevopsOrg/ECommerceCrawlers
5.5K stars53 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.Check: Repository looks stale
pythoncrawler
by DropsDevopsOrgQuick view
182

Docstrange

VERIFIEDEXCELLENT · 85

Extract and convert data from any document, images, pdfs, word doc, ppt or URL into multiple formats (Markdown, JSON, CSV, HTML) with intelligent structured data extraction and advanced OCR.

$ npx skills add NanoNets/docstrange
1.5K stars53 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
pythonocr
by NanoNetsQuick view
183

Keras Ocr

VERIFIEDEXCELLENT · 85

A packaged and flexible version of the CRAFT text detector and Keras CRNN recognition model.

$ npx skills add faustomorales/keras-ocr
1.5K stars53 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
pythonocr
by faustomoralesQuick view
184

Krawl

STRONG · 84

Krawl is a customizable, lightweight, cloud-native web deception server and anti-crawler that creates fake web applications with low-hanging vulnerabilities using realistic, randomly generated decoy data and AI-generated HTML templates.

$ npx skills add BlessedRebuS/Krawl
530 stars53 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.
pythoncrawler
by BlessedRebuSQuick view
185

Reader

STRONG · 84

Open source web infrastructure for AI. Scrape, crawl, and automate the web, clean markdown, browser sessions, ready for your agents.

$ npx skills add vakra-dev/reader
529 stars53 qualityClaude Code + Browser agents
Solid option that is likely worth shortlisting for production workflows.
typescriptai-agents
by vakra-devQuick view
186

Promises Book

VERIFIEDSTRONG · 79

JavaScript Promiseの本

$ npx skills add azu/promises-book
1.4K stars53 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.
htmlpdf
by azuQuick view
187

TorCrawl.Py

STRONG · 84

Crawl and extract (regular or onion) webpages through TOR network

$ npx skills add MikeMeliz/TorCrawl.py
511 stars53 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.
pythoncrawler
by MikeMelizQuick view
188

MathTranslate

VERIFIEDEXCELLENT · 85

translate scientific papers in latex, especially arxiv papers

$ npx skills add SUSYUSTC/MathTranslate
1.4K stars53 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
pythonpdf
by SUSYUSTCQuick view
189

PolyglotPDF

VERIFIEDSTRONG · 84

(eBook,PDFs Translation) A multilingual eBook processing tool supporting all eBook formats. Features online and offline translation while preserving original layouts. Compatible with both scanned and digital PDFs. Elegant user interface. The world's highest-performing open-source layout-preserving eBook translator.

$ npx skills add CBIhalsen/PolyglotPDF
1.3K stars53 qualityClaude Code + OpenAI Agents
Solid option that is likely worth shortlisting for production workflows.
pythonpdf
by CBIhalsenQuick view
190

BiliBili Manga Downloader

VERIFIEDSTRONG · 84

一个好用的哔哩哔哩漫画下载器,拥有图形界面,支持关键词搜索漫画和二维码登入,黑科技下载未解锁章节,多线程下载,多种保存格式,本地漫画管理,一键检查更新!

$ npx skills add Zeal-L/BiliBili-Manga-Downloader
1.2K stars52 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.
pythoncrawler
by Zeal-LQuick view
191

PaddleOCR2Pytorch

VERIFIEDSTRONG · 84

PaddleOCR inference in PyTorch. Converted from [PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR)

$ npx skills add frotms/PaddleOCR2Pytorch
1.2K stars52 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.
pythonocr
by frotmsQuick view
192

ProxyBroker

VERIFIEDSTRONG · 77

Proxy [Finder | Checker | Server]. HTTP(S) & SOCKS :performing_arts:

$ npx skills add constverum/ProxyBroker
4.2K stars52 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.Check: Repository looks stale
pythoncrawler
by constverumQuick view
193

Crawly

VERIFIEDSTRONG · 84

Crawly, a high-level web crawling & scraping framework for Elixir.

$ npx skills add elixir-crawly/crawly
1.1K stars52 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.
elixircrawler
by elixir-crawlyQuick view
194

Proxypool

VERIFIEDSTRONG · 76

Automatically crawls proxy nodes on the public internet, de-duplicates and tests for usability and then provides a list of nodes

$ npx skills add zu1k/proxypool
4.0K stars52 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.Check: Repository looks stale
gocrawler
by zu1kQuick view
195

How To Scrape Google Finance

VERIFIEDSTRONG · 78

Use Web Scraper API to extract data from Google Finance, including stock titles, pricing, and price changes in percentages.

$ npx skills add oxylabs/how-to-scrape-google-finance
1.0K stars52 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.
pythonweb-automation
by oxylabsQuick view
196

RED HAWK

VERIFIEDSTRONG · 76

All in one tool for Information Gathering, Vulnerability Scanning and Crawling. A must have tool for all penetration testers

$ npx skills add Tuhinshubhra/RED_HAWK
3.7K stars52 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.Check: Repository looks stale
phpcrawler
by TuhinshubhraQuick view
197

SpiderSuite

STRONG · 70

SpiderSuite releases, wiki and roadmap

$ npx skills add spidersuite/SpiderSuite
959 stars52 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.
crawler
by spidersuiteQuick view
198

Paddle2ONNX

STRONG · 75

ONNX Model Exporter for PaddlePaddle

$ npx skills add PaddlePaddle/Paddle2ONNX
922 stars51 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.
c++ocr
by PaddlePaddleQuick view
199

Python3 Spider

VERIFIEDSTRONG · 71

Python爬虫实战 - 模拟登陆各大网站 包含但不限于:滑块验证、拼多多、美团、百度、bilibili、大众点评、淘宝,如果喜欢请start ❤️

$ npx skills add wkunzhi/Python3-Spider
3.4K stars51 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.Check: Repository looks stale
pythoncrawler
by wkunzhiQuick view
200

Scrapyrt

STRONG · 75

HTTP API for Scrapy spiders

$ npx skills add scrapinghub/scrapyrt
881 stars51 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.
pythoncrawler
by scrapinghubQuick view
201

Crawlergo

VERIFIEDSTRONG · 75

A powerful browser crawler for web vulnerability scanners

$ npx skills add Qianlitp/crawlergo
3.0K stars51 qualityClaude Code + Browser agents
Solid option that is likely worth shortlisting for production workflows.Check: Repository looks stale
gocrawler
by QianlitpQuick view
202

Gospider

VERIFIEDPROMISING · 69

Gospider - Fast web spider written in Go

$ npx skills add jaeles-project/gospider
3.0K stars51 qualityClaude Code
Useful candidate, but compare it with alternatives before adopting.Check: Repository looks stale
gocrawler
by jaeles-projectQuick view
203

DecryptLogin

VERIFIEDSTRONG · 75

DecryptLogin: APIs for loginning some websites by using requests.

$ npx skills add CharlesPikachu/DecryptLogin
2.9K stars51 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.Check: Repository looks stale
pythoncrawler
by CharlesPikachuQuick view
204

Owllook

VERIFIEDPROMISING · 69

owllook-小说搜索引擎

$ npx skills add howie6879/owllook
2.8K stars51 qualityClaude Code
Useful candidate, but compare it with alternatives before adopting.Check: Repository looks stale
pythoncrawler
by howie6879Quick view
205

GoogleScraper

VERIFIEDSTRONG · 75

A Python module to scrape several search engines (like Google, Yandex, Bing, Duckduckgo, ...). Including asynchronous networking support.

$ npx skills add NikolaiT/GoogleScraper
2.8K stars51 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.Check: Repository looks stale
htmlcrawler
by NikolaiTQuick view
206

Siteone Crawler

STRONG · 80

SiteOne Crawler is a cross-platform website crawler and analyzer for SEO, security, accessibility, and performance optimization—ideal for developers, DevOps, QA engineers, and consultants. Supports Windows, macOS, and Linux (x64 and arm64).

$ npx skills add janreges/siteone-crawler
754 stars51 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.
rustcrawler
by janregesQuick view
207

Hacker News Digest

STRONG · 80

:newspaper: Let ChatGPT Summarize Hacker News for You

$ npx skills add polyrabbit/hacker-news-digest
754 stars51 qualityClaude Code + OpenAI Agents
Solid option that is likely worth shortlisting for production workflows.
pythoncrawler
by polyrabbitQuick view
208

Geziyor

VERIFIEDSTRONG · 75

Geziyor, blazing fast web crawling & scraping framework for Go. Supports JS rendering.

$ npx skills add geziyor/geziyor
2.8K stars51 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.Check: Repository looks stale
gocrawler
by geziyorQuick view
209

:blue_book: 电子书 -《Real-Time Rendering 3rd》提炼总结 | 全书共9万7千余字。你可以把它看做中文通俗版的《Real-Time Rendering 3rd》,也可以把它看做《Real-Time Rendering 3rd》的解读版与配套学习伴侣,或者《Real-Time Rendering 4th》的前置阅读材料。

$ npx skills add QianMo/Real-Time-Rendering-3rd-CN-Summary-Ebook
2.7K stars51 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.Check: Repository looks stale
pdf
by QianMoQuick view
210

Pdf Bot

VERIFIEDSTRONG · 75

🤖 A Node queue API for generating PDFs using headless Chrome. Comes with a CLI, S3 storage and webhooks for notifying subscribers about generated PDFs

$ npx skills add esbenp/pdf-bot
2.6K stars51 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.Check: Repository looks stale
javascriptpdf
by esbenpQuick view
211

ZerolanLiveRobot

STRONG · 80

AI VTuber with LLM, ASR, TTS, OCR, CV and more technologies to live stream or play Minecraft with you.

$ npx skills add AkagawaTsurunaki/ZerolanLiveRobot
699 stars51 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.
pythonocr
by AkagawaTsurunakiQuick view
212

Open Paperless

VERIFIEDPROMISING · 69

Scan, index, and archive all of your paper documents (acquired by Mayan EDMS)

$ npx skills add zhoubear/open-paperless
2.6K stars51 qualityClaude Code
Useful candidate, but compare it with alternatives before adopting.Check: Repository looks stale
pythonpdf
by zhoubearQuick view
213

High-performance asynchronous Douyin(抖音) TikTok Xiaohongshu(小红书) Kuaishou(快手) Weibo(微博) Instagram YouTube(油管) Twitter(X) Captcha Solver(验证码解决器) Temp Mail(临时邮箱) API(接口).

$ npx skills add TikHub/TikHub-API-Python-SDK
681 stars51 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.
pythoncrawler
by TikHubQuick view
214

Leaked GPTs

VERIFIEDPROMISING · 69

Leaked GPTs Prompts Bypass the 25 message limit or to try out GPTs without a Plus subscription.

$ npx skills add friuns2/Leaked-GPTs
2.4K stars50 qualityClaude Code + OpenAI Agents
Useful candidate, but compare it with alternatives before adopting.Check: Repository looks stale
pythoncrawler
by friuns2Quick view
215

Tabula Py

VERIFIEDSTRONG · 74

Simple wrapper of tabula-java: extract table from PDF into pandas DataFrame

$ npx skills add chezou/tabula-py
2.3K stars50 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.Check: Repository looks stale
pythonpdf
by chezouQuick view
216

Abot

VERIFIEDSTRONG · 74

Cross Platform C# web crawler framework built for speed and flexibility. Please star this project! +1.

$ npx skills add sjdirect/abot
2.3K stars50 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.Check: Repository looks stale
c#crawler
by sjdirectQuick view
217

Vue Pdf

VERIFIEDPROMISING · 68

vue.js pdf viewer

$ npx skills add FranckFreiburger/vue-pdf
2.3K stars50 qualityClaude Code
Useful candidate, but compare it with alternatives before adopting.Check: Repository looks stale
javascriptpdf
by FranckFreiburgerQuick view
218

Moodle DL

STRONG · 79

Moodle-DL downloads course content fast from Moodle (eg. lecture pdfs)

$ npx skills add C0D3D3V/Moodle-DL
612 stars50 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.
pythoncrawler
by C0D3D3VQuick view
219

Pdftabextract

VERIFIEDSTRONG · 74

A set of tools for extracting tables from PDF files helping to do data mining on (OCR-processed) scanned documents.

$ npx skills add WZBSocialScienceCenter/pdftabextract
2.3K stars50 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.Check: Repository looks stale
pythonpdf
by WZBSocialScienceCenterQuick view
220

Books Pdf

VERIFIEDPROMISING · 63

books pdf

$ npx skills add huyubing/books-pdf
2.2K stars50 qualityClaude Code
Useful candidate, but compare it with alternatives before adopting.Check: Repository looks stale
pdf
by huyubingQuick view
221

Openhtmltopdf

VERIFIEDPROMISING · 69

An HTML to PDF library for the JVM. Based on Flying Saucer and Apache PDF-BOX 2. With SVG image support. Now also with accessible PDF support (WCAG, Section 508, PDF/UA)!

$ npx skills add danfickle/openhtmltopdf
2.2K stars50 qualityClaude Code
Useful candidate, but compare it with alternatives before adopting.Check: Repository looks stale
javapdf
by danfickleQuick view
222

Deepcrawl

STRONG · 79

100% free and full open-source edge Firecrawl alternative with better links extraction for agents - that you can deploy to cloudflare or vercel by yourself.

$ npx skills add lumpinif/deepcrawl
576 stars50 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.
typescriptweb-automation
by lumpinifQuick view
223

Vulnx

VERIFIEDSTRONG · 74

vulnx 🕷️ an intelligent Bot, Shell can achieve automatic injection, and help researchers detect security vulnerabilities CMS system. It can perform a quick CMS security detection, information collection (including sub-domain name, ip address, country information, organizational information and time zone, etc.) and vulnerability scanning.

$ npx skills add anouarbensaad/vulnx
2.1K stars50 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.Check: Repository looks stale
pythoncrawler
by anouarbensaadQuick view
224

Gocrawl

VERIFIEDPROMISING · 67

Polite, slim and concurrent web crawler.

$ npx skills add PuerkitoBio/gocrawl
2.1K stars50 qualityClaude Code
Useful candidate, but compare it with alternatives before adopting.Check: Repository looks stale
gocrawler
by PuerkitoBioQuick view
225

Dirhunt

VERIFIEDPROMISING · 67

Find web directories without bruteforce

$ npx skills add Nekmo/dirhunt
2.0K stars50 qualityClaude Code
Useful candidate, but compare it with alternatives before adopting.Check: Repository looks stale
pythoncrawler
by NekmoQuick view
226

Pdf2image

VERIFIEDSTRONG · 73

A python module that wraps the pdftoppm utility to convert PDF to PIL Image object

$ npx skills add Belval/pdf2image
2.0K stars50 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.Check: Repository looks stale
pythonpdf
by BelvalQuick view
227

LxSpider

VERIFIEDSTRONG · 73

爬虫案例合集。包括但不限于《淘宝、京东、天猫、豆瓣、抖音、快手、微博、微信、阿里、头条、pdd、优酷、爱奇艺、携程、12306、58、搜狐、各种指数、维普万方、Zlibraty、Oalib、小说、招标网、采购网、小红书、大众点评、推特、脉脉、知乎》

$ npx skills add lixi5338619/lxSpider
1.9K stars50 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.Check: Repository looks stale
pythoncrawler
by lixi5338619Quick view
228

Ast Hook For Js RE

VERIFIEDPROMISING · 62

浏览器内存漫游解决方案(探索中...)

$ npx skills add JSREI/ast-hook-for-js-RE
1.9K stars50 qualityClaude Code
Useful candidate, but compare it with alternatives before adopting.Check: Repository looks stale
javascriptcrawler
by JSREIQuick view
229

ClawPDF

VERIFIEDSTRONG · 73

Open Source Virtual (Network) Printer for Windows that allows you to create PDFs, OCR text, and print images, with advanced features usually available only in enterprise solutions.

$ npx skills add clawsoftware/clawPDF
1.9K stars50 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.Check: Repository looks stale
c#pdf
by clawsoftwareQuick view
230

BT Btt

VERIFIEDPROMISING · 67

磁力網站U3C3介紹以及域名更新

$ npx skills add u3c3/BT-btt
1.8K stars50 qualityClaude Code
Useful candidate, but compare it with alternatives before adopting.Check: Repository looks stale
crawler
by u3c3Quick view
231

PSpider

VERIFIEDPROMISING · 67

简单易用的Python爬虫框架,QQ交流群:597510560

$ npx skills add xianhu/PSpider
1.8K stars50 qualityClaude Code
Useful candidate, but compare it with alternatives before adopting.Check: Repository looks stale
pythoncrawler
by xianhuQuick view
232

Go Spider

VERIFIEDSTRONG · 73

[爬虫框架 (golang)] An awesome Go concurrent Crawler(spider) framework. The crawler is flexible and modular. It can be expanded to an Individualized crawler easily or you can use the default crawl components only.

$ npx skills add hu17889/go_spider
1.8K stars50 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.Check: Repository looks stale
gocrawler
by hu17889Quick view
233

Docconv

VERIFIEDSTRONG · 73

Converts PDF, DOC, DOCX, XML, HTML, RTF, etc to plain text

$ npx skills add sajari/docconv
1.8K stars49 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.Check: Repository looks stale
gopdf
by sajariQuick view
234

Textshot

VERIFIEDPROMISING · 67

Python tool for grabbing text via screenshot

$ npx skills add ianzhao/textshot
1.8K stars49 qualityClaude Code
Useful candidate, but compare it with alternatives before adopting.Check: Repository looks stale
pythonocr
by ianzhaoQuick view
235

Tarsier

VERIFIEDPROMISING · 67

Vision utilities for web interaction agents 👀

$ npx skills add reworkd/tarsier
1.8K stars49 qualityClaude Code + OpenAI Agents
Useful candidate, but compare it with alternatives before adopting.Check: Repository looks stale
jupyter-notebookocr
by reworkdQuick view
236

Caj2pdf Qt

VERIFIEDPROMISING · 67

CAJ 转 PDF 转换器(GUI 版本)

$ npx skills add sainnhe/caj2pdf-qt
1.8K stars49 qualityClaude Code
Useful candidate, but compare it with alternatives before adopting.Check: Repository looks stale
c++pdf
by sainnheQuick view
237

Extractous

VERIFIEDSTRONG · 73

Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.

$ npx skills add yobix-ai/extractous
1.8K stars49 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.Check: Repository looks stale
rustrag
by yobix-aiQuick view
238

Ruia

VERIFIEDSTRONG · 73

Async Python 3.6+ web scraping micro-framework based on asyncio

$ npx skills add howie6879/ruia
1.7K stars49 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.Check: Repository looks stale
pythoncrawler
by howie6879Quick view
239

Obsidian Annotator

VERIFIEDSTRONG · 73

A plugin for reading and annotating PDFs and EPUBs in obsidian.

$ npx skills add elias-sundqvist/obsidian-annotator
1.7K stars49 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.Check: Repository looks stale
javascriptpdf
by elias-sundqvistQuick view
240

PdfViewPager

VERIFIEDSTRONG · 73

Android widget that can render PDF documents stored on SD card, linked as assets, or downloaded from a remote URL.

$ npx skills add voghDev/PdfViewPager
1.7K stars49 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.Check: Repository looks stale
javapdf
by voghDevQuick view
241

AutoCrawler

VERIFIEDSTRONG · 73

Google, Naver multiprocess image web crawler (Selenium)

$ npx skills add YoongiKim/AutoCrawler
1.7K stars49 qualityClaude Code + Browser agents
Solid option that is likely worth shortlisting for production workflows.Check: Repository looks stale
pythoncrawler
by YoongiKimQuick view
242

MixTeX Latex OCR

VERIFIEDSTRONG · 72

MixTeX multimodal LaTeX, ZhEn, and, Table OCR. It performs efficient CPU-based inference in a local offline on Windows.

$ npx skills add RQLuo/MixTeX-Latex-OCR
1.6K stars49 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.Check: Repository looks stale
pythonocr
by RQLuoQuick view
243

Spider Collection

VERIFIEDSTRONG · 72

python爬虫,目前库存:网易云音乐歌曲爬取,B站视频爬取,知乎问答爬取,壁纸爬取,xvideos视频爬取,有声书爬取,微博爬虫,安居客信息爬取+数据可视化,哔哩哔哩视频封面提取器,ip代理池封装,知乎百万级用户爬虫+数据分析,github用户爬虫

$ npx skills add srx-2000/spider_collection
1.6K stars49 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.Check: Repository looks stale
pythoncrawler
by srx-2000Quick view
244

PDFLayoutTextStripper

VERIFIEDSTRONG · 72

Converts a pdf file into a text file while keeping the layout of the original pdf. Useful to extract the content from a table in a pdf file for instance. This is a subclass of PDFTextStripper class (from the Apache PDFBox library).

$ npx skills add JonathanLink/PDFLayoutTextStripper
1.6K stars49 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.Check: Repository looks stale
javapdf
by JonathanLinkQuick view
245

Grab Site

VERIFIEDSTRONG · 76

The archivist's web crawler: WARC output, dashboard for all crawls, dynamic ignore patterns

$ npx skills add ArchiveTeam/grab-site
1.6K stars49 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.
pythoncrawler
by ArchiveTeamQuick view
246

Pocorgtfo

VERIFIEDPROMISING · 67

a "Proof of Concept or GTFO" mirror with an extensive index with also whole issues or individual articles as clean PDFs.

$ npx skills add angea/pocorgtfo
1.6K stars49 qualityClaude Code
Useful candidate, but compare it with alternatives before adopting.Check: Repository looks stale
texpdf
by angeaQuick view
247

Awesome Document Understanding

VERIFIEDPROMISING · 67

A curated list of resources for Document Understanding (DU) topic

$ npx skills add tstanislawek/awesome-document-understanding
1.5K stars49 qualityClaude Code
Useful candidate, but compare it with alternatives before adopting.Check: Repository looks stale
pdf
by tstanislawekQuick view
248

PaddleOCR Json

VERIFIEDSTRONG · 72

OCR离线图片文字识别命令行windows程序,以JSON字符串形式输出结果,方便别的程序调用。提供各种语言API。由 PaddleOCR C++ 编译。

$ npx skills add hiroi-sora/PaddleOCR-json
1.5K stars49 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.Check: Repository looks stale
c++ocr
by hiroi-soraQuick view
249

ElixirBooks

VERIFIEDPROMISING · 61

List of Elixir books

$ npx skills add sger/ElixirBooks
1.5K stars49 qualityClaude Code
Useful candidate, but compare it with alternatives before adopting.Check: Repository looks stale
pdf
by sgerQuick view
250

Documind

VERIFIEDPROMISING · 67

Open-source platform for extracting structured data from documents using AI.

$ npx skills add DocumindHQ/documind
1.5K stars49 qualityClaude Code
Useful candidate, but compare it with alternatives before adopting.Check: Repository looks stale
javascriptpdf
by DocumindHQQuick view