Web crawling built for AI
$ npx skills add unclecode/crawl4aiDecision filters
194 skills matching "data"
Best blend of quality, stars, freshness, and agent usage
Web crawling built for AI
$ npx skills add unclecode/crawl4ai🔥 Search, scrape, and clean the web for AI agents.
$ npx skills add firecrawl/firecrawlTurn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
$ npx skills add PaddlePaddle/PaddleOCRHigh-throughput crawling and scraping for agent data pipelines
$ npx skills add scrapy/scrapyAdaptive web scraping for agent data collection
$ npx skills add D4Vinci/ScraplingConnect agents to private data and retrieval workflows
$ npx skills add run-llama/llama_index📚 《从零开始构建智能体》——从零开始的智能体原理与实践教程
$ npx skills add datawhalechina/hello-agentsAI coding assistant skill (Claude Code, Codex, OpenCode, Cursor, Gemini CLI, and more). Turn any folder of code, SQL schemas, R scripts, shell scripts, docs, papers, images, or videos into a queryable knowledge graph. App code + database schema + infrastructure in one graph.
$ npx skills add safishamsi/graphifyMilvus is a high-performance, cloud-native vector database built for scalable vector ANN search
$ npx skills add milvus-io/milvusA visual no-code/code-free web crawler/spider易采集:一个可视化浏览器自动化测试/数据采集/网页爬虫软件,可以无代码图形化的设计和执行爬虫任务。别名:ServiceWrapper面向Web应用的智能化服务封装系统。
$ npx skills add NaiboWang/EasySpiderWeb data for AI applications
$ npx skills add firecrawl/firecrawlGive your AI agent a web browser
$ npx skills add browser-use/browser-use📚 从零开始构建大模型
$ npx skills add datawhalechina/happy-llmExtract web data with LLM-guided scraping graphs
$ npx skills add ScrapeGraphAI/Scrapegraph-aiSearch infrastructure for AI
$ npx skills add chroma-core/chromaA set of ready to use Agent Skills for research, science, engineering, analysis, finance and writing.
$ npx skills add K-Dense-AI/scientific-agent-skillsA set of ready to use Agent Skills for research, science, engineering, analysis, finance and writing.
$ npx skills add K-Dense-AI/claude-scientific-skillsThe open source AI engineering platform for agents, LLMs, and ML models. MLflow enables teams of all sizes to debug, evaluate, monitor, and optimize production-quality AI applications while controlling costs and managing access to models and data.
$ npx skills add mlflow/mlflowBuild reliable crawlers for LLM and RAG data ingestion
$ npx skills add apify/crawleeElegant Scraper and Crawler Framework for Golang
$ npx skills add gocolly/collyOpenViking is an open-source context database designed specifically for AI Agents(such as openclaw). OpenViking unifies the management of context (memory, resources, and skills) that Agents need through a file system paradigm, enabling hierarchical context delivery and self-evolving.
$ npx skills add volcengine/OpenVikingPython ProxyPool for web spider
$ npx skills add jhao104/proxy_poolDolt – Git for Data
$ npx skills add dolthub/doltFinceptTerminal is a modern finance application offering advanced market analytics, investment research, and economic data tools, designed for interactive exploration and data-driven decision-making in a user-friendly environment.
$ npx skills add Fincept-Corporation/FinceptTerminalPDF Parser for AI-ready data. Automate PDF accessibility. Open-source.
$ npx skills add opendataloader-project/opendataloader-pdfopen-source agentic AI data assistant for the next generation of AI + Data products.
$ npx skills add eosphoros-ai/DB-GPTThe container platform tailored for Kubernetes multi-cloud, datacenter, and edge management ⎈ 🖥 ☁️
$ npx skills add kubesphere/kubesphereA next-generation crawling and spidering framework.
$ npx skills add projectdiscovery/katanaTurn any AI Agents into world-class data analysts through the open context layer that gives AI agents grounded, governed memory, context, SQL across 20+ data sources, that helps you build GenBI, agentic BI, text-to-sql, dashboards, and agentic analytics.
$ npx skills add Canner/WrenAInewspaper3k is a news, full-text, and article metadata extraction in Python 3. Advanced docs:
$ npx skills add codelucas/newspaperA powerful tool for creating datasets for LLM fine-tuning 、RAG and Eval
$ npx skills add ConardLi/easy-datasetAn open source, privacy focused alternative to NotebookLM for teams with no data limits. Join our Discord: https://discord.gg/ejRNvftDp9
$ npx skills add MODSetter/SurfSense👾 Fast and simple video download library and CLI tool written in Go
$ npx skills add iawia002/luxBISHENG is an open LLM devops platform for next generation Enterprise AI applications. Powerful and comprehensive features include: GenAI workflow, RAG, Agent, Unified model management, Evaluation, SFT, Dataset Management, Enterprise-level System Management, Observability and more.
$ npx skills add dataelement/bishengPython脚本。模拟登录知乎, 爬虫,操作excel,微信公众号,远程开机
$ npx skills add injetlee/PythonThe all-in-one, open-source backend platform for agentic coding. InsForge gives your coding agent database, auth, storage, compute, hosting, and AI gateway to ship full-stack apps end-to-end.
$ npx skills add InsForge/InsForgeBuild, Manage and Deploy AI/ML Systems
$ npx skills add Netflix/metaflowA lightweight, lightning-fast, in-process vector database
$ npx skills add alibaba/zvecUltra-high-performance, secure, all-in-one acceleration engine for developer resources
$ npx skills add xixu-me/xgetIncremental engine for long horizon agents 🌟 Star if you like it!
$ npx skills add cocoindex-io/cocoindexAI Observability & Evaluation
$ npx skills add Arize-ai/phoenixDeeplake is AI Data Runtime for Agents. It provides serverless postgres with a multimodal datalake, enabling scalable retrieval and training.
$ npx skills add activeloopai/deeplakeCrawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Parsel, BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.
$ npx skills add apify/crawlee-python为你 7*24 在线搞钱的“云上牛马”团队
$ npx skills add TeamWiseFlow/wiseflow本项目是一个面向小白开发者的大模型应用开发教程,在线阅读地址:https://datawhalechina.github.io/llm-universe/
$ npx skills add datawhalechina/llm-universe🔍大模型应用开发实战一:RAG 技术全栈指南,在线阅读地址:https://datawhalechina.github.io/all-in-rag/
$ npx skills add datawhalechina/all-in-ragTrail of Bits Claude Code skills for security research, vulnerability detection, and audit workflows
$ npx skills add trailofbits/skillsDynamic, resilient AI orchestration. Coordinate data, models, and compute as you build AI workflows.
$ npx skills add flyteorg/flyteAI + Data, online. https://vespa.ai
$ npx skills add vespa-engine/vespaPlano is an AI-native proxy and data plane for agentic apps — with built-in orchestration, safety, observability, and smart LLM routing so you stay focused on your agents core logic.
$ npx skills add katanemo/planoLocal-first chat history analyzer with AI. | 本地优先的 AI 聊天记录分析工具
$ npx skills add ChatLab/ChatLabOpen-source context retrieval layer for AI agents
$ npx skills add airweave-ai/airweave🔥 基于大模型和 RAG 的智能问数系统,对话式数据分析神器。Text-to-SQL Generation via LLMs using RAG.
$ npx skills add dataease/SQLBotOpen-source framework for building AI-powered apps in JavaScript, Go, and Python, built and used in production by Google
$ npx skills add genkit-ai/genkitDeclarative web scraping
$ npx skills add MontFerret/ferretPython API for JMComic | 提供Python API访问禁漫天堂,同时支持网页端和移动端 | 禁漫天堂GitHub Actions下载器🚀
$ npx skills add hect0x7/JMComic-Crawler-PythonRedis-based components for Scrapy.
$ npx skills add rmax/scrapy-redisZenML 🙏: One AI Platform from Pipelines to Agents. https://zenml.io.
$ npx skills add zenml-io/zenmlStructured data extraction and instruction calling with ML, LLM and Vision LLM
$ npx skills add katanaml/sparrowAnalysis of Bot Protection systems with available countermeasures 🚿. How to defeat anti-bot system 👻 and get around browser fingerprinting scripts 🕵️♂️ when scraping the web?
$ npx skills add niespodd/browser-fingerprinting✨ The agentic HTML editor — your local AI agent writes the HTML, you ship it. 🚀 75 Skills × 9 Surfaces (magazine · deck · poster · XHS / tweet · prototype · data report · Hyperframes) 🛡️ Sandboxed preview · 📤 1-click to WeChat / X / Zhihu / HTML / PNG 🔑 Zero API key — Claude Code / Cursor / Codex / Gemini / Copilot / OpenCode / Qwen / Aider.
$ npx skills add nexu-io/html-anythingAI You Control: Choose your models. Own your data. Eliminate vendor lock-in.
$ npx skills add thunderbird/thunderbolt新浪微博爬虫,用python爬取新浪微博数据,并下载微博图片和微博视频
$ npx skills add dataabc/weibo-crawlerscrape data from Google Maps. Extracts data such as the name, address, phone number, website URL, rating, reviews number, latitude and longitude, reviews,email and more for each place
$ npx skills add gosom/google-maps-scraperHigh-performance open-source in-memory graph database for GraphRAG, AI memory, agentic AI, and real-time graph analytics. Cypher-compatible, built in C++.
$ npx skills add memgraph/memgraphHeadless Chrome .NET API
$ npx skills add hardkoded/puppeteer-sharpEasiest and laziest way for building multi-agent LLMs applications.
$ npx skills add LazyAGI/LazyLLM🚀🚀🚀feapder is an easy to use, powerful crawler framework | feapder是一款上手简单,功能强大的Python爬虫框架。内置AirSpider、Spider、TaskSpider、BatchSpider四种爬虫解决不同场景的需求。且支持断点续爬、监控报警、浏览器渲染、海量数据去重等功能。更有功能强大的爬虫管理系统feaplat为其提供方便的部署及调度
$ npx skills add Boris-code/feapderEvery web site provides APIs.
$ npx skills add elliotgao2/toapiTake a list of domains, crawl urls and scan for endpoints, secrets, api keys, file extensions, tokens and more
$ npx skills add edoardottt/cariddiAgent Skills as a Memory Layer
$ npx skills add memodb-io/AcontextThe platform for LLM evaluations and AI agent testing
$ npx skills add langwatch/langwatchAn EVM compatible Substrate chain, powered by StorageHub and secured by EigenLayer
$ npx skills add datahaven-xyz/datahavenFree Trial Amazon Scraper API for extracting search, product, offer listing, reviews, question and answers, best sellers and sellers data.
$ npx skills add oxylabs/amazon-scraperList of libraries, tools and APIs for web scraping and data processing.
$ npx skills add lorien/awesome-web-scrapinghttps://spatie.be/docs/crawler
$ npx skills add spatie/crawlerAll In One Web Recon
$ npx skills add thewhiteh4t/FinalRecon:spider: The progressive PHP crawler framework! 优雅的渐进式PHP采集框架。
$ npx skills add jae-jae/QueryListTurn any webpage into structured data using LLMs
$ npx skills add mishushakov/llm-scraperLow latency web data collector
$ npx skills add spider-rs/spiderApache Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and scales everywhere python does.
$ npx skills add apache/hamilton🕷 CrawlerDetect is a PHP class for detecting bots/crawlers/spiders via the user agent
$ npx skills add JayBizzle/Crawler-Detect基于搜狗微信搜索的微信公众号爬虫接口
$ npx skills add chyroc/WechatSogouIncredibly fast crawler designed for OSINT.
$ npx skills add s0md3v/PhotonVideodl: A lightweight video downloader written in pure python. (轻量级视频下载器,优先高清无水印,支持抖音,快手,小红书,B站,TikTok,YouTube,FIFA+,优酷,腾讯,爱奇艺,1905电影网,乐视,芒果,咪咕,PPTV,搜狐,Facebook,Twitter,新浪微博,今日头条,网易公开课,全民K歌,CCTV央视频,酷狗音乐MV,新片场,知乎,百度贴吧,TED等海量流媒体平台)
$ npx skills add CharlesPikachu/videodl蓝天采集器是一款开源免费的爬虫系统,仅需点选编辑规则即可采集数据,可运行在本地、虚拟主机或云服务器中,几乎能采集所有类型的网页,无缝对接各类CMS建站程序,免登录实时发布数据,全自动无需人工干预!是网页大数据采集软件中完全跨平台的云端爬虫系统
$ npx skills add zorlan/skycaiji🏳️🌈 Media downloader from any sites, including Twitter, Reddit, Instagram, BlueSky, TikTok, Threads, Facebook, OnlyFans, YouTube, Pinterest, PornHub, XHamster, XVIDEOS, ThisVid etc.
$ npx skills add AAndyProgram/SCrawlerWeb crawling framework based on asyncio.
$ npx skills add elliotgao2/gainDistributed web crawler admin platform for spiders management regardless of languages and frameworks. 分布式爬虫管理平台,支持任何语言和框架
$ npx skills add crawlab-team/crawlabThe PHP Agentic Framework to build production-ready AI driven applications. Connect components (LLMs, vector DBs, memory) to agents that can interact with your data.
$ npx skills add neuron-core/neuron-aiA scalable web crawler framework for Java.
$ npx skills add code4craft/webmagicUi.Vision Open-Source RPA Software with Computer Vision, OCR, Anthropic Computer Use/LLM. Selenium IDE import/export.
$ npx skills add A9T9/RPATo extract main article from given URL with Node.js
$ npx skills add extractus/article-extractorNewPipe's core library for extracting data from streaming sites
$ npx skills add TeamNewPipe/NewPipeExtractorFlexible Node.js AI-assisted crawler library
$ npx skills add coder-hxl/x-crawlTransform Web Content into LLM-Ready Data
$ npx skills add watercrawl/WaterCrawlOpiniated RAG for integrating GenAI in your apps 🧠 Focus on your product rather than the RAG. Easy integration in existing products with customisation! Any LLM: GPT4, Groq, Llama. Any Vectorstore: PGVector, Faiss. Any Files. Anyway you want.
$ npx skills add QuivrHQ/quivrCollection of China illegal cases about web crawler 本项目用来整理所有中国大陆爬虫开发者涉诉与违规相关的新闻、资料与法律法规。致力于帮助在中国大陆工作的爬虫行业从业者了解我国相关法律,避免触碰数据合规红线。
$ npx skills add hiddendevj/Crawler_Illegal_Cases_In_ChinaAdala: Autonomous DAta (Labeling) Agent framework
$ npx skills add HumanSignal/Adala[GenAI Application Development Framework] 🚀 Build GenAI application quick and easy 💬 Easy to interact with GenAI agent in code using structure data and chained-calls syntax 🧩 Use Event-Driven Flow *TriggerFlow* to manage complex GenAI working logic 🔀 Switch to any model without rewrite application code
$ npx skills add AgentEra/Agently抖音爬虫——采集账号主页、喜欢、收藏、音乐原声、话题、搜索、合集、作品、关注、粉丝等公开数据。
$ npx skills add erma0/douyinDotnetSpider, a .NET standard web crawling library. It is lightweight, efficient and fast high-level web crawling & scraping framework
$ npx skills add dotnetcore/DotnetSpiderScopeSentry-Cyberspace mapping, subdomain enumeration, port scanning, sensitive information discovery, vulnerability scanning, distributed nodes
$ npx skills add Autumn-27/ScopeSentryDownload comics novels 小说漫画下载工具 小説漫画のダウンローダ 小說漫畫下載:腾讯漫画 大角虫漫画 有妖气 咪咕 SF漫画 哦漫画 看漫画 漫画柜 汗汗酷漫 動漫伊甸園 快看漫画 微博动漫 733动漫网 大古漫画网 漫画DB 無限動漫 動漫狂 卡推漫画 动漫之家 动漫屋 古风漫画网 36漫画网 亲亲漫画网 乙女漫画 webtoons 咚漫 ニコニコ静画 ComicWalker ヤングエースUP モアイ pixivコミック サイコミ;アルファポリス カクヨム ハーメルン 小説家になろう 起点中文网 八一中文网 顶点小说 落霞小说网 努努书坊 笔趣阁→epub.
$ npx skills add kanasimi/work_crawlerScrape tweets, profiles, followers and following from Twitter/X, no API key needed. Python library with smart multi-account pooling, proxy support and async.
$ npx skills add Altimis/ScweetElasticsearch File System Crawler (FS Crawler)
$ npx skills add dadoonet/fscrawlerGive your AI the power to browse, scrape, and extract structured data from complex websites — with faster execution, lower cost, and more reliable results.
$ npx skills add browser-act/skillsA web privacy measurement framework
$ npx skills add openwpm/OpenWPMAgentQL is a suite of tools for connecting your AI to the web. Featuring a query language and Playwright integrations for interacting with elements and extracting data quickly, precisely, and at scale. Includes REST API, Python and JavaScript SDKs, browser debugger.
$ npx skills add tinyfish-io/agentqlA framework for comprehensive diagnosis and optimization of agents using simulated, realistic synthetic interactions
$ npx skills add plurai-ai/intellagentHTTP(S)/SOCKS5 rotating residential proxies - code examples & general information.
$ npx skills add Decodo/DecodoA JavaScript library for generating random user agents with data that's updated daily.
$ npx skills add intoli/user-agentsFaster requests on Python 3
$ npx skills add juancarlospaco/faster-than-requestsNode.js scraper to get data from Google Play
$ npx skills add facundoolano/google-play-scraperLearn to build your Second Brain AI assistant with LLMs, agents, RAG, fine-tuning, LLMOps and AI systems techniques.
$ npx skills add decodingai-magazine/second-brain-ai-assistant-coursenews-please - an integrated web crawler and information extractor for news that just works
$ npx skills add fhamborg/news-pleaseWebsite Cloner - Utilizes powerful Go routines to clone websites to your computer within seconds.
$ npx skills add goclone-dev/gocloneDiskover Community Edition - Open source file indexer, file search engine and data management and analytics powered by Elasticsearch
$ npx skills add diskoverdata/diskover-community一些非常有趣的python爬虫例子,对新手比较友好,主要爬取淘宝、天猫、微信、微信读书、豆瓣、QQ等网站。(Some interesting examples of python crawlers that are friendly to beginners. )
$ npx skills add shengqiangzhang/examples-of-web-crawlers持续维护的企业面试题库网站,帮你拿到满意 offer!⭐️ 2026年最新Java面试题、前端面试题、AI大模型面试题、AI Agent面试题、RAG面试题、C++面试题、Go面试题、Python面试题、测试面试题、运维面试题、后端面试题、操作系统面试题、计算机网络面试题、Redis面试题、MySQL数据库面试题、算法面试题、Spring面试题、JVM面试题、Java并发面试题、Linux面试题、LLM面试题、Prompt工程面试题、系统设计面试题等1万多道高频程序员求职必备八股文。面试刷题就选面试鸭 💎 React 前端 + Node 后端 + 云开发全栈项目 by 程序员鱼皮
$ npx skills add liyupi/mianshiya浏览过的精彩逆向文章汇总,值得一看
$ npx skills add darbra/spermIn this tutorial, we showcase how to scrape public Google data with Python and Oxylabs API.
$ npx skills add oxylabs/scrape-google-pythonA community-driven way to read and chat with AI bots - powered by chatGPT.
$ npx skills add myreader-io/myGPTReaderDark Web OSINT Tool
$ npx skills add DedSecInside/TorBotStructured data gathering from any website using AI-powered scraper, crawler, and browser automation. Scraping and crawling with natural language prompts. Equip your LLM agents with fresh data. AI Studio python SDK for intelligent web data gathering.
$ npx skills add oxylabs/oxylabs-ai-studio-pyLearn step-by-step how to scrape Google Trends data and make a result comparison using Python and Oxylabs SERP API. Extract keywords, their popularity, breakdown by region, related queries, and more.
$ npx skills add oxylabs/how-to-scrape-google-trendsEasy to use lightweight web crawler(易用的轻量化网络爬虫)
$ npx skills add xtuhcy/geccoOpen Source Deep Research Alternative to Reason and Search on Private Data. Written in Python.
$ npx skills add zilliztech/deep-searcherWeb Crawler/Spider for NodeJS + server-side jQuery ;-)
$ npx skills add bda-research/node-crawlerPython & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML
$ npx skills add adbar/trafilaturaSuperduper: End-to-end framework for building custom AI applications and agents.
$ npx skills add superduper-io/superduper小红书数据采集、网站图片、视频资源批量下载工具,颜值超高的数据采集工具(批量下载,视频提取,图片)Telegram:https://t.me/+ZtLSwuIKTo44MDY1
$ npx skills add xisuo67/XHS-Spider新一代爬虫平台,以图形化方式定义爬虫流程,不写代码即可完成爬虫。
$ npx skills add ssssssss-team/spider-flowA collection of agent skills for financial analysis and trading. Includes options payoff charts, stock correlation analysis, yfinance data fetching, Discord/Telegram/Twitter financial research, and generative UI for interactive visualizations.
$ npx skills add himself65/finance-skillsWrite web scrapers in Ruby using a clean, AI-assisted DSL. Kimurai uses AI to figure out where the data lives, then caches the selectors and scrapes with pure Ruby. Get the intelligence of an LLM without the per-request latency or token costs.
$ npx skills add vifreefly/kimuraframeworkIntelligent proxy pool for Humans™ to extract content from the internet and build your own Large Language Models in this new AI era
$ npx skills add MikeChongCan/scyllaMovie metadata scraper
$ npx skills add sqzw-x/mdcxThe process of extracting product data from Amazon using Python, including titles, ratings, prices, images, and descriptions.
$ npx skills add oxylabs/how-to-scrape-amazon-product-dataAV 电影管理系统, avmoo , javbus , javlibrary 爬虫,线上 AV 影片图书馆,AV 磁力链接数据库,Japanese Adult Video Library,Adult Video Magnet Links - Japanese Adult Video Database
$ npx skills add guyueyingmu/avbookScalable Python web scraping scripts for +40 popular domains
$ npx skills add scrapfly/scrapfly-scrapersAI Agent Development Platform - Supports multiple models (OpenAI/DeepSeek/Wenxin/Tongyi), knowledge base management, workflow automation, and enterprise-grade security. Built with Flask + Vue3 + LangChain, featuring one-click Docker deployment.
$ npx skills add Haohao-end/openagentFree Proxy List ✅🚀 HTTP, HTTPS, SOCKS4 & SOCKS5 | Updated every 5 minutes | Strict SSL, zero MITM, multi-country
$ npx skills add databay-labs/free-proxy-listA collection of awesome web crawler,spider in different languages
$ npx skills add BruceDone/awesome-crawlerFast, streaming indexing, query, and agentic LLM applications in Rust
$ npx skills add bosun-ai/swiftidePowerMem: Your AI-Powered Long-Term Memory — Accurate, Agile, Affordable. Also friendly support for the OpenClaw Memory Plugin.
$ npx skills add oceanbase/powermemDaC is a dashboard-as-code tool. Build interactive dashboards using YAML and JSX. Built-in semantic layer. Get your agents to build standardized, reviewable dashboards.
$ npx skills add bruin-data/dacBuild ChatGPT over your data, all with natural language
$ npx skills add run-llama/ragsSelf-host n8n on Google Cloud without the subscription fees or server headaches - because your automation workflows shouldn't cost more than your coffee budget
$ npx skills add datawranglerai/self-host-n8n-on-gcrThe archivist's web crawler: WARC output, dashboard for all crawls, dynamic ignore patterns
$ npx skills add ArchiveTeam/grab-siteDistributed crawler powered by Headless Chrome
$ npx skills add yujiosaka/headless-chrome-crawler:sparkling_heart: High available distributed ip proxy pool, powerd by Scrapy and Redis
$ npx skills add SpiderClub/haipproxy实战🐍多种网站、电商数据爬虫🕷。包含🕸:淘宝商品、微信公众号、大众点评、企查查、招聘网站、闲鱼、阿里任务、博客园、微博、百度贴吧、豆瓣电影、包图网、全景网、豆瓣音乐、某省药监局、搜狐新闻、机器学习文本采集、fofa资产采集、汽车之家、国家统计局、百度关键词收录数、蜘蛛泛目录、今日头条、豆瓣影评、携程、小米应用商店、安居客、途家民宿❤️❤️❤️。微信爬虫展示项目:
$ npx skills add DropsDevopsOrg/ECommerceCrawlersJekyll-based static site for The Programming Historian
$ npx skills add programminghistorian/jekyllSecond Brain is an agentic framework that acts as an operating system, using local file intelligence, workflow automation, and LLMs to complete tasks and communicate over multiple modalities and messaging platforms.
$ npx skills add henrydaum/second-brainOpen source web infrastructure for AI. Scrape, crawl, and automate the web, clean markdown, browser sessions, ready for your agents.
$ npx skills add vakra-dev/reader🌟 A curated collection of free, high quality AI tools 🤖, APIs 🔗, datasets 📊, and learning resources 📚 covering machine learning 🧠, deep learning 🧩, generative AI 🎨, NLP 💬, and data science 📈. Designed to help developers 👩💻, researchers 🔬, and creators ✨ explore and build with AI faster ⚡.
$ npx skills add CelaDaniel/free-ai-resources-xProxy [Finder | Checker | Server]. HTTP(S) & SOCKS :performing_arts:
$ npx skills add constverum/ProxyBrokerAutomatically crawls proxy nodes on the public internet, de-duplicates and tests for usability and then provides a list of nodes
$ npx skills add zu1k/proxypoolUse Web Scraper API to extract data from Google Finance, including stock titles, pricing, and price changes in percentages.
$ npx skills add oxylabs/how-to-scrape-google-financeAll in one tool for Information Gathering, Vulnerability Scanning and Crawling. A must have tool for all penetration testers
$ npx skills add Tuhinshubhra/RED_HAWKPython爬虫实战 - 模拟登陆各大网站 包含但不限于:滑块验证、拼多多、美团、百度、bilibili、大众点评、淘宝,如果喜欢请start ❤️
$ npx skills add wkunzhi/Python3-SpiderA powerful browser crawler for web vulnerability scanners
$ npx skills add Qianlitp/crawlergoGospider - Fast web spider written in Go
$ npx skills add jaeles-project/gospiderDecryptLogin: APIs for loginning some websites by using requests.
$ npx skills add CharlesPikachu/DecryptLoginowllook-小说搜索引擎
$ npx skills add howie6879/owllookA Python module to scrape several search engines (like Google, Yandex, Bing, Duckduckgo, ...). Including asynchronous networking support.
$ npx skills add NikolaiT/GoogleScraperGeziyor, blazing fast web crawling & scraping framework for Go. Supports JS rendering.
$ npx skills add geziyor/geziyorLeaked GPTs Prompts Bypass the 25 message limit or to try out GPTs without a Plus subscription.
$ npx skills add friuns2/Leaked-GPTsCross Platform C# web crawler framework built for speed and flexibility. Please star this project! +1.
$ npx skills add sjdirect/abotNo-code multi-agent framework to build LLM Agents, workflows and applications with your data
$ npx skills add trypromptly/LLMStackvulnx 🕷️ an intelligent Bot, Shell can achieve automatic injection, and help researchers detect security vulnerabilities CMS system. It can perform a quick CMS security detection, information collection (including sub-domain name, ip address, country information, organizational information and time zone, etc.) and vulnerability scanning.
$ npx skills add anouarbensaad/vulnxFetch user's data across social media
$ npx skills add shaikhsajid1111/social-media-profile-scrapersPolite, slim and concurrent web crawler.
$ npx skills add PuerkitoBio/gocrawlFind web directories without bruteforce
$ npx skills add Nekmo/dirhunt爬虫案例合集。包括但不限于《淘宝、京东、天猫、豆瓣、抖音、快手、微博、微信、阿里、头条、pdd、优酷、爱奇艺、携程、12306、58、搜狐、各种指数、维普万方、Zlibraty、Oalib、小说、招标网、采购网、小红书、大众点评、推特、脉脉、知乎》
$ npx skills add lixi5338619/lxSpider浏览器内存漫游解决方案(探索中...)
$ npx skills add JSREI/ast-hook-for-js-RE磁力網站U3C3介紹以及域名更新
$ npx skills add u3c3/BT-btt简单易用的Python爬虫框架,QQ交流群:597510560
$ npx skills add xianhu/PSpider[爬虫框架 (golang)] An awesome Go concurrent Crawler(spider) framework. The crawler is flexible and modular. It can be expanded to an Individualized crawler easily or you can use the default crawl components only.
$ npx skills add hu17889/go_spiderAsync Python 3.6+ web scraping micro-framework based on asyncio
$ npx skills add howie6879/ruiaGoogle, Naver multiprocess image web crawler (Selenium)
$ npx skills add YoongiKim/AutoCrawlerGeneral Assembly's 2015 Data Science course in Washington, DC
$ npx skills add justmarkham/DAT8python爬虫,目前库存:网易云音乐歌曲爬取,B站视频爬取,知乎问答爬取,壁纸爬取,xvideos视频爬取,有声书爬取,微博爬虫,安居客信息爬取+数据可视化,哔哩哔哩视频封面提取器,ip代理池封装,知乎百万级用户爬虫+数据分析,github用户爬虫
$ npx skills add srx-2000/spider_collection🤖 Scrape data from HTML websites automatically by just providing examples
$ npx skills add lorey/mlscraperCollection of patches for puppeteer and playwright to avoid automation detection and leaks. Helps to avoid Cloudflare and DataDome CAPTCHA pages. Easy to patch/unpatch, can be enabled/disabled on demand.
$ npx skills add rebrowser/rebrowser-patches🤖 AI-powered web scraping editor with visual workflow builder. Build, test & deploy web scrapers using natural language. Powered by ScrapeGraphAI & LangGraph.
$ npx skills add ScrapeGraphAI/scrapecraftDetailed web scraping tutorials for dummies with financial data crawlers on Reddit WallStreetBets, CME (both options and futures), US Treasury, CFTC, LME, MacroTrends, SHFE and alternative data crawlers on Tomtom, BBC, Wall Street Journal, Al Jazeera, Reuters, Financial Times, Bloomberg, CNN, Fortune, The Economist
$ npx skills add je-suis-tm/web-scrapingNeum AI is a best-in-class framework to manage the creation and synchronization of vector embeddings at large scale.
$ npx skills add NeumTry/NeumAIDataHen Till is a companion tool to your existing web scraper that instantly makes it scalable, maintainable, and more unblockable, with minimal code changes on your scraper. Integrates with any scraper in 5 minutes.
$ npx skills add DataHenHQ/tillUscrapper Vanta: Dive deeper into the web with this powerful open-source tool. Extract valuable insights with ease and efficiency, from both surface and deep web sources. Empower your data mining and analysis with Vanta's advanced capabilities. Fast, reliable, and user-friendly, Uscrapper Vanta is the ultimate choice for researchers and analysts.
$ npx skills add z0m31en7/UscrapperComplete-Life-Cycle-of-a-Data-Science-Project
$ npx skills add achuthasubhash/Complete-Life-Cycle-of-a-Data-Science-Project`scrape_linkedin` is a python package that allows you to scrape personal LinkedIn profiles & company pages - turning the data into structured json.
$ npx skills add austinoboyle/scrape-linkedin-seleniumData-Driven Evaluation for LLM-Powered Applications
$ npx skills add relari-ai/continuous-evalQuery crypto newsflashes, articles, and on-chain market data via BlockBeats Pro API. Covers 1,500+ information sources including AI-driven insights, Hyperliquid on-chain data, Polymarket analytics. Features market overview, capital flow analysis, macro environment assessment, derivatives analysis, and keyword search.
$ npx skills add https://clawhub.ai/BlockBeatsOfficial/blockbeats-skill