Collect structured data

Best web scraping skills for AI agents

Find skills for crawling websites, extracting structured data, monitoring pages, and turning messy web content into agent-ready inputs.

30
Ranked
429K
Stars
100
Top score
#1

Crawlee Python

36 fitExcellent · 100

Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Parsel, BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.

Excellent quality, 9.1K stars, and a 36 use-case fit score.

9.1K stars744 forksMay 22, 2026 pushweb-automation
$ npx skills add apify/crawlee-python
#2

Firecrawl

34 fitExcellent · 100

🔥 Search, scrape, and clean the web for AI agents.

Excellent quality, 123K stars, and a 34 use-case fit score.

123K stars7.5K forksMay 23, 2026 pushagent-frameworks
$ npx skills add firecrawl/firecrawl
#3

Crawl4AI

31 fitExcellent · 100

Open-source LLM-friendly web crawler and scraper

Excellent quality, 66K stars, and a 31 use-case fit score.

66K stars6.8K forksMay 22, 2026 pushweb-automation
$ npx skills add unclecode/crawl4ai
#4

WaterCrawl

30 fitExcellent · 93

Transform Web Content into LLM-Ready Data

Excellent quality, 1.8K stars, and a 30 use-case fit score.

1.8K stars224 forksMay 20, 2026 pushweb-automation
$ npx skills add watercrawl/WaterCrawl
#5

Lux

30 fitExcellent · 100

👾 Fast and simple video download library and CLI tool written in Go

Excellent quality, 31K stars, and a 30 use-case fit score.

31K stars3.3K forksMar 29, 2026 pushweb-automation
$ npx skills add iawia002/lux
#6

Colly

30 fitExcellent · 100

Elegant Scraper and Crawler Framework for Golang

Excellent quality, 25K stars, and a 30 use-case fit score.

25K stars1.9K forksMay 17, 2026 pushweb-automation
$ npx skills add gocolly/colly
#7

GoogleScraper

29 fitStrong · 75

A Python module to scrape several search engines (like Google, Yandex, Bing, Duckduckgo, ...). Including asynchronous networking support.

Strong quality, 2.8K stars, and a 29 use-case fit score.

2.8K stars751 forksJul 3, 2021 pushweb-automation
$ npx skills add NikolaiT/GoogleScraper
#8

Scrapling

29 fitExcellent · 100

Adaptive web scraping for agent data collection

Excellent quality, 53K stars, and a 29 use-case fit score.

53K stars5.1K forksMay 18, 2026 pushweb-automation
$ npx skills add D4Vinci/Scrapling
#9

Newspaper

29 fitExcellent · 100

newspaper3k is a news, full-text, and article metadata extraction in Python 3. Advanced docs:

Excellent quality, 15K stars, and a 29 use-case fit score.

15K stars2.1K forksMay 13, 2026 pushweb-automation
$ npx skills add codelucas/newspaper
#10

Mlscraper

29 fitPromising · 67

🤖 Scrape data from HTML websites automatically by just providing examples

Promising quality, 1.4K stars, and a 29 use-case fit score.

1.4K stars93 forksMar 17, 2024 pushweb-automation
$ npx skills add lorey/mlscraper
#11

Scrapecraft

28 fitStrong · 75

🤖 AI-powered web scraping editor with visual workflow builder. Build, test & deploy web scrapers using natural language. Powered by ScrapeGraphAI & LangGraph.

Strong quality, 641 stars, and a 28 use-case fit score.

641 stars99 forksDec 26, 2025 pushweb-automation
$ npx skills add ScrapeGraphAI/scrapecraft
#12

Google Maps Scraper

28 fitExcellent · 100

scrape data from Google Maps. Extracts data such as the name, address, phone number, website URL, rating, reviews number, latitude and longitude, reviews,email and more for each place

Excellent quality, 4.1K stars, and a 28 use-case fit score.

4.1K stars618 forksMay 1, 2026 pushweb-automation
$ npx skills add gosom/google-maps-scraper
#13

Awesome Crawler

27 fitStrong · 79

A collection of awesome web crawler,spider in different languages

Strong quality, 7.2K stars, and a 27 use-case fit score.

7.2K stars753 forksJun 16, 2024 pushweb-automation
$ npx skills add BruceDone/awesome-crawler
#14

Amazon Scraper

27 fitExcellent · 100

Free Trial Amazon Scraper API for extracting search, product, offer listing, reviews, question and answers, best sellers and sellers data.

Excellent quality, 3.0K stars, and a 27 use-case fit score.

3.0K stars70 forksApr 26, 2026 pushweb-automation
$ npx skills add oxylabs/amazon-scraper
#15

Headless Chrome Crawler

27 fitStrong · 72

Distributed crawler powered by Headless Chrome

Strong quality, 5.6K stars, and a 27 use-case fit score.

5.6K stars404 forksApr 29, 2023 pushweb-automation
$ npx skills add yujiosaka/headless-chrome-crawler
#16

QueryList

27 fitExcellent · 100

:spider: The progressive PHP crawler framework! 优雅的渐进式PHP采集框架。

Excellent quality, 2.7K stars, and a 27 use-case fit score.

2.7K stars428 forksMay 18, 2026 pushweb-automation
$ npx skills add jae-jae/QueryList
#17

EasySpider

27 fitExcellent · 100

A visual no-code/code-free web crawler/spider易采集:一个可视化浏览器自动化测试/数据采集/网页爬虫软件,可以无代码图形化的设计和执行爬虫任务。别名:ServiceWrapper面向Web应用的智能化服务封装系统。

Excellent quality, 44K stars, and a 27 use-case fit score.

44K stars5.3K forksMay 22, 2026 pushweb-automation
$ npx skills add NaiboWang/EasySpider
#18

Google Play Scraper

27 fitExcellent · 94

Node.js scraper to get data from Google Play

Excellent quality, 2.9K stars, and a 27 use-case fit score.

2.9K stars713 forksMar 25, 2026 pushweb-automation
$ npx skills add facundoolano/google-play-scraper
#19

Ferret

27 fitExcellent · 100

Declarative web scraping

Excellent quality, 6.0K stars, and a 27 use-case fit score.

6.0K stars319 forksMay 23, 2026 pushweb-automation
$ npx skills add MontFerret/ferret
#20

Mdcx

27 fitStrong · 83

Movie metadata scraper

Strong quality, 3.6K stars, and a 27 use-case fit score.

3.6K stars477 forksNov 17, 2025 pushweb-automation
$ npx skills add sqzw-x/mdcx
#21

Oxylabs AI Studio Py

27 fitExcellent · 96

Structured data gathering from any website using AI-powered scraper, crawler, and browser automation. Scraping and crawling with natural language prompts. Equip your LLM agents with fresh data. AI Studio python SDK for intelligent web data gathering.

Excellent quality, 2.9K stars, and a 27 use-case fit score.

2.9K stars27 forksDec 4, 2025 pushweb-automation
$ npx skills add oxylabs/oxylabs-ai-studio-py
#22

Article Extractor

27 fitExcellent · 100

To extract main article from given URL with Node.js

Excellent quality, 1.9K stars, and a 27 use-case fit score.

1.9K stars160 forksMay 3, 2026 pushweb-automation
$ npx skills add extractus/article-extractor
#23

NewPipeExtractor

27 fitExcellent · 100

NewPipe's core library for extracting data from streaming sites

Excellent quality, 1.9K stars, and a 27 use-case fit score.

1.9K stars551 forksMay 20, 2026 pushweb-automation
$ npx skills add TeamNewPipe/NewPipeExtractor
#24

Goclone

27 fitExcellent · 99

Website Cloner - Utilizes powerful Go routines to clone websites to your computer within seconds.

Excellent quality, 2.1K stars, and a 27 use-case fit score.

2.1K stars387 forksMar 23, 2026 pushweb-automation
$ npx skills add goclone-dev/goclone
#25

How To Scrape Google Trends

27 fitExcellent · 90

Learn step-by-step how to scrape Google Trends data and make a result comparison using Python and Oxylabs SERP API. Extract keywords, their popularity, breakdown by region, related queries, and more.

Excellent quality, 2.6K stars, and a 27 use-case fit score.

2.6K stars16 forksDec 31, 2025 pushweb-automation
$ npx skills add oxylabs/how-to-scrape-google-trends
#26

Browserless

27 fitExcellent · 100

The headless Chrome/Chromium driver on top of Puppeteer. Take screenshots, generate PDFs, extract text and HTML with a production-ready API.

Excellent quality, 1.8K stars, and a 27 use-case fit score.

1.8K stars90 forksMay 22, 2026 pushweb-automation
$ npx skills add microlinkhq/browserless
#27

The process of extracting product data from Amazon using Python, including titles, ratings, prices, images, and descriptions.

Strong quality, 2.9K stars, and a 27 use-case fit score.

2.9K stars11 forksSep 23, 2025 pushweb-automation
$ npx skills add oxylabs/how-to-scrape-amazon-product-data
#28

Selectolax

27 fitExcellent · 100

Python binding to Modest and Lexbor engines. Fast HTML5 parser with CSS selectors for Python.

Excellent quality, 1.6K stars, and a 27 use-case fit score.

1.6K stars92 forksMay 18, 2026 pushweb-automation
$ npx skills add rushter/selectolax
#29

Skills

26 fitExcellent · 100

Give your AI the power to browse, scrape, and extract structured data from complex websites — with faster execution, lower cost, and more reliable results.

Excellent quality, 1.4K stars, and a 26 use-case fit score.

1.4K stars36 forksMay 22, 2026 pushweb-automation
$ npx skills add browser-act/skills
#30

Agentql

26 fitExcellent · 100

AgentQL is a suite of tools for connecting your AI to the web. Featuring a query language and Playwright integrations for interacting with elements and extracting data quickly, precisely, and at scale. Includes REST API, Python and JavaScript SDKs, browser debugger.

Excellent quality, 1.4K stars, and a 26 use-case fit score.

1.4K stars158 forksMay 19, 2026 pushweb-automation
$ npx skills add tinyfish-io/agentql