Trafilatura

VERIFIED

Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML

Downloads 0
Stars 6.0K
Version 1.0.0
Quality 91/100 · Excellent

Install with one command

$ npx skills add adbar/trafilatura

Best for

Web scraping

Find skills for crawling websites, extracting structured data, monitoring pages, and turning messy web content into agent-ready inputs.

Choose it when

  • You want a GitHub-backed skill with 6.0K stars.
  • You need a reusable install command for agents.
  • You want to compare it with related marketplace skills.

Check before install

  • Pushed 8mo ago
  • License: Apache-2.0
  • Review the repository README and examples.

Quality profile

Excellent candidate for agent workflows

High-confidence pick with strong adoption and healthy maintenance signals.

91
GitHub stars
6.0K
Freshness
8mo ago
Install ready
Yes
License
Apache-2.0

Workflow fit

Use this skill in these scenarios

Stack fit

Add it to a complete workflow

Overview

Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML

Imported by the skill-only GitHub discovery pipeline because it matches agent skill, automation, RAG, or developer-tool signals. Protocol-server projects are excluded from automated imports.

Platform Compatibility

pythonFULL
web-automationFULL

Technical Details

Version
1.0.0
License
Apache-2.0
Last Updated
5/23/2026
Published
5/23/2026

Frameworks & Tools

PythonWeb Automation