Collect structured data

Crawl a documentation site

Find skills for crawling docs, converting HTML to markdown, preserving links, and preparing agent-ready source material.

Resolve via API Open scenario Text version

Agent prompt

Find the best skill for crawling a documentation website and converting pages into clean markdown with useful metadata.

Matched skills

3.7K

Top stars

Best first install

Html To Markdown

⚙️ Convert HTML to Markdown. Even works with entire websites and can be extended through rules.

3.7K stars67 qualitydocument-processing

Open skill page Install handoff

Install with one command

$ npx skills add JohannesKaufmann/html-to-markdown

Install targets

Install this skill in your agent workflow

Copy the registry command or an agent-specific install prompt for Codex, Claude Code, and Cursor.

skill install

OpenAgentSkill CLI

Use the registry command when your workflow supports the OpenAgentSkill installer.

$ npx skills add JohannesKaufmann/html-to-markdown

Decision guide

Use and avoid conditions

Success criteria

Preserves source URLs
Produces clean markdown
Can limit crawl scope

Do not use when

Docs block crawling
The content is private without authorization
You need pixel-perfect browser state

Alternatives

Compare before installing

Compare top 4

Docsify

899

🃏 A magical documentation site generator.

31K starsdocument-processing

AnyCrawl

893

AnyCrawl 🚀: A Node.js/TypeScript crawler that turns websites into LLM-ready data and extracts structured SERP results from Google/Bing/Baidu/etc. Native multi-threading for bulk processing.

3.2K starsdata

Mkdocs

871

Project documentation with Markdown.

22K starsdocument-processing

Hugo Book

858

Hugo documentation theme as simple as plain book

4.1K starsdocument-processing

Cloudflare Docs

836

Cloudflare’s documentation

4.9K starsdocument-processing

Pandoc

836

Universal markup converter

45K starsdocument-processing