Process rich media

Best multimodal media skills for AI agents

Browse skills for image, video, audio, transcription, metadata extraction, and multimodal content workflows.

30
Ranked
726K
Stars
100
Top score
#1

PaddleOCR

22 fitExcellent · 100

Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.

Excellent quality, 78K stars, and a 22 use-case fit score.

78K stars10K forksMay 19, 2026 pushdata
$ npx skills add PaddlePaddle/PaddleOCR
#2

Graphify

21 fitExcellent · 100

AI coding assistant skill (Claude Code, Codex, OpenCode, Cursor, Gemini CLI, and more). Turn any folder of code, SQL schemas, R scripts, shell scripts, docs, papers, images, or videos into a queryable knowledge graph. App code + database schema + infrastructure in one graph.

Excellent quality, 52K stars, and a 21 use-case fit score.

52K stars5.6K forksMay 22, 2026 pushdata
$ npx skills add safishamsi/graphify
#3

Open Design

21 fitExcellent · 100

Open Design is a powerful, local-first design tool that integrates multiple coding-agent CLIs for generating various design outputs.

Excellent quality, 50K stars, and a 21 use-case fit score.

50K stars5.7K forksMay 23, 2026 pushproductivity
$ npx skills add nexu-io/open-design
#4

SCrawler

21 fitExcellent · 100

🏳️‍🌈 Media downloader from any sites, including Twitter, Reddit, Instagram, BlueSky, TikTok, Threads, Facebook, OnlyFans, YouTube, Pinterest, PornHub, XHamster, XVIDEOS, ThisVid etc.

Excellent quality, 2.0K stars, and a 21 use-case fit score.

2.0K stars151 forksMay 13, 2026 pushweb-automation
$ npx skills add AAndyProgram/SCrawler
#5

Deeplake

19 fitExcellent · 100

Deeplake is AI Data Runtime for Agents. It provides serverless postgres with a multimodal datalake, enabling scalable retrieval and training.

Excellent quality, 9.1K stars, and a 19 use-case fit score.

9.1K stars711 forksMay 21, 2026 pushdata
$ npx skills add activeloopai/deeplake
#6

Vision Agents

19 fitExcellent · 100

Open Vision Agents by Stream. Build voice and vision agents quickly with any model or video provider. Uses Stream's edge network for ultra-low latency.

Excellent quality, 7.8K stars, and a 19 use-case fit score.

7.8K stars655 forksMay 22, 2026 pushagent-frameworks
$ npx skills add GetStream/Vision-Agents
#7

Awesome AITools

19 fitExcellent · 100

A comprehensive collection of AI-related utilities with community contributions.

Excellent quality, 6.0K stars, and a 19 use-case fit score.

6.0K stars588 forksMay 23, 2026 pushutility
$ npx skills add ikaijua/Awesome-AITools
#8

ArcReel

18 fitExcellent · 100

AI Agent 驱动的开源视频生成工作台 — 小说→角色/场景/道具设计→剧本→分镜图→视频,跨镜头角色与场景一致 | Open-source AI video workspace powered by AI Agents, Nano Banana 2 & Veo 3.1 / Grok / Seedance / OpenAI

Excellent quality, 2.3K stars, and a 18 use-case fit score.

2.3K stars503 forksMay 22, 2026 pushagent-frameworks
$ npx skills add ArcReel/ArcReel
#9

UI-TARS Desktop

18 fitExcellent · 100

Run multimodal agents that operate desktop interfaces

Excellent quality, 35K stars, and a 18 use-case fit score.

35K stars3.5K forksMay 18, 2026 pushautomation
$ npx skills add bytedance/UI-TARS-desktop
#10

RPA

18 fitExcellent · 99

Ui.Vision Open-Source RPA Software with Computer Vision, OCR, Anthropic Computer Use/LLM. Selenium IDE import/export.

Excellent quality, 1.9K stars, and a 18 use-case fit score.

1.9K stars394 forksMay 15, 2026 pushweb-automation
$ npx skills add A9T9/RPA
#11

Khoj

18 fitExcellent · 100

Your AI second brain. Self-hostable. Get answers from the web or your docs. Build custom agents, schedule automations, do deep research. Turn any online or local LLM into your personal, autonomous AI (gpt, claude, gemini, llama, qwen, mistral). Get started - free.

Excellent quality, 35K stars, and a 18 use-case fit score.

35K stars2.2K forksMar 26, 2026 pushdata
$ npx skills add khoj-ai/khoj
#12

Lux

18 fitExcellent · 100

👾 Fast and simple video download library and CLI tool written in Go

Excellent quality, 31K stars, and a 18 use-case fit score.

31K stars3.3K forksMar 29, 2026 pushweb-automation
$ npx skills add iawia002/lux
#13

Haystack

18 fitExcellent · 100

Open-source AI orchestration framework for building context-engineered, production-ready LLM applications. Design modular pipelines and agent workflows with explicit control over retrieval, routing, memory, and generation. Built for scalable agents, RAG, multimodal applications, semantic search, and conversational systems.

Excellent quality, 25K stars, and a 18 use-case fit score.

25K stars2.8K forksMay 22, 2026 pushdata
$ npx skills add deepset-ai/haystack
#14

CV

17 fitExcellent · 100

✅(已完结)超级全面的 深度学习 笔记【土堆 Pytorch】【李沐 动手学深度学习】【吴恩达 深度学习】【大飞 大模型Agent】

Excellent quality, 21K stars, and a 17 use-case fit score.

21K stars2.4K forksApr 27, 2026 pushdata
$ npx skills add AccumulateMore/CV
#15

Ppt Master

17 fitExcellent · 100

AI generates natively editable PPTX from any document — real PowerPoint shapes with native animations, not images · by Hugo He

Excellent quality, 20K stars, and a 17 use-case fit score.

20K stars1.8K forksMay 23, 2026 pushagent-frameworks
$ npx skills add hugohe3/ppt-master
#16

Awesome N8n Templates

17 fitExcellent · 100

280+ free n8n automation templates — ready-to-use workflows for Gmail, Telegram, Slack, Discord, WhatsApp, Google Drive, Notion, OpenAI, and more. AI agents, RAG chatbots, email automation, social media, DevOps, and document processing. The largest open-source n8n template collection.

Excellent quality, 22K stars, and a 17 use-case fit score.

22K stars6.1K forksApr 9, 2026 pushagent-frameworks
$ npx skills add enescingoz/awesome-n8n-templates
#17

Baoyu Skills

17 fitExcellent · 100

A repository of skills designed to enhance daily work efficiency with Claude Code.

Excellent quality, 19K stars, and a 17 use-case fit score.

19K stars2.2K forksMay 21, 2026 pushproductivity
$ npx skills add JimLiu/baoyu-skills
#18

Kubesphere

17 fitExcellent · 100

KubeSphere is a comprehensive container platform designed for managing Kubernetes across multi-cloud, datacenter, and edge environments.

Excellent quality, 17K stars, and a 17 use-case fit score.

17K stars2.7K forksMay 6, 2026 pushutility
$ npx skills add kubesphere/kubesphere
#19

Katana

17 fitExcellent · 100

A next-generation crawling and spidering framework.

Excellent quality, 17K stars, and a 17 use-case fit score.

17K stars1.1K forksMay 21, 2026 pushweb-automation
$ npx skills add projectdiscovery/katana
#20

Newspaper

17 fitExcellent · 100

newspaper3k is a news, full-text, and article metadata extraction in Python 3. Advanced docs:

Excellent quality, 15K stars, and a 17 use-case fit score.

15K stars2.1K forksMay 13, 2026 pushweb-automation
$ npx skills add codelucas/newspaper
#21

AutoCrawler

17 fitStrong · 73

Google, Naver multiprocess image web crawler (Selenium)

Strong quality, 1.7K stars, and a 17 use-case fit score.

1.7K stars424 forksApr 15, 2024 pushweb-automation
$ npx skills add YoongiKim/AutoCrawler
#22

Waoowaoo

17 fitExcellent · 100

首家工业级全流程 AI 影视生产平台。Industry-first professional AI Agent platform for controllable film & video production. From shorts to live-action with Hollywood-standard workflows.

Excellent quality, 12K stars, and a 17 use-case fit score.

12K stars2.8K forksMay 21, 2026 pushagent-frameworks
$ npx skills add waooAI/waoowaoo
#23

Guizang Ppt Skill

17 fitExcellent · 100

AI-agent Skill for generating polished HTML slide decks: editorial magazine and Swiss layouts, image prompts, social covers, and a WebGL/low-power presentation runtime.

Excellent quality, 11K stars, and a 17 use-case fit score.

11K stars889 forksMay 19, 2026 pushagent-frameworks
$ npx skills add op7418/guizang-ppt-skill
#24

Pm Skills

17 fitExcellent · 100

A comprehensive marketplace for AI-driven product management skills and workflows.

Excellent quality, 12K stars, and a 17 use-case fit score.

12K stars1.4K forksMay 20, 2026 pushproductivity
$ npx skills add phuryn/pm-skills
#25

Zvec

17 fitExcellent · 100

Zvec is a lightweight and fast in-process vector database suitable for various applications.

Excellent quality, 9.7K stars, and a 17 use-case fit score.

9.7K stars553 forksMay 21, 2026 pushutility
$ npx skills add alibaba/zvec
#26

Xget

16 fitExcellent · 100

Xget is a high-performance acceleration engine designed for developer resources, enhancing efficiency and security.

Excellent quality, 8.1K stars, and a 16 use-case fit score.

8.1K stars1.3K forksMay 18, 2026 pushutility
$ npx skills add xixu-me/xget
#27

Big AGI

16 fitExcellent · 100

AI suite powered by state-of-the-art models and providing advanced AI/AGI functions. Includes AI personas, AGI functions, world-class Beam multi-model chats, text-to-image, voice, response streaming, code highlighting and execution, PDF import, presets for developers, much more. Deploy on-prem or in the cloud.

Excellent quality, 7.0K stars, and a 16 use-case fit score.

7.0K stars1.6K forksMay 22, 2026 pushagent-frameworks
$ npx skills add enricoros/big-AGI
#28

Manifest

16 fitExcellent · 100

Manifest provides smart routing for LLMs in OpenClaw, significantly reducing costs.

Excellent quality, 6.6K stars, and a 16 use-case fit score.

6.6K stars399 forksMay 22, 2026 pushutility
$ npx skills add mnfst/manifest
#29

Genkit

16 fitExcellent · 100

Open-source framework for building AI-powered apps in JavaScript, Go, and Python, built and used in production by Google

Excellent quality, 6.0K stars, and a 16 use-case fit score.

6.0K stars745 forksMay 23, 2026 pushdata
$ npx skills add genkit-ai/genkit
#30

AutoGPT

16 fitExcellent · 100

Build and run autonomous AI agents for open-ended tasks

Excellent quality, 184K stars, and a 16 use-case fit score.

184K stars46K forksMay 23, 2026 pushagent-frameworks
$ npx skills add Significant-Gravitas/AutoGPT