Decision filters

Choose skills by scenario, quality, and trust signals.

12 skills matching "modal"

Best blend of quality, stars, freshness, and agent usage

1

UI-TARS Desktop

VERIFIEDEXCELLENT · 100

Run multimodal agents that operate desktop interfaces

$ npx skills add bytedance/UI-TARS-desktop
35.0K stars75 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
ui-tarsdesktoptypescript
by bytedanceQuick view
2

Scientific Agent Skills

VERIFIEDEXCELLENT · 100

A set of ready to use Agent Skills for research, science, engineering, analysis, finance and writing.

$ npx skills add K-Dense-AI/scientific-agent-skills
25.2K stars74 qualityClaude Code + OpenAI Agents
High-confidence pick with strong adoption and healthy maintenance signals.
by K-Dense-AIQuick view
3

Haystack

VERIFIEDEXCELLENT · 100

Open-source AI orchestration framework for building context-engineered, production-ready LLM applications. Design modular pipelines and agent workflows with explicit control over retrieval, routing, memory, and generation. Built for scalable agents, RAG, multimodal applications, semantic search, and conversational systems.

$ npx skills add deepset-ai/haystack
25.3K stars74 qualityClaude Code + OpenAI Agents
High-confidence pick with strong adoption and healthy maintenance signals.
mdxrag
by deepset-aiQuick view
4

Deeplake

VERIFIEDEXCELLENT · 100

Deeplake is AI Data Runtime for Agents. It provides serverless postgres with a multimodal datalake, enabling scalable retrieval and training.

$ npx skills add activeloopai/deeplake
9.1K stars69 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
c++rag
by activeloopaiQuick view
5

Genkit

VERIFIEDEXCELLENT · 100

Open-source framework for building AI-powered apps in JavaScript, Go, and Python, built and used in production by Google

$ npx skills add genkit-ai/genkit
6.0K stars68 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
typescriptrag
by genkit-aiQuick view
6

Blades

EXCELLENT · 86

Blades is a Go-based multimodal AI Agent framework.

$ npx skills add go-kratos/blades
777 stars54 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
gollm
by go-kratosQuick view
7

Open Webui Tools

EXCELLENT · 86

Open‑WebUI Tools is a modular toolkit designed to extend and enrich your Open WebUI instance, turning it into a powerful AI workstation. With a suite of over 15 specialized tools, function pipelines, and filters, this project supports academic research, agentic autonomy, multimodal creativity, workflows, and more

$ npx skills add Haervwe/open-webui-tools
725 stars54 qualityClaude Code
High-confidence pick with strong adoption and healthy maintenance signals.
pythonai-agents
by HaervweQuick view
8

AppAgent

VERIFIEDSTRONG · 79

AppAgent: Multimodal Agents as Smartphone Users, an LLM-based multimodal agent framework designed to operate smartphone apps.

$ npx skills add TencentQQGYLab/AppAgent
6.8K stars54 qualityClaude Code + OpenAI Agents
Solid option that is likely worth shortlisting for production workflows.Check: Repository looks stale
pythonllm
by TencentQQGYLabQuick view
9

Second Brain

STRONG · 80

Second Brain is an agentic framework that acts as an operating system, using local file intelligence, workflow automation, and LLMs to complete tasks and communicate over multiple modalities and messaging platforms.

$ npx skills add henrydaum/second-brain
532 stars53 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.
pythonai-agents
by henrydaumQuick view
10

Awesome GUI Agent

VERIFIEDSTRONG · 79

💻 A curated list of papers and resources for multi-modal Graphical User Interface (GUI) agents.

$ npx skills add showlab/Awesome-GUI-Agent
1.2K stars52 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.
llm
by showlabQuick view
11

Panes

STRONG · 80

🎉📱 Create dynamic modals, cards, panes for your applications in few steps. Any framework and free.

$ npx skills add tech-systems/panes
729 stars51 qualityClaude Code
Solid option that is likely worth shortlisting for production workflows.
typescriptai-agents
by tech-systemsQuick view
12

GroundingLMM

PROMISING · 69

[CVPR 2024 🔥] Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses that are seamlessly integrated with object segmentation masks.

$ npx skills add mbzuai-oryx/groundingLMM
958 stars44 qualityClaude Code
Useful candidate, but compare it with alternatives before adopting.
pythonllm
by mbzuai-oryxQuick view