Agent outcome loop

Let rankings learn from real agent runs.

Every resolved skill should report what happened after one narrow task: success, failure, setup friction, or risk block. Those aggregate signals feed Trust Score, rankings, skill pages, and install recommendations.

View outcome API Read machine summary

Outcomes

No data

Success

Installs

Risk blocks

Resolve

The agent gets one recommended skill.

Resolve returns the selected skill, alternatives, install plan, Trust Score, safety policy, and a unique feedback event id.

Run

The agent tries one narrow task.

Use a sandbox workflow first. Record whether install was used, whether setup was required, and whether risk blocked execution.

Learn

The registry updates trust signals.

Aggregate outcomes improve rankings without exposing raw agent notes or per-user identifiers publicly.

Outcome leaderboard

Skills with the strongest adoption evidence

No public outcome reports yet; this page is ready for the first Resolve-powered runs.

System Prompts And Models Of AI Tools

Ready for first outcome

FULL Augment Code, Claude Code, Cluely, CodeBuddy, Comet, Cursor, Devin AI, Junie, Kiro, Leap.new, Lovable, Manus, NotionAI, Orchids.app, Perplexity, Poke, Qoder, Replit, Same.dev, Trae, Traycer AI, VSCode Agent, Warp.dev, Windsurf, Xcode, Z.ai Code, Dia & v0. (And other Open Sourced) System Prompts, Internal Tools & AI Models

No agent outcome reports yet. The first resolved run should report success, failed, not_relevant, blocked_by_risk, or setup_required.

Claude Mem

Ready for first outcome

Persistent Context Across Sessions for Every Agent – Captures everything your agent does during sessions, compresses it with AI, and injects relevant context back into future sessions. Works with Claude Code, OpenClaw, Codex, Gemini, Hermes, Copilot, OpenCode + More

No agent outcome reports yet. The first resolved run should report success, failed, not_relevant, blocked_by_risk, or setup_required.

CrewAI

Ready for first outcome

Framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work together seamlessly, tackling complex tasks.

No agent outcome reports yet. The first resolved run should report success, failed, not_relevant, blocked_by_risk, or setup_required.

FreeCodeCamp

Ready for first outcome

freeCodeCamp.org's open-source codebase and curriculum. Learn math, programming, and computer science for free.

No agent outcome reports yet. The first resolved run should report success, failed, not_relevant, blocked_by_risk, or setup_required.

Transformers

Ready for first outcome

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

No agent outcome reports yet. The first resolved run should report success, failed, not_relevant, blocked_by_risk, or setup_required.

Material Ui

Ready for first outcome

Material UI: Comprehensive React component library that implements Google's Material Design. Free forever.

No agent outcome reports yet. The first resolved run should report success, failed, not_relevant, blocked_by_risk, or setup_required.

Playwright

Ready for first outcome

Playwright is a framework for Web Testing and Automation. It allows testing Chromium, Firefox and WebKit with a single API.

No agent outcome reports yet. The first resolved run should report success, failed, not_relevant, blocked_by_risk, or setup_required.

PaddleOCR

Ready for first outcome

Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.

No agent outcome reports yet. The first resolved run should report success, failed, not_relevant, blocked_by_risk, or setup_required.

Ragflow

Ready for first outcome

RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge RAG with Agent capabilities to create a superior context layer for LLMs

No agent outcome reports yet. The first resolved run should report success, failed, not_relevant, blocked_by_risk, or setup_required.

AppFlowy

Ready for first outcome

Bring projects, wikis, and teams together with AI. AppFlowy is the AI collaborative workspace where you achieve more without losing control of your data. The leading open source Notion alternative.

No agent outcome reports yet. The first resolved run should report success, failed, not_relevant, blocked_by_risk, or setup_required.

Deer Flow

Ready for first outcome

An open-source long-horizon SuperAgent harness that researches, codes, and creates. With the help of sandboxes, memories, tools, skill, subagents and message gateway, it handles different levels of tasks that could take minutes to hours.

No agent outcome reports yet. The first resolved run should report success, failed, not_relevant, blocked_by_risk, or setup_required.

Unsloth

Ready for first outcome

Unsloth Studio is a web UI for training and running open models like Gemma 4, Qwen3.6, DeepSeek, gpt-oss locally.

No agent outcome reports yet. The first resolved run should report success, failed, not_relevant, blocked_by_risk, or setup_required.

POST /api/agent/outcome

Report after one narrow run.

Resolve responses include a unique feedback.event_id. Agents should reuse it when reporting the result so retries stay idempotent.

{
  "event_id": "resolve_...",
  "skill_slug": "crawl4ai",
  "task": "scrape pricing pages",
  "agent": "codex",
  "outcome": "success",
  "install_used": true,
  "time_to_useful_ms": 120000
}

Outcome meanings

Use the smallest honest label.

successThe skill helped complete the task.

failedThe skill was attempted but did not work.

not_relevantThe selected skill did not fit the task.

blocked_by_riskAudit, license, token, shell, or network risk stopped execution.

setup_requiredThe skill looked relevant but needed missing keys, data, or configuration.