seo-keyword-brief
Agent resolve quality
Resolve evals for real agent tasks.
A regression dashboard for whether OpenAgentSkill recommends the right reusable skill for high-intent tasks across coding, research, finance, documents, automation, and sports analytics.
Needs tuning
Failed or weak matches
Use these rows to improve ranking terms and supply coverage.
Full suite
Standard resolve cases
Each case checks expected slugs, terms, and minimum top score.
generic-web-scraping
Scrape competitor pricing pages and extract structured data
browser-automation
Control a browser, fill forms, and verify a web app workflow
github-pr-review
Review pull requests, inspect repository changes, and summarize GitHub issues
rag-documents
Build a RAG workflow over PDFs and retrieve reliable context
data-analysis
Analyze CSV data, create charts, and explain trends
content-automation
Turn product updates into blog posts, newsletters, and social copy
security-scan
Scan a codebase for vulnerabilities, exposed secrets, and dependency risks
database-sql
Inspect a database schema, write SQL, and explain query results
stock-news-analysis
Analyze stock news from the last 30 days and summarize market risks
sec-filing-summary
Summarize SEC filings and prepare investor notes
quant-backtest
Backtest a trading strategy and explain drawdowns
world-cup-dashboard
Build a World Cup dashboard from football match data
football-xg-analysis
Compare football teams using expected goals and event data
pdf-table-extraction
Extract tables from PDF reports and convert them to markdown
office-to-markdown
Convert Word, PowerPoint, and spreadsheet files into clean markdown
youtube-research
Research recent YouTube videos and produce a grounded summary
reddit-market-scan
Scan Reddit discussions for product feedback and market signals
hacker-news-monitoring
Monitor Hacker News and summarize trending developer discussions
pull-request-tests
Inspect a pull request and generate focused regression tests
repo-architecture
Explain a repository architecture and identify risky modules
browser-qa-flow
Run a browser QA flow and capture evidence for broken UI states
form-automation
Fill forms in a browser and verify the submitted result
rag-citations
Build a RAG answer with citations from a document collection
vector-search
Index documents and retrieve context with vector search
seo-keyword-brief
Research SEO keywords and generate article briefs
social-launch-copy
Turn a product launch into social posts and newsletter copy
crm-cleanup
Clean CRM exports and prepare a growth report
spreadsheet-analysis
Analyze spreadsheet data and produce charts with explanation
database-migration-review
Review a database migration for schema and query risks
secret-scanning
Scan a repository for exposed API keys and secrets
dependency-vulnerability
Audit dependencies for vulnerabilities and summarize remediation
contract-review
Review a contract and summarize risky clauses
privacy-policy-review
Review a privacy policy for compliance obligations
education-tutor
Create an adaptive tutor that explains a topic step by step
video-generation-workflow
Create a video generation workflow for short creative clips
image-design-workflow
Generate image design prompts and refine visual assets
workflow-automation
Connect repeated operational tasks across APIs and tools
scheduled-agent-run
Run an agent task on a schedule and report results
customer-support-triage
Triage customer support messages and draft replies
api-docs-generation
Generate API documentation from source code and examples
data-visualization
Create data visualizations from analytics exports