OpenAgentSkill Registry Manifest
Skill: Evaluation Guidebook
Slug: huggingface-evaluation-guidebook
Category: ml-automation
Description: Sharing both practical insights and theoretical knowledge about LLM evaluation that we gathered while managing the Open LLM Leaderboard and designing lighteval!

Agent fit:
- Decision: 84/100 Production-ready
- Primary fit: RAG and knowledge
- Role: Primary pick

Supply profile:
- Track: Data, BI, and analytics
- Scenario: Data analysis
- Applicable agents: Claude Code, CLI, Codex, Cursor, Jupyter Notebook
- Maintenance: 7mo since push
- Risk: Needs review

Trust:
- Trust score: 83/100 Strong shortlist
- Audit: 80/100 Needs review

Attribution:
- Status: Community indexed
- Source: GitHub star discovery
- Creator: huggingface
- Claim URL: https://www.openagentskill.com/skills/huggingface-evaluation-guidebook#claim-this-skill

Install:
npx skills add huggingface/evaluation-guidebook

URLs:
- Web: https://www.openagentskill.com/skills/huggingface-evaluation-guidebook
- API: https://www.openagentskill.com/api/agent/skills/huggingface-evaluation-guidebook
- Install API: https://www.openagentskill.com/api/skills/huggingface-evaluation-guidebook/install
- Repository: https://github.com/huggingface/evaluation-guidebook