Skill audit report

Evaluation Guidebook audit report.

Sharing both practical insights and theoretical knowledge about LLM evaluation that we gathered while managing the Open LLM Leaderboard and designing lighteval!

REVIEWEDREVIEWNeeds reviewGenerated Jul 3, 2026Heuristic metadata audit
80
Audit
83
Trust
82
Quality
82
Security
62
Maintain
92
Install

OpenAgentSkill Trust Score

83
Strong shortlist

Stars, maintenance, license, docs, install safety, permission surface, and installability.

The Trust Score is OpenAgentSkill's adoption layer. It is designed to help an agent decide whether a skill is safe enough to shortlist before installation.

GitHub adoption

PASS

86

2.1K GitHub stars

Stars/forks activity

INFO

77

2.1K stars, 123 forks; issue activity unavailable in current metadata

Recent maintenance

INFO

62

7mo since push

License clarity

WARN

42

Unknown

README/SKILL.md completeness

PASS

90

Metadata includes enough usage and workflow context

Dependency/runtime risk

PASS

90

no major dependency risk hints in public metadata

Install availability

PASS

92

npx skills add huggingface/evaluation-guidebook

Install command safety

PASS

92

standard package or runtime install path

Permission surface

PASS

86

filesystem or document access

Repository evidence

PASS

86

https://github.com/huggingface/evaluation-guidebook

Review status

PASS

88

AI review data available

Agent Proven outcomes

INFO

54

No agent outcome data yet

Checks

Install and adoption review

8 passed 路 6 review

Install path

92

PASS

npx skills add huggingface/evaluation-guidebook

Repository

88

PASS

https://github.com/huggingface/evaluation-guidebook

License

45

CHECK

Unknown

Maintenance

62

CHECK

7mo since push

AI review

88

PASS

Approved with no listed issues

README/SKILL.md completeness

90

PASS

Usable description available

Dependency risk

90

PASS

no major dependency risk hints in public metadata

Install command safety

92

PASS

standard package or runtime install path

Permission surface

86

PASS

filesystem or document access

Stars/forks activity

77

CHECK

2.1K stars, 123 forks; issue activity unavailable in current metadata

Adoption

88

PASS

2.1K GitHub stars

Warnings

  • License is unclear
  • Quality score needs review
  • License clarity: Unknown

Method

This report combines public metadata, AI review output, repository freshness, install readiness, OpenAgentSkill events, quality scoring, trust checks, and the agent safety gate. It is not a full source-code security review.

Compare nearby options

Related skills to audit next