Pre-install eval

Toydb eval report.

A machine-readable install decision for agents: task fit, Trust Score, Audit Score, install safety, permission surface, and a concrete validation plan before this skill touches a workspace.

Needs reviewMEDIUM RISKALLOW POLICY
91
Eval
90
Trust
94
Audit
82
Safety

manual review

Allow agent install in a sandbox or low-risk workspace, then promote after one successful narrow task.

Required gates

Checks an agent must pass before install

Open JSON

Task fit

84

pass

Task wording matches this skill metadata.

  • Evaluate Toydb before installing it in an agent workflow
  • data-analysis
  • Coding agents workflows; Claude Code teams; teams that value GitHub adoption signals

Install path

92

pass

Install handoff is available.

  • npx skills add erikgrinaker/toydb

Install command safety

92

pass

standard package or runtime install path

  • npx skills add erikgrinaker/toydb

Trust score

90

pass

Strong OpenAgentSkill Trust Score across adoption, recent maintenance, license clarity, documentation, dependency/runtime risk, install safety, permission surface, and install availability.

  • Production candidate
  • 7.3K GitHub stars
  • Apache-2.0

Audit score

94

pass

Safe to try

  • No major audit warning from metadata.

Agent safety gate

82

pass

Strong metadata, audit, install, and review signals. Suitable for agent shortlists after normal workspace review.

  • Allow agent install in a sandbox or low-risk workspace, then promote after one successful narrow task.
  • Verified listing

License clarity

86

pass

Apache-2.0

  • Apache-2.0

Permission surface

74

warn

filesystem or document access, database access

  • Network access: medium
  • Filesystem access: medium
  • Database access: medium

Validation plan

What the agent should do next

  1. 1Inspect repository, README/SKILL.md, license, and recent commits before production use.
  2. 2Install in an isolated workspace or sandbox with no production secrets available.
  3. 3Run the smallest representative task and record files touched, commands run, network access, and outputs.
  4. 4Compare the selected skill against at least one alternative when the eval status is review or failed.
  5. 5Promote only after the agent reports a successful verification result and unresolved warnings are accepted.

Do not use when

Conditions that require another skill

  • teams that need a vendor-supported SLA
  • high-compliance environments without internal security review
  • No major risk signals from current metadata
  • No major trust warnings detected from available metadata
  • Production credentials, payments, or irreversible account changes without explicit human review
  • Sensitive private data before reviewing repository code, license, and permission surface

Supporting checks

Trust signals behind the decision

README/SKILL.md completeness

warn

74

Public metadata needs stronger README/SKILL.md context

Recent maintenance

pass

100

8d since push

Alternatives available

pass

82

Alternative skills are available for comparison.