🎁 Get the FREE AI Skills Starter Guide β€” Subscribe β†’
BytesAgainBytesAgain
πŸ¦€ ClawHub

Skill Eval

by @xiaoxing9

Skill evaluation framework. Use when: testing trigger rate, quality compare (with/without skill), or model comparison. Runs via sessions_spawn + sessions_his...

πŸ’‘ Examples

Follow USAGE.md for all workflows.

Quick reference:

| Workflow | What It Tests | USAGE.md Section | |----------|---------------|------------------| | Trigger Rate | Does description trigger SKILL.md reads at the right times? | Workflow 1 | | Quality Compare | Does skill improve output vs no-skill baseline? | Workflow 2 | | Model Comparison | Quality + Speed across haiku/sonnet/opus | Workflow 3 | | Latency Profile | Response time p50/p90 | Workflow 4 |

Each workflow follows the same pattern: 1. Agent spawns subagents using sessions_spawn 2. Agent collects histories using sessions_history 3. Agent writes raw data to workspace/{skill}/iter-{n}/raw/ 4. Agent runs analysis script via exec


View on ClawHub
TERMINAL
clawhub install openclaw-skill-eval

πŸ§ͺ Use this skill with your agent

Most visitors already have an agent. Pick your environment, install or copy the workflow, then run the smoke-test prompt above.

πŸ” Can't find the right skill?

Search 60,000+ AI agent skills β€” free, no login needed.

Search Skills β†’