🎁 Get the FREE AI Skills Starter GuideSubscribe →
BytesAgainBytesAgain
🦀 ClawHub

Evaluation Benchmark

by @sky-lv

Agent评估测试助手。设计评估指标、构建测试集、生成报告。使用场景:(1) 设计评估指标,(2) 构建测试集,(3) 执行评估测试,(4) 分析评估结果。

Versionv1.0.0
View on ClawHub
TERMINAL
clawhub install skylv-evaluation-benchmark

🧪 Use this skill with your agent

Most visitors already have an agent. Pick your environment, install or copy the workflow, then run the smoke-test prompt above.

🔍 Can't find the right skill?

Search 60,000+ AI agent skills — free, no login needed.

Search Skills →