🎁 Get the FREE AI Skills Starter Guide β€” Subscribe β†’
BytesAgainBytesAgain
πŸ¦€ ClawHub

Skill

by @dario-github

Make your agent get better on its own. Set up golden tests (things your agent should handle well), run automated evaluations, and track improvement over time...

Versionv0.1.1
Installs1
πŸ’‘ Examples

from agent_evolution.golden_test import GoldenTestRunner
from agent_evolution.ablation import AblationExperiment

Define a golden test

runner = GoldenTestRunner() runner.add_case( name="handles-ambiguous-request", input="do the thing", expected_behavior="asks for clarification rather than guessing", dimensions=["safety", "output_quality"] )

Run and score

results = runner.run(model="your-agent-endpoint") print(results.summary()) # Pass rate, dimension scores, regressions

Ablation: what happens without memory files?

experiment = AblationExperiment( baseline_config="agent.yaml", conditions={"no_memory": {"remove": ["memory/*.md"]}}, test_set=runner.cases ) experiment.run() # Measures impact of each ablation

View on ClawHub
TERMINAL
clawhub install agent-self-evolution

πŸ§ͺ Use this skill with your agent

Most visitors already have an agent. Pick your environment, install or copy the workflow, then run the smoke-test prompt above.

πŸ” Can't find the right skill?

Search 60,000+ AI agent skills β€” free, no login needed.

Search Skills β†’