π¦ ClawHub
PinchBench
by @olearycrew
Run PinchBench benchmarks to evaluate OpenClaw agent performance across real-world tasks. Use when testing model capabilities, comparing models, submitting b...
π‘ Examples
cd Run benchmark with a specific model
uv run benchmark.py --model anthropic/claude-sonnet-4Run only automated tasks (faster)
uv run benchmark.py --model anthropic/claude-sonnet-4 --suite automated-onlyRun specific tasks
uv run benchmark.py --model anthropic/claude-sonnet-4 --suite task_01_calendar,task_02_stockSkip uploading results
uv run benchmark.py --model anthropic/claude-sonnet-4 --no-upload
βοΈ Configuration
TERMINAL
clawhub install pinchbench