🦀 ClawHub
Vision Helper — AI Image Analysis
by @ravenquasar
Analyze images using local or cloud vision models via Ollama to identify content, UI elements, screenshots, or extract text with OCR support.
💡 Examples
Basic
# Analyze an image (default: English description)
python3 /scripts/analyze_image.py With a custom prompt
python3 /scripts/analyze_image.py "Is this a chess game? Describe the board state"With a specific model
python3 /scripts/analyze_image.py "Describe content" kimi-k2.5:cloud
> resolves to your OpenClaw skill installation directory, typically ~/.openclaw/workspace/skills/vision-helper/.
In Conversation
When you need to analyze an image, use the exec tool:
exec: python3 /scripts/analyze_image.py /path/to/image.png "What do you see?"
Important: Set exec timeout to 120–180 seconds, as cloud vision models are slow.
Screenshot + Analysis Workflow
#### Option A: Browser screenshot → analyze
1. browser(action="screenshot") → get screenshot path (MEDIA: xxx)
2. exec("/scripts/analyze_image.py 'Describe this UI'")
3. Act on the analysis result
#### Option B: Desktop screenshot → analyze
macOS:
1. exec("screencapture -x /tmp/screen.png")
2. exec("/scripts/analyze_image.py /tmp/screen.png 'Describe the desktop'")
Linux:
1. exec("gnome-screenshot -f /tmp/screen.png")
— or —
exec("import /tmp/screen.png") # ImageMagick
— or —
exec("scrot /tmp/screen.png")
2. exec("/scripts/analyze_image.py /tmp/screen.png 'Describe the desktop'")
#### Option C: Game/App UI → analyze → act
1. Screenshot the current screen
2. Use vision-helper to identify UI elements, buttons, text
3. Execute clicks/input based on the analysis
📋 Tips & Best Practices
Q: Can I use the built-in image tool instead?
A: It works for local models but will time out on cloud vision models. Always prefer this skill's script for reliable results.Q: What image formats are supported?
A: PNG, JPG, JPEG, GIF, WebP, BMP, TIFF, SVG. Maximum file size: 20 MB.Q: Where should I save screenshots?
A: Any readable directory works —/tmp/, your workspace, etc. This script has no path restrictions.Q: How do I use a Chinese prompt?
A: Pass it as the second argument:python3 /scripts/analyze_image.py /tmp/img.png "请描述这张图片的内容" TERMINAL
clawhub install vision-helper