π¦ ClawHub
Vision Tool
by @huruilizhen
Image recognition using Ollama + qwen3.5:4b with think=False for reliable content extraction.
π‘ Examples
Basic usage
# From any OpenClaw channel
exec: python3 /path/to/vision-tool/main.py /path/to/image.jpgWith custom prompt
exec: python3 /path/to/vision-tool/main.py /path/to/image.jpg --prompt "Describe this image"Debug output
exec: python3 /path/to/vision-tool/main.py /path/to/image.jpg --debug
Channel-specific examples
WeChat Channel:
# When receiving an image
exec: python3 /path/to/vision-tool/main.py "$IMAGE_PATH"
Telegram Channel:
# Reply to photo messages
exec: python3 /path/to/vision-tool/main.py "/path/to/telegram_photo.jpg"
Discord Channel:
# Process attachments
exec: python3 /path/to/vision-tool/main.py "./discord_attachment.jpg"
βοΈ Configuration
1. Ollama service: ollama serve (must be running)
2. qwen3.5:4b model: ollama pull qwen3.5:4b
3. Python 3.8+: Required for running the skill
Install the skill
clawhub install vision-tool
Development Setup (For Contributors)
If you want to contribute or modify the skill, see CONTRIBUTING.md for detailed development instructions.Basic setup:
# Clone the repository
git clone https://github.com/HuRuilizhen/vision-tool
cd vision-toolSet up development environment
python3 -m venv .venv
source .venv/bin/activate
pip install -e .Run tests
python3 -m pytest tests/
π Tips & Best Practices
Common Issues
1. Ollama not running: Run ollama serve first
2. Model not installed: Run ollama pull qwen3.5:4b
3. Image path incorrect: Use absolute paths or correct relative paths
4. Timeout: Model may take 30+ seconds for complex images
Performance Tips
TERMINAL
clawhub install vision-tool