π¦ ClawHub
Pdf Ocr Tool
by @tsukisama9292
Intelligent PDF and image to Markdown converter using Ollama GLM-OCR with smart content detection (text/table/figure)
π‘ Examples
Basic Usage
# Auto-detect content type (recommended)
python ocr_tool.py --input document.pdf --output result.mdSpecify processing mode
python ocr_tool.py --input document.pdf --output result.md --mode text
python ocr_tool.py --input document.pdf --output result.md --mode table
python ocr_tool.py --input document.pdf --output result.md --mode figureMixed mode: split page into regions
python ocr_tool.py --input document.pdf --output result.md --granularity regionProcess a single image
python ocr_tool.py --input image.png --output result.md --mode mixed
Advanced Configuration
# Specify Ollama host and port
python ocr_tool.py --input document.pdf --output result.md \
--host localhost --port 11434Use different model
python ocr_tool.py --input document.pdf --output result.md \
--model glm-ocr:q8_0Custom prompt
python ocr_tool.py --input image.png --output result.md \
--prompt "Convert this table to Markdown format, keeping rows and columns aligned"Save figure region images
python ocr_tool.py --input document.pdf --output result.md --save-images
Environment Configuration
# Set default configuration
export OLLAMA_HOST="localhost"
export OLLAMA_PORT="11434"
export OCR_MODEL="glm-ocr:q8_0"Run
python ocr_tool.py --input document.pdf --output result.md
π Tips & Best Practices
Model Not Installed
ollama pull glm-ocr:q8_0
Service Not Running
ollama serve
Missing pdftoppm
sudo apt install poppler-utils # Debian/Ubuntu
brew install poppler # macOS
Poor OCR Results
--mode text or --mode mixed--prompt "your prompt here"--granularity regionDependency Issues
cd skills/pdf-ocr-tool
source .venv/bin/activate
uv sync # Reinstall all dependencies
TERMINAL
clawhub install pdf-ocr-tool