🎁 Get the FREE AI Skills Starter Guide β€” Subscribe β†’
BytesAgainBytesAgain
πŸ¦€ ClawHub

Pdf Ocr Tool

by @tsukisama9292

Intelligent PDF and image to Markdown converter using Ollama GLM-OCR with smart content detection (text/table/figure)

Versionv1.3.0
Installs4
πŸ’‘ Examples

Basic Usage

# Auto-detect content type (recommended)
python ocr_tool.py --input document.pdf --output result.md

Specify processing mode

python ocr_tool.py --input document.pdf --output result.md --mode text python ocr_tool.py --input document.pdf --output result.md --mode table python ocr_tool.py --input document.pdf --output result.md --mode figure

Mixed mode: split page into regions

python ocr_tool.py --input document.pdf --output result.md --granularity region

Process a single image

python ocr_tool.py --input image.png --output result.md --mode mixed

Advanced Configuration

# Specify Ollama host and port
python ocr_tool.py --input document.pdf --output result.md \
  --host localhost --port 11434

Use different model

python ocr_tool.py --input document.pdf --output result.md \ --model glm-ocr:q8_0

Custom prompt

python ocr_tool.py --input image.png --output result.md \ --prompt "Convert this table to Markdown format, keeping rows and columns aligned"

Save figure region images

python ocr_tool.py --input document.pdf --output result.md --save-images

Environment Configuration

# Set default configuration
export OLLAMA_HOST="localhost"
export OLLAMA_PORT="11434"
export OCR_MODEL="glm-ocr:q8_0"

Run

python ocr_tool.py --input document.pdf --output result.md

πŸ“‹ Tips & Best Practices

Model Not Installed

ollama pull glm-ocr:q8_0

Service Not Running

ollama serve

Missing pdftoppm

sudo apt install poppler-utils  # Debian/Ubuntu
brew install poppler            # macOS

Poor OCR Results

  • Try different modes: --mode text or --mode mixed
  • Use custom prompts: --prompt "your prompt here"
  • Check image quality (resolution, clarity)
  • Try mixed mode: --granularity region
  • Dependency Issues

    cd skills/pdf-ocr-tool
    source .venv/bin/activate
    uv sync  # Reinstall all dependencies
    

    View on ClawHub
    TERMINAL
    clawhub install pdf-ocr-tool

    πŸ§ͺ Use this skill with your agent

    Most visitors already have an agent. Pick your environment, install or copy the workflow, then run the smoke-test prompt above.

    πŸ” Can't find the right skill?

    Search 60,000+ AI agent skills β€” free, no login needed.

    Search Skills β†’