Browse AI Agent Skills | BytesAgain

🎁 Get the FREE AI Skills Starter Guide — Subscribe →

All Skills

105 skills total matching "image processing"

🦀 ClawHub42.3k dl

Markdown Converter

Convert documents and files to Markdown using markitdown. Use when converting PDF, Word (.docx), PowerPoint (.pptx), Excel (.xlsx, .xls), HTML, CSV, JSON, XML, images (with EXIF/OCR), audio (with transcription), ZIP archives, YouTube URLs, or EPubs to Markdown format for LLM processing or text analysis.

⭐ GitHub⭐ 167.2k

nutrient-document-processing

The agent harness performance optimization system. Skills, instincts, memory, security, and research-first development for Claude Code, Codex, Opencode, Cursor and beyond.

🦀 ClawHub3.5k dl

Screenshot Capture

Process screenshots Enzo shares with comments. Save to reference library, extract content, categorize, set reminders, and log patterns. Use when Enzo sends an image with context like "save this", shares a screenshot of content (LinkedIn posts, tweets, articles), or sends ideas/frameworks to remember.

⭐ GitHub⭐ 167.2k

nutrient-document-processing

The agent harness performance optimization system. Skills, instincts, memory, security, and research-first development for Claude Code, Codex, Opencode, Cursor and beyond.

🦀 ClawHub2.0k dl

Fetch handwritten notes, sketches, and drawings from a reMarkable tablet via Cloud API (rmapi). Process content by refining artwork with AI image generation, extracting handwritten text to memory/journal, or using sketches as input for other workflows. Use when working with reMarkable tablet content, syncing handwritten notes, processing sketches, or integrating tablet drawings into projects.

⭐ GitHub⭐ 167.2k

nutrient-document-processing

The agent harness performance optimization system. Skills, instincts, memory, security, and research-first development for Claude Code, Codex, Opencode, Cursor and beyond.

🦀 ClawHub1.9k dl

BitSoul AI Face Beauty 人像AI美颜

Edit image to beautify faces or portaits in it. Use when (1) User requests to process an image, (2) User asks to beautify a photo.

⭐ GitHub⭐ 39.7k

AI-powered job search system built on Claude Code. 14 skill modes, Go dashboard, PDF generation, batch processing.

🦀 ClawHub1.7k dl

Grok Imagine Image Pro

Generates and edits high-quality PNG images via xAI Grok/Flux API using prompts, styles, aspect ratios, and batch processing with base64 output.

⭐ GitHub⭐ 35.1k

threejs-postprocessing

Installable GitHub library of 1,400+ agentic skills for Claude Code, Cursor, Codex CLI, Gemini CLI, Antigravity, and more. Includes installer CLI, bundles, workflows, and official/community skill collections.

🦀 ClawHub1.5k dl

Read, analyze metadata, convert formats, resize, rotate, crop, compress, and batch process PNG, JPG, GIF, WebP, TIFF, BMP, HEIC, SVG, and ICO images.

⭐ GitHub⭐ 35.1k

threejs-postprocessing

Installable GitHub library of 1,400+ agentic skills for Claude Code, Cursor, Codex CLI, Gemini CLI, Antigravity, and more. Includes installer CLI, bundles, workflows, and official/community skill collections.

🦀 ClawHub1.2k dl

Remove image backgrounds using the remove.bg API with API-key auth and transparent PNG output. Use when high-quality cutouts are needed and cloud processing...

⭐ GitHub⭐ 35.1k

threejs-postprocessing

Installable GitHub library of 1,400+ agentic skills for Claude Code, Cursor, Codex CLI, Gemini CLI, Antigravity, and more. Includes installer CLI, bundles, workflows, and official/community skill collections.

🦀 ClawHub972 dl

multimodal-parser

Unified multi-modal content parser for images, PDF, DOCX, audio, auto OCR/transcription, output structured text for LLM processing

⭐ GitHub⭐ 34.9k

pubmed-database

Direct REST API access to PubMed. Advanced Boolean/MeSH queries, E-utilities API, batch processing, citation management. For Python workflows, prefer biopython (Bio.Entrez). Use this for direct HTTP/REST work or custom API implementations.

🦀 ClawHub804 dl

Remove Watermark

Remove light-colored text watermarks from white-background document images (exam papers, scanned documents). No API key needed - pure local image processing....

⭐ GitHub⭐ 19.6k

ElevenLabs audio generation — text-to-speech, voice cloning, and sound effects. Use this skill any time the agent needs to: convert text to spoken audio, narrate documents or content, generate voiceovers, clone voices from audio samples, create sound effects, or produce any audio output from text. Supports multiple voices, languages, models, voice cloning, batch processing, and sound effect generation. Requires ELEVENLABS_API_KEY.

🦀 ClawHub698 dl

When dealing with text within an image, the system automatically recognizes it as an OCR (Optical Character Recognition) task and applies the corresponding capabilities.

OCR (Optical Character Recognition) tool using Tesseract for extracting text from images. Use when: (1) processing screenshots, charts, or documents in image...

⭐ GitHub⭐ 5.7k

processing-stix-taxii-feeds

754 structured cybersecurity skills for AI agents · Mapped to 5 frameworks: MITRE ATT&CK, NIST CSF 2.0, MITRE ATLAS, D3FEND & NIST AI RMF · agentskills.io standard · Works with Claude Code, GitHub Copilot, Codex CLI, Cursor, Gemini CLI & 20+ platforms · 26 security domains · Apache 2.0

🦀 ClawHub674 dl

Image Processing Toolkit

Local image processing toolkit for format conversion, compression, resizing, batch jobs, and image-to-PDF. Use when users ask 压缩图片/改尺寸/批量处理/转PDF. Supports si...

🦀 ClawHub623 dl

Xiaohongshu Longpost Auto

When users have long-form content ready to publish on Xiaohongshu, automatically completes the entire process: login detection, long content segmentation optimization, AI-generated images, content filling, AI-generated tags, tag activation, original content declaration, and publishing.

🦀 ClawHub544 dl

Crop objects from images using bounding box annotations in COCO, YOLO, VOC, or LabelMe formats with optional padding and batch processing.

🦀 ClawHub533 dl

PDF OCR Using Gemini LLM

Extract text from PDFs using Google Gemini OCR. Use when extracting text from PDFs, performing OCR on scanned documents, or processing image-based PDFs.

🦀 ClawHub484 dl

keevx-image-to-video

Convert images to videos using Keevx API with support for multiple models, resolutions up to 4K, audio generation, and batch processing.

🦀 ClawHub456 dl

Computer vision and image processing using OpenCV WebAssembly. Uses opencv-component.wasm running in openclaw-wasm-sandbox plugin. Supports image processing,...

🦀 ClawHub429 dl

Image OCR Parse

Extract text from images via the PDFAPIHub cloud OCR API. Images are uploaded to pdfapihub.com for Tesseract OCR processing. Supports preprocessing (grayscal...

🦀 ClawHub404 dl

Pixel Art Processing

Pixel art sprite sheet processing tool — video frame extraction, GIF/frames conversion, sprite sheet compose/split, image matting, pixelation, resize, crop,...

🦀 ClawHub390 dl

Media.io Hailuo Video Generator

Generate high-quality HD AI videos from text or images using MiniMax Hailuo (Hailuo 2.3) via Media.io OpenAPI with fast processing.

🦀 ClawHub361 dl

Quick product image processing: add price sticker + watermark + logo. Use when user sends `$price:` with an image. Minimal context, runs fast.

🦀 ClawHub345 dl

Afm Image Analysis 1.0.0

Analyze AFM images to compute surface roughness, detect nanoparticles, extract line profiles, generate 3D renderings, and process batches with detailed reports.

🦀 ClawHub337 dl

Local video, audio and image processing expert for macOS, powered by VN Video Editor. Use this skill whenever the user wants to process video, audio or image...

🦀 ClawHub313 dl

Reads and analyzes images from messages across 10+ chat platforms using platform-specific APIs and unified image processing.

🦀 ClawHub286 dl

PDF Batch Processing Tool

Batch process PDF files - merge multiple PDFs, split PDF into multiple files, rotate pages, extract text, extract images, compress PDFs. Use when you need to...

🦀 ClawHub253 dl

Byted Tos Image Process

Provides image processing capabilities for objects in Bytedance TOS using the official SDK. Supports getting image info, format conversion, resizing, and wat...

🦀 ClawHub212 dl

watermark-remover-skill

Use this skill when the user wants to remove watermarks from images, batch-process images for watermark removal, or asks about the "布衣去水印" / "图片去水印" tool. Th...

🦀 ClawHub78 dl

AutoDimension Report Skill En

Process PDF, DOCX, XLSX from supply chain document packages — conversion, image extraction, OCR, dimension verification, and review report generation. Invoke...

Basic imaging processing functions and methods for converting to and from image formats

A library containing Convenience functions to make basic image processing operations such as translation, rotation, resizing, skeletonization, and displaying Matplotlib images easier with OpenCV and Python.

A fast image processing library with low memory needs.

Vectorizer (Dify)

**Vectorizer.AI** is a powerful tool that converts PNG and JPG images into scalable SVG vector graphics quickly and easily. Powered by AI, the conversion process is fully automated, allowing you to transform raster images into high-quality vector formats with minimal effort. This tool is ideal for graphic designers, developers, and anyone who needs vectorized images for their projects. To start us

Use when parsing PDFs, DOCX, PPTX, XLSX, or images locally. Supports text extraction, JSON output with bounding boxes, batch processing, and page screenshots...

🦀 ClawHub18.8k dl

Create, inspect, process, and optimize image files and visual assets with reliable format choice, resizing, compression, color-profile, metadata, and platfor...

🦀 ClawHub4.6k dl

Resize, crop, convert, and optimize images using ImageMagick. Use when processing photos, converting formats (PNG/WebP), compressing size, or adding watermarks.

🦀 ClawHub2.6k dl

Image processing tool for compression, background removal/replacement, and upscaling. Invoke when user wants to compress image, remove background, change bac...

🦀 ClawHub2.2k dl

Glasses to Social

Turn smart glasses photos into social media posts. Monitors a Google Drive folder for new images from Meta Ray-Ban glasses (or any smart glasses), analyzes them with vision AI, drafts tweets/posts in the user's voice, and publishes on approval. Use when setting up a glasses-to-social pipeline, processing smart glasses photos for social media, or creating hands-free content workflows.

🦀 ClawHub2.0k dl

Automatically detects and processes files including PDF, Excel, CSV, Word, images, and text for extraction, OCR, data analysis, and summarization.

🦀 ClawHub1.9k dl

BitSoul AI Face Beauty 人像AI美颜

Edit image to beautify faces or portaits in it. Use when (1) User requests to process an image, (2) User asks to beautify a photo.