Find the Right AI Skill for Any Job
Browse 338+ curated AI agent skills. Search by use case, filter by category, get the right tool instantly.
All Skills
338 skills total matching "extraction"
๐ Allcodingdevopsapidatabasesecuritydataresearchwritingimage-genvideoaudiotranslationseosocial-mediaemail-marketingadvertisingfinancecrypto-defiecommercelegalhrreal-estatehealtheducationcookingtravelgamingautomationcommunicationproductivityclawhublobehubdifymcp
๐ฆ ClawHub
LiteParse Document Parser
Use when parsing PDFs, DOCX, PPTX, XLSX, or images locally. Supports text extraction, JSON output with bounding boxes, batch processing, and page screenshots...
๐ฆ ClawHub
Scan To Markdown
OCR document extraction - extract text from scanned documents, photos, and images using OCR. Use when reading scanned PDFs, photographed pages, handwritten n...
๐ฆ ClawHub
InfoQuest Web Search
AI-optimized web search, image search and content extraction via BytePlus InfoQuest API. Use this skill when you need to gather concise and up-to-date inform...
๐ฆ ClawHub
PulpMiner Web Scraper - Convert Any Webpage to Realtime JSON API
Convert any webpage into structured JSON data using AI. Scrape websites, extract data into custom JSON schemas, and call saved APIs programmatically. Useful for web scraping, data extraction, content monitoring, lead generation, price tracking, and building data pipelines.
๐ฆ ClawHub
Akashic Doc Analyzer
Parse, analyze, and extract content from documents (PDF, DOCX, PPTX, audio). Supports OCR, table extraction, and semantic chunking.
๐ฆ ClawHub
Invoice Scan
AI-powered invoice OCR, scanning, and data extraction. Use when: (1) user needs OCR or text extraction from invoice images, scanned documents, or PDFs, (2) s...
๐ฆ ClawHub
MinerU zero-setup document extraction โ convert PDFs, images, Word, and PowerPoint to Markdown instantly. No login, no token, no configuration. Just run and get results
Zero-setup document extraction โ convert PDFs, images, Word, and PowerPoint to Markdown. No login, no token, no configuration. Just run and get results.
๐ฆ ClawHub
Power Search
Self-hosted research tool combining Brave Search API + Browserless content fetching. Search the web with optional full-page content extraction and HTML parsing.
๐ฆ ClawHub
file-processor
Automatically detects and processes files including PDF, Excel, CSV, Word, images, and text for extraction, OCR, data analysis, and summarization.
๐ฆ ClawHub
Deep Research Pro v5.0.1
Performs deep research using a three-stage process: data extraction, thematic insight briefs with contradiction analysis, and narrative-driven strategic repo...
๐ฆ ClawHub
Browser Web Extract
้่ฟURLๆๅไปปๆ็ฝ้กต็ๆๆฌๅๅพ็ๅ
ๅฎนใๅฝ็จๆทๆไพURLใ็ฝ้กต้พๆฅๆ็ฝๅ๏ผๅนถๅธๆ่ฏปๅใๆๅใๆๅใๆ่ฆใๅๆ้กต้ขๅ
ๅฎนใweb page extraction, URL scraping, web content reading, website data collection, link parsing, web...
๐ฆ ClawHub
Web Researcher Mini
Firecrawl CLI for web scraping, crawling, and search. Scrape single pages or entire websites, map site URLs, and search the web with full content extraction....
๐ฆ ClawHub
maasv Memory
Provides structured long-term memory with semantic, keyword, and knowledge graph retrieval, entity extraction, temporal versioning, and experiential learning.
๐ฆ ClawHub
Lightpanda Scraper
Fast headless browser web scraping using Lightpanda (0.5s page loads, 90x faster than Chromium). Perfect for OSINT recon, link extraction, and content scrapi...
๐ฆ ClawHub
Fast Browser Use Local
Rust-based browser automation using local Chrome for ultra-fast DOM extraction, session management, screenshots, scraping, and site structure analysis.
๐ฆ ClawHub
Ride Receipts
Build a local SQLite ride-history database from Gmail ride receipt emails using gog for fetch and OpenClaw Gateway /v1/responses for extraction. Use when you...
๐ฆ ClawHub
Video Transcript
Extract full transcripts from video content for analysis, summarization, note-taking, or research. Use when the user wants a written version of video content, asks to "transcribe this", "get the text from this video", "convert video to text", or shares a video URL for content extraction.
๐ฆ ClawHub
StartClaw-Optimizer
Master optimization system - APPLIES TO EVERY RESPONSE. Before responding, classify task complexity (simple question vs analysis vs coding). Use Haiku for simple/navigation/extraction/status. Use Sonnet ONLY for writing/analysis/planning/debugging. Monitor context size - if >50k tokens, recommend /compact. For automations, use scheduler wrapper. Never load full conversation history for simple tasks. Heartbeats always Haiku, single-line only. Never use Opus. This skill MUST run before every respo
๐ฆ ClawHub
Highlight Reels
Scenario-focused Sparki skill for highlight extraction while using the latest official Sparki setup, API-key, and upload workflow guidance.
๐ฆ ClawHub
nanobanana2-apiyi
Generate images via APIYI (Gemini 3.1 Flash Image Preview). Use when user wants to generate images from text descriptions. Supports keyword extraction, promp...
๐ฆ ClawHub
WiseOCR
PDF & Image OCR โ Convert a single PDF or image to Markdown via WiseDiag cloud API, with high-accuracy text extraction, table recognition, and multi-column l...
๐ฆ ClawHub
Web Scraping & Data Extraction Engine
Complete web scraping methodology โ legal compliance, architecture design, anti-detection, data pipelines, and production operations. Use when building scrap...
๐ฆ ClawHub
ucloud-deepseek-ocr
OCR text recognition using DeepSeek-OCR model. Use when user asks for OCR, text recognition, image text extraction, screenshot recognition, or converting ima...
๐ฆ ClawHub
Automated daily memory backfill for OpenClaw sessions
Scrape and analyze OpenClaw JSONL session logs to reconstruct and backfill agent memory files. Use when: (1) Memory appears incomplete after model switches, (2) Verifying memory coverage, (3) Reconstructing lost memory, (4) Automated daily memory sync via cron/heartbeat. Supports simple extraction and LLM-based narrative summaries with automatic secret sanitization.
๐ฆ ClawHub
MiniMax PDF Analysis V2
Analyze PDF files using MiniMax API. Supports text extraction, keyword search, and image-based VLM analysis (converts PDF pages to images first). Requires Mi...
๐ฆ ClawHub
ClawMemory
Sovereign agent memory engine โ self-hosted, privacy-first SQLite store with LLM-based fact extraction (GLM-4.7), hybrid BM25+vector search, contradiction re...
๐ง Dify
Firecrawl (Dify)
**Firecrawl** is a powerful API integration for web crawling and data scraping. It allows users to extract URLs, scrape website content, and retrieve structured data from web pages. With its modular tools, Firecrawl simplifies the process of gathering web data efficiently. You can now use it in your application workflows for automated web data extraction and analysis. To set up Firecrawl, follow t
๐ฆ ClawHub
Tavily Skill.Bak
Use Tavily API for real-time web search and content extraction. Use when: user needs real-time web search results, research, or current information from the...
๐ฆ ClawHub
Lark/Feishu Sheets & Cloud File Download (with PDF extraction)
Read, write and manage Lark/Feishu Sheets (spreadsheets) and download Lark/Feishu cloud files via Lark OpenAPI. Reads Feishu app credentials (appId/appSecret...
๐ฆ ClawHub
math-guide-solver
Complete mathematical problem solving workflow with OCR, LaTeX formula extraction, PNG rendering, and guided solutions. Use this skill when users want to: -...
๐ฆ ClawHub
frompdf
PDF extraction API for AI agents and LLM pipelines. Converts any PDF into semantic AST, markdown, HTML, plain text, or LLM-ready chunks โ no page limit. Also...
๐ฆ ClawHub
local_memory
Manage AI conversation memory locally with automatic extraction, retrieval, and manual commands, ensuring privacy without external APIs or fees.
๐ฆ ClawHub
Ocr Document
OCR document extraction - extract text from scanned documents, photos, and images using OCR. Use when reading scanned PDFs, photographed pages, handwritten n...
๐ฆ ClawHub
X (Twitter) Data Scraper
X (Twitter) data extraction and analysis. Use when user asks to "get tweets from @username", "search X for", "analyze Twitter data", "fetch tweets about [top...
๐ฆ ClawHub
Endpoints
Endpoints document management API toolkit. Scan documents with AI extraction and organize structured data into categorized endpoints. Use when the user asks to: scan a document, upload a file, list endpoints, inspect endpoint data, check usage stats, create or delete endpoints, get file URLs, or manage document metadata. Requires ENDPOINTS_API_KEY from endpoints.work dashboard.
๐ฆ ClawHub
CorpusGraph Document ETL and entity relationship engine for AI agents
Document ETL, entity extraction, and relationship graphing engine. Convert 1,000+ file formats into searchable, structured data with automatic entity and rel...
โญ GitHub
OrangeViolin/skill-evolve
ๆผ่ฟๅผ Skill ๆน่ฟ โ A Claude Code Skill that improves other skills through observation, pattern extraction, and iterative refinement. Based on OTF + JIT + Bootstrap methodology.
๐ฆ ClawHub
markdown-extract
Extract clean markdown from any URL using auto, AI, or browser methods via the markdown.new API with error handling and flexible extraction options.
๐ง Dify
Paddleocr (Dify)
**[PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR) is an industry-leading, production-ready OCR and document AI engine, offering end-to-end solutions from text extraction to intelligent document understanding.** This plugin provides several capabilities from PaddleOCR, including text recognition, document parsing, and more. Open the Plugin Marketplace, search for the PaddleOCR plugin, and in
๐ง Dify
Mineru (Dify)
MinerU is a tool that converts PDFs into machine-readable formats (e.g., markdown, JSON), allowing for easy extraction into any format. MinerU is a document parser that can parse complex document data for any downstream LLM use case (RAG, agents) [GitHub - opendatalab/MinerU: A high-quality tool for convert PDF to Markdown and JSON.](https://github.com/opendatalab/MinerU) - Remove headers, footers
โญ GitHubโญ 120
udayanwalvekar/clearshot
--- name: clearshot description: "Structured screenshot analysis for UI implementation and critique. Analyzes every UI screenshot with a 5ร5 spatial grid, full element inventory, and design system extraction โ facts and taste together, every time. Escalates to full implementation blueprint when building. Trigger on any digital interface image file (png, jpg, gif, webp โ websites, apps, dashboards, mockups, wireframes) or commands like 'analyse this screenshot,' 'rebuild this,' 'match this design
๐ฆ ClawHub
Brave Search Old
Web search and content extraction via Brave Search API. Use for searching documentation, facts, or any web content. Lightweight, no browser required.
๐ฆ ClawHub
Ingestigate Investigative intelligence for AI agents
Investigative intelligence โ document search, entity extraction, and relationship graphing. Analyze document corpuses to find connections between people, org...
๐ฆ ClawHub
document-parser
Extract structured data from PDFs, images, and Word files with layout analysis, table recognition, OCR, seal detection, and directory extraction.
๐ฆ ClawHub
Smart Memory (Zero Dep)
Enhanced memory system for agentic workflows. Automatic memory extraction from conversations, memory type classification (preference/project/technical/lesson...
๐ฆ ClawHub
Meta Video Ad Analyzer
Extract and analyze content from video ads using Gemini Vision AI. Supports frame extraction, OCR text detection, audio transcription, and AI-powered scene analysis. Use when analyzing video creative content, extracting text overlays, or generating scene-by-scene descriptions.
๐ฆ ClawHub
Pget
Parallel file download and optional tar extraction using the pget CLI (single URL or multifile manifest). Use when you need highโthroughput downloads from HTTP(S)/S3/GCS, want to split a large file into chunks for speed, or want to download and extract a .tar/.tar.gz in one step.
๐ฆ ClawHub
AnyCrawl-API
Perform high-performance web scraping, crawling, and Google search with multi-engine support and structured data extraction via AnyCrawl API.