BytesAgainBytesAgain

Find the Right AI Skill for Any Job

Browse 338+ curated AI agent skills. Search by use case, filter by category, get the right tool instantly.

Browse by Use Case โ†’Pick My Role

All Skills

338 skills total matching "extraction"

๐Ÿฆ€ ClawHub
LiteParse Document Parser
Use when parsing PDFs, DOCX, PPTX, XLSX, or images locally. Supports text extraction, JSON output with bounding boxes, batch processing, and page screenshots...
๐Ÿฆ€ ClawHub
Scan To Markdown
OCR document extraction - extract text from scanned documents, photos, and images using OCR. Use when reading scanned PDFs, photographed pages, handwritten n...
๐Ÿฆ€ ClawHub
InfoQuest Web Search
AI-optimized web search, image search and content extraction via BytePlus InfoQuest API. Use this skill when you need to gather concise and up-to-date inform...
๐Ÿฆ€ ClawHub
PulpMiner Web Scraper - Convert Any Webpage to Realtime JSON API
Convert any webpage into structured JSON data using AI. Scrape websites, extract data into custom JSON schemas, and call saved APIs programmatically. Useful for web scraping, data extraction, content monitoring, lead generation, price tracking, and building data pipelines.
๐Ÿฆ€ ClawHub
Akashic Doc Analyzer
Parse, analyze, and extract content from documents (PDF, DOCX, PPTX, audio). Supports OCR, table extraction, and semantic chunking.
๐Ÿฆ€ ClawHub
Invoice Scan
AI-powered invoice OCR, scanning, and data extraction. Use when: (1) user needs OCR or text extraction from invoice images, scanned documents, or PDFs, (2) s...
๐Ÿฆ€ ClawHub
MinerU zero-setup document extraction โ€” convert PDFs, images, Word, and PowerPoint to Markdown instantly. No login, no token, no configuration. Just run and get results
Zero-setup document extraction โ€” convert PDFs, images, Word, and PowerPoint to Markdown. No login, no token, no configuration. Just run and get results.
๐Ÿฆ€ ClawHub
Power Search
Self-hosted research tool combining Brave Search API + Browserless content fetching. Search the web with optional full-page content extraction and HTML parsing.
๐Ÿฆ€ ClawHub
file-processor
Automatically detects and processes files including PDF, Excel, CSV, Word, images, and text for extraction, OCR, data analysis, and summarization.
๐Ÿฆ€ ClawHub
Deep Research Pro v5.0.1
Performs deep research using a three-stage process: data extraction, thematic insight briefs with contradiction analysis, and narrative-driven strategic repo...
๐Ÿฆ€ ClawHub
Browser Web Extract
้€š่ฟ‡URLๆๅ–ไปปๆ„็ฝ‘้กต็š„ๆ–‡ๆœฌๅ’Œๅ›พ็‰‡ๅ†…ๅฎนใ€‚ๅฝ“็”จๆˆทๆไพ›URLใ€็ฝ‘้กต้“พๆŽฅๆˆ–็ฝ‘ๅ€๏ผŒๅนถๅธŒๆœ›่ฏปๅ–ใ€ๆๅ–ใ€ๆŠ“ๅ–ใ€ๆ‘˜่ฆใ€ๅˆ†ๆž้กต้ขๅ†…ๅฎนใ€web page extraction, URL scraping, web content reading, website data collection, link parsing, web...
๐Ÿฆ€ ClawHub
Web Researcher Mini
Firecrawl CLI for web scraping, crawling, and search. Scrape single pages or entire websites, map site URLs, and search the web with full content extraction....
๐Ÿฆ€ ClawHub
maasv Memory
Provides structured long-term memory with semantic, keyword, and knowledge graph retrieval, entity extraction, temporal versioning, and experiential learning.
๐Ÿฆ€ ClawHub
Lightpanda Scraper
Fast headless browser web scraping using Lightpanda (0.5s page loads, 90x faster than Chromium). Perfect for OSINT recon, link extraction, and content scrapi...
๐Ÿฆ€ ClawHub
Fast Browser Use Local
Rust-based browser automation using local Chrome for ultra-fast DOM extraction, session management, screenshots, scraping, and site structure analysis.
๐Ÿฆ€ ClawHub
Ride Receipts
Build a local SQLite ride-history database from Gmail ride receipt emails using gog for fetch and OpenClaw Gateway /v1/responses for extraction. Use when you...
๐Ÿฆ€ ClawHub
Video Transcript
Extract full transcripts from video content for analysis, summarization, note-taking, or research. Use when the user wants a written version of video content, asks to "transcribe this", "get the text from this video", "convert video to text", or shares a video URL for content extraction.
๐Ÿฆ€ ClawHub
StartClaw-Optimizer
Master optimization system - APPLIES TO EVERY RESPONSE. Before responding, classify task complexity (simple question vs analysis vs coding). Use Haiku for simple/navigation/extraction/status. Use Sonnet ONLY for writing/analysis/planning/debugging. Monitor context size - if >50k tokens, recommend /compact. For automations, use scheduler wrapper. Never load full conversation history for simple tasks. Heartbeats always Haiku, single-line only. Never use Opus. This skill MUST run before every respo
๐Ÿฆ€ ClawHub
Highlight Reels
Scenario-focused Sparki skill for highlight extraction while using the latest official Sparki setup, API-key, and upload workflow guidance.
๐Ÿฆ€ ClawHub
nanobanana2-apiyi
Generate images via APIYI (Gemini 3.1 Flash Image Preview). Use when user wants to generate images from text descriptions. Supports keyword extraction, promp...
๐Ÿฆ€ ClawHub
WiseOCR
PDF & Image OCR โ€” Convert a single PDF or image to Markdown via WiseDiag cloud API, with high-accuracy text extraction, table recognition, and multi-column l...
๐Ÿฆ€ ClawHub
Web Scraping & Data Extraction Engine
Complete web scraping methodology โ€” legal compliance, architecture design, anti-detection, data pipelines, and production operations. Use when building scrap...
๐Ÿฆ€ ClawHub
ucloud-deepseek-ocr
OCR text recognition using DeepSeek-OCR model. Use when user asks for OCR, text recognition, image text extraction, screenshot recognition, or converting ima...
๐Ÿฆ€ ClawHub
Automated daily memory backfill for OpenClaw sessions
Scrape and analyze OpenClaw JSONL session logs to reconstruct and backfill agent memory files. Use when: (1) Memory appears incomplete after model switches, (2) Verifying memory coverage, (3) Reconstructing lost memory, (4) Automated daily memory sync via cron/heartbeat. Supports simple extraction and LLM-based narrative summaries with automatic secret sanitization.
๐Ÿฆ€ ClawHub
MiniMax PDF Analysis V2
Analyze PDF files using MiniMax API. Supports text extraction, keyword search, and image-based VLM analysis (converts PDF pages to images first). Requires Mi...
๐Ÿฆ€ ClawHub
ClawMemory
Sovereign agent memory engine โ€” self-hosted, privacy-first SQLite store with LLM-based fact extraction (GLM-4.7), hybrid BM25+vector search, contradiction re...
๐Ÿ”ง Dify
Firecrawl (Dify)
**Firecrawl** is a powerful API integration for web crawling and data scraping. It allows users to extract URLs, scrape website content, and retrieve structured data from web pages. With its modular tools, Firecrawl simplifies the process of gathering web data efficiently. You can now use it in your application workflows for automated web data extraction and analysis. To set up Firecrawl, follow t
๐Ÿฆ€ ClawHub
Tavily Skill.Bak
Use Tavily API for real-time web search and content extraction. Use when: user needs real-time web search results, research, or current information from the...
๐Ÿฆ€ ClawHub
Lark/Feishu Sheets & Cloud File Download (with PDF extraction)
Read, write and manage Lark/Feishu Sheets (spreadsheets) and download Lark/Feishu cloud files via Lark OpenAPI. Reads Feishu app credentials (appId/appSecret...
๐Ÿฆ€ ClawHub
math-guide-solver
Complete mathematical problem solving workflow with OCR, LaTeX formula extraction, PNG rendering, and guided solutions. Use this skill when users want to: -...
๐Ÿฆ€ ClawHub
frompdf
PDF extraction API for AI agents and LLM pipelines. Converts any PDF into semantic AST, markdown, HTML, plain text, or LLM-ready chunks โ€” no page limit. Also...
๐Ÿฆ€ ClawHub
local_memory
Manage AI conversation memory locally with automatic extraction, retrieval, and manual commands, ensuring privacy without external APIs or fees.
๐Ÿฆ€ ClawHub
Ocr Document
OCR document extraction - extract text from scanned documents, photos, and images using OCR. Use when reading scanned PDFs, photographed pages, handwritten n...
๐Ÿฆ€ ClawHub
X (Twitter) Data Scraper
X (Twitter) data extraction and analysis. Use when user asks to "get tweets from @username", "search X for", "analyze Twitter data", "fetch tweets about [top...
๐Ÿฆ€ ClawHub
Endpoints
Endpoints document management API toolkit. Scan documents with AI extraction and organize structured data into categorized endpoints. Use when the user asks to: scan a document, upload a file, list endpoints, inspect endpoint data, check usage stats, create or delete endpoints, get file URLs, or manage document metadata. Requires ENDPOINTS_API_KEY from endpoints.work dashboard.
๐Ÿฆ€ ClawHub
CorpusGraph Document ETL and entity relationship engine for AI agents
Document ETL, entity extraction, and relationship graphing engine. Convert 1,000+ file formats into searchable, structured data with automatic entity and rel...
โญ GitHub
OrangeViolin/skill-evolve
ๆผ”่ฟ›ๅผ Skill ๆ”น่ฟ› โ€” A Claude Code Skill that improves other skills through observation, pattern extraction, and iterative refinement. Based on OTF + JIT + Bootstrap methodology.
๐Ÿฆ€ ClawHub
markdown-extract
Extract clean markdown from any URL using auto, AI, or browser methods via the markdown.new API with error handling and flexible extraction options.
๐Ÿ”ง Dify
Paddleocr (Dify)
**[PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR) is an industry-leading, production-ready OCR and document AI engine, offering end-to-end solutions from text extraction to intelligent document understanding.** This plugin provides several capabilities from PaddleOCR, including text recognition, document parsing, and more. Open the Plugin Marketplace, search for the PaddleOCR plugin, and in
๐Ÿ”ง Dify
Mineru (Dify)
MinerU is a tool that converts PDFs into machine-readable formats (e.g., markdown, JSON), allowing for easy extraction into any format. MinerU is a document parser that can parse complex document data for any downstream LLM use case (RAG, agents) [GitHub - opendatalab/MinerU: A high-quality tool for convert PDF to Markdown and JSON.](https://github.com/opendatalab/MinerU) - Remove headers, footers
โญ GitHubโญ 120
udayanwalvekar/clearshot
--- name: clearshot description: "Structured screenshot analysis for UI implementation and critique. Analyzes every UI screenshot with a 5ร—5 spatial grid, full element inventory, and design system extraction โ€” facts and taste together, every time. Escalates to full implementation blueprint when building. Trigger on any digital interface image file (png, jpg, gif, webp โ€” websites, apps, dashboards, mockups, wireframes) or commands like 'analyse this screenshot,' 'rebuild this,' 'match this design
๐Ÿฆ€ ClawHub
Brave Search Old
Web search and content extraction via Brave Search API. Use for searching documentation, facts, or any web content. Lightweight, no browser required.
๐Ÿฆ€ ClawHub
Ingestigate Investigative intelligence for AI agents
Investigative intelligence โ€” document search, entity extraction, and relationship graphing. Analyze document corpuses to find connections between people, org...
๐Ÿฆ€ ClawHub
document-parser
Extract structured data from PDFs, images, and Word files with layout analysis, table recognition, OCR, seal detection, and directory extraction.
๐Ÿฆ€ ClawHub
Smart Memory (Zero Dep)
Enhanced memory system for agentic workflows. Automatic memory extraction from conversations, memory type classification (preference/project/technical/lesson...
๐Ÿฆ€ ClawHub
Meta Video Ad Analyzer
Extract and analyze content from video ads using Gemini Vision AI. Supports frame extraction, OCR text detection, audio transcription, and AI-powered scene analysis. Use when analyzing video creative content, extracting text overlays, or generating scene-by-scene descriptions.
๐Ÿฆ€ ClawHub
Pget
Parallel file download and optional tar extraction using the pget CLI (single URL or multifile manifest). Use when you need highโ€‘throughput downloads from HTTP(S)/S3/GCS, want to split a large file into chunks for speed, or want to download and extract a .tar/.tar.gz in one step.
๐Ÿฆ€ ClawHub
AnyCrawl-API
Perform high-performance web scraping, crawling, and Google search with multi-engine support and structured data extraction via AnyCrawl API.
โ† PrevPage 5 / 8 (338 skills)Next โ†’