Find the Right AI Skill for Any Job
Browse 28+ curated AI agent skills. Search by use case, filter by category, get the right tool instantly.
All Skills β audio
28 skills in "audio" matching "detection"
π Allcodingdevopsapidatabasesecuritydataresearchwritingimage-genvideoaudiotranslationseosocial-mediaemail-marketingadvertisingfinancecrypto-defiecommercelegalhrreal-estatehealtheducationcookingtravelgamingautomationcommunicationproductivityclawhublobehubdifymcp
π¦ ClawHub
whatsappVoiceOpenSkill
Real-time WhatsApp voice message processing. Transcribe voice notes to text via Whisper, detect intent, execute handlers, and send responses. Use when building conversational voice interfaces for WhatsApp. Supports English and Hindi, customizable intents (weather, status, commands), automatic language detection, and streaming responses via TTS.
π¦ ClawHub
Voice Note To Midi
Convert voice notes, humming, and melodic audio recordings to quantized MIDI files using ML-based pitch detection and intelligent post-processing
π¦ ClawHub
Azure Ai Voicelive Py
Build real-time voice AI applications using Azure AI Voice Live SDK (azure-ai-voicelive). Use this skill when creating Python applications that need real-time bidirectional audio communication with Azure AI, including voice assistants, voice-enabled chatbots, real-time speech-to-speech translation, voice-driven avatars, or any WebSocket-based audio streaming with AI models. Supports Server VAD (Voice Activity Detection), turn-based conversation, function calling, MCP tools, avatar integration, a
π¦ ClawHub
Video To Text
Convert video or audio files from URLs into text or subtitle formats using a free API with automatic language detection and no local downloads required.
π¦ ClawHub
Audio Recording Quality Analyzer
Analyze audio recording quality - echo detection, loudness, speech intelligibility, SNR, spectral analysis. Use when the user wants to check a recording's qu...
π¦ ClawHub
EmoCity Biometric Scan
Real-time biometric analysis β stress, deception, emotions, heart rate from your camera. 478 facial landmarks, voice stress, micro-expression detection. Powe...
π¦ ClawHub
BirdWeather PUC
Access BirdWeather PUC station data β species detections, sensor readings (AQI, temperature, humidity, pressure, eCOβ, sound dB), and historical trends. Use...
π¦ ClawHub
TTS AutoPlay with Wake Word
Auto-play TTS voice files with wake word detection. Only plays audio when user message contains wake words like "θ―ι³", "εΏ΅εΊζ₯", "voice", etc. Perfect for Webcha...
π¦ ClawHub
JARVIS AI Skills
Control robotic arms and grippers via voice or code with OpenClaw, supporting precise 6-DOF movement, force sensing, collision detection, and simulation.
π¦ ClawHub
Prompt injection detection skill
Two-layer content safety for agent input and output. Use when (1) a user message attempts to override, ignore, or bypass previous instructions (prompt injection), (2) a user message references system prompts, hidden instructions, or internal configuration, (3) receiving messages from untrusted users in group chats or public channels, (4) generating responses that discuss violence, self-harm, sexual content, hate speech, or other sensitive topics, or (5) deploying agents in public-facing or multi
π¦ ClawHub
TubeScribe
YouTube video summarizer with speaker detection, formatted documents, and audio output. Works out of the box with macOS built-in TTS. Optional recommended tools (pandoc, ffmpeg, mlx-audio) enhance quality. Requires internet for YouTube access. No paid APIs or subscriptions. Use when user sends a YouTube URL or asks to summarize/transcribe a YouTube video.
π¦ ClawHub
Cinematic Script Writer
Create professional cinematic scripts for AI video generation with character consistency and cinematography knowledge. Use when the user wants to write a cinematic script, create story contexts with characters, generate image prompts for AI video tools (Midjourney, Sora, Veo), or needs cinematography guidance (camera angles, lighting, color grading). Also use for character consistency sheets, voice profiles, anachronism detection, and saving scripts to Google Drive.
π¦ ClawHub
Humanizer
Humanize AI-generated text by detecting and removing patterns typical of LLM output. Rewrites text to sound natural, specific, and human. Uses 24 pattern detectors, 500+ AI vocabulary terms across 3 tiers, and statistical analysis (burstiness, type-token ratio, readability) for comprehensive detection. Use when asked to humanize text, de-AI writing, make content sound more natural/human, review writing for AI patterns, score text for AI detection, or improve AI-generated drafts. Covers content,
π¦ ClawHub
Purefeed
Monitors Twitter/X feeds with AI signal detection. Searches tweets semantically, manages signal detectors, generates human-sounding posts, checks AI detectio...
π¦ ClawHub
Humanize AI text
Humanize AI-generated text to bypass detection. This humanizer rewrites ChatGPT, Claude, and GPT content to sound natural and pass AI detectors like GPTZero,...
π¦ ClawHub
Whisper Transcribe
Transcribe audio files to text using OpenAI Whisper. Supports speech-to-text with auto language detection, multiple output formats (txt, srt, vtt, json), batch processing, and model selection (tiny to large). Use when transcribing audio recordings, podcasts, voice messages, lectures, meetings, or any audio/video file to text. Handles mp3, wav, m4a, ogg, flac, webm, opus, aac formats.
π¦ ClawHub
Qwen Asr Skill
Provides high-accuracy speech-to-text conversion supporting 22 Chinese dialects and 30 languages with automatic language detection, running on CPU.
π¦ ClawHub
Speech to Text Transcription
Transcribe audio and video files to text with speaker detection, timestamps, and format conversion.
π¦ ClawHub
macOS Local Voice
Local STT and TTS on macOS using native Apple capabilities. Speech-to-text via yap (Apple Speech.framework), text-to-speech via say + ffmpeg. Fully offline, no API keys required. Includes voice quality detection and smart voice selection.
π¦ ClawHub
Invoices
Capture, extract, and organize received invoices with automatic OCR, provider detection, and searchable archive.
π¦ ClawHub
Cinematic Script Writer
Create professional cinematic scripts for AI video generation with character consistency and cinematography knowledge. Use when the user wants to write a cinematic script, create story contexts with characters, generate image prompts for AI video tools (Midjourney, Sora, Veo), or needs cinematography guidance (camera angles, lighting, color grading). Also use for character consistency sheets, voice profiles, anachronism detection, and saving scripts to Google Drive.
π¦ ClawHub
Truly Local Piper Multilang TTS (secure)
Local offline text-to-speech via Piper TTS. Self-contained setup, automatic language detection, per-call voice selection. Extensible to any language. Writes...
π¦ ClawHub
Ai Content Detection
Use this skill whenever a user wants to verify whether content (text, images, audio, video, or documents) was created by AI; detect deepfakes or AI-synthesiz...
π¦ ClawHub
Supplier Video Ad Builder
Transforms supplier or CJ source videos into 1080Γ1920 TikTok/Instagram Reels ads with clean zone detection, Pillow text overlays, CTA card, and trending audio.
π¦ ClawHub
Meta Video Ad Analyzer
Extract and analyze content from video ads using Gemini Vision AI. Supports frame extraction, OCR text detection, audio transcription, and AI-powered scene analysis. Use when analyzing video creative content, extracting text overlays, or generating scene-by-scene descriptions.
π¦ ClawHub
Parakeet Stt
Local speech-to-text with NVIDIA Parakeet TDT 0.6B v3 (ONNX on CPU). 30x faster than Whisper, 25 languages, auto-detection, OpenAI-compatible API. Use when transcribing audio files, converting speech to text, or processing voice recordings locally without cloud APIs.
π¦ ClawHub
Humanizer
Remove AI writing patterns based on Wikipedia's "Signs of AI writing" research. 24 pattern detection and rewriting rules for making AI-generated text sound n...
π¦ ClawHub
AssemblyAI Transcriber
Transcribe audio files with speaker diarization (who speaks when). Supports 100+ languages, automatic language detection, and timestamps. Use for meetings, interviews, podcasts, or voice messages. Requires AssemblyAI API key.