Find the Right AI Skill for Any Job
Browse 80+ curated AI agent skills. Search by use case, filter by category, get the right tool instantly.
All Skills β audio
80 skills in "audio" matching "transcription"
π Allcodingdevopsapidatabasesecuritydataresearchwritingimage-genvideoaudiotranslationseosocial-mediaemail-marketingadvertisingfinancecrypto-defiecommercelegalhrreal-estatehealtheducationcookingtravelgamingautomationcommunicationproductivityclawhublobehubdifymcp
π¦ ClawHub
Yt Assemblyai Monitor
YouTube channel monitor and video transcription using AssemblyAI cloud API. Pure Python + requests only β no ffmpeg, no Whisper, no extra tools needed. Monit...
π¦ ClawHub
Elevenlabs Transcribe
Transcribe audio to text using ElevenLabs Scribe. Supports batch transcription, realtime streaming from URLs, microphone input, and local files.
π¦ ClawHub
subtitle-extractor
Subtitle extractor for Bilibili, YouTube, Xiaohongshu, Douyin, and local files. Extracts native subtitles or Whisper transcription in original format. Agent...
π¦ ClawHub
Local Whisper
Local speech-to-text using OpenAI Whisper. Runs fully offline after model download. High quality transcription with multiple model sizes.
π¦ ClawHub
AssemblyAI advanced speech transcription
Transcribe, diarise, translate, post-process, and structure audio/video with AssemblyAI. Use this skill when the user wants AssemblyAI specifically, needs hi...
π¦ ClawHub
Youtube Transcriber
One-command YouTube video transcription. Automatically downloads audio and transcribes using OpenAI Whisper API β works even when YouTube subtitles are disab...
π¦ ClawHub
clip-editor
Video clip editing skill for automatically analyzing video content and generating CapCut draft templates. Uses local Whisper for speech transcription, Qwen-V...
π¦ ClawHub
mlx-whisper
Set up mlx-whisper as the local audio transcription engine for OpenClaw on Apple Silicon Macs (M1/M2/M3/M4). Automatically transcribes voice notes sent via T...
π§ Dify
Fal (Dify)
**FAL** is an advanced suite of tools designed for AI-powered image generation and audio transcription. In **Dify**, FAL provides multiple services, including image creation with models like **FLUX.1 [pro]** and **FLUX 1.1 [pro] ultra**, allowing users to generate high-quality visuals with customizable parameters. Additionally, FAL offers **Wizper**, a transcription tool that converts audio files
π¦ ClawHub
Qwen ASR
Local speech-to-text using Qwen3-ASR (CPU-only, no API key, no cloud). Use when: (1) a voice message or audio file needs transcription, (2) user asks to tran...
π¦ ClawHub
Meta Video Ad Analyzer
Extract and analyze content from video ads using Gemini Vision AI. Supports frame extraction, OCR text detection, audio transcription, and AI-powered scene analysis. Use when analyzing video creative content, extracting text overlays, or generating scene-by-scene descriptions.
π¦ ClawHub
Play Music from YouTube
Play music on YouTube via browser automation with playwright-cli.
Use when the user wants to:
(1) play a specific song (e.g. 'play Money Money Money by ABBA')
(2) play songs by an artist as a playlist or mix (e.g. 'play Jay Chou's songs')
(3) play genre or mood-based music (e.g. 'play relaxing spa music', 'play 60s Chinese oldies')
(4) control playback β next, pause, resume, stop, skip ad, change song, close the player.
Also handles song/artist name corrections from voice transcription erro
π¦ ClawHub
case.dev
case.dev β a legal AI platform with encrypted document vaults, OCR, audio transcription, and legal search. This skill installs the casedev CLI and provides s...
π MCP
format37/youtube_mcp
π βοΈ β MCP server that transcribes YouTube videos to text. Uses yt-dlp to download audio and OpenAI's Whisper-1 for more precise transcription than youtube captions. Provide a YouTube URL and get back the full transcript splitted by chunks for long videos.
π¦ ClawHub
ifly-speed-transcription
Ultra-fast speech transcription using iFLYTEK Speed Transcription API. Transcribe audio files (WAV/PCM/MP3) up to 5 hours in ~20 seconds per hour. Supports C...
π¦ ClawHub
Aliyun Asr
Pure Aliyun ASR skill for voice message transcription, supports multiple channels including Feishu
π¦ ClawHub
Telegram Whisper Transcribe
Standalone Telegram bot for voice message transcription via OpenAI Whisper API. No LLM overhead β audio goes directly to Whisper and text comes back in 2-5 s...
β GitHub
Vibe Transcribe
All-in-one solution for effortless audio and video transcription. [#opensource](https://github.com/thewh1teagle/vibe)
π¦ ClawHub
Voice To Protocol Transcriber
Record experimental procedures and observations via voice commands during lab work. Real-time transcription for structured experiment documentation.
π¦ ClawHub
transcription
Transcribe audio and video files using OpenAI Whisper API. Use when user wants to transcribe audio/video files, extract speech from media, or get text from r...
π¦ ClawHub
Douyin Video Transcribe
Douyin video transcription suite. Extract audio from Douyin/TikTok China videos, transcribe with Whisper, and analyze content. Supports video links, local fi...
π¦ ClawHub
Nex Voice
Voice note transcription and intelligent action item extraction for capture and organization of verbal communication. Record and transcribe voice notes, voic...
π¦ ClawHub
Faster Whisper Transcription
Transcribes local voice messages to text using Faster Whisper models for fast, privacy-focused speech recognition on audio files.
π¦ ClawHub
Faster Whisper Gpu
High-performance local speech-to-text transcription using Faster Whisper with NVIDIA GPU acceleration. Transcribe audio files locally without sending data to...
π¦ ClawHub
Venice API Kit
Complete Venice AI API toolkit - image generation, video, audio, embeddings, transcription, characters, models, and admin functions. Privacy-focused inferenc...
π¦ ClawHub
Youtube Transcript Api
Extract, transcribe, and translate YouTube video transcripts using the YouTubeTranscript.dev V2 API. Supports captions, ASR audio transcription, batch proces...
π¦ ClawHub
Voice Transcriber Pro
Voice note transcription and archival for OpenClaw agents. Powered by Deepgram Nova-3. Transcribes audio messages, saves both audio files and text transcript...
π¦ ClawHub
acestep-lyrics-transcription
Transcribe audio to timestamped lyrics using OpenAI Whisper or ElevenLabs Scribe API. Outputs LRC, SRT, or JSON with word-level timestamps. Use when users want to transcribe songs, generate LRC files, or extract lyrics with timestamps from audio.
π¦ ClawHub
Audio
Process, enhance, and convert audio files with noise removal, normalization, format conversion, transcription, and podcast workflows.
π¦ ClawHub
Speech To Text
Transcribe audio to text with Whisper models via inference.sh CLI. Models: Fast Whisper Large V3, Whisper V3 Large. Capabilities: transcription, translation,...
π¦ ClawHub
Ai Video Transcription
Transcribe video speech to text with 98%+ accuracy using AI β convert spoken audio from any video into perfectly timed text transcripts, searchable documents...
π¦ ClawHub
Coze Asr
Automatic Speech Recognition (ASR) using Coze API. Use when you need to transcribe audio files to text. Supports Chinese audio transcription via Coze's speec...
β PrevPage 2 / 2 (80 skills)