🎁 Get the FREE AI Skills Starter GuideSubscribe →
BytesAgainBytesAgain

All Skills — audio

56 skills in "audio" matching "transcription"

🦀 ClawHub25.3k dl
Openai Whisper Api
Transcribe audio via OpenAI Audio Transcriptions API (Whisper).
🦀 ClawHub8.9k dl
Video Subtitles
Generate SRT subtitles from video/audio with translation support. Transcribes Hebrew (ivrit.ai) and English (whisper), translates between languages, burns subtitles into video. Use for creating captions, transcripts, or hardcoded subtitles for WhatsApp/social media.
🦀 ClawHub3.0k dl
Elevenlabs Transcribe
Transcribe audio to text using ElevenLabs Scribe. Supports batch transcription, realtime streaming from URLs, microphone input, and local files.
🦀 ClawHub2.7k dl
Zoom Meeting Assistance Rtms Unofficial Community
Zoom RTMS Meeting Assistant — start on-demand to capture meeting audio, video, transcript, screenshare, and chat via Zoom Real-Time Media Streams. Handles meeting.rtms_started and meeting.rtms_stopped webhook events. Provides AI-powered dialog suggestions, sentiment analysis, and live summaries with WhatsApp notifications. Use when a Zoom RTMS webhook fires or the user asks to record/analyze a meeting.
🦀 ClawHub2.4k dl
Voicenotes
Sync and access voice notes from Voicenotes.com. Use when the user wants to retrieve their voice recordings, transcripts, and AI summaries from Voicenotes. Supports fetching notes, syncing to markdown, and searching transcripts.
🦀 ClawHub2.2k dl
Pocket AI Transcripts
Read transcripts and summaries from Pocket AI (heypocket.com) recording devices. Use when users want to retrieve, search, or analyze their Pocket recordings, transcripts, summaries, or action items. Triggers on requests involving Pocket device data, conversation transcripts, meeting recordings, or audio note retrieval.
🦀 ClawHub2.2k dl
Ai Sdk Core
Build backend AI with Vercel AI SDK v6 stable. Covers Output API (replaces generateObject/streamObject), speech synthesis, transcription, embeddings, MCP tools with security guidance. Includes v4→v5 migration and 15 error solutions with workarounds. Use when: implementing AI SDK v5/v6, migrating versions, troubleshooting AI_APICallError, Workers startup issues, Output API errors, Gemini caching issues, Anthropic tool errors, MCP tools, or stream resumption failures.
🦀 ClawHub1.9k dl
Meta Video Ad Analyzer
Extract and analyze content from video ads using Gemini Vision AI. Supports frame extraction, OCR text detection, audio transcription, and AI-powered scene analysis. Use when analyzing video creative content, extracting text overlays, or generating scene-by-scene descriptions.
🦀 ClawHub1.7k dl
Whisper STT
Free local speech-to-text transcription using OpenAI Whisper. Transcribe audio files (mp3, wav, m4a, ogg, etc.) to text without API costs. Use when: (1) User...
🦀 ClawHub1.5k dl
Venice API Kit
Complete Venice AI API toolkit - image generation, video, audio, embeddings, transcription, characters, models, and admin functions. Privacy-focused inferenc...
🦀 ClawHub1.2k dl
clawdio
Analyze Twitter Spaces and voice conversations to extract market intelligence, crypto alpha, sentiment analysis, and speaker-attributed insights. Transforms spoken audio into structured reports, full transcripts, and machine-readable metadata. Use when you need intelligence from Twitter Spaces, podcast discussions, or any long-form voice content — especially for crypto markets, AI trends, and expert commentary that only exists in audio.
🦀 ClawHub1.1k dl
Funasr Transcribe Skill
Use when the user needs local speech-to-text transcription for audio files, especially Chinese or mixed Chinese-English audio, without relying on cloud trans...
🦀 ClawHub1.0k dl
Voice Notes
Organize voice message transcripts into a structured, searchable knowledge base with tags, links, and progressive note-taking.
🦀 ClawHub777 dl
Douyin Video Transcribe
Douyin video transcription suite. Extract audio from Douyin/TikTok China videos, transcribe with Whisper, and analyze content. Supports video links, local fi...
🦀 ClawHub694 dl
case.dev
case.dev — a legal AI platform with encrypted document vaults, OCR, audio transcription, and legal search. This skill installs the casedev CLI and provides s...
🦀 ClawHub603 dl
Podcast Transcribe
For transcript or subtitle requests involving podcast URLs, public audio URLs/files, or raw transcript cleanup. Generates audio + SRT + TXT artifacts and can...
🦀 ClawHub356 dl
Whisper Voice Transcription (whisper.cpp)
Build and use whisper.cpp for local speech-to-text workflows, with optional cloud fallback when local transcription is not practical.
🦀 ClawHub158 dl
realtime-transcription
Real-time transcription of system or microphone audio with automatic summary generation and date-based Markdown archival after stopping or idle timeout.
🦀 ClawHub
format37/youtube_mcp
🐍 ☁️ – MCP server that transcribes YouTube videos to text. Uses yt-dlp to download audio and OpenAI's Whisper-1 for more precise transcription than youtube captions. Provide a YouTube URL and get back the full transcript splitted by chunks for long videos.
🦀 ClawHub12.5k dl
Video Transcript Downloader
Download videos, audio, subtitles, and clean paragraph-style transcripts from YouTube and any other yt-dlp supported site. Use when asked to “download this video”, “save this clip”, “rip audio”, “get subtitles”, “get transcript”, or to troubleshoot yt-dlp/ffmpeg and formats/playlists.
🦀 ClawHub12.0k dl
Local Whisper
Local speech-to-text using OpenAI Whisper. Runs fully offline after model download. High quality transcription with multiple model sizes.
🦀 ClawHub3.4k dl
Transcribee 🐝
Transcribe YouTube videos and local audio/video files with speaker diarization. Use when user asks to transcribe a YouTube URL, podcast, video, or audio file. Outputs clean speaker-labeled transcripts ready for LLM analysis.
🦀 ClawHub3.4k dl
AssemblyAI advanced speech transcription
Transcribe, diarise, translate, post-process, and structure audio/video with AssemblyAI. Use this skill when the user wants AssemblyAI specifically, needs hi...
🦀 ClawHub2.9k dl
it will help you to send voice messages to your AI Assistant and also can make it talk
Text-to-Speech and Speech-to-Text using ElevenLabs AI. Use when the user wants to convert text to speech, transcribe voice messages, or work with voice in multiple languages. Supports high-quality AI voices and accurate transcription.
🦀 ClawHub2.8k dl
Aliyun Asr
Pure Aliyun ASR skill for voice message transcription, supports multiple channels including Feishu
🦀 ClawHub2.5k dl
MarkItDown Skill
OpenClaw agent skill for converting documents to Markdown. Documentation and utilities for Microsoft's MarkItDown library. Supports PDF, Word, PowerPoint, Excel, images (OCR), audio (transcription), HTML, YouTube.
🦀 ClawHub2.5k dl
Azure Ai Transcription Py
Azure AI Transcription SDK for Python. Use for real-time and batch speech-to-text transcription with timestamps and diarization. Triggers: "transcription", "speech to text", "Azure AI Transcription", "TranscriptionClient".
🦀 ClawHub2.2k dl
Play Music from YouTube
Play music on YouTube via browser automation with playwright-cli. Use when the user wants to: (1) play a specific song (e.g. 'play Money Money Money by ABBA') (2) play songs by an artist as a playlist or mix (e.g. 'play Jay Chou's songs') (3) play genre or mood-based music (e.g. 'play relaxing spa music', 'play 60s Chinese oldies') (4) control playback — next, pause, resume, stop, skip ad, change song, close the player. Also handles song/artist name corrections from voice transcription erro
🦀 ClawHub2.2k dl
Pollinations
Pollinations.ai API for AI generation and analysis - text, images, videos, audio, vision, and transcription. Use when user requests AI-powered content (text...
🦀 ClawHub2.2k dl
Audio
Process, enhance, and convert audio files with noise removal, normalization, format conversion, transcription, and podcast workflows.
🦀 ClawHub2.2k dl
Plaud Unofficial Skill
Use when accessing Plaud voice recorder data (recordings, transcripts, AI summaries) - guides credential setup and provides patterns for plaud_client.py
🦀 ClawHub1.9k dl
Local Voice (FluidAudio TTS/STT)
Local text-to-speech (TTS) and speech-to-text (STT) using FluidAudio on Apple Silicon. Sub-second voice synthesis and transcription running entirely on-device via the Apple Neural Engine. Use when setting up local voice capabilities, voice assistant integration, or replacing cloud TTS/STT services.
🦀 ClawHub1.9k dl
Content Repurposer
Transform long-form content into platform-optimized snippets. Your agent takes one blog post, video transcript, or podcast notes and generates ready-to-publish Twitter threads, LinkedIn posts, email newsletters, and Instagram captions. Maintains voice consistency while adapting to each platform's format, length, and engagement patterns. Configure tone preferences, platform priorities, and output formats. Use when publishing content across multiple channels, repurposing existing material, or maxi
🦀 ClawHub1.8k dl
Faster Whisper Local Service
OpenClaw local speech-to-text backend using faster-whisper over HTTP on 127.0.0.1:18790. Use when you want voice transcription without external APIs, without...
🦀 ClawHub1.7k dl
Bilibili Transcript
Transcribe Bilibili videos to text with high accuracy using Whisper medium model. Use when the user provides a Bilibili video URL (BVxxxxx) and wants to: (1)...
🦀 ClawHub1.6k dl
Speechall command-line tool for fast speech-to-text transcription using multiple providers
Install and use the speechall CLI tool for speech-to-text transcription. Use when the user wants to: (1) transcribe audio or video files to text, (2) install speechall on macOS or Linux, (3) list available STT models and their capabilities, (4) use speaker diarization, subtitles, or other transcription features from the terminal. Triggers on mentions of speechall, audio transcription CLI, or speech-to-text from the command line.
🦀 ClawHub1.6k dl
Podcast Chaptering Highlights
Create chapters, highlights, and show notes from podcast audio or transcripts. Use when a user wants chapter markers, highlight clips, or show-note drafts without publishing or distribution actions.
🦀 ClawHub1.4k dl
Instagram Reels
Download Instagram Reels, transcribe audio, and extract captions. Share a reel URL and get back a full transcript with the original description.
🦀 ClawHub1.3k dl
acestep-lyrics-transcription
Transcribe audio to timestamped lyrics using OpenAI Whisper or ElevenLabs Scribe API. Outputs LRC, SRT, or JSON with word-level timestamps. Use when users want to transcribe songs, generate LRC files, or extract lyrics with timestamps from audio.
🦀 ClawHub1.1k dl
Youtube Transcript Api
Extract, transcribe, and translate YouTube video transcripts using the YouTubeTranscript.dev V2 API. Supports captions, ASR audio transcription, batch proces...
🦀 ClawHub1.1k dl
Youtube Knowledge Extractor
Multimodal YouTube video analysis through both audio (transcript) and visual (frame extraction + image analysis) channels. Especially powerful for HowTo vide...
🦀 ClawHub1.1k dl
小红书视频下载器
Download and summarize Xiaohongshu (小红书/RedNote) videos. Produces a full resource pack with video, audio, subtitles, transcript, and AI summary. This skill s...
🦀 ClawHub1.1k dl
Speech to Text Transcription
Transcribe audio and video files to text with speaker detection, timestamps, and format conversion.
🦀 ClawHub954 dl
Pine Voice
Give your agent a real phone. It dials, waits on hold, negotiates your bills, and returns a full transcript.
🦀 ClawHub951 dl
Video Analyzer
Download videos, extract transcripts, capture frames. Analyze YouTube, tutorials, DD videos with yt-dlp + Whisper + ffmpeg.
🦀 ClawHub739 dl
MH openai-whisper-api
Transcribe audio via OpenAI Audio Transcriptions API (Whisper).
🦀 ClawHub730 dl
ton
Ton namespace for Netsnek e.U. audio and media processing tools. Handles audio transcription, format conversion, waveform analysis, and podcast production wo...
🦀 ClawHub690 dl
Timeless.day Meeting Notes
Query and manage Timeless meetings, rooms, transcripts, and AI documents. Capture podcast episodes and YouTube videos into Timeless for transcription. Use wh...