BytesAgainBytesAgain

Find the Right AI Skill for Any Job

Browse 166+ curated AI agent skills. Search by use case, filter by category, get the right tool instantly.

Browse by Use Case →Pick My Role

All Skills — audio

166 skills in "audio" matching "transcribe"

🦀 ClawHub
Summarize
Summarize or extract text/transcripts from URLs, podcasts, and local files (great fallback for “transcribe this YouTube/video”).
🦀 ClawHub
Phone Voice Agent
Run a real-time AI phone agent using Twilio, Deepgram, and ElevenLabs. Handles incoming calls, transcribes audio, generates responses via LLM, and speaks back via streaming TTS. Use when user wants to: (1) Test voice AI capabilities, (2) Handle phone calls programmatically, (3) Build a conversational voice bot.
🦀 ClawHub
Bilibili Notion Pipeline Skill
Skill-first Bilibili to Notion pipeline. Download a Bilibili/b23 video, transcribe audio, upload the mp4, create or update a Notion transcript page, write tr...
🦀 ClawHub
Video Transcribe - 视频转文字
本地视频转文字 - 使用 OpenAI Whisper 进行语音识别,完全免费、离线运行、保护隐私
🦀 ClawHub
Video Subtitles
Generate SRT subtitles from video/audio with translation support. Transcribes Hebrew (ivrit.ai) and English (whisper), translates between languages, burns subtitles into video. Use for creating captions, transcripts, or hardcoded subtitles for WhatsApp/social media.
🦀 ClawHub
AudioPod
Use AudioPod AI's API for audio processing tasks including AI music generation (text-to-music, text-to-rap, instrumentals, samples, vocals), stem separation, text-to-speech, noise reduction, speech-to-text transcription, speaker separation, and media extraction. Use when the user needs to generate music/songs/rap from text, split a song into stems/vocals/instruments, generate speech from text, clean up noisy audio, transcribe audio/video, or extract audio from YouTube/URLs. Requires AUDIOPOD_API
🦀 ClawHub
Crypto Alert
Download YouTube videos and transcribe audio using local Whisper. Use when you need to extract text from YouTube videos that don't have subtitles, or when yo...
🦀 ClawHub
Qqbot Voice Transcribe
QQ Bot 语音消息自动识别 v2.0。自动解码 QQ Silk V3 格式,Whisper medium 模型识别,Gateway 集成,用户确认流程。
🦀 ClawHub
Video To Text
Video to text converter. Downloads videos from Bilibili using bilibili-api, from other sites using yt-dlp, then transcribes audio using faster-whisper. Use w...
🦀 ClawHub
Bilibili Transcriber
Bilibili视频转文字摘要专家。支持云端(阿里云Paraformer)和本地(faster-whisper)双引擎转录。当用户提供B站视频URL时,自动下载音频、转录成文字、生成结构化摘要。支持BV号和完整URL。
🦀 ClawHub
Local Transcription
Local speech-to-text transcription with Qwen ASR — transcription routed across your Apple Silicon fleet. Transcribe meetings, voice notes, podcasts with loca...
🦀 ClawHub
Telegram Voice Transcribe
Transcribe Telegram voice messages and audio notes into text using the OpenAI Whisper API. Use when (1) a user sends a voice message or audio note via Telegr...
🦀 ClawHub
Telnyx Stt
Transcribe audio files to text using Telnyx Speech-to-Text API. Use when you need to convert audio recordings, voice messages, or spoken content to text.
🦀 ClawHub
Audio Command Executor
Processes inbound audio files, transcribes them, and answers to resulting texts. Converts non-WAV inputs to WAV before transcription.
🦀 ClawHub
Whisper STT
Free local speech-to-text transcription using OpenAI Whisper. Transcribe audio files (mp3, wav, m4a, ogg, etc.) to text without API costs. Use when: (1) User...
🦀 ClawHub
VoiceClaw
Local voice I/O for OpenClaw agents. Transcribe inbound audio/voice messages using local Whisper (whisper.cpp) and generate voice replies using local Piper T...
🦀 ClawHub
Audio Transcriber Pro
Transform audio recordings into professional Markdown documentation with intelligent summaries using LLM integration
🦀 ClawHub
Prompt Refiner
Transforms casual or voice-transcribed user requests into precise, AI-optimized prompts. Handles mixed languages, vague input, and ambiguity. Reduces task ex...
🦀 ClawHub
yapp
Receive and engage with transcribed voice memos from Yapp, a voice journaling app, capturing raw, unedited speech-to-text recordings with metadata.
🦀 ClawHub
Cloudflare Whisper Worker
Transcribe audio using a deployed Cloudflare Worker Whisper endpoint. Use when converting voice/audio files (wav, mp3, m4a, ogg, webm) to text through the cu...
🦀 ClawHub
Groq Voice Transcriber
Automatically transcribes Telegram voice messages using Groq Whisper API and replies with text generated by an LLM.
🦀 ClawHub
ANY WHISPER API
Transcribe audio via API Whisper with any compatible local servers.
🦀 ClawHub
Voice-to-Protocol Transcriber
Record experimental procedures and observations via voice commands during lab work. Real-time transcription for structured experiment documentation.
🦀 ClawHub
Audio Intelligence Mcp
Transcribe, summarize, and analyze audio files using local Whisper + Qwen. Returns transcript, segments, and action items.
🦀 ClawHub
WebChat Voice GUI
Voice input and microphone button for OpenClaw WebChat Control UI. Adds a mic button to chat, records audio via browser MediaRecorder, transcribes locally vi...
🦀 ClawHub
asr-skill
This skill should be used when the user asks to "transcribe audio", "transcribe video", "convert speech to text", "generate subtitles", "create captions", "i...
🦀 ClawHub
Walkie-Talkie Mode
Handles voice-to-voice conversations on WhatsApp. Automatically transcribes incoming audio and responds with local TTS audio. Use when the user wants to "talk" instead of type.
🦀 ClawHub
Walkie-Talkie Mode
Handles voice-to-voice conversations on WhatsApp. Automatically transcribes incoming audio and responds with local TTS audio. Use when the user wants to "talk" instead of type.
🦀 ClawHub
Openai Whisper Api
Transcribe audio via OpenAI Audio Transcriptions API (Whisper).
🦀 ClawHub
OpenRouter Audio
Audio transcription and text-to-speech generation using OpenRouter API. Use when the user needs to transcribe audio files to text or generate speech/audio fr...
🦀 ClawHub
musa-torch-coding
Transcribe audio via OpenAI Audio Transcriptions API (Whisper).
🦀 ClawHub
Video Caption Generator
The video-caption-generator skill transcribes spoken audio from your video and burns accurate, readable captions directly into the output file. Upload any cl...
🦀 ClawHub
TubeScribe
YouTube video summarizer with speaker detection, formatted documents, and audio output. Works out of the box with macOS built-in TTS. Optional recommended tools (pandoc, ffmpeg, mlx-audio) enhance quality. Requires internet for YouTube access. No paid APIs or subscriptions. Use when user sends a YouTube URL or asks to summarize/transcribe a YouTube video.
🦀 ClawHub
Doubao Asr
Transcribe recorded audio files to text via Doubao Seed-ASR 2.0 (豆包录音文件识别模型2.0) from ByteDance/Volcengine. Best-in-class Chinese speech recognition with spea...
🦀 ClawHub
Meeting Summarizer
Transcribe meetings with SenseAudio ASR speaker diarization, timestamps, and meeting-note extraction workflows. Use when users need meeting transcription, me...
🦀 ClawHub
Vocal Chat
Handles voice-to-voice conversations on WhatsApp. Automatically transcribes incoming audio and responds with local TTS audio. Use when the user wants to "talk" instead of type.
🦀 ClawHub
Speech is Cheap Transcribe
Fast, affordable automatic speech-to-text transcription supporting 100 languages, speaker diarization, word timestamps, and customizable output formats.
🦀 ClawHub
Elevenlabs Integration with Openclaw
ClawVox - ElevenLabs voice studio for OpenClaw. Generate speech, transcribe audio, clone voices, create sound effects, and more.
🦀 ClawHub
Telegram Voice Bot
Telegram bot that transcribes voice messages using Whisper and replies in Chinese with Microsoft Edge text-to-speech.
🦀 ClawHub
Funasr Transcribe Skill
Use when the user needs local speech-to-text transcription for audio files, especially Chinese or mixed Chinese-English audio, without relying on cloud trans...
🦀 ClawHub
Whisper Transcriber
Offline speech-to-text (ASR) using whisper.cpp (whisper-cli) + ffmpeg. Supports batch transcription, timestamps, SRT/TXT/JSON outputs, and model download. Cr...
🦀 ClawHub
Voice
Voice communication via Telegram. Automatically transcribes incoming voice messages using faster-whisper and replies with TTS voice. Use for all voice-relate...
🦀 ClawHub
deAPI AI Media Suite (Community)
The cheapest AI media API on the market. Generate images (Flux), music (AceStep), speech with voice cloning, transcribe video/audio, OCR, video generation, b...
🦀 ClawHub
Speechall command-line tool for fast speech-to-text transcription using multiple providers
Install and use the speechall CLI tool for speech-to-text transcription. Use when the user wants to: (1) transcribe audio or video files to text, (2) install speechall on macOS or Linux, (3) list available STT models and their capabilities, (4) use speaker diarization, subtitles, or other transcription features from the terminal. Triggers on mentions of speechall, audio transcription CLI, or speech-to-text from the command line.
🦀 ClawHub
video-transcriber
Transcribe speech from videos
🦀 ClawHub
Whisper Transcribe
Transcribe audio files to text using OpenAI Whisper. Supports speech-to-text with auto language detection, multiple output formats (txt, srt, vtt, json), batch processing, and model selection (tiny to large). Use when transcribing audio recordings, podcasts, voice messages, lectures, meetings, or any audio/video file to text. Handles mp3, wav, m4a, ogg, flac, webm, opus, aac formats.
🦀 ClawHub
🎤 Transcribe audio files using Qwen ASR. 千问STT
Transcribe audio files using Qwen ASR (千问STT). Use when the user sends voice messages and wants them converted to text.
🦀 ClawHub
whatsappVoiceOpenSkill
Real-time WhatsApp voice message processing. Transcribe voice notes to text via Whisper, detect intent, execute handlers, and send responses. Use when building conversational voice interfaces for WhatsApp. Supports English and Hindi, customizable intents (weather, status, commands), automatic language detection, and streaming responses via TTS.
Page 1 / 4 (166 skills)Next →