BytesAgainBytesAgain

Find the Right AI Skill for Any Job

Browse 12+ curated AI agent skills. Search by use case, filter by category, get the right tool instantly.

Browse by Use Case →Pick My Role

All Skills

12 skills total matching "Transcribe"

🦀 ClawHub19.3k dl
YouTube Transcript
Fetch and summarize YouTube video transcripts. Use when asked to summarize, transcribe, or extract content from YouTube videos. Handles transcript fetching via residential IP proxy to bypass YouTube's cloud IP blocks.
🦀 ClawHub7.7k dl
Video Subtitles
Generate SRT subtitles from video/audio with translation support. Transcribes Hebrew (ivrit.ai) and English (whisper), translates between languages, burns subtitles into video. Use for creating captions, transcripts, or hardcoded subtitles for WhatsApp/social media.
🦀 ClawHub5.3k dl
Voice Transcribe
Transcribe audio files using OpenAI's gpt-4o-mini-transcribe model with vocabulary hints and text replacements. Requires uv (https://docs.astral.sh/uv/).
🦀 ClawHub4.4k dl
TubeScribe
YouTube video summarizer with speaker detection, formatted documents, and audio output. Works out of the box with macOS built-in TTS. Optional recommended tools (pandoc, ffmpeg, mlx-audio) enhance quality. Requires internet for YouTube access. No paid APIs or subscriptions. Use when user sends a YouTube URL or asks to summarize/transcribe a YouTube video.
🦀 ClawHub4.2k dl
Transcript
Get transcripts from any YouTube video — for summarization, research, translation, quoting, or content analysis. Use when the user shares a video link or asks "what did they say", "get the transcript", "transcribe this video", "summarize this video", or wants to analyze spoken content.
🦀 ClawHub3.5k dl
Vocal Chat
Handles voice-to-voice conversations on WhatsApp. Automatically transcribes incoming audio and responds with local TTS audio. Use when the user wants to "talk" instead of type.
🦀 ClawHub3.1k dl
Transcribee 🐝
Transcribe YouTube videos and local audio/video files with speaker diarization. Use when user asks to transcribe a YouTube URL, podcast, video, or audio file. Outputs clean speaker-labeled transcripts ready for LLM analysis.
🦀 ClawHub3.1k dl
AudioPod
Use AudioPod AI's API for audio processing tasks including AI music generation (text-to-music, text-to-rap, instrumentals, samples, vocals), stem separation, text-to-speech, noise reduction, speech-to-text transcription, speaker separation, and media extraction. Use when the user needs to generate music/songs/rap from text, split a song into stems/vocals/instruments, generate speech from text, clean up noisy audio, transcribe audio/video, or extract audio from YouTube/URLs. Requires AUDIOPOD_API
🦀 ClawHub3.0k dl
Gemini STT
Transcribe audio files using Google's Gemini API or Vertex AI
🦀 ClawHub2.9k dl
AssemblyAI advanced speech transcription
Transcribe, diarise, translate, post-process, and structure audio/video with AssemblyAI. Use this skill when the user wants AssemblyAI specifically, needs hi...
🦀 ClawHub2.9k dl
Agentic Calling
Enable AI agents to autonomously make, receive, transcribe, route, and record phone calls using Twilio with customizable voice messages and IVR support.
🦀 ClawHub2.7k dl
Elevenlabs Integration with Openclaw
ClawVox - ElevenLabs voice studio for OpenClaw. Generate speech, transcribe audio, clone voices, create sound effects, and more.
🦀 ClawHub2.6k dl
Speech is Cheap Transcribe
Fast, affordable automatic speech-to-text transcription supporting 100 languages, speaker diarization, word timestamps, and customizable output formats.
🦀 ClawHub2.5k dl
Speech To Text
Transcribe audio to text with Whisper models via inference.sh CLI. Models: Fast Whisper Large V3, Whisper V3 Large. Capabilities: transcription, translation,...
🦀 ClawHub2.5k dl
Walkie-Talkie Mode
Handles voice-to-voice conversations on WhatsApp. Automatically transcribes incoming audio and responds with local TTS audio. Use when the user wants to "talk" instead of type.
🦀 ClawHub2.1k dl
whatsappVoiceOpenSkill
Real-time WhatsApp voice message processing. Transcribe voice notes to text via Whisper, detect intent, execute handlers, and send responses. Use when building conversational voice interfaces for WhatsApp. Supports English and Hindi, customizable intents (weather, status, commands), automatic language detection, and streaming responses via TTS.
🦀 ClawHub1.9k dl
Transcribe Audio with Parakeet MLX
Local speech-to-text with Parakeet MLX (ASR) for Apple Silicon (no API key).
🦀 ClawHub1.9k dl
Cult Of Carcinization
Give your agent a voice — and ears. The Cult of Carcinization is the bot-first gateway to ScrappyLabs TTS and STT. Speak with 20+ voices, design your own from a text description, transcribe audio to text, and evolve into a permanent bot identity. No human signup required.
🦀 ClawHub1.7k dl
Walkie-Talkie Mode
Handles voice-to-voice conversations on WhatsApp. Automatically transcribes incoming audio and responds with local TTS audio. Use when the user wants to "talk" instead of type.
🦀 ClawHub1.6k dl
Telegram Voice To Voice Macos
Telegram voice-to-voice for macOS Apple Silicon: transcribe inbound .ogg voice notes with yap (Speech.framework) and reply with Telegram voice notes via say+ffmpeg. Not compatible with Linux/Windows.
🦀 ClawHub1.5k dl
Walkie-Talkie Mode
Handles voice-to-voice conversations on WhatsApp. Automatically transcribes incoming audio and responds with local TTS audio. Use when the user wants to "talk" instead of type.
🦀 ClawHub1.5k dl
Audio Transcribe
Auto-transcribe voice messages locally using faster-whisper with selectable Whisper models, no API key required.
🦀 ClawHub1.3k dl
Whisper Transcribe
Transcribe audio files to text using OpenAI Whisper. Supports speech-to-text with auto language detection, multiple output formats (txt, srt, vtt, json), batch processing, and model selection (tiny to large). Use when transcribing audio recordings, podcasts, voice messages, lectures, meetings, or any audio/video file to text. Handles mp3, wav, m4a, ogg, flac, webm, opus, aac formats.
🦀 ClawHub1.3k dl
AssemblyAI Transcriber
Transcribe audio files with speaker diarization (who speaks when). Supports 100+ languages, automatic language detection, and timestamps. Use for meetings, interviews, podcasts, or voice messages. Requires AssemblyAI API key.
🦀 ClawHub1.3k dl
Speechall command-line tool for fast speech-to-text transcription using multiple providers
Install and use the speechall CLI tool for speech-to-text transcription. Use when the user wants to: (1) transcribe audio or video files to text, (2) install speechall on macOS or Linux, (3) list available STT models and their capabilities, (4) use speaker diarization, subtitles, or other transcription features from the terminal. Triggers on mentions of speechall, audio transcription CLI, or speech-to-text from the command line.
🦀 ClawHub1.2k dl
Doubao Asr
Transcribe recorded audio files to text via Doubao Seed-ASR 2.0 (豆包录音文件识别模型2.0) from ByteDance/Volcengine. Best-in-class Chinese speech recognition with spea...
🦀 ClawHub1.1k dl
Whisper STT
Free local speech-to-text transcription using OpenAI Whisper. Transcribe audio files (mp3, wav, m4a, ogg, etc.) to text without API costs. Use when: (1) User...
🦀 ClawHub946 dl
Free Groq Voice Recognition
FREE voice recognition using Groq's complimentary Whisper API. Transcribe audio messages to text in 50+ languages at no cost. Perfect for voice-to-text autom...
🦀 ClawHub939 dl
Faster Whisper Transcription
Transcribes local voice messages to text using Faster Whisper models for fast, privacy-focused speech recognition on audio files.
🦀 ClawHub857 dl
Instagram Reels
Download Instagram Reels, transcribe audio, and extract captions. Share a reel URL and get back a full transcript with the original description.
🦀 ClawHub854 dl
Telnyx Stt
Transcribe audio files to text using Telnyx Speech-to-Text API. Use when you need to convert audio recordings, voice messages, or spoken content to text.
🦀 ClawHub798 dl
Zhipu Asr
Automatic Speech Recognition (ASR) using Zhipu AI (BigModel) GLM-ASR model. Use when you need to transcribe audio files to text. Supports Chinese audio trans...
🦀 ClawHub717 dl
Youtube Transcription Generator
Use VLM Run (vlmrun) to generate transcriptions from YouTube videos. Download a video with yt-dlp, then run vlmrun to transcribe with optional timestamps. VLMRUN_API_KEY must be in .env; follow vlmrun-cli-skill for CLI setup and options.
🦀 ClawHub705 dl
Pocket AI Integration
Transcribe, index, and semantically search all voice recordings, extracting action items and meeting insights for comprehensive conversation intelligence.
🦀 ClawHub702 dl
Speech to Text Transcription
Transcribe audio and video files to text with speaker detection, timestamps, and format conversion.
🦀 ClawHub647 dl
Bili Summary
Download Bilibili videos, extract or transcribe subtitles, and generate AI-powered detailed summaries using Gemini 2.5 Flash.
🦀 ClawHub620 dl
openclaw-voice
Transcribe audio to text and generate spoken AI responses using Whisper and ElevenLabs via CLI with transcript storage and search.
🦀 ClawHub580 dl
YouTube Transcript Pipeline Lite
Run a lightweight YouTube transcript workflow: transcribe, attribution cleanup, translation, and packaging with minimal tooling. Use for repeatable transcrip...
🦀 ClawHub549 dl
B站视频转文字&总结神器-Bilibili video transcribe&summary
当用户提供 B 站视频链接、BV 号或 b23.tv 短链,并希望转录、提取字幕、总结或分析视频内容时使用。先检查 Node.js 环境和 SILICONFLOW_API_KEY,优先尝试官方字幕;如果没有字幕,则获取匿名音频地址,下载为 .m4s 后直接改名为 .mp3,无需转码;有 API key 时调用硅基...
🦀 ClawHub544 dl
Voice
Voice communication via Telegram. Automatically transcribes incoming voice messages using faster-whisper and replies with TTS voice. Use for all voice-relate...
🦀 ClawHub542 dl
VoiceClaw
Local voice I/O for OpenClaw agents. Transcribe inbound audio/voice messages using local Whisper (whisper.cpp) and generate voice replies using local Piper T...
🦀 ClawHub505 dl
Video Transcribe - 视频转文字
本地视频转文字 - 使用 OpenAI Whisper 进行语音识别,完全免费、离线运行、保护隐私
🦀 ClawHub464 dl
Telegram Voice Transcribe
Transcribe Telegram voice messages and audio notes into text using the OpenAI Whisper API. Use when (1) a user sends a voice message or audio note via Telegr...
🦀 ClawHub458 dl
Volcengine STT
Transcribe audio to text using Volcano Engine (Volcengine/ARK) speech-to-text APIs. Use when the user wants to replace Whisper/OpenAI STT with Volcengine, tr...
🦀 ClawHub444 dl
Gettr Transcribe
Download audio from a GETTR post or streaming page and transcribe it locally with MLX Whisper on Apple Silicon (with timestamps via VTT). Use when given a GE...
🦀 ClawHub439 dl
ElevenLabs STT OpenClaw
Transcribe audio files with ElevenLabs Speech-to-Text (Scribe v2) from the local CLI. Supports diarization, events, JSON output, webhooks, and advanced STT o...
🦀 ClawHub428 dl
Facticity.AI Complete Integration
Complete Facticity.AI integration - fact-check claims, extract claims from content, transcribe links, check link reliability, check credits, and monitor task...
🦀 ClawHub421 dl
MH openai-whisper-api
Transcribe audio via OpenAI Audio Transcriptions API (Whisper).