Browse AI Agent Skills | BytesAgain

🎁 Get the FREE AI Skills Starter Guide — Subscribe →

All Skills

210 skills total matching "Transcribe"

🦀 ClawHub25.6k dl

Openai Whisper Api

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

⭐ GitHub⭐ 35.1k

audio-transcriber

Installable GitHub library of 1,400+ agentic skills for Claude Code, Cursor, Codex CLI, Gemini CLI, Antigravity, and more. Includes installer CLI, bundles, workflows, and official/community skill collections.

🦀 ClawHub6.3k dl

Voice Transcribe

Transcribe audio files using OpenAI's gpt-4o-mini-transcribe model with vocabulary hints and text replacements. Requires uv (https://docs.astral.sh/uv/).

⭐ GitHub⭐ 35.1k

audio-transcriber

Installable GitHub library of 1,400+ agentic skills for Claude Code, Cursor, Codex CLI, Gemini CLI, Antigravity, and more. Includes installer CLI, bundles, workflows, and official/community skill collections.

🦀 ClawHub4.0k dl

Handles voice-to-voice conversations on WhatsApp. Automatically transcribes incoming audio and responds with local TTS audio. Use when the user wants to "talk" instead of type.

⭐ GitHub⭐ 35.1k

audio-transcriber

Installable GitHub library of 1,400+ agentic skills for Claude Code, Cursor, Codex CLI, Gemini CLI, Antigravity, and more. Includes installer CLI, bundles, workflows, and official/community skill collections.

🦀 ClawHub3.8k dl

Agentic Calling

Enable AI agents to autonomously make, receive, transcribe, route, and record phone calls using Twilio with customizable voice messages and IVR support.

⭐ GitHub⭐ 35.1k

audio-transcriber

Installable GitHub library of 1,400+ agentic skills for Claude Code, Cursor, Codex CLI, Gemini CLI, Antigravity, and more. Includes installer CLI, bundles, workflows, and official/community skill collections.

🦀 ClawHub3.7k dl

video-transcript

Use when video content needs to be extracted as text: pasted YouTube links or IDs, requests to transcribe, summarize, quote, translate, convert video to text...

⭐ GitHub⭐ 35.1k

audio-transcriber

Installable GitHub library of 1,400+ agentic skills for Claude Code, Cursor, Codex CLI, Gemini CLI, Antigravity, and more. Includes installer CLI, bundles, workflows, and official/community skill collections.

🦀 ClawHub3.5k dl

Transcribee 🐝

Transcribe YouTube videos and local audio/video files with speaker diarization. Use when user asks to transcribe a YouTube URL, podcast, video, or audio file. Outputs clean speaker-labeled transcripts ready for LLM analysis.

⭐ GitHub⭐ 35.1k

audio-transcriber

Installable GitHub library of 1,400+ agentic skills for Claude Code, Cursor, Codex CLI, Gemini CLI, Antigravity, and more. Includes installer CLI, bundles, workflows, and official/community skill collections.

🦀 ClawHub3.2k dl

Speech is Cheap Transcribe

Fast, affordable automatic speech-to-text transcription supporting 100 languages, speaker diarization, word timestamps, and customizable output formats.

⭐ GitHub⭐ 3.9k

Edit any video by conversation. Transcribe, cut, color grade, generate overlay animations, burn subtitles — for talking heads, montages, tutorials, travel, interviews. No presets, no menus. Ask questions, confirm the plan, execute, iterate, persist. Production-correctness rules are hard; everything else is artistic freedom.

🦀 ClawHub3.0k dl

it will help you to send voice messages to your AI Assistant and also can make it talk

Text-to-Speech and Speech-to-Text using ElevenLabs AI. Use when the user wants to convert text to speech, transcribe voice messages, or work with voice in multiple languages. Supports high-quality AI voices and accurate transcription.

⭐ GitHub⭐ 392

audio-transcription

Transcribe audio and video files into structured notes. Activate this skill when users want to transcribe recordings, meetings, podcasts, voice memos, or any audio/video content in their vault.

🦀 ClawHub2.7k dl

Local speech-to-text with NVIDIA Parakeet TDT 0.6B v3 (ONNX on CPU). 30x faster than Whisper, 25 languages, auto-detection, OpenAI-compatible API. Use when transcribing audio files, converting speech to text, or processing voice recordings locally without cloud APIs.

⭐ GitHub⭐ 376

ElevenLabs speech-to-text with Scribe models and forced alignment via inference.sh CLI. Models: Scribe v1/v2 (98%+ accuracy, 90+ languages). Capabilities: transcription, speaker diarization, audio event tagging, word-level timestamps, forced alignment, subtitle generation. Use for: meeting transcription, subtitles, podcast transcripts, lip-sync timing, karaoke. Triggers: elevenlabs stt, elevenlabs transcription, scribe, elevenlabs speech to text, forced alignment, word alignment, subtitle timing, diarization, speaker identification, audio event detection, eleven labs transcribe

🦀 ClawHub2.2k dl

Transcribe Audio with Parakeet MLX

Local speech-to-text with Parakeet MLX (ASR) for Apple Silicon (no API key).

⭐ GitHub⭐ 358

english-to-katakana-transcription

Transcribes English sentences into Japanese Katakana characters based on phonetic syllables without translating the meaning.

🦀 ClawHub1.9k dl

Walkie-Talkie Mode

Handles voice-to-voice conversations on WhatsApp. Automatically transcribes incoming audio and responds with local TTS audio. Use when the user wants to "talk" instead of type.

🦀 ClawHub1.8k dl

AssemblyAI Transcriber

Transcribe audio files with speaker diarization (who speaks when). Supports 100+ languages, automatic language detection, and timestamps. Use for meetings, interviews, podcasts, or voice messages. Requires AssemblyAI API key.

🦀 ClawHub1.8k dl

Video Analyzer (TikTok + YouTube + Instagram)

Analyze videos from TikTok, YouTube, Instagram, Twitter, and others by URL, transcribing audio locally and answering questions about the content.

🦀 ClawHub1.7k dl

🎤 Transcribe audio files using Qwen ASR. 千问STT

Transcribe audio files using Qwen ASR (千问STT). Use when the user sends voice messages and wants them converted to text.

🦀 ClawHub1.5k dl

B站视频转文字&总结神器-Bilibili video transcribe&summary

当用户提供 B 站视频链接、BV 号或 b23.tv 短链，并希望转录、提取字幕、总结或分析视频内容时使用。先检查 Node.js 环境和 SILICONFLOW_API_KEY，优先尝试官方字幕；如果没有字幕，则获取匿名音频地址，下载为 .m4s 后直接改名为 .mp3，无需转码；有 API key 时调用硅基...

🦀 ClawHub1.4k dl

Transcribe audio files to text using Telnyx Speech-to-Text API. Use when you need to convert audio recordings, voice messages, or spoken content to text.

🦀 ClawHub1.2k dl

Read, analyze, convert, trim, merge, adjust volume, and transcribe audio files in multiple formats including MP3, WAV, FLAC, AAC, OGG, and more.

🦀 ClawHub1.2k dl

Voice communication via Telegram. Automatically transcribes incoming voice messages using faster-whisper and replies with TTS voice. Use for all voice-relate...

🦀 ClawHub1.1k dl

Speech to Text Transcription

Transcribe audio and video files to text with speaker detection, timestamps, and format conversion.

🦀 ClawHub1.1k dl

Transcribe audio files using Sber Salute Speech async API. Russian-first STT with support for ru-RU, en-US, kk-KZ, ky-KG, uz-UZ.

🦀 ClawHub1.0k dl

抖音视频智能助手

抖音视频智能助手。用户发抖音链接或视频文件，自动转录并智能处理（总结/逐字稿/归档/讨论）。触发词：抖音、douyin.com、转文字、转录、视频转文本、douyin、transcribe

🦀 ClawHub917 dl

Captures ambient audio from wearable devices, transcribes locally, and streams searchable, speaker-tagged conversation data to your OpenClaw agent.

🦀 ClawHub896 dl

WebChat Voice GUI

Voice input and microphone button for OpenClaw WebChat Control UI. Adds a mic button to chat, records audio via browser MediaRecorder, transcribes locally vi...

🦀 ClawHub865 dl

Local voice I/O for OpenClaw agents. Transcribe inbound audio/voice messages using local Whisper (whisper.cpp) and generate voice replies using local Piper T...

🦀 ClawHub858 dl

ANY WHISPER API

Transcribe audio via API Whisper with any compatible local servers.

🦀 ClawHub844 dl

Video Transcribe

Use when the user wants to transcribe, caption, or get the text content of a video or audio file — e.g. "transcribe this video", "get the transcript", "what...

🦀 ClawHub821 dl

Knowledge Base Collector - save YouTube, URLs, text to Obsidian with AI summarization. Auto-transcribes videos, fetches pages, supports weekly/monthly digest...

🦀 ClawHub809 dl

Video to text converter. Downloads videos from Bilibili using bilibili-api, from other sites using yt-dlp, then transcribes audio using faster-whisper. Use w...

🦀 ClawHub800 dl

Transcribe audio to text using Volcano Engine (Volcengine/ARK) speech-to-text APIs. Use when the user wants to replace Whisper/OpenAI STT with Volcengine, tr...

🦀 ClawHub764 dl

Facticity.AI Complete Integration

Complete Facticity.AI integration - fact-check claims, extract claims from content, transcribe links, check link reliability, check credits, and monitor task...

🦀 ClawHub737 dl

Transcribe audio via Groq API (~10x cheaper than OpenAI API)

Transcribe audio via Groq Automatic Speech Recognition (ASR) Models (Whisper).

🦀 ClawHub717 dl

Audio Transcribe

This skill should be used when the user explicitly asks to "transcribe a meeting", "transcribe audio", "transcribe a meeting recording", "convert audio to te...

🦀 ClawHub711 dl

Subtitle Video Generator

Generate and style video subtitles in any language with AI — auto-transcribe speech to perfectly timed subtitles, translate across 50+ languages, apply trend...

🦀 ClawHub688 dl

Groq Voice Transcriber

Automatically transcribes Telegram voice messages using Groq Whisper API and replies with text generated by an LLM.

🦀 ClawHub676 dl

Fun ASR Nano Transcribe

使用 Fun-ASR-Nano-2512 轻量级模型进行语音转文字。提供快速准确的中文语音识别，识别结果实时输出到控制台，针对 CPU/GPU 环境优化。使用场景：(1) 将中文音频文件转写为文字，(2) 需要轻量级低内存占用的 ASR， (3) 处理包含领域特定热词的音频（医疗、保险等）， (4) 需要高准...

🦀 ClawHub652 dl

Groq Voice Transcribe

Transcribe audio files via Groq's OpenAI-compatible speech-to-text API. Use when the user sends voice messages or audio files and you need fast cloud speech-...

🦀 ClawHub633 dl

moss-transcribe-diarize

MOSS 多说话人转写技能。支持 URL / 本地文件 / Base64 音频输入，输出带时间戳与 speaker 的结构化转写结果（JSON、逐段文本、按说话人汇总）。用于会议纪要、访谈录音、多人对话整理。需要 API 凭证（环境变量：MOSS_API_KEY，兼容 MOSI_TTS_API_KEY / MOS...

🦀 ClawHub614 dl

Transcribe and organize voice memos with automatic categorization and information extraction. Use when users have voice notes, audio memos, or spoken notes t...