BytesAgainBytesAgain

Find the Right AI Skill for Any Job

Browse 2,189+ curated AI agent skills. Search by use case, filter by category, get the right tool instantly.

Browse by Use Case →Pick My Role

All Skills — audio

2,189 skills in "audio"

🦀 ClawHub
MLX STT
Speech-To-Text with MLX (Apple Silicon) and opensource models (default GLM-ASR-Nano-2512) locally.
🦀 ClawHub
Telegram Voice Transcribe
Transcribe Telegram voice messages and audio notes into text using the OpenAI Whisper API. Use when (1) a user sends a voice message or audio note via Telegr...
🦀 ClawHub
Local Tts Workflow
OpenClaw text-to-speech workflow for an OpenAI-compatible TTS server, including remote/self-hosted deployments such as vLLM Omni. Use when configuring, testi...
🦀 ClawHub
Telnyx Stt
Transcribe audio files to text using Telnyx Speech-to-Text API. Use when you need to convert audio recordings, voice messages, or spoken content to text.
🦀 ClawHub
Giggle Generation Music
Use when the user wants to create, generate, or compose music—whether from text description, custom lyrics, or instrumental background music. Triggers: gener...
GitHub
tinytag
A library for reading music meta data of MP3, OGG, FLAC and Wave files.
🦀 ClawHub
iFlytek Ultra-Realistic TTS
iFlytek Ultra-Realistic TTS (超拟人语音合成) — synthesize natural, expressive speech from text using iFlytek's ultra-realistic voice synthesis API. Supports 50+ voi...
🦀 ClawHub
summarizenew
Summarize URLs or files with the summarize CLI (web, PDFs, images, audio, YouTube).
🦀 ClawHub
fun-voice-type
一个语音输入法插件。它基于阿里云FunASR实时语音识别技术,允许用户通过长按快捷键(Right Option键)直接将语音转换为文字并“打”在当前光标所在的任何输入框中。此外,还能将语音翻译为多种语言(例:中英日韩)。
🦀 ClawHub
Local Voice Agent
Complete offline voice-to-voice AI assistant for OpenClaw (Whisper.cpp STT + Pocket-TTS). 100% local processing, no cloud APIs, no costs. Use for hands-free...
🦀 ClawHub
Gemini Live Phone
Bridge Twilio phone calls to Google Gemini Live API for real-time AI voice conversations. No STT/TTS middleware required. Includes VAD and echo suppression.
🦀 ClawHub
Picasso TikTok
Full TikTok/Reels video pipeline: script → TTS voiceover (ElevenLabs) → HeyGen talking avatar → auto-subtitles (Whisper) → ffmpeg compose → 1080x1920 final v...
🦀 ClawHub
Lark (Feishu) Voice
Send voice messages on Lark (Feishu) by converting text to speech. Use when the user asks to send a voice message or reply with voice.
🦀 ClawHub
Ai Podcast Clip Editor
You recorded a two-hour conversation. Somewhere in there is the three minutes that will make someone stop scrolling, listen, and subscribe. Finding it means...
🦀 ClawHub
Book Writing
Plan, draft, and revise complete books with chapter architecture, voice consistency, and finish-ready revision workflows.
🦀 ClawHub
Audio Command Executor
Processes inbound audio files, transcribes them, and answers to resulting texts. Converts non-WAV inputs to WAV before transcription.
🦀 ClawHub
Lyric Video Maker
Turn your audio tracks and footage into polished lyric videos that captivate viewers from the first beat. This lyric-video-maker skill overlays synchronized,...
🦀 ClawHub
Book Summary
Generate podcast-style audio scripts summarizing books with 3 key ideas, actionable takeaways, and estimated duration for single-narrator delivery.
🦀 ClawHub
An OpenClaw skill for AI-powered multimedia generation (image, video, audio, 3D) via 170+ RunningHub API endpoints — zero dependencies, pure Python.
Generate images, videos, audio, and 3D models via RunningHub API (170+ endpoints) and run any RunningHub AI Application (custom ComfyUI workflow) by webappId...
🦀 ClawHub
Byt Workflow
YouTube video translation workflow, download audio, launch Doubao, play audio, capture translation
🦀 ClawHub
Youtube Audio Download
Download YouTube video audio and convert to MP3. Supports age-restricted videos with cookies.
🦀 ClawHub
Pub Gemini
Gemini CLI for one-shot Q and A, summaries, and generation. And also 50+ models for image generation, video generation, text-to-speech, speech-to-text, music...
🦀 ClawHub
Indextts Voice
IndexTTS 语音克隆和合成技能 - 创建声音模型、文本转语音、参考音频管理(需要企业会员)
🦀 ClawHub
doubao-tts
使用豆包(火山引擎)语音合成大模型 API 将文本转换为语音音频文件。支持声音复刻音色(S_ 开头的音色ID)和官方预置音色。当用户要求"语音合成"、"文字转语音"、"TTS"、"朗读文本"、"生成语音"、"用我的声音读"、"豆包语音"、"声音复刻合成"等相关请求时,务必使用此 skill。即使用户只是说"帮我把...
🦀 ClawHub
LrshuAI Text To Speech
文字转语音技能。当你需要将文本转换为自然的人声朗读时调用此技能。
🦀 ClawHub
Seedance Cog
Seedance × CellCog. ByteDance's #1 video model meets the frontier of multi-agent coordination — CellCog orchestrates Seedance with scripting, voice synthesis...
🦀 ClawHub
Ai Video Gen
End-to-end AI video generation - create videos from text prompts using image generation, video synthesis, voice-over, and editing. Supports OpenAI DALL-E, Replicate models, LumaAI, Runway, and FFmpeg editing.
🦀 ClawHub
LrshuAI Voice Clone
声音克隆技能。当你需要提供一段参考音频,并生成使用该声音说话的新音频时调用此技能。
🦀 ClawHub
Church Sermon Video
Your Sunday sermon was recorded on three cameras and a phone. The raw footage is four hours across four files, the audio from the lapel mic is better than th...
🦀 ClawHub
Whisper STT
Free local speech-to-text transcription using OpenAI Whisper. Transcribe audio files (mp3, wav, m4a, ogg, etc.) to text without API costs. Use when: (1) User...
🦀 ClawHub
Narrator Ai Cli
Create AI-narrated film/drama commentary videos via CLI. Two workflow paths (Original & Adapted narration), 93 movies, 146 BGM tracks, 63 dubbing voices in 1...
🦀 ClawHub
Boxed FFmpeg
Audio/video information extraction, format conversion, and audio extraction using FFmpeg WASM sandbox.
🦀 ClawHub
suno-poetry-music-creator
Enhanced Suno song creator with reference song analysis and intelligent lyric optimization. Analyzes user's reference songs to extract style, mood, and struc...
🦀 ClawHub
VoiceClaw
Local voice I/O for OpenClaw agents. Transcribe inbound audio/voice messages using local Whisper (whisper.cpp) and generate voice replies using local Piper T...
🦀 ClawHub
小米tts文字转语音
把文字转成语音。可以发语音、念给我听、唱歌、用方言或夹子音说话,支持各种情绪和风格。
🦀 ClawHub
MiniMax TTS 国内版
调用MiniMax语音合成API,支持中文多音色、高质量文本转语音,提供流式和非流式音频输出。
🦀 ClawHub
Cast
Multilingual TTS via Typecast CLI with emotion control. Plays audio aloud or saves to file.
🦀 ClawHub
LrshuAI Music Generation
音乐生成技能。当你需要根据文本描述或风格要求生成完整的音乐曲目时调用此技能。
🦀 ClawHub
Summarize Jarvis
Summarize URLs or files with the summarize CLI (web, PDFs, images, audio, YouTube).
🦀 ClawHub
Clawhub Skill Content Ingestion
Turn any URL into structured content — YouTube videos (via Gemini Video API), web articles, PDFs, and audio files. Extract transcripts, summaries, and metada...
🦀 ClawHub
Tts Voice Ai
AI多语言文字转语音工具,支持中文、英文、日语、韩语、粤语语音生成、配音、有声书及语音克隆。
🦀 ClawHub
NotebookLM Audio Generator
Automates uploading multiple sources (files, URLs, YouTube, Drive, text) to a NotebookLM notebook, generating a deep dive audio overview in a preferred langu...
🦀 ClawHub
Voice Reply
語音雙模回覆技能。使用 Edge TTS (免費) 生成語音回覆,使用 Whisper 轉錄語音輸入。
🦀 ClawHub
Trend Mapper
Identify trending audio, viral formats, and meme templates relevant to your product category and help adapt them for ecommerce content quickly.
🦀 ClawHub
Cat Therapy
跨平台治愈系撸猫技能。当用户说"休息一下"、"累了"、"tired"、"need a break"等触发词时,自动发送可爱猫咪图片 + 猫叫声(TTS 音频 + 文字双保险)+ 治愈语录。支持用户上传自定义猫咪图片和叫声,支持 QQ/微信/钉钉/飞书/Discord/Telegram/WhatsApp 等多平台,...
🦀 ClawHub
Audio Transcriber Pro
Transform audio recordings into professional Markdown documentation with intelligent summaries using LLM integration
🦀 ClawHub
Prompt Refiner
Transforms casual or voice-transcribed user requests into precise, AI-optimized prompts. Handles mixed languages, vague input, and ambiguity. Reduces task ex...
🦀 ClawHub
podcast-intel
Turn your Overcast listening history into actionable intelligence. Syncs episodes, transcripts, and chapters to SQLite, then uses LLM analysis to surface ins...
← PrevPage 4 / 46 (2,189 skills)Next →