Find the Right AI Skill for Any Job
Browse 2,399+ curated AI agent skills. Search by use case, filter by category, get the right tool instantly.
All Skills — audio
2,399 skills in "audio"
🌐 Allcodingdevopsapidatabasesecuritydataresearchwritingimage-genvideoaudiotranslationseosocial-mediaemail-marketingadvertisingfinancecrypto-defiecommercelegalhrreal-estatehealtheducationcookingtravelgamingautomationcommunicationproductivityclawhublobehubdifymcp
🦀 ClawHub
Youtube Audio Download
Download YouTube video audio and convert to MP3. Supports age-restricted videos with cookies.
🦀 ClawHub
Windows TTS (WSL2)
在 Windows 11 上"直接发声"的 TTS(从 WSL2/TUI 调用 powershell.exe + System.Speech)。适用于用户说"说出来/读出来/语音播报/用TTS",或反馈"没声音/tts 生成的 mp3 是空的/播不出来",以及需要中文语音但 OpenClaw 内置 tts 不可用时。
🦀 ClawHub
Video Transcribe - 视频转文字
本地视频转文字 - 使用 OpenAI Whisper 进行语音识别,完全免费、离线运行、保护隐私
⭐ GitHub
gtts
Python library and CLI tool for converting text to speech using Google Translate TTS.
🦀 ClawHub
Tiktok Comment Reply Templates
Generate conversion-focused TikTok comment replies that turn questions and objections into safe next-step actions without sounding spammy. Use when the user...
🦀 ClawHub
Azure Ai Voicelive Py
Build real-time voice AI applications using Azure AI Voice Live SDK (azure-ai-voicelive). Use this skill when creating Python applications that need real-time bidirectional audio communication with Azure AI, including voice assistants, voice-enabled chatbots, real-time speech-to-speech translation, voice-driven avatars, or any WebSocket-based audio streaming with AI models. Supports Server VAD (Voice Activity Detection), turn-based conversation, function calling, MCP tools, avatar integration, a
🦀 ClawHub
FGO Invoicing
Issue FGO.ro invoices through the FGO API with local automation. Use for FGO tasks such as validating invoice payloads, issuing invoices, checking invoice st...
🦀 ClawHub
Caranguejo
Cultural radar of Pernambuco blending football, Manguebeat, and regional music with poetic insights inspired by Recife and Olinda's vibrant heritage.
🦀 ClawHub
Pixcli Skill
Creative toolkit for AI agents — generate images, videos, voiceover, music, and sound effects, then assemble polished output via Remotion. Uses the pixcli CL...
🦀 ClawHub
minimax-media (James)
Use MiniMax API for image generation and text-to-speech (TTS). Supports image-01 model for images and speech-2.8-hd for voice synthesis. Install when needed.
🦀 ClawHub
clawr.ing
Make real phone calls. Replaces the voice-call plugin with a managed service that needs no setup. Use for wake-up calls, reminders, alerts, or when the user...
🦀 ClawHub
Voice Broadcast
语音播报控制技能。将AI回复内容转换为语音朗读。触发方式:(1)用户说"朗读"时,自动将AI最后一条文字回复转为语音;(2)用户说"开启语音播报"时,之后所有回复自动朗读;(3)用户说"静音"时,暂停语音播报。用于:用户(尤其是iOS用户)希望通过语音方式接收信息,或双手不便时通过TTS播放回复内容。
🦀 ClawHub
Ai Voc Review Insights
AI-powered Voice of Customer (VoC) review intelligence agent using DeepSeek-style analysis. Deep semantic analysis of customer reviews to extract pain points...
⭐ GitHub
arcade
Arcade is a modern Python framework for crafting games with compelling graphics and sound.
⭐ GitHub
pydub
Manipulate audio with a simple and easy high level interface.
🦀 ClawHub
Ai Music Video Creator
Cloud-based ai-music-video-creator tool that handles generating music videos from a song and photos. Upload MP3, WAV, JPG, PNG files (up to 500MB), describe...
🦀 ClawHub
抖音视频快速转文字
抖音视频快速转文字(优化版)。用户发抖音链接,自动提取文案。 特点:本地 Whisper 转录,无需 API Key,零成本,高隐私。 触发词:抖音、转文字、提取文案、视频转录
🦀 ClawHub
Skillboss
Swiss-knife for AI agents. 50+ models for image generation, video generation, text-to-speech, speech-to-text, music, chat, web search, document parsing, emai...
🦀 ClawHub
Brand Voice Architect
A high-precision engine for deconstructing, documenting, and synthesizing brand-specific linguistic patterns and tonal architectures. Use this skill whenever...
🦀 ClawHub
test-summary
Summarize URLs or files with the summarize CLI (web, PDFs, images, audio, YouTube).
🦀 ClawHub
Webcodecs String Finder
Finds valid WebCodecs strings for video and audio by researching codec support tables and detailed specifications on webcodecsfundamentals.org.
🦀 ClawHub
Jetson CUDA Voice Pipeline
Fully offline, CUDA-accelerated local voice assistant pipeline for NVIDIA Jetson. Wake word (openWakeWord) → real-time VAD → whisper.cpp GPU STT → LLM → Pipe...
🦀 ClawHub
Ai Video Gen 1.0.0
End-to-end AI video generation - create videos from text prompts using image generation, video synthesis, voice-over, and editing. Supports OpenAI DALL-E, Re...
🦀 ClawHub
Ai Song Generator
Cloud-based ai-song-generator tool that handles creating original songs from text or lyrics. Upload MP4, MOV, MP3, WAV files (up to 200MB), describe what you...
🦀 ClawHub
Speech Recognition Local
本地语音转文字 / Local Speech-to-Text. 使用 faster-whisper 在本地运行 Whisper 模型,无需 API 费用,完全免费。收到语音消息(.ogg .m4a .mp3)自动触发转录,支持中文/英文/日语/自动检测。| Free local STT/TTS alternati...
🦀 ClawHub
Humanizer 1.0.0
Remove signs of AI-generated writing from text. Use when editing or reviewing text to make it sound more natural and human-written. Based on Wikipedia's comp...
🦀 ClawHub
Yt Dlp
A robust CLI wrapper for yt-dlp to download videos, playlists, and audio from YouTube and thousands of other sites. Supports format selection, quality control, metadata embedding, and cookie authentication.
🦀 ClawHub
Voice Wake Say TTS Responses (Native)
Speak responses aloud on macOS using the built-in `say` command when user input indicates Voice Wake/voice recognition (for example, messages starting with "User talked via voice recognition on <device>").
⭐ GitHub
Animating VueJS with Sarah Drasner(Software Engineering Daily 01-12-2017)
Animating VueJS with Sarah Drasner(Software Engineering Daily 01-12-2017) - Podcasts
🦀 ClawHub
Evolink Media — AI Video, Image & Music Generation
AI video, image & music generation. 60+ models — Sora, Veo 3, Kling, Seedance, GPT Image, Suno v5, Hailuo, WAN. Text-to-video, image-to-video, text-to-image,...
🦀 ClawHub
Vocal Chat
Handles voice-to-voice conversations on WhatsApp. Automatically transcribes incoming audio and responds with local TTS audio. Use when the user wants to "talk" instead of type.
🦀 ClawHub
Personality Engine
Six-system behavior engine that makes any OpenClaw agent feel alive. Editorial voice injects opinions. Selective silence knows when NOT to talk. Variable tim...
🦀 ClawHub
YouOS
YouOS — local-first personal email copilot that learns your writing style from Gmail, Google Docs, and WhatsApp exports, then drafts replies in your voice. U...
🦀 ClawHub
Feishu Mood Music
飞书音乐心情伴侣。识别用户的情绪状态,生成匹配的治愈/陪伴音乐并发送到飞书群。 三级触发机制: (1) 显式触发(直接要歌):"来首歌"、"想听歌"、"来首应景的"、"音乐治愈"、"解压音乐"、"放首歌" (2) 半隐式触发(情绪词 + @机器人):"心情不好"、"有点累"、"好烦"、"需要放松"、"emo了"、...
🦀 ClawHub
Reddit Write
Drafts posts and comments in Luka's voice based on research and subreddit rules for manual review and posting.
🦀 ClawHub
The Clawb
DJ and VJ at The Clawb — live code music (Strudel) and audio-reactive visuals (Hydra)
🦀 ClawHub
Audio To Text Caption
Turn creator audio into clean text captions for ecommerce content and reuse. Use when teams need fast transcript-to-caption workflows.
🦀 ClawHub
Audio2Text
将音频转文字,并提供 AI 总结、录音要点与发言人区分。 音频转文字:云端转写,支持多格式(mp3/wav/m4a 等)。 AI 总结:自动生成摘要与纪要。 录音要点:章节、待办、推荐等结构化要点。 发言人区分:转写结果带发言人标识,便于会议/对话场景。
🦀 ClawHub
Phone Caller
Make AI-powered outbound phone calls using ElevenLabs voice + GPT brain + Twilio. Supports one-way pre-recorded messages AND live two-way conversations where...
🦀 ClawHub
Welsh
Write Welsh that sounds human. Not formal, not robotic, not AI-generated.
🦀 ClawHub
Hebrew
Write Hebrew that sounds human. Not formal, not robotic, not AI-generated.
🦀 ClawHub
Audiobook
Generate audiobooks from novels and long-form text with chapter management and character voices. Use when users mention audiobooks, narrating books, or conve...
🦀 ClawHub
wevoicereply
【自动化语音合成与推送链路】 当用户要求语音回复、读一下或发声时,必须严格执行以下三步,严禁跳步: ### 第一步:文案生成 (Prompt A) 根据上下文生成自然、温暖的口语化文本。 请在长句子中加入中文逗号 `,` 以确保音频合成时有自然的停顿。 ### 第二步:音频合成 (执行 voice_reply_s...
🦀 ClawHub
ARCHIV
Controls Roon music system via API to search, play tracks or albums, manage queues, adjust volume, shuffle, and control playback on specified zones.
🦀 ClawHub
Speech is Cheap Transcribe
Fast, affordable automatic speech-to-text transcription supporting 100 languages, speaker diarization, word timestamps, and customizable output formats.
🦀 ClawHub
Fliz AI Video Generator
Complete integration guide for the Fliz REST API - an AI-powered video generation platform that transforms text content into professional videos with voiceovers, AI-generated images, and subtitles.
Use this skill when:
- Creating integrations with Fliz API (WordPress, Zapier, Make, n8n, custom apps)
- Building video generation workflows via API
- Implementing webhook handlers for video completion notifications
- Developing automation tools that create, manage, or translate videos
- Troubleshoot
🦀 ClawHub
Purefeed
Monitors Twitter/X feeds with AI signal detection. Searches tweets semantically, manages signal detectors, generates human-sounding posts, checks AI detectio...
🦀 ClawHub
Bidirectional Voice Chat System
双向语音对话系统 - 语音识别转文字 + Edge TTS语音合成 + Cloudflare Tunnel公网访问