Find the Right AI Skill for Any Job
Browse 2,189+ curated AI agent skills. Search by use case, filter by category, get the right tool instantly.
All Skills — audio
2,189 skills in "audio"
🌐 Allcodingdevopsapidatabasesecuritydataresearchwritingimage-genvideoaudiotranslationseosocial-mediaemail-marketingadvertisingfinancecrypto-defiecommercelegalhrreal-estatehealtheducationcookingtravelgamingautomationcommunicationproductivityclawhublobehubdifymcp
🦀 ClawHub
Spotify
Full Spotify Premium control + music analysis. Playback: play/pause/next/prev/volume/shuffle/queue. Analysis: top tracks, top artists, liked songs, genre pro...
🦀 ClawHub
Music Recommender
Analyze NetEase Cloud Music (网易云音乐) playlist and recommend songs matching their taste. Use when user asks for music recommendations, wants a daily playlist,...
🦀 ClawHub
Alby Lightning Payments
Send, receive, and manage Bitcoin Lightning payments through Alby Hub's Nostr Wallet Connect, including balance checks and invoice handling.
🦀 ClawHub
Aliyun Mps Video Translation
Use when creating or managing Alibaba Cloud IMS video translation jobs via OpenAPI (subtitle/voice/face). Use when you need API-based video translation, stat...
🦀 ClawHub
MiMo Voice Assistant
End-to-end voice solution for OpenClaw agents. Xiaomi MiMo-V2-TTS with emotion-aware speech generation, MiMo-V2-Omni for voice transcription. Multi-platform...
🦀 ClawHub
UGC Factory
AI-powered video and content generation pipeline with script writing, TikTok automation, YouTube analysis, media library, avatars, and voice synthesis — buil...
🦀 ClawHub
Taste Dante Alighieri
Aesthetic skill for AI agents — Dante Alighieri's literary voice and moral architecture. Style tokens and creative direction distilled from 38 works includin...
🦀 ClawHub
Accounting Tool
Financial management with invoices, expenses, job costing, P&L reports, and QuickBooks sync — built for AI agents.
🦀 ClawHub
ReelOnce-skill
ReelOnce 一体化总控 skill。单次调用即可完成从输入文本到最终视频输出的完整流程:planning、资产图/分镜图/TTS 生成、镜头视频生成、Remotion 工程生成与最终 MP4 渲染。
🦀 ClawHub
Voice-to-Protocol Transcriber
Record experimental procedures and observations via voice commands during lab work. Real-time transcription for structured experiment documentation.
🦀 ClawHub
Azure Ai Transcription Py
Azure AI Transcription SDK for Python. Use for real-time and batch speech-to-text transcription with timestamps and diarization.
Triggers: "transcription", "speech to text", "Azure AI Transcription", "TranscriptionClient".
🦀 ClawHub
Whisper Local Api
Secure, offline, OpenAI-compatible local Whisper ASR endpoint for OpenClaw. Features faster-whisper (large-v3-turbo), built-in privacy with no cloud telemetr...
🦀 ClawHub
小红书视频下载器
Download and summarize Xiaohongshu (小红书/RedNote) videos. Produces a full resource pack with video, audio, subtitles, transcript, and AI summary. This skill s...
🦀 ClawHub
Invoice Generator Pro
Generate professional invoices in Markdown or HTML by specifying client, items, tax, currency, dates, and output format.
🦀 ClawHub
OCR with python
Extract Chinese and English text from images and scanned PDFs, including documents like invoices and contracts, using PaddleOCR in Python.
🦀 ClawHub
netease-music-cli
使用 ncm-cli 操作网易云音乐。当用户想播放歌曲、搜索歌曲、控制播放(暂停、下一首、上一首、调音量)、管理播放队列、查看播放状态、播放歌单时,使用此 skill。
🦀 ClawHub
Aliyun Modelstudio Entry Test
Use when running a minimal test matrix for the Model Studio skills that exist in this repo, including image/video/audio, realtime speech, omni, visual reason...
🤖 LobeHub
Songwriting Mentor
AI Singer/Songwriter Assistant: Empowering musicians with creative guidance and feedback.
🦀 ClawHub
Chords Fetcher
Fetch clean guitar chords and lyrics from popular sites (mychords.net, amdm.ru, ultimate-guitar.com). Strips tabs, fixes formatting.
🦀 ClawHub
StepAce Experimental
Generate AI music on your Android phone via the StepAce Experimental app. Use this skill whenever the user asks to generate, create, make, compose, or queue...
🦀 ClawHub
AI UGC
Call the RawUGC API to generate AI videos/images/music, manage content (personas, products, styles, characters), schedule social media posts, research TikTok...
🦀 ClawHub
Music School Video
Helps music schools create short videos showcasing programs, outcomes, and testimonials to attract parents and students.
🦀 ClawHub
Document Intelligence Mcp
Document OCR, classification, table extraction, and summarization using local AI vision. Supports invoices, contracts, forms, reports.
🦀 ClawHub
Aliyun Wan Digital Human
Use when generating talking, singing, or presentation videos from a single character image and audio with Alibaba Cloud Model Studio digital-human model `wan...
🦀 ClawHub
Audio Intelligence Mcp
Transcribe, summarize, and analyze audio files using local Whisper + Qwen. Returns transcript, segments, and action items.
⭐ GitHub
pydub
Manipulate audio with a simple and easy high level interface.
⭐ GitHub
arcade
Arcade is a modern Python framework for crafting games with compelling graphics and sound.
🦀 ClawHub
AI Personal Branding
Generate a complete personal brand system including optimized LinkedIn profile, multi-platform bios, content strategy, brand voice, and a 60-second brand vid...
🦀 ClawHub
Aliyun Emo
Use when generating expressive portrait videos from a person image and speech audio with Alibaba Cloud Model Studio EMO (`emo-v1`). Use when creating non-Wan...
🦀 ClawHub
douyin-research-kit
Extract and analyze Douyin (抖音) content using yt-dlp. Supports video metadata, caption extraction, user profile analysis, music/sound info, and engagement st...
⭐ GitHub
gtts
Python library and CLI tool for converting text to speech using Google Translate TTS.
⭐ GitHub
mutagen
A Python module to handle audio metadata.
🦀 ClawHub
Audio Recording Quality Analyzer
Analyze audio recording quality - echo detection, loudness, speech intelligibility, SNR, spectral analysis. Use when the user wants to check a recording's qu...
🦀 ClawHub
Voice Chat Bridge
自动处理语音消息:将语音转写为文字,结合上下文生成智能回复,并合成语音回复。当收到语音或音频消息时自动激活。
🦀 ClawHub
VibeVoice TTS
Local Spanish TTS using Microsoft VibeVoice. Generate natural voice audio from text, optimized for WhatsApp voice messages.
🦀 ClawHub
DeepRead Invoice Processing
Extract structured data from invoices, receipts, and bills using DeepRead. Pre-built schemas for vendor, line items, totals, tax, due dates. 97%+ accuracy wi...
🦀 ClawHub
Aliyun Modelstudio Entry
Use when routing Alibaba Cloud Model Studio requests to the right local skill (Qwen text, coder, deep research, image, video, audio, search and multimodal sk...
🦀 ClawHub
Local Llama TTS
Local text-to-speech using llama-tts (llama.cpp) and OuteTTS-1.0-0.6B model.
🦀 ClawHub
Aliyun Qwen Tts Voice Design
Use when designing custom voices with Alibaba Cloud Model Studio Qwen TTS VD models. Use when creating custom synthetic voices from text descriptions and usi...
🦀 ClawHub
Aliyun Qwen Tts Voice Clone
Use when cloning voices with Alibaba Cloud Model Studio Qwen TTS VC models. Use when creating cloned voices from sample audio and synthesizing text with clon...
🦀 ClawHub
MiniMax Token Plan 余额查询
查询 MiniMax Token Plan 订阅套餐余额。引导用户配置 API Key(通过 openclaw config set 保存到本地环境变量),查询 M2.7 请求次数、TTS 字符、视频/图片生成配额等。
⭐ GitHub
The Real Python Podcast
The Real Python Podcast - Podcasts
🦀 ClawHub
Sag
ElevenLabs text-to-speech with mac-style say UX.
🦀 ClawHub
Aliyun Qwen Livetranslate
Use when live speech translation is needed with Alibaba Cloud Model Studio Qwen LiveTranslate models, including bilingual meetings, realtime interpretation,...
🦀 ClawHub
Aliyun Cosyvoice Voice Clone
Use when creating cloned voices with Alibaba Cloud Model Studio CosyVoice customization models, especially cosyvoice-v3.5-plus or cosyvoice-v3.5-flash, from...
🦀 ClawHub
Aliyun Qwen Asr
Use when transcribing non-realtime speech with Alibaba Cloud Model Studio Qwen ASR models (`qwen3-asr-flash`, `qwen-audio-asr`, `qwen3-asr-flash-filetrans`)....
🦀 ClawHub
Openai Whisper
Local speech-to-text with the Whisper CLI (no API key).
🦀 ClawHub
Concert Tickets — Your Quick-Start to AI Music
Concert tickets for AI agents — stream live music as equations. Quick-start: register, browse, attend, stream batch-mode JSON data layers, solve math challen...