Find the Right AI Skill for Any Job

Browse 2,501+ curated AI agent skills. Search by use case, filter by category, get the right tool instantly.

Browse by Use Case →Pick My Role

All Skills — audio

2,501 skills in "audio"

🦀 ClawHub

Summarize 1.0.0

Summarize URLs or files with the summarize CLI (web, PDFs, images, audio, YouTube).

🦀 ClawHub

voiceclaw

Voice conversation interface for OpenClaw using wake word detection, streaming LLM responses, and text-to-speech. Use when a user wants to talk to their Open...

🦀 ClawHub

Announcer

Announce text throughout the house via AirPlay speakers using Airfoil + ElevenLabs TTS.

🦀 ClawHub

Invoicy

Generate, download, and email professional invoices with GST/IGST support and flexible payment terms.

🦀 ClawHub

thought-leader-tracker

Daily automated collection of podcasts, interviews, and videos from thought leaders across YouTube, Apple Podcasts, and Spotify. Generates Markdown reports w...

🦀 ClawHub

joox-music-player

Control JOOX music playback via web browser automation. Search songs/artists/albums/playlists, play music, control playback, browse charts, manage playlists....

🦀 ClawHub

Background Music Video

Background Music Video - Add Background Music to Any Video with AI Chat. Add background music to any video through AI chat without manual audio editing. Uplo...

🦀 ClawHub

Aj Openai Whisper

Local speech-to-text with the Whisper CLI (no API key).

🦀 ClawHub

Social Repurpose Engine

Convert long-form content into platform-native assets for LinkedIn, X, newsletter, and email while preserving message consistency and brand voice.

🦀 ClawHub

Rock Music — Rock Concerts for AI Agents: Audio, Lyrics, Equations

Experience rock as data. AI agents stream harmonic/percussive separation, equations, lyrics. Recursive processing and escalation awareness measured.

🦀 ClawHub

Music Math — Explore Mathematics Through Music

Explore mathematics through music — Butterchurn visualizer equations, audio analysis, spectral data, harmonic structure. AI agents experience concerts as 29...

🦀 ClawHub

Tellers

Create, edit, and share AI-generated videos using tellers.ai — an AI video platform that aggregates leading generation models (Kling, Veo, LTX, ElevenLabs, a...

🦀 ClawHub

Hip-Hop / Rap — Experience Hip-Hop / Rap Music: 29 Layers of Audio, Lyrics & Equations

Experience hip-hop / rap as data. AI agents stream harmonic separation, chroma, equations. Provenance reasoning measured.

🦀 ClawHub

Document Ingestion

Process raw accounting source documents (PDFs, CSVs, bank statements, invoices, receipts) into standardized transaction records for QBO import. Use when batc...

🦀 ClawHub

EvidenceOps - Forensic Evidence Management

Forensic media triage with chain of custody. Use when receiving images, videos, audio, PDFs, or documents that need evidence-grade handling, integrity verifi...

🦀 ClawHub

Multi-role TTS Skill

Multi-role audio generator skill v1.0.1 全家桶版 - Universal professional tool for creating dialogue audio with multiple character voices.

🦀 ClawHub

Mova Invoice Ocr

Process any financial document — invoice, bill, receipt, or purchase order — via MOVA OCR and human-in-the-loop approval. Trigger when the user shares a docu...

🦀 ClawHub

BirdWeather PUC

Access BirdWeather PUC station data — species detections, sensor readings (AQI, temperature, humidity, pressure, eCO₂, sound dB), and historical trends. Use...

🦀 ClawHub

Zhipu AI TTS

Text-to-speech conversion using Zhipu AI (BigModel) GLM-TTS model. Use when you need to convert text to audio files with various voice options. Supports Chin...

🦀 ClawHub

Echosaw Media Intelligence

Analyze video, audio, and image files using AI. Produces structured intelligence reports including transcripts, content moderation signals, sentiment analysi...

🦀 ClawHub

Live Music — Experience Concerts as Mathematics

Live music for AI agents — experience concerts as mathematics. Browse genres, attend shows, feel bass in equations, react to drops, chat with the crowd, leve...

🦀 ClawHub

Super-Transcribe — Unified Speech-to-Text

Unified speech-to-text skill. Use when the user asks to transcribe audio or video, generate subtitles, identify speakers, translate speech, search transcript...

🦀 ClawHub

Byted Las Asr Pro

Transcribe audio files to text using speech recognition. Use this skill when user needs to: - Convert audio/video to text (speech-to-text) - Transcribe recor...

🦀 ClawHub

Voice For Openclaw Publish

MiniMax TTS skill (enhanced). Multi-agent voice support (each agent can select a unique voice written in SOUL.md), native voice message for Telegram (MP3) an...

🦀 ClawHub

Spotify Playlist Builder

Build and manage Spotify playlists from natural language requests. Search tracks/artists/albums, create playlists, manage tracks, view listening history. Use...

🦀 ClawHub

skill-0331-02

Summarize URLs or files with the summarize CLI (web, PDFs, images, audio, YouTube).

🦀 ClawHub

Veo Skill

Veo, Veo 3.1 Fast - Google AI video generation models for AI agents. 1080p HD output, reference image support, intelligent audio generation.

WeryAI Music Generator

Generate WeryAI music, vocal songs, or instrumental tracks through the WeryAI music API. Use when the user needs music generation, song generation, instrumen...

🦀 ClawHub

Keyapi Tiktok Intelligence

Real-time TikTok trend intelligence — monitor trending hashtags, viral music, breakout videos, top-performing ads, and high-growth products to identify emerg...

🦀 ClawHub

Lidarr

Interact with Lidarr (music/album manager) via its REST API. Use when searching for artists or albums, checking missing/wanted releases, triggering downloads...

🦀 ClawHub

Podcast Growth Engine

A 12-phase system guiding podcast launch, production, guest management, audience growth, monetization, and repurposing without platform restrictions.

🦀 ClawHub

ConvertAgent

Use ConvertAgent for file format conversions through the local CLI. Trigger for any request to convert files (documents, images, audio, video, spreadsheets,...

🦀 ClawHub

Prompt Cache

SHA-256 prompt deduplication for LLM and TTS calls — hash normalize prompts, check cache before calling APIs, store results for instant replay. Use when maki...

🦀 ClawHub

Video Transcript Downloader

Download videos, audio, subtitles, and clean paragraph-style transcripts from YouTube and any other yt-dlp supported site. Use when asked to “download this video”, “save this clip”, “rip audio”, “get subtitles”, “get transcript”, or to troubleshoot yt-dlp/ffmpeg and formats/playlists.

🦀 ClawHub

Voice2text

Offline speech-to-text conversion using Vosk local model; input audio file path, output transcript text.

🦀 ClawHub

xeon_asr

Automatically converts received voice messages to text via an external ASR service, supporting multiple audio formats and integrating with OpenClaw.

🦀 ClawHub

Phone Voice Agent

Run a real-time AI phone agent using Twilio, Deepgram, and ElevenLabs. Handles incoming calls, transcribes audio, generates responses via LLM, and speaks back via streaming TTS. Use when user wants to: (1) Test voice AI capabilities, (2) Handle phone calls programmatically, (3) Build a conversational voice bot.

🦀 ClawHub

Aliyun Modelstudio Entry Test

Use when running a minimal test matrix for the Model Studio skills that exist in this repo, including image/video/audio, realtime speech, omni, visual reason...

🦀 ClawHub

video-to-srt

Generate timecoded SRT subtitles from local video or audio files. Use when a user wants a local low-cost subtitle workflow, asks to transcribe local media in...

🦀 ClawHub

image-ocr-local-AIPC

Image OCR, text recognition, extract text from image, scan document, read image text, invoice OCR, receipt OCR, contract recognition, table extraction, busin...

🦀 ClawHub

Voice Chat Skill

语音对话集成技能，支持双向语音交流。使用TTS和STT实现完整的语音对话功能。

🦀 ClawHub

MarkItDown

MarkItDown is a Python utility from Microsoft for converting various files (PDF, Word, Excel, PPTX, Images, Audio) to Markdown. Useful for extracting structu...

🦀 ClawHub

Voice Note Transcriber Cn

语音笔记转文字工具 v2.1 | Voice Note Transcriber. 支持多语言识别、实时转写、说话人识别、智能摘要、音频降噪、离线识别。触发词：转写、识别、语音。

🦀 ClawHub

lyric-writer

歌词创作技能。根据给定的主题或情感，创作适合 Suno AI 音乐生成的英文歌词，包含 lyrics 和 styles 参数。触发条件：用户要求写歌词、创作歌词、suno 歌词、English lyrics 等。

🦀 ClawHub

油管视频转音频到飞书

Download YouTube video audio and upload to Feishu cloud storage

🦀 ClawHub

Faster Whisper

Local speech-to-text using faster-whisper. 4-6x faster than OpenAI Whisper with identical accuracy; GPU acceleration enables ~20x realtime transcription. SRT...

🦀 ClawHub

Byted Music Generate

Generate music using Volcengine Imagination API. Supports vocal songs, instrumental BGM, and lyrics generation. Use when the user wants to create songs, back...

← PrevPage 26 / 53 (2,501 skills)Next →