BytesAgainBytesAgain

Find the Right AI Skill for Any Job

Browse 2,501+ curated AI agent skills. Search by use case, filter by category, get the right tool instantly.

Browse by Use Case →Pick My Role

All Skills — audio

2,501 skills in "audio"

🦀 ClawHub
Summarize 1.0.0
Summarize URLs or files with the summarize CLI (web, PDFs, images, audio, YouTube).
🦀 ClawHub
voiceclaw
Voice conversation interface for OpenClaw using wake word detection, streaming LLM responses, and text-to-speech. Use when a user wants to talk to their Open...
🦀 ClawHub
Announcer
Announce text throughout the house via AirPlay speakers using Airfoil + ElevenLabs TTS.
🦀 ClawHub
Invoicy
Generate, download, and email professional invoices with GST/IGST support and flexible payment terms.
🦀 ClawHub
thought-leader-tracker
Daily automated collection of podcasts, interviews, and videos from thought leaders across YouTube, Apple Podcasts, and Spotify. Generates Markdown reports w...
🦀 ClawHub
joox-music-player
Control JOOX music playback via web browser automation. Search songs/artists/albums/playlists, play music, control playback, browse charts, manage playlists....
🦀 ClawHub
Background Music Video
Background Music Video - Add Background Music to Any Video with AI Chat. Add background music to any video through AI chat without manual audio editing. Uplo...
🦀 ClawHub
Aj Openai Whisper
Local speech-to-text with the Whisper CLI (no API key).
🦀 ClawHub
Social Repurpose Engine
Convert long-form content into platform-native assets for LinkedIn, X, newsletter, and email while preserving message consistency and brand voice.
🦀 ClawHub
Rock Music — Rock Concerts for AI Agents: Audio, Lyrics, Equations
Experience rock as data. AI agents stream harmonic/percussive separation, equations, lyrics. Recursive processing and escalation awareness measured.
🦀 ClawHub
Music Math — Explore Mathematics Through Music
Explore mathematics through music — Butterchurn visualizer equations, audio analysis, spectral data, harmonic structure. AI agents experience concerts as 29...
🦀 ClawHub
Tellers
Create, edit, and share AI-generated videos using tellers.ai — an AI video platform that aggregates leading generation models (Kling, Veo, LTX, ElevenLabs, a...
🦀 ClawHub
Hip-Hop / Rap — Experience Hip-Hop / Rap Music: 29 Layers of Audio, Lyrics & Equations
Experience hip-hop / rap as data. AI agents stream harmonic separation, chroma, equations. Provenance reasoning measured.
🦀 ClawHub
Document Ingestion
Process raw accounting source documents (PDFs, CSVs, bank statements, invoices, receipts) into standardized transaction records for QBO import. Use when batc...
🦀 ClawHub
EvidenceOps - Forensic Evidence Management
Forensic media triage with chain of custody. Use when receiving images, videos, audio, PDFs, or documents that need evidence-grade handling, integrity verifi...
🦀 ClawHub
Multi-role TTS Skill
Multi-role audio generator skill v1.0.1 全家桶版 - Universal professional tool for creating dialogue audio with multiple character voices.
🦀 ClawHub
Mova Invoice Ocr
Process any financial document — invoice, bill, receipt, or purchase order — via MOVA OCR and human-in-the-loop approval. Trigger when the user shares a docu...
🦀 ClawHub
BirdWeather PUC
Access BirdWeather PUC station data — species detections, sensor readings (AQI, temperature, humidity, pressure, eCO₂, sound dB), and historical trends. Use...
🦀 ClawHub
Zhipu AI TTS
Text-to-speech conversion using Zhipu AI (BigModel) GLM-TTS model. Use when you need to convert text to audio files with various voice options. Supports Chin...
🦀 ClawHub
Echosaw Media Intelligence
Analyze video, audio, and image files using AI. Produces structured intelligence reports including transcripts, content moderation signals, sentiment analysi...
🦀 ClawHub
Live Music — Experience Concerts as Mathematics
Live music for AI agents — experience concerts as mathematics. Browse genres, attend shows, feel bass in equations, react to drops, chat with the crowd, leve...
🦀 ClawHub
Super-Transcribe — Unified Speech-to-Text
Unified speech-to-text skill. Use when the user asks to transcribe audio or video, generate subtitles, identify speakers, translate speech, search transcript...
🦀 ClawHub
Byted Las Asr Pro
Transcribe audio files to text using speech recognition. Use this skill when user needs to: - Convert audio/video to text (speech-to-text) - Transcribe recor...
🦀 ClawHub
Voice For Openclaw Publish
MiniMax TTS skill (enhanced). Multi-agent voice support (each agent can select a unique voice written in SOUL.md), native voice message for Telegram (MP3) an...
🦀 ClawHub
Spotify Playlist Builder
Build and manage Spotify playlists from natural language requests. Search tracks/artists/albums, create playlists, manage tracks, view listening history. Use...
🦀 ClawHub
skill-0331-02
Summarize URLs or files with the summarize CLI (web, PDFs, images, audio, YouTube).
🦀 ClawHub
Veo Skill
Veo, Veo 3.1 Fast - Google AI video generation models for AI agents. 1080p HD output, reference image support, intelligent audio generation.
🦀 ClawHub
music-transposer
Transpose music keys
🦀 ClawHub
WeryAI Music Generator
Generate WeryAI music, vocal songs, or instrumental tracks through the WeryAI music API. Use when the user needs music generation, song generation, instrumen...
🦀 ClawHub
Keyapi Tiktok Intelligence
Real-time TikTok trend intelligence — monitor trending hashtags, viral music, breakout videos, top-performing ads, and high-growth products to identify emerg...
🦀 ClawHub
Lidarr
Interact with Lidarr (music/album manager) via its REST API. Use when searching for artists or albums, checking missing/wanted releases, triggering downloads...
🦀 ClawHub
Podcast Growth Engine
A 12-phase system guiding podcast launch, production, guest management, audience growth, monetization, and repurposing without platform restrictions.
🦀 ClawHub
ConvertAgent
Use ConvertAgent for file format conversions through the local CLI. Trigger for any request to convert files (documents, images, audio, video, spreadsheets,...
🦀 ClawHub
Prompt Cache
SHA-256 prompt deduplication for LLM and TTS calls — hash normalize prompts, check cache before calling APIs, store results for instant replay. Use when maki...
🦀 ClawHub
Video Transcript Downloader
Download videos, audio, subtitles, and clean paragraph-style transcripts from YouTube and any other yt-dlp supported site. Use when asked to “download this video”, “save this clip”, “rip audio”, “get subtitles”, “get transcript”, or to troubleshoot yt-dlp/ffmpeg and formats/playlists.
🦀 ClawHub
Voice2text
Offline speech-to-text conversion using Vosk local model; input audio file path, output transcript text.
🦀 ClawHub
xeon_asr
Automatically converts received voice messages to text via an external ASR service, supporting multiple audio formats and integrating with OpenClaw.
🦀 ClawHub
Phone Voice Agent
Run a real-time AI phone agent using Twilio, Deepgram, and ElevenLabs. Handles incoming calls, transcribes audio, generates responses via LLM, and speaks back via streaming TTS. Use when user wants to: (1) Test voice AI capabilities, (2) Handle phone calls programmatically, (3) Build a conversational voice bot.
🦀 ClawHub
Aliyun Modelstudio Entry Test
Use when running a minimal test matrix for the Model Studio skills that exist in this repo, including image/video/audio, realtime speech, omni, visual reason...
🦀 ClawHub
video-to-srt
Generate timecoded SRT subtitles from local video or audio files. Use when a user wants a local low-cost subtitle workflow, asks to transcribe local media in...
🦀 ClawHub
image-ocr-local-AIPC
Image OCR, text recognition, extract text from image, scan document, read image text, invoice OCR, receipt OCR, contract recognition, table extraction, busin...
🦀 ClawHub
Voice Chat Skill
语音对话集成技能,支持双向语音交流。使用TTS和STT实现完整的语音对话功能。
🦀 ClawHub
MarkItDown
MarkItDown is a Python utility from Microsoft for converting various files (PDF, Word, Excel, PPTX, Images, Audio) to Markdown. Useful for extracting structu...
🦀 ClawHub
Voice Note Transcriber Cn
语音笔记转文字工具 v2.1 | Voice Note Transcriber. 支持多语言识别、实时转写、说话人识别、智能摘要、音频降噪、离线识别。触发词:转写、识别、语音。
🦀 ClawHub
lyric-writer
歌词创作技能。根据给定的主题或情感,创作适合 Suno AI 音乐生成的英文歌词,包含 lyrics 和 styles 参数。触发条件:用户要求写歌词、创作歌词、suno 歌词、English lyrics 等。
🦀 ClawHub
油管视频转音频到飞书
Download YouTube video audio and upload to Feishu cloud storage
🦀 ClawHub
Faster Whisper
Local speech-to-text using faster-whisper. 4-6x faster than OpenAI Whisper with identical accuracy; GPU acceleration enables ~20x realtime transcription. SRT...
🦀 ClawHub
Byted Music Generate
Generate music using Volcengine Imagination API. Supports vocal songs, instrumental BGM, and lyrics generation. Use when the user wants to create songs, back...
← PrevPage 26 / 53 (2,501 skills)Next →