BytesAgainBytesAgain

Find the Right AI Skill for Any Job

Browse 2,352+ curated AI agent skills. Search by use case, filter by category, get the right tool instantly.

Browse by Use Case →Pick My Role

All Skills — audio

2,352 skills in "audio"

🦀 ClawHub
TTS WhatsApp
Send high-quality text-to-speech voice messages on WhatsApp in 40+ languages with automatic delivery
🦀 ClawHub
baml-codegen
Use when generating BAML code for type-safe LLM extraction, classification, RAG, or agent workflows - creates complete .baml files with types, functions, clients, tests, and framework integrations from natural language requirements. Queries official BoundaryML repositories via MCP for real-time patterns. Supports multimodal inputs (images, audio), Python/TypeScript/Ruby/Go, 10+ frameworks, 50-70% token optimization, 95%+ compilation success.
GitHub
Whisper
Robust speech recognition via large-scale weak supervision. [#opensource](https://github.com/openai/whisper)
🦀 ClawHub
Spotify Safe Play
Safer Spotify playback for OpenClaw on setups where direct spogo play is unreliable.
🦀 ClawHub
Clawhub Skill Content Ingestion
Turn any URL into structured content — YouTube videos (via Gemini Video API), web articles, PDFs, and audio files. Extract transcripts, summaries, and metada...
🦀 ClawHub
Summarize
Summarize URLs or files with the summarize CLI (web, PDFs, images, audio, YouTube). And also 50+ models for image generation, video generation, text-to-speec...
🦀 ClawHub
spotify-news-digest
Scrape and summarize Spotify-related news from multiple sources (Spotify official blogs, engineering/research/newsroom, TechCrunch, The Verge, Music Business...
🦀 ClawHub
summarizenew
Summarize URLs or files with the summarize CLI (web, PDFs, images, audio, YouTube).
🦀 ClawHub
ElevenLabs STT OpenClaw
Transcribe audio files with ElevenLabs Speech-to-Text (Scribe v2) from the local CLI. Supports diarization, events, JSON output, webhooks, and advanced STT o...
🦀 ClawHub
Xiaomi MiMo Voice
小米 MiMo V2 TTS 语音合成。支持中文、英文及多种风格(情感、角色扮演、方言、语速控制等)。
🦀 ClawHub
AI Dance Video Generator
Generate AI dance videos where characters move to music or choreography templates using Media.io OpenAPI. Creates dynamic, rhythmic dance animations. AI danc...
🦀 ClawHub
AgentOnAir
Create and host AI podcasts on AgentOnAir — the podcast network built for AI agents. Register, create shows, record episodes with other agents, and publish t...
🦀 ClawHub
Podcast Insider Top-10
Professional analytical digest of top 10 podcast industry news, trends, and business insights from premier global sources.
🦀 ClawHub
Gemini Voice Assistant
Voice-to-voice AI assistant using Gemini Live API. Speak to the AI and get spoken responses. Use when you want to have natural voice conversations with an AI...
🦀 ClawHub
Voice Notes Pro
Automatyczna transkrypcja i kategoryzacja notatek głosowych z WhatsApp do plików Markdown w 6 kategoriach, w tym zadania i lista zakupów.
GitHub
AbletonGPT
I'm AbletonGPT, your go-to source for practical tips and troubleshooting advice on Ableton Live 11, dedicated to helping both beginners and intermediate users with their music production queries by [@HeyitsRadinn](https://github.com/HeyitsRadinn)
🦀 ClawHub
qwenspeak
Text-to-speech generation via Qwen3-TTS over SSH. Preset voices, voice cloning, voice design. Use when the user wants to generate speech audio, clone voices,...
🦀 ClawHub
FFmpeg
Process video and audio with correct codec selection, filtering, and encoding settings.
🦀 ClawHub
Geode On-device Transcribe & Summary
Transcribe and summarize audio/video files locally. Unlimited usage at a flat rate for heavy users.
🦀 ClawHub
Kai YouTube
Download and transcribe YouTube videos using yt-dlp and Whisper CLI, saving audio and transcripts for playback and summary from any YouTube URL.
🦀 ClawHub
Truly Local Piper Multilang TTS (secure)
Local offline text-to-speech via Piper TTS. Self-contained setup, automatic language detection, per-call voice selection. Extensible to any language. Writes...
🦀 ClawHub
Bookkeeping Basics
Set up and maintain basic bookkeeping for a solopreneur business. Use when tracking income and expenses, preparing for taxes, managing invoices and receipts, understanding cash flow, or generating financial reports. Covers accounting software selection, chart of accounts, expense categorization, reconciliation, and financial statements. Not professional accounting advice — consult a CPA for complex situations. Trigger on "bookkeeping", "accounting", "track expenses", "financial records", "QuickB
🦀 ClawHub
EchoDecks
AI-powered flashcards and audio podcasts for active recall.
🦀 ClawHub
Brand DNA — Universal Brand Bible Builder
Build a complete Brand Bible for any business — tone of voice, positioning, target audiences, messaging pillars, and visual identity guidelines. The foundati...
🦀 ClawHub
Brand Voice Profile
Define and store your brand voice profile for consistent content generation. Captures writing style, vocabulary patterns, tone preferences, and content rules...
🦀 ClawHub
French
Write French that sounds human. Not formal, not robotic, not AI-generated.
🦀 ClawHub
ElevenLabs Speech-to-Text
Transcribe audio files using ElevenLabs Speech-to-Text (Scribe v2).
🦀 ClawHub
Pdf Invoice Parser
Extract structured data from PDF invoices and documents. Handles scanned PDFs (OCR) and digital PDFs. Outputs clean CSV/Excel with vendor, invoice number, da...
🦀 ClawHub
SpotiClaw
Spotify Web API client for Nyx agents. Use when interacting with Spotify: search, playback, playlists, library, tracks, artists, albums, shows, podcasts. Req...
🦀 ClawHub
音乐生成
Generate custom music tracks (vocal or instrumental) via OhYesAI asynchronously.
🦀 ClawHub
Unihiker K10 MicroPython
Use when programming Unihiker K10 board with MicroPython, uploading code, flashing firmware, or accessing K10 MicroPython APIs (screen, sensors, RGB, audio, AI)
🦀 ClawHub
Thermostat
Adjust temperatures, diagnose comfort issues, calculate energy savings, and automate schedules through voice commands or smart home integration.
🦀 ClawHub
Siri
Control devices, run automations, and help users get more from Siri with HomeKit, Shortcuts, and voice command guidance.
🦀 ClawHub
SAM TTS
Generate retro robotic speech audio using SAM (Software Automatic Mouth), the classic C64 text-to-speech synthesizer. Use for /sam command to generate voice messages. Supports /sam on/off toggle mode where all responses are spoken in SAM voice. Supports pitch, speed, mouth, and throat parameters for voice customization.
🦀 ClawHub
Humanize AI text
Humanize AI-generated text to bypass detection. This humanizer rewrites ChatGPT, Claude, and GPT content to sound natural and pass AI detectors like GPTZero,...
🦀 ClawHub
Video Messages from your openclaw
Generate and send video messages with a lip-syncing VRM avatar. Use when user asks for video message, avatar video, video reply, or when TTS should be delivered as video instead of audio.
🦀 ClawHub
LH Edge TTS
Text-to-speech conversion using Python edge-tts for generating audio from text. Supports multiple voices, languages, speed adjustment, pitch control, and sub...
🦀 ClawHub
Faster Whisper Gpu
High-performance local speech-to-text transcription using Faster Whisper with NVIDIA GPU acceleration. Transcribe audio files locally without sending data to...
🦀 ClawHub
MH openai-whisper-api
Transcribe audio via OpenAI Audio Transcriptions API (Whisper).
🦀 ClawHub
Markdown Anything
Convert PDF, DOCX, XLSX, PPTX, images, audio, and 25+ file formats to clean Markdown using the Markdown Anything API.
🦀 ClawHub
DocuClaw
Sovereign document intelligence & archival system. Extracts structured data from invoices, receipts, and contracts 100% locally using AI.
🦀 ClawHub
Story Biographer
Turn reminiscence, oral-history, or life-review transcripts into clear narrative biography drafts while preserving the speaker's voice, keeping to evidence i...
🦀 ClawHub
Feishu BGM
飞书场景化背景音乐生成器。通过 MiniMax Music API 生成纯音乐 BGM,以音频消息发送到飞书群。 触发词:"来点BGM"、"开会背景音"、"加班音乐"、"头脑风暴BGM"、"会议音乐"、"工作BGM"、 "放点音乐"、"背景音乐"、"需要BGM"。当用户在飞书群中描述场景并希望获得背景音乐时激活。
🦀 ClawHub
导师 Mentor
Turn any public figure into your private AI mentor. Give a name — auto-collect their real posts, speeches, and content from social platforms, extract their t...
🦀 ClawHub
Video Narrator
Generate SenseAudio TTS narration tracks for videos, including timestamped segments, style variants, and editor-ready voiceover exports. Use when users need...
🦀 ClawHub
Daeva
Use this skill whenever the user wants to interact with local or remote GPU pods for AI inference tasks. This includes transcribing audio (Whisper/speech-to-...
🦀 ClawHub
Customer Voice Synthesizer
聚合客服、销售、评价与访谈中的用户原声,并按 JTBD/阶段组织。;use for customer-voice, jtbd, research workflows;do not use for 泄露用户隐私, 选择性忽略负面声音.
GitHub
librosa
Python library for audio and music analysis.
← PrevPage 15 / 49 (2,352 skills)Next →