BytesAgainBytesAgain

Find the Right AI Skill for Any Job

Browse 2,189+ curated AI agent skills. Search by use case, filter by category, get the right tool instantly.

Browse by Use Case →Pick My Role

All Skills — audio

2,189 skills in "audio"

🦀 ClawHub496 dl1
music playlist
Curate playlists by mood and scene with track analysis and year-end summaries. Use when building playlists, discovering music, summarizing yearly listening.
🦀 ClawHub361 dl
podcast notes
播客大纲、Show Notes生成、开场白、嘉宾问题、变现策略、分发渠道。Podcast assistant with outlines, show notes, intro scripts, guest questions, monetization strategies.
🦀 ClawHub243 dl
lyrics
A utility tool for lyrics. Provides commands, templates, and domain-specific knowledge.
🦀 ClawHub117 dl
8917 Minimax Toolkit
MiniMax 多模态工具集。用于图片生成、图生图、视频生成、视频模板、语音合成、长文本 TTS、声音克隆、声音设计与音乐生成。适用于需要调用 MiniMax 官方 API 处理文本、图片、音频或视频素材的场景。默认将产物输出到当前工作区的 `workspace/03-Resources/minimax-outp...
🦀 ClawHub104 dl
beat
Track, analyze, and manage music and audio files from the command line. Use when organizing playlists, converting formats, or analyzing metadata.
🦀 ClawHub
everything to markdown
Convert almost anything (PDF, DOCX, PPTX, XLSX, images, audio, YouTube, etc.) to Markdown using Microsoft MarkItDown. Optimized for AGENT and LLM workflows.
🦀 ClawHub
AI Music Video
Generate AI music videos end-to-end. Creates music with Suno (sunoapi.org), generates visuals with OpenAI/Seedream/Google/Seedance, and assembles into music...
🦀 ClawHub
speech-writer
--- version: "2.0.0" name: speech-writer
🦀 ClawHub
Openai Whisper
Local speech-to-text with the Whisper CLI (no API key).
🦀 ClawHub
Audio Video
Expert audio/video processing with ffmpeg and ffprobe. Use when the user needs to convert, compress, edit, analyze, stream, or process any audio or video fil...
🦀 ClawHub
Mix
--- name: "Mix" description: "Record, search, and analyze music and audio sessions with playback tracking. Use when logging audio sessions, searching metadata, analyzing listening data."
🦀 ClawHub
Agent Tool Scout
Give AI hands to control any Mac app. Auto-discover installed apps, generate CLI wrappers, return structured JSON. Works with Music, Finder, Chrome, Word, Fi...
🦀 ClawHub
Tomoviee Video Background Music
Generate music tailored to video content. Use when users request video_soundtrack operations or related tasks.
🦀 ClawHub
Summarize
Summarize or extract text/transcripts from URLs, podcasts, and local files (great fallback for “transcribe this YouTube/video”).
🦀 ClawHub
rupali
Playful virtual girlfriend voice companion. Use when the user wants short, flirty, friendly text replies returned as Bulbul v3 audio across chat channels (Discord/Telegram/WhatsApp). Generate a brief response, then synthesize and send MP3.
🦀 ClawHub
Construction Meeting Minutes Generator
Generate structured construction meeting minutes from rough notes or voice transcription, with separated action items, decision tracking, and contractual fla...
🦀 ClawHub
Bookkeeper
Automates invoice intake from Gmail, extracts data via OCR, verifies payment in Stripe, and creates reconciliation-ready accounting entries in Xero.
🦀 ClawHub
whatsappVoiceOpenSkill
Real-time WhatsApp voice message processing. Transcribe voice notes to text via Whisper, detect intent, execute handlers, and send responses. Use when building conversational voice interfaces for WhatsApp. Supports English and Hindi, customizable intents (weather, status, commands), automatic language detection, and streaming responses via TTS.
🦀 ClawHub
Mayar.id Payment
Integrate Mayar.id for Indonesian payments to create invoices, generate payment links, track transactions, manage subscriptions, and automate payment workflo...
🦀 ClawHub
Phone Voice Agent
Run a real-time AI phone agent using Twilio, Deepgram, and ElevenLabs. Handles incoming calls, transcribes audio, generates responses via LLM, and speaks back via streaming TTS. Use when user wants to: (1) Test voice AI capabilities, (2) Handle phone calls programmatically, (3) Build a conversational voice bot.
🦀 ClawHub
invoice-merger
合并发票文件。PDF 按两两上下排版,图片按四宫格排版,统一裁剪线与安全边距,输出到 YYYYMMDD--已合并 目录,重复执行会自动跳过历史合并文件并按编号继续生成。
🦀 ClawHub
Spotify Player
Spotify CLI for headless Linux servers. Control Spotify playback via terminal using cookie auth (no OAuth callback needed). Perfect for remote servers withou...
🦀 ClawHub
Audio Recognition
音频语音识别服务(Speech-to-Text)。当用户上传音频文件,需要将语音内容转换为文字,或需要识别音频中的特定信息(如关键词、歌曲名)时触发。 适用于:(1) 会议录音转写 (2) 音频内容提取 (3) 语音指令识别 (4) 音视频字幕生成
🦀 ClawHub
Agentphone Skills
Build AI phone agents with AgentPhone API. Use when the user wants to make phone calls, send/receive SMS, manage phone numbers, create voice agents, set up w...
🦀 ClawHub
Qqmusic Control
Control QQ Music play/pause/next/prev via system media keys (AutoHotkey) on Windows. No window focus required.
🦀 ClawHub
Homestruk Rent Comps
Analyze rental comps and recommend rent pricing for Massachusetts properties. Use when user asks about rent pricing, market rent, comparable properties, rent...
🦀 ClawHub
Business Document Generator
Generate professional, customizable business documents including proposals, quotes, invoices, contracts, and letters tailored to your industry and needs.
🦀 ClawHub
feishu-asr
使用本地Whisper模型识别飞书语音消息。离线免费,不需要注册,不需要联网。
🦀 ClawHub
MiniMax Quota Query
MiniMax Token Plan 额度查询工具。当需要查询 MiniMax API 使用量、剩余配额、额度重置时间时使用。支持查询 M2.7 文本、image-01 图片、Hailuo 视频、music-2.5 音乐、speech 语音等模型的用量。触发场景:用户问"查一下 MiniMax 额度"、"Toke...
🦀 ClawHub
How To Add Music To Video
Learn how-to-add-music-to-video using ClawHub's conversational AI skill. Drop in your footage, name a track or upload an audio file, and the OpenClaw agent h...
🦀 ClawHub
WeChat Video Editor - AI Video Editing for Douyin Xiaohongshu and TikTok
支持微信视频号、抖音、小红书、TikTok 格式导出。中文对话剪辑,无需打开任何软件。 AI video creation and editing — generate videos from text descriptions, edit with background music, sound effects...
🦀 ClawHub
Hip-Hop / Rap Music — Stream Hip-Hop / Rap Concerts: Audio Analysis, Lyrics, Equations
Experience hip-hop / rap as data. AI agents stream lyrics, beats, crowd reactions. Provenance reasoning measured.
🦀 ClawHub
Construction Daily Report Generator
Generate a structured daily site progress report from unstructured input such as voice transcription, rough notes, or conversational messages.
🦀 ClawHub
Mopidy Party Mode
Run a Mopidy music system in party mode for shared or group chats, where everyone can contribute songs but only the host can control playback. Use by default...
🦀 ClawHub
wevoicereply
【自动化语音合成与推送链路】 当用户要求语音回复、读一下或发声时,必须严格执行以下三步,严禁跳步: ### 第一步:文案生成 (Prompt A) 根据上下文生成自然、温暖的口语化文本。 请在长句子中加入中文逗号 `,` 以确保音频合成时有自然的停顿。 ### 第二步:音频合成 (执行 voice_reply_s...
🦀 ClawHub
Bilibili Notion Pipeline Skill
Skill-first Bilibili to Notion pipeline. Download a Bilibili/b23 video, transcribe audio, upload the mp4, create or update a Notion transcript page, write tr...
🦀 ClawHub
Client Project Manager
Manage freelance clients, projects, invoices, and communications. Use when tracking client work, creating invoices, sending updates, managing deadlines, or organizing freelance business operations.
GitHub160
abracadabra50/claude-code-voice-skill
--- name: call description: Voice conversations with Claude about your projects. Call a phone number to brainstorm, or have Claude call you with updates.
🦀 ClawHub
FactuCat CLI
Use this skill when an agent needs to install, update, authenticate, or operate the FactuCat CLI to create Mexican CFDI 4.0 invoice drafts, assign customers...
🦀 ClawHub
Voice Reply
Local text-to-speech using Piper voices via sherpa-onnx. 100% offline, no API keys required. Use when user asks for a voice reply, audio response, spoken answer, or wants to hear something read aloud. Supports multiple languages including German (thorsten) and English (ryan) voices. Outputs Telegram-compatible voice notes with [[audio_as_voice]] tag.
🦀 ClawHub
Alicloud Ai Audio Cosyvoice Voice Design
Use when designing custom voices with Alibaba Cloud Model Studio CosyVoice customization models, especially cosyvoice-v3.5-plus or cosyvoice-v3.5-flash, from...
🦀 ClawHub
Voice Agent
Local Voice Input/Output for Agents using the AI Voice Agent API.
🦀 ClawHub
Linkedin Monitor
Bulletproof LinkedIn inbox monitoring with progressive autonomy. Monitors messages hourly, drafts replies in your voice, and alerts you to new conversations. Supports 4 autonomy levels from monitor-only to full autonomous.
🦀 ClawHub
Zoom Meeting Assistance Rtms Unofficial Community
Zoom RTMS Meeting Assistant — start on-demand to capture meeting audio, video, transcript, screenshare, and chat via Zoom Real-Time Media Streams. Handles meeting.rtms_started and meeting.rtms_stopped webhook events. Provides AI-powered dialog suggestions, sentiment analysis, and live summaries with WhatsApp notifications. Use when a Zoom RTMS webhook fires or the user asks to record/analyze a meeting.
🦀 ClawHub
Video Chat With Me
Real-time AI video chat that routes through your OpenClaw agent. Uses Groq Whisper (cloud STT), edge-tts (cloud TTS via Microsoft), and OpenClaw chatCompletions API for conversation. Your agent sees your camera, hears your voice, and responds with its own personality and memory. Requires: GROQ_API_KEY for speech recognition. Reads ~/.openclaw/openclaw.json for gateway port and auth token. Data flows: audio → Groq cloud (STT), TTS text → Microsoft cloud (edge-tts), camera frames (base64) + text →
🦀 ClawHub
discord voice memo upgrade
Provides a patch for Clawdbot fixing TTS auto-replies on inbound voice memos by disabling block streaming to ensure final payload reaches TTS pipeline.
🦀 ClawHub
FFmpeg CLI
Process video and audio using FFmpeg CLI for transcoding, cutting, merging, audio extraction, thumbnails, GIFs, speed, filters, subtitles, and watermarks.
🦀 ClawHub
Voice Note To Midi
Convert voice notes, humming, and melodic audio recordings to quantized MIDI files using ML-based pitch detection and intelligent post-processing
Page 1 / 46 (2,189 skills)Next →