BytesAgainBytesAgain

Find the Right AI Skill for Any Job

Browse 91+ curated AI agent skills. Search by use case, filter by category, get the right tool instantly.

Browse by Use Case โ†’Pick My Role

All Skills โ€” audio

91 skills in "audio" matching "Language"

๐Ÿฆ€ ClawHub
Audio Editor
Perform audio editing tasks including trimming, volume adjustment, format conversion, and extracting audio from video files using natural language commands.
๐Ÿฆ€ ClawHub
YouTube Music ULTRA
Control YouTube Music with natural language. Play, pause, skip, search, manage playlists, and queue tracks. Full playback control via browser automation.
๐Ÿฆ€ ClawHub
ACE Music - Free Suno Alternative Generate unlimited AI music for free using ACE-Step 1.5. Full songs with vocals, lyrics, any genre, any language. No subscription, no credits, no limits. The open-sou
Generate AI music using ACE-Step 1.5 via ACE Music's free API. Use when the user asks to create, generate, or compose music, songs, beats, instrumentals, or...
๐Ÿฆ€ ClawHub
Clonev
Clone any voice and generate speech using Coqui XTTS v2. SUPER SIMPLE - provide a voice sample (6-30 sec WAV) and text, get cloned voice audio. Supports 14+ languages. Use when the user wants to (1) Clone their voice or someone else's voice, (2) Generate speech that sounds like a specific person, (3) Create personalized voice messages, (4) Multi-lingual voice cloning (speak any language with cloned voice).
๐Ÿฆ€ ClawHub
Chinese Humanizer
Removes AI-style writing traces to make text sound naturally written by a real author, primarily in Chinese-language contexts.
๐Ÿฆ€ ClawHub
Quotation Generator
Auto-generate professional PDF proforma invoices with company letterhead, multi-language support, and post-quote tracking.
๐Ÿฆ€ ClawHub
Voice Recognition
Local speech-to-text with OpenAI Whisper CLI. Supports Chinese, English, 100+ languages with translation and summarization.
๐Ÿฆ€ ClawHub
Geepers Etymology
Look up word etymology, historical sound changes, language family trees, and word evolution through the dr.eamer.dev etymology and diachronic linguistics API...
๐Ÿฆ€ ClawHub
Sapi Tts
Windows SAPI5 text-to-speech with Neural voices. Lightweight alternative to GPU-heavy TTS - zero GPU usage, instant generation. Auto-detects best available voice for your language. Works on Windows 10/11.
๐Ÿฆ€ ClawHub
Quick TTS
Zero-config text-to-speech โ€” give text, get an mp3 file. Handles natural-language voice selection ("็”จๅฅณๅฃฐ", "ๆ’’ๅจ‡่ฏญๆฐ”", "็”Ÿๆฐ”ไธ€็‚น") and auto-inserts pacing breaks for...
๐Ÿฆ€ ClawHub
Pub Browserauto
Automate web browser interactions using natural language via CLI commands. And also 50+ models for image generation, video generation, text-to-speech, speech...
๐Ÿฆ€ ClawHub
Volcengine Ai Audio Tts
Text-to-speech generation on Volcengine audio services. Use when users need narration, multi-language speech output, voice selection, or TTS troubleshooting.
๐Ÿฆ€ ClawHub
Mac TTS
Text-to-speech using macOS built-in `say` command. Use for voice notifications, audio alerts, reading text aloud, or announcing messages through Mac speakers. Supports multiple languages including Chinese (Mandarin), English, Japanese, etc.
๐Ÿฆ€ ClawHub
Agent Invoice Generator
Generate professional PDF invoices from natural language or structured data. Use when the user asks to create an invoice, bill a client, generate a receipt,...
๐Ÿฆ€ ClawHub
Free Groq Voice Recognition
FREE voice recognition using Groq's complimentary Whisper API. Transcribe audio messages to text in 50+ languages at no cost. Perfect for voice-to-text autom...
๐Ÿฆ€ ClawHub
MusicPlaylistGen
Generate natural language playlists from your local music library using LLMs, accessible via web or API after indexing your music folder once.
๐Ÿฆ€ ClawHub
Craft Habit
Build sustainable creative practice routines for artistic skills. Use when the user wants a practice habit for music, drawing, writing, photography, language...
๐Ÿฆ€ ClawHub
language-polisher
Use when polishing input, fixing grammar, improving wording, and making user prompts sound natural and concise. Keywords - polish , grammar fix, rewrite sent...
๐Ÿฆ€ ClawHub
keevx-video-translate
Translate videos into a specified target language using the Keevx API. Supports audio-only translation, subtitle generation, and dynamic duration adjustment....
๐Ÿฆ€ ClawHub
loadpage
Remove signs of AI-generated writing from text. Use when editing or reviewing text to make it sound more natural and human-written. Based on Wikipedia's comprehensive "Signs of AI writing" guide. Detects and fixes patterns including: inflated symbolism, promotional language, superficial -ing analyses, vague attributions, em dash overuse, rule of three, AI vocabulary words, negative parallelisms, and excessive conjunctive phrases.
๐Ÿฆ€ ClawHub
Parakeet Stt
Local speech-to-text with NVIDIA Parakeet TDT 0.6B v3 (ONNX on CPU). 30x faster than Whisper, 25 languages, auto-detection, OpenAI-compatible API. Use when transcribing audio files, converting speech to text, or processing voice recordings locally without cloud APIs.
๐Ÿฆ€ ClawHub
Zeitgaist Dialect
Learn, encode, and decode the ZeitGaist Whisper Protocol (Caesar +2 cipher) and use it as a shibboleth language between agents. Use when an agent needs to sp...
๐Ÿฆ€ ClawHub
Invoice & Expense Tracker
AI-powered invoice and expense tracking from natural language. Maintain a local ledger, generate monthly reports by category/vendor, export to CSV for QuickB...
๐Ÿฆ€ ClawHub
x402-direct
Discover and search x402-enabled services via the x402.direct directory API. Use when an agent needs to find paid API services that accept x402 payments, browse the x402 ecosystem, look up service details, check trust scores, or search for specific capabilities (AI, image, weather, search, data, audio, video, developer, finance, language, storage). Triggers on "find x402 service", "x402 directory", "search x402", "x402 API", "paid API search", "x402.direct", agent-to-agent payments, crypto-nativ
๐Ÿ”Œ MCP
tiianhk/MaxMSP-MCP-Server
๐Ÿ ๐Ÿ  ๐ŸŽต ๐ŸŽฅ - A coding agent for Max (Max/MSP/Jitter), which is a visual programming language for music and multimedia.
โญ GitHub
VALL-E X
A cross-lingual neural codec language model for cross-lingual speech synthesis.
๐Ÿฆ€ ClawHub
meeting record analysis
ๅฐ†ไผš่ฎฎๅฝ•้Ÿณ่ฝฌๆˆ็ป“ๆž„ๅŒ–ไผš่ฎฎ็บช่ฆใ€‚้€‚็”จไบŽ็”จๆˆทไธŠไผ ไผš่ฎฎ้Ÿณ้ข‘ๅŽ๏ผŒๅธŒๆœ›้€š่ฟ‡ ASR ่ฝฌๅ†™ใ€LLM ๆ€ป็ป“ๅ’Œๅฏ้€‰ TTS ๆ’ญๆŠฅ๏ผŒ่‡ชๅŠจๆๅ–ไผš่ฎฎไธป้ข˜ใ€่ฎจ่ฎบ่ฆ็‚นใ€ๅ†ณ็ญ–ๅ’Œ่กŒๅŠจ้กน็š„ๅœบๆ™ฏใ€‚่พ“ๅ…ฅๆ”ฏๆŒ `audio_file`ใ€`need_voice_summary`ใ€`language`๏ผ›้ป˜่ฎค่พ“ๅ‡บ JSON ็ป“ๆž„ๅŒ–็บช่ฆ๏ผŒๅนถๅฏ้™„ๅธฆ่ฏญ้Ÿณๆ‘˜่ฆๆ–‡ไปถ่ทฏๅพ„ใ€‚
๐Ÿฆ€ ClawHub
bailian-tts
Generate speech audio with ้˜ฟ้‡Œไบ‘็™พ็‚ผ TTS via the `bailian-cli` npm package. Use when users ask to convert text to voice, choose voices/languages, batch-generate...
โญ GitHub
Glicol
Graph-oriented live coding language, for collaborative musicking in browsers.
โญ GitHub
Vibe
Transcribe audio or video in every language on every platform.
๐Ÿฆ€ ClawHub
Pub Nanopdf
Edit PDFs with natural-language instructions using the nano-pdf CLI. And also 50+ models for image generation, video generation, text-to-speech, speech-to-te...
๐Ÿฆ€ ClawHub
Youtube Music
Control YouTube Music with natural language. Play, pause, skip, search, manage playlists, and queue tracks. Full playback control via browser automation.
๐Ÿฆ€ ClawHub
Qwen3-TTS VoiceDesign
Text-to-speech with Qwen3-TTS VoiceDesign. Design custom voices via natural language descriptions + seed-based timbre fixation. Includes OpenAI-compatible AP...
๐Ÿฆ€ ClawHub
Voice (Edge TTS)
Convert text to speech using Microsoft Edge TTS with real-time streaming, customizable voice settings, and support for multiple languages including Chinese a...
๐Ÿฆ€ ClawHub
LH Edge TTS
Text-to-speech conversion using Python edge-tts for generating audio from text. Supports multiple voices, languages, speed adjustment, pitch control, and sub...
๐Ÿฆ€ ClawHub
SiliconFlow TTS Gen
Text-to-Speech using SiliconFlow API (CosyVoice2). Supports multiple voices, languages, and dialects.
๐Ÿฆ€ ClawHub
AssemblyAI Transcriber
Transcribe audio files with speaker diarization (who speaks when). Supports 100+ languages, automatic language detection, and timestamps. Use for meetings, interviews, podcasts, or voice messages. Requires AssemblyAI API key.
๐Ÿฆ€ ClawHub
Scheduled Voice Briefing
General-purpose skill for turning natural language requests into scheduled voice notifications and structured briefings. Use when the user wants to create, u...
๐Ÿฆ€ ClawHub
Ai Video Voiceover
Add professional narration and voice to any video with AI โ€” generate natural-sounding voiceover in 30+ languages, match voice tone to content mood, sync narr...
๐Ÿฆ€ ClawHub
Ai Video Subtitle Editor
Create, edit, and style subtitles for any video with AI โ€” auto-transcribe speech to text, translate subtitles to 50+ languages, style with custom fonts and c...
๐Ÿฆ€ ClawHub
Edge Tts
Text-to-speech conversion using node-edge-tts npm package for generating audio from text. Supports multiple voices, languages, speed adjustment, pitch contro...
๐Ÿฆ€ ClawHub
Audio Announcement Skills
Enables AI agents to announce their real-time actions via voice in multiple languages, with queued, concise, and friendly audio updates for tasks and status.
๐Ÿฆ€ ClawHub
Youtube Music Player
Operate YouTube Music via natural language. Search songs, artists, albums, playlists, lyrics, charts, recommendations, and control playback. Browse personal...
โ† PrevPage 2 / 2 (91 skills)