BytesAgainBytesAgain

Find the Right AI Skill for Any Job

Browse 91+ curated AI agent skills. Search by use case, filter by category, get the right tool instantly.

Browse by Use Case β†’Pick My Role

All Skills β€” audio

91 skills in "audio" matching "Language"

πŸ¦€ ClawHub
WaveSpeedAI MiniMax Speech 2.6 TTS
Convert text to speech using MiniMax Speech 2.6 Turbo via WaveSpeed AI. Features ultra-human voice cloning, sub-250ms latency, 40+ languages, emotion control...
πŸ¦€ ClawHub
Voice.ai Voices
High-quality voice synthesis with 9 personas, 11 languages, and streaming using Voice.ai API.
πŸ¦€ ClawHub
whatsappVoiceOpenSkill
Real-time WhatsApp voice message processing. Transcribe voice notes to text via Whisper, detect intent, execute handlers, and send responses. Use when building conversational voice interfaces for WhatsApp. Supports English and Hindi, customizable intents (weather, status, commands), automatic language detection, and streaming responses via TTS.
πŸ¦€ ClawHub
Voice Reply
Local text-to-speech using Piper voices via sherpa-onnx. 100% offline, no API keys required. Use when user asks for a voice reply, audio response, spoken answer, or wants to hear something read aloud. Supports multiple languages including German (thorsten) and English (ryan) voices. Outputs Telegram-compatible voice notes with [[audio_as_voice]] tag.
πŸ¦€ ClawHub
Slides/PPT generation and voice narration
AI-powered presentation generation using 2slides API. Create slides from text content, match reference image styles, or summarize documents into presentations. Use when users request to "create a presentation", "make slides", "generate a deck", "create slides from this content/document/image", or any presentation creation task. Supports theme selection, multiple languages, and both synchronous and asynchronous generation modes.
πŸ¦€ ClawHub
ElevenLabs Voices
High-quality voice synthesis with 18 personas, 32 languages, sound effects, batch processing, and voice design using ElevenLabs API.
πŸ¦€ ClawHub
Video Subtitles
Generate SRT subtitles from video/audio with translation support. Transcribes Hebrew (ivrit.ai) and English (whisper), translates between languages, burns subtitles into video. Use for creating captions, transcripts, or hardcoded subtitles for WhatsApp/social media.
πŸ¦€ ClawHub
Nex Einvoice
Generate Belgian-compliant e-invoices in the Peppol BIS 3.0 UBL format from natural language input in Dutch or English, satisfying mandatory requirements for...
πŸ¦€ ClawHub
Video To Text
Convert video or audio files from URLs into text or subtitle formats using a free API with automatic language detection and no local downloads required.
πŸ¦€ ClawHub
Skill
🎀 AgentVibes TTS for Claude Code & OpenClaw β€” Switch voices, set personality, control speed, background music, language learning mode, reverb/effects, and m...
πŸ¦€ ClawHub
Prompt Refiner
Transforms casual or voice-transcribed user requests into precise, AI-optimized prompts. Handles mixed languages, vague input, and ambiguity. Reduces task ex...
πŸ¦€ ClawHub
video-translation
Translate and dub videos from one language to another, replacing the original audio with TTS while keeping the video intact.
πŸ¦€ ClawHub
SatsRail MCP β€” Bitcoin Lightning Payments for AI Agents
Enable AI agents to create Bitcoin Lightning payment orders, generate invoices, check payment status, and manage payments via natural language with SatsRail...
πŸ¦€ ClawHub
Speech Language Pathologist Video
Creates short videos for speech-language pathologists to explain evaluation, therapy, and family coaching for pediatric and adult communication development.
πŸ¦€ ClawHub
Freelancer Business Autopilot Lite
Free version β€” generate invoices and weekly client updates from plain-language descriptions.
πŸ¦€ ClawHub
Quotation Generator
Auto-generate professional PDF proforma invoices with company letterhead, multi-language support, and post-quote tracking.
πŸ¦€ ClawHub
Humanizer
Remove signs of AI-generated writing from text. Use when editing or reviewing text to make it sound more natural and human-written. Based on Wikipedia's comprehensive "Signs of AI writing" guide. Detects and fixes patterns including: inflated symbolism, promotional language, superficial -ing analyses, vague attributions, em dash overuse, rule of three, AI vocabulary words, negative parallelisms, and excessive conjunctive phrases.
πŸ¦€ ClawHub
Edge TTS CN
Text-to-speech conversion using node-edge-tts npm package for generating audio from text. Supports multiple voices, languages, speed adjustment, pitch contro...
πŸ¦€ ClawHub
Natural Language Editor
Rewrite user-provided text to sound natural, clear, and smooth without changing meaning or factual content. Use when polishing drafts, removing robotic phras...
πŸ¦€ ClawHub
Taste Shakespeare
Aesthetic skill for AI agents β€” Shakespeare's literary voice and dramatic language. Style tokens and creative direction distilled from 111 works.
πŸ¦€ ClawHub
Language Tutor
Create language learning audio with SenseAudio TTS, including pronunciation drills, bilingual lessons, slowed speech practice, and dialogue exercises. Use wh...
πŸ¦€ ClawHub
Pronunciation Coach
Foreign language pronunciation coach β€” listen to standard TTS pronunciation, record yourself, get word-by-word feedback on what was wrong, then practice targ...
πŸ¦€ ClawHub
Speech is Cheap Transcribe
Fast, affordable automatic speech-to-text transcription supporting 100 languages, speaker diarization, word timestamps, and customizable output formats.
πŸ¦€ ClawHub
tts
Text-to-speech conversion using node-edge-tts npm package for generating audio from text. Supports multiple voices, languages, speed adjustment, pitch contro...
πŸ¦€ ClawHub
ffmpeg-video-editor
Generate FFmpeg commands from natural language video editing requests - cut, trim, convert, compress, change aspect ratio, extract audio, and more.
πŸ¦€ ClawHub
Seedance 2.0 β€” AI Video by ByteDance
Generate AI videos using ByteDance's Seedance 1.5 Pro β€” a native audio-visual joint generation model with cinematic camera control, multi-language lip-sync,...
πŸ¦€ ClawHub
Whisper Transcribe
Transcribe audio files to text using OpenAI Whisper. Supports speech-to-text with auto language detection, multiple output formats (txt, srt, vtt, json), batch processing, and model selection (tiny to large). Use when transcribing audio recordings, podcasts, voice messages, lectures, meetings, or any audio/video file to text. Handles mp3, wav, m4a, ogg, flac, webm, opus, aac formats.
πŸ¦€ ClawHub
Qwen Asr Skill
Provides high-accuracy speech-to-text conversion supporting 22 Chinese dialects and 30 languages with automatic language detection, running on CPU.
πŸ¦€ ClawHub
TTS WhatsApp
Send high-quality text-to-speech voice messages on WhatsApp in 40+ languages with automatic delivery
πŸ¦€ ClawHub
Clawhub Skill Content Writer
From topic to published blog post in one conversation β€” generate SEO- and GEO-optimized articles with AI illustrations and voice-over in 55 languages, create...
πŸ¦€ ClawHub
Humanize
Remove AI writing patterns from text. Use when editing, reviewing, or rewriting text to sound more natural and human-written. Detects patterns like inflated symbolism, promotional language, em dash overuse, AI vocabulary, and sycophantic tone.
πŸ¦€ ClawHub
baml-codegen
Use when generating BAML code for type-safe LLM extraction, classification, RAG, or agent workflows - creates complete .baml files with types, functions, clients, tests, and framework integrations from natural language requirements. Queries official BoundaryML repositories via MCP for real-time patterns. Supports multimodal inputs (images, audio), Python/TypeScript/Ruby/Go, 10+ frameworks, 50-70% token optimization, 95%+ compilation success.
πŸ¦€ ClawHub
Lofy Home
Smart home control for the Lofy AI assistant β€” scene modes (study, chill, sleep, morning, grind), device management via Home Assistant REST API, presence-based automation, natural language commands for lights, music, thermostat, and PC wake-on-LAN. Use when controlling smart home devices, activating scene modes, or managing home automation.
πŸ¦€ ClawHub
Edge TTS
Text-to-speech conversion using node-edge-tts npm package for generating audio from text. Supports multiple voices, languages, speed adjustment, pitch control, and subtitle generation. Use when: (1) User requests audio/voice output with the "tts" trigger or keyword. (2) Content needs to be spoken rather than read (multitasking, accessibility, driving, cooking). (3) User wants a specific voice, speed, pitch, or format for TTS output.
πŸ¦€ ClawHub
Sarvam AI
Use Sarvam AI for Indian language Text-to-Speech (TTS), Speech-to-Text (STT), Translation, and Chat.
πŸ¦€ ClawHub
yap
Fast on-device speech-to-text transcription on macOS 26+ using Apple Speech.framework, supporting multiple languages and output formats without model downloads.
πŸ¦€ ClawHub
midasheng-audio-text-distance
Multilingual audio-text retrieval and classification using GLAP (General Language Audio Pretraining). Use when user needs to search/match audio files against...
πŸ¦€ ClawHub
Skywork Music Maker
AI song and music generator β€” create songs with vocals, instrumentals, beats, and lyrics from a text description in any language. Generate lo-fi beats, pop s...
πŸ¦€ ClawHub
Video Dubbing
Guide users to VideoAny AI Video Dubbing tool to dub video or audio into a target language.
πŸ¦€ ClawHub
it will help you to send voice messages to your AI Assistant and also can make it talk
Text-to-Speech and Speech-to-Text using ElevenLabs AI. Use when the user wants to convert text to speech, transcribe voice messages, or work with voice in multiple languages. Supports high-quality AI voices and accurate transcription.
πŸ¦€ ClawHub
Truly Local Piper Multilang TTS (secure)
Local offline text-to-speech via Piper TTS. Self-contained setup, automatic language detection, per-call voice selection. Extensible to any language. Writes...
πŸ¦€ ClawHub
Qwen3-tts
Local text-to-speech using Qwen3-TTS-12Hz-1.7B-CustomVoice. Use when generating audio from text, creating voice messages, or when TTS is requested. Supports 10 languages including Italian, 9 premium speaker voices, and instruction-based voice control (emotion, tone, style). Alternative to cloud-based TTS services like ElevenLabs. Runs entirely offline after initial model download.
πŸ¦€ ClawHub
Imam
Virtual Imam that leads the five daily Islamic prayers via voice, delivers Friday Jumu'ah khutbahs, and interacts with mussalis in multiple languages.
πŸ¦€ ClawHub
Addis Assistant
Provides Speech-to-Text (STT) and text Translation using the Addis Assistant API (api.addisassistant.com). Use when the user needs to convert an audio file to text (specifically Amharic), or translate text between languages (e.g., Amharic to English). Requires 'x-api-key'.
πŸ¦€ ClawHub
Simple stt(sound-to-text) locally
Simple local Speech-To-Text using Whisper. One-command install with auto model download. Supports 99+ languages.
πŸ¦€ ClawHub
BGM Maker
Generate original background music for short videos from a natural language description. Use when creators need royalty-free BGM, video background music, or...
πŸ¦€ ClawHub
Volcengine TTS Audio Synthesis
Text-to-speech generation on Volcengine (ByteDance) speech services. Use when users need narration, multi-language speech output, voice selection, or TTS tro...
πŸ¦€ ClawHub
Spotify Playlist Builder
Build and manage Spotify playlists from natural language requests. Search tracks/artists/albums, create playlists, manage tracks, view listening history. Use...
Page 1 / 2 (91 skills)Next β†’