Find the Right AI Skill for Any Job

Browse 91+ curated AI agent skills. Search by use case, filter by category, get the right tool instantly.

All Skills — audio

91 skills in "audio" matching "Language"

Convert text to speech using MiniMax Speech 2.6 Turbo via WaveSpeed AI. Features ultra-human voice cloning, sub-250ms latency, 40+ languages, emotion control...

🦀 ClawHub

Voice.ai Voices

High-quality voice synthesis with 9 personas, 11 languages, and streaming using Voice.ai API.

🦀 ClawHub

whatsappVoiceOpenSkill

Real-time WhatsApp voice message processing. Transcribe voice notes to text via Whisper, detect intent, execute handlers, and send responses. Use when building conversational voice interfaces for WhatsApp. Supports English and Hindi, customizable intents (weather, status, commands), automatic language detection, and streaming responses via TTS.

🦀 ClawHub

Voice Reply

Local text-to-speech using Piper voices via sherpa-onnx. 100% offline, no API keys required. Use when user asks for a voice reply, audio response, spoken answer, or wants to hear something read aloud. Supports multiple languages including German (thorsten) and English (ryan) voices. Outputs Telegram-compatible voice notes with [[audio_as_voice]] tag.

🦀 ClawHub

Slides/PPT generation and voice narration

AI-powered presentation generation using 2slides API. Create slides from text content, match reference image styles, or summarize documents into presentations. Use when users request to "create a presentation", "make slides", "generate a deck", "create slides from this content/document/image", or any presentation creation task. Supports theme selection, multiple languages, and both synchronous and asynchronous generation modes.

🦀 ClawHub

ElevenLabs Voices

High-quality voice synthesis with 18 personas, 32 languages, sound effects, batch processing, and voice design using ElevenLabs API.

🦀 ClawHub

Video Subtitles

Generate SRT subtitles from video/audio with translation support. Transcribes Hebrew (ivrit.ai) and English (whisper), translates between languages, burns subtitles into video. Use for creating captions, transcripts, or hardcoded subtitles for WhatsApp/social media.

🦀 ClawHub

Nex Einvoice

Generate Belgian-compliant e-invoices in the Peppol BIS 3.0 UBL format from natural language input in Dutch or English, satisfying mandatory requirements for...

🦀 ClawHub

Video To Text

Convert video or audio files from URLs into text or subtitle formats using a free API with automatic language detection and no local downloads required.

🦀 ClawHub

Skill

🎤 AgentVibes TTS for Claude Code & OpenClaw — Switch voices, set personality, control speed, background music, language learning mode, reverb/effects, and m...

🦀 ClawHub

Prompt Refiner

Transforms casual or voice-transcribed user requests into precise, AI-optimized prompts. Handles mixed languages, vague input, and ambiguity. Reduces task ex...

🦀 ClawHub

video-translation

Translate and dub videos from one language to another, replacing the original audio with TTS while keeping the video intact.

🦀 ClawHub

SatsRail MCP — Bitcoin Lightning Payments for AI Agents

Enable AI agents to create Bitcoin Lightning payment orders, generate invoices, check payment status, and manage payments via natural language with SatsRail...

🦀 ClawHub

Speech Language Pathologist Video

Creates short videos for speech-language pathologists to explain evaluation, therapy, and family coaching for pediatric and adult communication development.

🦀 ClawHub

Freelancer Business Autopilot Lite

Free version — generate invoices and weekly client updates from plain-language descriptions.

🦀 ClawHub

Quotation Generator

Auto-generate professional PDF proforma invoices with company letterhead, multi-language support, and post-quote tracking.

🦀 ClawHub

Humanizer

Remove signs of AI-generated writing from text. Use when editing or reviewing text to make it sound more natural and human-written. Based on Wikipedia's comprehensive "Signs of AI writing" guide. Detects and fixes patterns including: inflated symbolism, promotional language, superficial -ing analyses, vague attributions, em dash overuse, rule of three, AI vocabulary words, negative parallelisms, and excessive conjunctive phrases.

🦀 ClawHub

Edge TTS CN

Text-to-speech conversion using node-edge-tts npm package for generating audio from text. Supports multiple voices, languages, speed adjustment, pitch contro...

🦀 ClawHub

Natural Language Editor

Rewrite user-provided text to sound natural, clear, and smooth without changing meaning or factual content. Use when polishing drafts, removing robotic phras...

🦀 ClawHub

Taste Shakespeare

Aesthetic skill for AI agents — Shakespeare's literary voice and dramatic language. Style tokens and creative direction distilled from 111 works.

🦀 ClawHub

Language Tutor

Create language learning audio with SenseAudio TTS, including pronunciation drills, bilingual lessons, slowed speech practice, and dialogue exercises. Use wh...

🦀 ClawHub

Pronunciation Coach

Foreign language pronunciation coach — listen to standard TTS pronunciation, record yourself, get word-by-word feedback on what was wrong, then practice targ...

🦀 ClawHub

Speech is Cheap Transcribe

Fast, affordable automatic speech-to-text transcription supporting 100 languages, speaker diarization, word timestamps, and customizable output formats.

🦀 ClawHub

tts

Text-to-speech conversion using node-edge-tts npm package for generating audio from text. Supports multiple voices, languages, speed adjustment, pitch contro...

🦀 ClawHub

ffmpeg-video-editor

Generate FFmpeg commands from natural language video editing requests - cut, trim, convert, compress, change aspect ratio, extract audio, and more.

🦀 ClawHub

Seedance 2.0 — AI Video by ByteDance

Generate AI videos using ByteDance's Seedance 1.5 Pro — a native audio-visual joint generation model with cinematic camera control, multi-language lip-sync,...

🦀 ClawHub

Whisper Transcribe

Transcribe audio files to text using OpenAI Whisper. Supports speech-to-text with auto language detection, multiple output formats (txt, srt, vtt, json), batch processing, and model selection (tiny to large). Use when transcribing audio recordings, podcasts, voice messages, lectures, meetings, or any audio/video file to text. Handles mp3, wav, m4a, ogg, flac, webm, opus, aac formats.

🦀 ClawHub

Qwen Asr Skill

Provides high-accuracy speech-to-text conversion supporting 22 Chinese dialects and 30 languages with automatic language detection, running on CPU.

🦀 ClawHub

TTS WhatsApp

Send high-quality text-to-speech voice messages on WhatsApp in 40+ languages with automatic delivery

🦀 ClawHub

Clawhub Skill Content Writer

From topic to published blog post in one conversation — generate SEO- and GEO-optimized articles with AI illustrations and voice-over in 55 languages, create...

🦀 ClawHub

Humanize

Remove AI writing patterns from text. Use when editing, reviewing, or rewriting text to sound more natural and human-written. Detects patterns like inflated symbolism, promotional language, em dash overuse, AI vocabulary, and sycophantic tone.

🦀 ClawHub

baml-codegen

Use when generating BAML code for type-safe LLM extraction, classification, RAG, or agent workflows - creates complete .baml files with types, functions, clients, tests, and framework integrations from natural language requirements. Queries official BoundaryML repositories via MCP for real-time patterns. Supports multimodal inputs (images, audio), Python/TypeScript/Ruby/Go, 10+ frameworks, 50-70% token optimization, 95%+ compilation success.

🦀 ClawHub

Lofy Home

Smart home control for the Lofy AI assistant — scene modes (study, chill, sleep, morning, grind), device management via Home Assistant REST API, presence-based automation, natural language commands for lights, music, thermostat, and PC wake-on-LAN. Use when controlling smart home devices, activating scene modes, or managing home automation.

🦀 ClawHub

Edge TTS

Text-to-speech conversion using node-edge-tts npm package for generating audio from text. Supports multiple voices, languages, speed adjustment, pitch control, and subtitle generation. Use when: (1) User requests audio/voice output with the "tts" trigger or keyword. (2) Content needs to be spoken rather than read (multitasking, accessibility, driving, cooking). (3) User wants a specific voice, speed, pitch, or format for TTS output.

🦀 ClawHub

Sarvam AI

Use Sarvam AI for Indian language Text-to-Speech (TTS), Speech-to-Text (STT), Translation, and Chat.

🦀 ClawHub

yap

Fast on-device speech-to-text transcription on macOS 26+ using Apple Speech.framework, supporting multiple languages and output formats without model downloads.

🦀 ClawHub

midasheng-audio-text-distance

Multilingual audio-text retrieval and classification using GLAP (General Language Audio Pretraining). Use when user needs to search/match audio files against...

🦀 ClawHub

Skywork Music Maker

AI song and music generator — create songs with vocals, instrumentals, beats, and lyrics from a text description in any language. Generate lo-fi beats, pop s...

🦀 ClawHub

Video Dubbing

Guide users to VideoAny AI Video Dubbing tool to dub video or audio into a target language.

🦀 ClawHub

it will help you to send voice messages to your AI Assistant and also can make it talk

Text-to-Speech and Speech-to-Text using ElevenLabs AI. Use when the user wants to convert text to speech, transcribe voice messages, or work with voice in multiple languages. Supports high-quality AI voices and accurate transcription.

🦀 ClawHub

Truly Local Piper Multilang TTS (secure)

Local offline text-to-speech via Piper TTS. Self-contained setup, automatic language detection, per-call voice selection. Extensible to any language. Writes...

🦀 ClawHub

Qwen3-tts

Local text-to-speech using Qwen3-TTS-12Hz-1.7B-CustomVoice. Use when generating audio from text, creating voice messages, or when TTS is requested. Supports 10 languages including Italian, 9 premium speaker voices, and instruction-based voice control (emotion, tone, style). Alternative to cloud-based TTS services like ElevenLabs. Runs entirely offline after initial model download.

🦀 ClawHub

Imam

Virtual Imam that leads the five daily Islamic prayers via voice, delivers Friday Jumu'ah khutbahs, and interacts with mussalis in multiple languages.

🦀 ClawHub

Addis Assistant

Provides Speech-to-Text (STT) and text Translation using the Addis Assistant API (api.addisassistant.com). Use when the user needs to convert an audio file to text (specifically Amharic), or translate text between languages (e.g., Amharic to English). Requires 'x-api-key'.

🦀 ClawHub

Simple stt(sound-to-text) locally

Simple local Speech-To-Text using Whisper. One-command install with auto model download. Supports 99+ languages.

🦀 ClawHub

BGM Maker

Generate original background music for short videos from a natural language description. Use when creators need royalty-free BGM, video background music, or...

🦀 ClawHub

Volcengine TTS Audio Synthesis

Text-to-speech generation on Volcengine (ByteDance) speech services. Use when users need narration, multi-language speech output, voice selection, or TTS tro...

🦀 ClawHub

Spotify Playlist Builder

Build and manage Spotify playlists from natural language requests. Search tracks/artists/albums, create playlists, manage tracks, view listening history. Use...

Page 1 / 2 (91 skills)Next →