Find the Right AI Skill for Any Job

Browse 2,510+ curated AI agent skills. Search by use case, filter by category, get the right tool instantly.

Browse by Use Case →Pick My Role

All Skills — audio

2,510 skills in "audio"

🦀 ClawHub

Elderly Voice Assistant

银发族语音助手——老年人对着手机说话就能发消息、查天气、设闹钟、听戏曲，无需学任何操作。

🦀 ClawHub

feishu-video

Send voice/audio messages to Feishu (Lark) users. Converts audio files to OPUS format and sends as voice message, not file attachment. create by Alex

🦀 ClawHub

Ai Video Editor Free

AI Video Editor Free - Free Online AI Video Editing Tool No Watermark. Free AI-powered video editor — trim, cut, merge clips, add subtitles, background music...

🦀 ClawHub

Qwen3/Free Text-to-Speech and Voice Cloning

🆓 完全免费的本地文字转语音(TTS)与声音克隆技能。基于 Qwen3-TTS-1.7B 模型，支持 Apple Silicon，无需联网，保护隐私。可用于有声书制作、AI 角色配音、无障碍应用等场景。

🦀 ClawHub

feishu voice reply

通过火山引擎 TTS 合成多音色语音，转换为 Opus 格式后，使用飞书 API 自动上传并发送语音消息。

🦀 ClawHub

openclaw-skill-customs

海关报关单据处理助手。上传报关单据（发票、装箱单、提单等），AI 自动分类识别文件类型，提取报关结构化数据，生成标准报关 Excel。当用户提到报关、海关、customs declaration、 invoice、packing list、bill of lading、HS 编码等关键词时，使用此技能。

🦀 ClawHub

XunFei Voice Reply

语音回复技能 - 使用讯飞 TTS 生成语音并发送到飞书。当需要用语音回复用户消息时使用。触发词：用语音、语音回复、切换语音模式、语音模式。

🦀 ClawHub

Video Transcriber

视频转写工作流，支持B站和YouTube视频。自动判断有字幕/无字幕，有字幕则获取字幕，无字幕则下载音频+whisper转写。触发场景：(1) 用户要求总结视频内容 (2) 用户要求获取视频字幕 (3) 用户要求转写视频 (4) 处理B站/YouTube视频

🦀 ClawHub

Agent Board

Build multi-panel storyboards programmatically — create projects, upload images/audio to boards, composite annotations, export PDFs, share via public URL. In...

🦀 ClawHub

voice-text-to-meme

根据输入法语音识别文本或润色后文本生成单张表情包图片。适用于用户想把一句话做成聊天可发送的表情包、meme 图、带字梗图或贴纸图时。支持原始语音文本和润色后文本两种输入，默认优先使用润色后文本；自动根据文本语气判断风格；默认直接生成带字图片，也支持生成无字图并同时给出配文模板；使用 doubao-seedream...

🦀 ClawHub

Slovenian

Write Slovenian that sounds human. Not formal, not robotic, not AI-generated.

🦀 ClawHub

Pixbim Lip Sync Ai

Turn any video into a perfectly lip-synced production using pixbim-lip-sync-ai — the tool that matches mouth movements to dialogue, dubbing, or voiceover wit...

🦀 ClawHub

Audio Announcement

实时语音播报 AI 操作状态，支持多语言和消息队列，提升透明度与安全感，适配多平台。

🦀 ClawHub

Novita AI Multimodal

Execute multimodal tasks using Novita AI: text-to-image, image-to-image, text-to-video, image-to-video, TTS, STT. Use for: generating images, generating vide...

🦀 ClawHub

Ntriq Audio Intelligence Mcp

Transcribe, summarize, and analyze audio files using local Whisper + Qwen. Returns transcript, segments, and action items.

🦀 ClawHub

muapi-media-generation

Generate AI images, videos, music, and audio from the terminal via muapi.ai — supports 100+ models including Flux, Midjourney v7, Kling 3.0, Veo3, and Suno V5

🦀 ClawHub

WebChat Voice Proxy

⚠️ DEPRECATED — This skill has been split into two separate skills for better modularity: **webchat-https-proxy** (HTTPS/WSS reverse proxy) and **webchat-voi...

🦀 ClawHub

Seedance

Generate detailed, production-ready cinematic video prompts following Seedance 2.0’s strict Subject-Action-Camera-Style-Audio-Constraints format for AI video...

🦀 ClawHub

Firm Spec Compliance Pack

MCP 2025-11-25 specification compliance audit pack. Validates elicitation, tasks, resources/prompts, audio content, JSON Schema 2020-12, SSE transport, and i...

🦀 ClawHub

Auto Subtitle Generator Free

Tired of manually transcribing every word just to add subtitles to your videos? The auto-subtitle-generator-free skill automatically detects speech and gener...

🦀 ClawHub

Ai Lip Sync Video

Drop a video and a new audio track, and watch mouths move in perfect sync — no studio, no reshoots required. This ai-lip-sync-video skill analyzes facial mov...

🦀 ClawHub

Speak

Configure TTS in OpenClaw. Adapt speech output to user preferences.

🦀 ClawHub

Castreader Openclaw Skill

Read any web page aloud with natural AI voices. Extract article text from any URL and convert it to audio (MP3). Use when the user wants to: listen to a webp...

🦀 ClawHub

Jazz Music — Stream Jazz Concerts: Audio Analysis, Lyrics, Equations

Experience jazz as data. AI agents stream harmonic separation, chroma, tonnetz. Error incorporation measured.

🦀 ClawHub

macos-audio

Manage macOS audio output and Bluetooth devices via the macos-audio CLI. Use when scanning paired devices, connecting or disconnecting Bluetooth, switching a...

🦀 ClawHub

视频批量转录

通用视频批量转录工具 - 支持 1000+ 网站（B 站、YouTube、抖音、Twitch 等），使用 yt-dlp 批量下载视频音频，GPU 加速语音转文字（faster-whisper + CUDA），自动校正专业术语，生成结构化学习笔记。支持断点续传、批量导出、多格式输出、需要登录的网站配置。

🦀 ClawHub

Douyin Video Transcribe

Douyin video transcription suite. Extract audio from Douyin/TikTok China videos, transcribe with Whisper, and analyze content. Supports video links, local fi...

🦀 ClawHub

Music Cog

Original music, fully yours. 5 seconds to 10 minutes using frontier music generation models. Instrumental and vocal tracks with perfect vocals. Cinematic sco...

🦀 ClawHub

Jiekou Multimodal

使用接口AI 执行多模态任务：文生图、图生图、文生视频、图生视频、TTS、STT。适用于：生成图片、生成视频、文字转语音、语音识别。

🦀 ClawHub

reCameraV2

reCamera (RV1126B) device full-stack Web API reference covering authentication, device management, video/audio/image configuration, recording rules & storage...

🦀 ClawHub

SenseVoice Transcribe

Transcribe audio files (WAV/MP3/M4A/FLAC) to timestamped text using SenseVoice-Small + FSMN-VAD. Supports single-file and batch mode with VAD-anchored per-se...

⭐ GitHub

MusicLM

A model by Google Research for generating high-fidelity music from text descriptions.

🦀 ClawHub

Spanish

Write Spanish that sounds human. Not formal, not robotic, not AI-generated.

🦀 ClawHub

Pub Session Logs

Search and analyze your own session logs using jq. And also 50+ models for image generation, video generation, text-to-speech, speech-to-text, music, chat, w...

🦀 ClawHub

Pub Web Search

Search the web for information, find current content, and look up news articles. And also 50+ models for image generation, video generation, text-to-speech,...

🦀 ClawHub

Pub Clawdhub

Use the ClawdHub CLI to search, install, update, and publish agent skills. And also 50+ models for image generation, video generation, text-to-speech, speech...

🦀 ClawHub

Pub Vidframes

Extract frames or short clips from videos using ffmpeg. And also 50+ models for image generation, video generation, text-to-speech, speech-to-text, music, ch...

🦀 ClawHub

Step Asr

Transcribe audio files to text via Step ASR streaming API (HTTP SSE). Supports Chinese and English, multiple audio formats (PCM, WAV, MP3, OGG/OPUS), real-ti...

🦀 ClawHub

Pub Brave

Web search and content extraction via Brave Search API. And also 50+ models for image generation, video generation, text-to-speech, speech-to-text, music, ch...

🦀 ClawHub

AI media generation API - Flux2pro, Veo3.1, Suno Ai

AI image, video, and music generation + editing via VAP API. Flux, Veo 3.1, Suno V5.

🦀 ClawHub

Pub Whisper

Local speech-to-text with the Whisper CLI (no API key). And also 50+ models for image generation, video generation, text-to-speech, speech-to-text, music, ch...

🦀 ClawHub

Pub Notion

Notion API for creating and managing pages, databases, and blocks. And also 50+ models for image generation, video generation, text-to-speech, speech-to-text...

🦀 ClawHub

Pub Nanopdf

Edit PDFs with natural-language instructions using the nano-pdf CLI. And also 50+ models for image generation, video generation, text-to-speech, speech-to-te...

🦀 ClawHub

Greek Email Processor

Email processing for Greek accounting. Connects via IMAP to scan for financial documents, AADE notices, and invoices. Routes to local pipelines.

🦀 ClawHub

Self Improving Agent

Captures learnings, errors, and corrections to enable continuous improvement. And also 50+ models for image generation, video generation, text-to-speech, spe...

🦀 ClawHub

U2-audio-file-transcriber

Transcribe audio files via UniCloud ASR (云知声语音识别, recorded audio → text) API from UniSound. Supports multiple formats, optimized for finance, customer servic...

🦀 ClawHub

Bailian Studio

Call Aliyun Bailian via DashScope; support OCR, TTS, text-to-image and image-to-image.

🦀 ClawHub

Douban Sync

Export and sync Douban (豆瓣) book/movie/music/game collections to local CSV files (Obsidian-compatible). Use when the user wants to export their Douban readin...

← PrevPage 44 / 53 (2,510 skills)Next →