BytesAgainBytesAgain

Find the Right AI Skill for Any Job

Browse 2,510+ curated AI agent skills. Search by use case, filter by category, get the right tool instantly.

Browse by Use Case →Pick My Role

All Skills — audio

2,510 skills in "audio"

🦀 ClawHub
Elderly Voice Assistant
银发族语音助手——老年人对着手机说话就能发消息、查天气、设闹钟、听戏曲,无需学任何操作。
🦀 ClawHub
feishu-video
Send voice/audio messages to Feishu (Lark) users. Converts audio files to OPUS format and sends as voice message, not file attachment. create by Alex
🦀 ClawHub
Ai Video Editor Free
AI Video Editor Free - Free Online AI Video Editing Tool No Watermark. Free AI-powered video editor — trim, cut, merge clips, add subtitles, background music...
🦀 ClawHub
Qwen3/Free Text-to-Speech and Voice Cloning
🆓 完全免费的本地文字转语音(TTS)与声音克隆技能。基于 Qwen3-TTS-1.7B 模型,支持 Apple Silicon,无需联网,保护隐私。可用于有声书制作、AI 角色配音、无障碍应用等场景。
🦀 ClawHub
feishu voice reply
通过火山引擎 TTS 合成多音色语音,转换为 Opus 格式后,使用飞书 API 自动上传并发送语音消息。
🦀 ClawHub
openclaw-skill-customs
海关报关单据处理助手。上传报关单据(发票、装箱单、提单等),AI 自动分类识别文件类型, 提取报关结构化数据,生成标准报关 Excel。当用户提到报关、海关、customs declaration、 invoice、packing list、bill of lading、HS 编码等关键词时,使用此技能。
🦀 ClawHub
XunFei Voice Reply
语音回复技能 - 使用讯飞 TTS 生成语音并发送到飞书。当需要用语音回复用户消息时使用。触发词:用语音、语音回复、切换语音模式、语音模式。
🦀 ClawHub
Video Transcriber
视频转写工作流,支持B站和YouTube视频。自动判断有字幕/无字幕,有字幕则获取字幕,无字幕则下载音频+whisper转写。触发场景:(1) 用户要求总结视频内容 (2) 用户要求获取视频字幕 (3) 用户要求转写视频 (4) 处理B站/YouTube视频
🦀 ClawHub
Agent Board
Build multi-panel storyboards programmatically — create projects, upload images/audio to boards, composite annotations, export PDFs, share via public URL. In...
🦀 ClawHub
voice-text-to-meme
根据输入法语音识别文本或润色后文本生成单张表情包图片。适用于用户想把一句话做成聊天可发送的表情包、meme 图、带字梗图或贴纸图时。支持原始语音文本和润色后文本两种输入,默认优先使用润色后文本;自动根据文本语气判断风格;默认直接生成带字图片,也支持生成无字图并同时给出配文模板;使用 doubao-seedream...
🦀 ClawHub
Slovenian
Write Slovenian that sounds human. Not formal, not robotic, not AI-generated.
🦀 ClawHub
Pixbim Lip Sync Ai
Turn any video into a perfectly lip-synced production using pixbim-lip-sync-ai — the tool that matches mouth movements to dialogue, dubbing, or voiceover wit...
🦀 ClawHub
Audio Announcement
实时语音播报 AI 操作状态,支持多语言和消息队列,提升透明度与安全感,适配多平台。
🦀 ClawHub
Novita AI Multimodal
Execute multimodal tasks using Novita AI: text-to-image, image-to-image, text-to-video, image-to-video, TTS, STT. Use for: generating images, generating vide...
🦀 ClawHub
Ntriq Audio Intelligence Mcp
Transcribe, summarize, and analyze audio files using local Whisper + Qwen. Returns transcript, segments, and action items.
🦀 ClawHub
muapi-media-generation
Generate AI images, videos, music, and audio from the terminal via muapi.ai — supports 100+ models including Flux, Midjourney v7, Kling 3.0, Veo3, and Suno V5
🦀 ClawHub
WebChat Voice Proxy
⚠️ DEPRECATED — This skill has been split into two separate skills for better modularity: **webchat-https-proxy** (HTTPS/WSS reverse proxy) and **webchat-voi...
🦀 ClawHub
Seedance
Generate detailed, production-ready cinematic video prompts following Seedance 2.0’s strict Subject-Action-Camera-Style-Audio-Constraints format for AI video...
🦀 ClawHub
Firm Spec Compliance Pack
MCP 2025-11-25 specification compliance audit pack. Validates elicitation, tasks, resources/prompts, audio content, JSON Schema 2020-12, SSE transport, and i...
🦀 ClawHub
Auto Subtitle Generator Free
Tired of manually transcribing every word just to add subtitles to your videos? The auto-subtitle-generator-free skill automatically detects speech and gener...
🦀 ClawHub
Ai Lip Sync Video
Drop a video and a new audio track, and watch mouths move in perfect sync — no studio, no reshoots required. This ai-lip-sync-video skill analyzes facial mov...
🦀 ClawHub
Speak
Configure TTS in OpenClaw. Adapt speech output to user preferences.
🦀 ClawHub
Castreader Openclaw Skill
Read any web page aloud with natural AI voices. Extract article text from any URL and convert it to audio (MP3). Use when the user wants to: listen to a webp...
🦀 ClawHub
Jazz Music — Stream Jazz Concerts: Audio Analysis, Lyrics, Equations
Experience jazz as data. AI agents stream harmonic separation, chroma, tonnetz. Error incorporation measured.
🦀 ClawHub
macos-audio
Manage macOS audio output and Bluetooth devices via the macos-audio CLI. Use when scanning paired devices, connecting or disconnecting Bluetooth, switching a...
🦀 ClawHub
视频批量转录
通用视频批量转录工具 - 支持 1000+ 网站(B 站、YouTube、抖音、Twitch 等),使用 yt-dlp 批量下载视频音频,GPU 加速语音转文字(faster-whisper + CUDA),自动校正专业术语,生成结构化学习笔记。支持断点续传、批量导出、多格式输出、需要登录的网站配置。
🦀 ClawHub
Douyin Video Transcribe
Douyin video transcription suite. Extract audio from Douyin/TikTok China videos, transcribe with Whisper, and analyze content. Supports video links, local fi...
🦀 ClawHub
Music Cog
Original music, fully yours. 5 seconds to 10 minutes using frontier music generation models. Instrumental and vocal tracks with perfect vocals. Cinematic sco...
🦀 ClawHub
Jiekou Multimodal
使用接口AI 执行多模态任务:文生图、图生图、文生视频、图生视频、TTS、STT。 适用于:生成图片、生成视频、文字转语音、语音识别。
🦀 ClawHub
reCameraV2
reCamera (RV1126B) device full-stack Web API reference covering authentication, device management, video/audio/image configuration, recording rules & storage...
🦀 ClawHub
SenseVoice Transcribe
Transcribe audio files (WAV/MP3/M4A/FLAC) to timestamped text using SenseVoice-Small + FSMN-VAD. Supports single-file and batch mode with VAD-anchored per-se...
GitHub
MusicLM
A model by Google Research for generating high-fidelity music from text descriptions.
🦀 ClawHub
Spanish
Write Spanish that sounds human. Not formal, not robotic, not AI-generated.
🦀 ClawHub
Pub Session Logs
Search and analyze your own session logs using jq. And also 50+ models for image generation, video generation, text-to-speech, speech-to-text, music, chat, w...
🦀 ClawHub
Pub Web Search
Search the web for information, find current content, and look up news articles. And also 50+ models for image generation, video generation, text-to-speech,...
🦀 ClawHub
Pub Clawdhub
Use the ClawdHub CLI to search, install, update, and publish agent skills. And also 50+ models for image generation, video generation, text-to-speech, speech...
🦀 ClawHub
Pub Vidframes
Extract frames or short clips from videos using ffmpeg. And also 50+ models for image generation, video generation, text-to-speech, speech-to-text, music, ch...
🦀 ClawHub
Step Asr
Transcribe audio files to text via Step ASR streaming API (HTTP SSE). Supports Chinese and English, multiple audio formats (PCM, WAV, MP3, OGG/OPUS), real-ti...
🦀 ClawHub
Pub Brave
Web search and content extraction via Brave Search API. And also 50+ models for image generation, video generation, text-to-speech, speech-to-text, music, ch...
🦀 ClawHub
AI media generation API - Flux2pro, Veo3.1, Suno Ai
AI image, video, and music generation + editing via VAP API. Flux, Veo 3.1, Suno V5.
🦀 ClawHub
Pub Whisper
Local speech-to-text with the Whisper CLI (no API key). And also 50+ models for image generation, video generation, text-to-speech, speech-to-text, music, ch...
🦀 ClawHub
Pub Notion
Notion API for creating and managing pages, databases, and blocks. And also 50+ models for image generation, video generation, text-to-speech, speech-to-text...
🦀 ClawHub
Pub Nanopdf
Edit PDFs with natural-language instructions using the nano-pdf CLI. And also 50+ models for image generation, video generation, text-to-speech, speech-to-te...
🦀 ClawHub
Greek Email Processor
Email processing for Greek accounting. Connects via IMAP to scan for financial documents, AADE notices, and invoices. Routes to local pipelines.
🦀 ClawHub
Self Improving Agent
Captures learnings, errors, and corrections to enable continuous improvement. And also 50+ models for image generation, video generation, text-to-speech, spe...
🦀 ClawHub
U2-audio-file-transcriber
Transcribe audio files via UniCloud ASR (云知声语音识别, recorded audio → text) API from UniSound. Supports multiple formats, optimized for finance, customer servic...
🦀 ClawHub
Bailian Studio
Call Aliyun Bailian via DashScope; support OCR, TTS, text-to-image and image-to-image.
🦀 ClawHub
Douban Sync
Export and sync Douban (豆瓣) book/movie/music/game collections to local CSV files (Obsidian-compatible). Use when the user wants to export their Douban readin...
← PrevPage 44 / 53 (2,510 skills)Next →