Browse AI Agent Skills | BytesAgain

🎁 Get the FREE AI Skills Starter Guide — Subscribe →

All Skills — audio

337 skills in "audio"

Remove AI-generated jargon and restore human voice to text

Music To Video Ai

convert audio files into music-synced videos with this skill. Works with MP3, WAV, AAC, FLAC files up to 200MB. musicians and content creators use it for con...

Search and retrieve podcast and episode details from Podcast Index API using keywords, titles, feed IDs, URLs, or featured persons with authenticated requests.

Video Pipeline Bundle

视频一站式工作流技能包。整合视频剪辑、转写、烧录、拼接全流程，支持分步执行和用户确认。包含：(1) auto-editor - 视频剪辑去除静音片段；(2) Faster Whisper + MiniMax LLM - 语音转字幕； (3) ffmpeg - 烧录字幕到视频；(4) FFmpeg 工具箱 - 拼...

Knowledge Digest

Converts textbooks or PDFs into personalized, multimodal interactive learning materials including handwritten notes, quiz webpages, slides, audio courses, an...

Gemini Assistant

General-purpose AI assistant using Gemini API with voice and text support. Use when you need a smart AI assistant that can answer questions, have conversatio...

Freebeat Music Video Generator

Generate AI music videos from any MCP client. Turn text prompts into cinematic music videos with multiple styles and modes. Existing features include charact...

Text to speech using the default macOS "say" command. No need for 3rd party APIs or models. Supports many languages. Also, Trinoids!

Local text-to-speech using macOS `say` + ffmpeg for Telegram/Matrix voice messages

Text-to-speech using macOS built-in `say` command. Use for voice notifications, audio alerts, reading text aloud, or announcing messages through Mac speakers. Supports multiple languages including Chinese (Mandarin), English, Japanese, etc.

Audio Processor

音频处理工具集 - 支持音频录制、剪辑、格式转换、频谱分析、降噪、变速变调等操作。Use when: (1) 需要处理音频文件（录音、剪辑、合并、分割）, (2) 需要转换音频格式（MP3/WAV/FLAC/OGG等）, (3) 需要分析音频特征（频谱、音量、静音检测）, (4) 需要对音频进行效果处理（降噪、变...

Humanize AI-generated text by detecting and removing patterns typical of LLM output. Rewrites text to sound natural, specific, and human. Uses 24 pattern det...

Morning Wake-Up

Morning wake-up automation that fetches today's weather and matches a Sonos playback preset. Use when setting up daily alarm routines, weather-driven music w...

Jazz Music — Stream Jazz Concerts: Audio Analysis, Lyrics, Equations

Experience jazz as data. AI agents stream harmonic separation, chroma, tonnetz. Error incorporation measured.

Add Music To Video App

Skip the learning curve of professional editing software. Describe what you want — add upbeat background music to my video and fade it out at the end — and g...

Play Apple Music songs on macOS using clawtunes, including streaming catalog tracks via a practical keyboard-navigation workaround after opening the song in...

Ai Music Video Generator Free

Turn a 3-minute MP3 song file into 1080p synced music videos just by typing what you need. Whether it's generating visual music videos from audio tracks or q...

Voice Translator

说中文出外语语音——按住说中文，2-3秒内播放英/日/韩语音。支持场景模式、双向对话、常用句收藏。

Any-to-any AI sub-agent — research, images, video, audio, music, podcasts, avatars, voice cloning, documents, spreadsheets, dashboards, 3D models, diagrams,...

Azure Speech Tts

Azure Speech TTS skill for generating local audio files from text or SSML with Azure Speech. Use when the user asks to use Azure Speech / Azure TTS / Microso...

Audio Quality Checker

Analyze audio quality, detect noise types, and provide improvement recommendations. Use when users need to check audio quality, validate recordings, or ident...

Transcrição e respostas em áudio em PTBR, Português Brasil - Brazillian portuguese transcription and audio answers

Brazilian Portuguese voice auto-reply skill for OpenClaw. Transcribes audio locally with wav2vec2, generates a reply with the local OpenClaw agent by default...

Premium Portuguese-Brazilian voice interface with neural TTS and Claude AI integration. Features wav2vec2-large-xlsr-53-ptBR for excellent PT-BR understandin...

audioclaw-skills-voice-reply

Use when AudioClaw Skills, Feishu, or Lark needs to send AudioClaw voice replies with runtime-switchable voice_id, emotion preset, or speaking style, includi...

Call the RawUGC API to generate AI videos/images/music, manage content (personas, products, styles, characters), schedule social media posts, research TikTok...

Add Music To Video Online Free

Skip the learning curve of professional editing software. Describe what you want — add background music to my video and adjust the volume — and get music-bac...

Official elevenlabs skill: text-to-speech. From elevenlabs/skills.

convert audio files into music video MP4 with this skill. Works with MP3, WAV, MP4, MOV files up to 200MB. music creators use it for turning Suno AI songs in...

Skip the learning curve of professional editing software. Describe what you want — turn this text into a short explainer video with visuals and voiceover — a...

Openclaw Perfexcrm Skill

Manage PerfexCRM from any messaging app. Full CRUD for customers, invoices, leads, tickets, projects, contracts, and 13 more resources (170 API endpoints). C...

婴儿哭声智能解析技能

Detects baby cries via audio AI in real-time, analyzes causes, and precisely identifies needs like hunger, tiredness, pain, discomfort, or irritability to as...

Write Latvian that sounds human. Not formal, not robotic, not AI-generated.

Crazyrouter Music Gen

AI music generation via Crazyrouter API using Suno. Create songs from text descriptions with lyrics, style, and title. Use when user asks to generate music,...

Anima Avatar - Interactive Video Generation Engine. Generates 16:9 videos with dynamic character sprites (Shutiao), synced audio (Fish Audio), and text overlay.

Byted Las Long Video Understand

Extracts audio tracks from video files and splits long audio into timed segments using Volcengine LAS. Audio extraction and separation from video — pull audi...

Latin — Experience Latin Music: 29 Layers of Audio, Lyrics & Equations

Experience latin as data. AI agents stream lyrics, emotions, harmonic/percussive separation. Temporal semantics measured.

Agentphone Skills

Get your AI agent a real US/Canada phone number in one API call. Make voice calls, send and receive SMS, and hold actual conversations — all via API.

Call SuperX AI art APIs to generate images, videos, and music. Use this skill whenever the user wants to: generate/create/draw images or pictures (GPT-4o, Mi...

Remove signs of AI-generated writing from text. Use when editing or reviewing text to make it sound more natural and human-written. Based on Wikipedia's comp...

Autonomous CRM for freelancers. Tracks clients, detects follow-up opportunities, generates proposals, tracks invoices, and sends a weekly digest. Works via W...

Upwork Freelancer Operations

Income and pipeline management for Upwork freelancers. Proposal drafting, job scanning, client follow-up sequences, contract tracking, invoice reminders, and...

Mistral Mcp Openclaw

Configure OpenClaw to use the community mistral-mcp stdio server for Mistral OCR, Codestral FIM, Voxtral audio, durable workflows, moderation, classification...

Brand Persona Skill

Create a living brand persona with authentic voice and defined services as a personalized AI agent for your business or institution.

Voice Input Patch – Dual Mic Buttons

Patch OpenClaw Control UI to add dual voice input buttons (auto-send + continuous). Use when OpenClaw voice input needs patching, after OpenClaw updates (pat...

Alicloud Ai Audio Tts

Generate human-like speech audio with Model Studio DashScope Qwen TTS models (qwen3-tts-flash, qwen3-tts-instruct-flash). Use when converting text to speech,...

Crazyrouter Tts

Text-to-speech via Crazyrouter API. OpenAI TTS voices (alloy, echo, fable, onyx, nova, shimmer). Convert text to natural speech audio files. Use when user as...

CreateVideo Podcast to Video

视频生成工具。当用户说"CreateVideo"、"创建视频"、"生成视频"或提供文案要求制作视频时触发。支持文本转语音（通过 ListenHub MCP）、模版视频裁剪合并、内容分析输出。依赖 ffmpeg 和 ListenHub MCP Server。

Video Maker Italiano

Turn three product photos and a voiceover MP3 into 1080p polished Italian videos just by typing what you need. Whether it's creating professional videos in I...

Bilibili Audio Transcribe

Download audio from Bilibili or b23.tv links and transcribe it into txt, srt, and segment JSON with yt-dlp, ffmpeg, and faster-whisper. Use when a user asks...

← PrevPage 7 / 8 (337 skills)Next →