Browse AI Agent Skills | BytesAgain

🎁 Get the FREE AI Skills Starter Guide — Subscribe →

All Skills — audio

18 skills in "audio" matching "processing"

🦀 ClawHub42.3k dl

Markdown Converter

Convert documents and files to Markdown using markitdown. Use when converting PDF, Word (.docx), PowerPoint (.pptx), Excel (.xlsx, .xls), HTML, CSV, JSON, XML, images (with EXIF/OCR), audio (with transcription), ZIP archives, YouTube URLs, or EPubs to Markdown format for LLM processing or text analysis.

🦀 ClawHub3.5k dl

AssemblyAI advanced speech transcription

Transcribe, diarise, translate, post-process, and structure audio/video with AssemblyAI. Use this skill when the user wants AssemblyAI specifically, needs hi...

🦀 ClawHub2.5k dl

ElevenLabs API integration with managed authentication. AI-powered text-to-speech, voice cloning, sound effects, and audio processing. Use this skill when us...

🦀 ClawHub2.2k dl

Glasses to Social

Turn smart glasses photos into social media posts. Monitors a Google Drive folder for new images from Meta Ray-Ban glasses (or any smart glasses), analyzes them with vision AI, drafts tweets/posts in the user's voice, and publishes on approval. Use when setting up a glasses-to-social pipeline, processing smart glasses photos for social media, or creating hands-free content workflows.

🦀 ClawHub1.6k dl

inSaiAI Intelligent Editing

Use when performing video/audio processing tasks including transcoding, filtering, streaming, metadata manipulation, or complex filtergraph operations with FFmpeg.

🦀 ClawHub343 dl

VN SKill for Windows

Local video, image and audio processing expert for Windows, powered by VN Video Editor. Use this skill whenever the user wants to process video or audio on t...

🦀 ClawHub7.0k dl

ElevenLabs Voices

High-quality voice synthesis with 18 personas, 32 languages, sound effects, batch processing, and voice design using ElevenLabs API.

🦀 ClawHub4.9k dl

Process video and audio with correct codec selection, filtering, and encoding settings.

🦀 ClawHub2.9k dl

Donson Intelligent Editing

Use when performing video/audio processing tasks including transcoding, filtering, streaming, metadata manipulation, or complex filtergraph operations with FFmpeg.

🦀 ClawHub2.7k dl

Local speech-to-text with NVIDIA Parakeet TDT 0.6B v3 (ONNX on CPU). 30x faster than Whisper, 25 languages, auto-detection, OpenAI-compatible API. Use when transcribing audio files, converting speech to text, or processing voice recordings locally without cloud APIs.

🦀 ClawHub2.3k dl

Process, enhance, and convert audio files with noise removal, normalization, format conversion, transcription, and podcast workflows.

🦀 ClawHub2.3k dl

Voice Note To Midi

Convert voice notes, humming, and melodic audio recordings to quantized MIDI files using ML-based pitch detection and intelligent post-processing

🦀 ClawHub1.8k dl

Whisper Transcribe

Transcribe audio files to text using OpenAI Whisper. Supports speech-to-text with auto language detection, multiple output formats (txt, srt, vtt, json), batch processing, and model selection (tiny to large). Use when transcribing audio recordings, podcasts, voice messages, lectures, meetings, or any audio/video file to text. Handles mp3, wav, m4a, ogg, flac, webm, opus, aac formats.

🦀 ClawHub1.7k dl

MiniMax Multimodal Toolkit

Generate and process speech, music, video, and images using MiniMax AI with voice cloning, custom voices, multi-scene video, and FFmpeg-based media tools.

🦀 ClawHub529 dl

Expert audio/video processing with ffmpeg and ffprobe. Use when the user needs to convert, compress, edit, analyze, stream, or process any audio or video fil...

🦀 ClawHub383 dl

ffmpeg-audio-processing

Extract, normalize, mix, and process audio tracks - audio manipulation and analysis

🦀 ClawHub337 dl

Local video, audio and image processing expert for macOS, powered by VN Video Editor. Use this skill whenever the user wants to process video, audio or image...

🦀 ClawHub218 dl

Alibabacloud Oss Media Process

Process images, audio, and video files stored in Alibaba Cloud OSS. Supports 14+ image operations (resize, crop, rotate, watermark, blur, format conversion,...