BytesAgainBytesAgain

Find the Right AI Skill for Any Job

Browse 25+ curated AI agent skills. Search by use case, filter by category, get the right tool instantly.

Browse by Use Case →Pick My Role

All Skills — video

25 skills in "video" matching "detection"

🦀 ClawHub
TencentCloud Video AIGC Detection
腾讯云 AI 生成视频识别 (TencentCloud Video AIGC Detection) 技能。适用于 AI 生成视频检测、视频真伪鉴别、AI 合成视频检测、Deepfake 视频检测等场景。TencentCloud Video AIGC Detection is an AI-generated con...
🦀 ClawHub
Video To Text
Convert video or audio files from URLs into text or subtitle formats using a free API with automatic language detection and no local downloads required.
🦀 ClawHub
Intelligent Public Smoking Detection Skill | 公共场所吸烟行为智能检测技能
Automatically detects smoking behavior in target areas based on computer vision; supports real-time detection of video streams, images, and video files; iden...
🦀 ClawHub
Workplace Phone Usage Smart Monitoring Skill | 职场玩手机智能监测技能
Based on computer vision, automatically detects employees playing with phones during work hours, supports real-time video stream and image detection, counts...
🦀 ClawHub
Pet Detection Skill | 宠物检测技能
Detects cats, dogs, and birds appearing in the target area; supports video stream and image detection, suitable for home pet monitoring scenarios. | 宠物检测技能,检...
🦀 ClawHub
Regional Humanoid Detection Skill | 区域人形检测技能
Automatically detects personnel in target areas based on computer vision. Supports real-time video stream detection and is suitable for monitoring personnel...
🦀 ClawHub
Video-based Fall Detection Skill | 跌倒检测视频版技能
跌倒检测视频版技能,检测目标区域内是否有人跌倒,支持视频流检测,适用于独居老人居家安全监测
🦀 ClawHub
TubeScribe
YouTube video summarizer with speaker detection, formatted documents, and audio output. Works out of the box with macOS built-in TTS. Optional recommended tools (pandoc, ffmpeg, mlx-audio) enhance quality. Requires internet for YouTube access. No paid APIs or subscriptions. Use when user sends a YouTube URL or asks to summarize/transcribe a YouTube video.
🦀 ClawHub
Cinematic Script Writer
Create professional cinematic scripts for AI video generation with character consistency and cinematography knowledge. Use when the user wants to write a cinematic script, create story contexts with characters, generate image prompts for AI video tools (Midjourney, Sora, Veo), or needs cinematography guidance (camera angles, lighting, color grading). Also use for character consistency sheets, voice profiles, anachronism detection, and saving scripts to Google Drive.
🦀 ClawHub
Whisper Transcribe
Transcribe audio files to text using OpenAI Whisper. Supports speech-to-text with auto language detection, multiple output formats (txt, srt, vtt, json), batch processing, and model selection (tiny to large). Use when transcribing audio recordings, podcasts, voice messages, lectures, meetings, or any audio/video file to text. Handles mp3, wav, m4a, ogg, flac, webm, opus, aac formats.
🦀 ClawHub
video-clip-skill
Clips a YouTube video locally using yt-dlp and ffmpeg. Supports auto-highlight detection, translation, and CapCut-style karaoke subtitle burning. Triggers wh...
🦀 ClawHub
Youtube Apify Transcript
Fetch YouTube transcripts via APIFY API. Works from cloud IPs (Hetzner, AWS, etc.) by bypassing YouTube's bot detection. Features local caching (FREE repeat...
🦀 ClawHub
Speech to Text Transcription
Transcribe audio and video files to text with speaker detection, timestamps, and format conversion.
🦀 ClawHub
Perceptron
Image and video analysis powered by Isaac vision models. Capabilities include visual Q&A, object detection, OCR, captioning, counting, and grounded spatial r...
🦀 ClawHub
macOS Local Voice
Local STT and TTS on macOS using native Apple capabilities. Speech-to-text via yap (Apple Speech.framework), text-to-speech via say + ffmpeg. Fully offline, no API keys required. Includes voice quality detection and smart voice selection.
🦀 ClawHub
Vlmrun Cli Skill
Use the VLM Run CLI (`vlmrun`) to interact with Orion visual AI agent. Process images, videos, and documents with natural language. Triggers: image understanding/generation, object detection, OCR, video summarization, document extraction, image generation, visual AI chat, 'generate an image/video', 'analyze this image/video', 'extract text from', 'summarize this video', 'process this PDF'.
🦀 ClawHub
Cinematic Script Writer
Create professional cinematic scripts for AI video generation with character consistency and cinematography knowledge. Use when the user wants to write a cinematic script, create story contexts with characters, generate image prompts for AI video tools (Midjourney, Sora, Veo), or needs cinematography guidance (camera angles, lighting, color grading). Also use for character consistency sheets, voice profiles, anachronism detection, and saving scripts to Google Drive.
🦀 ClawHub
Basic Object Detection Analysis
Basic object detection skill. Detects people, vehicles, non-motorized vehicles, pets, and parcels appearing in the target area. Supports video stream and ima...
🦀 ClawHub
Ai Content Detection
Use this skill whenever a user wants to verify whether content (text, images, audio, video, or documents) was created by AI; detect deepfakes or AI-synthesiz...
🦀 ClawHub
Supplier Video Ad Builder
Transforms supplier or CJ source videos into 1080×1920 TikTok/Instagram Reels ads with clean zone detection, Pillow text overlays, CTA card, and trending audio.
🦀 ClawHub
YouTube Video Transcript
Fetch, summarize, and save YouTube transcripts with timestamp navigation, chapter detection, and searchable content.
🦀 ClawHub
Short Video Downloader
Download videos and metadata from TikTok, Instagram Reels, YouTube Shorts, and Xiaohongshu with automatic platform detection.
🦀 ClawHub
youtube copy of yt
Fetch YouTube video transcripts via APIFY API using residential proxies to bypass bot detection, supporting text and JSON output formats.
🦀 ClawHub
Meta Video Ad Analyzer
Extract and analyze content from video ads using Gemini Vision AI. Supports frame extraction, OCR text detection, audio transcription, and AI-powered scene analysis. Use when analyzing video creative content, extracting text overlays, or generating scene-by-scene descriptions.
GitHub
dockerface
Easy to install and use deep learning Faster R-CNN face detection for images and video in a docker container. **[Deprecated]**