🎁 Get the FREE AI Skills Starter GuideSubscribe →
BytesAgainBytesAgain
🦀 ClawHub

Wonda

by @degausai

Using the Wonda CLI to generate images, videos, music, and audio from the terminal — plus LinkedIn, Reddit, and X/Twitter research and automation

⚙️ Configuration

  • Auth: wonda auth login (opens browser, recommended) or set WONDERCAT_API_KEY env var
  • Verify: wonda auth check
  • Access tiers

    Not all commands are available to every account type:

    | Tier | Access | | ------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------- | | Anonymous (temporary account, no login) | Media upload/download, editing (video/edit, image/edit, audio/edit), transcription, social publishing, scraping, analytics | | Free (logged in, Basic/Free plan) | Everything above + generation (image/generate, video/generate, etc.), styles, recipes, brand | | Paid (Plus, Pro, or Absolute plan) | Everything above + video analysis (requires credits), skill commands (wonda skill install/list/get) |

    If a command returns a 403 error, check your plan at https://app.wondercat.ai/settings/billing.

    Social signups (Instagram, TikTok, etc.)

    Drive them with the wonda device primitives + a throwaway mailbox from wonda email. The screenshot → decide → tap/type/swipe loop is how these flows work — there's no shortcut command, and that's fine: social apps change their UI constantly and any canned flow would drift faster than you could maintain it.

    Standard loop:

    1. wonda email account create --random → save {email, password}. 2. wonda device create → pick a ready device (poll wonda device get --fields status). 3. wonda device launch com.instagram.android (or com.zhiliaoapp.musically for TikTok). Fall back to wonda device open-url if you'd rather start in the web flow. 4. Loop: wonda device screenshot > s.json → decode the base64 PNG → read → pick an action → tap | type | swipe | key → screenshot again. Use --text "SomeButtonLabel" on tap before guessing coordinates; fall back to --x --y read off the screenshot for elements without matching text (number pickers, date spinners, etc.). 5. When the app sends a verification email, wonda email inbox wait --timeout 120 — returns {codes: ["483921"], links: [...]} with the 6-digit code already extracted. wonda device type --text "" to feed it back. 6. For number/date spinners: tap on the highlighted cell, Android pops up a numeric or alphabetic keyboard, wonda device type --text "" replaces the selected text. wonda device key --code 4 dismisses the keyboard when done.

    Consent-like taps — anything that accepts Terms/Privacy/Cookies, grants permissions, or publishes something — stop and ask the user for explicit confirmation in chat before tapping. That isn't about signups specifically; it applies to any automation step.

    Rate-limit signals — if the app shows you a visual puzzle ("we want to make sure you're a real person"), stop and hand off to the user with wonda device stream (see next section). Don't click through puzzles yourself.

    Handing off to a human

    If automation hits a screen that requires a human to take over (consent flow you shouldn't auto-accept, ambiguous UI, step where the user prefers to act themselves), use wonda device stream — returns a playerUrl signed with a short-lived JWT (1h). Give that URL to the user, they act in their own browser, and automation can resume afterward.

    wonda device stream 
    

    → { "streamUrl": "wss://…", "playerUrl": "https://…", "deviceType": "social" }

    Global output flags

    All commands support these output control flags:

  • --json — Force JSON output (auto-enabled when stdout is piped)
  • --quiet — Only output the primary identifier (job ID, media ID, etc.) — ideal for scripting
  • -o — Download output to file (implies --wait)
  • --fields status,outputs — Select specific JSON fields
  • --jq '.outputs[0].media.url' — Filter JSON output with a jq expression
  • 📋 Tips & Best Practices

    | Symptom | Likely Cause | Fix | | -------------------------------- | --------------------------------------------- | ------------------------------------------------------ | | Sora rejected image | Person in image | Switch to kling_3_pro | | Video adds objects not in source | Motion prompt describes elements not in image | Simplify to camera movement and atmosphere only | | Text unreadable in video | AI tried to render text in generation | Remove text from video prompt, use textOverlay instead | | Hands look wrong | Complex hand actions in prompt | Simplify to passive positions or frame to exclude | | Style inconsistent across series | No shared anchor | Use same reference image via --attach | | Changes to step A not in step B | Stale render | Re-run all downstream steps |

    View on ClawHub
    TERMINAL
    clawhub install wonda

    🧪 Use this skill with your agent

    Most visitors already have an agent. Pick your environment, install or copy the workflow, then run the smoke-test prompt above.

    🔍 Can't find the right skill?

    Search 60,000+ AI agent skills — free, no login needed.

    Search Skills →