🎁 Get the FREE AI Skills Starter GuideSubscribe →
BytesAgainBytesAgain
🦀 ClawHub

xiaoyuzhou-asr

by @worldwonderer

Transcribe 小宇宙 (Xiaoyuzhou) podcast episodes to text using local Qwen3-ASR speech recognition. Combines xyz API (小宇宙FM API) to fetch episode metadata and aud...

⚙️ Configuration

1. xyz API server running — fetches episode data and audio URLs from 小宇宙

   git clone https://github.com/ultrazg/xyz.git && cd xyz && go run .
   # Default port: 23020, change with -p
   
2. Access token — login via POST /sendCode then POST /login (see references/xyz-api.md) 3. ffmpeg — audio format conversion (brew install ffmpeg) 4. Qwen3-ASR model — download (HF Hub does NOT ship tokenizer.json):
   python3 -c "
   from huggingface_hub import snapshot_download
   snapshot_download('Qwen/Qwen3-ASR-0.6B', local_dir='models/0.6B')
   "
   
5. qwen3-asr-rs — build from source:
   git clone https://github.com/alan890104/qwen3-asr-rs.git && cd qwen3-asr-rs
   cargo build --release --example local_transcribe
   
6. tokenizer.json — auto-generated by the transcription script on first run (from vocab.json + merges.txt). No manual step needed.

🔒 Constraints

  • MUST split audio into ≤3-minute segments for Metal GPU stability
  • Audio must be WAV 16kHz mono
  • tokenizer.json must be generated manually (not included in HF download)
  • local_transcribe binary needed (demo binary only runs built-in test samples)
  • xyz API requires Chinese phone number (+86) login
  • All processing is local — audio never leaves the machine
  • View on ClawHub
    TERMINAL
    clawhub install xiaoyuzhou-asr

    🧪 Use this skill with your agent

    Most visitors already have an agent. Pick your environment, install or copy the workflow, then run the smoke-test prompt above.

    🔍 Can't find the right skill?

    Search 60,000+ AI agent skills — free, no login needed.

    Search Skills →