Web Video Transcribe DOCX
by @c-narcissus
Offline-first workflow for turning Chinese web page video or audio into text and Word deliverables. Use when Codex needs to (1) extract playable media stream...
1. Run python {baseDir}/scripts/bootstrap_env.py once in the target environment.
2. For a generic web page URL, run python {baseDir}/scripts/pipeline_web_to_docx.py .
3. For a direct media URL, run python {baseDir}/scripts/download_url.py and then python {baseDir}/scripts/transcribe_sensevoice.py --input .
4. For a local media file, run python {baseDir}/scripts/transcribe_sensevoice.py --input .
5. If the user asks for a polished reading version rather than a raw transcript, read references/cleanup-guidelines.md, produce a refined .txt, and then render it with python {baseDir}/scripts/transcript_to_docx.py.
python {baseDir}/scripts/bootstrap_env.py before first use in a fresh environment.skill-creator/scripts/quick_validate.py.--help and one representative happy path after changing the scripts.clawhub install web-video-transcribe-docx