acestep-lyrics-transcription
by @dumoedss
Transcribe audio to timestamped lyrics using OpenAI Whisper or ElevenLabs Scribe API. Outputs LRC, SRT, or JSON with word-level timestamps. Use when users want to transcribe songs, generate LRC files, or extract lyrics with timestamps from audio.
Transcribed (wrong):
[00:46.96]AC step alive,
[00:50.80]one point five eyes.
Original lyrics reference:
ACE-Step alive
One point five arrives
Corrected (right):
[00:46.96]ACE-Step alive,
[00:50.80]One point five arrives.
Config file: scripts/config.json
# Switch provider
./scripts/acestep-lyrics-transcription.sh config --set provider openai
./scripts/acestep-lyrics-transcription.sh config --set provider elevenlabsSet API keys
./scripts/acestep-lyrics-transcription.sh config --set openai.api_key sk-...
./scripts/acestep-lyrics-transcription.sh config --set elevenlabs.api_key ...View config
./scripts/acestep-lyrics-transcription.sh config --list
| Option | Default | Description |
|--------|---------|-------------|
| provider | openai | Active provider: openai or elevenlabs |
| output_format | lrc | Default output: lrc, srt, or json |
| openai.api_key | "" | OpenAI API key |
| openai.api_url | https://api.openai.com/v1 | OpenAI API base URL |
| openai.model | whisper-1 | OpenAI model (whisper-1 for word timestamps) |
| elevenlabs.api_key | "" | ElevenLabs API key |
| elevenlabs.api_url | https://api.elevenlabs.io/v1 | ElevenLabs API base URL |
| elevenlabs.model | scribe_v2 | ElevenLabs model |
clawhub install acestep-lyrics-transcription