Overview
Voice Isolation helps you:- Remove background noise from audio recordings
- Enhance speech clarity for better transcription
- Prepare clean audio for voice cloning
- Improve audio quality for downstream processing
Features
Advanced Noise Reduction
AI-powered noise removal while preserving speech quality
Async Job Processing
Upload once and stream progress events while the job runs
Audio & Video Support
Accepts both audio files and video files (audio track is extracted)
High Fidelity
Maintains natural speech characteristics
Use Cases
- Pre-processing for Transcription: Clean audio before sending to speech-to-text
- Voice Cloning Preparation: Isolate clean speech for better voice cloning results
- Podcast Production: Remove background noise from podcast recordings
- Call Quality Enhancement: Improve audio quality in telephony applications
How It Works
Voice Isolation is an asynchronous job-based pipeline:- Submit the file —
POST /denoisewith amultipart/form-databody containing theaudiofield. The response returns ajobIdand adenoiseId. - Track progress — open an SSE connection to
GET /denoise/{denoiseId}/progressto receiveprocessing,done, anderrorevents in real time. - Retrieve the result — when the
doneevent fires it carries theurlof the denoised audio. You can also list past jobs and their final URLs viaGET /denoise.
Limits
- Maximum file size: 200 MB
- Maximum duration: 15 minutes
- Accepted source types: audio files and common video containers (
mp4,mov,mkv,webm,avi,m4v). Video uploads have their audio track extracted automatically.
API Reference
- Submit a denoise job —
POST /denoise - List denoise history —
GET /denoise - Stream denoise progress —
GET /denoise/{denoiseId}/progress
