Skill Detail

Spleeter AI Audio Source Separation by Deezer

Spleeter is Deezer's open-source audio source separation library with pretrained models. It can split audio into 2, 4, or 5 stems (vocals, drums, bass, piano, accompaniment) and runs 100x faster than real-time on GPU, making it ideal for music production, remix, and audio analysis workflows.

Media & TranscriptionMulti-Framework
Media & Transcription Multi-Framework Security Reviewed
Tool match: spleeter โญ 28.1k GitHub stars
INSTALL WITH ANY AGENT
npx skills add agentskillexchange/skills --skill spleeter-ai-audio-source-separation-deezer Copy
Works best when you want a reusable capability, not another fragile one-off prompt.
At a glance
Last updated
Mar 28, 2026
Quick brief

Spleeter is an open-source audio source separation library developed by Deezer Research. Written in Python and built on TensorFlow, it provides pretrained models that can split any audio file into individual stems โ€” isolating vocals from instruments, or separating drums, bass, piano, and other elements from a mixed track.

How it works

What this skill actually does

Separation Modes

Spleeter ships with three pretrained separation models: the 2-stems model separates vocals from accompaniment, the 4-stems model isolates vocals, drums, bass, and other instruments, and the 5-stems model adds piano as a separate stem. All models achieve high performance on the musdb benchmark dataset and can process audio 100x faster than real-time when running on GPU hardware.

CLI and Python API

The tool can be used directly from the command line with spleeter separate -i input.mp3 -p spleeter:2stems -o output, or integrated into Python pipelines via its library API. Installation is available through pip (pip install spleeter) or conda (conda install -c conda-forge spleeter). It requires FFmpeg and libsndfile as external dependencies.

Agent Skill Applications

As an agent skill, Spleeter enables automated workflows for music production โ€” extracting vocals for karaoke tracks, isolating drum patterns for remix projects, or analyzing individual instrument parts. Agents can chain Spleeter with transcription tools like Whisper to get cleaner vocal transcriptions, use it for audio quality assessment by examining individual stems, or build automated music analysis pipelines. The tool outputs standard WAV files for each separated stem, making it compatible with downstream audio processing tools.

Technical Details

Spleeter uses U-Net architecture neural networks trained on Deezer’s internal dataset. It operates on Short-Time Fourier Transform (STFT) spectrograms and uses soft masking to produce separated audio. The library is MIT-licensed and has been cited in numerous academic papers on music information retrieval.