Inverse Drum Machine: Source Separation Through Joint Transcription and Analysis-by-Synthesis
This is the accompanying page for the paper “The Inverse Drum Machine: Source Separation Through Joint Transcription and Analysis-by-Synthesis”, currently under review. The Inverse Drum Machine (IDM) uses joint transcription and analysis-by-synthesis to separate drum components from mixed audio without needing isolated sources for training.
Drum Samples and Envelopes
One of the components of our model is a One-Shot Drum Synthesizer which is trained without ever being exposed to isolated drum samples. The One-Shot synth is conditioned on drum class and timbre (we use a one-hot vector of the drum kit to represent timbre). Here we provide the drum samples and envelopes of the model reported in the paper.
Note: The interactive drum samples and envelopes visualizations may take some time to load. For best performance, you can toggle them on only when needed, or open the drum samples and envelopes in separate windows.
Audio Demos
We present some uncurated audio demos from the StemGMD test set showcasing the performance of our model and our baselines. As the individual stems for drums are often very sparse, listening can be tricky (and very boring). We therefore present an interactive demo where the tracks are played on loop and you can choose the model and stem you want to "solo" out. You can click on the waveform to come back to parts of the audio of interest.
Method | Training | Inference | STFT masking |
---|---|---|---|
Oracle † | -- | Isolated stems | ✓ |
NMFD † | -- | Transcription + one-shots | ✓ |
LarsNet † | Isolated stems | -- | ✓ |
IDM masked (ours) | Transcription | -- | ✓ |
IDM synth (ours) | Transcription | -- | - |
We recommend using headphones for the best experience. If you encounter any issues, please let us know!
The following controls are available:
- Stop All: Stop all currently playing audio.
- Sync Playback: When enabled, switching between models or stems will sync the playback position across all audio elements. When disabled, each audio element will play from the beginning.
- Loop: When enabled, the audio will loop continuously.
Citation
If you use our work in your research, please cite our paper:
@article{torres2025InverseDrumMachine,
title={The Inverse Drum Machine: Source Separation Through Joint Transcription and Analysis-by-Synthesis},
author={Torres, Bernardo and Peeters, Geoffroy and Richard, Gaël},
year={2025},
journal={arXiv preprint arXiv:2505.03337}
}