Learning to Groove with Inverse Sequence Transformations
Jon Gillick · Adam Roberts · Jesse Engel · Douglas Eck · David Bamman

Tue Jun 11th 11:25 -- 11:30 AM @ Room 201

We explore models for translating abstract musical ideas (scores, rhythms) into expressive performances using seq2seq and recurrent variational information bottleneck (VIB) models. Though seq2seq models usually require painstakingly aligned corpora, we show that it is possible to adapt an approach from the Generative Adversarial Network (GAN) literature (e.g. Pix2Pix, Vid2Vid) to sequences, creating large volumes of paired data by performing simple transformations and training generative models to plausibly invert these transformations. Music, and drumming in particular, provides a strong test case for this approach because many common transformations (quantization, removing voices) have clear semantics, and learning to invert them has real-world applications. Focusing on the case of drum set players, we create and release a new dataset for this purpose, containing over 13 hours of recordings by professional drummers aligned with fine-grained timing and dynamics information. We also explore some of the creative potential of these models, demonstrating improvements on state-of-the-art methods for Humanization (instantiating a performance from a musical score).

Author Information

Jon Gillick (UC Berkeley)
Adam Roberts (Google Brain)
JesseEngel Engel (Google Brain)
Douglas Eck (Google Brain)
David Bamman (UC Berkeley)

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors