Deep Symphony Baseline - Shaofan Lai's Blog

This blog is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Project logs: http://shaofanlai.com/exp/1

Here are some music samples generated by AI:

The model is asked to predict the next event (i.e. note on, note off, sustain and change of velocity) given previous events. No validation set is used and I actually wants the model to memorize all songs. No data argument trick is used.

This one is trained on a very small dataset (150 songs) with 100 events as prior. The model can memorize the (loss = 0.2) dataset quite well. When some noisy is added in the generation procedure, it diverse and tries to connect different rhythms.

Next, a large dataset performed by human (~1200 songs) is used and every 2000 previous events are set as the input, while Performance-rnn uses the entire song. Similar structure [(512 LSTM) (512 LSTM) (512 LSTM)] is employed. It sounds like someone is showing off his skills by wondering around the keyboard.

This one is a longer (10000 events) generated song. I don’t like cherry-picking songs, and hence you can hear that this one has a worse rhythm than the last one.