LSTMs for Music Generation

Original Source Here

Music is a continuous signal, which is a combination of sounds from various instruments and voices. Another characteristic is the presence of structural recurrent patterns which we pay attention to while listening. In other words, each musical piece has its own characteristic coherence, rhythm, and flow.

In this article, we will approach the task of music generation in a very simplified manner. We will leverage and extend a stacked LSTM network for the task of music generation. Such a setup is similar to the case of text generation (this is a topic for another upcoming article). To keep things simple and easy to implement, we will focus on a single instrument/monophonic music generation task.

The following is an outline of our workflow for this walk-through:

  • Getting to know the dataset
  • Prepare dataset for music generation
  • LSTMs based music generation model (did we say attention!)
  • Model Training
  • Listen to the beat! Let’s hear out a few samples our model generates

Let’s first get to know more about the dataset and think about how we would prepare it for our task of music generation.

The code used in this article is available through Github repositories [1] and [2]. More interestingly this is nicely packaged in a Google-Colab enabled jupyter notebook which you can simply click and use.

The Dataset

MIDI is an easy-to-use format which helps us extract a symbolic representation of music contained in the files. For this discussion/walk-through, we will make use of a subset of the massive public MIDI dataset collected and shared by reddit user u/midi_man, which is available at this link: r/WeAreTheMusicMakers

We will leverage a subset of this dataset itself. The subset is based on classical piano pieces by great musicians such as Beethoven, Bach, Bartok, and the like. The subset can be found in a zipped folder,, along with the code in this GitHub repository.

We will make use of music21 to process the subset of this dataset and prepare our data for training the model. As music is a collection of sounds from various instruments and voices/singers, for the purpose of this exercise we will first use the chordify()function to extract chords from the songs. The following snippet helps us to get a list of MIDI scores in the required format.


Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot

%d bloggers like this: