Skip to content

Latest commit

 

History

History
30 lines (21 loc) · 2.82 KB

README.md

File metadata and controls

30 lines (21 loc) · 2.82 KB

JazzNet

In this repository, all code can be found that produced the results for the Bachelor's thesis. In the thesis, we discussed the generation of new Jazz Chords given this training dataset (iRealPro Corpus of Jazz Standards). The Corpus contains around 1000 songs in the genre of Jazz and a chord vocab of 1007 distinct chords. The general pipeline is as follows:

How to use

To train and run the network, open Models.ipynb and run the cells. It relies on already generated data, so one does not need to tun the notebook PreProcessing.ipynb beforehand. All modules should be installed - they are collectively imported at the top of the notebook. Hyperparameters can be adjusted - or even the architecture of the RNNs. Chords are not generated by default; they will be loaded from outputs/sequences/, where they are stored in a JSON format. Function parameters must be changed accordingly if one wants to generate new sequences.

Pre-processing data

In the Notebook PreProcessing.ipynb, data is being loaded and processed. First, we take care of the **kern structure and arrange the chords as given in the sequence information in the header. Then, the chords are added to a 2D list. This list is then processed, and the chords are simplified. The original chord vocab size shrinks from 1007115. The chords are then saved into data/processed/chords.json.

Statistics

Statistics.ipynb will give some insights into the data and produce plots saved in img/.

Network and Training

The Notebook Models.ipynb contains all the training and generating logic. Two Recurrent Neural Networks (one LSTM and a baseline RNN) are set up with the same hyper-parameters. They will be compared later. Here is the general structure:

  1. Load chords.json: Load the chords and tokenize them to datatype int. This is all done by functions one can find in functions/utils.py. The sequences are also padded.
  2. Add tokens to the data: <BOS>: Beginning of Sequence token (marks the start of the chord sequence)
    <EOS>: End of Sequence token (marks the end of the chord sequence)
    pad: Padding token (sequences are padded to the same length)
  3. Set up RNN: An RNN with two linear layers is set up. Sequences will not be packed like in the LSTM.
  4. Set up LSTM: The same architecture as the RNN, but with sequence packing.
  5. Train them both over 50 epochs.
  6. Generate new sequences using multinomial sampling for picking the next element.
  7. Compare the results: Distribution similarity, padding content, ...
  8. Save generated chords to midi files: they can be found in the midi folder for listening. I recommend looking into the outputs/midi/arranged since the chords have been used to make an entire arrangement out of it - or for only chords (without arrangement): outputs/midi/piano.