A Hierarchical Recurrent Neural Network for Symbolic Melody Generation

IEEE Trans Cybern. 2020 Jun;50(6):2749-2757. doi: 10.1109/TCYB.2019.2953194. Epub 2019 Dec 2.

Abstract

In recent years, neural networks have been used to generate symbolic melodies. However, the long-term structure in the melody has posed great difficulty to design a good model. In this article, we present a hierarchical recurrent neural network (HRNN) for melody generation, which consists of three long-short-term-memory (LSTM) subnetworks working in a coarse-to-fine manner along time. Specifically, the three subnetworks generate bar profiles, beat profiles, and notes, in turn, and the output of the high-level subnetworks are fed into the low-level subnetworks, serving as guidance to generate the finer time-scale melody components in the low-level subnetworks. Two human behavior experiments demonstrate the advantage of this structure over the single-layer LSTM which attempts to learn all hidden structures in melodies. Compared with the recently proposed models MidiNet and MusicVAE, the HRNN produces better melodies evaluated by humans.

MeSH terms

  • Adolescent
  • Adult
  • Choice Behavior
  • Computer Simulation*
  • Databases, Factual
  • Humans
  • Music*
  • Neural Networks, Computer*
  • Young Adult