Laboratory One Research Blog

90s Pop Lyrics Generator

September 16, 2018

Model Diagram

It sucks that the music industry stopped producing new 90s pop songs like ~28 years ago. It’s kind of uncalled for really. Since our music mogul overlords won’t produce any new hits for us, I’ve decided to write my own. I’m going to focus on writing song lyrics, but because I’m not much of lyricist I’ve built an A.I. to help me out. That is, an A.I which can write a set of characters in the style of a song from the pop genre from the year 1990 - 2000 based on an initial set of characters. The A.I. is based on my learnings from Andrew Ng’s Deep Learning course on Sequence Models. I utilized a sequence model which relies on a Long Short-Term Memory (LSTM) unit to build this Recurrent Neural Network (RNN).

Project repository

Problem

How can a set of English characters in the style of 1990s pop songs be generated?

Dataset Creation

To solve this problem, a dataset which is representive of 90s pop songs must be created. I started with a dataset from Kaggle, 380,000+ lyrics from MetroLyrics. Then I extracted a subset of the data, transformed it into rows of individual characters, and each character is one-hot encoded to create X and Y datasets.

One difficulty was in determining the character set to be used by the model. There was a surpising number of non-english characters in the original dataset. I manually defined a character set as:

[
  "'", 'a', 'b', 'c', 'd',
  'e', 'f', 'g', 'h', 'i',
  'j', 'k', 'l', 'm', 'n',
  'o', 'p', 'q', 'r', 's',
  't', 'u', 'v', 'w', 'x',
  'y', 'x', 'z', '\n', '!',
  '"', '$', '%', '&', '(',
  ')', '*', '+', ',', '-',
  '.', '/', ':', ';', '<',
  '=', '>', '?', '@', '[',
  '\\', ']', '^', '_', '`',
  '{', '|', '}', '~', ' '
]

The Model

The model’s goal is to predict the next character in a sequence of characters. At first, I tried to have the model predict the next word in a sequence of words. That did not work very well, so with the help of a reference project, I decided to move to a character-level model.

The model is given X, a set of one-hot encoded characters, and attempts to predict Y, the next one-hot encoded character in the sequence. The model is an RNN, so each character is feed to the model through an LSTM. The output of the LSMT is then feed into a fully-connect softmax layer, which outputs a representation of a one-hot encoded character.

Model Diagram

Model Performance

I was not able to achieve a high validation accuracy. My best model peaked at 55%, which could definitely be improved upon.

Model Performance

Lyric Generation

Since the model predicts the next character of a sequence, lyrics can be generated by making many predictions while updating the input sequence. Combining these predictions results in a set of new lyrics. After a lot of tuning, I found the following hyper parameters to work well:

  • LSMT activations: 128
  • Batch size: 50
  • Learning rate: 0.01
  • Characters per input: 20

The following are the results of using 5 sets of examples from the dataset. The initial sequence used to generate them was: “sweet dreams are made of these”.

Results

100000 Examples

sweet dreams are made of these
i wanna i can't be free)
and i can't
i am
she's gonna feel the time
when i am
the way the sigh that the time and from anywording of my all i was you see your heart slawing
it was me christ
all on the sound
and i love you thing that and the someone
i feel down the sunsed and hemplass
and don't have and when you was to the smilestay we try thing you didn't the someone you would be fre
e)
something it is that was to sleep
it was the way that my smile and for you but there's some out then a feeling

300000 Examples

sweet dreams are made of these you saw you say
and i never knew, you like you to have you to heare
the knew the love can you say,
seems away that don't let your love to see the love that we find to me
to have you strengy with you to me
the trients of the stars with your love didn't let your love to feel the light the knew with you
we'll hold in the stranged
everything that you got to be your love in the stork her strippen baby
forget your fafe the way they wanng

400000 Examples

sweet dreams are made of these we didn't see you one sight
stay, the thing i can should see me
this is it with me
i love you
when you can see believering that on the sun better let me see out on the love
it's the more i think i stay could care it's me to last to you
i really you stay
we things you stay away
i'm not the sun but i got to  the baby
but i know the feeling her man
i can love you it story to cry
i want to the love is we can i'm been to the love of a loving and something will have to love you

500000 Examples

sweet dreams are made of these but is lost
to thing the bughing to me
one all the far rain me
everything it around for me it that you gonna the last the end the and i want to be
like the sunce is sham, the ladine the heart the chance the thing the world and the sun world
you come to her the way
i see a lost
when the love it the sund leave
this in the wing the heart the baby

1000000 Examples

sweet dreams are made of thesesw  d tht t we a to  sfhedt sh t bn o  ing   enofam sejt y we uf t you tathbe dsal du  two th ii  ml i i  ve  i   s'rl t ti   be tasee taor c' s b te bej  yousuenronid s mvl t sr waos     meo yewsoo bqa w t h pstn t t ccyui t cnmoho yo u ont,
m tio  ali t whap an t no po n tey al sghi
y idm t tu !   athen  svej g
id hah hai htde me to yl ia  it  swr
t sgmh ato ude a aoer temes y t da o av o f a sou i iioug d otbi t ma as se' f no m s, t i t p o ola  ho yli s w  yb eraee b

Conclusion

At 1000000 examples, the model became overtrained and stopped being effective. The output was gibberish. Additionally, training times became very long so I wasn’t willing to experiment more with the hyperparameters. Even though the model’s validation accuracy peaked at 55%, the generated lyrics were convincing. To me, this solution has a lot of promise.

I could see several ways of improving the model. The model would be deepened by adding another layer of LSMTs. Making the model bi-directional could help the lyrics make more sense. Another improvement could come from being more selective of the songs in the dataset. The little drummer boy is in the dataset. It’s not what I think of as a 90s pop song.

Next Steps

I want to take this model into production. To do so, the code needs to be abstracted so that it can be used in production, and the model should be exported after it’s trained. Additionally, lyric generation should be in it’s own script.

I hope to deploy this model to a web application with tensorflow.js, and perhaps an iOS application with CoreML.


Peter Chau

Written by Peter Chau, a Canadian Software Engineer building AIs, APIs, UIs, and robots.

peter@labone.tech

laboratoryone

laboratory_one