Just another char-rnn generator model using Blocks and spanish lyrics
Sep 29, 2015
The Andrej’s blog post
gained a lot of popularity because of their interesting results generating text
in several scenarios. Taking a single txt file as input, he was able to
automatically generate char-by-char from Shakespeare’s dialogs to C++ code.
The strategy was also used to generate
music
and bible’s phrases. Those results
come from learning a language model using recurrent neural networks (RNN).
This post is just another exploration of such kind of models. This time is the
turn for spanish reggaeton lyrics. I think it could be fun to train a language
model on this genre because of its strongly inclination to messy (and sometimes
also grose) speech in many of its lyrics. You can directly go to the generated
text, or read below a bit of technical details.
About the implementation
The learning-by-doing approach works for me. So, I wanted to reimplement the
original Torch version using the
Blocks framework.
Blocks is a very useful library which makes easier to build complex neural
network architectures with Theano.
Blocks already brings a ready-to-use SequenceGenerator that includes additional
components such as AttentionModel and Emitters. Since my goal was to better
explore and understand the details of generative RNN models (preprocessing,
sampling, etc), I decided to keep it as simple as possible, by mimicking the
original code and implementing the sampling process by my own. The implemented
code is available on GitHub.
Training procedure
I took 241 reggaeton artist from freebase,
and joined them with around 100.000 lyrics songs crawled from the web, resulting
in 6800 reggaeton songs with 11 millions of characters. I trained a RNN with 3
layers, with 512 GRU units each one, yielding to 5 millions parameters in total.
I didn’t use dropout. The training took around 3 hours for 10 epochs.
Once more, this kind of models shows promising results. It exposes the
general structure of the corpus, i.e. short lines grouped in paragraphs (like
strophes). Also, the model learned to open and close correctly brackets and
parenthesis. It also learn most of the syntactic rules in spanish language (at
least those that the training corpus exposes), whose usually are more complex
than english ones (verb conjugation, accentuation, additional punctuation, etc).
But more interestingly, from time to time the model decides to include the
[coro] tag indicating the chorus, mention some artist between brackets, or
even include some english phrases, just like usual reggaeton songs do.
On the downside, it generates several misspellings and, at the end, the generated
text does not have a concrete story. However, we shouldn’t be worried about it
because anyway: it is reggaeton lyrics!