Rappers more educated than pop artists?
DKE student Thomas Vrancken uses text-mining to prove that the vocabulary of rappers is significantly richer than that of pop music artists.
Sentiment, Emotion & Vocabulary Analysis on Music Lyrics
In his project 'Sentiment, Emotion & Vocabulary Analysis on Music Lyrics' from the course Information Retrieval and Text Mining from the department of Data Science and Knowledge Engineering, Thomas Vrancken conducted a sentiment and emotion analysis on 57,651 songs from 643 artists.
For each artist, he assessed a positive and negative sentiment score, as well as a score for each of the eight main emotions joy, sadness, anger, fear, trust, disgust, anticipation, surprise. Both analyses used a tf-idf algorithm.In addition, Vrancken conducted an analysis to determine the wideness of each artist’s vocabulary, based on the number of different words the singer uses.
Natural Language Processing
Natural Language Processing (NLP) is a computer science discipline that analyzes and extracts information from texts. Sentiment Analysis and Emotion Mining are sub-fields of this discipline. Sentiment analysis examines whether a text has a rather positive or negative undertone. Emotion Mining analyzes how much of a specific emotion the text expresses (e.g. fear, joy, sadness, etc.).
Sentiment Analysis and Emotion Mining have many possible commercial applications like helping customer care services and recommending products to online shoppers. In (criminal) investigations, locating emotional communication based on key words expressing anger, cursing or threats can be a good starting point to discover potential threats on time.
Surprisingly, there is a strong positive correlation between joy and sadness. Apparently, artists who sing a lot about joy also express a lot of sadness. Intuitively, these results make sense. Artists (especially pop artists) that sing a lot about happy feelings also tend to produce more melancholic songs about love and heartaches.
Conversely, most rock and rap artists reflect neither of these emotions in their songs. Less surprisingly, these results show a negative correlation between joy and anger, but a strong positive correlation between anger and fear.
Looking at that ranking for the vocabulary score, one can identify some artists that are known to be quite lyrical (e.g. Eminem, Wu-Tang Clan). They also confirm the notion that hip-hop artists in general have a quite wide vocabulary.
However, the breaking ground results came from calculating regressions and correlations between these scores. These showed that:
- Artists who express more negative sentiments than positive also tend to be more lyrical, i.e. have a wider vocabulary
- Artists that express joy and/or sadness tend to be less lyrical and use a much smaller vocabulary
- Artists that use words associated with rap music are more lyrical and have a significantly richer vocabulary than artists that use words linked with pop such as Backstreet Boys or Britney Spears
These results bring statistical proof of many theories about music. For instance, that pop-artists do not bother to develop rich lyrics, whereas hip-hop artists do.