Rappers more educated than pop artists?
DKE student Thomas Vrancken uses text-mining to prove that the vocabulary of rappers is significantly richer than that of pop music artists.
Sentiment, Emotion & Vocabulary Analysis on Music Lyrics
In his project 'Sentiment, Emotion & Vocabulary Analysis on Music Lyrics' from the course Information Retrieval and Text Mining from the department of Data Science and Knowledge Engineering, Thomas Vrancken conducted a sentiment and emotion analysis on 57,651 songs from 643 artists.
For each artist, he assessed a positive and negative sentiment score, as well as a score for each of the eight main emotions joy, sadness, anger, fear, trust, disgust, anticipation, surprise. Both analyses used a tf-idf algorithm. In addition, Vrancken conducted an analysis to determine the wideness of each artist’s vocabulary, based on the number of different words the singer uses.
Natural Language Processing
Natural Language Processing (NLP) is a computer science discipline that analyzes and extracts information from texts. Sentiment Analysis and Emotion Mining are sub-fields of this discipline. Sentiment analysis examines whether a text has a rather positive or negative undertone. Emotion Mining analyzes how much of a specific emotion the text expresses (e.g. fear, joy, sadness, etc.).
Sentiment Analysis and Emotion Mining have many possible commercial applications like helping customer care services and recommending products to online shoppers. In (criminal) investigations, locating emotional communication based on key words expressing anger, cursing or threats can be a good starting point to discover potential threats on time.
Results
Surprisingly, there is a strong positive correlation between joy and sadness. Apparently, artists who sing a lot about joy also express a lot of sadness. Intuitively, these results make sense. Artists (especially pop artists) that sing a lot about happy feelings also tend to produce more melancholic songs about love and heartaches.
Conversely, most rock and rap artists reflect neither of these emotions in their songs. Less surprisingly, these results show a negative correlation between joy and anger, but a strong positive correlation between anger and fear.
Looking at that ranking for the vocabulary score, one can identify some artists that are known to be quite lyrical (e.g. Eminem, Wu-Tang Clan). They also confirm the notion that hip-hop artists in general have a quite wide vocabulary.
Conclusion
However, the breaking ground results came from calculating regressions and correlations between these scores. These showed that:
- Artists who express more negative sentiments than positive also tend to be more lyrical, i.e. have a wider vocabulary
- Artists that express joy and/or sadness tend to be less lyrical and use a much smaller vocabulary
- Artists that use words associated with rap music are more lyrical and have a significantly richer vocabulary than artists that use words linked with pop such as Backstreet Boys or Britney Spears
These results bring statistical proof of many theories about music. For instance, that pop-artists do not bother to develop rich lyrics, whereas hip-hop artists do.
Relevant links
Also read
-
Innovation in action: rapid analysis of bacteria in food and crop health
Real-time analysis of plant health or harmful bacteria during food production is becoming a reality. Thanks to funding from the Province of Limburg and Europe, researchers at the Sensor Engineering research institute start working on both projects.
-
Professor Thomas Cleij reappointed as Dean of the Faculty of Science and Engineering
The Executive Board of Maastricht University has reappointed Prof. Dr. Thomas Cleij as Dean of the Faculty of Science and Engineering. Thomas will continue to lead the faculty until March 2031.
-
Aurélie Carlier uses mathematics to improve healthcare for women
Women experience side effects from medication twice as often as men. Yet doctors still prescribe the same dosage. This is because men and women process medication differently. Aurélie Carlier uses computer models to investigate these differences.