Visualization and Dimension Reduction on Music Genre Dataset

tuyetgiangst93
Jun 2, 2022
4 min read

This is an academic group project about data visualization and dimension reduction. I worked with another classmate to visualize and apply dimension reduction techniques such as principle component analysis (PCA), linear discriminant analysis (LDA), Laplacian Eigenmaps, and multidimensional scaling (MDS) to gain some insight about the dataset.

The Music Genre dataset from Kaggle, which was originally sourced from Spotify's API, is intended for prediction of music genre in songs. It has 50,000 instances and 18 dimensions. However, only 14 dimension (1 response and 13 features) will be used in this project.

Before moving into data analyzing, we describe the variables to help readers easy follow the next sections, some of which are from Spotify.

Genre: indicates a piece of music’s stylistic category. This categorical variable can take on one of 10 possible values: Alternative, Anime, Blues, Classical, Country, Electronic, Hip-Hop, Jazz, Rap and Rock.
Popularity: the popularity of a track, likely based on Spotify’s listening metrics.
Acousticness: a measurement of how acoustic a track is. Values range from 0 to 1, with values closer to 1 indicating a more acoustic track.
Danceability: “Danceability describes how suitable a track is for dancing based on a combination of musical elements including tempo, rhythm stability, beat strength, and overall regularity. A value of 0.0 is least danceable and 1.0 is most danceable”.
Duration: the duration of the track in milliseconds.
Energy: “represents a perceptual measure of intensity and activity. Typically, energetic tracks feel fast, loud, and noisy”. Based on this description, it’s likely that its value depends on loudness and tempo.
Instrumentalness: a value close to 1 indicates fewer vocals and more instrumentation, while a value closer to 0 indicates more vocals.
Liveness: the probability that a track is a recording of a live performance.
Loudness: describes the average volume of the track in decibels (dB). This variable ranges from -60 to 0 dB.
Speechiness: “detects the presence of spoken words in a track”. A value from 0 to 1, it measures how likely a track is to contain spoken words.
Valence: “A measure from 0.0 to 1.0 describing the musical positiveness conveyed by a track. Tracks with high valence sound more positive (e.g. happy, cheerful, euphoric), while tracks with low valence sound more negative (e.g. sad, depressed, angry)”.
Tempo: is the speed of a piece, measured in average beats per minute (BPM).
Key: The note that the song is centered around, and one of A, A#, B, C, C#, D, D#, E, F, F#, G, G#.
Mode: Either ‘Major’ or ‘Minor’.

Through out this project, we used Matlab to perform data visualization and dimension reduction.

Cleaning dataset: There were many missing fields in tempo column and bad data such as -1 for duration column. To address those issues, we used knnimpute() with k=5 to replace the missing data of variable tempo by the average of 5 nearest neighbors based on Euclidean distance.

Exploring data: Popularity, danceability, valence, and tempo look to have close to a normal distribution. Energy is a bit left-skewed and most of the songs have relatively high energy, while acousticness, duration, instrumentalness, speechiness, and liveness are clearly right-skewed.

We found that classical is often an outlier relative to the other genres, and is usually a polar opposite of rap or hip-hop. Rock and alternative, and rap and hip-hop have very similar averages across many of the features.

We used this knowledge to determine which genres (classes) to examine when running our dimension reduction methods and when visualizing our data. The idea is to run our algorithms on some classes which are very different from each other to analyze the effectiveness in the best case scenario. In addition, we wanted to analyze effectiveness in a not-so-good scenario, when the differences in classes are minimal. We also used general knowledge of music genres to make these distinctions. For example, we know that country and rap are very different styles, and rock/alternative are almost indistinguishable (we couldn’t really tell you the difference).

Moreover, we did an exhaustive comparison of PCA across pairs of genres to make sure we chose genres that covered a variety of scenarios. Our PCA method is explained in the next section.

The genres we chose to compare were the six pairs, Classical / Rap, Hip-Hop / Anime, Rap / Country, Classical / Anime, Rock / Alternative, and Rap / Hip-Hop, as well as the two triplets, Classical / Rap / Electronic and Rock / Alternative / Country.

Dimension Reduction

It is sometimes hard to work with a dataset that has high dimensions. Dimension reduction techniques are meant to reduce the dimension of the dataset that helps us to easier work with. PCA, LDA are linear methods whereas MDS and Laplacian Eigenmaps are non-linear methods. In this project, we used all four technique to check the separation of the classes, and also to see which method is more effective in clustering this data.

We perform PCA on all the genres first, but the result is not good. Although classical may be partially distinguishable from the other genres, the overall separation between all classes is quite poor. The first principal component captures the least variance of our dataset whereas the PC12 contains the most information.

While the distinction between all genres were poor, we saw better results when applying PCA for two genres. Therefore, we continue to apply other techniques on two genres. Below figures show that the separation between anime/hip-hop and classical/rap are good while country/rap has moderate separation. The distinction between other classes are very poor on most of the techniques. We can clearly see that LDA is the best method that clusters the data very well whereas other techniques can only separates that data when the classes are different.

When it comes to a subset of three genres, LDA also the technique that best separate the data. We can see that Classical/Electric/Rap is the subset that has better separation compared to the other one.

Resources:

Dataset: https://www.kaggle.com/datasets/vicsuperman/prediction-of-music-genre

Is my Spotify music boring? An analysis involving music, data, and machine learning, https://towardsdatascience.com/is-my-spotify-music-boring-an-analysis-involving-music-data-and-machine-learning-47550ae931de

Code: https://drive.google.com/drive/u/0/folders/1ZjAeexcz_QsRa_wnfG8SDIzcYNyZZFdW

Visualization and Dimension Reduction on Music Genre Dataset

Recent Posts

Commentaires