Italian Music Bands

Download the Data & Code

This is the data and code necessary to reproduce the results in the paper “Node Attribute Analysis for Cultural Data Analytics: a Case Study on Italian XX-XXI Century Music.”

Running the code requires Python and R. To also generate the figures, you can use the provided scripts with Gnuplot.

To run the code, you’re going to need the following python libraries: numpy, scipy, pandas, networkx, torch, torch_geometric, scikit-learn, pandarallel, seaborn, and matplotlib.

You’re also going to need the following libraries for R: relaimpo and stargazer.

The archive contains four folders.

  • “data”. This folder contains the raw data. The dataset is stored in five files. They are all tab-delimited.
    • “genres.tsv”. The first column is a band id. The following columns contain the count of the number of records the given band has released for the given genre.
    • “ids.tsv”. A two-column file connecting a band id with its name.
    • “network.tsv”. The temporal bipartite network. Three columns. Each row is an artist (first column) playing for a band (second column) in a given year (third column).
    • “regions.tsv”. The first column is a band id. The following columns contain a binary value, equal to one if the band originated from a given region.
    • “years.tsv”. The first column is a band id. The following columns contain a binary value, equal to one if the band released a record in the given year.
  • “figures”. This folder contains three gnuplot scripts to generate some of the figures of the paper. You can call them with the command “gnuplot scriptname.gp”, provided you have run the python scripts that will generate their inputs beforeheand.
  • “outputs”. The outputs of the python scripts will be put here. The folder is pre-populated with the Cytoscape session files generating the network visualizations of the paper, along with the unipartite projections of the bipartite network.
  • “scripts”. This folder contains the code to reproduce the results. It has a “lib” subfolder with custom python libraries necessary to run the code. All scripts can be run by calling “python scriptname.py” or “Rscript scriptname.r” and none of them requires any parameter setting. Be sure to run the in order and before calling the scripts in the “figures” folder.

Download the Data & Code