Building the database#

Windows installation#

Requirements#

  • Windows 10

  • Python 3.10 accessible via PATH (i.e., python via cmd.exe should run without errors)

  • FFmpeg accessible via PATH

  • Git

Setting up#

First things first, you’ll need to clone the source code repository for the Jazz Trio Database to a new folder on your local machine by running the following command on a local terminal:

git clone https://github.com/HuwCheston/Jazz-Trio-Database

We’d now recommend that you create a new virtual environment to keep the dependencies required for the Jazz Trio Database separate from your main Python environment. To do so, run the following:

cd Jazz-Trio-Database
pip install virtualenv
python -m venv venv
python test_environment.py
call venv\Scripts\activate.bat

Your terminal input should now have a (venv) preface, reading something like (venv) C:\Jazz-Trio-Database.

Next, install the project dependencies:

pip install -r requirements.txt

Audio download and separation#

Next, you can run the following command to download and separate the audio:

python src\clean\make_dataset.py

>>> processing "Blues for Gwen" from 1962 album Inception, leader McCoy Tyner ...
>>> ...
>>> ...
>>> dataset corpus_chronology made in X secs !

This command will begin by downloading the audio from a given track on YouTube (stored in .\data\raw), and then separating this into separate piano, bass, and drums stems (stored in .\data\processed). Hennequin et al., 2020 is used for piano separation, and Rouard et al., 2022 for bass and drums separation.

Warning

By default, if a YouTube source cannot be found, the program will try to retry downloading a set number of times before terminating prematurely. Sources that cannot be found will be printed directly to the console. If this occurs, first check your internet connection and the source itself. If a particular source has been removed from YouTube, please open a new issue on the GitHub repository or [contact us](mailto:hwc31@cam.ac.uk?subject=Missing YouTube link&cc=huwcheston@gmail.com).

By default, this command will process audio for tracks contained in .\references\corpus_chronology.xlsx: this contains 300 tracks by 30 different pianists, and is the dataset used in Cheston et al. (2024). To process audio from a different corpus, pass the -corpus flag to the above command, followed by the name of the corpus spreadsheet inside .\references. For example, to process audio from .\references\corpus_bill_evans.xlsx:

python src\clean\make_dataset.py -corpus corpus_bill_evans

>>> processing "A Sleepin Bee" from 1968 album At the Montreux Jazz Festival, leader Bill Evans ...
>>> ...
>>> ...
>>> dataset corpus_bill_evans made in X secs !

Tip

If the program detects that audio for a particular track has already been downloaded or separated, it will skip over this stage when re-running the command. In order to force download or separation, pass the -force-download or -force-separation flags when running the command.

If you want to suppress the use of either Spleeter or Demucs for separation (not recommended), you can additionally pass the -no-spleeter and -no-demucs commands.

Onset detection#

After the audio for a given corpus file has been separated and extracted, we can move on to extracting onset data from the recordings. To do so, run the following command:

python src\detect\detect_onsets.py
>>> detecting onsets in 300 tracks (0 from disc) using -1 CPUs ...
>>> ...
>>> ...
>>> onsets detected for all tracks in corpus corpus_chronology in X secs !

Warning

The onset detection program will, by default, use every single CPU core available on your machine. To control this, pass in the -n_jobs flag, followed by an integer corresponding to the number of cores to use.

This command will process onsets in the source separated piano, bass, and drums files for each track in the corpus (contained in .\data\processed), and beats in the overall audio mixture (contained in .\data\raw). Once again, by default this command uses the tracks described in .\references\corpus_chronology.xlsx by default. This can be changed by passing the -corpus flag and a valid corpus file, as above.

Tip

If the program detects that onsets have already been detected for a particular track, this will be skipped when re-running the command. To force processing, pass in the -ignore_cache flag when running the command.

By default, the program will detect onsets for every track in the provided corpus file. If you just want to check the results out without processing every track, you can pass in the -annotated-only flag when running the command to only process those tracks with corresponding manual annotation files (located inside .\references\manual_annotation).

Tip

Alternatively, you can pass in the -one-track-only flag to (you guessed it!) process the first track in the corpus.

Check the results#

Once the program has finished running, you can check out the results by listening to the click tracks contained in .\reports\click_tracks. By default, for the piano, bass, and drums stems, the high-pitched tones in each click track correspond to onsets matched with a quarter note beat, and the lower-pitched tones onsets that were not matched. In the audio mixture (mix), high-pitched tones correspond with the estimated downbeat and low-pitched tones every other beat.

Tip

To run the onset detection without generating click tracks, pass in the -no_click argument when running the command.

The program will also output a Python pickle file inside the .\models directory. This contains the serialised OnsetMaker class instances (defined in src\detect\detect_utils.py).

You can now run many of the analysis scripts contained inside the .\notebooks directory, or pass the serialised OnsetMaker instances to the feature extraction classes defined in src\features\features_utils.py.

Linux installation#

TODO