
Utility functions, classes, and constants for analysis and modelling


append_zoom_array(perf_df, zoom_arr[, onset_col])

Appends a column to a dataframe showing the approx amount of latency by AV-Manip for each event in a performance

average_bpms(df1, df2[, window_size, elap, bpm])

Returns a list of averaged BPMs from two performance.

create_model_list(df, avg_groupers[, md])

Subset a dataframe of per-condition results and return a list of statsmodels regression outputs for use in a table.

create_one_simulation(keys_data, drms_data, ...)

Create data for one simulation, using numba optimisations.

extract_event_density(bpm, raw)

Appends a column to performance dataframe showing number of actual notes per extracted crotchet


Extracts the number of beats in the performance that required interpolation in REAPER.


Extracts the normalised pairwise variability index (nPVI) from a column of IOIs

extract_pairwise_asynchrony(keys_nn, drms_nn)

Extracts pairwise asynchrony from two matched dataframes.

generate_df(data[, iqr_range, threshold, ...])

Create dataframe from MIDI performance data, either cleaned (just crotchet beats) or raw.


Returns average tempo slope coefficients for all performances as list of tuples in the form (trial, block, latency, jitter, avg.

iqr_filter(col, df[, iqr_range])

Filter duration values below a certain quartile to remove extraneous midi notes not cleaned in Reaper


Loads all pickled data from the processed data folder

load_from_disc(output_dir[, filename])

Try and load models from disc

log_model(md[, logger])

Helper function to log metadata for a particular model in our GUI, if we've passsed a logger function

log_simulation(sim[, logger])

Helper function to log metadata for a particular simulation in our GUI, if we've passed a logger function

reg_func(df, xcol, ycol)

Calculates linear regression between two given columns, returns results table.

resample(perf[, func, col, resample_window, ...])

Resamples an individual performance dataframe to get mean of every second.


Returns list of tuples containing average coefficient for keys/drums performance in a single trial Tuples take the form of those in generate_tempo_slopes, i.e. (trial, block, latency, jitter, avg.


Formats the table returned by statsmodel to return only the regression coefficient as an integer


Tests if data is stationary, if not returns data with first difference calculated


Iterates through raw data and zips keys/drums data from the same performance together Returns a list of zip objects, each element of which is a tuple containing

src.analyse.analysis_utils.append_zoom_array(perf_df: DataFrame, zoom_arr: array, onset_col: str = 'onset') DataFrame

Appends a column to a dataframe showing the approx amount of latency by AV-Manip for each event in a performance

src.analyse.analysis_utils.average_bpms(df1: DataFrame, df2: DataFrame, window_size: int = 8, elap: str = 'elapsed', bpm: str = 'bpm') DataFrame

Returns a list of averaged BPMs from two performance. Data is grouped by every second in a performance.

src.analyse.analysis_utils.create_model_list(df, avg_groupers: list, md='correction_partner_onset~C(latency)+C(jitter)+C(instrument)') list

Subset a dataframe of per-condition results and return a list of statsmodels regression outputs for use in a table. By default, the regression will average results from the same condition across multiple measures. This can be overridden by setting the averaging argument to False.

src.analyse.analysis_utils.create_one_simulation(keys_data: Dict, drms_data: Dict, keys_params: Dict, drms_params: Dict, keys_noise, drms_noise, lat: ndarray, beats: int) tuple

Create data for one simulation, using numba optimisations. This function is defined outside of the Simulation class to enable instances of the Simulation class to be pickled.

src.analyse.analysis_utils.extract_event_density(bpm: DataFrame, raw: DataFrame) DataFrame

Appends a column to performance dataframe showing number of actual notes per extracted crotchet

src.analyse.analysis_utils.extract_interpolated_beats(c: array) tuple[int, int]

Extracts the number of beats in the performance that required interpolation in REAPER. This was usually due to a performer ‘pushing’ ahead a crotchet beat by a swung quaver, or due to an implied metric modulation.

src.analyse.analysis_utils.extract_npvi(s: Series) float

Extracts the normalised pairwise variability index (nPVI) from a column of IOIs

src.analyse.analysis_utils.extract_pairwise_asynchrony(keys_nn: DataFrame, drms_nn: DataFrame) float

Extracts pairwise asynchrony from two matched dataframes.

Rasch (2015) defines pairwise asynchrony as as the root-mean-square of the standard deviations of the onset time differences for all pairs of voice parts. We can calculate this for each condition, using the nearest-neighbour model for both the keyboard and drummer.

src.analyse.analysis_utils.generate_df(data: array, iqr_range: tuple = (0.05, 0.95), threshold: float = 0, keep_pitch_vel: bool = False) DataFrame

Create dataframe from MIDI performance data, either cleaned (just crotchet beats) or raw. Optional keyword arguments: iqr_range: Upper and lower quartile to clean IOI values by. threshold: Value to remove IOI timings below keep_pitch_vel: Keep pitch and velocity columns

src.analyse.analysis_utils.generate_tempo_slopes(raw_data: list) list[tuple]

Returns average tempo slope coefficients for all performances as list of tuples in the form (trial, block, latency, jitter, avg. slope coefficient). Deprecated?

src.analyse.analysis_utils.iqr_filter(col: str, df: DataFrame, iqr_range: tuple = (0.05, 0.95)) Series

Filter duration values below a certain quartile to remove extraneous midi notes not cleaned in Reaper

src.analyse.analysis_utils.load_data(input_filepath: str) list

Loads all pickled data from the processed data folder

src.analyse.analysis_utils.load_from_disc(output_dir: str, filename: str = 'phase_correction_mds.p') list

Try and load models from disc

src.analyse.analysis_utils.log_model(md, logger=None) None

Helper function to log metadata for a particular model in our GUI, if we’ve passsed a logger function

src.analyse.analysis_utils.log_simulation(sim, logger=None) None

Helper function to log metadata for a particular simulation in our GUI, if we’ve passed a logger function

src.analyse.analysis_utils.reg_func(df: DataFrame, xcol: str, ycol: str) RegressionResults

Calculates linear regression between two given columns, returns results table. Deprecated.

src.analyse.analysis_utils.resample(perf: ~pandas.core.frame.DataFrame, func=<function nanmean>, col: str = 'my_onset', resample_window: str = '1s', interpolate: bool = True) DataFrame

Resamples an individual performance dataframe to get mean of every second.

src.analyse.analysis_utils.return_average_coeffs(coeffs: list) list[tuple]

Returns list of tuples containing average coefficient for keys/drums performance in a single trial Tuples take the form of those in generate_tempo_slopes, i.e. (trial, block, latency, jitter, avg. slope coefficient)

src.analyse.analysis_utils.return_coeff_from_sm_output(results: RegressionResults) int

Formats the table returned by statsmodel to return only the regression coefficient as an integer

src.analyse.analysis_utils.test_stationary(array: Series) Series

Tests if data is stationary, if not returns data with first difference calculated

src.analyse.analysis_utils.zip_same_conditions_together(raw_data: list) list[zip]

Iterates through raw data and zips keys/drums data from the same performance together Returns a list of zip objects, each element of which is a tuple containing