src.analyse.analysis_utils

Utility functions, classes, and constants for analysis and modelling

Functions

append_zoom_array(perf_df, zoom_arr[, onset_col])

Appends a column to a dataframe showing the approx amount of latency by AV-Manip for each event in a performance

average_bpms(df1, df2[, window_size, elap, bpm])

Returns a list of averaged BPMs from two performance.

create_model_list(df, avg_groupers[, md])

Subset a dataframe of per-condition results and return a list of statsmodels regression outputs for use in a table.

create_one_simulation(keys_data, drms_data, ...)

Create data for one simulation, using numba optimisations.

extract_event_density(bpm, raw)

Appends a column to performance dataframe showing number of actual notes per extracted crotchet

extract_interpolated_beats(c)

Extracts the number of beats in the performance that required interpolation in REAPER.

extract_npvi(s)

Extracts the normalised pairwise variability index (nPVI) from a column of IOIs

extract_pairwise_asynchrony(keys_nn, drms_nn)

Extracts pairwise asynchrony from two matched dataframes.

generate_df(data[, iqr_range, threshold, ...])

Create dataframe from MIDI performance data, either cleaned (just crotchet beats) or raw.

generate_tempo_slopes(raw_data)

Returns average tempo slope coefficients for all performances as list of tuples in the form (trial, block, latency, jitter, avg.

iqr_filter(col, df[, iqr_range])

Filter duration values below a certain quartile to remove extraneous midi notes not cleaned in Reaper

load_data(input_filepath)

Loads all pickled data from the processed data folder

load_from_disc(output_dir[, filename])

Try and load models from disc

log_model(md[, logger])

Helper function to log metadata for a particular model in our GUI, if we've passsed a logger function

log_simulation(sim[, logger])

Helper function to log metadata for a particular simulation in our GUI, if we've passed a logger function

reg_func(df, xcol, ycol)

Calculates linear regression between two given columns, returns results table.

resample(perf[, func, col, resample_window, ...])

Resamples an individual performance dataframe to get mean of every second.

return_average_coeffs(coeffs)

Returns list of tuples containing average coefficient for keys/drums performance in a single trial Tuples take the form of those in generate_tempo_slopes, i.e. (trial, block, latency, jitter, avg.

return_coeff_from_sm_output(results)

Formats the table returned by statsmodel to return only the regression coefficient as an integer

test_stationary(array)

Tests if data is stationary, if not returns data with first difference calculated

zip_same_conditions_together(raw_data)

Iterates through raw data and zips keys/drums data from the same performance together Returns a list of zip objects, each element of which is a tuple containing

src.analyse.analysis_utils.append_zoom_array(perf_df: DataFrame, zoom_arr: array, onset_col: str = 'onset') DataFrame

Appends a column to a dataframe showing the approx amount of latency by AV-Manip for each event in a performance

src.analyse.analysis_utils.average_bpms(df1: DataFrame, df2: DataFrame, window_size: int = 8, elap: str = 'elapsed', bpm: str = 'bpm') DataFrame

Returns a list of averaged BPMs from two performance. Data is grouped by every second in a performance.

src.analyse.analysis_utils.create_model_list(df, avg_groupers: list, md='correction_partner_onset~C(latency)+C(jitter)+C(instrument)') list

Subset a dataframe of per-condition results and return a list of statsmodels regression outputs for use in a table. By default, the regression will average results from the same condition across multiple measures. This can be overridden by setting the averaging argument to False.

src.analyse.analysis_utils.create_one_simulation(keys_data: Dict, drms_data: Dict, keys_params: Dict, drms_params: Dict, keys_noise, drms_noise, lat: ndarray, beats: int) tuple

Create data for one simulation, using numba optimisations. This function is defined outside of the Simulation class to enable instances of the Simulation class to be pickled.

src.analyse.analysis_utils.extract_event_density(bpm: DataFrame, raw: DataFrame) DataFrame

Appends a column to performance dataframe showing number of actual notes per extracted crotchet

src.analyse.analysis_utils.extract_interpolated_beats(c: array) tuple[int, int]

Extracts the number of beats in the performance that required interpolation in REAPER. This was usually due to a performer ‘pushing’ ahead a crotchet beat by a swung quaver, or due to an implied metric modulation.

src.analyse.analysis_utils.extract_npvi(s: Series) float

Extracts the normalised pairwise variability index (nPVI) from a column of IOIs

src.analyse.analysis_utils.extract_pairwise_asynchrony(keys_nn: DataFrame, drms_nn: DataFrame) float

Extracts pairwise asynchrony from two matched dataframes.

Rasch (2015) defines pairwise asynchrony as as the root-mean-square of the standard deviations of the onset time differences for all pairs of voice parts. We can calculate this for each condition, using the nearest-neighbour model for both the keyboard and drummer.

src.analyse.analysis_utils.generate_df(data: array, iqr_range: tuple = (0.05, 0.95), threshold: float = 0, keep_pitch_vel: bool = False) DataFrame

Create dataframe from MIDI performance data, either cleaned (just crotchet beats) or raw. Optional keyword arguments: iqr_range: Upper and lower quartile to clean IOI values by. threshold: Value to remove IOI timings below keep_pitch_vel: Keep pitch and velocity columns

src.analyse.analysis_utils.generate_tempo_slopes(raw_data: list) list[tuple]

Returns average tempo slope coefficients for all performances as list of tuples in the form (trial, block, latency, jitter, avg. slope coefficient). Deprecated?

src.analyse.analysis_utils.iqr_filter(col: str, df: DataFrame, iqr_range: tuple = (0.05, 0.95)) Series

Filter duration values below a certain quartile to remove extraneous midi notes not cleaned in Reaper

src.analyse.analysis_utils.load_data(input_filepath: str) list

Loads all pickled data from the processed data folder

src.analyse.analysis_utils.load_from_disc(output_dir: str, filename: str = 'phase_correction_mds.p') list

Try and load models from disc

src.analyse.analysis_utils.log_model(md, logger=None) None

Helper function to log metadata for a particular model in our GUI, if we’ve passsed a logger function

src.analyse.analysis_utils.log_simulation(sim, logger=None) None

Helper function to log metadata for a particular simulation in our GUI, if we’ve passed a logger function

src.analyse.analysis_utils.reg_func(df: DataFrame, xcol: str, ycol: str) RegressionResults

Calculates linear regression between two given columns, returns results table. Deprecated.

src.analyse.analysis_utils.resample(perf: ~pandas.core.frame.DataFrame, func=<function nanmean>, col: str = 'my_onset', resample_window: str = '1s', interpolate: bool = True) DataFrame

Resamples an individual performance dataframe to get mean of every second.

src.analyse.analysis_utils.return_average_coeffs(coeffs: list) list[tuple]

Returns list of tuples containing average coefficient for keys/drums performance in a single trial Tuples take the form of those in generate_tempo_slopes, i.e. (trial, block, latency, jitter, avg. slope coefficient)

src.analyse.analysis_utils.return_coeff_from_sm_output(results: RegressionResults) int

Formats the table returned by statsmodel to return only the regression coefficient as an integer

src.analyse.analysis_utils.test_stationary(array: Series) Series

Tests if data is stationary, if not returns data with first difference calculated

src.analyse.analysis_utils.zip_same_conditions_together(raw_data: list) list[zip]

Iterates through raw data and zips keys/drums data from the same performance together Returns a list of zip objects, each element of which is a tuple containing