src.analyse.phase_correction_models

Code for generating phase correction models

Functions

generate_phase_correction_models(raw_data, ...)

Generates all phase correction models.

Classes

PhaseCorrectionModel(c1_keys, c2_drms, **kwargs)

A linear phase correction model for a single performance (keys and drums).

class src.analyse.phase_correction_models.PhaseCorrectionModel(c1_keys: list, c2_drms: list, **kwargs)

Bases: object

A linear phase correction model for a single performance (keys and drums).

static _append_zoom_array(perf_df: DataFrame, zoom_arr: array, onset_col: str = 'my_onset') → DataFrame: Appends a column to a dataframe showing the approx amount of latency by AV-Manip for each event in a performance

_apply_elliptic_envelope(delayed_arr: ndarray, nn_np: ndarray) → DataFrame: Applies an EllipticEnvelope filter to data to extract outliers and rematch or set them to missing. Numba isn’t used here as it isn’t supported by EllipticEnvelope and sklearn.

static _cleaning_for_180ms(delayed_arr: ndarray, nn_np: ndarray) → ndarray: Applies specialised cleaning using bins to performances with 180ms of latency. Optimised with numba.

_create_higher_order_phase_correction_models(df: DataFrame, endog: str = 'my_next_ioi_diff', exog_vars: tuple[str] = ('my_prev_ioi_diff', 'asynchrony')) → list: Creates a list of higher order phase correction models, including a greater number of lags of the asynchrony and previous IOI terms

_create_phase_correction_model(df: DataFrame, md: Optional[str] = None): Create the linear phase correction model

_create_summary_dictionary(c: dict, md, nn: DataFrame, rn: int, higher_order_md: list): Creates a dictionary of summary statistics, used when analysing all models.

_extract_asynchrony_third_person(async_col: str = 'asynchrony_third_person', subset: Optional[int] = None) → float: Extracts asynchrony experienced by an imagined third person joined to the Zoom call

static _extract_pairwise_asynchrony(nn, asynchrony_col: str = 'asynchrony')

Extract pairwise asynchrony as a float (in milliseconds, as is standard for this unit in the literature)

Method: — - Carry out the nearest-neighbour matching, and get a series of asynchrony values for both musicians

(I.e. keys -> drums with delay, drums -> keys with delay).

Square all of these values;
Get the overall mean (here we collapse both arrays down to a single value);
Take the square root of this mean.

_extract_pairwise_asynchrony_with_standard_deviations(async_col: str = 'asynchrony')

Extract pairwise asynchrony using the standard deviation of the asynchrony

Method: — - Join both nearest-neighbour dataframes in order to match asynchrony values together; - Square all of these values; - Get the overall mean (here we collapse both arrays down to a single value); - Take the square root of this mean. - Repeat the join process with the other dataframe as left join, get the pairwise asynchrony again - Take the mean of both (this is to prevent issues with the dataframe join process)

_extract_tempo_slope(subset: Optional[int] = None)

Extracts tempo slope, in the form of a float representing BPM change per second.

Method: — - Resample both dataframes to get the mean BPM value every second. - Concatenate these dataframes together, and get the mean BPM by both performers each second - Compute a regression of this mean BPM against the overall elapsed time and extract the coefficient

_format_df_for_model(df: DataFrame) → DataFrame: Coerces a dataframe into the format required for the phrase-correction model, including setting required columns

_generate_df(data: list, threshold: float = 0.25) → tuple[pandas.core.frame.DataFrame, int]: Create dataframe, append zoom array, and add a column with our delayed onsets. This latter column replicates the performance as it would have been heard by our partner.

_get_contamination_value_from_json(default: Optional[float] = None) → float

_get_rolling_coefficients(nn_df: DataFrame, func=None, ind_var: str = 'latency', dep_var: str = 'my_prev_ioi', cov: str = 'their_prev_ioi') → list

Centralised function for calculating the relationship between IOI and latency variancy. Takes in a single independent and dependent variable and covariants, as well as a function to apply to these.

Method: — - Get rolling standard deviation values for all passed variables. - Lag these values according to the maximum lag attribute passed when creating the class instance. - Apply a function (defaults to regression) onto the lagged and non-lagged variables and return the results.

_get_rolling_standard_deviation_values(nn_df: DataFrame, cols: tuple[str] = ('my_prev_ioi', 'their_prev_ioi', 'latency')) → DataFrame: Extracts the rolling standard deviation of values within a given window size, then resample to get mean value for every second.

_iqr_filter(df: DataFrame, col: str) → DataFrame: Applies an inter-quartile range filter to set outlying values for a particular column to missing.

_lag_rolling_values(roll: DataFrame, cols: tuple[str] = ('my_prev_ioi_std', 'their_prev_ioi_std', 'latency_std')) → DataFrame: Shifts rolling values by a given number of seconds and concatenates together with the original dataframe.

_match_onsets(live_arr: ndarray, delayed_arr: ndarray, zoom_arr: ndarray) → DataFrame: For a single performer, matches each of their live onsets with the closest delayed onset from their partner.

static _nearest_neighbour(live_arr, delayed_arr, empty_arr): Carry out the nearest-neighbour matching. Optimised with numba.

_partial_corr_shifted_rolling_variables(lagged: DataFrame, dep_var: str = 'my_prev_ioi_std', ind_var: str = 'latency_std', cov_var: str = 'their_prev_ioi_std') → list: Gets the partial correlation between dep_var and ind_var, controlling for covariate cov_var

static _regress_shifted_rolling_variables(lagged: DataFrame, dep_var: str = 'my_prev_ioi_std', ind_var: str = 'latency_std', cov_var: str = 'their_prev_ioi_std') → list[float]: Creates a regression model of lagged variables vs non-lagged variables and extracts coefficients.

static _remove_duplicate_matches(nn_np: ndarray) → ndarray: Filters onsets for duplicate matches, then keeps whichever match is closest to median asynchrony time.

static _return_granger_causality(nn: DataFrame, maxlag: int = 1) → dict: Calculates Granger causality between time series

src.analyse.phase_correction_models.generate_phase_correction_models(raw_data: list, output_dir: str, logger=None, force_rebuild: bool = False) → tuple[list[src.analyse.phase_correction_models.PhaseCorrectionModel], str]: Generates all phase correction models. Returns the models and a string for logging