Section 4: Interpreting the Data
Contents
Section 4: Interpreting the Data¶
You should now have generated two clean quantitative data sets, each showing the amount of physical motion in a single musical performance at varying moments of intensity. We now need to carry out statistical analysis to see if there actually is any difference in the motion between the two videos.
What is bootstrapping?¶
The bootstrap method is useful for estimating the distribution of a statistic (e.g. it’s mean, variance) without assuming that it follows a normal distribution (as is required, e.g., for parametric methods like the t-test). The basic idea about how this method works can be found on the Wikipedia page. I’m reproducing the relevant passage below:
First, we resample the data with replacement, and the size of the resample must be equal to the size of the original data set. Then the statistic of interest is computed from the resample from the first step. We repeat this routine many times to get a more precise estimate of the Bootstrap distribution of the statistic.
In more detail:
Take a dataset, e.g.
[1, 2, 3, 4, 5]
(withmean = 3
);Create a shuffled version of that dataset, where one value can appear in the new dataset several times, e.g. [
2,2,2,3,3
];Calculate the statistic of interest in the shuffled dataset, e.g.
mean = 2.4
for our dataset above;Repeat this process
N
times, so that we end up e.g. with an array ofN
means;Usually the number of samples is pretty high, but it depends on what we’re interested in doing with the distribution. For confidence intervals, I’d be skeptical of any analysis that doesn’t calculate at least 10,000 samples.
Considering the distribution of bootstrapped statistics, e.g. by calculating 95% confidence intervals by taking the 2.5 and 97.5 percentile.
Calculate bootstrapped confidence intervals¶
Note
If you’re comfortable with R or Python, feel free to try bootstrapping your dataset using these languages. In Python, you’ll probably want to investigate the df.sample
method in pandas
(with replace=True
), or the scipy.stats.bootstrap
function in SciPy.
Rather than having to go through the process of bootstrapping manually in Excel or Sheets (would be pretty painful), we can use the online tool linked here to help us.
In the data
box, paste in the results of one of your scaled data columns. Set # simulations
to at least 10,000, and quantiles
to any value that includes P2.5
and P97.5
. You can then set the remaining values (mostly related to the output plots) to whatever you want. Finally, click compute
, and you’ll be taken to a new page after a few seconds of computation.
What we’re interested on in the new page is the mean
row in the Estimation Results of Bootstrap
table. Make a note of the two values under the P2.5
and P97.5
columns; these are our 95% confidence intervals. Also, make sure that the computed mean (under the Estimate
column) is the same as the actual mean of your data.
Repeat this stage with your second column of scaled data, and make a note of the estimated confidence interval for this column.
Interpreting the results¶
Consider both confidence intervals. Do they cross? If so, we can consider the difference between mean motion in both videos not to be statistically significant: we can’t be certain that the true means are not the same. If the confidence intervals don’t cross, however, we can say that there is a statistically significant difference in mean motion between the two videos (although, see the below Important
caveat)
Note
If extracting a p-value is so important to us, one option is to consider the narrowest possible confidence intervals that cross. So, we can extend both confidence intervals to find the point at which they cross, taking the area outside the confidence interval as our estimate of p: e.g., if the confidence intervals crossed at P2
and P98
, we’d be able to say p=0.04
.
Important
The above analysis should really come with a caveat. What we probably should have done is to shuffle both datasets, calculate the mean of each, compute the difference in means, and resample. This would give us a single distribution of mean differences, from which we could extract a confidence interval. If this confidence interval crossed 0, there would be no significant difference at alpha=0.05
. We can also extend this to find a p-value as described above.
Unfortunately, there’s no easy way to do this in Excel or Sheets, and no online calculator that can do it for us. But, if you’re feeling adventurous, give it a go in Python or R!
Presenting your findings¶
Highlight both data columns and generate a line chart or histogram to demonstrate the differences in motion between the two. What differences do you notice between the high and low intensity videos? Is musical intensity significantly related to performer body motion? Why (or why not)?
Try and think about these issues when answering the questions on the following page!