After making changes to the code and/or documentation, this page should remain on the wiki as a reminder of what was done and how it was done. However, there is no guarantee that this page is updated in the end to reflect the final state of the project
So chances are that this page is considerably outdated and irrelevant. The notes here might not reflect the current state of the code, and you should not use this as serious documentation.
Z-scores are being used as a means to 'normalize' the data before doing over-subjects statistics. However, there are many ways of implementing this and at the moment there is not much consensus what the best method is.
To homogenize the data on two levels:
What we want to accomplish is at the same time control the false alarm rate, and have max statistical power.
Ad 1)
Ad 2)
Ad 1)
Ad 2)
–> Problem: these solutions may introduce additional noise!
Properties of the data:
The choice of the 'best' solution will (probably) depend on the characteristics of the data. Ideally, this will lead to a simple scheme with for each type of data characteristics the optimal method. The choice for the best method for a particular dataset will then depend on the question 'what are the properties of the data?'.
The simulated dataset should consist of two conditions, baseline and activation, that can be compared and should contain an effect that is small enough to not always be detected by the statistical test. We will ignore the MCP, since cluster randomization effectively deals with that, so it is not of current interest.
The data should be of the form:
baseline = phys + noise activation = phys * e1 + e2 + noise
phys = phys_constant + phys_noise * lambda_phys noise = random_noise * lambda_ext
This gives us the following dimensions:
This way we can vary the size/range of effect and noise, have multiple trials and at the same time repeat the statistical test several times, using freqstatistics.
We need a range of effect and noise size where at the edges the effect is always found (effect high, noise low) resp. never found (effect low, noise high). In the range in between, it will sometimes turn up, sometimes not. If we repeat the statistical test several times (using the chan dim) and average the results we have a nice measure of the statistical power. Now we can go to the next step and repeat this using the different homogenization methods and see whether they improve the statistical power.
To illustrate (using additive effect model):
The results of using 1 repetition of the stats test (i.e., 1 channel). The left figure shows 'stat', right shows 'mask'. On the x-axis is noise level, on y-axis effect level, the colouring codes the stat (t values) resp. mask (1=sig effect, 0=no sig effect).
The results of using 500 repetitions of the stats test. Since the noise is randomly generated the results turn out little bit different for each run. Averaging the masks over repetitions reflects the statistical power (=1-beta)
To illustrate (using multiplicative effect model):
The results of using 500 repetitions of the stats test. The left figure shows 'stat', right shows 'mask'.
To test the methods we will take the following steps:
effect model:
noise model:
(of course also combinations of these models are possible)
Share this page: