Table of Contents
, , , ,

Parametric and non-parametric statistics on event related fields

FIXME For the clustering section, it is required that you have the channel locations (elec or grad structure) or that you manually specify a neighbourhood structure.

FIXME This tutorial only works if the option cfg.keepindividual = 'yes' was specified when applying the grandaveraging. Error messages don't make that clear, however.

FIXME the structure of this tutorial does not yet follow the documentation guidelines.

The goal of this tutorial is to provide a gentle introduction into the different options that are implemented for statistical analysis. Here we will use event related potentials, because they are more familiar to most of the audience and easier to visualize. Below we will show how you can do some limited statistical testing using the Matlab statistics toolbox and compare that to the FieldTrip ft_timelockstatistics function.

If you want to do statistics on channel-level power spectra or time-frequency representations of power (as obtained from freqanalysis), you have the same options available as described below, except that you should use the ft_freqstatistics function. If you want to do statistics on source reconstructed activity (as obtained from ft_sourceanalysis), you have the same options available as described below, except that you should use the ft_sourcestatistics function.

Topics that will be covered are parametric statistics on a single channel and time-window, the multiple comparison problem (MCP) if you test many time points and channels and solutions for the MCP, non-parametric randomization testing and cluster-based testing.

Parametric statistics (this will be familiar to all of you)

In this tutorial we will use the same dataset as some of the other tutorials.

The data that will be used in this tutorial is the result of ft_timelockanalysis on all 10 subjects that participared in the experiment, followed by ft_timelockgrandaverage. The data is available from ftp://ftp.fcdonders.nl/pub/fieldtrip/tutorial/eventrelatedstatistics/GA_ERF_orig.mat

We now describe how we can statistically test the difference between the event-related averages for fully incongruent (FIC) and the fully congruent (FC) sentence endings. For this analysis we use planar gradient data. We begin by loading the data structures containing the event-related averages (of the planar gradient data) of all ten subjects:

load GA_ERF_orig % Grand average of planar gradient, 10 subjects

In the field “individual” the event related fields (ERFs) per subject are present. In the field “avg” the grand average (GA) over the ten subjects has been computed. Plot all channels with ft_multiplotER, and channel MLT12 with ft_singleplotER.

cfg = [];
cfg.showlabels  = 'yes';
cfg.layout    	= 'CTF151.lay';
cfg.interactive = 'yes';
figure; ft_multiplotER(cfg,GA_FC, GA_FIC)

cfg = [];
cfg.channel = 'MLT12';
figure; ft_singleplotER(cfg,GA_FC, GA_FIC)

multiplot_GA_FC_FIC.png

singleplotMLT12_GA_FC_FIC.pngg

To get an idea of the data, plot the ERF of channel MLT12 for all subjects:

figure; 
for iSub = 1:10
  subplot(3,4,iSub)
  plot(GA_FC.time,squeeze(GA_FC.individual(iSub,52,:)))
  hold on
  plot(GA_FIC.time,squeeze(GA_FIC.individual(iSub,52,:)),'r')
  title(strcat('subject ',num2str(iSub)))
  ylim([0 1.9e-13])
  xlim([-1 2])
end
subplot(3,4,11); 
text(0.5,0.5,'FC','color','b') ;text(0.5,0.3,'FIC','color','r')
axis off

plotMLT12_allsubj_FC_FIC.png

Between 300ms and 700ms there seems to be a difference between the FC and the FIC condition in the grand average in channel MLT12 (channel 52).

chan = 52;
time = [0.3 0.7];

timesel_FIC = find(GA_FIC.time >= time(1) & GA_FIC.time <= time(2));
values_FIC = mean(GA_FIC.individual(:,chan,timesel_FIC),3);

timesel_FC = find(GA_FC.time >= time(1) & GA_FC.time <= time(2));
values_FC = mean(GA_FC.individual(:,chan,timesel_FC),3);

% plot to see the effect in each subject
M = [values_FC,values_FIC];
figure; plot(M','o-'); xlim([0.5 2.5])
legend({'subj1', 'subj2', 'subj3', 'subj4', 'subj5', 'subj6', ...
        'subj7', 'subj8', 'subj9', 'subj10'}, 'location','EastOutside');

MLT12_300to700ms_allsubj_FC_FIC.png

Test with Matlab function

You can do a dependent samples t-test with the Matlab ttest.m function (in the Statistics toolbox) where you average over this time window and compare. This shows that the effect on the selected time and channel is significant with a t-value of -4.9999 and a p-value of 0.00073905.

%dependent samples ttest
FCminFIC = values_FC - values_FIC;
[h,p,ci,stats] = ttest(FCminFIC, 0, 0.05) % H0: mean = 0, alpha 0.05

Test with FieldTrip function

You can do the same thing in FieldTrip (which does not require the statistics toolbox) using the ft_timelockstatistics function. This gives you the same p-value of 0.00073905

cfg = [];
cfg.channel     = 'MLT12';
cfg.latency     = [0.3 0.7];
cfg.avgovertime = 'yes';
cfg.parameter   = 'individual';
cfg.method      = 'analytic';
cfg.statistic   = 'depsamplesT';
cfg.alpha       = 0.05;
cfg.correctm    = 'no';

Nsub = 10;
cfg.design(1,1:2*Nsub)  = [ones(1,Nsub) 2*ones(1,Nsub)];
cfg.design(2,1:2*Nsub)  = [1:Nsub 1:Nsub];
cfg.ivar                = 1; % the 1st row in cfg.design contains the independent variable
cfg.uvar                = 2; % the 2nd row in cfg.design contains the subject number

stat = ft_timelockstatistics(cfg,GA_FC,GA_FIC)

Exercise 1

Look at the temporal evolution of the effect by changing cfg.latency and cfg.avgovertime in ft_timelockstatistics. You can plot the t-value versus time, the probability versus time and the statistical mask versus time. Note that the output of the ft_timelockstatistics function closely resembles the output of the ft_timelockanalysis function.

The multiple comparison problem and solutions

In the previous paragraph we picked a channel and time window by hand after eyeballing the effect. If you would like to test the significance of the effect in all channels you could make a loop over channels.

%loop over channels
time = [0.3 0.7];
timesel_FIC = find(GA_FIC.time >= time(1) & GA_FIC.time <= time(2));
timesel_FC = find(GA_FC.time >= time(1) & GA_FC.time <= time(2));
clear h p
for iChan = 1:151
  FICminFC = mean(GA_FIC.individual(:,iChan,timesel_FIC),3) - mean(GA_FC.individual(:,iChan,timesel_FC),3);
  [h(iChan), p(iChan)] = ttest(FICminFC, 0, 0.05 ); % test each channel separately
end

% plot uncorrected "significant" channels
cfg = [];
cfg.style     = 'blank';
cfg.layout    = 'CTF151.lay';
cfg.highlight = 'on';
cfg.highlightchannel = find(h);
cfg.comment   = 'no';
figure; ft_topoplotER(cfg, GA_FC)
title('significant without multiple comparison correction')

depttestMATLAB_nomcc.png

But since you're now performing 151 individual tests, you are not controlling the false alarm rate any more. Under the null-hypothesis and with an alpha of 5%, you have a 5% chance of making a false alarm and incorrectly conclude that the null-hypothesis should be rejected. That false alarm rate applies to each test that you perform, so the chance of making a false alarm if you do 151 subsequent tests is much larger than the desired 5%. This is the multiple comparison problem.

To ensure that the false alarm rate is controlled at the desired alpha level over all channels, you should do a correction for multiple comparisons. The best known correction is the Bonferoni correction, which allows to control the false alarm rate if you do multiple independent tests. For a Bonferoni correction you would divide your initial alpha level (i.e. the desired upper boundary for the false alarm rate) by the number of tests that you perform. In this case with 151 channels the alpha level would become 0.05/151 which is 0.00033113. With this alpha the effect is not significant anymore on any channel. However, a Bonferoni correction is too conservative in this case; the effect on one MEG channel is highly correlated in neighboring channels. Although the Bonferoni correction achieves the desired control of the type-I error (false alarm rate), the type-II error (chance of not rejecting H0 when it in fact should be rejected) is much larger than desired.

Below you can see the means by which to implement a Bonferoni correction and find other corrections for multiple comparisons which are less conservative and hence more sensitive.

Test with Matlab function

% with Bonferoni correction for multiple comparisons
for iChan = 1:151
  FICminFC = mean(GA_FIC.individual(:,iChan,timesel_FIC),3) - mean(GA_FC.individual(:,iChan,timesel_FC),3);
  [h(iChan), p(iChan)] = ttest(FICminFC, 0, 0.05/151); % Bonferoni correction for 151 channels
end

Test with FieldTrip function

cfg = [];
cfg.channel     = 'MEG'; %now all channels
cfg.latency     = [0.3 0.7];
cfg.avgovertime = 'yes';
cfg.parameter   = 'individual';
cfg.method      = 'analytic';
cfg.statistic   = 'depsamplesT'
cfg.alpha       = 0.05;
cfg.correctm    = 'bonferoni';

Nsub = 10;
cfg.design(1,1:2*Nsub)  = [ones(1,Nsub) 2*ones(1,Nsub)];
cfg.design(2,1:2*Nsub)  = [1:Nsub 1:Nsub];
cfg.ivar                = 1; % the 1st row in cfg.design contains the independent variable
cfg.uvar                = 2; % the 2nd row in cfg.design contains the subject number

stat = ft_timelockstatistics(cfg,GA_FIC,GA_FC)

Fieldtrip also has other methods implemented for performing a multiple comparison correction, such as FDR. See the statistics_analytic function for the options to cfg.correctm when you want to do a parametric test.

Non-parametric statistics in general

Instead of using the analytic t-distribution to calculate the appropriate p-value for your effect, you can also use a nonparametric randomization test to obtain the p-value.

This is implemented in FieldTrip in the function statistics_montecarlo which is called by ft_timelockstatistics when you set cfg.method = 'montecarlo'. An Monte-Carlo estimate of the significance probabilities and/or critical values is then calculated based on randomizing or permuting your data many times between the conditions.

cfg = [];
cfg.channel     = 'MEG';
cfg.latency     = [0.3 0.7];
cfg.avgovertime = 'yes';
cfg.parameter   = 'individual';
cfg.method      = 'montecarlo';
cfg.statistic   = 'depsamplesT'
cfg.alpha       = 0.05;
cfg.correctm    = 'no';
cfg.numrandomization = 1000;

Nsub = 10;
cfg.design(1,1:2*Nsub)  = [ones(1,Nsub) 2*ones(1,Nsub)];
cfg.design(2,1:2*Nsub)  = [1:Nsub 1:Nsub];
cfg.ivar                = 1; % the 1st row in cfg.design contains the independent variable
cfg.uvar                = 2; % the 2nd row in cfg.design contains the subject number

stat = ft_timelockstatistics(cfg,GA_FIC,GA_FC)

% make the plot
cfg = [];
cfg.style     = 'blank';
cfg.layout    = 'CTF151.lay';
cfg.highlight = 'on';
cfg.highlightchannel = find(stat.mask);
cfg.comment   = 'no';
figure; ft_topoplotER(cfg, GA_FC)
title('Nonparametric: significant without multiple comparison correction')

depttest_nonpara_FieldTrip_nomcc.png


Also in the non-parametric approach for testing of statistical significance different corrections for multiple comparisons such as Bonferoni, fdr, and others are implemented. See the options for cfg.correctm in statistics_montecarlo.

Non-parametric statistics using a cluster-based approach

FieldTrip also implements a special way to correct for multiple comparisons, which makes use of the feature in the data that the effect at neighbouring timepoints and sensors is highly correlated. For more details see the cluster permutation tutorials for ERFs and time frequency data and the publication by Maris and Oostenveld (2007).

With this method for multiple comparisons the following sensors show a significant effect (p<0.05) between 0.3 and 0.7 s.

cfg = [];
cfg.channel     = 'MEG';
cfg.latency     = [0.3 0.7];
cfg.avgovertime = 'yes';
cfg.parameter   = 'individual';
cfg.method      = 'montecarlo';
cfg.statistic   = 'depsamplesT'
cfg.alpha       = 0.05;
cfg.correctm    = 'cluster';
cfg.grad        = GA_FC.grad; %see ft_neighbourselection
cfg.numrandomization = 1000;

Nsub = 10;
cfg.design(1,1:2*Nsub)  = [ones(1,Nsub) 2*ones(1,Nsub)];
cfg.design(2,1:2*Nsub)  = [1:Nsub 1:Nsub];
cfg.ivar                = 1; % the 1st row in cfg.design contains the independent variable
cfg.uvar                = 2; % the 2nd row in cfg.design contains the subject number

stat = ft_timelockstatistics(cfg,GA_FIC,GA_FC)

% make a plot
cfg = [];
cfg.style     = 'blank';
cfg.layout    = 'CTF151.lay';
cfg.highlight = 'on';
cfg.highlightchannel = find(stat.mask);
cfg.comment   = 'no';
figure; ft_topoplotER(cfg, GA_FC)
title('Nonparametric: significant with cluster multiple comparison correction')

depttest_nonpara_FieldTrip_cluster.png


So far we predefined a time window over which the effect was averaged, and tested the difference of that between conditions. You can also not predefine a timewindow and cluster simultaneously over neighboring channels and neighboring time points. See the example below. Now a channel-time cluster is found from 0.33 s untill 0.52 s in which p < 0.05.

cfg = [];
cfg.channel     = 'MEG';
cfg.latency     = [-0.5 2];
cfg.avgovertime = 'no';
cfg.parameter   = 'individual';
cfg.method      = 'montecarlo';
cfg.statistic   = 'depsamplesT'
cfg.alpha       = 0.05;
cfg.correctm            = 'cluster';
cfg.grad                = GA_FC.grad; %see ft_neighbourselection
cfg.numrandomization    = 1000;
cfg.minnbchan           = 2; % minimal neighbouring channels

Nsub = 10;
cfg.design(1,1:2*Nsub)  = [ones(1,Nsub) 2*ones(1,Nsub)];
cfg.design(2,1:2*Nsub)  = [1:Nsub 1:Nsub];
cfg.ivar                = 1; % the 1st row in cfg.design contains the independent variable
cfg.uvar                = 2; % the 2nd row in cfg.design contains the subject number

stat = ft_timelockstatistics(cfg,GA_FIC,GA_FC)

ft_clusterplot is a handy FieldTrip function to plot the effect.

% make a plot
cfg = [];
cfg.highlightsymbolseries = ['*','*','.','.','.'];
cfg.layout = 'CTF151.lay';
cfg.contournum = 0;
cfg.markersymbol = '.';
cfg.alpha = 0.05;
cfg.zlim = [-5 5];
ft_clusterplot(cfg,stat);

depttest_nonpara_FieldTrip_cluster_fig1.png

depttest_nonpara_FieldTrip_cluster_fig2.png

depttest_nonpara_FieldTrip_cluster_fig3.png

depttest_nonpara_FieldTrip_cluster_fig4.png