Table of Contents
, , , , ,

Cluster-based permutation tests on event related fields

Introduction

The objective of this tutorial is to give an introduction to the statistical analysis of event-related EEG and MEG data (denoted as MEEG data in the following) by means of cluster-based permutation tests. The tutorial starts with a long background section that sketches the background of permutation tests. Subsequently it is shown how to use FieldTrip to perform cluster-based permutation tests on actual axial and planar event-related fields in a between-trials and in a within-subjects design.

In this tutorial we will continue working on the dataset described in the preprocessing tutorials. Below we will repeat code to select the trials and preprocess the data as described in the first tutorials (trigger based trial selection, artifact rejection, event related averaging and planar gradient). We assume that the preprocessing and averaging steps of the analysis are already clear for the reader.

This tutorial is not covering statistical test on time-frequency representations. If you are interested in that, you can read the Cluster-based permutation tests on time-frequency data tutorial. If you are interested how parametric statistical tests can be used with FieldTrip, you can read the Parametric and non-parametric statistics on event-related fields tutorial.

Background

The multiple comparisons problem

In the statistical analysis of MEEG-data we have to deal with the multiple comparisons problem (MCP). This problem originates from the fact that MEEG-data are multidimensional. MEEG-data have a spatiotemporal structure: the signal is sampled at multiple channels and multiple time points (as determined by the sampling frequency). The MCP arises from the fact that the effect of interest (i.e., a difference between experimental conditions) is evaluated at an extremely large number of (channel,time)-pairs. This number is usually in the order of several thousands. Now, the MCP involves that, due to the large number of statistical comparisons (one per (channel,time)-pair), it is not possible to control the so called family-wise error rate (FWER) by means of the standard statistical procedures that operate at the level of single (channel,time)-pairs. The FWER is the probability under the hypothesis of no effect of falsely concluding that there is a difference between the experimental conditions at one or more (channel,time)-pairs. A solution of the MCP requires a procedure that controls the FWER at some critical alpha-level (typically, 0.05 or 0.01). The FWER is also called the false alarm rate.

In this tutorial, we describe nonparametric statistical testing of MEEG-data. Contrary to the familiar parametric statistical framework, it is straightforward to solve the MCP in the nonparametric framework. Fieldtrip contains a set of statistical functions that provide a solution for the MCP. These are the functions ft_timelockstatistics and ft_freqstatistics which compute significance probabilities using a non-parametric permutation tests. Besides nonparametric statistical tests, ft_timelockstatistics and ft_freqstatistics can also perform parametric statistical tests (see the Parametric and non-parametric statistics on event related fields tutorial). However, in this tutorial the focus will be on non-parametric testing.

Structure in experiment and data

Different types of data

MEEG-data have a spatiotemporal structure: the signal is sampled at multiple channels and multiple time points. Thus, MEEG-data are two-dimensional: channels (the spatial dimension) X time-points (the temporal dimension). In the following, these (channel,time)-pairs will be called samples. In some studies, these two-dimensional data are reduced to one-dimensional data by averaging over either a selected set of channels or a selected set of time-points.

In many studies, there is an interest in the modulation of oscillatory brain activity. Oscillatory brain activity is measured by power estimates obtained from Fourier or wavelet analyses of single-trial spatiotemporal data. Very often, power estimates are calculated for multiple data segments obtained by sliding a short time window over the complete data segment. In this way, spatio-spectral-temporal data are obtained from the raw spatiotemporal data. The spectral dimension consists of the different frequency bins for which the power is calculated. In the following, these (channel,frequency,time)-triplets will be called samples. In some studies, these three-dimensional data are reduced to one- or two-dimensional data by averaging over a selected set of channels, a selected set of time points, and/or a selected set of frequencies.

Structure in an experiment

The data are typically collected in different experimental conditions and the experimenter wants to know if there is a significant difference between the data observed in these conditions. To answer this question, statistical tests are used. To understand these statistical tests one has to know how the structure in the experiment affects the structure in the data. Two concepts are crucial for describing the structure in an experiment:

  1. The concept of a unit of observation (UO).
  2. The concept of a between- or a within-UO design.

In MEEG experiments, there are two types of UO: subjects and trials (i.e., epochs of time within a subject). Concerning the concept of a between- or a within-UO design, there are two schemes according to which the UOs can be assigned to the experimental conditions: (1) every UO is assigned to only one of a number of experimental conditions (between UO-design; independent samples), or (2) every UO is assigned to multiple experimental conditions in a particular order (within UO-design; dependent samples). Therefore, in a between-UO design, there is only one experimental condition per UO, and in a within-UO design there are multiple experimental conditions per UO. These two experimental designs require different test statistics.

The two UO-types (subjects and trials) and the two experimental designs (between- and within-UO) can be combined and this results in four different types of experiments.

Computation of the test statistic and its significance probability

The statistical testing in FieldTrip is performed by the functions ft_timelockstatistics and ft_freqstatistics. From a statistical point of view, the calculations performed by these two functions are very similar. These functions differ with respect to the data structures on which they operate: ft_timelockstatistics operates on data structures produced by ft_timelockanalysis and ft_timelockgrandaverage, and ft_freqstatistics operates on data structures produced by ft_freqanalysis, ft_freqgrandaverage, and ft_freqdescriptives.

To correct for multiple comparisons, one has to specify the field cfg.correctm. This field can have the values 'no', 'max', 'cluster', 'bonferoni', 'holms', or 'fdr'. In this tutorial, we only use cfg.correctm = 'cluster', which solves the MCP by calculating a so-called cluster-based test statistic and its significance probability.

The calculation of this cluster-based test statistic

  1. For every sample (a (channel,time)-pair or a (channel,frequency,time)-triplet) the experimental conditions are compared by means of a t-value or some other number that quantifies the effect at this sample. It must be noted that this t-value is not the cluster-based test statistic for which we will calculate the significance probability; it is just an ingredient in the calculation of this cluster-based test statistic. Quantifying the effect at the sample level is possible by means of different measures. These measures are specified in cfg.statistic, which can have the values 'indepsamplesT', 'depsamplesT', 'actvsblT', and many others. We will return to this in the following.
  2. All samples are selected whose t-value is larger than some threshold as specified in cfg.clusteralpha. It must be noted that the value of cfg.clusteralpha does not affect the false alarm rate of the statistical test; it only sets a threshold for considering a sample as a candidate member of some cluster of samples. If cfg.clusteralpha is equal to 0.05, the t-values are thresholded at the 95-th quantile for a one-sided t-test, and at the 2.5-th and the 97.5-th quantiles for a two-sided t-test.
  3. Selected samples are clustered in connected sets on the basis of temporal, spatial and spectral adjacency.
  4. Cluster-level statistics are calculated by taking the sum of the t-values within every cluster.
  5. The maximum of the cluster-level statistics is taken. This step and the previous one (step 4) are controlled by cfg.clusterstatistic, which can have the values 'maxsum', 'maxsize', or 'wcm'. In this tutorial, we will only use 'maxsum', which is the default value.

The result from step 5 is the test statistic by means of which we evaluate the effect of the experimental conditions.

The calculation of the significance probability

If the MCP is solved by choosing for a cluster-based test statistic (i.e., with cfg.correctm = 'cluster'), the significance probability can only be calculated by means of the so-called Monte Carlo method. This is because the reference distribution for a permutation test can be approximated (to any degree of accuracy) by means of the Monte Carlo method. In the configuration, the Monte Carlo method is chosen as follows: cfg.method = 'montecarlo'.

The Monte Carlo significance probability is calculated differently for a within- and between-UO design. We now show how the Monte Carlo significance probability is calculated for in a between-trials experiment (a between-UO design with trials as UO):

  1. Collect the trials of the different experimental conditions in a single set.
  2. Randomly draw as many trials from this combined data set as there were trials in condition 1 and place them into subset 1. The remaining trials are placed in subset 2. The result of this procedure is called a random partition.
  3. Calculate the test statistic (i.e., the maximum of the cluster-level summed t-values) on this random partition.
  4. Repeat steps 2 and 3 a large number of times and construct a histogram of the test statistics. The computation of this Monte Carlo approximation involves a user-specified number of random draws (specified in cfg.numrandomization). The accuracy of the Monte Carlo approximation can be increased by choosing a larger value for cfg.numrandomization.
  5. From the test statistic that was actually observed and the histogram in step 4, calculate the proportion of random partitions that resulted in a larger test statistic than the observed one. This proportion is the Monte Carlo significance probability, which is also called a p-value.

If the p-value is smaller than the critical alpha-level (typically 0.05) then we conclude that the data in the two experimental conditions are significantly different. This p-value can also be calculated for the second largest cluster-level statistic, the third largest, etc. It is common practice to say that a cluster is significant if its p-value is less than the critical alpha-level. This practice suggests that it is possible to do spatio-spectral-temporal localization by means of the cluster-based permutation test. However, from a biophysical perspective, this suggestion is not justified. Consult Maris and Oostenveld, Journal of Neuroscience Methods, 2007 for an elaboration on this topic.

Procedure

In this tutorial we will consider a between-trials experiment, in which we analyze the data of a single subject. The statistical analysis for this experiment we perform both on axial and planar ERFs. The steps we perform are as follows:


Figure 1. Analysis protocol of a between-trials experiment with axial data


Figure 2. Analysis protocol of a between-trials experiment with planar data

Subsequently, we consider a within-subjects experiment, in which we compare differences between (planar gradient) ERFs over all subjects. The steps we perform are as follows:


Figure 3. Analysis protocol of a within-subjects experiment with planar data

Between-trials experiments

In a between-trials experiment, we analyze the data of a single subject. By means of a statistical test, we want to answer the question whether there is a systematic difference in the MEG recorded on trials with a fully congruent and trials with a fully incongruent sentence ending. (This comparison is the MEG-version of the classical comparison that showed the N400-component in EEG-data).

Preprocessing and time-locked analysis

We first extract the trials of the fully incongruent condition

Reading the FIC data

The ft_definetrial and ft_preprocessing functions require the original MEG dataset, which is available from ftp://ftp.fcdonders.nl/pub/fieldtrip/tutorial/Subject01.zip.

% find the interesting segments of data
cfg = [];                                           % empty configuration
cfg.dataset                 = 'Subject01.ds';       % name of CTF dataset  
cfg.trialdef.eventtype      = 'backpanel trigger';
cfg.trialdef.prestim        = 1;
cfg.trialdef.poststim       = 2;
cfg.trialdef.eventvalue     = 3;                    % trigger value for fully incongruent (FIC)
cfg = ft_definetrial(cfg);            

% remove the trials that have artifacts from the trl
cfg.trl([15, 36, 39, 42, 43, 49, 50, 81, 82, 84],:) = []; 

% preprocess the data
cfg.channel    = {'MEG', '-MLP31', '-MLO12'};        % read all MEG channels except MLP31 and MLO12
cfg.demean     = 'yes';
cfg.baselinewindow  = [-0.2 0];
cfg.lpfilter   = 'yes';                              % apply lowpass filter
cfg.lpfreq     = 35;                                 % lowpass at 35 Hz.

dataFIC_LP = ft_preprocessing(cfg);                      

These data have been cleaned from artifacts by removing several trials and two sensors; see the visual artifact rejection tutorial.

Subsequently you can save the data to disk.

save dataFIC_LP dataFIC_LP

Then we also extract the trails of the fully congruent condition.

Reading the FC data

The ft_definetrial and ft_preprocessing functions require the original MEG dataset, which is available from ftp://ftp.fcdonders.nl/pub/fieldtrip/tutorial/Subject01.zip.

% find the interesting segments of data
cfg = [];                                           % empty configuration
cfg.dataset                 = 'Subject01.ds';       % name of CTF dataset  
cfg.trialdef.eventtype      = 'backpanel trigger';
cfg.trialdef.prestim        = 1;
cfg.trialdef.poststim       = 2;
cfg.trialdef.eventvalue     = 9;                    % trigger value for fully incongruent (FC)
cfg = ft_definetrial(cfg);            

% remove the trials that have artifacts from the trl
cfg.trl([2, 3, 4, 30, 39, 40, 41, 45, 46, 47, 51, 53, 59, 77, 85],:) = []; 

% preprocess the data
cfg.channel    = {'MEG', '-MLP31', '-MLO12'};       % read all MEG channels except MLP31 and MLO12
cfg.demean     = 'yes';
cfg.baselinewindow  = [-0.2 0];
cfg.lpfilter   = 'yes';                              % apply lowpass filter
cfg.lpfreq     = 35;                                 % lowpass at 35 Hz.

dataFC_LP = ft_preprocessing(cfg);                      

These data have been cleaned from artifacts by removing several trials and two sensors; ; see the visual artifact rejection tutorial.

Subsequently you can save the data to disk.

save dataFC_LP dataFC_LP

The format of the preprocessed data is inconvenient for statistical analyses. A more convenient format is produced by ft_timelockanalysis with the option cfg.keeptrials = 'yes'. The output of ft_timelockanalysis contains an .avg field with the average event-related field, and a .trial field with the individual trial data. The output is stored in timelockFIC and timelockFC for the fully incongruent and the fully congruent condition.

Ft_timelockanalysis requires preprocessed data (see above), which is available from ftp://ftp.fcdonders.nl/pub/fieldtrip/tutorial/cluster_permutation_timelock/dataFIC_LP.mat and dataFC_LP.mat.

load dataFIC_LP
load dataFC_LP

cfg = [];
cfg.keeptrials = 'yes';
timelockFIC = ft_timelockanalysis(cfg, dataFIC_LP);
timelockFC  = ft_timelockanalysis(cfg, dataFC_LP);

Permutation test

Cluster-level permutation tests for event-related fields are performed by the function ft_timelockstatistics. This function takes as its arguments a configuration (cfg) and two or more data structures. These data structures must be produced by ft_timelockanalysis or ft_timelockgrandaverage, which all operate on preprocessed data. The argument list of ft_timelockstatistics must contain one data structure for every experimental condition. For comparing the data structures timelockFIC and timelockFC, you must call ft_timelockstatistics as follows: [stat] = ft_timelockstatistics(cfg, timelockFIC, timelockFC);

The configuration settings

Some fields of the configuration (cfg), such as channel and latency, are not specific for ft_timelockstatistics; their role is similar in other FieldTrip functions. We first concentrate on the fields that are specific for ft_timelockstatistics.

cfg = [];
cfg.method = 'montecarlo';       % use the Monte Carlo Method to calculate the significance probability
cfg.statistic = 'indepsamplesT'; % use the independent samples T-statistic as a measure to evaluate 
                                 % the effect at the sample level
cfg.correctm = 'cluster';
cfg.clusteralpha = 0.05;         % alpha level of the sample-specific test statistic that will be used for thresholding
cfg.clusterstatistic = 'maxsum'; % test statistic that will be evaluated under the permutation distribution. 
cfg.minnbchan = 2;               % minimum number of neighborhood channels that is required for a selected 
cfg.neighbours = neighbours;
                                 % sample to be included in the clustering algorithm (default=0).
cfg.tail = 0;                    % -1, 1 or 0 (default = 0); one-sided or two-sided test
cfg.clustertail = 0;
cfg.alpha = 0.025;               % alpha level of the permutation test
cfg.numrandomization = 100;      % number of draws from the permutation distribution

design = zeros(1,size(timelockFIC.trial,1) + size(timelockFC.trial,1));
design(1,1:size(timelockFIC.trial,1)) = 1;
design(1,(size(timelockFIC.trial,1)+1):(size(timelockFIC.trial,1) + size(timelockFC.trial,1)))= 2;

cfg.design = design;             % design matrix
cfg.ivar  = 1;                   % number or list with indices, independent variable(s)

We now describe these options one-by-one.

We now briefly discuss the configuration fields that are not specific for ft_timelockstatistics:

cfg.channel = {'MEG'};          % cell-array with selected channel labels
cfg.latency = [0 1];            % time interval over which the experimental 
                                % conditions must be compared (in seconds)
cfg_neighb.method = 'distance';         
cfg.neighbours = ...            % specifies with which sensors other sensors
ft_prepare_neighbours(...       % can form clusters
cfg_neighb, dataFC_LP);

With these two options, we select the spatio-temporal dataset involving all MEG channels and the time interval between 0 and 1 second. The two experimental conditions will only be compared on this selection of the complete spatio-temporal dataset. Also, feel free to consult our frequenly asked questions about prepare_neighbours

One should be aware of the fact that the sensitivity of ft_timelockstatistics (i.e., the probability of detecting an effect) depends on the length of the time interval that is analyzed, as specified in cfg.latency. For instance, assume that the difference between the two experimental conditions extends over a short time interval only (e.g., between 0.3 and 0.4 sec.). If it is known in advance that this short time interval is the only interval where an effect is likely to occur, then one should limit the analysis to this time interval (i.e., choose cfg.latency = [0.3 0.4]). Choosing a time interval on the basis of prior information about the time course of the effect will increase the sensitivity of the statistical test. If there is no prior information, then one must compare the experimental conditions over the complete time interval. This is accomplished by choosing cfg.latency = 'all'. In this tutorial, we choose cfg.latency = [0 1] because EEG-studies have shown that the strongest effect of semantic incongruity is observed in the first second after stimulus presentation.

Now, run ft_timelockstatistics to compare timelockFIC and timelockFC using the configuration described above.

[stat] = ft_timelockstatistics(cfg, timelockFIC, timelockFC);

Save the output to disk:

save stat_ERF_axial_FICvsFC stat;

The format of the output

The output can also be obtained from stat_ERF_axial_FICvsFC.mat. If you need to reload the statistics output, use:

load stat_ERF_axial_FICvsFC

The output of ft_timelockstatistics has separate fields for positive and negative clusters. For the positive clusters, the output is given in the following pair of fields: stat.posclusters and stat.posclusterslabelmat. The field stat.posclusters is an array that provides the following information for every cluster:

The elements in the array stat.posclusters are sorted according to their p-value: the cluster with the smallest p-value comes first, followed by the cluster with the second-smallest, etc. Thus, if the k-th cluster has a p-value that is larger than the critical alpha-level (e.g., 0.025), then so does the (k+1)-th. Type stat.posclusters(k) on the Matlab command line to see the information for the k-th cluster.

The field stat.posclusterslabelmat is a spatiotemporal matrix. This matrix contains numbers that identify the clusters to which the (channel,time)-pairs (the samples) belong. For example, all (channel,time)-pairs that belong to the third cluster, are identified by the number 3. As will be shown in the following, this information can be used to visualize the topography of the clusters.

For the negative clusters, the output is given in the following pair of fields: stat.negclusters and stat.negclusterslabelmat. These fields contain the same type of information as stat.posclusters and stat.posclusterslabelmat, but now for the negative clusters.

By inspecting stat.posclusters and stat.negclusters, it can be seen that only the first positive and the first negative cluster have a p-value less than the critical alpha-level of 0.025. This critical alpha-level corresponds to a false alarm rate of 0.05 in a two-sided test. By typing stat.posclusters(1) on the Matlab command line, you should obtain the following:

stat.posclusters(1)
ans = 
         prob: 0
  clusterstat: 8.2721e+03

And by typing stat.negclusters(1), you should obtain the following:

stat.negclusters(1)
ans = 
         prob: 0
  clusterstat: -1.0251e+04

It is possible that the p-values in your output are a little bit different from 0. This is because ft_timelockstatistics calculated as a Monte Carlo approximation of the permutation p-values: the p-value for the k-th positive cluster is calculated as the proportion of random draws from the permutation distribution in which the maximum of the cluster-level statistics is more larger than stat.posclusters(k).clusterstat.

Plotting the results

To plot the results of the permutation test, we use the plotting function ft_topoplotER. In doing so, we will plot a topography of the difference between the two experimental conditions (FC and FIC). Atop that, and for each timestep of interest, we'll highlight the sensors which are members of significant clusters. First, however, we must calculate the difference between conditions.

cfg = [];
cfg.keeptrials = 'no';  % now keep only the subject-wise average, not the single trials
avgFIC = ft_timelockanalysis(cfg, dataFIC_LP);
avgFC  = ft_timelockanalysis(cfg, dataFC_LP);

% Copy the entire timelockanalysis structure to preserve all
% the information it holds in addition to ERP averages.
raweffectFICvsFC     = avgFIC;
% Then take the difference of the averages.
raweffectFICvsFC.avg = avgFIC.avg - avgFC.avg;  

We then construct a boolean matrix indicating membership in the significant clusters. This matrix has size [Number_of_MEEG_channels × Number_of_temporal_samples], like stat.posclusterslabelmat. We'll make two such matrices: one for positive clusters (named pos), and one for negative (neg). All (channel,time)-pairs belonging to the significant clusters will be coded in the new boolean matrix as 1, and all those that don't will be coded as 0.

In plotting significant clusters, we must of course first determine which clusters are reliable.

% Make a vector of all p-values associated with the clusters from ft_timelockstatistics.
pos_cluster_pvals = [stat.posclusters(:).prob];
% Then, find which clusters are significant, outputting their indices as held in stat.posclusters
pos_signif_clust = find(pos_cluster_pvals < stat.cfg.alpha);
% (stat.cfg.alpha is the alpha level we specified earlier for cluster comparisons; In this case, 0.025)
% make a boolean matrix of which (channel,time)-pairs are part of a significant cluster
pos = ismember(stat.posclusterslabelmat, pos_signif_clust);

% and now for the negative clusters...
neg_cluster_pvals = [stat.negclusters(:).prob];
neg_signif_clust = find(neg_cluster_pvals < stat.cfg.alpha);
neg = ismember(stat.negclusterslabelmat, neg_signif_clust);

Alternatively, we can manually select which clusters we want to plot. If we only want to see the extext of the first (i.e. most significant) positive and negative clusters, for instance, we can do so as follows:

pos = stat.posclusterslabelmat == 1;	% or == 2, or 3, etc.
neg = stat.negclusterslabelmat == 1;

To plot a sequence of twenty topographic plots equally spaced between 0 and 1 second, we define the vector j of time steps. These time intervals correspond to the samples m in stat and in the variables pos and neg. m and j must, therefore, have the same length.

To be sure that your sample-based time windows align with your time windows in seconds, check the following:

timestep = 0.05;		% timestep between time windows for each subplot (in seconds)
sampling_rate = dataFC_LP.fsample;	% Data has a temporal resolution of 300 Hz
sample_count = length(stat.time);
					% number of temporal samples in the statistics object
j = [0:timestep:1];   % Temporal endpoints (in seconds) of the ERP average computed in each subplot
m = [1:timestep*sampling_rate:sample_count];  % temporal endpoints in MEEG samples

To plot the data use the following for-loop:

for k = 1:20;
     subplot(4,5,k);
     cfg = [];   
     cfg.xlim=[j(k) j(k+1)];   % time interval of the subplot
     cfg.zlim = [-2.5e-13 2.5e-13];
   % If a channel reaches this significance, then
   % the element of pos_int with an index equal to that channel
   % number will be set to 1 (otherwise 0).
   
   % Next, check which channels are significant over the
   % entire time interval of interest.
     pos_int = all(pos(:, m(k):m(k+1)), 2);
     neg_int = all(neg(:, m(k):m(k+1)), 2);

     cfg.highlight = 'on';
   % Get the index of each significant channel
     cfg.highlightchannel = find(pos_int | neg_int);
     cfg.comment = 'xlim';   
     cfg.commentpos = 'title';   
     cfg.layout = 'CTF151.lay';
     ft_topoplotER(cfg, raweffectFICvsFC);   
end

In this for-loop, cfg.xlim defines the time interval of each subplot. The variables pos_int and neg_int boolean vectors indicating which channels of pos and neg are significant in the time interval of interest. This is defined in cfg.highlight. The for-loop plots 20 subplots covering a time interval of 50 ms each. Running this for-loop creates the following figure:

Figure 4: Raw effect (FIC-FC) on the ERFs of subject 1, significant clusters are highlighted.

Using planar gradient data

To perform the permutation test using synthetic planar gradient data, the data must first be converted using the functions ft_megplanar and ft_combineplanar. These functions were described in the tutorial on event-related fields. After running these functions, the statistical analysis using ft_timelockstatistics involves the same configuration options as for ordinary event-related averages (no synthetic planar gradients). There is only one additional step, which is needed to add the gradiometer structure to one of the planar gradient data sets.

cfg = [];
cfg.planarmethod = 'sincos';
timelockFIC_planar = ft_megplanar(cfg, timelockFIC);
timelockFC_planar  = ft_megplanar(cfg, timelockFC);
       
timelockFIC_planar_cmb = ft_combineplanar(cfg, timelockFIC_planar);
timelockFC_planar_cmb  = ft_combineplanar(cfg, timelockFC_planar);
  
timelockFIC_planar_cmb.grad = timelockFIC.grad;  % add the gradiometer structure
timelockFC_planar_cmb.grad  = timelockFC.grad;
     

Having calculated synthetic planar gradient data, one can use the same configuration parameters as used for the analysis of the original data.

cfg = [];
cfg.channel = {'MEG'};
cfg.latency = [0 1];
 
cfg.method = 'montecarlo';
cfg.statistic = 'indepsamplesT';
cfg.correctm = 'cluster';
cfg.clusteralpha = 0.05;
cfg.clusterstatistic = 'maxsum';
cfg.minnbchan = 2;
cfg.tail = 0;
cfg.clustertail = 0;
cfg.alpha = 0.025;
cfg.numrandomization = 100;

design = zeros(1,size(timelockFIC_planar_cmb.trial,1) + size(timelockFC_planar_cmb.trial,1));
design(1,1:size(timelockFIC_planar_cmb.trial,1)) = 1;
design(1,(size(timelockFIC_planar_cmb.trial,1)+1):(size(timelockFIC_planar_cmb.trial,1) + size(timelockFC_planar_cmb.trial,1)))= 2;

cfg.design = design;
cfg.ivar = 1;

[stat] = ft_timelockstatistics(cfg, timelockFIC_planar_cmb, timelockFC_planar_cmb)

save stat_ERF_planar_FICvsFC stat

The output can also be obtained from ftp://ftp.fcdonders.nl/pub/fieldtrip/tutorial/cluster_permutation_timelock/stat_ERF_planar_FICvsFC.mat. If you need to reload the statistics output, use:

load stat_ERF_planar_FICvsFC

We now calculate the raw effect in the average with planar gradient data using the following configuration:

cfg = [];
cfg.keeptrials = 'no';   % now only the average, not the single trials
avgFIC_planar = ft_timelockanalysis(cfg, timelockFIC_planar);
avgFC_planar  = ft_timelockanalysis(cfg, timelockFC_planar);
cfg = [];
avgFIC_planar_cmb = ft_combineplanar(cfg, avgFIC_planar);
avgFC_planar_cmb  = ft_combineplanar(cfg, avgFC_planar);

% make a copy of the structure to contain the raw effect and all other information
% and subtract the averages from each other
raweffectFICvsFC     = avgFIC_planar_cmb;
raweffectFICvsFC.avg = avgFIC_planar_cmb.avg - avgFC_planar_cmb.avg;  

Using the following configuration for ft_topoplotER we can plot the raw effect and highlight the channels belonging to the significant clusters:

figure;  
timestep = 0.05;		%(in seconds)
sampling_rate = dataFC_LP.fsample;
sample_count = length(stat.time);
j = [0:timestep:1];   % Temporal endpoints (in seconds) of the ERP average computed in each subplot
m = [1:timestep*sampling_rate:sample_count];  % temporal endpoints in MEEG samples

pos_cluster_pvals = [stat.posclusters(:).prob];
pos_signif_clust = find(pos_cluster_pvals < stat.cfg.alpha);
pos = ismember(stat.posclusterslabelmat, pos_signif_clust);

% Remember to do the same for negative clusters if you want them!

for k = 1:20;
     subplot(4,5,k);   
     cfg = [];
     cfg.xlim =[j(k) j(k+1)];
     cfg.zlim = [-1.0e-13 1.0e-13];   
     pos_int = all(pos(:, m(k):m(k+1)), 2);
     cfg.highlight = 'on';
     cfg.highlightchannel = find(pos_int);
     cfg.comment = 'xlim';
     cfg.commentpos = 'title';
     cfg.layout = 'CTF151.lay';
     ft_topoplotER(cfg, raweffectFICvsFC);
end

Figure 5: Raw effect (FIC-FC) on the planar gradient ERFs of subject 1, the significant clusters are highlighted.

Within-subjects experiments

We now consider experiments involving multiple subjects that are each observed in multiple experimental conditions. Typically, every subject is observed in a large number of trials, each one belonging to one experimental condition. Usually, for every subject, averages are computed over all trials belonging to each of the experimental conditions. Thus, for every subject, the data are summarized in an array of condition-specific averages. The permutation test that is described in this section informs us about the following null hypothesis: the probability distribution of the condition-specific averages is independent of the experimental conditions.

Reading-in, preprocessing, timelockanalysis, planar gradient, and grandaveraging

We now describe how we can statistically test the difference between the event-related averages for fully incongruent (FIC) and the fully congruent (FC) sentence endings. For this analysis we use planar gradient data. For convenience we will not do the reading-in and preprocessing steps on all subjects. Instead we begin by loading the timelock structures containing the event-related averages (of the planar gradient data) of all ten subjects. The data is available from the FieldTrip FTP server (GA_ERF_orig.mat).

load GA_ERF_orig;

The event-related averages for the fully incongruent and the fully congruent sentence endings are stored in GA_FIC and GA_FC. These averages were calculated using the function ft_timelockgrandaverage. In the configuration for ft_timelockgrandaverage, the option cfg.keepindividual = ‘yes’ has to be chosen. (This has been done in the example data.) If this option is not used, then only the average ERF of all subjects will be calculated.

Permutation test

We now perform the permutation test using ft_timelockstatistics. The configuration settings for this analysis differ from the previous settings in several fields:

  1. We have to select a different measure to evaluate the effect at sample level (in cfg.statistic)
  2. The design matrix is different (i.c., it now contains two lines instead of one)
  3. The so-called unit variable has to be defined.

The configuration looks as follows:

cfg = [];
cfg.channel = {'MEG'};
cfg.latency = [0 1];

cfg.method = 'montecarlo';
cfg.statistic = 'depsamplesT';
cfg.correctm = 'cluster';
cfg.clusteralpha = 0.05;
cfg.clusterstatistic = 'maxsum';
cfg.minnbchan = 2;
cfg.neighbours = neighbours;
cfg.tail = 0;
cfg.clustertail = 0;
cfg.alpha = 0.025;
cfg.numrandomization = 500;

subj = 10;
design = zeros(2,2*subj);
for i = 1:subj
  design(1,i) = i;
end
for i = 1:subj
  design(1,subj+i) = i;
end
design(2,1:subj)        = 1;
design(2,subj+1:2*subj) = 2;

cfg.design = design;
cfg.uvar  = 1;
cfg.ivar  = 2;

We now describe the differences between this configuration and the configuration for a between-trials experiment.

Now, use the configuration above to perform the following statistical analysis:

[stat] = ft_timelockstatistics(cfg, GA_FIC, GA_FC)

save stat_ERF_planar_FICvsFC_GA stat

The output can also be obtained from ftp://ftp.fcdonders.nl/pub/fieldtrip/tutorial/cluster_permutation_timelock/stat_ERF_planar_FICvsFC_GA.mat. If you need to reload the statistics output, use:

load stat_ERF_planar_FICvsFC_GA

From inspection of stat.posclusters and stat.negclusters, we observe that there is only one significant positive cluster and no significant negative cluster.

Plotting the results

To plot the results:

GA_FICvsFC = GA_FIC;
GA_FICvsFC.avg = GA_FIC.avg - GA_FC.avg;

figure;  
timestep = 0.05;      %(in seconds)
sampling_rate = dataFC_LP.fsample;
sample_count = length(stat.time);
j = [0:timestep:1];   % Temporal endpoints (in seconds) of the ERP average computed in each subplot
m = [1:timestep*sampling_rate:sample_count];  % temporal endpoints in MEEG samples

pos_cluster_pvals = [stat.posclusters(:).prob];
pos_signif_clust = find(pos_cluster_pvals < stat.cfg.alpha);
pos = ismember(stat.posclusterslabelmat, pos_signif_clust);

for k = 1:20;
     subplot(4,5,k);   
     cfg = [];   
     cfg.xlim=[j(k) j(k+1)];   
     cfg.zlim = [-1.0e-13 1.0e-13];   
     pos_int = all(pos(:, m(k):m(k+1)), 2);
     cfg.highlight = 'on';
     cfg.highlightchannel = find(pos_int);       
     cfg.comment = 'xlim';   
     cfg.commentpos = 'title';   
     ft_topoplotER(cfg, GA_FICvsFC);
end  

Figure 6: Raw effect (FIC-FC) on the grand average planar gradient ERFs the significant cluster is highlighted

Summary and suggested further readings

In this tutorial, it was shown how to do non-parametric statistics on axial and planar ERFs in a between-trials and in within-subjects design. It was also shown how to plot the results.

If you are interested in parametric tests in FieldTrip, you can read the Parametric and non-parametric statistics on event-related fields tutorial. If you are interested in how to do the same statistics on time-frequency representations, you can read the Cluster-based permutation tests on time-frequency data tutorial.

If you would like to read more about issues related to statistical analysis, you can read the following FAQ's as well:
Why should I use the cfg.correcttail option when using statistics_montecarlo?
What is the idea behind statistical inference at the second-level?

If you would like to read about neighborhood selection, you can read the following FAQ's:
How can I define my own neighbourhood templates or updating an already existing template?
How can I define neighbouring sensors?
How does ft_prepare_neighbours work?

Or you can look also at the following example scripts:
Apply clusterrandanalysis on TFRs of power that were computed with BESA
Source statistics
Stratify the distribution of one variable that differs in two conditions
Using the MATLAB statistics toolbox


This tutorial was last tested by Jan-Mathijs, with version 20110409 of FieldTrip, using Matlab 2009b on a 64-bit Linux platform.