FieldTrip Multivariate Module

The FieldTrip Multivariate Module (FMM) is a generic machine learning toolbox with additional support for the analysis of neuroimaging datasets and is written in object-oriented Matlab. It is especially suitable for applications in Brain-Computer Interfacing (BCI) and multivariate pattern analysis (MVPA). Contrary to other FieldTrip modules, it makes use of object-orientation and requires Matlab version >= 7.6.0 (R2008a). The module offers support for preprocessing, feature selection, classification, regression and reconstruction.

FieldTrip also offers a more high-level way of calling the module using the statistics_crossvalidate function (i.e., it automatically performs cross-validation for some specified multivariate analysis procedure). Please consult the multivariateanalysis tutorial for this.

Module structure

The organization of the FMM folders is as follows:

  • base: base classes that should not be used directly
  • preprocessors: preprocessing classes
  • selectors: feature selection classes
  • predictors: prediction classes
  • timeseries: timeseries analysis classes
  • meta: meta analysis classes
  • statistics: crossvalidation, performance measures, significance tests, example data
  • external: external toolboxes used by wrapper classes

The basic philosophy of FMM is that new methods can be added as toolboxes which are called through generic wrapper code. This makes it easy to extend the toolbox to fit your own needs. The currently used external toolboxes are listed here:

General overview

Different multivariate methods can be concatenated together in a processing pipeline, constituting a multivariate analysis ft_mv_analysis, or MVA for short. Finally, an ft_mv_analysis object can either be used standalone or be evaluated on data using a ft_mv_crossvalidator object. This automates the whole process of analyzing (neuroimaging) data.

In FMM, a dataset is given by a multidimensional array, where the first dimension indexes datapoints and the remaining dimensions are arbitrarily chosen, e.g., size(X) = trials x channels x frequencies. Internally, the module represents this data as trials x features matrix. Missing values are represented using NaN.

An exception to this rule holds for ft_mv_timeseries objects, where the last dimension must index time, e.g., size(X) = trials x channels x frequencies x timepoints. Internally the module represents this data as a trials x features x timepoints matrix.

Multiple datasets are represented as a cell-array of multidimensional arrays. E.g., X = {X1 X2} and Y = {Y1 Y2} represents two input datasets and two output datasets.

A multivariate analysis is a Matlab object which can be constructed as follows:

obj = ft_mv_analysis({method1,...,methodN})

where the argument is a cell-array of multivariate methods. By calling

obj = obj.train(X,Y)

we can train our MVA on input X and output Y. Finally, we can test on new data U by calling

V = obj.test(U)

such that V contains the predictions. The MVA object will sequentially call each ft_mv_method in the pipeline where output of the previous method will be input to the next method. The method obj.name will return the MVA as a string and the method obj.model will return the associated model, in this case the model of the last method in the pipeline (more on this later).

Multivariate methods are organized into different subclasses according to the way they transform input data X to output data Y. Currently supported methods are preprocessors, which preprocess the data, selectors, which perform some kind of feature selection, predictors, which make predictions for new data, and timeseries methods, which perform timeseries analysis on the data. Many methods handle missing data, multiple outputs, multiple datasets and online (sample by sample) learning. If a method does not handle any of these situations then it will throw an error. Handling of some of these situations can be done by invoking a wrapper function.

All ft_mv_method classes are structured as follows (here we show an example of a predictor):

classdef mymethod < ft_mv_predictor

  properties 
  % insert class-specific properties here 
  
  end

  methods

  function obj = mymethod(varargin) 
  % constructor
    obj = obj@ft_mv_predictor(varargin{:});
  end

  function obj = train(obj,X,Y) 
  % estimate mvmethod parameters
  
  end

  function Y = test(obj,X) 
  % compute output
  
  end

  function X = invert(obj,Y) 
  % optional function which inverts the mapping performed by this class
  
  end

  function [m,d] = model(obj) 
  % returns the model parameters as a cell-array m 
  % desc contains a description of each element of m
  
  end
  

Note that a predictor can perform classification tasks, regression tasks or represent multiple-input multiple-output (MIMO) models. In case of K-class classification, it is assumed that the output vector Y consists of labels 1 : K. For example, Y = [2 1 1 2 1 2]′ assigns examples 2, 3 and 5 to class 1 and the remaining examples to class 2. In case of regression, Y just consists of the observed real output per example. In case of MIMO models , the same behaviour holds, but Y consists of multiple columns where each column stands for a separate output.

In the following subsection we list the currently supported objects. More complete descriptions can be found in the help of the respective methods.

Base classes

Base classes should not be called directly but rather form the core of the module.

ft_mv_analysis

  • Status: stable
  • Dependencies: none
  • Contributors: Marcel van Gerven

A multivariate analysis is just a sequence of multivariate methods {method1 method2 method3 …} that are called in this order and where the output of the previous method acts as input to the next method. Example:

>> f = ft_mv_analysis({ft_mv_standardizer ft_mv_svm})

f = 

  ft_mv_analysis

  Properties:
    mvmethods: {[1x1 ft_mv_standardizer]  [1x1 ft_mv_svm]}

  Methods

>> f.name

ans =

{ ft_mv_standardizer ft_mv_svm }

ft_mv_method

  • Status: stable
  • Dependencies: none
  • Contributors: Marcel van Gerven

Abstract class for multivariate methods. Introduces the verbose property to allow methods to show or hide output.

ft_mv_ndata

  • Status: stable
  • Dependencies: none
  • Contributors: Marcel van Gerven

Helper class to handle multiple datasets in a uniform manner.

ft_mv_noutput

  • Status: stable
  • Dependencies: none
  • Contributors: Marcel van Gerven

Helper class to handle multiple outputs in a uniform manner.

Preprocessors

Preprocessors are used to preprocess or filter the input data X in such a way that it will be more suitable for making predictions.

ft_mv_preprocessor

  • Status: stable
  • Dependencies: none
  • Contributors: Marcel van Gerven

Abstract base class from which other preprocessors are derived. Should not be called directly.

ft_mv_standardizer

  • Status: stable
  • Dependencies: none
  • Contributors: Marcel van Gerven

Centers and scales the data such that it has mean zero and a standard deviation of one.

ft_mv_whitener

  • Status: unstable
  • Dependencies: external/farquhar toolbox
  • Contributors: Jason Farquhar

Whitens the input data.

ft_mv_csp

  • Status: unstable
  • Dependencies: external/herman toolbox
  • Contributors: Pawel Herman

Common spatial pattern algorithm.

ft_mv_pca

  • Status: stable
  • Dependencies: statistics toolbox, external/pca toolbox
  • Contributors: Marcel van Gerven

Principal component analysis. Uses parameter proportion. It defines the proportion of pc's or number of pc's. If < 1 then it is interpreted as a proportion of accounted variance. Otherwise it is treated as the absolute number of pc's. If empty then all components are used (default = 0.80).

Selectors

Selectors are used to perform feature selection on the input data X.

ft_mv_selector

  • Status: stable
  • Dependencies: none
  • Contributors: Marcel van Gerven

The base class from which other selectors are derived. It defines properties validator and subset. The validator is used to select the optimal feature subset using an inner cross-validation. The subset defines the indices of the original features which are selected by the algorithm.

ft_mv_filterer

  • Status: unstable
  • Dependencies: none
  • Contributors: Marcel van Gerven

Performs feature selection by computing a univariate measure for each feature and taking the best N features based on a crossvalidator.

ft_mv_searchlight

  • Status: stable
  • Dependencies: none
  • Contributors: Marcel van Gerven

Performs a searchlight analysis. Suppose our features are given by voxels in multiple fMRI volumes. The searchlight moves a sphere through this volume and computes decoding performance per sphere. Below, an example is shown using a sphere of radius 3 and a step size of 3. Both radius and step size are measured in number of features (e.g. voxels). In order to define the structure of our search space, we use indims to specify the dimensions of the input data (e.g. [110 110 37] for our fMRI volumes). The used decoding procedure is by default a 5-fold cross-validation using a support vector machine but this behaviour can be overridden.

sl = ft_mv_searchlight('radius',3,'step',3,'indims',dims);
sl = sl.train(X,Y);

estimating 39 spheres with radius 3 with 3 steps for a
volume of size [110 110	37] 
estimating sphere 1 of 39 
estimating sphere 2 of 39 
estimating sphere 3 of 39 
estimating sphere 4 of 39 ... 
average sphere volume: 52.2308 
performance for sphere 1 of 39: 0.736842 (p-value: 0.342782) 
performance for sphere 2 of 39: 0.842105 (p-value: 0.148915) 
performance for sphere 3 of 39: 0.789474 (p-value: 0.267257) 
performance for sphere 4 of 39: 0.684211 (p-value: 0.546494) 
...

Furthermore, we could use a mask to specify our volume (e.g. a grey matter mask in case of fMRI data) or specify an irregular neighbourhood structure. Please see the function documentation for further details.

Predictors

Predictors are used to predict new outputs Y based on input data X. Some predictors expect discrete outputs Y and others expect continuous outputs Y. Some methods are suitable for predicting multiple outputs. Methods which are not directly suitable for multiple outputs will process each output independently and combine the results.

ft_mv_predictor

  • Status: stable
  • Dependencies: none
  • Contributors: Marcel van Gerven

Abstract class from which other predictors are derived. Should not be called directly.

ft_mv_blogreg

  • Status: stable
  • Dependencies: none
  • Contributors: Marcel van Gerven, Botond Cseke, Tom Heskes

Bayesian logistic regression with a multivariate Laplace prior as described here.

ft_mv_naive

  • Status: stable
  • Dependencies: none
  • Contributors: Marcel van Gerven

Naive Bayes classifier with normally distributed feature values.

ft_mv_kernelmethod

  • Status: stable
  • Dependencies: external/farquhar toolbox
  • Contributors: Marcel van Gerven

Abstract class. Superclass of the following kernel methods.

ft_mv_svm
  • Status: stable
  • Dependencies: external/farquhar toolbox
  • Contributors: Jason Farquhar

Support vector machine.

ft_mv_klr
  • Status: stable
  • Dependencies: external/farquhar toolbox
  • Contributors: Jason Farquhar

Kernel logistic regression.

ft_mv_rkls
  • Status: stable
  • Dependencies: external/farquhar toolbox
  • Contributors: Jason Farquhar

Regularized kernel least squares.

ft_mv_rfda

  • Status: stable
  • Dependencies: external/herman
  • Contributors: Pawel Herman

Regularized Fisher discriminant analysis.

ft_mv_glmnet

  • Status: stable
  • Dependencies: external/glmnet, external/murphy
  • Contributors: Marcel van Gerven

Fast implementation of the elastic net algorithm as described here. This method can be used to perform regularized linear or logistic regression:

unregularized linear regression

ft_mv_glmnet('lambda',0,'family','gaussian')

unregularized logistic regression

ft_mv_glmnet('lambda',0,'family','binomial')

L2 regularized logistic regression

ft_mv_glmnet('alpha',0,'lambda',1,'family','binomial')

L1 regularized logistic regression

ft_mv_glmnet('alpha',1,'lambda',1,'family','binomial')

elastic net linear regression

ft_mv_glmnet('alpha',0.5,'lambda',1,'family','binomial')

Instead of setting the regularization parameter lambda manually, we can also optimize it using an inner cross validation as follows:

ft_mv_glmnet('validator',ft_mv_crossvalidator('nfolds',5),'family','gaussian') %  5-fold inner cross validation

ft_mv_rls

  • Status: stable
  • Dependencies: none
  • Contributors: Stefan Klanke

Recursive least squares.

ft_mv_sopls

  • Status: stable
  • Dependencies: external/glmnet
  • Contributors: Marcel van Gerven

Sparse orthonormalized partial least squares algorithm.

Time series analysis

Timeseries methods are used to predict new outputs Y based on input data X which should be of size repetitions x features x timepoints.

ft_mv_timeseries

Status: unstable

  • Dependencies: none
  • Contributors: Marcel van Gerven

Abstract class from which all timeseries objects are derived.

ft_mv_hmm

  • Status: unstable
  • Dependencies: external/murphy
  • Contributors: Marcel van Gerven

Hidden Markov model

ft_mv_lds

  • Status: unstable
  • Dependencies: none
  • Contributors: Ali Bahramisharif, Marcel van Gerven

Linear dynamical system

Meta analysis

Meta analysis methods have in common that they operate on other multivariate analysis methods.

ft_mv_meta

  • Status: stable
  • Dependencies: none
  • Contributors: Marcel van Gerven

Base class for meta analysis methods.

ft_mv_gridsearch

  • Status: stable
  • Dependencies: none
  • Contributors: Marcel van Gerven

For many of the described methods there are free parameters which need to be optimized. For instance, the C parameter of the SVM or the regularization parameters L1 and L2 for the elastic net. In those cases, it is useful to employ the ft_mv_gridsearch. This object can be used to optimize free parameters of a multivariate analysis in a fully automated way. The grid search is called as folllows:

ft_mv_gridsearch('mva',mymva,'validator',ft_mv_crossvalidator... ('metric',mymetric),'mvidx',myidx,'vars',myvars,'vals',myvals)

where mymva is the employed MVA, mymetric the employed performance measure to test which configuration is optimal, myidx the index of the method in mymva that is to be optimized, myvars the variable of mymva that needs to be optimized and myvals the values which that variable may assume. This sounds complicated but it is not. Let’s clarify with some examples. Suppose we wish to optimize the C parameter of an SVM in the range logspace(-3,3,7). Then, we can use

% crossvalidator with 80% of the data and accuracy as the metric 
cv = ft_mv_crossvalidator('nfolds',0.8,'metric','accuracy');
opt = ft_mv_gridsearch('verbose',true,'mva',svm,'validator',cv,...
     'vars','C','vals',logspace(-3,3,7))

% standard behavior from here on 
opt = opt.train(X,Y) ...

ft_mv_one_against_one

  • Status: stable
  • Dependencies: none
  • Contributors: Marcel van Gerven

Wrapper class to implement one-against-one classification in case of multiple classes. For example,

ft_mv_one_against_one('mva',ft_mv_svm,'combfun',[])

creates an SVM for all pairs of class labels and combines the predictions using the default combination function (sum of the probabilities).

ft_mv_one_against_rest

  • Status: unstable
  • Dependencies: none
  • Contributors: Marcel van Gerven

Wrapper class to implement one-against-rest classification in case of multiple classes. For example,

ft_mv_one_against_one('mva',ft_mv_svm,'combfun',[])

creates an SVM for each class label and combines the predictions against the remaining class labels using the default combination function (sum of the probabilities).

ft_mv_ensemble

  • Status: unstable
  • Dependencies: none
  • Contributors: Marcel van Gerven

Ensemble methods evoke methods in parallel and combine their outputs in some prespecified way. In FMM this is realized through the ensemble object. For example, we can apply a naive Bayes classifier and a support vector machine through ensemble methods as follows:

m = ft_mv_ensemble(’mvas’,{ft_mv_naive ft_mv_svm},’combfun’,myfun); m = m.train(X,Y); ...

Here, the user-specified function myfun specifies how the ouputs of naive Bayes and the SVM should be combined. For example, myfun could implement a majority vote as follows

function y = myfun(x)
 y = zeros(size(x{1})); 
 for k=1:length(x)
  [temp,pcls] = max(x{k},[],2); 
  for p=1:length(pcls)
   y(p,pcls(p)) = y(p,pcls(p)) + 1; 
  end
 end

 % resolve ties for p=1:size(y,1)
 m = find(ismember(y(p,:),max(y(p,:)))); 
 y(p,:) = 0; 
 if length(m) > 1
  m = m(ceil(rand*length(m))); 
 end
 y(p,m) = 1; 
end

ft_mv_likelihood

  • Status: stable
  • Dependencies: none
  • Contributors: Marcel van Gerven

Likelihood based classification. Takes another probabilistic method m, builds a model for each class and then compares likelihoods. Requires the method m.likelihood(X), which returns the log likelihood of the data X, to be defined. Example:

X = randn(20,10,15); % trials x features x timepoints
X(1:10,:,:) = X(1:10,:,:) + 0.5;
Y = [ones(1,10) 2*ones(1,10)]';
k = ft_mv_likelihood('mva',ft_mv_hmm('indims',[10 15]));
k = k.train(X,Y);
k.test(X) % show log likelihood of training data

Statistics

We can either use a MVA to perform online state estimation or to perform an offline analysis of neuroimaging data. The former is used in BCI applications (prediction) whereas the latter is used in offline analysis of BCI data or MVPA (model inference). Online state estimation is handled in the next section. Here, we describe statistics for models learned on offline data.

ft_mv_crossvalidator

  • Status: stable
  • Dependencies: none
  • Contributors: Marcel van Gerven

Suppose we have acquired an offline dataset where subjects had to attend to the left or right visual field. In a MVPA approach, we want to get an estimate of how well our MVA can predict the at- tended location in individual trials and which features (brain regions, channels, latencies, frequencies, etc) contributed to this prediction. This can be assessed using the ft mv crossvalidator object. It splits the data into separate folds and learns for each fold a model on the remaining folds. For example, in case of ten-fold cross-validation we will obtain ten different models which are evaluated on ten different parts of the data. It is quite easy to perform such an analysis using FFM, as shown below.

% create crossvalidator object which % standardizes the data and applies an svm 
cv = ft_mv_crossvalidator('mva',{ft_mv_standardizer ft_mv_svm});
% train cv on input data X and output data Y 
cv = cv.train(X,Y);
% display classification accuracy 
cv.metric = 'accuracy'; 
disp(cv.performance); 
>> 0.80 % 80% correctly classified
% display outcome of mcnemar test 
cv.sigtest = 'mcnemar'; 
disp(cv.significance); 
>> 1e-4 % p-value; null-hypothesis rejected

In the example, a ten-fold cross-validation is performed by standardizing the data and applying a support vector machine. Subsequently, the classification accuracy (proportion of correctly classified trials) is computed. Finally, an approximate binomial test (McNemar test) is computed which compares the predictions with that of a naive classifier that assigns all outcomes to the majority class. The different performance measures and significance tests can be examined by consulting the help for ft_mv_performance and ft_mv_significance.

One may changenfolds to any number of folds, with nfolds = inf implementing leave-one- out cross-validation. Specifying a proportion for nfolds will use that proportion for training and the remaining data for testing. Finally, specifying nfolds = ’online’ will simulate an online session by taking all trials in sequence where we train on trials 1, . . . , n and test on trial n + 1 for n = 0, . . . , N − 1.

If you want to manually define which trials should end up in which folds then this can be done via either the trainfolds or testfolds property. These should define an Nx1 cell-array containing the trial indices for each of the N folds. Just one of the properties needs to be defined since the other property will automatically take the complement of the trials in each fold.

We have not yet mentioned how to determine which features were responsible for the predictions. This is realized through the model field of the cross-validator. It returns the average model that is produced by the last method in the specified MVA; in our case, the model of a SVM. It is up to each multivariate method to specify how its model is defined. For example, for the above example, assuming we used only five features, we have:

disp(cv.model); 
>> [ 0.0176 -0.0305 0.0445 0.0904 0.0710]
disp(cv.description) 
>> 'primal form parameters; positive values indicate condition 2'

Sometimes it can be very expensive to save all the parameters learned in each fold. In that case, you may want to set the property compact to true. This will still save the average model but not all the parameters separately for each fold.

ft_mv_performance
  • Status: stable
  • Dependencies: none
  • Contributors: Marcel van Gerven

Performance measures of prediction results. Operates on the design matrix and the predictions. Called as

res = ft_mv_performance(design,post,metric)

The following performance metrics can be used:

  • accuracy : proportion of correctly classified trials
  • logprob : log probability of the correct classes
  • correlation : correlation between design and post
  • invresvar : inverse residual variance
  • coefdet : coefficient of determination
  • contingency : contingency table
  • cfmatrix : confusion matrix
  • bac : balanced accuracy

ft_mv_significance

  • Status: stable
  • Dependencies: none
  • Contributors: Marcel van Gerven

Significance testing of prediction results. Operates on the design matrix and the predictions. Called as

res = ft_mv_significance(design,post,sigtest)

and returns a p-value as the result. The sigtest determines the significance test. It supports

  • binomial : performs a binomial test comparing the predictions with predictions made by a majority classifier (random guessing for equal number of trials per class).
  • mcnemar : approximate binomial test

ft_mv_test

  • Status: stable
  • Dependencies: none
  • Contributors: Marcel van Gerven

A helper function which allows easy testing of a new dataset.

Multitask learning

Multitask learning is the the notion that if we have multiple datasets (e.g., subjects, sessions) then we can learn better models for each dataset by being informed by the other datasets. For instance, if multiple datasets are given to ft mv blogreg then the same features are coupled between the different datasets. This leads to dataset-specific models that are easier to compare. For instance, in the following example we apply this method to two datasets, both generated by adding random noise to the original data.

>> load 69digits; 
>> X1 = X + 0.05*randn(size(X)); 
>> X2 = X + 0.1*randn(size(X)); 
>> a = ft_mv_test(’mva’,{ft_mv_blogreg},’X’,X1,’Y’,Y); 
>> disp(a)
0.6600 
>> a = ft_mv_test(’mva’,{ft_mv_blogreg},’X’,X2,’Y’,Y); 
>> disp(a)
0.5800 
>> a = ft_mv_test(’mva’,{ft_mv_blogreg(’taskcoupling’,100)},’X’,{X1 X2},’Y’,Y); 
>> disp(a)
[0.6900] [0.5900]

Hence, if we apply multitask learning then the results per dataset are influenced by the other datasets. The taskcoupling parameter effectively couples the models:

>> disp(corr(d.model{1,1},d.model{1,2})) 
0.9999

Weaker coupling will cause weaker correlations between the models.

Online state estimation

If we use the toolbox for prediction then we need to work under strict time constraints. During training we want an optimal MVA to be learned quickly and be able to update our MVA with new incoming data. During testing we want to obtain an MVA output fast such that the online system does not stall. In the following example, we show an example of how online training and testing is realized.

% create a naive Bayes classifier 
clf = ft_mv_naive
% train on initial data 
clf = clf.train(X1,Y1)
% get output for new data; produces a posterior over classes 
out1 = clf.test(x1);
% train some more (update the trained classifier) 
clf = clf.train(X2,Y2);
% get output for new data; produces a posterior over classes 
out2 = clf.test(x2);

In the above example we used a naive Bayes classifier but the MVA can be arbitrarily complex. For example,

clf = ft_mv_analysis({ft_mv_standardizer ... ft_mv_filterer(’maxfeatures’,10) ft_mv_svm})

creates a MVA which first standardizes the data, then performs feature selection and finally applies a support vector machine.

Adaptive methods

Parallelization (work in progress)

Some of the described methods can be resource intensive. Most notably, cross validation, grid search, feature selection and the application of ensemble methods since they all require iterating over a collection of multivariate analyses. These methods all support parallel computing as implemented through FieldTrip’s peer distributed computing module.

This module is maintained by Marcel van Gerven (m.vangerven@donders.ru.nl).

development/multivariate.txt · Last modified: 2011/10/28 12:13 by 134.34.250.86

You are here: startdevelopmentmultivariate
This DokuWiki features an Anymorphic Webdesign theme, customised by Eelke Spaak and Stephen Whitmarsh.
www.chimeric.de Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0