Explanation#
This section of the documentation exists to illuminate how autoreject
works.
The primary source for understanding should be the original publication [1],
however the sections in this guide can make the content of that primary source
more graspable.
Intuition on how the - autoreject global - algorithm works#
Given some MEEG data \(X\) with the dimensions \(trials(=epochs) \times sensors \times timepoints\)
We want to find a threshold \(\tau\) in \(\mu V\) that will reject noisy epochs and retain clean epochs
Do the following for a set of possible candidate thresholds: \(\Phi\)
For each \(\tau_i \in \Phi\) :
Split your data \(X\) into \(K\) folds (\(K\) equal parts) along the trial dimension
Each of the \(K\) parts will be a “test” set once, while the remaining \(K-1\) parts will be combined to be the corresponding “train” set (see k-fold crossvalidation)
Then for each fold \(K\) (consisting of train and test trials) do:
apply threshold \(\tau_i\) to reject trials in the train set
calculate the mean of the signal (for each sensor and timepoint) over the GOOD (=not rejected) trials in the train set
calculate the median of the signal (for each sensor and timepoint) over ALL trials in the test set
compare both of these signals and calculate the error \(e_k\) (i.e., take the Frobenius norm of their difference)
save that error \(e_k\)
Now we have \(K\) errors \(e_k \in E\)
Form the mean error \(\bar E\) (over all \(K\) errors) associated with our current threshold \(\tau_i\) in \(\mu V\)
Save the mapping of \(\tau_i\) to its associated error \(\bar E\)
… now each threshold candidate in the set \(\Phi\) is mapped to a specific error value \(\bar E\)
the candidate threshold \(\tau_i\) with the lowest error is the best rejection threshold for a global rejection