TY - JOUR
T1 - I tried a bunch of things
T2 - The dangers of unexpected overfitting in classification of brain data
AU - Hosseini, Mahan
AU - Powell, Michael
AU - Collins, John
AU - Callahan-Flintoft, Chloe
AU - Jones, William
AU - Bowman, Howard
AU - Wyble, Brad
N1 - Publisher Copyright:
© 2020 Elsevier Ltd
PY - 2020/12
Y1 - 2020/12
N2 - Machine learning has enhanced the abilities of neuroscientists to interpret information collected through EEG, fMRI, and MEG data. With these powerful techniques comes the danger of overfitting of hyperparameters which can render results invalid. We refer to this problem as ‘overhyping’ and show that it is pernicious despite commonly used precautions. Overhyping occurs when analysis decisions are made after observing analysis outcomes and can produce results that are partially or even completely spurious. It is commonly assumed that cross-validation is an effective protection against overfitting or overhyping, but this is not actually true. In this article, we show that spurious results can be obtained on random data by modifying hyperparameters in seemingly innocuous ways, despite the use of cross-validation. We recommend a number of techniques for limiting overhyping, such as lock boxes, blind analyses, pre-registrations, and nested cross-validation. These techniques, are common in other fields that use machine learning, including computer science and physics. Adopting similar safeguards is critical for ensuring the robustness of machine-learning techniques in the neurosciences.
AB - Machine learning has enhanced the abilities of neuroscientists to interpret information collected through EEG, fMRI, and MEG data. With these powerful techniques comes the danger of overfitting of hyperparameters which can render results invalid. We refer to this problem as ‘overhyping’ and show that it is pernicious despite commonly used precautions. Overhyping occurs when analysis decisions are made after observing analysis outcomes and can produce results that are partially or even completely spurious. It is commonly assumed that cross-validation is an effective protection against overfitting or overhyping, but this is not actually true. In this article, we show that spurious results can be obtained on random data by modifying hyperparameters in seemingly innocuous ways, despite the use of cross-validation. We recommend a number of techniques for limiting overhyping, such as lock boxes, blind analyses, pre-registrations, and nested cross-validation. These techniques, are common in other fields that use machine learning, including computer science and physics. Adopting similar safeguards is critical for ensuring the robustness of machine-learning techniques in the neurosciences.
UR - http://www.scopus.com/inward/record.url?scp=85095433113&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85095433113&partnerID=8YFLogxK
U2 - 10.1016/j.neubiorev.2020.09.036
DO - 10.1016/j.neubiorev.2020.09.036
M3 - Review article
C2 - 33035522
AN - SCOPUS:85095433113
SN - 0149-7634
VL - 119
SP - 456
EP - 467
JO - Neuroscience and Biobehavioral Reviews
JF - Neuroscience and Biobehavioral Reviews
ER -