TY - GEN
T1 - Strain-Level Identification and Analysis of Avian Coronavirus Using Raman Spectroscopy and Interpretable Machine Learning
AU - Jin, Peng
AU - Yeh, Yin Ting
AU - Ye, Jiarong
AU - Wang, Ziyang
AU - Xue, Yuan
AU - Zhang, Na
AU - Huang, Shengxi
AU - Ghedin, Elodie
AU - Lu, Huaguang
AU - Schmitt, Anthony
AU - Huang, Sharon X.
AU - Terrones, Mauricio
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Strain-level identification of viruses is important for decision making in public health management. Recently, Raman spectroscopy has attained great attention in virus identification since it enables rapid and label-free analysis. In this paper, we present an interpretable machine learning approach for strain-level identification of avian coronaviruses based on Raman spectra. Specifically, we design a spectral transformer to classify the Raman spectra of 32 avian coronavirus strains. After training, relevance maps can be generated through gradient and relevance propagation to further understand the contribution of each wavenumber to the identification. Experimental results show that the proposed method outperforms several machine learning and deep learning baseline models, and achieves 72.72% accuracy in the 32-class identification problem. The relevance maps generated reveal some wavenumber ranges that are important for the identification of almost all strains, and these ranges correlate with Raman peak ranges for lipids, nucleic acids, and proteins.
AB - Strain-level identification of viruses is important for decision making in public health management. Recently, Raman spectroscopy has attained great attention in virus identification since it enables rapid and label-free analysis. In this paper, we present an interpretable machine learning approach for strain-level identification of avian coronaviruses based on Raman spectra. Specifically, we design a spectral transformer to classify the Raman spectra of 32 avian coronavirus strains. After training, relevance maps can be generated through gradient and relevance propagation to further understand the contribution of each wavenumber to the identification. Experimental results show that the proposed method outperforms several machine learning and deep learning baseline models, and achieves 72.72% accuracy in the 32-class identification problem. The relevance maps generated reveal some wavenumber ranges that are important for the identification of almost all strains, and these ranges correlate with Raman peak ranges for lipids, nucleic acids, and proteins.
UR - http://www.scopus.com/inward/record.url?scp=85172109406&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85172109406&partnerID=8YFLogxK
U2 - 10.1109/ISBI53787.2023.10230416
DO - 10.1109/ISBI53787.2023.10230416
M3 - Conference contribution
AN - SCOPUS:85172109406
T3 - Proceedings - International Symposium on Biomedical Imaging
BT - 2023 IEEE International Symposium on Biomedical Imaging, ISBI 2023
PB - IEEE Computer Society
T2 - 20th IEEE International Symposium on Biomedical Imaging, ISBI 2023
Y2 - 18 April 2023 through 21 April 2023
ER -