TY - GEN
T1 - Application of Quantum Tensor Networks for Protein Classification
AU - Kundu, Debarshi
AU - Ghosh, Archisman
AU - Ekambaram, Srinivasan
AU - Wang, Jian
AU - Dokholyan, Nikolay
AU - Ghosh, Swaroop
N1 - Publisher Copyright:
© 2024 ACM.
PY - 2024/6/12
Y1 - 2024/6/12
N2 - Computational methods in drug discovery significantly reduce both time and experimental costs. Nonetheless, certain computational tasks in drug discovery can be daunting with classical computing techniques which can be potentially overcome using quantum computing. A crucial task within this domain involves the functional classification of proteins. However, a challenge lies in adequately representing lengthy protein sequences given the limited number of qubits available in existing noisy quantum computers. We show that protein sequences can be thought of as sentences in natural language processing and can be parsed using the existing Quantum Natural Language framework into parameterized quantum circuits of reasonable qubits, which can be trained to solve various protein-related machine-learning problems. We classify proteins based on their sub-cellular locations - a pivotal task in bioinformatics that is key to understanding biological processes and disease mechanisms. Leveraging the quantum-enhanced processing capabilities, we demonstrate that Quantum Tensor Networks (QTN) can effectively handle the complexity and diversity of protein sequences. We present a detailed methodology that adapts QTN architectures to the nuanced requirements of protein data, supported by comprehensive experimental results. We demonstrate two distinct QTNs, inspired by classical recurrent neural networks (RNN) and convolutional neural networks (CNN), to solve the binary classification task mentioned above. Our top-performing quantum model has achieved a 94% accuracy rate, which is comparable to the performance of a classical model that uses the ESM2 protein language model embeddings. It's noteworthy that the ESM2 model is extremely large, containing 8 million parameters in its smallest configuration, whereas our best quantum model requires only around 800 parameters. We demonstrate that these hybrid models exhibit promising performance, showcasing their potential to compete with classical models of similar complexity.
AB - Computational methods in drug discovery significantly reduce both time and experimental costs. Nonetheless, certain computational tasks in drug discovery can be daunting with classical computing techniques which can be potentially overcome using quantum computing. A crucial task within this domain involves the functional classification of proteins. However, a challenge lies in adequately representing lengthy protein sequences given the limited number of qubits available in existing noisy quantum computers. We show that protein sequences can be thought of as sentences in natural language processing and can be parsed using the existing Quantum Natural Language framework into parameterized quantum circuits of reasonable qubits, which can be trained to solve various protein-related machine-learning problems. We classify proteins based on their sub-cellular locations - a pivotal task in bioinformatics that is key to understanding biological processes and disease mechanisms. Leveraging the quantum-enhanced processing capabilities, we demonstrate that Quantum Tensor Networks (QTN) can effectively handle the complexity and diversity of protein sequences. We present a detailed methodology that adapts QTN architectures to the nuanced requirements of protein data, supported by comprehensive experimental results. We demonstrate two distinct QTNs, inspired by classical recurrent neural networks (RNN) and convolutional neural networks (CNN), to solve the binary classification task mentioned above. Our top-performing quantum model has achieved a 94% accuracy rate, which is comparable to the performance of a classical model that uses the ESM2 protein language model embeddings. It's noteworthy that the ESM2 model is extremely large, containing 8 million parameters in its smallest configuration, whereas our best quantum model requires only around 800 parameters. We demonstrate that these hybrid models exhibit promising performance, showcasing their potential to compete with classical models of similar complexity.
UR - http://www.scopus.com/inward/record.url?scp=85197905471&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85197905471&partnerID=8YFLogxK
U2 - 10.1145/3649476.3658701
DO - 10.1145/3649476.3658701
M3 - Conference contribution
AN - SCOPUS:85197905471
T3 - Proceedings of the ACM Great Lakes Symposium on VLSI, GLSVLSI
SP - 132
EP - 137
BT - GLSVLSI 2024 - Proceedings of the Great Lakes Symposium on VLSI 2024
PB - Association for Computing Machinery
T2 - 34th Great Lakes Symposium on VLSI 2024, GLSVLSI 2024
Y2 - 12 June 2024 through 14 June 2024
ER -