TY - GEN
T1 - Predicting RNA splicing branchpoints
AU - Jovanovic, Antonio
AU - Alqassem, Israa
AU - Chappell, Nathan
AU - Canzar, Stefan
AU - Matijevic, Domagoj
N1 - Publisher Copyright:
© 2022 Croatian Society MIPRO.
PY - 2022
Y1 - 2022
N2 - RNA splicing is a process where introns are removed from pre-mRNA, resulting in mature mRNA. It requires three main signals, a donor splice site (5'ss), an acceptor splice site (3'ss) and a branchpoint (BP). Splice site prediction is a well-studied problem with several reliable prediction tools. However, branchpoint prediction is a harder problem, mainly due to varying nucleotide motifs in the branchpoint area and the existence of multiple branch-points in a single intron. An RNN based approach called LaBranchoR was introduced as the state-of-the-art method for predicting a single BP for each 3'ss. In this work, we explore the fact that previous research reported that 95% of introns have multiple BPs with an estimated average of 5 to 6 BPs per intron. To that end, we extend the existing encoder in the LaBranchoR network with a PointerNetwork decoder. We train our new encoder-decoder model, named RNA PtrNets, on 70-nucleotide-long annotated sequences taken from three publicly available datasets. We evaluate its accuracy and demonstrate how well the predictor can generate multiple branchpoints on the given datasets.
AB - RNA splicing is a process where introns are removed from pre-mRNA, resulting in mature mRNA. It requires three main signals, a donor splice site (5'ss), an acceptor splice site (3'ss) and a branchpoint (BP). Splice site prediction is a well-studied problem with several reliable prediction tools. However, branchpoint prediction is a harder problem, mainly due to varying nucleotide motifs in the branchpoint area and the existence of multiple branch-points in a single intron. An RNN based approach called LaBranchoR was introduced as the state-of-the-art method for predicting a single BP for each 3'ss. In this work, we explore the fact that previous research reported that 95% of introns have multiple BPs with an estimated average of 5 to 6 BPs per intron. To that end, we extend the existing encoder in the LaBranchoR network with a PointerNetwork decoder. We train our new encoder-decoder model, named RNA PtrNets, on 70-nucleotide-long annotated sequences taken from three publicly available datasets. We evaluate its accuracy and demonstrate how well the predictor can generate multiple branchpoints on the given datasets.
UR - http://www.scopus.com/inward/record.url?scp=85133947457&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85133947457&partnerID=8YFLogxK
U2 - 10.23919/MIPRO55190.2022.9803685
DO - 10.23919/MIPRO55190.2022.9803685
M3 - Conference contribution
AN - SCOPUS:85133947457
T3 - 2022 45th Jubilee International Convention on Information, Communication and Electronic Technology, MIPRO 2022 - Proceedings
SP - 383
EP - 388
BT - 2022 45th Jubilee International Convention on Information, Communication and Electronic Technology, MIPRO 2022 - Proceedings
A2 - Vrcek, Neven
A2 - Koricic, Marko
A2 - Gradisnik, Vera
A2 - Skala, Karolj
A2 - Car, Zeljka
A2 - Cicin-Sain, Marina
A2 - Babic, Snjezana
A2 - Sruk, Vlado
A2 - Skvorc, Dejan
A2 - Jovic, Alan
A2 - Gros, Stjepan
A2 - Vrdoljak, Boris
A2 - Mauher, Mladen
A2 - Tijan, Edvard
A2 - Katulic, Tihomir
A2 - Petrovic, Juraj
A2 - Grbac, Tihana Galinac
A2 - Kusen, Benjamin
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 45th Jubilee International Convention on Information, Communication and Electronic Technology, MIPRO 2022
Y2 - 23 May 2022 through 27 May 2022
ER -