TY - JOUR
T1 - PSU at CLEF-2020 ARQMath Track
T2 - 11th Conference and Labs of the Evaluation Forum, CLEF 2020
AU - Rohatgi, Shaurya
AU - Wu, Jian
AU - Giles, C. Lee
N1 - Funding Information:
We would like to thank members of the ARQMath lab at the Department of Computer Science in Rochester Institute of Technology for organizing this track. Special thanks to Behrooz Mansouri for providing the dataset, initial analysis of topics, and starter code to all the participants of the task; it made it easier for us to pre-process the data and jump directly to the experiments which have been presented in this work.
Publisher Copyright:
Copyright © 2020 for this paper by its authors.
PY - 2020
Y1 - 2020
N2 - This paper elaborates on our submission to the ARQMath track at CLEF 2020. Our primary run for the main Task-1: Question Answering uses a two-stage retrieval technique in which the first stage is a fusion of traditional BM25 scoring and tf-idf with cosine similarity-based retrieval while the second stage is a finer re-ranking technique using contextualized embeddings. For the re-ranking we use a pre-trained roberta-base model (110 million parameters) to make the language model more math-aware. Our approach achieves a higher NDCG0 score than the baseline, while our MAP and P@10 scores are competitive, performing better than the best submission (MathDowsers) for text and text+formula dependent topics.
AB - This paper elaborates on our submission to the ARQMath track at CLEF 2020. Our primary run for the main Task-1: Question Answering uses a two-stage retrieval technique in which the first stage is a fusion of traditional BM25 scoring and tf-idf with cosine similarity-based retrieval while the second stage is a finer re-ranking technique using contextualized embeddings. For the re-ranking we use a pre-trained roberta-base model (110 million parameters) to make the language model more math-aware. Our approach achieves a higher NDCG0 score than the baseline, while our MAP and P@10 scores are competitive, performing better than the best submission (MathDowsers) for text and text+formula dependent topics.
UR - http://www.scopus.com/inward/record.url?scp=85111688292&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85111688292&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:85111688292
SN - 1613-0073
VL - 2696
JO - CEUR Workshop Proceedings
JF - CEUR Workshop Proceedings
Y2 - 22 September 2020 through 25 September 2020
ER -