PSU at CLEF-2020 ARQMath Track: Unsupervised Re-ranking using Pretraining

Shaurya Rohatgi, Jian Wu, C. Lee Giles

Research output: Contribution to journalConference articlepeer-review

7 Scopus citations

Abstract

This paper elaborates on our submission to the ARQMath track at CLEF 2020. Our primary run for the main Task-1: Question Answering uses a two-stage retrieval technique in which the first stage is a fusion of traditional BM25 scoring and tf-idf with cosine similarity-based retrieval while the second stage is a finer re-ranking technique using contextualized embeddings. For the re-ranking we use a pre-trained roberta-base model (110 million parameters) to make the language model more math-aware. Our approach achieves a higher NDCG0 score than the baseline, while our MAP and P@10 scores are competitive, performing better than the best submission (MathDowsers) for text and text+formula dependent topics.

Original languageEnglish (US)
JournalCEUR Workshop Proceedings
Volume2696
StatePublished - 2020
Event11th Conference and Labs of the Evaluation Forum, CLEF 2020 - Thessaloniki, Greece
Duration: Sep 22 2020Sep 25 2020

All Science Journal Classification (ASJC) codes

  • General Computer Science

Fingerprint

Dive into the research topics of 'PSU at CLEF-2020 ARQMath Track: Unsupervised Re-ranking using Pretraining'. Together they form a unique fingerprint.

Cite this