MIA 2022 Shared Task: Evaluating Cross-lingual Open-Retrieval Question Answering for 16 Diverse Languages

Akari Asai, Shayne Longpre, Jungo Kasai, Chia Hsuan Lee, Rui Zhang, Junjie Hu, Ikuya Yamada, Jonathan H. Clark, Eunsol Choi

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Scopus citations

Abstract

We present the results of the Workshop on Multilingual Information Access (MIA) 2022 Shared Task, evaluating cross-lingual open-retrieval question answering (QA) systems in 16 typologically diverse languages. In this task, we adapted two large-scale cross-lingual open-retrieval QA datasets in 14 typologically diverse languages, and newly annotated open-retrieval QA data in 2 underrepresented languages: Tagalog and Tamil. Four teams submitted their systems. The best constrained system uses entity-aware contextualized representations for document retrieval, thereby achieving an average F1 score of 31.6, which is 4.1 F1 absolute higher than the challenging baseline. The best system obtains particularly significant improvements in Tamil (20.8 F1), whereas most of the other systems yield nearly zero scores. The best unconstrained system achieves 32.2 F1, outperforming our baseline by 4.5 points. The official leaderboard and baselines models are publicly available.

Original languageEnglish (US)
Title of host publicationMIA 2022 - Workshop on Multilingual Information Access, Proceedings of the Workshop
EditorsAkari Asai, Eunsol Choi, Jonathan H. Clark, Junjie Hu, Chia-Hsuan Lee, Jungo Kasai, Shayne Longpre, Ikuya IkuyaYamada, Rui Zhang
PublisherAssociation for Computational Linguistics (ACL)
Pages108-120
Number of pages13
ISBN (Electronic)9781955917896
StatePublished - 2022
Event2022 Workshop on Multilingual Information Access, MIA 2022 - Seattle, United States
Duration: Jul 15 2022 → …

Publication series

NameMIA 2022 - Workshop on Multilingual Information Access, Proceedings of the Workshop

Conference

Conference2022 Workshop on Multilingual Information Access, MIA 2022
Country/TerritoryUnited States
CitySeattle
Period7/15/22 → …

All Science Journal Classification (ASJC) codes

  • Language and Linguistics
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'MIA 2022 Shared Task: Evaluating Cross-lingual Open-Retrieval Question Answering for 16 Diverse Languages'. Together they form a unique fingerprint.

Cite this