REASTAP: Injecting Table Reasoning Skills During Pre-training via Synthetic Reasoning Examples

Yilun Zhao, Linyong Nan, Zhenting Qi, Rui Zhang, Dragomir Radev

Research output: Chapter in Book/Report/Conference proceedingConference contribution

26 Scopus citations

Abstract

Reasoning over tabular data requires both table structure understanding and a broad set of table reasoning skills. Current models with table-specific architectures and pre-training methods perform well on understanding table structures, but they still struggle with tasks that require various table reasoning skills. In this work, we develop REASTAP to show that high-level table reasoning skills can be injected into models during pre-training without a complex table-specific architecture design. We define 7 table reasoning skills, such as numerical operation, temporal comparison, and conjunction. Each reasoning skill is associated with one example generator, which synthesizes questions over semi-structured tables according to the sampled templates. We model the table pre-training task as a sequence generation task and pre-train REASTAP to generate precise answers to the synthetic examples. REASTAP is evaluated on four benchmarks covering three downstream tasks including: 1) WIKISQL-WEAK and WIKITQ for Table Question Answering; 2) TABFACT for Table Fact Verification; and 3) LOGICNLG for Faithful Table-to-Text Generation. Experimental results demonstrate that REASTAP achieves new state-of-the-art performance on all benchmarks and delivers a significant improvement on low-resource setting. Our code is publicly available at https://github.com/Yale-LILY/ReasTAP.

Original languageEnglish (US)
Title of host publicationProceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022
EditorsYoav Goldberg, Zornitsa Kozareva, Yue Zhang
PublisherAssociation for Computational Linguistics (ACL)
Pages9006-9018
Number of pages13
ISBN (Electronic)9781959429401
DOIs
StatePublished - 2022
Event2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022 - Hybrid, Abu Dhabi, United Arab Emirates
Duration: Dec 7 2022Dec 11 2022

Publication series

NameProceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022

Conference

Conference2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022
Country/TerritoryUnited Arab Emirates
CityHybrid, Abu Dhabi
Period12/7/2212/11/22

All Science Journal Classification (ASJC) codes

  • Computational Theory and Mathematics
  • Computer Science Applications
  • Information Systems

Fingerprint

Dive into the research topics of 'REASTAP: Injecting Table Reasoning Skills During Pre-training via Synthetic Reasoning Examples'. Together they form a unique fingerprint.

Cite this