REASTAP: Injecting Table Reasoning Skills During Pre-training via Synthetic Reasoning Examples

Yilun Zhao, Linyong Nan, Zhenting Qi, Rui Zhang, Dragomir Radev

Research output: Contribution to conferencePaperpeer-review

15 Scopus citations

Abstract

Reasoning over tabular data requires both table structure understanding and a broad set of table reasoning skills. Current models with table-specific architectures and pre-training methods perform well on understanding table structures, but they still struggle with tasks that require various table reasoning skills. In this work, we develop REASTAP to show that high-level table reasoning skills can be injected into models during pre-training without a complex table-specific architecture design. We define 7 table reasoning skills, such as numerical operation, temporal comparison, and conjunction. Each reasoning skill is associated with one example generator, which synthesizes questions over semi-structured tables according to the sampled templates. We model the table pre-training task as a sequence generation task and pre-train REASTAP to generate precise answers to the synthetic examples. REASTAP is evaluated on four benchmarks covering three downstream tasks including: 1) WIKISQL-WEAK and WIKITQ for Table Question Answering; 2) TABFACT for Table Fact Verification; and 3) LOGICNLG for Faithful Table-to-Text Generation. Experimental results demonstrate that REASTAP achieves new state-of-the-art performance on all benchmarks and delivers a significant improvement on low-resource setting. Our code is publicly available at https://github.com/Yale-LILY/ReasTAP.

Original languageEnglish (US)
Pages9006-9018
Number of pages13
StatePublished - 2022
Event2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022 - Abu Dhabi, United Arab Emirates
Duration: Dec 7 2022Dec 11 2022

Conference

Conference2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022
Country/TerritoryUnited Arab Emirates
CityAbu Dhabi
Period12/7/2212/11/22

All Science Journal Classification (ASJC) codes

  • Computational Theory and Mathematics
  • Computer Science Applications
  • Information Systems

Cite this