A Study on Reproducibility and Replicability of Table Structure Recognition Methods

Kehinde Ajayi, Muntabir Hasan Choudhury, Sarah M. Rajtmajer, Jian Wu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Scopus citations

Abstract

Concerns about reproducibility in artificial intelligence (AI) have emerged, as researchers have reported unsuccessful attempts to directly reproduce published findings in the field. Replicability, the ability to affirm a finding using the same procedures on new data, has not been well studied. In this paper, we examine both reproducibility and replicability of a corpus of 16 papers on table structure recognition (TSR), an AI task aimed at identifying cell locations of tables in digital documents. We attempt to reproduce published results using codes and datasets provided by the original authors. We then examine replicability using a dataset similar to the original as well as a new dataset, GenTSR, consisting of 386 annotated tables extracted from scientific papers. Out of 16 papers studied, we reproduce results consistent with the original in only four. Two of the four papers are identified as replicable using the similar dataset under certain IoU values. No paper is identified as replicable using the new dataset. We offer observations on the causes of irreproducibility and irreplicability. All code and data are available on Codeocean at https://codeocean.com/capsule/6680116/tree.

Original languageEnglish (US)
Title of host publicationDocument Analysis and Recognition – ICDAR 2023 - 17th International Conference, Proceedings
EditorsGernot A. Fink, Rajiv Jain, Koichi Kise, Richard Zanibbi
PublisherSpringer Science and Business Media Deutschland GmbH
Pages3-19
Number of pages17
ISBN (Print)9783031416781
DOIs
StatePublished - 2023
Event17th International Conference on Document Analysis and Recognition, ICDAR 2023 - San José, United States
Duration: Aug 21 2023Aug 26 2023

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume14188 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference17th International Conference on Document Analysis and Recognition, ICDAR 2023
Country/TerritoryUnited States
CitySan José
Period8/21/238/26/23

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'A Study on Reproducibility and Replicability of Table Structure Recognition Methods'. Together they form a unique fingerprint.

Cite this