TY - GEN
T1 - Generating synopses for document-element search
AU - Bhatia, Sumit
AU - Lahiri, Shibamouli
AU - Mitra, Prasenjit
PY - 2009
Y1 - 2009
N2 - Scientists often search for document-elements like tables, figures, or algorithm pseudo-codes. Domain scientists and researchers report important data, results and algorithms using these document-elements; readers want to compare the reported results with their findings. Some document-element search engines have been proposed (especially to search for tables and figures) to make this task easier. While searching for document-elements today, the end-user is presented with the caption of the document-element and a sentence in the document text that refers to the document-element. Oftentimes, the caption and the reference text do not contain enough information to interpret the document-element. In this paper, we present the first set of methods to extract this useful information (synopsis) related to document-elements automatically. We also investigate the problem of choosing the optimum synopsis-size that strikes a balance between information content and size of the generated synopses.
AB - Scientists often search for document-elements like tables, figures, or algorithm pseudo-codes. Domain scientists and researchers report important data, results and algorithms using these document-elements; readers want to compare the reported results with their findings. Some document-element search engines have been proposed (especially to search for tables and figures) to make this task easier. While searching for document-elements today, the end-user is presented with the caption of the document-element and a sentence in the document text that refers to the document-element. Oftentimes, the caption and the reference text do not contain enough information to interpret the document-element. In this paper, we present the first set of methods to extract this useful information (synopsis) related to document-elements automatically. We also investigate the problem of choosing the optimum synopsis-size that strikes a balance between information content and size of the generated synopses.
UR - http://www.scopus.com/inward/record.url?scp=74549222997&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=74549222997&partnerID=8YFLogxK
U2 - 10.1145/1645953.1646287
DO - 10.1145/1645953.1646287
M3 - Conference contribution
AN - SCOPUS:74549222997
SN - 9781605585123
T3 - International Conference on Information and Knowledge Management, Proceedings
SP - 2003
EP - 2006
BT - ACM 18th International Conference on Information and Knowledge Management, CIKM 2009
T2 - ACM 18th International Conference on Information and Knowledge Management, CIKM 2009
Y2 - 2 November 2009 through 6 November 2009
ER -