Information extraction from nanotoxicity related publications

Lemin Xiao, Kaizhi Tang, Xiong Liu, Hui Yang, Zheng Chen, Roger Xu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

8 Scopus citations

Abstract

High-quality experimental data are important when developing predictive models for studying nanomaterial environmental impact (NEI). Given that raw data from experimental laboratories and manufacturing workplaces are usually proprietary and small-scaled, extracting information from publications is an attractive alternative for collecting data. We developed an information extraction system that can extract useful information from full-text nanotoxicity related publications. This information extraction system consists of five components: raw data transformation into machine readable format, data preprocessing, ontology-based named entity recognition, rule-based numerical attribute extraction from both tables and unstructured text, and relation extraction among entities and attributes. The information extraction system is applied on a dataset made of 94 publications, and results in an acceptable accuracy. By storing extracted data into a table according to relations among the data, a dataset that can be used to predict nanomaterial environmental impact is obtained. Such a system is unique in current nanomaterial community, and can help nanomaterial scientists and practitioners quickly locate useful information they need without spending lots of time reading articles.

Original languageEnglish (US)
Title of host publicationProceedings - 2013 IEEE International Conference on Bioinformatics and Biomedicine, IEEE BIBM 2013
Pages25-30
Number of pages6
DOIs
StatePublished - 2013
Event2013 IEEE International Conference on Bioinformatics and Biomedicine, IEEE BIBM 2013 - Shanghai, China
Duration: Dec 18 2013Dec 21 2013

Publication series

NameProceedings - 2013 IEEE International Conference on Bioinformatics and Biomedicine, IEEE BIBM 2013

Other

Other2013 IEEE International Conference on Bioinformatics and Biomedicine, IEEE BIBM 2013
Country/TerritoryChina
CityShanghai
Period12/18/1312/21/13

All Science Journal Classification (ASJC) codes

  • Biomedical Engineering

Fingerprint

Dive into the research topics of 'Information extraction from nanotoxicity related publications'. Together they form a unique fingerprint.

Cite this