Collaborative Research: Learning Classifiers From Autonomous, Semantically Heterogeneous, Distributed Data

Project: Research project

Project Details

Description

Advances in networks, sensors, storage, computing, and high throughput data acquisition, have led to a proliferation of autonomous, distributed data sources in many areas of human activity. New discoveries in biological, physical, and social sciences and engineering are being driven by our ability to discover, share, integrate and analyze disparate types of data. Statistically-based machine learning algorithms offer some of the most cost-effective approaches to discovery of experimentally testable predictive models and hypotheses from data. However, the large size, distributed nature, and autonomy of the data sources (and the attendant differences in access, queries allowed, processing capabilities, structure, organization, and underlying data models and data semantics) present hurdles to effective utilization of machine learning. This research aims to overcome these hurdles by developing efficient, resource-aware distributed algorithms and software services to support collaborative, integrative knowledge acquisition such a setting. The research team will implement, deploy, and evaluate the resulting algorithms using benchmark data sets, associated data models and ontologies, and user-specified inter-ontology mappings on a distributed test-bed of networked databases and services at Iowa State University and Kansas State University. The resulting open-source software can potentially transform collaborative e-science in the same way that Web has transformed information sharing. Broader impacts of this research include enhanced opportunities for research-based training of graduate and undergraduate students, interdisciplinary collaborations, participation of under-represented groups, and development of increasingly sophisticated software to support collaborative, integrative e-science. The project web site (http://www.cild.iastate.edu/projects/indus.html) provides access to information about the project, benchmark data, publications, software, and documentation.

StatusFinished
Effective start/end date8/15/077/31/13

Funding

  • National Science Foundation: $304,495.00

Fingerprint

Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.