Democratization of Data Analysis in Life Sciences Through Galaxy

  • Nekrutenko, Anton (PI)
  • Schatz, Michael M (CoPI)

Project: Research project

Project Details

Description

Project Summary For over a decade, the Galaxy Project (https://galaxyproject.org/) has worked to solve key issues plaguing modern data intensive biology -- the ability of researchers to access cutting-edge analysis methods, to share analysis results transparently, and to precisely reproduce complex computational analyses. Galaxy has become one of the largest and most widely used open source platforms for biological data science. Promoting openness and collaboration in all facets of the project, from technical decisions to training and leadership, has enabled us to build a vibrant community of users, developers, system engineers, and educators who continuously contribute new software features, add the latest tools, adopt to the most modern infrastructure, author training materials, and lead research and training workshops. Genomics research is continuously evolving, and current challenges include the rapid growth in size and complexity of new datasets, the increasing availability of controlled-access datasets with human genomic components, and the continuing expansion in the breadth of research areas capable of generating high throughput data. The core Galaxy development team submitting this proposal will respond to these challenges by focusing on the following key priorities: - Rearchitect Galaxy for scalability and security using software container technologies; - Design new user interface (UI) for working with thousands of tools, workflows, and samples; - Enable interactive exploratory data analysis in Galaxy; - Facilitate community growth and support; - Enable effective training and outreach. Concentrating on these broad priorities will allow us to achieve the ultimate goal of the Galaxy Project: developing a data analysis medium connecting biomedical experts across the full spectrum of skill sets, scientific domains, and research practices. For biomedical researchers it will provide a powerful analysis platform populated with the latest tools and data. For tool developers it will provide a community-supported mechanism for deploying tools before a wide audience of users. For system administrators and engineers it will provide a framework they will feel comfortable deploying on any infrastructure. For educators it will provide a comprehensive collection of materials covering most data analysis needs and an infrastructure for delivering interactive, hands-on training workshops for audiences of different sizes.
StatusActive
Effective start/end date2/19/211/31/25

Funding

  • National Human Genome Research Institute: $1,678,813.00
  • National Human Genome Research Institute: $1,910,000.00