CIBR: Collaborative Research: Providing sustainable Galaxy service on XSEDE resources

  • Nekrutenko, Anton (PI)
  • Hilser, Vincent V.J. (CoPI)
  • Taylor, James (CoPI)
  • Stubbs, Joseph J. (CoPI)

Project: Research project

Project Details


Due to proliferation of high throughput sequencing and imaging techniques biology has become a de facto data-driven discipline. Yet the ability to take full advantage of these technological advances is still lagging for most biological researchers. The Galaxy Project started in 2005 to create a system enabling biologists without informatics expertise or local compute resources to perform computational analysis through the web. The main site located at the Texas Advanced Computing Center allows researchers to analyze data freely and access training materials on analytic approaches, their assumptions and limitations, and best practices. Galaxy combines the accessibility of GUI with infinite power of exploratory analyses provided by interactive environments. This combination empowers high quality research and, again, provides a robust launchpad for quantitative and data analysis training. The public instance of Galaxy is used by tens of thousands of registered users and performs between 300 to 500 thousands of analyses every month and has been cited in over 7,000 publications.

This proposal aims at changing how Galaxy utilizes existing and procures additional infrastructure to put it on a sustainable footing going forward. The plan for achieving these goals involves two primary focal areas: The first objective is to adapt Galaxy architecture to take advantage of tiered storage. This will allow significantly reducing the cost and ensure effective management of storage infrastructure. Within this aim the team will develop and implement strategies for the management of existing data and leverage an existing policy-based system for controlling when and what data to move between tiers. The second objective is to design and implement functionality for 'premium storage'. The work undertaken within this aim will allow users with existing XSEDE allocations to 'extend' their storage. Other users will be able to acquire additional Galaxy storage on TACC systems by paying for it directly. These changes will allow the Galaxy service to continue to provide a basic level of storage to all users while supporting a flexible set of options to scale beyond that level as needed. For more information about Galaxy visit the website at

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Effective start/end date12/15/1911/30/23


  • National Science Foundation: $2,035,468.00


Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.