TY - JOUR
T1 - Galaxy CloudMan
T2 - Delivering cloud compute clusters
AU - Afgan, Enis
AU - Baker, Dannon
AU - Coraor, Nate
AU - Chapman, Brad
AU - Nekrutenko, Anton
AU - Taylor, James
N1 - Funding Information:
Galaxy is developed by the Galaxy Team: Enis Afgan, Guruprasad Ananda, Dannon Baker, Dan Blankenberg, Ramkrishna Chakrabarty, Nate Coraor, Jeremy Goecks, Greg Von Kuster, Ross Lazarus, Kanwei Li, Anton Nekrutenko, James Taylor, and Kelly Vincent. We thank our many collaborators who support and maintain data warehouses and browsers accessible through Galaxy. We would also like to thank the participants of BOSC Codefest 2010, and the Bio-Linux community. This work was supported by NIH grant HG005542 (J.T. and A.N.). Development of the Galaxy framework is also supported by NIH grants HG004909 (A.N. and J.T), HG005133 (J.T. and A.N), by NSF grant DBI-0850103 (A.N. and J.T) and by funds from the Huck Institutes for the Life Sciences and the Institute for CyberScience at Penn State. Additional funding is provided, in part, under a grant with the Pennsylvania Department of Health using Tobacco Settlement Funds. The Department specifically disclaims responsibility for any analyses, interpretations or conclusions. This article has been published as part of BMC Bioinformatics Volume 11 Supplement 12, 2010: Proceedings of the 11th Annual Bioinformatics Open Source Conference (BOSC) 2010. The full contents of the supplement are available online at http://www.biomedcentral.com/1471-2105/11?issue=S12.
PY - 2010/12/21
Y1 - 2010/12/21
N2 - Background: Widespread adoption of high-throughput sequencing has greatly increased the scale and sophistication of computational infrastructure needed to perform genomic research. An alternative to building and maintaining local infrastructure is " cloud computing" , which, in principle, offers on demand access to flexible computational infrastructure. However, cloud computing resources are not yet suitable for immediate " as is" use by experimental biologists.Results: We present a cloud resource management system that makes it possible for individual researchers to compose and control an arbitrarily sized compute cluster on Amazon's EC2 cloud infrastructure without any informatics requirements. Within this system, an entire suite of biological tools packaged by the NERC Bio-Linux team (http://nebc.nerc.ac.uk/tools/bio-linux) is available for immediate consumption. The provided solution makes it possible, using only a web browser, to create a completely configured compute cluster ready to perform analysis in less than five minutes. Moreover, we provide an automated method for building custom deployments of cloud resources. This approach promotes reproducibility of results and, if desired, allows individuals and labs to add or customize an otherwise available cloud system to better meet their needs.Conclusions: The expected knowledge and associated effort with deploying a compute cluster in the Amazon EC2 cloud is not trivial. The solution presented in this paper eliminates these barriers, making it possible for researchers to deploy exactly the amount of computing power they need, combined with a wealth of existing analysis software, to handle the ongoing data deluge.
AB - Background: Widespread adoption of high-throughput sequencing has greatly increased the scale and sophistication of computational infrastructure needed to perform genomic research. An alternative to building and maintaining local infrastructure is " cloud computing" , which, in principle, offers on demand access to flexible computational infrastructure. However, cloud computing resources are not yet suitable for immediate " as is" use by experimental biologists.Results: We present a cloud resource management system that makes it possible for individual researchers to compose and control an arbitrarily sized compute cluster on Amazon's EC2 cloud infrastructure without any informatics requirements. Within this system, an entire suite of biological tools packaged by the NERC Bio-Linux team (http://nebc.nerc.ac.uk/tools/bio-linux) is available for immediate consumption. The provided solution makes it possible, using only a web browser, to create a completely configured compute cluster ready to perform analysis in less than five minutes. Moreover, we provide an automated method for building custom deployments of cloud resources. This approach promotes reproducibility of results and, if desired, allows individuals and labs to add or customize an otherwise available cloud system to better meet their needs.Conclusions: The expected knowledge and associated effort with deploying a compute cluster in the Amazon EC2 cloud is not trivial. The solution presented in this paper eliminates these barriers, making it possible for researchers to deploy exactly the amount of computing power they need, combined with a wealth of existing analysis software, to handle the ongoing data deluge.
UR - http://www.scopus.com/inward/record.url?scp=78650841579&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=78650841579&partnerID=8YFLogxK
U2 - 10.1186/1471-2105-11-S12-S4
DO - 10.1186/1471-2105-11-S12-S4
M3 - Article
C2 - 21210983
AN - SCOPUS:78650841579
SN - 1471-2105
VL - 11
JO - BMC bioinformatics
JF - BMC bioinformatics
IS - SUPPL. 12
M1 - S4
ER -