Empowering Agroecosystem Modeling with HTC Scientific Workflows: The Cycles Model Use Case

Rafael Ferreira Da Silva, Rajiv Mayani, Yuning Shi, Armen R. Kemanian, Mats Rynge, Ewa Deelman

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Scopus citations

Abstract

Scientific workflows have enabled large-scale scientific computations and data analysis, and lowered the entry barrier for performing computations in distributed heterogeneous platforms (e.g., HTC and HPC). In spite of impressive achievements to date, large-scale modeling, simulation, and data analytics in the long-tail still face several challenges such as efficient scheduling and execution of large-scale workflows (\mathrm{O}(10^{6})) with very short-running tasks (few seconds). While the current trend to support next-generation workflows on leadership class machines have gained much attention in the past years, at the other end of the spectrum scientific workflows from the long-tail science have become larger and require processing massive volumes of data. In this paper, we report on our experience in designing and implementing an HTC workflow for agroecosystem modeling. We leverage well-known (task clustering and co-scheduling) and emerging (hierarchical workflows and containers) workflow optimization techniques to make the workflow planning problem tractable, and maximize resource utilization and the degree of task parallelism. Experimental results, via the implementation of a use case, show that by strategically combining the above strategies and defining an appropriate set of optimization parameters, the overall workflow makespan can be improved by 3.5 orders of magnitude when compared to a regular (non-optimized) execution of the workflow.

Original languageEnglish (US)
Title of host publicationProceedings - 2019 IEEE International Conference on Big Data, Big Data 2019
EditorsChaitanya Baru, Jun Huan, Latifur Khan, Xiaohua Tony Hu, Ronay Ak, Yuanyuan Tian, Roger Barga, Carlo Zaniolo, Kisung Lee, Yanfang Fanny Ye
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages4545-4552
Number of pages8
ISBN (Electronic)9781728108582
DOIs
StatePublished - Dec 2019
Event2019 IEEE International Conference on Big Data, Big Data 2019 - Los Angeles, United States
Duration: Dec 9 2019Dec 12 2019

Publication series

NameProceedings - 2019 IEEE International Conference on Big Data, Big Data 2019

Conference

Conference2019 IEEE International Conference on Big Data, Big Data 2019
Country/TerritoryUnited States
CityLos Angeles
Period12/9/1912/12/19

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Computer Networks and Communications
  • Information Systems
  • Information Systems and Management

Fingerprint

Dive into the research topics of 'Empowering Agroecosystem Modeling with HTC Scientific Workflows: The Cycles Model Use Case'. Together they form a unique fingerprint.

Cite this