TY - GEN
T1 - An evolutionary path to object storage access
AU - Goodell, David
AU - Kim, Seong Jo
AU - Latham, Robert
AU - Kandemir, Mahmut
AU - Ross, Robert
PY - 2012
Y1 - 2012
N2 - High-performance computing (HPC) storage sys- tems typically consist of an object storage system that is accessed via the POSIX file interface. However, rapid increases in system scales and storage system complexity have uncovered a number of limitations in this model. In particular, applications and libraries are limited in their ability to partition data into units with independent concurrency control, and mapping complex science data models into the POSIX file model is inconvenient at best. In this paper we propose an alternative interface for use by applications and libraries that provides direct access to underlying storage objects. This model allows applications and libraries to organize storage access around these objects in order to avoid lock contention without needing to create many separate files. Additionally, complex data models are more readily organized into multiple object data streams, simplifying the storage of variable-length data and allowing a choice of degree of parallelism related to access needs. Our approach provides for datasets stored in this new model to coexist with POSIX files, allowing evolution to the new model over time. We apply these concepts in the PVFS, PLFS, and Parallel netCDF packages to prototype the model and describe our experiences.
AB - High-performance computing (HPC) storage sys- tems typically consist of an object storage system that is accessed via the POSIX file interface. However, rapid increases in system scales and storage system complexity have uncovered a number of limitations in this model. In particular, applications and libraries are limited in their ability to partition data into units with independent concurrency control, and mapping complex science data models into the POSIX file model is inconvenient at best. In this paper we propose an alternative interface for use by applications and libraries that provides direct access to underlying storage objects. This model allows applications and libraries to organize storage access around these objects in order to avoid lock contention without needing to create many separate files. Additionally, complex data models are more readily organized into multiple object data streams, simplifying the storage of variable-length data and allowing a choice of degree of parallelism related to access needs. Our approach provides for datasets stored in this new model to coexist with POSIX files, allowing evolution to the new model over time. We apply these concepts in the PVFS, PLFS, and Parallel netCDF packages to prototype the model and describe our experiences.
UR - http://www.scopus.com/inward/record.url?scp=84876542724&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84876542724&partnerID=8YFLogxK
U2 - 10.1109/SC.Companion.2012.17
DO - 10.1109/SC.Companion.2012.17
M3 - Conference contribution
AN - SCOPUS:84876542724
SN - 9780769549569
T3 - Proceedings - 2012 SC Companion: High Performance Computing, Networking Storage and Analysis, SCC 2012
SP - 36
EP - 41
BT - Proceedings - 2012 SC Companion
T2 - 2012 SC Companion: High Performance Computing, Networking Storage and Analysis, SCC 2012
Y2 - 10 November 2012 through 16 November 2012
ER -