Collaborative Research: PPoSS: Planning: Extreme-scale Sparse Data Analytics

Project: Research project

Project Details


The graph data structure is used for storing and manipulating relational data. Tensors are a higher-order generalization of the two-dimensional matrix representation. Both graphs and tensors are used in exploratory and automated data analysis. Applications areas include cybersecurity, complex system analysis, and personalized healthcare. There exist a myriad of known algorithms for typical data analysis tasks in these areas. For instance, the problem of community identification in graphs, referring to automatically identifying well-connected groups of vertices in graphs, has dozens of algorithms. Analogous to the singular value decomposition in matrices, several tensor factorizations exist with diverse use-cases. Both graph algorithms and tensor factorizations use computer storage formats inspired by matrix computations. This project focuses on data analysis use-cases that result in large-scale graphs and tensors, necessitating parallel and distributed processing. The project's novelties are in identifying and developing unifying parallel algorithm design principles that span multiple graph computations and tensor factorizations. In the planning stage, several focused research tasks will explore eight unifying themes.

The project aims to develop the foundations for an end-to-end streaming data analytics system with performance comparable to highly tuned static graph analysis benchmarks on current high-end workstations and supercomputers. The investigators' multi-disciplinary expertise span high-performance computing, theory and algorithms, computer architecture, and programming languages and compilers. The cross-cutting research aims include generalizable principles to orchestrate intra- and inter-node communication, multiple approaches for exploiting hierarchical parallelism, locality-enhancing strategies, and automatic performance tuning. The software artifacts from the planning stage could form the basis for new data analytic benchmarks. The investigators will incorporate research findings into the courses they teach. Engaging experts from the national laboratories and the industry in the planning stage will help solidify future large-scale efforts. The investigators will leverage and contribute to existing institutional programs that broaden participation in computing research.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Effective start/end date10/1/189/30/22


  • National Science Foundation: $75,000.00


Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.