Parametric Multi-Way Recursive Divide-and-Conquer Algorithms for Dynamic Programs
General Material Designation
[Thesis]
First Statement of Responsibility
Javanmard, Mohammad Mahdi
Subsequent Statement of Responsibility
Chowdhury, Rezaul
.PUBLICATION, DISTRIBUTION, ETC
Name of Publisher, Distributor, etc.
State University of New York at Stony Brook
Date of Publication, Distribution, etc.
2020
GENERAL NOTES
Text of Note
86 p.
DISSERTATION (THESIS) NOTE
Dissertation or thesis details and type of degree
Ph.D.
Body granting the degree
State University of New York at Stony Brook
Text preceding or following the note
2020
SUMMARY OR ABSTRACT
Text of Note
A typical modern supercomputer is a network of distributed-memory hybrid compute nodes where each node usually has both latency-optimized multicores (a.k.a. fat cores) and throughput-optimized manycores (a.k.a. thin cores) connected through a multilevel memory hierarchy. The same supercomputer often hosts compute nodes of various configurations. Many of them are upgraded to the next state-of-the-art within four or five years of becoming operational. To design and implement efficient algorithms for the heterogeneous and constantly evolving modern supercomputers the need for performance portability across architectures must be taken into account. We show that a class of grid-based parallel multi-way recursive divide-and-conquer algorithms for solving Dynamic Programs (DP) can be executed efficiently with provably optimal or near-optimal performance bounds on fat cores (cache complexity), thin cores (data movements) and purely distributed-memory machines (communication complexity) without any changes in the basic structure of the algorithm. We generate such an algorithm from its standard two-way counterpart by multi-level inlining of the recursion and then making each recursive function call as early as possible without violating the dependency constraints. We present a novel framework-Multi-way AutoGen-that uses polyhedral compiler transformations for systematically deriving a multi-way recursive divide-and-conquer algorithm for solving a DP from an iterative specification of the DP. The framework takes advantage of polyhedral transformations including mono-parametric tiling, and index-set splitting. We show that our parametric multi-way recursive divide-and-conquer algorithms can be implemented to run efficiently on Apache Spark. To perform well on distributed computing systems, such as Apache Spark, running on clusters and computation clouds, an algorithm must have the ability to scale out. Additionally, such clusters have a trade-off between communication cost and parallelism. Therefore, it is crucial to have well-decomposable, adaptive, tunable, and scalable programs, and we show that our parametric multi-way recursive divide-and-conquer algorithms have those properties. We provide experimental results illustrating the performance and scalability of the Apache Spark programs for various DP algorithms.