A Comprehensive Study of Communication in Distributed-Memory Multiprocessors


This dissertation studies the problem of channel contention in distributed-memory multiprocessors. Our main contributions are on adaptive routing. In this dissertation, we pose and answer several fundamental questions related to the optimality and deadlock freedom of routing algorithms. We also address contention through mapping, a complementary software approach.

First, we present a fully adaptive routing algorithm for n-dimensional meshes, which allows a message to use any shortest path from the source to the destination. We prove that this routing algorithm is optimal both in its hardware resource requirements and its flexibility in using these resources. Furthermore, we prove that the only routing algorithms that can satisfy both optimality conditions are symmetric cases of our routing algorithm.

Building on the ideas underlying these proofs of optimality, we present a necessary and sufficient condition for deadlock-free routing. The resulting proof methodology applies to a much richer class of routing algorithms than previous proof techniques. Furthermore, our methodology is applicable to any switching technique. To prove our results, we introduce two new concepts: the channel waiting graph and False Resource Cycles. These two concepts yield a simple and powerful methodology, which we demonstrate by proposing a partially adaptive routing algorithm for wormhole-routed meshes that does not require any additional virtual channels, and fully adaptive routing algorithms for wormhole-routed hypercubes and packet-switched meshes that are more adaptive than any previous routing algorithm for these topologies and switching techniques.

We then present a heuristic for compile-time mapping of clustered task graphs that measures the channel contention in each mapping. This yields two advantages: a more accurate execution time estimate of each mapping and better information for improving the mapping. Simulation results show the advantage of considering contention when mapping.

The mapping heuristic is then integrated with a simulator that models the execution and performance of many routing algorithms, including those that we have proposed. Using task graphs derived from the Cholesky factorization of sparse matrices, we investigate the relationship between mapping and routing. The simulation results demonstrate the advantages of our comprehensive approach to reducing contention.

Return to the Publication list.

Last updated by Loren Schwiebert Email: on Jun-06-2001