Level-Spread: A New Job Allocation Policy for Dragonfly Networks

Loading...
Thumbnail Image
Full text at PDC
Publication date

2018

Advisors (or tutors)
Editors
Journal Title
Journal ISSN
Volume Title
Publisher
Citations
Google Scholar
Citation
Y. Zhang, O. Tuncer, F. Kaplan, K. Olcoz, V. J. Leung and A. K. Coskun, "Level-Spread: A New Job Allocation Policy for Dragonfly Networks," 2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS), Vancouver, BC, Canada, 2018, pp. 1123-1132, doi: 10.1109/IPDPS.2018.00121.
Abstract
The dragonfly network topology has attracted attention in recent years owing to its high radix and constant diameter. However, the influence of job allocation on communication time in dragonfly networks is not fully understood. Recent studies have shown that random allocation is better at balancing the network traffic, while compact allocation is better at harnessing the locality in dragonfly groups. Based on these observations, this paper introduces a novel allocation policy called Level-Spread for dragonfly networks. This policy spreads jobs within the smallest network level that a given job can fit in at the time of its allocation. In this way, it simultaneously harnesses node adjacency and balances link congestion. To evaluate the performance of Level-Spread, we run packet-level network simulations using a diverse set of application communication patterns, job sizes, and communication intensities. We also explore the impact of network properties such as the number of groups, number of routers per group, machine utilization level, and global link bandwidth. Level-Spread reduces the communication overhead by 16% on average (and up to 71%) compared to the state-of-the-art allocation policies.
Research Projects
Organizational Units
Journal Issue
Description
Keywords
Collections