Resource Allocation for Coded Distributed Computing

A data center has an abundance of processing power (e.g., it hosts tens of thousands of servers). To execute a distributed computing job, it is natural to have multiple servers working on different parts of the job in parallel to reduce the computation time. On the other hand, in most cases the intermediate results computed by these servers needs to be collected by a server in order to execute a second stage computation. Thus, spreading the computation jobs onto too many servers may also increase the communication time. It is valuable to find the optimal way of assigning the computation resources to a problem, in order to solve it as soon as possible. We consider a simple distributed computing framework, where the goal is to compute Q arbitrary functions from N files, by decomposing the overall computation into a set of "Map" and "Reduce" functions. We study the minimum possible total execution time for completing an arbitrary distributed computing jobs, when one have access to arbitrary number of servers. We give the exact minimum possible execution time for all possible values of problem parameters, and we provide the corresponding optimal computing schemes. In most cases, the proposed optimal scheme can exactly achieve the minimum execution time using only a finite number of servers. We prove a matching information theoretic converse showing that our proposed scheme exactly achieve the minimum execution time. Besides, we also prove that among all possible computing schemes that achieves the minimum execution time, our proposed scheme also uses the exact minimum possible number of servers.

Authors

Mohammad Ali Maddah-Ali

View Publication

Select your country

Resource Allocation for Coded Distributed Computing

Authors

Looking for Nokia licensed products support?

Looking for Nokia licensed products support?