Performance of Balanced Fairness in Computer Clusters: A Recursive Approach
18 June 2018
Understanding the performance of computer clusters is crucial for proper dimensioning. One of the main challenge is to take into account the complex interactions between computers that process jobs in parallel. In particular, a job can generally not be processed by any computer of the cluster due to various constraints like data locality or limits of parallelization. In this paper, we represent these constraints by some assignment graph between jobs and servers. We present a recursive approach to computing performance metrics like mean response times when the server capacities are shared according to balanced fairness. While the computational cost of these formulas can be exponential in the number of servers in the worst case, we illustrate their practical interest by introducing broad classes of cluster models that can be exactly analyzed in polynomial time. .is extends considerably the class of models for which explicit performance metrics are accessible, including the most recent ones derived in the context of redundant requests, which turn out to be special cases of the formulas presented here.