Lowering Inter-Datacenter Bandwidth Costs via Bulk Data Scheduling
13 May 2012
Cloud service providers (CSP) of today operate multiple data centers, over which they provide resilient infrastructure, data storage and compute services. The links between data centers have very high capacity, and are typically leased by the CSPs on a 95-th percentile cost basis. These links are used to serve both client traffic as well as CSP-specific traffic, such as backup jobs, etc. Past studies have shown a diurnal pattern of traffic over such links. However, CSPs pay for the peak bandwidth, which implies that they are underutilizing the capacity for which they have paid for. We propose a scheduling framework that considers various classes of jobs that are encountered over such links, and propose GRESE, an algorithm that attempts to minimize overall bandwidth costs to the CSP, by leveraging the flexible nature of the deadlines of these jobs. We demonstrate the problem is not a simple extension of EDF, BDTS or any well-known scheduling problems, and show how the GRESE algorithm is effective in curtailing CSP bandwidth costs.