Why Reading Patterns Matter in Storage Coding & Scheduling Design
01 January 2015
Coding techniques for storage systems are gaining traction in data center (DC) applications, owing to their data survivability performance, and more recently, to their ability to mitigate traffic congestion. This paper considers stochastic allocation schedules in networks that admit bulk file requests, across three drive blocking models. We consider a block-based code and a stochastic scheduling algorithm which is beneficial in the case of continuous chunk read patterns. In particular, we demonstrate that in systems with continuous chunk reading patterns, when drive blocking is either independent or from traffic congestion, block coded storage can reduce average download time by 10 -- 66%, given modern system parameters. However, a distinction should be made between systems with continuous and those with interrupted chunk read patterns. For interrupted chunk read systems, given our allocation algorithm that performs well for continuous reads, block coded storage performance can be worse than replication, numerical illustrations show relative losses over 66%. These illustrations demonstrate that to harness the full benefits of coded storage and to avoid pitfalls, careful attention must be paid to continuous vs. Interrupted chunk reading patterns, codes other than block codes should be considered, as could joint code-scheduling design.