Scheduling and Data Replication to Improve Tape Jukebox Performance
01 January 1999
An increasing number of database applications require on-line access to massive amounts of data. Since large-scale storage systems implemented entirely on magnetic disk can be impractical or too costly for many applications, tape jukeboxes can provide an attractive solution. Unfortunately, current implementations of tape jukeboxes deliver poor performance for applications that access data randomly. This paper shows how the performance shortcomings of tape jukeboxes can be improved across a broad parameter space via a new scheduling algorithm and schemes for the placement and replication of hot data. We substantiate our claim by an extensive simulation study that quantifies the improvements obtained over a wide variety of workload characteristics. Our experiments suggest that system throughput increases when replicas of hot data are placed at the tape ends (not in the middle or at the beginning). As a result, the proposed replication techniques can be used to fill existing spare capacity in a tape jukebox, thus improving the performance of the jukebox "for free".