Skip to main content

Dependable Initialization of Large-Scale Distributed Software

01 January 2004

New Image

Ib this paper, we present a dependable initialization model that captures the architecture of the system to be initialized, as well as interdependencies among system components. We show that overall system initialization may sometimes complete more quickly if recovery actions are deferred as opposed to commencing recovery actions as soon as a failure is detected. 

This observation leads us to introduce a recovery decision function that dynamically assesses when to take recovery actions. We than describe a dependable initilization algorithm that combines the dependable initialization model and the recovery decision function for achieving fast initialization. 

Experimental results show that our algorithm incurs lower initialization overhead than that of a conventional initialization algorithm. This work is the first effort we are aware of that formally studies the challenges of initializing a distributed system in the presence of failures.