Computer Systems Availability Evaluation Using a Segregated Failure Model
01 January 2008
This paper presents the Segregated Failures Model (SFM) of availability of fault-tolerant computer systems with several recovery procedures. This model is compared with a Markov chain model and its advantages are explained. The basic model is then extended for this situation when the coverage factor is unknown and the failure escalation rates must be used instead. A simple practical analytical approach to availability evaluation is provided and illustrated in detail by estimating availability of two versions of a Reliable Clustered Computing architecture. For these examples, numeric values of availability indexes are computed and the contribution of each recovery procedure to total system availability is analyzed.