Design of Recovery Strategies for a Fault-Tolerant No. 4 Electronic Switching System
01 December 1982
The No. 4 Electronic Switching System (ESS) is a high-capacity toll and tandem digital and switching system that is capable of handling 616,000 call attempts per hour and serving up to 100,000 active terminations. The maintenance system in the No. 4 ESS has the responsibility of responding to error conditions reported by hardware-fault-detection circuits, by memory mutilation detectors, or by some other systemintegrity monitor. The strategies invoked for rapid isolation of faulty configurable entities or for correcting memory errors are based on the type of error condition, the state of the system at the time of the failure, and the history of previous recovery actions that may have been attempted. The recovery strategies employed are fundamentally two-dimensional. One dimension is concerned with the present and the other with the past. Probably the most powerful and unique property of the No. 4 ESS fault recovery system is the latter dimension, that of 3019 considering the past, in a process called error analysis. Much more will be said of this process later in this paper. Although the basic mission of the maintenance recovery system in the No. 4 ESS has remained the same throughout its evolution, the maintenance recovery strategies developed during that evolution have been influenced by many external factors. The availability of new hardware technologies and the development of redesigned and costreduced hardware have created the need for making changes in existing strategies.