Patterns for Fault Tolerant Software

01 January 2007

New Image

This book presents proven techniques to achieve highly available, fault tolerant software that can be implemented by software developers, software architects and small teams. The techniques will be presented in the form of patterns as a resource for teaching developers and students about fault tolerance principles and also as a reference for experts seeking to select the technique appropriate for a given system. Within the phases of fault tolerance (fault detection, error containment, error recovery and fault treatment) the patterns will be organized in a way that leads from high-level abstractions to the concrete mechanisms. The collection of techniques will be programming language dependent, and will be presented in a way that supports their working together to design fault tolerant software. This allows the designer to build the fault tolerant pattern language needed to solve their unique design problems. Readers will be guided from concepts and terminology, through common principles and methods to advanced techniques and practices in the development of software systems. References will provide access points to the key literature, including descriptions of exemplar applications of each technique. This book gives the system designer a toolbox of software fault tolerant techniques described in sufficient detail that it will allow the designer to tailor the techniques to meet particular system specifications. The architect of fault tolerant systems will find an organized collection of software techniques, when they will be able to easily look up a specific technique and find sufficient details that would allow appropriate choices for the system being designed. In addition, the relations among the fault tolerance techniques, captured in the pattern language structure, will allow the architect to assemble their own tool chest of techniques appropriate for any design project. One topic that this book will not include (except to point to reference works) is reliability modeling and analysis. This topic is widely coevered in many other books. This book's focus is on the techniques that can lead to reliable software systems.