Hierarchical Vector Clock: Scalable Plausible Clock for Detecting Causality in Large Distributed Systems

01 January 1999

New Image

Modern ATM networks posses multilevel hierarchical structure [AF96], where certain sets of physical or logical nodes are grouped together according to physical, geographical, and/or administrative consideration. A distributed application running on top of such network may contain thousands of sites belonging to different logical nodes with the transport channels between them extending through multiple domain boundaries and crossing multiple hierarchical levels. While knowledge of causality between the events in such system is essential for analyzing the system behavior and ensuring the correct operation through solving various problems related to mutual exclusion, consistency maintenance, fault tolerance and recovery, etc., the accurate tracking of the causality information may turn infeasible due to the system's size and lifespan. For example, the communication and storage overhead involved in using the conventional vector clock, which allows to completely characterize causality, is linear in the number of sites of a distributed system. In this paper we propose a new logical time system, called Hierarchical Vector Clock or HVC, which allows to detect causality between events in a large distributed system with high degree of accuracy, while being ideally suited for a hierarchical structure of the underlying network and able to scale gracefully with the increasing number of distributed sites.