Distributed Set-Expression Cardinality Estimation
01 January 2004
We consider the problem of estimating set-expression cardinality in a distributed streaming environment where rapid update streams originating at tens or hundreds of remote sites are continually transmitted to a central processing system. We propose the first algorithmic solutions for answering set-expression cardinality queries with guaranteed accuracy at the central processor while keeping data communication costs between the remote sites and the central site at a minimum. Ar the core of our solutions are two novel techniques for lowering the data messaging overhead without sacrificing answer precision.