Incremental Quantile Estimation for Massive Tracking
01 January 2000
Fraud detection, customer retention, network monitoring and planning, and industrial process control all require tracking the behavior of hundreds, thousands or even millions of entities in real or nearly-real time. In such applications, the quantities of interest are often highly skewed, so that tracking the average is not sufficient to detect "interesting" behavior. Instead, interest centers on tracking extreme behavior. For example, the goal may be track the .99 quantile of utilization for each of many network elements or the .99 quantile of call duration for each of millions of customers. In applications where a real-time response is desired, such as network performance management, the quantiles need to be incrementally updated as new data become available. This paper proposes an incremental exponentially weighted stochastic approximation (EWSA) estimate that is straightforward to compute from stream data which becomes available in a batch of one or more records at a time. Simulations show that the EWSA outperforms other kinds of incremental estimates that also require minimal main memory, especially when extreme quantiles are tracked for patterns of behavior that change over time. Use of the EWSA is illustrated in an application to tracking call duration for a set of callers over a three month period.