On Derivation and Application of AIC As A Data-Based Criterion for Histograms
10 April 1987
In this paper a criterion based on a biased estimate of maximum information (negative entropy) for choosing the number of bins of a histogram is derived. It is shown that extracting more information (negative Kullback-Leibler information distance) from a random sample can be achieved by choosing smaller bin widths but this results in increasing the estimation error. The criterion derived is formulation of a trade-off between Kullback-Leibler information distance and estimation error, it does not require a knowledge of the underlying density. Consistency and asymptotic optimality of the criterion is discussed and its relationship to penalize liklihood methods is shown. A formula relating the optimal number of bins for a sample and a sub-sample obtained from it is derived. A number of numerical examples are presented.