Block Models and Community Detection

01 January 2009

New Image

A great deal of attention has recently been paid in the computer, social and physical sciences, to determining subclasses on the basis of observed relationships (edges) between individuals (vertices) of an undirected unlabeled graph. From a characterization of the class of all ergodic models on infinite such graphs, due to Aldous, we conclude that so called block models are arbitrarily good approximation to the general model. In the context of block models, we show that as $nrightarrowinfty$, the well known Newman-Girvan index does not necessarily consistently estimate the class membership of each individual while an index we propose, the profile likelihood, does. We study these and other indexes both theoretically and by simulation and application examples and show that the theory is a good guide for dense graphs.