On classification with partial statistics and universal data compression.

01 January 1988

New Image

Two-classes classification with partial statistics is studied for finite alphabet sources. Efficient universal discriminant functions are described and shown to be related to universal data compression. It is demonstrated that if one of the probability measures of the two classes is not known, it is still possible to define a universal discrimination function which performs as well as the optimal (likelihood ratio) discriminant function (which can be evaluated only if the probability measures of the two classes are available). If both of the probability measures are not available but training-vectors from at least one of the two classes are available, it is demonstrated that no discriminant function can perform efficiently if the length of the training vectors does not grow at least linearly with the length of the classified vector.