The Design and Analysis of Pattern Recognition Experiments

01 March 1962

New Image

There are two distinct and consecutive processes usually involved in the feasibility study of a pattern recognition method or machine. The first process is the actual design of the machine. This might be based upon a set of sample patterns which the experimenter has gathered, from which he estimates the parameters of the machine. Alternatively, the experimenter may base his design 011 some a priori knowledge concerning the pertinent characteristics of the pattern classes under study. The second process is then the testing of this machine either in its hardware form or by its simulation 011 a general purpose computer. A differ723 724 T H E B E L L SYSTEM T E C H N I C A L J O U R N A L , M A R C H 1902 ent set of sample patterns from that used in the design is used in this stage. The popular procedure for interpreting the test results is to take the proportion of patterns in the test data which have been misrecognized or rejected by the machine as the estimates of the error probability and rejection probability, respectively, for the machine. There are several questions which might be raised concerning this testing procedure, such as: 1. Are these estimates the best estimates? 2. If so, how good are these estimates? 3. How does the estimate improve as the sample size is increased? Questions such as these are discussed in Part I of this paper. Two cases are considered; one is the case in which the a priori probabilities of class occurrence are unknown, and the other case assumes full knowledge of the a priori probabilities.