DISCR <data>,L / M.Korhonen Discriminant analysis: In the discriminant analysis the observations (cases) are divided into groups according to the values of a grouping variable. The grouping variable may be at nominal scale or it has comparatively few distinct values. The purpose of the analysis is to find such classification functions that best characterize the differences between the groups. These functions, which are linear combinations of the original variables, are used for classifying new cases too. The discriminant analysis usually has the following two phases: (1) First the classification functions and tests associated with them are computed. (2) Second the cases of the original or another data are classified according to these functions. The analysis is succesful if few cases of the original data are classified into wrong groups. However, we can get optimistic results when classification function is used to classify the same cases that were used to compute it. This bias may be reduced by using cross validation in the classification or another data with known groups. The classification may base on the classification functions obtained from the discriminant analysis or on the original observations. The general form of the DISCR operation is the following: DISCR <data>,L <the definition of the variables in the model> <options for the printout and methods used> The variables used for forming the classification (discriminant) functions may be defined either by the VARIABLES specification or they can be pointed by masks X or A. Correspondingly, the grouping variable may be defined by the GROUPING specification or by mask G. The grouping structure of the grouping variable is given in the same way as in ANOVA and MEANS operations. If the structure is not given then the program will examine the values of the grouping variable from the data file and uses all distinct values found (which means one extra pass through the data). Example: DISCR FISHER,END+2 VARIABLES=sepallen,sepalw,petallen,petalw GROUPING=iristype iristype=1(setosa),2(versicol),3(virginic) RESULTS=CROSS The option CROSS in the RESULTS specification causes the printout of the within and between groups crossproducts matrices. Alternatively, covariances (COVA) or correlations (CORR) may be printed. Further information: 1 = Definitions for grouping variables 2 = Classification ot the cases D = More on data analysis