DISTV <data>,<matrix_file>
computes a distance or (dis)similarity matrix of active variables (!)
for active observations.
There is another Survo module DIST for a distance matrix of active
observations.
The results are saved in <matrix_file> with default extension .MAT .
<matrix file> can be used as an input in /CSCAL and LSCAL operations,
for example. In this case the matrix must consist of dissimilarities.
The (dis)similarity measure used is selected by a MEASURE specification
with following alternatives (see T.C.Cox & M.A.A.Cox: Multidimensional
Scaling, Chapman & Hall p.10):
EUCLIDEAN
CITY_BLOCK
MINKOWSKI(k) (k>0)
CORRELATION (1 - correlation)
BINARY (various measures for binary variables; see next page)
Three first letters are sufficient like MEASURE=MIN(2) which is the
same as MEASURE=EUC . Also MEASURE=MIN(1) is the same as MEASURE=CITY .
The variables can be standardized by SCALING=YES .
The observations are weighted by activating a weight variable by `W'.
In case MEASURE=BINARY various user-defined (dis)similarity measures
for binary variables are used.
By default each active variable is converted to a binary one by mapping
values X<=0 to 0 and values X>0 to 1.
This convention is overridden by giving a specification BINARY=C
Then values X<=C are mapped to 0 and values X>C to 1.
An optional parameter R in BINARY=C,R exchanges the values 0 and 1.
Both of the above conventions can be overridden individually in any
variable, say Z, by entering a specification Z=C or Z=C,R with
the same interpretation as in the BINARY specification.
The actual (dis)similarity coefficient for binary variables is entered
as a specification COEFF=<function of a,b,c,d> where a,b,c,d are the
frequencies in a 2x2 table
X/Y 1 0
1 a b
0 c d
for each pair X,Y of variables.
For example, COEFF=1-(a+d)/(a+b+c+d) gives a dissimilarity measure
which is the complement of a simple matching coefficient (default).
1 = More information on additional multivariate operations
M = More information on multivariate analysis