SURVO MM Help System (web edition)

DIST <data>,<matrix_file> 
computes a distance or dissimilarity matrix of active
observations from active variables.
There is another Survo module DISTV for a distance matrix of active
variables.
A special form of DIST computes the distance of each observation to
the closest of given center observations (see DISTC?).

The results are saved in <matrix_file> with default extension .MAT .

If a string variable is activated by `L', the 8 first characters of it
are used as row and column labels in <matrix_file>.
Otherwise, if the first active variable is a string, it will serve
as a label variable. Otherwise labels will be integers 1,2,...

<matrix file> can be used as an input in /CSCAL and LSCAL operations,
for example.

The dissimilarity measure used is selected by a MEASURE specification
with following alternatives (see T.C.Cox & M.A.A.Cox: Multidimensional
Scaling, Chapman & Hall p.10):

EUCLIDEAN, MAHALANOBIS, CITY_BLOCK,
MINKOWSKI(k)   (k>0)
CANBERRA, BRAY_CURTIS, BHATTACHARYYA,
ANGULAR        (Angular separation)
CORRELATION    (1 - correlation)
BINARY         (various measures for binary variables; see next page)

Three first letters are sufficient like MEASURE=MIN(2) which is the
same as MEASURE=EUC . Also MEASURE=MIN(1) is the same as MEASURE=CITY .

The variables can be standardized by SCALING=YES .
The variables are also weighted by WEIGHTS=<vector_of_weights_as_matrix_file>.
The order of weights must be the same as the order of active
variables in <data>.

In case MEASURE=BINARY various user-defined (dis)similarity measures
for binary data are used.
By default each active variable is converted to a binary one by mapping
values X<=0 to 0 and values X>0 to 1.
This convention is overridden by giving a specification BINARY=C
Then values X<=C are mapped to 0 and values X>C to 1.
An optional parameter R in BINARY=C,R exchanges the values 0 and 1.
Both of the above conventions can be overridden individually in any
variable, say Z, by entering a specification Z=C or Z=C,R with
the same interpretation as in the BINARY specification.

The actual (dis)similarity coefficient for binary data is entered
as a specification COEFF=<function of a,b,c,d> where a,b,c,d are the
frequencies in a 2x2 table
       1      0
   1   a      b
   0   c      d
for each pair of observations.
For example, COEFF=1-(a+d)/(a+b+c+d)  gives a dissimilarity measure
which is the complement of a simple matching coefficient (default).

 1 = More information on additional multivariate operations 
 M = More information on multivariate analysis 


More information on Survo from www.survo.fi
Copyright © Survo Systems 2001-2012.
webmaster'at'survo.fi