FILE AGGR <data> BY <aggr_variable> TO <new_data_file> with a VARIABLES list forms a new data file by aggregating consecutive observations (with a same value in <aggr_variable>) according to different rules (functions). The VARIABLES list is given below the FILE AGGR operation line in the form: VARIABLES: A1 Function1 X1 Condition1 A2 Function2 X2 Condition2 .. ......... .. .......... END <data> must be sorted by <aggr_variable> before using FILE AGGR. In the VARIABLES list, A1,A2,... are names of aggregated variables. Also the type of a variable can be given as Sum:8, Name:S16. Possible functions are listed on the next page. X1,X2,... are names of variables in <data> to be aggregated. Conditions are given in the form a1*a2+...+b1*b2*... (as in SELECT) Each of terms a1,a2, etc. is given as a condition of type IND or CASES. Functions in FILE AGGR: N Number of cases Example: Nbig N - Big / Big=Popul,30000,500000 `-' above means that N assumes no X variable. SUM Sum of observations Example: Popul:8 SUM Popul MEAN Arithmetic mean of observations STDDEV Standard deviation MIN Minimum value MAX Maximum value #VALUES # of different values (<data> must be sorted also with respect to the X variable.) FIRST Value of the first observation within the aggregate Typically, the name of the aggregate is copied by FIRST. Example: Province FIRST Province LAST Value of the last observation within the aggregate NMISS Number of missing observations SUMS Sum of observations. If any are missing, the result is missing. MISSING Just a new variable with missing values is created. (To be continued on the next page) Functions in FILE AGGR (Continued): MODE Mode of the observations MEDIAN Median of the observations FRACTILE(p) p-fractile of the observations (0<=p<=1) ORDER(k) Observation Xk in the ordered sample X1<=X2<=...<=Xn If k<0, observation X(n+k-1) Example: ORDER(-1) is same as MAX. ORDERN(k,V) Value of variable V for the kth observation Example: Maxcomm ORDERN(-1,Commune) Popul TMEAN(k) Trimmed mean when k largest and least cases are omitted TPMEAN(p) Trimmed mean on rejection probability level p (0<p<0.5) CORR(V) Correlation of the X and the V variable SLOPE(V) Slope a in the regression model X=a*V+b+eps INTERCEPT(V) Intercept constant b in the above regression model. In the 3 last functions V can be replaced by ORDER i.e. order 1,2,...,n of observation within the aggregate. An application of FILE AGGR is presented by the sucro /AGGRDEMO A = Using several aggregation variables D = More information on data management