Environment for creative processing of text and numerical data

SURVO MM
Home  |  News  |  Publications  |  Download  |  Flash
S.Mustonen: (2010-)

Survo highlights and small applications

Background of these demos.

Links to demos

1. Using Survo in Touch mode
2. Editorial computing in Survo
3. Plotting curves
4. Worm mode
5. Simulating a bivariate normal distribution
6. Table formatting and sorting (Origin of the editorial approch)
7. Simple bar chart
8. A closed curve
9. Mr. Cole gives a talk
10. Linear regression analysis
11. Chernoff's faces
12. Sum of independent random variables tends to normal distribution
13. Lissajous curve variation (Knitting a carpet)
14. Histogram
15. Factor analysis
16. Comparing two samples
17. Fisher's exact test for contingency tables
18. Miscellaneous conversions
19. Computus: calculating the date of Easter (by KV)
20. Pascal's triangle
21. Grid lines
22. Why 0.3-0.2-0.1 is not zero in PC's?
23. Arrow diagram of a correlation matrix
24. "Origin of Species"
25. Cooling of a coffee cup
26. Symbolic derivatives
27. Chords of ellipses
28. Linear dependencies in a matrix
29. Temperature in Helsinki
30. Example P160 from Survo Book (1992)
31. Fence lines
32. Small problem of Ramanujan
33. July mean temperature and rainfall in Helsinki
34. Approximate squaring of a circle
35. Symmetric random walk
36. Reversing
37. Birds (word and phrase continuation)
38. Graphical rotation in factor analysis
39. Pythagorean points on a green meadow
40. Unbiased coin-flips with a biased coin
41. Omega coin tossing
42. Rational approximations by listening
43. Color changing
44. Permutation test
45. HH - HT game (analysis)
46. HH - HT game (simulation)
47. Solving a Survo puzzle
48. Lines going through 3 points in a 9x9 grid
49. Solving a Survo puzzle by the swapping method
50. Prime numbers listed by a sucro
51. Ulam spiral in color
52. Testing the correlation coefficient
53. Age pyramid (Finland 2009)
54. Letter frequencies in Shakespeare's Sonnets
55. Shakespeare's Sonnets as a Markov chain
56. Linear regression analysis by orthogonalization
57. The most common words in Shakespeare's Sonnets
58. Discriminant analysis of Iris flower data set
59. Cluster analysis of Iris flower data set
60. F1 connections
61. "Rotated arrowheads"
62. Influence curves for the correlation coefficient
63. Colored texts in bar/pie charts
64. "Hello World!"
65. Virtual keyboard
66. Cycloid
67. Prime factors of numbers m^n-1
68. Some properties of Magic Squares (by KV)
69. Some further properties of Magic Squares (by KV)
70. Solving linear equations
71. Polynomial regression
72. Marking columns by SHADOW SET
73. Hunting quanta
74. Survopoint display mode
75. Four-dimensional cube
76. 'Word' processing by mouse
77. Edges and diagonals of a regular n-sided polygon
78. Examples from a presentation in 1987
79. Thurstone's box problem
80. Finding recursive formula for number of grid lines
81. Testing the correlation coefficient
82. Matrix interpreter (regression analysis)
83. "Origin of Species" 2
84. Simulating multivariate normal distribution
85. Genesis of multivariate normal distribution 1
86. Genesis of multivariate normal distribution 2
87. Monthly temperature and rainfall in Helsinki
88. Early sound experiment on Elliott 803 computer in 1962
89. Tracing a sucro program
90. Finding primes by the Sieve of Eratosthenes
91. Combining Survo operations by sucros
92. Tracing a sucro program (Finding prime numbers)
93. Multiple discriminant analysis in linguistic problems
94. Probability of Matching Column Drums (1/2)
95. Probability of Matching Column Drums (2/2)
96. Distance distributions in networks 1
97. Distance distributions in networks 2
98. Distance distributions in networks 3
99. Battle over Degrees of Freedom
100. Problem of minor chord in music
101. Dissonance functions
102. Random music with Slutzky-Youle effect
103. Synthetic bird song
104. Sounds of statistical data 1
105. Sounds of statistical data 2: "Cuckoos singing in the rain"
106. Resurrection of SURVO 66
107. Cross tabulations with SURVO66
108. Printing a small document
109. Circle estimation
110. Contour ellipses on a graph paper
111. Sampling from a discrete uniform distribution
112. Merits of slow plotting
113. Tuning roots of algebraic equations by "listening"
114. Equation for the sum of chord lengths in a regular polygon
115. Regular polygons: Solving riddle of q coefficients
116. Regular polygons: Testing roots




























Background of demos

Always when the main web page is opened, one randomly selected Survo demo is running on that page as a GIF animation.

All these examples are created as pure Survo applications as sucros (Survo macros) by letting Survo to save all actions of the user in a sucro file. After possible editing of that file, the session is repeated automatically by Survo and saved as a flash file by the ScreenFlash program. The flash file is finally saved as an animated GIF picture by ScreenFlash. That technique was applied for demos ex1 - ex91.

The latest items (ex92-) in this collection are created from sucros as MP4 videos by BB FlashBack Recorder.

By clicking the animation, background information about the current topic will be displayed on another web page (this page) containing short descriptions of all Survo animations.

You can play any of the examples simply by clicking its sample picture.


Sucros behind most of the gif animations are available when using SURVO MM by the command
/LOAD <Survo>\U\EX\INDEX / or by soft buttons DEMO HIGHLIGHTS
This gives a list of these sucros and any of them may be run and studied more closely.


The only web browser able to control GIF animations decently seems to be Mozilla Firefox with the SuperStop add-on. When you select an example from the list below by clicking its sample picture, it is possible to pause the demo by pressing the shift-ESC key and continue thereafter by the F5 key.

In Internet Explorer, hitting ESC may stop the animation and you can examine the current situation more accurately but there are no means to continue and the demo has to be restarted.

Chrome allows no interventions from the user.

Demos in YouTube and as MP4 videos

Some demos have been made available also in YouTube thus enabling better possibilities for navigation. It is possible to pause a demo for studying the current situation more carefully and then continue. It is also easy to jump forward or backward. These controls are also available in MP4 videos of examples from ex92 onwards.


These demos are created by using SURVO MM, the original Windows version of Survo, now freely available.
Another free alternative is Survo R (Muste) available on all common operating systems.


Using Survo in Touch mode

Example #1 (Start the demo by clicking the picture!)

Touch mode is one of the smart calculation modes in Survo.
The function key F3 is the TOUCH key for entering the Touch mode of the Survo editor.


Editorial computing in Survo

Example #2 (Start the demo by clicking the picture!)

Editorial computing provides unique means for simple arithmetics and for making extensive computation schemes.

Plotting curves

Example #3 (Start the demo by clicking the picture!)

In Survo, graphics is produced either in PostScript format (by PLOT commands) or in EMF (Enhanced Meta File) format (by GPLOT commands). GPLOT pictures appear automatically in their own windows and several graphs may appear simultaneously on the screen according to user-specific layouts.

Survo Graphics windows typically do not overlay the Survo main window. However, in these GIF-animations the graphs are placed on the main window.


"Worm mode"

Example #4 (Start the demo by clicking the picture!)

"Worm mode" is one special feature of Touch mode enabling forming sequences in any direction from characters displayed in the window and moving these sequences in any directions. The display thus created may be set permanent.

When some computer specialists claimed that Survo is able for 'simple text editing only', "Worm mode" was created in 1994 for demonstrating that they were wrong :)


Simulating a bivariate normal distribution

Example #5 (Start the demo by clicking the picture!)

This demo in YouTube

MNSIMUL operation in Survo is a general tool for generating samples from any multivariate normal distribution. The parameters of the distribution are given by a correlation matrix and a matrix of means and standard deviations. In this application standardized variables (with means=0 and standard deviations=1) are created and this is indicated by an asterisk (*) as the second parameter.

As a technical detail, it also shown how the graph of the sample is positioned in the display.

Screen graphics (created by GPLOT commands of Survo) are displayed by default in separate windows typically outside the Survo main window.


Table formatting and sorting

Example #6 (Start the demo by clicking the picture!)

This demo in YouTube

The editorial approach was originally created for a musical application.

See the Flash demo
About the idea of editorial approach
and
Example of the first version of Survo Editor.

Soon after this experiment it was realized that the same approach could be used for many more purposes, too. For example, it was a pleasure to detect how easily a formatted table of several columns could be sorted according to any column.

Plotting music in 1982 by using the first Survo Editor (Click the picture to see the plotter working)


Simple bar chart

Example #7 (Start the demo by clicking the picture!)

In Survo, graphics is produced either in PostScript format (by PLOT commands) or in EMF (Enhanced Meta File) format (by GPLOT commands). GPLOT pictures appear automatically in their own windows and several graphs may appear simultaneously on the screen according to user-specific layouts.

Survo Graphics windows typically do not overlay the Survo main window. However, in these GIF-animations the graphs are placed on the main window.


A closed curve

Example #8 (Start the demo by clicking the picture!)

This demo in YouTube

This is one of the oldest Survo applications (created in 1976) in this series. The original graph was produced by using SURVO 76 on a Wang 2200 minicomputer. Plotting that graph on a drum plotter took about one hour.

A closed curve defined by a plotting scheme

HEADER= FRAME=0 MODE=1024       XDIV=0,1,0 YDIV=0,1,0
T=0,2*pi,pi/4000 pi=3.14159265 XSCALE=-3.0,3.0 YSCALE=-1.5,1.5
R=cos(78*T)+cos(80*T) A=R*cos(T) B=R*sin(T)
s=0.8 u=0.06 LINETYPE=[color(0.1,0.3,1,0.2)]  COLORS=[/BLACK]
SLOW=400
GPLOT X(T)=A+s*B+u*sin(5*B),Y(T)=B+u*sin(5*A)

is drawn. The graph consists of a single curved line traversing through the origin 2x80=160 times. Plotting is here slowed down 400 times (see the SLOW specification above). In the snapshot above the cycle is not completed.

Mr. Cole gives a talk

Example #9 (Start the demo by clicking the picture!)

This demo in YouTube

The original version of this demonstration was made in 1990 in Finnish and it still belongs to a collection of tutorials made in the Sucro language of Survo.

In fact, all these GIF animations were originally made as such tutorials by letting the ScreenFlash program to 'watch' them and save as Flash movies. These movies were then converted to GIF animations by the same program.

This example was also present in 1990 in a school version of Survo. The aim here was to point out how diligent people in old times - without computers and calculators - were ready and able to do very demanding numerical computing.

The arithmetical calculations presented here were carried out by a family of arithmetical sucros (<Survo>\OPETUS\AR) made just for this presentation.

A related demo: Prime factors of numbers m^n-1


Linear regression analysis

Example #10 (Start the demo by clicking the picture!)

Several operations for regression analysis are available. The oldest of them is LINREG which is applied here to a 'historical' data set DECA belonging to the repertoire since 1970ies.

Chernoff's faces

Example #11 (Start the demo by clicking the picture!)

This demo in YouTube

Survo offers several means for illustration of multivariate statistical data.
One of them is Chernoff's faces.

The original numerical data was given here
(almost 20 years before Chernoff invented his faces).


Sum of independent random variables tends to normal distribution

Example #12 (Start the demo by clicking the picture!)

This demo in YouTube

This demo illustrates the power of the Central limit theorem of probability and statistics.

It is a combination of two sucros, the first one for selecting one of the given discrete distributions and the second one for computing distribution of sums of independent variates from the selected distribution.

In each stage it is shown graphically how close the standardized sum distribution is to the normal distribution. The gap between the sum and the normal distribution is also given numerically as a deviation corresponding to the standard Kolmogorov-Smirnov test statistics.

Two examples are shown. The first one tells how the binomial distribution tends quickly to normal distribution. The second example (due to a very heavy tail on the right) has more dramatic features but eventually normalization is its inevitable destiny, too.


Lissajous curve variation (Knitting a carpet)

Example #13 (Start the demo by clicking the picture!)

This demo in YouTube

This graph belongs to a series of cover pictures I made by Survo for the magazine "Dimensio" of "the Finnish Association of Mathematics and Science Education Research" in 1990-91. The original graph was published in the 9/91 issue of the magazine and it contained also a short article where I described the facts and details related to graphs like this.

The graph here is slightly simplified, due to a limited resolution on the screen but given as a stepwise presentation revealing the complete symmetry finally at the last steps.

The basis of the graph is a Lissajous curve getting a more surprising appearance by "rounding" the function values to integers.

The entire setup in a Survo edit field for making the graph is

GPLOT X(T)=int(M*sin(N*T)+0.5),
      Y(T)=int(N*cos(M*T)+0.5)
M=29 N=19 T=[line_width(4)],0,2*pi,pi/3100 pi=3.14159
HEADER= FRAME=0 XSCALE=-M,M YSCALE=-N,N
XDIV=0,1,0 YDIV=0,1,0
MODE=652,381 WSIZE=652,381 WHOME=0,0 WSTYLE=0
SLOW=300 Slowing the speed by drawing each line segment 300 times

The corresponding Lissajous curve without "rounding" by int():
Pure Lissajous curve


Histogram

Example #14 (Start the demo by clicking the picture!)


Factor analysis

Example #15 (Start the demo by clicking the picture!)

There are many options in Survo for factor analysis and related topics.
This is a straightforward example of the classical approach.


Comparing two samples

Example #16 (Start the demo by clicking the picture!)

The final output (with additional comments on lines 31-39) of the COMPARE program of Survo is displayed above. This example was created in 1986.

Fisher's exact test for contingency tables

Example #17 (Start the demo by clicking the picture!)

The computers are now so fast that P values for exact statistical tests are obtained in reasonable time and accuracy by simple simulation. This approach has been used in Survo already from 1986.

Miscellaneous conversions

Example #18 (Start the demo by clicking the picture!)


Computus: calculating the date of Easter

Example #19 (Start the demo by clicking the picture!)

This demo is created by Kimmo Vehkalahti.

It is a good example of co-operation between Editorial computing and Survo data file operations.


Pascal's triangle

Example #20 (Start the demo by clicking the picture!)

Two ways for creating Pascal's triangle are presented. In the first one, matrix commands and the library function C(n,m) giving the binomial coefficients are used. The second way is based entirely on efficient utilization of Touch mode. The details of this construction are given in the User Guide (1992) on page 73.

Grid lines

Example #21 (Start the demo by clicking the picture!)

This demo in YouTube

The formulas behind the computational setup

L(N)|=if(N<2)then(0)else(2*L1(N)-L(N-1)+R1(N))
L1(N)|=if(N<3)then(1)else(2*L(N-1)-L1(N-1)+R2(N)) R1(N):=4*S(N)
S(N):=for(I=2)to(N)sum(totient(I-1)-e(I))
e(N):=if(mod(N,2)=0)then(0)else(totient((N-1)/2))
R2(N):=if(mod(N,2)=0)then((N-1)*totient(N-1))else(R21(N))
R21(N):=if(mod(N,4)=1)then((N-1)*totient(N-1)/2)else(0)
were found experimentally by using Survo as described in my document (pages 11-15) and shown step by step in YouTube and also as a flash demo.

Another formula in Sloane's Encyclopedia of Integer Sequences has been presented earlier but it is much slower in computations.


Before finding the fast recursive formulas, I could make a conjecture that an accurate asymptotic expression for L(n) is

L(n)=[3/(2*pi)*n^2]^2+O(n^2.5)

based on calculations for values n<=15000 by the slow formula as told in my document on pages 8-9.

Later (by using the fast formulas) I have computed the L(N) values for all N values to 10^11 by Mathematica code (on page 27) controlled directly from Survo. The graph indicates that the accuracy of the asymptotic expression really seems to be of order O(n^2.5) and this conjecture has been validated in a paper by Hytönen-Ernvall, Matomäki, Haukkanen, and Merikoski provided that the Riemann hypothesis is true. In the same paper also my other empirical findings have been proved.

More information about this topic in

Finding recursive formula for number of grid lines

Another related example


Why 0.3-0.2-0.1 is not zero in PC's?

Example #22 (Start the demo by clicking the picture!)

Two routines for dealing with binary numbers were needed. When using Survo, the quickest way to create such auxiliary tools is to make them by using Survo's own macro language as sucros.

Here are listings of those 'ad hoc' sucros (readily available for all Survo users):
*TUTSAVE BIN-CONV
/ /BIN-CONV number,n
* converts a positive decimal number <1 into binary form with n bits.
/ def Wx=W1 Wacc=W2 Wn=W3 Wint=W4
/
*{init}{tempo -1}{Wn=0}{Wint=0.}{R}{erase}
+ A: {Wx=2*Wx}{ref}{line end}{print Wint}{R}
*{erase}int({print Wx})={act}{l} {save word Wint}{Wx=Wx-Wint}
*{Wn=Wn+1}
- if Wn < Wacc then goto A
*{line start}{erase}
+ E: {tempo +1}{end}
*
*
*TUTSAVE BIN-SUB
/  /BIN-SUB makes the difference of two binary numbers,
/  either integers or fractions in (0,1) according to following setup:
/
/           ..  ....   ....  (borrowed bits appearing during calculation)
/           1001110000110010 A
/  /BIN-SUB 0010010010100111 B  (activate at the last bit)
/           0111011110001011 A-B
/
*{tempo -1}
+ A: {ref set 1}{save char W1}
- if W1 '=' {sp} then goto E
- if W1 '=' . then goto B
- if W1 '=' 0 then goto C
/ W1=1
*{u}{save char W2}
- if W2 '=' 1 then goto D1
*{u}{save char W2}{d}
- if W2 '=' . then goto D2
/
+ F: {l}{u}.{l}{d}{save char W2}
- if W2 '=' 0 then goto F
*{ref jump 1}{W3=1}{goto S}
+ D1: {u}{save char W2}{d}
- if W2 '=' . then goto D3
*{d}{W3=0}{goto S}
+ D3: {goto F}
+ D2: {d}{W3=0}{goto S}
/
+ C: {u}{save char W2}
- if W2 = 1 then goto C1
*{u}{save char W2}{d}
- if W2 '=' . then goto C2
*{d}{W3=0}{goto S}
+ C2: {d2}1{ref jump 1}{l}{goto A}
+ C1: {u}{save char W2}{d}
- if W2 '=' . then goto C3
*{d}{W3=1}{goto S}
+ C3: {d}{W3=0}{goto S}
+ B: {W3=.}
+ S: {d}{print W3}{ref jump 1}{l}{goto A}
+ E: {tempo +1}{end}

Arrow diagram of a correlation matrix

Example #23 (Start the demo by clicking the picture!)

This demo in YouTube

Relations between 12 variables are visualized by connecting any pair by a line if the correlation is strong enough. Positive correlations are indicated by a red line, negative by a blue line. The line thickness reflects the size of the correlation coefficient.

It is not possible to make this kind of graphs quite automatically since there are so many options. However, a ready-made template corresponding to this example exists for Survo users. It is easy to modify this template for at least up to correlation matrices with 30 variables and get a good general view on the relations at hand.

To gain enough accuracy, this arrow or vector diagram is drawn as a PostScript picture. The final picture is obtained by combining two graphs using the EPS JOIN command of Survo for PostScript files generated by Survo PLOT commands. The final display here is dramatically slowed down by a SLOW=3000 specification when making the arrow diagram.


"Origin of Species"

Example #24 (Start the demo by clicking the picture!)

This demo in YouTube

I made this graph for the first time using Survo in 1976 on a drum plotter connected to a Wang 2200 minicomputer. It was plotted in separate parts so that its size was over one square meter.

The graph illustrates a fact how little information is needed for creating various forms starting from a simple circle (ovum) at the center. The plotting scheme

XDIV=0,1,0 YDIV=0,1,0 SIZE=1180,1180 HEADER= FRAME=3 HOME=300,500
A=-8,10,1        B=-8,10,1        T=0,2*pi,pi/30 pi=3.14159265
XSCALE=-9,11     YSCALE=-9,11     DEVICE=PS,SPECIES.PS

PLOT X(T)=A+0.225*SIN(T)+0.139*SIN(A*T)+0.086*SIN(B*T),
     Y(T)=B+0.225*COS(T)+0.139*COS(A*T)+0.086*COS(B*T)

/GS-PDF SPECIES.PS
reveals that the entire graph is created by activating a single PLOT command making a family of curves depending on two parameters A and B both varying from -8 to 10 by step 1 and effecting simultaneously to the location of each partial graph and to its form.

The graph is created as a PostScript file SPECIES.PS and converted into the PDF format.

A more accurate version (tenfold size and step length pi/300) is made as follows:

*XDIV=0,1,0 YDIV=0,1,0 SIZE=11800,11800 HEADER= FRAME=3 HOME=0,0
*A=-8,10,1        B=-8,10,1        T=[line_width(0.96)],0,2*pi,pi/300
*pi=3.141592653589793
*XSCALE=-9,11     YSCALE=-9,11     DEVICE=PS,SPECIES10.PS
*
*PLOT X(T)=A+0.225*SIN(T)+0.139*SIN(A*T)+0.086*SIN(B*T),
*     Y(T)=B+0.225*COS(T)+0.139*COS(A*T)+0.086*COS(B*T)
*
*.....................................................................
*PRINT CUR+1,E TO K.PS / Reduction to original size
% 1240
- [left_margin(1)]
- picture species10.ps,*,*,0.1,0.1
E

Cooling of a coffee cup

Example #25 (Start the demo by clicking the picture!)

ESTIMATE was the first statistical program I created for the new version of Survo (SURVO 84C in 1985) written in the C language. It is still the general tool in Survo for nonlinear regression analysis and maximum likelihood estimation, for example. In fact, simultaneously I also wrote a C program (currently DER) for computing symbolic derivatives of real functions since they are valuable when forming the gradient and searching for the optimum of the object function in nonlinear regression, etc.

I had done same things one year before for the Wang PC by using interpretative Basic. When I heard some programming experts in Finland to say that "Basic spoils your brain!":) I wanted to test my brain when getting a chance to start learning C by selecting these rather demanding targets as my first examples in C programming.

Usually I wrote already then all my programs at the computer without pen and paper but all this happened during Summer 1985 during my summer vacation in Central Finland where I had no access to any computer. So I wrote these programs by hand and got the first chance to test them only after returning home in August and by starting using my brand new IBM PC (AT model) and the new Microsoft C compiler.

This tiny example tells how ESTIMATE is used in calculating parameters and related statistics of a nonlinear regression model. The predecessor of ESTIMATE (on Wang PC in 1984) was probably one of the first statistical programs able to evaluate symbolic derivatives automatically and see (by studying derivatives of the second degree of the model function) whether they all are zero or not and thus determine if the model is linear with respect to parameters to be estimated or not. Then the program could decide what kind of numerical algorithm to select.

The the first model in this example was

MODEL CUP1 / Exponential decay
T=T0+a*exp(-b*t)
ESTIMATE is able to distinguish what are the parameters to be estimated (a,b) since it detects that T and t are variables in the data set CUP.

Another demo about ESTIMATE


Symbolic derivatives

Example #26 (Start the demo by clicking the picture!)

I created the DER program together with the ESTIMATE program for the new version of Survo (SURVO 84C in 1985) written in the C language. ESTIMATE uses the DER code for creating symbolic derivatives of the object function and converts them into inverted Polish notation.

I made the original Finnish version of this demo in 1990 in connection with a limited SURVOS version intended for use in Finnish schools.


Chords of ellipses

Example #27 (Start the demo by clicking the picture!)

This plot is a collection of 51x36x10=18360 chords inside 36x10=360 ellipses created according to the plotting scheme
SIZE=681,381 XDIV=0,1,0 YDIV=0,1,0 MODE=681,381
SCALE=0,7 FRAME=0 SLOW=100
t=0,50,1 n=0,35,1 r=-1.0,-0.1,0.1
GPLOT X(t)=int(n/6)+1+r*cos((-7*r+n)*t),
      Y(t)=n+1-6*int(n/6)+r*sin((-7*r+n)*t)
COLORS=[/BLACK]
COLOR_CHANGE=n-10*r,16
The COLOR_CHANGE specification takes care of selecting one of 16 colors according to value mod(n-10*r,16). SLOW=100 makes the output 100 times slower than normally.

Linear dependencies in a matrix

Example #28 (Start the demo by clicking the picture!)

This demo in YouTube

This example is related to the problem of selecting variables (for example, for multiple regression analysis). However, here the selection is not based on any external information (like on the regressand) but it must be done solely by internal criteria.

I encountered this problem when making the first computer program in 1962 for Cosine rotation in factor analysis. This rotation technique was devised and applied as a hand calculation and graphical procedure by by Yrjö Ahmavaara and Touko Markkanen in the 1950ies.
As far as I know, before 1962 no analytical approach to the problem of selecting the 'factor variables' had been presented. In the cosine rotation program the target is to select the factor variables as the maximally orthogonal subset of variables by a determinant criterion. That principle is demonstrated in this example.

The example is an abridged version of the chapter 'Column space' in my paper
Matrix computations in Survo (1999).

Temperature in Helsinki

Example #29 (Start the demo by clicking the picture!)

This demo in YouTube

This graph of a time series was created by the Survo plotting scheme:

YLABEL=[Arial(25)],Yearly_mean_temperature_in_Helsinki_(1829-2009)
GPLOT HEL_MEAN,Year,Temp / SIZE=652,381 YDIV=50,291,40
XSCALE=1829(20)2009 YSCALE=1.5(0.5)8  TICK=5,1 TICK2=5,1
LINE=[WHITE],1 TREND=[BLACK],0 PEN=[BLACK]
FILL=[RED],1,1,181,Trend,1 FILL-=[BLUE]
XDIV=50,539,50 HEADER=   WHOME=0,0 WSIZE=652-5,381-25

Example P160 from Survo Book (1992)

Example #30 (Start the demo by clicking the picture!)

This demo in YouTube

Here is the entire setup in the edit field for making this experiment:

FILE CREATE SIMUDATA,4,1,64,7,10000
  Sample (N=10000) from a mixture of two normal distributions
FIELDS:
1 N 4 X
END

VAR X TO SIMUDATA
  X=if(rnd(1)<0.7)then(X1)else(X2)
  X1=probit(rnd(1))
  X2=0.5*probit(rnd(1))+2
.......................................................................

DENSITY MIXNORM(p,m1,s1,m2,s2)
y(x)=c*(p/s1*exp(-0.5*((x-m1)/s1)^2)+(1-p)/s2*exp(-0.5*((x-m2)/s2)^2))
     c=0.39894226
GHISTO SIMUDATA,X,22
X=-10(0.2)10 XSCALE=-10(2)10 YSCALE=0(100)600
FIT=MIXNORM INIT=0.5,0.5,1.5,2.5,0.7

HISTO: Estimated parameters of MIXNORM:
p=0.7044 (0.0123)
m1=0.0088 (0.0290)
s1=1.0136 (0.0185)
m2=2.0186 (0.0200)
s2=0.5200 (0.0139)
...
Since in this demo a ready-made example given in User Guide (p.160) was employed, the graph it generated in a separate window had to be dragged manually upon the main window.

Fence lines

Example #31 (Start the demo by clicking the picture!)

Fence lines gives a possibility to make adaptive setups in the Survo edit field so that the results of commands do not disturb other contents.

This technique is available in SURVO MM versions 3.16+.


Small problem of Ramanujan

Example #32 (Start the demo by clicking the picture!)

This example is taken from the book "The Man Who Knew Infinity, A Life of the Genius Ramanujan" (1991) by Robert Kanigel.
The formulas are typeset by the PRINT operation of Survo as PostScript files, then converted to bitmap files by the ImageMagick program, then to EMF format by the Photoline program, and finally displayed on the main window of Survo by a GPLOT FILE command of Survo.

July mean temperature and rainfall in Helsinki

Example #33 (Start the demo by clicking the picture!)

This demo in YouTube

The final scatter diagram is compiled by overlaying two plots. In the first one each observation is represented by a dot and in the second one by a year label. The first one is saved as an EMF file A.EMF by a specification OUTFILE=A. The second plot then overlays it by a specification INFILE=A.

Although there is much confusion in labels in the middle of the graph, the exceptional and thus the most interesting years can be clearly detected. The first versions of this graph were made in late 1970ies by using SURVO 76.


Approximate squaring of a circle

Example #34 (Start the demo by clicking the picture!)

It is a great pity that classical Euclidean plane geometry plays a minor role in the curriculum of mathematics in high schools, for example. Especially constructions with compass and straight edge could be used to reinforce visual perception.

In Survo a special program called by a GEOM command is available for making such constructions in conjunction with some other Survo functions.

Here a construction for approximate circle squaring is presented.
It is based on a random search in a square grid as explained in my paper
Statistical accuracy of geometric constructions (2008) on pages 35-37.

This construction is described in an edit field as follows:

*/GEOM
*GEOM CUR+1,E
*CL4
*O=point(2,2)
*A=point(2,0)
*_C1=circle(O,2)
*LX4=line(A,O)
*B=cross_cl(C1,LX4,2,4)
*c2=circle(B,*2)
*LY4=perpendicular(LX4,*B)
*C=cross_cl(C2,LY4,4,4)
*D=cross_cl(C2,LY4,0,4)
*LY2=perpendicular(LX4,O)
*LX2=perpendicular(LY4,*D)
*F=cross(LX2,LY2)
*G=midpoint(B,O,LX4)
*H=midpoint(D,F,LX2)
*C3=circle_p(O,G)
*J=cross_cl(C3,LY2,3,2)
*eag=edge(A,G)
*C4=circle(J,EAG)
*L=line(H,C)
*E=cross_cl(C4,L,0,3)
*Edge=edge(A,E)
*save edge(Edge)
E
GEOM is typically called by a sucro /GEOM which creates suitable Survo data files for various geometric objects appearing in the construction. Thus /GEOM also calls GEOM for making the construction so that points are saved in _POINTS.SVO, lines in _LINES.SVO, circles in _CIRCLES.SVO, and edges in _EDGES.SVO.

The construction can then be displayed by using various forms of the Survo operation PLOT. An ready-made template as a SURVO edit field is available so that the entire construction is saved as a PostScript file.

Everyone who has experience of making geometric constructions in practice knows how much attention must be paid to a careful placement of the compass and the straightedge in each step of the construction in order to achieve as accurate results as possible.

In my paper, the accuracy of these placements is described by a simple statistical model and the accuracy of the entire construction is estimated on this basis. Then it is natural to consider the accuracy of the construction as a measure of its complexity. This measure is expected to give better possibilities for comparing complexities of constructions than the characteristics of Lemoine's geometrography. My approach is mainly computational. Although the error distribution of placements is defined precisely, the error distributions related to entire constructions are so complicated that the only way is to use Monte Carlo simulation for estimating essential statistics.

When considering the accuracy of this approximate circle squaring construction, the nominal accuracy (pi-3.14152=0.00007) is not a sufficient measure since it can be attained only when there are no errors in construction steps.

For example, the relative root mean squared error (defined on page 24 and computed on page 37 of my paper) of this construction is 2.125 while, for example, that of Kochanski when extended to approximate construction of sqrt(pi)r (the side of the square) is 2.908, although the nominal accuracy of the latter is 0.00006 and thus slightly better.

  • See also: Statistical accuracy of geometric constructions and Squaring the circle

    Symmetric random walk

    Example #35 (Start the demo by clicking the picture!)

    This demo in YouTube

    By detecting that (by rotation of 45 degrees) the symmetric random walk in the plane can be seen as a combination of two independent and simultaneous one-dimensional random walks, it was easier to study asymptotic properties.

    The background of this presentation is William Feller's An Introduction to Probability Theory and Its Applications, Vol.I (Second Edition 1957) Ch. XIV.7 "Random Walks in the plane and space".

    As a student of mathematics and statistics I wrote an essay about this topic in 1959 after inventing a simpler formula for the transition probability (of moving from the origin to the point (x,y) in n steps) compared to that given as a double integral expression by Feller.

    I sent a letter about my findings to Feller and got immediately a friendly answer from him where he promised to use my result "if any" in the forthcoming edition of his book. However, to my disappointment, in next editions nothing had been changed in this respect.

  • See also: Plotting scatter diagrams and COMB program

    Reversing

    Example #36 (Start the demo by clicking the picture!)

  • See also: Worm mode, Moving parts of the edit field and EsreveR in Uncyclopedia

    Birds (word and phrase continuation)

    Example #37 (Start the demo by clicking the picture!)

    There has been an context sensitive autotext feature in the Survo Editor already from 1989 by means of the key combination F2 J.

    F1 J is an extended alternative for F2 J for completing phrases found elsewhere in the current edit field. As in this example, a list containing 'all possible phrases' may be loaded to the end of the edit field. This list then serves as a source of information during writing process by giving synonyms, technical terms etc.

    It is easy to create such lists on different topics by pasting them from websites, for example. The list used in this example originates from Birds of Sweden.

  • See also: Searching for words etc.

    Graphical rotation in factor analysis

    Example #38 (Start the demo by clicking the picture!)

    This demo in YouTube

    This feature has been available already in SURVO 76.

    The traditional graphical rotation is described e.g. in
    Ledyard Tucker and Robert MacCallum: Exploratory Factor Analysis, Chapter 10.

    The principles of Cosine rotation and Transformation analysis were introduced in Yrjö Ahmavaara's dissertation
    Transformation Analysis of Factorial Data, Helsinki Ann. Acad. Sci. Fenn., B 88, 2, 1954.

    The current algorithm for Cosine rotation was created in 1961 and described in Matrix computations in Survo.

  • See also: Graphical rotation, Cosine rotation, Transformation analysis, and Linear dependencies in a matrix

    Pythagorean points on a green meadow

    Example #39 (Start the demo by clicking the picture!)

    This demo in YouTube

    A comprehensive documentation is given in my paper
    Visualization and characterization of Pythagorean triples.
    Interactive 'graphical' identifying of Pythagorean triples is available as a sucro /P_TRIPLE.

  • See also: Scatter diagrams, POINT specification, and POINT_COLOR specification

    Unbiased coin-flips with a biased coin

    Example #40 (Start the demo by clicking the picture!)

    It is surprising that a biased coin (even without knowing its probabilities p for HEADS p and 1-p for TAILS) may be used like a fair coin by observing coin-flips in pairs. The expected number of flips of the biased coin for extracting one unbiased coin-flip is 1/(p*(1-p)) which gets its minimum value 4 for p=1/2, i.e. when the coin actually is fair. Thus typically four or more flips are needed if we do not believe that the coin is fair.

    The original data of 1200 flips with a biased coin (p=1/3) was generated by Survo as follows:

    FILE CREATE COIN,1,1
    FIELDS:
    1 N 1 X
    END
    
    FILE INIT COIN,1200
    
    p=1/3
    MATRIX P ///
    0 p
    1 1-p
    
    MAT SAVE P
    
    RND=URAND(20106)
    TRANSFORM COIN BY #DISTR(P)
    MAT SAVE DATA COIN AS COIN2
    MAT COIN3=VEC(COIN2,20)      / *COIN3~VEC(COIN2) 20*60
    
    MAT LOAD COIN3,##,CUR+1
    MATRIX COIN3
    VEC(COIN2)
    ///       1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 ...
      1       1  0  0  1  1  1  1  1  0  1  1  1  0  0  1  1  1 ...
      2       0  1  1  1  0  1  0  1  1  1  1  1  0  1  1  1  0 ...
      3       1  1  1  1  0  1  1  1  1  0  1  1  1  1  1  1  1 ...
      4       1  0  1  1  1  1  1  0  1  1  1  0  1  0  1  1  1 ...
      5       1  1  1  1  1  1  1  1  0  1  1  1  1  0  1  1  1 ...
      .       .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  . ...
    
  • See also: REPLACE command, TRIM command, MINSTAT operation, RUNTEST operation,
    and Fair Coin (in Wikipedia)

    Omega coin tossing

    Example #41 (Start the demo by clicking the picture!)

    This approach is based on the following facts:
    An integer can be decomposed into prime factors in only one way.
    There is strong number-theoretic evidence for the fact that for (large) integers the number of prime factors is even or odd with equal probabilities. This is also intuitively obvious.

    To give credence to the fact that the number of prime factors is even or odd with probablity 1/2, I took a random sample of 10 million integers with 16 digits by the following Mathematica code

    a=10^15
    SeedRandom[1];
    t1=TimeUsed[];
    tab:=Table[PrimeOmega[RandomInteger[{a,a+8999999999999999}]],{n,1,10^7}];
    Export["Sample.txt",tab,"Table"]
    TimeUsed[]-t1
    
    and converted the output file Sample.txt of 10^7 Omega values into a Survo data file. By the STAT program of Survo the following frequency distribution was obtained:
    Omega      f     %      *=65536 obs.
      1   277575   2.8 ****
      2  1053208  10.5 ****************
      3  1860960  18.6 ****************************
      4  2105565  21.1 ********************************
      5  1783889  17.8 ***************************
      6  1245149  12.5 ******************
      7   765283   7.7 ***********
      8   433642   4.3 ******
      9   232340   2.3 ***
     10   120919   1.2 *
     11    60947   0.6 :
     12    30650   0.3 :
     13    15225   0.2 :
     14     7482   0.1 :
     15     3655   0.0 :
     16     1814   0.0 :
     17      876   0.0 :
     18      408   0.0 :
     19      217   0.0 :
     20      102   0.0 :
     21       49   0.0 :
     22       22   0.0 :
     23       15   0.0 :
     24        3   0.0 :
     25        2   0.0 :
     27        1   0.0 :
     28        1   0.0 :
     29        1   0.0 :
    
    The relative frequence of odd Omega values was then 0.5001034 (and 0.4998966 for even values). Since the standard error of these estimates is 0.00016, the deviation from 0.5 is less than this standard error.
    Thus "Omega coin tossing" works like a fair coin.

    In number theory the function lambda(n)=(-1)^Omega(n) getting "randomly" values -1 and +1 is known as Liouville function and it completely corresponds to "Omega coin tossing". The sums of lambda(n) has been studied experimentally in Sign changes in sums of the Liouville function by Borwein, Ferguson, and Mossinghoff (2010).


    Rational approximations by listening

    Example #42 (Start the demo by clicking the picture!)

    Musicians (who often claim that they know nothing about mathematics) are clever in recognizing intervals and chords even when the sound is not pure. Thus when hearing an interval x as a major third they unconsciously find that 5/4 is the best approximation for x.

    The VAR and PLAY commands of Survo give an opportunity to create and play sound files (in WAV format). For example, the neutral third and pure intervals close to can be listened as follows:

    s(X):=sin(X*ORDER)
    
    f1=11/9 1.2222222222222...          Interval_11_9
    f2=sqrt(5/4*6/5) 1.2247448713916... Neutral_third
    f3=5/4 1.25                         Major_third
    f=0.2 'basic frequency'
    
    FILE MAKE Test,1,24000,X,2            / creates data file Test.
    VAR X=10000*(s(f)+s(f*f1)) TO Test    / computes the wave form
    PLAY DATA Test,X / WAV=Interval_11_9  / converts the wave form into WAV
    VAR X=10000*(s(f)+s(f*f2)) TO Test
    PLAY DATA Test,X / WAV=Neutral_third
    VAR X=10000*(s(f)+s(f3*f)) TO Test
    PLAY DATA Test,X / WAV=Major_third
    
    PLAY SOUNDS                           / plays sound files created
    Interval_11_9
    Neutral_third
    Major_third
    
    Slightly off the topic:
    The neutral third sqrt(3/2) is recognized correctly from its
    approximate value 1.2247448713916 by the INTREL command
    INTREL 1.2247448713916 / giving
    X=1.2247448713916 is a root of 2*X^2-3=0
    
    
    The dissonance function diss(c,x,m,n) is plotted for various c values as a family of curves in User Guide (1992) on page 330.
    Other demos related to this topic:
    Dissonance functions
    Tuning roots of algebraic equations by "listening"
  • See also: Conversions, PLAY commands, INTREL command, and Continued fractions

    Color changing

    Example #43 (Start the demo by clicking the picture!)

    This demo in YouTube

    The graph is a modification of an old Survo application presented in User Guide (1992) on page 334. This 'animated' version is inspired by Life A User's Manual (La Vie mode d'emploi) by Georges Perec.

    In the plotting scheme the jumps are triggered by int() functions in X(t) and Y(t) expressions.

    HEADER=  HOME=0,0 WHOME=0,0 WSIZE=652,381 WSTYLE=0 MODE=652,381
    XSCALE=-14,14 YSCALE=-10,10 FRAME=3
    SIZE=652,381 XDIV=0,1,0 YDIV=0,1,0
    a=53 t=0,50*pi,pi/150 pi=3.14159
    GPLOT X(t)=12*sin(int(0.35*t))+0.8*sin(a*t)*sin(t)+0*A,
          Y(t)=8*sin(int(0.3*t))+0.8*sin(a*t)*cos(t)
    A=0(1)4
    PALETTE=BGY3 COLORS=[background(8)] COLOR_CHANGE=A,3
    SLOW=100 slowing down the plotting speed
    
  • See also: Families of curves and COLOR_CHANGE specification,

    Permutation test

    Example #44 (Start the demo by clicking the picture!)

    This is an abbreviated version of a Finnish teaching program made in 1998.

    I wanted to show how simple is the 'theory' behind randomization tests when compared to that of the t test, for example, and how a time-consuming permutation test can be replaced by its randomized alternative.

    The COMB operation and the Survo matrix interpreter are the key components in this experiment. Also a few other special features of Survo are applied.

  • See also: Permutation tests

    HH - HT game (analysis)

    Example #45 (Start the demo by clicking the picture!)

    This demo in YouTube

    This is an abbreviated version of a Finnish teaching program made in 1998. It is shown how the winning probabilities and expected values can be derived simply by considering them conditionally with respect to the first pair of coin-flips. The probabilities are found by solving a system of four linear equations.

    This example is continued by a simulation experiment HH - HT game (simulation)

  • See also: MAT SOLVE command

    HH - HT game (simulation)

    Example #46 (Start the demo by clicking the picture!)

    This demo in YouTube

    This is an abbreviated version of a Finnish teaching program made in 1998 and continuation of HH - HT game (analysis).

  • See also: Special forms of VARSTAT

    Solving a Survo puzzle

    Example #47 (Start the demo by clicking the picture!)

    I presented Survo Puzzle in 2006. More information is available on the home page.

    The puzzle solved here has a moderate degree of difficulty (400). It was selected randomly by using a Java applet. By clicking the applet and entering the serial number #352-23824, this Survo puzzle can be solved more easily by the Swapping method. (This is shown in another demo.) But when using this technique the uniqueness of the solution cannot be confirmed.
    If your browser does not support Java, a corresponding Javascript version is available.

    Survo supports solving of these puzzles in many ways. Besides the COMB operation also Editorial computing and Touch mode are useful tools. The editorial interface of Survo is also suitable for general book-keeping during the solving process.


    Lines going through 3 points in a 9x9 grid

    Example #48 (Start the demo by clicking the picture!)

    This demo in YouTube

    This is an illustration related to my paper
    On lines through a given number of points in a rectangular grid of points.

    On the cover page of that paper a picture of all L(16,4)=548 lines connecting exactly four points in a 16x16 grid is presented.

    Although efficient computing of numbers L(n,j), i.e. # of lines going through j points in and nxn grid of points, is not trivial, making a list of these lines is still more demanding task for large n values. However, such a list for small n is easy to generate by brute force on current computers. Thus the Survo program module GRIDP simply starts from all pairs of points in the grid and sorts out the required lines.

    The graph is a typical example of how in Survo complicated pictures are compiled of several overlaying parts, here starting from a black background, then drawing the line segments, and finally setting the points as small 'hollow' circles.

    A related example

  • See also: Families of curves (using data values as varying parameters) and
    Combining Survo PostScript files

    Solving a Survo puzzle by the swapping method

    Example #49 (Start the demo by clicking the picture!)

    This demo in YouTube

    When a Survo puzzle is solved by the swapping method there is no guarantee that the solution is the only possible. The same puzzle is solved in another demo systematically showing at the same time that the solution is unique.

    If this puzzle is given as an open Survo puzzle without any fixed numbers in the form

        A  B  C  D  E
     1  *  *  *  *  * 41
     2  *  *  *  *  * 28
     3  *  *  *  *  * 51
       34  8 13 28 37
    
    it has also another solution which is obtained by three swaps. Try to find that solution by going to http://www.survo.fi/swap/puzzles and click the game board and type #352-23824 ENTER

    If your browser does not support Java, a corresponding Javascript version is available.

  • See also: Home page of Survo puzzles

    Prime numbers listed by a sucro

    Example #50 (Start the demo by clicking the picture!)

    The prime numbers are found by using a variation of the 'trial division' method:

    Given a number n, one divides n by all numbers m less than or equal to the square root of that number. If any of the divisions come out as an integer, then the original number is not a prime. Otherwise, it is a prime.

    After numbers 2, 3, and 5 are listed, both n values (Wnumber in PRIMES) and m values (Wfactor) are selected so that they are not divisible by 2 or 3.

    A sucro program cannot be efficient in purely numerical problems since all objects processed by a sucro are presented as strings of characters. The 'values' of variables are saved in a 'sucro memory' which is simply a string.
    For example, at the end of the current application this string is


    5000@5003@71@29@4@2@5041@
    giving values
    WN=5000
    Wnumber=5003 (first integer exceeding 5000: 4999+Wi)
    Wdivisor=71
    Wremainder=29 (last accepted prime 4999 mod 71 is 29)
    Wi=4 (oscillating between 2 and 4)
    Wj=2 (similarly)
    Wsquare=5041 (71^2=5041)

    It is evident that repeating string conversions to numerical values and vice versa are slowing down the speed of computation.

    In typical applications of sucros, like teaching programs and demos (like these GIF animations) and combining several Survo operations, this feature is unimportant. Although Survo program modules are written in C, many system routines are sucros.

    A general description of the sucro language is given in User Guide (1992) chapter 12 (pages 399 - 443).

  • See also: Finding primes by the Sieve of Eratosthenes and Prime numbers

    Ulam spiral in color

    Example #51 (Start the demo by clicking the picture!)

    Background information about the Ulam spiral in Wikipedia, for example.

    When making the spiral the main task is to map values of n to x,y coordinates.
    I derived the formulas
    x(n)=x(n-1)+sin(mod(int(sqrt(4*(n-2)+1)),4)*pi/2)
    y(n)=y(n-1)-cos(mod(int(sqrt(4*(n-2)+1)),4)*pi/2)
    by observing that the turning points of the spiral may be described in this way

               12 11 11 11 11 11 11
               12  8  7  7  7  7 10  .
               12  8  4  3  3  6 10  .
               12  8  4  1  2  6 10 14
               12  8  5  5  5  6 10 14
               12  9  9  9  9  9 10 14
               13 13 13 13 13 13 13 14
    
    giving an integer sequence

    1,2,3,3,4,4,5,5,5,6,6,6,7,7,7,7,8,8,8,8,9,9,9,9,9,...
    with a general term a(n)=int(sqrt(4n+1)), n=0,1,2,...

    This is verified easily by observing that a(n) grows exactly on values n=k^2 and n=k(k+1), k=0,1,2,... and then the gaps between growing points are 1,1,2,2,3,3,4,4,... leading to the sequence in question.
    The increments in x coordinates are 1,0,-1,-1,0,0,1,1,1,0,0,0, following the same pattern of value changes as a(n) but with cyclic variation 1,0,-1,0 of length 4. Therefore the increment in x values can be expressed as sin(mod(int(sqrt(4*(n-2)+1)),4)*pi/2),
    thus as a composition of four 'elementary' functions.
    This is a bit overuse of trigonometric functions but nice for the VAR operation of Survo. The increments of the y coordinates follow the same pattern by changing sin to -cos.


    Testing the correlation coefficient

    Example #52 (Start the demo by clicking the picture!)

    This is a typical example of how by means of editorial computing and text processing a general computation scheme is created for a particular application.

    The Fisher zeta transformation makes the correlation coefficient approximately normally distributed with standard deviation 1/sqrt(n-3). The cumulative normal distribution function is available as a library function N.F(m,s^2,x).

    If you like to use this template in your own Survo,
    please copy/paste it from here.

  • See also: Functions defined in the edit field and Library functions

    Age pyramid (Finland 2009)

    Example #53 (Start the demo by clicking the picture!)

    The age pyramid (TYPE=PYRAMID) is one of types of bar charts in Survo.

  • See also: Types of bar charts

    Letter frequencies in Shakespeare's Sonnets

    Example #54 (Start the demo by clicking the picture!)

    Shakespeare's 154 Sonnets were imported from
    http://www.shakespeares-sonnets.com/allsonn.htm.

    The same textual data is studied in demos Shakespeare's Sonnets as a Markov chain and
    The most common words in Shakespeare's Sonnets.

    It is shown how Survo can deal with literal and partially disorganized data. In many cases such data sets have to be scanned, filtered, and purified from unsystematic features. In this example, various forms of the LINEDEL command of Survo were useful.

    The letter frequencies in Shakespeare's Sonnets counted in this demo seem to deviate significantly from current English at least for some common letters like a,c,h,t as one can see in the following table.

    Letter  Sonnets English Difference
              %       %
    a       6.8     8.2    -1.4
    b       1.7     1.5     0.2
    c       1.8     2.8    -1.0
    d       3.8     4.3    -0.5
    e      12.5    12.7    -0.2
    f       2.3     2.2     0.1
    g       1.9     2.0    -0.1
    h       7.0     6.1     0.9
    i       6.4     7.0    -0.6
    j       0.1     0.2    -0.1
    k       0.8     0.8     0.0
    l       4.2     4.0     0.2
    m       2.9     2.4     0.5
    n       6.2     6.7    -0.5
    o       7.8     7.5     0.3
    p       1.4     1.9    -0.5
    q       0.1     0.1     0.0
    r       5.7     6.0    -0.3
    s       6.8     6.3     0.5
    t       9.9     9.1     0.8
    u       3.2     2.8     0.4
    v       1.3     1.0     0.3
    w       2.6     2.4     0.2
    x       0.1     0.2    -0.1
    y       2.7     2.0     0.7
    z       0.0     0.1    -0.1
    

    Shakespeare's Sonnets as a Markov chain

    Example #55 (Start the demo by clicking the picture!)

    Shakespeare's 154 Sonnets were imported from http://www.shakespeares-sonnets.com/allsonn.htm.
    The same textual data is studied in demos Letter frequencies in Shakespeare's Sonnets and
    The most common words in Shakespeare's Sonnets.

    The simulations were made according to a technique presented by Claude Shannon in
    A Mathematical Theory of Communication (1948).

  • See also: Operating with Markov chains in Survo and Markov chains in general


    Linear regression analysis by orthogonalization

    Example #56 (Start the demo by clicking the picture!)

    The matrix formulas used in this demo are derived, for example, in Survo User Guide pp. 377-378. The REGDIAG program of Survo uses the same algorithm based on orthogonalization of the regressor matrix.

    The automatic labelling of matrix rows and columns has been possible already in the matrix interpreter of SURVO 76 (in 1977). It is important to notice the rules for labels in derived matrices. For example, labels are transposed not only when transposing a matrix but also when matrix is inverted, etc. A simple label 'algebra' ensures that in the matrix of regression coefficients the names of regressors appear as row labels and the names of regressands as column labels.

  • See also: Matrix operations in Survo and The same regression analysis by LINREG


    Most common words in Shakespeare's Sonnets

    Example #57 (Start the demo by clicking the picture!)

    Shakespeare's 154 Sonnets were imported from http://www.shakespeares-sonnets.com/allsonn.htm.
    The same textual data is studied in demos Letter frequencies in Shakespeare's Sonnets and
    Shakespeare's Sonnets as a Markov chain.

    The most important tools were the WORDS, STAT, and SORT commands.

    The order of the most common words differs from that of common English for obvious reasons.


    Discriminant analysis of Iris flower data set

    Example #58 (Start the demo by clicking the picture!)

    Multiple discriminant analysis is performed in Survo by computing covariance structures (correlations, means and standard deviations) for each group of observations by the CORR operation. Then the actual analysis takes place using these results by the sucro command /DISCRI. The computations are made automatically by the MAT commands (matrix interpreter) of Survo. /DISCRI saves the results as matrix files and lists suitable commands in the edit field for retrieving them.

    One of those commands is
    MAT LOAD DISCRXR.M,END+2 / Correlations variables/discriminators
    giving in this case

    MATRIX DISCRXR.M
    Correlations_between_variables_and_discriminators
    ///        Discr1   Discr2
    Sepal_L  -0.79189 -0.21759
    Sepal_W   0.53076 -0.75799
    Petal_L  -0.98495 -0.04604
    Petal_W  -0.97281 -0.22290
    
    and shows that the dominant dicriminator depends essentially on the petal size of the flower.

    The discriminant scores were computed by a LINCO command
    LINCO Iris,DISCRL.M(D1,D2)
    (as suggested by /DISCRI).

    The same data is studied in Cluster analysis of Iris flower data set. by classifying the observations into three groups without any prior information about the species of flowers. It turns out that then clustering according to Wilks' Lambda criterion will be identical to that obtained by reclassification of the original observations according to Mahalanobis distances after discriminant analysis.


    Cluster analysis of Iris flower data set

    Example #59 (Start the demo by clicking the picture!)

    Survo offers alternative means for making cluster analysis. In this case the best result was achieved by statistical clustering based on Wilks' Lambda criterion.

    Pekka Korhonen has presented an effective stepwise procedure for computation of lambda values in his doctoral thesis "A stepwise procedure for multivariate clustering", Computing Centre, University of Helsinki, Research Reports N:o 7 (1979).
    In Korhonen's research a pivot operation plays an essential part in a form presented earlier by Hannu Väliaho in his doctoral thesis "A synthetic approach to stepwise regression analysis", Comm.Phys.Math., vol.34, No.12, 91-132 (1969).

    In the CLUSTER program of Survo, the dual procedure of Korhonen's stepwise method is applied. I was Korhonen's opponent in his dissertation and then I took a task to check his algorithms by implementing them to Survo.

    The same data is studied in Discriminant analysis of Iris flower data set


    F1 connections

    Example #60 (Start the demo by clicking the picture!)

    In gif-animated demos (and in particular in this one) the size of the 'window' limits possibilities of showing interplay between Survo and other programs. Only a hands-on approach can tell the whole truth. Recommended!

  • See also: Links and cross references in Survo and F1 connections


    "Rotated arrowheads"

    Example #61 (Start the demo by clicking the picture!)

    This demo in YouTube

    A new feature (valid in SURVO MM from ver. 3.21) for denoting points by arrowheads in scatter diagrams is presented. New point types are 21 (arrow) and 22 (filled arrow) are available and the orientation of arrows is selected by a variable, say A, giving the direction angle in degrees and determined by a code [rotation(A)] in the POINT specification.


    Influence curves for the correlation coefficient

    Example #62 (Start the demo by clicking the picture!)

    This demo in YouTube

    The plotting setup

    PLOT z(x,y)=abs(r*(1-w)+u*v)/w
      u=sqrt(n/(n*n-1))*(x-mx)/sx
      v=sqrt(n/(n*n-1))*(y-my)/sy
      w=sqrt((1+u*u)*(1+v*v))
    TYPE=CONTOUR SCREEN=NEG ZSCALING=20,0
    
    may be used for any two variables x,y in a Survo data (file).
    The formulas are derived in my note.

    I presented this example among others in my talk about Survo in Compstat 1992 (Neuchatel).
    This graph was used by the organizers of Compstat as a cover page (upside down!!)
    in the proceedings of the symposium. http://en.wikipedia.org/wiki/Computational_statistics

  • See also: Contour plots in Survo

    Colored texts in bar/pie charts

    Example #63 (Start the demo by clicking the picture!)

    In bar and pie charts, labels of variables can be written in the graph by a LABELS specification and values of variables by a VALUES specification. From the version 3.22 of SURVO MM these texts can be colored individually by using an extended form of the SHADING specification (here on line 8) referring to both fill colors and to text colors (separated by a slash /).

    The colors to be used are defined by COLOR(n) specifications telling the color components of each SHADING value n according to the CMYK color model.


    "Hello World!"

    Example #64 (Start the demo by clicking the picture!)

  • See also: Hello World in Wikipedia and Hello World collection


    Virtual keyboard

    Example #65 (Start the demo by clicking the picture!)


     

    Before SURVO MM (until year 2000) the mouse had practically no role while using Survo. Thereafter natural functions of the mouse were adopted. This example shows how almost everything may now be done without the keyboard just by the mouse and a virtual keyboard. Needless to say, practice of mouse-oriented use to such extent is not very convenient.

    The user may edit soft buttons and also create new ones while using Survo. The default set of soft buttons is defined in the edit field
    <Survo>\U\SUR-SOFT.EDT specifying, for example, the main button line EXIT in the form:


     

    Page from SUR-SOFT.EDT

    Most of the soft buttons lead to activation of a Survo macro (sucro). For example, clicking the START button activates sucro /SURVO-START (see lines 60-61).
  • See also: Defining soft buttons

    Cycloid

    Example #66 (Start the demo by clicking the picture!)

    A cycloid is a curve defined by the path of a point on the edge of circular wheel as the wheel rolls along a straight line.

    The first Survo Editor (1979) was originally programmed for input and editing of musical manuscripts and for converting them into a printable form. The slurs (arched curves connectiong a group of notes) were then plotted as slightly modified cycloids.

    See: The origin of Survo Editor

  • See also: Cycloid in Wikipedia

    Prime factors of numbers m^n-1

    Example #67 (Start the demo by clicking the picture!)

    This 67th demo is a tribute to Mr. Cole who found by numerical computations in 1903 that the Mersenne number M67=2^67-1 was not a prime number but a product of integers 193707721 and 761838257287. Finding of these factors was made easier e.g. by the fact that it was known beforehand that each potential factor has the form c*67+1 since 67 is a prime. In this case 193707721=2891160*67+1 and 761838257287=11370720258*67+1.

    It seems that, in general, numbers of type m^n-1 typically have many (and sometimes all) prime factors of the form c*n+1. By plain numerical calculations I have tried to study their abudance and found some systematic results reported in my paper. These results may have been proved already before. Thus if somebody knows about such proofs, please, let me know.

    Editorial computing in Survo makes inventing and testing of this kind of numerical hypotheses easy and comfortable according to the style used in this demo. Making of suitable sucros (Survo macros) is also helpful. The most important sucro /MPN used in this connection has the following listing in a Survo edit field:

    *TUTSAVE MPN
    / /MPN m_max,n                              / SM 4.12.2010
    / assuming that n is a prime number
    / computes the prime factors of numbers (m^n-1)/(m-1) for
    / m=2,3,...,m_max
    / and represents them in the form c*n+1.
    / If m-1 divides n, the smallest factor is n.
    /
    / See: ../papers/MustonenPrimes.pdf
    /
    / def Wmax=W1 Wn=W2 Wm=W3 Wprod=W4 Wc=W5 Wfactor=W6 Wpow=W7
    /
    *{init}{tempo 0}{disp off}{Wm=2}{R}
    *int(exp(log(9000000000000000)/{print Wn}))={act}{l} {save word Wc}
    *{line start}{erase}{u}{disp on}{tempo 2}
    - if Wmax <= Wc then goto A
    *{Wmax=Wc}
    + A: {R}
    *({print Wm}^{print Wn}-1)/({print Wm}-1)={act}{l} {line end}
    *(10:factors)={act}
    / Remove text "(10:factors)=":
    *{l13}{del12}{r}{ref set 1}
    /
    *{save line Wprod}{erase}{R}{print Wprod}{line start}
    / Replace *'s by spaces:
    + B: {r}{save char Wc}
    - if Wc '=' {sp} then goto C
    - if Wc '<>' * then goto B
    / Replace * by a space:
    * {goto B}
    + C:
    / Each factor to a separate line:
    *{home}{u}{ins line}TRIM 1{act}{del line}{home}
    *{save char Wc}
    - if Wc '<>' {sp} then goto D
    *{del line}
    /
    + D: {save word Wfactor}{Wpow=1}
    - if Wfactor = 0 then goto G
    - if Wfactor > Wn then goto D1
    / Wfactor is n
    *{form}{goto F}
    + D1:
    / Search for ^
    + D2: {r}{save char Wc}
    - if Wc '=' ^ then goto D3
    - if Wc '<>' {sp} then goto D2 else goto D4
    + D3:  {save word Wpow}{home}{save word Wfactor}
    + D4: {line start}{erase}({print Wfactor}-1)/{print Wn}={act}
    /
    *{l} {save word Wc}{line start}{erase}({print Wc}*{print Wn}+1)
    /
    *{line start}{save word Wfactor}
    + F: {ref jump 1}{write Wfactor}
    - if Wpow = 1 then goto F2
    *^{print Wpow}
    + F2: *{ref set 1}{R}{del line}{goto D}
    + G: {ref jump 1}{l}{del}
    /
    - if Wm = Wmax then goto E
    *{Wm=Wm+1}{goto A}
    + E: {end}
    
  • See also: Making sucros and Cunningham number

    Some properties of Magic Squares

    Example #68 (Start the demo by clicking the picture!)

    This demo is created by Kimmo Vehkalahti.

    Functions of the Survo matrix interpreter are shown in connection with Magic Squares.

  • See also: Matrix operations in Survo and Magic Square in Wikipedia

    Some further properties of Magic Squares

    Example #69 (Start the demo by clicking the picture!)

    This demo is created by Kimmo Vehkalahti.

    Functions of the Survo matrix interpreter are shown in connection with Magic Squares.

  • See also: Matrix operations in Survo and Magic Square in Wikipedia

    Solving linear equations

    Example #70 (Start the demo by clicking the picture!)

    A system of linear equations

    X1+X2=27
    X2+X3=32
    X3+X4=32
    X1+X3=25
    
    is represented as a matrix equation A*X=B by saving matrices
    MATRIX A
    ///  X1 X2 X3 X4
    r12   1  1  0  0
    r23   0  1  1  0
    r34   0  0  1  1
    r13   1  0  1  0
    
    MATRIX B
    /// freq
    r12   27
    r23   32
    r34   32
    r13   25
    
    and solved by using the Survo matrix interpreter.

  • See also: Matrix operations in Survo and System of linear equations

    Polynomial regression

    Example #71 (Start the demo by clicking the picture!)

    An essential tool for making polynomial regression is the POWERS program. It computes powers of selected variables up to a given degree as new variables. Thereafter polynomial regression analysis is carried out by standard tools like LINREG.

    In this example, polynomial regression is applied for determining unknown coefficients of a certain polynomial of two variables from a sample of values.

    Originally, this calculation was presented in my note (in Finnish) (2004) related to computation of a distribution of the city block distance D between two random points in a grid of N x N points.
    F(N,K)/N^4 is then the probability P[D=K] for K=1,2,...,N-1.

  • See also: Products of powers and Linear regression analysis

    Marking columns by SHADOW SET

    Example #72 (Start the demo by clicking the picture!)

    Shadow characters play an essential role in the editorial interface of Survo. In fact, each line in the Survo Editor may have an optional line consisting of shadow characters. Their existence is indicated by various display effects. For example, '1' as a shadow character makes the corresponding (main) character red and when lines are printed, these red characters typically appear in boldface.

    Survo has certain tools for management of shadow lines. The SHADOW SET command is the most recent one (included at the end of year 2011). It enables filling columns of tables with selected shadow characters thus enhancing their appearance.

    In fact, this new SHADOW SET does exactly the same job for the shadow lines as the 'classic' SET command for ordinary edit lines.

  • See also: Shadow characters, SHADOW commands and SET command.

    Hunting quanta

    Example #73 (Start the demo by clicking the picture!)

    Consider a data set x_1, x_2,..., x_n where each observation is an approximate integral multiple of one of positive numbers q_1, q_2,..., q_k where typically k=1 or another small integer.

    Our task is to estimate the values of quanta q_1, q_2,..., q_k on the condition that each of them exceeds a certain minimum value q_min.

    D.G.Kendall has in his paper Hunting Quanta (Royal Society of London. Mathematical and Physical Sciences A 276, 231-266) proposed using a "cosine quantogram" of the form

                             n
        phi(q) = sqrt(2/n)* SUM cos(2*pi*eps(i)/q)                (Kendall 1974)
                            i=1
    
    where 0<=eps(i)<q is the remainder when x_i is divided by q. The q-values of highest upward peaks of this function will be considered as candidates for quanta.

    My idea is that the quanta are estimated by a selective, conditional least squares method where the sum

                          n
       ss(q_1,...,q_k) = SUM min[g(x_i,q_1)^2,...,g(x_i,q_k)^2]       (SLS 2005)
                         i=1
    
    where g(x,q) in the least absolute remainder when x is divided by q, is to be minimized with respect of q_1,...,q_k on the condition that each q_i is at least q_min.

    A more detailed description is found in my paper Hunting multiple quanta by selective least squares.

  • See also: QUANTA program module

    Survopoint display mode

    Example #74 (Start the demo by clicking the picture!)

    It was a rather simple task to implement this technique (available in SURVO MM from ver.3.35). The lines in the edit field of Survo are displayed by using pointer variables of the C language. Thus when switching edit lines, no line is actually moved; only their pointers are temporarily 'updated'.

    All Survopoint lines are indicated by the '~' (tilde) character in the control column. At the end of such a line a marking of type ~x must exist. x is any of the lowercase characters a,b,...,z. For any x, a line having x in its control column must exist in the same edit field (typically outside the Survo window) and this line tells how the corresponding Survopoint line is displayed.

    For example, in this demo the display mode of English proverbs (appearing in the latter part of the demo) is defined as follows:

    e 30 159 S
    *    The road to hell is paved with good intentions.
    *    He laughs best who laughs last.
    *    A smooth sea never made a skilled mariner.
    *    Truth is stranger than fiction.
    *    A friend to all is a friend to none.
    *    Be swift to hear, slow to speak.
    *    Knowledge in youth is wisdom in age.
     ...
    
    On the 'e' line, 30 indicates the rate of change (this Survopoint line is altered only once in 30 sequent refreshments of the display). 159 is the number of proverbs in the list and 'S' indicates a systematic change.

    Four-dimensional cube

    Example #75 (Start the demo by clicking the picture!)

    My lecture notes (1995) on multivariate statistical methods (in Finnish) include an appendix about multidimensional hyperspheres and hypercubes. The main purpose is to show properties of such abstract objects and give an idea how things become more complicated in higher dimensions but are still tractable.

    One of the illustrations is a graph of a 4-dimensional cube represented as 2-dimensional projections. This graph resembles a draftsman's plot (scatterplot matrix) of multivariate statistical data.

    Here this graph is generated by a series of Survo operations triggered by an /ACTIVATE sucro command. Below is a complete description (an extract from a Survo edit field) about how the graph has been created:


     65 *
     66 *The following sucro command activates all commands having a '+' in the
     67 *control column and thus the final graph will be automatically created:
     68 *
     69 */ACTIVATE +  (Activated commands are displayed here in red.)
     70 *It is possible to draw each 2-dimensional projection of a 4-dimensional
     71 *cube as a single line graph of edges since the degree of each vertex
     72 *is 4. Then there exists an Eulerian circuit where each edge is
     73 *traversed just once.
     74 *Consider the cube in a 4-dimensional space so that vertices have
     75 *coordinates (x_1,x_2,x_3,x_4) where each x_i is either 0 or 1.
     76 *Then the following matrix gives an Eulerian circuit in this
     77 *4-dimensional cube:
     78 *
     79 *MATRIX C4 ///
     80 *0 0 0 0
     81 *0 0 0 1
     82 *0 0 1 1
     83 *0 0 1 0
     84 *0 0 0 0
     85 *0 1 0 0
     86 *0 1 0 1
     87 *0 1 1 1
     88 *0 1 1 0
     89 *0 1 0 0
     90 *1 1 0 0
     91 *1 1 0 1
     92 *0 1 0 1
     93 *0 0 0 1
     94 *1 0 0 1
     95 *1 1 0 1
     96 *1 1 1 1
     97 *0 1 1 1
     98 *0 0 1 1
     99 *1 0 1 1
    100 *1 0 0 1
    101 *1 0 0 0
    102 *1 1 0 0
    103 *1 1 1 0
    104 *0 1 1 0
    105 *0 0 1 0
    106 *1 0 1 0
    107 *1 0 1 1
    108 *1 1 1 1
    109 *1 1 1 0
    110 *1 0 1 0
    111 *1 0 0 0
    112 *0 0 0 0
    113 *
    114 +MAT SAVE C4
    115 +MAT TRANSFORM C4 BY X#-0.5   / Centering (0,1) -> (-0.5,0.5)
    116 +MAT CLABELS "X" TO C4        / Column labels X1,X2,X3,X4
    117 *
    118 *The regular 2-dimensional projections of this hypercube are plain
    119 *squares and thus not very interesting.
    120 *
    121 *A better view is obtained by making an "arbitrary" 4-dimensional
    122 *rotation:
    123 *
    124 +MAT T=ZER(4,4)
    125 +MAT TRANSFORM T BY sin(31*I#*J#)  / "arbitrary" T
    126 *
    127 +MAT GRAM-SCHMIDT DECOMPOSITION OF T TO Q,R  / Orthogonalization of T
    128 +MAT K=C4*Q                  / Rotation of the hypercube by orthogonal Q
    129 +MAT CLABELS "dim" TO K      / Column labels dim1,dim2,dim3,dim4
    130 *
    131 *Combining the rotated and original cube into one matrix KB:
    132 *
    133 +MAT KB=ZER(33,8)
    134 +MAT KB(1,1)=K
    135 +MAT KB(1,5)=C4
    136 *
    137 *.......................................................................
    138 *Plotting all six 2-dimensional projections separately:
    139 *
    140 *SIZE=1000,1000 SCALE=-1,1 HEADER= XDIV=0,1,0 YDIV=0,1,0 FRAME=3
    141 *FRAMES=F F=0,0,1000,1000 PEN=[SwissB(30)]
    142 *XLABEL= YLABEL= LINE=[line_type(2)][line_width(0.2)],1 TEXTS=T
    143 *
    144 +PLOT KB.MAT,dim1,dim2 / DEVICE=PS,A12.PS T=1_-_2,750,50
    145 +PLOT KB.MAT,dim1,dim3 / DEVICE=PS,A13.PS T=1_-_3,750,50
    146 +PLOT KB.MAT,dim1,dim4 / DEVICE=PS,A14.PS T=1_-_4,750,50
    147 +PLOT KB.MAT,dim2,dim3 / DEVICE=PS,A23.PS T=2_-_3,750,50
    148 +PLOT KB.MAT,dim2,dim4 / DEVICE=PS,A24.PS T=2_-_4,750,50
    149 +PLOT KB.MAT,dim3,dim4 / DEVICE=PS,A34.PS T=3_-_4,750,50
    150 *
    151 *.......................................................................
    152 *Plotting two opposite 3-dimensional cubes in different colors (blue and
    153 *red):
    154 *
    155 *SIZE=1000,1000 SCALE=-1,1 HEADER= XDIV=0,1,0 YDIV=0,1,0 FRAME=0
    156 *XLABEL= YLABEL=
    157 *               *blue=[color(1,1,0,0)],1 *red=[color(0,1,1,0)],1
    158 +PLOT KB.MAT,dim1,dim2 / DEVICE=PS,B12.PS IND=X1,-0.5 LINE=*blue
    159 +PLOT KB.MAT,dim1,dim2 / DEVICE=PS,C12.PS IND=X1,0.5  LINE=*red
    160 +PLOT KB.MAT,dim1,dim3 / DEVICE=PS,B13.PS IND=X1,-0.5 LINE=*blue
    161 +PLOT KB.MAT,dim1,dim3 / DEVICE=PS,C13.PS IND=X1,0.5  LINE=*red
    162 +PLOT KB.MAT,dim1,dim4 / DEVICE=PS,B14.PS IND=X1,-0.5 LINE=*blue
    163 +PLOT KB.MAT,dim1,dim4 / DEVICE=PS,C14.PS IND=X1,0.5  LINE=*red
    164 +PLOT KB.MAT,dim2,dim3 / DEVICE=PS,B23.PS IND=X1,-0.5 LINE=*blue
    165 +PLOT KB.MAT,dim2,dim3 / DEVICE=PS,C23.PS IND=X1,0.5  LINE=*red
    166 +PLOT KB.MAT,dim2,dim4 / DEVICE=PS,B24.PS IND=X1,-0.5 LINE=*blue
    167 +PLOT KB.MAT,dim2,dim4 / DEVICE=PS,C24.PS IND=X1,0.5  LINE=*red
    168 +PLOT KB.MAT,dim3,dim4 / DEVICE=PS,B34.PS IND=X1,-0.5 LINE=*blue
    169 +PLOT KB.MAT,dim3,dim4 / DEVICE=PS,C34.PS IND=X1,0.5  LINE=*red
    170 *
    171 *Coloring the projections:
    172 +EPS JOIN K12,A12,B12,C12
    173 +EPS JOIN K13,A13,B13,C13
    174 +EPS JOIN K14,A14,B14,C14
    175 +EPS JOIN K23,A23,B23,C23
    176 +EPS JOIN K24,A24,B24,C24
    177 +EPS JOIN K34,A34,B34,C34
    178 *
    179 *Entering coordinates for projections in the final setup:
    180 *K12=K12,0,2000
    181 *K13=K13,0,1000   K23=K23,1000,1000
    182 *K14=K14          K24=K24,1000,0      K34=K34,2000,0
    183 *
    184 *Combining the parts:
    185 +EPS JOIN CUB4,K12,K13,K14,K23,K24,K34
    186 *
    187 *Creating the result
    188 *as a PostScript file:
    189 +PRINT CUR+1,X TO Cube4.PS
    190 % 1500
    191 - picture CUB4.PS,*,*,0.47,0.47
    192 X
    193 *Making and displaying
    194 *a PDF file Cube4.PDF:
    195 +/GS-PDF Cube4.PS
    196 *
    197 *The result is converted
    198 *and displayed here
    199 *as an EMF file.
    200 *
    

    The previous extract from an edit field is a typical example of templates created for advanced applications. It is one of the options of Survo for self documenting and literate programming available as essential features of the editorial approach since 1979.


    In the appendix (pp. 181-) it is also shown e.g. that the number of m-cubes in an n-cube is K(n,m)=C(n,m)2^(n-m), m=0,1,2,...,n. Thus, for example, the number of edges in a cube is C(3,1)*2^(3-1)=12 and the number of cubes in a 4-dimensional cube is C(4,3)*2^(4-3)=8. These 8 cubes are shown below as 2-dimensional projections to first two coordinate axes.

    The first pair is the same as in the 1-2 plot defined on lines 157-159 in the template above and the remaining three pairs are obtained by changing X1 on lines 158 and 159 to X2,X3,X4, respectively.

    The generating function of K(n,n-m) numbers is f(s)=(s+2)^n in the same way as (s+1)^n is the generating function of the binomial coefficients C(n,m). The total number of "parts": vertices (m=0), edges (m=1), faces (m=2), cubes (m=3), etc. in an n-dimensional cube is then f(1)=(2+1)^n=3^n.

    Another example related to hypercubes

    'Word' processing by mouse

    Example #76 (Start the demo by clicking the picture!)

    Copies of various items in the edit field can be made in various ways.

    Traditional means are the COPY command, the key alt-F4 for rectangular blocks, and the key alt-F2 for text.

    For 'words' (contiguous strings separated by blanks) the best method from version 3.37 onwards is based on two mouse-clicks:

    1. Click the word to be copied by the rightmost button,
    2. Select the place where to copy the word by the leftmost button.

    Immediately after the first copy, more copies can be made by the leftmost mouse button.
    If the mouse is pointing at a blank space between existing 'words', the copy is inserted between these words.
    If the mouse is pointing at a 'word' (a non-blank character), this 'word' is replaced by the copy.
    3. The copying process is terminated by the DEL key.

    - - - - - - - - - -

    When using Survo there is no absolute need for working with a mouse. For many people, operating with a classical mouse is a nuisance causing physical stress and pain.

    For many years, at last for me, a 'RollerMouse' (coupled with the splendid IBM PC/AT keyboard) has been much better as a pointing device. When using it there is no need to move hands or wrist away from the keyboard. All mouse functions can be executed by minimal moves of the fingertips.

    Edges and diagonals of a regular n-sided polygon

    Example #77 (Start the demo by clicking the picture!)

    By means of Survo and Mathematica I have found certain properties related to lengths of edges and diagonals of a regular n-sided polygon inscribed in a unit circle.
    In particular, in this demo it was shown experimentally that in the regular heptagon the squared lengths of the chords are roots of an algebraic equation
    X^3-7*X^2+14*X-7=0 (known already by Kepler).

    In general, I found experimentally that the squared lengths of the diagonals in a regular n-sided polygon are roots of an algebraic equation with coefficients as simple expressions of binomial coefficients.
    These results were formally proved in my paper with Pentti Haukkanen and Jorma Merikoski.

    See also
    http://www.survo.fi/papers/Polygons2013.pdf
    http://www.survo.fi/papers/Roots2013.pdf.
    and
    Mustonen, S., Haukkanen, P., Merikoski, J. (2014).
    Some polynomials associated with regular polygons.
    Acta Univ. Sapientiae, Mathematica, 6, 2, 178-193.
    http://www.acta.sapientia.ro/acta-math/

    More extensive demos about the same subject:
    Equation for the sum of chord lengths in a regular polygon
    Regular polygons: Solving riddle of q coefficients
    Regular polygons: Testing roots


    Examples from a presentation in 1987

    Example #78 (Start the demo by clicking the picture!)

    This demo in YouTube

    This is partial reproduction of my talk "Editorial approach in statistical computing"
    in the Second International Tampere Conference in Statistics (1987).
    This demo gives two examples about usage of Survo.
    In the original video

    many details related to these examples are difficult to see.

    The last example 'Estimation of a circle' of my talk is also available in YouTube.

    The final paper in conference proceedings does not include these examples.


    Thurstone's box problem

    Example #79 (Start the demo by clicking the picture!)

    This demo in YouTube
    The same in better resolution YouTube

    Thurstone's problem is presented in a more general form. Values of the derived variables have substantial 'measurement' errors.

    The Thurstone's original experiment is described, for example, in Richard L. Gorsuch: Factor Analysis pp. 10-11.

  • See also: Factor analysis

    Finding recursive formula for number of grid lines

    Example #80 (Start the demo by clicking the picture!)

    This demo in YouTube and also as a flash demo.

    At first sight, one could assume that the number of lines going through at least two points in an n x n regular grid could be presented by a simple algebraic formula. However, this is not possible since the number of lines with a given slope u/v (u,v integers) depends essentially on the divisibility of integers v=2,3,...,n-1. Therefore Euler's totient function plays an essential role in 'residual terms' of the formulas.

    Although these recursive formulas are more complicated than the direct double sum formula (presented in the beginning) they give results much faster. For example, already when n=10^4 recursive formulas are over 1000 times faster than the double sum formula. Furthermore, the recursive formulas are applied iteratively so that results are obtained at the same time also for all integers less than n and it is efficient to continue iteration for greater n values step by step on the basis of values L(n,n), L(n-1,n), and R1(n).

    By these means I have computed L(n,n) values for all n <= 10^8 in 100 sequences of a million n values and it would take less than 3 hours on my current PC. The same task by using the double sum formula would last over 100'000 years!

    In 2015 I extended this calculation for all n <= 10^11.

    More information about this topic is given in Grid lines where the asymptotic behaviour of L(n,n) numbers is reported.

    See also

  • Euler's totient function.
  • Transformation of variables (VAR)

    Testing the correlation coefficient

    Example #81 (Start the demo by clicking the picture!)

    This demo in YouTube

    A typical Survo computation scheme (template) is presented for testing the correlation coefficient by the Fisher z transformation.

    This one of the oldest example (in 1981) used for demonstrating 'self-documenting' and 'literate programming' in the Survo Editor. These terms were unknown to me since apparently they were introduced later e.g. by Donald Knuth.

    It should be noted that all information (formulas and data) needed for computation of test statistics is given within the text typed in the edit field.

  • See also: Statistical tests in Survo and
  • Fisher z transformation

    Matrix interpreter (regression analysis)

    Example #82 (Start the demo by clicking the picture!)

    This demo in YouTube

    This demo was a part of my talk in
    The Eighth International Workshop on Matrices and Statistics (1999)
    at the University of Tampere, Finland.


    The matrix interpreter is an essential tool for making extended calculations from the results given by statistical Survo operations, for example. As shown at the end of this demo, such operations often give their results also as matrix files.

    It is easy to convert a matrix file to a Survo data file by

    FILE SAVE MAT <matrix_file> TO <Survo_data_file>
    and conversely any Survo data (table or file) to a matrix file by
    MAT SAVE DATA <Survo_data> TO <matrix_file>
    but Survo matrix files (like COUNTRIES.MAT in this demo) can be used also as data in statistical operations.

    The matrix interpreter is also useful for teaching methods related to linear models, for example.

    The automatic labelling of matrix rows and columns has been possible already in the matrix interpreter of SURVO 76 (in 1977). It is important to notice the rules for labels in derived matrices. For example, labels are transposed not only when transposing a matrix but also when matrix is inverted, etc. A simple label 'algebra' ensures that in the matrix of regression coefficients the names of regressors appear as row labels and the names of regressands as column labels.

  • See also: Matrix operations in Survo

    "Origin of Species" 2

    Example #83 (Start the demo by clicking the picture!)

    This demo in YouTube

    This is a variation of an earlier demo. and displaying partially randomly selected forms. Each figure is a 2x2 setup of graphs from the family of curves defined by the following plotting scheme in the Survo edit field: (This represents the first example above.)
    GPLOT x(t)=X0+R*(sin(t)+r*sin(A*t)+r^2*sin(B*t)),
          y(t)=Y0+R*(cos(t)+r*cos(A*t)+r^2*cos(B*t))
    
    r=(sqrt(5)-1)/2 s=1/(1+r+r^2) R=3*r/2
    t=0,2*pi,pi/1000 pi=3.141592653589793
    SCALE=-4,4 OUTFILE=A
    XDIV=0,1,0 YDIV=0,1,0 SIZE=381,381 HEADER= FRAME=3 MODE=381,381
    i=0(1)3
    a=5
    b=15 T1=5;15,170,170 TEXTS=T1
    PEN=[color(0.2,0.4,1,0)][SwissB(20)]
    FILL(-2)=0.9,0.6,0.3,0
    LINETYPE=[line_width(2)],1 SLOW=300
    A=x0*a+1  X0=2*x0  x0=int(sqrt(2)*(sin(pi/2*(i+0.5)))+0.5)
    B=y0*b+1  Y0=2*y0  y0=int(sqrt(2)*(cos(pi/2*(i+0.5)))+0.5)
    WSIZE=381,381 WHOME=800,0 WSTYLE=0 FRAMES=F F=0,0,381,381,-2
    
    The setup of graphs corresponds to values (here A=5, B=15) in this way:
          -A,B    A,B
    
          -A,-B   A,-B
    
    
    The parameters (a,b,PEN,FILL,LINETYPE, etc.) in the plotting scheme are controlled by a sucro (Survo macro) taking the values from a matrix file
    MATRIX AB
    ///  a  b C     M     Y     c     m     y     L  W   S
      1  5 15 0.200 0.400 1.000 0.900 0.600 0.300 1 40 300
      2  0  0 1.000 1.000 1.000 0.000 0.000 0.000 1  7  50
      3  1  1 1.000 1.000 1.000 0.000 0.000 0.000 1  4  20
      4  2  2 1.000 1.000 1.000 0.000 0.000 0.000 1  3  10
      5  3  3 1.000 1.000 1.000 0.000 0.000 0.000 1  2   5
    ... .. .. ..... ..... ..... ..... ..... ..... . .. ...
     95  9  7 0.298 0.498 1.000 0.872 0.605 0.204 1  1   0
     96  9 15 0.260 0.397 1.000 0.858 0.655 0.175 1  1   0
     97 11 32 0.323 0.374 1.000 0.917 0.711 0.211 1  1   0
     98 16 96 0.313 0.427 1.000 0.838 0.537 0.237 1  1   0
     99 45 50 0.170 0.401 1.000 0.918 0.640 0.245 1  8 100
    100 10 60 1.000 1.000 1.000 0.000 0.000 0.000 1 25 300
    


    Simulating multivariate normal distribution

    Example #84 (Start the demo by clicking the picture!)

    This demo in YouTube

    MNSIMUL in Survo is a general tool for generating observations from multivariate normal distribution with a given matrix of correlations and vectors of expected values and standard deviations.

    This is a replicate of a my flash demo created in 2006. The most significant distinction is that the computation times (including loading and saving) measured in Survo by TIME COUNT START - TIME COUNT END commands were now only about a third of those obtained eight years earlier. There are no changes in the program code but PC's have become somewhat faster.

  • See also: MNSIMUL operation and "Srivastava: Methods of Multivariate Statistics (2002)"

    Genesis of multivariate normal distribution 1

    Example #85 (Start the demo by clicking the picture!)

    This demo in YouTube

    As a special case of the central limit theorem it is shown how the distribution of linear combinations of independent, uniformly distributed variables tends to multivariate normal distribution.

    Throughout the calculations the matrix interpreter of Survo is used.

    The simple formula for the correlation coefficient in the current case is derived as follows:

    It is also noteworthy that the value of of the correlation coefficient depends on the parameters m and s only through their ratio m/s.

    According to my experience, in general, the most clear-cut way to define multivariate normal distribution is by a linear transformation of independent N(0,1) variables.

    A related demo: Genesis of multivariate normal distribution 2

  • See also: COMPARE operation and MNTEST operation

    Genesis of multivariate normal distribution 2

    Example #86 (Start the demo by clicking the picture!)

    This demo in YouTube

    Defining the multivariate normal distribution through a linear transformation of independent N(0,1) variables has been my favorite for a long time. This definition is much more comprehensible in teaching (at least on undergraduate level) than starting from the density or characteristic function.

    For example, from this standpoint one can readily understand that there can be only linear dependencies between component variables, or see that its marginal and conditional distributions are (multi)normal.

    In my lecture notes on multivariate statistical methods (in Finnish) almost everything is derived on this basis without a need for working with integrals, etc. The creation of two-dimensional normal normal distribution is characterized there (p.16) in this way

    The main tool in calculations is here the matrix interpreter of Survo.

    Graphics were generated by Survo plotting schemes in the following way:
    (here the first figure Z of a sample of N2(0,I) distribution)

    *
    *Matrix Z of independent N(0,1) variables:
    *MAT Z=ZER(100000,2)              / Z is a data frame.
    *MAT #TRANSFORM Z BY #RAND(20140) / Fill with uniform[0,1] values,
    *MAT #TRANSFORM Z BY probit(X#)   / convert to independent N(0,1) values
    *
    A Common specifications:
    *WHOME=271,0 WSIZE=381,381 HEADER= XDIV=0,1,0 YDIV=0,1,0 XLABEL= YLABEL=
    *WSTYLE=0 FRAME=3 LINETYPE=[line_width(3)],1 POINT=11 MODE=1024,1024
    *SCALE=-5,5
    B
    *..................................
    *Coordinate axes: (common backgroud for each graph)
    *GPLOT X(t)=c*t,Y(t)=(1-c)*t / SPECS=A,B
    *c=0,1,1 t=-9,9,9
    *OUTFILE=Z0
    *..................................
    *Scatter diagram of two independent N(0,1) variables, 100'000 cases
    *GPLOT Z.MAT,1,2 / SPECS=A,B CONTOUR=[RED],0.001,0.5
    *PEN=[BLACK][SwissB(100)] TEXTS=T T=Z,20,900
    *INFILE=Z0 OUTFILE=Z1
    *
    

    A related demo: Genesis of multivariate normal distribution 1

  • See also: CORR operation, MNTEST operation, MAT SVD operation, and Singular value decomposition

    Monthly temperature and rainfall in Helsinki

    Example #87 (Start the demo by clicking the picture!)

    This demo in YouTube

    FILE MEDIT is the most extensive program in SURVO MM for displaying and editing of data files.
    Data of many variables are displayed on a set of consecutive pages defined automatically or specified by the user.
    Each page may contain fields for variables and free-format textual comments. Comments may be conditional, depending on the observation at hand.
    Also derived fields containing functions of original variables may appear.
    Sound effects, voice comments and graphical displays can be inserted. General checking facilities for data integrity are provided. Also various search facilities are available.

    In the second half of this demo (displaying graphs), another Survo session is called to create graphics by PLOT commands. The instructions for the entire application are given in an edit field HELTERA3.EDT

    A related demo: Temperature in Helsinki

  • See also: FILE MEDIT operation

    Early sound experiment on Elliott 803 computer in 1962

    Example #88 (Start the demo by clicking the picture!)

    This demo in YouTube

    I started my work with computers in 1960 by programming statistical software for the Elliott 803 computer. One of the special devices in this computer was a tiny loudpeaker which received a pulse each time when the program executed a jump instruction. Since computer programs always contain many jumps creating loops of fixed or variable lengths, each program generated a characteristic sequence of sounds.

    This was a useful feature because at that time a typical statistical analysis would take a long time, often more than ten minutes, and one learned to recognize the various stages of standard programs just by listening the sound.

    During my summer holiday in 1962 I decided to create a program just for producing sound sequences created more or less randomly. The program included subroutines for random 'trills' and 'glissandos', for example. It was also able to make random 'variations' on a 'theme' given by the operator from the keyboard or by the program itself.

    It may have been the first program for both generating 'music' and playing it in real time. I had a rough estimate that it would take about 10^50 years before the program starts to repeat itself .

    The most serious drawback was that the highest tone was only about H=1135 Hz (caused by the shortest possible loop) and all other tones were H/2, H/3, H/4, ... Hz so that the scale of tones was primitive indeed.

    This example contains small captions of the sound output generated by this program. The samples were taken by Erkki Kurenniemi on a recorder of the Department of Music in the University of Helsinki in 1962. They are now available on a CD "On-Off" produced by Petri Kuljuntausta.

    See also Peter Onion's video about my program in YouTube. In this video the program code punched on a paper tape is fed into the ferrite core memory of Elliott 803 and the program starts immediately thereafter according to default settings.



    Tracing a sucro program

    Example #89 (Start the demo by clicking the picture!)

    This demo in YouTube

    Use that YouTube version if you want to pause and/or slow down. Then you have time for a more detailed study.

    In the snapshot above numbers Wfirst=-0.4 and Wnext=1.4 are to be compared by an if statement on line 36. Since now Wfirst is not greater than Wnext, the program will continue on line 37 and inserts -0.4 in the beginning of line 45.

    By a new 'echo' option it is possible to watch how a sucro program is working. This is a useful feature for debugging a sucro program and for teaching sucro programming.
    The 'echo' option is available on various levels, the most stringent one demonstrated here. On this level the sucro code has to be visible in the Survo main window when the sucro is running and the sucro has to be saved just before it is activated with the ECHO2 parameter without scrolling the main window.
    If this is not possible, the parameter ECHO has to be used and then only the information displayed on the bottom line, current command and values of (selected) variables, is available.

    The program code of /SORT with comments

    
    *TUTSAVE SORT
    / /SORT sorts numbers below the command line in ascending order.
    / Defining variables:
    / def Wfirst Wnext Wr Wc Wc2
    /
    / Initialization and setting maximum speed:
    *{init}{tempo 0}
    *
    / Making room for the sorted list:
    *{R}{ins line}
    /
    / Setting a 'wall' |:
    * |
    /
    / Setting a space in front of the original list:
    *{line start}{u}{ins} {ins}{l}{Wc=1}
    /
    / Finding next number and recording its location:
    + A: {next word}{ref set 1}{save cursor Wr,Wc2}
    /
    / If no next number found, going to End:
    - if  Wc2 = Wc then goto End
    /
    / Saving new number:
    *{Wc=Wc2}{save word Wfirst}{R}
    /
    / Finding next word on result line:
    + B: {next word}{save word Wnext}
    /
    / If it is the wall, inserting value of Wnext to the end:
    - if Wnext '=' | then goto C
    /
    / Repeating from B if proper location for Wfirst not yet found:
    - if Wfirst > Wnext then goto B
    /
    / Writing newest number to its place in the sorted list:
    + C: {ins}{write Wfirst} {ins}{ref jump 1}{goto A}
    /
    / Finishing by removing the wall and the original list:
    + End: {R}{del}{line end}{l}{del}{line start}{u}{del line}
    *{end}
    

  • See also: Sucros in Survo and another demo about tracing of sucro code

    Finding primes by the Sieve of Erastothenes

    Example #90 (Start the demo by clicking the picture!)

    This demo in YouTube

    All prime numbers below 10^6 are found by means of a sucro /SIEVE using a simple sieve method.
    A crucial step in the elimination of composite numbers is application of a new extended form of the SET command where an extra parameter (default is 1) gives a gap between lines to be treated.

    That command appears on the line 33 above in the form

    SET A+{write Wstart},A+{write Wmax},A,{write Wprime}
    
    and it takes actual forms
    SET A+4,A+1000000,A-1,2
    SET A+9,A+1000000,A-1,3
    SET A+25,A+1000000,A-1,5
    SET A+49,A+1000000,A-1,7
    
    in the four first rounds.

    The speed of this process is much higher than in the older example, but even here the fact that integers are represented as character strings and converted to double precison floating point numbers slows down the computation dramatically.

    Then even this approach is very slow when compared a corresponding task carried out by a pure C program presented as a Survo command SIEVE below. Finding of all primes below a million and loading them to the edit field (see lines 80- in the C code below) takes only 0.245 seconds and is thus about 100 times faster.

    Thus in pure numerical computations the sucro technique is inefficient. Sucros are at their best when makining tutorials like these demos or when the task at hand is a sequence of standard Survo operations and the same job has to be repeated many times from various initial conditions. See, for example Combining Survo operations by sucros

    
        1 *SAVE SIEVE
        2 *
        3 a/* _sieve.c 8.4.2015/SM (8.4.2015) */
        4 *
        5 *#include <stdio.h>
        6 *#include <stdlib.h>
        7 *#include <malloc.h>
        8 *#include <math.h>
        9 *#include <survo.h>
       10 *#include <survoext.h>
       11 *
       12 *unsigned int n;
       13 *char *prime;
       14 *int j,h;
       15 *
       16 *void main(argc,argv)
       17 *int argc; char *argv[];
       18 *    {
       19 *    unsigned int i,n,max,p,count,output;
       20 *    if (argc==1) return;
       21 *    s_init(argv[1]); // initializing Survo environment for this program
       22 *    if (g<2)
       23 *        {
       24 *        sur_print("\nUsage: SIEVE N [output=0,1,2]");
       25 *        WAIT; return;
       26 *        }
       27 *    n=atoi(word[1]);
       28 *    prime=(char *)malloc(n+1); // reserving space for prime indicators
       29 *    for (i=0; i<n+1; ++i) prime[i]='1'; // at the start all are primes
       30 *    max=(int)sqrt((double)n);  // max number to be tested is sqrt(n)
       31 *    p=1; output=2;
       32 *    if (g>2) output=atoi(word[2]); // selecting scope of output
       33 *    while (p<max)
       34 *       {
       35 *       ++p;
       36 *       while (prime[p]=='0') ++p; // finding next prime p
       37 *       for (i=p*p; i<=n; i+=p) prime[i]='0'; // multiples of p composite
       38 *       } // all primes found
       39 *    count=0;
       40 *    j=r1+r; new_line();
       41 *    for (i=2; i<n+1; ++i)
       42 *       if (prime[i]=='1')
       43 *          {
       44 *          ++count; // counting number of primes
       45 *          if (output==2)
       46 *              {
       47 *              h+=sprintf(sbuf+h,"%u ",i); // collecting primes on a line
       48 *              if (h>c-7) // visible line length - 7
       49 *                  {
       50 *                  out();
       51 *                  new_line();
       52 *                  }
       53 *              }
       54 *          }
       55 *    if (output>0)
       56 *          {
       57 *          if (output==2) out();
       58 *          new_line();
       59 *          sprintf(sbuf,"Number of primes < %u is %u.",n,count);
       60 *          out();
       61 *          s_end(argv[1]); // output to be catched by the editor
       62 *          }
       63 *    return;
       64 *    }
       65 *
       66 *new_line()
       67 *    {
       68 *    ++j; h=0; *sbuf=EOS; // output=2
       69 *    return(1);
       70 *    }
       71 *
       72 *out()
       73 *    {
       74 *    edwrite(space,j,1);
       75 *    edwrite(sbuf,j,1);
       76 *    return(1);
       77 *    }
       78 A
       79 *
       80 *TIME COUNT START / Continuous activation by F2 ESC
       81 *SIEVE 1000000
       82 *TIME COUNT END   0.245
       83 *2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89
       84 *97 101 103 107 109 113 127 131 137 139 149 151 157 163 167 173 179
       85 *181 191 193 197 199 211 223 227 229 233 239 241 251 257 263 269 271
       86 *277 281 283 293 307 311 313 317 331 337 347 349 353 359 367 373 379
       87 *383 389 397 401 409 419 421 431 433 439 443 449 457 461 463 467 479
       88 *487 491 499 503 509 521 523 541 547 557 563 569 571 577 587 593 599
       89 *601 607 613 617 619 631 641 643 647 653 659 661 673 677 683 691 701
       90 *709 719 727 733 739 743 751 757 761 769 773 787 797 809 811 821 823
       91 *827 829 839 853 857 859 863 877 881 883 887 907 911 919 929 937 941
       92 *947 953 967 971 977 983 991 997 1009 1013 1019 1021 1031 1033 1039
       - - - - - - - - - - -
     7817 *999529 999541 999553 999563 999599 999611 999613 999623 999631 999653
     7818 *999667 999671 999683 999721 999727 999749 999763 999769 999773 999809
     7819 *999853 999863 999883 999907 999917 999931 999953 999959 999961 999979
     7820 *999983
     7821 *Number of primes < 1000000 is 78498.
     7822 *
    
  • See also: Sucros in Survo and SET command

    Combining Survo operations by sucros

    Example #91 (Start the demo by clicking the picture!)

    This demo in YouTube
    This as flash demo

    In many demanding Survo applications sucros play an essential role. When the user finds out that a certain task consisting of a series of Survo operations is encountered repeatedly, it is profitable to let Survo to save all the actions belonging to that task in the tutorial mode in a sucro file. Usually such a sucro needs some editing and polishing. This is done easily by loading the sucro code into the edit field (TUTLOAD) and after modifications by saving the code back to a sucro file (TUTSAVE).

    Practically all demos in this collection have been created as sucros.

    For example, the entire sucro code of this demo is:

    
     11 *
     12 */M
     13 *TUTSAVE M
     14 *{tempo -1}{init}{jump 1,1,1,1}SCRATCH {act}{line start}
     15 /
     16 *COLX W20{act}{line start}{erase}{tempo 2}{wait 100}{tempo -1}
     17 *{R}
     18 *   {form7} Combining Survo operations by sucros {R}
     19 *{R}
     20 *Sucros are at their best when the task at hand is a sequence of{R}
     21 *standard Survo operations and the same job has to be repeated{R}
     22 *many times from various initial conditions.{R}
     23 *For example, sucros have been created for performing some multistage{R}
     24 *forms of statistical analysis.{R}
     25 *Here a sucro /FACTOR is presented. It carries out the standard steps{R}
     26 *of factor analysis:{R}
     27 *{R}
     28 *1. Computing correlations CORR.M{R}
     29 *2. Computing eigenvalues by spectral decomposition of CORR.M{R}
     30 *3. The number of factors f is determined as follows:{R}
     31 *   Eigenvalues e(1)>=e(2)>=e(3)>=...{R}
     32 *   Ratios      s(i)=e(i+1)/e(i), if e(i)>=0.9, s(i)=1 else{R}
     33 *   Let e(j)>=1 and e(j+1)=<1 and s(k)=min(s(j),s(j+1),s(j+2)){R}
     34 *   Then f=k.{R}
     35 *4. Computing the maximum likelihood solution FACT.M by FACTA{R}
     36 *5. Computing the rotated factor matrix AFACT.M by ROTATE{R}
     37 *{R}
     38 */FACTOR is now applied to the dataset DECA on the 48 best athletes{R}
     39 *of the world in 1973.
     40 /
     41 *{tempo 2}{90}{R}
     42 *{d5}{tempo 0}{d14}{u19}{tempo 2}{10}
     43 *
     44 *The 10 event{tempo 0} variables will be considered
     45 * and thus set active in DECA:{tempo 2}{20}{R}
     46 *FILE ACTIVATE DECA{keys 2}{act}
     47 /
     48 *--AAAAAAAAAA--{exit}
     49 /
     50 *{keys 0}{10}{R}
     51 *Sucro /FACTOR{tempo 0} is activated with dataset DECA:
     52 *{tempo 2}{10}{R}
     53 */FACTOR DECA{keys 2}{act}{keys 0}{30}
     54 *
     55 *{d14}{20}{d14}{20}{del line}{d21}{u18}{10}
     56 /
     57 *The rotated{tempo 0} factor matrix is made
     58 * more informative by another sucro:{tempo 2}{20}
     59 *{R}
     60 */LOADFACT{keys 2}{act}{keys 0}{30}
     61 /
     62 *{d17}{u2}
     63 *Typically{tempo 0} for Survo, the work has documented itself and it may be {R}
     64 *repeated.{tempo 2}{20}
     65 * Now e.g. {tempo 0}a four-factor solution can be obtained 'manually'
     66 *{R}
     67 *and another rotation technique can be adopted:
     68 *{tempo 2}{30}{home}{5}{u55}{5}{r13}{10}4{r}35   {keys 2}{act}{keys 0}
     69 /
     70 *{10}{d}{l6}4{r}49{erase}{5}   / ROTATION=ORTHO_CLF (by Jennrich)
     71 *{keys 2}{act}{keys 0}{10}{R}
     72 *{d28}{20}{d22}{u13}{10}SCRATCH{10}{act}{10}{R}
     73 *{u2}{keys 2}{act}{keys 0}{end}
     74 *
    
    The source code of each demo in this collection lives in its own folder and therefore a fixed short name M for the file (see lines 12,13 above) is selected. The first lines (14-19), except the header text, are common for all these demos.

    In sucros intended for teaching or demonstrating it is important to regulate timing of the process. This takes place by wait codes ( {wait 20} or simply {20} means a wait for two seconds ) and tempo codes ( {tempo 0} sets the fastest speed for 'writing' and {tempo 2} a normal speed ).

    An essential part of sucro programming is the 'cursor choreography' i.e. how the cursor is moved in the edit field. Also this sucro contains plenty of codes like {d14} (14 steps downwards) or {r6} (6 steps to the right).

    The key codes control echoing of key strokes in the lower right corner of the Survo window. {key 2} starts echoing and {key 0} cancels it. This property is used on lines 46-50 when selecting active variables from DECA and in most activations of Survo commands.



    Sucro /FACTOR used here as a 'subroutine' has a different nature as a tool for a rapid automatic execution of the typical stages of factor analysis. There are also conditional statements for determination of the number of factors, for example.

    /FACTOR has the following code when loaded into the edit field:
    
     11 *
     12 *TUTLOAD <Survo>\S\FACTOR
     13 /    /FACTOR <data>                         / 10.6.1991/SM (13.5.1994)
     14 / or /FACTOR <data>,<number_of_factors>
     15 *{tempo -1}{init}{R}
     16 *SCRATCH {act}{home}
     17 - if W1 '=' ? then goto A
     18 - if W1 '<>' (empty) then goto S
     19 + A: /FACTOR <data>{R}
     20 *makes a factor analysis from active variables and observations of{R}
     21 *a Survo data <data>.{R}
     22 *The steps of analysis are:{R}
     23 *1. Computing correlations CORR.M{R}
     24 *2. Computing eigenvalues by spectral decomposition of CORR.M{R}
     25 *3. The number of factors f is determined as follows:{R}
     26 *   Eigenvalues e(1)>=e(2)>=e(3)>=...{R}
     27 *   Ratios      s(i)=e(i+1)/e(i), if e(i)>=0.9{R}
     28 *               s(i)=1            else.{R}
     29 *   Let e(j)>=1 and e(j+1)=<1 and s(k)=min(s(j),s(j+1),s(j+2)){R}
     30 *   Then f=k.{R}
     31 *4. Computing the maximum likelihood solution FACT.M by FACTA{R}
     32 *5. Computing the rotated factor matrix AFACT.M by ROTATE{R}
     33 *{R}
     34 *The user can also enter the number of factors f by activating{R}
     35 */FACTOR <data>,f{R}
     36 *{goto E}
     37 /
     38 + S: CORR {print W1}{act} / Correlation matrix saved as CORR.M{R}
     39 - if W2 > 0 then goto FAC
     40 *MAT SPECTRAL DECOMPOSITION OF CORR.M TO &S,&D{act}{R}
     41 *MAT DIM &S{act}{find =} {save word W3}
     42 - if W3 = 1 then goto F
     43 *{home}{erase}{ref}MAT LOAD &D,CUR+1{act}{R}
     44 *{d2}{W1=0}
     45 + Next_line: {R}
     46 *{W1=W1+1}{next word}{next word}{save word W2}
     47 - if W2 >= 1 then goto Next_line
     48 *{W4=W1-1}
     49 - if W4 < W3 then goto D
     50 + F: {R}
     51 *{ins line}Not a proper correlation matrix for factor analysis!
     52 *{goto E}
     53 + D: {}
     54 /
     55 / def We1=W3 We2=W4 We3=W5 We4=W6
     56 / def Wsmin=W7 Wf=W8 Ws=W9
     57 *{u}{save word We1}{d}{save word We2}{d}{save word We3}{d}
     58 *{save word We4}{Wsmin=We2/We1}{Wf=W1-1}
     59 - if We2 < 0.9 then goto C
     60 *{Ws=We3/We2}
     61 - if Ws > Wsmin then goto B
     62 *{Wsmin=Ws}{Wf=W1}
     63 + B: {}
     64 - if We3 < 0.9 then goto C
     65 *{Ws=We4/We3}
     66 - if Ws > Wsmin then goto C
     67 *{Wsmin=Ws}{Wf=W1+1}
     68 + C: {ref}{ref}{u}SCRATCH {act}{home}MAT &D=&D'{act}{home}{erase}MAT L
     69 *OAD &D,12.12,CUR{act}{home}{del line}{erase}MAT KILL &*{act}{home}
     70 *{erase}Eigenvalues of the correlation matrix CORR.M:{R}
     71 *{del9}{R}
     72 *{del9}{R}
     73 *{goto FAC2}
     74 /
     75 + FAC: {Wf=W2}
     76 + FAC2: {}
     77 - if Wf = 1 then goto F
     78 *FACTA CORR.M,{print Wf},END+2{act} / Factor matrix saved as FACT.M{R}
     79 *{ins line}{u}
     80 /
     81 *ROTATE FACT.M,{print Wf},END+2{act}
     82 /
     83 * / Rotated factor matrix saved as AFACT.M{R}
     84 + E: {tempo +1}{end}
     85 *
    

  • See also: Sucros in Survo and Factor analysis in Survo

    Tracing a sucro program (Finding prime numbers)

    Example #92 (Start the demo by clicking the picture!)

    This demo in YouTube


    In principle the same technique for finding prime numbers is used as in an earlier demo.
    Now instead of having a long column, the integers a listed compactly on edit lines and then permitting a better chance for viewing the process.

    The sucro code with additional features (in red) used for tracing in the latter part of this demo can be studied here:



    In this two-window mode, sucro SIEVE is saved by a sucro command /TUTSAVE (on line 24 above) and with parameter ECHO2.

    It is possible to omit echoing for selected parts of the code by control codes {-} and {+} appearing here on lines 34 and 41.
    Thus the code on lines from 35 to 40 (used for finding next number after the newest prime in the list) is not echoed.

  • See also: Sucros in Survo and LST commands

    Multiple discriminant analysis in linguistic problems

    Example #93 (Start the demo by clicking the picture!)

    This demo in YouTube



    An old application presented in
    Mustonen, S. (1965). Multiple Discriminant Analysis in Linguistic Problems. Statistical Methods in Linguistics, 4, 37-44.

    is revisited now 50 years later by using a tenfold dataset.
    Systematic samples of 3000 words from each of languages Finnish, Swedish, and English are collected from word lists Finnish, Swedish, and English

    When creating 43 numerical variables, some of them are based on (Finnish) hyphenation of a word. For this purpose a new key combination (F1 T) was introduced (in SURVO MM) for hyphenating the word touched by the cursor in the edit field.

    In the old experiment certain words were plotted in the two-dimensional discriminant space as presented in this picture

    and here is the result when the same words are plotted according to this new experiment

    In the latter picture the vertical axis had to be reversed for a proper comparison.


    My early (1965) experiment has been described recently (2013) by Steve Pepper.

  • See also: Multiple discriminant analysis

    Probability of Matching Column Drums (1/2)

    Example #94 (Start the demo by clicking the picture!)

    This demo in YouTube

    The main source of this demo is my joint paper with Jari Pakkanen.

    Certain probabilities related to preserved columns of a ruined Temple of Zeus in Lambrounda are obtained by using editorial computing in Survo.

  • See also: Editorial computing and X function in Survo

    Probability of Matching Column Drums (2/2)

    Example #95 (Start the demo by clicking the picture!)

    This demo in YouTube

    A study related to my joint paper with Jari Pakkanen is continued.

    The values of probabilities obtained in the previous demo are compared to empirical frequencies now obtained by simulation.

  • See also: Touch mode

    Distance distributions in networks 1

    Example #96 (Start the demo by clicking the picture!)

    This demo in YouTube

    This study originated from a practical research problem of determining the distribution of the distances travelled in Finland by post parcels in early 1960ies. A random sample of postal traffic in Finland was collected for this and many other purposes.


    I was then working as an assistant of Professor Leo Törnqvist and this practical problem gave us an idea for a more theoretical problem of deriving the distribution of the distance between two points chosen uniformly randomly in a metric network. Törnqvist achieved the result, according to his phenomenal intuitive thinking, without any formal derivation or proofs. My task was to formalize the problem, verify the results, and generalize them. This was done in my doctoral thesis in mathematics (1964).


    Now I have selected a more practical approach. Due to enormous progress in computing speed and capacity during 50 years, the distributions can now be studied by plain simulations giving also more possibilities for generalizations of the original problem.


    Here a simple, regular network G(2) consisting of 2x2 unit squares is studied for finding the density function of the length of the shortest path (along the edges) between two random points selected according to uniform distribution over the entire network of length 12 units. The results were obtained by using results given in my dissertation. In particular, the expected value of the distance is 5/3=1.666...


    In the graph below the theoretical means for networks G(n), n=0,1,2,3,4, are given

    and it is obvious that generally the mean for G(n) is (2n+1)/3. When n grows, this mean divided by n approaches 2/3. This is validated by the fact that it is the expected value of the distance by city metrics between two random points inside a unit square i.e 2 times mean distance in G(0) = 1/3 as explained below.


    When we got interested in this topic (in the beginning of 1960ies), it was natural to start by calculating means for the simplest networks. like a single edge G(0), or a ring corresponding to G(1).
    In the latter case it is easy to see that the first point may be fixed and thereafter it is seen immediately that the mean is one fourth of the total length. The fact that for an edge of length 1 the mean is just 1/3 requires more effort. My favourite (but a little heuristic) explanation was: "After selecting two random points on the edge, let's select a third random point. The probability that it falls between the two earlier ones is 1/3 for symmetrical reasons. Due to uniformity, the last point covers 1/3 of the total length 'on average'."
    This statement can be even generalized: If n random points are selected from an unit interval, the expected value of the distance of the extreme points is (n-1)/(n+1). This is 'proved' just in the same way by selecting once more.

    A simple strict proof that for G(0) the mean is 1/3 goes as follows: Let x be he mean. By splitting the unit edge into two equal parts of lenghts 1/2, the probability of selecting the two point on the same half is 1/2 and from different halves also 1/2. The conditional means are 1/2 in the first case and x/2 in the second case. Then we get an equation x=1/2*1/2+1/2*x/2 wherefrom x=1/3. (This was presented by Hannu Väliaho.)


    When preparing my PhD thesis I made a program in Elliott Autocode for the Elliott 803B computer. That program created the exact density function, but required a lot of computer time. For example, the G(10) case took about 3 hours.
    Now, by using the GDIST operation, approximate results (10^6 simulations) are obtained in less than 10 seconds on my current (2015) PC. For example, I got the mean 6.995827 which is close enough to the exact value 7.


    The GDIST program calculates shortest distances between any points on the edges of the network as follows. At first the distance matrix between all end points is calculated by Dijkstra's algorithm. and then, after selecting random points, the various distances through the end points of the corresponding edges (4 alternatives) are considered and the minimal distance is selected. As a special case, the points may be selected on the same edge. Then as the fifth alternative the distance between the points on that edge must be considered, too. It should be noted that on 'curved' edges the last alternative is not necessarily the shortest one.

    Distance distributions (Twin cities bridge problem)
    Distance distributions (n-dimensional cube)



    Distance distributions in networks 2

    Example #97 (Start the demo by clicking the picture!)

    This demo in YouTube

    As a more 'practical' example the new GDIST operation is used here for optimization of a traffic network. The mean distance is minimized by selecting the best place for a second bridge for connecting two parts of a city.


    In the A matrix

    
    MAT SAVE AS A
     1  2 10 1
     2  3  5 1
     3  4  5 1
     1  5  4 1
     2  6  4 1
     3  7  4 1
     4  8  4 1
     5  6 10 1
     6  7  5 1
     7  8  5 1
     7 12  6 0
     9 10  7 1
     -  -  - -
    
    the fourth element on each row tells the intensity of sending/receiving traffic so that the probability of selecting a random point in the network from a particular edge is proportional to the product of its length and intensity. Thus in this example no journey has an origin or a destination on the edge of a bridge (7 12 6 0). The intensities can be any non-negative numbers.

    Another generalization (not used in this example) is a possibility to add a fifth element (1 or 2) on A-rows implying the edges denoted by 1 as sending regions and edges denoted by 2 as receiving regions so that on traffic between two regions is to be considered.

    When applying this possibility to this "twin-cities" so that only traffic between the upper and lower city is studied, in the original situation (no second bridge) the mean distance grows from 17.30 to 24.53 because the internal traffic in the two parts is excluded, but this modification does not change the optimal solution for the additional bridge.

    It would be easy make modifications to the GDIST program so that certain gravitation principle is adopted (the probability for taking a journey is dependent on the distace). Obviously this feature can taken into account by a suitable transformation of the density function afterwards.


    Distance distributions (Background information)
    Distance distributions (n-dimensional cube)


    Distance distributions in networks 3

    Example #98 (Start the demo by clicking the picture!)

    This demo in YouTube

    This is so far the only example where the distance distribution is asymptotically normal. It is shown that in an n-dimensional unit cube the distance (along the edges) between randomly selected vertices is Bin(n/2,1/2). The distribution between random points on the edges has similar properties but it is more complicated to handle.

    
    The exact expected value for the distance (along the edges) between
    two random points on the edges of the 4-dimensional cube can be computed
    as follows:
    Because this 'network' is symmetric for each edge, it is sufficient
    to assume that the first random point is selected from a given
    edge. Each element (vertex, edge, square, cube) of this hypercube
    can be denoted by a string of the form abcd where each a,b,c, and d
    can be 0, 1, or x, where x covers the range (0,1). Thus for example,
    0000 is the first vertex (origo), x000 is the edge from origo to
    (1,0,0,0), and xxx0 and x1xx are two of the eight cubes located in the
    hypercube.
    
    The mean distances between x000 and the other edges can be
    easily computed and they are presented in the following table.
    
    edge mean     edge mean   edge mean   edge mean
    x000  1/3     0x00  1     00x0  1     000x  1
    x100  5/3     1x00  1     10x0  1     100x  1
    x010  5/3     0x10  2     01x0  2     010x  2
    x110  8/3     1x10  2     11x0  2     110x  2
    x001  5/3     0x01  2     00x1  2     001x  2
    x101  8/3     1x01  2     10x1  2     101x  2
    x011  8/3     0x11  3     01x1  3     011x  3
    x111 11/3     1x11  3     11x1  3     111x  3
    
    The means in the first column are related to 'opposite' edges of
    x000. According to terminology used in my dissertation (1964)
    they are mirror point sets for x000.
    It is clear that mean inside x000 is 1/3, but why the mean
    distance between random points on opposite edges is of a unit square
    square is 5/3 ?
         .----.
         |    |
         |    |
         .----.
    Let it be x. If two random points are selected from the edges of a
    unit square, the probablity that they are selected from the same
    edge is 1/4, from opposite edges similarly 1/4, and from
    neighbouring edges 1/2. The means are 1/3, x, and 1 respectively.
    Since the total mean in unit square is 1, we have an equation
       1 = 1/4*1/3 + 1/4*x + 1/2*1
    giving x=5/3. The remaining means in the first column are thereafter
    easy to comprehend.
    
    The means in the remaining three columns are obvious, since they
    are either adjacent to x000 or adjacent through a route of a constant
    integer length.
    
    Now the total expected value of the distance in the entire
    4-dimensional cube is calculated as
    
    ((1*2+3*5+3*8+1*11-1)/3+2*3*(1*1+2*2+1*3))/(4*2^3)=2.03125
    
    The general structure becomes still clearer in the 5-dimensional case
    where the expected value has the form
    
    ((1*2+4*5+6*8+4*11+1*14-1)/3+2*4*(1*1+3*2+3*3+1*4))/(5*2^4)=2.5291666666667
    and 2.5291666666667(12:ratio)=607/240 (0.00000000000003)
    
    On the basis of these expressions it is obvious that
    in the n-dimensional case the expression for the mean distance
    can be presented in the form
    
    E(n)=((P(n+1)-1)/3+2*(n-1)*Q(n-2))/(n*2^(n-1))
    
    Both P(n) and Q(n) are 'weighted' sums of binomial coefficients.
    In fact P-sequence is an inverse binomial transform of an arithmetic
    sequence 2,5,8,11,14,... and Q-sequence a similar transform of natural
    numbers 1,2,3,4,5,...
    
    The P(n) values for n=2,...,7 are
    
     n  P(n)
     2  1*2=2
     3  1*2+1*5=7
     4  1*2+2*5+1*8=20
     5  1*2+3*5+3*8+1*11=52
     6  1*2+4*5+6*8+4*11+1*14=128
     7  1*2+5*5+10*8+10*11+5*14+1*17=304
    
    By consulting OEIS (The On-Line Encyclopedia of Integer Sequences)
    it is found that the sequence 2,7,20,52,128,304,... is
    A066373
    and P(n)=(3*n-2)*2^(n-3),  n=2,3,...
    
    The Q(n) values for n=0,1,...,5 are
    
     n  Q(n)
     0  1*1=1
     1  1*1+2*1=3
     2  1*1+2*2+1*3=8
     3  1*1+3*2+3*3+1*4=20
     4  1*1+4*2+6*3+4*4+1*5=48
     5  1*1+5*2+10*3+10*4+5*5+1*6=112
    
    and OEIS tells that 1,3,8,20,48,112,... is
    A001792
    and Q(n):=(n+2)*2^(n-1),  n=0,1,...
    
    By substituting expressions of P(n+1) and Q(n-2) into the
    the previous formula of E(n) we obtain
    E(n)=(((3*n+1)*2^(n-2)-1)/3+2*(n-1)*n*2^(n-3))/(n*2^(n-1))
    and this can be simplified into the form
    E(n)=n/2+(1-2^(2-n))/(6*n).
    


    Distance distributions (Background information)
    Distance distributions (Twin cities bridge problem)
    Projections of a 4-dimensional cube

    Battle over Degrees of Freedom

    Example #99 (Start the demo by clicking the picture!)

    This demo in YouTube

    I created this example as a flash application originally in 2006. There the weblink to Fisher's paper is not valid anymore.
    Currently this paper "Bayes' theorem and the fourfold table" (1926) is available from https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2984620/.


    Problem of minor chord in music

    Example #100 (Start the demo by clicking the picture!)

    This demo in YouTube

    It is shown that although the pure minor triad is theoretically more complicated than the corresponding major triad, this contrast is essentially diminished when these triads are compared in equal temperament. In the history of music the transition from meantone tuning (with pure thirds) towards more practical temperaments has a substantial role in the development of tonal western music, since most of the composers have used keyboard instruments with a fixed intonation like clavichord, harpsichord, organ or piano as tools in their work.
    According to my mind, to the 'problem of the minor chord' this is a simpler solution than purely theoretical explanations presented by many musicologists.
    The oscillograms for pure triads were created by the following plotting schemes of Survo:
    HEADER=[Swiss(20)],Major_triad_(pure)
    HOME=0,175 SIZE=649,174
    XSCALE=0:_,10*pi:_ YSCALE=[SMALL],-3:_,3:_ pi=3.14159265
    X=0,10*pi,pi/60  XDIV=29,600,20  FRAME=0
    GPLOT Y(X)=sin(20*X)+sin(25*X)+sin(30*X)
                   20:25:30=4:5:6
    ........................................................
    HEADER=[Swiss(20)],Minor_triad_(pure)
    HOME=0,0 SIZE=649,174
    XSCALE=0:_,10*pi:_ YSCALE=[SMALL],-3:_,3:_ pi=3.14159265
    X=0,10*pi,pi/60  XDIV=29,600,20  FRAME=0
    GPLOT Y(X)=sin(20*X)+sin(24*X)+sin(30*X)
                   20:24:30=10:12:15
    ........................................................
    
    In the corresponding tempered triads the proportions 20:25:30 and
    20:24:30 were replaced by 20:20*2^(4/12):20*2^(7/12) and
    20:20*2^(3/12):20*2^(7/12).
    


    The sound file for the pure A Major triad was built in the standard Waveform Audio File (WAV) format as follows:
    At first a Survo data file TEST of 100000 observations for an integer
    variable X was created by
    FILE MAKE TEST,1,100000,X,2
    
    A fundamental tone of frequency F Hz on sampling rate RATE is described
    by a sinus function of the form
    S(x):=sin(x*ORDER*2*pi*F/RATE)
    where ORDER in the index of an observation in the data file TEST.
    
    When F=440, RATE=44100, and pi=3.141592653589793
    S(1) will represent a sample of tone A=440 Hz
    and a sample of a pure A Major triad is saved in the data file TEST
    as variable X by the VAR operation
    VAR X=1000*(S(1)+S(5/4)+S(3/2)) TO TEST
    where 1000 gives the sound volume.
    
    Now this TEST file is converted to a WAV file MAJOR.WAV by the Survo
    operation
    PLAY DATA TEST,X / WAV=MAJOR
    and thereafter it can be played as a sound of length
    100000/RATE=2.267... seconds by activating
    PLAY SOUND MAJOR
    
    The other triads were obtained by replacing S(1)+S(5/4)+S(3/2) in the
    VAR operation above by
    S(1)+S(6/5)+S(3/2)             pure minor
    S(1)+S(2^(4/12))+S(2^(7/12))   tempered major
    S(1)+S(2^(3/12))+S(2^(7/12))   tempered minor
    

  • See also: Sound files in Survo

    Dissonance functions

    Example #101 (Start the demo by clicking the picture!)

    This demo in YouTube

    In 1972 when reading the classical treatise "On the Sensations of Tone" by Helmholtz (1863)
    I noticed his graphical presentation



    on the consonance of various musical intervals based on practical experiments on violin.
    I formulated these results as a theoretical model as follows:


    
    Minimum points of the dissonance function
    
    "Accuracy of the ear"        c=5     c=4     c=3     c=2
    
    Unison                       1:1     1:1     1:1     1:1
    Just minor semitone         18:17
    Minor diatonic semitone     17:16
    Just diatonic semitone      16:15
    Septimal diatonic semitone  15:14
    Lesser tridecimal 2:3-tone  14:13
    Greater tridecimal 2:3-tone 13:12
    Neutral second              12:11
    Neutral second              11:10
    Major second                10:9    10:9
    Major second                 9:8     9:8
    Septimal major second        8:7     8:7
                                15:13
    Septimal major third         7:6     7:6
                                13:11
    Just minor third             6:5     6:5     6:5
    Neutral third               11:9
    Major third                  5:4     5:4     5:4
    Diminished major third      14:11
    Septimal major third         9:7     9:7
    Diminished fourth           13:10
    Perfect fourth               4:3     4:3     4:3     4:3
                                15:11
    Eleventh harmonic           11:8
    Lesser septimal tritone      7:5     7:5
    Greater septimal tritone    10:7    10:7
                                13:9
    Inversion of 11th harmonic  16:11
    Perfect fifth                3:2     3:2     3:2     3:2
                                17:11
    Septimal minor sixth        14:9
    Undecimal minor sixth       11:7    11:7
    Just minor sixth             8:5     8:5
    Tridecimal neutral sixth    13:8
    Just major sixth             5:3     5:3     5:3
                                17:10
    Septimal major sixth        12:7
    Harmonic seventh             7:4     7:4     7:4
    Small just minor seventh    16:9
    Greater just minor seventh   9:5     9:5
    Undecimal neutral seventh   11:6    11:6
                                13:7
    Just major seventh          15:8
                                17:9
                                19:10
                                21:11
    Octave                       2:1     2:1     2:1     2:1
    


    My simple intuitive approach to this problem is conceptully different from results presented by
    Plomp, Levelt, and Terhardt, for example, based on psychoacoustics and auditory experiments.
    An excellent source for this topic is the book
    Tuning, Timbre, Spectrum, Scale (Second edition 2005) by William A. Sethares.


    In Survo the same algorithm is available also for Rational approximations of decimal numbers by 'listening'.
  • See also: Converting decimal numbers to fractions and Survo book (1992) p.63 and p.330.

    Random music with Slutzky-Youle effect

    Example #102 (Start the demo by clicking the picture!)

    This demo in YouTube

    In 2001 I programmed a PLAY operation for making various acoustic experiments in Survo.
    This demo shows how synthetic WAV files are generated by using PLAY and Survo operations related to data files (FILE MAKE, VAR, SER).
    For example, it was interesting to see how a sequence of random tones when tamed by a simple moving avarage transformation becomes more 'melodic' (corresponding to the Slutzky-Youle effect in economic time series).
  • See also: PLAY operation and SER operation


  • Synthetic bird song

    Example #103 (Start the demo by clicking the picture!)

    This demo in YouTube

    In Survo artificial sound files can be generated by defining waveforms by means of the VAR function. Combinations of trigonometric functions are suitable for such applications.
    The harmony of sounds is here distorted by jumps triggered by an int function (on edit lines 13,14) rounding numbers down to the nearest integer.
    A similar effect plays an important role visually in
    Lissajous curve variation (Knitting a carpet)
  • See also: PLAY operation


  • Sounds of statistical data 1

    Example #104 (Start the demo by clicking the picture!)

    This demo in YouTube

    It is shown how the PLAY operation may be a useful tool for finding structural features and for detecting outliers in statistical data sets.
    I created this feature in a restricted form in 1987 for SURVO 84C.
    There it was working in connection with the FILE SHOW operation.

    Series of values of any numerical variable in a Survo data (list, table, or file) can be converted to musical tones and played by the PLAY DATA operation.
    A tone file TRIADS,TON have an essential role in this task. Each tone file is a standard text file with a specific structure. For example, the tone file TRIADS.TON was created in the Survo edit field in this way
    (by activating the SAVEP command on line 11):
    
     11 *SAVEP CUR+1,E,TRIADS.TON
     12 *1     / Type
     13 *44100 / Rate Hz
     14 *11    / # of tones
     15 *9     / # of partials
     16 * 0  110   1 S1:.9 S2:.3 S3:.3 S4:.2 S5:.2 S6:.1 S7:.1 S8:.05 S9:.05
     17 * 1  137   1 S1:.9 S2:.3 S3:.3 S4:.2 S5:.2 S6:.1 S7:.1 S8:.05 S9:.05
     18 * 2  165   1 S1:.9 S2:.3 S3:.3 S4:.2 S5:.2 S6:.1 S7:.1 S8:.05 S9:.05
     19 * 3  220   1 S1:.9 S2:.3 S3:.3 S4:.2 S5:.2 S6:.1 S7:.1 S8:.05 S9:.05
     20 * 4  275   1 S1:.9 S2:.3 S3:.3 S4:.2 S5:.2 S6:.1 S7:.1 S8:.05 S9:.05
     21 * 5  330   1 S1:.9 S2:.3 S3:.3 S4:.2 S5:.2 S6:.1 S7:.1 S8:.05 S9:.05
     22 * 6  440   1 S1:.9 S2:.3 S3:.3 S4:.2 S5:.2 S6:.1 S7:.1 S8:.05 S9:.05
     23 * 7  550   1 S1:.9 S2:.3 S3:.3 S4:.2 S5:.2 S6:.1 S7:.1 S8:.05 S9:.05
     24 * 8  660   1 S1:.9 S2:.3 S3:.3 S4:.2 S5:.2 S6:.1 S7:.1 S8:.05 S9:.05
     25 * 9  880   1 S1:.9 S2:.3 S3:.3 S4:.2 S5:.2 S6:.1 S7:.1 S8:.05 S9:.05
     26 *10 1100   1 S1:.9 S2:.3 S3:.3 S4:.2 S5:.2 S6:.1 S7:.1 S8:.05 S9:.05
     27 *
    
    The four first lines (12-15) define the general structure if the file.
    So far "Type" 1 is the only alternative. "Rate Hz" is the sampling rate used when creating Waveform File Format (WAV) audio files by PLAY DATA. "# of tones" is the number of tones defined in the tone file and "# of partials" gives the number of partials (fundamental tone + overtones) for each tone.
    Each tone is defined on a line of its own (lines 16-26 above) and has the form
  • index 0,1,2,...,
  • fundamental tone in Hz,
  • relative volume (1 for each tone in this case),
  • list of partials with their relative frequencies as multipliers of the fundamental tone and their relative powers.

  • For example, above the tone 6 is 440 Hz with relative volume 1 and in the list of partials S3:.3 denotes the second partial with a relative power 0.3.
    Here 'S' refers to a sinusoidal partial.
    In this case, all tones have the same composition of partials, but in general partials may be different for each tone and they need not to be integer multiples of the fundamental tone.


    Sounds of statistical data 2: "Cuckoos singing in the rain"

    Example #105 (Start the demo by clicking the picture!)

    This demo in YouTube

    The entire statistical time series about the monthly mean temperature and rainfall in Helsinki from 1829 to the end of 2015 (187 consecutive) years is here presented in 226 seconds as a sequence of two voices.
    Thus the time elapses here over 26 million times faster than the real time.

    The snapshot above tells all details how this demo was generated.
    For both voices the following simple "tone generator" TRIADS1.TON vas used (see the line 10 above). Its propertien are given at the end of theses comments.
    The specifications Temp and Rain (lines 11-12) tell how voices related to Temp and Rain variables would appear.
    The parameters in these definitions are in general:
  • data variable creating the voice (Temp, Rain)
  • tone generator (index from TONES specification)
  • channel selection (0=left 1=right)
  • relative volume (1,1)
  • decay rate of volume <= 1 (default 1 = no decay)

  • SCALING=1 (line 14) tells that the values of data variables in use are linearly mapped to the indices (0,1,2,...) of the corresponding tone generator. If it is not given, the values of variables must be readily integers 0,1,... within the scale of the corresponding tone generator.
    The specifications on the line 15 determine observations of the data to be used (RECORDS), tempo of "music" (TEMPO), and general volume (VOLUME).

    In this case I have wanted Survo to display the current year as a text during the fast flow of years.
    This takes place by calling another Survo session to play the sound by 'activating' the PLAY DATA command (line 16) by a special key sucro (modification of PRE M Z).
    Then the 'main' Survo is free to do anything during the play and in this case displays the years on line 20 by a simple sucro.
    To get the sound and the display of the year synchronized, the TEMPO specification has been adjusted to value 596 just by trial and error.

    Here are the properties of TRIADS1.TON:
    
     11 *SAVEP CUR+1,E,TRIADS1.TON
     12 *1     / Type
     13 *44100 / Rate Hz
     14 *11    / # of tones
     15 *3     / # of partials
     16 * 0  110   1 S1:1 S2:0.5 S3:0.25
     17 * 1  137   1 S1:1 S2:0.5 S3:0.25
     18 * 2  165   1 S1:1 S2:0.5 S3:0.25
     19 * 3  220   1 S1:1 S2:0.5 S3:0.25
     20 * 4  275   1 S1:1 S2:0.5 S3:0.25
     21 * 5  330   1 S1:1 S2:0.5 S3:0.25
     22 * 6  440   1 S1:1 S2:0.5 S3:0.25
     23 * 7  550   1 S1:1 S2:0.5 S3:0.25
     24 * 8  660   1 S1:1 S2:0.5 S3:0.25
     25 * 9  880   1 S1:1 S2:0.5 S3:0.25
     26 E10 1100   1 S1:1 S2:0.5 S3:0.25
     27 *
    


    The monthly statistics of the mean temperature and the rainfall are are presented below as plots of time series for the last 20 years.

    Because during each year the temperature is varying rather systematically between low and high valules, it gives a clear cycle of 12 tones but the tones from the rainfall are typically on a low level and high tones are like random peaks (cuckoos singing in the rain:).
  • See another graphical presentation of Temperature in Helsinki.
  • More demos related to the same data are ex33 and ex87.
  • The weather data are provided by Finnish Meteorological Institute.


  • Resurrection of SURVO 66

    Example #106 (Start the demo by clicking the picture!)

    This demo in YouTube

    In this recreation of SURVO 66 its essential features are reproduced.
    The example is taken from Statistical programming language SURVO 66
    by T.Alanko, S.Mustonen, M.Tienari (1968). BIT, 8, 69-85.
    A more comprehensive document is
    Tilastollinen tietojenkäsittelyjärjestelmä SURVO 66
    (in Finnish) by S.Mustonen (1967).

    The SURVO66 operation is mainly a historical account about the origin of Survo. However, in cases where a great number of two-dimensional tables of frequencies, means, and standard deviations has to be made from large data, the SURVO66 operation is more efficient than the TAB operation since everything is obtained in one pass of the data set.

    SURVO66 does its job through three stages T1,T2, and T3:
    T1: Reading and storing the program. Storage space is allocated for each operation and the sum locations are cleared.
    T2: Reading the data so that only one observation is in the core memory at the same time. (This is typical in SURVO MM also.)
    The entire SURVO66 program is obeyed for each observation and 'sufficient' statistics are collected.
    T3: After all observations have been processed, the SURVO66 program is scanned once more. 'Sufficient' statistics are processed into final form of the output and saved as a text file SURVO66.TXT.

    Functions of SURVO MM are used in certain operations of SURVO66, like output formatting in CORREL and computations and output of REGRAN (based on sufficient statistics from CORREL).

    The example presented in this demo does not contain any conditional operations (IF, EQUAL, LESS, BETWEEN, OR, AND, NOT) available also in the SURVO66 operation.

    See also Cross tabulations with SURVO66


    Sources and recordings related to Elliott 803 computer:
    Elliott 803
    Peter Onion's video
    Early sound experiment on Elliott 803 computer in 1962
    Using Elliott 803 (in Finnish) (Martti Tienari and Seppo Mustonen)

    The basic contents of this example:
    
    A data set from a statistical research by Dr. Knight on computer
    characteristics is used and it has the following contents
    as a text file KNIGHT.TXT (N=91):
    
    SHOW KNIGHT.TXT
    %
    % Date         Scientific      Commercial      Inverse unit
    % introduced   power (ops/sec) power (ops/sec) cost (sec/$)
    % month year
       4     63     21420.000       9079.000       44.54
       4     65    224374.000     118154.000       15.20
       7     63     67660.000      23420.000       23.98
       4     65      1768.000        990.500      230.90
       1     63     68690.000      58880.000       13.86
       4     65     68497.000      29571.000      103.90
      --     --    ----------     ----------      ------
    
    The main target in this example was to see how well Grosch's law
    P=kC^2 fits to Knight's data by using a linear model
    
    ln(P) = a0 + a1*ln(C) + a2*T
    
    where P is 1000 ops/sec, C is $/hour, and T is the age in months.
    This law states that a1 should be close to 2.
    
    
    The program code (with comments in parentheses) is:
    SURVO66 KNIGHT.TXT
    Evolving Computer performance 1963-1967
    M@5                         (number of variables)
    CALL@X1 MONTH               (rename variables)
        @X2 YEAR
    DEF@MONTH L:1 U:12          (set limits)
       @YEAR L:63 U:67
    DIV@SPEED X3 1000           (SPEED=X3/1000)
    DIV@COST 3600 X5            (COST=3600/X5)
    SUB@Y1 68 YEAR              (Y1=68-YEAR)
    MULT@Y2 12 Y1               (Y2=12*Y1)
    SUB@AGE Y2 MONTH            (AGE= age in months)
    LOG@LSPEED SPEED            (LSPEED=log(SPEED))
       @LCOST COST
    CLASS@COSTCL                (classification COSTCL)
          CHEAP 30              (upper limit for CHEAP is 30)
          MODER 90
          EXPNS 500
    TABLE@YEAR -                (column variable, default classes 63-67)
          DEVEL COST COSTCL     (table DEVEL, row variable COST with COSTCL)
          T:SPEED               (tables of means and stddevs of SPEED)
    CORREL@LSPEED LCOST AGE N:CORR (correlations of variables LSPEEED-AGE)
    REGRAN@LSPEED LCOST AGE N:CORR (regression analysis using CORR)
    END@
    
    The original names X1,X2,... of the variables are renamed in the code
    and the code is activated by the SURVO66 command with the name of
    the data set (KNIGHT.TXT) as the only parameter:
    
    The results have been saved in a text file SURVO66.TXT.
    They are now loaded into the edit field by
    LOADP SURVO66.TXT
    
    Evolving Computer performance 1963-1967
    
    M@5                         (number of variables)
    CALL@X1 MONTH               (rename variables)
        @X2 YEAR
    DEF@MONTH L:1 U:12          (set limits)
       @YEAR L:63 U:67
    DIV@SPEED X3 1000           (SPEED=X3/1000)
    DIV@COST 3600 X5            (COST=3600/X5)
    SUB@Y1 68 YEAR              (Y1=68-YEAR)
    MULT@Y2 12 Y1               (Y2=12*Y1)
    SUB@AGE Y2 MONTH            (AGE= age in months)
    LOG@LSPEED SPEED            (LSPEED=log(SPEED))
       @LCOST COST
    CLASS@COSTCL                (classification COSTCL)
          CHEAP 30              (upper limit for CHEAP is 30)
          MODER 90
          EXPNS 500
    
    TABLE@YEAR -                (column variable, default classes 63-67)
          DEVEL COST COSTCL     (table DEVEL, row variable COST with COSTCL)
          T:SPEED               (tables of means and stddevs of SPEED)
    
    Table: DEVEL
    Column variable: YEAR
    Row    variable: COST
    Frequencies
                   63       64       65       66       67    Total
       CHEAP        6        4       10        7        4       31
       MODER        7       11        9        5        1       33
       EXPNS        6        6        6        6        3       27
       Total       19       21       25       18        8       91
    Chi2=6.0617 df=8 P=0.64032
    
    Means of SPEED
                   63       64       65       66       67    Total
       CHEAP 5.529187 2.078415 20.80887 1.658377 36.56175 13.14300
       MODER 13.25171 54.88778 50.50800 439.0866 154.8420 106.1023
       EXPNS 198.3025 1371.591 1123.889 1875.849 1419.726 1173.221
       Total 69.25011 421.0299 296.2399 747.8964 570.0332 391.0525
    
    Standard deviations of SPEED
                   63       64       65       66       67    Total
       CHEAP 9.650197 2.826062 28.63729 2.051873 55.15149 26.80980
       MODER 15.16012 59.95012 47.31911 484.0829        - 231.6778
       EXPNS 170.9234 2770.993 1412.064 2187.189 1567.665 1823.350
       Total 127.8364 1517.006 801.2234 1472.591 1095.507 1114.315
    
    CORREL@LSPEED LCOST AGE N:CORR (correlations of variables LSPEEED-AGE)
    Means and standard deviations
    MATRIX CORR_M.M
    S66MSN
    ///          Mean   Stddev        N
    LSPEED    3.05430  3.11669 91.00000
    LCOST     3.90558  1.23418 91.00000
    AGE      32.97802 14.36120 91.00000
    Correlation coefficients
    MATRIX CORR_R.M
    S66CORR
    ///        LSPEED    LCOST      AGE
    LSPEED    1.00000  0.80688 -0.17920
    LCOST     0.80688  1.00000  0.05398
    AGE      -0.17920  0.05398  1.00000
    
    REGRAN@LSPEED LCOST AGE N:CORR (regression analysis using CORR)
    LINREG S66DATA>.M,CUR+1 / RESULTS=0
    Linear regression analysis: Data S66DATA>.M, Regressand LSPEED    N=91
    Variable Regr.coeff.    Std.dev.    t     beta
    AGE      -0.048483       0.012672 -3.826 -0.223
    LCOST     2.068079       0.147458  14.02  0.819
    constant -3.423891       0.716240 -4.780
    Variance of regressand LSPEED=9.713751390 df=90
    Residual variance=2.972162239 df=88
    R=0.8372 R^2=0.7008
    
    END@
    
    Grosch's law seems to fit well to Knight's data.
    


    Cross tabulations with SURVO66

    Example #107 (Start the demo by clicking the picture!)

    This demo in YouTube

    It is shown that SURVO66 is essentially faster than the TAB operation of SURVO MM when many cross tabulations should be done from a large data set.
    See also Resurrection of SURVO 66



    Printing a small document

    Example #108 (Start the demo by clicking the picture!)

    This demo in YouTube

    In the window above the PRINT command on the line 29 prints lines from CUR+1=30 to E=39 in a PostSript file DOC1.PS.
    The plain text on lines 30-33 is printed by using the default font [Times(12)] and with line spacing [line_spacing(12)].
    The graphs (histograms) are included on lines 34-36.
    % 500 on line 34 allocates vertical space of 500 Points (1 Point = 0.3528 mm) for these graphs.
    The graphs are included on lines by - picture commands of the form
    - picture name_of_PS_file,x,y,scale_x,scale_y
    where x and y give coordinates of the left lower corner of the graph in units of 0.1 mm. A plain * indicates just the current position on the current page.
    For example, on the line 36 x is *+850 thus indicating that the second graph should be moved 850 units to the right so that the histogram of the rainfall will be positioned correctly without overlaying the histogram of the temperature.
    scale_x and scale_y are scaling coefficients of the graph. In this case the size of both graphs will be 70 per cent of the original size in both directions.

    Thus the following document has been created:



    and it can then printed by means provided by Adobe Acrobat (Reader).

    The supporting freeware programs Ghostscript and Acrobat Reader do not belong to SURVO MM. They must be downloaded from the net. The latest version of Ghostscript (either a 32-bit or a 64-bit version) can be loaded as a self-extracting EXE file.
    When installing it, please use default settings.
    When /GS-PDF is activated for the first time, Survo locates Ghostscript (this search for the Ghostscript program may take several seconds) and saves the location of Gswin32.exe or Gswin64.exe as a text file <Survo>\U\SYS\GSPATH.SYS

    When making PostScript documents containing text and graphics the PRINT program module of Survo uses a 'driver' PS.DEV which is a standard text file located on the path <Survo>\U\SYS\ .
    Normally this file should not to be altered. The main default settings (font type, font size, line_spacing, etc.) are set on the two last lines of this file.

    The user may override any setting by inserting new definitions. For example, if we insert a new line of the form
    - [ArialB(11)][line_spacing(14)]
    next after the PRINT line in the previous example, the result will be




    I made the first version of this PostScript driver in 1987. Before that drivers had been made for various printers like Epson dot matrix printers and Canon laser printers.
    In 1997 Kimmo Vehkalahti created a driver for making HTML pages,
    and in 2004 another driver similarly for making LaTeX documents.
    All these drivers can be used for controlling the same PRINT program module of Survo.

    Chapters 8. Graphics and 9. Printing of reports in my book


    tell all essential information about making graphs and multi-page documents when using Survo.
    This and many other documents about Survo and other topics have been created by using these capabilities of Survo.


    Circle estimation

    Example #109 (Start the demo by clicking the picture!)

    This demo in YouTube

    ESTIMATE is a powerful tool for linear and nonlinear regression analysis.
    Also fairly general problems of ML estimation can be solved by this operation.
    It permits the user to enter the model in normal mathematical notation.
    Before computations ESTIMATE analyzes the model function and evaluates its symbolic derivatives up to second order with respect to parameters to be estimated. The estimation and computation procedures are then selected according to this analysis and on the basis of the user's specifications.
    For example, if all derivatives of second order vanish, ESTIMATE 'knows' that the model is linear and selects the Newton's method method leading to the solution rapidly in so many steps as there are parameters to be estimated.
    In other cases the Davidon-Fletcher-Powell method is the default one.
    The user may override these selections by a METHOD specification.

    I created this demo during the Compstat 82 Conference (Toulouse 1982).
    It is shown how the location and radius of a circle can be estimated by the ESTIMATE operation from a data set having observations which are located approximately on the circumference of a circle.
    It is shown also how a bias due to an erroneus observation can be eliminated by using L1 estimation instead the standard least squares (L2) method.

    This example is also available as a flash demo.


    Apparently the same problem has been treated in various ways by other statisticians. One example is the paper
    Finding the circle that best fits a set of points of Luc Maisonobe (2007).
    There one numerical example of 5 observations is given and solved by the Levenberg-Marquardt method.
    Below is the corresponding least squares solution by ESTIMATE using the Davidon-Fletcher-Powell method
    and starting from initial values X0=Y0=R=0.


    DATA CIRCLE
      X    Y
     30   68
     50   -6
    110  -20
     35   15
     45   97
    
    MODEL Cmodel
    sqrt((X-X0)^2+(Y-Y0)^2)=R
    
    ESTIMATE CIRCLE,Cmodel,CUR+1
    Estimated parameters of model Cmodel:
    X0=96.0759 (1.69426)
    Y0=48.1352 (1.11286)
    R=69.9602 (1.33064)
    n=5 rss=3.126753 nf=120
    Correlations:
                    X0     Y0      R
     X0          1.000  0.611  0.892
     Y0          0.611  1.000  0.675
     R           0.892  0.675  1.000
    

    The results are the same (to 3 decimal places) as those obtained by Maisonobe.

    More information and another demo about ESTIMATE

  • See also: Estimation of nonlinear regression models

    Contour ellipses on a graph paper

    Example #110 (Start the demo by clicking the picture!)

    This demo in YouTube




    The first version of this graph was given on page 40 of my document about SURVO 84*
    in 1984 and indicates that I had found the formulas

    for contour ellipses on the confidence level P of a general bivariate normal distribution

    as well as the generalized Box-Müller formulas
    for generating observations of a general bivariate normal distiribution with a correlation coefficient ρ from two independent Uniform(0,1) variables

    already then and thus earlier than told in my document
    Two formulas related to two-dimensional normal distribution.

    *Originally arcsin(ρ) had to be replaced by arctan(ρ/(sqrt(1-ρ*ρ)) since arctan was the only inverse trigonometric funtion available in SURVO 84.



    Sampling from a discrete uniform distribution

    Example #111 (Start the demo by clicking the picture!)

    This demo in YouTube

    Samples from distributions related to the discrete uniform distribution are shown as dynamic histograms generated step by step. In fact, these graphs are scatter plots of data sets of two variables. The X variable is a discrete random variate and the values of the Y variable are cumulative frequencies of distinct X values.
    This setup is generated, for example, for discrete uniform variable with values 1,2,3,4,5,6 by MAT commands

    MAT A=ZER(n,1) / n is the sample size
    MAT #TRANSFORM A BY int(6*rand(2017)+1)
    MAT B=ZER(n,2)
    MAT B(1,1)=A
    MAT #CUMFREQ(B)
    
    where the last command computes the cumulative sums of distinct values in the first column as elements of the second column.

    A new sucro /U_SAMPLE is used for creating scatter plots of such 'datasets' appearing as gradually growing histograms. The plotting process is slowed down by a SLOW specification in the plot setup. Thus SLOW=50, for example, makes the GPLOT operation to plot each observation 50 times.
    This slowing feature has been used in earlier demos
    A closed curve,
    Lissajous curve variation,
    Color changing
    related to plotting families of curves. This feature is now available also in scatter plots on screen.
    The syntax of the /U_SAMPLE sucro command is
    /U_SAMPLE m,s,n,seed,slow
    where
    m is the number of outcomes U in a single trial,
    s is number of independent outcomes U1,U2,...,Us giving U=U1+U2+...+Us,
    seed is the seed value of the random number generator,
    slow is the value of the SLOW specification.


    Merits of slow plotting

    Example #112 (Start the demo by clicking the picture!)


    This demo in YouTube
    In 1970s it was possible to create Survo graphics by the Wang 2272 digital drum plotter connected to a Wang 2200 minicomputer.

    Then drawing of a graph like that above took several minutes and there was indeed plenty of time to watch details of the plotting process and detect possible peculiarities.

    This advantage was lost when plotters were replaced laser printers or graphic screens. On a laser printer, the entire plotting process takes place out of sight. On the screen everything happens too quickly.

    The main target of this demo is to point out that there may be cases where it is meaningful to slow down the plotting process (by using the SLOW specification) so that the user is able to see potential interactions between observations and/or variables.

    Other examples of slow plotting in Survo are given in
    Sampling from a discrete uniform distribution.



    Tuning roots of algebraic equations by "listening"

    Example #113 (Start the demo by clicking the picture!)

    This demo in YouTube

    Recently (in March 2017) I have improved the C code for root finding in POL R=ROOT(P) operation by implementing Laguerre's method.
    The roots are found stepwise and after each step the degree of the polynomial is decreased by dividing it by (x-r) where r is the newest root.
    When the coefficients of the polynomial are integers, I check whether the value r (and in case of a complex number, its real and imaginary part separately) can be replaced by a 'nice' rational number which solves the equation at least as accurately as r.
    This method is described in
    Rational approximations by listening

    See also
  • Operations on polynomials
  • Matrix operations


  • Equation for the sum of chord lengths in a regular polygon

    Example #114 (Start the demo by clicking the picture!)

    This demo in YouTube

    Since 2013 I have been interested in certain metric properties of regular polygons. The most important result in my experimental and expository studies is a conjecture that, for each such a polygon, the the total length of all edges and chords is the greatest root of an algebraic equation with coefficients depending on binomial coefficients and the other roots of that equation can be represented as linear combinations of the same entities with coefficients -1,0,1.
    Furthermore, if the number of vertices of the regular polygon is a prime or a power of 2, the coefficients are -1 or 1.
    I have also found an efficient algorithm for determining those coefficients.
    These results are presented in my paper Lengths of edges and diagonals and sums of them in regular polygons as roots of algebraic equation (2013)
    and some of these conjectures have been proved in my paper with Pentti Haukkanen and Jorma Merikoski Some polynomials associated with regular polygons (2014).

    After finding the essential results, I noticed (in March 2014) that the roots can also be given as simple expressions (see page 39 in my paper) and solving of an algebraic equation is avoided. However, it is still interesting to study these expressions (roots) as simple linear combinations of chord lengths leading also to certain trigonometric identities.

    For example, equations C11*E11=R11 (divided by 2*11) lead to formulas
    
    sin(5π/11)+sin(4π/11)+sin(3π/11)+sin(2π/11)+sin(1π/11) = cot(1π/22)/2
    sin(5π/11)-sin(4π/11)+sin(3π/11)+sin(2π/11)-sin(1π/11) = cot(3π/22)/2
    sin(5π/11)-sin(4π/11)+sin(3π/11)-sin(2π/11)+sin(1π/11) = cot(5π/22)/2
    sin(5π/11)+sin(4π/11)-sin(3π/11)-sin(2π/11)-sin(1π/11) = cot(7π/22)/2
    sin(5π/11)-sin(4π/11)-sin(3π/11)+sin(2π/11)+sin(1π/11) = cot(9π/22)/2
    
    and equations C15*E15=R15 (divided by 2*15) to formulas
    
     sin(7π/15)+sin(6π/15)+sin(5π/15)+sin(4π/15)+sin(3π/15)+sin(2π/15)+sin(1π/15) = cot(1π/30)/2
                sin(6π/15)                      +sin(3π/15)                       = cot(3π/30)/2
                           sin(5π/15)                                             = cot(5π/30)/2
     sin(7π/15)-sin(6π/15)+sin(5π/15)-sin(4π/15)+sin(3π/15)-sin(2π/15)+sin(1π/15) = cot(7π/30)/2
                sin(6π/15)                      -sin(3π/15)                       = cot(9π/30)/2
    -sin(7π/15)+sin(6π/15)-sin(5π/15)+sin(4π/15)+sin(3π/15)-sin(2π/15)+sin(1π/15) = cot(11π/30)/2
    -sin(7π/15)+sin(6π/15)+sin(5π/15)-sin(4π/15)-sin(3π/15)+sin(2π/15)+sin(1π/15) = cot(13π/30)/2
    
    The elements of the vector En are lengths of edges and chords
    (multiplied by n)
    e_i = 2*sin(((n+1)/2-i)π/n), i=1,2,...,(n-1)/2
    for odd n and
    e_i = 2*sin((n/2+1-i)π/n), i=1,2,...,n/2
    for even n.
    In the latter case e_1=2 is replaced by e_1=1.
    
    The elements of the vector Rn are
    r_i = n*cot((2*i-1)π/(2n)), i=1,2,...,⌊n/2⌋
    and found originally as the square roots of the roots of
    equation
    
     (n-k)/2
       Σ    (-1)^i*C(n,2*i+k)*n^(n-2*i-k)*x^i=0
      i=0
    
    where k=0 if n is even and k=1 if k is odd.
    
    The essential tools for finding the Cn matrices are the MAT #ARFIND and MAT #SPREAD commands of SURVO MM. MAT #ARFIND,n,A finds the Cn coefficients for 'roots' unique for n as linear combinations of chord lengths with coefficients +1,-1. The general setup related to n is saved as a matrix file A.MAT and the coefficients of the linear combinations in a matrix file Cn.MAT. According to my conjecture the valid coefficients related to row i of Cn (in a 'unique' case) are to be selected from c_ij=±sgn(cos(q_ni*pi*(2*j-k)/(2*i-1)))), i,j=1,,2,...,⌊n/2⌋ where k=1 for odd n and k=2 for even n and q_ni is a positive integer. The correct value of q_ni is selected from alternatives q_ni=1,2,...,⌊(n/2)⌋ so that the linear combination of chord lengths with coefficients c_ij gives the i'th 'root'. The selected q value and its sign coefficient appear as two first columns of A.MAT. Then according to this conjecture the correct coefficients are found in n/4 trials on average. Without relying to this conjecture, about 2^(n-2) alternatives should be tested which is an essentially harder task. The indefined rows of Cn (related to factors of n) remain filled with zeroes and the origin of these rows is revealed by the 'factor' and 'index' columns of the matrix A. The details of MAT #ARFIND can be found in its current C code. By creating C matrices for factors given in matrix A by repeated applications of MAT #ARFIND, the 'empty' rows in the original Cn matrix can be filled by the MAT #SPREAD operation. Table of q_ni coefficients By defining for positive integers n,k, n>=k mod(n,k) if mod(n,k)<=⌊k/2⌋ amod(n,k) = k - mod(n,k) otherwise I have concluded experimentally that q_ni values depend on n only through m=amod(n,i) values. Then it has been possible to create a table of q values of the following form i/m 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 2 1 . . . . . . . . . . . . . . . . . . . . 3 2 1 . . . . . . . . . . . . . . . . . . . 4 3 2 1 . . . . . . . . . . . . . . . . . . 5 4 2 *3 1 . . . . . . . . . . . . . . . . . 6 5 3 2 4 1 . . . . . . . . . . . . . . . . 7 6 3 2 5 4 1 . . . . . . . . . . . . . . . 8 7 4 *3 2 *5 *6 1 . . . . . . . . . . . . . . 9 8 4 3 2 5 7 6 1 . . . . . . . . . . . . . 10 9 5 3 7 2 8 4 6 1 . . . . . . . . . . . . 11 10 5 *3 8 2 *6 *7 4 *9 1 . . . . . . . . . . . 12 11 6 4 3 7 2 5 10 9 8 1 . . . . . . . . . . 13 12 6 4 3 *5 2 9 11 7*10 8 1 . . . . . . . . . 14 13 7 *3 10 8 *6 2 5 *9 4 11*12 1 . . . . . . . . 15 14 7 5 11 3 12 2 9 8 13 4 6 10 1 . . . . . . . 16 15 8 5 4 3 13 11 2 12 14 7 9 6 10 1 . . . . . . 17 16 8 *3 4 10 *6 7 2 *9 5*11*12 14 13*15 1 . . . . . 18 17 9 6 13 *5 3 *7 11 2*10 8 16 4*14*15 12 1 . . . . 19 18 9 6 14 11 3 8 7 2 13 5 17 10 4 16 15 12 1 . . . 20 19 10 *3 5 4 *6 14 17 *9 2 16*12*13 7*15 11 8*18 1 . . 21 20 10 7 5 4 17 3 18 16 2 13 12 11 19 15 9 6 8 14 1 . 22 21 11 7 16 13 18 3 8 12 15 2 9 5 20 10 4 19 6 17 14 1 The row i in the table is a permutation of integers 1,2,...,i-1. The numbers preceded by *'s (being the same as column numbers) extend the row i to a permutation, but cannot appear as amod() values due to common factors with 2*i-1. Let's call them dummy values. The column 'q' in the A matrix obtained by MAT #ARFIND,n,A gives the pertinent q_ni values and the table may be extended by using them. The same table extended to row 75

    In April 2017 I found an efficient algorithmic solution for calculating the table of q_ni numbers and this approach is described in
    Regular polygons: Solving riddle of q coefficients

    Earlier demo on the same subject:
    Edges and diagonals of a regular n-sided polygon



    Regular polygons: Solving riddle of q coefficients

    Example #115 (Start the demo by clicking the picture!)

    This demo in YouTube

    Already in 2013 I tried to find an algorithm for computing the table
    of the q_ni numbers appearing in the coefficients
    c_ij=±sgn(cos(q_ni*pi*(2*j-k)/(2*i-1)))), i,j=1,,2,...,⌊n/2⌋
    of the linear combinations needed in the previous demo.
    
    When noticing that the row i of the table of q's is a permutation
    of integers 1,2,...,i-1, it was natural to search for a direct rule
    of selecting the right permutations and maybe that it is still
    possible. For example, it is temptating to think that the rows
    could be related to residues (mod i) of some functions of i.
    I have not found such a direct formula.
    
    It is surprising that now an extension of the table by simple
    arithmetical tricks leads to these permutations and thus we seem
    to have an 'algorithmic formula'.
    
    The q_ni values depend on n only through m=amod(n,i) values as
    told in the previous demo.
    If the sequence of integers in the column m of the table of q's
    is denoted by q(i,m), i=1,2,..., the recursive relation
    
    q(i,m)=2*q(i-m,m)-q(i-2*m,m), i=1,2,...
    
    seems to be generally valid and the table of q's can be completed
    in the following way by using that recursion (and readily available
    permutations until i=43):
    
    
    
    
    
    It is crucial to see that the row i (i=1,2,...) starting by a specific
    permutation of numbers 1,2,...,i-1 (in red) is followed by the same
    numbers in reversed order (in green), then followed by one dummy value
    and thereafter this scheme is repeated 'forever'.
    Dummies may also appear in permutations (typically as multiples of the
    'correct' number) but it is not harmful since they cannot appear
    as q coefficients. For simplicity, dummies can be replaced by 0's.
    
    It is also essential to notice that any column can be continued upwards
    by a still simpler recursion so that q(-i,m)=-q(i+1,m) i=0,1,2,...
    
    For example, for m=4 we have
    i       ... -4 -3 -2 -1  0  1  2  3  4  5 ...
    q(i,4)  ... -1 -1 -2 -1  0  0  1  2  1  1 ...
    
    Then it is obvious that the table of q's can be generated simply
    row by row using the recursive relation.
    For example, assume that we have rows down to 5 ready with upward
    'mirror' completions for 5 first columns:
    
     i/m    1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21
    -5
    -4     -4 -2  0 -1 -1!
    -3     -3 -2 -1 -1 -2
    -2     -2 -1 -1 -2! 0
    -1     -1 -1  0 -1 -1
     0      0  0  0! 0  0
     1      0  0  0  0  0! 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
     2      1  1  0  1! 1  0  1  1  0  1  1  0  1  1  0  1  1  0  1  1  0
     3      2  1  1! 2  0  2  1  1  2  0  2  1  1  2  0  2  1  1  2  0  2
     4      3! 2! 1  1  2  3  0  3  2  1  1  2  3  0  3  2  1  1  2  3  0
     5      4! 2  0  1  1  0  2  4  0  4  2  0  1  1 15  2  4  0  4  2  0
    
    Then the start of the next row emerges for the 5 first elements
    recursively as (!'s after numbers used in recursion)
    
     6      5  3  2  4  1
    
    giving the permutation and the row is completed by the rule told above:
    
     6      5  3  2  4  1  1  4  2  3  5  0  5  3  2  4  1  1  4  2  3  5
    
    On basis of these findings it was possible to create an essentiallly
    faster algorithm for computing the Cn matrices demonstrated here.
    
    This new algorithm is now available in SURVO MM as MAT #QFIND(n)
    operation and computing the table of q numbers is at least ten
    times faster than before.
    The MAT #ARFIND operation is now replaced by MAT #QRFIND operation
    which works as MAT #ARFIND but does the job much faster by using
    a readily computed large table of q values. By using MAT #QRFIND
    I have computed the linear combinations (with coefficients ±1)
    for the roots of the equation (presented in the preceding demo) for all
    prime numbers n less than 10000.
    At the same time I have checked that coefficients really are either
    +1 or -1 and linear combinations give the true roots.
    It has also been verified that each row i in the table of q's
    gives a permutation of numbers 1,2,...,i-1
    (when each possible 0 is replaced by the column index m)
    and each permutation is of order 2 with i-1 as its first element
    and 1 as the last one.
    
    Although the table of q's was computed only once in this experiment,
    and it takes a few seconds, the entire checking process lasted
    on my current PC about 15 hours (a lot of matrix manipulations).
    

    These new findings are reported in
    On the roots of an algebraic equation related to regular polygons (2017).

    The first part of this demo is
    Equation for the sum of chord lengths in a regular polygon
    The calculation process is described in
    Regular polygons: Testing roots

    C code for SURVO MM operations MAT #QFIND and MAT #QRFIND


    Regular polygons: Testing roots

    Example #116 (Start the demo by clicking the picture!)

    This demo in YouTube

    Here is the sucro code for numerical checking of the equation CE=R.

    10 *
    11 *TUTSAVE C_TEST
    12 / def Wn=W1 Wdivisor=W2 Wremainder=W3 Wsquare=W4 Wt=W5 Wt2=W6
    13 *{tempo 1}{R}
    14 /
    15 *{ref set 1}
    16 + S: n={print Wn} m=({print Wn}-1)/2{R}
    17 *MAT #QRFIND {print Wn},Q5000.TXT,A{act}{R}
    18 / Testing that coefficients Cn are 1 or -1
    19 *MAT A=C{print Wn}{act}{R}
    20 *MAT TRANSFORM A BY abs(X#){act}{R}
    21 *MAT B=CON(m,m){act}{R}
    22 *MAT A=A-B{act}{R}
    23 *MAT TRANSFORM A BY abs(X#){act}{R}
    24 *MAT A=SUM(SUM(A)'){R}
    25 *MAT_A(1,1)={act}{l} {save word Wt}{R}
    26 /
    27 *MAT R=ZER(m,1){act} pi=3.141592653589793{R}
    28 *MAT E=R{act}{R}
    29 *MAT TRANSFORM R BY cot((2*I#-1)*pi/(2*n)){act}{R}
    30 *MAT TRANSFORM E BY 2*sin(((n+1)/2-I#)*pi/n){act}{R}
    31 *MAT A=MTM(C{print Wn}*E-R){act}{R}
    32 *MAT_A(1,1)={act}{l} {save word Wt2}{R}
    33 *COPY CUR+1,CUR+1 TO C_TEST.TXT{R}
    34 *{erase}{print Wn} {print Wt} {print Wt2}{home}{u}{act}
    35 / Next prime number
    36 + A: {Wn=Wn+2}{Wdivisor=1}
    37 + B: {Wdivisor=Wdivisor+2}{Wremainder=Wn%Wdivisor}
    38 - if Wremainder = 0 then goto A
    39 *{Wsquare=Wdivisor*Wdivisor}
    40 - if Wsquare < Wn then goto B
    41 *{ref jump 1}{goto S}{end}
    42 *
    

    Before using this sucro the q coefficients have to be calculated as a text file Q5000.TXT by the command MAT #QFIND(5000). Thereafter
    /C_TEST 3
    starts from number 3 and scans consecutive prime numbers, checks for each of them the structure of C and CE=R, and saves the results in a text file C_TEST.TXT in the form

       3 0 4.9303806576313e-032
       5 0 2.0954117794933e-031
       7 0 6.1629758220392e-032
      11 0 3.5745259767827e-031
      13 0 3.4266145570538e-030
      17 0 2.3611901118188e-030
      19 0 5.3810482646179e-030
      23 0 1.4687141755897e-029
      29 0 1.4881468087286e-029
    ....
    9931 0 2.5192799756962e-022
    9941 0 1.2926419087749e-022
    9949 0 6.8344417909304e-022
    9967 0 1.2351984585185e-022
    9973 0 5.3219957169430e-022
    

    until interrupted by the user. Zeros after the prime number indicate that all elements in C are either 1 or −1 and the floating point number is the sum of squares of elements in CE−R calculated in double precision. Largest sums were obtained for the last primes for obvious reasons. The sums are close enough to zeros and indicate validity of CE=R.
    So the presentation of roots as linear combinations of edge and chord lengths was confirmed by this sucro for all primes less than 10000.

    Although the table of q's was computed only once in this experiment by the MAT #QFIND(5000) command giving Q5000.TXT and it takes a few seconds, the entire checking process lasted on my current PC about 15 hours (a lot of matrix manipulations).

    I have also checked that the linear combinations of chord lengths with coefficients 1 or −1 are unique for primes ≤79. For this task a Survo operation CTEST has been made. For example, in case of n=79, 2^((n−1)/2−1)=274'877'906'944 (positive) combinations had to be tested and it took about 100 hours on my PC.

    The newest freeware version of SURVO MM can be dowloaded from here.
    It includes all functions related to the current topic.

    Earlier demos on the same subject:
    Regular polygons: Solving riddle of q coefficients
    Equation for the sum of chord lengths in a regular polygon


    C code for SURVO MM operations MAT #QFIND and MAT #QRFIND
  • See also: Matrix operations in Survo

    Home  |  News  |  Publications  |  Download  |  Flash
    Copyright © Survo Systems 2001-2017. All rights reserved.
    Updated 2017-06-02 by webmaster'at'survo.fi.
    Demos Best viewed with any browser.