Solution of Survo puzzles by using the products of marginal sums
It has been seen that in practice Survo puzzles (without a real solver program) are treated in surprisingly different means. Most often the solution is found by pure logical reasoning but sometimes computational support offered by Survo (COMB operation, for example) may be helpful. The standard procedure is to fill empty cells in the table gradually so that in each stage (when reasoning is optimal) the selected number is the only possible one. However, many people find the solution at least partially by trial and error. Here a different approach is introduced. It is based on some previously presented ideas and it transforms the task into a form resembling tackling of typical chess problems. By this 'swapping method' many hard Survo puzzles may be solved in a few steps very easily, but on the other hand there are many puzzles where the method is inefficient. Another weakness is that by this method the uniqueness of the solution cannot be proved. The swapping method is related to the main idea of my original solver program where the table (originally filled by numbers in random order) is improved by swapping numbers step by step so that at the end the computed sums become identical to the given ones. and to the common observation that small numbers are located in crossings of small sums and large numbers in crossings of large ones or more precisely to remarks made e.g. by Joonas Kauppinen and Markku Verkasalo that the products of the marginal sums crudely indicate the positions of the correct numbers in the final solution. Usage of those products in this way has a natural statistical basis. If the table of the Survo puzzle is interpreted as a contingency table computed for two independent variables, the expected frequency in any cell is the product of marginal sums divided by the sample size. Thus those products as such are proportional to expected frequencies. By combining the principles given above, many of the hard (especially open) Survo puzzles can be solved easily according to the style of "Mate-in-N-move" problems in chess. Let us see how it happens in an open 4x4 problem I presented and solved in my expository paper (on pages 17-21) * * * * 14 * * * * 28 * * * * 44 * * * * 50 16 33 40 47 The table of the products of the marginal sums is then 224 462 560 658 14 448 924 1120 1316 28 704 1452 1760 2068 44 800 1650 2000 2350 50 16 33 40 47 and when the original table is filled by numbers 1-16 according to sizes of these products, the following preliminary solution is obtained: 1 3 4 5 14 2 8 9 10 28 6 11 13 15 44 7 12 14 16 50 16 33 40 47 It is not a correct solution. For example, the sum in the first row is 1+3+4+5=13 and not 14, but all the sums are at least close to the true ones as one can see in the following table. Sum OK D 1 3 4 5 13 14 -1 2 8 9 10 29 28 1 6 11 13 15 45 44 1 7 12 14 16 49 50 -1 Sum 16 34 40 46 OK 16 33 40 47 D 0 1 0 -1 W=6 Here row/column 'Sum' tell the computed sums, 'OK' the correct sums, and 'D' their differences (errors) D=Sum-OK. The sum of absolute values of errors is W=6. The task thereafter is trying to reduce the error W by swapping two numbers at a time. A swap (5,6) seems now to be good since it leads to a better setup Sum OK D 1 3 4 6 14 14 0 2 8 9 10 29 28 1 5 11 13 15 44 44 0 7 12 14 16 49 50 -1 Sum 15 34 40 47 OK 16 33 40 47 D -1 1 0 0 W=4 having a smaller total error W=4. Then it is really easy to see that a swap (7,8) leads to the correct solution Sum OK D 1 3 4 6 14 14 0 2 7 9 10 28 28 0 5 11 13 15 44 44 0 8 12 14 16 50 50 0 Sum 16 33 40 47 OK 16 33 40 47 D 0 0 0 0 W=0 Thus the problem is solved by two steps and the solution seems to be much shorter than my original solution. However, in the original solution also the uniqueness was proved. In most problems, more swaps are needed but often the solution can still be found without difficulties. For example, a harder Survo puzzle presented in the same paper (on page 24) A B C D 1 * * * * 51 2 * * * * 36 3 * * * * 32 4 * * * * 17 51 42 26 17 is solvable by the swapping method in five simple steps. The solution can be seen as a Flash demo. In order to find out how the swapping method works generally in all 5327 uniquely solvable, open 4x4 Survo puzzles, I have computed (by a special program) the shortest solution for them each. The numbers of swaps are distributed according to the table Swaps 0 1 2 3 4 5 6 7 8 9 frequency 85 262 522 977 1271 1222 696 255 36 1 cumulative 85 347 869 1846 3117 4339 5035 5290 5326 5327 % 1.6 6.5 16.3 34.7 58.5 81.5 94.5 99.3 100.0 The products of sums give the right solution immediately in 85 cases and 81.5 per cent of these puzzles can be solved by less than 6 swaps. I have tried to solve rather many puzzles of this type by the swapping method manually. For creating the original setup defined by the products of sums and for keeping record of swaps and their influence in the setup, a special sucro /SP_SWAP has been made. According to my experience, the solution is found rather easily at least when the number of swaps needed does not exceed 5. However, when starting to solve a problem, one cannot know in advance the number of swaps needed. Then the original W value predicts roughly this number. W varies between 0 and 22 (in these 4x4 cases) and it is related to the number of swaps as follows: W swaps<6 Frequencies of W values 0-4 100.0 % 467 6 99.5 % 417 8 99.0 % 609 10 93.6 % 879 12 85.8 % 896 14 70.6 % 929 16 56.6 % 681 18 49.6 % 312 20 46.8 % 126 22 54.5 % 11 all 81.5 % 5327 Thus, if the original sum of errors is 10 or less, this kind of puzzle is solvable by less than 6 swaps in 93.6 per cent of cases. The mean and the standard deviation of the number of swaps are about 4 and 1.6, respectively. If the swapping procedure is started from a randomly initiated table, the corresponding statistics are 12.6 and 1.3 which numbers are calculated here. So the mean is much higher. The success of the swapping method (at least in many 4x4 cases) is largely based also on the fact that if chi-square values are computed for all these 5327 tables, they are located in the interval (0.25,8.25). So even the greatest of them is smaller than the degrees of freedom 9 which is the expected value for independent variables. Thus, in these tables "the independence of variables" is exaggerated and therefore the products of the marginal sums predict well the correct solution. It is typical when using the swapping method that in each step the error sum W decreases and then a smart solver sees the best choices immediately. Sometimes, however, the situation is more complicated. For example, the starting value of W may be so small that this strategy will not work. Then one must make "sacrifices" (as in chess problems) and allow swaps where W grows temporarily. In such cases also other kind of logical reasoning must be empoyed as in this example. * * * For simplifying usage of the swapping technique, I have made a sucro /SP_SWAP which at first initiates the table according to the products of sums and lets thereafter the user repeatedly to swap numbers by using the mouse and X and Y keys. After each swap the sucro updates the setup and displays a list of swaps used so far. It is also possible to solve non-open Survo puzzles having some fixed numbers in the table. In these cases the initialization program (a small Survo module SPGUESS) used by /SP-SWAP employs a generalized products of sums technique. In these cases the numbers given readily in the table are displayed with a blue background and then the player knows to avoid swapping those "fixed points". /SP_SWAP requires the complete SURVO MM version. A special Java applet is also available for solving randomly created Survo puzzles by the swapping technique. * * * When using /SP_SWAP the Survo puzzle must be saved as a Survo matrix file. For example, the problem * * * * 14 * * * * 28 * * * * 44 * * * * 50 16 33 40 47 is saved in the form MAT SAVE AS A 0 0 0 0 14 0 0 0 0 28 0 0 0 0 44 0 0 0 0 50 16 33 40 47 0 and the sucro is activated by the command /SP_SWAP A Here are some Survo puzzles which can be solved by the swapping method. Problem S0: (no swaps needed) MAT SAVE AS A 0 0 0 0 15 0 0 0 0 32 0 0 0 0 41 0 0 0 0 48 14 33 38 51 0 Problem S1: (three swaps needed) MAT SAVE AS A 0 0 0 0 48 0 0 0 0 43 0 0 0 0 34 0 0 0 0 11 47 41 30 18 0 Problem S2: (four swaps needed) MAT SAVE AS A 0 0 0 0 55 0 0 0 0 44 0 0 0 0 23 0 0 0 0 14 45 39 29 23 0 Problem S3: MAT SAVE AS A 0 0 0 0 0 27 0 0 0 0 0 37 0 0 0 0 0 56 9 17 24 31 39 0 Problem S4: MAT SAVE AS A 0 0 0 0 57 0 0 0 0 41 0 0 0 0 27 0 0 0 0 11 44 36 31 25 0 Problem S5: MAT SAVE AS A 0 0 0 0 56 0 0 0 0 38 0 0 0 0 31 0 0 0 0 11 45 37 32 22 0 Distribution of swaps for a random permutation It is not reasonable to find this distribution by computing the number of swaps needed separately for each permutation since in the case of 4x4 Survo puzzles there are fact(16)=20922789888000 different permutations. Although one could calculate a million values in a second, the entire process would last about 240 days. It is now shown how this distribution can be found exactly and very quickly by using the matrix and polynomial interpreter of Survo. It is a good mathematical excercise to explain, why these calculations give the correct result. Let's study the product (x-1)*(x-2)*(x-3)*...*(x-15) and expand it to a polynomial s0+s1*x+s2*x^2+...+s15*x^15. Then the coefficients s0,s1,s2,s3,...,s15 are Stirling numbers of the first kind and their absolute values divided by 16! give the proba- bilities of the sought distribution in the reverse order. The essential calculations are done by Survo as follows. The coefficients of the binomials of in the product are saved as vectors (it would be easy to make a short sucro for the general case): MAT SAVE AS P -1 1 MAT SAVE AS P2 -2 1 MAT SAVE AS P3 -3 1 the same for values 4-14 MAT SAVE AS P15 -15 1 The product of binomials: POL P=P*P2 POL P=P*P3 POL P=P*P4 POL P=P*P5 POL P=P*P6 POL P=P*P7 POL P=P*P8 POL P=P*P9 POL P=P*P10 POL P=P*P11 POL P=P*P12 POL P=P*P13 POL P=P*P14 POL P=P*P15 MAT TRANSFORM P BY abs(X#) Now the vector P gives the absolute values of coefficients s0,s1,...,s15. These values are listed by MAT LOAD, their sum (equal to 16!) is checked by the touch mode computation of Survo and by dividing by this factorial we get the probabilities MAT LOAD P,12345678901245,CUR+1 MATRIX P Polynom /// real probability number ow swaps 0 1307674368000 0.06250000000000 15 1 4339163001600 0.20738931207681 14 2 6165817614720 0.29469385525189 13 3 5056995703824 0.24169796336407 12 4 2706813345600 0.12937153028299 11 5 1009672107080 0.04825704948933 10 6 272803210680 0.01303856761648 9 7 54631129553 0.00261108245341 8 8 8207628000 0.00039228171979 7 9 928095740 0.00004435812552 6 10 78558480 0.00000375468474 5 11 4899622 0.00000023417632 4 12 218400 0.00000001043838 3 13 6580 0.00000000031449 2 14 120 0.00000000000574 1 15 1 0.00000000000005 0 20922789888000 Again by using the touch mode it is easy to compute the mean and the standard deviation m=12.619271006771 (mean) S2=161.04238320212 (weighted sum of squares) s=sqrt(S2-m^2) s=1.3402919308079 (std.dev.) A problem requiring a sacrifice * * * * 49 MD=1700 * * * * 39 * * * * 33 * * * * 15 49 43 29 15 The initial setup of the swapping method is C1 C2 C3 C4 Sum OK D R1 16 15 11 7 49 49 0 R2 14 13 9 4 40 39 1 R3 12 10 8 3 33 33 0 R4 6 5 2 1 14 15 -1 Sum 48 43 30 15 OK 49 43 29 15 W=4 D -1 0 1 0 W is only 4, but the puzzle cannot be solved directly by one or two steps. Since both minimal sums are equal to 15, it is worthwhile to study their possible partitions. COMB P,CUR+1 / P=PARTITIONS,15,4 DISTINCT=1 Partitions 4 of 15: N[P]=6 1 2 3 9 1 2 4 8 1 2 5 7 1 3 4 7 1 3 5 6 2 3 4 6 Then the only permitted combinations of these partitions are 8 4 2 1 - 6 5 3 1 (1 as the common element) 7 5 1 2 - 6 4 3 2 (2 as the common element) and the first pair is closer to the initial setup. Therefore it is reasonable to start by swaps (2,3) and (7,8) and getting correct values for the sums of last row and column, but W grows to 6 (sacrifice!) C1 C2 C3 C4 Sum OK D R1 16 15 11 8 50 49 1 R2 14 13 9 4 40 39 1 R3 12 10 7 2 31 33 -2 R4 6 5 3 1 15 15 0 Sum 48 43 30 15 OK 49 43 29 15 W=6 D -1 0 1 0 Now it is easy to detect that swaps (12,13) and (10,11) lead directly to the solution C1 C2 C3 C4 Sum OK D R1 16 15 10 8 49 49 0 R2 14 12 9 4 39 39 0 R3 13 11 7 2 33 33 0 R4 6 5 3 1 15 15 0 Sum 49 43 29 15 OK 49 43 29 15 W=0 D 0 0 0 0 and the problem has been solved by 4 swaps.