Suppose you are a beginner on the piano. You hear a piece of music that catches your fancy and borrow the score from the library. If the music is beyond your capabilities, you’ll know soon enough: if not right away on turning the cover page, then before you get to the end of the first few bars.
In programming, it’s different. There you can be busy for days or weeks, even months, learning the ins and outs of, say, Java and its libraries. Flushed with the illusion of power that all this knowledge brings, you come across something that catches your fancy, say, write a program that solves Sudoku puzzles. Wouldn’t that be neat! Now that you know so much about this powerful and sophisticated programming language, what’s going to stop you? Even after weeks of thrashing around, you may not know what hit you.
Knowing the language is not enough
This was the state of a friend I wanted to help. How could I get the fact across that there is something to learn in programming beyond the language? Even independently of the language? In fact, I was trying to solve
“Programming = Language + X“.
I scanned the books on my shelves, mainly the detritus of many visits by publishers’ reps. I ended up not giving him any of these, because, every time I reached out, I realized that this was another book where the learning of some language was misleadingly intertwined with the X.
What are the other things you need to know? As a rule, programmers don’t know, just as a pianist has no idea of all the things she’s doing right. For something to change in this matter, something unusual needs to happen. In this case, the unusual thing was a great programmer accepting a job as a professor, being confronted with large classes of novices, and trying to teach them to program. One of the times this happened was in the 1960s and the professor was E.W. Dijkstra.
Dijkstra was not only an unusually effective programmer, but was unusual in trying to make explicit the problem-solving steps that he tended to race through unconsciously. He took some problems that were simple to explain, yet hard enough for his students to founder upon. For each of these he wrote down a complete route from problem statement to solution, trying not to skip a single step. A result of this exercise is “Notes on Structured Programming”.
An example picked by Dijkstra
As soon as I remembered that it was from this paper that I learned a nifty algorithm for generating prime numbers, I knew what my friend needed. So, out from under the gifts from the publishers’ reps, I pulled the book “Structured Programming” by O.-J. Dahl, E.W. Dijkstra, and C.A.R. Hoare; Academic Press 1972. This contains Dijkstra’s paper. It first came out as a report, which I obtained by submitting to Google the query dijkstra "eight queens".
I found to my surprise that Dijkstra’s paper also contains a treatment of the Eight-Queens problem. This is a simple search problem, simpler than the Sudoku-solving that my friend was struggling with. OK, I know: the Eighty-Queens problem is not a simple search problem, but here we’re only trying to run Eight-Queens problem on a modern computer.
So I e-mailed him the following problem statement taken verbatim from Dijkstra’s paper
“It is requested to make a program generating all configurations of eight queens on a chessboard of 8 by 8 squares such that no queen can take any of the others. This means that in the configurations sought, no two queens may be on the same row, on the same column, or on the same diagonal.”
I suggested that he try this instead of Sudoku. My plan was to give him a copy of Dijkstra’s story. Whatever degree of success my friend would have had, his attempt should have softened him up for Dijkstra’s description of a problem-solving process.
As it happened, I did not hear from my friend again. Still my interest was piqued: Dijkstra was great on generating prime numbers, so what would he be like with Eight Queens? I only knew it as a program beloved by Prolog enthusiasts because it is supposed to demonstrate the power of built-in backtracking.
On the bus back from work I started reading Dijkstra’s explanation. My mind kept wandering off. After a few stops I closed the book, and started to think. The bus is a good place to think; walking home afterwards is even better.
I have yet to get back to Dijkstra’s explanation. In the meantime I think I have found a better one. Dijkstra was working with Algol 60, so did not have the help of built-in backtracking. My choice of programming language was C, so I was in the same situation. For me the process was a revelation, because the problem seemed to solve itself; at no point was I conscious of “coding backtracking”. Anyway, backtracking is only a vague concept for me. I would have to work hard to even give a reasonably precise description of the process. But the step-by-step problem solving process that I followed leads to a simple program that does the job. Maybe it does backtracking; I don’t know.
But of course it didn’t matter that I could solve the problem. The issue here is to explain the mental operations involved to someone in whom these operations are not triggered automatically when confronted with the problem. Dijkstra followed a do-it-yourself approach to describing the operations, and so did I. Only when I was finished did something from the distant past come back.
Pólya on problem-solving
It turned out I had not only read, but still owned, a copy of the book “How To Solve It” by George Pólya, published in 1948.
Pólya was a mathematician and most of his examples in the book concern mathematical problems. But he also includes recreational puzzles and, indeed, the subtitle of the book is: “A System Of Thinking Which Can Help You Solve Any Problem”. This makes one wonder whether the system can also help solve the problem of writing a program for the Eight-Queens problem.
I’ll repeat here the very short outline of the book, so short that you’ll find it on the inside front cover.
- Understanding the problem
- What is the unknown? What are the data? What is the condition?
- Is it possible to satisfy the condition? Is the condition sufficient to determine the unknown? Or is it insufficient? Or redundant? Or contradictory?
- Draw a figure. Introduce suitable notation.
- Separate the various parts of the condition. Can you write them down?
- What is the unknown? What are the data? What is the condition?
- Devising a plan
- Have you seen it before? Or have you seen the same problem in a slightly different form?
- Do you know a related problem?
Do you know a theorem that could be useful? - Look at the unknown!
And try to think of a familiar problem having the same or a similar unknown. - Here is a problem related to yours and solved before. Could you use it?
Could you use its result? Could you use its method? Should you introduce some auxiliary element in order to make its use possible? - Could you restate the problem? Could you restate it still differently? Go back to definitions.
- If you cannot solve the proposed problem, try to solve first some related problem. Could you imagine a more accessible related problem? A more general problem? A more special problem? An analogous problem? Could you solve a part of the problem? Keep only part of the condition, drop the other; how far is the unknown determined, how can it vary? Could you derive something useful from the data? Could you think of other data appropriate to determine the unknown? Could you change the unknown or the data, or both if necessary, so the new unknown and the new data are nearer to each other?
- Did you use all the data? Did you use the whole condition? Have you taken into account all essential notions involved in the problem?
- Carrying out the plan Carrying out your plan of the solution check each step. Can you see clearly that the step is correct? Can you prove it is correct?
- Looking back
- Can you check the result? Can you check the argument?
- Can you derive the result differently? Can you see it at a glance?
- Can you use the result, or the method, for some other problem?
This is the complete text of Pólya’s outline. I was tempted to skip a lot, as it seemed to be slanted towards solving mathematical problems rather than the problem at hand. But it was good that I persevered, because more turned out to be relevant than seemed to be the case at first sight.
Object and Meta problem
An interesting feature of the problem at hand is that it is easy to confuse solving the Eight-Queens problem itself (the object problem) with the meta problem of writing a program that solves the Eight-Queens problem. One might think that we need not be concerned with methods for solving the object problem. But going over Pólya’s checklist shows that it gives hints about how to solve the object problem that help with the meta problem.
Using the checklist for the object problem
Item 1.1
The data are the board as specified by the rules of chess, by the rule for attack by a queen, and by the fact that there are eight queens on the board. The unknown is a set of eight squares for putting queens on. The condition is that, with a queen on each of these squares, no queen attacks any other.
Item 1.2
It’s funny that ours is apparently not a typical problem: these questions can only be answered by trying to solve the object problem, which we expect to leave to the program to be constructed. We happen to know that
- it is possible to satisfy the condition,
- the condition is not sufficient (there are 92 solutions)
- the conditions are not redundant
- the conditions are not contradictory (there is at least one solution).
Item 1.3
Not applicable here: a chessboard with eight tokens is even better than a figure.
Item 1.4
The condition that a queen attacks a piece can be separated as follows: the two have to be (1) in the same row, or (ii) in the same column, or (iii) on the same diagonal. The further condition that there cannot be any intervening piece is not relevant for this problem, as that intervening piece can only be a queen, which is then itself exposed to attack.
Item 2.5
Checklist item 2.5 is helpful. If a human is solving the object problem, then the existing formulation seems satisfactory. However, a program does not solve the object problem by handling chunks of plastic on a piece of cardboard. So restating the problem should be useful.
We can even restate the problem in such a way that the resulting problem is easier to solve by a computer program. It is fortunate that we came across 1.4 first, and separated the condition into three parts. The first part allows us to simplify the representation of a queen: as there can only be one queen in each row, and there are just enough rows for the queens, there has to be exactly one queen in each row. That suggests identifying each queen by the row in which it resides. For each row we only have to specify the column of the queen of that row. In this way the constraint of not being in the same row is automatically satisfied.
How would we now represent a solution? It is a sequence s of eight numbers, where s[i] is the column number of the queen in the i-th row. Two squares s[i] and s[j] with i not equal to j are on the same diagonal if and only if the absolute value of s[i]-s[j] is equal to the absolute value of i-j. So the figure of checklist item 1.3 comes in handy after all.
Even though we do not have to solve the object problem ourselves, it makes sense to know more about it and look for other hints that might be helpful. Item 2.6 suggests we solve part of the problem, perhaps in connection with dropping part of the condition.
The part to be dropped could be that we have to start solving the problem from scratch. We could consider the problem of completing a partial solution. where non-attacking queens have already been placed in the rows 0,…,i-1. That would leave us with the more promising problem of only placing the queen in row i in such a way that it is safe from the queens in rows 0,…,i-1. If this is not possible, then we have to look for another way of placing the first i queens. If it is possible, then we can continue completing the partial solution from a more advantageous starting point.
Using the checklist for the meta problem
Here we run into a difficulty right away with item 1.1: Understanding the Problem. The most interesting part of Pólya’s work is his claim that the methods apply to solving problems in all areas, not just in mathematics. But even within mathematics, he has to go out of his way to argue that the methods apply both to proving and to constructing. Our meta problem is clearly one of constructing: namely, we have to construct a computer program.
But, what can the data possibly be? Do data ever make sense with a construction problem? Yes, consider a typical one, where one is to construct the perpendicular through a point P on a line l. Here the P and l are the data, the unknown is a line p, and the condition is that p goes through P and is perpendicular to l.
In the current problem, the unknown is a program. The condition is that it prints the eight column numbers that represent a solution to the Eight-Queens problem. It doesn’t seem necessary to identify data. Just read on and see how the program gets written with help of some of the other hints by Pólya. It is OK to pick and choose from his list: it is a checklist, not an algorithm. Item 2.6 is promising, as it supposes we cannot solve the problem, and suggests we solve some other problem, an easier one, of course. That other problem had better be related, otherwise the solving would be an exercise in futility. But related in what way?
We noted that the Eight Queens problem can be solved by completing a partial solution that places mutually non-attacking queens in rows s[0],…,s[i-1]. For such a completion to be possible, two conditions need to be satisfied:
- It is possible to find a suitable position for the queen in row i.
- It is possible to complete the resulting partial solution s[0],…, s[i]. Of course, when i is equal to 7, then there is no doubt that completion is possible. Writing the these conditions more formally, we get: To complete the problem of placing n queens starting from a partial solution of i queens, we check whether i equals n, in which case we only need to print the solution. In case i is less than n, we try in turn all of 0,…,n-1 for a column number j in which the i-th queen is not attacked by any of the queens placed in rows 0,…,i-1 according to the existing partial solution. This is specific enough to translate to C. The program can take the form of a function with the following definition.
void complete(int n, int s[], int i) { // assume s[0],...,s[i-1] is a partial solution // assume i <= n if (i == n) { // partial solution is solution print(n,s); // print first n elements of array s return; } // i < n for (int j = 0; j < n; j++) { // try all columns j in i-th row if (nonattack(j, s, i)) { s[i] = j; complete(n, s, i+1); } } }
Here the function nonattack can be defined as:
int nonattack(int j, int s[], int i) { for (int k = 0; k < i; k++) if (j == s[k] || abs(j-s[k]) == i-k) return 0; return 1; }
All we have done here is to state the precisely the condition under which it is possible to complete a partial solution s[0],…,s[i-1]. How do we get a partial solution? We already have one if i equals 0. That suggests, if we don’t want to do any further work, to call this function with complete(8, s, 0). As result of this call, all 92 solutions are printed.
Apparently, by stating precisely in C whether it is possible to complete a solution, we have created a program that produces the solutions, if any.
This completes Pólya’s step 2.6, part of Devising a Plan. It seems we overshot the goal of a plan and instead ended up with a solution to our meta problem: writing a program that solves the Eight Queens problem. Accordingly, Pólya’s step 3, Carrying Out the Plan, does not apply. Step 4, Looking Back does apply.
Can we check the result? To a certain extent we can: run the program. This can expose blunders, but it cannot prove correctness. However, proving the program correct is nothing but checking the correctness of our conditions for being able to complete a partial solution, and to check the correctness of their transcription to C.
As for 4.3, there are many problems of similar type, all classified as search problems, such as the Knight’s Tour, the Fifteen Puzzle, Sudoku, Instant Insanity, and Rubik’s Cube. These are all microcosms that have features in common with industrial scheduling problems.
Concluding remarks
In the execution of the derived program it happens many times that the first i queens attack every square in the i-th row. In such a situation the current partial solution has to be discarded and the previous one re-instated. In general the first few column positions in the row just beyond that previous partial solution have already been tried (leading to a previous solution or not). Somewhere it needs to be recorded what the next column position is that needs to be tried.
Such a return to something previously tried is called “backtracking”. The program we found somehow does all these things. Yet in the derivation of it, nothing was further from my mind than the hoary issue of backtracking. Somehow the logic of the condition under which a partial solution can be completed took care of this.
Such magic is exactly what is claimed as the unique advantage of the logic programming language Prolog: one only needs to be concerned with the declarative aspects of the problem. Because the execution mechanism of Prolog is equivalent to a theorem-prover, the programmer is largely shielded from the operational aspects. The miraculous ease with which the Eight-Queens problem can be programmed in Prolog is usually attributed to the fact that the Prolog execution mechanism has backtracking built into it.
This would suggest that the Prolog version be shorter than our solution in C. Is this in fact the case? Let us compare it with the first Prolog version that comes to hand for many people: the one in the well-known text by Sterling and Shapiro (“The Art of Prolog”, 2nd edition, MIT Press, 1994):
queens(N,Qs) if range(1,N,Ns), queens(Ns,[],Qs). queens(UnplacedQs,SafeQs,Qs) if select(Q,UnplacedQs,UnplacedQs1), not attack(Q,SafeQs), queens(UnplacedQs1,[Q|SafeQs],Qs). queens([],Qs,Qs). range(M,N,[M|Ns]) if M lt N, M1 is M+1, range(M1,N,Ns). range(N,N,[N]). select(X,[X|Xs],Xs). select(X,[Y|Ys],[Y|Zs] if select(X,Ys,Zs). attack(X,Xs) if attack(X,1,Xs). attack(X,N,[Y|Ys]) if X is Y+N ; X is Y-N. attack(X,N,[Y|Ys]) if N1 is N+1, attack(X,N1,Ys).
This is hardly shorter, even though Prolog has the advantage of built-in backtracking.
When I was restating the Eight-Queens problem, you may have thought: “It’s a permutation, stupid!” Indeed I narrowly missed restating the problem as that of finding a permutation that satisfies the diagonal constraint. This leads to first writing a program that generates all permutations and then modifying it so that the permutation being built up is aborted as soon as it violates the diagonal constraint.
I think I am lucky I missed this, because the solution to the Eight-Queens problem can hardly have been easier than it was here. And generating all permutations is considered a worthy programming exercise in its own right by programming gurus, Dijkstra included. Indeed, if I would have to generate the permutations, I would follow the same line of reasoning, mutatis mutandi, and change the line
if (j == s[k] || abs(j-s[k]) == i-k) return false;
to
if (j == s[k]) return false;
to get a perfectly presentable permutations program, even though “nonattack” becomes a rather mystifying name for that function.
To summarize, I started by observing that beginners are likely to fall into the trap of believing that, during the considerable time they spend learning a “powerful” programming language, they also learn programming. Dijkstra illustrated some of what is still missing by means of a step-by-step development of a program for the Eight-Queens problem. By giving the name “general problem-solving” to the missing component, I was led to Póya’s book “How to Solve It”, which claims that its method applied to all problems. By trying Pólya’s precepts on a programming problem, and meeting with success, I feel I have substantiated this claim.
I conclude that
“Programming = Language + X”
is solved by X = Heuristic.
February 23, 2010 at 5:28 am |
One can also say that programming is using a generic language and creating a domain-specific part on top of it:
Programming = Language + Domain-Specific Adaptation
Or one can say that programming is putting in features and bugs, and removing some of the bugs afterward:
Programming = Putting a lot of bugs in + Removing some bugs afterward
Or one can say that programming is simply coding and testing:
Programming = Coding + Testing