History of “Structured Programming”

At its introduction around 1970, structured programming was controversial. It was soon universally accepted. In this essay I trace the history of structured programming and argue that it is worthwhile to re-open the controversy.

 

History of the goto statement

From their beginnings stored-program computers worked by going through the fetch-execute cycle. The cycle begins with fetching the content of the memory location of which the address is found in the CPU’s instruction counter. By default this counter is incremented so that the next cycle executes the content of the next memory location. It is essential that variations on this default are possible. Such variations are provided by instructions that write to the instruction counter, so that execution can continue at any memory location of the programmer’s choice.

In the first few years of stored-program computers, before the first high-level language, all programs were written in machine code or assembler code. Such code made it natural to sequence instructions by writing to the instruction counter. In high-level languages (several before the first Fortran, see [1]) the only memory locations that were legal to transfer to were the ones at which the compiler had placed the beginnings of statements. By placing a label L in the code, this memory location was made accessible to the programmer; by writing the statement goto L the programmer achieved the effect of writing to the instruction counter.

The availability of the goto statement made, for example, Fortran a curious mixture of high- and low-level language. This was even more the case with subsequent high-level languages. These aspired to a higher level by allowing a more abstract approach to program design, yet retained the goto statement. The first that I know of who was bothered by this dissonance was Heinz Zemanek who “expressed doubts about goto statements as early as 1959” [1a, page 5]. D.V. Schorre systematically avoided labels and goto statements in 1964. Around the same time Naur published remarks critical of the goto statement and Forsythe was purging them from submissions to the algorithms section of the Comm. ACM [1a, page 5].

Compared to what came later these critical remarks were mere murmurs. In 1968 the Communications of the ACM published a Letter to the Editor with title “Goto statement considered harmful” [2] in which E.W. Dijkstra not only elaborated on the title, but proposed that programmers abstain from its use and that it be left out of future programming languages. Dijkstra argued that code should be written instead with conditional, alternative, and repetitive clauses. This publication started a lively controversy. Knuth writes [1a] of Dijkstra telling him that he received a torrent of abusive letters.

1968 was also the year of the first of two NATO Conferences on “Software Engineering”. The programming pundits convened here voiced their alarm over the number of software projects that were delivered far over deadline, were far more costly than was anticipated, or were cancelled outright. It may be here that the term “software crisis” was born. Dijkstra [3] reports that the conference gave him the courage to publish the Letter to the Editor.

Though goto-less programming is often equated with it, the term “structured programming” does not occur in the Letter to the Editor. It may not even have existed in 1968. The earliest mention of “structured programming” I have been able to find is in EWD249 “Notes on Structured Programming” [4] dated August 1969. It is this term that created even more controversy than the Letter to the Editor. It rapidly gained support, especially in academia, where it happened that professors with no programming experience had to teach the introductory programming course. As they could not tell the students what to do, they welcomed the opportunity of telling them what not to do: use a goto statement. These theoretical types enjoyed the casuistry concerning whether and in what cases the rule might exceptionally be broken.

Fast forward now to 1996, when C.A.R. Hoare finds occasion to note [5] that the dire predictions of the 1968 NATO conference had not come to pass. Specifically, the prediction was that without anything less than a revolution, software development would be stuck in the sorry state noted at the conference. For Dijkstra this revolution would have to take the form of correctness proofs of all code. Hoare published a verification method soon after. Yet the title of Hoare’s 1996 paper is “How Did Software Get So Reliable Without Proof?”.

Hoare reviews several contributions to this welcome, yet unanticipated development. Successive sections are devoted to Management, Testing, Debugging, and Over-engineering. Most relevant to this article is the next section, Programming Methodology.

Research into programming methodology has already had dramatic effects on the way that people write programs today. One of the most spectacular successes occurred so long ago that it is now quite non-controversial. It is the almost universal adoption of the practice of structured programming, otherwise known as avoidance of jumps (or gotos). Millions of lines of code have now been written without them. But it was not always so. At one time, most programmers were proud of their skill in the use of jumps and labels. They regarded structured notations as unnatural and counter-intuitive, and took it as a challenge to write such complex networks of jumps that no structured notations could ever express them.

As we will see later, not everyone agreed on this characterization of structured programming. Hoare continues with an explanation of how this other revolution came about.

The decisive breakthrough in the adoption of structured programming by IBM was the publication of a simple result in pure programming theory, the Bohm-Jacopini theorem. This showed that an arbitrary program with jumps could be executed by an interpreter written without any jumps at all; so in principle any task whatsoever can be carried out by purely structured code. This theorem was needed to convince senior managers of the company that no harm would come from adopting structured programming as a company policy; and project managers needed it to protect themselves from having to show their programmers how to do it by rewriting every piece of complex spaghetti code that might be submitted. Instead the programmers were just instructed to find a way, secure in the knowledge that they always could. And after a while, they always did.

This completes what I consider the external history of structured programming: the history viewed from the outside. For the internal history I turn to what Dijkstra wrote about it in EWD1308 (dated June 2001) [3]: “I knew that in retrospect, the Algol implementation and the THE Multiprogramming System had only been agility exercises and that now I had to tackle the real problem of How to Do Difficult Things.” As a first step in this new direction he wrote EWD249 “Notes on Structured Programming”. It is interesting to compare what Hoare wrote about the significance of IBM and its adoption of structured programming with what Dijkstra wrote in 2001 [3]:

… IBM … stole the term “Structured Programming” and under its auspices Harlan D. Mills trivialized the original concept to the abolishment of the goto statement.

Indeed a far cry from “Learning How to Do Difficult Things.”

In 1969, with the completion of “Notes on Structured Programming”, Dijkstra took the next step with “Concern for Correctness as a Guiding Principle for Program Construction” [6]. He believed that the Software Crisis, the theme of the 1968 NATO conference, could only be overcome by verifying all code. In this paper he wrote

When correctness concerns come as an afterthought and correctness proofs have to be given once the program is already completed, the programmer can indeed expect severe troubles. If, however, he adheres to the discipline to produce the correctness proofs as he programs along [my italics], he will produce program and proof with less effort than programming alone would have taken.

In structured programming one cannot help writing the loop code first and then wondering what might be a fitting invariant. In the quoted passage Dijkstra implicitly repudiated structured programming. And rightly so, when verified code is the goal, structured programs are not necessarily helpful: they are valued because they are readable, which means they are plausible, rather than verified. If verification is the goal, then one has to start from scratch.

I get the impression that Dijkstra’s “Concern for Correctness …” was a bit of a lead balloon. As of this writing, Google Scholar gives 13 as the number of citations. Compare this with 1768 citations for “Goto Considered Harmful” and 2093 for “Structured Programming”, the book. This makes one wonder whether anything has happened since 1970 to find something more specific than “to produce the correctness proofs as he programs along”.

In 2009 R-J. Back wrote a paper [7] under the title “Invariant-Based Programming”, which solved Dijkstra’s conundrum by the observation that, given loop code, finding a fitting invariant can be hard, whereas given a suitable invariant, it is easy to find corresponding code. He attributes this method to Reynolds’s 1978 “Programming with Transition Diagrams” [8] and to van Emden’s 1979 “Programming with Verification Conditions” [9]. These papers introduce the same concept. Reynolds starts with the syntactic side (the diagrams), whereas van Emden starts with the semantic side (“verification conditions” in the sense of Floyd).

What is imperative programming?

On the subject of programming paradigms “imperative” is often contrasted with “declarative”. Imperative is characterized by being dominated by the concept of state, whereas state is supposed to play no role in declarative programming. Lamport argues [10] that this dichotomy is not helpful. A program in Scheme or in Prolog, however true to the language’s ideal style, specifies a computation and, as Lamport has pointed out, all computation is performed by a state machine of some sort. This is obviously true of imperative programs, but it is also true of programs in Scheme or Prolog, although typically less obviously so.

Still, structured programming is different from the kind of programming for which Scheme or Prolog is intended for. How can one characterize the difference? Not by whether a state machine is involved. The difference seems to lie in the structure of the state and in its visibility. In the kind of programming that Dijkstra is interested in, and that I loosely refer to as imperative programming, the structure of the state is simple: the state is a vector of primitive values indexed by variables. The state is visible: new values are created as values of expressions containing state components and decisions are taken on the basis of such expressions. In the kind of programming for which Scheme and Prolog are intended, the structure of the state is complex, containing stacks and pointers into them. These are not visible and are manipulated indirectly via the evaluation of expressions (Scheme) or via the elaboration of goal statements (Prolog).

Floyd’s verification method

To act on Dijkstra’s injunction to make a correctness proof the guiding principle in program design, I find that Floyd’s verification method [11] is the best starting point. So let us review this method, if only to establish terminology.

The program is given as a flowchart with the nodes identified by labels. Without loss of generality I consider flowcharts with two distinguished nodes: the start node (the one node without an incoming arc) and the halt node (the one node without an outgoing arc). For a correctness proof each label is associated with an assertion.

An assertion is a formula of first-order predicate logic. As such it is restricted to a given vocabulary, which consists of function and predicate symbols and of variables. The variables can be bound or free; only the latter concern us here. The set of assertions that belong to the same flowchart are restricted to the same vocabulary. Whether an assertion is true depends on an interpretation of the function and predicate symbols and on the state (the values of the variables).

To follow a computation by a flowchart one has to know what the current state is and at what node control has arrived. Thus a computation can be formalized as a sequence of pairs consisting of a label and a state. Such a sequence is a trace of the flowchart if <L0, s0> is followed by <L1, s1> whenever L0 is followed by L1 in the flowchart with no intervening label and if s0 is transformed to s1 by the intervening code. Infinite traces exist in some flowcharts. Finite traces end in the halt node. A finite trace is a computation of the flowchart if it starts in the start node.

The expression {p}C{q} has become known as “Hoare triple”, although Hoare introduced it as p{C}q to mean the same as the verification condition introduced by Floyd and written by him as VC(p;q). I adopt the notation of the Hoare triple and use Floyd’s name for it. The meaning of a verification condition {p}C{q} can be characterized by saying that it is true whenever truth of p in state s and s being transformed to state t by executing C together imply that q is true in t.

In Floyd’s original form, the command C is a fragment of a flowchart. Hoare restricted it to be a fragment of a structured program. Programming With Verification Conditions (as in [9]) is based on the observation that {p}C{q} is defined for any C of which the meaning is characterized by a binary relation. This gives a generalization with respect to flowcharts because a formula by itself can be regarded as a binary relation between states that is a subset of the identity relation on states. For example, {p} x>0 {p & x>0} is true for any assertion p. Think of x>0 as a partially defined command that fails in any state in which x <= 0 and that otherwise successfully terminates without changing the state.

Imagine now that we have a program with labels and that each label is associated with an assertion. Suppose that the semantics of goto L is a claim that assertion L is true of the state. If p and q are the assertions associated with the labels P and Q, then (P: C; goto Q) is executable code and {p}C{q} is its verification condition. Any set of verification conditions with common state variables becomes a program. All we have to do is to allow programs for which attempts at execution can have the outcomes that can be classified as follows: (a) there is a final state, and (b) there is no final state (“infinite loop”). In case (a) we can have that (a1) the final state is reached at the halt label (“success”), and (a2) the final state is reached at another label (“finite failure”). (a2) is not possible in flowcharts; it is a possibility that arises in the generalization of flowcharts to sets of verification conditions.

Matrix Code

Once we know that any set of verification conditions with a common vocabulary is in principle executable, any conceptual barriers that may have existed are removed; what remains is mere technicalities. What follows is a description of one route from a set of verification conditions to a program I can run on my computer.

A set of verification conditions, being a set, is not ordered. In a listing we can re-order them in any way without affecting the meaning of the program. The first thing to happen after [8] and [9] was the observation that a set of verification conditions {p}C{q} can be read as a matrix in which C is the entry in column P (the label of p) and in row Q (the label of q); the entries not thus specified contain the empty binary relation. Thus the typical set of verification conditions is a sparse matrix. Indeed, sparse matrices of reals are often specified by an unordered listing of triples consisting of a non-zero entry together with its row and column index. The unlisted entries are assumed zero.

In [12] the matrix format for verification conditions is called Matrix Code. An advantage of the matrix format is that behaviour of the program can be couched in terms of matrix multiplication. Apart from its application to computer programs this paper contains a number of theoretical results. Floyd’s characterization of the verification conditions for a flowchart is generalized and related to fixpoint semantics.

In [12] programs are left in the traditional two-dimensional format for matrices. Though this works reasonably well for small examples on a whiteboard (which is where Matrix Code was born), it helps to use a simple character-oriented editor such as emacs or vi. To bridge the gap between the program-as-set-of-verification-conditions and compilable code it helps to have an intermediate form. One will see all this worked out in [13].

Coda

Hoare’s retrospective [5] shows that after 25 years programming method was still stuck in structured programming. How could this happen? Dijkstra was dismayed to find that structured programming had been equated with programming without goto statements. He couldn’t have expected otherwise, what with his emphasis on source code structures (conditional, alternative, and repetitive statements). With his introduction of the guarded-command language Dijkstra reinforced this emphasis. People could not have guessed that structured programming was meant as a first step towards a method for “doing difficult things”.

I advocate Matrix Code as a further step towards such a method. Matrix Code requires translation to code with the goto statement as the only way of transferring control. In the Letter to the Editor “Goto statement considered harmful” [2] Dijkstra advocates never using the goto statement. There must be something in the letter that I disagree with. What is it?

Dijkstra opens the argument with

My first remark is that, although the programmer’s activity ends when he has constructed a correct program, the process taking place under control of his program is the true subject matter of his activity, for it is this process that has to accomplish the desired effect; it is this process that in its dynamic behaviour has to satisfy the desired specifications.

What I propose is to change “the true subject matter of his activity” from that process to the goal the process is to achieve. That goal is a static entity. It can be expressed as an assertion. The programmer’s activity should be to design achievable goals and to link their assertions with verification conditions. One can statically ascertain whether a set of verification conditions is sufficient for the task at hand.

To proceed thus is to work at a higher level than structured programming; if it needs a name, then “goal-directed programming” might do. From the point of view of [6] this is a step forward because it is assumed there that code has to be verified. Indeed goal-directed programming results in code that is verified in the sense of Floyd. But I see a more important advantage. Without knowing an algorithm to achieve the goal, the sub goals needed suggest themselves, together with verification conditions that link them to a complete set. As observed before, to get from a set of verification conditions to executable code is trivial mechanics. For examples of this miracle, see [13].

If you have a goal, then you know where you are going to and goto is exactly what you need.

Acknowledgements

I am indebted to Paul McJones for discussions and pointers to material.

References

[1] “The early development of programming languages.” by D.E. Knuth and L.T. Pardo. A history of computing in the twentieth century. 1980. 197-273.
[1a] “Structured Programming with go to Statements” by D.E. Knuth. ACM Computing Surveys 6.4 (1974): 261-301.
[2] “Letters to the editor: go to statement considered harmful.” by E.W. Dijkstra. Communications of the ACM 11.3 (1968): 147-148.
[3] “What led to ‘Notes on Structured Programming'” by E.W. Dijkstra. EWD1308, June 2001.
[4] “Notes on Structured Programming” by E.W. Dijkstra. EWD249, August 1969.
[5] “How did software get so reliable without proof?” by C.A.R. Hoare. International Symposium of Formal Methods Europe. Springer, 1996.
[6] “Concern for correctness as guiding principle in program construction” by E.W. Dijkstra. EWD288, July 1970.
[7] “Invariant based programming: basic approach and teaching experiences” by R-J. Back. Formal Aspects of Computing 21.3 (2009): 227-244.
[8] “Programming with transition diagrams” by J.C. Reynolds. Programming Methodology. Springer, 1978. 153-165.
[9] “Programming with verification conditions” by M.H. van Emden. IEEE Transactions on Software Engineering 2 (1979): 148-159.
[10] “Computation and state machines” by L. Lamport. Unpublished note (2008).
[11] “Assigning meanings to programs” by R.W. Floyd. Mathematical aspects of computer science 19.19-32 (1967): 1.
[12] “Matrix code” by M.H. van Emden. Science of Computer Programming 84 (2014): 3-21.
[13] “Beyond Structured Programming” by M.H. van Emden.
Posted on arXiv October 2018.

 

2 Responses to “History of “Structured Programming””

  1. Lúthien Says:

    Thanks, Maarten! I only knew about this in very vague terms (I remember hearing programming students in the 1980’s scoff about “Goto”) but this makes the whole issue come alive.

    Very insightful!

    All the best,
    Maike

  2. Yigal Rachman Says:

    Maarten: This article raises my understanding of state machines to the next level – thank you. And yes, ‘goto’ is very useful for this method of programming. Unfortunately, almost all programs I have seen that use ‘goto’ do not use it in this constructive way.

    Warmest regards,
    Yigal

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s


%d bloggers like this: