Dijkstra, Blaauw, and the origin of computer architecture

E.W. Dijkstra is known for several important contributions. It does not seem to be widely known that he played a role in the origin of computer architecture as a concept. In arguing that this is the case I draw attention to the passage in his 1972 Turing lecture where he recounts that the darkest week in his professional life was when he studied the specifications of a newly announced line of computers that he does not further specify than being “of the third generation”.

The “darkest week” surfaced again in the interview, many years later, by Philip Frana for the oral history project:

… ’64 was the year in which IBM announced the 360. I was extremely cross with Gerry Blaauw, with whom I had cooperated and who should have known better, because there were serious flaws built into the I/O organization of that machine.

Frana: Which you had discussed with him before?

Dijkstra: No I hadn’t, but he should have known of my Ph.D. thesis, and he should have known about the care that has to go into the design of such things, but that was clearly not a part of the IBM culture. In my Turing Lecture I have described the week that I studied the specifications of the 360, it was [laughter] the darkest week in my professional life.

In this article I explain why Dijkstra thought that Blaauw should have known better. But first a little detour back to the Turing award lecture, where Dijkstra says that a professional journal like the Communications of the ACM (CACM) should not only review newly published books, but also new computer designs.

And here I have a confession to make: in the early sixties I wrote such a review with the intention of submitting it to the CACM, but in spite of the fact that the few colleagues to whom the text was sent for their advice urged me all to do so, I did not dare to do it, fearing that the difficulties, either for myself or for the editorial board, would prove to be too great.

In the Dijkstra archive [1] I found EWD255, which is a critique of the IBM 360 I/O organization. That matches the review that Dijkstra mentioned in his Turing lecture. There are, however, two anomalies: EWD 255, itself undated, is sandwiched between EWD 253 (dated January 1969) and EWD 257 (dated August 1969). So that is rather later than “early sixties”.

The other anomaly is that by 1969 Dijkstra’s habit of writing in English was well established. For example, among the EWDs 200-299 I counted 29 in English and 28 in Dutch. The latter are for local consumption, administrative as well as technical. Yet EWD 255, where Dijkstra attacks the design (or lack of it) of the IBM 360 I/O organization, is in Dutch. As I could not find an English translation, I wrote one; see the Appendix to this article.

Finally back to the origin of computer architecture, in which Blaauw, mentioned in the above quote from the Dijkstra interview, played an important role. The stage is set in the beginnings of computing in continental Europe.

In 1952, the year that Dijkstra started his career in computing, there were two related threads of pioneering work in the US and the UK that had made a big impression in continental Europe. One was numerical analysis: the large-scale application of numerical computation to solve problems in applied mathematics. In the US the example was the Manhattan project; in the UK it was crystallography. In Los Alamos you had a roomful of people manually operating mechanical calculators and controlled by pre-planned paper forms for intermediate results. This led to the first stored-program electronic computers. The first to be completed was the EDSAC in England; next was the EDVAC in the US.

In the early fifties in continental Europe there was a desire to tackle all manner of important problems with numerical analysis. Even if electronic computers would have been on the market, it would have been difficult to get the various ministries of finance in war-torn continental Europe to give permission to make the necessary foreign currency available. The scarce supply of US dollars had to compete with the equally urgent demand for Camel cigarettes and nylon stockings. At the time the currency problem could not be sorted out by market forces: exchange rates were fixed; in this case 3.60 Dutch florins to the US dollar.

As a result just about every self-respecting country had a project to build a stored-program automatic computer. The Netherlands were no exception; in fact, it had two projects, one of which at the Mathematical Centre (MC) in Amsterdam. The MC was hard at work on solving applied mathematics problems with manually operated calculators. And it got funding (in Dutch florins, of course) to build ARRA, a stored-program computer. B. Loopstra and C. Scholten were designing the machine. In anticipation of its supposedly imminent completion, Dijkstra was hired as programmer. As he would later proudly say, the first such in the Netherlands. An excellent account of Dijkstra’s first years at the MC can be found in Chapter I of Gauthier van den Hove’s MSc thesis [7].

The MC team could not get the ARRA reliable [2]. Of course a demo had to be given, but the project only dared to demonstrate the generation of random numbers. Now A. van Wijngaarden, the boss of the computer project, had excellent international contacts and managed a prize hire. Gerrit Blaauw had graduated in 1946 from the engineering school in Delft, Holland, and got a scholarship to study in the US funded by Thomas J. Watson. After a year at an obscure college, Blaauw was accepted at Harvard and joined Howard Aiken’s group, where he got his PhD. Nobody could have found a better addition to the MC team, which Blaauw joined in 1952.

After examining the obstreperous ARRA it took Blaauw all of twenty minutes to convince the team that it was not just a matter of getting the last bugs out, but that there was no way the ARRA could be made to work at all [3]. There was no alternative but to start from scratch. Face was saved by pretending ARRA never existed: the new machine, completed in December 1953, was also named ARRA. By that time the team had evolved the method best described by quoting Dijkstra [2]:

I would discuss, we would design the instruction code, I would check whether this was an instruction code I could live with, and my hardware friends would check that this was an instruction code that they could build. And then I would write down the formal specification of the machine, and all three [sic] of us would sign it with our blood, so to speak. And then our ways parted and they would start, they knew what to build and I knew what I could construct.

Dijkstra described the method by introducing his 1953 report [4] with

In the following this machine is described, in so far as it concerns the user: it will be described what the machine does, not how the machine works. [Dijkstra’s underlinings]

This may well be, if not the invention, the first explicit formulation of the interface concept, pervasive in computer and software organization. Rather than to list examples of interfaces, let us pick up the story when Blaauw was designing machines in Amsterdam. In 1954 Arthur Samuel visited Amsterdam in one of his many roles with IBM, apparently in this case as salesman. He promised Blaauw a job at IBM [6]. Blaauw arrived there in 1955. After some initial orientation he was assigned to the team for the Stretch computer.

In 1961 IBM had a chaotic collection of production lines originating in different divisions. It was losing market share to the Seven Dwarfs (Burroughs, Sperry Rand, Control Data, GE, RCA, and NCR). IBM’s offerings were address-limited while customers demanded larger memories. IBM countered the crisis by deciding on a single product line for the entire company. Blaauw was assigned to this project. Because the product line ranged from small machines to the largest, it was essential that different hardware configurations present the same interface to software. Blaauw made two contributions.

Computer Architecture The interface concept developed earlier in Amsterdam by Dijkstra and the hardware team, of which Blaauw was part, jelled into the term computer architecture. It became fundamental to the concept of System/360, as IBM’s new product line was called. Blaauw’s contribution is emphasized by Brooks:

Architecture must be carefully distinguished from implementation. As Blaauw has said, “Where architecture tells what happens, implementation tells how it is made to happen”. [5]

System/360 was only the beginning of Computer Architecture, the application of the interface concept to computer organization. Computer Architecture was essential to the wave of revolutions that made processors orders of magnitude more powerful and cheaper during the last decades of the twentieth century. In spite of the implementation upheavals caused by paging, pipe-lining, and caching, the x86 architecture, as seen by compilers, remained the same. This was also the case for the PDP11 series, the Motorola 68000 series, and the PowerPC. Thus Computer Architecture, the basis of all devices based on computer processors, originated in a small group trying to build a computer that worked in an old school building in Amsterdam in the early 1950s.
Data format The System/360 project found it difficult to decide on the data format [8]. A stack architecture, successfully demonstrated by the Burroughs machines was considered and rejected for not being able to accommodate the smaller machines of the product line.

So we scrapped it and had a design competition, which was Gene Amdahl’s idea. I think there were 12 or 13 internal teams, mostly of three or four people. I said the decision of the judge will be final. Gene’s team and Jerry’s team came in with essentially the same concept, which was to use the base registers that Philco had introduced in a machine a year or so earlier. Base registers meant we could get by with short addresses. There was one very big difference. Gene’s machine was based on the existing 6-bit byte and multiples of that: 24-bit instructions and a 48-bit instruction or floating point. Jerry’s [Blaauw’s] machine was based on an 8-bit byte and 32-bit instructions, so 64-bit and 32-bit floating point. This is not a real happy choice. There are strong arguments each way, but you want your architecture to be consistent. You are not going to have an 8-bit byte and 48-bit instruction floating-point word.
[8].

It was a Thomas J. Watson scholarship that brought Gerrit Blaauw in 1947 to the US, when IBM was just tentatively orienting itself in the direction of electronic stored-program digital computers by engaging von Neumann as a consultant. By 1948 Blaauw had found his way to the world’s prime computer laboratory, and under the terms of the scholarship [3], returned to his native country. As luck would have it, he found his way to that country’s prime computer laboratory. After seminal work there, the long arm of IBM reached Amsterdam to retrieve Blaauw, who after further training, was at the right spot at a critical juncture for IBM. Thomas J. Watson Jr had bet the company, by winding down its chaotic offerings of independently conceived machines, and introducing a single line of compatible ones based on the novel concept of computer architecture.

Acknowledgment

I am greatly indebted to Edgar G. Daylight for allowing me to read the draft [6] of a chapter of his forthcoming book. This gave me not only the story, but also allowed me to find the citations used for this article.

References

[1] Dijkstra Archive
[2] Oral history interview
[3] Blaauw biography
[4] Functional description of ARRA
[5] The Mythical Man-Month by Frederick P. Brooks.
[6] Chapter “De kern van de zaak” in De Geest van de Computer (working title, to appear) by E.G. Daylight.
[7] Edsger Wybe Dijkstra: First Years in Computing Science by Gauthier van den Hove. MSc thesis, University of Namur, 2009.
[8] “Interview with Fred Brooks” by Len Shustek. Comm. ACM Nov. 2015, vol. 58, no.11, pp 38–40.

Appendix: Translation of EWD 255, “On the IBM360” by Edsger W. Dijkstra

A first series of objections concerns the functional specification of the organisation of channels. This is designed with the intention of the CPU submitting a chain of instructions to the channel for sequential execution. However, the hardware is poorly utilized because:

when a channel sends an interrupt to the CPU before an earlier interrupt from this channel has been processed, the previous interrupt is lost;
it is impossible to add reliably an instruction to a chain that is being executed by a channel: should the CPU attempt this, then there are circumstances under which it is not possible to determine reliably whether or not the channel has just completed execution or still has to do so. (This impossibility is a consequence of the fact that the length of the chain is indicated by marking its last element.)
if the execution of an instruction fails, then this is duly reported, but the channel continues after this failure with the next instruction in the chain as if no failure had occurred.

As a result of these shortcomings the instruction chain cannot be used in the intended manner and the operating system is faced with moral urgency situations [1] that would otherwise be avoidable. (To those not familiar with operating systems this probably seems to be a criticism of detail. I, on the contrary, feel that this defect is an alarming indication with respect to the competence of the design team: this is not a question of taste or style, things have been designed with sensible intent while simple reasoning demonstrates that the facilities offered are inadequate.)

A second series of objections consists of examples where the designers have left to the software the management of the machine’s components instead of allowing the software to specify the process at a more abstract level, delegating the management of specific components to the system/machine. Examples of this are the following.

Peripherals can only be controlled by coupling them to a channel and by subsequently issuing instructions to this channel. And this while the channel has no logical significance; the peripheral does have one.
The arithmetic unit has a large number of registers of which it is indicated explicitly in the program text which ones are to be used. This has unpleasant consequences:
1. that compilers are confronted with the problem of “register allocation”, which leads to an optimization process that is costly in terms of compilation time
2. that the kind of status change that occurs in subroutine calls and in multiprogramming becomes unavoidably costly because of obligations to save or restore register contents (whether these turn out to be necessary or not!). Here the design optimizes at the microscopic level, which must be paid for many times over at the macroscopic level:
  re 2.1: the fact that compilation is so time-consuming is responsible for the demand for “independently pre-compiled program components”, the combination of which has created the need for a so-called linkage editor;
  re 2.2: in case time becomes an issue, subroutine calls have to be replaced by an adapted copy of the subroutine’s text, which leads to extremely lengthy programs, so long, that their assembling becomes “a major processing task” (Asher Opler, IFIP 1965). IBM will, in that case, be happy to supply the required memory!
Instead of the program addressing information, it has to address memory, either in core or on disks, with the consequence that every program is responsible for its private organisation of “overlays” and transfers between slow and fast memory. This has rather disastrous consequences:
1. Standard programs have to exist in different versions, depending on their core requirements. The desire to use programs of others (APT for the 360, for example) can force one to install at least that amount of core. Moreover: enlarging core does not seamlessly lead to increased efficiency, at least not without some reprogramming.
2. The only way in which distinct programs can use common subroutines that have only a single occurrence in core is to have these subroutines permanently present. Because one has to be economic with this, the need arises for “system generation”, adapting the system to the problem mix, with all its consequent misery: it is a time-consuming process and, moreover, one does not want to do it, as the problem mix can change!
3. The fact that programs in execution occupy an immovable contiguous part of core memory makes scheduling harder to an unnecessary and disproportionate content; the resulting system remains inflexible in use.

Summarizing: details of embedding, which could have been abstracted from by slightly more refined hardware, now require their explicit representation in programs. On the one hand this unnecessary explicitness makes program structure more difficult; on the other hand it is the case that this premature fixation detracts from the suppleness with which the installation can be used. The unbelievable thing is that this machine is now being advertised with the “wonderful operating system” that relieves the programmer of so many tasks! This glosses over the facts that:

a large part of these tasks are necessitated by the hardware and would have been easier or non-existent in a better design;
this operating system implies an alarming overhead (but IBM is happy to supply a faster model from the family);
the tasks have not been obviated, but have been transferred to the computing centre’s management, which has to dimension and manage the machine
the creation of this operating system turned out to be a programming task beyond the capabilities of the manufacturer;
that the system has become such a baroque monstrosity that no user can venture to adapt it.

Finally, just as the 709-series always struck me as a desperate attempt to use tapes as back-up storage, the 360-series strikes me as disk units with accompanying electronics. It would not surprise me if the somewhat disappointing performance of the faster models comes down to the fact that the faster model is waiting, say, four times faster for a disk arm to move. How they are going extract themselves from this vicious circle is not clear to me: to address this discrepancy by multiprogramming is not attractive from the point of view of the CPU; the advent of “one-head-per-track-disks” might well undercut the raison-d’être for the current operating system.

Footnotes

[1] Translator’s note: I cannot make sense of this. It seems to me that “moral urgency situations” is the correct translation of the original’s “morele haastsituaties”. Is he talking about race conditions?

Postcript December 21, 2014.

Kevin Hely solved the mystery of “moral urgency situations” by pointing me to Dijkstra’s EWD1303 in the Dijkstra archive. I am gratified to note that Dijkstra had arrived at the same translation of the Dutch word. EWD1303 has a definition of the concept.

Three things make this document an improvement over EWD 255: (1) it is in English, (2) it contains the material of EWD255, but embedded in a brief historical exposition of how cooperating sequential processes turned out to be the right abstraction of practical problems in computer architecture, and (3) an explanation of Dijkstra’s reluctance to publish his criticism of IBM System/360 when it was announced.

This entry was posted on June 14, 2014 at 12:52 pm and is filed under Programming. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

2 Responses to “Dijkstra, Blaauw, and the origin of computer architecture”

Kevin Hely Says:
December 20, 2014 at 2:19 pm | Reply
Dijkstra explains his term “moral urgency” here: http://www.cs.utexas.edu/users/EWD/transcriptions/EWD13xx/EWD1303.html
Paul Kimpel Says:
August 30, 2015 at 5:22 am | Reply
With the exception of the comments on channels and peripherals, EWD255 to me reads almost as a checklist of the ways in which the 360 did not employ the concepts of the Burroughs B5500. In particular, his comments on the presence of general registers, their cost of in terms saving state and compiler complexity, static memory allocation, lack of reentrancy for common routines, use of a link editor, the need for or tasks to do their own overlay memory management, the need for “system generation,” and the inefficiency of the operating system all sound like an indictment of the 360 against the B5500, which preceded it (in the guise of the B5000) to the market by a couple of years.

Then there’s that funny comment about “one-head-per-track-disks,” which as far as I know only Burroughs had at the time.

Of course, we know Dijkstra was familiar with the Burroughs Algol-based systems from early in their life, and in the 1970s worked for them as a consultant and research fellow.

A Programmers Place