Python slams into its exponential wall

My first Python sighting was around 1999, in the part of the building where the hackers hang out. Somebody had a poster on the door saying something like If you would have used Python, then you would have been done by now.

Next stop 2005. Since 1986 I had been a fan of “Structure and Interpretation of Computer Programs” by Harold Abelson and Gerald Sussman. At the University of Waterloo it was only in a fourth-year topics course that this book could make a brief appearance. I was envious of MIT, where masses of first-year students took EECS 6.001, where Abelson and Sussman was used from day one. That was class. In 2005 I heard the sad news that, after two glorious decades, EECS 6.001 was closed down, replaced by a course where the text was … a book written for high-school students. Maybe EECS wouldn’t have picked that book if its language would have been BASIC. Perhaps it had to do with the book’s choice of Python. It was from this news item that I learned that Python is a programming language.

Next item on my time line is 2013, with an e-mail from my friend at the Aerospace Faculty of Delft University of Technology in Holland. As an external examiner of some of their PhD theses, I had marveled at the considerable programming done by these students, that these students had never taken a course in programming, and that the code was all in Matlab even though there was, as far as I could see, nothing for which Matlab was needed. It dawned on me that Matlab was the only “programming language” they knew.

A brief introduction to the foreign world of the venerable engineering schools on the continent of Europe. There is no national body overseeing engineering education, such as the ones in North America that ensure that every first-year engineering student takes a programming course. At Delft not even within the university there is uniformity in this respect. The Aerospace Faculty, in its august autonomy, had always taken the position that programming was lab work. Students don’t need a course in oscilloscopes; they learn to use them in some lab. Students don’t need a course in programming; they learn that in some lab.

And what they learn is Matlab. For me Matlab is expensive software that you turn to in the rare situation that the algorithms in Numerical Recipes [1] don’t cut it. For me Matlab is the creation of Cleve Moler, a high priest of the Church of Numerics, who turned out to be a maverick by creating a successful commercial enterprise. For a computer scientist it was amazing  to see researchers do all their programming in Matlab without needing the edge that you can get from state-of-the-art numerical algorithms.

A recent phenomenon around Aerospace in Delft is that of the start-up company. The first question on the job of the newly hired graduates: “Where is Matlab?” Answer: “Forget it. We can’t afford that. You’ll have to do with Python.” “You mean, Matlab costs money? And what’s Python, anyway?” Matlab, the ubiquitous and universal tool, had been taken for granted while under the capacious umbrella of the university-wide license.

Next news item from the Aerospace Faculty: professor Jacco Hoekstra teaches a course in … programming. And chooses Python as language. And writes a 137-page tutorial. His reason for choosing and recommending Python (page 84):

Python with Numpy and Scipy is more capable than Matlab. Python is better in handling strings, reading files, working with very large projects and with large datasets. Python can also be used in an object-oriented programming way. Both Spyder and the iPy Notebook provide a very user-friendly environment for scientists. A more general difference consists in the extra possibilities, which are provided by a full-featured general-purpose programming language like Python. And, often more importantly, Matlab is very expensive and many applications require extra toolboxes, which are in turn also very expensive. This also hinders sharing tools as well as quickly using source code from the internet community: often you can only use the downloaded bits after purchasing the required toolboxes.

Can somebody who has used or taught Algol 60, POP-2, APL, Prolog, Basic, Cobol, Fortran, Scheme, ML, C, C++, and Java get excited about yet another programming language? I am excited about Python and I’ll tell you why by showing you an example that only works in programming languages with a certain cachet. Consider a function that takes a real-valued function f of a real as argument and produces another such function that is the derivative of f. To avoid misunderstanding: the required function is not to compute the value of the derivative at a given point; it is to return the entire derivative as a function, a real-valued function of a real argument. This was not possible in Fortran, nor in any other language at the time of the early Lisps. At the present time you cannot do it in C. It has only been possible in C++ since 2011 [2].

The first programming language with cachet was Lisp, where one would write the derivative-producing function as

(define (derivative f dx)
        (lambda (x) (/ (- (f (+ x dx))
                          (f x))
                       dx)))

However, in early versions of Lisp this would not work. It turned out that the language needed to follow mathematical thinking more closely. Technically speaking those early Lisps had “dynamic scope” of non-local names, whereas “lexical scope” was needed [3]. Lisp users and implementers were reluctant to make a wholesale switch to lexical scope. The situation in Lisp remains messy to the present day. The first language that adopted lexical scope cleanly and wholly was the Lisp dialect Scheme in 1975, the language of “Structure and Interpretation of Computer Programs” mentioned earlier.

My reason for welcoming Python to my overcrowded zoo of programming languages is the first thing I tried, namely write McCarthy’s derivative function in Python. Here it is.

 1 def derivative(f, dx):
 2   return lambda(x): (f(x+dx) - f(x-dx))/(2*dx)
 3
 4 ddx = lambda(f): derivative(f, 0.0000001)
 5
 6 cube = lambda(x): x*x*x
 7
 8 f = lambda(x): x
 9
10 print((ddx(cube))(1))
11 # 3.00000000009
12 print((ddx(ddx(cube)))(1)) # lousy algorithm, beautiful code
13 # 6.01463323591            # lousy accuracy

The Lisp function is rendered in lines 1 and 2. In line 4 I take the liberty to define a better version because it seems to me that the Lisp derivative should only have a single argument, namely the function of which the derivative is needed; ddx in line 4 is that function. In line 6 we have an example function to be differentiated. In line 8 there is a red herring to be dangled in front of the ddx in line 10 to see if Python can be tempted to pick the wrong f. In line 12 the second derivative is computed in a way that is a horror in the eyes of Cleve Moler and his colleagues, but a delight from a programming point of view.

The early history of Python has been written by the person best qualified to do so: Guido van Rossum, the creator of the language. From van Rossum’s history we know the date of Python’s birth. This allows us to estimate its growth rate. I assume it has been growing exponentially from one in 1989 to n users in 2013. If its doubling rate is d years, then n is the result of (2013-1989)/d doublings. If n were 2^20, which it may well be by Mark Lutz’s estimate, then the doubling rate would be 1.2 years.

2013 is a good endpoint for this computation: arrival of Python at the Delft University of Technology indicates to me that Python has reached saturation and that the most recent doubling is the last we will see. Python has slammed into its inevitable exponential wall. May its many users have many years of productive and enjoyable Pythoning ahead of them!

Acknowledgements

Thanks to Paul McJones for several improvements to this article.

References

[1] Numerical Recipes in C by William H. Press, Saul A. Teukolsky, William T. Vetterling, and Brian P. Flannery. Cambridge University Press, 1992.

[2] But it is possible in Javascript.

[3] The Art of the Interpreter or, the Modularity Complex (Parts Zero, One, and Two) by Guy Lewis Steele, Jr and Gerald Jay Sussman. AI Memo No. 453, Massachusetts Institute of Technology Artificial Intelligence Laboratory, May 1978.

4 Responses to “Python slams into its exponential wall”

  1. Andre Vellino Says:

    On that recommendation, I’ll have to give Python another go. The perhaps trivial but nevertheless irritating design decision in the language that really put me off was the choice they made for indenting multi-line programs. (see: https://docs.python.org/2/reference/lexical_analysis.html#indentation) This was so terribly “I don’t care about the programmer” and so obviously “I want to make parsing my programs easier” that I came to the hasty conclusion that there wasn’t much that this language could offer me that other interpreted languages with dynamic typing and garbage collection don’t.

  2. pauljurczak Says:

    @Andre: Curiously, my reaction to Python indentation scheme was quite opposite to yours: “They do care about the programmer”. I’ve spent a few decades programming with C++. During these decades, I’ve typed many hundred thousands of semicolons and curly braces. At the same time, I was indenting my code, rendering majority of these semicolons and curly braces redundant. Languages like Python and F#, where indentation is syntactically significant, are a breath of fresh air to me.

  3. Paul Wormer Says:

    Maarten, you are not quite fair to Matlab. The strength of it is that it has under “one roof” all of Numerical Recipes and much more:
    extensive 3D plotting,solving of coupled systems of linear differential equations, and so on. Using an ordinary computing language, be it Python, Java, C, or Fortran, one always has to search for appropriate libraries that do the 3D plots or solve the differential equations for you.

    Indeed, Matlab is expensive. Fortunately there is an open source alternative: http://www.scilab.org/

  4. Jim Callahan Says:

    Well you might want to look at Julia, Stan and R. Julia, unlike Python, is built from the ground up to be a scientific computing language. Stan is a simulation language that had to implement derivatives for performance reasons. And R is a re-implementation of the Bell Labs developed, S statistical language, built on the guts of a Scheme interpreter.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


%d bloggers like this: