← All Posts · ← Newer · Older →

The depth and breadth of Python

June 03, 2011 — originally posted on Neopythonic


As of late I'm noticing a trend: I'm spending more time having in-person in-depth conversations, and less time coding. While I regret the latter, I really enjoy the former. Certainly more than weekly meetings, code reviews, or bikeshedding email threads. (I'm not all that excited about blogging either, as you may have guessed; but some things just don't fit in 140 characters.)

Two conversations with visitors I particularly enjoyed this week were both with very happy Python users, and yet they couldn't be more different. This to me is a confirmation of Python's enduring depth and breadth: it is as far away of a one-trick language as you can imagine.

My first visitor was Annie Liu, a professor of computer science (with a tendency to theory :-) at Stony Brook University in New York State. During an animated conversation that lasted nearly three hours (and still she had more to say :-) she explained to me the gist of her research, which appears to be writing small Python programs that implement fundamental algorithms using set comprehensions, and then optimizing the heck out of it using an automated approach she summarized as the three I's: Iterate, incrementalize, and implement. While her academic colleagues laugh at her for her choice of such a non-theoretical language like Python, her students love it, and she seems to be having the last laugh, obtaining publication-worthy results that don't require advanced LaTeX skills, nor writing in a dead language like SETL (of which she is also a great fan, and which, via ABC, had some influence on Python -- see also below).

Annie told me an amusing anecdote about an inscrutable security standard produced by NiST a decade ago, with a fifty-page specification written in Z. She took a 12-page portion of it and translated it into a 120-line Python program, which was much more readable than the original, and in the process she uncovered some bugs in the spec!

Another anecdote she recounted had reached me before, but somehow I had forgotten about it until she reminded me. It concerns the origins of Python's use of indentation. The anecdote takes place long before Python was created. At an IFIP working group meeting in a hotel, one night the delegates could not agree about the best delimiters to use for code blocks. On the table were the venerable BEGIN ... END, the newcomers { ... }, and some oddities like IF ... FI and indentation. In desperation someone said the decision had to be made by a non-programmer. The only person available was apparently Robert Dewar's wife, who in those days traveled with her husband to these events. Despite the late hour, she was called down from her hotel room and asked for her independent judgement. Immediately she decided that structuring by pure indentation was the winner. Now, I've probably got all the details wrong here, but apparently Lambert Meertens was present, who went on to design Python's predecessor, ABC, though at the time he called it B (the italics meant that B was not the name of the language, but the name of the variable containing the name of the language). I checked my personal archives, and the first time I heard this was from Prof. Paul Hilfinger at Berkeley, who recounted a similar story. In his version, it was just Lambert Meertens and Robert Dewar, and Robert Dewar's wife chose indentation because she wanted to go to bed. Either way it is a charming and powerful story. (UPDATE: indeed the real story was quite different.)

Of course Annie had some requests as well. I'll probably go over these in more detail on python-ideas, but here's a quick rundown (of what I could remember):
The contrast with my visitor the next day couldn't be greater. Through a former colleague I got an introduction to Drew Houston, co-founder and CEO of the vastly successful start-up company Dropbox. Dropbox currently has 25 million users, stores petabytes of data on Amazon S3, is profitable, and is not for sale. Drew is an easygoing MIT graduate who is equally comfortable discussing custom memory allocators, the world of venture capitalism, and how to keep engineers happy; he likes hard problems and winning.

Python plays an important role in Dropbox's success: the Dropbox client, which runs on Windows, Mac and Linux (!), is written in Python. This is key to the portability: everything except the UI is cross-platform. (The UI uses a Python-ObjC bridge on Mac, and wxPython on the other platforms.) Performance has never been a problem -- understanding that a small number of critical pieces were written in C, including a custom memory allocator used for a certain type of objects whose pattern of allocation involves allocating 100,000s of them and then releasing all but a few. Before you jump in to open up the Dropbox distro and learn all about how it works, beware that the source code is not included and the bytecode is obfuscated. Drew's no fool. And he laughs at the poor competitors who are using Java.

Next Monday I'm having lunch with another high-tech enterpreneur, a Y-combinator start-up founder using (and contributing to) App Engine. Maybe I should just cancel all weekly meetings and sign off from all mailing lists and focus on two things: meeting Python users and coding. That's the life!

Comments (25)

Daniel LarssonJune 03, 2011

@Snagglepants
Apparently the Dropbox fan's have more in common with Apple users then previously believed.

Tacky? Dude, you might want to read up on netiquette and buy a dictionary.

Guido van RossumJune 03, 2011

One more flame about dropbox v. spideroak (or other competitors) and I'll turn off commenting and delete all comments about that topic.

TonyJune 03, 2011

Please dont turn commenting off: they are very interesting

AnonymousJune 03, 2011

Thought about any(): it might have been interesting to have it return either an empty tuple (if False) or a 1-tuple containing the value (if True). Too late for that now, though.

Regarding while/if: Perhaps it's time to revisit the idea of an 'as' clause on these conditionals (the tuple trick above could handle the cases where the condition and required value inside the clause don't match up)

As far as immutability goes, perhaps someone would care to take the freeze() PEP back up (PEP 351)

João BernardoJune 03, 2011

As we're talking about language requests:
Something that really could be done but probably won't are ranges in the form 1..10. :)

As Ellipsis took the triple dot syntax a ruby-like range can't be made.

BTW, I always tought it was inconsistent to have mylist[1:5] but 1:5 alone don't work as a range.

Just toughts... :P

CarlJune 03, 2011

As far as I can tell, wouldn't an optional second arg to any() do the job for SOME:

def any(iterable, fn = ident):
for x in iterable:
if fn(x):
return True
return False

NickJune 04, 2011

Sounds awesome.

Laurence RoweJune 04, 2011

@Carl
You can already use: any(ident(x) for x in iterable)

Ralph CorderoyJune 04, 2011

Hi Guido. This "while some x in xs has pred:", will x on each iteration be the first element of xs where pred(x) and xs needs changing each iteration? If so, we've only x, and not its index to be able to efficiently alter xs. Also, "some" doesn't emphasise first and is more suggestive of a x where pred(x) being plucked at random. Cheers, Ralph.

P.S. Isn't an upside-down E an E by another name? :-)

Guido van RossumJune 04, 2011

@Ralph Corderoy: xs is typically a set or dict so no index is needed. And the suggestion of "some" (or "any" for that matter) that it's an arbitrary element is intentional, since loops over sets/dicts use an arbitrary order.

Which reminds me, Annie also asked for an arb() function that is like Ian's first(). It's been discussed on python-dev in the past, but we could never agree on what it should return if the collection is empty. None feels wrong because the collection could very well contain None as a legitimate value.

Roger PateJune 04, 2011

Why not have arb() behave like next(iterator[, default])? (Throw by default, but a value can be specified.)

Zaheer SoebhanJune 04, 2011

Besides all the stories about some important decision making, it's really cool to see how some of the best ideas are coming from interacting with other people.
So I totally agree with "focus on two things: meeting Python users and coding. That's the life!"
I find programming less fun without interacting with other people. I guess it's all about sharing ideas and coming up with great solutions together. It probably makes us feel like a better programmer as well as a better human being.

Thank you for sharing your ideas about programming and life in general, Guido! Good luck!

PS: Robert Dewar's wife should be given an award for her input on the greatest programming language of all! ;)

YousufJune 06, 2011

Oh! Pattern Matching. Guido, please give us pattern matching. It makes life so much more beautiful in Erlang. With Python being a thing of beauty, it must must must have Pattern Matching.

James O'BeirneJune 07, 2011

Quick question: when you said you "checked your personal archives" per the indentation question, were you going back and reading journals that you'd kept or simply browsing over old mail?

I ask because I'm curious as to how you record, organize, and store your thoughts.

Guido van RossumJune 07, 2011

I wish I'd kept journals! No, I just searched my GMail for "Robert Dewar" and found a message from 2005 where I had summarized the story as I had heard it then and sent it off to an old colleague (who didn't reply).

DmitreyJune 10, 2011

Some proposition to yield "SOME x IN xs HAS pred" and "EACH x IN xs HAS pred" on existing Python language programming constructs, mentioned in answers, definitely will suffer by lack of performance. I have awful lot of "all" and "any" in my Python mathematical code (OpenOpt, FuncDesigner), and both Pyhon and numpy versions of "all" and "any" suffer the mentioned drawback of evaluating all elements before yielding result, while very often checking only several elements would be enough.

The thing I would like you to pay attention - it would be very essential to get the new Python constructs numpy-style compatible, where "all" and "any" often are evaluated along a chosen dimension of n-dimensional array (e.g. all(X==10, axis=1) or mere all(X==10, 1)).

Incompatibility between numpy and Python "min" and "max" styles already is a terrible headache, especially for newbies. It's very hard to find the bug. Python max(-1,0) is 0, while numpy treats max(-1,0) as max of array with single element "-1" along axis 0 and thus result is -1.

I hope it will be taken into account to reduce numpy-python incompatibility.

Guido van RossumJune 10, 2011

@Dmitrey: we're getting pretty far off-topic here, but I suggest that NumPy was in the fault by redefining min() and max() that way. This could be avoided by requiring a keyword for axis, e.g. max(a, axis=0). Perhaps NumPy can evolve in this direction?

DmitreyJune 10, 2011

Well, at first, it will bring some backward incompatibility to previous numpy versions, at second, I'm not a NumPy developer and thus cannot affect decisions they make. Thus here I mere want to prevent appearing of another one incompatibility. Checking predicates along axis of n-dimensional array will certainly be used very often in scientific/engineering Python software to speedup calculations instead of using "all"/"any" along the involved axis.

Sometimes I wonder what NumPy developers do (they often doesn't care even about bug reports I inform), but of course it's absolutely up to them. For example, one of most essential task numpy certainly require - possibility for downloading it from easy_install or linux channels (e.g. apt-get) with ACML (for AMD processors, free of charge and without royalites) or MKL (Intel processors, free-for-educational) (currently it requires lots of makefile changes and other efforts). But since Enthought (who sponsor numpy and thus affect their decisions) included MKL into their EPD, they seems to be not interested of it any longer - they prefer to force people buy their EPD with MKL than having it easily available with numpy download for free.

DmitreyJune 10, 2011

By the way, Python and numpy "sum" are also incompatible. E.g. Python:

>>> sum([1,2],-1)
2
Now suppose a programmer replaced in a Python file header "from numpy import func1, func2, ..." by "from numpy import *", or imported "sum" from numpy to perform his own operations somewhere in his part of the file. Then somewhere in other part of the file we'll get

>>> from numpy import *
>>> sum([1,2],-1)
3

It can be obtained in most inappropriate time and be very hard to find the error (or even worse, it can be not revealed at all).

As for the predicates check, I think it's better to handle them in more lower level (e.g. C language), and possibly to have it parallel computed (at least, by default or by an optional argument). On the other hand, maybe it's better to wait with all those parallel computations changes till Python 3.3 where LLVM-based JIT will be implemented (I hope it will be already done that time, isn't it?)

Guido van RossumJune 10, 2011

Sorry, Unladen Swallow is dead: http://qinsb.blogspot.com/2011/03/unladen-swallow-retrospective.html

ComposingJune 11, 2011

Another +1 for pattern matching arguments.

What about something like

def f([h,*t]) :

for cutting up lists? It's close enough to [H|T] and consistent with *args notation.

Empty lists would still be [].

ShuphaninJune 24, 2011

Seize the day Guido, just make sure everything you do in your life is pure enjoyment! (And I bet it is!)

I'll just quote this lyric from a song. (http://www.pubnub.com/blog/internet-heroes-song-crockford-dahl-wall-torvalds-resig-souders)

"Guido Van Rossum, thanks for the Python. More like Van Awesome, maybe Van Right On."

sunny1304July 02, 2011

I would like to thank Guido van Rossum for Python. I am a student of computer science and started python 3 weeks ago and now my world is changed. Honestly I never learned anything with such joy like Python. In python I feel like everything is possible to solve. .......

Thanks Guido van Rossum ...you are my favorite person.

sunny1304July 02, 2011

I would like to thank Guido van Rossum for Python. I am a student of computer science and started python 3 weeks ago and now my world is changed. Honestly I never learned anything with such joy like Python. In python I feel like everything is possible to solve. .......

Thanks Guido van Rossum ...you are my favorite person.

Martin BlechNovember 29, 2012

I think pattern matching would be a great addition to Python. I did a small experiment implementing it as an external library (here: http://github.com/martinblech/pyfpm) but the new syntax requires a parser of its own. Language-level support would be much better. Still, pyfpm lets you do stuff like:

unpacker('head :: tail') << (1, 2, 3)
unpacker.head # 1
unpacker.tail # (2, 3)

or

@match_args('head :: tail')
def match(head, tail):
return head, tail

match([1, 2, 3]) # (1, (2, 3))

← All Posts · ← Newer · Older →