There's a lot of talk these days about the new programming language Go from the plan9 guys at Google. While I think it's great that a "system" language will get good concurrency support, I find it strange that they seem to ignore prior art. The concurrency model looks a lot like what you find in Erlang, but with some differences in how they do channels and error handling.
So what about Python? Stackless Python is shown to outperform the chaining example from the tech talk at slide 34 by using tasklets and simple channels. Let's try to see what we can do for the standard CPython implementation.
def main(n=100000): def runner(left, right): left.write(right.read() + 1) leftmost = left = Channel() for i in xrange(n): right = Channel() go(runner, left, right) left = right right.write(0) print leftmost.read()
Then we need an implementation of something that can act like goroutines and channels. For a long time I've personally been using greenlet from the pypy project in various hobby projects. I'll provide a cut down version of a Channel implementation which should do the job for this example.
Greenlet is a coroutine implmentation, meaning threads/routines don't automatically yield to others; switching context must be done explicitly. While many may see this a disadvantage, I see it as an advantage that forces you to program in a reactive way. In my experience this fits perfectly with channels and the Actor model.
from greenlet import greenlet, getcurrent from functools import partial from collections import deque # This is just a job queue which we routinely pop to do more work. There's no # "switch thread after N time" mecanism, so each job needs to behave. queue = deque() def go(callable, *args, **vargs): "create a new coroutine for callable(*args, **vargs)" g = greenlet(partial(callable, *args, **vargs), scheduler) # scheduler must be parent queue.append(g.switch) def scheduler(): while queue: queue.popleft()() scheduler = greenlet(scheduler) class Channel(object): "A asynchronous channel" def __init__(self): self.q = deque() self.waiting =  def write(self, msg): "write to the channel" self.q.append(msg) # notify everyone queue.extend(self.waiting) self.waiting =  def read(self): "read from the channel, blocking if it's empty" while not self.q: # block until we have data self.waiting.append(getcurrent().switch) scheduler.switch() return self.q.popleft() if __name__ == "__main__": main()
Example run on my computer (mbp 13" 2.53GHz):
chaining.go: % time sh -c "6g chaining.go && 6l chaining.6 && ./6.out" 100000 Time spent in user mode (CPU seconds) : 0.760s Time spent in kernel mode (CPU seconds) : 0.534s Total time : 0:01.31s CPU utilisation (percentage) : 98.4% Times the process was swapped : 0 Times of major page faults : 0 Times of minor page faults : 0 chaining.py: % time python chaining.py 100000 Time spent in user mode (CPU seconds) : 3.761s Time spent in kernel mode (CPU seconds) : 0.830s Total time : 0:04.61s CPU utilisation (percentage) : 99.5% Times the process was swapped : 0 Times of major page faults : 0 Times of minor page faults : 0
This is in no way a real benchmark (I even include the compilation time for go), but it shows that the almost naively simple and old Python implementation is not completely off target. I kind of expected the difference to be much larger. Further more, I've added no special syntax, done no customization to the standard distribution, only added one c-extension module with a basic asynchronous Channel implementation.
CPython is still in the game!
Posted 2009-11-16, by Erik Gorset, Colombo, Sri Lanka.