June 11, 2008

Python-SpiderMonkey Resurrected

Yesterday I found John J. Lee’s old Python-SpiderMonkey code from 2003, which creates a bridge between the Python language and Mozilla’s SpiderMonkey JavaScript engine—the same engine that powers Firefox. Lee mentioned on his website that it’s not currently maintained, and after downloading it and trying to compile it, I found that SpiderMonkey had changed a bit since the code had been written, so I made some fixes and, after discussing things with him, set up a new Google Code project for it at http://code.google.com/p/python-spidermonkey.

I also revamped the documentation, wrote a basic tutorial, and removed some dependencies for building it.

Interoperating between Python and C

Python-SpiderMonkey still isn’t particularly easy to build, though: while you no longer need to get Pyrex or compile SpiderMonkey yourself, you do need to download the XULRunner SDK and have a C compiler on your system so the C Python extension can be created. Ideally, this shouldn’t have to be done: I’ve written some example code that uses ctypes to dynamically load the SpiderMonkey shared library file at runtime, which is proof that all that should be required by Python-SpiderMonkey are Python and either Firefox or the XULRunner SDK.

On the other hand, Python-SpiderMonkey currently uses Pyrex—a dialect of Python that understands C data types—to do all its communication with SpiderMonkey. This was my first time using Pyrex, and I’m quite impressed: without even reading a tutorial, it was easy for me to understand and change what Lee’s Pyrex code was doing. Having written “from scratch” C Python extensions, SWIG extensions, and ctypes code, I think that my (admittedly brief) encounter with Pyrex makes it my favorite of all the options from a coding standpoint: it was very easy for me to write (and read!) code that interoperated between Python and C. On top of that, tracebacks into Pyrex-generated code were eminently useful, which made debugging very easy—and almost as much fun as writing pure Python code, if it weren’t for the recompilation step (which makes me wonder if there’s an importer that transparently re-compiles Pyrex code on-the-fly).

The only downside of Pyrex, though, is that it compiles to a C Python extension, which makes it difficult for other Python users to use. It’d be really cool to see an optional backend for Pyrex that compiled its extension down to ctypes-wielding Python code instead, but I’m not sure how possible that is. I suspect that the issue of calling C macros from Python code would be an annoying one to get around.

Why did I do this?

To begin with, I should make it clear that I didn’t actually do much—this is still almost exclusively John J. Lee’s code. I just added some documentation, made the build process a bit easier, and got it to work on modern builds of SpiderMonkey.

One reason I did this was just for personal education: I wanted to see what Pyrex was like, and I also wanted to get my hands dirty with the rather well-documented SpiderMonkey C API. In doing that, I actually found a potential bug in SpiderMonkey that I submitted a test case for.

The other main reason I did this may be a bit controversial.

Since becoming more acquainted with web programming, I’ve started to really like JavaScript. Aside from the fact that it’s on almost everyone’s computer and its applications in web-content-space are incredibly easy to deploy, it’s just a fun and powerful language to use, once you get around some of its wrinkles. It’s in this web-content-space where the language is used by an enormous community of developers and new tools and better ways of doing things are invented every day.

The thing is, a lot of Firefox itself is implemented in JavaScript, and that’s not really web-content-space; Mozilla calls it chrome-space, because it has entry to a variety of software components—via a rather obtuse and verbose Microsoft-inspired system called XPCOM—that provide access to the local system. Think file I/O, sqlite storage, the sorts of things that the Python standard library does.

The only problem is that, perhaps because chrome-content JavaScript and the associated XPCOM componentry is only really used by Mozilla, Firefox extension developers, and a handful of smaller projects, the development environment isn’t nearly as mature and as much fun to program in as web-content JavaScript and Python. There’s no Firebug, for instance, and there’s nothing like nose; there are very few development and testing tools in general, and they’re not terribly robust. Perhaps most importantly for newcomers to chrome-content like myself—and in stark contrast to the Python standard library and nearly every major Python library in existence—many XPCOM components don’t even have any documentation. Just accessing the filesystem to open a file can involve a nontrivial amount of XPCOM sorcery, unless you happen to find the right page on MDC or talk to the right person on IRC. Development tools often fall into disrepair, quickly becoming nonfunctional because the trunk changed, the original developer stopped maintaining the code and no one was interested in picking it up.

This isn’t to say that it’s impossible to do things in chrome-content. Almost nothing is impossible for a programmer; it’s rather that things just aren’t fun, especially when compared to well-documented platforms with diverse, vibrant communities like web-content JavaScript and Python.

So, I’ve been spending some of my time trying to connect Python to JavaScript, so that, at least when it comes to writing unit and system tests, chrome-content JS can leverage some of Python’s mature, well-documented, easy-to-use tools. I don’t have any really concrete ideas just yet, but I think that building a bridge between Python and JavaScript is the first step. There’s also another somewhat neglected package called PyXPCOM that I’d like to make a little easier to use and possibly integrate with Python-SpiderMonkey—but that’s a topic for another post.

Aside from the world of chrome-content JavaScript, there are a few other places where I think this kind of bridge could be useful:

  • As I mentioned in my overview of Weave, because the client's data is decrypted client-side, almost all synchronization logic needs to be done there. This logic needs to be rock-solid in all client implementations, or else the user's data can become corrupted. So one way of creating a Python client may be to simply reuse the sync logic from the Weave extension, which is written in JavaScript 1.7.
  • It seems as though web applications need to duplicate logic on both the client and server side: for example, client-side code in a web application may want to check to see if a user's credit card number is valid, and warn them before they submit the form. The same logic may need to be duplicated on the server-side; one way to do this is by allowing Python-based servers to access some of the JavaScript code that's being sent to the client, to prevent the duplication of code in multiple languages.

I’ve talked to Ian Bicking and Kumar McMillan about this too, and they had some pretty interesting ideas of things that could be done with Python-SpiderMonkey. I’m interested in hearing any other thoughts people have on this.

© Atul Varma 2017