PyMW: Summer-y of Code

Today is the “suggested pencils down” date for Summer of Code and I’m very happy to say that my proposal is complete and I feel the summer was a great success!

For those of you who don’t know, PyMW is a Master-Worker computing framework in Python. It wraps several other Master-Worker frameworks such as MPI, Condor, BOINC or even just using multi-core processors, and exposes them as a simple and elegant API.

The way it works is, you create tasks and submit them to a Master and the master uses an interface/wrapper (BOINC, Condor, MPI, etc) to process those tasks. The master distributes the tasks out to compute nodes (or processor cores), called Workers, and the results get sent back to the Master. This allows you to debug using the multi-core interface (a single machine) and then, by changing one switch on the command line, you can run the same code on thousands of machines using BOINC or any other supported interface.

My proposal was to improve BOINC integration with PyMW by 1) eliminating the startup script and the need to compile C code; 2) adding pure-Python support for BOINC Assimilators and Validators; and 3) by adding a new checkpointing mechanism for long running jobs (optional).

I completed (2) very quickly by virtue of some old Python code I found in BOINC, but (1) took a lot of sweat and tears — the existing BOINC interface was functional, but in need of some serious work. It was running under Mac and Linux, however the Windows client was crashing on every task. To get it working under Windows I ended up creating a C++ launcher application to avoid using batch scripts.

In addition to my proposal, I also added a few other tasks. When working with BOINC, the existing interface assumed that Python was already installed and on the system PATH environment variable. This is not a very safe assumption and so I also crated a portable Python interpreter integrated with the BOINC API. PyMW will now install this interpreter as your BOINC application so that clients no longer need to have Python installed to run PyMW-BOINC compute jobs. Along the way I also created a new logo and graphic design for the PyMW web site and setup WordPress, check it out.

Sadly goal (3) was not completed, however, it was originally proposed as an optional part of the project. I actively chose to pursue the BOINC-Python interpreter over (3) because I felt it was more important to the BOINC interface. In the end, checkpointing can always be done manually, but sending the Python interpreter is a considerably harder task.

I am still wrapping up a few odds and ends, but for the most part, I feel very happy with the state of the BOINC interface. If you are looking for a distributed or parallel processing framework for a future project, please consider PyMW.

If you have any questions or comments, I would love to hear them!

Tagged Tags: , , on August 10, 2009 at 12:45 pm