PyBOINC Work Continues

I am still working on PyBOINC (the embedded Python interpreter with support for BOINC). The actual integration (exposing the BOINC API to Python) was easy, it’s the cross-platform build that’s most difficult.

The problem is that when running an application on BOINC, you have no guarantee of what libraries will be available, so you must either distribute the libraries you need or compile them statically. I chose the later, which also caused lots of issues with the Python standard library. It’s mostly working now, with the exception of the sqlite module.

I moved the repo over to my bit bucket account since Nicolas is busy with other projects. The latest code can be found here.

Tagged Tags: , , , on July 31, 2009 at 4:11 pm

Integrating Python & BOINC

I started working on an embedded Python interpreter for BOINC with Nicolás Alvarez. The interpreter will be the main executable for Python based workunits and provides interop with the BOINC client API. It allows, for example, reporting percentage complete per workunit, which isn’t possible currently with PyMW alone.

The project will likely be incorporated into the BOINC trunk, but the current code is available on bitbucket as PyBOINC if you are interested in seeing how it works.

The Python developers made embedding the interpreter incredibly easy from C/C++ and the BOINC API is readily available from C/C++ as well, so there really isn’t much code. However, packaging the Python standard library and compiling it so it runs on multiple platforms are still going to be challenging.

Tagged Tags: , , , on July 12, 2009 at 10:29 pm

BOINC: Bundling Files

I’ve added support into the BOINC interface for bundled data files, however adding this new featue has exposed a new issue in the BOINC interface. I’ve known previously that BOINC holds an odd assumption of immutable files — any file ever seen by BOINC is expected to *never* change it’s contents for all time — however when running a PyMW application, the executable (for example, “monte_pi”) is reused over and over desipite any changes that may have occurred in the code.

This hasn’t been an issue up until now, mainly because I was running the PyMW example applications and not modifying them between executions. However, with the introduction of PyMW data bundles, this problem has become painfully obvious. Since data bundles are given a temporary file name and this file name is dynamically embedded into the body of the executable, the executable is now changing its contents on every run.

The fact that the file is changing and the file name remains the same means that BOINC keeps only one copy of the file (because of file name immutability/versioning). The end result is that when a work unit executes on a worker machine, it tries to open the first data bundle file name that was ever created because that first file was cached and never updated.

To fix this, I’ve added some code into the BOINC interface that deletes all work unit related files from the BOINC “download” directory on every execution. This has fixed the problem for now, but the interface should rename all files to a unique name before execution. This is one of my goals for the next iteration of the BOINC interface.

Tagged Tags: , , on July 4, 2009 at 10:19 am

BOINC: Failed WorkUnits

When work units fail in BOINC, it poses a question of how to handle the remaining work units still being processed. I’ve added code to minimally handle failures so that manual user intervention isn’t required, however there is still a burden on the developer to understand this situation and decide how to recover from it.

This is really no different from handling exceptions in non-distributed code, but I know all to well how exceptions are normally handled (hint: they aren’t). If you are interested, you can read my full post on the PyMW blog.

Tagged Tags: , , on July 2, 2009 at 10:34 am

PyMW Site Finished, for Now

I’ve finished converting the PyMW website over to WordPress and implemented the new design and logo. It still needs more work, but I’m going to switch back into core BOINC interface mode again for a while.

The interface is working well now, but BOINC work unit failures still require manual user intervention. I would like to automate as much failure recovery as possible, so I will be focusing on this for the next few days.

Tagged Tags: , on June 29, 2009 at 4:21 pm

PyMW Distributed Pacman Server

Pacman CTF
To create a real application for PyMW, I think I am going to create a distributed Pacman server.

Last semester I took an artificial intelligence class that used Pacman as a teaching tool. At the end of the semester, there was a tournament where each team could pit their Pacman AI client against each other in a game of Pacman-style capture the flag. We submitted our clients to a server and then waited 24 hours or so for the results to appear. If your client crashed, you had to wait another 24 hours to see your standings.

My idea is this:

  • Create a PyMW application that runs Pacman tournaments
  • Each job will be 3 matches between two clients
  • An animated GIF will be created for one of the 3 matches (one that agrees with the outcome)
  • The BOINC interface will be used so students can contribute compute time
  • The output of the PyMW application will be records in a MySQL database
  • Create a website for statistics

To test the tournament server, I am going to get the AI client code from last semester’s teams and then run it on the BOINC Alpha group. This should provide a solid test of the PyMW BOINC interface and my tournament server.

Tagged Tags: , , , , on June 29, 2009 at 12:17 pm

PyMW: Logo and Layout

PyMW: Python + BOINC

Created a new logo and web layout/design for PyMW. This isn’t really part of the summer of code gig, but I was feeling inspired so I threw this together. It’s still a work in progress, but you can get a feel for the design.

I’m not much of a graphic artist when it come to identity, but I tried to represent idea of one framework joining many disparate models of computation as well as the general idea of master-worker computing.

Tagged Tags: , , on June 10, 2009 at 12:08 am

PyMW: Week one

Today is the official end of my 7th day of working on the PyMW interface for BOINC for Google Summer of Code.

It took me three days to get the my first PyMW app to run (monte_pi.py), which ran on 4 virtual nodes (4 tasks). More than four tasks was causing problems, it turned out that there was a bug in the PyMW BOINC interface. Now that that’s fixed, I ran with 200 nodes yesterday and 800 today.

This morning, I tried running with 2 physical nodes: my laptop and my Ubuntu VM, which failed at the end of computation. Somehow the canonical results are not being recognized which causes the BOINC interface to get lost in limbo and hang forever.

This week, I created a pure-Python assimilator for PyMW, which works pretty well, but is perhaps causing the error above.

I also rewrote a big swath of the BOINC interface to stop it from using a new thread for each task during task reclamation (getting data back from BOINC). Since it was using one thread per task, it was reaching the maximum number of scheduler threads. This in turn caused the execution thread to hang until some of the tasks completed. Now it reclaims tasks in a single thread and is able to queue all tasks in a single shot, greatly improving the throughput from PyMW -> BOINC.

Overall, it’s been really fun so far. The first few days were trying, since there was little documentation of how to get PyMW to play nice with BOINC. But seeing the first application run was great :)

Tagged Tags: , , , , on June 5, 2009 at 12:01 pm

Google Summer of Code!

PyMW: Python + BOINC

My proposal has been accepted by the Python Software Foundation for Google Summer of Code 2009!

I will be working on the Python Master-Worker computing project (PyMW), a Python API that provides access to various distributed and parallel computing frameworks. In particular, I will be working on the Berkeley Open Infrastructure for Network Computing (BOINC) integration. BOINC is best known as the underlying system that enabled SETI@Home and I’m really excited to peek under the hood!

My goal is to make it easier for PyMW users to create BOINC applications by simplifying the setup process as well as adding new support for some important BOINC features. This will require some changes to PyMW as well as some changes to the BOINC server. If you are interested in seeing all the details, a public copy of my proposal is posted at the GSoC website.

Tagged Tags: , , , , , on April 23, 2009 at 12:33 am