Wednesday, 16 December 2009

DMA race checking

I am in a good mood this week because I (together with my co-authors Daniel Kroening and Philipp Ruemmer) just had a paper accepted to TACAS on checking DMA races in multicore software.

The paper is online here.

The inspiration for this work came via my experience at Codeplay Software Ltd., writing system-level code for the Cell processor. To achieve high performance on a Cell SPE, you needs to write low-level DMA operations to pre-fetch data. Getting these operations correct is hard - it's easy to end up with two DMA get operations simultaneously writing to the same local buffer, and corrupting it.

Our paper first shows how bounded model checking can be used to automatically find bugs with the way DMA operations are used. Then it shows how a technique called k-induction (introduced in this paper) can be used to actually prove absence of bugs for certain classes of programs, which is a lot harder to do.

We've made a tool, Scratch, which accepts Cell SPE programs involving DMA operations, and checks them to find, or prove absence of, DMA-related bugs. We used Scratch to find a bug in one of the examples that comes with the IBM Cell SDK (see report on Cell BE forum).

IBM also have a race check library that can be useful in finding DMA-related bugs, and finds them very quickly if they occur on a given test input. Scratch has the advantage that it uses symbolic execution, so can find more subtle bugs by considering program execution for all possible input values.

Scratch needs some more work before it will be of much use to Cell BE developers; in particular it doesn't yet handle all the vector language extensions which are used by SPE programmers.

However, it will in due course, so if you are interested in using Scratch please contact me and I'll prioritise development of particular features that you require to get started!

The bad news is that the Cell appears to be somewhat on a downer. However, similar issues arise when performing asynchronous memory copies in GPU programming, which is definitely on the up (especially with the OpenCL language), so I think there's a lot of cool stuff still to do in this area!

Tuesday, 1 December 2009

Having trouble getting the Murphi verifier to compile?

I've spent quite a bit of time today getting the Murphi verification tool to compile. I thought I'd write down how I did it in case this is of use to others.

First, I found that the version here did not compile due to flex/yacc errors. I don't know how to solve these, but luckily the version here (which has the same version number!) didn't present those problems.

Instead, it gave me this error:
error.C:58:20: error: stream.h: No such file or directory
In file included from mu.h:441,
from error.C:56:

It seems that stream.h comes from the days before C++ include files didn't finish with ".h". This header file has been deprecated for some time, but it seems that with gcc 4.3.3, which I have, it is not even provided.

It turns out that GCC 4.2.2 comes with a folder called "backward", that has "stream.h" and related header files. Armed with these files (which I provide online), I was able to get Murphi to compile OK by changing one line in the makefile from:

CFLAGS = -O4 # options for g++


CFLAGS = -O4 -I /home/scratch/gcc4.2.2_backward/backward # options for g++

(this is where I put the "backward" files).

Happy Murphi-ing!

Wednesday, 25 November 2009

Northern Concurrency Workshop

I am just back from the Northern Concurrency Workshop, which was organised by Joey Coleman at Newcastle University.
It was a really, really fun event. The talks were largely dominated by separation logic, which I am only just getting into, so I would say I understood less than 30% of the speakers' words. Nevertheless, most of the other 70% sounded tantalisingly interesting even though it was over my head! I gave a more practical talk on applying model checking to multicore programs with DMA.
What I liked about the workshop is that despite some pretty big names in concurrency theory being present, the atmosphere was very relaxed. There was a lot of friendly discussion; it really did have the feel of a workshop, rather than just a series of talks.
The standout presentation for me was "Verifying malloc", given by John Wickerson from University of Cambridge. It was really well explained, and so obviously of practical importance.
It is expected that there will be another one in Cambridge in September 2010, so I look forward to that!

Wednesday, 14 October 2009


This weekend I had a flying visit to Glasgow for a session with my band, and to see a musical put on by a company which my wife Chris used to be a member of. Apologies to friends of mine in Glasgow who I didn't get a chance to see -- next time hopefully I'll be up for longer.

It was wonderful to be back in Glasgow. Maybe it's just because I know it, but the city seems so fluid and easy to get around in. I drove to Byres road for a coffee with my mother, and effortlessly slipped into a space where you can park free for 30 minutes. I drove from there to Berkeley studios via the Clydeside expressway which is smooth and fast, and then when I needed to be in the west end a bit longer I easily found somewhere free to park just a few streets away. Maybe I could do this in Oxford if I knew it better, but my experience of driving in Oxford so far has been awful!

I met up with the Latonic guys for drinks after seeing the musical, and we went to a really cool basement bar on Bath Street, can't remember the name, where our mate Mehdi was DJing. I left the bar around 2, and then had the most amazing experience -- I wanted some food and was fully prepared to be ripped off at some awful chippy like Mr Chips, when I saw that Greggs was open. Yes, Greggs the baker, at 2am! So instead of paying £3 for a portion of under-cooked chips served to me by a weary chip shop owner, I paid £1.60 for two cheese and onion pasties, which are exactly what one wants after several drinks, and was served by a very friendly member of Greggs staff. It was amazing!

It seems like some other people have been talking about this too. I suppose it might be bad news for the chip shops, and maybe bad news for the Greggs staff, but it's OK by me!

Tuesday, 29 September 2009

Bash hates enthusiasm

I often feel good when I get to "svn commit" something. I want to type something like:

svn commit -m "Example ready to go!"

But this enthusiasm is not welcomed by the bash shell, which says:

bash: !": event not found

So I have to type:

svn commit -m "Example ready to go\!"

and reaching for the '\' key just doesn't seem spontaneous. Especially on a Mac, where it's in the wrong bloody place!

So I end up being firm, and typing the ploddingly unenthusiastic command:

svn commit -m "Example ready to go."

Monday, 14 September 2009

A parallel algorithm which has been intriguing me

When I was in 3rd year at the University of Glasgow, I learned about the Floyd-Steinberg error diffusion algorithm, a method for performing digital half-toning on images.  This is an extremely simple algorithm which produces a wonderful effect. It converts a greyscale image, with many shades of grey:

into a black and white image which essentially looks the same; in particular which looks the same when viewed from a distance:

More generally the technique can be applied to a colour image using a large palette of colours, producing a similar image in a reduced palette.

The technique is pretty well-known, but I'll quickly summarise how it works for greyscale images: pixels are visited left-right, top-bottom.  A pixel with greyscale value p is replaced with 0 if p <= 128 (assuming 0 is black, 255 is white and 128 is mid-grey).  This has the effect of finding the “best match” for p in the reduced palette.  But if p goes from say 100 to 0 this has darkened the pixel significantly, and if p goes from 130 to 255 this has lightened the pixel significantly.  So, to compensate for this, the error associated with the palette decision for the pixel (old value minus new value) is propagated to surrounding pixels in the following way:

In the following diagrams, pixel 22 is set to 0 since it is less than 128. Then 22x7/16=10 is added to the pixel immediately to the right, 22x1/16=1 to the pixel to the bottom right, 22x5/16=7 to the pixel below, and 22x3/16=4 to the pixel to the bottom left.

(The Wikipedia entry for Floyd-Steinberg dithering gives some pseudo-code showing how the algorithm can be implemented.)

Parallelizing error diffusion

I'm no expert on this technique, and I don't know where the authors got these coefficients from, or why they work so well, but when I started working at Codeplay Software Ltd. in 2007 I was looking for interesting algorithms to try parallelising using their Sieve technology. I remembered this algorithm as being particularly cool, and wondered if it was parallelizable. I started from scratchy, trying to work out by myself if it could be parallelized, but I got stuck, so I did a bit of searching and came across this paper, which presents a parallel version of the method, by Panagiotis T. Metaxas.

The paper makes the following observation: the top-left pixel in the image must be processed first, and the pixel to its right second. But once these two pixels have been processed, either the third pixel in the top row could be processed, or the first pixel in the second row. By continuing to reason in this way, we can number each pixel with an integer representing the time step at which it could be processed in a fully parallel machine:

Observe that there is a skewed diagonal of pixels which can be processed in parallel, and the length of this diagonal increases, reaching its peak as it crosses the middle of the image, then decreases. For a large image there would clearly be a fair bit of opportunity for parallelization.

When I read this paper I didn't feel too stupid for having got stuck with this problem: his algorithm is pretty clever, and apparently even Donald Knuth believed the Floyd-Steinberg algorithm to be inherently serial - see a quote in the paper. (It occurs to me that perhaps Knuth has been somewhat mis-quoted - Knuth states that the algorithm is inherently serial because the pixel in the bottom right hand corner depends on processing of all other pixels. While the point about dependences is true, Metaxas's approach shows that there is opportunity for parallelism. But maybe Knuth simply meant that there is no way to process every pixel in parallel.)

To be continued.

Thursday, 3 September 2009

Trouble getting Mac Book Pro to connect to wireless networks

My mate Colin Riddell from Codeplay just helped me fix a problem with my beautiful new Mac Book, which I thought I'd write down the solution to before I forget, and in case other people have the same problem (looking online, a lot of people seem to have trouble getting their Macs to connect to wireless networks).

Before that, I'll tell you something that really pisses me off - the whole "A is better than B" issue with computers and operating systems.  The reason I got a Mac is that I have never used one before (well, actually my secondary school was kitted out with Mac Classics, but I haven't used one within living memory) and people often big them up.  When I was recording an album with my band Latonic, the studio we worked in used a Mac and swore by it (even though it crashed all the time, but in my recording experience, music software seems to push any machine to the limit).  So I've got a Mac Book Pro, and it is in many ways really lovely.  But it does crash.  People often say things like "My Mac has never crashed", or even "Macs don't crash".  But in my experience the former is not true, and the latter obviously cannot be true - of course one can write a program for the Mac which crashes!

Anyway, so far I really love the Mac, apart from its AirPort program, which is used to manage wireless networks, and which in my opinion is really crap.  In contrast I really like the Windows wireless network manager.

The problem I had over the last couple of days is that, having returned from holiday, I could not connect to my BT HomeHub wireless network using my Mac.  My PC laptop connects fine, and nothing has changed with the network.  I have MacOS 10.5.7.

My problem was that AirPort would say:
Status:  On
AirPort does not have an IP address and cannot connect to the Internet
Network name: BTHomeHub2-XXXX

At the top, Location is set to Automatic.

No matter what I tried, I could not get the computer to connect to the network (even though this had been working fine before I went on holiday).

After a lot of messing around, this is the solution my mate Colin came up with:

  • Under the "Location" drop down (in Network preferences) click "Edit Locations"
  • Click "+" to add a new location
  • Give this a name (I used NewLocation) and click "Done"
  • Click on "AirPort" on the left hand side, and make sure that your new location is selected under "Location"
  • You should see the same error message as usual about no IP address.
  • Click "Advanced", which will open a new dialogue box, and select the TCP/IP tab.
  • On the right, click "Renew DHCP Lease"
  • Click "OK"
You should now get connected fine to the wireless network.

The weird thing is that you might expect to be able to solve this problem by selecting the "Automatic" location, then clicking "Advanced->TCP/IP->Renew DHCP Lease", but the weird thing is that the "Renew..." option isn't there for this location.  You can make it appear by clicking "Using DHCP" again, but that didn't fix the problem for me.

I'm relieved this works, but I don't think it is a great fix to the problem - I still don't know what has gone wrong with the "Automatic" location, so I'd be interested to hear from any Mac experts who know about this.

How to plug in a monitor

I have frustrating problems with computers all the time, and am totally reliant on internet sources to fix my problems.  So I thought I'd start writing down some of the problems I've had and how I fixed them, in case this can be useful to others.

I'll start things on a high by telling you about how I couldn't work out how to plug in a monitor.

I just had a refreshing 2 week holiday away from Oxford, then a week of conference/summer school work, and got back to the Oxford Computing Laboratory to find that my office had moved.  This was good news - my office mates and I are now in a larger, nicer office in a more sociable part of our building.  The minor downside was that I had to set up my computer (which was ready set up when I arrived at Oxford).

For someone with a degree and PhD in Computing Science, plus 2.5 years of experience with a very practical software company, this shouldn't be a problem, but being a bit of a dufus when it comes to real computers I couldn't get the monitor to show anything.

To cut a long story short, what I found out was that my machine has multiple DVI sockets on the back - there's a white DVI socket which looks like the regular one you should plug the monitor into, and that's what I was trying.  But then there are also 2 red DVI sockets (which look a bit different, they looks like they take more pins) at the bottom of the machine's case, on the back, which are apparently due to it having an external graphics card.

So, if you are like me and have this silly problem then look out for these mysterious additional DVI sockets!!  Of course you'll need a monitor to be reading this....