Wednesday, 16 December 2009

DMA race checking

I am in a good mood this week because I (together with my co-authors Daniel Kroening and Philipp Ruemmer) just had a paper accepted to TACAS on checking DMA races in multicore software.

The paper is online here.

The inspiration for this work came via my experience at Codeplay Software Ltd., writing system-level code for the Cell processor. To achieve high performance on a Cell SPE, you needs to write low-level DMA operations to pre-fetch data. Getting these operations correct is hard - it's easy to end up with two DMA get operations simultaneously writing to the same local buffer, and corrupting it.

Our paper first shows how bounded model checking can be used to automatically find bugs with the way DMA operations are used. Then it shows how a technique called k-induction (introduced in this paper) can be used to actually prove absence of bugs for certain classes of programs, which is a lot harder to do.

We've made a tool, Scratch, which accepts Cell SPE programs involving DMA operations, and checks them to find, or prove absence of, DMA-related bugs. We used Scratch to find a bug in one of the examples that comes with the IBM Cell SDK (see report on Cell BE forum).

IBM also have a race check library that can be useful in finding DMA-related bugs, and finds them very quickly if they occur on a given test input. Scratch has the advantage that it uses symbolic execution, so can find more subtle bugs by considering program execution for all possible input values.

Scratch needs some more work before it will be of much use to Cell BE developers; in particular it doesn't yet handle all the vector language extensions which are used by SPE programmers.

However, it will in due course, so if you are interested in using Scratch please contact me and I'll prioritise development of particular features that you require to get started!

The bad news is that the Cell appears to be somewhat on a downer. However, similar issues arise when performing asynchronous memory copies in GPU programming, which is definitely on the up (especially with the OpenCL language), so I think there's a lot of cool stuff still to do in this area!

No comments:

Post a Comment