Joe Duffy - Blog

September 17, 2008
Building a custom thread pool (series, part 3): incorporating work stealing queues
In part 2 of this series, I described a new work stealing queue data structure used for work item management. This structure allows us to push and pop elements into a thread-local work queue without heavy-handed synchronization. Moreover, this distributed a large amount of the scheduling responsibility across the threads (and hence processors). The result is that, for recursively queued work items, scalability is improved and pressure on the typical bottleneck in a thread pool (i.e., the global lock) is alleviated.
September 13, 2008
On being stateful
Most programs are tangled webs of data and control dependencies. For sequential programs, this doesn’t matter much aside from putting constraints on the legal optimizations available to a compiler. But it gets worse. Imperative programs today are also full of side-effect dependencies. Unlike data and control dependence—which most compilers can identify and understand the semantics of (aliasing aside)—side-effect dependencies are hidden and the semantic meaning of them is entirely ad-hoc. These can include scribbling to shared memory, writing to the disk, or printing to the console.
August 12, 2008
Parallel Extensions is on Mono
Miguel de Icaza recently blogged about the addition of Parallel Extensions to the Mono family.
August 11, 2008
Building a custom thread pool (series, part 2): a work stealing queue
The primary reason a traditional thread pool doesn’t scale is that there’s a single work queue protected by a global lock. For obvious reasons, this can easily become a bottleneck. Two primary things contribute heavily to whether the global lock becomes a limiting factor for a particular workload’s throughput:
July 29, 2008
Building a custom thread pool (series, part 1)
This is the first part in a series I am going to do on building a custom thread pool. Not that I’m advocating you do such a thing, but I figured it could be interesting to explore the intricacies involved. We’ll start off really simple:
July 20, 2008
A bit more formalism as to why CLR's MM is broken on x86/x64
Here’s a slightly more formal approach to demonstrating that the CLR MM is improperly implemented for the particular example I showed earlier.
July 16, 2008
"Loads cannot pass other loads" is a ~myth
The adjacent release/acquire problem is well known. As an example, given the program:
June 23, 2008
Final manuscript for Concurrent Programming on Windows has been submitted
I just submitted the final manuscript for Concurrent Programming on Windows to Addison-Wesley.
June 13, 2008
Volatile reads and writes, and timeliness
We had an interesting debate at a Parallel Extensions design meeting yesterday, where I tried to convince everybody that a full fence on SpinLock exit is not a requirement. We currently offer an Exit(bool) overload that accepts a flushReleaseWrite argument. This merely changes the lock release from
June 5, 2008
A tour of the new Parallel Extensions CTP on Channel9
We sat down last week with Charles from Channel9 to discuss the new CTP. Both parts got posted today:

Previous Page: 8 of 21 Next

Joe Duffy's Blog