Joe Duffy - Blog

April 16, 2007
Clearing Dcache between test iterations
To gather meaningful performance metrics, it’s usually a good idea to run several iterations of the same test, averaging the numbers in some way, to eliminate noise from the results. This is true of sequential and fine-grained parallel performance analysis alike. Though it’s clearly important for sequential code too, data locality can add enough noise to your parallel tests that you’ll want to do something about it. For example, if iteration #1 enjoys some form of temporal locality left over from iteration #0, then all but the first iteration would receive an unfair advantage. This advantage isn’t usually present in the real world – most library code isn’t called over and over again in a tight loop – and could cause test results to appear rosier than what customers will actually experience. Therefore, we probably want to get rid of it.
April 12, 2007
MSDN Magazine: 9 Reusable Parallel Data Structures and Algorithms
I wrote an article that appears in the May 2007 issue of MSDN Magazine. It’s now online for your reading pleasure:
April 11, 2007
Privatization and STM
Late last summer, an interesting issue with traditional optimistic read-based software transactional memory (STM) systems surfaced. We termed this “privatization” and there has been a good deal of research on possible solutions since then. I won’t talk about solutions here, but I will give a quick overview of the problem and a pointer to recent work.
March 29, 2007
On the imperfect nature of reader/writer lock policies
One of the motivations of doing a new reader/writer lock in Orcas (ReaderWriterLockSlim) was to do away with one particular scalability issue that customers commonly experienced with the old V1.1 reader/writer lock type ( ReaderWriterLock). The basic issue stems from exactly how the lock decides when (or in this case, when not) to wake up waking writers. Jeff Richter’s MSDN article from June of last year highlights this problem. This of course wasn’t the primary motivation, but it was just another straw hanging off the camel’s back.
March 9, 2007
The CLR commits the whole stack
The CLR commits the entire reserved stack for managed threads. This by default is 1MB per thread, though you can change the values with compiler settings, a PE file editor, or by changing the way you create threads. We’ve been having a fascinating internal discussion on the topic recently, and I’ve been surprised how many people were unaware that the CLR engages in this practice. I figure there’s bound to be plenty of customers in the real world that are also unaware.
March 4, 2007
Why the CLR 2.0 SP1's threadpool default max thread count was increased to 250/CPU
In 2.0 SP1, we changed the threadpool’s default maximum worker thread count from 25/CPU to 250/CPU.
February 19, 2007
Revisited: Broken variants on double checked locking
A reader asked for clarification on a past article of mine, regarding my claim that one particular variant of the double checked locking pattern won’t work on the .NET 2.0 memory model. The confusion was caused because my advice seems to contradict Vance’s MSDN article on the topic.
February 12, 2007
Barrier-free lock release and memory models
Somebody recently asked in a blog comment whether the new ReaderWriterLockSlim uses a full barrier (a.k.a. two-way fence, CMPXCHG, etc.) on lock exit. It does, and I claimed that “it has to”. It turns out that my statement was actually too strong. Doing so prevents a certain class of potentially surprising results, so it’s a matter of preference to the lock designer whether these results are so surprising as to incur the cost of a full barrier. Vance Morrison’s “low lock” article, for instance, shows a spin lock that doesn’t make this guarantee. And, FWIW, this is also left unspecified in the CLR 2.0 memory model. Java’s memory model permits non-barrier lock releases, though I will also note the JMM is substantially weaker in areas when compared to the CLR’s.
February 7, 2007
Introducing the new ReaderWriterLockSlim in Orcas
In Orcas, we offer a new reader/writer lock: System.Threading.ReaderWriterLockSlim.
January 29, 2007
Monitor.Enter, thread aborts, and orphaned locks
I previously mentioned the X86 JIT contains a “hack” to ensure that thread aborts can’t sneak in between a Monitor.Enter(o) and the subsequent try-block. This ensures that a lock won’t be leaked due to a thread abort occurring in the middle of a lock(o) { S1; } block. In the following example, that means an abort can’t be triggered at S0:

Previous Page: 12 of 21 Next

Joe Duffy's Blog