CLR 2.0 memory model

There are several docs out there that describe the CLR memory, most notably this article.

When describing the model, one can either use acquire/release, barrier/fence, or happens-before terminology.  They all acheive the same goal, so I will simply choose one, acquire/release: an acquire operation means no loads or stores may move before it, and a release operation means no loads or stores may move after it.  I can explain it with such simple terms because the CLR is homogeneous in the kinds of operations it permits or disallows to cross such a barrier, e.g. there’s never a case where loads may cross such a chasm but stores may not.

Despite the great article referenced above, I find that it’s still not entirely straightforward.  It is important to code to a well-understood abstract model when writing lock-free code.  For reference, here are the rules as I have come to understand them stated as simply as I can:

  • Rule 1: Data dependence among loads and stores is never violated.
  • Rule 2: All stores have release semantics, i.e. no load or store may move after one.
  • Rule 3: All volatile loads are acquire, i.e. no load or store may move before one.
  • Rule 4: No loads and stores may ever cross a full-barrier (e.g. Thread.MemoryBarrier, lock acquire, Interlocked.Exchange, Interlocked.CompareExchange, etc.).
  • Rule 5: Loads and stores to the heap may never be introduced.
  • Rule 6: Loads and stores may only be deleted when coalescing adjacent loads and stores from/to the same location.

Note that by this definition, non-volatile loads are not required to have any sort of barrier associated with them.  So loads may be freely reordered, and writes may move after them (though not before, due to Rule 2).  With this model, the only true case where you’d truly need the strength of a full-barrier provided by Rule 4 is to prevent reordering in the case where a store is followed by a volatile load.  Without the barrier, the instructions may reorder.

It is unfortunate that we’ve never gone to the level of detail and thoroughness the Java memory model folks have gone to.  We have constructed our model over years of informal work and design-by-example, but something about the JMM approach is far more attractive.  Lastly, what I’ve described applies to the implemented memory model, and not to what was specified in ECMA.  So this is apt to change from one implementation to the next.  I have no idea what Mono implements, for example.

6 thoughts on “CLR 2.0 memory model

  1. Dmitry Zaslavsky

    Hi Joe, I remember discussing this very topic with you and few other microsoft people at the last PDC. And you guys were saying that this model (as explained in the Vance’s article) is the final model and will become ECMA model. There will not be multiple memory models.

    What happend?

  2. Kevin M. Owen

    Do these rules apply to loads of initonly (‘readonly’ in C#) fields, or just "normal" fields? What about thread-local fields (those marked with System.ThreadStaticAttribute)?

  3. Joe Duffy

    Hi Dmitry,

    I remember this conversation, and it’s still something I am hoping for. There have not yet been any revisions or ammendments to the CLI since 2.0. At the first opportunity, I would like to have this in there, even if it is only a recommendation and not required.


  4. Joe Duffy

    Kevin, These rules apply to all fields. Java provides similar guarantees around final fields, but weaken them around ordinary fields (such that e.g. double-checked locking won’t work). We have applied them consistently to all fields.

    Loads and stores of thread-local fields could in theory be reordered more aggressively, in addition to stack local data. There the standard compiler optimization constraints would apply: e.g. cannot violate sequential semantics, etc. The memory model rules are meant to apply to shared reads and writes.


  5. Usman ur Rehman Ahmed

    Very cleanly put rules. I have been banging my head against wall while studying about Memory Barriers to Volatile unless I found a guy who can talk about Java’s completeness in this context. I couldn’t understand one thing though so far in this context.

    What difference of Memory Models (in relevance to compiler optimization), causes the double checked locking remain broken under Java but not in C#? That is to say, a singleton object (with no input arguments) created in a multithreaded context remains broken by only acquiring a lock and is healed only by volatile, but same is not true for C# and volatile is not required. Both running under compiler optmization on a single processor machine.

  6. Timwi

    The link to "this article" (the CLR memory model article) is broken. Is that article still available anywhere? I would love to read more about the CLR memory model.


Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Enter the word concurrency, in upper case: (my crude spam filter)