C# for Systems Programming

Update:

What was meant to be an innocent blog post to ease into some open community dialogue has turned into umm quite a bit more.

As is hopefully clear from my bio, the language I describe below is a research effort, nothing more, nothing less. Think of me as an MSR guy publishing a paper, it’s just on my blog instead appearing in PLDI proceedings. I’m simply not talented enough to get such papers accepted.

I do expect to write more in the months ahead, but all in the spirit of opening up collaboration with the community, not because of any “deeper meaning” or “clues” that something might be afoot. Too much speculation!

I love to see the enthusiasm, so please keep the technical dialogue coming. The other speculation could die silently and I’d be a happy man.

My team has been designing and implementing a set of “systems programming” extensions to C# over the past 4 years. At long last, I’ll begin sharing our experiences in a series of blog posts.

The first question is, “Why a new language?” I will readily admit that world already has a plethora of them.

I usually explain it as follows. If you were to draw a spectrum of popular languages, with axes being “Safety & Productivity” and “Performance,” you might draw it something like this:

(Please take this with a grain of salt. I understand that Safety != Productivity (though they certainly go hand-in-hand — having seen how much time and energy is typically spent with safety bugs, lint tools, etc.), that there are many kinds of safety, etc.)

Well, I claim there are really two broad quadrants dominating our language community today.

In the upper-left, you’ve got garbage collected languages that place a premium on developer productivity. Over the past few years, JavaScript performance has improved dramatically, thanks to Google leading the way and showing what is possible. Recently, folks have done the same with PHP. It’s clear that there’s a whole family of dynamically typed languages that are now giving languages like C# and Java a run for their money. The choice is now less about performance, and more about whether you want a static type system.

This does mean that languages like C# are increasingly suffering from the Law of the Excluded Middle. The middle’s a bad place to be.

In the lower-right, you’ve got pedal-to-the-metal performance. Let’s be honest, most programmers wouldn’t place C# and Java in the same quadrant, and I agree. I’ve seen many people run away from garbage collection back to C++, with a sour taste permeating their mouths. (To be fair, this is only partly due to garbage collection itself; it’s largely due to poor design patterns, frameworks, and a lost opportunity to do better in the language.) Java is closer than C# thanks to the excellent work in HotSpot-like VMs which employ code pitching and stack allocation. But still, most hard-core systems programmers still choose C++ over C# and Java because of the performance advantages. Despite C++11 inching closer to languages like C# and Java in the areas of productivity and safety, it’s an explicit non-goal to add guaranteed type-safety to C++. You encounter the unsafety far less these days, but I am a firm believer that, as with pregnancy, “you can’t be half-safe.” Its presence means you must always plan for the worst case, and use tools to recover safety after-the-fact, rather than having it in the type system.

Our top-level goal was to explore whether you really have to choose between these quadrants. In other words, is there a sweet spot somewhere in the top-right? After multiple years’ of work, including applying this to an enormous codebase, I believe the answer is “Yes!”

The result should be seen more of a set of extensions to C# — with minimal breaking changes — than a completely new language.

The next question is, “Why base it on C#?” Type-safety is a non-negotiable aspect of our desired language, and C# represents a pretty darn good “modern type-safe C++” canvas on which to begin painting. It is closer to what we want than, say, Java, particularly because of the presence of modern features like lambdas and delegates. There are other candidate languages in this space, too, these days, most notably D, Rust, and Go. But when we began, these languages had either not surfaced yet, or had not yet invested significantly in our intended areas of focus. And hey, my team works at Microsoft, where there is ample C# talent and community just an arm’s length away, particularly in our customer-base. I am eager to collaborate with experts in these other language communities, of course, and have already shared ideas with some key people. The good news is that our lineage stems from similar origins in C, C++, Haskell, and deep type-systems work in the areas of regions, linearity, and the like.

Finally, you might wonder, “Why not base it on C++?” As we’ve progressed, I do have to admit that I often wonder whether we should have started with C++, and worked backwards to carve out a “safe subset” of the language. We often find ourselves “tossing C# and C++ in a blender to see what comes out,” and I will admit at times C# has held us back. Particularly when you start thinking about RAII, deterministic destruction, references, etc. Generics versus templates is a blog post of subtleties in its own right. I do expect to take our learnings and explore this avenue at some point, largely for two reasons: (1) it will ease portability for a larger number of developers (there’s a lot more C++ on Earth than C#), and (2) I dream of standardizing the ideas, so that the OSS community also does not need to make the difficult “safe/productive vs. performant” decision. But for the initial project goals, I am happy to have begun with C#, not the least reason for which is the rich .NET frameworks that we could use as a blueprint (noting that they needed to change pretty heavily to satisfy our goals).

I’ve given a few glimpses into this work over the years (see here and here, for example). In the months to come, I will start sharing more details. My goal is to eventually open source this thing, but before we can do that we need to button up a few aspects of the language and, more importantly, move to the Roslyn code-base so the C# relationship is more elegant. Hopefully in 2014.

At a high level, I classify the language features into six primary categories:

1) Lifetime understanding. C++ has RAII, deterministic destruction, and efficient allocation of objects. C# and Java both coax developers into relying too heavily on the GC heap, and offers only “loose” support for deterministic destruction via IDisposable. Part of what my team does is regularly convert C# programs to this new language, and it’s not uncommon for us to encounter 30-50% time spent in GC. For servers, this kills throughput; for clients, it degrades the experience, by injecting latency into the interaction. We’ve stolen a page from C++ — in areas like rvalue references, move semantics, destruction, references / borrowing — and yet retained the necessary elements of safety, and merged them with ideas from functional languages. This allows us to aggressively stack allocate objects, deterministically destruct, and more.

2) Side-effects understanding. This is the evolution of what we published in OOPSLA 2012, giving you elements of C++ const (but again with safety), along with first class immutability and isolation.

3) Async programming at scale. The community has been ’round and ’round on this one, namely whether to use continuation-passing or lightweight blocking coroutines. This includes C# but also pretty much every other language on the planet. The key innovation here is a composable type-system that is agnostic to the execution model, and can map efficiently to either one. It would be arrogant to claim we’ve got the one right way to expose this stuff, but having experience with many other approaches, I love where we landed.

4) Type-safe systems programming. It’s commonly claimed that with type-safety comes an inherent loss of performance. It is true that bounds checking is non-negotiable, and that we prefer overflow checking by default. It’s surprising what a good optimizing compiler can do here, versus JIT compiling. (And one only needs to casually audit some recent security bulletins to see why these features have merit.) Other areas include allowing you to do more without allocating. Like having lambda-based APIs that can be called with zero allocations (rather than the usual two: one for the delegate, one for the display). And being able to easily carve out sub-arrays and sub-strings without allocating.

5) Modern error model. This is another one that the community disagrees about. We have picked what I believe to be the sweet spot: contracts everywhere (preconditions, postconditions, invariants, assertions, etc), fail-fast as the default policy, exceptions for the rare dynamic failure (parsing, I/O, etc), and typed exceptions only when you absolutely need rich exceptions. All integrated into the type system in a 1st class way, so that you get all the proper subtyping behavior necessary to make it safe and sound.

6) Modern frameworks. This is a catch-all bucket that covers things like async LINQ, improved enumerator support that competes with C++ iterators in performance and doesn’t demand double-interface dispatch to extract elements, etc. To be entirely honest, this is the area we have the biggest list of “designed but not yet implemented features”, spanning things like void-as-a-1st-class-type, non-null types, traits, 1st class effect typing, and more. I expect us to have a handful in our mid-2014 checkpoint, but not very many.

Assuming there’s interest, I am eager to hear what you think, get feedback on the overall idea (as well as the specifics), and also find out what aspects folks would like to hear more about. I am excited to share, however the reality is that I won’t have a ton of time to write in the months ahead; we still have an enormous amount of work to do (oh, we’re hiring ;-) ). But I’d sure love for y’all to help me prioritize what to share and in what order. Ultimately, I eagerly await the day when we can share real code. In the meantime, Happy Hacking!

162 thoughts on “C# for Systems Programming

  1. Banjobeni

    It is the time of presents, no?

    Assuming you’ll succeed with what I think you’re up to, it will take a lot of wind out of the sails of discussions like “nah, I don’t use those managed languages, they’re not fast enough anyway”. Personally, what I’m looking forward to the most is getting closer to the metal without dealing in simplicity for speed. I’m using C# mostly because it takes so much of the pain away compared to more unsafe languages. Keep it that way when squeezing out all that performance, and then I’ll happily look at the result more closely once there’s more to be seen.

    Please keep those details coming!

    Reply
  2. C

    Thanks for sharing, Joe. Great to see that MS is working on a new high level systems programming language that aims for Nirvana (General Purpose, Safe, Fast, Productive).

    In terms of performance, specifically, it’s certainly the case that the price for _execution time_ performance is paid for in currencies of language complexity (both conceptual and practical) and – something that often gets left out of the Nirvana programming language discussion – compilation time. For C++, these are definitely prices you pay when constructing non-trivial C++ programs that employ a healthy serving of generics (templates). For a language like C#, which was designed for (and concurrently _with_) a specific target framework (.NET), platform (Windows) and programming model (type-safe, garbage collected, productivity-first, high level object orientation, JITed, etc…), I wonder how a “Systems C#” will impact both conceptual complexity and compile time.

    I’d imagine your new language is not JITed (that would be kind of crazy for a systems programming language…). This means the current crop of managed developers will be waiting much longer for their programs to compile than they are used to and will they will be required to target specific machine architectures at build time – both of these experiences, though not too big of a deal depending upon code size and complexity, will be unnatural to them. They’ll also need to understand that having precise control over object lifetimes means the days of “invoke it and forget it” are gone.

    How you are thinking about limiting conceptual and compositional complexity while adding new (foreign to the target audience) features from C++, while ensuring compilation time does not get out of hand for real projects? All of these should be factored into both the productivity and performance aspects of any new high level systems programming language. C++ is not very good in this realm: it’s relatively hard to understand and then use C++ effectively (though it’s getting better with each new release – still, as Bjarne says, it’s a language for experts), and it can take a really long time for real C++ programs (non-trivial) to link and compile.

    Do you employ a linker? When will you share more details?

    Reply
  3. joeduffy Post author

    Hi C,

    These questions are spot on.

    Minimizing complexity has been one of our laser focuses from Day One. I cannot promise that the resulting language is as simple as C#, but I do honestly believe it’s much simpler than C++. A key to achieving this simplicity, I think, has been *guaranteed* safety. If it compiles, you’ve gotten it right. Removing all the usual escape hatches around lifetime & side-effects means you can trust the compiler.

    And arguably, knowing that a certain method can’t escape an object reference, or mutate it, can reduce complexity of programming in the environment in its own way.

    Regarding compile-time throughput, you are absolutely correct. We do not JIT, and the ahead-of-time compilation time has been a constant focus and challenge for the team. Because of the expectations of the C# half of our target audience, we constantly benchmark our throughput against the stock C# compiler. There is always more room for improvement, but we’re now within the same order of magnitude.

    Of course, the real answer to these questions is in the pudding. I hate speaking of this stuff as though it’s vaporware, given how real it is, and will do my best to share more real details with examples ASAP.

    Thanks again for the feedback.

    —joe

    Reply
    1. John Mitas

      I’m intrigued to hear more of your 100% AOT and removal of JIT compilation.

      I would love to hear how this compares to NGEN/JIT in current .NET as well as the future NGEN(TRITON/MDIL/ProjectN etc) and future JIT (RYUJIT)

      Also the assuming that everything can be statically compiled ahead of time with no need for JITing how have you overcome the problem with Dynamic etc. ? Isn’t there still a need to do something’s just-in-time or have you found a way to remove/replace all this in the new language?

      Reply
      1. joeduffy Post author

        The simple answer is, if it cannot be compiled ahead-of-time, it’s out. That includes reflection, and hence any ‘dynamic’ implementation that depends on it. Code generation can often be used for situations where you’d normally reach for reflection in .NET. It takes a bit more forethought, but also performs considerably better.

        Reply
        1. Tim

          Reflection is a significant factor to my productivity in c# (to the degree that I even rely heavily on my own reflection lib in c++). What reflection features will survive the chopping block? Is System.Type still a thing?

          Reply
          1. joeduffy Post author

            I understand completely, and appreciate the feedback. The loss of reflection was a biggie for me at the start too.

            Over time, I began to appreciate the sheer cost of reflection, and enjoyed seeing more of it go. Perhaps I can try to devote a blog post to this topic, and put some numbers behind it. I’ve been astonished as we chipped away at this how much metadata needs to be kept around to support reflection. This really was necessary in order to compete with C/C++ in the area of code quality (both speed and size).

            System.Type lives on, but it’s really just a metadata-less pointer. Think of it as a vtable that supports equality testing (and even that is controversial).

            You can imagine doing “efficient reflection” by splitting the metadata from the code, and loading it all on-demand. But in virtually every situation I’ve seen, one of two alternative approaches has performed better: (1) code generation, or (2) “low-tech” reflection, where you keep just keep around tidbits of metadata (strings and such) for the specific types where you need it (what you’d do in C++).

            I understand your point was about productivity. I’d be interested to hear whether those approaches would be too cumbersome. A good code generation framework (think “compile-time reflection”) could help here.

          2. Tim

            Thanks Joe,

            I’ll list the features I for which I leverage reflection in some meaningful priority order. Perhaps you can shed light on how those problems would be solved with your team’s model. Bear in mind not every type needs reflection, but that doesn’t lessen the cost of adding reflection as a language feature.

            1. Serialization (to file, to the wire etc.) 2. Serialization 3. Serialization 4. User editing of runtime state (property views) 5. Visualizing runtime state when watch windows and breakpoints won’t cut it. I’m a game developer; you can’t meaningfully debug animation by stopping the game and reading variables (better to print/graph it on the screen).

          3. Jesper

            I’ve used reflection a lot to get around issues with C#’s type system.

            For example, let’s say I have a Shape abstract base class, exposed as part of my public API:

            public abstract class Shape[TInfo, TAreaInfo, TSurfaceInfo] where TInfo : IShapeInfo[TAreaInfo, TSurfaceInfo], ... { public abstract TInfo FurtherInformation { get; } }

            with a number of derived classes available to the program at runtime,

            public class Circle : Shape[Circle.CircleInfo, Circle.AreaInfo, Circle.SurfaceInfo] { public class CircleInfo : IShapeInfo[AreaInfo, SurfaceInfo] { ... } public class AreaInfo { .. } public class SurfaceInfo { .. } static readonly CircleInfo _info = new CircleInfo(); public override CircleInfo FurtherInformation { get { return _info; } } }

            This is modeled on a real example. In the real example I am not actually handling shapes, and AreaInfo and SurfaceInfo are both separate concerns from each other that are nevertheless very useful to package up with the shape since they are canonical.

            When handling this, I use reflection to get a generic method “Handle[TShape, TInfo, TAreaInfo, TSurfaceInfo](TShape specificShape) where TShape : Shape[TInfo, TAreaInfo, TSurfaceInfo] [and so on for the rest of the constraints]” to be called knowing the right shape, such that the body of this method may know the type of specific properties. Dispatching to that method by simply calling it would use the least derived type used at the call site, while reflection allows me to build a definition where the actual type is used and call that.

            If the proposed language’s type system is more complete and allows this to work smoother, or even just allows generic lambdas that can be sufficiently bound to capture the right sort of information, I wouldn’t miss reflection.

            But like Tim says, I would miss reflection immensely for debugging and inspection, which isn’t constrained to “code not running in production”. Many of Microsoft’s developer tools themselves would struggle to do their job if a reflection-like capability wasn’t there. It’d be very interesting to see the use of optional separated metadata, as you say, and I’d add: interrogated using mirrors.

          1. joeduffy Post author

            For string literal regular expressions — by far the common case in my experience — you can still pre-compile using code-generation. It’s just done at compile-time, rather than runtime, which is arguably where it belongs.

        2. Daniel Earwicker

          What is it about reflection that cannot be compiled ahead of time? Reflection is just type metadata gathered at compile time and saved in a format that can be accessed at runtime – why is that opposed to any of the goals of your new extensions? What’s the conflict that means it has to be dropped?

          NB. Reflection is great but has been stuck at “version 1″ ever since it was created so it could use some improvements itself. But I find it hard to see “it cannot be compiled ahead of time, so it’s out” as an improvement! :)

          Reply
          1. joeduffy Post author

            The metadata needed for reflection carries a rather hefty cost, and yet is seldom used. (As I mentioned above, I will devote a post to quantifying this in the near future.) A top-level goal of ours is to compete with native code quality (C/C++), and this metadata was a constant source of bloat. Particularly in resource constrained environments, like phone and embedded, it would be really unfortunate to carry around megabytes of strings when you just agonized over a few bytes here and a few bytes in the area of code quality, especially when most code doesn’t need them.

            As I mentioned, we considered “pay for play” reflection, but it turns out that using code-generation almost always wins. Offering reflection might be more convenient, but it also results in a more complicated runtime, and there’s generally a better solution awaiting discovery anyway. At least in my experience.

    1. Dj Gilcrease

      My thoughts exactly. Guess Go is becoming popular enough that MS decided they needed to get a language in the market to compete

      Reply
    2. joeduffy Post author

      It’s certainly true that our language shares some in common with Go.

      Once we start getting into details, however, I trust you will see it’s a very different beast. For example, we prove programs race-free, and have generics & inheritance.

      Furthermore, the basic idea at the core of our language is something I first explored in a ThinkWeek paper circa 2007, when most of the world hadn’t heard of Go. By the time Go was released to the public in 2009, we had already begun down many of the paths, many starting with annotations (C# custom attributes) with a separate static verification step.

      I love new languages, and Go, Rust, and D are at the top of my list. But once we get beyond high-level goals, you will see some interesting differences emerge.

      Reply
      1. Nicolas Grilly

        How do you implement generics?

        Let’s consider a generic list. Do you compile one version of the code for each type of item (which gives the compiler an opportunity to optimize the code but also leads to code bloat and CPU cache miss)? Or do you compile one generic version of the code with some kind of boxing/unboxing (no code bloat but slow execution)? Or do you use an intermediate strategy like the CLR which compiles a specialized version for each value type and a generic version for all reference types?

        Reply
        1. joeduffy Post author

          Nicolas, you are right there’s a spectrum here. What we’ve found is there’s no one right answer, kind of like /O1 vs. /O2.

          We can in fact share more aggressively than .NET generics, e.g. structs with certain similarities. There is no boxing, but there can be indirections in the code. That’s why you can disable it where absolutely necessary (hence my analogy with /O1 vs. /O2). Our team is accustomed to staring at assembly code and trying to make it perfect :-)

          The real difficulties show up when you start talking about DLLs.

          Reply
          1. maninalift

            I don’t quite follow this.

            You seem to be implying that this is just a matter of optimisation.

            c# generics are a totally different beast to c++ templates .

            By instantiating the template with a concrete type C++ allows the programmer to test properties of the type, perform arbitrary logic and construct the implementation based on this logic. D does the same with more convenient syntax.

            By contrast c# generics are closer to using runtime polymorphism.

            When it comes to DLLs or modules for template-like things I know this has been a difficult question for C++ for some time. Loading deep dependency trees of headers is slow, something Go has got right is that compile times really matter.

            I guess D must have some solutions on this matter. I know the folks at LLVM have also been working on making C++ more efficient in this regard.

            Finally I have a (very vague) notion that querying lots of structured documents with no fixed schema without the up-front cost of loading them all into a structure representation is exactly what wavelet trees are good at. It seems like it is probably not a good match but I have been meaning to explore it.

          2. maninalift

            P.S.

            Phillip Wadler describes C++ templates as having no semantics, I would rather describe it as having programmable semantics as a result of the possibility of arbitrary compile time logic noted above.

      2. David

        Love the fact that generics or more importantly the idea of generic programming is maintained. However, other than interface support I am not so sure maintaining object class inheritance is so important. From my experience it seems like most people have given up on deep object hierarchies and moved to object composition, with very limited use of very shallow class hierarchy. Which when you look at it in its entirety can be seen as a cheap way of composing classes together without the need to implement a pass through version of all the public methods.

        Its the one thing I love most about go is that composition and interfaces I believe are a much better solution than inheritance. I would love to hear more about your thoughts on this.

        Reply
        1. MBR

          Yes, maybe add direct support in the language for delegation to more easily support has-a vs. is-a relations without adding boilerplate forwarding code. E.g.: (making up syntax) There are only options too – top of my (tired) head:

          class Foo : IBar, IBaz { IBar barProxy; // could be init’d from c’tor or dependency injection IBaz getBazProxy() { … }

          override IBar.Method1() { … } // takes precedence over proxies

          forward IBar to barProxy; // all other IBar methods are handled here – no need to change if methods added or removed forward IBaz to getBazProxy(); // call a fn get find the proxy }

          Reply
      3. Rob G

        Are your generics more like C++ templates, I truly hope so.

        For a systems language I’d really like to see.

        1. Templates. 2. Get rid of the Simula mistake of confusing classes with types. Any class should be free to implement a type (in this context a bit more like a C++ concept – i.e. compile-time interface), regardless of relation through inheritance. 3. Related to (2), put multiple inheritance back in. At the very least support for mixins (see Dart)

        Generally, keep type safety, but allow any class to provide an implementation of a type.

        ” a goal was that any C# compiles in this new language” that sounds horribly like the mistake made by C++. By the sound of it using the language will be sufficiently different anyway – the libraries alone should be significantly different for optimal use, so give up the goal. That way the language can make radical decisions that break compatibility with the C# tradition.

        Generally good news though – for a long time I’ve been thinking there is a need for a new systems-level language without the deficiencies (preprocessor, module system, complexity) of C++. I hope you’re all brave enough to depart from the tradition where its the right thing in your new application area.

        Reply
      4. Øyvind Teig

        Proving freedom from race conditions vs. not having mechanisms that open for race are two interesting aspects of concurrent language design. Reaching at anything but the in-between seems difficult in real life language design, I assume. Since this comment is in Go-lang context I would say that Go’s channel has rather interesting non-race properties (since a channel is a key to holding, moving of shared data and also synchronization). But the compiler does not do parallel usage checking (as the occam compiler did), which then might introduce race – and Go-lang does open for some usage that I assume could lead to races. I am eager to learn the about the concurrency properties of Microsoft’s new language, and could certainly absorb more glimpses into it!

        (Somewhat aside: I have been doing some blogging recently about concurrency, and now mentioned this blog note as an update in [1]. I am especially trying to discuss channel based versus event based types of concurrency.)

        (By the way, the joeduffy at acm.org does not seem to work)

        [1] http://www.teigfam.net/oyvind/home/technology/075-eventual-concurrency/

        “Aclassifier” Øyvind Teig Trondheim, Norway http://www.teigfam.net/oyvind/home/

        Reply
  4. kjc

    “As we’ve progressed, I do have to admit that I often wonder whether we should have started with C++, and worked backwards to carve out a “safe subset” of the language. ”

    I wish you had taken this approach. Because: 1) The c# familiary/expecations that c# dev’s bring could actually mislead them. 2) To appeal to dev’s outside the MS-sphere. And MS definitely need a better story re open source technologies. I don’t mind buying MS licences for OS/Office/VS/apps etc. But I do want core technologies to be open source – runtime’s, compilers etc.

    Focus on building the best language. Take what’s good from c++/c sharp etc.

    Reply
  5. Ben

    This new language sounds really interesting. It would be worth while looking at what the guys at JuliaLang (http://julialang.org/) have been doing. I believe they have managed to hit the sweet spot with a lot of language features. Even though the languange is dynamic it approaches the speed of C and C++.

    One big feature they have, is being able to interop with C and Fortran libraries seamlessly as well as Python. So if this new language you are working on can do the same as well as interop seamlessly with C# or other .NET libraries, I think it will be a winner. Developers won’t have to re-write their libraries completely, but can choose to only re-write the parts that are slow.

    Contracts everywhere will be a really nice feature.

    Reply
  6. Jeroen Frijters

    Hi Joe,

    As always it is great to hear about your project. I’ve been looking enviously at C++11 and you seem to check almost all the boxes on my C# wish list.

    I’m eagerly looking forward to more detailed posts.

    Regards, Jeroen

    Reply
  7. bjz

    Have you been following the development of Rust? It seems to tackle many of the things that your language does. I’m just wondering what you’re doing differently. If you haven’t looked at it, I’d highly recommend you do so you avoid some of their early mistakes. Here is the tutorial, which should give you an idea of what kind of features it supports. The community is very open and friendly and I’m sure they would share some of their findings with you, and would in exchange love to hear yours. Most development chatter happens over on irc.mozilla.org #rust.

    Reply
      1. bjz

        Ahh I see now. It was so off-hand that I missed it. I really would love to hear from the author what differentiates his language from Rust in more detail. For example, what different design decisions they made and why. It really would be exciting to learn more.

        Reply
  8. Jeffrey Drake

    Your diagram would probably be justifiable if you didn’t have ‘safety’ on it. Having Javascript higher than C++ on a ‘safety’ axis churns my stomach a little.

    I do love the concept you have described here. My minimal exposure to C# was difficult when I finally realized that operator overloads didn’t quite work with generics.

    As far as error handling goes, have you considered supporting an Option/Maybe and an Either type, or more specifically opt-in to nullability. In C++ with a reference parameter I can guarantee there is something there, but with a pointer there is no such guarantee.

    Reply
      1. MuiBienCarlota

        Non nullable types (ie class instance reference – ClassName!) have been mention for .Net 1.2 but was unfortunately removed from .Net 2.0 scope. Only Nullable survived but as an external library with only a small syntactic sugar (StructName?).

        A discussion took place to decide if this must be implemented as a language extension or as an external library. I don’t have real reason but I think the external library have been chosen to simplify multi-language implementation (C#/VB.Net and C++/CLI). Non nullable types cannot be safely impemented without support of compiler and language.

        Same apply to code contracts. They were very promising as language extension and they only appears as attributes and a complex to use, unpolished VS extension.

        Reply
    1. bjz

      As far as error handling goes, have you considered supporting an Option/Maybe and an Either type, or more specifically opt-in to nullability.

      The nice thing in Rust is that pointers stored in Options optimise to nullable pointers, so there is no overhead. This is not hardwired for Options – any ADT in a Some/None form can be optimised in this way.

      Reply
    2. eilra

      Actually, in C++, using a reference parameter (T& foo) does not guarantee that the reference is non-null. It is trivial to pass a null pointer (accidentally) to a method or function that has an R-value parameter.

      Reply
      1. Jeffrey Drake

        In the C++11 standard 8.3.2/5:

        A reference shall be initialized to refer to a valid object or function. [ Note: in particular, a null reference cannot exist in a well-defined program, because the only way to create such a reference would be to bind it to the “object” obtained by dereferencing a null pointer, which causes undefined behavior. As described in 9.6, a reference cannot be bound directly to a bit-field. — end note ]

        This means it is undefined behaviour, which is something to be avoided anyways.

        Reply
        1. eilra

          Correct. However, nothing prevents this at runtime from occurring, and in my experience, it is a common occurrence.

          It is extremely common for developers to call methods using a dereference of a pointer, to pass the object to a & parameter. If the pointer is NULL, then the reference is NULL. Yes, this is undefined behavior (according to the spec). However, the point is that C++ only deals with this as a specification issue. Nothing in C++ compilers can detect this at compile time and prevent it. However, there *are* languages that can detect and prevent certain classes of NULL dereference at *compile* time. This is what I (and others) want — to be able to exclude the possibility of the undesirable behavior, not merely to have a specification that says “If you do X, then the behavior is undefined.”

          Reply
    3. Mark

      | Having Javascript higher than C++ on a ‘safety’ axis churns my stomach a little.

      Yes, me too. I think maybe it’s a result of looking through “system-colored glasses”. If you define “safety” as “protecting against access violations and the like” (the sort of thing that system architects worry about), then sure – javascript is “safer” than C++. But if you define “safety” as “more likely to give the correct answer” (which I think is what most users are actually looking for), then C++ is definitely safer than javascript.

      Reply
  9. Pop Catalin

    Wow, this is unexpected and highly welcome. Looking forward for more posts with new details. I’m mainly a .Net developer, but with interests in game programming, numerical computations, scientific and financial applications.

    I know it was stated the new language is mostly targeted at systems programming, but what I’m interested is how much support will it have for SIMD instruction sets (auto vectorization, intrinsics?) and/or GPGPU progamming, if any. Also does the compiler outputs Stadard CIL, Newer Version or CIL or Native Code.

    Regards, Catalin

    Reply
  10. Yaakov

    I hope you get rid of some of the inheritance & interface complexity. C# development has turned into a religious cult where programs become more & more complex, without any real underlying complexity.

    Over-abstraction, where a single concept is abstracted & re-abstracted, until no one can figure its true meaning, is common.

    Over-Inheritance spreads the logic of a code segment across many files and types, introducing huge maintenance costs.

    It’s time to kill some of this complexity. A good language should shield programs from cult trends.

    Reply
    1. Frank Hileman

      Yaakov: While I share your desires, it is impossible for a language to shield developers from cult-like influences and fads in society. These are social problems; any language can be used by foolish educators as a platform for spreading bad ideas. We are better off focusing energy on ways to get programming communities to objectively improve productivity, with evidence backing all claims.

      Reply
      1. Andrew Kinnear

        I have the opposite opinion, I mean yes, fanaticism is bad in any form, some people just have blinkers and see nothing else, but shielding a developer from doing something stupid means to enforce a specific approach or pattern, and any single approach is limiting… by definition… I think we need more options… we should have language extensibility, and deterministic allocations, but also garbage collection and type safety… the evidence point to a wider variety, not a narrower one.

        Reply
  11. Andrew

    As a game developer I think this is a great idea. I love C#, but it has some performance issues (particularly when it comes to using SIMD, NEON ect). And at times things could done better when working with native C libs in C#. But I ONLY and I will stress this issue, I ONLY WILL USE something like this if its OPEN SOURCE (MS seriously needs to get with the times on this even with C#). In fact I would NOT be using C# if there wasn’t a open source version. And if C# didn’t exist I wouldn’t be using Visual Studios (However I would still use Visual Studios if C# was Open Source [This is something the heads at MS seems to miss]). No new language is going to go anywhere (especially if its from MS) if its not Open Source from get go… and none of this “community promise” bull… it needs to be open like .NET Micro is.

    So to keep in simple, i’m very interested in a project like this only if its Open Source under GPL/MIT or something.

    Reply
  12. Alois Kraus

    It is good to see you blogging again. Your posts have always been the source for mind changing light bulb moments. Personally I would love to hear more about the new one pass error handling model for exceptions you mentioned quite some years ago. I guess this is what you did add to the C# exception handling model. I also would like to know if operations stuff is hating you for that since memory dumps will be much harder to analyze when the last chance error handler is called after the stack is already unwound. Does your C#xxxx have something that allocates only an empty managed hull object but the real memory is on an unmanaged heap? That way we could much easier interoperate with unmanaged buffers which no C# api can currently consume without copying the data into a managed array since most APIs do not work with plain pointers. I wonder also how fast you can get with C# compared to C++11. The current JIT compiler is lacking SIMD support which can be as important as using threads to get better scalability.

    Reply
  13. wild

    Will the new language expected to be cross-platform? And what about existing .NET ecosystem. Will it be copied for the new language?

    Reply
  14. Jesper

    From what I’ve seen to date, not starting from C++ is the best thing you could have done. C++’s foundation is formulating yourself for the benefit of the machine, not for the benefit of sound logic. Having four kinds of casts is a way of being specific with the machine efficiently, not about communicating your algorithms well.

    Real interest in this sort of language (this quadrant overall) is exactly what has been lacking and it couldn’t have fallen to a better team to look into it. I’m hoping all the people who have been naming other languages aren’t seriously suggesting you abandon five years of hard, large scale work by a talented team. An opportunity to iterate back and forth in peace and quiet over several years and in the context of an entire stack of software, including operating systems and frameworks from what I hear, doesn’t come along that often. I’m willing (and eager) to hear what this effort has to say before presuming that it’s going to be a weak copy of Rust or Go.

    Thanks so much for sharing this with us, Joe.

    Reply
  15. pablo

    This is really awesome!!

    I’ve been developing in C# for 8 years already and I always wished I had what you propose. I love C#, but I think the GC helps you to do bad design. You don’t do malloc so easily because you know what is going on, but you do create tons of objects because… it is so cheap! And then of course you end up with frequent garbage collections eating up 10-30% of the cpu.

    So, I’m eager to see these new stuff! :-)

    Reply
    1. MBR

      I don’t really agree here about “knowing what’s going on”. That may be true in simple cases, and when it’s your code that’s fresh in your mind, but In C++ calling copy/move/etc. constructors can have arbitrary cost, and often multiple copies are made (and stored in collections as values) unnecessarily, and it’s not necessarily obvious when looking at the code. That’s why I’ve seen many C++ programs converted to C# not only become much more readable, but more performant, at least “in the large” – tiny hot-spots can certainly be optimized for memory allocation and cache effects purely in C# or by calling out to native code. Instead of yet another new language, another way to go would be to take algorithms already expressed high-level semantic forms in existing languages (F# reflected definitions, C# expression trees, etc.) with added meta-data and constraints as you need, and then generate your system code with whatever set of features you need from there – we can already target everything form GPU’s to Javascript this way – I’m not sure what you couldn’t have other back-ends that target the same architecture in a different ways.

      Reply
      1. MBR

        More succinctly, improve your meta-programming story, and then work from there, allowing a class of languages to evolve that suits the needs of various users. (I’d like to see this anyway, even if not in the scope of a system’s language.) Which brings up the questions of tooling (compiler-as-service), macros (or not), etc., in this new language…

        Reply
  16. Clay Borkholm

    Yeah, I hear Go and Rust things in there…maybe a touch of Eiffel, too (with the code contracts). I’ve alway felt that C# made expressing solutions very clean, and have been a fan from the beginning. I’m looking forward to seeing Go’s “select” construct mapped into C#. Its so smart, it almost hurts. Also, the code contracts stuff is encouraging…hoping that’s from the bottom-up. I learned more having to prove invariants than from any software innovation in the last twenty years. I learned what a hacker I truly was.

    Reply
  17. John Mitas

    is this “new language” the rumored M#

    and you have some conflicting statements in your post..

    on one hand you say “The result should be seen more of a set of extensions to C# — with minimal breaking changes — than a completely new language” the important bit “… minimal breaking changes…”

    and on the other hand you say “… rich .NET frameworks that we could use as a blueprint (noting that they needed to change pretty heavily to satisfy our goals).” the important bit ” .. needed to change pretty heavily .. ”

    so sounds like there is “minimum breaking changes” in your existing C#.NET code base if you “change .. heavily” your code ..

    This sounds very very interesting to me, thou I am honestly skeptical at the moment, ill wait to judge after I see code..

    Also I’m quite confused in what you are actually proposing 1. C# extensions – that allow C#.NET to use your new ideas 2. M# – a new language that encapsulates your new ideas 3. Syncing the C#.EXTENSIONS and M# so that porting between both is possible at a devs discretion

    Are you proposing all 3 above? Is a goal of yours is the ability to take code to and from C# and M# ?

    Also very interested in the tooling story and timeline of what you are proposing.

    Reply
    1. joeduffy Post author

      Sorry if my explanation was unclear on this.

      Basically, a goal was that any C# compiles in this new language, and then there are a bunch of new features that are opt-in.

      This entailed some sacrifices in the area of defaults — and is something we constantly revisit.

      What I meant by needing to change the frameworks is that, in order to really take advantage of the language, the frameworks need to be designed a bit differently. The performance problems we see in .NET are as much due to the frameworks and allocation-heavy designs of them (e.g., it’s a minor thing, but see String.Split; layers of APIs get built atop something with O(N) allocations). In principle, I suppose you could do without this step, but you’d be leaving a lot on the table.

      It’s still really unclear where we will land here once all is said and done. I like that we’ve left a few doors open for ourselves.

      Reply
      1. Jon Harrop

        > String.Split; layers of APIs get built atop something with O(N) allocations

        That problem is exacerbated by those allocated objects surviving generations in violation of the generational hypothesis. Then there’s the question of whether or not you even want to copy substrings out from an immutable string when you could just return a struct containing the start index and length of the substring instead. There’s really no excuse for heap allocating anything.

        Given that C# is already perfectly capable of expressing such a solution, does this motivate the development of another language?

        Reply
        1. joeduffy Post author

          Indeed, heap allocation is just too damned easy.

          Even treating Gen0 as a scratch pad is not free, as it regularly wipes your L1 cache clean.

          C# is indeed capable of expressing some solutions here, but the problem – as I see it – is that the most natural thing to write tends to be the wrong thing. The language lacks simple expression of scoped object lifetime, which could otherwise be used for aggressive stack allocation, and the very basic primitives (arrays and strings) coax developers into the very heap allocation we wish to avoid. It’s subtle, but if you give developers primitives that reinforce the basic principles (fast, less heap allocation, etc), you get more of what you want, statistically speaking.

          Reply
          1. Jon Harrop

            > The language lacks simple expression of scoped object lifetime, which could otherwise be used for aggressive stack allocation

            I see a lot of problems with that. Firstly, using scoped object lifetimes introduces floating garbage as objects are kept alive until the end of their scope. Secondly, scoped object lifetimes inhibit tail calls (because it injects dealloc code before function returns), force exception handling to unwind the stack rather than longjmping (because destructors must be called) and introduce the funarg problem with lambdas (because closures can return references to locals). Thirdly, scoped object lifetimes implies malloc+free allocation patterns rather than the bulk-free that gen0 offers when it is swept (which can be faster). Fourthly, deep thread stacks wreak havoc with latency as they are traversed atomically by the GC and the same might apply to your stacks.

            My own measurements indicate that scoped-object lifetime can be 10% faster than a generational garbage collector: http://flyingfrogblog.blogspot.co.uk/2011/01/boosts-sharedptr-up-to-10-slower-than.html

            Another data point is the results for escape analysis on the JVM. They believed it would recover the performance gap between the JVM and the CLR’s value types but it was a failure: stack allocation was not significantly faster because the real benefit of value types is in changing the heap topology. A lot of GC research has gone into trying to stack allocate objects as well and, in general, it is not fruitful.

            I believe much bigger performance gains are to be had by integrating the basic collections (substring, subarray, stack, heap, queue, list, set, map) into the VM and by integrating the VM into the OS. For example, imagine if all .NET applications shared the same VM that had a gen0 per core: no more context switches or gen0 cache misses between .NET apps! Your String.Split example is a good one: the mutator could store a substring as (string,start,length) and the GC could copy out those immutable strings to reduce memory footprint only when it makes sense to do so.

            And what will you do about pointers between your stack-allocated and heap-allocated worlds? I assume the stacks can point into the heap but the heap cannot have pointers back to the stacks?

          2. joeduffy Post author

            Jon, great reply. And thank you for the link — very interesting experiment & results.

            The key here from a language perspective is the *logical* scoping. How this gets implemented is of course important, but the key improvement here has been freeing up a bunch of implementation options that weren’t previously possible (and which don’t necessarily depend on a shared resource, i.e. the heap managed by the garbage collector). Also note, scope doesn’t necessarily mean “lexical scope”; it just means “well defined lifetime”, versus escaped garbage collected references, where lifetime is not known at the language level.

            You note most of the options. Stack, side-stack, region, thread-local, etc.

            Destructors in our system are orthogonal, although you are correct that some choices of storage require stack unwind. Thankfully we have not chosen one of those, at least in the current incarnation.

            We have a lot of experience with the stack problem, as we utilize lightweight linked stacks rather than, say, .NET’s heap-based continuation passing async. This is a blog post in its own right. There are some “easy” things you can do to improve scanning times, one of which is the simple recognition that in a system with lots of asynchrony, most of those stacks won’t have made any progress since the last collection.

            Regarding escape analysis, this is actually my overall point! Trying to recover an approximation of lifetime after-the-fact, through compiler analysis, is always going to be best effort and give mixed results. We of course do all the standard stuff here. But by augmenting it with first class lifetime scoping in the language, you get *way* better results. I cannot overemphasize how game-changing this has been.

      2. bjz

        Basically, a goal was that any C# compiles in this new language, and then there are a bunch of new features that are opt-in.

        This is concerning. Are you still going down this route?

        Having to deal with C backwards-compatibility is a huge burden on C++, from both a complexity and safety standpoint. And even then it is still imperfect and has gotten worse since the languages are now diverging. C# is already a very complex language, and there is also the factor that the defaults in C# might not be what you want to encourage in your language. For example the initial thought when designing Rust was that managed boxes were going to be the most used method of allocation. Interestingly once unique boxes were added the designers found that users tended to avoid using them by default, only resorting to them in more complex cases. Other things that could be concerning in a language that is interested in safety is nullable pointers and mutability as default.

        On the other hand, the good thing about C# compatibility is that it would allow you to bootstrap adoption much easier. What about, instead of aiming for a superset of C#, aim to have a really nice FFI? One of the reasons Rust has been so successful so far is that its C FFI is stellar, allowing it to leverage existing libraries. Seeing as the languages are both being developed by Microsoft, it would probably be reasonably easy for you to provide some nice tooling for auto-wrapping C# libs.

        Reply
        1. David Piepgrass

          This is concerning. Are you still going down this route?

          Having to deal with C backwards-compatibility is a huge burden on C++…

          Two things. One, if you don’t need or want a backward-compatible language, there is already a variety of cool new languages to choose from with various advantages (and occasionally disadvantages) over C#: D, Ceylon, Rust, Go, Nemerle, F#, the list goes on and on. Some of these languages have been available for years now, yet they don’t approach C#’s popularity. That is why the backward-compatible approach has value. Making yet another unique language would be a hard sell.

          Two, C is not that well-designed so I would agree that building on it was a mistake. C++ is not super well-designed either. Although C# retains some flaws from C/C++ (e.g. some of the parsing difficulties) it is a much better starting point than C was, and it sounds like Joe plans to address some of the flaws that C++ did not (e.g. void as a first-class type, non-nullable reference types).

          One of the reasons Rust has been so successful so far is that its C FFI is stellar, allowing it to leverage existing libraries

          Really? How does it consume C/C++ header files?

          C compatibility really is not enough in the long run though, it’s just a starting point. We should not continue putting C up on a pedestal; it should not be the case that the only way you can write a library to be consumed by any programming language is to write it in C, or to limit oneself to C ABIs (pointers, no built-in string type, no built-in collection types… it’s disgusting!). What we need is a cross-platform interoperability standard, something better than COM, something with a solid set of core data types and collection types, something that can allow garbage collection and reflection, but that (unlike .NET) allows native code and native performance.

          Reply
  18. Someone

    Great news indeed! Random thoughts to help this language succeed: Get it into a REPL and a browser up front, eg. tryfsharp style tutorials, fiddler/plunker, vs online and other web IDEs. Make it as accessible as possible. Work with mono to get it cross platform up front, into iOS and android. Build a to-javascript compiler (a la dart/asm.js) to promote it for clientside browser dev. Easy interop with .Net & winRT. Native xaml+this new language apps in winRT and wpf vNext. Basically: get it everywhere a dev could choose it and reduce barriers to migrating. The full weight of microsoft behind dev div here would be a good way to make amends for the disillusioned .Net dev community.

    Reply
  19. MuiBienCarlota

    I’m really happy to see MS invest in language design. After a too long period of stagnation in C# evolution (where are non nullable types of .Net 1.2, where are spec#/sing# innovations, why is Roselyn so slowly evolving). This sounds very promising. Long time ago, I have read a promising article from Mary Kirkland in MSJ December 1997 about COM+ (or what should have been COM+). Just after that, Microsoft decides to use Colusa’s OmniVM to be the new heart of COM+ and this start a managed era. We are perhaps at the end of this area. I hope you are not a new Mary Kirkland.

    Reply
  20. YoMan

    so it is going to be windows only? what backend are you planning to release with it: Phoenix? It will be interesting if it were LLVM based. looking forward to the real details … though i think you guys are too late to the scene.

    Reply
  21. Pingback: A glimpse into a new general purpose programming language under development at Microsoft | Lambda the Ultimate | Professional Hackers For Hire

  22. Ram

    Joe,

    I have 3 broad questions: #1. Emscripten (Mozilla) indicates there is no performance benefit if (C++ & Javascript) can be interchanged at LLVM level? So, why develop another language? #2. What about functional programming features? Why not start with F# instead of C#? Why not start with Haskell & C–? Erik Meijer & Simon Jones are readily available.They transparently address scalability under the covers for the developer. #3. How’s it going to address future hardware (GPUs(Nvidia), FPGAs, XeonPhi (Intel), Kaveri (AMD)) evolution? Thank you.

    Reply
  23. Lex L

    What’s about Linq.Expression.Compile() in M#? Will it have a runtime possibility to construct custom expressions and turn them into fast code?

    Or we are again back into slow future?

    Reply
  24. Pingback: Microsoft News | Microsoft Begins Talking About Their Next Generation Programming Language, C# For Systems Programming

  25. NegLewis

    MS needs to build a VM (VMWare type) of sandboxing and filters (dll – filter – dll) I should be able to run any app from any OS in any OS.

    THIS would free us from creating those everlasting compatibility nightmares. THIS would allow us to start over any time with anything. THIS would allow you guys to start over even with a new OS, new Programing Languages…

    THIS^…

    From what I see C# was chosen not for it’s users, devs or community but for Visual Studio. VS (extensions, features++) it’s THE real hero here. I would create a UML/Block type of “programming” with real time debuggers/trace… UI.

    Reply
  26. leonardo

    Hi Joe ,

    Good news . But I suspect that would be the ” extensions for C # ” and not a ” new D # ” , which compiles the library .NET using ARC and not the crap GC ( my imagination ) .

    My question : C # would be approaching again from Delphi ? which now has generics , anonymous methods , etc. … but it is still a native language with a secure string type rather than C++ , and now have to ARC Mobile development ..

    I considered 3/4 years ago at the beginning of the project that the closest language of the right upper quadrant was Delphi (2010 at the time) . I’m working with programmers and they can not deliver for C + + to be very dangerous, and can not deliver because C # User experience is poor ( JIT , GC , whatever ) . Then we use Delphi , and a simple memory leak is still a danger , and a language with a ” questionable ” syntax, and not like the support of Embarcadero ( current owner ) .

    My dream is to have a language of the result : ” tossing C # and C + + in a blender to see what comes out”

    Reply
    1. herman van der blom

      Yes ‘for Systems’ phrase is not clear, better to adjust C# to extension C## or C#+ enough possibilities. Lots of names can be found let a poll decide. What’s clear its that C will be the base, because C# is more productive that will be used to. The goal is more performance, so C#>> could be it or other symbols to indicate performance (C#p)

      Reply
  27. DQ

    Hello.

    Finally! Aggressive optimizations of lambdas and enumerators to reduce the number of allocations, I have been waiting for that for years and I feel like it is the one optimization we really need. But I am more fearful about your will to go further as I feel like the “allocate and forget” model should stay the default behavior and I am afraid to see my code polluted by memory management annotations everywhere. Here is why I hope that your compiler can understand objects’ lifetime often enough on its own that it is not necessary to introduce new language constructs to address the few other cases where determinism would help. Or at least that your new constructs are not pervasive and do not pollute the language significantly more than the struct/class nouns. As everyone I love micro-optimizations but we should not have the associated verbiage everywhere, they’re rarely worth the deal.

    Another thing that pleases me a lot is the “first-class immutability”. This would be a really nice addition as this is typically one of the few scenarios where I feel like I am held back by the C# type system (think about an immutable type with N fields and you want to create a shallow copy anytime you modify one field, then you need one method per field and maybe a private constructor). The other two scenarios where the type system is in my way are generics and their harsh limitations (no one will be surprised), and fluent APIs: I would like a “this” return type. Because if I inherit StringBuilder then I want the inherited fluent methods to return an OverridenStringBuilder, and the associated tool tips to tell me that those methods return “this”.

    PS: another source of allocations removal I would like to see: nested objects. For example a node and its children collection. We usually do not need a separate children collection object. Very often it could be inlined in the parent object (the Node here).

    PPS: I would like the non-null checks to be made when accessing a field rather than when invoking a method. That way I can invoke an instance method ImmutableStack.Push over a null ImmutableStack. Right now in C# this is not possible and we need either a static method (extension methods work best), or special, empty, root instances.

    My apologies if this post sounds like a Christmas’ wishes list.

    Reply
  28. Petr Vones

    Speaking of async programming, there was a Microsoft research project called Axum (Maestro) few years ago. Do you plan to use some ideas of this project in the “updated” C# language or better said, its common libraries ?

    Reply
  29. xoofx

    This is a great news and I’m eager to get my hands on this new language! this is a kind of a graal to succeed having both safety/productivity & performance, where none of the new kids out there have succeeded (rust, D…etc). If you can follow the Typescript’s way of making it open source, cross platform-able, possibly ISO-able, it could become a major player in the language space

    Reply
  30. Serge

    + 1 to cross-platform implementation Not sure about ahead of time compilation. I always thought JIT can squeeze out more out of hardware it is running on. But my major pain points were: 1) Performance (It felt like MS didn’t really care about it for a long time unlike Java) 2) Interop with c/c++ (weird pin pointers, fixed, unsafe or even other language C++/CLI) 3) GC pause/limited control over memory alloc 4) Cross-platform (mono always lags behind and only partially implemented) I really would like to learn more about the new language.

    Reply
  31. ukram

    I’d like to see RIAA pattern support. I’d like to use destructor functions in classes.

    cat c;

    instead of

    cat c = new cat();

    The destructor should be called automatically when object goes out of scope

    How hard can this be? How many years does this take?

    Reply
    1. joeduffy Post author

      Yep, that’s one of the aspects of #1 in my list. I’ll devote a post to this in the near future. Well, probably a few of them, since it’s taken us about three go-rounds to get to where we are. It’s harder than it appears when you don’t have a ‘delete’ keyword :-)

      Reply
  32. Rafel

    If it has to be fully compatible with C#… is it then just C# version 5 or 6? In each version they add new contructs… at the end almost nobody knows the complete language and what to use. Then someone will have to write the book ‘C#, the good parts’

    Reply
    1. David Piepgrass

      I, for one, use pretty much every feature of C# except the old features (e.g. delegate(…) {…} instead of (…) => {…}) and ‘dynamic’, so I’ve never been unhappy with the new features. Now the BCL is another matter, I use less than 10% of that.

      Reply
  33. Jon Harrop

    > This does mean that languages like C# are increasingly suffering from the Law of the Excluded Middle

    Interesting. For me, C++ was excluded from the middle between OCaml/F# and assembler. Indeed, I’ve spent much of the past 9 years rewriting C++ solutions in OCaml and F# and I have often seen substantial performance gains from doing so (see http://stackoverflow.com/questions/4257659/c-sharp-versus-c-performance/19505716#19505716). That is self-selection though because I’m the guy they call in when their C++ is grinding to a halt. :-)

    > It’s clear that there’s a whole family of dynamically typed languages that are now giving languages like C# and Java a run for their money

    Javascript is common because it is the defacto standard in the browser. I don’t know of any other dynamically typed language that comes close to C# or Java in the job market. For example, Python has only 6% market share (http://www.itjobswatch.co.uk/jobs/uk/python.do).

    > In the lower-right, you’ve got pedal-to-the-metal performance

    Assembler is pedal-to-the-metal performance. C++ abstracts away things like register allocation and calling convention which can make it substantially slower than hand-written assembler (e.g. see http://flyingfrogblog.blogspot.co.uk/2012/04/x86-code-generation-quality.html). Indeed, one of the largest potential performance gains that C# does not yet offer (on .NET) is SIMD which is, of course, closely related to register allocation.

    C++ is also missing some features that can help with performance. Lack of multiple return values tends to afflict the ABI that compilers adhere to (they often resort to s-ret form which is substantially slower than necessary). No guaranteed tail calls makes it difficult to implement extensible state machines efficiently, with implementors resorting to trampolines that are typically ~10x slower than tail calls. Destructors make exception handling slow because the stack must be unwound (C++ was ~6x slower than OCaml last I looked). The inability to crawl the stack portably makes it impossible to write an efficient tracing garbage collector which can be a fast solution for some kinds of problems (e.g. theorem provers). C++ lacks pattern matching and decent pattern match compilers (e.g. in OCaml) automate a huge amount of optimization that is usually too laborious to do by hand (consider rewriting some of the bigger pattern matches from the Coq source code into C++, for example). Last I looked, the performance of STL collections was often much worse than their .NET counterparts.

    > I’ve seen many people run away from garbage collection back to C++

    I have also seen many programmers run backwards. That doesn’t make me want to follow them. :-)

    Frankly, I used C++ for many years and I can say that it is by far the worst programming language I have ever used. If I were to draw inspiration from C++ it would be to learn the lesson that incidental complexity can be extremely damaging.

    I think better education for programmers is really important. Too many people struggle with garbage collectors because they are regarded as a voodoo black box filled entirely with magic. For example, how many .NET developers know how much slower it is to write a one-word reference into the heap than a one-word integer and why? How many know the performance cost of objects surviving a generation and what the cause is? How many know the performance cost of having a thread with a deep stack and why? None of the answers are particularly complicated but they are important and few people (in my experience) have any idea what the answers are and some even flee garbage collection rather than learn about it.

    That said, I’m keen to see your results! :-)

    Reply
  34. Robert Sundström

    Great! I look forward to hearing more about this language soon!

    I do have some ideas…

    I imagine introducing ^ (hat) for reference-counted objects in a safe context and having * (star) for unsafe pointers in an unsafe context. The default being that objects (of classes, structs and derived) without any notation should be treated as scope-bound values like in C/C++.

    Consider the case of this hypothetical extension method (omitting class definition):

    public unsafe static T^ ShallowCopy(this T^ source) { // Allocate space in memory. // Dereference source and copy values to the allocated memory. T* mem = (T*)malloc(sizeof(T)); (*mem) = (*source);

    // Wrap pointer in a reference of type T^ and return it. return RefPtr.Wrap(mem); }

    Reply
  35. Michael Rutherford

    This is exciting news. Any chance of getting rid of GC and using something like ARC in the Objective C compiler? Looking forward to playing in a new sandbox with a focus on performance and simplicity :)

    Reply
    1. Alan

      ARC cannot cope with reference cycles. If you accidentally create a reference cycle in your ARC enabled app, you can/will still leak memory forever as these constructs cannot be freed. ARC is a half step above manually calling retain/release, it is not a solution to memory management like a GC is.

      Reply
  36. Andrew Kinnear

    Excellent work, it’s about time we got something like this, I’m a c# developer and I often wish I had more control over lifecycle and allocations, and I understand the need for fully contracted exceptions, in my opinion good developers are constrained too much in certain aspects, and not enough in others, we need options, the ability to make the appropriate call from a set of solid tried and tested options, it’s about capability, building a managed OS will make this apparent.

    Reply
  37. Ms440

    Great job, as always, Joe!

    Do you think it is possible for Microsoft to really start using the results of your research? What about possible timeframe ? Another 4 years?

    Reply
  38. pcsaba

    Just one comment regarding reflection. I’m coming from C++ and embedded projects, not from the “managed” world. It’s very common to use reflection in C++, however this pattern is not understood by many. A good example is google protocol buffers: which is basically code generation. The devil is always in the details, but I’d heavily support the code generation based, optional reflection to the mandatory language feature which would make the language really bloated. It’s a good question that should it be the part of the language or can be just a library feature?

    Reply
  39. James Mitchell

    Microsoft does such cool stuff — I wish they had a business case to do it for other platforms/OS’s as well so I wasn’t stuck with Java. :(

    Good luck getting it right — I’m thrilled to see more folks attacking this. Go is like Haskell — our way or the highway, and D is just so ad-hoc…

    Reply
  40. Jo Blow

    Take a look at some of the features of D. It’s templates, (string) mixins, and aliasing set it apart from the rest. It has some very good features that would take C# to a new high.

    Reply
  41. David Piepgrass

    Wow, Joe, I envy your popularity. I, too, have been designing a backward-compatible successor to C# called Enhanced C# (part of the Loyc project at http://loyc.net/), but so far no one has commented on it (on the other hand, I haven’t publicized my ideas much either). I had a very long laundry list of things I want to add or change in C# and .NET…

    https://sourceforge.net/p/loyc/code/HEAD/tree/EC%23.cs

    I still want all those features, but after listing out all the things I would add to C#, I noticed that with all those features there was still no solution for this popular feature request: “Provide a way for INotifyPropertyChanged to be implemented for you automatically on classes” (436 votes on MS connect). That’s when I turned to LISP for inspiration: LISP-style “macros” can easily fulfill feature requests like these, as well as several of the features in the laundry list of things I wanted to add to C#. When I realized that, I redesigned Enhanced C# to have LISP-style macros as the cornerstone:

    https://sourceforge.net/p/loyc/code/388/tree/EC%23%20for%20language%20pundits.txt

    In fact, in this list of known possible features of C# 6:

    http://damieng.com/blog/2013/12/09/probable-c-6-0-features-illustrated

    My lexical macro system could support features 1, 2, 4, 7, and 9, albeit not necessarily with the same syntax. That is, these features would not have to be features of the language, but merely library features. Macros are a great way to empower the user community; no longer does MS have to implement every feature itself, nor is it the final judge and jury of which features deserve to exist. Macros would also help DQ have his desired “first-class immutability” feature, as macros would make it easy to write “an immutable type with N fields [...where...] you need one method per field and maybe a private constructor.”

    Currently, the best macro system I know of is in Nemerle, btw.

    There’s also a bunch of things I would like to see added to .NET itself, some of which a native-code compiler would be able to do: efficient (2-machine-word) slices, return-type covariance, thread-local variable inheritance, fixed-size arrays, an array at the end of a heap block with size chosen at allocation time (traditional arrays can be considered a special case of this), efficient binary comparison of value types, and SIMD vector types. Not to mention generics that support numeric processing. Details:

    http://loyc-etc.blogspot.ca/2010/05/new-features-net-framework-should-have.html

    I think it’s very important to use good ideas from other languages. For example, I’ve been studying Ceylon recently and I think its type system (specifically its “union” and “intersection” types) is fantastic. If you could somehow integrate this… well it’s probably difficult to use that type system in a language that is backward-compatible with C#, but it’s certainly worthwhile to look for opportunities to inject outside ideas. MS is traditionally known for an “NIH” attitude… which is bad. Stand on the shoulders of giants, there’s no shame in copying the best ideas!

    My current status is that I’m halfway done writing the parser for EC#. This parser will become the primary parser of my existing LL(k) parser generator for C#, LLLPG:

    http://www.codeproject.com/Articles/664785/A-New-Parser-Generator-for-Csharp

    Let me know if you want to join forces, Joe, or if MS offers any research funding for guys like me.

    P.S. I second what Dejan Lekic said, there are already languages that fit the area marked with “X”, such as D and lesser-known impure functional languages.

    Reply
        1. David Piepgrass

          I’ve been meaning to. But the command-line interface of git also makes ME want to cry and I haven’t figured out how to migrate to git with commit history intact. I’m so focused on writing code, I hate to dedicate a large amount of time to learning a damn SCM. But you are right and I will do it eventually (although what about forums and wikis? Guess they’d have to stay on SF. And then there’s the pain of git submodules vs the seemingly much better SVN externals…)

          Reply
    1. xoofx

      The main problem to extend a language like C# (or even to start from scratch) is to have access to a good native compiler infrastructure. Currently, LLVM is the only framework that provides a bit of this, but it is still quite a pain to get stack maps correctly working with it (if you are trying to develop a language with an integrated GC along a a more “manual” memory management), optimization is not really good when using gcroots as the compiler is disabling most of optims and the worst thing is probably the lack of a proper support for debugging on windows. And unfortunately for us, Microsoft on the other hand has nothing to offer publicly on this part, which is truly annoying for anyone trying to play with language design on Windows. I’m sure that Joe and his team had access to this compiler infrastructure internally at Microsoft (redhawk/bartok/phoenix don’t know which one is part is exactly of it), so It made his job much “easier” to do. Of course, It is still a breakthrough, if they have been able to hit the sweet pot of providing a language that can provide high performance and productivity without making the syntax cumbersome. Having a new language from Microsoft will be very cool but it would much cooler if we get access to the whole compiler pipeline and not a packaged version of it for a particular language. Having Roslyn as a front-end version for C# is great, It is also good to see that this M#/X language will hopefully switch to Roslyn… though Roslyn (at least the RC that is one year old now) is lacking extension points to make it a viable solution to develop new language/extensions. I have been playing as you did with language design few months ago, and dig into meta-programming as well integrating concepts of immutability (largely taken from concepts in “Uniqueness and Reference Immutability for Safe Parallelism”), It was great fun, but ultimately, I stopped playing, because It was too frustrating to be stuck by the lack of a good compiler backend.

      Reply
  42. Marc Sigrist

    It’s great that Microsoft is doing research to create better languages or improve existing ones, and I greatly respect your work re. concurrent programming.

    It baffles me that you considered only C# or Java as starting points regarding “safety and productivity” for a new language in the year 2014. Languages whose safety is already deeply compromised by the “billion dollar” null reference problem and mutability by default. Languages whose level of productivity is frustrating, due to missing type inference, no automatic generalization, no pattern matching, no value nesting, no unified function/action concept, no tail call optimization, and a syntax infected with unnecessary plumbing artifacts.

    The same department you are working for, Microsoft Research, has already created a fantastic new language, F#, who has all these features. Why not base the next new language on the most productive current new language available next door? Even if you cannot/will not take F# as a starting point, the new language will ultimately look similar to F# or other ML-based languages if you include the essential productivity features I mentioned…

    Reply
    1. MBR

      Here here! Once you’ve programmed with DU’s (including ‘t Option) and pattern-matching, it’s hard to wonder why you’d want to approach most problems without them. And I think it’s been shown that F# type provider’s aren’t just good for databases, but for accessing system services or other resources in a strongly-typed way as well — generalizing/extending this concept to allow for more compile-time support and generalized meta-programming to facilitate internal DSLs and higher levels of abstraction for systems programming as well as other domains would be great.) I hope it’s not the religious (optional, but almost always used) significant whitespace that’s the issue. (Which I love FWIW) I know there is the feeling out there that “real” or “systems” programming languages need to look like C and have curly-braces…

      Reply
  43. Stephen Channell

    It is interesting that you quote the stack allocation advantages of the Java Hotspot VM, because hot spot specifically uses the bytecode VM to profile executing code to optimise code generation. Transient stack allocation of objects is a later optimisation that was needed for HotSpot to compete with the JRocket JIT that was faster, and more similar to the .NET JIT. If Microsoft were to embark on engineering a Hotspot type VM the objective surely should be to generate code that can be run on a GPU/MIC rather than just native binaries for a particular CPU.. but the basic idea is worthy.

    Doing a C#/CX as a copy of the C++/CX version of C++/CLI is a worthy idea, and doing it without changing from System to Platform namespaces is a good idea because C# programs cannot define C pre-processor macros to switch between “ref new” and “gcnew” like you can in C++. Using the C++ backend would get all the optimisations that Herb’s group has added.. and would be somewhat like the Objective-C front-end for Clang.

    Having seen the SQL-injection vulnerabilities that foolish developers put into web-applications means that while code-generation is a better approach than Reflection/interpretation, it is no replacement for safe MSIL code verification. Whatever you think you could code with dynamic compilation to binary code, nothing with go into production or up to a cloud that does not use a CLR type engine for any dynamic code.

    Reply
  44. David Hanson

    I’m wondering if this new language and compiler was used in anyway to produce native xaml in windows 8? One thing that has puzzled me is how Microsoft managed to produce a replica of of the managed libraries. Rewriting from the ground up seems unlikely as surely you would try and produce something better and modern and not take all the old mistakes.

    Joe any thoughts?

    Reply
  45. Sam

    Have you guys thought about some form of ARC? It eliminates a lot of the memory management stuff without the drawbacks of the GC.

    Reply
    1. DQ

      Have you guys thought about some form of ARC? It eliminates a lot of the memory management stuff without the drawbacks of the GC.

      Performances-wise a GC is typically more efficient than ARC: a GC that freezes your threads waste less CPU cycles then ARC and its expensive atomic operations on every assignment (smart compilers help on that regard but they do so in the same extent with a GC anyway). Although the GC’s perfs can degrade with complex object graphs (chained references, etc) because of the memory saturation.

      Also ARC is not convenient when your object graph contains circular references: you need a special mechanism for weak references, and a convention or algorithm to determine when to use strong or weak references, and which objects to strongly reference to keep the graph alive.

      Finally the benefits of ARC are not on the CPU side, quite the contrary. It is determinist (and the process does not need to be awoken to free unused memory – great for smartphones), it does not require the threads to be frozen, and it has a lesser memory consumption. Here lie its real strengths. But they come at a cost: it is less practical than a GC.

      But Apple did a great job to let developers around the world believe that ARC is the alpha and omega. You can’t fight marketing.

      Reply
      1. Stephen Channell

        “Automatic Reference Counting” is pretty much what ComPtr does and prettified with T^ references, except: [1] references are released at the end-of-scope irrespective of exceptions [2] AddRef/Release is not called when passed by reference (+ dereference operators are compiler eliminated with const reference) [3] does not use virtual invocation/messaging [4] does not have any daft syntax, that reminds me of PL/I :-)

        I’d guess that a C#/CX would use ComPtr for reference tracking with WinRT… so “some kind of ARC” must be pretty much correct

        Reply
  46. Joe Mayo

    Happy to hear about including LINQ ‘and’ making it inherently async. Are you looking at communication capabilities in the HTTP and/or IoT areas? Any thoughts about a native JSON type?

    Thanks,

    Joe

    Reply
  47. Jon

    @joeduffy: The language I’ve been waiting for has all the performance of C++ but none of the backwards compatibility constraints. The fact that M# is based on C# concerns me, because it implies there will still be areas where C++ outperforms M#. For example, does “safety” imply that M# will force bounds-checking for arrays? That’s expensive. I’d like to completely replace C++, but if M# makes performance sacrifices in the name of productivity or safety, it won’t be that replacement.

    Reply
    1. pcsaba

      Exactly, we are waiting for a language which is similar to C++ regarding no compromise on speed, but with easier learning curve.

      Reply
    2. joeduffy Post author

      An out of bounds array access must never compromise type-safety, otherwise you don’t have a safe language.

      Therefore, you either prove the access is safe at compile-time and elide the check — through well known compiler optimizations (e.g., ABCD analysis), or through 1st class type systems (e.g., dependent types) — or you must perform the check at runtime.

      Reply
      1. Jon

        I suspect that there are cases where a human can prove the code is safe, even if a compiler can’t determine this statically. In these cases C++ would outperform M# if M# is doing runtime checks. I’m hoping that you can opt out of these checks, and anything else that might be detrimental to performance, in this new language (eg by marking the region as unsafe).

        Reply
  48. Marco

    @Joe: I do not understand your point here:

    “This does mean that languages like C# are increasingly suffering from the Law of the Excluded Middle. The middle’s a bad place to be.”

    Have you tried to build a non-trivial JavaScript application with > 100000 lines of code, yet ? ;) . I do not consider Javascript as productive at all when you are taken into account the whole software-lifecycle (maintenance, support etc.). The likelihood that a software-system, developed in Javascript, has more in common with Spaghetti-Code than a good designed system, is very likely if you do not have very good and experienced JS-developers in your team.

    And concerning performance: Your diagram depicts single-core performance, what’s about multi-threading and multi-core?

    And frankly, I have still not understood why you guys are designing a new language. At Microsoft you have with C#/F# two very strong and very good languages. Why do you not simply heavily invest in the runtimes of the existing languages to get simply faster? Wouldn’t a new OS also benefit from being developed in a functional style – where it makes sense (e.g. scheduler, optimizer and stuff, which is heavily algorithm-driven ?)

    Isn’t there the issue that with two very similar languages (M# and C#) that there is the danger of weakening the ecosystem (C#, .NET) since it probably lead to discussions in the community if C# is being abandoned in the long run by Microsoft (if those discussions make sense or not)? You probably know the Silverlight story…

    Just my 2 cents.

    Reply
    1. joeduffy Post author

      Marco, totally agree about big JavaScript projects easily getting out of hand. I’m optimistic about approaches to add real module systems, like TypeScript’s, but there’s a long way to go.

      In the area of multi-core, the answer is “it depends.” Each language has a remarkably similar spin on some similar underlying concepts: workstealing/promises/futures/tasks/parallel-for-loops/etc. C++ certainly has a more vibrant community in the areas of GPU parallelism (CUDA, AMP, etc), and JavaScript has neither a story for this nor shared-memory parallelism, at least one that is used at-scale. But on the server, they are all suitable, thanks to common shared-nothing architectures — even for JavaScript (see node.js).

      Where we really tried to innovate was safety. It’s remarkably powerful to know that, if your code compiled, it’s data race-free. You just don’t get that with C++, Java, C#, JavaScript, etc.

      Regarding “why a new language,” I did try to answer that in the post, including mentioning that you can think of this as an evolution of C#, rather than something entirely new. For a variety of reasons, it’s likely to stand on its own, however.

      Reply
      1. anthonyb

        “For a variety of reasons, it’s likely to stand on its own, however.”

        And that begs the question: If you bring a whole new language that is an “evolution to C# and that our existing code will compile to it (with some exceptions), then, what will be the fate of C# ? Will be really needed ? Because the overlap i see between these two languages will be huge, so one must go…

        Reply
  49. C++.Net

    C++.Net (or managed c++ or cli etc..) provided this, did it not? It was a great language that offered productivity with control.

    Reply
  50. Rafel

    All this interest is due to: 1) C and C++ are so old and out-fashioned… we hate the .h files! If we have to start a new project it is a pain in the ass to still use these tools from prehistoric age. 2) A lot of people want an alternative to Java and C# but without garbage collection and without also so much overhead. D is very good, but unfortunately it does not get enough interest from other people, surely because it does not come from a big company. Go has some bad design decisions and also, like D, it is not catching enough attention. Rust is coming, but very slowly… the fact that it comes from Mozilla is a very good point. Negative points: syntax is old fashioned (still curly braces, we will never get rid of them), no good IDE, runs only in Linux…

    Reply
    1. Nicolas Grilly

      Rafel,

      I agree that header files are really inconvenient in C/C++. This is probably the thing I “hate” the most in C/C++ :)

      D is really interesting, but even if it simpler than C++, it’s still a quite complex language. I think that a new language willing to attract a large audience needs to keep its specification relatively simple and short. In my opinion, it explains why C is still used by a lot of projects.

      I disagree with you about Go not catching enough attention. I think it’s quite the contrary. This language gained a large following in just a few years. The fact that its spec is simple and short can explain this. You wrote about some bad design decisions: what are they according to you?

      About Rust, it’s another really interesting work. About the old fashioned syntax, what would you suggest instead?

      Reply
  51. Kesav

    Please see if “delete” can be provided along side Dispose in C# next version. Dispose is good, but it leaves memory at the mercy of GC. All we need is ease of C# + deterministic destruction and simple data structures. So if I say “Delete” it should also release the memory from the claws of GC. To me GC should be little (yeah only little better) than a memory cache with some intelligent limits (dynamic limits). Also please take a look at existing data structures like List. I do not use most of it, so why should I pay for it? Let us have simple data structures and build complex ones on top of it. As we already have List, please bring in System.Collections.Simple and System.Collections.Generics.Simple, that has bare minimal data structures and really bare bone functionality.

    If you clean C# that much, it will be magnitudes better performant than existing one.

    Reply
    1. Andrew Kinnear

      Yeah I think a simpler language would be better, leave the implementation specifics at the mercy of the compiler, you could add compiler directives to your code perhaps (Attributes?) which could inform the compiler and the runtime to prefer X or Y, safety with performance with ease of use, “law of the INCLUSIVE middle”. We need options as developers. So maybe something like this:

      [System.LifeCycle.ManagementType(System.LifeCycle.ManagementTypeEnum.ARC)] public class MyClass ; IManagedObject{ [System.LifeCycle.Optimisation(System.LifeCycle.OptimisationTypeEnum.UnsafeFastest)] // excludes bounds checking and does aggressive loop optimisation void MyMethod(){ } void Destructor(){ // finalizer executed before releasing the object form the heap (either by GC or by ARC depending on ManagementType, should be used to finalise any un-managed resources) } }

      Explicit allocation could be done by way of something like a Using statement (replacing de-allocated object references with a null reference).

      Not sure how this works currently (not a compiler expert), but optimally pre-compiled machine level instructions could be included in the DLL to reduce JIT compilation times (targeted at specific instruction sets). Compiler optimisation will need to be the focus probably. Any compromises to the safety and fidelity of the execution should be guided by the developer allowing the language (and ultimately the CLR) to cater for more scenarios, scenarios where optimisation and integration with low level functions is needed, as is the case with things in an operating system.

      Anyway, that’s just my 2 cents.

      Reply
    2. Stephen Channell

      you wouldn’t use ‘delete’ with a smart pointer in C++ (smart pointer does the delete after last reference released), so you shouldn’t need it for a native C#.. and objects allocated on the stack are released at the end of the scope

      Reply
      1. Kesav

        Thanks for the reply. But I am talking about developer control over objects allocated on GC heap. C++ Smart pointer would ultimately do a “delete” on its internal object which indeed “deletes” the object. So I am requesting for that deterministic delete at times needed. Provide a way to do that…

        class ResourceCriticalClass : IDisposable { private A a = new A(); // this is an object created on GC heap public void Dispose() { Dispose(true); /*************************PROVIDE THIS OPTION********************/ GC.Delete(this); }

        private void Dispose(bool isDisposing) { ... ... } }

        Reply
  52. Edward S. Lowry

    In 1982 it was possible for people at DEC to enter a computation such as:

    6 = count every state where populatn of some city of it > 1000000

    Today no current language comes close to that simplicity of expression for that computation and no one has been able to give a sound reason for long term use of any substantial language that does not come close. If M# or any new language does’t come close, I recommend a major rethink. See “Inexcusable Complexity for 40 years” at users.rcn.com/eslowry .

    Reply
    1. MBR

      This capability certainly still exists, but natural language is not suitable for programming, and when conditions get complex, it doesn’t work at all. Certainly: SELECT city.name FROM cities WHERE city.population > 1000000 or cities.filter( city => city.population > 1000000) or countless other ways of doing this are not only potentially shorter than your NLP example, but are more predictable, scalable and tool-able.

      Reply
      1. Andrew Kinnear

        I think he’s referring to code which more resembles natural language, that would be useful for non-programmers, but more structure and less ambiguity is needed for more complex descriptions I think, and natural language is not very concise, it’s easier to scan properly formatted c# code to find something. Although it could be useful to be able to have intelligent context resolution, such that you could use “It”, “Them” or “They”, they may be able to add it to the Linq grammars so you can say “from them select”. But if MS is already doing things like Linq Grammar as a language extension, why make these grammars fixed or pre-defined, why not allow the developer (or architect) to define additional grammars, bring the behavioural/functional side of things in that way, pre-compile the grammars into statements (as I’m sure Linq grammar does). Like a domain specific language, this could promote re-use on a procedural level. Such that modelling could include these. Grammars could be created to match a set of key words like “from” or “select” in a specific sequence, like we see in the Linq grammars, coupled with place markers which match specific types of variables (or other Grammars). Then on compilation the grammar is expanded into it’s c# form based on the most specific definition that matches.

        Reply
        1. MBR

          “Programming languages for non-programmers” has a long history, and is self-contradictory both descriptively and in practice – what we want is better languages for programmers. However, as I’ve stated in my previous posts in this thread, I’m all for better meta-programming and control over surface syntax as well – if someone really wants the syntax as shown, they should be able to enable it using a modular language extension that uses the language itself to enable it.

          The important piece is an open and flexible back-end which targets the semantics of the classes of languages you want to support — the choice of syntaxes and any micro-DSLs used should be up to the user to either include or define.

          I agree that LINQ syntax is cool, but that it doesn’t offer enough over the fluent API syntax (actually, a lot less) that it need have been added without doing so in a way that enabled generic language extension — but it probably served its purpose of easing the translation of simple SQL queries and removing nasty embedded SQL, and it is good proof than alternate, embedded syntaxes are useful.

          Reply
          1. Andrew Kinnear

            I think that defining these types of syntaxes could be useful, an additional way to abstract code to promote re-use on a procedural level, such to facilitate construction, aid adaptation, and to intrinsically be more robust and fault tolerant to changes in the environment supporting the code. So I’m thinking some form of abstraction of aspects, it doesn’t necessarily have to be as “natural” as linq grammar, but being able to define even your own simple syntaxes would be useful. I want to be able to say something like this:

            public void MyMethod(){ secure(Security.Type.AdministrationAccess){ // secured code } }

            or

            public void MyMethod(){ critical{ // secured code } }

            The mechanism used to wrap code in a “secure” or “critical” block could change if the code is applied in another context based on a syntax or definition of some kind. So “secure” would wrap encapsulated code with code which does the security check, “critical” could log or notify of exceptions. Of course you could do this using lambda and delegates, it’s just not as clean, to me this would satisfy the “Aspect Oriented” side of things. For me lifecycle, context and abstraction of procedural aspects is important for a language.

  53. ac

    There’s certainly many cases where performance can lack due to points that were touched, however in C# vs C/C++ large part of overall performance can be lack of way to easily interoperate with existing optimized code – and if you do that in C#, then you’ll incur a penalty. Or you try waste time with doing some sort of C++/CLI layer, complicating the whole deal so much that rarely it will be bothered with, leaving the whole project full of code that could be bit faster if it weren’t for the implementation and actual overhead of interop.

    Also when it comes to Windows (atleast win32), there’s just a ton of things that can really improve perf using win32 api directly.

    I guess my point is, it would be nice to have safe high perf language, but being able to use existing optimized code would be high on the priority list. Of course interoping with “unsafe” code could be discouraged by having it behind a compiler switch, but there shouldn’t be perf penalty.

    That’s sort of why the idea that you could make a C++ project and then, like using assembly inline, one could use C# inline, in such a way that if you modified the C# part of the code, that would only cause the C# part to be re-compiled at C# compile-speeds. It wouldn’t need to be C# with reflection and CLR etc, just C# the syntax, usable right in C++ project, and the more convenient libraries accessible.

    Reply
    1. xoofx

      however in C# vs C/C++ large part of overall performance can be lack of way to easily interoperate with existing optimized code

      The whole purpose of this new language M# is to generate an optimized code without sacrificing for productivity or safety. In that case, we would not need anymore to interop with C/C++ library because the compiled code will be as efficient as compiled in C/C++. It seems that they have addressed in M# what was lacking in C#, i.e., mainly to have a deterministic/better control of the performance of a program by providing different allocation/destruction strategies (depending on usage and/or keywords).

      What is unclear in this post is how this language is going to interop with existing OS like Win32, as It seems that It was primarily developed for/with Midory OS in mind. There is no information in this post concerning how unsafe options/interop will be handled with native code (talking about safety and at the same time about these unsafe options would be an unconfortable position). Questions like: Do we have still [DllImport]?, is the managed/unamanged transition still necessary in all cases?…etc. are still to be unveiled.

      Concerning ASM code, if this language is able at some point to integrate SIMD types/intrinsics, It would be quite enough to tackle most of the last bits of performance that are required by some critical applications.

      Reply
  54. Someone

    This is really great news! We are using C# for productivity, but have to switch to C++ (performance/system) or Python (cross platform) every now and then.

    The C#/.NET GC has been already an issue in our daily work, but relying on it saves some time and makes C# easier to use for semiprofessional programmers. A combination of the C++ and C# world may be a good solution for many Memory related problems. Also more control over memory and performance would be a great feature in C#. The same goes for a deterministic performance behavior.

    When it comes to type-safety, safety goes first. Unfortunately a programming language seemingly has to be fool proof in many ways, in the first place. But if you found a way to assure safety and adding some candy, this is great news. Your error model and async approach sound very interesting. I’d really like to test this out. Of course the support of modern frameworks is very important for programming productivity. But I don’t mind missing framework support in a preview release, as long as I know the plans for adding framework support in the future.

    I know, your new language is far from being a product, but the following, more political than technological clauses, merge the biggest problem we see when using C#. So I cannot leave without noting this as early as possible. Without the Mono Framework, we would not use C#. Mono removes the strict dependency to Microsoft (just in case…) and also some of our components have to run on Linux and we have great benefits from using one language with great tool support from our Website, to our LOB projects and even down to an embedded device. Especially for the last part, enhanced performance would be great. We are suffering from the (small) limitations of C# but way more from the gap between .NET and Mono frequently. Because of this, I would like to see the new language be portable across platforms. At least Windows (including RT and Phone), the major Linux distributions (not just for x86/64, but also for ARM) and OS X should be supported right from the first release and out of the box – as far as technologically possible and without the need for community work and releases.

    Reply
  55. Ashkan

    I understand that it can be a little outside of the concerns of the language but you seem to somehow make a platform on top of it (or at least try to modify .NET libs for that). What are your plans for concurrency between machines, I mean features like what Erlang has, Being able to work with agents (in F# terminology) using mailboxes which are on different machines transparent to you as a language/platform user. This is a big thing for sure, You are implementing fault tolerance by using your safety features as i understand and because of having immutable data structures, recreating agents/processes/fibers/… should be easy so i’m interested to know more about these features.

    Fault tolerance Network based concurrency support

    Reply
  56. Neg Lewis

    I wonder if this could be done: Instead of “null” to be just a “null” it could be a static instance of that object it refers to. And all it’s properties/methods/.. would be valid but will return a “null”. It will be like it’s in “java / javascript” – non – null able programming language.

    At compile time the compiler could analyze the code and create the missing /parameters/methods and sustain multiple “instances” of the same object. Like Class1.x = 2; 2 is an int then class Class1 should have a public/private field/property “int x”

    but Class1.x = DateTime.Now should not return an error. but it should create a new object (Int-DateTime) that it should contain both values. Same with classes/dll’s.

    Let’s say I use an obsolete lib file that has no Class1. It should be created at compile time – maybe even at run time a wrapper could be created to isolate the problems. Maybe use a library in Java by creating a wrapper that could sandbox the security issues – by returning that/a “null”

    In that way I could use C# instead of JavaScript :) I Could be safe inside .Net. and so on.

    :)

    Reply
    1. Andrew Kinnear

      Yeah I was thinking about that myself, you could maybe make it such that any call to any member of a null reference just returns a null (instead of throwing a null reference exception)…

      Reply
      1. MV

        You could have null.anything produce another null. You can also run with “ON ERROR RESUME NEXT” in VB, and it’s great for “avoiding errors” – but terrible for getting the right result.

        Reply
        1. Andrew Kinnear

          Well that all depends, things like the Linq select clause apply this logic in certain scenarios… it’s up to the geniuses at Microsoft to determine which is more appropriate as a default behaviour… But it could avoid code being littered with things like the following…

          domainObject.namePropertyX = entityObject.PropertyA != null ? entityObject.PropertyA.Name : null

          Field and property setters would have to still throw the null reference. Other alternative could be to mark things “Nullable” perhaps.. As to check during compilation.

          Reply
  57. Neg Lewis

    Another thing you guys could/MUST do it’s in the Mono part. Mono let devs. to write apps for linux. WP SDK let devs. write apps for WP.

    Can you see where I am heading? Combine those 2 and let devs write C#(.Net) apps for Android. Any Android App would be a click away from being published on MS Store. CPU (ARM/x86) Conf: Release/Debug OS: Windows, WP, Android, … :) iOS ?

    Reply
  58. Andrew

    I also wanted to a comment about how iv’e started to look into making a C# to C++, HLSL, GLSL, CG converter for games. It uses NRefactory to generate the C# syntaxtree and in turn I will use it to convert C# code into C++ with an OpenSource GC under the BSD licence such as: http://sourceforge.net/projects/gccpp.berlios/ http://developer.berlios.de/projects/gccpp/

    All system types such as “System.Console.Write(…);” Would be converted to “System::Console::Write(…);” as an example. And “using System” would be converted to “#include “System.h” as an example. Of course I would be remaking some of the C# standard library to map over to C++ as shown. Of course C# to HLSL/ect would be much simple then C# to C++ which I have already done using Reflection and Regex, but decided to rewrite it using NRefectory as there were issues with that approach.

    The reason for this project would be to negate Mono/.NET licence issues on iOS/BB10/NaCl/ect and of course tackle performance issue, as well as making writing shaders portable. Although I hope a project suggested in this article would negate my efforts for the C# to C++ part.

    Reply
    1. David Piepgrass

      Andrew, I have almost finished my C# parser which converts C# to a Loyc tree. Loyc trees are generic syntax trees designed to represent any programming language, and they are specifically designed to assist conversion between programming languages. You could consider writing a “printer” module that will take a Loyc tree as input and output it as C++, HLSL, etc. Let me know if you’re interested, – qwertie 256 @ gmail. com, without the spaces.

      Reply
    2. Virgile

      On a similar note, I am quite interested in starting an opensource project that would target this new language. I have already been toying with MSIL => LLVM generation (actually I was also planning to start with C# and add some performance-oriented extension to the language, but since it’s already done by MS, better start from that directly!). LLVM is quite performant and would allow to target many platforms directly (iOS, Javascript, etc…).

      I have also been working on porting the LLDB (LLVM debugger) stack to Windows/Visual Studio as a preliminary step for this future effort.

      Even though many of the language features are not known yet, it seems various aspect of the C# language are being reused, so that mean part of the work can already be started without knowing everything beforehand (except knowing if it works from IL or source directly? any hint on that?).

      I would like to know if anybody would be interested in such a project (as contributors)?

      Reply
      1. David Piepgrass

        I believe Mono AOT (ahead-of-time) compilation can already do what you’re suggesting (CIL to LLVM), to some extent, See http://www.mono-project.com/Mono:Runtime:Documentation:LLVM . IIRC, Mono’s AOT was developed at first to target iOS, because Apple prohibits software that runs on a virtual machine.

        It sounds like Joe’s new project will not use CIL… that’s not a good way to make a native code compiler, as you’d be subject to limitations of CIL that native code doesn’t have, and unlike LLVM bitcode, CIL code isn’t designed specifically to be optimized.

        Reply
        1. Andrew Kinnear

          Just thinking about it is there something like “Process Safety”, like “Type Safety” but to ensure at compile time that the state of an object is suitable to execute a specific method, something along the lines of stating that “Method A” must execute before “Method B”, but not after “Method C” not sure how doable that is if you expand it out for every eventuality, but it will globally ensure Liskov substitution. Object states.

          Reply
  59. Hristo Pavlov

    A great idea to bring C# and C++ together and create a new language. In the past years I see myself more and more often writing the high level code in C# but then call via P/Invoke all performance critical code (written in C++ and compiled for the given platform and invoked from C#). Looking at other systems programming C# projects this seems to be a practice used by many.

    It would be really nice if there was only one language that provides both ease of use and performance. However the success of such a language to my mind will largely depend on the framework around it. I would be interested to learn about the current plans for:

    - Frameworks for system UI development. Would there be WFP-like XAML-based library for creating rich, responsive and modern UI? - Ability to make system calls (P/Invoke with Windows libraries and other C++ libraries) - Ability to interop with existing C# libraries and COM objects

    We will surely want to reuse libraries that are already built and to use technology for building UI that is familiar (XAML). Is there any certainly of whether any of this will be easily done?

    Reply
  60. Doug

    Just stumbled on this. I know you work for Microsoft. But if you want this language to succeed, please make the language run on all OSes, not just Windows. Model this like any non-Microsoft Open Source new language: open license like Apache or BSD, designed to be a first class language on all platforms (Mac, Linux, and Windows) from the beginning, build a community from all OSes (not just Windows), eventually have external committers and maintainers, work all releases out of the public repository and show all commits which means starting from the beginning (0.1 release) in the public repository, etc.

    Look at Go and Rust as examples of great languages with active communities. I think you will want that too from the beginning.

    Then you will have a horse in the game. If you don’t, it is dead out of the gate.

    Reply
  61. Sanna

    This is a big thing for sure, You are implementing fault tolerance by using your safety features as i understand and because of having immutable data structures, recreating agents. Keep posting such amazing post.

    Reply
  62. Ben Kloosterman

    Any updates ? Lots of prommised blogs above. Joe i wouldnt mind implementing an ,rcimmix gc for this. This gc has mimimized nearly all the gc disadvantages its still not as good as regions but great when they cant be automatically or explicitly formed. Can you provide an alpha abi for gc hooks?

    q is the new rubtime written in c or is it self hosting.

    Agree with inheritance above allow for compat but mark as obscelete. Favour type classes and interfaces w composition for oo though im dodgy on nomads.

    Reply
  63. Sandro

    This sounds promising, but I’ll have to remain skeptical until I see a paper! I’m also curious where you’d put Ada on your chart. Should probably be above and slightly to the left of C++11.

    No extant systems programming language [1] I’m aware of can safely express my goto systems programming examples:

    1. Safe tagged pointers as used in ML to distinguish integers from pointers: this requires the language to support bit-level variant tags and understand alignment. 2. Strong update: safely changing the type of a reference to another type, eg. a memory allocator changes a byte[] into a strong object type like “String”. This requires linear typing or something like it, and some strong alias analysis to ensure disjointness. 3. Unboxed arrays of any type with dynamic size: consider implementing an immutable hash array mapped trie with only a single allocation per level, ie. each node is a dependent record, in pseudo-C#: struct Node { int bitmask; fixed T[popcount(this.bitmask)] items; }

    So can “system-C#” safely express these idioms?

    [1] ATS might, since it permits safe pointer arithmetic.

    Reply
  64. qwerty2501

    Hello Joe. This new language Sounds great. I hope that your project goes well. Well, I want to compile “Any CPU” in this new language. Are you going to provide this feature? I think If you want to most support it, you might need pre-process instruction of ahead-of-time compilation.

    Reply
  65. The recommender

    I recommend you all take a look at the c-up language (www.c-up.net). I came across it on the gamedev.net forums and think it strikes a good balance between safety and performance. It’s GC by default but allows you to write your own manual allocators and has intriguing stack allocation capabilities with a novel ‘local reference’ concept to make use of the stack safe. If you do want to use manual memory management it can automatically check that you aren’t freeing still referenced memory (it does this at runtime using the GC tracing capabilities.) The performance comes from safe automatic task parallelisation and fully featured simd support similar to GPU shader languages. It’s bounds checked but this can be disabled as a compiler option, and in any case doesn’t seem to be anything like the performance hit that people often assume. The memory allocators (manual or GC) do not use block headers so you consume exactly as much memory as you allocate.

    Reply
  66. The Dude

    Really interested if there is any more news or updates on this. Can we expect to hear anything anytime soon?

    Reply
  67. Petter

    Would love to hear a lot more about this new language in coming posts (hint hint, nudge nudge).

    Especially, I’d love to hear more on tackling concurrency/determinism issues in memory management with your new language. When writing complex server software in C#, the number of design constraints on code that needs to be concurrent and somewhat responsive (minimizing duration/frequency of GC stalls) is insane. And in the end, none of the solutions are particularly satisfying and end up working around the language’s shortcomings by making horrible sacrifices.

    Reply
  68. Peter

    Comparing with C#, C++ is an unmanaged code but it is actually more freely and controlled by the developer if he knows what he is doing.

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Enter the word concurrency, in upper case: (my crude spam filter)