Is the Ceylon type system sound?

Posted by    |       Ceylon

So I've been reading some folks demanding that work on Ceylon start with a formal proof of the soundness of the type system. And calling me all sorts of names because I don't have one yet. I'm a bit bemused by this, since it's the first time in history that this has been demanded of a language designed for use in practical computing :-)

Nevertheless, I think the objection is interesting and merits some kind of response. So here's my take.

First of all, Ceylon has a very simple, super-conventional core type system. At core, the system is just:

  • parametric polymorphism with variance annotations, plus
  • mixin inheritance.

Now, we already know this basic type system to be sound. Other folks have already demonstrated that. In fact, if I understand correctly, the Scala guys have demonstrated the soundness of a significantly more complex type system with kinds in addition to mixin inheritance and generics.

There's just maybe five additional things that are defined primitively in the language spec:

  • union types,
  • reified generics,
  • attributes,
  • nested classes, and
  • higher-order functions.

Almost everything else you see in Ceylon is just sugar over the top of this basic scheme of parametric polymorphism + variance annotations + mixin inheritance. By that I mean you can re-express the other constructs in the language in terms of these more primitive notions. And that is, more or less, how the spec defines them (occasionally that might require some reading-between-the-lines).

That's one reason why Ceylon takes the approach of specifying things like operators in terms of equivalent Ceylon code. We don't have any special holes or primitive special cases in the type system. There's no primitive numeric types, arrays, raw types, built-in type promotions, or a primitive null. We've even eliminated overloading, which does seem to make soundness proofs more difficult.

And in fact, even most of the things in the list above could potentially be defined in terms of more primitive notions if necessary:

  • union types of interfaces can be interpreted as implicit interfaces (but union types of classes are a problem)
  • reified generics really just means an additional implicit parameter to generic constructors and methods (in fact, that's even how you need to implement it on the JVM)
  • attributes can be seen as syntactic sugar over a field + a method (in fact, that's even how you need to implement it on the JVM)
  • nested classes can be seen as syntactic sugar over a toplevel class (in fact, that's even how you need to implement it on the JVM)
  • higher-order function support can be seen as syntactic sugar over specially generated classes (in fact, that's even how you need to implement it on the JVM)

Get the picture? Even though these things are defined in primitive terms in the spec, we already know that they can be re-expressed in terms of other primitive constructs because we already need to be able to do that in order to compile the language to Java bytecode.

Indeed, I believe it would be quite easy to show that almost any Ceylon program could be mechanically translated to a known-sound language with parametric polymorphism + variance annotations + mixin inheritance. That spec would suffice in and of itself as a proof of the soundness of Ceylon's type system!

There is one major doubt I have. I'm not sure if the type systems that have been previously studied have included union types. It would be an interesting exercise for a CS student to take some of the existing soundness proofs out there and try adding union types. Not sure if someone has already done that.

Anyway, the point I'm trying to make is that you shoudn't need a full formal proof from first principles of the soundness of your type system in order to be pretty confident that it is sound. If the core type system is already known to be sound, due to the work of others, and if you know that some other construct can be re-expressed in terms of the simpler, core constructs, you have have an informal proof of the soundness of that construct. Not quite good enough for an academic paper, perhaps, but good enough to start work on the compiler.

What do you think? Am I missing something here?

Back to top