Help

This is the eleventh installment in a series of articles introducing the Ceylon language. Note that some features of the language may change before the final release.

This article was updated on 2/6/2011 to mention definite initialization of methods and on 28/5/2011 to reflect refinements to the language specification, and add new material dealing with self references, outer instance references, and circular references. The comment thread reflects information in the first version of article.

Self references and outer instance references

Ceylon features the keywords this and super, which refer to the current instance of a class — the receiving instance of an operation (method invocation, member class instantiation, or attribute evaluation/assignment), within the body of the definition of the operation. The semantics are exactly the same as what you're used to in Java. In particular, a reference to a member of super always refers to a member of a superclass. There is currently no syntax defined for references to a concrete member of a superinterface.

In addition to this and super, Ceylon features the keyword outer, which refers to the parent instance of the current instance of a nested class.

class Parent(String name) {
    shared String name = name;
    shared class Child(String name) {
        shared String name = outer.name + "/" + name;
        shared Parent parent { return outer; }
    }
}

There are some restrictions on the use of this, super, and outer, which we'll explore below.

Multiple inheritance and linearization

There's a good reason why super always refers to a superclass, and never to a superinterface.

Ceylon features a restricted kind of multiple inheritance often called mixin inheritance. Some languages with multiple inheritance or even mixin inheritance feature so-called depth-first member resolution or linearization where all supertypes of a class are arranged into a linear order. We believe that this model is arbitrary and fragile.

Ceylon doesn't perform any kind of linearization of supertypes. The order in which types appear in the satisfies clause is never significant. The only way one supertype can take precedence over another supertype is if the first supertype is a subtype of the second supertype. The only way a member of one supertype can take precedence over a member of another supertype is if the first member refines the second member.

In our view, there's no non-fragile basis for deciding that one type specializes another type unless the first type is explicitly defined to be a subtype of the second. There's no non-fragile basis for deciding that one operation is more specific than another operation unless the first operation is explicitly declared to refine the second.

For a similar reason, interfaces shouldn't be able to define initialization logic. There's no non-fragile way to define the ordering in which supertype initializers are executed in a multiple-inheritance model. This is the basic reason why interfaces are stateless in Ceylon.

(Note that these arguments are even stronger in the case of adapter introduction, where linearization or statefulness would be even more fragile.)

So Ceylon is more restrictive than some other languages here. But we think that this restriction makes a subtype less vulnerable to breakage due to changes in its supertypes.

Definite assignment and definite initialization

A really nice feature of Java is that the compiler checks that a local variable has definitely been assigned a value before allowing use of the local variable in an expression. So, for example, the following code compiles without error:

String greeting;
if (person==me) {
    greeting = "You're beautiful!";
}
else {
    greeting = "You're ugly!";
}
print(greeting);

But the following code results in an error at compile time:

String greeting;
if (person==me) {
    greeting = "You're beautiful!";
}
print(greeting);   //error: greeting not definitely initialized

Many (most?) languages don't perform this kind of static analysis, which means that use of an uninitialized variable results in an error at runtime instead of compile time.

Unfortunately, Java doesn't do this same kind of static analysis for instance variables, not even for final instance variables. Instead, an instance variable which is not assigned a value in the constructor is initialized to a default value (zero or null). Surprisingly, it's even possible to see this default value for a final instance variable that is eventually assigned a value by the constructor. Consider the following code:

//Java code that prints "null"
class Broken {
    final String greeting;
    
    Broken() {
        print();
        greeting = "Hello";
    }

    void print() {
        System.out.println(greeting);
    }

}
new Broken();

This behavior is bad enough in and of itself. But it would be even less acceptable in Ceylon, where most types don't have an acceptable default value. For example, consider the type Person. What would be an acceptable default value of this type? The value null certainly won't do, since it's not even an instance of Person. (It's an instance of Nothing, remember!) I suppose we could say that evaluation of an uninitialized instance variable always results in an immediate runtime exception, but this is really just our old friend NullPointerException creeping back in by the back door, and, well, it's Just Not How We Do Things Around Here.

Indeed, few object-oriented languages (i.e. none that I know of) perform the necessary static analysis to ensure definite initialization of instance variables, and I believe that this is perhaps one main reason why object-oriented languages have never featured typesafe handling of null values.

Class bodies

In order to make it possible for the compiler to guarantee definite initialization of attributes, Ceylon imposes some restrictions on the body of a class. (Remember that Ceylon doesn't have constructors!) Actually, to be completely fair, they're not really restrictions at all, at least not from one point of view, since you're actually allowed extra flexibility in the body of a class that you're not allowed in the body of method or attribute declarations! But compared to Java, there's some things you're not allowed to do.

First, we need to know that the compiler automatically divides the body of the class into two sections:

  1. First comes the initializer section, which contains a mix of declarations, statements and control structures. The initializer is executed every time the class is instantiated.
  2. Then comes the declaration section, which consists purely of declarations, similar to the body of an interface.

Now we're going to introduce some rules that apply to code that appears in each section. The purpose of these rules is to guarantee that an instance variable has had a value specified or assigned before its value is used in an expression.

But you don't need to actually explicitly think about these rules when you write code. Only very rarely will you need to think about the initializer section and declaration section in explicit terms. The compiler will let you know when you break the rules, and force you to fix your code.

Initializer section

The initializer section is responsible for initializing the state of the new instance of the class, before a reference to the new instance is available to clients. The declaration section contains members of the class which are only called after the instance has been fully initialized.

Consider the following example:

class Hello(String? name) {
    
    //initializer section:

    String greetingForTime {
        if (morning) {
            return "Good morning";
        }
        else if (afternoon) {
            return "Good afternoon";
        }
        else if (evening) {
            return "Good evening";
        }
        else {
            return "Hi";
        }
    }
    
    String greeting;
    if (exists name) {
        greeting = greetingForTime + ", " + name;
    }
    else {
        greeting = greetingForTime;
    }
    
    //declaration section:
    
    shared void say() {
        print(greeting);
    }
    
    default void print(String message) {
        writeLine(message);
    }
    
}

To prevent a reference to a new instance of the class leaking before the new instance has been completely initialized, the language spec defines the following terminology:

Within a class initializer, a self reference to the instance being initialized is either:
  • the expression this, unless contained in a nested class declaration, or
  • the expression outer, contained in a directly nested class declaration.

Now, according to the language spec:

A statement or declaration that appears within the initializer of a class may not:
  • evaluate attributes, invoke methods, or instantiate member classes that are declared later in the body of the class upon the instance that is being initialized, including upon a self reference to the instance being initialized.
  • pass a self reference to the instance being initialized as an argument of an instantiation or method invocation or as the value of an attribute assignment or specification.
  • return a self reference to the instance being initialized.
  • evaluate attributes, invoke methods, or instantiate member classes declared in the declaration section of a superclass of the instance being initialized, including upon a self reference to the instance being initialized.
  • invoke or evaluate a formal member of the instance being initialized, including upon a self reference to the instance being initialized.
  • invoke or evaluate a default member of the instance that is being initialized, except via the special super self reference.

Declaration section

The declaration section contains the definition of members that don't hold state, and that are never called until the instance to which they belong has been completely initialized.

According to the language spec:

[The declaration section] may not contain:
  • a statement or control structure, unless it is nested inside a method, attribute, nested class, or nested interface declaration,
  • a declaration with a specifier or initializer, unless it is nested inside a method, attribute, nested class, or nested interface declaration,
  • an object declaration with a non-empty initializer section, or
  • a specification or initialization statement for a member of the instance being initialized.
However, the declarations in this second section may freely use this and super, and may invoke any method, evaluate any attribute, or instantiate any member class of the class or its superclasses. Furthermore, the usual restriction that a declaration may only be used by code that appears later in the block containing the declaration is relaxed.

Note that the rules governing the declaration section of a class body are essentially the same rules governing the body of an interface. That makes sense, because interfaces don't have initialization logic — what interfaces and declaration sections have in common is statelessness.

Circular references

Unfortunately, these rules make it a little tricky to set up circular references between two objects without resort to non-variable attributes. This is a problem Ceylon has in common with functional languages, which also emphasize immutability. We can't write the following code in Ceylon:

abstract class Child(Parent p) {
    shared formal Parent parent = p;
}

class Parent() {
    shared Child child = Child(this); //compile error (this passed as argument in initializer section)
}

Eventually, Ceylon will probably need some specialized machinery for dealing with this problem, but for now, here is a partial solution:

abstract class Child() {
    shared formal Parent parent;
}

class Parent() {
    shared object child extends Child() {
        shared actual parent {
            return outer;
        }
    }    
}

Definite initialization of methods

Ceylon lets us separate the declaration of a method defined using a method reference from the actual specification statement that specifies the method reference.

Float x = ... ;
Float op(Float y);
switch (symbol)
case ("+") { op = x.plus; }
case ("-") { op = x.minus; }
case ("*") { op = x.times; }
case ("/") { op = x.divided; }

The rules for definite initialization of locals and attributes also apply to methods defined using a specification statement.

Definite return

While we're on the topic, it's worth noting that the Ceylon compiler, just like the Java compiler, also performs definite return checking, to ensure that a method or getter always has an explicitly specified return value. So, this code compiles without error:

String greeting {
    if (person==me) {
        return "You're beautiful!";
    }
    else {
        return "You're ugly!";
    }
}

But the following code results in an error at compile time:

String greeting {   //error: greeting does not definitely return
    if (person==me) {
        return "You're beautiful!";
    }
}

There's more...

In the Part 12, we're going to discuss annotations, and take a little peek at using the metamodel to build framework code.

18 comments:
 
21. May 2011, 09:16 CET | Link
Andrey

From what's written above it seems that in Ceylon there's no way of having two immutable objects that refer to each other. I.e. in Java

class A {
  final B b;

  A() {
    this.b = new B(this);
  }
}

class B {
  final A a;
  
  B(A a) {
    this.a = a;
  }
}

This pattern is limited in Java: one class mush create an instance of the other in its constructor, but at least it can be managed with factories etc. In Ceylon it looks impossible, because I cannot pass my this anywhere.

This is a significant limitation in the face of the fact that you declare promotion of immutability.

ReplyQuote
 
21. May 2011, 09:33 CET | Link
From what's written above it seems that in Ceylon there's no way of having two immutable objects that refer to each other.

Right, nicely picked up on.

My proposed fix for this is

  • introducing a fragile annotation, which has the effect of applying the same restrictions to a parameter that the this reference has in a constructor, and
  • allowing this to be passed to fragile parameters.

So this would compile:

class A() {
    B b = B(this);
}

class B(fragile A a) {
    A a { return a; }
}

But this would be an error:

class A() {
    B b = B(this); //compile error: cannot pass this to non-fragile
}

class B(A a) {
    A a { return a; }
}

As would this:

class A() {
    B b = B(this);
}

class B(fragile A a) {
    writeLine(a.string); //compile error: cannot invoke fragile a
}

I believe that this completely solves the problem of parent references. (Actually parent might be a better name for the annotation than fragile.)

 
21. May 2011, 09:34 CET | Link
which has the effect of applying the same restrictions to a parameter that the this reference has in a constructor

I mean in the initializer, of course.

 
21. May 2011, 09:56 CET | Link
Andrey
Gavin King wrote on May 21, 2011 03:33:
So this would compile:
class A() {
    B b = B(this);
}

class B(fragile A a) {
    A a { return a; }
}
I believe that this completely solves the problem of parent references. (Actually parent might be a better name for the annotation than fragile.)

Well, that seems to be broken:

class B(fragile A a) {
    shared A a { return a; }
}

class A() {
  B b = B(this);
  Int x = b.a.i // What value?!
  shared Int i = 10
}

The fragile annotation cannot work: it has temporal properties, which need some severe modification to the type system to be handled properly.

 
21. May 2011, 10:00 CET | Link
Well, that seems to be broken

You're right. As defined, the proposed solution broken. You need to also somehow prevent invocation of b until the constructor of A completes. Let me see if I can find a way to formalize that.

 
21. May 2011, 10:03 CET | Link
Let me see if I can find a way to formalize that.

Is it as easy as saying that the result of an instantiation with a fragile argument is also fragile? So the code looks like this:

class A() {
    fragile B b = B(this);
}

class B(fragile A a) {
    A a { return a; }
}

i.e. fragility propagates.

 
21. May 2011, 10:08 CET | Link

Actually I think my original code is OK. I think the compiler can fairly easily detect that your What value?! line of code is problematic using purely local static analysis. It can see that b was produced using a fragile argument, and could reason that it must therefore treat b as fragile. I think it works out.

 
21. May 2011, 10:18 CET | Link
Andrey
Gavin King wrote on May 21, 2011 04:08:
Actually I think my original code is OK. I think the compiler can fairly easily detect that your What value?! line of code is problematic using purely local static analysis. It can see that b was produced using a fragile argument, and could reason that it must therefore treat b as fragile. I think it works out.
class B(fragile A a) {
  shared A a { return a; }
}

class C(fragile B b) {
  B b { return b; }

  shared void foo() {
    // Can I use my b.a here? How do I know it's not fragile any more?
  }
}

Another question: how granular is fragile? Can I write

A(B|fragile C param)
?

 
21. May 2011, 10:27 CET | Link
Can I use my b.a here? How do I know it's not fragile any more?

Yes, you can, because foo() occurs in the declaration section of C. So we know it is never called until after C finishes initializing.

By the way there's something wrong with your code example. The only place you can possibly get a fragile reference to a B is from inside the initializer of B. I'm not sure who is supposed to be instantiating C here, or why they have a fragile B.

Another question: how granular is fragile? Can I write
A(B|fragile C param)

No, because even if you could in theory do the necessary static analysis to deal with that, Ceylon doesn't let you include annotations inside a union type. You would have to write:

A(fragile B|C param)
 
21. May 2011, 10:32 CET | Link
Andrey

One more thing: it seems that you cannot allow any closures passed to higher-order functions to capture fragile values, because you don't know when those closures are going to be executed.

 
21. May 2011, 10:44 CET | Link
Andrey wrote on May 21, 2011 04:32:
One more thing: it seems that you cannot allow any closures passed to higher-order functions to capture fragile values, because you don't know when those closures are going to be executed.

What do you mean by closures here? A method or attribute definition? That's already handled by the rules, right? Recasting the rules slightly, to account for fragile:

The following restrictions apply to statements and declarations that appear within the initializer of a class:
  • They may not evaluate attributes or invoke methods that are declared later in the body of the class upon the instance that is being initialized, [or any attributes or methods of a fragile reference].
  • They may not pass a reference to [any] instance that is being initialized [i.e. this or a fragile reference] as an argument of an instantiation or method invocation or as the value of an attribute assignment.
 
21. May 2011, 11:50 CET | Link

Here's another variation. We could have parent and child annotations for attributes.

  • Both child and parent are treated as though they were formal attributes.
  • A child specifier may instantiate an instance of a class whose parent is the containing type.
  • A parent may not contain a specifier.
class A() {
    shared child B foo = B("foo");
    shared child B bar = B("bar");
}

class B(String s) {
    shared parent A a;
}

This approach avoids needing to wrestle with the problem of a fragile reference.

 
23. May 2011, 15:41 CET | Link
John DeHope | johndehope3(AT)gmail.com

Quick question, in the name of familiarity, could you make the initialization section appear within a constructor? Something like...

class Foo {
  Foo ( args... ) {
    // constructor aka initializer code goes here
  }
}

I like the simplification of no constructors in the traditional sense, and the way a single parameter list allows you to reason better about how the class will operate, but I think putting code in the root body of a class definition is going too far down the path of look ma', no hands!.

 
23. May 2011, 19:39 CET | Link
Quick question, in the name of familiarity, could you make the initialization section appear within a constructor?

But, as I argue here, that ends up just really verbose. Which is why your Java IDE has a feature to auto-generate the constructor from the fields of the class. Also, it looks like the syntax to instantiate Foo should be Foo.Foo(args), according to our rules about block structure.

 
30. May 2011, 01:25 CET | Link
Gabriel Létourneau | gabriel.letourneau(AT)gmail.com

Maybe I missed it, but: What delineates the frontier between the initialization and the declaration sections? Is it the first shared or default?

 
30. May 2011, 02:22 CET | Link
Gabriel Létourneau wrote on May 29, 2011 19:25:
Maybe I missed it, but: What delineates the frontier between the initialization and the declaration sections? Is it the first shared or default?

It is the last non-declaration. i.e. the last control structure, specifier, expression statement, or object with a non-empty initializer. Sorry, that's actually not very clear from the text.

 
30. May 2011, 15:47 CET | Link
Gabriel Létourneau | gabriel.letourneau(AT)gmail.com
Gavin King wrote on May 29, 2011 20:22:
Gabriel Létourneau wrote on May 29, 2011 19:25:
Maybe I missed it, but: What delineates the frontier between the initialization and the declaration sections? Is it the first shared or default?
It is the last non-declaration. i.e. the last control structure, specifier, expression statement, or object with a non-empty initializer. Sorry, that's actually not very clear from the text.

OK. So in other words : I cannot access the public persona of the current instance as long as I'm not done initializing it — the only exception being "public" members already well-defined at a given point. Seems sound. Baby in the womb can't touch her feet.

BTW, am I right to think of unshared default and formal members as Java protected methods?

As a class designer, given the restrictions you put on them, I'd be very often tempted to use functional parameters instead. You give me first-class functions, I'll use them.

 
30. May 2011, 19:06 CET | Link
OK. So in other words : I cannot access the public persona of the current instance as long as I'm not done initializing it — the only exception being "public" members already well-defined at a given point. Seems sound.

It's not public vs non-public that makes the difference. But yeah, I think you get the idea.

BTW, am I right to think of unshared default and formal members as Java protected methods?

Naw, currently the language spec says that this combination is simply illegal.

As a class designer, given the restrictions you put on them, I'd be very often tempted to use functional parameters instead.

There would be absolutely nothing wrong with that. A formal member can often be replaced by a functional initializer parameter. With the nice syntax we have for named argument instantiations, that's actually often much more convenient for clients.

Post Comment