Red Hat

In Relation To Ceylon

In Relation To Ceylon

Introduction to Ceylon Part 9

Posted by    |       |    Tagged as Ceylon

This is the ninth installment in a series of articles introducing the Ceylon language. Note that some features of the language may change before the final release.

Named arguments

Consider the following method:

void printf(OutputStream to, String format, Object... values) { ... }

(Remember, the last parameter is a sequenced parameter which accepts multiple arguments, just like a Java varargs parameter.)

We've seen lots of examples of invoking a method or instantiating a class using a familiar C-style syntax where arguments are delimited by in parentheses and separated by commas. Arguments are matched to parameters by their position in the list. Let's see just one more example, just in case:

printf(process, "Thanks, %s. You have been charged %.2f. Your confirmation number is %d.", 
        user.name, order.total, order.confimationNumber);

This works fine, I suppose. However, Ceylon provides an alternative method invocation protocol that is usually easier to read when there are more than one or two arguments:

printf { 
    to = process;
    format = "Thanks, %s. You have been charged %.2f. Your confirmation number is %d."; 
    user.name, order.total, order.confimationNumber
};

This invocation protocol is called a named argument list. We can recognize a named argument list by the use of braces as delimiters instead of parentheses. Notice that arguments are separated by semicolons, except for arguments to the sequenced parameter, which are separated by commas. We explicitly specify the name of each parameter, except for the sequenced parameter, whose arguments always appear at the end of the named parameter list. Note that it's also acceptable to call this method like this, passing a sequence to the named value parameter:

printf { 
    to = process;
    format = "Thanks, %s. You have been charged %.2f. Your confirmation number is %d."; 
    values = { user.name, order.total, order.confimationNumber };
};

We usually format named argument invocations across multiple lines.

Declarative object instantiation syntax

Named arguments are very commonly used for building graphs of objects. Therefore, Ceylon provides a special abbreviated syntax that simplifies the declaration of an attribute getter, named parameter, or method that builds an object by specifying named arguments to the class initializer.

We're allowed to abbreviate an attribute definition of the following form:

Payment payment = Payment { 
    method = user.paymentMethod;
    currency = order.currency; 
    amount = order.total;
};

or a named argument specification of this form:

payment = Payment { 
    method = user.paymentMethod; 
    currency = order.currency; 
    amount = order.total;
};

to the following more declarative (and less redundant) style:

Payment payment { 
    method = user.paymentMethod; 
    currency = order.currency; 
    amount = order.total;
}

We're even allowed to write a method of the following form:

Payment createPayment(Order order) { 
    return Payment {
        method = user.paymentMethod; 
        currency = order.currency; 
        amount = order.total;
    };
}

using the following abbreviated syntax:

Payment createPayment(Order order) { 
    method = user.paymentMethod; 
    currency = order.currency; 
    amount = order.total;
}

Perhaps you're worried that this looks like a method that assigns the values of three attributes of the declaring class, rather than a shortcut syntax for a named argument instantiation of the Payment class. And that's a very fair point. To a Java developer, that is what it looks like. There's two things that should alert you to what's really going on. The above method:

  • has no return statement, but it's not declared void, and
  • contains a list of = specification statements instead of := assignment expressions.

Once you're used to Ceylon's more flexible syntax, these differences will usually stand out immediately.

More about named arguments

The following classes define a data structure for building tables:

class Table(String title, Natural rows, Border border, Column... columns) { ... } 
class Column(String heading, Natural width, String content(Natural row)) { ... } 
class Border(Natural padding, Natural weight) { ... }

Of course, we could built a Table using positional argument lists:

String x(Natural row) { return row.string; }
String xSquared(Natural row) { return (row**2).string; }
Table table = Table("Squares", 5, Border(2,1), Column("x",10, x), Column("x**2",12, xSquared));

However, it's far more common to use named arguments to build a complex graph of objects. In this section we're going to meet some new features of named argument lists, that make it especially convenient to build object graphs.

First, note that the syntax we've already seen for specifying a named argument value looks exactly like the syntax for refining a formal attribute. If you think about it, taking into account that a method parameter may accept references to other methods, the whole problem of specifying values for named parameters starts to look a lot like the problem of refining abstract members. Therefore, Ceylon will let us reuse much of the member declaration syntax inside a named argument list. (But note that this has not yet been implemented in the compiler.)

It's legal to include the following constructs in a named argument list:

  • method declarations — specify the value of a parameter that accepts a function,
  • object (anonymous class) declarations — are most useful for specifying the value of a parameter whose type is an interface or abstract class, and
  • getter declarations — lets us compute the value of an argument inline.

This helps explain why named argument lists are delimited by braces: the fully general syntax for a named argument list is very, very close to the syntax for a class, method, or attribute body. Notice, again, how flexibility derives from language regularity.

So we could rewrite the code that builds a Table as follows:

Table table = Table { 
    title="Squares";
    rows=5;
    border = Border {
        padding=2;
        weight=1;
    };
    Column {
        heading="x";
        width=10;
        String content(Natural row) {
            return row.string;
        }
    }, 
    Column {
        heading="x**2";
        width=12;
        String content(Natural row) {
            return (row**2).string;
        }
    }
};

Notice that we've specified the value of the parameter named content using the usual syntax for declaring a method.

Even better, our example can be abbreviated like this:

Table table { 
    title="Squares";
    rows=5;
    Border border {
        padding=2;
        weight=1;
    } 
    Column {
        heading="x";
        width=10;
        String content(Natural row) {
            return row.string;
        }
    }, 
    Column {
        heading="x**2"; 
        width=10;
        String content(Natural row) {
            return (row**2).string;
        }
    }
}

Notice how we've transformed our code from a form which emphasized invocation into a form that emphasizes declaration of a hierarchical structure. Semantically, the two forms are equivalent. But in terms of readability, they are very different.

We could put the above totally declarative code in a file by itself and it would look like some kind of mini-language for defining tables. In fact, it's executable Ceylon code that may be validated for syntactic correctness by the Ceylon compiler and then compiled to Java bytecode. Even better, the Ceylon IDE (when it exists) will provide authoring support for our mini-language. In complete contrast to the DSL support in some dynamic languages, any Ceylon DSL is completely typesafe! You can think of the definition of the Table, Column and Border classes as defining the schema or grammar of the mini-language. (In fact, they are really defining the syntax tree for the mini-language.)

Now let's see an example of a named argument list with an inline getter declaration:

shared class Payment(PaymentMethod method, Currency currency, Float amount) { ... }
Payment payment { 
    method = user.paymentMethod; 
    currency = order.currency; 
    Float amount {
        variable Float total := 0.0; 
        for (Item item in order.items) {
            total += item.quantity * item.product.unitPrice; 
        }
        return total;
    }
}

Finally, here's an example of a named argument list with an inline object declaration:

shared interface Observable { 
    shared void addObserver(Observer<Bottom> observer) { ... }
}
shared interface Observer<in Event> { 
    shared formal on(Event event);
}
observable.addObserver { 
    object observer satisfies Observer<UpdateEvent> {
        shared actual void on(UpdateEvent e) { 
            writeLine("Update:" + e.string);
        }
    }
};

Note that Observer<T> is assignable to Observer<Bottom> for any type T, since Observer<T> is contravariant in its type parameter T. If this doesn't make sense, please read these two sections again. ;-)

Of course, as we saw in Part 8, a better way to solve this problem might be to eliminate the Observer interface and pass a method directly:

shared interface Observable { 
    shared void addObserver<Event>(void on(Event event)) { ... }
}
observable.addObserver {
    void on(UpdateEvent e) { 
    	writeLine("Update:" + e.string);
    }
};

A quick tangent here: note that we need a type parameter T of the method addObserver() here only because Ceylon inherits Java's limitation that function types are nonvariant in their parameter types. This is actually pretty unnatural. We should probably eventually come up with a workaround to make function types contravariant in their parameter types, allowing us to write:

shared interface Observable { 
    shared void addObserver(void on(Bottom event)) { ... }
}

Defining user interfaces

One of the first modules we're going to write for Ceylon will be a library for writing HTML templates in Ceylon. A fragment of static HTML would look something like this:

Html { 
    Head head {
        title = "Hello World"; 
        cssStyleSheet = 'hello.css';
    } 
    Body body {
        Div { 
            cssClass = "greeting"; 
            "Hello World"
        },
        Div {
            cssClass = "footer"; 
            "Powered by Ceylon"
        }
    }
}

A complete HTML template might look like this:

import ceylon.html { ... }

doc "A web page that displays a greeting" 
page '/hello.html' 
Html hello(Request request) {
    
    Head head { 
        title = "Hello World"; 
        cssStyleSheet = 'hello.css';
    }
    
    Body body { 
        Div {
            cssClass = "greeting"; 
            Hello( request.parameters["name"] ).greeting
        }, 
        Div {
            cssClass = "footer"; 
            "Powered by Ceylon"
        }
    }

};

There's more...

There's plenty of potential applications of this syntax aside from user interface definition. For example, Ceylon lets us use a named argument list to the specify arguments of a program element annotation. But we'll have to come back to the subject of annotations in a future installment. In Part 10 we're going to discuss some of the basic types from the language module, in particular numeric types, and introduce the idea of operator polymorphism.

Introduction to Ceylon Part 8

Posted by    |       |    Tagged as Ceylon

This is the eighth installment in a series of articles introducing the Ceylon language. Note that some features of the language may change before the final release.

This article was updated on 31/5/2011 to add information about partial application of methods.

First class and higher order functions

Ceylon isn't a functional language: it has variable attributes and so methods can have side effects. But Ceylon does let you use functions as values, which in some people's eyes makes the language a kind of hybrid. I'm not so sure about that. There's actually nothing at all new about having functions-as-values in an object oriented language — for example, Smalltalk, one of the first and still one of the cleanest object oriented languages, was built around this idea. (To my eyes, true functional programming is more about what you can't do — mutate values — than what you can do.) Anyway, Ceylon, like Smalltalk and a number of other object oriented languages, lets you treat a function as an object and pass it around the system.

In this installment, we're going to discuss Ceylon's support for first class and higher order functions. First class function support means the ability to treat a function as a value. A higher order function is a function that accepts other functions as arguments, or returns another function. It's clear that these two ideas go hand-in-hand, so I'll just talk about higher order function support from now on.

A quick disclaimer: none of the things in this installment have actually been implemented in the compiler yet.

Representing the type of a function

Ceylon is a (very) statically typed language. So if we're going to treat a function as a value, the very first question that arises is: what is the type of the function? We need a way to encode the return type and parameter types of a function into the type system. Remember that Ceylon doesn't have primitive types. A strong design principle is that every type should be representable within the type system as a class or interface declaration.

I suppose Ceylon could have gone down the road of some functional languages, and represented all functions with multiple parameters in curried form. So

Natural sum(Natural x, Natural y) { ... }

would just be an abbreviation of

Natural sum(Natural x)(Natural y) { ... }

i.e. a function with one parameter that returns another function with one parameter. Then we could have represented the type of the function like this:

Function<Natural,Function<Natural,Natural>>

But we've decided not to go down this path.

Some other languages have chosen to have a separate type for each function arity. So there's F0<R>, F1<R,P1>, F2<R,P1,P2>, F3<R,P1,P2,P3>, etc. But this solution feels kinda .... lame. Worse, it doesn't allow us to abstract over all function types, building up abstractions like Method and Class, etc. We're going to need to be able to do that kind of thing when we get to discussing the typesafe metamodel.

In Ceylon, a single type Callable abstracts all functions. It's declaration is the following:

shared interface Callable<out Result, Argument...> {}

The syntax P... is called a sequenced type parameter. By analogy with a sequenced parameter, which accepts zero or more values as arguments, a sequenced type parameter accepts zero or more types as arguments. The type parameter Result represents the return type of the function. The sequenced type parameter Argument... represents the parameter types of the function.

So the type of sum in Ceylon is:

Callable<Natural, Natural, Natural>

What about void functions? Well, remember that way back in Part 1 we said that the return type of a void function is Void. So the type of a function like print() is:

Callable<Void,String>

Representing the type of a method

Here we've been discussing first class functions. But in Ceylon all named declarations are first class. That is to say, they all have a reified metamodel representable within the type system. For example, we could represent the type of a method like this:

shared interface Method<out Result, in Instance, Argument...>
    satisfies Callable<Callable<Result,Argument...>, Instance> {}

Where Instance is the type that declares the method. So the type of the method iterator() of Iterable<String> would be:

Method<Iterator<String>, Iterable<String>>

And the type of the method compare() of Comparable<Natural> would be:

Method<Comparison,Comparable<Natural>,Natural>

Notice that we've declared a method to be a function that accepts a receiver object and returns a function. As a consequence of this, an alternative method invocation protocol is the following:

Iterable<String>.iterator(strings)();
Comparable<Natural>.compare(0)(num);

Don't worry if you can't make sense of that right now. And actually I'm skipping over some details here, that's not quite exactly how Method is defined. But we'll come back to this in a future installment. Let's get back to today's topic.

Defining higher order functions

We now have enough machinery to be able to write higher order functions. For example, we could create a repeat() function that repeatedly executes a function.

void repeat(Natural times, Callable<Void,Natural> perform) {
    for (Natural i in 1..times) {
        perform(i);
    }
}

And call it like this:

void print(Natural n) { writeLine(n); }
repeat(10, print);

Which would print the numbers 1 to 10 to the console.

There's one problem with this. In Ceylon, as we'll see later, we often call functions using named arguments, but the Callable type does not encode the names of the function parameters. So Ceylon has an alternative, more elegant, syntax for declaring a parameter of type Callable:

void repeat(Natural times, void perform(Natural n)) {
    for (Natural i in 1..times) {
        perform(i);
    }
}

I find this version also slightly more readable and more regular. This is the preferred syntax for defining higher-order functions.

Function references

When a name of a function appears without any arguments, like print does above, it's called a function reference. A function reference is the thing that really has the type Callable. In this case, print has the type Callable<Void,Natural>.

Now, remember how we said that Void is both the return type of a void method, and also the logical root of the type hierarchy? Well that's useful here, since it means that we can assign a function with a non-Void return type to any parameter which expects a void method:

Boolean attemptPrint(Natural n) { 
    try {
        writeLine(n);
        return true;
    }
    catch (Exception e) {
        return false;
    }
}
repeat(10, attemptPrint);

Another way we can produce a function reference is by partially applying a method to a receiver expression. For example, we could write the following:

class Hello(String name) {
    shared void say(Natural n) {
        writeLine("Hello, " name ", for the " n "th time!");
    }
}

repeat(10, Hello("Gavin").say);

Here the expression Hello("Gavin").say has the same type as print above. It is a Callable<Void,Natural>.

More about higher-order functions

Let's see a more practical example, which mixes both ways of representing a function type. Suppose we have some kind of user interface component which can be observed by other objects in the system. We could use something like Java's Observer/Observable pattern:

shared interface Observer { 
    shared formal void observe(Event event);
}
shared abstract class Component() {
    
    OpenList<Observer> observers = OpenList<Observer>();
    
    shared void addObserver(Observer o) { 
        observers.append(o); 
    }
    
    shared void fire(Event event) { 
        for (Observer o in observers) { 
            o.observe(event);
        } 
    }

}

But now all event observers have to implement the interface Observer, which has just one method. Why don't we cut out the interface, and let event observers just register a function object as their event listener? In the following code, we define the addObserver() method to accept a function as a parameter.

shared abstract class Component() {
    
    OpenList<Callable<Void,Event>> observers = OpenList<Callable<Void,Event>>();
    
    shared void addObserver(void observe(Event event)) { 
        observers.append(observe); 
    }
    
    shared void fire(Event event) { 
        for (void observe(Event event) in observers) { 
            observe(event);
        } 
    }

}

Here we see the difference between the two ways of specifying a function type:

  • void observe(Event event) is more readable in parameter lists, where observe is the name of the parameter, but
  • Callable<Void,Event> is useful as a generic type argument.

Now, any event observer can just pass a reference to one of its own methods to addObserver():

shared class Listener(Component component) {

    void onEvent(Event e) { 
        //respond to the event 
        ...
    } 
    
    component.addObserver(onEvent); 
    
    ...

}

When the name of a method appears in an expression without a list of arguments after it, it is a reference to the method, not an invocation of the method. Here, the expression onEvent is an expression of type Callable<Void,Event> that refers to the method onEvent().

If onEvent() were shared, we could even wire together the Component and Listener from some other code, to eliminate the dependency of Listener on Component:

shared class Listener() {

    shared void onEvent(Event e) { 
        //respond to the event 
        ...
    } 
    
    ...

}
void listen(Component component, Listener listener) {
    component.addObserver(listener.onEvent);
}

Here, the syntax listener.onEvent is a kind of partial application of the method onEvent(). It doesn't cause the onEvent() method to be executed (because we haven't supplied all the parameters yet). Rather, it results in a function that packages together the method reference onEvent and the method receiver listener.

It's also possible to declare a method that returns a function. A method that returns a function has multiple parameter lists. Let's consider adding the ability to remove observers from a Component. We could use a Subscription interface:

shared interface Subscription {
    shared void cancel();
}
shared abstract class Component() {
    
    ...
    
    shared Subscription addObserver(void observe(Event event)) { 
        observers.append(observe); 
        object subscription satisfies Subscription {
            shared actual void cancel() {
                observers.remove(observe);
            }
        }
        return subscription;
    }
    
    ...

}

But a simpler solution might be to just eliminate the interface and return the cancel() method directly:

shared abstract class Component() {
    
    ...
    
    shared void addObserver(void observe(Event event))() { 
        observers.append(observe); 
        void cancel() {
            observers.remove(observe);
        }
        return cancel;
    }
    
    ...

}

Note the second parameter list of addObserver().

Here, we define a method cancel() inside the body of the addObserver() method, and return a reference to the inner method from the outer method. The inner method cancel() can't be called directly from outside the body of the addObserver() method, since it is a block local declaration. But the reference to cancel() returned by addObserver() can be called by any code that obtains the reference.

Oh, in case you're wondering, the type of the method addObserver() is Callable<Callable<Void>,Component,Callable<Void,Event>>.

Notice that cancel() is able to use the parameter observe of addObserver(). We say that the inner method receives a closure of the non-variable locals and parameters of the outer method — just like a method of a class receives a closure of the class initialization parameters and locals of the class initializer. In general, any inner class, method, or attribute declaration always receives the closure of the members of the class, method, or attribute declaration in which it is enclosed. This is an example of how regular the language is.

We could invoke our method like this:

addObserver(onEvent)();

But if we were planning to use the method in this way, there would be no good reason for giving it two parameter lists. It's much more likely that we're planning to store or pass the reference to the inner method somewhere before invoking it.

void cancel() = addObserver(onEvent);
...
cancel();

The first line demonstrates how a method can be defined using a = specification statement, just like a simple attribute definition. The second line of code simply invokes the returned reference to cancel().

We've already seen how an attribute can be defined using a block of code. Now we see that a method can be defined using a specifier. So, if you like, you can start thinking of a method as an attribute of type Callable — an attribute with parameters. Or if you prefer, you can think of an attribute as member with zero parameter lists, and of a method as a member with one or more parameter lists. Either kind of member can be defined by reference, using =, or directly, by specifying a block of code to be executed.

Cool, huh? That's more regularity.

There's more...

As you've probably noticed, all the functions we've defined so far have been declared with a name, using a traditional C-like syntax. We still need to see what Ceylon has instead of anonymous functions (sometimes called lambda expressions) for making it easy to take advantage of functions like repeat() which define specialized control structures. But I've hit my word limit already. Instead, you can find a discussion here.

If you're interested to know more about programming with higher-order functions, you can read more about currying, uncurrying, and function composition.

In Part 9, we're finally going to talk about Ceylon's syntax for named argument lists and for defining user interfaces and structured data.

Is the Ceylon type system sound?

Posted by    |       |    Tagged as Ceylon

So I've been reading some folks demanding that work on Ceylon start with a formal proof of the soundness of the type system. And calling me all sorts of names because I don't have one yet. I'm a bit bemused by this, since it's the first time in history that this has been demanded of a language designed for use in practical computing :-)

Nevertheless, I think the objection is interesting and merits some kind of response. So here's my take.

First of all, Ceylon has a very simple, super-conventional core type system. At core, the system is just:

  • parametric polymorphism with variance annotations, plus
  • mixin inheritance.

Now, we already know this basic type system to be sound. Other folks have already demonstrated that. In fact, if I understand correctly, the Scala guys have demonstrated the soundness of a significantly more complex type system with kinds in addition to mixin inheritance and generics.

There's just maybe five additional things that are defined primitively in the language spec:

  • union types,
  • reified generics,
  • attributes,
  • nested classes, and
  • higher-order functions.

Almost everything else you see in Ceylon is just sugar over the top of this basic scheme of parametric polymorphism + variance annotations + mixin inheritance. By that I mean you can re-express the other constructs in the language in terms of these more primitive notions. And that is, more or less, how the spec defines them (occasionally that might require some reading-between-the-lines).

That's one reason why Ceylon takes the approach of specifying things like operators in terms of equivalent Ceylon code. We don't have any special holes or primitive special cases in the type system. There's no primitive numeric types, arrays, raw types, built-in type promotions, or a primitive null. We've even eliminated overloading, which does seem to make soundness proofs more difficult.

And in fact, even most of the things in the list above could potentially be defined in terms of more primitive notions if necessary:

  • union types of interfaces can be interpreted as implicit interfaces (but union types of classes are a problem)
  • reified generics really just means an additional implicit parameter to generic constructors and methods (in fact, that's even how you need to implement it on the JVM)
  • attributes can be seen as syntactic sugar over a field + a method (in fact, that's even how you need to implement it on the JVM)
  • nested classes can be seen as syntactic sugar over a toplevel class (in fact, that's even how you need to implement it on the JVM)
  • higher-order function support can be seen as syntactic sugar over specially generated classes (in fact, that's even how you need to implement it on the JVM)

Get the picture? Even though these things are defined in primitive terms in the spec, we already know that they can be re-expressed in terms of other primitive constructs because we already need to be able to do that in order to compile the language to Java bytecode.

Indeed, I believe it would be quite easy to show that almost any Ceylon program could be mechanically translated to a known-sound language with parametric polymorphism + variance annotations + mixin inheritance. That spec would suffice in and of itself as a proof of the soundness of Ceylon's type system!

There is one major doubt I have. I'm not sure if the type systems that have been previously studied have included union types. It would be an interesting exercise for a CS student to take some of the existing soundness proofs out there and try adding union types. Not sure if someone has already done that.

Anyway, the point I'm trying to make is that you shoudn't need a full formal proof from first principles of the soundness of your type system in order to be pretty confident that it is sound. If the core type system is already known to be sound, due to the work of others, and if you know that some other construct can be re-expressed in terms of the simpler, core constructs, you have have an informal proof of the soundness of that construct. Not quite good enough for an academic paper, perhaps, but good enough to start work on the compiler.

What do you think? Am I missing something here?

Introduction to Ceylon Part 7

Posted by    |       |    Tagged as Ceylon

This is the seventh installment in a series of articles introducing the Ceylon language. Note that some features of the language may change before the final release.

Attributes and locals

In Java, a field of a class is quite easily distinguished from a local constant or variable of a method or constructor. Ceylon doesn't really make this distinction very strongly. An attribute is really just a local that happens to be captured by some shared declaration.

Here, count is a local variable of the initializer of Counter:

class Counter() {
    variable Natural count := 0;
}

But in the following two examples, count is an attribute:

class Counter() {
    shared variable Natural count := 0;
}
class Counter() {
    variable Natural count := 0;
    shared Natural inc() {
        return ++count;
    }
}

This might seem a bit strange at first, but it's really just how closure works. The same behavior applies to locals inside a method. Methods can't declare shared members, but they can return an object that captures a local:

interface Counter {
    shared formal Natural inc();
}
Counter createCounter() {
    variable Natural count := 0;
    object counter satisfies Counter {
        shared actual Natural inc() {
            return ++count;
        }
    }
    return counter;
}

Even though we'll continue to use the words local and attribute, keep in mind that there's no really strong distinction between the terms. Any named value might be captured by some other declaration in the same containing scope. (I'm still searching for a really good word to collectively describe attributes and locals.)

Variables

Ceylon encourages you to use immutable attributes as much as possible. An immutable attribute has its value specified when the object is initialized, and is never reassigned.

class Reference<Value>(Value x) {
    shared Value value = x;
}

If we want to be able to assign a value to a simple attribute or local we need to annotate it variable:

class Reference<Value>(Value x) {
    shared variable Value value := x;
}

Notice the use of := instead of = here. This is important! In Ceylon, specification of an immutable value is done using =. Assignment to a variable attribute or local is considered a different kind of thing, always performed using the := operator.

The = specifier is not an operator, and can never appear inside an expression. It's just a punctuation character. The following code is not only wrong, but even fails to parse:

if (x=true) {   //compile error
    ...
}

Setters

If we want to make an attribute with a getter mutable, we need to define a matching setter. Usually this is only useful if you have some other internal attribute you're trying to set the value of indirectly.

Suppose our class has the following simple attributes, intended for internal consumption only, so un-shared:

variable String? firstName := null;
variable String? lastName := null;

(Remember, Ceylon never automatically initializes attributes to null.)

Then we can abstract the simple attribute using a second attribute defined as a getter/setter pair:

shared String fullName {
    return " ".join(coalesce(firstName,lastName));
}

shared assign fullName {
    Iterator<String> tokens = fullName.tokens();
    firstName := tokens.head;
    lastName := tokens.rest.head;
}

A setter is identified by the keyword assign in place of a type declaration. (The type of the matching getter determines the type of the attribute.)

Yes, this is a lot like a Java get/set method pair, though the syntax is significantly streamlined. But since Ceylon attributes are polymorphic, and since you can redefine a simple attribute as a getter or getter/setter pair without affecting clients that call the attribute, you don't need to write getters and setters unless you're doing something special with the value you're getting or setting.

Control structures

Ceylon has five built-in control structures. There's nothing much new here for Java or C# developers, so I'll just give a few quick examples without much additional commentary. However, one thing to be aware of is that Ceylon doesn't allow you to omit the braces in a control structure. The following doesn't parse:

if (x>100) bigNumber();

You are required to write:

if (x>100) { bigNumber(); }

OK, so here's the examples. The if/else statement is totally traditional:

if (x>100)) {
    bigNumber(x);
}
else if (x>1000) {
    reallyBigNumber(x);
}
else {
    littleNumber();
}

The switch/case statement eliminates C's much-criticized fall through behavior and irregular syntax:

switch (x<=>100)
case (smaller) { littleNumber(); }
case (equal) { oneHundred(); }
case (larger) { bigNumber(); }

The for loop has an optional fail block, which is executed when the loop completes normally, rather than via a return or break statement. There's no C-style for.

Boolean minors;
for (Person p in people) {
    if (p.age<18) {
        minors = true;
        break;
    }
}
fail {
    minors = false;
}

The while and do/while loops are traditional.

variable local it = names.iterator();
while (exists String name = it.head) {
    writeLine(name);
    it:=it.tail;
}

The try/catch/finally statement works like Java's:

try {
    message.send();
}
catch (ConnectionException|MessageException e) {
    tx.setRollbackOnly();
}

And try supports a resource expression similar to Java 7.

try (Transaction()) {
    try (Session s = Session()) {
        s.persist(person);
    }
}

Sequenced parameters

A sequenced parameter of a method or class is declared using an ellipsis. There may be only one sequenced parameter for a method or class, and it must be the last parameter.

void print(String... strings) { ... }

Inside the method body, the parameter strings has type String[].

void print(String... strings) {
    for (String string in strings) {
        write(string);
    }
    writeLine();
}

A slightly more sophisticated example is the coalesce() method we saw above. coalesce() accepts X?[] and eliminates nulls, returning X[], for any type X. Its signature is:

shared Value[] coalesce<Value>(Value?... sequence) { ... }

Sequenced parameters turn out to be especially interesting when used in named argument lists for defining user interfaces or structured data.

Packages and imports

There's no special package statement in Ceylon. The compiler determines the package and module to which a toplevel program element belongs by the location of the source file in which it is declared. A class named Hello in the package org.jboss.hello must be defined in the file org/jboss/hello/Hello.ceylon.

When a source file in one package refers to a toplevel program element in another package, it must explicitly import that program element. Ceylon, unlike Java, does not support the use of qualified names within the source file. We can't write org.jboss.hello.Hello in Ceylon.

The syntax of the import statement is slightly different to Java. To import a program element, we write:

import org.jboss.hello { Hello }

To import several program elements from the same package, we write:

import org.jboss.hello { Hello, defaultHello, PersonalizedHello }

To import all toplevel program elements of a package, we write:

import org.jboss.hello { ... }

To resolve a name conflict, we can rename an imported declaration:

import org.jboss.hello { local Hi = Hello, ... }

We think renaming is a much cleaner solution than the use of qualified names.

There's more...

Now that we've mopped up a few missing topics, we're ready to look at first class functions in Part 8, and the declarative object-tree-builder syntax for defining user interfaces and structured data in Part 9.

If you're interested, the Ceylon module system is described briefly here.

Introduction to Ceylon Part 6

Posted by    |       |    Tagged as Ceylon

This is the sixth installment in a series of articles introducing the Ceylon language. Note that some features of the language may change before the final release.

Defining generic types

We've seen plenty of parameterized types in this series of articles, but now let's explore a few more details.

Programming with generic types is one of the most difficult parts of Java. That's still true, to some extent, in Ceylon. But because the Ceylon language and SDK were designed for generics from the ground up, Ceylon is able to alleviate the most painful aspects of Java's bolted-on-later model.

Just like in Java, only types and methods may declare type parameters. Also just like in Java, type parameters are listed before ordinary parameters, enclosed in angle brackets.

shared interface Iterator<out Element> { ... }
class Array<Element>(Element... elements) satisfies Sequence<Element> { ... }
shared Entries<Natural,Value> entries<Value>(Value... sequence) { ... }

As you can see, the convention in Ceylon is to use meaningful names for type parameters.

Unlike Java, we always do need to specify type arguments in a type declaration (there are no raw types in Ceylon). The following will not compile:

Iterator it = ...;   //error: missing type argument to parameter Element of Iterator

We always have to specify a type argument in a type declaration:

Iterator<String> it = ...;

On the other hand, we shouldn't need to explicitly specify type arguments in most method invocations or class instantiations. In principle it's very often possible to infer the type arguments from the ordinary arguments. The following code should be possible, just like it is in Java:

Array<String> strings = Array("Hello", "World");
Entries<Natural,String> entries = entries(strings);

But we haven't yet figured out what exactly the type inference algorithm will be (probably something involving union types!) and so the Ceylon compiler currently requires that all type arguments be explicitly specified like this:

Array<String> strings = Array<String>("Hello", "World");
Entries<Natural,String> entries = entries<Natural,String>(strings);

On the other hand, the following code does already compile:

local strings = Array<String>("Hello", "World");
local entries = entries<Natural,String>(strings);

The root cause of very many problems when working with generic types in Java is type erasure. Generic type parameters and arguments are discarded by the compiler, and simply aren't available at runtime. So the following, perfectly sensible, code fragments just wouldn't compile in Java:

if (is List<Person> list) { ... }
if (is Element obj) { ... }

(Where Element is a generic type parameter.)

A major goal of Ceylon's type system is support for reified generics. Like Java, the Ceylon compiler performs erasure, discarding type parameters from the schema of the generic type. But unlike Java, type arguments are supposed to be reified (available at runtime). Of course, generic type arguments won't be checked for typesafety by the underlying virtual machine at runtime, but type arguments are at least available at runtime to code that wants to make use of them explicitly. So the code fragments above are supposed to compile and function as expected. You will even be able to use reflection to discover the type arguments of an instance of a generic type.

The bad news is we haven't implemented this yet ;-)

Finally, Ceylon eliminates one of the bits of Java generics that's really hard to get your head around: wildcard types. Wildcard types were Java's solution to the problem of covariance in a generic type system. Let's first explore the idea of covariance, and then see how covariance in Ceylon works.

Covariance and contravariance

It all starts with the intuitive expectation that a collection of Geeks is a collection of Persons. That's a reasonable intuition, but especially in non-functional languages, where collections can be mutable, it turns out to be incorrect. Consider the following possible definition of Collection:

shared interface Collection<Element> { 
    shared formal Iterator<Element> iterator(); 
    shared formal void add(Element x);
}

And let's suppose that Geek is a subtype of Person. Reasonable.

The intuitive expectation is that the following code should work:

Collection<Geek> geeks = ... ; 
Collection<Person> people = geeks;    //compiler error 
for (Person person in people) { ... }

This code is, frankly, perfectly reasonable taken at face value. Yet in both Java and Ceylon, this code results in a compiler error at the second line, where the Collection<Geek> is assigned to a Collection<Person>. Why? Well, because if we let the assignment through, the following code would also compile:

Collection<Geek> geeks = ... ; 
Collection<Person> people = geeks;    //compiler error 
people.add( Person("Fonzie") );

We can't let that code by — Fonzie isn't a Geek!

Using big words, we say that Collection is nonvariant in Element. Or, when we're not trying to impress people with opaque terminology, we say that Collection both produces — via the iterator() method — and consumes — via the add() method — the type Element.

Here's where Java goes off and dives down a rabbit hole, successfully using wildcards to squeeze a covariant or contravariant type out of a nonvariant type, but also succeeding in thoroughly confusing everybody. We're not going to follow Java down the hole.

Instead, we're going to refactor Collection into a pure producer interface and a pure consumer interface:

shared interface Producer<out Output> { 
    shared formal Iterator<Output> iterator();
}
shared interface Consumer<in Input> { 
    shared formal void add(Input x);
}

Notice that we've annotated the type parameters of these interfaces.

  • The out annotation specifies that Producer is covariant in Output; that it produces instances of Output, but never consumes instances of Output.
  • The in annotation specifies that Consumer is contravariant in Input; that it consumes instances of Input, but never produces instances of Input.

The Ceylon compiler validates the schema of the type declaration to ensure that the variance annotations are satisfied. If you try to declare an add() method on Producer, a compilation error results. If you try to declare an iterate() method on Consumer, you get a similar compilation error.

Now, let's see what that buys us:

  • Since Producer is covariant in its type parameter Output, and since Geek is a subtype of Person, Ceylon lets you assign Producer<Geek> to Producer<Person>.
  • Furthermore, since Consumer is contravariant in its type parameter Input, and since Geek is a subtype of Person, Ceylon lets you assign Consumer<Person> to Consumer<Geek>.

We can define our Collection interface as a mixin of Producer with Consumer.

shared interface Collection<Element> 
        satisfies Producer<Element> & Consumer<Element> {}

Notice that Collection remains nonvariant in Element. If we tried to add a variance annotation to Element in Collection, a compile time error would result.

Now, the following code finally compiles:

Collection<Geek> geeks = ... ; 
Producer<Person> people = geeks; 
for (Person person in people) { ... }

Which matches our original intuition.

The following code also compiles:

Collection<Person> people = ... ; 
Consumer<Geek> geekConsumer = people; 
geekConsumer.add( Geek("James") );

Which is also intuitively correct — James is most certainly a Person!

There's two additional things that follow from the definition of covariance and contravariance:

  • Producer<Void> is a supertype of Producer<T> for any type T, and
  • Consumer<Bottom> is a supertype of Consumer<T> for any type T.

These invariants can be very helpful if you need to abstract over all Producers or all Consumers. (Note, however, that if Producer declared upper bound type constraints on Output, then Producer<Void> would not be a legal type.)

You're unlikely to spend much time writing your own collection classes, since the Ceylon SDK has a powerful collections framework built in. But you'll still appreciate Ceylon's approach to covariance as a user of the built-in collection types. The collections framework defines two interfaces for each basic kind of collection. For example, there's an interface List<Element> which represents a read-only view of a list, and is covariant in Element, and OpenList<Element>, which represents a mutable list, and is nonvariant in Element.

Generic type constraints

Very commonly, when we write a parameterized type, we want to be able to invoke methods or evaluate attributes upon instances of the type parameter. For example, if we were writing a parameterized type Set<Element>, we would need to be able to compare instances of Element using == to see if a certain instance of Element is contained in the Set. Since == is only defined for expressions of type Equality, we need some way to assert that Element is a subtype of Equality. This is an example of a type constraint — in fact, it's an example of the most common kind of type constraint, an upper bound.

shared class Set<out Element>(Element... elements) 
        given Element satisfies Equality {
    ...

    shared Boolean contains(Object obj) { 
        if (is Element obj) {
            return obj in bucket(obj.hash);
        }
        else {
            return false;
        }
    }

}

A type argument to Element must be a subtype of Equality.

Set<String> set = Set("C", "Java", "Ceylon"); //ok
Set<String?> set = Set("C", "Java", "Ceylon", null); //compile error

In Ceylon, a generic type parameter is considered a proper type, so a type constraint looks a lot like a class or interface declaration. This is another way in which Ceylon is more regular than some other C-like languages.

An upper bound lets us call methods and attributes of the bound, but it doesn't let us instantiate new instances of Element. Once we implement reified generics, we'll be able to add a new kind of type constraint to Ceylon. An initialization parameter specification lets us actually instantiate the type parameter.

shared class Factory<out Result>() 
        given Result(String s) {

    shared Result produce(String string) { 
        return Result(string);
    }

}

A type argument to Result of Factory must be a class with a single initialization parameter of type String.

Factory<Hello> = Factory<PersonalizedHello>(); //ok
Factory<Hello> = Factory<DefaultHello>(); //compile error

A third kind of type constraint is an enumerated type bound, which constrains the type argument to be one of an enumerated list of types. It lets us write an exhaustive switch on the type parameter:

Value sqrt<Value>(Value x) 
        given Value of Float | Decimal {
    switch (Value)
    case (satisfies Float) {
        return sqrtFloat(x);
    } 
    case (satisfies Decimal) {
        return sqrtDecimal(x);
    }
}

This is one of the workarounds we mentioned earlier for Ceylon's lack of overloading.

Finally, the fourth kind of type constraint, which is much less common, and which most people find much more confusing, is a lower bound. A lower bound is the opposite of an upper bound. It says that a type parameter is a supertype of some other type. There's only really one situation where this is useful. Consider adding a union() operation to our Set interface. We might try the following:

shared class Set<out Element>(Element... elements) 
        given Element satisfies Equality {
    ...
    
    shared Set<Element> union(Set<Element> set) {   //compile error
        return ....
    }
    
}

This doesn't compile because we can't use the covariant type parameter T in the type declaration of a method parameter. The following declaration would compile:

shared class Set<out Element>(Element... elements) 
        given Element satisfies Equality {
    ...
    
    shared Set<Object> union(Set<Object> set) { 
        return ....
    }
    
}

But, unfortunately, we get back a Set<Object> no matter what kind of set we pass in. A lower bound is the solution to our dilemma:

shared class Set<out Element>(Element... elements) 
        given Element satisfies Equality {
    ...
    
    shared Set<UnionElement> union(Set<UnionElement> set) 
            given UnionElement abstracts Element {
        return ...
    }
    
}

With type inference, the compiler chooses an appropriate type argument to UnionElement for the given argument to union():

Set<String> strings = Set("abc", "xyz") ; 
Set<String> moreStrings = Set("foo", "bar", "baz"); 
Set<String> allTheStrings = strings.union(moreStrings);
Set<Decimal> decimals = Set(1.2.decimal, 3.67.decimal) ; 
Set<Float> floats = Set(0.33, 22.0, 6.4); 
Set<Number> allTheNumbers = decimals.union(floats);
Set<Hello> hellos = Set( DefaultHello(), PersonalizedHello(name) ); 
Set<Object> objects = Set("Gavin", 12, true); 
Set<Object> allTheObjects = hellos.union(objects);

There's more...

I was about to start talking about sequenced type parameters, the foundation of Ceylon's typesafe metamodel. But I realize I already hit my word limit. If you're really impatient, you can skip forward to Part 8.

In Part 7 we're going to back up a bit and cover a couple of topics that got kinda glossed over.

Introduction to Ceylon Part 5

Posted by    |       |    Tagged as Ceylon

This is the fifth installment in a series of articles introducing the Ceylon language. Note that some features of the language may change before the final release.

Narrowing the type of an object reference

In any language with subtyping there is the hopefully occasional need to perform narrowing conversions. In most statically-typed languages, this is a two-part process. For example, in Java, we first test the type of the object using the instanceof operator, and then attempt to downcast it using a C-style typecast. This is quite curious, since there are virtually no good uses for instanceof that don't involve an immediate cast to the tested type, and typecasts without type tests are dangerously non-typesafe.

As you can imagine, Ceylon, with its emphasis upon static typing, does things differently. Ceylon doesn't have C-style typecasts. Instead, we must test and narrow the type of an object reference in one step, using the special if (is ... ) construct. This construct is very, very similar to if (exists ... ) and if (nonempty ... ), which we met earlier.

Object obj = ... ; 
if (is Hello obj) {
    obj.say();
}

The switch statement can be used in a similar way:

Object obj = ... ; 
switch(obj) 
case (is Hello) {
    obj.say();
} 
case (is Person) {
    stream.writeLine(obj.firstName);
} 
else {
    stream.writeLine("Some miscellaneous thing");
}

These constructs protect us from inadvertantly writing code that would cause a ClassCastException in Java, just like if (exists ... ) protects us from writing code that would cause a NullPointerException.

More about union types

We've seen a few examples of how ad-hoc union types are used in Ceylon. Let's just revisit the notion to make sure we completely understand it. When I declare the type of something using a union type X|Y, I'm saying that only expressions of type X and expressions of type Y are assignable to it. The type X|Y is a supertype of both X and Y. The following code is well-typed:

void print(String|Natural|Integer val) { ... }

print("hello");
print(69);
print(-1);

But what operations does a type like String|Natural|Integer have? What are its supertypes? Well, the answer is pretty intuitive: T is a supertype of X|Y if and only if it is a supertype of both X and Y. The Ceylon compiler determines this automatically. So the following code is also well-typed:

Natural|Integer i = ... ;
Number num = i;
String|Natural|Integer val = i;
Object obj = val;

However, num is not assignable to val, since Number is not a supertype of String.

Of course, it's very common to narrow an expression of union type using a switch statement. Usually, the Ceylon compiler forces us to write an else clause in a switch, to remind us that there might be additional cases which we have not handled. But if we exhaust all cases of a union type, the compiler will let us leave off the else clause.

void print(String|Natural|Integer val) {
    switch (val)
    case (is String) { writeLine(val); }
    case (is Natural) { writeLine("Natural: " + val); }
    case (is Integer) { writeLine("Integer: " + val); }
}

Enumerated subtypes

Sometimes it's useful to be able to do the same kind of thing with the subtypes of an ordinary type. First, we need to explicitly enumerate the subtypes of the type using the of clause:

abstract class Hello() 
        of DefaultHello | PersonalizedHello { 
    ...
}

(This makes Hello into Ceylon's version of what the functional programming community calls an algebraic data type.)

Now the compiler won't let us declare additional subclasses of Hello, and so the union type DefaultHello|PersonalizedHello is exactly the same type as Hello. Therefore, we can write switch statements without an else clause:

Hello hello = ... ; 
switch (hello) 
case (is DefaultHello) {
    writeLine("What's your name?");
} 
case (is PersonalizedHello) {
    writeLine("Nice to hear from you again!");
}

Now, it's usually considered bad practice to write long switch statements that handle all subtypes of a type. It makes the code non-extensible. Adding a new subclass to Hello means breaking all the switch statements that exhaust its subtypes. In object-oriented code, we usually try to refactor constructs like this to use an abstract method of the superclass that is overridden as appropriate by subclasses.

However, there are a class of problems where this kind of refactoring isn't appropriate. In most object-oriented languages, these problems are usually solved using the visitor pattern.

Visitors

Let's consider the following tree visitor implementation:

abstract class Node() {
    shared formal void accept(Visitor v);
}
class Leaf(Object val) extends Node() {
    shared Object value = val;
    shared actual void accept(Visitor v) { 
        v.visitLeaf(this); 
    }
}
class Branch(Node left, Node right) extends Node() {
    shared Node leftChild = left;
    shared Node rightChild = right;
    shared actual void accept(Visitor v) { 
        v.visitBranch(this);
    }
}
interface Visitor {
    shared formal void visitLeaf(Leaf l);
    shared formal void visitBranch(Branch b);
}

We can create a method which prints out the tree by implementing the Visitor interface:

void print(Node node) {
    object printVisitor satisfies Visitor {
        shared actual void visitLeaf(Leaf l) {
            writeLine("Found a leaf: " l.value "!");
        }
        shared actual void visitBranch(Branch b) {
            b.leftChild.accept(this);
            b.rightChild.accept(this);
        }
    }
    node.accept(printVisitor);
}

Notice that the code of printVisitor looks just like a switch statement. It must explicitly enumerate all subtypes of Node. It breaks if we add a new subtype of Node to the Visitor interface. This is correct, and is the desired behavior. By break, I mean that the compiler lets us know that we have to update our code to handle the new subtype.

In Ceylon, we can achieve the same effect, with less verbosity, by enumerating the subtypes of Node in its definition, and using a switch:

abstract class Node() of Leaf | Branch {}
class Leaf(Object val) extends Node() {
    shared Object value = val;
}
class Branch(Node left, Node right) extends Node() {
    shared Node leftChild = left;
    shared Node rightChild = right;
}

Our print() method is now much simpler, but still has the desired behavior of breaking when a new subtype of Node is added.

void print(Node node) {
    switch (node)
    case (is Leaf) {
        writeLine("Found a leaf: " node.value "!");
    }
    case (is Branch) {
        print(node.leftChild);
        print(node.rightChild);
    }
}

Typesafe enumerations

Ceylon doesn't have anything exactly like Java's enum declaration. But we can emulate the effect using the of clause.

shared class Suit(String name) 
        of hearts | diamonds | clubs | spades 
        extends Case(name) {}
        
shared object hearts extends Suit("hearts") {} 
shared object diamonds extends Suit("diamonds") {} 
shared object clubs extends Suit("clubs") {} 
shared object spades extends Suit("spades") {}

We're allowed to use the names of object declarations in the of clause if they extend the language module class Case.

Now we can exhaust all cases of Suit in a switch:

void print(Suit suit) {
    switch (suit)
    case (hearts) { writeLine("Heartzes"); }
    case (diamonds) { writeLine("Diamondzes"); }
    case (clubs) { writeLine("Clidubs"); }
    case (spades) { writeLine("Spidades"); }
}

(Note that these cases are ordinary value cases, not case (is...) type cases.)

Yes, this is a bit more verbose than a Java enum, but it's also slightly more flexible.

For a more practical example, let's see the definition of Boolean from the language module:

shared abstract class Boolean(String name) 
        of true | false 
        extends Case(name) {}
shared object false extends Boolean("false") {}
shared object true extends Boolean("true") {}

And here's how Comparable is defined. First, the typesafe enumeration Comparison:

doc "The result of a comparison between two
     Comparable objects."
shared abstract class Comparison(String name) 
        of larger | smaller | equal 
        extends Case(name) {}
doc "The receiving object is exactly equal 
     to the given object."
shared object equal extends Comparison("equal") {}
doc "The receiving object is smaller than 
     the given object."
shared object smaller extends Comparison("smaller") {}
doc "The receiving object is larger than 
     the given object."
shared object larger extends Comparison("larger") {}

Now, the Comparable interface itself:

shared interface Comparable<in Other> 
        satisfies Equality
        given Other satisfies Comparable<Other> {
    
    doc "The <=> operator."
    shared formal Comparison compare(Other other);
    
    doc "The > operator."
    shared Boolean largerThan(Other other) {
        return compare(other)==larger;
    }
    
    doc "The < operator."
    shared Boolean smallerThan(Other other) {
        return compare(other)==smaller;
    }
    
    doc "The >= operator."
    shared Boolean asLargeAs(Other other) {
        return compare(other)!=smaller;
    }
    
    doc "The <= operator."
    shared Boolean asSmallAs(Other other) {
        return compare(other)!=larger;
    }
    
}

Type inference

So far, we've always been explicitly specifying the type of every declaration. I think this generally makes code, especially example code, much easier to read and understand.

However, Ceylon does have the ability to infer the type of a locals or the return type of a local method. Just place the keyword local in place of the type declaration.

local hello = DefaultHello();
local operators = { "+", "-", "*", "/" };
local add(Natural x, Natural y) { return x+y; }

There are some restrictions applying to this feature. You can't use local:

  • for declarations annotated shared,
  • for declarations annotated formal,
  • when the value is specified later in the block of statements,
  • for methods with multiple return statements, or
  • to declare a parameter.

These restrictions mean that Ceylon's type inference rules are quite simple. Type inference is purely right-to-left and top-to-bottom. The type of any expression is already known without needing to look to any types declared to the left of the = specifier, or further down the block of statements.

  • The inferred type of a local declared local is just the type of the expression assigned to it using = or :=.
  • The inferred type of a method declared local is just the type of the returned expression.

Type inference for sequence enumeration expressions

What about sequence enumeration expressions like this:

local sequence  = { DefaultHello(), "Hello", 12.0 };

What type is inferred for sequence? You might answer: Sequence<X> where X is the common superclass or super-interface of all the element types. But that can't be right, since there might be more than one common supertype.

The answer is that the inferred type is Sequence<X> where X is the union of all the element expression types. In this case, the type is Sequence<DefaultHello|String|Float>. Now, this works out nicely, because Sequence<T> is covariant in T. So the following code is well typed:

local sequence  = { DefaultHello(), "Hello", 12.0 }; //type Sequence<DefaultHello|String|Float>
Object[] objects = sequence; //type Empty|Sequence<Object>

As is the following code:

local nums = { 12.0, 1, -3 }; //type Sequence<Float|Natural|Integer>
Number[] numbers = nums; //type Empty|Sequence<Number>

What about sequences that contain null? Well, do you remember the type of null from Part 1 was Nothing?

local sequence = { null, "Hello", "World" }; //type Sequence<Nothing|String>
String?[] strings = sequence; //type Empty|Sequence<Nothing|String>
String? s = sequence[0]; //type Nothing|Nothing|String which is just Nothing|String

It's interesting just how useful union types turn out to be. Even if you only very rarely explicitly write code with any explicit union type declaration (and that's probably a good idea), they are still there, under the covers, helping the compiler solve some hairy, otherwise-ambiguous, typing problems.

There's more...

A more advanced example of an algebraic datatype is shown here.

In Part 6 we'll explore Ceylon's generic type system in more depth.

Introduction to Ceylon Part 4

Posted by    |       |    Tagged as Ceylon

This is the fourth installment in a series of articles introducing the Ceylon language. Note that some features of the language may change before the final release.

Sequences

Some kind of array or list construct is a universal feature of all programming languages. The Ceylon language module defines support for sequence types. A sequence type is usually written X[] for some element type X. But this is really just an abbreviation for the union type Empty|Sequence<X>.

The interface Sequence represents a sequence with at least one element. The type Empty represents an empty sequence with no elements. Some operations of the type Sequence aren't defined by Empty, so you can't call them if all you have is X[]. Therefore, we need the if (nonempty ... ) construct to gain access to these operations.

void printBounds(String[] strings) {
    if (nonempty strings) {
        //strings is a Sequence<String>
        writeLine(strings.first + ".." + strings.last);
    }
    else {
        writeLine("Empty");
    }
}

Note how this is just a continuation of the pattern established for null value handling.

Sequence syntax sugar

There's lots more syntactic sugar for sequences. We can use a bunch of familiar Java-like syntax:

String[] operators = { "+", "-", "*", "/" };
String? plus = operators[0];
String[] multiplicative = operators[2..3];

Oh, and the expression {} returns a value of type Empty.

However, unlike Java, all these syntactic constructs are pure abbreviations. The code above is exactly equivalent to the following de-sugared code:

Empty|Sequence<String> operators = Array("+", "-", "*", "/");
Nothing|String plus = operators.value(0);
Empty|Sequence<String> multiplicative = operators.range(2,3);

A Range is also a subtype of Sequence. The following:

Character[] uppercaseLetters = 'A'..'Z';
Natural[] countDown = 10..0;

Is just sugar for:

Empty|Sequence<Character> uppercaseLetters = Range('A','Z');
Empty|Sequence<Natural> countDown = Range(10,0);

In fact, this is just a sneak preview of the fact that almost all operators in Ceylon are just sugar for method calls upon a type. We'll come back to this later, when we talk about operator polymorphism.

Iterating sequences

The Sequence interface extends Iterable, so we can iterate a Sequence using a for loop:

for (String op in operators) {
    writeLine(op);
}

Ceylon doesn't need C-style for loops. Instead, combine for with the range operator ...

variable Natural fac:=1;
for (Natural n in 1..100) {
    fac*=n;
    writeLine("Factorial " n "! = " fac "");
}

If, for any reason, we need to use the index of each element of a sequence we can use a special variation of the for loop that is designed for iterating instances of Entries:

for (Natural i -> String op in entries(operators)) {
    writeLine($i + ": " + op);
}

The entries() function returns an instance of Entries<Natural,String> containing the indexed elements of the sequence.

Sequence and its supertypes

It's probably a good time to see some more advanced Ceylon code. What better place to find some than in the language module itself?

Here's how the language module defines the type Sequence:

shared interface Sequence<out Element>
        satisfies Correspondence<Natural, Element> & 
                  Iterable<Element> & Sized {
    
    doc "The index of the last element of the sequence."
    shared formal Natural lastIndex;
    
    doc "The first element of the sequence."
    shared actual formal Element first;
    
    doc "The rest of the sequence, without the first
         element."
    shared formal Element[] rest;

    shared actual Boolean empty {
        return false;
    }
        
    shared actual default Natural size {
        return lastIndex+1;
    }
    
    doc "The last element of the sequence."
    shared default Element last {
        if (exists Element x = value(lastIndex)) {
            return x;
        }
        else {
            //actually never occurs if 
            //the subtype is well-behaved
            return first; 
        } 
    }

    shared actual default Iterator<Element> iterator() {
        class SequenceIterator(Natural from) 
                satisfies Iterator<Element> {
            shared actual Element? head { 
                return value(from);
            }
            shared actual Iterator<Element> tail {
                return SequenceIterator(from+1);
            }
        }
        return SequenceIterator(0);
    }
    
}

The most interesting operations are inherited from Correspondence, Iterable and Sized:

shared interface Correspondence<in Key, out Value>
        given Key satisfies Equality {
    
    doc "Return the value defined for the 
         given key."
    shared formal Value? value(Key key);
        
}
shared interface Iterable<out Element> 
        satisfies Container {
    
    doc "An iterator of values belonging
         to the container."
    shared formal Iterator<Element> iterator();
    
    shared actual default Boolean empty {
        return !(first exists);
    }
    
    doc "The first object."
    shared default Element? first {
        return iterator().head;
    }

}
shared interface Sized 
        satisfies Container {
        
    doc "The number of values or entries 
         belonging to the container."
    shared formal Natural size;
    
    shared actual default Boolean empty {
        return size==0;
    }
    
}
shared interface Container {
        
    shared formal Boolean empty;
    
}

Empty sequences and the Bottom type

Now let's see the definition of Empty:

object emptyIterator satisfies Iterator<Bottom> {
    
    shared actual Nothing head { 
        return null; 
    }
    shared actual Iterator<Bottom> tail { 
        return this; 
    }
    
}

shared interface Empty
           satisfies Correspondence<Natural, Bottom> & 
                     Iterable<Bottom> & Sized {
    
    shared actual Natural size { 
        return 0; 
    }
    shared actual Boolean empty { 
        return true; 
    }
    shared actual Iterator<Bottom> iterator() {
        return emptyIterator;
    }
    shared actual Nothing value(Natural key) {
        return null;
    }
    shared actual Nothing first {
        return null;
    }
    
}

The special type Bottom represents:

  • the empty set, or equivalently
  • the intersection of all types.

Since the empty set is a subset of all other sets, Bottom is assignable to all other types. Why is this useful here? Well, Correspondence<Natural,Element> and Iterable<Element> are both covariant in the type parameter Element. So Empty is assignable to Correspondence<Natural,T> and Iterable<T> for any type T. That's why Empty doesn't need a type parameter. The following code is well-typed:

void printAll(String[] strings) {
    variable Iterator<String> i := strings.iterator();
    while (exists String s = i.head) {
        writeLine(s);
    	i := i.tail;
    }
}

Since both Empty and Sequence<String> are subtypes of Iterable<String>, the union type String[] is also a subtype of Iterable<String>.

Another cool thing to notice here is the return type of the first and value() operations of Empty. You might have been expecting to see Bottom? here, since they override supertype members of type T?. But as we saw in Part 1, Bottom? is just an abbreviation for Nothing|Bottom. And Bottom is the empty set, so the union Bottom|T of Bottom with any other type T is just T itself.

The Ceylon compiler is able to do all this reasoning automatically. So when it sees an Iterable<Bottom>, it knows that the operation first is of type Nothing, i.e. it is the value null.

Cool, huh?

Sequence gotchas for Java developers

Superficially, a sequence type looks a lot like a Java array, but really it's very, very different! First, of course, a sequence type Sequence<String> is an immutable interface, it's not a mutable concrete type like an array. We can't set the value of an element:

String[] operators = .... ; 
operators[0] := "**"; //compile error

Furthermore, the index operation operators[i] returns an optional type String?, which results in quite different code idioms. To begin with, we don't iterate sequences by index like in C or Java. The following code does not compile:

for (Natural i in 0..operators.size-1) { 
    String op = operators[i]; //compile error 
    ...
}

Here, operators[i] is a String?, which is not directly assignable to String.

Instead, if we need access to the index, we use the special form of for shown above.

for (Natural i -> String op in entries(operators)) { 
    ...
}

Likewise, we don't usually do an upfront check of an index against the sequence length:

if (i>operators.size-1) { 
    throw IndexOutOfBoundException();
} 
else {
    return operators[i]; //compile error
}

Instead, we do the check after accessing the sequence element:

if (exists String op = operators[i]) { 
    return op;
} 
else {
    throw IndexOutOfBoundException();
}

We especially don't ever need to write the following:

if (i>operators.size-1) { 
    return "";
} 
else {
    return operators[i]; //compile error
}

This is much cleaner:

return operators[i] ? "";

All this may take a little getting used to. But what's nice is that all the exact same idioms also apply to other kinds of Correspondence, including Entries and Maps.

There's more...

In Part 5 we'll talk about union types and algebraic data types, type switching, and type inference.

Introduction to Ceylon Part 3

Posted by    |       |    Tagged as Ceylon

This is the third installment in a series of articles introducing the Ceylon language. Note that some features of the language may change before the final release.

This article was updated on 28/5/2011 to reflect changes to the model for introductions and add material on ambiguities in mixin inheritance, and on 3/5/2011 to incorporate feedback on member class refinement. The comment thread reflects information in the first version of article.

Inheritance and refinement

In object-oriented programming, we often replace conditionals (if, and especially switch) with subtyping. Indeed, according to some folks, this is what makes a program object-oriented. Let's try refactoring the Hello class from Part 2 into two classes, with two different implementations of greeting:

doc "A default greeting" 
class DefaultHello() {

    doc "The greeting" 
    shared default String greeting = "Hello, World!";
    
    doc "Print the greeting" 
    shared void say(OutputStream stream) {
        stream.writeLine(greeting);
    }
    
}

Notice that Ceylon forces us to declare attributes or methods that can be refined (overridden) by annotating them default.

Subclasses specify their superclass using the extends keyword, followed by the name of the superclass, followed by a list of arguments to be sent to the superclass initializer parameters. It looks just like an expression that instantiates the superclass:

doc "A personalized greeting" 
class PersonalizedHello(String name) 
        extends DefaultHello() {
    
    doc "The personalized greeting" 
    shared actual String greeting {
        return "Hello, " name "!";
    }

}

Ceylon also forces us to declare that an attribute or method refines (overrides) an attribute or method of a superclass by annotating it actual. All this annotating stuff costs a few extra keystrokes, but it helps the compiler detect errors. We can't inadvertently refine a member or the superclass, or inadvertently fail to refine it.

Notice that Ceylon goes out of its way to repudiate the idea of duck typing or structural typing. If it walks() like a Duck, then it should be a subtype of Duck and must explicitly refine the definition of walk() in Duck. We don't believe that the name of a method or attribute alone is sufficient to identify its semantics.

Abstract classes

There's one problem with what we've just seen. A personalized greeting is not really a kind of default greeting. This is a case for introducing an abstract superclass:

doc "A greeting" 
abstract class Hello() {
    
    doc "The (abstract) greeting" 
    shared formal String greeting;
    
    doc "Print the greeting" 
    shared void say(OutputStream stream) {
        stream.writeLine(greeting);
    }
    
}

Ceylon requires us to annotate abstract classes abstract, just like Java. This annotation specifies that a class cannot be instantiated, and can define abstract members. Like Java, Ceylon also requires us to annotate abstract members that don't specify an implementation. However, in this case, the required annotation is formal. The reason for having two different annotations, as we'll see later, is that nested classes may be either abstract or formal, and abstract nested classes are slightly different to formal member classes — a formal member class may be instantiated; an abstract class may not be.

Note that an attribute that is never initialized is always a formal attribute — Ceylon doesn't initialize attributes to zero or null unless you explicitly tell it to!

One way to define an implementation for an inherited abstract attribute is to simply assign a value to it in the subclass.

doc "A default greeting" 
class DefaultHello() extends Hello() {
    greeting = "Hello, World!";
}

Of course, we can also define an implementation for an inherited abstract attribute by refining it.

doc "A personalized greeting" 
class PersonalizedHello(String name) 
        extends Hello() {
    
    doc "The personalized greeting" 
    shared actual String greeting {
        return "Hello, " name "!";
    }
    
}

Note that there's no way to prevent a other code from extending a class in Ceylon. Since only members explicitly declared as supporting refinement using either formal or default can be refined, a subtype can never break the implementation of a supertype. Unless the supertype was explicitly designed to be extended, a subtype can add members, but never change the behavior of inherited members.

Interfaces and mixin inheritance

From time to time we come across a case where a class needs to inherit functionality from more than one supertype. Java's inheritance model doesn't support this, since an interface can never define a member with a concrete implementation. Interfaces in Ceylon are a little more flexible:

  • An interface may define concrete methods, attribute getters, and attribute setters.
  • It may not define simple attributes or initialization logic.

Notice that prohibiting simple attributes and initialization logic makes interfaces completely stateless. An interface can't hold references to other objects.

Let's take advantage of mixin inheritance to define a reusable Writer interface for Ceylon.

shared interface Writer { 

    shared formal Formatter formatter; 
    
    shared formal void write(String string);
    
    shared void writeLine(String string) { 
        write(string);
        write(process.newLine);
    }
    
    shared void writeFormattedLine(String formatString, Object... args) { 
        writeLine( formatter.format(formatString, args) );
    }
    
}

Note that we can't define a concrete value for the formatter attribute, since an interface may not define a simple attribute, and may not hold a reference to another object.

Note also that the call to writeLine() from writeFormattedLine() resolves to the instance method of Writer, which hides the toplevel method of the same name.

Now let's define a concrete implementation of this interface.

shared class ConsoleWriter() 
        satisfies Writer {
    
    formatter = StringFormatter();
    
    shared actual void write(String string) { 
        writeLine(string);
    }
    
}

The satisfies keyword is used to specify that an interface extends another interface or that a class implements an interface. Unlike an extends declaration, a satisfies declaration does not specify arguments, since interfaces do not have parameters or initialization logic. Furthermore, the satisfies declaration can specify more than one interface.

Ceylon's approach to interfaces eliminates a common pattern in Java where a separate abstract class defines a default implementation of some of the members of an interface. In Ceylon, the default implementations can be specified by the interface itself. Even better, it's possible to add a new member to an interface without breaking existing implementations of the interface.

Ambiguities in mixin inheritance

It's illegal for a type to inherit two members with the same name, unless the two members both (directly or indirectly) refine a common member of a common supertype, and the inheriting type itself also refines the member to eliminate any ambiguity. The following results in a compilation error:

interface Party {
    shared formal String legalName;
    shared default String name {
        return legalName;
    }
}

interface User {
    shared formal String userId;
    shared default String name {
        return userId;
    }
}

class Customer(String name, String email) 
        satisfies User & Party {
    legalName = name;
    userId = email;
    shared actual String name = name;    //error: refines two different members
}

To fix this code, we'll factor out a formal declaration of the attribute name to a common supertype. The following is legal:

interface Named {
    shared formal String name;
}

interface Party satisfies Named {
    shared formal String legalName;
    shared actual default String name {
        return legalName;
    }
}

interface User satisfies Named {
    shared formal String userId;
    shared actual default String name {
        return userId;
    }
}

class Customer(String name, String email) 
        satisfies User & Party {
    legalName = name;
    userId = email;
    shared actual String name = name;
}

Oh, of course, the following is illegal:

interface Named {
    shared formal String name;
}

interface Party satisfies Named {
    shared formal String legalName;
    shared actual String name {
        return legalName;
    }
}

interface User satisfies Named {
    shared formal String userId;
    shared actual String name {
        return userId;
    }
}

class Customer(String name, String email) 
        satisfies User & Party {    //error: inherits multiple definitions of name
    legalName = name;
    userId = email;
}

To fix this code, name must be declared default in both User and Party and explicitly refined in Customer.

Introduction

Sometimes, especially when we're working with code from modules we don't have control over, we would like to mix an interface into a type that has already been defined in another module. For example, we might like to introduce the Ceylon collections module type List into the language module type Sequence, so that all Sequences support all operations of List. But the language module shouldn't have a dependency to the collections module, so we can't specify that interface Sequence satisfies List in the declaration of Sequence in the language module.

Instead, we can introduce the type Sequence in the code which uses the collections and language modules. The collections module already defines an interface called SequenceList for this purpose. Well, it doesn't yet, since we have not yet either implemented introductions or written the collections module, but it will soon!

doc "Decorator that introduces List to Sequence."
see (List,Sequence)
shared interface SequenceList<Element> 
        adapts Sequence<Element>
        satisfies List<Element> {
    
    shared actual default List<Element> sortedElements() {
    	//define the operation of List in
    	//terms of operations on Sequence
        return asList(sortSequence(this));
    }
    
    ...
    
}

The adapts clause makes SequenceList a special kind of interface called an adapter (in the terminology used by this book). According to the language spec:

The interface may not:
  • declare or inherit a member that refines a member of any adapted type, or
  • declare or inherit a formal or non-default actual member unless the member is inherited from an adapted type.

The purpose of an adapter is to add a new supertype, called an introduced type, to an existing type, called the adapted type. The adapter doesn't change the original definition of the adapted type, and it doesn't affect the internal workings of an instance of the adapted type in any way. All it does is fill in the definitions of the missing operations. Here, the SequenceList interface provides concrete implementations of all methods of List that are not already implemented by Sequence.

Now, to introduce List to Sequence in a certain compilation unit, all we need to do is import the adapter:

import ceylon.collection { List, SequenceList }

...

//define a Sequence
Sequence<String> names = { "Gavin", "Emmanuel", "Andrew", "Ales" };

//call an operation of List on Sequence
List<String> sortedNames = names.sortedElements();

Note that the introduction is not visible outside the lexical scope of the import statement (the compilation unit). But within the compilation unit containing the import statement, every instance of of the adapted type Sequence now has all the attributes and methods of the introduced type List, and is assignable to the introduced type.

Again, according to the spec:

If, in a certain compilation unit, multiple introductions of a certain adapted type declare or inherit a member that refines a common member of a common supertype then either:
  • there must be a unique member from the set of members, called the most refined member, that refines all the other members, or
  • the adapted type must declare or inherit a member that refines all the members.
At runtime, an operation (method invocation, member class instantiation, or attribute evaluation) upon any type that is a subtype of all the adapted types is dispatched according to the following rule:
  • If the runtime type of the instance of the adapted type declares or inherits a member defining the operation, the operation is dispatched to the runtime type of the instance.
  • Otherwise, the operation is dispatched to the introduction that has the most-refined member defining the operation.

Introduction compared to extension methods and implicit type conversions

Introduction is Ceylon's way of extending a type after it's been defined. It's interesting to compare introduction to the following features of other languages:

  • extension methods, and
  • user-defined implicit type conversions.

Introduction is really just a much more powerful cousin of extension methods. From our point of view, an extension method introduces a member to a type, without actually introducing a new supertype. Indeed, a Ceylon adapter with no satisfies clause is actually a package of extension methods!

shared interface StringSequenceExtensions 
        adapts Sequence<String> {
    
    shared String concatenated {
        variable String concat = "";
        for (String s in this) {
            concat+=s;
        }
        return concat;
    }
    
    shared String join(String separator=", ") {
        ...
    }
    
}

On the other hand, introductions are less powerful than implicit type conversions. This is by design! In this case, less powerful means safer, more disciplined. The power of implicit type conversions comes partly from their ability to work around some of the designed-in limitations of the type system. But these limitations have a purpose! I'm especially thinking of the prohibitions against:

  • inheriting the same generic type twice, with different type arguments (in most languages), and
  • overloading (in Ceylon).

Implicit type conversions are an end-run around these restrictions, reintroducing the ambiguities that these restrictions exist to solve.

Furthermore, it's extremely difficult to imagine a language with implicit type conversions that preserve the following important properties of the type system:

  • transitivity of the assignability relationship,
  • covariance of generic types,
  • the semantics of the identity == operator, and
  • the ability to infer generic type arguments of an invocation or instantiation.

Finally, implicit type conversions work by having the compiler introduce hidden invocations of arbitrary user-written procedural code, code that could potentially have side-effects or make use of temporal state. Thus, the observable behavior of the program can depend upon precisely where and how the compiler introduces these magic calls.

Introductions are a kind of elegant compromise: more powerful than plain extension methods, safer than implicit type conversions. We think the beauty of this model is a major advantage of Ceylon over similar languages.

Type aliases

It's often useful to provide a shorter or more semantic name to an existing class or interface type, especially if the class or interface is a parameterized type. For this, we use a type alias, for example:

interface People = Set<Person>;

A class alias must declare its formal parameters:

shared class People(Person... people) = ArrayList<Person>;

Member classes and member class refinement

You're probably used to the idea of an inner class in Java — a class declaration nested inside another class or method. Since Ceylon is a language with a recursive block structure, the idea of a nested class is more than natural. But in Ceylon, a non-abstract nested class is actually considered a member of the containing type. For example, BufferedReader defines the member class Buffer:

class BufferedReader(Reader reader) 
        satisfies Reader { 
    shared default class Buffer() 
            satisfies List<Character> { ... }
    ...
}

The member class Buffer is annotated shared, so we can instantiate it like this:

BufferedReader br = BufferedReader(reader); 
BufferedReader.Buffer b = br.Buffer();

Note that a nested type name must be qualified by the containing type name when used outside of the containing type.

The member class Buffer is also annotated default, so we can refine it in a subtype of BufferedReader:

shared class BufferedFileReader(File file) 
        extends BufferedReader(FileReader(file)) {
    shared actual class Buffer() 
            extends super.Buffer() { ... }
}

That's right: Ceylon lets us override a member class defined by a supertype!

Note that BufferedFileReader.Buffer is a subclass of BufferedReader.Buffer.

Now the instantiation br.Buffer() above is a polymorphic operation! It might return an instance of BufferedFileReader.Buffer or an instance of BufferedReader.Buffer, depending upon whether br refers to a plain BufferedReader or a BufferedFileReader. This is more than a cute trick. Polymorphic instantiation lets us eliminate the factory method pattern from our code.

It's even possible to define a formal member class of an abstract class. A formal member class can declare formal members.

abstract class BufferedReader(Reader reader) 
        satisfies Reader { 
    shared formal class Buffer() {
        shared formal Byte read();
    }
    ...
}

In this case, a concrete subclass of the abstract class must refine the formal member class.

shared class BufferedFileReader(File file) 
        extends BufferedReader(FileReader(file)) {
    shared actual class Buffer() 
             extends super.Buffer() {
         shared actual Byte read() {
             ...
         }
    }
}

Notice the difference between an abstract class and a formal member class. An abstract nested class may not be instantiated, and need not be refined by concrete subclasses of the containing class. A formal member class may be instantiated, and must be refined by every subclass of the containing class.

It's an interesting exercise to compare Ceylon's member class refinement with the functionality of Java dependency injection frameworks. Both mechanisms provide a means of abstracting the instantiation operation of a type. You can think of the subclass that refines a member type as filling the same role as a dependency configuration in a dependency injection framework.

Anonymous classes

If a class has no parameters, it's often possible to use a shortcut declaration which defines a named instance of the class, without providing any actual name for the class itself. This is usually most useful when we're extending an abstract class or implementing an interface.

doc "A default greeting" 
object defaultHello extends Hello() {
    greeting = "Hello, World!";
}
shared object consoleWriter satisfies Writer {
        	
    formatter = StringFormatter();
    
    shared actual void write(String string) { 
        writeLine(string);
    }
    
}

The downside to an object declaration is that we can't write code that refers to the concrete type of defaultHello or consoleWriter, only to the named instances.

You might be tempted to think of object declarations as defining singletons, but that's not quite right:

  • A toplevel object declaration does define a singleton.
  • An object declaration nested inside a class defines an object per instance of the containing class.
  • An object declaration nested inside a method, getter, or setter results in an new object each time the method, getter, or setter is executed.

Let's see how this can be useful:

interface Subscription {
    shared formal void cancel();
}
shared Subscription register(Subscriber s) { 
    subscribers.append(s); 
    object subscription satisfies Subscription {
        shared actual void cancel() { 
            subscribers.remove(s);
        }
    } 
    return subscription;
}

Notice how this code example makes clever use of the fact that the nested object declaration receives a closure of the locals defined in the containing method declaration!

A different way to think about the difference between object and class is to think of a class as a parametrized object. (Of course, there's one big difference: a class declaration defines a named type that we can refer to in other parts of the program.) We'll see later that Ceylon also lets us think of a method as a parametrized attribute.

An object declaration can refine an attribute declared formal or default.

shared abstract class App() { 
    shared formal OutputStream stream; 
    ...
}
class ConsoleApp() extends App() { 
    shared actual object stream 
            satisfies OutputStream { ... } 
    ...
}

However, an object may not itself be declared formal or default.

There's more...

Member classes and member class refinement allows Ceylon to support type families.

If you're interested, here's some crazy ideas about how to generalize the notion of refinement to toplevel declarations.

In Part 4, we're going to meet sequences, Ceylon's take on the array type.

Introduction to Ceylon Part 2

Posted by    |       |    Tagged as Ceylon

This is the second installment in a series of articles introducing the Ceylon language. Note that some features of the language may change before the final release.

Creating your own classes

Ceylon is an object oriented language, so we usually write most of our code in classes. A class is a type that packages:

  • operations — called methods,
  • state — held by attributes,
  • logic to initialize the state when the object is first created, and,
  • sometimes, other nested types.

Types (interfaces, classes, and type parameters) have names that begin with uppercase letters. Members (methods and attributes) and locals have names that begin with lowercase letters. This is the rule you're used to from Java. Unlike Java, the Ceylon compiler enforces these rules. If you try to write class hello or String Name, you'll get a compilation error.

Our first version of the Hello class has a single attribute and a single method:

doc "A personalized greeting" 
class Hello(String? name) {
	
    doc "The greeting" 
    shared String greeting; 
    if (exists name) {
        greeting = "Hello, " name "!";
    } 
    else {
        greeting = "Hello, World!";
    }
    
    doc "Print the greeting" 
    shared void say(OutputStream stream) {
        stream.writeLine(greeting);
    }
    
}

To understand this code completely, we're going to need to first explore Ceylon's approach to program element accessibility — the shared annotation above, then meet the concept of an attribute, and finally discuss how object initialization works in Ceylon.

Hiding implementation details

In Java and C#, a class controls the accessibility of its members using visibility modifier annotations, allowing the class to hide its internal implementation from other code. The visibility modifiers select between pre-defined, definite visibility levels like public, protected, package private, and private. Ceylon provides just one annotation for access control. The key difference is that Ceylon's shared annotation does not represent a single definite scope. Rather, its meaning is contextual, relative to the program element at which it appears. The shared annotation is in some cases more flexible, in certain cases less flexible, but almost always simpler and easier to use than the approach taken by Java and C#. And it's a far better fit to a language like Ceylon with a regular, recursive block structure.

Members of a class are hidden from code outside the body of the class by default — only members explicitly annotated shared are visible to other toplevel types or methods, other compilation units, other packages, or other modules. A shared member is visible to any code to which the class itself is visible.

And, of course, a class itself may be hidden from other code. By default, a toplevel class is hidden from code outside the package in which the class is defined — only toplevel classes explicitly annotated shared are visible to other packages or modules. A shared toplevel class is visible to any code to which the package containing the class is visible.

Finally, a package may be hidden from packages in other modules. In fact, packages are hidden from code outside the module to which the package belongs by default — only explicitly shared packages are visible to other modules.

It's not possible to create a shared toplevel class with package-private members. Members of a shared toplevel class must be either shared — in which case they're visible outside the package containing the class, or un-shared — in which case they're only visible to the class itself. Package-private functionality must be defined in un-shared (package-private) toplevel classes or interfaces. Likewise, a shared package can't contain a module-private toplevel class. Module-private toplevel classes must belong to unshared (module-private) packages.

Ceylon doesn't have anything like Java's protected. The purpose of visibility rules and access control is to limit dependencies between code that is developed by different teams, maintained by different developers, or has different release cycles. From a software engineering point of view, any code element exposed to a different package or module is a dependency that must be managed. And a dependency is a dependency. It's not any less of a dependency if the code to which the program element is exposed is in a subclass of the type which contains the program element!

Abstracting state using attributes

The attribute greeting is a simple attribute, the closest thing Ceylon has to a Java field. Its value is specified immediately after it is declared. Usually we can declare and specify the value of an attribute in a single line of code.

shared String greeting = "Hello, " name "!";
shared Natural months = years * 12;

The Ceylon compiler forces us to specify a value of any simple attribute or local before making use of the simple attribute or local in an expression. Ceylon will never automatically initialize an attribute to any kind of default value or let code observe the value of an uninitialized attribute. This code results in an error at compile time:

Natural count;
shared void inc() {
    count++;   //compile error
}

An attribute is a bit different to a Java field. It's an abstraction of the notion of a value. Some attributes are simple value holders like the one we've just seen; others are more like a getter method, or, sometimes, like a getter and setter method pair. Like methods, attributes are polymorphic—an attribute definition may be refined (overridden) by a subclass.

We could rewrite the attribute greeting as a getter:

shared String greeting { 
    if (exists name) {
        return "Hello, " name "!";
    } 
    else {
        return "Hello, World!";
    }
}

Notice that the syntax of a getter declaration looks a lot like a method declaration with no parameter list.

Clients of a class never need to know whether the attribute they access holds state directly, or is a getter that derives its value from other attributes of the same object or other objects. In Ceylon, you don't need to go around declaring all your attributes private and wrapping them in getter and setter methods. Get out of that habit right now!

Understanding object initialization

In Ceylon, classes don't have constructors. Instead:

  • the parameters needed to instantiate the class — the initializer parameters — are declared directly after the name of the class, and
  • code to initialize the new instance of the class — the class initializer — goes directly in the body of the class.

Take a close look at the following code fragment:

String greeting; 
if (exists name) {
    greeting = "Hello, " name "!";
}
else {
    greeting = "Hello, World!";
}

In Ceylon, this code could appear in the body of a class, where it would be declaring and specifying the value of an immutable attribute, or it could appear in the body of a method definition, where it would be declaring and specifying the value of an immutable local variable. That's not the case in Java, where initialization of fields looks very different to initialization of local variables! Thus the syntax of Ceylon is more regular than Java. Regularity makes a language easy to learn and easy to refactor.

Now let's turn our attention to a different possible implementation of greeting:

class Hello(String? name) {
    shared String greeting { 
        if (exists name) {
            return "Hello, " name "!";
        } 
        else {
            return "Hello, World!";
        }
    } 
    ...

}

You might be wondering why we're allowed to use the parameter name inside the body of the getter of greeting. Doesn't the parameter go out of scope as soon as the initializer terminates? Well, that's true, but Ceylon is a language with a very strict block structure, and the scope of declarations is governed by that block structure. In this case, the scope of name is the whole body of the class, and the definition of greeting sits inside that scope, so greeting is permitted to access name.

We've just met our first example of closure. We say that method and attribute definitions receive a closure of values defined in the class body to which they belong. That's just a fancy way of obfuscating the idea that greeting holds onto the value of name, even after the initializer completes.

In fact, one way to look at the whole notion of a class in Ceylon is to think of it as a function which returns a closure of its own local variables. This helps explain why the syntax of class declarations is so similar to the syntax of method declarations (a class declaration looks a lot like a method declaration where the return type and the name of the method are the same).

Instantiating classes and overloading their initializer parameters

Oops, I got so excited about attributes and closure and null value handling that I forgot to show you the code that uses Hello!

doc "Print a personalized greeting" 
void hello() {
    Hello(process.args.first).say(process.output);
}

Our rewritten hello() method just creates a new instance of Hello, and invokes say(). Ceylon doesn't need a new keyword to know when you're instantiating a class. No, we don't know why Java needs it. You'll have to ask James.

I suppose you're worried that if Ceylon classes don't have constructors, then they also can't have multiple constructors. Does that mean we can't overload the initialization parameter list of a class?

I guess now's as good a time as any to break some more bad news: Ceylon doesn't support method overloading either! But, actually, this isn't as bad as it sounds. The sad truth is that overloading is the source of various problems in Java, especially when generics come into play. And in Ceylon, we can emulate most non-evil uses of constructor or method overloading using:

  • defaulted parameters, to emulate the effect of overloading a method or class by arity (the number of parameters),
  • sequenced parameters, i.e. varargs, and
  • union types or enumerated type constraints, to emulate the effect of overloading a method or class by parameter type.

We're not going to get into all the details of these workarounds right now, but here's a quick example of each of the three techniques:

//defaulted parameter 
void print(String string = "\n") {
    writeLine(string);
}
//sequenced parameter 
void print(String... strings) {
    for (String string in strings) { 
        writeLine(string);
    }
}
//union type 
void print(String|Named printable) {
    String string;
    switch (printable) 
    case (is String) {
        string = printable;
    } 
    case (is Named) {
        string = printable.name;
    }
    writeLine(string);
}

Don't worry if you don't completely understand the third example just yet. Just think of it as a completely typesafe version of how you would write an overloaded operation in a dynamic language like Smalltalk, Python, or Ruby. (If you're really impatient, skip forward to the discussion of generic type constraints.)

To be completely honest, there are some circumstances where this approach ends up slightly more awkward than Java-style overloading. But that's a small price to pay for a language with clearer semantics, without nasty corner cases, that is ultimately more powerful.

Let's overload Hello, and its say() method, using defaulted parameters:

doc "A command line greeting" 
class Hello(String? name = process.args.first) {
	...
    
    doc "Print the greeting" 
    shared void say(OutputStream stream = process.output) {
        stream.writeLine(greeting);
    }
    
}

Our hello() method is now looking really simple:

doc "Print a personalized greeting" 
void hello() {
    Hello().say();
}

There's more...

In Part 3, we'll explore inheritance and refinement (overriding).

Ceylon presentation video

Posted by    |       |    Tagged as Ceylon

InfoQ posted a video of my presentation in China. This being the first time I had ever tried to talk about the language in front of other people, I'm quite disfluent, and even say some stuff that isn't even correct. I certainly don't do a great job of explaining some things. Well, I'm sure that with all the conference invitations that have suddenly started filling up my inbox, I'll be getting plenty of practice. :-/

back to top