Help

This is the fourth installment in a series of articles introducing the Ceylon language. Note that some features of the language may change before the final release.

Sequences

Some kind of array or list construct is a universal feature of all programming languages. The Ceylon language module defines support for sequence types. A sequence type is usually written X[] for some element type X. But this is really just an abbreviation for the union type Empty|Sequence<X>.

The interface Sequence represents a sequence with at least one element. The type Empty represents an empty sequence with no elements. Some operations of the type Sequence aren't defined by Empty, so you can't call them if all you have is X[]. Therefore, we need the if (nonempty ... ) construct to gain access to these operations.

void printBounds(String[] strings) {
    if (nonempty strings) {
        //strings is a Sequence<String>
        writeLine(strings.first + ".." + strings.last);
    }
    else {
        writeLine("Empty");
    }
}

Note how this is just a continuation of the pattern established for null value handling.

Sequence syntax sugar

There's lots more syntactic sugar for sequences. We can use a bunch of familiar Java-like syntax:

String[] operators = { "+", "-", "*", "/" };
String? plus = operators[0];
String[] multiplicative = operators[2..3];

Oh, and the expression {} returns a value of type Empty.

However, unlike Java, all these syntactic constructs are pure abbreviations. The code above is exactly equivalent to the following de-sugared code:

Empty|Sequence<String> operators = Array("+", "-", "*", "/");
Nothing|String plus = operators.value(0);
Empty|Sequence<String> multiplicative = operators.range(2,3);

A Range is also a subtype of Sequence. The following:

Character[] uppercaseLetters = 'A'..'Z';
Natural[] countDown = 10..0;

Is just sugar for:

Empty|Sequence<Character> uppercaseLetters = Range('A','Z');
Empty|Sequence<Natural> countDown = Range(10,0);

In fact, this is just a sneak preview of the fact that almost all operators in Ceylon are just sugar for method calls upon a type. We'll come back to this later, when we talk about operator polymorphism.

Iterating sequences

The Sequence interface extends Iterable, so we can iterate a Sequence using a for loop:

for (String op in operators) {
    writeLine(op);
}

Ceylon doesn't need C-style for loops. Instead, combine for with the range operator ...

variable Natural fac:=1;
for (Natural n in 1..100) {
    fac*=n;
    writeLine("Factorial " n "! = " fac "");
}

If, for any reason, we need to use the index of each element of a sequence we can use a special variation of the for loop that is designed for iterating instances of Entries:

for (Natural i -> String op in entries(operators)) {
    writeLine($i + ": " + op);
}

The entries() function returns an instance of Entries<Natural,String> containing the indexed elements of the sequence.

Sequence and its supertypes

It's probably a good time to see some more advanced Ceylon code. What better place to find some than in the language module itself?

Here's how the language module defines the type Sequence:

shared interface Sequence<out Element>
        satisfies Correspondence<Natural, Element> & 
                  Iterable<Element> & Sized {
    
    doc "The index of the last element of the sequence."
    shared formal Natural lastIndex;
    
    doc "The first element of the sequence."
    shared actual formal Element first;
    
    doc "The rest of the sequence, without the first
         element."
    shared formal Element[] rest;

    shared actual Boolean empty {
        return false;
    }
        
    shared actual default Natural size {
        return lastIndex+1;
    }
    
    doc "The last element of the sequence."
    shared default Element last {
        if (exists Element x = value(lastIndex)) {
            return x;
        }
        else {
            //actually never occurs if 
            //the subtype is well-behaved
            return first; 
        } 
    }

    shared actual default Iterator<Element> iterator() {
        class SequenceIterator(Natural from) 
                satisfies Iterator<Element> {
            shared actual Element? head { 
                return value(from);
            }
            shared actual Iterator<Element> tail {
                return SequenceIterator(from+1);
            }
        }
        return SequenceIterator(0);
    }
    
}

The most interesting operations are inherited from Correspondence, Iterable and Sized:

shared interface Correspondence<in Key, out Value>
        given Key satisfies Equality {
    
    doc "Return the value defined for the 
         given key."
    shared formal Value? value(Key key);
        
}
shared interface Iterable<out Element> 
        satisfies Container {
    
    doc "An iterator of values belonging
         to the container."
    shared formal Iterator<Element> iterator();
    
    shared actual default Boolean empty {
        return !(first exists);
    }
    
    doc "The first object."
    shared default Element? first {
        return iterator().head;
    }

}
shared interface Sized 
        satisfies Container {
        
    doc "The number of values or entries 
         belonging to the container."
    shared formal Natural size;
    
    shared actual default Boolean empty {
        return size==0;
    }
    
}
shared interface Container {
        
    shared formal Boolean empty;
    
}

Empty sequences and the Bottom type

Now let's see the definition of Empty:

object emptyIterator satisfies Iterator<Bottom> {
    
    shared actual Nothing head { 
        return null; 
    }
    shared actual Iterator<Bottom> tail { 
        return this; 
    }
    
}

shared interface Empty
           satisfies Correspondence<Natural, Bottom> & 
                     Iterable<Bottom> & Sized {
    
    shared actual Natural size { 
        return 0; 
    }
    shared actual Boolean empty { 
        return true; 
    }
    shared actual Iterator<Bottom> iterator() {
        return emptyIterator;
    }
    shared actual Nothing value(Natural key) {
        return null;
    }
    shared actual Nothing first {
        return null;
    }
    
}

The special type Bottom represents:

  • the empty set, or equivalently
  • the intersection of all types.

Since the empty set is a subset of all other sets, Bottom is assignable to all other types. Why is this useful here? Well, Correspondence<Natural,Element> and Iterable<Element> are both covariant in the type parameter Element. So Empty is assignable to Correspondence<Natural,T> and Iterable<T> for any type T. That's why Empty doesn't need a type parameter. The following code is well-typed:

void printAll(String[] strings) {
    variable Iterator<String> i := strings.iterator();
    while (exists String s = i.head) {
        writeLine(s);
    	i := i.tail;
    }
}

Since both Empty and Sequence<String> are subtypes of Iterable<String>, the union type String[] is also a subtype of Iterable<String>.

Another cool thing to notice here is the return type of the first and value() operations of Empty. You might have been expecting to see Bottom? here, since they override supertype members of type T?. But as we saw in Part 1, Bottom? is just an abbreviation for Nothing|Bottom. And Bottom is the empty set, so the union Bottom|T of Bottom with any other type T is just T itself.

The Ceylon compiler is able to do all this reasoning automatically. So when it sees an Iterable<Bottom>, it knows that the operation first is of type Nothing, i.e. it is the value null.

Cool, huh?

Sequence gotchas for Java developers

Superficially, a sequence type looks a lot like a Java array, but really it's very, very different! First, of course, a sequence type Sequence<String> is an immutable interface, it's not a mutable concrete type like an array. We can't set the value of an element:

String[] operators = .... ; 
operators[0] := "**"; //compile error

Furthermore, the index operation operators[i] returns an optional type String?, which results in quite different code idioms. To begin with, we don't iterate sequences by index like in C or Java. The following code does not compile:

for (Natural i in 0..operators.size-1) { 
    String op = operators[i]; //compile error 
    ...
}

Here, operators[i] is a String?, which is not directly assignable to String.

Instead, if we need access to the index, we use the special form of for shown above.

for (Natural i -> String op in entries(operators)) { 
    ...
}

Likewise, we don't usually do an upfront check of an index against the sequence length:

if (i>operators.size-1) { 
    throw IndexOutOfBoundException();
} 
else {
    return operators[i]; //compile error
}

Instead, we do the check after accessing the sequence element:

if (exists String op = operators[i]) { 
    return op;
} 
else {
    throw IndexOutOfBoundException();
}

We especially don't ever need to write the following:

if (i>operators.size-1) { 
    return "";
} 
else {
    return operators[i]; //compile error
}

This is much cleaner:

return operators[i] ? "";

All this may take a little getting used to. But what's nice is that all the exact same idioms also apply to other kinds of Correspondence, including Entries and Maps.

There's more...

In Part 5 we'll talk about union types and algebraic data types, type switching, and type inference.

16 comments:
 
28. Apr 2011, 15:09 CET | Link
david

Really exciting! Maybe you could explain details of generics / covariants / contravariants in one of the next installments...

ReplyQuote
 
28. Apr 2011, 16:32 CET | Link
david wrote on Apr 28, 2011 09:09:
Really exciting! Maybe you could explain details of generics / covariants / contravariants in one of the next installments...

Yes, it might time for that. But I also need to go back and fill in a couple of things I've left out:

  • variable locals and attributes
  • packages, modules, and import statements
  • operators and operator polymorphism
  • type narrowing using if (is ... ) and case (is ... )
  • the useful types Object, IdentifiableObject, Entry<U,V>, and Entries<U,V>

We also haven't done numeric types yet.

I'm never sure whether it's better to first explain the scary (co|contra)variance stuff before getting into sequences and numeric types (which use this feature), or explain these familiar types first and gloss over what in and out means in their definitions. The truth is that variance annotations are really pretty non-scary once you get them, but it takes some effort the first time.

 
28. Apr 2011, 18:23 CET | Link

OK, I've decided what Part 5 will cover:

  • type narrowing with if (is ... ) and case (is ... )
  • revisit union types
  • algebraic data types (the of clause)
  • the visitor pattern
  • typesafe enums
  • type inference (the local keyword)
  • inferred type of sequence enumerations (expression like { x, y, z })

I think that material all fits together pretty well, since it's all related in some way to union types.

 
02. May 2011, 23:51 CET | Link
Hi,
So we have "exists" for null checks and "nonempty" for checking sequences. Are there "nonexists" and "empty" statements also?

If yes, wouldn't it be better to combine them? Like PHP empty()? One of the most common repeated line I've seen in Java was checking String agains null and being empty (""). Whereas C# has a library method for that and empty() in PHP is much more universal (nulls, empty values, empty arrays, False, 0, etc.). IMO it was pretty easy to use :)

What do you think?
 
03. May 2011, 00:07 CET | Link
Mateusz Mrozewski wrote on May 02, 2011 17:51:
Hi, So we have exists for null checks and nonempty for checking sequences. Are there nonexists and empty statements also?

There are exists and nonempty operators. (And is and in operators.) Not sure what you mean by statements in this context.

If yes, wouldn't it be better to combine them? Like PHP empty()? One of the most common repeated line I've seen in Java was checking String agains null and being empty (""). Whereas C# has a library method for that and empty() in PHP is much more universal (nulls, empty values, empty arrays, False, 0, etc.). IMO it was pretty easy to use :) What do you think?

I didn't mention it, but yes, if (nonempty ... ) does do an exists check. i.e. it can accept a T[]?, casting to Sequence<T>.

And yeah, nonempty was inspired by various scripting languages ;-)

 
03. May 2011, 00:15 CET | Link
There are exists and nonempty operators. (And is and in operators.) Not sure what you mean by statements in this context.

I meant operators ;) And my questions in other words: what's the opposite of exists and nonempty operators? How to write an if without an else, in which especially I expect null or an empty sequence?

 
03. May 2011, 00:21 CET | Link
Mateusz Mrozewski wrote on May 02, 2011 18:15:
There are exists and nonempty operators. (And is and in operators.) Not sure what you mean by statements in this context. I meant operators ;) And my questions in other words: what's the opposite of exists and nonempty operators? How to write an if without an else, in which especially I expect null or an empty sequence?
if (is Nothing foo) { ... }

and

if (is Empty foos) { ... }

Are probably going to work.

Or, alternatively:

if (! foo exists) { ... }

and

if (! foos nonempty) { ... }

But I find these slightly less readable.

 
03. May 2011, 00:24 CET | Link

Note: the exists, nonempty, and is Type operators are postfix.

 
08. May 2011, 14:36 CET | Link

Hello,

I couldn't find it in the previous parts - a method that is actual default means that the method implements a method from an interface, and can be further refined (overriden)? If the method was only actual, subclasses couldn't override it?

Adam

 
09. May 2011, 02:31 CET | Link
Adam Warski wrote on May 08, 2011 08:36:
I couldn't find it in the previous parts - a method that is actual default means that the method implements a method from an interface, and can be further refined (overriden)? If the method was only actual, subclasses couldn't override it?

Right. In Ceylon, actual default is like override virtual in C#.

(I've never liked the word override here, because it's a verb, not an adjective, and so it reads as if the annotation or type declaration that follows it is the object of the verb. OTOH, I could have gone with virtual instead of default, but default seems so much more descriptive. To me, virtual would be a really good word for what we ended up calling formal.)

 
17. May 2011, 21:32 CET | Link

I'm curious about the syntax used here and what's behind it:

for (Natural i -> String op in entries(operators)) {
    writeLine($i + ": " + op);
}

What type is being returned per iteration of entries? Is it a tuple?

Does the -> then signify destructuring assignment of some sort? In which case, why the arrow syntax, it seems oddly out of place with the rest of the code I've seen, (I've read up to part 10, I was hoping you'd go into this at some point).

 
18. May 2011, 04:23 CET | Link
What type is being returned per iteration of entries? Is it a tuple?

It's an instance of Entry<Natural,String>. The -> symbol is actually an operator that constructs an Entry when it appears in an expression like this:

Entries<Natural,String> operators = Entries { 0->"+", 1->"-", 2->"*", 3->"/" };

That's just an expression that instantiates an Entries, using a named argument invocation. It desugars to

Entries<Natural,String> operators = Entries { Entry(0,"+"), Entry(1,"-"), Entry(2,"*"), Entry(3,"/") };

But in the example you're asking about, -> is not an operator it's a special part of the syntax of for.

Does the -> then signify destructuring assignment of some sort?

Yeah, kinda. Ceylon doesn't support pattern matching or any other kind of general-purpose destructuring syntax. But it does have this special support for destructuring Entrys, since that is just such a common thing.

 
21. May 2011, 11:06 CET | Link
Yeah, kinda. Ceylon doesn't support pattern matching or any other kind of general-purpose destructuring syntax. But it does have this special support for destructuring Entrys, since that is just such a common thing.

What if you had general purpose destructuring everywhere? This would enable things like multiple return values, and -> in for would no longer be a special case.

Natural foo -> String bar = blah();
local foo -> bar = blah();
local head -> rest = operators.headrest();
 
23. May 2011, 19:19 CET | Link

From the article

shared interface Iterable<out X> 
       satisfies Container {
    
    doc "An iterator of values belonging
         to the container."
    shared formal Iterator<X> iterator();
    
    shared actual default Boolean empty {
        return !(first exists);
    }
    
    doc "The first object."
    shared default X? first {
        return iterator().head;
    }

}

shared interface Sized 
       satisfies Container {
        
    doc "The number of values or entries 
         belonging to the container."
    shared formal Natural size;
    
    shared actual default Boolean empty {
        return size==0;
    }
    
}

Both of interfaces define the method/attribute spelled empty. Which implementation is chosen in the case that a class implements both these interfaces? For example:

shared class Set<out T>
       satisfies Iterable<T>, Sized {

       ...
}
 
23. May 2011, 19:34 CET | Link
Which implementation is chosen in the case that a class implements both these interfaces?

The compiler forces you to refine empty in Set, thus resolving the ambiguity.

Note that in this case, the two inherited empty implementations both refine empty from Container, so they're actually the same attribute.

If this were not the case, and you were just going to inherit two unrelated attributes that happened to share a name, the compiler would simply prevent you from defining Set as satisfying both interfaces. At least, that is what the language spec currently says. A future version of the language might allow you to rename inherited members in the import statement, a possibility we've already discussed in another comment thread.

 
24. May 2011, 00:35 CET | Link

Ooh. I just noticed the nice feature of covariant return types with respect to type unions. I spotted it here:

shared interface Iterable<out X> 
       satisfies Container {
    
    // ...

    shared default X? first {
        return iterator().head;
    }

}

shared interface Sequence<out X>
       satisfies Correspondence<Natural, X> & 
                 Iterable<X> & Sized  {
    // ...
    shared actual formal X first;
}

Notice how Iterable<X> defines first as returning type X? (aka Nothing/X), but Sequence<X> reduces this to plain-old-X.

I was trying to dream up an SDK/Collections library. I had emulated the provided interfaces in java, using aspectj to provide the interface's default implementations. I had to implement an Optional<X> class to emulate the X? syntax, but full on type unions--especially this reduction--I cannot do in java/aspectj.

I consider this a good thing--for Ceylon. It's something unique and useful coming to the OOPL table.

Keep it up!

Post Comment