Anonymous functions or Smalltalk-style arguments?

Posted by    |      

In Introduction to Ceylon Part 8 we discussed Ceylon's support for defining higher order functions, in particular the two different ways to represent the type of a parameter which accepts a reference to a function. The following declarations are essentially equivalent:

X[] filter<X>(X[] sequence, Callable<Boolean,X> by) { ... }
X[] filter<X>(X[] sequence, Boolean by(X x)) { ... }

We've even seen how we can pass a reference to a method to such a higher-order function:

Boolean stringNonempty(String string) {
    return !string.empty;
}
String[] nonemptyStrings = filter(strings, stringNonempty);

Of course, almost all of the convenience of general-purpose higher order functions like filter() is lost if we have to declare a whole method every time we want to use the higher order function. Indeed, much of the appeal of higher order functions is the ability to eliminate verbosity by having more specialized versions of traditional control structures like for.

Most languages with higher order functions support anonymous functions (often called lambda expressions), where a function may be defined inline as part of the expression. My favored syntax for this in a C-like language would be the following:

(String string) { return !string.empty; }

This is an ordinary method declaration with the return type and name eliminated. Then we could call filter() as follows:

String[] nonemptyStrings = filter( strings, (String string) { return !string.empty; } );

Since it's extremely common for anonymous functions to consist of a single expression, I favor allowing the following abbreviation:

(String string) (!string.empty)

The parenthesized expression is understood to be the return value of the method. Then the invocation of filter() is a bit less noisy:

String[] nonemptyStrings = filter(strings, (String string) (!string.empty));

This works, and we could support this syntax in the Ceylon language.

Let's look at some more examples of how we would use anonymous functions:

  • Assertion:
    assert ("x must be positive", () (x>0.0))
  • Conditionals:
    when (x>100.0, () (100.0), () (x))
  • Repetition:
    repeat(n, () { writeLine("Hello"); })
  • Tabulation:
    tabulateList(20, (Natural i) (i**3))
  • Comprehension:
    from (people, (Person p) (p.name), (Person p) (p.age>18))
  • Quantification:
    forAll (people, (Person p) (p.age>18))
  • Accumulation (folds):
    accumulate (items, 0.0, (Float sum, Item item) (sum+item.quantity*item.product.price))

The problem is that I don't find these code snippets especially readable. Too much nested punctuation. They certainly fall short of the readability of built-in control structures like for and if. And the problem gets worse for multi-line anonymous functions. Consider:

repeat (n, () {
    String greeting;
    if (exists name) {
        greeting = "Hello, " name "!";
    }
    else {
        greeting = "Hello, World!";
    }
    writeLine(greeting);
});

Definitely much uglier than a for loop!

One language where anonymous functions really work is Smalltalk - to the extent that Smalltalk doesn't need any built-in control structures at all. What is unique about Smalltalk is its funny method invocation protocol. Method arguments are listed positionally, like in C or Java, but they must be preceded by the parameter name, and aren't delimited by parentheses. Let's transliterate this idea to Ceylon.

String[] nonemptyStrings = filter(strings) by (String string) (!string.empty);

Note that we have not changed the syntax of the anonymous function here, we've just moved it outside the parentheses. If we were to adopt this syntax, we could make empty parameter lists optional, without introducing any syntactic ambiguity, allowing the following:

repeat (n) 
perform {
    String greeting;
    if (exists name) {
        greeting = "Hello, " name "!";
    }
    else {
        greeting = "Hello, World!";
    }
    writeLine(greeting);
};

This looks much more like a built-in control structure. Now let's see some of our other examples:

  • Assertion:
    assert ("x must be positive") that (x>0.0)
  • Conditionals:
    when (x>100.0) then (100.0) otherwise (x)
  • Repetition:
    repeat(n) perform { writeLine("Hello"); }
  • Tabulation:
    tabulateList(20) containing (Natural i) (i**3)
  • Comprehension:
    from (people) select (Person p) (p.name) where (Person p) (p.age>18)
  • Quantification:
    forAll (people) every (Person p) (p.age>18)
  • Accumulation (folds):
    accumulate (items, 0.0) using (Float sum, Item item) (sum+item.quantity*item.product.price)

Well, I'm not sure about you, but I find all these examples more readable than what we had before. In fact, I like them so much better, that it makes me not want to support the more traditional lambda expression style.

On the other hand, this syntax is pretty exotic, and I'm sure lots of people will find it difficult to read at first.

Now, in theory, there's no reason why we can't support both variations, except that we've worked really hard to create a language with a consistent style, where there is usually one obvious way to write something (obviously the choice between named and positional arguments is a big exception to this, but we have Good Reasons in that case). The trouble is that supporting many harmless syntactic variations has the potential to make a language overall harder to read, and results in annoying things like:

  • coding standards
  • arguments over coding standards
  • shitty tooling to enforce coding standards
  • arguments over which shitty tooling to use to enforce coding standards
  • empowerment of people who are more interesting in arguing over coding standards and shitty tools that enforce coding standards over people who want to get work done

So this is definitely an issue we need lots of feedback on. Should we support:

  • traditional anonymous functions?
  • anonymous functions only as Smalltalk-style arguments?
  • both?

The answer just isn't crystal clear to us.

UPDATE: I realize that this post is an invitation for everyone to suggest their own favorite syntax for anonymous functions. I expect to see all these kinds of things:

#{String string -> !string.empty}
function (String string) { return !string.empty; }
\String string -> !string.empty
That's fine, but please keep in mind that I'm looking for something which:
  • is very regular with a normal C-style function declaration, and
  • is not impossible to parse in the context of the rest of the language.

Almost anything you can invent yourself or copy from some other language will fail one or both of these two requirements.

UPDATE 2: For completeness, I should mention that using a named argument invocation you can write the following:

String[] nonemptyStrings = filter {
    sequence = strings;
    local by(String string) {
        return !string.empty;
    }
};

Or even:

String[] nonemptyStrings = filter {
    sequence = strings;
    by = compose(not,String.empty);  //compose()() is a higher order function that composes two functions
};

I think this syntax makes sense in some places (for example, callbacks in a user interface definition), but isn't ideal for the problems we've been discussing here.


Back to top