Today Oracle's Mark Reinhold published the requirements for comment a public draft of the Java Module system being proposed for inclusion into the Java SE 8 platform. The requirements as given are fairly high-level yet comprehensive, and many of these requirements align well with the goals and specifications of our own JBoss Modules system, which is not only a strong validation of our own design but also what I think is a good sign for the future of the Java platform.
The requirements as posted however contain some things that are specifically noteworthy and some things which in my view should definitely be changed, based on my own experience with implementing the JBoss Modules environment as well as applying that environment to JBoss AS 7 and 6 (which we did on an experimental basis a while back), as well as feedback from my colleagues responsible for JBoss OSGi, which is presently implemented atop JBoss Modules. Being very sizable and very different projects, these efforts have taught us a lot about the practical nature of modularity in Java.
Since this is a big document, I'm going to comment on it a bit at a time and see where we end up.
The Requirements
The document, which may be found here, is broken up into a number of sections, which I will attempt address approximately in order.
Fundamentals: Versions
The fundamentals paragraph items all indicate a strong sense that the Jigsaw project is on the right general track. However I found a couple sections noteworthy:
Resolution in all phases — Build-time, install-time, and run-time module resolution and linking must all be supported. UNIX has all three models and they are all still in use 40 years later.
coupled with the following paragraph:
Fidelity across all phases — The set of modules seen by a library or application must be computed by the same algorithm at build time, install time, and run time.
and this one:
Versioning — The module system must support common version-string schemes [...]. It must be possible to specify a range of allowable versions when declaring a module dependence.
These seemingly reasonable statements hide a particularly tricky little goblin; namely, build reproducibility. At build-time, an artifact (to use the Maven term) is identified in part by its version. And that identity has a specific meaning. If I check out version 1.2.3 of jboss-cobwobble from source control and build it, I should get an artifact which is equivalent to one which anyone else who checks out and builds the same version of the same project gets.
However, that means that such builds cannot use version ranges. By definition this creates a way for two parties building the same project to get different output, thus blurring the meaning of what a version is. In the first paragraph, Mark emphasises the similarities to the shared library model and recommends that similar practices be followed to that paradigm, however there is a key difference in that when one builds against a shared library the linkage is much less invasive. A change to a compiled library used by a project does not generally affect the compilation results of that project, whereas in Java this is much easier to do, since class files are a much richer format than a shared object symbol table.
Therefore it is very important to allow (better yet, require) builds to use specific dependency versions when building, and allow version ranges only at install and run time, where version ranges are a much better fit.
Another very important consideration is that after an artifact is produced, the versions of its dependencies that it is known to be compatible with evolves over time as new versions of various things are released. Thus dependency version ranges with a closed upper end cannot be a fixed part of the module's internal metadata, or there is a risk over time that the module will have to be repackaged to accomodate new version compatibilities. My personal recommendation in fact is that modules only ever support a lower bound to version ranges for package/run-time constraining.
Fundamentals: Target constraints on modules
I admit I do not understand the purpose of this paragraph, which reads as follows:
Target constraints on modules — On a particular target platform it must be possible to declare that only modules with specific properties can be installed, e.g., particular authors, publishers, or licenses. The resolution algorithm must ignore any module that does not satisfy the target platform’s module constraints.
If the purpose of this is security, which I can very much get behind, then I think the only sensible approach is to constrain module loading to signed modules. Otherwise, if the goal is simply to create a mechanism by which administrators can annoy each other, then I guess this fits the bill; the restriction can be bypassed by changing the metadata, which means that it provides neither security nor convenience.
Overall I think this mechanism also risks fragmentation of the module ecosystem that would (hopefully) arise around this enhancement. I think Perl for example benefits greatly from a single repository of modules which is yet open to contributors with relatively little restriction; yet operating system distributors (know any of those?) have established best practices for the distribution of these modules. And of course today's Java developers often use the Maven central repository and expect it to contain everything
in reasonable locations. However by creating these filtering mechanisms, the problem of many publishers is being solved at the wrong place, and it allows for some potentially very surprising behavior (I can see the FAQ now: I installed a module but Java says it's not found... what gives?
).
Fundamentals: Native code
This section basically says that modules need to support native code somehow, though it seems to imply that the native library has to live inside the module itself, which I think is probably unneccessary (though of course it would have to exist within the module packaging, else there would be no way to actually install it). In JBoss Modules, we simply require that native libraries exist in filesystem-backed resource loaders (as opposed to jar-backed resource loaders which one would normally use for classes). In addition the filesystem resource loader uses the name of the OS and hardware platform to allow for packaging modules which contain native bits for more than one platform (actually not unlike how Perl's DynaLoader does it, if I recall correctly).
Fundamentals: Package subsets and Shared class loaders
These sections I have a serious problem with:
Package subsets — In support of platform modularization, it must be possible for the types defined in a Java package to be provided by more than one module yet still be loaded by the same class loader at run time. This is required so that CDC-sized subsets of large legacy packages such as java.util can be defined.
Shared class loaders — In support of platform modularization, it must be possible to declare that the types defined in a specific set of modules must be loaded by the same class loader.
I am very much against this notion. One module, one class loader, period, end of story. If you are splitting a single class loader among many modules, you haven't created modules, you've created one module with a class path of several JARs, of which some stuff may be missing and others not, and a hell of a lot of useless complexity, all for the privilege of saying sure, we modularized this
. There is no isolation of any sort between them beyond your standard access controls provided by the Java language. Without a strong definition of a module as an encapsulating vehicle for a class loader, the whole idea is weakened - visibility is only sensibly enforced in terms of whole class loaders (yes, while it is probably possible to use a different unit of division like class, it will definitely have a negative performance impact).
Rather than thus weakening the definition of a module, I recommend that modules be allowed to be packaged in portions, which may be separately shipped and installed - perhaps a module/submodule concept could be introduced, where the contract of a submodule is clearly defined as sharing the parent module's class loader. In other words, the label should be clearly marked, so to speak. There is no reasonable expectation that a package may be split between modules, and doing so is definitely a violation of the Principle of Least WTF.
Security: Where art thou?
Though there are various paragraphs which allude to it, there is no specific section addressing security, which I think is a somewhat serious oversight. I would personally like to see some enhancements to the standard security provider mechanism to accomodate modules and module signers more conveniently than the mechanisms at our disposal today, and I commented earlier about module signing, to say nothing about modular security provider discovery (okay, this kind of thing is currently listed as a non-requirement to be fair - but it will always be in the back of my mind, and you can bet that it will show up in JBoss AS 7.x at some point if I have anything to say about it).
Summary, Part 1
That's about it for the first chapter. Overall the Fundamentals leave me with a sense that things are somewhat more under control in Jigsaw-land... in particular, it's good news that the brakes are apparently being gently applied to implementation in order to come up with real requirements, which are essential to ensuring that a project is firmly on the correct planetary surface. Just ask any of my colleagues, they'll tell you how I like a good requirements document!
Next time I'll try to cover a little more ground, as the subsequent chapters are mostly rather shorter than Fundamentals is.