I am currently the lead of the JBoss Application Server project. I am also a member of the JCP, and represent Red Hat on the future Java EE6 specification. During my tenure at JBoss, I have worked in many areas including the application server, clustering, web services, AOP, and security. My interests include concurrency, distributed computing, hardware and programming language design.
Recently the EE6 co-lead, Roberto Chinnici, blogged about the state of profiles in EE6, and requested feedback on which direction the group should take. This was also posted to InfoQ and TSS. Essentially we have a choice between two options:
- Rubberstamp Tomcat
- Provide a complete framework for web development.
The first option offers nothing new that doesn't already exist today. Right now you can download Tomcat and, unless you are still developing with just Servlets and JSPs, install the many different frameworks you need to build a modern web application. You are then left to discover the magic version combination that doesn't have dependency conflicts or integration issues.
What we really need is a standardized solution which addresses the three most common needs of modern web applications:
- A data persistence framework
- A component framework
- A rich presentation framework
The full Java EE platform has good solutions for all of these aspects today, and more work is being done to improve them (e.g Web Beans, ability to deploy EJBs in a war, etc). However, the full EE platform also contains many other standards, most of which are focused towards EIS (like CORBA, RMI, JMS, JCA, etc). EIS is, of course, not necessary for a significant portion of web applications. So, the central idea behind option 2, is to deliver a version of the platform that is truly focused towards web development.
Not only can we omit specifications that don't serve the above goals, we can also improve the ones that do to better meet our new-found focus. A good example of this, is EJB-Lite, which only requires support for local session components and JPA.
Taking all of this into account we end up with what I view is the ideal combination, which I have strongly advocated on the EG:
- JPA 2.0
- JTA 1.1
- EJB Lite 3.1
- Web Beans 1.0
- JSF 2.0
- Servlet 3.0 (and friends, jsp, el, jstl etc)
I should also mention that non-standard frameworks would of course still be usable on this profile. As an example, Web Beans will offer an SPI so that any web framework can take advantage of the improved/simplified component integration layer, should it choose to.
However, I think the goal should be to provide a good out-of-the-box solution for web development, and a
tomcat profile definitely falls short.
I stumbled upon an interesting post by David J. DeWitt and Michael Stonebraker entitled
For those not familiar with MapReduce, it is a programming model developed by Google for distributed computing on extremely large sets of data. You can read Google's original paper outlining the technique here.
The authors of this post make some excellent points, most of which are centered upon the importance of a well defined structure and abstraction to data. While I certainly agree to a number of them, it's hard to ignore the simplicity and effectiveness of MapReduce. After all, it is currently being used to process 20 petabytes of data a day.
While there are many things I like about the Java language, the lack of unsigned types has always bothered me.
According to Gosling, they were too complicated:
Quiz any C developer about unsigned, and pretty soon you discover that almost no C developers actually understand what goes on with unsigned, what unsigned arithmetic is. Things like that made C complex.
Ironically this kind of hand-holding tends to introduce other complexities that are often more difficult to deal with than the original solution. In this particular case, leaving out unsigned types doesn't stop the need to work with unsigned data. Instead, it forces the developer to work around the language limitation in various unusual ways.
The first major problem in this system is that byte is signed in Java. Out of all of the code I have ever written, I can only think of a select few situations where I needed a signed byte value. In almost all cases I wanted the unsigned version.
Let's look at a very simple example, initializing a byte to be 0xFF (or 255), the following will fail (note: this works in C#, since they made byte unsigned):
byte b = 0xFF;
Java will not narrow this for us because the value is outside the range of the signed byte type (>127). We can, however, work around it with a cast:
byte b = (byte) 0xFF;
If we are clever and know twos compliment we can use a negative equivalent of our simple unsigned value:
byte b = -1;
This is just the tip of the iceberg though. A common technique used in search and compression algorithms is to precompute a table based on the occurrence of a particular byte value. Since a byte can represent 256 values, this is typically done using an array with the byte value as an index, which is very efficient. Ok so you might think you can do the following:
byte b = (byte) 0xFF; int table = new int; table[b] = 1; // OOPS!
While this code will legally compile, it will result in a runtime exception. What happens is that the array index operator requires an integer. Since a byte is specified instead, Java converts the byte to an integer, and this results in sign extension. Again, 0xFF means -1 for a signed byte, so it gets converted to an integer with a value of -1. This, of course, is an invalid array index.
To solve the problem, we must use the bitwise-and operator to force the conversion to occur in the correct (yet unintuitive) way like so:
table[b & 0xFF] = 1;
This technique gets ugly quick. Take a look at composing an int from 4 bytes (Ugh!):
byte b1 = 1; byte b2 = (byte)0x82; byte b3 = (byte)0x83; byte b4 = (byte)0x84; int i = (b1 << 24) | ((b2 & 0xFF) << 16) | ((b3 & 0xFF) << 8) | (b4 & 0xFF);
These issues have in turn lead to odd API workarounds. For example, look at InputStream.read() , which according to its docs is supposed to return a byte, but instead returns an integer. Why? So it can do the & 0xFF for you.
We also have a DataOutput.writeShort() and DataOutput.writeByte() that take integers instead of the their respective types. Why? So that you can output unsigned values on the wire. On the reading side we end up with four methods DataInput.readShort(), DataInput.readUnsignedShort(), DataInput.readByte(), and DataInput.readUnsignedByte(). The
unsigned versions return converted integers instead of the described type names.
To add to the confusion, we also have 2 right shift operators in this signed only mess. The
unsigned right shift operator, which treats the type as if it were unsigned, and the normal right shift, which preserves the sign (essentially acts like divide by 2). If we want to get the most significant nibble value of an integer we need the unsigned version.
int i = 0xF0000000; System.out.printf("%x\n", i >> 28); // Returns ffffffff! System.out.printf("%x\n", i >>> 28); // Returns f, as desired
So I ask all of you, was all of this hassle worth leaving out the simple and well understood
unsigned keyword? I think not, and I hope anyone who considers doing this in another language they are designing learns from it. At least C# has.