Red Hat

Silliest persistence post. Ever.

Posted by Gavin King    |       |    Tagged as

Somehow, this silliness got linked by InfoQ. It's not really worth the effort of fisking this post, but I'm bored so I'll go ahead and do it anyway.

Active Record is a well known data persistence pattern. It has been adopted by Rails, Hibernate, and many other ORM tools.

Actually Hibernate does not implement the ActiveRecord pattern. If you really feel the need to classify it according to Fowler's taxonomy, Hibernate is a DataMapper.

There is a 1:1 correspondence between tables and classes, columns and fields. (Or very nearly so). It is this 1:1 correspondence that bothers me. Indeed, it bothers me about all ORM tools. Why? Because this mapping presumes that tables and objects are isomorphic.

Well, this may be the case with ActiveRecord, but Hibernate and most other ORM solutions are a good deal more flexible in this regard. Indeed, one of the primary goals of any ORM solution is to allow objects and tables to be non-isomorphic to the extent that this makes sense.

From the beginning of OO we learned that the data in an object should be hidden, and the public interface should be methods. In other words: objects export behavior, not data. An object has hidden data and exposed behavior.

Bizarre statement. Objects are stateful, and one of the main purposes of the methods of an object is to export that state (data) to the world.

Data structures, on the other hand, have exposed data, and no behavior. In languages like C++ and C# the struct keyword is used to describe a data structure with public fields. If there are any methods, they are typically navigational. They don’t contain business rules.

I don't believe that the word data structure implies anything at all about behavior. Certainly it does not imply no behavior.

Thus, data structures and objects are diametrically opposed. They are virtual opposites. One exposes behavior and hides data, the other exposes data and has no behavior. But that’s not the only thing that is opposite about them.

Whoah! Diametrically opposed! Really?

On the contrary, most people I know would view an object as a package of:

  1. a data structure, and
  2. functionality that operates upon that data structure.

The data is an intrinsic part of the object; not its diametric opposite.

Algorithms that deal with objects have the luxury of not needing to know the kind of object they are dealing with.

This is not true.

The old example: shape.draw(); makes the point. The caller has no idea what kind of shape is being drawn.

Correct; the old example makes the point perfectly: the caller knows very precisely that the type of thing it is drawing is a Shape.

algorithms that employ objects are immune to the addition of new types ... objects are not immune to the addition of new functions ... Algorithms that use data structures are immune to the addition of new functions ... algorithms that employ data structures are not immune to the addition of new types ... Those portions of the system that are likely to be subject to new types, should be oriented around objects. On the other hand, any part of the system that is likely to need new functions ought to be oriented around data structures.

This is all predicated upon a false dichotomy between Objects and Data structures. In truth, there is no need to orient ourselves toward one view or the other in any particular part of the system.

Rather, we take advantage of subtyping (inheritance/polymorphism) when we design our object model. By carefully design of the type hierarchy, we allow our system to accommodate the introduction of new types and new behaviors.

Again, note the almost diametrical opposition. Objects and Data structures convey nearly opposite immunities and vulnerabilities.

Repeatedly asserting a false dichotomy does not make it true.

The problem I have with Active Record is that it creates confusion about these two very different styles of programming.

There is only one style of programming.

A database table is a data structure. It has exposed data and no behavior.


But an Active Record appears to be an object.

Correct. An ActiveRecord is an object.

It has "hidden data", and exposed behavior. I put the word "hidden" in quotes because the data is, in fact, not hidden. Almost all ActiveRecord derivatives export the database columns through accessors and mutators.

There appears to be a deep confusion here about exactly what is supposed to be hidden. It is not state, nor data, that must be hidden, it is implementation details. Not the same thing at all.

And products like Hibernate certainly do not require that database columns be exported via accessors and mutators. Persistent attributes may be private if appropriate. State (data) can be hidden or exposed depending upon what is required by client code.

Indeed, the Active Record is meant to be used like a data structure. On the other hand, many people put business rule methods in their Active Record classes; which makes them appear to be objects.

Again with the false dichotomy.

This leads to a dilemma. On which side of the line does the Active Record really fall? Is it an object? Or is it a data structure?

It is both. There is no line.

This dilemma is the basis for the oft-cited impedance mismatch between relational databases and object oriented languages. Tables are data structures, not classes. Objects are encapsulated behavior, not database rows.

The impedence mismatch is between the different modelling constructs available in the two paradigms.

The problem is that Active Records are data structures. Putting business rule methods in them doesn’t turn them into true objects.

On the contrary, a data structure, together with business rule methods is the very definition of an object.

In the end, the algorithms that employ Active Records are vulnerable to changes in schema, and changes in type. They are not immune to changes in type, the way algorithms that use objects are.

Any object-oriented code is vulnerable to changes in the type system. To argue that algorithms that use objects are invulnerable to changes in type is absurd in the extreme.

You can prove this to yourself by realizing how difficult it is to implement an polymorphic hierarchy in a relational database. It’s not impossible of course, but every trick for doing it is a hack. The end result is that few database schemae, and therefore few uses of Active Record, employ the kind of polymorphism that conveys the immunity of changes to type.

Nonsense. Traditional data modelling has powerful, elegant methodologies for modelling subtyping, and modern ORM solutions like Hibernate are easily able to map between the relational and object-oriented approaches to subtyping.

So applications built around ActiveRecord are applications built around data structures. And applications that are built around data structures are procedural—they are not object oriented. The opportunity we miss when we structure our applications around Active Record is the opportunity to use object oriented design.

Persistent classes embody inheritance relationships and subtyping. There are many opportunities to take advantage of object oriented design priniciples. If your ORM solution doesn't support inheritance, get one that does.

Applications should be designed and structured around objects, not data structures. Those objects should expose business behaviors, and hide any vestige of the database. The fact that we have Employee tables in the database, does not mean that we must have Employee classes in the application proper.

The fact that we have employees in the business domain, and a notion of Employee in our business domain model implies that we will almost certainly have an EMPLOYEE table and Employee class in the object-oriented and relational realizations of this model.

We may have Active Records that hold Employee rows in the database interface layer, but by the time that information gets to the application, it may be in very different kinds of objects.

This is extremely unlikely. Both the relational schema and the object model are representations of the same entities that exist in the business domain. These entities exist in all layers of the application, from the user interface, all the way back to the database.

I am not recommending against the use of Active Record. As I said in the first part of this blog I think the pattern is very useful. What I am advocating is a separation between the application and Active Record. Active Record belongs in the layer that separates the database from the application. It makes a very convenient halfway-house between the hard data structures of database tables, and the behavior exposing objects in the application.

Oh wonderful. This stuff was all just an excuse for more useless layerism? Please, enough layers already!

So, in the end, I am not against the use of Active Record. I just don’t want Active Record to be the organizing principle of the application. It makes a fine transport mechanism between the database and the application; but I don’t want the application knowing about Active Records. I want the application oriented around objects that expose behavior and hide data. I generally want the application immune to type changes; and I want to structure the application so that new features can be added by adding new types.

The application should be oriented around the business domain model. Objects are an implementation detail.

back to top