Red Hat

Repository Pattern vs. Transparent Persistence

Posted by Christian Bauer    |       |    Tagged as

I've recently read the book Domain Driven Design, which apparently is now the new bible for some folks who totally think in objects. I even mentioned it in HiA and JPwH - unfortunately I didn't read it before.

I now think that quite a few of the patterns promoted by this book are actually a step backwards. I also miss a list of Pros and Cons of each pattern (like in the GoF book). So readers have to guess why and when a pattern might be applicable and when not. Or, when the Cons bullet items would outweigh the Pros.

One example would be the /Repository/ pattern. If I understand it correctly, this is how it works (the DDD author says that this similar summary is correct):

  • Developers don't want to decide if they should call order.getLineItems() or rather lineItemDAO.findAllForOrderId(order.getId()) if they need a bunch of line items from the data store. (I completely agree that this is a typical and common issue - however, I don't agree with the proposed solution.)
  • We can restrict that and take away that decision by forcing access to always go through a repository. For example, there is only an OrderRepository and the only way to get line items would be OrderRepository.getLineItems(order). This repository thing uses DAOs internally to actually get the data. So we wrap DAOs in repositories, creating an additional layer. You are not supposed to call the DAO anymore.
  • Now, we generalize the OrderRepository into an interface and put the DAO access into OrderRepositoryImpl.
  • And finally: We then inject this concrete OrderRepository instance into the Order class, so that when you call order.getLineItems(), it will internally access the repository. This requires interception of domain model instance creation but that is easy enough to do these days with containers/factories/services everywhere. Nobody creates a /new/ instance anymore, right?

The advantages seem to be that repositories speak the language of the domain (eh, what? This seems to be related to passing IDs as DAO finder arguments...) and that, well, you no longer have to make these difficult choices when you want to access a data store, because there is only one way to get the stuff.

I think this pattern is a step backwards and should not be recommended:

  1. If you do not want to have to decide between order.getLineItems() and lineItemDAO.findAllForOrderId(order.getId()) - kick out one of these methods! Nobody forces you to map a persistent lineItems collection on your Order class. You did that because you like the features you get when you load data that way (in Hibernate: transparent pre-fetching of other lineItems collections with batches or subselects, can't do that nicely in DAO finders). If you don't need that, don't create it. The same is true for any other association in your domain model. Or, vice versa, if there is no difference between a DAO finder method and a domain model traversal, remove the DAO method.
  2. If there really is a difference between order.getLineItems() and lineItemDAO.findAllForOrderId(order.getId()) - for example, one returns more associated stuff, like the product for each line item - keep them both. I do that all the time. I document it properly, saying this finder method does eager fetching of the product for each line item and on the getter method I write this returns line items, the products are unloaded proxies. And of course I need them both. Actually, I rename the DAO finder to lineItemDAO.findAllForOrderIdWithProducts(order.getId()) or, if I want to speak the language of the domain nothing stops me from defining it as lineItemDAO.findAllForOrderWithProducts(order). Because I don't like repeating myself, I finally end up with lineItemDAO.findWithProducts(order). It's now quite easy to decide if I call this, or rather order.getLineItems() - they have a different fetch strategy and fetch plan.
  3. You could argue that it would be nice if everything would /look/ the same. That is, following this repository pattern I would end up with an API like order.getLineItems() and order.getLineItemsWithProducts(). But the price I'd have to pay is A) a completely new layer in my application that is at least as complex as the DAO layer (conceptually there really is not much difference between them) and B) coupling my domain model to these repositories. And because I think there is really no difference between a generic, interface-based DAO architecture and these repositories, it means injecting DAOs into my domain model instances at runtime. That means they don't run anymore without DAOs (mock or real). Wrapping the DAOs again and again does not change that.
  4. If you do not inject and call repositories in your domain model classes, you gained nothing. You created a layer on top of the DAOs that looks like the DAOs, albeit with funny new names.

My personal best practices, when you have to decide how the persistence layer and services are integrated with the rest of the system, would be these:

  • Optimize access paths and simplify data access by carefully designing the domain model associations and persistence layer API (the latter following some good DAO pattern and related best practices - yes, there are books about this). Use the features of your transparent persistence service to make sure that all these access paths do the right thing, depending on what you need.
  • Document the access paths properly, and when each should be used. I adopt some simple conventions, for example: Never map a persistent collection unless it is absolutely necessary - this already makes things a lot easier.
  • Keep the domain model implementation clean, testable, and reusable: no security checks, transaction demarcation, or explicit data store access in any of these classes.
back to top