Monday, October 15, 2007

Domain Modeling with JPA - The Gotchas - Part 4 - JPA is a nice way to abstract Repository implementation

When we model a domain, we love to work at a higher level of abstraction through Intention Revealing Interfaces, where the focus is always on the domain artifacts. Whenever at the domain layer we start coding in terms of SQLs and pulling resultsets out of the database through a JDBC connection, we lose the domain focus. When we start dealing with the persistent hierarchy directly instead of the domain hierarchy, we make the domain model more vulnerable, exposing it to the lower layers of abstraction. By decoupling the domain model from the underlying persistent relational model, JPA provides us an ideal framework for building higher levels of abstraction towards data access. The domain model can now access data in terms of collections of domain entities and not in terms of the table structures where these entities are deconstructed. And the artifact that provides us a unified view of the underlying collection of entities is the Repository.

Why Repository ?

In the days of EJB 2.0, we had the DAO pattern. Data Access Objects also provided with an abstraction of the underlying data model by defining queries and updates for every table of the model. However, the difference of DAOs with repositories are more semantic than technical. Repositories provide a higher level of abstraction and is a more natural habitant of the rich domain model. Repositories offer more controlled access to the domain model only through Aggregate roots implementing a set of APIs which follow the Ubiquitous Language of the domain. Programming at the JDBC level with the DAO pattern, we used to think in terms of individual tables and operations on them. However, in reality, the domain model is much more than that - it contains complex associations between entities and strong business rules that govern the integrity of the associations. Instead of having the domain model directly deal with individual DAOs, we had always felt the need for a higher level of abstraction that would shield the domain layer from the granularities of joins and projections. ORM frameworks like Hibernate gave us this ability and specifications like JPA standardized them. Build your repository to get this higher level of abstraction on top of DAOs.

You build a repository at the level of the Aggregate Root, and provide access to all entities underneath the root through the unified interface. Here is an example ..


@Entity
class Order {
  private String orderNo;
  private Date orderDate;
  private Collection<LineItem> lineItems;
  //..

  //.. methods
}

@Entity
class LineItem {
  private Product product;
  private long quantity;
  //..

  //.. methods
}



The above snippet shows the example of an Aggregate, with Order as the root. For an Order Management System, all that the user needs is to manipulate Orders through intention revealing interfaces. He should not be given access to manipulate individual line items of an Order. This may lead to inconsistency of the domain model if the user does an operation on a LineItem which invalidates the invariant of the Order aggregate. While the Order entity encapsulates all domain logic related to manipulation of an Order aggregate, the OrderRepository is responsible for giving the user a single point of interface with the underlying persistent store.


interface OrderRepository {
  //.. queries
  List<Order> getAll();
  List<Order> getByOrderNo(String orderNo);
  List<Order> getByOrderDate(Date orderDate);
  List<Order> getByProduct(Product product);

  //..
  void write(final Order order);
  //..
}



Now the domain services can use the service of this repository to access orders from the underlying database. This is what Eric Evans calls reconsitution as opposed to construction (which is typically the responsibility of the factory).

JPA to implement Repository

The nice thing to be able to program to a specification is the abstraction that you can enforce on your model. Repositories can be implemented using JPA and can nicely be abstracted away from the actual domain services. A Repository acts like a collection and provides user the illusion of using memory based data structures without exposing the internals of the interactions with the persistent store. Let us see a sample implementation of a method of the above repository ..


class OrderRepositoryImpl implements OrderRepository {
  //..
  //..

  public List<Order> getByProduct(Product product) {
    String query = "select o from Order o, IN (o.lineItems) li where li.product.id = ?1";
    Query qry = em.createQuery(query);
    qry.setParameter(1, product.getId());

    List<Order> res = qry.getResultList();
    return res;
  }
  //..
  //..
}



The good part is that we have used JPA to implement the Repository, but the actual domain services will not contain a single line of JPA code in them. All of the JPA dependencies can be localized within the Repository implementations. Have a look at the following OrderManagementService api ..


class OrderManagementService {
  //..
  // to be dependency injected
  private OrderRepository orderRepository;

  // apply a discount to all orders for a product
  public List<Order> markDown(final Product product, float discount) {
    List<Order> orders = orderRepository.getByProduct(product);
    for(Order order : orders) {
      order.setPrice(order.getPrice() * discount);
    }
    return orders;
  }
  //..
  //..
}



Note that the repository is injected through DI containers like Spring or Guice, so that the domain service remains completely independent of the implementation of the repository.

But OrderRepository is also a domain artifact !

Right .. and with proper encapsulation we can abstract away the JPA dependencies from OrderRepositoryImpl as well. I had blogged on this before on how to implement a generic repository service and make all domain repositories independent of the implementation.

14 comments:

vtrubnikov said...

Somehow, having "write" method in OrderRepository does not feel right... While OrderRepository is domain model artifact, "write" is not part of the domain language.
And how would you model an Order creation process?

Unknown said...

@valery_t:
write() is meant to create a new instance of Order in the database. If the name does not feel domain friendly, we can have an appropriate one which resembles the domain better. Suggestions ?
Order creation will be the responsibility of the Factory. We can have separate factories for each entity and value object. The factory needs to have the intelligence to create one complete valid object out of its parts. The instantiation model of the entity or the value object can be totally decoupled from the JPA instantiation model and can use sophisticated mechanisms using Builder or Flyweight patterns. I have discussed this decoupled instantiation model for Value Objects in Part 2 of this series. I plan to discuss this in more detail in a future post.

vtrubnikov said...

Thanks for your answer, Debasish.

Do you suggest that Service would instantiate and save order like below?

Repository.write(Factory.create(...order_parts...));


If so, why can't Factory.create() write Order into database at point of creation automatically?

Unknown said...

@valery_t:
>> Repository.write(Factory.create(...order_parts...));

Yes, this can be an approach using the Factory pattern. If you are using some MVC at the Web layer e.g. Spring MVC, then you can specify a command object while defining your controller class. The command object gets automatically populated by the MVC depending on the Java Beans based property matching. I usually prefer to specify the domain entity directly as the Command Object. The other option is to have another layer of objects called DTOs which goes into the Factory for creating the actual domain entity.

>> why can't Factory.create() write Order into database at point of creation automatically?

This is because there may be some business processing which populates additional fields of the entity between the creation process and the final persistence process. Eric Evans' book on DDD covers the details regarding the lifecycle of domain entities.

Mikhail.Khludnev said...

Debasish,

Do you have any ideas how we can inject repositories into services by EJB3 without Spring? e.g. obtain in service two repositories of the same interface but linked to different DataSources (or EntityManagerFectories or SessionFactories)
do I have any chance without Spring?

And why you ignored my bloody comment http://debasishg.blogspot.com/2007/10/domain-modeling-with-jpa-gotchas-part-3.html#c8248516419125202564 of your third "gotcha"?

2 valery_t
It seems that you use own terminology and calls "Service which Checks DTO and Builds Order" by word "Factory". So, you are right but more common name for this stuff is Service. Pay attention that Repository is DDD pattern (strictly), and Factory is a GOF pattern (as a rule). So you've just mixed these patterns in your mind (and probably added Facade).

bge said...

@Debashish, valery_t:

I'd suggest "put()" as an alternate name for "write()" in the repository. Conceptually, a repository is a kind of map, except you can query using different criteria instead of a simple key. In that light, I think "put()" works well.

As far as the factory stuff goes, my own experience is that it's not really a good idea to have the repository create the objects, nor to have the factory persist it. I've always run into problems doing it that way. In the end, I always end up having a true Factory class for each domain object, as well as a repository, as suggested in Eric Evans' book. I first questioned the wisdom of that (since Repositories have to create objects through the "back door" after queries, why not have them create new objects as well?), but it turns out that Factory and Repository are truly separate responsibilities and should not be mixed.

Unknown said...

@mikhail:
Regarding EJB3, I have not much experience. I will try to look up. But currently I have a persistent mental mapping of Spring with DI. :-)

>> why you ignored my bloody comment

Sorry I missed it .. I have responded to it in the comments of Gotcha 3 post. Please find my views. Let me know what u think ..

Unknown said...

@bge: +1 to both of your suggestions. Regarding factory and repository as two separate artifacts, I am very much with you. Creating objects with Factory and Reconstituting (this is the term Eric Evans uses in his book) objects with Repository are the ways to go. In the section where Eric discusses Lifecycle of Entities, he also mentions this separation of concerns.

Matt Corey said...

@Debasish,

Couple of questions:

First, what are your thoughts on the implementation of 'write'? Would this simply call 'em.persist()', or do you see this as a method that should call 'em.merge()' in appropriate cases, where a record should be updated instead of inserted (similar to Hibernate's Session.saveOrUpdate() method)?

Second, do you expect that all access to JPA code would be hidden inside various Repositories, ala the typical DAO implementation? I've been re-reading parts of Chris Richardson's "POJO's in Action", and one of the principals of his Facade pattern is that the Facade is responsible for loading all 'lazy' data -- I agree with this principal, but I'm not convinced that the best way to implement it is through a Repository...

In other words, I would prefer to call out to a helper object that simply loads what is necessary for this use-case than to build these methods into the Repository, since they're not particularly domain-friendly... I expect some will scoff at this, however, since it requires accessing the EntityManager directly from the facade (or a helper class)...

Thoughts?

M

Unknown said...

@Matt:
regarding implementation of write(), my thought is that it will call em.persist(). Updates can be taken care of automatically through Hibernate dirty checking.

regarding lazy loading, here are my thoughts. I am travelling and do not have Chris Richardson's book at hand. I will try to look up into it whenever I have a chance. Anyway, here are my thoughts.
My service methods operate on a connected mode - hence the first thing it does is call a merge on the objects which it received as arguments. Every repository exposes a merge() api for this. And all lazy loading stuff is taken care of within the repository implementations.

class OrderManagementService {
  private OrderRepository orderRepository;

  // apply a discount to all orders for a product
  public List<Order> markDown(final Product product, float discount) {
    productRepository.merge( product);
    List<Order> orders = orderRepository.getByProduct(product);
    for(Order order : orders) {
    order.setPrice(order.getPrice() * discount);
    }
    return orders;
  }
  //..
}

Let me know ur thoughts on this.

Matt Corey said...

@Debasish,

On the first point, my only concern with the 'merge()' (and the 'write()' method as well), is that I'm not sure it belongs in the Repository -- it feels like like it 'muddies' the domain a bit, although to be honest, I could be convinced otherwise (after all, you have to call 'put' on a Map instance to insert objects into it, as one reader pointed out)...

On the lazy loading issue -- my concern with handling all of this in the Repository is that in large applications, you can't possibly account for all data-loading situations at this level... the end result turns out to be a Repository with methods like 'getByProduct', 'getByProductWithMerchant', 'getByProductWithAllMerchants', etc. -- terrible to look at, and worse to maintain...

This is why I like the idea in "POJO's in Action" that puts the lazy loading responsibility at the Facade level... so in the above example, you would call 'getByProduct' in your Service class, and then the Facade objects would be smart enough to know that the current use case also requires that the Merchant data be loaded... you will certainly lose some efficiency, however the benefits you gain in terms of a clear, concise domain, and clear separation of responsibilities makes up for it...

M

Unknown said...

@Matt:
"it feels like like it 'muddies' the domain a bit"

I agree with you. They, sort of muddles the domain. Any suggestion ?

On the lazy loading issue, I usually expose frequently-used use cases through services explicitly and handle the rest through another api that takes a Criteria. I admit, it is Hibernate specific, but hopefully it will be there in the next version of JPA.

Matt Corey said...

@Debasish,

I agree with you. They, sort of muddles the domain. Any suggestion ?

One suggestion would be to allow the use of the EntityManager directly from your service and/or facade objects -- this would keep your domain intact, but would be considered heresy by some :)

It does go along with the theme that "JPA/EJB3 Killed the DAO" (http://www.adam-bien.com/roller/abien/entry/jpa_ejb3_killed_the_dao)...
I'm not 100% sure I'm sold on it, as the Repository pattern is extremely useful for keeping all of your data loading together, but if the 'write()' and 'merge()' methods are always one-liners, then it does seem to make sense to allow for them to be called by other classes...

The biggest concern is that you're pretty tightly tied into JPA with scenario, but JPA is itself an abstraction layer that is implemented by Hibernate, TopLink, etc... it seems to me that most of the implementation-specific code will be related to the mappings and the queries (esp. with Hibernate Criteria queries) -- well, if the Repository pattern is still used for loading data, then your migration efforts between different JPA vendors will be the same either way...

M

Unknown said...

@Matt:
>>One suggestion would be to allow the use of the EntityManager directly from your service and/or facade objects

Definitely that's an option, but once again u see there is this leaky abstraction syndrome, the persistence layer, sort of, creeps in the domain service layer, which I don't like.

>> regarding lazy loading and Richardson's POJOFacade approach:

I had a look - it definitely makes sense in a big application. In our case what we did was to keep the service itself at the granularity of the use cases, instead of having another facade at its front. Maybe it worked because the application was not a very huge one. Ultimately many of these methods were invoking the same method of the repository - they only differ in which component of the object graph you initialize for the presentation tier above. The added advantage of the facade is that your service layer remains more decoupled from the presentation tier - all such knowledge regarding which objects to initialize for the client remains localized within the facade. And u can have multiple facades for multiple clients, while ur service layer remains minimal.