Wednesday, February 27, 2008

10 important things using Hibernate/JPA

1) Become friends with the DBAs.

There is a tendency for some Java developers to marginalize the importance of DBAs. This is a huge mistake - a good working relationship with the keepers of the data is critical to success with any ORM technology. This is important for two reasons:

  • DBAs alone usually can’t make your Hibernate efforts successful, but they often can make them fail.
  • They usually have really good insight into the database itself, good modelling practices, and any pitfalls or shortcuts for you. I can think of several instances where a timely suggestion from a DBA saved us days and gave us an elegant solution.

In most cases, having good DBAs and having good relationships with them is highly critical to the success of your ORM efforts. Plus, database guys are usually pretty cool people :)

2) Use (and enforce) good naming standards from the beginning.

Who knew naming standards discussions could be so contentious? One of the things we try to accomplish with ORM tools is to make our data model more meaningful, which makes it easier for developers to use and helps avoid confusion. To this end, how we name entities and attributes is very important. I have naming standards I like and think are best, but I won’t try to push them on you here. The important thing is that you decide on something upfront and get everyone to use it. Actually, this should extend beyond just naming standards and should be inclusive of other standards as well (ie, Boolean vs “Y/N” or 0/1, primitive vs Object, Integer vs Long, etc).

3) Don’t try to map every attribute (and association).

We started out trying to use tools like Dali to map everything on a table quickly (some tables had several hundred columns!). This turned out to be a bad idea. Why? Since we were on a shared, legacy database, there were a ton of fields we didn’t care about and never used. Mapping them led to performance issues and confusion.

4) Let the database do the things it is good at.

We really wanted to have a good, clean data model, and so we avoided at all costs writing extra queries to fetch things pertinent to an object or using stored procedures or functions at all. This was also a mistake - databases are good for stuff other than just holding the data Hibernate creates and reads :) For example, we had one object that had a status associated with it. The status was used all over the app so it had to be performant, but we also didn’t want to have to make a separate query every time we needed it. The problem is, the status was derived based on some calculations that operated on several one-to-many relationships. Pulling back all that data as part of each load of the object would have been way to expensive. Working with one of our database guys (see #1), we found a sql function we could use to get the status very quickly. We mapped this to a status attribute using @Formula and had everything we wanted - it was part of the domain model still, and was very performant. Sometimes little compromises like this can yield big dividends.

5) Break up the database.

When we first started, I wanted to model the whole databse in Hibernate at the beginning. This turns out to have been really impractical for a few reasons:

  • a) It was a huge job and would have consumed weeks of time while the customer would have seen no actual work being done.
  • b) I was unlikely to get it right the first time, meaning developers would have to change it anyway as we got started.

There is a tendency to want to map the whole thing out before starting, but a lot of times you just have to work on it as you go. I did find it useful to break it up into pieces and try to do a whole piece at a time as we went - which really seems to have helped.

6) Watch out for triggers.

Watch out for database triggers for two reasons:

  • a) They can do sneaky things behind the scenes that can lead you to pulling your hair out, wondering what happened.
  • b) Sometimes they do stuff that you need to replicate on the Hibernate side. Several times before we fully learned this lesson, we missed some important things that triggers were doing and almost caused ourselves some real grief.

7) Shy away from tools to auto-generate your model.

Yeah, they can save time (although we found Dali to be terribly buggy at the time we used it), but you end up having to re-do a lot of it anyway. It doesn’t take that long to map them by hand, and it gives you a chance to familiarize yourself with the data more as you do it.

8) Use NamedQueries where you can.

It is easy to just pound a query out in-line, but in a lot of cases it is better to use NamedQueries. This helps to do two things:

  • a) It fosters more re-use, since the named queries are generally located in central places in the code.
  • b) Your queries get validated on startup, making it so errors in the queries (especially down the road when they can get messed up by model or table changes) discovered much more quickly.

It sometimes takes a bit of getting used to (and strong-arming!), but it is worth the effort.

9) Manage Expectations.

This is probably important to keep in mind for any new framework, technology, or even concept. For some reason, people tend to get sold on certain features that simply don’t exist, or are highly exaggerated. Sometimes it is small and understandable (ie, underestimating the actual work required to map stuff in Hibernate), and sometimes I have no idea how they came up with such ideas (ie, that Hibernate can somehow manage schema revisioning). At any rate, finding out what the expectations are and then managing them can be really important. If your team thinks that Hibernate will make DBAs completely obsolete (usually quite false) and fires them all, you will probably have a big problem on your hands.

10) Use rich domain modeling.

One of the sadder things I’ve seen with projects that use Hibernate (and in some cases, I’ve seen it because I’ve done it!) is when the domain objects become no more than simple containers of data. The goal of tools like Hibernate is to let us use data in an object-oriented fashion - simply mapping the data only gets us halfway there. As I’ve made conscious efforts to practice rich domain modeling, I’ve noticed that we reuse a lot more code, our other layers become less cluttered, and our code is more testable and easier to refactor.

Source: SpencerUresk