There are times that we have a process that needs to run
multiple requests from the db, and then with the result of all of them, to
combine the results in a specific way. Of course you can run each query
sequentially, but to boost performance you want to run them in parallel.
At first this seems simple since you just need to run each
query in a separate thread using a callable, and then wait on the future
results to get them (for a good article on thread and concurrency see: http://www.vogella.com/articles/JavaConcurrency/article.html).
As we will see this approach has a lot of problems, and does
not work simply with hibernate.
Session and Threads
The problem is what information do you pass to the thread,
and what is the result from the thread. The first naïve thought was to create
the query object in the main thread (so the thread can be generic and not deal
with the query and parameters), and then pass to each new thread the query.
Then all the thread needs to do is call query.uniqueResult() or query.list()
and return the result.
The problem with this approach is that hibernate session is
not thread safe (http://www.javalobby.org/java/forums/t104442.html, http://pveentjer.wordpress.com/2007/02/19/sharing-hibernate-entities-between-threads/),
so you cannot share the session between threads. Hibernate to simplify the
implementation assumes that all queries run on a thread are using the same
session. This means that you are not allowed to pass the session between threads.
So the next option is to pass the dao to the thread and then
call a specific function on the dao. This makes more sense from a design point
of view since this is the default design of a service layer application. In a
web app, each request has it’s own thread, and then the calls to the dao are
all stateless and will have their own session per thread that is created by hibernate.
The downside to this solution is that we do not have a
generic way to run queries asynchronous. A possible solution is to pass a
reference to the dao, and an object that describes the function and parameters,
and the within the thread to use reflection to find the function and to call it
with the parameters sent.
Transactions
The next part that you need to think about is transactions.
By default each thread will have its own transaction (needs to be defined by
annotations or xml). It is not simple to have transactions cross threads and is
not considered to be a good design practice. But if for whatever reason you
must have it, global JTA is the way to go.
Results
The next part that needs to be taken into consideration is
the result from the thread. The problem is that if we return the entity as is,
and it has lazy objects, accessing these objects on the main thread will cause
a Hibernate Lazy Load Exception. Since hibernate tries to load the data from
the database when there is a lazy proxy, and the call to the object is from a
different thread, hibernate cannot load the information and a lazy exception is
thrown (for more information see http://blog.xebia.com/2009/02/07/hibernate-and-multi-threading/).
There are multiple ways to solve this issue. One is to use
dto’s and convert the entity to a pojo and then it will not be associated with
hibernate session. Another solution is to use an open source lib by tikal (http://tikalk.com/chop-hibernate-lazy-associations)
that will chop any entities and replace the proxies by a null according to an
aop pointcut.
Summary:
What seemed like a simple idea in the beginning to run
multiple queries in multiple threads, although a need sometimes, the
implementation must be carefully done to take into account all the issues
raised above.