The problem:
Let’s say I have a class: Author that has children of
multiple books. I would like to retrieve the authors and books, but only want
to retrieve the first 10 authors.
The hibernate api for this is to use the setMaxResults
function.
The naïve thought would be that if I set the setMaxResults
to 10, I will get 10 authors, but this is not true. Since all hibernate
commands in the end are sql statements, what the setMaxResults does is to add a
LIMIT/TOP (depending on the database) to the sql statement. What then happens
is if you have a join in your query like we do, you will not get 10 authors,
but actually you will get 10 books (maybe only 5 authors).
Solution, Option One:
The simple solution for this is to create two sql
statements. One that will retrieve the first 10 authors id’s according to the criteria
you want (including on the books). Then add the id’s to a basic search and
fetch the books with the authors.
Solution, Option Two:
If you need to create the join and fetch already in the
query stage, but you only want to retrieve the first 10 records that have
unique authors. The next solution I propose for this is to use a cursor. Using
a standard query will return all records from the db to the application. Using
a cursor you can iterate over the records without bringing them all to the
client. Cursors lock the db for more time, but give the option to iterate over
the records without retrieving them all. To do this you call .scroll() on the
query object or criteria and get a ScrollableResults object. If you are only
iterating of the list use the ScrollMode.FORWARD_ONLY option to use less memory
and speed the process.
Note: you need to make sure that your database supports
this, since some databases like mysql fake it and bring the whole data to the
client (http://stackoverflow.com/questions/2826319/using-hibernates-scrollableresults-to-slowly-read-90-million-records)
Pagination:
Some queries use a pagination scheme to get data. The
standard way is to call setFirstResult.
In the first solution, you need to add the setFirstResult
to the query on the id’s, and not the query with the fetch.
In the second solution this won’t work since we are back
to the problem that we are counting the rows by ourselves so the setFirstResult
won’t work either in hibernate query or scroll.
Dialect solution:
Currently solutions to this issue is on the dialect of
the specific database. For example sql server has a row count feature where you
can define the field that the sql will add the row count field (a column that
has an integer of count according to another field- http://msdn.microsoft.com/en-us/library/ms186734.aspx).
Then on this field you can use the standard hibernate setMaxResults and setFirstResult.