I'm thinking about table layout for a Hibernate-managed class hierarchy, and certainly the table per subclass technique strikes me as the most appropriate in a general sense. However, thinking through the logic I have some concerns about its performance especially as the number of subclasses scale.
To give a very brief (and classic) example, let's say you have the following classes:
public abstract class Animal {
   int pkey;
   String name;
}
public class Dog extends Animal {
   long numSlippersChewed; // int is not large enough...
}
public class Cat extends Animal {
   short miceCaught; // ... but here int is far bigger than required :-)
}
(I'm eliding getters and setters and Hibernate mappings etc., just assume they're the basic obvious case).
The database tables for these entities make sense, you get nice denormalisation and so on. However, what query does Hibernate do in order to pull out an individual animal? I can think of at least two cases where this might happen:
pet field of a Human class.  This would store the pkey, so when Hibernate fetches a Human object it will need to fetch the corresponding Animal object too.  When given the pkey of the animal, what query(/ies) will Hibernate use to extract and unmarshall the actual Animal data, given that it could reside in the Cat or Dog tables?from Animal where name='Rex' (let's assume names are unique).  This is similar to the above in that it lets you identify a row in the superclass table but you don't know which subclass table to inspect for further details.  Does HQL even let you issue a query from an abstract class?  (Using subclass specific stuff works nicely though, e.g. from Cat where miceCaught > 5).I can think of two ways that this could be done in SQL and neither seems pretty.  One is to run an exists query on each subclass table for the given pkey and then load from the table that returned a hit. Alternatively Hibernate could perform some horrible union query joining in all the tables - essentially simulating the table-per-hierarchy scheme in that the result set would include attributes for all possible subclasses with the individual selects from the subclass tables returning null for the irrelevant arguments.  This latter case would probably even need to add a synthetic discriminator column so that Hibernate could know which subclass table actually returned the row and thus what Java class they should be parsed into.
Things get hairier too if you have subtypes of concrete types:
public class Greyhound extends Dog {
   float lifetimeRacingWinnings;
}
Now for a given animal pkey, there may be valid rows in the Dog and Greyhound tables, meaning that my first approach of manually checking the class that corresponds to a pkey gets a lot tougher.
The reason I'm so concerned is that I will be wanting to use this approach on a class hierarchy with about 70 classes with a maximum nesting chain of 4-5 levels, so performing a union query on all of that is likely to have horrible performance. Does Hibernate have any tricks up its sleeve to keep this relatively performant? Or is loading a reference to one of these classes by pkey going to take a long time?
Table Per Class strategy is the most logical inheritance solution because it mirrors the object model in the data model. In this pattern a table is defined for each class in the inheritance hierarchy to store only the local attributes of that class.
Hibernate/JPA Single Table Inheritance. The single table strategy maps all entities of the inheritance structure to the same database table. This approach makes polymorphic queries very efficient and provides the best performance.
In table per hierarchy mapping, single table is required to map the whole hierarchy, an extra column (known as discriminator column) is added to identify the class. But nullable values are stored in the table .
Table per Hierarchy is one of the inheritance strategies in hibernate. In this strategy, the entire hierarchy is mapped to a single table. All attributes of all the classes in the hierarchy are stored in a single table.
You'll find that Hibernate writes the query for an unknown animal type with a series of LEFT JOIN statements, one per subclass. So the query will slow as the number of subclasses increases, and will attempt to return an ever wider result set. So you are correct, it doesn't scale well with large class hierarchies.
With HQL, yes you can query the subclass directly, and access its properties. That will then be rendered with a single INNER JOIN.
I haven't tried this with multiple levels of inheritance. If the above hasn't put you off yet, suggest you try it and see - you can turn on SQL debug output to see what is being sent to the database, or simply profile your database.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With