I've been tasked with migrating data from a "legacy" web app that uses spring data JPA to a newer system. My initial thoughts were to use spring batch. I am using a JpaPagingItemReader<LegacyEntity>
to read the legacy entities, a custom ItemProcessor<LegacyEntity, NewDataDto>
to transform the entities and an ItemWriter<NewDataDto>
to post the data via http rest call to the new system.
The entities have a lot of one to many associations. LegacyEntity has a one to many relationship with entityA which has a oneToMany relationship with entityB.
My problem is that the JpaPagingItemReader is driven by jpql. I want the reader to output one of each LegacyEntity with all associations fully loaded. I looked into using a fetch join in jpql but it looks like it might not support nested associations and emits duplicates.
What's the best way to handle this? How would I handle this if I was using plain old jdbc?
Spring batch readers and processors all focus on processing one record at a time, only using paging under the hood so how would I normally read objects with toMany associations in batch?
Normally you'd use the driving query pattern. Your ItemReader
will read the ids (or the bare minimum entity). The ItemProcessor
will then enrich the item with whatever else is needed. The ItemWriter
will then have the full entity to write. You can read more about this pattern in the Spring Batch documentation here: https://docs.spring.io/spring-batch/trunk/reference/html/patterns.html#drivingQueryBasedItemReaders
I managed to solve this with jdbc instead of jpa.
public class LegacyEntityReader extends JdbcPagingItemReader<LegacyEntity> {
private NamedParameterJdbcTemplate jdbcTemplate;
public LegacyEntityReader(DataSource dataSource, int pageSize) {
//setup reader for loading legacyENtity without associations here
}
@Override
protected void doReadPage() {
super.doReadPage();//this loads a page of root entities into a list exposed as a protected field: "results"
List<Long> resultIds = results.stream().map(LegacyEntity::getId).collect(Collectors.toList());
//DO queries to load associations here where legacyEntity.id in resultIds
//Then associate in memory with the results in the results field
}
}
JPA cannot be used to load entities with associations in one jpa query using Paging. The query will inevitably yield duplicates and the deduplication will need to take place on the entire resultset in memory possibly leading to out of memory errors
This makes me sad :(.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With