Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use spring batch to read JPA entities with associations

I've been tasked with migrating data from a "legacy" web app that uses spring data JPA to a newer system. My initial thoughts were to use spring batch. I am using a JpaPagingItemReader<LegacyEntity> to read the legacy entities, a custom ItemProcessor<LegacyEntity, NewDataDto> to transform the entities and an ItemWriter<NewDataDto> to post the data via http rest call to the new system.

The entities have a lot of one to many associations. LegacyEntity has a one to many relationship with entityA which has a oneToMany relationship with entityB.

My problem is that the JpaPagingItemReader is driven by jpql. I want the reader to output one of each LegacyEntity with all associations fully loaded. I looked into using a fetch join in jpql but it looks like it might not support nested associations and emits duplicates.

What's the best way to handle this? How would I handle this if I was using plain old jdbc?

Spring batch readers and processors all focus on processing one record at a time, only using paging under the hood so how would I normally read objects with toMany associations in batch?

like image 260
Jacob Botuck Avatar asked Oct 16 '25 03:10

Jacob Botuck


2 Answers

Normally you'd use the driving query pattern. Your ItemReader will read the ids (or the bare minimum entity). The ItemProcessor will then enrich the item with whatever else is needed. The ItemWriter will then have the full entity to write. You can read more about this pattern in the Spring Batch documentation here: https://docs.spring.io/spring-batch/trunk/reference/html/patterns.html#drivingQueryBasedItemReaders

like image 90
Michael Minella Avatar answered Oct 18 '25 18:10

Michael Minella


I managed to solve this with jdbc instead of jpa.

public class LegacyEntityReader extends JdbcPagingItemReader<LegacyEntity> {
private NamedParameterJdbcTemplate jdbcTemplate;

public LegacyEntityReader(DataSource dataSource, int pageSize) {
    //setup reader for loading legacyENtity without associations here
}

@Override
protected void doReadPage() {
    super.doReadPage();//this loads a page of root entities into a list exposed as a protected field: "results"
    List<Long> resultIds = results.stream().map(LegacyEntity::getId).collect(Collectors.toList());

    //DO queries to load associations here where legacyEntity.id in resultIds
    //Then associate in memory with the results in the results field
}
}

JPA cannot be used to load entities with associations in one jpa query using Paging. The query will inevitably yield duplicates and the deduplication will need to take place on the entire resultset in memory possibly leading to out of memory errors

This makes me sad :(.

like image 31
Jacob Botuck Avatar answered Oct 18 '25 17:10

Jacob Botuck



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!