[Spring Batch] Order of execution of the reader, the processor and the writer

It seems to me that the order of execution of the reader, the processor and the writer, inside a step and a chunk, is not always clear in everybody’s mind (included mine when i started with Spring Batch).

So I made the following 2 tables that should help – I hope – clarify things.

1) First scenario : we read, process and write all data at once. There is no commit interval defined.

Here is an excerpt of a job with one step. The reader reads the data from a datasource such as a database.


<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns:b="http://www.springframework.org/schema/batch"
xmlns="http://www.springframework.org/schema/beans" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.springframework.org/schema/batch
http://www.springframework.org/schema/batch/spring-batch-2.1.xsd
http://www.springframework.org/schema/beans
http://www.springframework.org/schema/beans/spring-beans-3.0.xsd">
...

<b:job id="doSomething" incrementer="idIncrementer" >
<b:step id="mainStep" >
<b:tasklet>
<b:chunk reader="doSomethingReader" processor="doSomethingProcessor"
writer="doSomethingWriter"    chunk-completion-policy="defaultResultCompletionPolicy" />
</b:tasklet>
</b:step>
</b:job>

</beans>

If the total number of lines (items) returned from the database is 6, then here is how Spring Batch will process each item :

Order of execution with defaultResultCompletionPolicy

Execution orderReaderProcessorWriterTransactions
11st itemT1
22nd item
33rd item
44th item
55th item
66th item
71st item
82nd item
93rd item
104th item
115th item
126th item
13The 6 items at the same time

In that configuration, there is a single transaction. The items are all written at once.

2) Second scenario : we define a size of 4 items for each chunk. So that means there will be a commit every 4 items.


<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns:b="http://www.springframework.org/schema/batch"
xmlns="http://www.springframework.org/schema/beans" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.springframework.org/schema/batch
http://www.springframework.org/schema/batch/spring-batch-2.1.xsd
http://www.springframework.org/schema/beans
http://www.springframework.org/schema/beans/spring-beans-3.0.xsd">

...

<b:job id="doSomething" incrementer="idIncrementer" >
<b:step id="mainStep" >
<b:tasklet>
<b:chunk reader="doSomethingReader" processor="doSomethingProcessor"
writer="doSomethingWriter"    commit-interval="4" />
</b:tasklet>
</b:step>
</b:job>

</beans>

Then chunk processing will occur in that order (supposing there are 6 items) :

Order of execution with chunk processing

Execution orderReaderProcessorWriterTransactions
11st itemT1
22nd item
33rd item
44th item
51st item
62nd item
73rd item
84th item
9The first 4 lines, at the same time
105th itemT2
116th item
125th item
136th item
14The last 2 lines, at the same time
There are 2 transactions, one for each chunk.
That means if there is a problem with the 6th item, the first 4 items (1st chunk) will already have been processed and committed.
A rollback will occur for all items of the 2nd chunk (items 5 and 6).