[Spring Batch] Order of execution of the reader, the processor and the writer

It seems to me that the order of execution of the reader, the processor and the writer, inside a step and a chunk, is not always clear in everybody’s mind (included mine when i started with Spring Batch).

So I made the following 2 tables that should help – I hope – clarify things.

1) First scenario : we read, process and write all data at once. There is no commit interval defined.

Here is an excerpt of a job with one step. The reader reads the data from a datasource such as a database.


<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns:b="http://www.springframework.org/schema/batch"
xmlns="http://www.springframework.org/schema/beans" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.springframework.org/schema/batch
http://www.springframework.org/schema/batch/spring-batch-2.1.xsd
http://www.springframework.org/schema/beans
http://www.springframework.org/schema/beans/spring-beans-3.0.xsd">
...

<b:job id="doSomething" incrementer="idIncrementer" >
<b:step id="mainStep" >
<b:tasklet>
<b:chunk reader="doSomethingReader" processor="doSomethingProcessor"
writer="doSomethingWriter"    chunk-completion-policy="defaultResultCompletionPolicy" />
</b:tasklet>
</b:step>
</b:job>

</beans>

If the total number of lines (items) returned from the database is 6, then here is how Spring Batch will process each item :

Order of execution with defaultResultCompletionPolicy

Execution orderReaderProcessorWriterTransactions
11st itemT1
22nd item
33rd item
44th item
55th item
66th item
71st item
82nd item
93rd item
104th item
115th item
126th item
13The 6 items at the same time

In that configuration, there is a single transaction. The items are all written at once.

2) Second scenario : we define a size of 4 items for each chunk. So that means there will be a commit every 4 items.


<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns:b="http://www.springframework.org/schema/batch"
xmlns="http://www.springframework.org/schema/beans" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.springframework.org/schema/batch
http://www.springframework.org/schema/batch/spring-batch-2.1.xsd
http://www.springframework.org/schema/beans
http://www.springframework.org/schema/beans/spring-beans-3.0.xsd">

...

<b:job id="doSomething" incrementer="idIncrementer" >
<b:step id="mainStep" >
<b:tasklet>
<b:chunk reader="doSomethingReader" processor="doSomethingProcessor"
writer="doSomethingWriter"    commit-interval="4" />
</b:tasklet>
</b:step>
</b:job>

</beans>

Then chunk processing will occur in that order (supposing there are 6 items) :

Order of execution with chunk processing

Execution orderReaderProcessorWriterTransactions
11st itemT1
22nd item
33rd item
44th item
51st item
62nd item
73rd item
84th item
9The first 4 lines, at the same time
105th itemT2
116th item
125th item
136th item
14The last 2 lines, at the same time
There are 2 transactions, one for each chunk.
That means if there is a problem with the 6th item, the first 4 items (1st chunk) will already have been processed and committed.
A rollback will occur for all items of the 2nd chunk (items 5 and 6).

Asynchronously run a method with the Async and Await keywords

I wrote this small class to illustrate the use of the Async and Await keywords which were added in .NET 4.5.
It should quickly help anyone curious to understand this easier way to write asynchronous programs in C#.


using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace ConsoleApplication3
{
    class Program
    {
        static void Main(string[] args)
        {
            Console.WriteLine("Synchronous method");
            Console.WriteLine("Before calling DisplaySum");
            DisplaySum();
            Console.WriteLine("After calling DisplaySum");
            Console.WriteLine("Press a key to continue.");
            Console.ReadKey();

            Console.WriteLine("************************");
            Console.WriteLine("Asynchronous method");

            Console.WriteLine("Before calling DisplaySumAsync");
            DisplaySumAsync();
            Console.WriteLine("After calling DisplaySumAsync");
            Console.WriteLine("Press a key to quit.");
            Console.ReadKey();
        }

        public static double Calculate()
        {
            Console.WriteLine("Method Calculate()");
            double x = 1;
            // Long-running method
            for (int i = 1; i < 100000000; i++)
            {
                x += Math.Tan(x) / i;
            }
            return x;
        }

        // Synchronous method
        private static void DisplaySum()
        {
            Console.WriteLine("Method DisplaySum()");
            double result = Calculate();
            Console.WriteLine("DisplaySum DisplaySum - result : " + result);
        }

        // ***************************************

        public static Task<double> CalculateAsync()
        {
            Console.WriteLine("Method CalculateAsync()");
            return Task.Run(() =>
            {
                double x = 1;
                // Long-running method
                for (int i = 1; i < 100000000; i++)
                {
                    x += Math.Tan(x) / i;
                }
                return x;
            });
        }

        // Asynchronous method
        private static async void DisplaySumAsync()
        {
            Console.WriteLine("Method DisplaySumAsync()");
            double result = await CalculateAsync();
            Console.WriteLine("Method DisplaySumAsync - result : " + result);
        }
    }
}

The output below (DOS window) shows clearly the difference between the call of a synchronous method and the call of an asynchronous method.
The message “After calling DisplaySumAsync” (line 25) is called immediately after the call of the DisplaySumAsync() method, whether it is finished or not.
The important thing to note is : the lines written in the same thread used to call the asynchronous method (which runs in another thread) are executed immediately. An asynchronous method does not block the calling function, allowing that function to continue functioning.

AsyncAwait