MULE 4 Batch Job steps:
In this post, I want to share some brief details about Batch Processing and how to process messages as a batch in MuleSoft.
Batch Processing:
Batch processing handles
large amounts of data. It can process data quickly, minimize or
eliminate the need for user interaction, and improve the efficiency of
job processing. In MuleSoft when we need to interact with or process
large amounts of data or process messages as a batch we can
use Batch Processing, which can be achieved by batch scope.
Batch
scope in the mule application has multiple phases in which
it divides the input payload into individual records, performs actions
on these individual records, and then sends the processed data to target
systems.
Batch job is an asynchronous process. It works asynchronously with respect to the main flow or calling flow.
Each batch job contains three different phases:
Load and Dispatch:
This phase will create
batch job instances, convert payload into a collection of records and
then split the collection into individual records using Data
weave (internal) for processing. In this phase, Mule internally uses a
persistence queue for processing and storing records.
Process:
In this phase, Mule starts pulling
records from the queue as per the configured batch block size. Next,
Mule sends the records to their corresponding batch step and processes
them asynchronously. Each batch step starts processing multiple record
blocks in parallel. However, batch steps process the records inside each
block sequentially. After processing all records it sends those records
back to the queue, from where the records can be processed by the next
batch step.
On Complete:
The last and optional
phase to create a report or summary of the records it processed for the
particular batch job instance. It will tell us how many
records were processed, and how many failed.
Batch Components:
Batch Step:
The batch step is generally a
part of the processing phase. After load and dispatch, the batch job
sends all records to the batch step. During the batch step, the batch
performs work on each record. We can apply filters by adding Accept
Expressions within each batch step.
Batch Aggregator:
The batch aggregator scope
only exists in batch steps. The batch aggregator scope is used to
accumulate the records from a batch step, and send them to an external
source or service in bulk. The batch aggregator performs
on the payload, not variables.
There are two types of batch
aggregators: Default, where we can define the size of the aggregator,
and Streaming, where we can use the aggregator to stream all the records
and process them, no matter how large they are.
Batch Filters:
The
batch filter can only exist in batch steps. We can apply one or more
filters as attributes to any number of batch steps. There are two
available attributes to filter the records.
- Accept Expressions :
– The Accept Expression attribute is used to process only records that evaluate to true or satisfy the expression. - Accept Policy:
– The Accept Policy has 3 default attributes used to filter the records.
NO FAILURES: The batch step processes only those records that succeeded in previous steps.
ONLY FAILURES: The batch step processes only those records that failed to process in previous steps.
ALL: The batch step processes all records, regardless of whether they failed or not in previous steps.
Flow Diagram of Batch Processing: