Thursday, May 26, 2022

 MULE 4 Batch Job steps:

In this post, I want to share some brief details about Batch Processing and how to process messages as a batch in MuleSoft. 

Batch Processing: 

Batch processing handles large amounts of data. It can process data quickly, minimize or eliminate the need for user interaction, and improve the efficiency of job processing. In MuleSoft when we need to interact with or process large amounts of data or process messages as a batch we can use Batch Processing, which can be achieved by batch scope. 

Batch scope in the mule application  has multiple phases in which it divides the input payload into individual records, performs actions on these individual records, and then sends the processed data to target systems. 

Batch job is an asynchronous process. It works asynchronously with respect to the main flow or calling flow. 

Each batch job contains three different phases: 

Load and Dispatch: 

This phase will create batch job instances, convert payload into a collection of records and then split the collection into individual records using Data weave (internal) for processing. In this phase, Mule internally uses a persistence queue for processing and storing records.  

Process:

In this phase, Mule starts pulling records from the queue as per the configured batch block size. Next, Mule sends the records to their corresponding batch step and processes them asynchronously. Each batch step starts processing multiple record blocks in parallel. However, batch steps process the records inside each block sequentially. After processing all records it sends those records back to the queue, from where the records can be processed by the next batch step. 

On Complete: 

The last and optional phase to create a report or summary of the records it processed for the particular batch job instance. It will tell us how many records were processed, and how many failed. 

Batch Components: 

Batch Step: 

The batch step is generally a part of the processing phase. After load and dispatch, the batch job sends all records to the batch step. During the batch step, the batch performs work on each record. We can apply filters by adding Accept Expressions within each batch step. 

Batch Aggregator: 

The batch aggregator scope only exists in batch steps. The batch aggregator scope is used to accumulate the records from a batch step, and send them to an external source or service in bulk. The batch aggregator performs on the payload, not variables.  

There are two types of batch aggregators: Default, where we can define the size of the aggregator, and Streaming, where we can use the aggregator to stream all the records and process them, no matter how large they are.

Batch Filters: 

The batch filter can only exist in batch steps. We can apply one or more filters as attributes to any number of batch steps. There are two available attributes to filter the records. 

  1. Accept Expressions : 
    – The Accept Expression attribute is used to process only records that evaluate to true or satisfy the expression. 

  2. Accept Policy: 
    – The Accept Policy has 3 default attributes used to filter the records.

    NO FAILURES: The batch step processes only those records that succeeded in previous steps. 

    ONLY FAILURES: The batch step processes only those records that failed to process in previous steps. 

    ALL: The batch step processes all records, regardless of whether they failed or not in previous steps. 

Flow Diagram of Batch Processing:


 

No comments:

Post a Comment

DataWeave 2.0 core functions cheatsheet

DataWeave 2.0 core functions and its usage: This is a compilation of all the core functions that can be used in DataWeave 2.0 according to M...