Stream File Reader Pattern
Discover more about the Stream File Reader Pattern component and how to use it on the Digibee Integration Platform.
Stream File Reader Pattern reads a local text file in blocks of line according to the configured pattern and triggers subpipelines to process each message. This resource must be used for large files.
Parameters
Take a look at the configuration parameters of the component. Parameters supported by Double Braces expressions are marked with (DB)
.
Parameter | Description | Default value | Data type |
---|---|---|---|
File Name | Name or full file path (i.e. tmp/processed/file.txt) of the local file. | N/A | String |
Tokenizer | XML, PAIR, and REGEX. By using the XML option, it's possible to inform the name of the XML tag for the component to send the block that has it. By using the PAIR option, it's possible to configure a start token and an end token for the component to return to the subflow all the lines between both tokens. By using the REGEX option, it's necessary to inform a regular expression for the component to return the block between the regular expressions. | XML | String |
Token | Token to be used to search the pattern in the informed file. | N/A | String |
End Token | End token. This parameter is available only when PAIR Tokenizer is selected. | N/A | String |
Include Tokens | For the inclusion of start and end tokens. This parameter is available only when PAIR Tokenizer is selected. | False | Boolean |
Group | Whole value that determines the grouping value returned by the component when finding a match with the defined pattern. | N/A | String |
Element Identifier | Attribute to be sent in case of errors. | N/A | String |
Parallel Execution Of Each Iteration | Occurs in parallel with the loop execution. | False | Boolean |
Fail On Error | When activated, this parameter suspends the pipeline execution only if there’s a severe occurrence in the iteration structure, disabling its complete conclusion. The Fail On Error parameter activation doesn’t have any connection with the errors occurred in the components used for the construction of the subpipelines (onProcess and onException). | False | Boolean |
Messages flow
Input
File Name substitutes the local pattern file.
Output
total: total number of processed lines.
success: total number of successful processed lines.
failed: total number of lines of whose processing failed.
To know if a line has been correctly processed, each processed line must return { "success": true }
.
The component throws an exception if the File Name doesn't exist or can't be read.
The files manipulation inside a pipeline occurs in a protected way. All the files can be accessed with a temporary directory only, where each pipeline key gives access to its own files set.
Stream File Reader Pattern makes batch processing. To better understand the process, read the documentation.
Stream File Reader Pattern in Action
See below how the component behaves in a determined situation and what its respective configuration is.
Using XML Tokenizer and searching tags information that can be in multiple lines
Given that the following XML file must be read:
file.xml
Configuring the component to return just the XML block of the order
tag:
File Name: file.xml
Tokenizer: XML
Token: order
The result will be 2 subflows containing the values that are inside the order
tag:
First:
Second:
Using the PAIR Tokenizer to read a file where there's a start token and an end token for each block
file.txt
File Name: file.txt
Tokenizer: PAIR
Token: ###
End Token: --###
Include Tokens: deactivated
The result will be 3 subflows containing the values that are inside the start (###
) and end tokens (--###
):
First:
Second:
Third:
Using REGEX Tokenizer to search all the lines among patterns
file.txt
The following pattern must be searched:
ID-\b[0-9a-f]{8}\b-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-\b[0-9a-f]{12}\b
File Name: file.txt
Tokenizer: REGEX
Token: ID-\b[0-9a-f]{8}\b-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-\b[0-9a-f]{12}\b
The result will be 2 subflows containing the values that match with the informed REGEX pattern.
First:
Second:
Using the REGEX Tokenizer to search all the lines among patterns and grouping every 2 results
file.txt
The following pattern must be searched:
ID-\b[0-9a-f]{8}\b-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-\b[0-9a-f]{12}\b
File Name: file.txt
Tokenizer: REGEX
Token: ID-\b[0-9a-f]{8}\b-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-\b[0-9a-f]{12}\b
Group: 2
The result will be 1 subflow containing the values that match the informed REGEX pattern.
When the REGEX Tokenizer is used to group, the pattern found as output is shown.
If the pattern informed in the file isn't found, then the return will be an execution of the whole file. Be careful when specifying the REGEX.
Last updated