Stream Excel
The Documentation Portal provides guides on all options for streaming on the Digibee iPaaS. This page covers how users can stream Excel components.
Stream Excel reads a local Excel file row by row in a JSON structure and triggers subpipelines to process each line. This resource is indicated for situations in which large files need to be processed.
Parameters
Take a look at the configuration options for the component. Parameters supported by Double Braces expressions are marked with (DB)
.
Parameter | Description | Default value | Data type |
---|---|---|---|
File Name | Determines the name or full file path (i.e., tmp/processed/file.txt) of the local file to be read. | file.xlsx | String |
Sheet Name | Name of the Excel sheet to be read. | Plan1 | String |
Sheet Index | Excel sheet index to be read. | N/A | Integer |
Use Sheet Index Instead Of Name | If activated, the option allows the sheet index to be informed instead of the name. | False | Boolean |
Max Fractional Digits | Determines the precise number of fractional digits in a numeric cell when the Excel file is read. | 5 | Integer |
Read Specific Columns As String | Indicates which columns the component must read in string format instead of its original format. | B,D,F | String |
Read All Columns As String | If selected, the option will make all the columns be read as a string. | False | Boolean |
Column Identifier | In case of errors, this is the column that will be sent to the onException sub-process. | A | String |
Parallel Execution Of Each Iteration | If selected, the option will make all the file lines be read in parallel. | False | Boolean |
Fail On Error | When activated, this parameter suspends the pipeline execution only if there’s a severe occurrence in the iteration structure, disabling its complete conclusion. The Fail On Error parameter activation doesn’t have any connection with the errors occurred in the components used for the construction of the subpipelines (onProcess and onException). | False | Boolean |
Advanced | When selected, the option requires the definition of advanced parameters. | False | Boolean |
Skip | Number of lines to be skipped before the file reading. | N/A | Integer |
Limit | Maximum number of lines to be read. | N/A | Integer |
Stream Excel makes batch processing. To better understand the concept, read the Batch processing documentation.
Stream Excel isn’t capable of reading files in .xls format, but only in .xlsx format.
Messages flow
Input
The component accepts any input message, being able to use it through Double Braces.
Output
The component returns a JSON with the total amount of executions, successful executions and executions with error.
without error
with error
total: total number of processed lines.
success: total number of successfully processed lines.
failed: total number of line whose process failed.
To know if a line has been correctly processed, there must be the return { "success": true }
for every processed line.
The component throws an exception if the file doesn't exist or can't be read. On contrary, a message is produced at the output with the occurred exception.
You may also find an error by uploading a .xlsx file to Google Drive and, in a pipeline, using the Google Drive component to download it and a Stream Excel component to read it.
When you do this, an unexpected behavior of Google Sheets modifies the .xlsx file. This causes Stream Excel to read every row in a sheet (including blank rows), instead of reading only the ones with content. This behavior is not related to the Digibee Integration Platform.
As a workaround, you can copy the contents of the sheet and paste it into a new sheet tab in the same .xlsx file. If you do this, don't copy blank rows, or the same error will occur.
The files manipulation inside a pipeline occurs in a protected way. All the files can be accessed with a temporary directory only, where each pipeline key gives access to its own files set.
Stream Excel in action
See below how the component behaves in certain situations and how it must be configured in each case.
Read the Excel file and analyze the results
For this example, let's assume that we already have an Excel file in the pipeline flow that was downloaded through components such as Google Drive, OneDrive, or a similar one. This file is a sheet with the names of the 100 billionaires selected by Forbes.
The Stream Excel component is configured as follows:
File Name: file.xlsx
Sheet Name: Plan1
Use Sheet Index Instead of Name: deactivated
Max Fractional Digits: 5
Read Specific Columns As String: B,D,F
Read All Columns As String: deactivated
Column Identifier: A
Parallel Execution of Each Iteration: deactivated
Fail On Error: deactivated
Advanced: deactivated
Input
Output
Log results
To see this log, we use the Messages tab in the pipeline. In the figure below, all the sheet lines have been read individually by the component, including the names of the columns.
Read the Excel file and analyze a sheet that does not exist in the file
For this example, consider the same sheet as in the example above. However, we will select a sheet that does not exist.
The Stream Excel component will give the following error message (Fail On Error is deactivated):
Read the invalid Excel file
In this example, let's consider a non-existent file in the pipeline flow.
The Stream Excel component will return the following error message (Fail On Error is deactivated):
Last updated