Parquet File Reader
Learn more about the Parquet File Reader connector and how to use it in the Digibee Integration Platform.
Parquet File Reader is a Pipeline Engine v2 exclusive connector.
The Parquet File Reader connector allows you to read Parquet files.
Parquet is a columnar file format designed for efficient data storage and retrieval. Further information can be found on the official website.
Parameters
Take a look at the configuration parameters of the connector. Parameters supported by Double Braces expressions are marked with (DB)
.
General tab
Parameter | Description | Default value | Data type |
---|---|---|---|
File Name | The file name of the Parquet file to be read. | {{ message.fileName }} | String |
Check File Size | If the option is active, the specified Maximum File Size is checked. If the size is larger, an error is displayed. | False | Boolean |
Maximum File Size | Specifies the maximum size allowed (in bytes) of the file to be read. | N/A | Integer |
Fail On Error | If the option is active, the execution of the pipeline with an error will be interrupted. Otherwise, the pipeline execution proceeds, but the result will show a false value for the | False | Boolean |
Documentation tab
Parameter | Description | Default value | Data type |
---|---|---|---|
Documentation | Section for documenting any necessary information about the connector configuration and business rules. | N/A | String |
Note that a compressed Parquet file generates JSON content that is larger than the file itself when it’s read. Therefore, it’s important to check whether the pipeline has enough memory to handle the data, as it will be stored in the pipeline's memory.
Usage examples
Reading file
Reading a Parquet file without checking the file size:
File Name: file.parquet
Check File Size: deactivated
Output:
Reading file - Checking file size
Reading a Parquet file checking if its size is larger than the Maximum File Size:
File Name: file.parquet
Check File Size: activated
Maximum File Size: 5000000
Output:
Last updated