# Parquet File Reader

The **Parquet File Reader** connector allows you to read Parquet files.

Parquet is a columnar file format designed for efficient data storage and retrieval. Further information can be found [on the official website](https://parquet.apache.org/).

## **Parameters**

Configure the connector using the parameters below. Fields that support [Double Braces expressions ](https://docs.digibee.com/documentation/connectors-and-triggers/double-braces/overview)are marked in the **Supports DB** column.&#x20;

{% tabs fullWidth="true" %}
{% tab title="General" %}

<table data-full-width="true"><thead><tr><th>Parameter</th><th>Description</th><th>Type</th><th>Supports DB</th><th>Default</th></tr></thead><tbody><tr><td><strong>Alias</strong></td><td>Name (alias) for this connector’s output, allowing you to reference it later in the flow using Double Braces expressions.</td><td>String</td><td>✅</td><td><code>parquet-file-reader-1</code></td></tr><tr><td><strong>File Name</strong></td><td>The file name of the Parquet file to be read.</td><td>String</td><td>✅</td><td>{{ message.fileName }}</td></tr><tr><td><strong>Check File Size</strong></td><td>If the option is active, the specified <strong>Maximum File Size</strong> is checked. If the size is larger, an error is displayed.</td><td>Boolean</td><td>❌</td><td>False</td></tr><tr><td><strong>Convert Date Fields</strong></td><td>If enabled, <code>DATE/TIMESTAMP</code> fields from the file are converted to string format (e.g. <code>yyyy-MM-dd</code> for <code>DATE</code>, ISO-8601 for <code>TIMESTAMP)</code>. When default, dates remain as numeric values (days/millis since epoch).</td><td>Boolean</td><td>❌</td><td>False</td></tr><tr><td><strong>Date Field Paths (optional)</strong></td><td>Manually indicates date fields when the schema does not declare a logical type <code>DATE</code>.</td><td>String</td><td>❌</td><td>N/A</td></tr><tr><td><strong>Decode Base64 Fields</strong></td><td>If enabled, the connector recursively scans the output JSON nodes. Any string identified as a valid Base64 sequence is automatically decoded to UTF-8 and replaced in-place.</td><td>Boolean</td><td>❌</td><td>Boolean</td></tr><tr><td><strong>Maximum File Size</strong></td><td>Specifies the maximum size allowed (in bytes) of the file to be read.</td><td>Integer</td><td>❌</td><td>N/A</td></tr><tr><td><strong>Fail On Error</strong></td><td>If the option is active, the execution of the pipeline with an error will be interrupted. Otherwise, the pipeline execution proceeds, but the result will show a false value for the <code>"success"</code> property.</td><td>Boolean</td><td>❌</td><td>False</td></tr></tbody></table>
{% endtab %}

{% tab title="Documentation" %}

<table data-full-width="true"><thead><tr><th>Parameter</th><th>Description</th><th>Default value</th><th>Data type</th></tr></thead><tbody><tr><td><strong>Documentation</strong></td><td>Section for documenting any necessary information about the connector configuration and business rules.</td><td>N/A</td><td>String</td></tr></tbody></table>
{% endtab %}
{% endtabs %}

{% hint style="info" %}
Note that a compressed Parquet file generates JSON content that is larger than the file itself when it’s read. Therefore, it’s important to check whether the pipeline has enough memory to handle the data, as it will be stored in the pipeline's memory.
{% endhint %}

## **Usage examples**

### **Reading file**

Reading a Parquet file without checking the file size:

* **File Name:** file.parquet
* **Check File Size:** deactivated

**Output:**

```
{
  "data": [
    {
      "name": "Aquiles",
      "phoneNumbers": [
        "11 99999-9999",
        "11 93333-3333"
      ],
      "active": true,
      "address": "St. Example",
      "score": 71.3,
      "details": "Some details"
    }
  ],
  "fileName": "file.parquet",
  "total": 1
}

```

### **Reading file - Checking file size**

Reading a Parquet file checking if its size is larger than the **Maximum File Size**:

* **File Name:** file.parquet
* **Check File Size:** activated
* **Maximum File Size:** 5000000

**Output:**

```
{
  "data": [
    {
      "name": "Aquiles",
      "phoneNumbers": [
        "11 99999-9999",
        "11 93333-3333"
      ],
      "active": true,
      "address": "St. Example",
      "score": 71.3,
      "details": "Some details"
    }
  ],
  "fileName": "file.parquet",
  "total": 1
}

```
