# Avro File Writer

{% hint style="info" %}
**Avro File Writer** is a Pipeline Engine v2 exclusive connector.
{% endhint %}

The **Avro File Writer** connector allows you to write Avro files based on Avro schemas.

Avro is a popular data serialization framework used within the Hadoop Big Data ecosystem, known for its schema evolution support and compactness. For more information, [see the official website](https://avro.apache.org/).

## **Parameters**

Take a look at the configuration parameters of the connector. Parameters supported by [Double Braces expressions](https://docs.digibee.com/documentation/connectors-and-triggers/double-braces/overview) are marked with `(DB)`.

### **General tab**

<table data-full-width="true"><thead><tr><th>Parameter</th><th>Description</th><th>Default value</th><th>Data type</th></tr></thead><tbody><tr><td><strong>File Name</strong> <code>(DB)</code></td><td>The file name of the Avro file to be written.</td><td>file.avro</td><td>String</td></tr><tr><td><strong>Data</strong> <code>(DB)</code></td><td><p>The data to be written to the Avro file.</p><p>It only accepts JSON objects or arrays of objects.</p></td><td>{{ message.data }}</td><td>String</td></tr><tr><td><strong>Schema</strong> <code>(DB)</code></td><td><p>The Avro Schema to be used for writing the file.</p><p>It only accepts schemas with <code>RECORD</code> type as the root data type.</p></td><td>N/A</td><td>String</td></tr><tr><td><strong>Data From File</strong></td><td>If active, the data to be written to the Avro file must come from other Avro files and not from a JSON payload.</td><td>False</td><td>Boolean</td></tr><tr><td><strong>Files</strong></td><td><p>Defines the Avro files containing the data to be written to the final Avro file.</p><p></p><p>This option is available only when <strong>Data From File</strong> is active.</p></td><td>N/A</td><td>Options of Data From File</td></tr><tr><td><strong>File Name</strong> <code>(DB)</code></td><td><p>The name of the Avro file that contains the data.</p><p></p><p>This option is available only when <strong>Data From File</strong> is active.</p></td><td>N/A</td><td>String</td></tr><tr><td><strong>Infer Schema From File</strong></td><td>If the option is active, the Avro Schema will be inferred from the first Avro file defined in the <strong>Files</strong> parameter.</td><td>False</td><td>Boolean</td></tr><tr><td><strong>File Exists Policy</strong></td><td><p>Defines which behavior to be followed when a file with the same name (<strong>File Name</strong> parameter) already exists in the current pipeline execution.</p><p>You can select the following options: <strong>Append</strong> (append data to an existing file), <strong>Overwrite</strong> (overwrite the existing file), or <strong>Fail</strong> (execution interrupted with an error if the file already exists).</p></td><td>Append</td><td>String</td></tr><tr><td><strong>Fail On Error</strong></td><td>If the option is active, the execution of the pipeline with an error will be interrupted. Otherwise, the pipeline execution proceeds, but the result will show a false value for the <code>"success"</code> property.</td><td>False</td><td>Boolean</td></tr></tbody></table>

### **Advanced tab**

<table data-full-width="true"><thead><tr><th>Parameter</th><th>Description</th><th>Default value</th><th>Data type</th></tr></thead><tbody><tr><td><strong>Compression Codec</strong></td><td><p>The compression codec to be used when compressing the Avro file.</p><p>Options:</p><ul><li><strong>Uncompressed</strong></li><li><strong>DEFLATE</strong></li><li><strong>BZIP2</strong></li></ul></td><td>Uncompressed</td><td>String</td></tr><tr><td><strong>Compression Level</strong></td><td><p>The level of compression to be applied when compressing the Avro file. Options: 1-9.</p><p>This option is only available when <strong>Compression Codec</strong> is set as <strong>DEFLATE.</strong></p></td><td>1</td><td>Integer</td></tr></tbody></table>

### **Documentation tab**

<table data-full-width="true"><thead><tr><th>Parameter</th><th>Description</th><th>Default value</th><th>Data type</th></tr></thead><tbody><tr><td><strong>Documentation</strong></td><td>Section for documenting any necessary information about the connector configuration and business rules.</td><td>N/A</td><td>String</td></tr></tbody></table>

{% hint style="info" %}
Note that performance differences can occur when writing compressed and uncompressed Avro files. Since compression requires greater memory and processing consumption, it’s important to validate the limits that the pipeline should support when applying it.
{% endhint %}

## **Usage examples**

### **File from JSON object**

Writing an Avro File based on a JSON object payload:

* **File Name:** file.avro
* **Data:** {{ message.data }}
* **Schema:** {{ message.schema }}
* **File Exists Policy:** Overwrite

**Data example:**

```
{
  "data": {
    "name": "Aquiles",
    "phoneNumbers": [
      "11 99999-9999",
      "11 93333-3333"
    ],
    "active": true,
    "address": "St. Example",
    "score": 71.3,
    "details": "Some details"
  }
}

```

**Schema example:**

```
{
  "schema": {
    "type": "record",
    "name": "Record",
    "fields": [
        {
            "name": "name",
            "type": "string"
        },
        {
            "name": "phoneNumbers",
            "type": {
                "type": "array",
                "items": "string"
            }
        },
        {
            "name": "active",
            "type": "boolean"
        },
        {
            "name": "address",
            "type": "string"
        },
        {
            "name": "score",
            "type": "double"
        },
        {
            "name": "details",
            "type": [
                "string",
                "null"
            ]
        }
    ]
  }
}

```

**Output:**

```
{
  "success": true,
  "fileName": "file.avro"
}

```

### **File from JSON array of objects**

Writing an Avro File based on a JSON array of objects payload:

* **File Name:** file.avro
* **Data:** {{ message.data }}
* **Schema:** {{ message.schema }}
* **File Exists Policy:** Overwrite

**Data example:**

```
{
  "data": [ 
    {
      "name": "Aquiles",
      "phoneNumbers": [
        "11 99999-9999",
        "11 93333-3333"
      ],
      "active": true,
      "address": "St. Example",
      "score": 71.3,
      "details": "Some details"
    },
    {
      "name": "Vitor",
      "phoneNumbers": [
        "11 97777-7777"
      ],
      "active": false,
      "address": "St. Example 2",
      "score": 80.0,
      "details": null
    }
  ]
}

```

**Schema example:**

```
{
  "schema": {
    "type": "record",
    "name": "Record",
    "fields": [
        {
            "name": "name",
            "type": "string"
        },
        {
            "name": "phoneNumbers",
            "type": {
                "type": "array",
                "items": "string"
            }
        },
        {
            "name": "active",
            "type": "boolean"
        },
        {
            "name": "address",
            "type": "string"
        },
        {
            "name": "score",
            "type": "double"
        },
        {
            "name": "details",
            "type": [
                "string",
                "null"
            ]
        }
    ]
  }
}

```

**Output:**

```
{
  "success": true,
  "fileName": "file.avro"
}

```

### **Uncompressed Avro file**

Writing an uncompressed Avro File:

* **File Name:** file.avro
* **Data:** {{ message.data }}
* **Schema:** {{ message.schema }}
* **File Exists Policy:** Overwrite
* **Compression Codec:** Uncompressed

**Output:**

```
{
  "success": true,
  "fileName": "file.avro"
}

```

### **Compressed Avro file**

Writing a compressed Avro File:

* **File Name:** file.avro
* **Data:** {{ message.data }}
* **Schema:** {{ message.schema }}
* **File Exists Policy:** Overwrite
* **Compression Codec:** BZIP2

**Output:**

```
{
  "success": true,
  "fileName": "file.avro"
}

```

### **File Exists Policy as Fail**

Writing an Avro File with the same name of an existent file in the pipeline file directory:

* **File Name:** file.avro
* **Data:** {{ message.data }}
* **Schema:** {{ message.schema }}
* **File Exists Policy:** Fail

**Output:**

{% code overflow="wrap" %}

```
{
  "success": false,
  "message": "Something went wrong while trying to execute the Avro Writer connector",
  "error": "com.digibee.pipelineengine.exception.PipelineEngineRuntimeException: Avro file file.avro already exists."
}

```

{% endcode %}

### **Writing file from another Avro file - Explicit schema**

Writing an Avro File with the data to be written coming from other Avro files instead of from a JSON payload, using a Schema explicit configuration:

* **File Name:** file.avro
* **Data From File:** activated
* **Files:**
  * **File Name:** {{ message.existingAvroFile }}
* **Schema:** {{ message.schema }}
* **File Exists Policy:** Overwrite

**Output:**

```
{
  "success": true,
  "fileName": "file.avro"
}

```

### **Writing file from another Avro file - Infer Schema**

Writing an Avro File with the data to be written coming from other Avro files instead of from a JSON payload, inferring the schema from the file:

* **File Name:** file.avro
* **Data From File:** activated
* **Files:**
  * **File Name:** {{ message.existingAvroFile }}
* **Infer Schema:** activated
* **File Exists Policy:** Overwrite

**Output:**

```
{
  "success": true,
  "fileName": "file.avro"
}

```
