Avro File Writer
Learn more about the Avro File Writer connector and how to use it in the Digibee Integration Platform.
Avro File Writer is a Pipeline Engine v2 exclusive connector.
The Avro File Writer connector allows you to write Avro files based on Avro schemas.
Avro is a popular data serialization framework used within the Hadoop Big Data ecosystem, known for its schema evolution support and compactness. For more information, see the official website.
Parameters
Take a look at the configuration parameters of the connector. Parameters supported by Double Braces expressions are marked with (DB)
.
General tab
File Name (DB)
The file name of the Avro file to be written.
file.avro
String
Data (DB)
The data to be written to the Avro file.
It only accepts JSON objects or arrays of objects.
{{ message.data }}
String
Schema (DB)
The Avro Schema to be used for writing the file.
It only accepts schemas with RECORD
type as the root data type.
N/A
String
Data From File
If active, the data to be written to the Avro file must come from other Avro files and not from a JSON payload.
False
Boolean
Files
Defines the Avro files containing the data to be written to the final Avro file.
This option is available only when Data From File is active.
N/A
Options of Data From File
File Name (DB)
The name of the Avro file that contains the data.
This option is available only when Data From File is active.
N/A
String
Infer Schema From File
If the option is active, the Avro Schema will be inferred from the first Avro file defined in the Files parameter.
False
Boolean
File Exists Policy
Defines which behavior to be followed when a file with the same name (File Name parameter) already exists in the current pipeline execution.
You can select the following options: Append (append data to an existing file), Overwrite (overwrite the existing file), or Fail (execution interrupted with an error if the file already exists).
Append
String
Fail On Error
If the option is active, the execution of the pipeline with an error will be interrupted. Otherwise, the pipeline execution proceeds, but the result will show a false value for the "success"
property.
False
Boolean
Advanced tab
Compression Codec
The compression codec to be used when compressing the Avro file.
Options:
Uncompressed
DEFLATE
BZIP2
Uncompressed
String
Compression Level
The level of compression to be applied when compressing the Avro file. Options: 1-9.
This option is only available when Compression Codec is set as DEFLATE.
1
Integer
Documentation tab
Documentation
Section for documenting any necessary information about the connector configuration and business rules.
N/A
String
Note that performance differences can occur when writing compressed and uncompressed Avro files. Since compression requires greater memory and processing consumption, it’s important to validate the limits that the pipeline should support when applying it.
Usage examples
File from JSON object
Writing an Avro File based on a JSON object payload:
File Name: file.avro
Data: {{ message.data }}
Schema: {{ message.schema }}
File Exists Policy: Overwrite
Data example:
Schema example:
Output:
File from JSON array of objects
Writing an Avro File based on a JSON array of objects payload:
File Name: file.avro
Data: {{ message.data }}
Schema: {{ message.schema }}
File Exists Policy: Overwrite
Data example:
Schema example:
Output:
Uncompressed Avro file
Writing an uncompressed Avro File:
File Name: file.avro
Data: {{ message.data }}
Schema: {{ message.schema }}
File Exists Policy: Overwrite
Compression Codec: Uncompressed
Output:
Compressed Avro file
Writing a compressed Avro File:
File Name: file.avro
Data: {{ message.data }}
Schema: {{ message.schema }}
File Exists Policy: Overwrite
Compression Codec: BZIP2
Output:
File Exists Policy as Fail
Writing an Avro File with the same name of an existent file in the pipeline file directory:
File Name: file.avro
Data: {{ message.data }}
Schema: {{ message.schema }}
File Exists Policy: Fail
Output:
Writing file from another Avro file - Explicit schema
Writing an Avro File with the data to be written coming from other Avro files instead of from a JSON payload, using a Schema explicit configuration:
File Name: file.avro
Data From File: activated
Files:
File Name: {{ message.existingAvroFile }}
Schema: {{ message.schema }}
File Exists Policy: Overwrite
Output:
Writing file from another Avro file - Infer Schema
Writing an Avro File with the data to be written coming from other Avro files instead of from a JSON payload, inferring the schema from the file:
File Name: file.avro
Data From File: activated
Files:
File Name: {{ message.existingAvroFile }}
Infer Schema: activated
File Exists Policy: Overwrite
Output:
Last updated