Stream DB V3

Discover more about the Stream DB V3 component and how to use it on the Digibee Integration Platform.

Stream DB V3 allows the establishment of a connection with a service that supports the JDBC (Java Database Connectivity) protocol and the execution of the SQL (Structured Query Language) instructions. To see all the databases supported by this component, read the Supported databases documentation.

Differently from the DB V1 component, Stream DB has been designed to make execution in batches, which means, each return (resulting line or row) of the executed SQL instruction is individually treated through a subpipeline, being able to have its own processing flow. Read the Subpipelines article to learn more.

Parameters

Take a look at the configuration parameters of the component. Parameters supported by Double Braces expressions are marked with (DB).

In situations where an Apache Hive database is used, the Updatecount data may be unavailable due to a system behavior. This information will be available only if the updated row control is enabled on the Apache Hive server. To learn more about Apache Hive support for the Digibee Integration Platform, refer to Supported databases.

Parameters additional information

Column Name

See the following example about Column Name error message:

{
  "timestamp": 1600797662733,
  "error": "Error message",
  "code": 500,
  "processedId": "2"
}

Blob As File

If Blob As File is active, the blob-type field is stored as follows:

// "Blob As File" true
{
  "id": 12,
  "blob": "d3X8YK.file",
}

Otherwise, the blob-type field is stored as follows:


// "Blob As File" false
{
  "id": 12,
  "blob": "iVBORw0KGgoAAAANSUhEUgAAABAAAAAQCAIAAACQkWg2AAAAAXNSR0IArs4c6QAAAARnQU1BAACxjwv8YQUAAAAJcEhZcwAADsMAAA7DAcdvqGQAAAAeSURBVDhPY1Da6EMSYiBJNVDxqAZiQmw0lAZHKAEAaskfEED3lr0AAAAASUVORK5CYII="
}

Clob As File

If Clob As File is active, the clob-type field is stored as follows:

// "Clob As File" true
{
  "id": 15,
  "clob": "f7X9AS.file",
}

Otherwise, the clob-type field is stored as follows:

// "Clob As File" false
{
  "id": 15,
  "clob": "AAAAABBBBBCCCC”
}

Raw SQL Statement

To bring more flexibility when using Stream DB V3, we can activate the Raw SQL Statement option, previously configure a query, and reference it through Double Braces in the SQL Statement parameter as follows:

Query previously defined via Template Transformer

Raw SQL Statement activation

Query referenced in SQL Statement parameter

Important: as a good practice, we strongly recommend that, when activating the Raw SQL Statement option, the queries are previously defined using the Template Transformer component. The use of Template Transformer allows validating parameters through FreeMarker technology and also parameter declaration through Double Braces. These parameters are not resolved by Template Transformer but by the component Stream DB V3, which by default configures and validates the parameters of the SQL statement beforehand (PreparedStatement). By applying these security measures, you reduce the risks of attacks like SQL Injection.

In the image below, we have an example of recommended use of the component on the left (with the Double Braces in the WHERE clause, highlighted in green); and another example of non-recommended use on the right (with the FreeMarker in the WHERE clause, highlighted in red), which may pose risks to pipeline safety:

Technology

Authentication via Kerberos

To make an authentication to a database via Kerberos, it's necessary to:

  • inform a KERBEROS-type account;

  • set a main Kerberos principal;

  • set a keytab (that must be the base64 of the own generated keytab file).

Messages flow

Message structure available in the onProcess subpipeline

Once the SQL instruction is executed, the subpipeline will be triggered receiving the execution result through a message in the following structure:

{
    "column1": "data1",
    "column2": "data2",
    "column3": "data3"
}

Output with error

{
    "code": error_code,"error": error message,
    "processId": the_id_column_value
}

Output

After the component execution, a message is returned in the following structure:

{
    "total": 0,
    "success": 0,
    "failed": 0
}
  • total: total number of processed lines.

  • success: total number of successfully processed lines.

  • failed: total number of lines whose processing failed.

Important: to detect if a line has been correctly processed, each onProcess subpipeline must respond with { "success": true } to each processed element.

Stream DB V3 makes batch processing. Refer to the Batch processing article to learn more about this concept.

Connection pool

By standard, we use a pool based on the configurations of the deployed pipeline. If it's a Small pipeline, then the pool size will be 10; for the Medium it will be 20, and for the Large 40.

It's possible to manage the pool size during the implantation as well. For that, it will be necessary to enable the Pool Size By Actual Consumers parameter in the component. With it, it will be used whatever is manually configured in the implantation screen.

In the example of the image above, a SMALL pipeline with 5 consumers was configured. If you want the pool of the database components (DB V2 and Stream DB V3) to use this size, it's necessary to enable the Pool Size By Actual Consumers parameter in all the existing components.

Be extra careful when configuring the pool size manually so there's no deadlock in concurrent calls to the same database.

Our pool is shared between the database components that access the same database inside the pipeline. If you need an exclusive pool for a determined component, enable the Exclusive DB Pool parameter.

Last updated