Best practices about alerts
Learn best practices for using alerts on the Digibee Integration Platform.
In this document you will learn some best practices to help you organize, identify and better understand your alerts on the Digibee Integration Platform.
Name your alerts correctly
The alert list displays information such as the name of the alert, the pipeline, and the status. However, when alerts are received, only the name is displayed. To identify the type of alert more quickly, include the following information in the alert name:
Name of the realm
Name of the pipeline (as one word)
Metric type (only the first letters, for example, “MIQ” for “Messages In Queue”)
Pipeline environment (test or prod)
Pipeline version
Severity of the alert (low, medium, or high)
Example
See below how the structure should look like:
Structure: realm-pipeline-metric-environment-version-severity
Alert name: digibee-projectxyz-MIQ-prod-v1-medium
Establish realistic thresholds
When setting up alerts, it’s essential to choose realistic thresholds to avoid excessive notifications, which can make it difficult to identify the truly important alerts. A threshold that is too low, for example, can generate constant alerts even when the pipeline is behaving as expected.
Use your data as a baseline: Review recent metric behavior before setting a threshold. For example, if the metric typically ranges between 8 and 12 seconds, setting an alert at 9 seconds will likely trigger frequent notifications without indicating a real problem.
Be aware of natural variations: Pipeline performance often has normal fluctuations . Ensure alerts are triggered only by significant deviations.
Avoid overly strict thresholds: Values too close to the average execution time tend to generate continuous alerts. Choose thresholds that reflect exceptional situations.
Review thresholds regularly: As workload, infrastructure, or integration behavior changes, thresholds may need to be adjusted accordingly.
Use alert messages to get more information quickly
A good alert message helps you quickly identify what is happening, where the problem is, and what criteria have been violated without the need for further investigation. Vague messages can delay diagnosis and increase incident response time, requiring access to logs or dashboards to figure out what actually happened.
Last updated
Was this helpful?