Skip to content

Data Quality

Overview

Data Streams can have automated validation checks enabled to monitor the quality of the data being received from the Asset.

These data quality validations are done at the edge in real time and any Application can subscribe to the results.

Note

The results are also saved to the Kelvin Cloud and will become available in future releases of Kelvin.

There are many types of data quality validation algorithms available to detect issues and maintain the integrity and reliability of data within the Kelvin Platform.

Data Quality Inputs

Applications can subscribe selected Data Streams to run specific Data Quality validations.

They can then see the validation's results in real time and react to any data quality issues. The type of reaction depends on the developer's requirements, for example sending emails or slack messages.

There are a number of inbuilt Data Quality validation options available.

Validation Description Configurable Parameters
kelvin_timestamp_anomaly Detects anomalies or irregularities in the timestamp sequence None
kelvin_duplicate_detection Detects duplicate values within a defined window size window_size (default: 5)
kelvin_out_of_range_detection Validates whether values fall within an expected range min_threshold, max_threshold
kelvin_outlier_detection Uses statistical methods to detect outliers over a moving window model, threshold (default: 3), window_size (default: 10)
kelvin_data_availability Ensures expected number of messages are received in a given time window window_expected_number_msgs, window_time_interval_unit (second, minute, hour, day)
kelvin_data_availability_rate Part of kelvin_data_availability None
kelvin_data_availability_count Part of kelvin_data_availability None

All the validations are also saved in the Cloud and historical data will be accessible through the Kelvin UI in future releases.

Data Quality Outputs

It is also possible to create your own custom validation Applications that will process incoming Data Streams and then produce data quality information that can be used by other Applications.

The other Applications will connect to the custom validations through the Data Quality input key in the app.yaml and not directly to the Application doing the validation calculations.

All the validations are also saved in the Cloud and historical data will be accessible through the Kelvin UI in future releases.