Create a Separate Run for Each Data Quality Pipeline

Created by Alexandru Sirbu, Modified on Wed, 18 Mar at 11:29 AM by Alexandru Sirbu

Problem Overview


A Run is HEDDA.IO's container for execution statistics. When multiple pipelines share a single Run, their results become entangled and hard to interpret. Give each pipeline its own Run so statistics stay clean and traceable.


Solution


In HEDDA.IO, a Run is not an execution itself - it is the named container that holds all the Executions triggered against a Knowledge Base from a particular context. Think of it as a folder: every time your Databricks pipeline or notebook calls HEDDA.IO, the results land in whichever Run that call references.


The Run's statistics panel shows you the score, rows per execution, invalid row counts, and breakdowns by Business Rule, Domain, and Data Quality Dimension. All of this is meaningful only if the executions inside a Run come from the same source. Mix in executions from unrelated datasets or pipelines and the statistics become uninterpretable noise.


Configuration


Creating a Run

  1. Open the Knowledge Base and navigate to the Runs tab.
  2. Click Edit Version in the top-right corner to switch to edit mode.
  3. Click Add Run at the top of the Browsing Panel on the left.
  4. Name the Run after its pipeline and dataset. Good examples: 'Databricks - Customer Master', 'Synapse - Product Catalog Daily', 'Notebook - HR Export Weekly'.
  5. Optionally assign a Default Mapping to the Run. This Mapping will be used automatically when no explicit Mapping is passed by the pipeline at execution time.
  6. Click Save.


Use the Example button in the Run Details panel to get ready-made Python and .NET code snippets showing exactly how to execute against that specific Run. This is the fastest way to wire up a new notebook or pipeline.


Outcome


After executions have been recorded, the Runs Overview panel provides a score, trend charts, and three breakdown tabs: results per Business Rule, results per Domain, and results per Data Quality Dimension. These are most useful when the Run contains homogeneous executions - all from the same pipeline, same dataset structure, same Mapping. Deviations in the charts immediately signal that something changed in the source data or the pipeline.


The Runs Info Panel also shows a Tags section. If you have assigned a Tag to this Run, only Rulebooks carrying that same Tag (plus untagged Rulebooks) will be applied during execution. This is the connection point between Runs and the Tag-based Rulebook targeting system.



If you have any further questions, please feel free to Contact Us.

You can also refer to the HEDDA.IO End User Documentation.


Was this article helpful?

That’s Great!

Thank you for your feedback

Sorry! We couldn't be helpful

Thank you for your feedback

Let us know how can we improve this article!

Select at least one of the reasons
CAPTCHA verification is required.

Feedback sent

We appreciate your effort and will try to fix the article