dbt: Enforcing Model Contracts in Incremental dbt Models Using Jinja Macros and Pre-Hooks


✅ Enforcing Model Contracts in Incremental dbt Models Using Jinja Macros and Pre-Hooks

When using incremental models in dbt, enabling model contracts helps ensure that new rows conform to a defined schema. However, model contracts only validate new rows inserted during the run — they do not validate existing rows already in the table.

This limitation can lead to inconsistent tables: old rows may violate the contract while new rows are compliant.


🧠 The Solution: Pre-Hook Validation with a Jinja Macro

The most reliable way to enforce contracts across the full dataset — both old and new data — is to use:

  • A Jinja macro that generates validation SQL
  • A pre_hook in your incremental model to run that validation
  • Optional custom error messages or logging

This approach keeps the logic inside your dbt project and fully portable across environments.


🧱 Example: Validating a Contract with a Jinja Macro

1. ✅ Define a macro in macros/validate_contract.sql:

{% macro validate_existing_clients_contract(model_relation) %}
  SELECT
    CASE
      WHEN EXISTS (
        SELECT 1 FROM {{ model_relation }}
        WHERE client_id IS NULL OR email IS NULL
      )
      THEN RAISE_ERROR('Contract violation: NULL values detected in {{ model_relation }}')
  END;
{% endmacro %}

👉 This macro checks that required fields are not null in the existing table before new data is inserted.


2. ✅ Use the macro in your incremental model:

{{ config(
    materialized='incremental',
    unique_key='client_id',
    contract={'enforced': true},
    pre_hook=[ validate_existing_clients_contract(this) ]
) }}

SELECT
  client_id,
  name,
  email
FROM {{ source('crm', 'clients') }}
{% if is_incremental() %}
WHERE updated_at > (SELECT MAX(updated_at) FROM {{ this }})
{% endif %}

➡️ When this model runs, dbt will:

  1. Execute the validation query first (before inserting data)
  2. Abort the run with an error if existing data violates the contract
  3. Proceed only if the entire dataset is compliant

✅ Benefits of This Approach

FeatureBenefit
🔁 Validates entire tableNot just new rows — also checks existing records
🔒 Prevents silent driftEnsures contract compliance across time
⚙️ dbt-nativeFully implemented within dbt macros — no need for external stored procedures
🔍 Custom errorsProvides detailed error messages for debugging
🧪 PortableWorks across environments and warehouses (BigQuery, Snowflake, etc.)

🧠 Summary

While model contracts in dbt enforce schema expectations on new data, they don’t protect you from historical inconsistencies.

By using a Jinja macro + pre-hook, you can:

  • Validate the entire dataset (existing + new rows)
  • Stop execution when data violates your contract
  • Keep your data warehouse clean and reliable

This pattern is especially powerful for production models, where data integrity is non-negotiable.


Leave a Reply

Your email address will not be published. Required fields are marked *