๐ฏ What is schema: null
?
When you set schema: null
in a dbt model configuration, you’re explicitly telling dbt to use the default target schema from your connection profile, ignoring any other schema configurations.
๐ The Three Schema Configuration States
1. No Schema Configuration (Default Behavior)
-- models/customers.sql
-- No schema config = follows dbt's default schema logic
SELECT * FROM raw_data.customers
Result: Uses dbt’s schema generation logic (target schema + model folder structure)
2. Custom Schema Configuration
-- models/customers.sql
{{ config(schema='analytics') }}
SELECT * FROM raw_data.customers
Result: Creates table in my_database.analytics
schema
3. Explicit Null Schema (Our Focus)
-- models/customers.sql
{{ config(schema=null) }}
SELECT * FROM raw_data.customers
Result: Forces table into default target schema, ignoring all other schema rules
๐ Detailed Examples
Connection Profile Setup
# ~/.dbt/profiles.yml
my_project:
target: dev
outputs:
dev:
type: postgres
host: localhost
schema: dbt_dev # โ This is the DEFAULT TARGET SCHEMA
user: analyst
prod:
type: postgres
host: prod-server
schema: analytics # โ This is the DEFAULT TARGET SCHEMA for prod
user: dbt_user
Scenario 1: Project with Custom Schema Logic
# dbt_project.yml
name: 'ecommerce_project'
models:
ecommerce_project:
# Project-level schema configuration
marts:
+schema: 'business_marts' # All marts models go to 'business_marts'
staging:
+schema: 'staging_area' # All staging models go to 'staging_area'
Without schema: null
:
-- models/marts/customer_summary.sql
-- No explicit schema config = uses project config
SELECT customer_id, total_orders
FROM {{ ref('stg_customers') }}
Result: dbt_dev.business_marts.customer_summary
With schema: null
:
-- models/marts/customer_summary.sql
{{ config(schema=null) }} -- โ
Explicit override
SELECT customer_id, total_orders
FROM {{ ref('stg_customers') }}
Result: dbt_dev.customer_summary
(ignores project config!)
Scenario 2: Mixed Schema Strategy
-- models/marts/schema.yml
version: 2
models:
# Regular mart models use custom schema
- name: sales_summary
config:
schema: 'business_marts'
# Core models must go to default schema
- name: dim_customers
config:
schema: null # โ
Forces default schema
- name: fact_orders
config:
schema: null # โ
Forces default schema
Results:
dbt_dev.business_marts.sales_summary
dbt_dev.dim_customers
โ Default schemadbt_dev.fact_orders
โ Default schema
Scenario 3: Environment-Specific Behavior
-- models/core/customer_master.sql
{{ config(
schema=null, -- Always use default schema
materialized='table'
) }}
SELECT
customer_id,
customer_name,
email,
created_date
FROM {{ ref('stg_customers') }}
Dev Environment:
- Profile:
schema: dbt_dev
- Result:
my_database.dbt_dev.customer_master
Prod Environment:
- Profile:
schema: analytics
- Result:
my_database.analytics.customer_master
Staging Environment:
- Profile:
schema: staging
- Result:
my_database.staging.customer_master
๐ ๏ธ Practical Use Cases
Use Case 1: Core Dimension Tables
-- models/dimensions/dim_date.sql
{{ config(
schema=null, -- โ
Always in default schema
materialized='table',
pre_hook="DROP TABLE IF EXISTS {{ this }}"
) }}
WITH date_spine AS (
SELECT
DATE_ADD('2020-01-01', INTERVAL seq DAY) as date_day
FROM UNNEST(GENERATE_ARRAY(0, 1825)) as t(seq) -- 5 years
)
SELECT
date_day,
EXTRACT(YEAR FROM date_day) as year,
EXTRACT(MONTH FROM date_day) as month,
EXTRACT(DAYOFWEEK FROM date_day) as day_of_week,
CASE EXTRACT(DAYOFWEEK FROM date_day)
WHEN 1 THEN 'Sunday'
WHEN 2 THEN 'Monday'
-- etc...
END as day_name
FROM date_spine
Why schema: null
?
- Dimension tables are foundational – should be in default schema
- All environments need to reference the same location
- Simplifies cross-schema references
Use Case 2: Development Workflow Consistency
# dbt_project.yml
models:
my_project:
# Development models use custom schemas for isolation
experiments:
+schema: 'dev_experiments'
# But core models must be consistent across environments
core:
+schema: null # โ
Always default schema
-- models/core/customer_360.sql
{{ config(
schema=null, -- Consistent location across dev/staging/prod
materialized='table'
) }}
SELECT
c.customer_id,
c.customer_name,
o.total_orders,
o.total_revenue,
s.support_tickets
FROM {{ ref('dim_customers') }} c
LEFT JOIN {{ ref('customer_order_summary') }} o USING (customer_id)
LEFT JOIN {{ ref('customer_support_summary') }} s USING (customer_id)
Use Case 3: Overriding Macro-Generated Schemas
-- macros/generate_schema_name.sql
{% macro generate_schema_name(custom_schema_name, node) -%}
{%- set default_schema = target.schema -%}
{%- if custom_schema_name is none -%}
{{ default_schema }}
{%- else -%}
{{ default_schema }}_{{ custom_schema_name | trim }}
{%- endif -%}
{%- endmacro %}
Normal model with custom schema:
-- models/reports/monthly_sales.sql
{{ config(schema='reports') }}
SELECT * FROM {{ ref('fact_orders') }}
Result: dbt_dev_reports.monthly_sales
Model with schema: null
bypasses macro:
-- models/core/fact_orders.sql
{{ config(schema=null) }} -- โ
Bypasses generate_schema_name macro
SELECT * FROM {{ ref('stg_orders') }}
Result: dbt_dev.fact_orders
(pure default schema)
๐ Comparison Table
Configuration | Schema Result | When to Use |
---|---|---|
No schema config | Uses dbt logic + macros | Default behavior |
schema: 'custom' | target_schema_custom | Organize by domain |
schema: null | target_schema only | Force default schema |
๐ฏ Best Practices
โ
Use schema: null
When:
- Core dimension tables that should be centrally located
{{ config(schema=null) }} -- dim_customers, dim_products, etc.
- Cross-environment consistency is critical
{{ config(schema=null) }} -- Same location in dev/staging/prod
- Overriding project-level schema configurations
-- Project says marts go to 'business_marts', but this one goes to default
{{ config(schema=null) }}
- Simplifying references in downstream models
-- Easier to reference when in predictable default location
SELECT * FROM {{ target.schema }}.dim_customers -- Always works
โ Avoid schema: null
When:
- Domain separation is important
-- Better to keep separate
{{ config(schema='finance') }} -- Finance models
{{ config(schema='marketing') }} -- Marketing models
- Multiple teams working on same project
-- Better to isolate by team
{{ config(schema='team_a') }}
{{ config(schema='team_b') }}
๐ง Advanced Configuration Examples
Dynamic Schema with Null Override
-- models/core/universal_customer.sql
{{ config(
schema=var('force_default_schema', false) and null or 'customers',
materialized='table'
) }}
SELECT * FROM {{ ref('stg_customers') }}
Usage:
# Normal run - goes to custom schema
dbt run --select universal_customer
# Result: dbt_dev_customers.universal_customer
# Force default schema
dbt run --select universal_customer --vars '{"force_default_schema": true}'
# Result: dbt_dev.universal_customer
Conditional Schema Assignment
-- models/staging/stg_orders.sql
{{ config(
schema=target.name == 'prod' and null or 'staging'
) }}
SELECT * FROM {{ source('raw', 'orders') }}
Results:
- Dev/Staging:
dbt_dev_staging.stg_orders
- Production:
analytics.stg_orders
(default schema)
๐ Real-World Implementation
E-commerce Project Structure
# dbt_project.yml
models:
ecommerce:
# Raw staging - separate schema
staging:
+schema: 'staging'
# Core business logic - default schema for consistency
core:
+schema: null # โ
All core models in default schema
# Department-specific marts
marts:
finance:
+schema: 'finance'
marketing:
+schema: 'marketing'
operations:
+schema: 'operations'
# Shared utilities - default schema for easy access
utils:
+schema: null # โ
Utility models in default schema
Results Across Environments
Development (target.schema: dbt_john
):
dbt_john_staging.stg_orders
dbt_john.dim_customers
โ Core (schema: null)dbt_john_finance.revenue_summary
dbt_john.date_spine
โ Utility (schema: null)
Production (target.schema: analytics
):
analytics_staging.stg_orders
analytics.dim_customers
โ Core (schema: null)analytics_finance.revenue_summary
analytics.date_spine
โ Utility (schema: null)
๐ก Key Takeaways
schema: null
is explicit – it’s different from no schema config- Forces default schema – ignores project configs and macros
- Ensures consistency across environments
- Simplifies references to core/shared models
- Perfect for dimensions and utilities that should be centrally located
Bottom Line: Use schema: null
when you want predictable, consistent schema placement regardless of other configurations! ๐ฏ