dbt Integration¶
Tessera integrates deeply with dbt to automatically extract and publish contracts from your dbt project.
Overview¶
The integration:
- Parses your
manifest.jsonafterdbt compileordbt run - Creates assets for each model, source, seed, and snapshot
- Extracts column schemas from your YAML definitions
- Converts dbt tests into contract guarantees
- Publishes contracts with version tracking
- Detects breaking changes in CI/CD pipelines
- Auto-registers consumers from
ref()dependencies
Quick Start¶
# 1. Compile your dbt project
dbt compile
# 2. Upload manifest to Tessera
curl -X POST http://localhost:8000/api/v1/sync/dbt/upload \
-H "Content-Type: application/json" \
-d "{
\"manifest\": $(cat target/manifest.json),
\"owner_team_id\": \"your-team-uuid\",
\"auto_publish_contracts\": true
}"
Endpoints¶
Upload Manifest¶
POST /api/v1/sync/dbt/upload
The primary integration endpoint. Syncs assets from a dbt manifest with full automation options.
{
"manifest": { /* manifest.json contents */ },
"owner_team_id": "uuid",
"conflict_mode": "overwrite",
"auto_publish_contracts": true,
"auto_create_proposals": true,
"auto_register_consumers": true,
"infer_consumers_from_refs": true
}
| Parameter | Type | Default | Description |
|---|---|---|---|
manifest |
object | required | Contents of manifest.json |
owner_team_id |
UUID | required | Default team for assets without meta.tessera.owner_team |
conflict_mode |
string | "ignore" |
How to handle existing assets: ignore, overwrite, fail |
auto_publish_contracts |
boolean | false |
Automatically publish contracts for synced assets |
auto_create_proposals |
boolean | false |
Create proposals for breaking changes instead of failing |
auto_register_consumers |
boolean | false |
Register consumers from meta.tessera.consumers |
infer_consumers_from_refs |
boolean | false |
Infer consumer relationships from dbt ref() calls |
Response:
{
"status": "success",
"assets": {
"created": 10,
"updated": 5,
"skipped": 2
},
"contracts": {
"published": 8
},
"proposals": {
"created": 2,
"details": [
{
"proposal_id": "uuid",
"asset_fqn": "warehouse.analytics.dim_customers",
"change_type": "major",
"breaking_changes_count": 3
}
]
},
"registrations": {
"created": 15
},
"guarantees_extracted": 12
}
Impact Analysis¶
POST /api/v1/sync/dbt/impact
Check impact of dbt model changes against existing contracts without applying changes.
Response:
{
"status": "success",
"total_models": 15,
"models_with_contracts": 8,
"breaking_changes_count": 2,
"results": [
{
"fqn": "warehouse.analytics.dim_customers",
"node_id": "model.project.dim_customers",
"has_contract": true,
"safe_to_publish": false,
"change_type": "major",
"breaking_changes": [
{
"type": "property_removed",
"property": "email",
"message": "Required property 'email' was removed"
}
]
}
]
}
CI/CD Diff¶
POST /api/v1/sync/dbt/diff
Dry-run preview for CI/CD pipelines. Shows what would change without applying.
Response:
{
"status": "breaking_changes_detected",
"summary": {
"new": 3,
"modified": 5,
"unchanged": 7,
"breaking": 1
},
"blocking": true,
"models": [
{
"fqn": "warehouse.analytics.dim_customers",
"change_type": "modified",
"schema_change_type": "breaking",
"breaking_changes": ["Removed required field 'email'"]
}
]
}
Use blocking: true in CI to fail the build on breaking changes.
Tessera Metadata Tags¶
Configure Tessera behavior using the meta.tessera block in your dbt YAML files.
Full Example¶
models:
- name: dim_customers
description: Customer dimension table
meta:
tessera:
# Ownership
owner_team: data-platform
owner_user: alice@company.com
# Consumer declarations
consumers:
- team: ml-platform
purpose: Model training dataset
- team: analytics
purpose: Dashboard metrics
# SLA guarantees
freshness:
max_staleness_minutes: 60
measured_by: _loaded_at
volume:
min_rows: 10000
max_row_delta_pct: 50
# Contract settings
compatibility_mode: backward
columns:
- name: customer_id
data_type: integer
tests:
- not_null
- unique
- name: email
data_type: string
tests:
- not_null
Available Fields¶
| Field | Type | Description |
|---|---|---|
owner_team |
string | Team name that owns this asset (overrides owner_team_id) |
owner_user |
string | Email of the user who owns this asset |
consumers |
list | Array of consumer team declarations |
consumers[].team |
string | Team name to register as consumer |
consumers[].purpose |
string | Optional description of how the team uses this data |
freshness |
object | Data freshness SLA configuration |
freshness.max_staleness_minutes |
integer | Maximum allowed data staleness |
freshness.measured_by |
string | Column used to measure freshness |
volume |
object | Data volume guarantees |
volume.min_rows |
integer | Minimum expected row count |
volume.max_row_delta_pct |
integer | Maximum allowed row count change percentage |
compatibility_mode |
string | Contract compatibility: backward, forward, full, none |
Compatibility Modes¶
| Mode | Description |
|---|---|
backward |
New schema must be compatible with old (can add optional fields, can't remove) |
forward |
Old consumers must be able to read new data (can remove fields, can't add required) |
full |
Any schema change requires acknowledgment |
none |
No compatibility checking (notify only) |
What Gets Synced¶
Resource Types¶
| dbt Resource | Synced | Resource Type | Notes |
|---|---|---|---|
| Models | Yes | model |
Full schema and metadata |
| Sources | Yes | source |
External data sources |
| Seeds | Yes | seed |
CSV reference data |
| Snapshots | Yes | snapshot |
Type-2 slowly changing dimensions |
| Tests | Extracted | - | Converted to guarantees |
| Macros | No | - | Not applicable |
| Exposures | No | - | Not applicable |
Asset Metadata¶
For each synced resource, Tessera captures:
{
"dbt_node_id": "model.my_project.dim_customers",
"resource_type": "model",
"description": "Customer dimension table",
"tags": ["core", "pii"],
"dbt_fqn": ["my_project", "marts", "core", "dim_customers"],
"path": "models/marts/core/dim_customers.sql",
"depends_on": [
"warehouse.staging.stg_customers",
"warehouse.staging.stg_orders"
],
"columns": {
"customer_id": {
"data_type": "integer",
"description": "Primary key"
}
},
"guarantees": {
"nullability": {"customer_id": "never"},
"accepted_values": {}
},
"tessera_meta": {
"owner_team": "data-platform",
"consumers": [{"team": "analytics"}]
}
}
Test Extraction¶
Tessera automatically converts dbt tests into contract guarantees.
Standard Tests¶
| dbt Test | Guarantee Type | Example |
|---|---|---|
not_null |
nullability |
{"customer_id": "never"} |
accepted_values |
accepted_values |
{"status": ["active", "inactive"]} |
unique |
custom |
Stored with test config |
relationships |
custom |
Stored with foreign key info |
Third-Party Tests¶
Tests from dbt_expectations, dbt_utils, and other packages are stored as custom guarantees:
columns:
- name: email
tests:
- dbt_expectations.expect_column_values_to_match_regex:
regex: "^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\\.[a-zA-Z0-9-.]+$"
Becomes:
{
"guarantees": {
"custom": [
{
"type": "dbt_expectations.expect_column_values_to_match_regex",
"column": "email",
"config": {
"regex": "^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\\.[a-zA-Z0-9-.]+$"
}
}
]
}
}
Singular Tests¶
SQL-based singular tests are also captured:
Becomes:
{
"guarantees": {
"custom": [
{
"type": "singular",
"name": "assert_positive_revenue",
"sql": "select * from {{ ref('fct_orders') }} where total_revenue < 0"
}
]
}
}
Type Mapping¶
dbt column types are mapped to JSON Schema types:
| dbt Type | JSON Schema Type |
|---|---|
string, text, varchar, char |
string |
integer, int, bigint, smallint |
integer |
float, double, numeric, decimal |
number |
boolean, bool |
boolean |
date |
string (format: date) |
datetime, timestamp |
string (format: date-time) |
json, jsonb, variant, object |
object |
array |
array |
Consumer Auto-Registration¶
From Meta Tags¶
Declare consumers in your model YAML:
models:
- name: fct_orders
meta:
tessera:
consumers:
- team: analytics
purpose: Daily revenue dashboard
- team: ml-platform
purpose: Churn prediction model
With auto_register_consumers: true, these teams are automatically registered.
From Refs¶
With infer_consumers_from_refs: true, Tessera analyzes dbt ref() calls:
If fct_revenue is owned by team finance, they're registered as a consumer of fct_orders.
CI/CD Integration¶
GitHub Actions - Check Breaking Changes¶
name: Check Contracts
on:
pull_request:
paths:
- 'models/**'
- 'dbt_project.yml'
jobs:
contract-check:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup Python
uses: actions/setup-python@v5
with:
python-version: '3.11'
- name: Install dbt
run: pip install dbt-core dbt-postgres
- name: Compile dbt
run: dbt compile
env:
DBT_PROFILES_DIR: .
- name: Check contract impact
run: |
RESULT=$(curl -s -X POST ${{ secrets.TESSERA_URL }}/api/v1/sync/dbt/diff \
-H "Authorization: Bearer ${{ secrets.TESSERA_API_KEY }}" \
-H "Content-Type: application/json" \
-d "{\"manifest\": $(cat target/manifest.json), \"fail_on_breaking\": true}")
echo "$RESULT" | jq .
if [ "$(echo $RESULT | jq -r '.blocking')" = "true" ]; then
echo "::error::Breaking changes detected!"
exit 1
fi
GitHub Actions - Sync on Merge¶
name: Sync to Tessera
on:
push:
branches: [main]
paths:
- 'models/**'
jobs:
sync:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Compile dbt
run: dbt compile
- name: Sync to Tessera
run: |
curl -X POST ${{ secrets.TESSERA_URL }}/api/v1/sync/dbt/upload \
-H "Authorization: Bearer ${{ secrets.TESSERA_API_KEY }}" \
-H "Content-Type: application/json" \
-d "{
\"manifest\": $(cat target/manifest.json),
\"owner_team_id\": \"${{ secrets.TESSERA_TEAM_ID }}\",
\"conflict_mode\": \"overwrite\",
\"auto_publish_contracts\": true,
\"auto_register_consumers\": true,
\"infer_consumers_from_refs\": true
}"
Pre-commit Hook¶
# .pre-commit-config.yaml
repos:
- repo: local
hooks:
- id: tessera-check
name: Check Tessera contracts
entry: bash -c 'dbt compile && curl -s -X POST $TESSERA_URL/api/v1/sync/dbt/diff -d "{\"manifest\": $(cat target/manifest.json)}" | jq -e ".blocking == false"'
language: system
files: ^models/.*\.yml$
pass_filenames: false
Conflict Resolution¶
The conflict_mode parameter controls how existing assets are handled:
| Mode | Behavior |
|---|---|
ignore |
Skip assets that already exist (default, safe) |
overwrite |
Update existing assets with new metadata |
fail |
Return error if any asset already exists |
Troubleshooting¶
No columns extracted¶
Ensure your YAML has data_type definitions:
Model not synced¶
Check that the model:
- Is in the manifest (run dbt compile)
- Has a valid unique_id
- Matches any filter you've applied
Team not found¶
If you get owner_team 'xyz' not found:
- Create the team first via the API or UI
- Use the team name exactly as it appears in Tessera
- Or remove meta.tessera.owner_team to use the default owner_team_id
Breaking changes not detected¶
Ensure the asset has an existing contract:
- Check /api/v1/assets/{id} for current_contract_id
- Contracts are only published with auto_publish_contracts: true
Python SDK¶
Use the Tessera Python SDK for programmatic access:
from tessera_sdk import TesseraClient
import json
client = TesseraClient(base_url="http://localhost:8000")
# Load manifest
with open("target/manifest.json") as f:
manifest = json.load(f)
# Sync with options
result = client.sync.dbt_upload(
manifest=manifest,
owner_team_id="your-team-uuid",
auto_publish_contracts=True,
auto_register_consumers=True
)
print(f"Created {result.assets.created} assets")
print(f"Published {result.contracts.published} contracts")