How do dbt models relate to Dagster assets?
dbt models are assets: they produce data and can have dependencies. Because of these similarities, Dagster can translate each of your dbt models into a Dagster Software-defined Asset (SDA).
How can Dagster do this? Each component of a Dagster asset has an equivalent counterpart in a dbt model:
- The asset key for a dbt model is (by default) the name of the model
- The upstream dependencies of a dbt model are defined with
reforsourcecalls within the model's definition - The computation required to compute the asset from its upstream dependencies is the SQL within the model's definition
From code like this:
from pathlib import Path
from dagster_dbt import DbtCliResource, dbt_assets, get_asset_key_for_model
from dagster_fivetran import build_fivetran_assets
from dagster import AssetExecutionContext, asset
fivetran_assets = build_fivetran_assets(
connector_id="postgres",
destination_tables=["users", "orders"],
)
@dbt_assets(manifest=Path("manifest.json"))
def dbt_project_assets(context: dg.AssetExecutionContext, dbt: DbtCliResource):
yield from dbt.cli(["build"], context=context).stream()
@dg.asset(
kinds={"tensorflow"},
deps=[get_asset_key_for_model([dbt_project_assets], "daily_order_summary")],
)
def predicted_orders():
...
Let's break down what's happening in this example:
- Using
build_fivetran_assets, we load two tables (users,orders) from a Fivetran Postgres connector as Dagster assets - Using
@dbt_assets, Dagster reads from a dbt project'smanifest.jsonand creates Dagster assets from the dbt models it finds - Lastly, we create a Dagster
@dg.assetnamedpredicted_ordersthat has an upstream dependency on a dbt asset nameddaily_order_summary