Dagster & dbt (Component)
The dagster-dbt library provides a DbtProjectComponent which can be used to easily represent dbt models as assets in Dagster. Dagster assets understand dbt at the level of individual dbt models. This means that you can:
- Use Dagster's UI or APIs to run subsets of your dbt models, seeds, and snapshots.
- Track failures, logs, and run history for individual dbt models, seeds, and snapshots.
- Define dependencies between individual dbt models and other data assets. For example, put dbt models after the Fivetran-ingested table that they read from, or put a machine learning after the dbt models that it's trained from.
DbtProjectComponent is a state-backed component, which compiles and caches your dbt project's manifest. For information on managing component state, see Configuring state-backed components.
Dagster supports dbt Fusion as of the 1.11.5 release. Dagster will automatically detect which engine you have installed. If you're currently using core, to migrate uninstall dbt-core and install dbt Fusion. For more information please reference the dbt docs.
This feature is still in preview pending dbt Fusion GA.
1. Prepare a Dagster project
To begin, you'll need a Dagster project. You can use an existing components-ready project or create a new one:
create-dagster project my-project && cd my-project
Activate the project virtual environment:
source .venv/bin/activate
Then, add the dagster-dbt library to the project, along with a duckdb adapter:
- uv
- pip
uv add dagster-dbt dbt-duckdb
pip install dagster-dbt dbt-duckdb
2. Set up a dbt project
For this tutorial, we'll use the jaffle shop dbt project as an example. Clone it into your project:
git clone --depth=1 https://github.com/dbt-labs/jaffle_shop.git dbt && rm -rf dbt/.git
We will create a profiles.yml file in the dbt directory to configure the project to use DuckDB:
jaffle_shop:
  target: dev
  outputs:
    dev:
      type: duckdb
      path: tutorial.duckdb
      threads: 24
3. Scaffold a dbt component definition
Now that you have a Dagster project with a dbt project, you can scaffold a dbt component definition. You'll need to provide the path to your dbt project:
dg scaffold defs dagster_dbt.DbtProjectComponent dbt_ingest \
  --project-path "dbt"
Creating defs at /.../my-project/src/my_project/defs/dbt_ingest.
The dg scaffold defs call will generate a defs.yaml file in your project structure:
tree src/my_project
src/my_project
├── __init__.py
├── definitions.py
└── defs
    ├── __init__.py
    └── dbt_ingest
        └── defs.yaml
3 directories, 4 files
In its scaffolded form, the defs.yaml file contains the configuration for your dbt project:
type: dagster_dbt.DbtProjectComponent
attributes:
  project: '{{ project_root }}/dbt'
This is sufficient to load your dbt models as assets. You can use dg list defs to see the asset representation:
dg list defs
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Section      ┃ Definitions                                                                                           ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ Assets       │ ┏━━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ │
│              │ ┃ Key           ┃ Group   ┃ Deps          ┃ Kinds  ┃ Description                                    ┃ │
│              │ ┡━━━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━  ━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩ │
│              │ │ customers     │ default │ stg_customers │ dbt    │ This table has basic information about a       │ │
│              │ │               │         │ stg_orders    │ duckdb │ customer, as well as some derived facts based  │ │
│              │ │               │         │ stg_payments  │        │ on a custome…                                  │ │
│              │ ├───────────────┼─────────┼───────────────┼────────┼────────────────────────────────────────────────┤ │
│              │ │ orders        │ default │ stg_orders    │ dbt    │ This table has basic information about orders, │ │
│              │ │               │         │ stg_payments  │ duckdb │ as well as some derived facts based on         │ │
│              │ │               │         │               │        │ payments                                       │ │
│              │ │               │         │               │        │                                                │ │
│              │ │               │         │               │        │ #### …                                         │ │
│              │ ├───────────────┼─────────┼───────────────┼────────┼────────────────────────────────────────────────┤ │
│              │ │ raw_customers │ default │               │ dbt    │ dbt seed raw_customers                         │ │
│              │ │               │         │               │ duckdb │                                                │ │
│              │ │               │         │               │        │ #### Raw SQL:                                  │ │
│              │ │               │         │               │        │ ```sql                                         │ │
│              │ │               │         │               │        │                                                │ │