Dagster & Airbyte (Component)
The dagster-airbyte library provides an AirbyteWorkspaceComponent which can be used to easily represent Airbyte connections as assets in Dagster.
1. Prepare a Dagster project
To begin, you'll need a Dagster project. You can use an existing components-ready project or create a new one:
create-dagster project my-project && cd my-project/src
Activate the project virtual environment:
source ../.venv/bin/activate
Finally, add the dagster-airbyte library to the project:
uv add dagster-airbyte
2. Scaffold an Airbyte component
Now that you have a Dagster project, you can scaffold an Airbyte component. You'll need to provide your Airbyte workspace ID and API credentials:
dg scaffold defs dagster_airbyte.AirbyteWorkspaceComponent airbyte_ingest \
  --workspace-id test_workspace --client-id "{{ env.AIRBYTE_CLIENT_ID }}" --client-secret "{{ env.AIRBYTE_CLIENT_SECRET }}"
Creating defs at /.../my-project/src/my_project/defs/airbyte_ingest.
The scaffold call will generate a defs.yaml file:
tree my_project/defs
my_project/defs
├── __init__.py
└── airbyte_ingest
    └── defs.yaml
2 directories, 2 files
In its scaffolded form, the defs.yaml file contains the configuration for your Airbyte workspace:
type: dagster_airbyte.AirbyteWorkspaceComponent
attributes:
  workspace:
    workspace_id: test_workspace
    client_id: '{{ env.AIRBYTE_CLIENT_ID }}'
    client_secret: '{{ env.AIRBYTE_CLIENT_SECRET }}'
You can check the configuration of your component:
dg list defs
┏━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Section ┃ Definitions                                                ┃
┡━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ Assets  │ ┏━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━┓ │
│         │ ┃ Key         ┃ Group   ┃ Deps ┃ Kinds     ┃ Description ┃ │
│         │ ┡━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━┩ │
│         │ │ account     │ default │      │ airbyte   │             │ │
│         │ │             │         │      │ snowflake │             │ │
│         │ ├─────────────┼─────────┼──────┼───────────┼─────────────┤ │
│         │ │ company     │ default │      │ airbyte   │             │ │
│         │ │             │         │      │ snowflake │             │ │
│         │ ├─────────────┼─────────┼──────┼───────────┼─────────────┤ │
│         │ │ contact     │ default │      │ airbyte   │             │ │
│         │ │             │         │      │ snowflake │             │ │
│         │ ├─────────────┼─────────┼──────┼───────────┼─────────────┤ │
│         │ │ opportunity │ default │      │ airbyte   │             │ │
│         │ │             │         │      │ snowflake │             │ │
│         │ ├─────────────┼─────────┼──────┼───────────┼─────────────┤ │
│         │ │ task        │ default │      │ airbyte   │             │ │
│         │ │             │         │      │ snowflake │             │ │
│         │ ├─────────────┼─────────┼──────┼───────────┼─────────────┤ │
│         │ │ user        │ default │      │ airbyte   │             │ │
│         │ │             │         │      │ snowflake │             │ │
│         │ └─────────────┴─────────┴──────┴───────────┴─────────────┘ │
└─────────┴────────────────────────────────────────────────────────────┘
3. Configuration for Airbyte OSS or Self-Managed Enterprise
In order to configure your Airbyte component for Airbyte OSS or Self-Managed Enterprise, you will need to provide the REST API URL and Configuration API URL.
The REST API URL endpoint is exposed at https://<airbyte-server-hostname>/api/public/v1 and the Configuration API URL endpoint is exposed at https://<airbyte-server-hostname>/api/v1.
Airbyte OSS and Self-Managed Enterprise support several authentication methods. Please see Authentication in Self-Managed in the Airbyte API docs for more details.
- OAuth Client Credentials
- Basic Authentication
- No Authentication
type: dagster_airbyte.AirbyteWorkspaceComponent
attributes:
  workspace:
    rest_api_base_url: http://localhost:8000/api/public/v1
    configuration_api_base_url: http://localhost:8000/api/v1
    workspace_id: test_workspace
    client_id: "{{ env.AIRBYTE_CLIENT_ID }}"
    client_secret: "{{ env.AIRBYTE_CLIENT_SECRET }}"
type: dagster_airbyte.AirbyteWorkspaceComponent
attributes:
  workspace:
    rest_api_base_url: http://localhost:8000/api/public/v1
    configuration_api_base_url: http://localhost:8000/api/v1
    workspace_id: test_workspace
    username: "{{ env.AIRBYTE_USERNAME }}"
    password: "{{ env.AIRBYTE_PASSWORD }}"
type: dagster_airbyte.AirbyteWorkspaceComponent
attributes:
  workspace:
    rest_api_base_url: http://localhost:8000/api/public/v1
    configuration_api_base_url: http://localhost:8000/api/v1
    workspace_id: test_workspace
4. Select specific connections
You can select specific Airbyte connections to include in your component using the connection_selector key. This allows you to filter which connections are represented as assets:
type: dagster_airbyte.AirbyteWorkspaceComponent
attributes:
  workspace:
    workspace_id: test_workspace
    client_id: "{{ env.AIRBYTE_CLIENT_ID }}"
    client_secret: "{{ env.AIRBYTE_CLIENT_SECRET }}"
  connection_selector:
    by_name:
      - salesforce_to_snowflake
dg list defs
┏━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Section ┃ Definitions                                                ┃
┡━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ Assets  │ ┏━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━┓ │
│         │ ┃ Key         ┃ Group   ┃ Deps ┃ Kinds     ┃ Description ┃ │
│         │ ┡━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━┩ │
│         │ │ account     │ default │      │ airbyte   │             │ │
│         │ │             │         │      │ snowflake │             │ │
│         │ ├─────────────┼─────────┼──────┼───────────┼─────────────┤ │
│         │ │ opportunity │ default │      │ airbyte   │             │ │
│         │ │             │         │      │ snowflake │             │ │
│         │ ├─────────────┼─────────┼──────┼───────────┼─────────────┤ │
│         │ │ task        │ default │      │ airbyte   │             │ │
│         │ │             │         │      │ snowflake │             │ │
│         │ ├─────────────┼─────────┼──────┼───────────┼─────────────┤ │
│         │ │ user        │ default │      │ airbyte   │             │ │
│         │ │             │         │      │ snowflake │             │ │
│         │ └─────────────┴─────────┴──────┴───────────┴─────────────┘ │
└─────────┴────────────────────────────────────────────────────────────┘
5. Customize Airbyte assets
Properties of the assets emitted by each connection can be customized in the defs.yaml file using the translation key:
type: dagster_airbyte.AirbyteWorkspaceComponent
attributes:
  workspace:
    workspace_id: test_workspace
    client_id: "{{ env.AIRBYTE_CLIENT_ID }}"
    client_secret: "{{ env.AIRBYTE_CLIENT_SECRET }}"
  connection_selector:
    by_name:
      - salesforce_to_snowflake
  translation:
    group_name: airbyte_data
    description: "Loads data from Airbyte connection {{ props.connection_name }}"
dg list defs
┏━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Section ┃ Definitions                                                                                                ┃
┡━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ Assets  │ ┏━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ │
│         │ ┃ Key         ┃ Group        ┃ Deps ┃ Kinds     ┃ Description                                            ┃ │
│         │ ┡━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩ │
│         │ │ account     │ airbyte_data │      │ airbyte   │ Loads data from Airbyte connection                     │ │
│         │ │             │              │      │ snowflake │ salesforce_to_snowflake                                │ │
│         │ ├─────────────┼──────────────┼──────┼───────────┼────────────────────────────────────────────────────────┤ │
│         │ │ opportunity │ airbyte_data │      │ airbyte   │ Loads data from Airbyte connection                     │ │
│         │ │             │              │      │ snowflake │ salesforce_to_snowflake                                │ │
│         │ ├─────────────┼──────────────┼──────┼───────────┼────────────────────────────────────────────────────────┤ │
│         │ │ task        │ airbyte_data │      │ airbyte   │ Loads data from Airbyte connection                     │ │
│         │ │             │              │      │ snowflake │ salesforce_to_snowflake                                │ │
│         │ ├─────────────┼──────────────┼──────┼───────────┼────────────────────────────────────────────────────────┤ │
│         │ │ user        │ airbyte_data │      │ airbyte   │ Loads data from Airbyte connection                     │ │
│         │ │             │              │      │ snowflake │ salesforce_to_snowflake                                │ │
│         │ └─────────────┴──────────────┴──────┴───────────┴────────────────────────────────────────────────────────┘ │
└─────────┴──────────────────────────────────────────────────────────────────────────────────────  ──────────────────────┘