Post-processing components
It is often useful to make modifications to the definitions generated by a component without needing to modify the component logic. Dagster provides a generic mechanism for this called post-processing.
Post-processing is available on all components. To add post-processing to a
component instance, add a post_process field in defs.yaml.
Currently post-processing is only supported for the assets, not other definitions.
Setup
Let's look at a simple example using the DefsFolderComponent. DefsFolderComponent
simply loads all definitions from a specified folder.
Starting from a blank project, let's scaffold a DefsFolderComponent called
my_assets:
dg scaffold defs DefsFolderComponent my_assets
Creating defs at /.../my-project/src/my_project/defs/my_assets.
We now have a directory my_project/defs/my_assets with a single file,
defs.yaml:
type: dagster.DefsFolderComponent
attributes: {}
Let's add some assets. We'll create two files
my_project/defs/my_assets/foo.py and my_project/defs/my_assets/bar.py, each
containing a single asset:
- foo.py
- bar.py
import dagster as dg
@dg.asset
def foo():
    return None
import dagster as dg
@dg.asset
def bar():
    return None
Let's run dg list defs to see our assets:
dg list defs
┏━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Section ┃ Definitions                                    ┃
┡━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ Assets  │ ┏━━━━━┳━━━━━━━━━┳━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━┓ │
│         │ ┃ Key ┃ Group   ┃ Deps ┃ Kinds ┃ Description ┃ │
│         │ ┡━━━━━╇━━━━━━━━━╇━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━┩ │
│         │ │ bar │ default │      │       │             │ │
│         │ ├─────┼─────────┼──────┼───────┼─────────────┤ │
│         │ │ foo │ default │      │       │             │ │
│         │ └─────┴─────────┴──────┴───────┴─────────────┘ │
└─────────┴────────────────────────────────────────────────┘
Example 1: Adding kind tags to assets
Now suppose we want to add a compute kind to every asset defined in this folder.
We could do this by manually adding the kind on each asset declaration or by using
a factory. However, component post-processing provides a simpler solution. We
modify our defs.yaml to add a post_processing field that specifies the
kind:
type: dagster.DefsFolderComponent
attributes: {}
post_processing:
  assets:
    - attributes:
        kinds:
        - "some_kind"
Let's break down the structure of the value we set for post_processing. The
top-level key, assets, is currently the only supported key. assets holds a
list of asset post-processors. Each post-processor transforms a set of asset
attributes and applies to a subset of all of the assets generated by the
component. In this case, we have a single post-processor with no defined
subset, which means the specified transformation is applied to all assets.
Let's run dg list defs again to see the result:
dg list defs
┏━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Section ┃ Definitions                                        ┃
┡━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ Assets  │ ┏━━━━━┳━━━━━━━━━┳━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━┓ │
│         │ ┃ Key ┃ Group   ┃ Deps ┃ Kinds     ┃ Description ┃ │
│         │ ┡━━━━━╇━━━━━━━━━╇━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━┩ │
│         │ │ bar │ default │      │ some_kind │             │ │
│         │ ├─────┼─────────┼──────┼───────────┼─────────────┤ │
│         │ │ foo │ default │      │ some_kind │             │ │
│         │ └─────┴─────────┴──────┴───────────┴─────────────┘ │
└─────────┴────────────────────────────────────────────────────┘
You can see that both assets now have the kind we defined in our
post_processing field.
Example 2: Assigning assets to different groups
Adding a kind isn't the only thing we can do. The full schema for
attributes contains many other fields:
Details
{
    "$defs": {
        "DailyPartitionsDefinitionModel": {
            "additionalProperties": false,
            "properties": {
                "type": {
                    "const": "daily",
                    "default": "daily",
                    "enum": [
                        "daily"
                    ],
                    "title": "Type",
                    "type": "string"
                },
                "start_date": {
                    "title": "Start Date",
                    "type": "string"
                },
                "end_date": {
                    "anyOf": [
                        {
                            "type": "string"
                        },
                        {
                            "type": "null"
                        }
                    ],
                    "default": null,
                    "title": "End Date"
                },
                "timezone": {
                    "anyOf": [
                        {
                            "type": "string"
                        },
                        {
                            "type": "null"
                        }
                    ],
                    "default": null,
                    "title": "Timezone"
                },
                "minute_offset": {
                    "default": 0,
                    "title": "Minute Offset",
                    "type": "integer"
                },
                "hour_offset": {
                    "default": 0,
                    "title": "Hour Offset",
                    "type": "integer"
                }
            },
            "required": [
                "start_date"
            ],
            "title": "DailyPartitionsDefinitionModel",
            "type": "object"
        },
        "HourlyPartitionsDefinitionModel": {
            "additionalProperties": false,
            "properties": {
                "type": {
                    "const": "hourly",
                    "default": "hourly",
                    "enum": [
                        "hourly"
                    ],
                    "title": "Type",
                    "type": "string"
                },
                "start_date": {
                    "title": "Start Date",
                    "type": "string"
                },
                "end_date": {
                    "anyOf": [
                        {
                            "type": "string"
                        },
                        {
                            "type": "null"
                        }
                    ],
                    "default": null,
                    "title": "End Date"
                },
                "timezone": {
                    "anyOf": [
                        {
                            "type": "string"
                        },
                        {
                            "type": "null"
                        }
                    ],
                    "default": null,
                    "title": "Timezone"
                },
                "minute_offset": {
                    "default": 0,
                    "title": "Minute Offset",
                    "type": "integer"
                }
            },
            "required": [
                "start_date"
            ],
            "title": "HourlyPartitionsDefinitionModel",
            "type": "object"
        },
        "StaticPartitionsDefinitionModel": {
            "additionalProperties": false,
            "properties": {
                "type": {
                    "const": "static",
                    "default": "static",
                    "enum": [
                        "static"
                    ],
                    "title": "Type",
                    "type": "string"
                },
                "partition_keys": {
                    "items": {
                        "type": "string"
                    },
                    "title": "Partition Keys",
                    "type": "array"
                }
            },
            "required": [
                "partition_keys"
            ],
            "title": "StaticPartitionsDefinitionModel",
            "type": "object"
        },
        "TimeWindowPartitionsDefinitionModel": {
            "additionalProperties": false,
            "properties": {
                "type": {
                    "const": "time_window",
                    "default": "time_window",
                    "enum": [
                        "time_window"
                    ],
                    "title": "Type",
                    "type": "string"
                },
                "start_date": {
                    "title": "Start Date",
                    "type": "string"
                },
                "end_date": {
                    "anyOf": [
                        {
                            "type": "string"
                        },
                        {
                            "type": "null"
                        }
                    ],
                    "default": null,
                    "title": "End Date"
                },
                "timezone": {
                    "anyOf": [
                        {
                            "type": "string"
                        },
                        {
                            "type": "null"
                        }
                    ],
                    "default": null,
                    "title": "Timezone"
                },
                "fmt": {
                    "title": "Fmt",
                    "type": "string"
                },
                "cron_schedule": {
                    "title": "Cron Schedule",
                    "type": "string"
                }
            },
            "required": [
                "start_date",
                "fmt",
                "cron_schedule"
            ],
            "title": "TimeWindowPartitionsDefinitionModel",
            "type": "object"
        },
        "WeeklyPartitionsDefinitionModel": {
            "additionalProperties": false,
            "properties": {
                "type": {
                    "const": "weekly",
                    "default": "weekly",
                    "enum": [
                        "weekly"
                    ],
                    "title": "Type",
                    "type": "string"
                },
                "start_date": {
                    "title": "Start Date",
                    "type": "string"
                },
                "end_date": {
                    "anyOf": [
                        {
                            "type": "string"
                        },
                        {
                            "type": "null"
                        }
                    ],
                    "default": null,
                    "title": "End Date"
                },
                "timezone": {
                    "anyOf": [
                        {
                            "type": "string"
                        },
                        {
                            "type": "null"
                        }
                    ],
                    "default": null,
                    "title": "Timezone"
                },
                "minute_offset": {
                    "default": 0,
                    "title": "Minute Offset",
                    "type": "integer"
                },
                "hour_offset": {
                    "default": 0,
                    "title": "Hour Offset",
                    "type": "integer"
                },
                "day_offset": {
                    "default": 0,
                    "title": "Day Offset",
                    "type": "integer"
                }
            },
            "required": [
                "start_date"
            ],
            "title": "WeeklyPartitionsDefinitionModel",
            "type": "object"
        }
    },
    "additionalProperties": false,
    "properties": {
        "deps": {
            "anyOf": [
                {
                    "items": {
                        "type": "string"
                    },
                    "type": "array"
                },
                {
                    "type": "string"
                },
                {
                    "type": "null"
                }
            ],
            "default": null,
            "description": "The asset keys for the upstream assets that this asset depends on.",
            "examples": [
                [
                    "my_database/my_schema/upstream_table"
                ]
            ],
            "title": "Deps"
        },
        "description": {
            "anyOf": [
                {
                    "type": "string"
                },
                {
                    "type": "null"
                }
            ],
            "default": null,
            "description": "Human-readable description of the asset.",
            "examples": [
                "Refined sales data"
            ],
            "title": "Description"
        },
        "metadata": {
            "anyOf": [
                {
                    "type": "object"
                },
                {
                    "type": "string"
                }
            ],
            "default": "__DAGSTER_UNSET_DEFAULT__",
            "description": "Additional metadata for the asset.",
            "title": "Metadata"
        },
        "group_name": {
            "anyOf": [
                {
                    "type": "string"
                },
                {
                    "type": "null"
                }
            ],
            "default": null,
            "description": "Used to organize assets into groups, defaults to 'default'.",
            "examples": [
                "staging"
            ],
            "title": "Group Name"
        },
        "skippable": {
            "anyOf": [
                {
                    "type": "boolean"
                },
                {
                    "type": "null"
                },
                {
                    "type": "string"
                }
            ],
            "default": null,
            "description": "Whether this asset can be omitted during materialization, causing downstream dependencies to skip.",
            "title": "Skippable"
        },
        "code_version": {
            "anyOf": [
                {
                    "type": "string"
                },
                {
                    "type": "null"
                }
            ],
            "default": null,
            "description": "A version representing the code that produced the asset. Increment this value when the code changes.",
            "examples": [
                "3"
            ],
            "title": "Code Version"
        },
        "owners": {
            "anyOf": [
                {
                    "items": {
                        "type": "string"
                    },
                    "type": "array"
                },
                {
                    "type": "null"
                },
                {
                    "type": "string"
                }
            ],
            "default": null,
            "description": "A list of strings representing owners of the asset. Each string can be a user's email address, or a team name prefixed with `team:`, e.g. `team:finops`.",
            "examples": [
                [
                    "team:analytics",
                    "nelson@hooli.com"
                ]
            ],
            "title": "Owners"
        },
        "tags": {
            "anyOf": [
                {
                    "additionalProperties": {
                        "type": "string"
                    },
                    "type": "object"
                },
                {
                    "type": "string"
                }
            ],
            "default": "__DAGSTER_UNSET_DEFAULT__",
            "description": "Tags for filtering and organizing.",
            "examples": [
                {
                    "team": "analytics",
                    "tier": "prod"
                }
            ],
            "title": "Tags"
        },
        "kinds": {
            "anyOf": [
                {
                    "items": {
                        "type": "string"
                    },
                    "type": "array"
                },
                {
                    "type": "string"
                }
            ],
            "default": "__DAGSTER_UNSET_DEFAULT__",
            "description": "A list of strings representing the kinds of the asset. These will be made visible in the Dagster UI.",
            "examples": [
                [
                    "snowflake"
                ]
            ],
            "title": "Kinds"
        },
        "automation_condition": {
            "anyOf": [
                {
                    "type": "string"
                },
                {
                    "type": "null"
                }
            ],
            "default": null,
            "description": "The condition under which the asset will be automatically materialized.",
            "title": "Automation Condition"
        },
        "partitions_def": {
            "anyOf": [
                {
                    "$ref": "#/$defs/HourlyPartitionsDefinitionModel"
                },
                {
                    "$ref": "#/$defs/DailyPartitionsDefinitionModel"
                },
                {
                    "$ref": "#/$defs/WeeklyPartitionsDefinitionModel"
                },
                {
                    "$ref": "#/$defs/TimeWindowPartitionsDefinitionModel"
                },
                {
                    "$ref": "#/$defs/StaticPartitionsDefinitionModel"
                },
                {
                    "type": "string"
                }
            ],
            "default": null,
            "description": "The partitions definition for the asset.",
            "title": "Partitions Def"
        }
    },
    "title": "SharedAssetKwargsModel",
    "type": "object"
}
Let's change our defs.yaml to assign foo and bar to different groups. To
do this, we'll add two more two post-processors, each with a separate target
argument to specify which assets they apply to:
type: dagster.DefsFolderComponent
post_processing:
  assets:
    - attributes:
        kinds:
        - "some_kind"
    - target: foo
      attributes:
        group_name: "foo_group"
    - target: bar
      attributes:
        group_name: "bar_group"
The target field supports the full Dagster asset selection syntax.
Now if we run dg list defs again, we can see that the assets are in different
groups:
dg list defs
┏━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ ━━━━━━━┓
┃ Section ┃ Definitions                                          ┃
┡━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ Assets  │ ┏━━━━━┳━━━━━━━━━━━┳━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━┓ │
│         │ ┃ Key ┃ Group     ┃ Deps ┃ Kinds     ┃ Description ┃ │
│         │ ┡━━━━━╇━━━━━━━━━━━╇━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━┩ │
│         │ │ bar │ bar_group │      │ some_kind │             │ │
│         │ ├─────┼───────────┼──────┼───────────┼─────────────┤ │
│         │ │ foo │ foo_group │      │ some_kind │             │ │
│         │ └─────┴───────────┴──────┴───────────┴───── ────────┘ │
└─────────┴──────────────────────────────────────────────────────┘
There are many other possibilities. Post-processors are a flexible and convenient way to efficiently specify the properties of large sets of assets.