SDF as a best-in-class transformation and authoring layer for Dagster Orchestration
Set up your environment
dagster
and the sdf-cli
package inside a
Python virtualenv.dagster-sdf
library installs both sdf-cli
and dagster
as python dependencies. If you’re starting from scratch, this will make the sdf
CLI available to you, which you can use to interact with your SDF workspace.To validate that you’ve installed the packages correctly, run the following commands and confirm that their output matches:sdf 0.10.8
Initialize your SDF Workspace with Dagster
dagster-sdf
CLI to create a new Dagster project that references an existing SDF workspace. The dagster-sdf CLI pre-generates a scaffolded Dagster project with the necessary configuration to interface with your SDF workspace.Initializing your project is as easy as running the following command:--project-name
- [required] The name of the Dagster project to be created. This will be the name of the directory containing the project files.--sdf-workspace-dir
- [optional] The path to an existing SDF workspace, which should contain a valid workspace.sdf.yml
file. Optional, if executing from the root of an SDF workspace.project-name
specified above containing a set of files that define a Dagster project (i.e. assets.py
, definitions.py
etc.) and configuration on how to interface with your SDF workspace.The output should look similar to:Understanding your Dagster project
dagster-sdf
package installed in Step 1. We’ll briefly break down the key components here:@sdf_assets
decorator to materialize your SDF workspace by running a command with the sdf-cli
. It can be used to run arbitrary SDF commands, but is tested to work with the compile
, run
and test
commands.Start the Dagster Web Server
Materialize your project in the Dagster UI
Materialize all
button on the asset lineage view. This will materialize all assets in the workspace, including your SDF assets.Optional: Defining upstream dependencies
upstream_dagster_table
is not defined in SDF but is materialized in Dagster via:upstream_dagster_table
as a Dagster source in your SDF workspace by specifying a dagster-asset-key
in the meta section of the table definition:upstream_dagster_table
is a Dagster asset that needs to be materialized before the SDF workspace can be run.upstream_dagster_table
materialized by Dagster, you will need to define an integration in your SDF workspace that can read from the source table. See Integration Docs for more information.