SDF’s support for compilation and local execution of dialects is not covered by this document. Please see the Features for more information on supported dialects.
- Databases (i.e. data warehouses). Think Snowflake, Redshift, BigQuery, etc.
- Data Sources. Think S3, GCS, and Azure Blob Storage.
- Metadata Sources. Think Iceberg or AWS Glue.
These categories may be distinct, but that does not mean they are mutually exclusive. Databases or data warehouses can act as both data sources and metadata sources.
However, they are unique in that the queries authored with SDF can actually be executed on these databases.
Databases
Databases are the most common type of integration. They are the primary target for the queries that are authored with SDF. They are also unique in that they can act as all three integration types: databases, data sources, and metadata sources. Currently, SDF supports the following databases:Feature | Snowflake | Redshift | BigQuery |
---|---|---|---|
Metadata Source | 🟢 | 🟢 | 🟢 |
Data Source | 🟢 | 🔴 | 🟢 |
Materialization | 🟢 | 🔴 | 🟢 |
Data Sources
Data sources are used to read data into SDF from external sources. The most common use case for this is pulling data down for local execution with the SDF DB. Currently, SDF supports the following data sources:- S3
- Snowflake
- BigQuery
Metadata Sources
Metadata sources are used to pull in metadata about the tables in your data warehouse. This metadata powers SDF compilations by providing SDF with table schemas for use in compilation and type checking. Currently, SDF supports the following metadata sources:- Apache Iceberg**
- AWS Glue
- Snowflake
- Redshift
- BigQuery
Others
Outside of the three main categories used to enable SDF query compilation and materialization, SDF supports the following integrations for bespoke use cases:- GitHub - SDF offers an official open source GitHub Action for running SDF in CI/CD workflows.
- DBT - SDF can provide static impact analysis, column-level lineage, data classification / governance, and more alongside DBT projects.
- Databricks - SDF can ingest and compile Spark Logical Plans from Databricks Spark clusters to power column-level lineage and data classification.
- Dagster - SDF workspaces can be orchestrated with Dagster for better scheduling, monitoring, and execution of data workflows.