Overview
Do you have a need for speed? Is the Rust performance of SDF not enough for you? Do you want to compile your SDF project without running any queries against your database? If you answered yes to any of these questions, then you’re in the right place! This guide will show you how to compile your SDF project locally without running any queries against your database. The key to accomplishing this is hydrating your SDF workspace with schemas for the remote sources locally.Architecture
The architecture of a locally compilable SDF workspace is similar to a standard SDF workspace. The primary difference is that the locally compilable workspace will have a local copy of the schemas for the remote sources. This local copy of the schemas will be used to compile the workspace without running any queries against the database. We recommend storing these in a directory calledsources in the root of your workspace. As such, a typical directory structure for this might be:
financials, in a schema public, the file structure would optimally look like:
Source Declarations
In order to compile locally, we need the column names and datatypes for the tables our queries are pulling data from (i.e. sources). These can be declared as Local Schema Files. You can create a YML file for each source table in thesources directory. These files should contain the column names and datatypes for the source table. (example below)
Let’s imagine we have a source table called raw_customers (as seen above) with columns like an customerid and name. An example of a schema for this source table might look like:
financials/public/raw_customers.sdf.yml
- The
originfield is set toremoteto indicate that this is a remote source. This is critical, as SDF will still fetch the remote schema unless this attribute is set. - The
columnsfield contains a list of column objects, each with anameanddatatypefield. Thedatatypefield must be a valid SQL datatype that corresponds to the column’s datatype in the remote source. - Other metadata like classifiers and descriptions can be added alongside the datatype. These will propagate downstream using our column-level lineage.
Incremental Models and Snapshots
When SDF compiles incremental models and snapshots, the compilation results are often modified by the incremental or snapshot mode. By default, this mode is set by simply checking to see if this model exists in the remote database. Therefore, we need to overwrite this default behavior if we’d like to compile incremental models and snapshots entirely locally. This can be accomplished by passing in the flag--prefer-local to the sdf compile command. This flag will force SDF to compile the model entirely locally, without checking the remote database.
The --prefer-local flag works by simply setting the incremental mode and snapshot mode variables to true during compilation, thereby replacing the need to check the remote database.
However, let’s say we wanted to compile these models locally but with incremental and snapshot mode off. We can do this by passing extra parameters to the sdf compile command, specifically --no-incremental-mode and --no-snapshot-mode.
Here are four examples and their expected results:
sdf compile --prefer-local: All incremental models and snapshots will not require a database request to compile. They will compile with incremental mode totrueand snapshot mode totruesdf compile --prefer-local --no-incremental-mode: All incremental models and snapshots will not require a database request to compile. They will be compile with incremental mode tofalseand snapshot mode totruesdf compile --prefer-local --no-snapshot-mode: All incremental models and snapshots will not require a database request to compile. They will be compile with incremental mode totrueand snapshot mode tofalsesdf compile --prefer-local --no-incremental-mode --no-snapshot-mode: All incremental models and snapshots will not require a database request to compile. They will be compile with incremental mode tofalseand snapshot mode tofalse
--no-incremental-mode and --no-snapshot-mode will not work to compile locally without --prefer-local. --prefer-local is required to prevent a request to the remote database.