Environments
SDF Environments were built for teams to help them collaborate, isolate, and maintain development, staging, and production environments.
Overview
Environments are a way to separate your workspace into different contexts, each with its own set of configurations, credentials, and dependencies. Imagine a local developer wants to materialize a table in their Snowflake warehouse, but they don’t want to overwrite the production version of the table they are updating. They can use an environment to materialize the table to a sandbox environment locally, then materialize to the production environment after deployment without changes to the code.
Prerequisites
Before beginning to understand the following examples, you should have an understanding of the following concepts:
These are all foundational concepts that are used in the examples below.
Example
Let’s start with an example of a basic development environment. A common pattern is to prefix developer versions of databases with sandbox_
.
For sake of simplicity, we’ll use a local provider in this guide, but the same principles apply to remote providers.
Setup Moms Flower Shop
First, let’s create a new workspace with the moms_flower_shop
sample.
Add an Environment
Next, let’s add a new environment to the workspace. This will use the SDF provider for local compute.
In this example, we’re creating a new environment called dev
. This environment will use the local
provider to hydrate tables from the prod_db
database.
We’re also specifying that any tables materialized to the prod_db
database should, if run with this environment, materialize to the sandbox_prod_db
.
Compile with the Environment
To compile a workspace with the dev
environment, you can run:
or
Notice the renamed tables in the output! Specifically, the (aka <renamed-table-name>
after each compiler output).
Run with the Environment
To run a workspace with the dev
environment, you can run:
or
The executed tables have been materialized to the renamed fully qualified name.
(Optional) Test on a Remote Provider
The same principles apply to remote providers as seen below:
In this example, we’re creating a new environment called dev
. This environment will use the snowflake
provider to hydrate tables from the prod_db
database.
We’re also specifying that any tables materialized to the prod_db
database should, if run with this environment, materialize to the sandbox_prod_db
.
We’re using the local_dev
credential to authenticate with Snowflake when sdf compile
or sdf run
is run with the dev
environment. This configuration allows for environments to have different credentials for different providers.
You’ll notice in the below example that the schema and table names are being passed through as Jinja variables.
This is because the rename-as
field is a Jinja template when used with *
. In this case, we’re renaming the
schema to sandbox_prod_db
and keeping the schema and table names the same using ${1}
and ${2}
respectively.
More about renaming in the next section.
Renaming
In the above example, we used a simple renaming pattern to rename the database. However, you can use more complex renaming patterns to rename the schema and table names. For example, let’s say we want to prepend the developer name to all schemas when running in the dev
environment. We can do this by using the following configuration:
Cool! But there’s a lot going on here. Let’s break it down:
*.*.*
is a wildcard pattern that matches any schema and table name.${1}
is the first match group from the source pattern. In this case, it’s the database name. Note that in the previous example, our first*
represented the schema name since it came after the database name like<database-name>.*
.sandbox_{{ env_var('developer_name') }}_${2}.${3}
is the new schema and table name. We’re prependingsandbox_
to the schema name, then using theenv_var
function to get the developer’s name from the environment variables. Finally, we’re appending the table name to the end with${3}
since that’s the third match group from the source pattern.
When using the env_var
jinja function for environments and other SDF yml use cases, a common patter is to use a .env
file to store environment variables. This way, you can keep sensitive information out of your codebase.
Overwriting
Environments allow you to overwrite any configurations from your workspace. For example, let’s say you want to use a different database when running in the dev
environment. You can do this by adding the following configuration:
In this example, we’re overwriting the default catalog
, schema
, and materialization
configurations in the workspace. When running in the dev
environment, SDF will use the sandbox_catalog
catalog, the sandbox_<developer_name>
schema, and materialize tables as views
.
The environment block is like a workspace block. If any defaults or includes blocks are overwritten, none of the defaults set on the workspace level will be preserved. There is no merging! If it is not redefined, it takes the value from the workspace block.
Default Environment
You can also set a default environment in your workspace. This is useful when you want to run SDF with a specific environment without having to specify it each time. To set a default environment, you can add the following configuration:
Next Steps
For a more active example, follow along with the materialization guide specific to your data warehouse: