SDF’s static analyis can supercharge CI/CD workflows for data teams. By compiling your SQL on each pull request, SDF can statically ensure new updates to your models are valid SQL and do not break anything downstream. Additionally, SDF supports the use of checks in CI/CD. This feature allows users to define their own static tests, further enhancing the robustness and reliability of your data models. Unlike other CI/CD data tools, SDF checks everything statically, preventing you from having to incur compute costs to validate SQL and run impact analysis.

This guide will walk you through how to integrate SDF into an example CI/CD workflow. Specifically, we’ll be using the official SDF GitHub Action to run sdf compile and sdf test on new pull requests and subsequent updates. This guide assumes you have a basic understanding of GitHub Actions and how to create a workflow file. If you’re new to GitHub Actions, we recommend checking out the official documentation.

If you’d like to skip ahead and see the final result, you can find the example repository here.

Prerequisites

There are two open source GitHub repositories we’ll be using today, to follow along, fork the starting repository and clone it to your local machine.

  • A GitHub account
  • A GitHub repository cloned locally that can be used for testing. See this guide for how to set this up.

Guide

1

Create an Example Workspace

To start off, we’ll create a new SDF workspace. We’ll skaffold our GitHub action around this workspace. Today, we’ll be using the pii_saas_platform example. You can create a new workspace by running the following command:

    sdf new --sample pii_saas_platform

This command should be run within the Git repository you’d like to test with. We recommend using a fresh repository for this example

2

Run Compile

Next, let’s confirm our workspace is working as expected by running sdf compile. This command will compile all of the SQL in your workspace and ensure it is valid. If you have any errors in your SQL, this command will fail.

    sdf compile
3

Add the GitHub Action

Now that we have a working workspace, let’s add a GitHub action to run sdf compile on each pull request. We’ll need to add a workflow file to our workspace that utilizes the SDF GitHub Action to do this.

Add the following yml to a file called sdf.yml in the .github/workflows directory of your repository. This file will run the trigger the SDF GitHub action on each pull request.

.github/workflows/sdf.yml
name: On Pull Request Run Compile

on:
    push:
        branches:
        - main
    pull_request:

jobs:
    compile_sdf:
        runs-on: ubuntu-latest
        name: Run sdf compile
        steps:
          - name: Checkout
            uses: actions/checkout@v4
          - name: sdf compile step
            uses: sdf-labs/sdf-action@v0
            id: sdf
            with:
                command: 'sdf compile'
                workspace_dir: '.' # Path to your SDF workspace
                # Add your SDF credentials if you intend to run `sdf push` 
                access_key: ${{ secrets.ACCESS_KEY }}
                secret_key: ${{ secrets.SECRET_KEY }}
                # (Optional) Add your compute provider credentials and other env vars here
                snowflake_account_id: ${{ secrets.SNOWFLAKE_ACCOUNT_ID }}
                snowflake_username: ${{ secrets.SNOWFLAKE_USERNAME }}
                snowflake_password: ${{ secrets.SNOWFLAKE_PASSWORD }}
                snowflake_role: ${{ secrets.SNOWFLAKE_ROLE }}
                snowflake_warehouse: ${{ secrets.SNOWFLAKE_WH }}
                # Whether or not to run `sdf dbt refresh` and other DBT-specific config
                is_dbt: "false"

          # Use the output from the `sdf` step
          - name: Display the sdf output
            if: always()
            run: |
                echo "### SDF Run Logs 🪵" >> $GITHUB_STEP_SUMMARY
                echo '```' >>$GITHUB_STEP_SUMMARY
                echo "${{ steps.sdf.outputs.log }}" >>$GITHUB_STEP_SUMMARY
                echo '```' >>$GITHUB_STEP_SUMMARY

As you can see, we’re specifying sdf-labs/sdf-action@v0 as the action to use. Version v0 gets the latest version under 0.*.*. If you’d like to use a specific version, you can specify it in the uses field by specifying the entire semantic version (e.g. sdf-labs/sdf-action@v0.1.0).

For documentation on each of the allowed inputs, see the README in the SDF GitHub Action repository.

The last step in the workflow is to pretty print the output from the SDF command. This is optional, but will result in the output seen in the screenshot below on the workflow summary page.

4

(Optional) Use the Action with DBT

If you’re not using DBT alongside SDF, you can skip step. Otherwise, you’ll need to ensure DBT compile is run before SDF compile. This is because SDF relies on the compiled DBT models to run its static analysis on a DBT project.

We recommend accomplishing the dbt compile using this GitHub Action.

Furthermore, if you don’t have the DBT profiles.yml configured in your DBT project (i.e. adjacent to your dbt_project.yml), you’ll need to write the profiles.yml file in the GitHub action. Use the modified sdf.yml workflow file (called sdf_dbt.yml) below to accomplish this. Note the example below is for a Snowflake configuration.

The snippet below for writing out a profiles.yml was contributed by Chris Hronek. Thanks Chris!

.github/workflows/sdf_dbt.yml
    name: On Pull Request Run Compile

    on:
        push:
            branches:
            - main
        pull_request:

    jobs:
        compile_dbt_then_sdf:
            runs-on: ubuntu-latest
            name: Run sdf compile
            steps:
              - name: Checkout
                uses: actions/checkout@v4
              - name: Generate dbt profile
                run: |
                echo "
                [target-name]:
                    outputs:
                    prod:
                        type: snowflake
                        account: ${{ secrets.SNOWFLAKE_ACCOUNT_ID }}
                        user: ${{ secrets.SNOWFLAKE_USERNAME }}
                        password: ${{ secrets.SNOWFLAKE_PASSWORD }}
                        role: ${{ secrets.SNOWFLAKE_ROLE }}
                        database: ${{ secrets.SNOWFLAKE_DB }} 
                        warehouse: ${{ secrets.SNOWFLAKE_WH }}
                        schema: ${{ secrets.SNOWFLAKE_SCHEMA }}
                        threads: 1
                        client_session_keep_alive: False
                        query_tag: GITHUB_ACTIONS
                    target: prod
                " > [relative-path-to-your-dbt-project]/profiles.yml  
              - name: dbt compile step
                uses: mwhitaker/dbt-action@master
                with:
                    dbt_command: "dbt compile --profiles-dir . --target prod"
                    dbt_project_folder: "my_dbt_project"
              - name: sdf compile step
                uses: sdf-labs/sdf-action@v0
                id: sdf
                with:
                    command: 'sdf compile'
                    is_dbt: true
                    workspace_dir: './my_dbt_project' # Path to your SDF workspace
                    # Add your SDF credentials if you intend to run `sdf push` 
                    access_key: ${{ secrets.ACCESS_KEY }}
                    secret_key: ${{ secrets.SECRET_KEY }}
                    # (Optional) Add your compute provider credentials and other env vars here
                    snowflake_account_id: ${{ secrets.SNOWFLAKE_ACCOUNT_ID }}
                    snowflake_username: ${{ secrets.SNOWFLAKE_USERNAME }}
                    snowflake_password: ${{ secrets.SNOWFLAKE_PASSWORD }}
                    snowflake_role: ${{ secrets.SNOWFLAKE_ROLE }}
                    snowflake_warehouse: ${{ secrets.SNOWFLAKE_WH }}

              # Use the output from the `sdf` step
              - name: Display the sdf output
                if: always()
                run: |
                    echo "### SDF Compile Logs 🪵" >> $GITHUB_STEP_SUMMARY
                    echo '```' >>$GITHUB_STEP_SUMMARY
                    echo "${{ steps.sdf.outputs.log }}" >>$GITHUB_STEP_SUMMARY
                    echo '```' >>$GITHUB_STEP_SUMMARY

Notice that, critically, is_dbt is set to true in the sdf compile step. Furthermore, we are specifying the workspace_dir to be the path to the DBT project, since the SDF workspace must be defined adjacent to the DBT project in order to function properly.

Under the hood, the GitHub action runs sdf dbt refresh and looks for models in the target/compiled/models/sdf directory for compilation when is_dbt is set to true.

5

Create a Pull Request on GitHub

Now, we’ll commit out changes to a new branch and push them to GitHub to see our new action in action. Let’s call this branch add-sdf-action.

git checkout -b add-sdf-action && \
git add . && \
git commit -m "Add SDF workspace and GitHub Action" && \
git push -u origin add-sdf-action

This will add all staged changes to the new branch. If there is anything you don’t want to commit, make sure to unstage or stash it before running git add .

Now that we have a new branch in our remote repository, let’s make a pull request on GitHub. Follow this official guide from GitHub to do so, and make sure to select our sdf-add-action branch as the branch to merge into main.

Once the pull request is created, you should see our new GitHub action start running and succeed. Congrats! You’ve successfully integrated SDF into your CI/CD workflow.

6

Test a Breaking Change (Optional)

If you’d like to see the action fail, you can make a breaking change to your SQL and push it to the add-sdf-action branch. This will trigger the GitHub action to run again, and it should fail.

To demonstrate SDF’s static impact analysis, let’s remove a column that’s being used in a downstream model. This will cause the sdf compile command to fail, and in a perfect world, would prevent us from merging this update into our main branch. To do this, remove line 7 from the file found at ddls/payment/public/invoices.sql. This will remove the payer_user_id column from the invoices DDL.

After committing the change, push it to the branch with an open pull request. You should see the GitHub action fail, and the pull request will not be mergeable if the repository is configured in this way.

Here’s an example job summary of an SDF compilation failure in CI/CD

Summary

In this guide, we walked through how to integrate SDF into a CI/CD workflow using GitHub Actions. We created a new SDF workspace, added a GitHub action to run sdf compile on each pull request, and tested the action by making a breaking change to our SQL. By integrating SDF into your CI/CD workflow, you can ensure that your SQL is always valid and downstream models are compliant with your new changes. This can help prevent costly errors and ensure that your data is always reliable.

For a completed example of this guide, check out the example repository