> ## Documentation Index
> Fetch the complete documentation index at: https://docs.sdf.com/llms.txt
> Use this file to discover all available pages before exploring further.

# GitHub

> SDF is a powerful tool on its own, but its utility is amplified when integrated into CI/CD workflows.

SDF's static analyis can supercharge CI/CD workflows for data teams. By compiling your SQL on each pull request, SDF can statically ensure new updates to your models are valid SQL and do not break anything downstream. Additionally, SDF supports the use of [checks](/guide/data-quality/checks) in CI/CD. This feature allows users to define their own static tests, further enhancing the robustness and reliability of your data models. Unlike other CI/CD data tools, SDF checks everything statically, preventing you from having to incur compute costs to validate SQL and run impact analysis.

This guide will walk you through how to integrate SDF into an example CI/CD workflow. Specifically, we'll be using the official [SDF GitHub Action](https://github.com/sdf-labs/sdf-action) to run `sdf compile` and `sdf test` on new pull requests and subsequent updates. This guide assumes you have a basic understanding of GitHub Actions and how to create a workflow file. If you're new to GitHub Actions, we recommend checking out the [official documentation](https://docs.github.com/en/actions).

If you'd like to skip ahead and see the final result, you can find the example repository [here](https://github.com/sdf-labs/github-action-example).

## Prerequisites

There are two open source GitHub repositories we'll be using today, to follow along, fork the starting repository and clone it to your local machine.

* A GitHub account
* A GitHub repository cloned locally that can be used for testing. See [this guide](https://docs.github.com/en/repositories/creating-and-managing-repositories/quickstart-for-repositories) for how to set this up.

## Guide

<Steps>
  <Step title="Create an Example Workspace">
    To start off, we'll create a new SDF workspace. We'll skaffold our GitHub action around this workspace. Today, we'll be using the `pii_saas_platform` example. You can create a new workspace by running the following command:

    ```shell theme={null}
        sdf new --sample pii_saas_platform
    ```

    <Note>
      This command should be run within the Git repository you'd like to test with. We recommend using a fresh repository for this example
    </Note>
  </Step>

  <Step title="Run Compile">
    Next, let's confirm our workspace is working as expected by running `sdf compile`. This command will compile all of the SQL in your workspace and ensure it is valid. If you have any errors in your SQL, this command will fail.

    ```shell theme={null}
        sdf compile
    ```
  </Step>

  <Step title="Add the GitHub Action">
    Now that we have a working workspace, let's add a GitHub action to run `sdf compile` on each pull request. We'll need to add a **workflow file** to our workspace that utilizes the SDF GitHub Action to do this.

    Add the following `yml` to a file called `sdf.yml` in the `.github/workflows` directory of your repository. This file will run the trigger the SDF GitHub action on each pull request.

    ````yaml .github/workflows/sdf.yml theme={null}
    name: On Pull Request Run Compile

    on:
        push:
            branches:
            - main
        pull_request:

    jobs:
        compile_sdf:
            runs-on: ubuntu-latest
            name: Run sdf compile
            steps:
              - name: Checkout
                uses: actions/checkout@v4
              - name: sdf compile step
                uses: sdf-labs/sdf-action@v0
                id: sdf
                with:
                    command: 'sdf compile'
                    workspace_dir: '.' # Path to your SDF workspace
                    # Add your SDF credentials if you intend to run `sdf push` 
                    access_key: ${{ secrets.ACCESS_KEY }}
                    secret_key: ${{ secrets.SECRET_KEY }}
                    # (Optional) Add your compute provider credentials and other env vars here
                    snowflake_account_id: ${{ secrets.SNOWFLAKE_ACCOUNT_ID }}
                    snowflake_username: ${{ secrets.SNOWFLAKE_USERNAME }}
                    snowflake_password: ${{ secrets.SNOWFLAKE_PASSWORD }}
                    snowflake_role: ${{ secrets.SNOWFLAKE_ROLE }}
                    snowflake_warehouse: ${{ secrets.SNOWFLAKE_WH }}
                    # Whether or not to run `sdf dbt refresh` and other DBT-specific config
                    is_dbt: "false"

              # Use the output from the `sdf` step
              - name: Display the sdf output
                if: always()
                run: |
                    echo "### SDF Run Logs 🪵" >> $GITHUB_STEP_SUMMARY
                    echo '```' >>$GITHUB_STEP_SUMMARY
                    echo "${{ steps.sdf.outputs.log }}" >>$GITHUB_STEP_SUMMARY
                    echo '```' >>$GITHUB_STEP_SUMMARY
    ````

    As you can see, we're specifying `sdf-labs/sdf-action@v0` as the action to use. Version `v0` gets the latest version under `0.*.*`. If you'd like to use a specific version, you can specify it in the `uses` field by specifying the entire semantic version (e.g. `sdf-labs/sdf-action@v0.1.0`).

    For documentation on each of the allowed inputs, see the README in the [SDF GitHub Action repository](https://github.com/sdf-labs/sdf-action).

    The last step in the workflow is to pretty print the output from the SDF command. This is optional, but will result in the output seen in the screenshot below on the workflow summary page.

    <img src="https://cdn.sdf.com/img/github-action-summary-output.png" />
  </Step>

  <Step title="(Optional) Use the Action with DBT">
    If you're not using DBT alongside SDF, you can skip step. Otherwise, you'll need to ensure DBT compile is run before SDF compile. This is because SDF relies on the compiled DBT models to run its static analysis on a DBT project.

    We recommend accomplishing the `dbt compile` using this [GitHub Action](https://github.com/marketplace/actions/dbt-action).

    Furthermore, if you don't have the DBT `profiles.yml` configured in your DBT project (i.e. adjacent to your `dbt_project.yml`), you'll need to write the `profiles.yml` file in the GitHub action. Use the modified `sdf.yml` workflow file (called `sdf_dbt.yml`) below to accomplish this.
    Note the example below is for a Snowflake configuration.

    <Check>
      The snippet below for writing out a `profiles.yml` was contributed by [Chris Hronek](https://github.com/chrishronek). Thanks Chris!
    </Check>

    ````yaml .github/workflows/sdf_dbt.yml theme={null}
        name: On Pull Request Run Compile

        on:
            push:
                branches:
                - main
            pull_request:

        jobs:
            compile_dbt_then_sdf:
                runs-on: ubuntu-latest
                name: Run sdf compile
                steps:
                  - name: Checkout
                    uses: actions/checkout@v4
                  - name: Generate dbt profile
                    run: |
                    echo "
                    [target-name]:
                        outputs:
                        prod:
                            type: snowflake
                            account: ${{ secrets.SNOWFLAKE_ACCOUNT_ID }}
                            user: ${{ secrets.SNOWFLAKE_USERNAME }}
                            password: ${{ secrets.SNOWFLAKE_PASSWORD }}
                            role: ${{ secrets.SNOWFLAKE_ROLE }}
                            database: ${{ secrets.SNOWFLAKE_DB }} 
                            warehouse: ${{ secrets.SNOWFLAKE_WH }}
                            schema: ${{ secrets.SNOWFLAKE_SCHEMA }}
                            threads: 1
                            client_session_keep_alive: False
                            query_tag: GITHUB_ACTIONS
                        target: prod
                    " > [relative-path-to-your-dbt-project]/profiles.yml  
                  - name: dbt compile step
                    uses: mwhitaker/dbt-action@master
                    with:
                        dbt_command: "dbt compile --profiles-dir . --target prod"
                        dbt_project_folder: "my_dbt_project"
                  - name: sdf compile step
                    uses: sdf-labs/sdf-action@v0
                    id: sdf
                    with:
                        command: 'sdf compile'
                        is_dbt: true
                        workspace_dir: './my_dbt_project' # Path to your SDF workspace
                        # Add your SDF credentials if you intend to run `sdf push` 
                        access_key: ${{ secrets.ACCESS_KEY }}
                        secret_key: ${{ secrets.SECRET_KEY }}
                        # (Optional) Add your compute provider credentials and other env vars here
                        snowflake_account_id: ${{ secrets.SNOWFLAKE_ACCOUNT_ID }}
                        snowflake_username: ${{ secrets.SNOWFLAKE_USERNAME }}
                        snowflake_password: ${{ secrets.SNOWFLAKE_PASSWORD }}
                        snowflake_role: ${{ secrets.SNOWFLAKE_ROLE }}
                        snowflake_warehouse: ${{ secrets.SNOWFLAKE_WH }}

                  # Use the output from the `sdf` step
                  - name: Display the sdf output
                    if: always()
                    run: |
                        echo "### SDF Compile Logs 🪵" >> $GITHUB_STEP_SUMMARY
                        echo '```' >>$GITHUB_STEP_SUMMARY
                        echo "${{ steps.sdf.outputs.log }}" >>$GITHUB_STEP_SUMMARY
                        echo '```' >>$GITHUB_STEP_SUMMARY
    ````

    Notice that, critically, `is_dbt` is set to `true` in the `sdf compile` step. Furthermore, we are specifying the `workspace_dir` to be the path to the DBT project, since the SDF workspace must be defined adjacent to the DBT project in order to function properly.

    <Note>
      Under the hood, the GitHub action runs `sdf dbt refresh` and looks for models in the `target/compiled/models/sdf` directory for compilation when `is_dbt` is set to `true`.
    </Note>
  </Step>

  <Step title="Create a Pull Request on GitHub">
    Now, we'll commit out changes to a new branch and push them to GitHub to see our new action *in action*. Let's call this branch `add-sdf-action`.

    ```shell theme={null}
    git checkout -b add-sdf-action && \
    git add . && \
    git commit -m "Add SDF workspace and GitHub Action" && \
    git push -u origin add-sdf-action
    ```

    <Warning>
      This will add *all staged changes* to the new branch. If there is anything you don't want to commit, make sure to unstage or stash it before running `git add .`
    </Warning>

    Now that we have a new branch in our remote repository, let's make a pull request on GitHub. Follow this [official guide](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-a-pull-request) from GitHub to do so, and make sure to select our `sdf-add-action` branch as the branch to merge into `main`.

    Once the pull request is created, you should see our new GitHub action start running and succeed. Congrats! You've successfully integrated SDF into your CI/CD workflow.

    <img src="https://cdn.sdf.com/img/sdf-github-action-pending.png" />

    <img src="https://cdn.sdf.com/img/sdf-github-action-success.png" />
  </Step>

  <Step title="Test a Breaking Change (Optional)">
    If you'd like to see the action fail, you can make a breaking change to your SQL and push it to the `add-sdf-action` branch. This will trigger the GitHub action to run again, and it should fail.

    To demonstrate SDF's static impact analysis, let's remove a column that's being used in a downstream model. This will cause the `sdf compile` command to fail, and in a perfect world, would prevent us from merging this update into our main branch.
    To do this, remove line `7` from the file found at `ddls/payment/public/invoices.sql`. This will remove the `payer_user_id` column from the `invoices` DDL.

    After committing the change, push it to the branch with an open pull request. You should see the GitHub action fail, and the pull request will not be mergeable if the repository is configured in this way.

    Here's an example job summary of an SDF compilation failure in CI/CD

    <img src="https://cdn.sdf.com/img/sdf-github-action-failure.png" />
  </Step>
</Steps>

## Summary

In this guide, we walked through how to integrate SDF into a CI/CD workflow using GitHub Actions. We created a new SDF workspace, added a GitHub action to run `sdf compile` on each pull request, and tested the action by making a breaking change to our SQL. By integrating SDF into your CI/CD workflow, you can ensure that your SQL is always valid and downstream models are compliant with your new changes. This can help prevent costly errors and ensure that your data is always reliable.

For a completed example of this guide, check out the [example repository](https://docs.github.com/en/repositories/creating-and-managing-repositories/quickstart-for-repositories)
