> ## Documentation Index
> Fetch the complete documentation index at: https://docs.sdf.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Checks

## Understanding Checks

Checks are an example of [**Metaprogramming**](https://en.wikipedia.org/wiki/Metaprogramming). SDF takes your SQL workspace, classifiers, and lineage and creates a rich information schema whenever you run `sdf compile`.

Checks are simply small SQL queries that run against this information schema and check for the existence of certain properties which you want to assure. Some examples of effective checks are:

* Ensure Data Privacy: Ensure all personally identifiable information (PII) is appropriately anonymized
* Ensure Data Ownership: Guarantee every table has an owner (a staple of GDPR)
* Ensure Data Quality: Prevent different currency types from combining in calculations (e.g. preventing £ + \$ )

## Writing Your First Check

Let's write and modify a check, starting with one of the sdf samples.

<Steps>
  <Step title="Create a Sample Workspace">
    In the terminal, run:

    ```shell theme={null}
    sdf new --sample hello_with_pii && cd hello_with_pii
    ```

    <div className="bg-[#0F1117] dark:bg-codeblock rounded-xl dark:ring-1 dark:ring-gray-800/50 relative">
      <pre style={{ fontFamily: 'monospace', backgroundColor: 'transparent' }} className="language-shell">
        <code className="language-shell">
              Created hello\_with\_pii/.gitignore
              Created hello\_with\_pii/checks/code\_check.sql
              Created hello\_with\_pii/models/main.sql
              Created hello\_with\_pii/workspace.sdf.yml
           
                  Welcome to your new SDF Workspace! To help you on your journey:
                  
                  💡 Join the SDF Community Slack -> [https://sdf.com/join](https://sdf.com/join) 
                  
                  📚 Read the Docs to Get Started -> [https://docs.sdf.com/](https://docs.sdf.com/)
                  
             Finished new in 0.312 secs
        </code>
      </pre>
    </div>

    This sample workspace contains a model, and a classifier, and a check.
  </Step>

  <Step title="Include Checks in Workspace">
    Add the checks' path to `woskspace.sdf.yml`.

    ```yml theme={null}
      workspace:
        name: hello
        edition: 1.2
        defaults:
          dialect: trino
        includes:
          - path: models
          # ADD THE BELOW #
          - path: checks
            type: check
    ```

    <Info>
      Both checks and reports must be written in the Trino SQL dialect.
    </Info>
  </Step>

  <Step title="Run the Check">
    To run the check, simply execute `sdf check`. During compile, SDF runs all its static analysis on the workspace, creating lineage, metadata, and ultimately executing the checks.

    Hurray! The check passes.

    <div className="bg-[#0F1117] dark:bg-codeblock rounded-xl dark:ring-1 dark:ring-gray-800/50 relative">
      <pre style={{ fontFamily: 'monospace', backgroundColor: 'transparent' }} className="language-shell">
        <code className="language-shell">
          Working set 1 model file, 1 .sdf file
            Compiling hello.pub.main (./models/main.sql)
          Working set 1 check file, 1 .sdf file
             Checking hello.pub.code\_check (./checks/code\_check.sql)
             Finished 1 model \[1 succeeded], 1 check \[1 passed] in 2.987 secs
          \[Pass] Check hello.pub.code\_check
        </code>
      </pre>
    </div>

    But, carefully read the check.

    ```sql theme={null}
    SELECT
        DISTINCT c.table_name as "table name",
        c.column_name as "column name",
        c.classifiers
    FROM
        sdf.information_schema.columns c
    WHERE
        CONTAINS_ARRAY_VARCHAR(c.classifiers, 'PII.name')
    ```

    It queries for any column that contains PII.name and asserts that this query should be empty. But we never added the classifier in the first place!
  </Step>

  <Step title="Add a Classifier">
    To add the classifier and change the behavior, uncomment the following classifier later in the workspace.

    ```yml theme={null}
    table:
      name: main
      columns:
        - name: column_2
          # UNCOMMENT THE BELOW
          classifiers:
            - PII.name
    ```

    Running `sdf check` again will **result in a failed test**. The output will look like this:

    <div className="bg-[#0F1117] dark:bg-codeblock rounded-xl dark:ring-1 dark:ring-gray-800/50 relative">
      <pre style={{ fontFamily: 'monospace', backgroundColor: 'transparent' }} className="language-shell">
        <code className="language-shell">
          Working set 1 model file, 1 .sdf file
            Compiling hello.pub.main (./models/main.sql)
          Working set 1 check file, 1 .sdf file
             Checking hello.pub.code\_check (./checks/code\_check.sql)
             Finished 1 model \[1 succeeded], 1 check \[1 failed] in 3.049 secs, for details see below.

          \[Fail] Check hello.pub.code\_check
          ┌────────────┬─────────────┬─────────────┐
          │ table\_name ┆ column name ┆ classifiers │
          ╞════════════╪═════════════╪═════════════╡
          │ main       ┆ column\_2    ┆ \[PII.name]  │
          └────────────┴─────────────┴─────────────┘
          1 rows.

          \-------
          Summary 1 model \[1 succeeded], 1 check \[1 failed] in 3.049 secs.
          \-------
        </code>
      </pre>
    </div>
  </Step>
</Steps>

## Modifying Classifiers

Let's fix this failing check! We can apply an `md5()` hash to column\_2 (the name column) to obfuscate it. This is called a **reclassification** since we are transforming one classifier into another.

<Steps>
  <Step title="Add a Reclassifying Function">
    Let's add a function block that attaches a reclassify statement to the built-in md5 function. In your `workspace.sdf.yml` create the following function block.

    ```yml theme={null}
    ---
    function:
      name: md5
      reclassify:
        - from: PII.name
    ```
  </Step>

  <Step title="Create a New Model">
    Now, create a second table that uses the md5 function.
    In the same directory as *main.sql*, create *sink.sql* with the line

    ```sql sink.sql theme={null}
    SELECT MD5(CAST (column_2 AS VARBINARY)) FROM main;
    ```
  </Step>

  <Step title="Update Check">
    Update the *code\_check.sql* with the following line `and c.table_name like '%.sink'`. It should now look like the check below.
    Now, the Check specifically checks the *sink* table for the PII.name classifier.

    ```sql code_check.sql theme={null}
    SELECT
        DISTINCT c.table_name as "table_name",
        c.column_name as "column name",
        c.classifiers
    FROM
        sdf.information_schema.columns c
    WHERE
        CONTAINS_ARRAY_VARCHAR(c.classifiers, 'PII.name')
        AND c.table_name like '%.sink'
    ```

    Run `sdf check` again and voila! All checks pass. The source table has PII.name, the sink table does not.

    <div className="bg-[#0F1117] dark:bg-codeblock rounded-xl dark:ring-1 dark:ring-gray-800/50 relative">
      <pre style={{ fontFamily: 'monospace', backgroundColor: 'transparent' }} className="language-shell">
        <code className="language-shell">
          Working set 2 model files, 1 .sdf file
            Compiling hello.pub.main (./models/main.sql)
            Compiling hello.pub.sink (./models/sink.sql)
          Working set 1 check file, 1 .sdf file
             Checking hello.pub.code\_check (./checks/code\_check.sql)
             Finished 2 models \[2 succeeded], 1 check \[1 passed] in 3.080 secs
          \[Pass] Check hello.pub.code\_check
        </code>
      </pre>
    </div>
  </Step>
</Steps>

## Advanced Checks

* To learn more about transforming classifiers across functions, see: [UDFs](/guide/advanced/udf)
* To learn more about the SDF internal information schema, see: [Information Schema](/reference/sdf-information-schema)
