Overview

Imagine your mom is trying to optimize marketing campaigns for mobile app installations, because she’s proudly running a tech-forward boutique flower shop. She wants to get some metrics per campaign to examine their performance.

To help your mom, you plan to create a new table in the analytics layer called dim_marketing_campaigns.

In short, creating a new model using SDF is as simple as two basic steps:

  1. Create a SQL file with the query of the model
  2. Run sdf run

Let’s check it out!

In this series of tutorials, we will execute our queries locally using SDF’s built-in infrastructure. To learn more about using other integrations see our integrations guides.

Prerequisites

Completion of the previous tutorial.

A complete version of the workspace (after the tutorials) is available in this GitHub project for you to follow along these guides if you wish.

Alternatively, create the folder locally by running the command sdf new --sample moms_flower_shop_completed

Guide

1

Create a New Analytics Layer Model

To create a new model in SDF, all you need to do is create a new SQL file. To follow best practices, we will create the model in the models/analytics directory.

Create a file called dim_marketing_campaigns.sql under models/analytics and copy the SQL below.

Note that SDF can infer dependencies and other information from SQL natively with no special adjustments needed from you. We do not require Jinja refs, sources definitions or additional configurations.

Just write your SQL as you would run it.

models/analytics/dim_marketing_campaigns.sql
SELECT 
    -- marketing campaigns dimensions
    m.campaign_id,
    m.campaign_name,
    -- metrics
    i.total_num_installs,
    total_campaign_spent / 
        NULLIF(i.total_num_installs, 0) AS avg_customer_acquisition_cost,
    campaign_duration / 
        NULLIF(i.total_num_installs, 0) AS install_duration_ratio
FROM staging.marketing_campaigns m
    LEFT OUTER JOIN staging.stg_installs_per_campaign i
    ON (m.campaign_id = i.campaign_id)
ORDER BY total_num_installs DESC NULLS LAST
2

Validate the Model Through Static Analysis

In just one command, SDF will analyze our entire data warehouse and provide immediate feedback. Through this static analysis we can fix our query or DAG before sending it to run, saving time and compute.

To run SDF commands your terminal should execute from the workspace’s root folder - moms_flower_shop.

If not already there, run cd moms_flower_shop in your terminal.

Let’s compile the workspace:

sdf compile

A successful run will show all our models in their order of execution.

SDF intelligently caches previous compiles. Since we compiled the source models under models/raw in the previous tutorial, those will not be re-compiled in this run. If you’d like to re-compile all files, you can clear SDF’s cache by running sdf clean.

It will look like this:

Working set 11 model files, 22 .sdf files   Compiling moms_flower_shop.raw.raw_inapp_events (./models/raw/raw_inapp_events.sql)   Compiling moms_flower_shop.raw.raw_customers (./models/raw/raw_customers.sql)   Compiling moms_flower_shop.raw.raw_marketing_campaign_events (./models/raw/raw_marketing_campaign_events.sql)   Compiling moms_flower_shop.raw.raw_addresses (./models/raw/raw_addresses.sql)   Compiling moms_flower_shop.staging.marketing_campaigns (./models/staging/marketing_campaigns.sql)   Compiling moms_flower_shop.staging.inapp_events (./models/staging/inapp_events.sql)   Compiling moms_flower_shop.staging.app_installs (./models/staging/app_installs.sql)   Compiling moms_flower_shop.staging.app_installs_v2 (./models/staging/app_installs_v2.sql)   Compiling moms_flower_shop.analytics.agg_installs_and_campaigns (./models/analytics/agg_installs_and_campaigns.sql)   Compiling moms_flower_shop.staging.customers (./models/staging/customers.sql)   Compiling moms_flower_shop.staging.stg_installs_per_campaign (./models/staging/stg_installs_per_campaign.sql)    Finished 11 models [11 succeeded] in 0.856 secs

We will see how it looks like when this command fails in later tutorials.

3

Run the Query

Now that we validated our query, let’s see its results.

All we need to do is execute sdf run in our terminal. To only run our table, we will indicate its name in our command.

sdf run analytics.dim_marketing_campaigns

Notice that to execute our query, SDF ran all the models it is dependent on.

Working set 12 model files, 15 .sdf files     Running moms_flower_shop.raw.raw_marketing_campaign_events (./models/raw/raw_marketing_campaign_events.sql)     Running moms_flower_shop.raw.raw_inapp_events (./models/raw/raw_inapp_events.sql)     Running moms_flower_shop.staging.inapp_events (./models/staging/inapp_events.sql)     Running moms_flower_shop.staging.marketing_campaigns (./models/staging/marketing_campaigns.sql)     Running moms_flower_shop.staging.app_installs (./models/staging/app_installs.sql)     Running moms_flower_shop.staging.stg_installs_per_campaign (./models/staging/stg_installs_per_campaign.sql)     Running moms_flower_shop.analytics.dim_marketing_campaigns (./models/analytics/dim_marketing_campaigns.sql)    Finished 7 models [7 succeeded] in 0.928 secs

Table moms_flower_shop.analytics.dim_marketing_campaigns ┌─────────────┬─────────────────────────────────┬────────────────────┬───────────────────────────────┬────────────────────────┐ │ campaign_id ┆ campaign_name                   ┆ total_num_installs ┆ avg_customer_acquisition_cost ┆ install_duration_ratio │ ╞═════════════╪═════════════════════════════════╪════════════════════╪═══════════════════════════════╪════════════════════════╡ │ 25          ┆ friends_referrals_25_f1a9344085 ┆ 12                 ┆ 174.903432075169              ┆ 1                      │ │ 18          ┆ instagram_ads_18_92b545ab18     ┆ 10                 ┆ 288.52496746162666            ┆ 2                      │ │ 61          ┆ friends_referrals_61_17d7bb85f3 ┆ 10                 ┆ 313.0172449950501             ┆ 2                      │ │ 51          ┆ instagram_ads_51_94f540ce54     ┆ 9                  ┆ 261.3704859860934             ┆ 2                      │ │ 26          ┆ google_search_26_bb6932ae6d     ┆ 9                  ┆ 301.34941277893944            ┆ 2                      │ │ 17          ┆ google_search_17_a400715565     ┆ 9                  ┆ 245.21392390373887            ┆ 1                      │ │ 49          ┆ friends_referrals_49_16b0b159fa ┆ 9                  ┆ 288.469560173737              ┆ 2                      │ │ 60          ┆ facebook_ads_60_c107c6b147      ┆ 8                  ┆ 345.1581886309037             ┆ 2                      │ │ 32          ┆ google_search_32_893397b50d     ┆ 8                  ┆ 369.19360069982207            ┆ 2                      │ │ 103         ┆ instagram_ads_103_cd5421fbd9    ┆ 8                  ┆ 223.14798839074024            ┆ 1                      │ └─────────────┴─────────────────────────────────┴────────────────────┴───────────────────────────────┴────────────────────────┘ 114 rows, showing only 10 rows.   Run with —limit 0 to show all rows.

Congratulations! You just created your first model using SDF!

Next Steps

Our work here isn’t done!

As you look at the results of the query you realize that the total_num_installs column just doesn’t make sense.

Of course, your query is perfect! So what could’ve possible gone wrong? Let’s debug together using SDF in our debugging tutorial.