Overview
Imagine your mom is trying to optimize marketing campaigns for mobile app installations, because she’s proudly running a tech-forward boutique flower shop. She wants to get some metrics per campaign to examine their performance. To help your mom, you plan to create a new table in the analytics layer calleddim_marketing_campaigns
.
In short, creating a new model using SDF is as simple as two basic steps:
- Create a SQL file with the query of the model
- Run
sdf run
In this series of tutorials, we will execute our queries locally using SDF’s built-in infrastructure.
To learn more about using other integrations see our integrations guides.
Prerequisites
Completion of the previous tutorial.A complete version of the workspace (after the tutorials) is available in this GitHub project for you
to follow along these guides if you wish.Alternatively, create the folder locally by running the command
sdf new --sample moms_flower_shop_completed
Guide
1
Create a New Analytics Layer Model
To create a new model in SDF, all you need to do is create a new SQL file.
To follow best practices, we will create the model in the
models/analytics
directory.Create a file called dim_marketing_campaigns.sql
under models/analytics
and copy the SQL below.Note that SDF can infer dependencies and other information from SQL natively
with no special adjustments needed from you. We do not require Jinja refs, sources
definitions or additional configurations.Just write your SQL as you would run it.
models/analytics/dim_marketing_campaigns.sql
2
Validate the Model Through Static Analysis
In just one command, SDF will analyze our entire data warehouse and provide immediate
feedback. Through this static analysis we can fix our query or dependency graph before sending it to
run, saving time and compute.Let’s compile the workspace:A successful run will show all our models in their order of execution.SDF intelligently caches previous compiles. Since we compiled the source models
under We will see how it looks like when this command fails in later tutorials.
To run SDF commands your terminal should execute from the workspace’s root folder -
moms_flower_shop
.If not already there, run cd moms_flower_shop
in your terminal.models/raw
in the previous tutorial,
those will not be re-compiled in this run. If you’d like to re-compile all files,
you can clear SDF’s cache by running sdf clean
.It will look like this:Working set 11 model files, 22 .sdf files
Compiling moms_flower_shop.raw.raw_inapp_events (./models/raw/raw_inapp_events.sql)
Compiling moms_flower_shop.raw.raw_customers (./models/raw/raw_customers.sql)
Compiling moms_flower_shop.raw.raw_marketing_campaign_events (./models/raw/raw_marketing_campaign_events.sql)
Compiling moms_flower_shop.raw.raw_addresses (./models/raw/raw_addresses.sql)
Compiling moms_flower_shop.staging.marketing_campaigns (./models/staging/marketing_campaigns.sql)
Compiling moms_flower_shop.staging.inapp_events (./models/staging/inapp_events.sql)
Compiling moms_flower_shop.staging.app_installs (./models/staging/app_installs.sql)
Compiling moms_flower_shop.staging.app_installs_v2 (./models/staging/app_installs_v2.sql)
Compiling moms_flower_shop.analytics.agg_installs_and_campaigns (./models/analytics/agg_installs_and_campaigns.sql)
Compiling moms_flower_shop.staging.customers (./models/staging/customers.sql)
Compiling moms_flower_shop.staging.stg_installs_per_campaign (./models/staging/stg_installs_per_campaign.sql)
Finished 11 models [11 succeeded] in 0.915 secs
3
Run the Query
Now that we validated our query, let’s see its results.All we need to do is execute Notice that to execute our query, SDF ran all the models it is dependent on.
sdf run
in our terminal. To only run our table,
we will indicate its name in our command.Working set 12 model files, 15 .sdf files
Running moms_flower_shop.raw.raw_inapp_events (./models/raw/raw_inapp_events.sql)
Running moms_flower_shop.raw.raw_marketing_campaign_events (./models/raw/raw_marketing_campaign_events.sql)
Running moms_flower_shop.staging.marketing_campaigns (./models/staging/marketing_campaigns.sql)
Running moms_flower_shop.staging.inapp_events (./models/staging/inapp_events.sql)
Running moms_flower_shop.staging.app_installs (./models/staging/app_installs.sql)
Running moms_flower_shop.staging.stg_installs_per_campaign (./models/staging/stg_installs_per_campaign.sql)
Running moms_flower_shop.analytics.dim_marketing_campaigns (./models/analytics/dim_marketing_campaigns.sql)
Finished 7 models [7 succeeded] in 0.999 secsTable moms_flower_shop.analytics.dim_marketing_campaigns
┌─────────────┬─────────────────────────────────┬────────────────────┬───────────────────────────────┬────────────────────────┐
│ campaign_id ┆ campaign_name ┆ total_num_installs ┆ avg_customer_acquisition_cost ┆ install_duration_ratio │
╞═════════════╪═════════════════════════════════╪════════════════════╪═══════════════════════════════╪════════════════════════╡
│ 25 ┆ friends_referrals_25_f1a9344085 ┆ 12 ┆ 174.903432075169 ┆ 1 │
│ 18 ┆ instagram_ads_18_92b545ab18 ┆ 10 ┆ 288.52496746162666 ┆ 2 │
│ 61 ┆ friends_referrals_61_17d7bb85f3 ┆ 10 ┆ 313.0172449950501 ┆ 2 │
│ 51 ┆ instagram_ads_51_94f540ce54 ┆ 9 ┆ 261.3704859860934 ┆ 2 │
│ 26 ┆ google_search_26_bb6932ae6d ┆ 9 ┆ 301.34941277893944 ┆ 2 │
│ 17 ┆ google_search_17_a400715565 ┆ 9 ┆ 245.21392390373887 ┆ 1 │
│ 49 ┆ friends_referrals_49_16b0b159fa ┆ 9 ┆ 288.469560173737 ┆ 2 │
│ 32 ┆ google_search_32_893397b50d ┆ 8 ┆ 369.19360069982207 ┆ 2 │
│ 4 ┆ friends_referrals_4_c554b77114 ┆ 8 ┆ 347.85997737922236 ┆ 2 │
│ 103 ┆ instagram_ads_103_cd5421fbd9 ┆ 8 ┆ 223.14798839074024 ┆ 1 │
└─────────────┴─────────────────────────────────┴────────────────────┴───────────────────────────────┴────────────────────────┘
114 rows, showing only 10 rows.
Run with —limit 0 to show all rows.
Next Steps
Our work here isn’t done! As you look at the results of the query you realize that thetotal_num_installs
column just doesn’t make sense.
Of course, your query is perfect! So what could’ve possible gone wrong?
Let’s debug together using SDF in our debugging tutorial.