Creating a model
Our first model in SDF, that’s exciting!
Overview
Imagine your mom is trying to optimize marketing campaigns for mobile app installations, because she’s proudly running a tech-forward boutique flower shop. She wants to get some metrics per campaign to examine their performance.
To help your mom, you plan to create a new table in the analytics layer called dim_marketing_campaigns
.
In short, creating a new model using SDF is as simple as two basic steps:
- Create a SQL file with the query of the model
- Run
sdf run
Let’s check it out!
In this series of tutorials, we will execute our queries locally using SDF’s built-in infrastructure. To learn more about using other integrations see our integrations guides.
Prerequisites
Completion of the previous tutorial.
A complete version of the workspace (after the tutorials) is available in this GitHub project for you to follow along these guides if you wish.
Alternatively, create the folder locally by running the command sdf new --sample moms_flower_shop_completed
Guide
Create a New Analytics Layer Model
To create a new model in SDF, all you need to do is create a new SQL file.
To follow best practices, we will create the model in the models/analytics
directory.
Create a file called dim_marketing_campaigns.sql
under models/analytics
and copy the SQL below.
Note that SDF can infer dependencies and other information from SQL natively with no special adjustments needed from you. We do not require Jinja refs, sources definitions or additional configurations.
Just write your SQL as you would run it.
Validate the Model Through Static Analysis
In just one command, SDF will analyze our entire data warehouse and provide immediate feedback. Through this static analysis we can fix our query or dependency graph before sending it to run, saving time and compute.
To run SDF commands your terminal should execute from the workspace’s root folder - moms_flower_shop
.
If not already there, run cd moms_flower_shop
in your terminal.
Let’s compile the workspace:
A successful run will show all our models in their order of execution.
SDF intelligently caches previous compiles. Since we compiled the source models
under models/raw
in the previous tutorial,
those will not be re-compiled in this run. If you’d like to re-compile all files,
you can clear SDF’s cache by running sdf clean
.
It will look like this:
Working set 11 model files, 22 .sdf files
Compiling moms_flower_shop.raw.raw_addresses (./models/raw/raw_addresses.sql)
Compiling moms_flower_shop.raw.raw_inapp_events (./models/raw/raw_inapp_events.sql)
Compiling moms_flower_shop.raw.raw_customers (./models/raw/raw_customers.sql)
Compiling moms_flower_shop.raw.raw_marketing_campaign_events (./models/raw/raw_marketing_campaign_events.sql)
Compiling moms_flower_shop.staging.inapp_events (./models/staging/inapp_events.sql)
Compiling moms_flower_shop.staging.marketing_campaigns (./models/staging/marketing_campaigns.sql)
Compiling moms_flower_shop.staging.app_installs (./models/staging/app_installs.sql)
Compiling moms_flower_shop.staging.app_installs_v2 (./models/staging/app_installs_v2.sql)
Compiling moms_flower_shop.analytics.agg_installs_and_campaigns (./models/analytics/agg_installs_and_campaigns.sql)
Compiling moms_flower_shop.staging.customers (./models/staging/customers.sql)
Compiling moms_flower_shop.staging.stg_installs_per_campaign (./models/staging/stg_installs_per_campaign.sql)
Finished 11 models [11 succeeded] in 0.907 secs
We will see how it looks like when this command fails in later tutorials.
Run the Query
Now that we validated our query, let’s see its results.
All we need to do is execute sdf run
in our terminal. To only run our table,
we will indicate its name in our command.
Notice that to execute our query, SDF ran all the models it is dependent on.
Working set 12 model files, 15 .sdf files
Running moms_flower_shop.raw.raw_marketing_campaign_events (./models/raw/raw_marketing_campaign_events.sql)
Running moms_flower_shop.raw.raw_inapp_events (./models/raw/raw_inapp_events.sql)
Running moms_flower_shop.staging.inapp_events (./models/staging/inapp_events.sql)
Running moms_flower_shop.staging.marketing_campaigns (./models/staging/marketing_campaigns.sql)
Running moms_flower_shop.staging.app_installs (./models/staging/app_installs.sql)
Running moms_flower_shop.staging.stg_installs_per_campaign (./models/staging/stg_installs_per_campaign.sql)
Running moms_flower_shop.analytics.dim_marketing_campaigns (./models/analytics/dim_marketing_campaigns.sql)
Finished 7 models [7 succeeded] in 0.968 secs
Table moms_flower_shop.analytics.dim_marketing_campaigns
┌─────────────┬─────────────────────────────────┬────────────────────┬───────────────────────────────┬────────────────────────┐
│ campaign_id ┆ campaign_name ┆ total_num_installs ┆ avg_customer_acquisition_cost ┆ install_duration_ratio │
╞═════════════╪═════════════════════════════════╪════════════════════╪═══════════════════════════════╪════════════════════════╡
│ 25 ┆ friends_referrals_25_f1a9344085 ┆ 12 ┆ 174.903432075169 ┆ 1 │
│ 18 ┆ instagram_ads_18_92b545ab18 ┆ 10 ┆ 288.52496746162666 ┆ 2 │
│ 61 ┆ friends_referrals_61_17d7bb85f3 ┆ 10 ┆ 313.0172449950501 ┆ 2 │
│ 51 ┆ instagram_ads_51_94f540ce54 ┆ 9 ┆ 261.3704859860934 ┆ 2 │
│ 26 ┆ google_search_26_bb6932ae6d ┆ 9 ┆ 301.34941277893944 ┆ 2 │
│ 17 ┆ google_search_17_a400715565 ┆ 9 ┆ 245.21392390373887 ┆ 1 │
│ 49 ┆ friends_referrals_49_16b0b159fa ┆ 9 ┆ 288.469560173737 ┆ 2 │
│ 32 ┆ google_search_32_893397b50d ┆ 8 ┆ 369.19360069982207 ┆ 2 │
│ 4 ┆ friends_referrals_4_c554b77114 ┆ 8 ┆ 347.85997737922236 ┆ 2 │
│ 103 ┆ instagram_ads_103_cd5421fbd9 ┆ 8 ┆ 223.14798839074024 ┆ 1 │
└─────────────┴─────────────────────────────────┴────────────────────┴───────────────────────────────┴────────────────────────┘
114 rows, showing only 10 rows.
Run with —limit 0 to show all rows.
Congratulations! You just created your first model using SDF!
Next Steps
Our work here isn’t done!
As you look at the results of the query you realize that the total_num_installs
column just doesn’t make sense.
Of course, your query is perfect! So what could’ve possible gone wrong? Let’s debug together using SDF in our debugging tutorial.