Seeds
Upload data files as tables
Overview
This guide describes the seeds mechanism that helps you upload raw data files (e.g. csv files) to Snowflake as tables
Prerequisites
This guide should be followed after completing the Getting Started with Snowflake and SDF guide and the Basic Materialization with Snowflake guide.
You’ll also need:
- A Snowflake account to connect to
- Valid credentials with read and write access to the SEEDS database used in this guide.
Guide
Create a new SDF Project from the Seeds Sample
Create a new SDF project using the Seeds sample. This will create a new project in your current working directory with the sample project files.
The newly created seeds
directory contains three files:
- The workspace file
workspace.sdf.yml
- The data file
seeds/raw_customers.csv
, and - The model
models/french_customers.sql
which consumes and filters the seed table
Contents of the workspace file
Let’s look into the workspace file:
Of note is the includes
section containing two paths:
- The
models
directory for all the model code, and - The
seeds
directory for the data files in CSV format.
The seeds path is declared with type: seed
. This instructs SDF to look for CSV files
in the specified directory.
Seed files
The seed flies must use CSV format and must include a header row. Here is seeds/raw_customers.csv
with some synthetic customer data:
Models derived from seeds
Seed files materialize as tables which can be used in the definition of other models just like any other table. Here is a model which filters the above seed table to a subset of rows whose country is France:
We can run this model as follows:
Working set 1 model file, 1 seed file, 1 .sdf file
Uploading ‘SEEDS.PUB.RAW_CUSTOMERS’
Running SEEDS.PUB.FRENCH_CUSTOMERS (./models/french_customers.sql)
Finished 2 models [2 succeeded] in 9.779 secs
Table SEEDS.PUB.FRENCH_CUSTOMERS
┌────────────┬─────────────────┬────────────────┬────────────────────────────────────────┬─────────────────────────────────┬─────────────────────────────┬───────────┬─────────┬─────────────────────────┬─────────────────────────┐
│ customerid ┆ name ┆ phone ┆ email ┆ address ┆ region ┆ postalzip ┆ country ┆ createdat ┆ updatedat │
╞════════════╪═════════════════╪════════════════╪════════════════════════════════════════╪═════════════════════════════════╪═════════════════════════════╪═══════════╪═════════╪═════════════════════════╪═════════════════════════╡
│ 10 ┆ Alfonso Mcbride ┆ 1-426-567-2754 ┆ ultricies.ligula.nullam@protonmail.net ┆ P.O. Box 120, 3949 Leo, Rd. ┆ North Island ┆ 4751-5554 ┆ France ┆ 2023-02-11T02:05:55.379 ┆ 2023-02-19T09:06:00.400 │
│ 11 ┆ Carol White ┆ 1-192-246-3106 ┆ porttitor.scelerisque.neque@icloud.edu ┆ Ap #345-4585 Luctus Rd. ┆ Saint Petersburg City ┆ 57565 ┆ France ┆ 2023-01-24T07:06:21.325 ┆ 2023-02-02T10:07:55.385 │
│ 44 ┆ Keely Mccarthy ┆ 1-557-880-9789 ┆ tristique.senectus@aol.org ┆ P.O. Box 432, 6899 Accumsan Rd. ┆ Luxemburg ┆ 7542 ┆ France ┆ 2023-03-08T05:05:59.346 ┆ 2023-03-19T03:07:29.433 │
│ 76 ┆ Samantha Booker ┆ 1-691-851-1158 ┆ faucibus.orci@google.ca ┆ 148-3403 Aenean Road ┆ Southwestern Tagalog Region ┆ 8533 ┆ France ┆ 2023-03-06T09:05:33.404 ┆ 2023-03-14T08:06:44.471 │
│ 104 ┆ Kitra Huber ┆ 1-322-815-5342 ┆ interdum.curabitur@hotmail.couk ┆ 910-6287 Nam Rd. ┆ Viken ┆ 45012 ┆ France ┆ 2023-02-22T17:06:44.334 ┆ 2023-03-01T08:08:24.417 │
│ 147 ┆ Kaseem Rowland ┆ 1-209-868-8767 ┆ sed.facilisis@google.edu ┆ 2536 Aliquam St. ┆ Cartago ┆ 8576-5132 ┆ France ┆ 2023-02-16T21:06:00.390 ┆ 2023-02-22T05:06:19.404 │
│ 229 ┆ Deirdre Clark ┆ 1-821-572-8987 ┆ augue.scelerisque.mollis@hotmail.couk ┆ 2734 Natoque Avenue ┆ North Chungcheong ┆ 858642 ┆ France ┆ 2023-02-08T11:06:13.389 ┆ 2023-02-12T07:07:30.436 │
│ 257 ┆ Marah Silva ┆ 1-577-860-9883 ┆ pellentesque.ut.ipsum@yahoo.org ┆ Ap #318-2518 Risus St. ┆ Lubelskie ┆ 4718 ┆ France ┆ 2022-12-29T06:05:53.314 ┆ 2023-01-07T03:07:15.331 │
│ 316 ┆ Berk Melendez ┆ 1-425-391-6429 ┆ vel@aol.ca ┆ Ap #540-6098 Semper, Road ┆ Paraná ┆ 1274 ┆ France ┆ 2023-01-24T13:06:14.320 ┆ 2023-02-02T07:07:19.398 │
│ 377 ┆ Adam Fitzgerald ┆ 1-477-951-6067 ┆ sem@google.edu ┆ Ap #426-4098 Vestibulum Rd. ┆ Nariño ┆ 53150 ┆ France ┆ 2023-01-13T03:06:22.406 ┆ 2023-01-25T23:06:40.441 │
└────────────┴─────────────────┴────────────────┴────────────────────────────────────────┴─────────────────────────────────┴─────────────────────────────┴───────────┴─────────┴─────────────────────────┴─────────────────────────┘
10 rows.
Ad-hoc queries against seeds
We can also run ad-hoc queries directly against seeds tables. Here is a query counting the number of countries
in raw_customers
:
Working set 2 model files, 1 seed file, 1 .sdf file
Uploading ‘SEEDS.PUB.RAW_CUSTOMERS’
Running seeds.pub.query (./sdftarget/test/query.sql)
Finished 2 models [2 succeeded] in 9.053 secs
Table seeds.pub.query
┌─────┐
│ cnt │
╞═════╡
│ 35 │
└─────┘
1 rows.