> ## Documentation Index
> Fetch the complete documentation index at: https://docs.sdf.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Getting Started

> SDF supports reading data and metadata from S3, and writing it the results back.

## Overview

This guide will walk you through the steps of creating an AWS S3 integration. By the end,
you will be able to reference remote S3 objects in SDF as if they were locally defined!

SDF will use remote S3 schema information to do type bindings, column level lineage,
and other static analysis checks on your workspace.

## Guide using an IAM access\_key and secret\_key

<Steps>
  <Step title="Collect Required Information">
    To connect to S3 via IAM credentials, you need a valid keypair with the minimum credentials to read objects from your S3 Object Storage.

    * `access_key` the access key of your IAM profile
    * `secret_access_key` the secret key of your IAM profile
    * `region` containing the S3 objects

    <Note>
      Run `sdf auth login aws --help` to see all login options.
    </Note>
  </Step>

  <Step title="Register the User in the AWS CLI">
    To begin, you need to register your user with the AWS CLI.
    You can optionally provide a profile name. As default region, please use the AWS region
    that your S3 bucket is located in.

    The example below configures a profile called *S3*

    ```shell theme={null}
    aws configure --profile S3  
    ```
  </Step>

  <Step title="Connect SDF to AWS">
    Connect SDF to AWS by telling SDF which of your AWS profiles to use.

    ```shell theme={null}
    sdf auth login aws --profile S3 --default-region <REGION>
    ```

    <Info>
      This will create a new [credential](/reference/sdf-yml#block-credential) in
      a `~/.sdf/` directory in the root of your system. This credential will be used
      to authenticate with AWS services. By default, the credential's name is default.
      As such, the credential does not need to be explicitly referenced in the integrations
      configuration below.
    </Info>

    To validate the connection, run:

    ```shell theme={null}
    sdf auth status
    ```
  </Step>

  <Step title="Add S3 Integration in your SDF Workspace">
    Once authenticated, add an integrations block in your `workspace.sdf.yml`
    workspace block. This tells SDF to pull data from the specified bucket,

    ```yml theme={null}
    ---
    workspace:
        ...
        integrations:
            - provider: S3
              type: data
              bucket: 
                - uri: <LOCATION>
                  region: <REGION>
    ```

    For more information on SDF integrations, see the [integration documentation](/guide/setup/integrations).
  </Step>

  <Step title="Create a Table that Pulls Data from S3">
    Next, we'll need to define an external table in SQL that references data in our S3 bucket. Let's imagine this table was pulling data from a CSV file, if so the definition would like like so:

    ```sql my_external_table.sql theme={null}
        CREATE TABLE database.schema.my_external_table WITH (
            FORMAT='CSV', 
            SKIP_HEADER_LINE_COUNT=1,
            LOCATION='s3://path/to/your.csv'
        );
    ```

    We recommend placing this file in the `models` directory, or whichever directory you have specified in your `workspace.sdf.yml`'s includes block.
  </Step>

  <Step title="Try it out!">
    Now that you're connected and you have a table defined, let's make sure SDF can pull the schema information it needs.

    Run `sdf compile -q "select * from database.schema.my_external_table" --show all`

    If the connection is successful, you will see the schema for the table you selected.
  </Step>
</Steps>

<Warning>
  The ARN you created also needs user access to the database. Depending on your
  DBs Access Policy, you might also have to give explicit access to
  this new user for this S3 bucket. You can do this through by adding
  the ARN user to the appropriate group.
</Warning>
