> ## Documentation Index > Fetch the complete documentation index at: https://docs.sdf.com/llms.txt > Use this file to discover all available pages before exploring further. # Databricks Spark Listener > Install SDF's Spark Listener in your Databricks Cluster SDF does not support transformation on Databricks. The SDF Spark Listener is only used to capture lineage from your Databricks Cluster. This is unlike Snowflake and Redshift integrations where you can run transformations against your cloud compute with `sdf run`. Support for `sdf run` on Databricks with SparkSQL is coming soon. The SDF Console can be configured to listen for Spark events on your Databricks Cluster and automatically ingest and analyze the lineage of your Spark Warehouse. ## Prerequisites Ensure that you have the following installed and configured locally before beginning. For more information on how to install, see the **[Databricks CLI Tutorial](https://docs.databricks.com/en/dev-tools/cli/tutorial.html)** ## Installation Follow the steps below to install the SDF Spark Listener on your Databricks Cluster. Using the databricks cli, create a directory on your Databricks File System (DBFS) called `databricks/spark-listener`. ```bash theme={null} databricks fs mkdirs dbfs:/databricks/sdf-listener ``` Next, run the following to upload the latest version of the SDF Spark Listener to the dbfs directory you just created. ```bash theme={null} databricks fs cp <(curl -L https://cdn.sdf.com/spark-listener/releases/download/sdf-spark-listener-latest.jar) dbfs:/databricks/sdf-listener/sdf-spark-listener-latest.jar ``` Using the databricks cli, create a directory in your databricks workspace called `/Shared/sdf-listener`. ```bash theme={null} databricks workspace mkdirs /Shared/sdf-listener ``` Log into the Databricks Workspace UI and navigate to **Workspace** in the side bar. Navigate to the `/Shared/sdf-listener` directory. Once there, click the **Add** dropdown and select **File**. Name the file `init-script.sh` and paste the following into the file. ```bash init-script.sh theme={null} #!/bin/bash STAGE_DIR="/dbfs/databricks/sdf-listener" echo "BEGIN: Upload Spark Listener JARs" cp -f $STAGE_DIR/sdf-spark-listener*.jar /mnt/driver-daemon/jars || { echo "Error copying Spark Listener library file"; exit 1;} echo "END: Upload Spark Listener JARs" echo "BEGIN: Modify Spark config settings" cat << 'EOF' > /databricks/driver/conf/openlineage-spark-driver-defaults.conf [driver] { "spark.extraListeners" = "io.openlineage.spark.agent.OpenLineageSparkListener" } EOF echo "END: Modify Spark config settings" ``` Navigate to [console.sdf.com](https://console.sdf.com) and log in. Go to **Settings > Integrations** and click **Connect Database**. Name your integration and select **Databricks** as the type. Click **Next** and follow the on-screen instructions to complete the integration. You must be an Admin to create a databricks integration In the Databricks UI, navigate to **Compute** in the sidebar. Select the cluster (or clusters) that you want to enable the SDF Spark Listener on. Alternativelly, you can create a new compute cluster by clicking **Create compute** and following these [instructions](https://docs.databricks.com/en/clusters/configure.html). In the **Cluster Configuration** page, click **Edit**. Note, if your cluster is currently running, you will need to restart it in order for changes to apply. Expand the section labeled **Advanced options** and in the section labeled **Spark**, add the spark configuration generated in the previous step to the **Spark Config** section. Next, click the **Init Scripts** Tab and select the init script located at `/Shared/sdf-listener/init-script.sh`. If you know your workspace and project names as well as your sdf cluster endpoint, you can populate the below template. Remember to replace ``, ``, and `` with your values. ```config Spark Config theme={null} spark.openlineage.version v1 spark.extraListeners io.openlineage.spark.agent.OpenLineageSparkListener spark.openlineage.debugFacet enabled spark.openlineage.transport.type http spark.openlineage.transport.url spark.openlineage.transport.endpoint /api/v1/open-lineage/lineage spark.openlineage.transport.headers.workspace spark.openlineage.transport.headers.project spark.openlineage.transport.headers.cluster spark.openlineage.transport.auth.type api_key spark.openlineage.transport.auth.apiKey {{secrets/sdf-integration/events-access-key}} ``` Save your changes by clicking **Confirm** and then start your cluster by clicking **Start**. In a Databricks Notebook, select the **Connect** button and select the compute cluster that you just configured. Run the following query to test that the SDF Spark Listener is working. ```python theme={null} spark.createDataFrame([ {'a': 1, 'b': 2}, {'a': 3, 'b': 4} ]).write.mode("overwrite").saveAsTable("default.test_sdf_integration") ```