This document describes how the Databricks Connector is configured inside WOODY.IO, what authentication options are available, what permissions the authenticating identity must hold on the Databricks side, and how the optional Temporary Location feature works.
Creating a Databricks Connection
Navigate to Management > Connections > Add Connection. Set Connection Type to Databricks, then fill in the fields below.
| Field | Description |
| Cluster Type | Choose All Purpose Cluster (general workloads) or SQL Warehouse (SQL/BI workloads) |
| Host | HTTPS endpoint of your Databricks workspace, e.g. adb-<id>.azuredatabricks.net |
| Cluster ID | Unique identifier of the target cluster or SQL Warehouse |
| Schema | Optional. Format: <catalog>.<schema> - overrides the workspace default catalog. Entity-level schema takes precedence if also set. Helps loading entities faster when creating Entity from Connection |
| Temporary Location Name | Optional. Name of a Databricks External Location used as a staging area for temp tables |
Can Import / Can Persist / Can Live Edit - toggle these switches in the Details section to control how the Connection may be used.
Authentication
Two authentication methods are available. Select one in the Authentication Type dropdown.
| Method | Details |
| Private Access Token | A Personal Access Token (PAT) generated in Databricks User Settings. Stored in WOODY.IO or referenced via Azure Key Vault using @KeyVault(<Identifier>;<SecretName>) when a KeyVault is configured on the environment |
| Application Service Principal | Uses the Service Principal configured at Application level (Tenant ID, Client ID, Client Secret). Authenticates via Azure AD OAuth 2.0. Recommended for production |
Required Permissions
The identity used to authenticate, PAT owner or Service Principal, must hold the following permissions in Databricks.
| Compute Type | Required Permissions |
| All Purpose Cluster | CAN ATTACH TO or CAN RESTART on the target cluster |
| SQL Warehouse | CAN USE on the target SQL Warehouse |
Unity Catalog
For every catalog and schema the Connection or Entity Technical Names reference:
| Grant | Purpose |
| USE CATALOG | Needed to access any object in the catalog |
| USE SCHEMA | Needed to access any object in the schema |
| SELECT | Needed to read source data during Import |
| MODIFY | Needed to insert, update, merge, or delete rows during Persist |
| CREATE TABLE | Only when Temporary Location Name is not configured |
Temporary Location Name
The Temporary Location Name field in the Connection form, when configured, should point to a pre-configured Databricks External Location, a Unity Catalog object that grants Databricks access to a specific path in cloud storage (ADLS Gen2).
Whether or not this field is set changes how WOODY.IO handles the temp tables it needs during Merge, Update, and Delete operations.
Without Temporary Location Name
WOODY.IO creates the temp table as a managed table directly in the Unity Catalog. Databricks writes the Delta files to the storage account backing that catalog (its own storage or the metastore root).
If the Storage Credential on that storage account only has Storage Blob Data Reader, the CREATE TABLE call will fail - even though SELECT queries continue to work. The Reader role permits reads but not writes.
With Temporary Location Name configured
WOODY.IO creates the temp table as an Unmanaged (external) table inside the External Location path, then runs the required DML against the destination table, and finally issues DELETE and VACUUM to clean up the leftover Delta files.
Because the temp table is external (created with LOCATION), Databricks will not delete it automatically when dropped. WOODY.IO runs an explicit DELETE + VACUUM. This increases total import time and requires retentionDurationCheck to be disabled on the cluster.
Disable the check in cluster Spark config:
spark.databricks.delta.retentionDurationCheck.enabled falseWhich approach to use?
Configuring an External Location is the recommended approach. It isolates WOODY.IO's temp writes to a dedicated, controlled storage path and avoids needing to elevate permissions on the entire Unity Catalog storage account.
| Approach | Description |
| Without External Location | Temp tables go into the Unity Catalog managed storage. Requires Storage Blob Data Contributor on the catalog storage account. Simpler setup but broader permission scope |
| With External Location | Temp tables go into a dedicated External Location path. The Storage Credential on that path needs write access only to that path. Recommended for production environments |
Configuration and Permission Checklist
Whenever the Import runs into an error during loading or persisting the data, or connection validation fails, go through the following checklist and confirm that your connection in WOODY.IO meets all the requirements.
| Item | Notes |
| Auth Identity is a member of the Databricks workspace | PAT Owner or Service Principal |
| Cluster / Warehouse access granted | CAN ATTACH TO (All Purpose) or CAN USE (SQL Warehouse) |
| USE CATALOG granted | On every catalog used or part of the scope |
| USE SCHEMA granted | On every schema used or part of the scope |
| SELECT granted | On all source tables in the schema |
| MODIFY granted | On all destination tables in the schema |
| CREATE TABLE granted | Required only when Temporary Location Name is not set |
| READ FILES + WRITE FILES on External Location | Required only when Temporary Location Name is set |
| CREATE TABLE on External Location | Required only when Temporary Location Name is set |
| retentionDurationCheck disabled on cluster | Required only when Temporary Location Name is set |
| Schema field set (if using non-default catalog) | Format: <catalog>.<schema>. Helps loading the entities from the catalog/schema faster |
If you have any further questions, please feel free to Contact Us.
Was this article helpful?
That’s Great!
Thank you for your feedback
Sorry! We couldn't be helpful
Thank you for your feedback
Feedback sent
We appreciate your effort and will try to fix the article