Databricks - Chalk

Chalk supports Databricks as a SQL Source. You can configure the Databricks-specific options using the DatabricksSource.__init__ args. Alternately, you can configure the source through your dashboard.

Single Integration

If you have only one Databricks connection that you’d like to add to Chalk, you do not need to specify any arguments to construct the source in your code.

from chalk.sql import DatabricksSource

databricks = DatabricksSource()

@online
def fn(...) -> ...:
    return databricks.query(...).first()

Multiple Integrations

Chalk's injects environment variables to support data integrations. But what happens when you have two data sources of the same kind? When you create a new data source from your dashboard, you have an option to provide a name for the integration. You can then reference this name in the code directly.

from chalk.sql import DatabricksSource

risk = DatabricksSource(name="RISK")
marketing = DatabricksSource(name="MARKETING")

@online
def risk_resolver(...) -> ...:
    return risk.query(...).first()

@online
def marketing_resolver(...) -> ...:
    return marketing.query(...).first()

Named integrations inject environment variables with the standard names prefixed by the integration name. For example, if your integration is called RISK, then the variable DATABRICKS_HOST will be injected as RISK_DATABRICKS_HOST. The first integration of a given kind will also create the un-prefixed environment variable (ie. both DATABRICKS_HOST and RISK_DATABRICKS_HOST).

Environment Variables

You can also configure the integration directly using environment variables on your local machine or from those added through the generic environment variable support.

import os
from chalk.sql import DatabricksSource

databricks = DatabricksSource(
    host=os.getenv("DATABRICKS_HOST"),
    http_path=os.getenv("DATABRICKS_HTTP_PATH"),
    access_token=os.getenv("DATABRICKS_TOKEN"),
    db=os.getenv("DATABRICKS_DATABASE"),
    port=os.getenv("DATABRICKS_PORT"),
)

@online
def resolver_fn(...) -> ...:
    return databricks.query(...).first()

​Single Integration

​Multiple Integrations

​Environment Variables

On this page

Single Integration

Multiple Integrations

Environment Variables