Redshift - Chalk

Chalk supports Redshift as a SQL source. You can configure the Redshift-specific options using the RedshiftSource init args, or configure the source through your dashboard, and reference the source in your code.

Adding Redshift

On the dashboard, you can plug in the configuration for your Redshift database:

Add Redshift

Add a Redshift integration. These parameters will also be available as environment variables.

As part of the Redshift configuration, you will be prompted to also provide an IAM role and a S3 bucket. The IAM role would be assumed by Redshift to do basic operations (get, put, list objects) in the S3 bucket provided. The S3 bucket itself does not require versioning, but you can set a retention policy on the bucket. The retention policy should be somewhere from 1 - 14 days, to balance storage efficiency with history that can be useful in debugging.

Single Integration

If you have only one Redshift connection that you’d like to add to Chalk, you do not need to specify any arguments to construct the source in your code.

from chalk.sql import RedshiftSource

redshift = RedshiftSource()

@online
def fn(...) -> ...:
    return redshift.query(...).first()

Multiple Integrations

Chalk's injects environment variables to support data integrations. But what happens when you have two data sources of the same kind? When you create a new data source from your dashboard, you have an option to provide a name for the integration. You can then reference this name in the code directly.

from chalk.sql import RedshiftSource

risk = RedshiftSource(name="RISK")
marketing = RedshiftSource(name="MARKETING")

@online
def risk_resolver(...) -> ...:
    return risk.query(...).first()

@online
def marketing_resolver(...) -> ...:
    return marketing.query(...).first()

Named integrations inject environment variables with the standard names prefixed by the integration name. For example, if your integration is called RISK, then the variable REDSHIFT_DB will be injected as RISK_REDSHIFT_DB. The first integration of a given kind will also create the un-prefixed environment variable (ie. both REDSHIFT_DB and RISK_REDSHIFT_DB).

Environment Variables

You can also configure the integration directly using environment variables on your local machine or from those added through the generic environment variable support.

import os
from chalk.sql import RedshiftSource

redshift = RedshiftSource(
    host=os.getenv("REDSHIFT_HOST"),
    db=os.getenv("REDSHIFT_DB"),
    user=os.getenv("REDSHIFT_USER"),
    password=os.getenv("REDSHIFT_PASSWORD"),
)

@online
def resolver_fn(...) -> ...:
    return redshift.query(...).first()

​Adding Redshift

Add Redshift

​Single Integration

​Multiple Integrations

​Environment Variables

On this page

Adding Redshift

Single Integration

Multiple Integrations

Environment Variables