Chalk supports Redshift as a SQL source. You can configure the Redshift-specific options using the RedshiftSource init args, or configure the source through your dashboard, and reference the source in your code.

Adding Redshift

On the dashboard, you can plug in the configuration for your Redshift database:

Add Redshift

Add a Redshift integration. These parameters will also be available as environment variables.

Redshift
Environment

Learn more about Chalk's Redshift Integration

As part of the Redshift configuration, you will be prompted to also provide an IAM role and a S3 bucket. The IAM role would be assumed by Redshift to do basic operations (get, put, list objects) in the S3 bucket provided. The S3 bucket itself does not require versioning, but you can set a retention policy on the bucket. The retention policy should be somewhere from 1 - 14 days, to balance storage efficiency with history that can be useful in debugging.

Integrations Setup

After configuring your Redshift integration in the dashboard, define your data sources in Python:

from chalk.sql import RedshiftSource

risk = RedshiftSource(name="RISK")
marketing = RedshiftSource(name="MARKETING")

Then reference them in SQL file resolvers using the name parameter. For example, to query from the RISK source:

-- type: online
-- resolves: User
-- source: RISK
SELECT id, credit_score FROM users

And to query from the MARKETING source:

-- type: online
-- resolves: User
-- source: MARKETING
SELECT id, email, campaign_status FROM users
Named integrations inject environment variables with the standard names prefixed by the integration name. For example, if your integration is called RISK, then the variable REDSHIFT_DB will be injected as RISK_REDSHIFT_DB.