Integrations
Integrate with SQL data sources.
Chalk supports Redshift
as a SQL source.
You can configure the Redshift-specific
options using the RedshiftSource
init args,
or configure the source through your dashboard, and
reference the source in your code.
On the dashboard, you can plug in the configuration for your Redshift database:
Add a Redshift integration. These parameters will also be available as environment variables.
As part of the Redshift configuration, you will be prompted to also provide an IAM role and a S3 bucket. The IAM role would be assumed by Redshift to do basic operations (get, put, list objects) in the S3 bucket provided. The S3 bucket itself does not require versioning, but you can set a retention policy on the bucket. The retention policy should be somewhere from 1 - 14 days, to balance storage efficiency with history that can be useful in debugging.
If you have only one Redshift connection that you’d like to add to Chalk, you do not need to specify any arguments to construct the source in your code.
from chalk.sql import RedshiftSource
redshift = RedshiftSource()
@online
def fn(...) -> ...:
return redshift.query(...).first()
from chalk.sql import RedshiftSource
risk = RedshiftSource(name="RISK")
marketing = RedshiftSource(name="MARKETING")
@online
def risk_resolver(...) -> ...:
return risk.query(...).first()
@online
def marketing_resolver(...) -> ...:
return marketing.query(...).first()
Named integrations inject environment variables with the standard names prefixed by the integration name. For example, if your integration is called RISK
, then the variable REDSHIFT_DB
will be injected as RISK_REDSHIFT_DB
. The first integration of a given kind will also create the un-prefixed environment variable (ie. both REDSHIFT_DB
and RISK_REDSHIFT_DB
).You can also configure the integration directly using environment variables on your local machine or from those added through the generic environment variable support.
import os
from chalk.sql import RedshiftSource
redshift = RedshiftSource(
host=os.getenv("REDSHIFT_HOST"),
db=os.getenv("REDSHIFT_DB"),
user=os.getenv("REDSHIFT_USER"),
password=os.getenv("REDSHIFT_PASSWORD"),
)
@online
def resolver_fn(...) -> ...:
return redshift.query(...).first()