Chalk supports AWS Athena as a SQL source. This allows users to load data from AWS Glue and other AWS data sources (Hive, DocumentDB, Iceberg, etc.) directly into Chalk features.

Adding Athena

In the settings page, you’ll find a form for adding your credentials. Note that we will perform data unload operations to the provided staging directory in S3: these intermediate results will appear under the chalk-unload folder.

Add Athena

Add an Athena integration. These parameters will also be available as environment variables.

Athena
Environment

Learn more about Chalk's Athena Integration

Integrations Setup

After configuring your Athena integration in the dashboard, define your data sources in Python:

from chalk.sql import AthenaSource

athena_source_txns = AthenaSource(name="ATHENA_TRANSACTIONS")
athena_source_marketing = AthenaSource(name="ATHENA_MARKETING")

Note that all queries to Athena will be run with UNLOAD to handle larger-than-memory datasets.

Then reference them in SQL file resolvers using the name parameter. For example, to query from the ATHENA_TRANSACTIONS source:

-- type: online
-- resolves: User
-- source: ATHENA_TRANSACTIONS
SELECT id, transaction_volume FROM transactions

And to query from the ATHENA_MARKETING source:

-- type: online
-- resolves: User
-- source: ATHENA_MARKETING
SELECT id, email, campaign_status FROM marketing_data