Testing
Snapshot testing with Chalk
Snapshot testing is an effective way of ensuring that your features are not dramatically changing between deployments. You can set up snapshot testing in Chalk by leveraging Datasets.
Typically, snapshot testing will be part of a CI (continuous integration) workflow. Pull requests will automatically trigger tests which verify that code changes have not caused feature distribution shifts.
In this section, we’ll walk through the process of creating snapshot tests using Pytest and GitHub Actions.
To set up snapshot testing, we’ll write a couple Pytest fixtures, a test, and a helper script. The fixtures will pull your current snapshot data and create a new snapshot based on your current code. The test will then compare the distributions of the features in your old and new datasets. The helper script will update the main snapshot and will be run when a PR is merged.
In this example we’ll be snapshot testing the features on the following feature class:
from chalk.features import features
import datetime as dt
@features
class User:
id: int
full_name: str
birthday: dt.datetime
age: int
In this example, we resolve the full_name
and birthday of our User’s from a Postgres database. The age feature is calculated using an online resolver:
from chalk import online
from chalk.features import Now
@online
def get_age_in_years(bday: User.birthday, now: Now) -> User.age:
return now.year - bday.year - ((now.month, now.day) < (bday.month, bday.day))
Though this is a rather trivial example, lets set some expectations for our feature distribution shifts between snapshots:
First we’ll write two fixtures: the first sets up a Chalk Client and the second creates your new snapshot (using the previous snapshot to make sure that the exact same User ids are being pulled).
from chalk.client import ChalkClient
from chalk.features import DataFrame
from src.feature_sets import Transaction, User
import datetime as dt
import pytest
import pandas as pd
@pytest.fixture(scope="session")
def client():
return ChalkClient(branch=True) # Uses your current git branch
@pytest.fixture(scope="session")
def snapshots(client: ChalkClient):
# The dataset name will be set to the latest commit on the branch
main_snapshot = client.get_dataset(dataset_name="user_snapshot")
main_snapshot_df = snapshot.to_pandas().set_index(User.id)
branch_snapshot = client.offline_query(
input={User.id: main_snapshot_df.index.to_list()},
recompute_features=True,
wait=True,
run_asynchronously=True
)
branch_snapshot_df = branch_snapshot.to_pandas().set_index(User.id)
return active_snapshot_df, branch_snapshot_df
These fixtures serve to output two Pandas DataFrames: the main_snapshot_df
(which is stored in
a dataset called user_snapshot
) and the branch_snapshot_df
(which is generated
on the fly with an offline query).
Next, we will use those snapshots in a test:
# Note, we skip this test locally since we only want to run it in CI
@pytest.mark.skipif(os.getenv("CI") is None, reason="Skipping on local machine")
def test_snapshot__user_feature_drift(snaphot_dfs: tuple[pd.DataFrame, pd.DataFrame]):
main_snapshot, branch_snapshot = snapshot_dfs
# Birthdays should not change
assert main_snapshot[User.birthday].equals(branch_snapshot[User.birthday])
# Full names should not change in more than 0.001% of users
assert (main_snapshot[User.full_name] != branch_snapshot[User.full_name]).mean() < 0.00001
# Age should not change by more than 2 years between runs
assert (main_snapshot[User.age] - branch_snapshot[User.age]).max() <= 2
In addition to the tests, we’ll create a script which will run when we merge pull requests—this
script will update the user_snapshot
dataset with the one generated with our latest code.
import os
from chalk.client import ChalkClient
branch = os.environ["GITHUB_REF_NAME"]
client = ChalkeClient(branch=branch)
dataset = client.get_dataset(dataset_name="user_snapshot")
dataset.recompute_features(branch=branch, wait=True, run_asynchrounously=True)
We will write two GitHub actions: one will run when we create a pull requests, the other will run when we merge code into our main branch.
The first GitHub Action sets up the environment and then runs pytest
. The
GitHub Action is set to run on pull
, which means it will run when we open a
pull request against main:
name: Chalk Integration Test
on:
pull_request:
branches: [main]
jobs:
chalk-test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup Python
uses: actions/setup-python@v4
with:
python-version: '3.10'
cache: 'pip'
- name: Install dependencies
run: pip install -r requirements.txt
- uses: chalk-ai/deploy-action@v2
with:
client-id: ${{secrets.CHALK_CLIENT_ID}}
client-secret: ${{secrets.CHALK_CLIENT_SECRET}}
# Deploys Chalk to a branch environment
branch: ${{ GITHUB_REF_NAME }}
# Waits for the deployment to succeed (Optional, default false)
await: true
- name: Runs the Snapshot Test
run: |
pytest -s ./tests
The second GitHub Action runs our update script to update the user_snapshot
dataset with our latest snapshot. It runs whenever we push code to the main branch.
on:
push:
branches: [main]
jobs:
update-snapshot:
steps:
- uses: actions/checkout@v4
- name: Setup Python
uses: actions/setup-python@v4
with:
python-version: '3.10'
cache: 'pip'
- name: Install dependencies
run: pip install -r requirements.txt
- uses: chalk-ai/cli-action@main
with:
client-id: ${{ secrets.CHALK_CLIENT_ID }}
client-secret: ${{ secrets.CHALK_CLIENT_SECRET }}
environment: ${{ secrets.CHALK_ENVIRONMENT }}
- name: Update snapshot
run: python3 scripts/update_user_snapshot.py
Before this works fully, we need to bootstrap the snapshot test. If you tried
to run the action above, you’d get an error when running client.get_dataset(...)
:
the error would indicate that the dataset was not found. To bootstrap your snapshot,
you’ll want to run an offline_query
to create a dataset named user_snapshot
.
from src.feature_sets import User
client.offline_query(
output=[User],
# This can be however many you want to include in your snapshot
max_samples=200_000,
recompute_features=True,
run_asynchrounously=True,
dataset_name="user_snapshot"
)
With your initial snapshot created, future pull requests will be able to automatically execute a snapshot test against your main deployment.