Snapshot Tests

Snapshot testing is an effective way of ensuring that your features are not dramatically changing between deployments. You can set up snapshot testing in Chalk by leveraging Datasets.

Typically, snapshot testing will be part of a CI (continuous integration) workflow. Pull requests will automatically trigger tests which verify that code changes have not caused feature distribution shifts.

In this section, we’ll walk through the process of creating snapshot tests using Pytest and GitHub Actions.

Overview

To set up snapshot testing, we’ll write a couple Pytest fixtures, a test, and a helper script. The fixtures will pull your current snapshot data and create a new snapshot based on your current code. The test will then compare the distributions of the features in your old and new datasets. The helper script will update the main snapshot and will be run when a PR is merged.

In this example we’ll be snapshot testing the features on the following feature class:

from chalk.features import features
import datetime as dt

@features
class User:
    id: int
    full_name: str
    birthday: dt.datetime
    age: int

In this example, we resolve the full_name and birthday of our User’s from a Postgres database. The age feature is calculated using an online resolver:

from chalk import online
from chalk.features import Now

@online
def get_age_in_years(bday: User.birthday, now: Now) -> User.age:
    return now.year - bday.year - ((now.month, now.day) < (bday.month, bday.day))

Though this is a rather trivial example, lets set some expectations for our feature distribution shifts between snapshots:

We expect a User’s birthday to never change,
We expect a User’s full_name to change in a maximum of 0.001% of our users (1/100,000),
We expect a User’s age to change by a maximum amount of 2 years (in reality, average age will increase by precisely the amount of time between invocations of our snapshot test).

Writing the Test and Helper Script

First we’ll write two fixtures: the first sets up a Chalk Client and the second creates your new snapshot (using the previous snapshot to make sure that the exact same User ids are being pulled).

from chalk.client import ChalkClient
from chalk.features import DataFrame
from src.feature_sets import Transaction, User
import datetime as dt
import pytest
import pandas as pd


@pytest.fixture(scope="session")
def client():
    return ChalkClient(branch=True) # Uses your current git branch


@pytest.fixture(scope="session")
def snapshots(client: ChalkClient):
    # The dataset name will be set to the latest commit on the branch
    main_snapshot = client.get_dataset(dataset_name="user_snapshot")
    main_snapshot_df = snapshot.to_pandas().set_index(User.id)

    branch_snapshot = client.offline_query(
        input={User.id: main_snapshot_df.index.to_list()},
        recompute_features=True,
        wait=True,
        run_asynchronously=True
    )
    branch_snapshot_df = branch_snapshot.to_pandas().set_index(User.id)

    return active_snapshot_df, branch_snapshot_df

These fixtures serve to output two Pandas DataFrames: the main_snapshot_df (which is stored in a dataset called user_snapshot) and the branch_snapshot_df (which is generated on the fly with an offline query).

Next, we will use those snapshots in a test:

# Note, we skip this test locally since we only want to run it in CI
@pytest.mark.skipif(os.getenv("CI") is None, reason="Skipping on local machine")
def test_snapshot__user_feature_drift(snaphot_dfs: tuple[pd.DataFrame, pd.DataFrame]):
    main_snapshot, branch_snapshot = snapshot_dfs

    # Birthdays should not change
    assert main_snapshot[User.birthday].equals(branch_snapshot[User.birthday])

    # Full names should not change in more than 0.001% of users
    assert (main_snapshot[User.full_name] != branch_snapshot[User.full_name]).mean() < 0.00001

    # Age should not change by more than 2 years between runs
    assert (main_snapshot[User.age] - branch_snapshot[User.age]).max() <= 2

In addition to the tests, we’ll create a script which will run when we merge pull requests—this script will update the user_snapshot dataset with the one generated with our latest code.

import os
from chalk.client import ChalkClient

branch = os.environ["GITHUB_REF_NAME"]
client = ChalkeClient(branch=branch)

dataset = client.get_dataset(dataset_name="user_snapshot")

dataset.recompute_features(branch=branch, wait=True, run_asynchrounously=True)

Writing the GitHub Action

We will write two GitHub actions: one will run when we create a pull requests, the other will run when we merge code into our main branch.

The first GitHub Action sets up the environment and then runs pytest. The GitHub Action is set to run on pull, which means it will run when we open a pull request against main:

name: Chalk Integration Test

on:
  pull_request:
    branches: [main]

jobs:
  chalk-test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Setup Python
        uses: actions/setup-python@v4
        with:
          python-version: '3.10'
          cache: 'pip'

      - name: Install dependencies
        run: pip install -r requirements.txt

      - uses: chalk-ai/deploy-action@v2
        with:
          client-id: ${{secrets.CHALK_CLIENT_ID}}
          client-secret: ${{secrets.CHALK_CLIENT_SECRET}}
          # Deploys Chalk to a branch environment
          branch: ${{ GITHUB_REF_NAME }}
          # Waits for the deployment to succeed (Optional, default false)
          await: true

      - name: Runs the Snapshot Test
        run: |
          pytest -s ./tests

The second GitHub Action runs our update script to update the user_snapshot dataset with our latest snapshot. It runs whenever we push code to the main branch.

on:
  push:
    branches: [main]
jobs:
  update-snapshot:
    steps:
      - uses: actions/checkout@v4

      - name: Setup Python
        uses: actions/setup-python@v4
        with:
          python-version: '3.10'
          cache: 'pip'

      - name: Install dependencies
        run: pip install -r requirements.txt

      - uses: chalk-ai/cli-action@main
        with:
          client-id: ${{ secrets.CHALK_CLIENT_ID }}
          client-secret: ${{ secrets.CHALK_CLIENT_SECRET }}
          environment: ${{ secrets.CHALK_ENVIRONMENT }}

      - name: Update snapshot
        run: python3 scripts/update_user_snapshot.py

Bootstrapping the GitHub Action

Before this works fully, we need to bootstrap the snapshot test. If you tried to run the action above, you’d get an error when running client.get_dataset(...): the error would indicate that the dataset was not found. To bootstrap your snapshot, you’ll want to run an offline_query to create a dataset named user_snapshot.

from src.feature_sets import User

client.offline_query(
    output=[User],
     # This can be however many you want to include in your snapshot
    max_samples=200_000,
    recompute_features=True,
    run_asynchrounously=True,
    dataset_name="user_snapshot"
)

With your initial snapshot created, future pull requests will be able to automatically execute a snapshot test against your main deployment.

​Overview

​Writing the Test and Helper Script

​Writing the GitHub Action

​Bootstrapping the GitHub Action

On this page

Overview

Writing the Test and Helper Script

Writing the GitHub Action

Bootstrapping the GitHub Action