Feature Engine
Define multiple versions of a windowed aggregation with independent expressions, time windows, and materialization configs.
Versioned windowed aggregations combine feature versioning with windowed features. They let you evolve a windowed aggregation’s expression, time windows, or materialization configuration across multiple versions without affecting existing consumers.
Like scalar versioned features, each version is independently addressable with the @ operator. Consumers that don’t specify a version continue to see the default (v1), while new consumers can opt in to a newer version explicitly.
Use versioned windowed aggregations when:
sum to mean), but existing consumers depend on the current behavior.Supply a versions dict to feature(), where each value is a windowed() call:
from chalk import Windowed, windowed
from chalk.features import features, feature, DataFrame, _
from datetime import datetime
@features
class Transaction:
id: int
user_id: "User.id"
amount: float
timestamp: datetime
@features
class User:
id: int
transactions: DataFrame[Transaction]
total_spend: Windowed[float] = feature(versions={
1: windowed(
"30d",
expression=_.transactions[
_.amount,
_.timestamp > _.chalk_window,
_.timestamp < _.chalk_now,
].sum(),
materialization={"bucket_duration": "1d"},
),
2: windowed(
"30d",
expression=_.transactions[
_.amount,
_.timestamp > _.chalk_window,
_.timestamp < _.chalk_now,
].mean(),
materialization={"bucket_duration": "1d"},
),
})Each version can independently specify its expression, set of time windows, materialization config, and other windowed() parameters.
Use the @ operator to request a specific version. Without it, the default version (v1 unless overridden with default_version) is returned.
from chalk.client import ChalkClient
from src.models import User
client = ChalkClient()
# Default version (v1)
result = client.query(
input={User.id: 42},
output=[User.total_spend["30d"]],
)
# Explicitly request version 2
result = client.query(
input={User.id: 42},
output=[User.total_spend["30d"] @ 2],
)In the CLI, windowed features use the FQN format <namespace>.<feature>__<seconds>__. Non-default versions get an @<version> suffix.
# Default version (v1) — no suffix
$ chalk query --in user.id=42 --out user.total_spend__2592000__
# Version 2 — @2 suffix
$ chalk query --in user.id=42 --out user.total_spend__2592000__@2See Referencing windowed features for how window durations map to their FQN representations.
Each version can define a different set of time windows. This is useful when a newer version of an aggregation needs to support a longer lookback or a different granularity:
@features
class User:
id: int
transactions: DataFrame[Transaction]
spend: Windowed[float] = feature(versions={
1: windowed(
"30d",
expression=_.transactions[
_.amount,
_.timestamp > _.chalk_window,
_.timestamp < _.chalk_now,
].sum(),
materialization={"bucket_duration": "1d"},
),
2: windowed(
"30d", "90d",
expression=_.transactions[
_.amount,
_.timestamp > _.chalk_window,
_.timestamp < _.chalk_now,
].sum(),
materialization={"bucket_duration": "1d"},
),
})Here v1 exposes only a 30d window, while v2 adds a 90d window. The new window is addressable as User.spend["90d"] @ 2 in Python, or user.spend__7776000__@2 in CLI/FQN form.
Once all consumers have migrated to a newer version, you can promote it to the default using default_version:
total_spend: Windowed[float] = feature(
default_version=2,
versions={
1: windowed(
"30d",
expression=_.transactions[
_.amount,
_.timestamp > _.chalk_window,
_.timestamp < _.chalk_now,
].sum(),
materialization={"bucket_duration": "1d"},
),
2: windowed(
"30d",
expression=_.transactions[
_.amount,
_.timestamp > _.chalk_window,
_.timestamp < _.chalk_now,
].mean(),
materialization={"bucket_duration": "1d"},
),
},
)It is best practice to leave default_version=1 until all consumers have been updated to the newer version.
Use chalk aggregate backfill with the @N version suffix to backfill a specific version’s materialized aggregation:
chalk aggregate backfill --feature user.total_spend@2Without the @N suffix, Chalk backfills the default version. See Materialized Windowed Aggregations for full backfill options.
versions dict must be windowed() calls. You cannot mix windowed() and feature() values in the same dict.@ operator selects the version, not the time window. To specify both, use User.feature["30d"] @ 2.