Chalk home page
Docs
API
CLI
  1. Features
  2. Caching

When a feature is expensive or slow to compute, you may wish to cache its value. Chalk uses the terminology “maximum staleness” to describe how recently a feature value needs to have been computed to be returned without re-running a resolver.

You can specify the maximum staleness for a feature as follows:

from chalk.features import feature, features
from datetime import timedelta

@features
class UserFeatures:
    # Using text descriptors:
    expensive_fraud_score: float = feature(
        max_staleness="1m 30s"
    )

    # Alternatively, using timedelta:
    expensive_fraud_score: float = feature(
        max_staleness=timedelta(minutes=1, seconds=30)
    )

Max staleness durations can be given in natural language, or specified using datetime.timedelta. You can specify a max staleness of “infinity” to indicate that Chalk should cache computed feature values forever. This makes sense for data that never becomes invalid, or for data that you wish to explicitly update using Streaming Updates or Reverse ETL.

Staleness can also be assigned to all features in a namespace:

@features(max_staleness="1d")
class User:
    fraud_score: float
    full_name: str
    email: str = feature(max_staleness="0s")
    ...

Here, User.fraud_score and User.full_name assume the max-staleness of 1d. However, User.email, which specifies max-staleness at the feature level, assumes the max-staleness of 0s, forcing it to be recomputed on every request.

Default values

By default, features are not cached, and instead are recomputed for every online request. In effect, you can think of max_staleness as being 0 except where otherwise specified.

In an offline environment, all feature values are taken from past runs or historical tables, where max_staleness does not apply.

Overriding default caching

The max_staleness values provided to the feature function may be overridden at the time of querying for features. See Overriding Default Caching for a detailed discussion.