When a feature is expensive or slow to compute, you may wish to cache its value. Chalk uses the terminology “maximum staleness” to describe how recently a feature value needs to have been computed to be returned without re-running a resolver.
You can specify the maximum staleness for a feature as follows:
from chalk.features import feature, features from datetime import timedelta @features class UserFeatures: # Using text descriptors: expensive_fraud_score: float = feature( max_staleness="1m 30s" ) # Alternatively, using timedelta: expensive_fraud_score: float = feature( max_staleness=timedelta(minutes=1, seconds=30) )
Max staleness durations can be given in natural language, or specified using datetime.timedelta. You can specify a max staleness of “infinity” to indicate that Chalk should cache computed feature values forever. This makes sense for data that never becomes invalid, or for data that you wish to explicitly update using Streaming Updates or Reverse ETL.
Staleness can also be assigned to all features in a namespace:
@features(max_staleness="1d") class User: fraud_score: float full_name: str email: str = feature(max_staleness="0s") ...
assume the max-staleness of
User.email, which specifies max-staleness at the feature level,
assumes the max-staleness of
0s, forcing it to be recomputed on every request.
By default, features are not cached, and instead are recomputed for every online request.
In effect, you can think of
max_staleness as being
0 except where otherwise specified.
In an online environment, all feature values are taken from past runs or historical tables,
max_staleness does not apply.
max_staleness values provided to the
may be overridden at the time of querying for features.
See Overriding Default Caching for a detailed discussion.