Resolvers can depend on other features to compute their features. These dependencies are declared through the type signature of the arguments to the resolver function.
To depend on a feature from a Feature Set, you give your resolver an argument with that feature as the type. You can then use that argument in the body of your resolver to compute your output features. If you’re running our editor plugin, your editor will see the type of each variable as the type of the underlying scalar.
from chalk.features import features, online @features class User: id: str email: str email_domain: str @online def get_domain(a: User.email) -> User.email_domain: # type(email) == str return email.split('@').lower()
You can require multiple features in a resolver. However, all feature dependencies in a single resolver need to originate at the same root namespace:
Requiring features from the same root namespace
@online def fn(a: User.email, b: User.name) -> User.email_name_match: return ...
Here, we incorrectly request features from the root namespaces
Requiring features from different root namespaces
@online def fn(a: User.email, b: Transfer.memo) -> Transfer.email_in_memo: return email in memo
Requiring features from different root namespaces using a relationship
@online def fn(email: Transfer.user.email, memo: Transfer.memo) -> Transfer.email_in_memo: return email in memo
You can also require features joined to a Feature Set through has-one relationships. For example, if users in your system have bank accounts, and you wanted to compare the name on the user’s bank account to the user’s name, you could require the user’s name and the account’s title through the user:
@online def name_sim(title: User.account.title, name: User.full_name) -> ...
You can also require all scalars on the user’s profile:
@online def fn(profile: User.profile) -> ...: profile.signup_date
Chalk will materialize all scalar features on the profile before calling this function. If you want to pull only a few features from the profile, require each directly:
@online def fn(signup_date: User.profile.signup_date, age: User.profile.age) -> ...
Has-one relationships can also be declared as optional. You may also require feature through optional relationships, but the types for all of those optional features will become optional. Consider the below example:
@features class Account: id: int user_id: int balance: float # Non-optional balance @features class User: id: int # Optional relationship account: Account | None = has_one(lambda: Account.user_id == User.id) has_high_balance: bool @online def has_high_bal(balance: User.account.balance) -> User.has_high_balance: # Balance will be "float | None" if balance is None: return False return balance > 1000
The resolver in this example receives an optional
balance is not an optional field on
The optional is added because the user may not have an account,
in which case the resolver will receive
None for the balance.
Consider a schema where users have a feature set of profile information, and the user’s profile has an identity feature set, which in turn has the age of the user’s email. You can require the email age feature as below:
@online def fn(email_age: User.profile.identity.email_age) -> ...
However, you cannot access nested relationships without explicit asking for them.
Accessing a transitive relationship from a dependency.
@online def fn(acct: User.account) -> ...: acct.balance # Ok acct.institution.name # Error!
Instead, you can require the nested relationship directly and access any of its scalar features.
Directly requiring the transitive relationship.
@online def fn(ins: User.account.institution, acct: User.account) -> ...: acct.balance # Ok ins.name # Ok
The semantics of optional has-one dependencies carry over to nested has-one dependencies. If you traverse an optional relationship, then all downstream attributes will become optional.
You can also require has-many relationships as inputs to your resolver:
@online def fn(transfers: User.transfers) -> ...:
By default, Chalk will materialize all scalar features
Transfer feature set before calling your resolver.
As an optimization hint, you can specify which features from
the transfers that you’d like Chalk to materialize before calling
the function. For example, if there were expensive features
to compute on the transfer, you could scope the features
to only the set you need:
@online def fn(transfers: User.transfers[Transfer.amount, Transfer.memo]) -> ...: transfers[Transfer.amount].sum() # Ok transfers[Transfer.from_institution] # Error: filtered out above
The error above is surfaced statically by our editor plugin.
You can apply filters to the has-many inputs of resolvers:
@online def fn(transfers: User.transfers[Transfer.amount > 100]) -> ...:
@online def fn(transfers: User.transfers[Transfer.amount > 100, Transfer.memo]) -> ...:
Has-many relationships can be required through has-one relationships:
@online def fn(transfers: User.account.transfers) -> ...:
As with scalar has-many dependencies, you can scope down the scalar features on the transfer to only those required:
@online def fn(transfers: User.account.transfers[Transfer.amount]) -> ...: transfers[Transfer.amount].sum() # Ok transfers[Transfer.from_institution] # Error: filtered out above
@online def fn(ts: User.transfers[Transfer.account, Transfer.amount]) -> ...: ts[Transfer.account.balance].sum() # Ok ts[Transfer.amount].sum() # Ok ts[Transfer.memo] # Error! Filtered out
You can also refine the types pulled from the nested has-one:
@online def fn(ts: User.transfers[Transfer.account.balance]) -> ...: ts[Transfer.account.balance].sum() # Ok ts[Transfer.account.title] # Error! Filtered out
Has-many relationships can be required through other has-many relationships.
For example, consider the following feature definitions for
where a user can have many accounts, each with many transactions.
from chalk.features import features, has_many, DataFrame @features class Transaction: id: str account_id: str amount: float @features class Account: id: str user_id: str transactions: DataFrame[Transaction] = has_many( lambda: Transaction.account_id == Account.id ) @features class User: id: str total_spent: float accounts: DataFrame[Account] = has_many(lambda: Account.user_id == User.id)
We can resolve the
total_spent feature on
User by computing the sum of transaction
amounts across all of a user’s accounts, as shown below.
@online def get_total_spent( txns: User.accounts.transactions[Transaction.amount] ) -> User.total_spent: return txns.sum()