Resolvers
Automate resolver runs.
You can schedule resolvers to run on a pre-determined schedule
via the cron
argument to resolver decorators.
@online(cron="90d")
def fn() -> DataFrame[User.id, User.name]:
return ...
The cron
keyword arguments takes a duration
to determine the schedule on which to run. For more fine-grained control,
you can alternatively specify a crontab.
@online(cron="*/5 * * * *")
def region_average(
houses: DataFrame[
House.city, House.rent_price
]
) -> DataFrame[Region.name, Region.average_price]:
return houses.group_by(
group={Region.name: House.city},
agg={
Region.average_price: op.mean(House.rent_price),
},
)
Scheduled resolvers can also take arguments. Chalk gives you control over which arguments should be provided for each run of the schedule.
Chalk calls the resolver with
all unique argument names that could have called the function.
Consider a scheduled resolver that takes in
Feature 1
and Feature 2
:
In this example, we compute two points to sample: (1, 5) and (1, 7). These two values will be given to the resolver on the specified cron schedule.
You can choose to filter down the set of all examples to run.
The cron
keyword argument alternatively takes a Cron
class.
By specifying a function from features to boolean,
you can tell Chalk which of the default examples
to run, and which to skip:
def filter_examples(bank_id: User.bank_account.balance) -> bool:
return balance > 100
@online(cron=Cron(schedule="1d", filter=filter_examples))
def fn(balance: User.bank_account.balance) -> ...:
The arguments to the filter function all need to be rooted in the same entity as the features in the scheduled resolver, but there is no requirement that the filter function take a subset of the scheduled resolver’s arguments:
def filter_examples(status: User.bank_account.status) -> bool:
return status == "opened"
@online(cron=Cron(schedule="1d", filter=filter_examples))
def fn(balance: User.bank_account.balance) -> ...:
Finally, you can pick exactly the examples that you’d like to run.
def pull_examples() -> DataFrame[User.id]:
return DataFrame.read_csv(...)
@online(cron=Cron(schedule="1d", samples=pull_examples))
def fn(uid: User.id) -> ...:
In the above example, we provide all arguments to the resolver function. However, you may also choose to provide only a subset of the arguments, and Chalk will sample the other arguments.
def pull_examples() -> DataFrame[User.id]:
return DataFrame.read_csv(...)
@online(cron=Cron(schedule="1d", samples=pull_examples))
def fn(uid: User.id, balance: User.account.balance) -> ...: