Features
Define one-to-many and many-to-many relationships between feature sets.
A feature set can be linked to many examples of a different feature set
via the function has_many
, where the first argument specifies a
function returning how to join the tables.
In the example below, users have many transfers:
from chalk.features import has_many, DataFrame, ...
@features
class Transfer:
...
user_id: str
amount: float
@features
class User:
...
uid: str
transfers: DataFrame[Transfer] = has_many(lambda: Transfer.user_id == User.uid)
The `lambda` solves forward references, letting you reference `User` before it is defined.
Now, you can reference the transfers for a user through the user.
The has_many
function returns a chalk.DataFrame,
which supports many helpful aggregate operations:
# Number of transfers made by a user
User.transfers.count()
# Total amount of transfers made by the user
User.transfers[Transfer.amount].sum()
# Total amount of the transfers made by the user that were returned
User.transfers[
Transfer.status = "returned",
Transfer.amount
].sum()
One-to-many is defined by a has_one relationship
on the other side of the relation.
However, you don’t have to explicitly use has_one.
Instead, the join condition is assumed to be symmetric and copied over.
Building on the above example, all you need to do
to complete the one-to-many relationship is to add a User
to
the Transfer
class:
@features
class Transfer:
...
user_id: str
amount: float
user: "User"
@features
class User:
...
uid: str
transfers: DataFrame[Transfer] = has_many(lambda: Transfer.user_id == User.uid)
Here you need to use quotes around `User` to use a forward reference.
Alternatively, you could define the join condition on the has_one
side, and add the DataFrame to the User
class:
@features
class Transfer:
...
user_id: str
amount: float
user: "User"
user: "User" = has_one(lambda: Transfer.user_id == User.uid)
@features
class User:
...
uid: str
transfers: DataFrame[Transfer] = has_many(lambda: Transfer.user_id == User.uid)
transfers: DataFrame[Transfer]
Again, you need to use quotes around `User` deal with forward references.
Many-to-many is defined by a has_many
relationship
on the both sides of the relation.
As before, you don’t need to specify the join condition a second time
so long as the join condition is symmetric.
@features
class Book:
...
author_id: str
uuid: str
authors: "DataFrame[Author]"
@features
class Author:
...
uuid: str
books: DataFrame[Book] = has_many(lambda: Author.uuid == Book.uuid)
Here you need to use quotes around `DataFrame[Author]` to use a forward reference.