Tutorial: Building a marketplace recommendation system

Introduction

Whether you’re building a content platform, or a multi-sided marketplace, giving your users the best recommendations possible and in a timely fashion can make or break customer satisfaction and revenue.

In this tutorial, we’ll walkthrough setting up a marketplace recommendation system and dive into how Chalk abstracts away the complexity of deploying high-throughput low-latency production infrastructure.

Want to dive right into the code? Check out this GitHub repo!

Core data modeling

Let’s first set the stage by modeling some of our core entities and their relationships.

Our demo marketplace has a few core entities with many interdependent relationships (primarily through interactions between users and sellers over a particular item):

Users
Sellers
Items
Interactions
Reviews

Entities are represented with a Pythonic data class-esque syntax. Each set of features is decorated with @features which is a decorator imported from Chalk’s SDK.

Note: Keyword arguments can be passed into our @features decorator to define metadata like feature ownership, tags, cache-policies, etc.

@features(owner="ml-core@chalk.ai")
class User:
    id: str
    first_name: str
    last_name: str

    full_name: str = _.first_name + " " + _.last_name
    # computed after we've fetched first and last name

Whenever we request features for a User, with a command like:

chalk query --in user.id=50 --out full_name

Chalk knows to first fetch the first_name and last_name from the underlying data source, before computing the full_name using the provided Chalk expression:

full_name: str = _.first_name + " " + _.last_name

This is all handled automatically! Chalk at it’s core is a feature orchestration engine that dynamically builds query plans for fetching and computing features on-demand.

In otherwards, requesting first_name instead of full_name with chalk query --in user.id=50 --out first_name produces a totally different query plan than when we requested just full_name.

Data resolution

Chalk knows where to fetch the data to hydrate this User class with SQL Resolvers.

These are ordinary SQL files that live in your project alongside your feature definitions.

-- resolves: User
-- source: postgres
select
    hid as id,
    created_at,
    first_name,
    last_name,
    email,
    birthday
from marketplace_users

Chalk automatically maps the SQL result to the User feature class and passes in any input predicates and filters as needed (e.g. user.id=50).

We annotate the SQL file with special comments to indicate which feature class it resolves and also which data source it should run against. Chalk can connect to an arbitrary number of data sources when fulfilling a request.

Some customers end up fetching thousands of features from a heterogeneous mix of data sources in < 5ms for a single request (fraud detection tends to be very feature rich).

Relationships and entity connections

Let’s define another core entity, Reviews, to illustrate how we build relationships and connect feature classes.

@features
class Review:
    id: int
    created_at: datetime

    review_headline: str
    review_body: str

    star_rating: int = feature(min=1, max=5)

    normalized_rating: float

    # user_id: str
    user_id: User.id
    user: User

Instead of user_id being a simple str, we can explicitly point it to User.id, communicating to Chalk that this is a foreign key relationship.

This let’s us easily establish a has-many relationship between Users and Reviews. Where a User can have many reviews and each Review has a single User.

This enables us to easily compute complex features in the namespace of Review, like normalized_rating which requires access to aggregations like average_rating_given which is defined in the User namespace i.e. User.average_rating_given.

We can define complex features using Python Resolvers.

Computed features with Python Resolvers

Python resolvers are ordinary Python functions that take in feature inputs and return computed feature outputs.

@online
def get_normalized_rating(
    review_rating: Review.star_rating,
    review_count_across_all_items: Review.item.total_reviews,
    average_rating_across_all_items: Review.item.average_rating,
) -> Review.normalized_rating:
    minimum_reviews: float = review_count_across_all_items / 10
    return (
            review_count_across_all_items
            / (review_count_across_all_items + minimum_reviews)
    ) * review_rating + (
        minimum_reviews / (review_count_across_all_items + minimum_reviews)
    ) * average_rating_across_all_items

We overload the type annotations of the function parameters to indicate explicitly which features we want to be passed into the function.

Notice how instead of typing average_rating_across_all_items as a float, we type hint it as Review.item.average_rating.

The return type is also explictly hinted as Review.normalized_rating. Setting up the type hints in this way enables us to (necessarily) build the dependency graph of of your features (that we leverage for query planning) while preserving as much developer experience as possible—we want to minimize boilerplate and the cognitive load required to express complicated feature logic.

If you manage your company’s Airflow DAGs, you know exactly how painful this actually is

In addition to referencing features across entity boundaries to compute new features, we can also query for them directly, in this case we can easily fetch the User that wrote a particular Review.

chalk query --in review.id=241 --out user

Has-many relationships and aggregations

The inverse relationship is also automatically established, enabling us to easily query (a has-many) for all the Reviews that belong to a one particular User.

chalk query --in user.id=50 --out reviews

This sets the foundation for building aggregations over these (has-many) relationships, which we’ll model by leveraging the DataFrame type from Chalk’s SDK:

reviews: DataFrame[Review]

After setting up our reviews dataframe, we can compute aggregations like average_rating_given and review_count by projecting columns and applying aggregation functions, like sum, count, and mean.

@features(owner="ml-core@chalk.ai")
class User:
    id: str
    first_name: str
    last_name: str

    full_name: str = _.first_name + " " + _.last_name
    # computed after we've fetched first and last name

    reviews: DataFrame[Review]  # has-many relationship

    average_rating_given: float = _.reviews.star_rating.mean()

    review_count: Windowed[int] = windowed(
        "7d",
        "30d",
        "60d",
        expression=_.reviews[_.created_at > _.chalk_window].count(),
    )

Chalk extends this functionality further by enabling windowed aggregations, which are essential for capturing temporal trends in user behavior.

We can inject the chalk_window variable into our query to dynamically adjust the time window for our aggregations.

Whenever we want to explictly reference a specific window within a Python resolver, we can index the Windowed feature like so: User.review_count["30d"].

In our normalized_rating example from earlier, we could easily modify it to use a windowed review count instead of the all-time review count.

@online
def get_normalized_rating(
    review_rating: Review.star_rating,
    review_count_past_month: Review.item.review_count["30d"],
    average_rating_across_all_items: Review.item.average_rating,
) -> Review.normalized_rating:

Putting it all together, we have a powerful and flexible way to build semantic models that capture the complexity of real-world data. We can quickly express complex relationships and also compute features that span across not only multiple entities but also time windows.

Optimizing for real-time performance

Computing heavy aggregations at high QPS can be extremely load-bearing on your underlying data infrastructure, especially since real-time use-cases (by definition) necesitate fetching features directly from your raw data sources, for maximum freshness.

Chalk offers robust caching and materialization strategies to help alleviate this load, while still ensuring that your features remain fresh and accurate.

Most notably, Chalk offers materialized feature views that let you cache windowed aggregations and other compute-heavy features with MaterializationWindowConfig and max_staleness.

@features
class User:
    id: str
    ... # other features
    
    reviews: DataFrame[Review]
    review_count: Windowed[int] = windowed(
        "7d",
        "30d",
        "60d",
        expression=_.reviews[_.created_at > _.chalk_window].count(),
        materialization=MaterializationWindowConfig(
            bucket_duration="3d",
            backfill_schedule="0 0 * * *",
        )
    )

More granular bucket durations can be configured per window to optimize for storage and compute costs:

bucket_durations={
    # 10-minute buckets for the 1d window
    "10m": "1d",
    # 5-day buckets for 60d and 365d windows
    "5d": ["60d", "365d"],
}

Traditional features can be cached with max_staleness, which is particularly useful for features that are expensive with regards to time (3rd-party API), compute, or even monetary costs (e.g. LLM-based features).

@features
class Review:
    id: int
    ...

    body: str
    sentiment_from_llm: str = feature(max_staleness="infinity") # or 30d for a credit score

Some features are static in nature and only ever need to be computed once (e.g. sentiment classification) while others may need to be recomputed periodically to capture shifts in market dynamics or user behavior (e.g. 30-day credit scores).

Recommendation system overview

Now that we’ve covered the core data modeling concepts, let’s dive into building out a simple recommendation system.

We’ll showcase how to build out user-item affinity features, leveraging both traditional feature engineering techniques as well as LLM-powered features.

Let’s introduce a new entity, Item, to represent a product being sold in our Marketplace. A typical production environment would also incorporate features from the other entities like Seller that’s responsible for posting this item, but for brevity we’ll focus on just User and Item pairs.

@features
class Item:
    id: str
    title: str
    genre_from_llm: str = feature(max_staleness="infinity")

    price: float

    prices: "DataFrame[ItemPrice]"

    times_purchased: int = _.prices[_.type == "purchase"].count()
    average_price: float | None = _.prices[_.value].mean()
    most_recent_price: float | None = _.prices[_.value].max_by(_.created_at)

    is_discounted: bool = _.price < _.average_price

We can model pricing dynamics with a separate ItemPrice feature class that tracks price changes over time across all sellers. We’ll leverage this to get a sense of whether this item is currently discounted, the most recent price (across all sellers), and how many times its been purchased (which we can easily make windowed).

Notice how we can filter the DataFrame with _.type == "purchase" beforing applying the aggregation function to guarantee we’re only counting actual purchase events.

Computing User-Item pairs with Chalk's bulk queries

Our final piece of the puzzle is to compute user-item affinity features that capture the relationship between a particular user and item.

@features
class UserItem:
    id: str = _.user_id + "-" + _.item_id

    user_id: "User.id"
    user: "User"

    item_id: "Item.id"
    item: "Item"

    # import chalk.functions as F
    price_diff: float = F.abs(_.item.most_recent_price - _.user.average_purchase_price)
    price_diff_ratio: float = _.price_diff / _.user.average_purchase_price
    affordability_cap: float = F.sigmoid(_.price_diff_ratio)
    price_fit: float = 1 / (1 + _.price_diff_ratio**2)

Whenever we query for this user-item pair, we’ll pass in both a User.id and Item.id and then compute pricing features that capture how well this item fits within the user’s typical purchasing behavior.

chalk query --in user_item.user_id=50 --in user_item.item_id=200

Using '--out=user_item'
Results                                                                  
https://chalk.ai/environments/dvxenv/query-runs/cmgv4x37k0dly0u0ocwx0ck9d
                                                                         
 Name                         Hit?  Value                                
────────────────────────────────────────────────────────                 
 user_item.affordability_cap        0.531067620635202                    
 user_item.id                       "50-200"                             
 user_item.item_id                  "200"                                
 user_item.price_diff               3.4404545454545428                   
 user_item.price_diff_ratio         0.12443078137072767                  
 user_item.price_fit                0.9847530494774773                   
 user_item.user_id                  "50"

This returns the computed UserItem features for user 50 and item 200 and gives us back insights into how affordable this item is for this user.

In a production recommendation system, we’d query in bulk for a bunch of user-item pairs and also include other features from the User and Item namespaces (in practice many other namespaces as well) to power our ranking model.

bulk_query: BulkOnlineQueryResponse = client.query_bulk(
    input={
        UserItem.user_id: [50] * 10,
        UserItem.item_id: [201, 47, 39, ...]
    },
    output=[
        # Item details
        UserItem.item.title,
        # Joint features
        UserItem.price_diff,
        UserItem.price_diff_ratio,
        UserItem.affordability_cap,
        UserItem.price_fit,
        # Item pricing
        UserItem.item.times_purchased,
        UserItem.item.most_recent_price,
        UserItem.item.average_price,
         # Item ratings
        UserItem.item.average_rating,
        UserItem.item.total_reviews,
        # User details
        UserItem.user.id,
        UserItem.user.created_at,
    ],
)

This bulk query returns a matrix of features for each user-item pair that can be fed directly into your ranking model.

Chalk provides a number of clients in various languages (Python, Go, Java, JS, etc.) to make it easy to integrate into your existing stack. This also enables teams across the organization to easily share and reuse features without having to reimplement them in multiple places. This also streamlines deploying serving infrastructure, since Chalk handles all the complexity of scaling and optimizing feature serving for you.

Dynamic feature computations

Features can also be computed dynamically, by passing in arbitrary expressions at query time.

This makes it easy to experiment with new features without having to redeploy any code and also enables advanced use-cases like user-defined features in your product.

result = client.query(
    input={
        User.id: 1,
    },
    output=[
        User.full_name,
        User.username,
        (F.jaccard_similarity(
                a=F.lower(_.full_name),
                b=F.lower(_.email)
        )).alias("user.name_email_sim"),
    ],
)

We can leverage Chalk’s built-in functions to compute dynamic features on-the-fly.

Chalk offers hundreds of functions that can be composed together to build complex feature logic and that are guaranteed to run efficiently at scale.

Conclusion

Chalk offers a powerful and flexible way to build recommendation systems that can scale to millions of users and items.

With automatic query planning, flexible caching strategies, and support for both batch and real-time feature computation make it possible to serve thousands of features with sub-millisecond latency while maintaining data freshness.

With Chalk, teams can focus on building great models and products instead of worrying about the underlying infrastructure.

P.S. We have extensive primitives for also building out other use cases and workflows around Context Engineering, Search, Personalization, Fraud Detection, and more.

Check out our LLM Toolchain to learn about the full suite of tools we offer for building with LLMs.

A Marketplace Tutorial

Introduction

Core data modeling

Data resolution

Relationships and entity connections

Computed features with Python Resolvers

Has-many relationships and aggregations

Optimizing for real-time performance

Recommendation system overview

Computing User-Item pairs with Chalk's bulk queries

Dynamic feature computations

Conclusion

On this page

​Introduction

​Core data modeling

​Data resolution

​Relationships and entity connections

​Computed features with Python Resolvers

​Has-many relationships and aggregations

​Optimizing for real-time performance

​Recommendation system overview

​Computing User-Item pairs with Chalk's bulk queries

​Dynamic feature computations

​Conclusion

On this page

Introduction

Core data modeling

Data resolution

Relationships and entity connections

Computed features with Python Resolvers

Has-many relationships and aggregations

Optimizing for real-time performance

Recommendation system overview

Computing User-Item pairs with Chalk's bulk queries

Dynamic feature computations

Conclusion