Overview
How it all fits together.
Chalk’s online query serving platform is architected to
In short, Chalk was built to get the right data from the right place at the right time.
Let’s examine how the pieces work together to compute and serve features.
Suppose that you want to compute a set of features for making a decision about a request from a user:
This entire pipeline—from SQL queries and API calls to response—runs in less than 5ms, even with heterogeneous data sources and complex logic. Chalk uses many techniques to reduce latency, such as:
Chalk eliminates the complexity of orchestrating data and ETL pipelines by building a dependency graph (DAG) of your features, which are defined using Python. At inference time, Chalk dynamically builds query plans (subgraphs of your feature DAG) without manual configuration, based on the features you request.
Write feature definitions in Python, and Chalk automatically
As a result, Chalk can serve as a drop-in replacement for orchestration tools like Dagster, Airflow, and Prefect while simultaneously providing purpose-built features for production ML workloads.
Declaratively defining features frees up data teams to focus on designing features instead of writing plumbing code. There’s no need to write custom glue code because Chalk interfaces directly with underlying data sources, managing all the connections and transformations behind the scenes.
Note: Features can also be computed on a recurring basis with scheduled queries.
Chalk uses different storage technologies to support online and offline use cases.
The online store is optimized for serving the latest version of any given feature for any given entity with the minimum possible latency. Chalk can be configured to use Redis or Cloud Memory Store for smaller resident data sets with high latency requirements, or DynamoDB when horizontal scalability is required.
The offline store is optimized for storing all historical feature values, serving point-in-time correct queries, and tracking provenance of features. Chalk supports a variety of storage backends depending on data scale and latency requirements. Typically, Chalk uses Snowflake, Delta Lake, BigQuery or Athena.
Chalk’s architecture also supports efficient batch point-in-time queries to construct model training sets or perform batch offline inference.
Chalk integrates with your existing data providers (Snowflake, Delta Lake, or BigQuery) to ingest massive amounts of data from a variety of data sources and query it efficiently. Note that data ingested into the offline store can be trivially made available for use in an online querying context with Chalk’s Reverse ETL.
There’s an exhaustive list of supported ingestion sources in the Integrations section.
Under the hood, Chalk uses (Velox), an open source unified execution engine, to deliver high-throughput feature computation. We maintain a fork that’s been optimized for low-latency online inference.
You can think of Velox as a backend for query engine’s like Presto (AWS Athena) and Spark i.e. you can’t point Velox at a database and pass in a SQL expression. Rather than forcing users to work directly with low-level execution primitives, Chalk provides an ergonomic interface (Chalk Python SDK) for defining features, transformations, and pipelines.
This architecture allows us to expose the power of vectorized computation with clean APIs that feel natural (like writing Pandas and Polars) to data scientists and engineers. Users write simple Python decorators and SQL queries, while Velox handles the complex optimizations that make these computations blazingly fast.
We offer both a hosted model (“Chalk Cloud”) and a customer-hosted model (“Customer Cloud”).
Most companies choose to run Chalk in their own cloud (VPC) for data residency and compliance. Chalk is deployed with Terraform (sample config) and uses common cloud primitives, making it easy to integrate deployments with your existing infrastructure.
Compute nodes run on Kubernetes (typically EKS on AWS and GKE on GCP). If you have custom needs, we are happy to customize the deployment to fit with your service architecture.
The Management Plane is responsible for storing and serving non-customer data (like alert and RBAC configurations). Workloads in the Data Plane are managed through the Kubernetes API, enabling tasks such as scaling deployments and running batch jobs.
Most customers choose our Customer Cloud deployment; however, we also offer a self-hosted option for highly regulated environments (like FedRAMP) that require additional security control.
In short, the Management Plane handles
It does not have access to customer data.
The Data Plane encompasses the execution environment for feature pipelines along with the storage and serving infrastructure for both online and offline feature stores.
A single Data Plane can run many Chalk Environments.
Often, companies will have 2-3 environments (like qa
, stage
, and prod
.)
If running in a single data plane, these environments share resources, which helps with cost and ease of setup.
However, if you prefer to have stronger isolation between Chalk Environments, each Chalk Environment can run in a separate Data Plane. You would typically run only one Management Plane to orchestrate all Data Planes, and deploy the Management Plane to the most sensitive of the environments.
Native integrations with PagerDuty and Slack ensure teams are immediately alerted to any issues in their feature pipelines.
Beyond alerting, every Chalk query is fully instrumented with traces and detailed logs, enabling both broad system-wide monitoring and deep request-level debugging across every stage of computation—down to the root data source. With Chalk, data teams get
Easily, build your own views and set up custom dashboards to visualize your metrics and configure smart alerts with custom formulas that notify you instantly when thresholds are crossed or anomalies are detected.
This flexibility to configure and define your own metrics makes it easy to answer common questions such as how often certain features are computed, how long individual computations take, and what the average value for a feature is.
By both connecting to your data stores directly and computing features post-fetch, Chalk makes it trivial to integrate new data sources from other teams, dramatically increasing predictive accuracy and the context available to your models.
Your systems can also bidirectionally integrate with Chalk’s underlying infrastructure, which is built on widely-adopted technologies like Redis, DynamoDB, or Postgres and leverages open standards like Arrow, Parquet, and Iceberg—ultimately maximizing compatibility and unlocking downstream analytical workflows
Together, these architectural choices enable enterprises to build future-proof ML and AI systems that scale with their needs, maintain interoperability, and seamless integrate with their existing technology stack.