Overview
How it all fits together.
Chalk’s online query serving platform is architected to
In short, Chalk was built to get the right data from the right place at the right time.
Let’s examine how the pieces work together to compute and serve features.
Suppose that you want to compute a set of features for making a decision about a request from a user:

This entire pipeline—from SQL queries and API calls to response—runs in less than 5ms, even with heterogeneous data sources and complex logic. Chalk uses many techniques to reduce latency, such as:
Chalk eliminates the complexity of orchestrating data and ETL pipelines by building a dependency graph (DAG) of your features, which are defined using Python. At inference time, Chalk dynamically builds query plans (subgraphs of your feature DAG) without manual configuration, based on the features you request.
Write feature definitions in Python, and Chalk automatically
As a result, Chalk can serve as a drop-in replacement for orchestration tools like Dagster, Airflow, and Prefect while simultaneously providing purpose-built features for production ML workloads.

Declaratively defining features frees up data teams to focus on designing features instead of writing plumbing code. There’s no need to write custom glue code because Chalk interfaces directly with underlying data sources, managing all the connections and transformations behind the scenes.
Note: Features can also be computed on a recurring basis with scheduled queries.
Chalk uses different storage technologies to support online and offline use cases.
The online store is optimized for serving the latest version of any given feature for any given entity with the minimum possible latency. Chalk can be configured to use Redis or Cloud Memory Store for smaller resident data sets with high latency requirements, or DynamoDB when horizontal scalability is required.
The offline store is optimized for storing all historical feature values, serving point-in-time correct queries, and tracking provenance of features. Chalk supports a variety of storage backends depending on data scale and latency requirements. Typically, Chalk uses Snowflake, Delta Lake, BigQuery or Athena.
Chalk’s architecture also supports efficient batch point-in-time queries to construct model training sets or perform batch offline inference.
Chalk integrates with your existing data providers (Snowflake, Delta Lake, or BigQuery) to ingest massive amounts of data from a variety of data sources and query it efficiently. Note that data ingested into the offline store can be trivially made available for use in an online querying context with Chalk’s Reverse ETL.
There’s an exhaustive list of supported ingestion sources in the Integrations section.
Under the hood, Chalk uses (Velox), an open source unified execution engine, to deliver high-throughput feature computation. We maintain a fork that’s been optimized for low-latency online inference.
You can think of Velox as a backend for query engine’s like Presto (AWS Athena) and Spark i.e. you can’t point Velox at a database and pass in a SQL expression. Rather than forcing users to work directly with low-level execution primitives, Chalk provides an ergonomic interface (Chalk Python SDK) for defining features, transformations, and pipelines.

This architecture allows us to expose the power of vectorized computation with clean APIs that feel natural (like writing Pandas and Polars) to data scientists and engineers. Users write simple Python decorators and SQL queries, while Velox handles the complex optimizations that make these computations blazingly fast.
We offer both a hosted model (“Chalk Cloud”) and a customer-hosted model (“Customer Cloud”).
Most companies choose to run Chalk in their own cloud (VPC) for data residency and compliance. Chalk is deployed with Terraform and uses common cloud primitives, making it easy to integrate deployments with your existing infrastructure.
Compute nodes run on Kubernetes (typically EKS on AWS, GKE on GCP, and AKS on Azure). If you have custom needs, we are happy to customize the deployment to fit with your service architecture.
The Metadata Plane is responsible for storing and serving non-customer data (like alert and RBAC configurations). It can control many Data Planes, which it manages through the Kubernetes API, enabling tasks such as scaling deployments and running batch jobs.
In short, the Metadata Plane handles:
It does not have access to customer data.
The Data Plane encompasses the execution environment for feature pipelines along with the storage and serving infrastructure for both online and offline feature stores.
A single Data Plane can run many Chalk Environments.
Often, companies will have 2-3 environments (like qa, stage, and prod.)
If running in a single data plane, these environments share resources, which helps with cost and ease of setup.
However, if you prefer to have stronger isolation between Chalk Environments, each Chalk Environment can run in a separate Data Plane. You would typically run only one Metadata Plane to orchestrate all Data Planes, and deploy the Metadata Plane to the most sensitive of the environments.
Chalk offers several deployment options to provide you with the right level of infrastructure control.
In the Chalk-Hosted Deployment, both the Metadata Plane and Data Plane run in Chalk’s cloud account. Deployed in this manner, Chalk runs as a SaaS application. There is no infrastructure to manage, and no ability to see inside the cloud account running Chalk.
Most customers choose our Customer Cloud Deployment. In this model, the customer runs the Data Plane in its own cloud account, and Chalk runs the Metadata Plane in its cloud account.
This deployment model strikes a good balance between security and ease of maintenance. No one at Chalk will be able to access your data, but the Chalk team can handle upgrades to the underlying resources without your team’s involvement.
Chalk offers the option to self-host both the Data Plane and Metadata Plane. In the Customer Cloud Deployment, only the Data Plane runs in your cloud account, whereas in the Air-Gapped Deployment, the Metadata Plane joins the Data Plane in your cloud account.
There are two primary reasons that customers will choose to host the Metadata Plane:
In this configuration, no service hosted by Chalk needs to talk to your instance. Telemetry can be exported for billing purposes (over topics), but is non-essential to uptime of your instance. In the event of a complete outage in Chalk’s cloud accounts, your instance service would continue running indefinitely without disruption.
Native integrations with PagerDuty and Slack ensure teams are immediately alerted to any issues in their feature pipelines.
Beyond alerting, every Chalk query is fully instrumented with traces and detailed logs, enabling both broad system-wide monitoring and deep request-level debugging across every stage of computation—down to the root data source. With Chalk, data teams get
Easily, build your own views and set up custom dashboards to visualize your metrics and configure smart alerts with custom formulas that notify you instantly when thresholds are crossed or anomalies are detected.

This flexibility to configure and define your own metrics makes it easy to answer common questions such as how often certain features are computed, how long individual computations take, and what the average value for a feature is.
By both connecting to your data stores directly and computing features post-fetch, Chalk makes it trivial to integrate new data sources from other teams, dramatically increasing predictive accuracy and the context available to your models.
Your systems can also bidirectionally integrate with Chalk’s underlying infrastructure, which is built on widely-adopted technologies like Redis, DynamoDB, or Postgres and leverages open standards like Arrow, Parquet, and Iceberg—ultimately maximizing compatibility and unlocking downstream analytical workflows
Together, these architectural choices enable enterprises to build future-proof ML and AI systems that scale with their needs, maintain interoperability, and seamlessly integrate with their existing technology stack.