Chalk home page
Docs
SDK
CLI
  1. Overview
  2. Platform Architecture

Chalk’s online query serving platform is architected to

  • retrieve fresh features across heterogeneous sources with the minimum possible latency
  • orchestrate and execute complex transformations on structured and unstructured data
  • uphold enterprise-grade guarantees for security, observability, and reliability

In short, Chalk was built to get the right data from the right place at the right time.


Online serving architecture

Let’s examine how the pieces work together to compute and serve features.

Suppose that you want to compute a set of features for making a decision about a request from a user:

  1. Your application sends an HTTP request to Chalk’s serving API for features
  2. Chalk’s query planner generates an optimized execution plan by analyzing feature dependencies and available data sources.
  3. Chalk’s compute engine
    • retrieves fresh values from underlying sources with Resolvers
    • pulls from Chalk’s low-latency online storage e.g. tiled / materialized features or cached features
    • executes the generated query plan
  4. Returns computed features in response
  5. Newly computed features values are logged for reuse and auditability

Chalk architecture diagram

This entire pipeline—from SQL queries and API calls to response—runs in less than 5ms, even with heterogeneous data sources and complex logic. Chalk uses many techniques to reduce latency, such as:

  • Automatic parallelism across pipeline stages
  • Vectorization of pipeline stages that are written using scalar syntax
  • Statistics-informed join planning
  • Low-latency key/value stores (like Redis, BigTable, or DynamoDB)
  • Transpilation of Python code into native code with Chalk’s symbolic Python interpreter

Data orchestration

Chalk eliminates the complexity of orchestrating data and ETL pipelines by building a dependency graph (DAG) of your features, which are defined using Python. At inference time, Chalk dynamically builds query plans (subgraphs of your feature DAG) without manual configuration, based on the features you request.

Write feature definitions in Python, and Chalk automatically

As a result, Chalk can serve as a drop-in replacement for orchestration tools like Dagster, Airflow, and Prefect while simultaneously providing purpose-built features for production ML workloads.

Chalk query plan

Declaratively defining features frees up data teams to focus on designing features instead of writing plumbing code. There’s no need to write custom glue code because Chalk interfaces directly with underlying data sources, managing all the connections and transformations behind the scenes.

Note: Features can also be computed on a recurring basis with scheduled queries.

Data persistence and storage

Chalk uses different storage technologies to support online and offline use cases.

The online store is optimized for serving the latest version of any given feature for any given entity with the minimum possible latency. Chalk can be configured to use Redis or Cloud Memory Store for smaller resident data sets with high latency requirements, or DynamoDB when horizontal scalability is required.

The offline store is optimized for storing all historical feature values, serving point-in-time correct queries, and tracking provenance of features. Chalk supports a variety of storage backends depending on data scale and latency requirements. Typically, Chalk uses Snowflake, Delta Lake, BigQuery or Athena.


Offline computation and serving

Chalk’s architecture also supports efficient batch point-in-time queries to construct model training sets or perform batch offline inference.

  1. Submit a training data request from a notebook client like Jupyter using Chalk’s Python SDK
  2. The query planner builds a point-in-time correct plan
  3. The compute engine pulls point-in-time correct feature values from offline storage
  4. Chalk returns a DataFrame of features to you.

Chalk integrates with your existing data providers (Snowflake, Delta Lake, or BigQuery) to ingest massive amounts of data from a variety of data sources and query it efficiently. Note that data ingested into the offline store can be trivially made available for use in an online querying context with Chalk’s Reverse ETL.

There’s an exhaustive list of supported ingestion sources in the Integrations section.


High-performance execution engine

Under the hood, Chalk uses (Velox), an open source unified execution engine, to deliver high-throughput feature computation. We maintain a fork that’s been optimized for low-latency online inference.

You can think of Velox as a backend for query engine’s like Presto (AWS Athena) and Spark i.e. you can’t point Velox at a database and pass in a SQL expression. Rather than forcing users to work directly with low-level execution primitives, Chalk provides an ergonomic interface (Chalk Python SDK) for defining features, transformations, and pipelines.

This architecture allows us to expose the power of vectorized computation with clean APIs that feel natural (like writing Pandas and Polars) to data scientists and engineers. Users write simple Python decorators and SQL queries, while Velox handles the complex optimizations that make these computations blazingly fast.

  • Velox’s columnar memory layout and vectorized expression evaluation deliver significant performance improvements, with benchmarks showing 6-7x speedups on real analytical workloads
  • Native support for structs, maps, arrays, and nested data types makes it ideal for feature engineering workflows
  • Features like filter reordering, dynamic filter pushdown, and adaptive column prefetching optimize query execution based on runtime statistics

Service architecture

We offer both a hosted model (“Chalk Cloud”) and a customer-hosted model (“Customer Cloud”).

Most companies choose to run Chalk in their own cloud (VPC) for data residency and compliance. Chalk is deployed with Terraform (sample config) and uses common cloud primitives, making it easy to integrate deployments with your existing infrastructure.

Compute nodes run on Kubernetes (typically EKS on AWS and GKE on GCP). If you have custom needs, we are happy to customize the deployment to fit with your service architecture.

Architecture diagram
  1. 1Creating secrets:The API server can be configured to have write-only access to the secret store.
  2. 2Reading secrets:Secret access can be restricted entirely to the data plane.
  3. 3Online store:Chalk supports several online feature stores, which are used for caching feature values. On AWS, Chalk supports DynamoDB, Elasticache, and PostgreSQL.

Management Plane

The Management Plane is responsible for storing and serving non-customer data (like alert and RBAC configurations). Workloads in the Data Plane are managed through the Kubernetes API, enabling tasks such as scaling deployments and running batch jobs.

Most customers choose our Customer Cloud deployment; however, we also offer a self-hosted option for highly regulated environments (like FedRAMP) that require additional security control.

In short, the Management Plane handles

  • Deployment orchestration
  • Access control
  • Monitoring and alerting

It does not have access to customer data.

Data Plane

The Data Plane encompasses the execution environment for feature pipelines along with the storage and serving infrastructure for both online and offline feature stores.

A single Data Plane can run many Chalk Environments. Often, companies will have 2-3 environments (like qa, stage, and prod.) If running in a single data plane, these environments share resources, which helps with cost and ease of setup.

However, if you prefer to have stronger isolation between Chalk Environments, each Chalk Environment can run in a separate Data Plane. You would typically run only one Management Plane to orchestrate all Data Planes, and deploy the Management Plane to the most sensitive of the environments.


Monitoring

Native integrations with PagerDuty and Slack ensure teams are immediately alerted to any issues in their feature pipelines.

Beyond alerting, every Chalk query is fully instrumented with traces and detailed logs, enabling both broad system-wide monitoring and deep request-level debugging across every stage of computation—down to the root data source. With Chalk, data teams get

  • Comprehensive logging & debugging
    • Resolver execution details
    • Feature computations
    • Cache misses
    • Data source failures
    • Feature staleness
    • Feature ownership tracking
  • Metrics and performance monitoring
    • Features: request volumes, computation times, and error rates
    • Models: inference latency, prediction accuracy, and resource utilization
    • System: query throughput, system latency, and pipeline health
  • End-to-end request tracing
    • Track the inputs and outputs of every step in any run
    • Lineage tracking from data source to final predictions

Easily, build your own views and set up custom dashboards to visualize your metrics and configure smart alerts with custom formulas that notify you instantly when thresholds are crossed or anomalies are detected.

Metrics and chart creation

This flexibility to configure and define your own metrics makes it easy to answer common questions such as how often certain features are computed, how long individual computations take, and what the average value for a feature is.


Architecting maintainable and scalable systems for enterprise success

By both connecting to your data stores directly and computing features post-fetch, Chalk makes it trivial to integrate new data sources from other teams, dramatically increasing predictive accuracy and the context available to your models.

Your systems can also bidirectionally integrate with Chalk’s underlying infrastructure, which is built on widely-adopted technologies like Redis, DynamoDB, or Postgres and leverages open standards like Arrow, Parquet, and Iceberg—ultimately maximizing compatibility and unlocking downstream analytical workflows

Together, these architectural choices enable enterprises to build future-proof ML and AI systems that scale with their needs, maintain interoperability, and seamless integrate with their existing technology stack.