Chalk home page
Docs
SDK
CLI
  1. Getting Started
  2. File Structure

Overview

Your Chalk project’s configuration is shared across the following files:

  1. chalk.yaml (or chalk.yml): Configuration for your project’s deployment
  2. .chalkignore: Files to exclude from your project’s deployment
  3. A requirements file: Default Python requirements for Chalk to install via pip (can be overridden in chalk.yaml). This can either be requirements.txt or a pyproject.toml (via Poetry or uv).

Here’s our recommended repository structure:

company_chalk/
├── src/
│  ├── resolvers/
│  │  ├── ...
│  │  ├── __init__.py
│  │  └── pipelines.py
│  ├── __init__.py
│  ├── datasources.py
│  └── feature_sets.py
├── tests/
│  └── ...
├── notebooks/
│  └── ...
├── .chalkignore
├── chalk.yaml
├── README.md
└── requirements.txt

When you’re first getting started, we recommend putting all your features in a single file. Keeping the features in a single file makes circular references easier to reason about, as they can just be quoted.

If you do want to split your features across multiple files, you’ll need to use the if TYPE_CHECKING block from the typing module`.

Organizing Feature Definitions Across Files

In large projects, it’s common to split feature definitions across multiple Python modules. For unidirectional dependencies, this is straightforward. For example, if src/models/user.py imports src/models/profile.py, you can define the User and Profile features in separate files without issues. However, if you have circular dependencies, you may run into problems.

Chalk supports this, but circular imports can arise when features reference each other across files. To avoid these issues, use the if TYPE_CHECKING block from the typing module and quote your forward references.

Here’s an example of how to do this cleanly:

src/models/profile.py
# Imports `User` directly, because `src/models/user.py`
# doesn't import `src/models/profile.py`
from src.models.user import User

@features
class Profile:
    id: Primary[User.id]
    username: str
src/models/user.py
from chalk.features import features
from typing import TYPE_CHECKING

if TYPE_CHECKING:
    # Imports `Profile` only when type checking
    # to avoid circular imports
    from src.models.profile import Profile

@features
class User:
    id: str
    # Profile must be quoted because it is imported
    # only when type checking
    profile: "Profile"

By quoting imports inside if TYPE_CHECKING, you avoid circular dependency errors while still maintaining type safety and feature linkage.

If the relationship to Profile is optional, you can use typing.Optional or the | None syntax, but the entire annotation should be quoted:

src/models/user.py
from chalk.features import features
from typing import TYPE_CHECKING

if TYPE_CHECKING:
    from src.models.profile import Profile

@features
class User:
    id: str
    # All of `"Profile | None"` must be quoted, not just the `Profile` part
    profile: "Profile | None"

A similar pattern should be used for DataFrame annotations:

src/models/user.py
from chalk.features import features
from typing import TYPE_CHECKING

if TYPE_CHECKING:
    from src.models.profile import Profile

@features
class User:
    id: str
    # All of `"DataFrame[Transaction]"` must be quoted, not just the `Transaction` part
    transactions: "DataFrame[Transaction]"

In a project with features living in different files, we recommend that the schema definition all live in a single folder, separate from the resolvers. In this case, your folder structure will look something like:

company_chalk/
├── src/
│  ├── models/
│  │  ├── user.py
│  │  ├── profile.py
│  │  ├── __init__.py
│  │  └── ...
│  ├── resolvers/
│  │  ├── ...
│  │  ├── __init__.py
│  │  └── pipelines.py
│  ├── __init__.py
│  ├── datasources.py
│  └── feature_sets.py
├── tests/
│  └── ...
├── notebooks/
│  └── ...
├── .chalkignore
├── chalk.yaml
├── README.md
└── requirements.txt

Keeping the schema definition all in one place (as you might with something like Protobuf or Avro files) helps you keep your schema definitions organized and makes importing them in your resolvers straightforward.


chalk.yaml

Use chalk.yaml to configure your project’s Docker environment, Python configuration, and metadata validation for your features and resolvers.

Schema

chalk.yaml
projectstr
The name of this Chalk project, which must match your project's name in the dashboard (case-sensitive).
environmentsdictionary
Per-environment overrides. See our Docker documentation for more details.
[environment_name]dictionary
The name of this environment, such as prod or qa. Find available environment names in your dashboard. Use default to configure default values to apply to all of this project's environments.
runtime"python310" | "python311"
The Python version for the Chalk project.
requirementsstr
Python requirements file to install with pip. This can either be a `requirements.txt` file or a `pyproject.toml` file.
dockerfilestr
Path to an alternative Dockerfile that will be used to build a base image for your Chalk deploys.
platform_versionstr
Reference a version of the Chalk runtime that you want to use for this environment. Can be either a version tag, like `v3.0.0`, or a release stream like `stable`. If not specified, defaults to `stable`. Releases are documented in the Platform Release Notes.
validationdictionary
Configuration for metadata validation for features and resolvers. See our validation documentation for more details.
featuredictionary
Validations for all features.
metadatalist
A list of dictionaries of metadata attributes and error severity levels when this metadata property is missing.
name"description" | "owner" | "tags"
The name of the metadata property.
missing"info" | "warning" | "error"
The level of severity to use when a feature is missing this metadata property. Deploys are permitted forinfo and warning levels, but disallowed for `error` level.
resolverdictionary
Validations for all resolvers.
metadatalist
A list of dictionaries of metadata attributes and error severity levels when this metadata property is missing.
name"description" | "owner" | "tags"
The name of the metadata property.
missing"info" | "warning" | "error"
The level of severity to use when a feature is missing this metadata property. Deploys are permitted forinfo and warning levels, but disallowed for `error` level.

Sample file

Here is a sample chalk.yaml file. In this file, we use a different Dockerfile in production.

project: my-project-id

environments:
  default:
    runtime: python311
    requirements: requirements.txt
  prod:
    dockerfile: ./DockerfileProd

validation:
  feature:
    metadata:
      - name: owner
        missing: error
      - name: description
        missing: warning
      - name: tags
        missing: info
  resolver:
    metadata:
      - name: owner
        missing: error

.chalkignore

Your .chalkignore file should include your scripts, notebooks, and tests. Anything that you are not actively using in your deployment should be added so that non-deployment code does not clutter or interfere with your deployment.


Specifying Python Requirements

Chalk will use install the Python requirements for your project as specified in the requirements parameter of your chalk.yaml file. This can either be a requirements.txt file or a pyproject.toml. Chalk supports both Poetry and uv for managing your Python dependencies. You can specify this file’s location and type in your chalk.yaml file like below:

project: my-project-id

environments:
  dev:
    runtime: python311
    requirements: requirements-dev.txt
  prod:
    requirements: pyproject.yaml