Getting Started
Configure your Chalk project.
Your Chalk project’s configuration is shared across the following files:
chalk.yaml
(or chalk.yml
): Configuration for your project’s deployment.chalkignore
: Files to exclude from your project’s deploymentpip
(can be overridden in chalk.yaml
).
This can either be requirements.txt
or a pyproject.toml
(via Poetry or uv).Here’s our recommended repository structure:
company_chalk/
├── src/
│ ├── resolvers/
│ │ ├── ...
│ │ ├── __init__.py
│ │ └── pipelines.py
│ ├── __init__.py
│ ├── datasources.py
│ └── feature_sets.py
├── tests/
│ └── ...
├── notebooks/
│ └── ...
├── .chalkignore
├── chalk.yaml
├── README.md
└── requirements.txt
When you’re first getting started, we recommend putting all your features in a single file. Keeping the features in a single file makes circular references easier to reason about, as they can just be quoted.
If you do want to split your features across multiple files, you’ll need to use the if TYPE_CHECKING
block from the typing
module`.
In large projects, it’s common to split feature definitions across multiple Python modules.
For unidirectional dependencies, this is straightforward. For example, if src/models/user.py
imports
src/models/profile.py
, you can define the User
and Profile
features in separate files without
issues. However, if you have circular dependencies, you may run into problems.
Chalk supports this, but circular imports can arise when features reference each other across files.
To avoid these issues, use the if TYPE_CHECKING
block from the typing
module and quote your forward references.
Here’s an example of how to do this cleanly:
# Imports `User` directly, because `src/models/user.py`
# doesn't import `src/models/profile.py`
from src.models.user import User
@features
class Profile:
id: Primary[User.id]
username: str
from chalk.features import features
from typing import TYPE_CHECKING
if TYPE_CHECKING:
# Imports `Profile` only when type checking
# to avoid circular imports
from src.models.profile import Profile
@features
class User:
id: str
# Profile must be quoted because it is imported
# only when type checking
profile: "Profile"
By quoting imports inside if TYPE_CHECKING
,
you avoid circular dependency errors while still
maintaining type safety and feature linkage.
If the relationship to Profile
is optional, you can use typing.Optional
or the | None
syntax,
but the entire annotation should be quoted:
from chalk.features import features
from typing import TYPE_CHECKING
if TYPE_CHECKING:
from src.models.profile import Profile
@features
class User:
id: str
# All of `"Profile | None"` must be quoted, not just the `Profile` part
profile: "Profile | None"
A similar pattern should be used for DataFrame
annotations:
from chalk.features import features
from typing import TYPE_CHECKING
if TYPE_CHECKING:
from src.models.profile import Profile
@features
class User:
id: str
# All of `"DataFrame[Transaction]"` must be quoted, not just the `Transaction` part
transactions: "DataFrame[Transaction]"
In a project with features living in different files, we recommend that the schema definition all live in a single folder, separate from the resolvers. In this case, your folder structure will look something like:
company_chalk/
├── src/
│ ├── models/
│ │ ├── user.py
│ │ ├── profile.py
│ │ ├── __init__.py
│ │ └── ...
│ ├── resolvers/
│ │ ├── ...
│ │ ├── __init__.py
│ │ └── pipelines.py
│ ├── __init__.py
│ ├── datasources.py
│ └── feature_sets.py
├── tests/
│ └── ...
├── notebooks/
│ └── ...
├── .chalkignore
├── chalk.yaml
├── README.md
└── requirements.txt
Keeping the schema definition all in one place (as you might with something like Protobuf or Avro files) helps you keep your schema definitions organized and makes importing them in your resolvers straightforward.
Use chalk.yaml
to configure your project’s Docker environment, Python configuration, and metadata validation for your
features and resolvers.
projectstr
environmentsdictionary
[environment_name]dictionary
prod
or qa
. Find available environment names in your dashboard. Use default
to configure default values to apply to all of this project's environments.runtime"python310" | "python311"
requirementsstr
pip
. This can either be a `requirements.txt` file or a `pyproject.toml` file.dockerfilestr
platform_versionstr
validationdictionary
featuredictionary
metadatalist
name"description" | "owner" | "tags"
missing"info" | "warning" | "error"
info
and warning
levels, but disallowed for `error` level.resolverdictionary
metadatalist
name"description" | "owner" | "tags"
missing"info" | "warning" | "error"
info
and warning
levels, but disallowed for `error` level.Here is a sample chalk.yaml
file. In this file, we use a different Dockerfile in production.
project: my-project-id
environments:
default:
runtime: python311
requirements: requirements.txt
prod:
dockerfile: ./DockerfileProd
validation:
feature:
metadata:
- name: owner
missing: error
- name: description
missing: warning
- name: tags
missing: info
resolver:
metadata:
- name: owner
missing: error
Your .chalkignore
file should include your scripts, notebooks, and tests. Anything that you are not actively using in
your deployment should be added so that non-deployment code does not clutter or interfere with your deployment.
Chalk will use install the Python requirements for your project as specified in the requirements
parameter of your chalk.yaml
file. This can either be a requirements.txt
file or a pyproject.toml
.
Chalk supports both Poetry
and uv
for managing your Python dependencies.
You can specify this file’s location
and type in your chalk.yaml
file like below:
project: my-project-id
environments:
dev:
runtime: python311
requirements: requirements-dev.txt
prod:
requirements: pyproject.yaml