Queries
Track and manage named queries in Chalk
With Chalk NamedQuery
objects, you can define and version
your common query patterns in code. This provides a few advantages:
To define a named query, add a NamedQuery
object to your Chalk deployment:
from chalk import NamedQuery
from src.models import User
NamedQuery(
name="fraud",
input=[User.id],
output=[
User.email_age_days,
User.denylisted,
User.credit_report.flags,
],
tags=["team:fraud"],
owner="jodie@chalk.ai",
description="Primary fraud model for signup"
)
Running chalk apply
makes the named query available in your deployment.
Named queries can then be leveraged through any of our clients by specifying the query_name
parameter.
Using the Chalk CLI tool, this looks something like:
chalk query --in user.id=1 --query-name fraud
Because a named query has been specified, you don’t need to explicitly pass in the tags and outputs for your query. The above command is equivalent to running the more complicated:
chalk query \
--in user.id=1 \
--out user.email_age_days \
--out user.denylisted \
--out user.credit_report.flags \
--tags team:fraud
This feature is also accessible in all of our API clients through the query_name
parameter.
For instance, in Python, you can run:
from chalk.client import ChalkClient
ChalkClient().query(
input={"user.id": 1},
query_name="fraud",
)
To see all the named queries you’ve defined in your current active deployment, you can run:
$ chalk named-query list
<example output>
If you want to create multiple versions of a similar query, you can use the version
parameter of the NamedQuery
object
and the query_name_version parameter of our various clients.
Note, when executing a named query both the query name and the query version must match. This means that if you’ve defined two named queries in your codebase:
from chalk import NamedQuery
from src.models import User
NamedQuery(
name="fraud",
input=[User.id],
output=[User.denylisted],
)
NamedQuery(
name="fraud",
version="1.1.0",
input=[User.id],
output=[
User.email_age_days,
User.denylisted,
User.credit_report.flags,
],
)
And you run the following query:
chalk query --in user.id=1 --query-name fraud
We will return User.denylisted
since the first named query has no version and no version was passed
through query-name-version
. To access a version named query, the version must be
explicitly passed. For example:
chalk query --in user.id=1 --query-name fraud --query-name-version 1.1.0
Sometimes defining NamedQuery
objects is not ergonomic or possible. For example, if you are
a platform team serving multiple teams, you may not want to define a NamedQuery
object for every
query that your users run.
In this case, you can use these environment variables:
CHALK_STORE_ADHOC_QUERIES=true
CHALK_PLAN_ADHOC_QUERIES=3
The first environment variable will cache the ad-hoc query requests in the database. The second
environment variable will plan up to 3
of the most recent ad-hoc queries. These Ad-hoc queries
are re-planned at boot so that code or platform changes can be reflected in the query plan.