Queries
Track and manage named queries in Chalk
With Chalk NamedQuery objects, you can define and version your common query patterns in code. This provides a few advantages:
To define a named query, add a NamedQuery
object to your Chalk deployment:
from chalk import NamedQuery
from src.feature_sets import Book, Author
NamedQuery(
name="book_key_information",
input=[Book.id],
output=[
Book.id,
Book.title,
Book.author.name,
Book.year,
Book.short_description
],
tags=["team:analytics"],
staleness={
Book.short_description: "0s"
},
owner="mary.shelley@aol.com",
description=(
"Return a condensed view of a book, including its title, author, "
"year, and short description."
)
)
Running chalk apply
makes the named query available in your deployment.
Named queries can then be leveraged through any of our clients by specifying the query_name
parameter.
Using the Chalk CLI tool, this looks something like:
chalk query --in book.id=1 --query-name book_key_information
Because a named query has been specified, you don’t need to explicitly pass in the tags and outputs for your query. The above command is equivalent to running the more complicated:
chalk query --in book.id=1 \
--out book.id \
--out book.title \
--out book.author \
--out book.year \
--out book.author.name \
--out book.short_description \
--staleness book.short_description=0s \
--tags team:analytics
This feature is also accessible in all of our API clients through the query_name
parameter.
For instance, in Python, you can run:
from chalk.client import ChalkClient
client = ChalkClient()
client.query(
input={"book.id": 1},
query_name="book_key_information",
)
To see all the named queries you’ve defined in your current active deployment, you can run:
$ chalk named-query list
<example output>
If you want to create multiple versions of a similar query, you can use the version parameter of the NamedQuery object and the query_name_version parameter of our various clients.
Note, when executing a named query both the query name and the query version must match. This means that if you’ve defined four named queries in your codebase:
from chalk import NamedQuery
from src.feature_sets import Book
first_nq = NamedQuery(name="book_id", input=[Book.id], output=[Book.id])
second_nq = NamedQuery(name="book_id", input=[Book.id], version=1, output=[Book.name])
third_nq = NamedQuery(name="book_id", input=[Book.id], version=2, output=[Book.author])
fourth_nq = NamedQuery(name="book_id", input=[Book.id], version=3, output=[Book.version])
And you run the following query:
chalk query --in book.id=1 --query-name book_id
We will return Book.id
since the first named query has no version and no version was passed
through query-name-version
. To access a version named query, the version must be
explicitly passed. For example:
chalk query --in book.id=1 --query-name book_id --query-name-version 1