Compute
Let an agent write and run Chalk queries on demand, fetching exactly the features it reasons it needs.
The usual way to give an LLM access to your data is to gather it up front: run a fixed query, serialize the rows into the prompt, and hope you fetched the right columns. That is a guess made before the model has reasoned about anything.
A better pattern, and the one Chalk Compute is built for, is to hand the agent a tool that issues Chalk queries and let it decide what to pull. Because the agent runs inside a Chalk Function — co-located with the feature engine, inside your own cloud — each tool call is a real online query against the feature store. The agent reads the fraud score, decides it needs the account age, fetches that too, and rules. Nothing is pre-staged, and no data leaves your VPC.
This is what we mean by data-native agents: the feature store is the agent’s context window, queried on demand.
# Inside the agent loop, a tool call becomes a live Chalk query:
ctx = chalk_client.query(
input={"user.id": user_id},
output=requested_features, # ← chosen by the model at inference time
)Three pieces turn an LLM into a data-native agent:
ChalkClient.query(...).@chalkcompute.function wrapping the whole loop so it runs next to the
feature engine with credentials injected.Here is a refund-abuse investigator built exactly this way. The agent is given a
get_chalk_features tool, and each time it calls that tool we run an online
query for the features it asked for and feed the results back:
import json
import chalkcompute
@chalkcompute.function(secrets=[...], image=..., tracing=True)
def investigate_refund(user_id: int, reason: str) -> str:
from chalk.client import ChalkClient
chalk_client = ChalkClient()
messages = [...] # system prompt + the refund claim
while True:
response = openai_client.chat.completions.create(
model="gpt-5.5",
messages=messages,
tools=[{
"type": "function",
"function": {
"name": "get_chalk_features",
"parameters": { # the model names whichever features it wants
"type": "object",
"properties": {
"user_id": {"type": "integer"},
"features": {
"type": "array",
"items": {"type": "string"},
},
},
},
},
}],
)
msg = response.choices[0].message
if not msg.tool_calls:
return msg.content or "" # final verdict
messages.append(msg.model_dump(exclude_none=True))
for tc in msg.tool_calls:
args = json.loads(tc.function.arguments)
# Each tool call becomes a live online query against the feature store.
with chalkcompute.span("tool.get_chalk_features"):
result = chalk_client.query(
input={"user.id": args["user_id"]},
output=args["features"], # ← the query the agent composed
)
rendered = "\n".join(f"{r.field}: {r.value}" for r in result.data)
messages.append({"role": "tool", "tool_call_id": tc.id, "content": rendered})The model never sees a Chalk SDK or a feature definition — it names the features
it wants as plain strings and emits a JSON array. The query is composed by the
agent at inference time: output=args["features"] is whatever subset the model
decided it needed on this turn. That is a Chalk query the agent wrote.
The `output` argument accepts **feature name strings** like `"user.is_fraud"`, not just `User.is_fraud` attributes. That is exactly what makes this work — the model emits strings, and they map straight onto a query.
See Online Queries for the full client API.
Notice the tool lets the model ask for any feature name — there is no hardcoded allowlist in the schema. That is deliberate. Pinning the permitted features into a tool schema is brittle: the list drifts the moment someone adds a feature, lives somewhere other than the feature definitions, and is only as trustworthy as the code that assembles the prompt. At best it is a hint to the model, never a security boundary.
The boundary belongs in Chalk instead. The agent queries under a service token, and tokens carry tag-based feature permissions — an allowlist or blocklist of feature tags that Chalk enforces server-side at query time. Tag your features once, at definition:
from chalk.features import features, feature
@features(tags=["agent-readable"])
class User:
id: int
total_spend: float
is_fraud: bool
# SSN can feed other features but never reach the agent:
ssn: str = feature(tags=["pii"])Issue the agent a token that allowlists agent-readable and blocklists pii.
Every query the agent composes is checked against that scope, and a request for
an out-of-scope feature comes back Unauthorized — even if the model
hallucinated the name. Chalk also keeps separate permissions for computed-only
versus output features, so a feature can participate in deriving others without
ever being returned to the agent.
Don't keep sensitive features out of an agent's reach with a list in a prompt — the model can ignore it and the list rots. Give the agent the widest data surface its job needs and let tag permissions on its token decide what is off-limits. That boundary is enforced by the engine, not by the model's good behavior.
Prompt the agent to pull a few features, reason, then pull more — rather than requesting everything at once. Cheaper queries, smaller prompts, and a tool-call trace that reads like an investigation: checked fraud score → looked up account age → ruled. It also lets the agent stop early when it has seen enough.
Wrap the agent in @chalkcompute.function and inject
Chalk and model credentials with Secret.
The function gives you autoscaling, retries, and concurrency limits for free,
and keeps the ChalkClient co-located with the engine. Call it from anywhere
with RemoteFunction.from_name(...):
from chalkcompute import RemoteFunction
investigate = RemoteFunction.from_name("investigate_refund")
verdict = investigate.remote(user_id=1, reason="item not received")Wrap each tool call in chalkcompute.span(...)
and set tracing=True on the function. You get a span per query showing which
features the agent fetched and when — the audit trail for why an agent
decided what it did. Indispensable for debugging agent behavior and reviewing
high-stakes decisions.
If an agent can tolerate slightly stale values, pass staleness to accept
cached features and skip recomputation — faster and cheaper. Reserve fresh
computation for the signals where it matters.
result = chalk_client.query(
input={"user.id": user_id},
output=["user.total_spend"],
staleness={"user.total_spend": "1h"}, # cached value up to 1h old is fine
)Read-only feature fetches are safe to let the agent fire at will. Tools that act — banning a user, issuing a refund, writing to a table — deserve more care: require human approval, restrict them to specific verdicts, or log them prominently. Keep the agent’s ability to read data wide and its ability to change things narrow.
To backtest an agent or reproduce a past decision, run the same feature reads as
an offline query with input_times. Chalk computes each
feature as of that timestamp, so the agent sees the data it would have seen
then — no leakage from the future. The agent logic is unchanged; only the query
mode differs.
┌─────────────┐ tool call: get_chalk_features(["user.is_fraud"])
│ LLM │ ───────────────────────────────────────────────┐
│ (the agent)│ ▼
└─────────────┘ ┌──────────────────────────┐
▲ │ @chalkcompute.function │
│ features as messages │ ChalkClient().query(...)│
└───────────────────────────────────────── │ → Chalk feature engine │
└──────────────────────────┘
(same cluster, your cloud)
The model decides what to fetch; the function executes it as a real Chalk query against the engine next door. The agent’s reasoning and your feature store operate as one system — which is the whole point of running agents on Chalk Compute.