# Dynamic Chalk Queries
source: https://docs.chalk.ai/docs/compute/dynamic-chalk-queries

## Let an agent write and run Chalk queries on demand, fetching exactly the features it reasons it needs.

### Overview

The usual way to give an LLM access to your data is to gather it up front: run a
fixed query, serialize the rows into the prompt, and hope you fetched the right
columns. That is a guess made before the model has reasoned about anything.

A better pattern, and the one Chalk Compute is built for, is to hand the agent a
tool that issues Chalk queries and let it decide what to pull. Because the
agent runs inside a Chalk Function — co-located with
the feature engine, inside your own cloud — each tool call is a real online
query against the feature store. The agent reads the fraud score, decides it
needs the account age, fetches that too, and rules. Nothing is pre-staged, and
no data leaves your VPC.

This is what we mean by data-native agents: the feature store is the
agent's context window, queried on demand.

```
# Inside the agent loop, a tool call becomes a live Chalk query:
ctx = chalk_client.query(
    input={"user.id": user_id},
    output=requested_features,   # ← chosen by the model at inference time
)
```

### The pattern

Three pieces turn an LLM into a data-native agent:

- A feature-fetch tool exposed to the model, whose parameters name the
features it is allowed to request.
- A handler that turns each tool call into a ChalkClient.query(...).
- A @chalkcompute.function wrapping the whole loop so it runs next to the
feature engine with credentials injected.

Here is a refund-abuse investigator built exactly this way. The agent is given a
get_chalk_features tool, and each time it calls that tool we run an online
query for the features it asked for and feed the results back:

```
import json
import chalkcompute


@chalkcompute.function(secrets=[...], image=..., tracing=True)
def investigate_refund(user_id: int, reason: str) -> str:
    from chalk.client import ChalkClient

    chalk_client = ChalkClient()
    messages = [...]  # system prompt + the refund claim

    while True:
        response = openai_client.chat.completions.create(
            model="gpt-5.5",
            messages=messages,
            tools=[{
                "type": "function",
                "function": {
                    "name": "get_chalk_features",
                    "parameters": {  # the model names whichever features it wants
                        "type": "object",
                        "properties": {
                            "user_id": {"type": "integer"},
                            "features": {
                                "type": "array",
                                "items": {"type": "string"},
                            },
                        },
                    },
                },
            }],
        )

        msg = response.choices[0].message
        if not msg.tool_calls:
            return msg.content or ""          # final verdict
        messages.append(msg.model_dump(exclude_none=True))

        for tc in msg.tool_calls:
            args = json.loads(tc.function.arguments)
            # Each tool call becomes a live online query against the feature store.
            with chalkcompute.span("tool.get_chalk_features"):
                result = chalk_client.query(
                    input={"user.id": args["user_id"]},
                    output=args["features"],   # ← the query the agent composed
                )
            rendered = "\n".join(f"{r.field}: {r.value}" for r in result.data)
            messages.append({"role": "tool", "tool_call_id": tc.id, "content": rendered})
```

The model never sees a Chalk SDK or a feature definition — it names the features
it wants as plain strings and emits a JSON array. The query is composed by the
agent at inference time: output=args["features"] is whatever subset the model
decided it needed on this turn. That is a Chalk query the agent wrote.

The output argument accepts feature name strings like "user.is_fraud",
not just User.is_fraud attributes. That is exactly what makes this work — the
model emits strings, and they map straight onto a query.

See Online Queries for the full client API.

### Best Practices

### Scope the agent's data surface with token permissions

Notice the tool lets the model ask for any feature name — there is no
hardcoded allowlist in the schema. That is deliberate. Pinning the permitted
features into a tool schema is brittle: the list drifts the moment someone adds
a feature, lives somewhere other than the feature definitions, and is only as
trustworthy as the code that assembles the prompt. At best it is a hint to the
model, never a security boundary.

The boundary belongs in Chalk instead. The agent queries under a
service token, and tokens carry
tag-based feature permissions — an allowlist or blocklist of feature tags
that Chalk enforces server-side at query time. Tag your features once, at
definition:

```
from chalk.features import features, feature

@features(tags=["agent-readable"])
class User:
    id: int
    total_spend: float
    is_fraud: bool
    # SSN can feed other features but never reach the agent:
    ssn: str = feature(tags=["pii"])
```

Issue the agent a token that allowlists agent-readable and blocklists pii.
Every query the agent composes is checked against that scope, and a request for
an out-of-scope feature comes back Unauthorized — even if the model
hallucinated the name. Chalk also keeps separate permissions for computed-only
versus output features, so a feature can participate in deriving others without
ever being returned to the agent.

Don't keep sensitive features out of an agent's reach with a list in a prompt —
the model can ignore it and the list rots. Give the agent the widest data
surface its job needs and let tag permissions on its token decide what is
off-limits. That boundary is enforced by the engine, not by the model's good
behavior.

### Fetch incrementally, in small batches

Prompt the agent to pull a few features, reason, then pull more — rather than
requesting everything at once. Cheaper queries, smaller prompts, and a tool-call
trace that reads like an investigation: checked fraud score → looked up account
age → ruled. It also lets the agent stop early when it has seen enough.

### Run the loop inside a Chalk Function

Wrap the agent in @chalkcompute.function and inject
Chalk and model credentials with Secret.
The function gives you autoscaling, retries, and concurrency limits for free,
and keeps the ChalkClient co-located with the engine. Call it from anywhere
with RemoteFunction.from_name(...):

```
from chalkcompute import RemoteFunction

investigate = RemoteFunction.from_name("investigate_refund")
verdict = investigate.remote(user_id=1, reason="item not received")
```

### Trace every query

Wrap each tool call in chalkcompute.span(...)
and set tracing=True on the function. You get a span per query showing which
features the agent fetched and when — the audit trail for why an agent
decided what it did. Indispensable for debugging agent behavior and reviewing
high-stakes decisions.

### Control freshness with staleness

If an agent can tolerate slightly stale values, pass staleness to accept
cached features and skip recomputation — faster and cheaper. Reserve fresh
computation for the signals where it matters.

```
result = chalk_client.query(
    input={"user.id": user_id},
    output=["user.total_spend"],
    staleness={"user.total_spend": "1h"},  # cached value up to 1h old is fine
)
```

### Gate side-effecting tools

Read-only feature fetches are safe to let the agent fire at will. Tools that
act — banning a user, issuing a refund, writing to a table — deserve more
care: require human approval, restrict them to specific verdicts, or log them
prominently. Keep the agent's ability to read data wide and its ability to
change things narrow.

### Replay decisions with point-in-time correctness

To backtest an agent or reproduce a past decision, run the same feature reads as
an offline query with input_times. Chalk computes each
feature as of that timestamp, so the agent sees the data it would have seen
then — no leakage from the future. The agent logic is unchanged; only the query
mode differs.

### How it fits together

```
┌─────────────┐   tool call: get_chalk_features(["user.is_fraud"])
│     LLM     │ ───────────────────────────────────────────────┐
│  (the agent)│                                                 ▼
└─────────────┘                                    ┌──────────────────────────┐
       ▲                                           │  @chalkcompute.function  │
       │  features as messages                     │  ChalkClient().query(...)│
       └───────────────────────────────────────── │   → Chalk feature engine  │
                                                   └──────────────────────────┘
                                                      (same cluster, your cloud)
```

The model decides what to fetch; the function executes it as a real Chalk
query against the engine next door. The agent's reasoning and your feature store
operate as one system — which is the whole point of running agents on Chalk
Compute.





