Local SSDs for spilling and scan caching (AWS)

Overview

Two Velox features benefit from fast local NVMe SSDs (LSSDs) attached to your async offline-query workers:

Spilling writes per-query intermediate state to disk when a query exceeds its memory limit, letting large offline queries complete instead of out-of-memory crashing.
Table-scan SSD cache keeps a process-wide on-disk cache of reusable scan ranges from external table sources. Survives query completion and engine restarts, so repeated reads of the same partitions skip the round trip to object storage.

Both features can share a single LSSD-backed mount on the node. This page walks through the end-to-end setup.

**Scope:** this guide covers **AWS EKS clusters using Karpenter**. The infrastructure steps (EC2NodeClass, NodePool, instance families) are AWS-specific. The Chalk-side configuration (resource group, Job Queue Consumer, environment variables, client routing) applies regardless of cloud, but the Karpenter-specific UI fields and shell commands on this page will not apply verbatim to GCP GKE or Azure AKS deployments. For GKE local-SSD guidance, see the short note in [Kubernetes Resources Overview](/docs/kube-resources-drilldown#local-ssds-for-temporary-storage) or contact Chalk support.

This setup applies to **async offline queries** (`run_asynchronously=True`), which run on the [job queue](/docs/job-queue). Synchronous offline queries and online queries don't go through the job queue and aren't affected by the configuration on this page.

When this is useful

You’ll see the biggest impact from LSSD-backed workers when:

Async offline queries OOM or take many minutes longer than expected because they’re spilling to slow remote EBS.
Your offline store is backed by Iceberg and queries repeatedly read the same partitions or backfill date ranges — the scan cache turns repeat reads into local disk hits.
You want spill-heavy and cache-heavy workloads isolated from latency-sensitive online queries on a dedicated nodepool.

If your offline store is BigQuery, Snowflake, Redshift, or Databricks, the scan cache won’t help — those backends execute SQL on the warehouse and results come back through warehouse drivers, not through Velox’s scan path. Spilling still helps if those queries spill in memory, but the scan-cache section below applies only to the Iceberg path.

Setup overview

Verify the LSSD EC2NodeClass exists in your cluster.
Create a dedicated NodePool from the Chalk dashboard.
Add a resource group with a Job Queue Consumer that targets the new NodePool.
Configure resource requests and environment variables.
Route async offline queries to the new resource group from your client code.
Verify the setup after the first job.

Step 1: Verify the LSSD EC2NodeClass exists

Karpenter’s EC2NodeClass is an AWS-only resource — these steps don’t apply to GKE or AKS clusters. Chalk’s standard AWS Terraform provisions an EC2NodeClass named al2023-offline-lssd with spec.instanceStorePolicy: RAID0 automatically. Check whether yours is present:

kubectl get ec2nodeclass al2023-offline-lssd

If the resource exists, continue to Step 2.
If it returns NotFound, your cluster is either on an older infrastructure setup or you’re managing the EKS cluster yourself outside Chalk’s Terraform module. Contact Chalk support to have the EC2NodeClass provisioned — it requires cluster-specific IAM and networking values that vary across deployments, so it’s not safe to apply a generic manifest. Once support confirms it’s been created, re-run the kubectl get above and continue to Step 2.

instanceStorePolicy: RAID0 is the critical field on the EC2NodeClass — it makes Karpenter mount the instance’s local NVMe array as the node’s ephemeral storage, so the container overlay and any writes to non-volume paths land on local SSD.

Step 2: Create the NodePool

In the Chalk dashboard, go to Infrastructure → Nodepools and click + Add New Nodepool. Use these settings:

Field	Value
Nodepool Name	`offline-lssd` (or similar)
EC2NodeClass	`al2023-offline-lssd`
Kubernetes Cluster	your cluster
CPU Limit	`512` (cap total CPU the pool can provision)
Capacity type	`on-demand`
Instance categories	`m`, `c`, `r`
Instance generations	`> 5`
Instance sizes	not in `[nano, micro, small, medium, large]`
Architecture	`amd64`
Zones	your cluster’s availability zones
Isolate this nodepool	✓ checked
Restrict to Chalk workloads only	✓ checked
Nodepool Workload Type	`Default` (leave alone)

Because the al2023-offline-lssd EC2NodeClass sets instanceStorePolicy: RAID0, Karpenter will only provision instance types that have local NVMe storage — no extra constraint is required to filter out non-LSSD families. If no LSSD instance is available in the requested categories or zones, pods will stay Pending rather than fall back to EBS.

Do **not** set Nodepool Workload Type to `Offline`. The dropdown option adds a `chalk.ai/workload-type=offline:NoSchedule` taint that no Chalk pod currently tolerates, which would make the pool repel every workload. Leave it as `Default`.

The two isolation checkboxes generate the taints that exclude unrelated workloads:

chalk.ai/nodepool=offline-lssd:NoSchedule (from “Isolate this nodepool”)
chalk.ai/managed-by=chalk:NoSchedule (from “Restrict to Chalk workloads only”)

Chalk auto-adds matching tolerations to pods that target this pool via the Resource Configuration form in Step 3.

Step 3: Add a resource group with a Job Queue Consumer

Go to Infrastructure → Resource Configuration. At the bottom of the resource groups tree, click + Add Resource Group. Give it a name like offline-lssd.

Under the new resource group, add a Job Queue Consumer service. You do not need to add a separate Job Queue Manager — there is one environment-wide Manager that polls jobs across all resource groups and spawns the per-group Consumer Deployments on demand.

On the Job Queue Consumer page:

Nodepool: select offline-lssd.
Instance Type: leave as None so Karpenter picks from the pool’s allowed instance types.

Step 4: Configure resources and environment variables

Resource requests

Set requests on the Requests panel. Two starting profiles, pick based on the size of your typical async offline query:

Standard (lands on a 2xlarge LSSD instance)

Setting	Value
CPU	`7`
Memory	`50Gi`
Ephemeral Storage	`350Gi`

Forces Karpenter to pick a 2xlarge LSSD instance (e.g. r6id.2xlarge — 8 vCPU, 64 GiB RAM, ~474 GB NVMe).

Heavier (for very large async offline queries)

Setting	Value
CPU	`15`
Memory	`100Gi`
Ephemeral Storage	`600Gi`

Forces a 4xlarge LSSD instance (e.g. r6id.4xlarge — 16 vCPU, 128 GiB RAM, ~950 GB NVMe).

Leave the Limits panel blank so spill writes can use whatever the LSSD provides without an artificial cap. Set Min Instances to 0 to scale to zero when idle, and Max Instances to 2 or 3 to cap concurrent LSSD nodes.

The scan cache is per-pod — each Consumer pod has its own cache on its own node’s LSSD, and libchalk uses an exclusive lock so caches are never shared across pods. Setting Min Instances to 1 keeps one warm cache alive, not a pool-wide warm cache. If the workload bursts above one pod, the additional pods start cold and warm their own caches independently. Only raise Min Instances above 0 if the workload re-reads the same data consistently enough that paying for one always-on LSSD instance (~$15/day for an r6id.2xlarge) is worth it.

Environment variables — required for all backends

Add these under Environment Variable Overrides:

Variable	Value	Purpose
`CHALK_VELOX_QUERY_SPILLING_MODE`	`Always`	Enable spilling for queries that exceed their memory limit

In Always mode, libchalk uses the configured spill directory, or falls back to /tmp/spill if no spill directory is set. With instanceStorePolicy: RAID0 in effect, the container’s writable overlay sits on the local NVMe array, so no separate spill path is required for the standard setup.

By default, libchalk sets the process-wide Velox memory cap to 80% of the container’s cgroup memory limit, then sets the per-query spill threshold to 75% of that process cap. With Memory=50Gi, that’s ~30 GiB of in-memory work before spill kicks in; with 100Gi, it’s ~60 GiB.

Environment variables — optional customization

Set this only if you want the spill files under a specific path:

Variable	Value	Purpose
`CHALK_VELOX_SPILL_DIRECTORY`	`/chalk-lssd-spill`	Custom per-query spill scratch path on local NVMe

/chalk-lssd-spill does not need to be mounted explicitly. With instanceStorePolicy: RAID0, the engine can create the directory inside the container’s writable overlay and the writes still land on LSSD.

Environment variables — Iceberg only (optional)

If your offline store is backed by Iceberg (or you read Parquet/Delta tables directly through Velox via static resolvers), you can also add a scan cache. When enabling the scan cache, set the custom spill directory above so the cache has an explicit LSSD-backed location, then add:

Variable	Value	Purpose
`LIBCHALK_VELOX_TABLE_SCAN_SSD_CACHE_BYTES`	`214748364800`	200 GiB persistent on-disk scan cache, shares the spill mount

The cache directory defaults to CHALK_VELOX_SPILL_DIRECTORY/table_scan_cache when not otherwise configured, so no extra cache path setup is needed.

Skip this variable if your offline store is **BigQuery, Snowflake, Redshift, or Databricks**. Those backends execute SQL on the warehouse and never go through Velox's table-scan operators, so the cache would be initialized but never see any reads — wasting LSSD capacity that could otherwise hold spill files.

Sizing the scan cache

214748364800 (200 GiB) is a stock starting value, not a universal default. The right size is roughly the working set of distinct external-table partitions your async offline queries repeatedly touch:

Small / focused workloads (e.g. daily backfills over the same date range): 10-50 GiB is usually enough.
Broad ad-hoc analytics that read many partitions: 100-300 GiB or more.

The cache size also has to fit on the LSSD alongside spill scratch. On a 2xlarge LSSD instance (~474 GB usable), 200 GiB for the cache leaves ~270 GB for spill files and the container overlay — comfortable. On smaller shapes, scale down. If startup logs warn that the cache directory’s available space is below the configured cache size, either shrink this value or increase the ephemeral-storage request so Karpenter picks a larger LSSD instance.

Step 5: Route queries from your client

The default resource group for offline queries is "default". To send a specific async offline query to the new LSSD-backed resource group, pass ResourceRequests(resource_group=...):

from chalk.client import ChalkClient, ResourceRequests

client = ChalkClient()
client.offline_query(
    input={'user.id': range(1_000_000)},
    output=['user.name'],
    run_asynchronously=True,
    resources=ResourceRequests(
        resource_group="offline-lssd",
    ),
)

Only queries that explicitly opt in via resource_group= will land on the new pool. Existing queries continue to use the default resource group and its existing nodepool, so you can roll out LSSD gradually for the queries that benefit most.

For scheduled queries, the equivalent kwarg lives directly on ScheduledQuery:

from chalk import ScheduledQuery

ScheduledQuery(
    name="weekly-aggregations",
    schedule="0 0 * * 0",
    output=[User.historical_aggregates],
    resource_group="offline-lssd",
)

Step 6: Verify the setup

After running an async offline query against the new resource group, confirm spilling actually happened by checking the query’s performance summary in the Chalk dashboard for spill_enabled=true and a nonzero spilled_bytes value. If neither field appears, the query didn’t exceed its memory limit and didn’t need to spill — small queries that fit in memory won’t trigger it.

Local SSDs for spilling and scan caching (AWS)

Overview

When this is useful

Setup overview

Step 1: Verify the LSSD EC2NodeClass exists

Step 2: Create the NodePool

Step 3: Add a resource group with a Job Queue Consumer

Step 4: Configure resources and environment variables

Resource requests

Standard (lands on a 2xlarge LSSD instance)

Heavier (for very large async offline queries)

Environment variables — required for all backends

Environment variables — optional customization

Environment variables — Iceberg only (optional)

Sizing the scan cache

Step 5: Route queries from your client

Step 6: Verify the setup

See also

On this page

​Overview

​When this is useful

​Setup overview

​Step 1: Verify the LSSD EC2NodeClass exists

​Step 2: Create the NodePool

​Step 3: Add a resource group with a Job Queue Consumer

​Step 4: Configure resources and environment variables

​Resource requests

​Standard (lands on a 2xlarge LSSD instance)

​Heavier (for very large async offline queries)

​Environment variables — required for all backends

​Environment variables — optional customization

​Environment variables — Iceberg only (optional)

​Sizing the scan cache

​Step 5: Route queries from your client

​Step 6: Verify the setup

​See also

On this page

Overview

When this is useful

Setup overview

Step 1: Verify the LSSD EC2NodeClass exists

Step 2: Create the NodePool

Step 3: Add a resource group with a Job Queue Consumer

Step 4: Configure resources and environment variables

Resource requests

Standard (lands on a 2xlarge LSSD instance)

Heavier (for very large async offline queries)

Environment variables — required for all backends

Environment variables — optional customization

Environment variables — Iceberg only (optional)

Sizing the scan cache

Step 5: Route queries from your client

Step 6: Verify the setup

See also