Overview

Chalk sandboxes are lightweight, isolated execution environments for running arbitrary code — agent workloads, model inference, data pipelines, or any container-based task. Each sandbox runs inside a gVisor-hardened container with its own filesystem, network namespace, and resource limits.

Sandboxes are designed around three principles:

  1. Strong isolation. Workloads from different tenants (and different workloads from the same tenant) cannot observe or interfere with each other.
  2. Flexible deployment. The same sandbox abstraction runs on managed serverless infrastructure or on your own Kubernetes clusters.
  3. First-class GPU support. GPU-accelerated workloads use the same API as CPU workloads.

Isolation with gVisor

Every sandbox runs under gVisor, a container runtime that intercepts application system calls through a user-space kernel. Unlike traditional containers that share the host kernel directly, gVisor interposes a second layer of defense:

┌────────────────────────┐
│   Application process  │
├────────────────────────┤
│   gVisor (Sentry)      │  ← intercepts syscalls
├────────────────────────┤
│   Host kernel          │
└────────────────────────┘

This means a kernel exploit in one sandbox cannot compromise the host or other sandboxes. gVisor also restricts the set of available syscalls, reducing the attack surface exposed to untrusted code — particularly important for agent workloads that execute LLM-generated commands.

Each sandbox additionally receives its own:

  • Filesystem namespace. The root filesystem is ephemeral and private. Persistent storage is available through Volumes.
  • Network namespace. Sandboxes have isolated network stacks. Egress can be further restricted with Network Policies.
  • PID namespace. Processes inside a sandbox cannot see or signal processes in other sandboxes.

Deployment models

Managed serverless

Chalk operates a fleet of bare-metal nodes optimized for sandbox workloads. When you create a sandbox without additional configuration, it runs on this managed infrastructure:

from chalkcompute import Container, Image

c = Container(image=Image.debian_slim()).run()

Managed serverless handles provisioning, scaling, and node maintenance. Sandboxes are scheduled across availability zones and can cold-start in under two seconds for common base images.

Self-hosted Kubernetes (EKS / GKE)

For workloads that must run within your cloud account — for compliance, data residency, or proximity to other infrastructure — Chalk can deploy sandboxes into your existing EKS or GKE clusters.

In this model, you install the Chalk node agent as a DaemonSet. The agent manages gVisor runtime configuration, volume mounts, and GPU device plugin integration. The control plane remains managed by Chalk; your cluster provides the compute.

# Install the Chalk node agent into your cluster
chalk compute install --cluster arn:aws:eks:us-east-1:123456789:cluster/my-cluster

The same Container and SandboxClient APIs work regardless of where the sandbox is scheduled — your application code doesn’t change between managed and self-hosted.


Resource allocation

CPU and memory

Specify resource requests when creating a sandbox:

c = Container(
    image=Image.debian_slim(),
    cpu="4",
    memory="16Gi",
)

Resources are guaranteed (requests equal limits), so sandboxes are not subject to noisy-neighbor throttling.

GPU

GPU-accelerated workloads request a GPU type at creation time:

c = Container(
    image="nvcr.io/nvidia/pytorch:24.01-py3",
    gpu="A100",
    cpu="8",
    memory="64Gi",
)

The Chalk scheduler matches the request to a node with the appropriate GPU hardware and configures the NVIDIA device plugin and driver mounts automatically. GPU workloads run under the same gVisor isolation as CPU workloads.


Multi-tenancy

Chalk enforces tenant isolation at every layer of the stack:

LayerMechanism
RuntimegVisor kernel-level syscall interception per sandbox
NetworkSeparate network namespace per sandbox; no shared listening sockets
StorageVolumes are scoped to the owning environment; cross-tenant access is impossible
SchedulingWorkloads from different tenants are placed on separate host nodes by default
IdentityEach sandbox receives a unique workload identity — no shared credentials

For deployments with strict regulatory requirements, dedicated node pools can be configured so that a tenant’s workloads never share physical hardware with any other tenant.


Lifecycle

Sandboxes follow a straightforward lifecycle:

  1. Image resolution. If the image is an Image spec (e.g. Image.debian_slim().pip_install([...])), it is built and cached. Pre-built OCI images are used directly.
  2. Scheduling. The sandbox is placed on a node with sufficient resources. gVisor initializes the isolation boundary.
  3. Running. The sandbox accepts exec calls. Volumes are mounted and accessible.
  4. Termination. The sandbox is stopped explicitly or after its lifetime expires. Ephemeral filesystem state is discarded; volume data persists if synced.
from chalkcompute import Container, Image

c = Container(
    image=Image.debian_slim().pip_install(["numpy"]),
    cpu="2",
    memory="4Gi",
    lifetime="1h",
).run()

result = c.exec("python", "-c", "import numpy; print(numpy.__version__)")
print(result.stdout_text)

c.stop()