Overview

Every sandbox and container runs from an OCI container image. You can reference a pre-built image by tag, or build one declaratively using the Image API. Images are content-addressed — the build service hashes the full specification (base image, installed packages, file contents) and skips any build whose hash already exists in the cache. Rebuilds after a minor dependency change typically complete in seconds.


Building images

The Image class exposes a fluent, method-chaining API. Each method returns a new Image instance, so intermediate images can be shared and extended without mutation.

from chalkcompute import Image

img = (
    Image.debian_slim("3.12")
    .pip_install(["requests", "pandas>=2.0"])
    .run_commands("apt-get update && apt-get install -y git curl")
    .workdir("/app")
    .env({"PROJECT_ROOT": "/app"})
)

Base images

ConstructorDescription
Image.debian_slim(python_version)python:<version>-slim-bookworm — lightweight default for Python workloads
Image.base(image)Arbitrary OCI image reference (e.g. "node:22-slim", "ghcr.io/org/image:tag")
Image.from_dockerfile(path)Parse an existing Dockerfile and continue chaining on top

Installing dependencies

Python packages — use pip_install with a list of requirement specifiers, or point at an existing requirements file:

img = Image.debian_slim().pip_install(["torch==2.3.0", "transformers"])

# Or from a requirements file
img = Image.debian_slim().pip_install_from_requirements("requirements.txt")

System packages — use run_commands for apt, apk, or any other shell command that should execute during the build:

img = Image.debian_slim().run_commands(
    "apt-get update && apt-get install -y ffmpeg libsndfile1",
)

Raw Dockerfile instructions — for anything not covered by the builder methods:

img = Image.debian_slim().dockerfile_commands([
    "EXPOSE 8080",
    "RUN useradd -m worker",
    "USER worker",
])

Adding local files

Local files can be included in the image via copy or volume mount:

# Volume mount (default) — file is uploaded at sandbox start, not baked into the image.
# Faster iteration when the file changes frequently.
img = Image.debian_slim().add_local_file("app.py", "/app/app.py")

# Copy — inlines the file contents into the image spec.
# Better for small, stable files that should be part of the cached image.
img = Image.debian_slim().add_local_file(
    "app.py", "/app/app.py", strategy="copy"
)

# Entire directories work the same way
img = Image.debian_slim().add_local_dir("./src", "/app/src")

Content-addressed caching

Image builds are content-addressed. The build service computes a deterministic hash from the full image specification — base image digest, build steps, package lists, inlined file contents, and environment variables. If that hash already exists in the image registry, the build is skipped entirely and the existing image is returned.

This means:

  • Identical specifications across different sandboxes or users share the same cached image.
  • Adding a single pip package to an existing image only rebuilds the final layer, not the entire image.
  • Re-running a sandbox definition with no changes incurs zero build time.

GPU images

For GPU-accelerated workloads, start from a base image that includes CUDA and the driver toolkit rather than installing them during the build. NVIDIA publishes pre-built images with CUDA, cuDNN, and popular ML frameworks:

# PyTorch with CUDA — starts from NVIDIA's pre-built image
img = (
    Image.base("nvcr.io/nvidia/pytorch:24.01-py3")
    .pip_install(["transformers", "accelerate"])
)

# TensorFlow
img = Image.base("nvcr.io/nvidia/tensorflow:24.01-tf2-py3")

Installing CUDA from scratch during an image build is slow and produces large layers that defeat caching. Pre-distributed images also ensure driver compatibility with the host GPU hardware.


Image pull performance

Built images are stored in a regional OCI registry close to the compute fleet. To minimize cold-start latency, Chalk maintains an on-node image cache on each host:

  • Warm start. If the image (by content hash) is already cached on the node, the sandbox starts immediately — no pull required.
  • Cold start. The first time a node sees a particular image, it pulls from the registry. For large images (multi-GB GPU images, for example) this pull may take 30–60 seconds. Subsequent sandboxes on the same node using the same image start instantly.

Common base images (debian_slim, popular NVIDIA containers) are pre-warmed across the fleet. If startup latency is critical for a custom image, you can trigger a pre-warm explicitly:

chalk compute image warm --image <image-hash>

This distributes the image to nodes ahead of sandbox creation.