Volumes - Chalk

Overview

Volumes provide persistent file storage that can be shared across sandboxes and containers. Under the hood, volumes are backed by a Rust-based FUSE driver that mounts directly into the container’s filesystem. This is file-level storage, not block storage — workloads interact with volumes through normal filesystem operations, and the driver handles tiered caching and replication to object storage transparently.

Volumes can be managed through the Python SDK, the CLI, or the web dashboard.

Creating and using volumes

CLI

# Create a volume
chalk volume create my-data

# Upload files
chalk volume put my-data ./local-model.bin models/latest.bin

# List contents
chalk volume ls my-data

# Download a file
chalk volume get my-data models/latest.bin ./downloaded-model.bin

Python SDK

from chalkcompute import Volume

vol = Volume("my-data")

# Upload files
vol.put_file("models/latest.bin", model_bytes)

# Batch upload for efficiency
with vol.batch_upload() as batch:
    batch.put("data/train.csv", train_csv)
    batch.put("data/eval.csv", eval_csv)

# Read files
data = vol.read_file("models/latest.bin")

# List files
for f in vol.listdir("data/"):
    print(f"{f.path}  {f.size} bytes")

Mounting into containers

Create a container and attach the volume before starting it. The mounted volume is available as an ordinary directory inside the container:

from chalkcompute import Container, Image, Volume

vol = Volume("my-data")
vol.put_file("inputs/message.txt", b"hello from the volume\n")

container = (
  Container(
      image=Image.debian_slim("3.12"),
      name="volume-reader",
      cpu="1",
      memory="1Gi",
  )
   .mount_volume("my-data", "/data")
   .run()
)

result = container.exec(
  "python3",
  "-c",
  "print(open('/data/inputs/message.txt').read())",
)
print(result.stdout_text)

container.stop()

Web browser

Volumes are also browsable from the Chalk dashboard. You can navigate the file tree, preview file contents, and upload or download files directly from the browser.

Architecture

FUSE driver

The volume mount is implemented as a Rust-based FUSE (Filesystem in Userspace) driver that runs inside the container. The driver exposes the volume as a standard POSIX directory, so workloads don’t need special libraries or APIs to read and write files — any tool that works with the filesystem works with a volume.

The FUSE driver handles:

Tiered caching. Frequently accessed files are cached on the container’s local disk. Cold data is fetched from object storage on demand.
Transparent replication. All persisted data is durably stored in object storage (S3 or GCS, depending on your environment). Local cache is ephemeral and rebuilt automatically.

Copy-on-write semantics

Volumes use batch copy-on-write. When a container writes to a mounted volume, the writes are buffered locally on the container’s filesystem. These local writes are not visible to other containers until a commit is made to persist writes via a sync call and containers reload their current state.

# Writes inside the container are local until committed
sandbox.exec("cp", "output.parquet", "/volumes/my-data/output.parquet")

This design avoids partial-write visibility — other consumers of the volume see a consistent snapshot, not a stream of in-progress file mutations. If the container terminates before sync is called, uncommitted writes are discarded.

Versioning and fork semantics

Every commit creates an immutable snapshot of the volume’s state. Past versions are retained and can be retrieved by version ID:

# List available versions
versions = vol.versions()

# Open a previous version (read-only)
old = vol.at_version(versions[-2].id)
data = old.read_file("models/latest.bin")

Because prior versions are immutable and cheaply addressable, volumes support fork semantics. You can spawn multiple sandboxes from the same volume version, let them diverge independently, and sync their results into separate version lineages — without copying the underlying data.

This is particularly useful for coding agents that need to explore multiple solution paths in parallel:

from chalkcompute import Container, Image

base_version = vol.latest_version()

# Fork two containers from the same starting state
sandbox_a = Container(
    image=Image.debian_slim(),
    volumes=[vol.at_version(base_version.id).fork("sandbox_a")],
).run()
sandbox_b = Container(
    image=Image.debian_slim(),
    volumes=[vol.at_version(base_version.id).fork("sandbox_b")],
).run()

# Each sandbox writes independently — no interference
sandbox_a.exec("python", "approach_a.py")
sandbox_b.exec("python", "approach_b.py")

Each sandbox’s writes are isolated until explicitly synced, and the original version remains available regardless of what the forks produce.

​Overview

​Creating and using volumes

​CLI

​Python SDK

​Mounting into containers

​Web browser

​Architecture

​FUSE driver

​Copy-on-write semantics

​Versioning and fork semantics

On this page