Infrastructure
Background Persistence Installation Via the UI
Chalk uses background writers hosted in the (“Customer Cloud”) Kubernetes cluster to write information about queries to various storage locations.
In order to install Chalk persistence writers, you need to have the following:
If using Kafka:
If using Pubsub:
Navigate to the Settings/Team/Shared Resources/Background Persistence
page in the Chalk UI
to view the background persistence configuration. If no background persistence is configured,
you will see a message indicating that no background persistence is currently present, and the
first save and apply will create background persistence writers.
Chalk supports different types of background persistence writers, each designed for specific data flow and storage purposes:
COPY INTO
operations.bigquery-streaming-write-loader
is typically used instead.Each writer type requires specific subscription IDs and topics to be configured in the common persistence specifications.
When using pubsub, topics and subscriptions are 2 separate entities, but for Kafka, we use the same topic for both publishing and subscribing. Additionally, we need to provide Kafka authentication credential, whereas pubsub uses its google identity to authenticate.
In the JSON format, these fields are in the common_specs
field.
bus_backendstring
namespacestring
service_account_namestring
secret_clientstring
kafka_dlq_topicstring
api_server_hoststring
kafka_sasl_secretstring
kafka_bootstrap_serversstring
kafka_security_protocolstring
kafka_sasl_mechanismstring
redis_is_clusteredstring
snowflake_storage_integration_namestring
metadata_providerstring
namestring
bus_subscriber_typestring
requestobject
limitobject
versionstring
default_replica_countint
In the JSON format, these fields are in the common_specs
field but are not necessarily required.
Writers will each require an image and some, but not all, of the subscription and topic ID’s.
In each writer’s specification form, a writer will ask for its required fields and images.
bus_writer_image_gostring
bus_writer_image_pythonstring
bus_writer_image_bswlstring
bigquery_parquet_upload_subscription_idstring
bigquery_streaming_write_subscription_idstring
bigquery_streaming_write_topicstring
bigquery_upload_bucketstring
bigquery_upload_topicstring
metrics_bus_subscription_idstring
metrics_bus_topic_idstring
result_bus_metrics_subscription_idstring
result_bus_offline_store_subscription_idstring
result_bus_online_store_subscription_idstring
The following is an example configuration for background persistence writers:
{
"common_persistence_specs": {
"bus_backend": "KAFKA",
"bus_writer_image_go": "<go bus writer image>",
"bus_writer_image_python": "<python bus writer image>",
"bus_writer_image_bswl": "<bswl bus writer image>",
"namespace": "background-persistence",
"service_account_name": "background-persistence-sa",
"secret_client": "AWS",
"bigquery_parquet_upload_subscription_id": "offline-store-bulk-insert-bus-1",
"bigquery_streaming_write_subscription_id": "offline-store-streaming-insert-bus-1",
"bigquery_streaming_write_topic": "offline-store-streaming-insert-bus-1",
"bigquery_upload_bucket": "s3://<your data bucket>",
"bigquery_upload_topic": "offline-store-bulk-insert-bus-1",
"metrics_bus_subscription_id": "metrics-bus-1",
"metrics_bus_topic_id": "metrics-bus-1",
"result_bus_metrics_subscription_id": "result-bus-1",
"result_bus_offline_store_subscription_id": "result-bus-1",
"result_bus_online_store_subscription_id": "result-bus-1",
"kafka_dlq_topic": "dlq-1",
"operation_subscription_id": "operation-bus-1"
},
"api_server_host": "<your api server here>",
"kafka_sasl_secret": "<your aws kafka auth secret here>",
"kafka_bootstrap_servers": "<bootstrap server1>:<port>, <bootstrap server2>:<port>, ...",
"kafka_security_protocol": "SASL_SSL",
"kafka_sasl_mechanism": "SCRAM-SHA-512",
"redis_is_clustered": "1",
"snowflake_storage_integration_name": "<snowflak integration name>",
"metadata_provider": "GRPC_SERVER",
"writers": [
{
"name": "go-metrics-bus-writer",
"bus_subscriber_type": "GO_METRICS_BUS_WRITER",
"request": {
"cpu": "200m",
"memory": "512Mi"
},
"limit": {
"cpu": "1",
"memory": "512Mi"
},
"version": "1.0",
"default_replica_count": 1
},
{
"name": "go-result-bus-metrics-writer",
"bus_subscriber_type": "GO_RESULT_BUS_METRICS_WRITER",
"request": {
"cpu": "400m",
"memory": "1024Mi"
},
"limit": {
"cpu": "1",
"memory": "1024Mi"
},
"version": "1.0",
"default_replica_count": 1
}
]
}