Deploy Chroma to GCP (Production Grade)

Jul 7, 2023


Simple GCP Deployment

⚠️ Chroma and its underlying database need at least 2gb of RAM, which means it won't fit on the f1-micro instances provided as part of the GCP Free Tier. This template uses a g1-small Compute Engine instance, which costs about two cents an hour, or $15 for a full month. If you follow these instructions, GCP will bill you accordingly.

⚠️ This basic stack doesn't support any kind of authentication; anyone who knows your server IP will be able to add and query for embeddings. To secure this endpoint, you'll need to put it behind GCP Cloud Endpoints or add your own authenticating proxy.

⚠️ By default, this template saves all data on a single volume. When you delete or replace it, the data will disappear. For serious production use (with high availability, backups, etc) please read and understand the Deployment Manager template and use it as a basis for what you need, or reach out to the Chroma team for assistance.

Step 1: Get a GCP Account

You will need a Google Cloud Platform Account. You can use one you already have, or create a new one.

Step 2: Get credentials

For this example, we will be using the GCP command-line interface (gcloud). There are several ways to configure gcloud, but for the purposes of these examples, we will presume that you have created a Service Account and downloaded its JSON key file.

To authenticate using the JSON key file, run:

gcloud auth activate-service-account --key-file=PATH_TO_YOUR_JSON_KEY_FILE

You can also configure GCP to use a region of your choice using the `gcloud config set compute/region` command:

gcloud config set compute/region us-central1

Step 3: Deploy with Deployment Manager

Chroma publishes Deployment Manager templates for each release. To launch the template, save it as a .yaml file, then run the following command:

gcloud deployment-manager deployments create my-chroma-stack --config chroma.yaml

Replace `my-chroma-stack` with a different deployment name, if you wish.

Wait a few minutes for the server to boot up, and Chroma will be available! You can get the public IP address of your new Chroma server using the GCP console, or using the following command:

gcloud compute instances describe my-chroma-instance --format='get(networkInterfaces[0].accessConfigs[0].natIP)'

Step 4: Customize the Deployment (optional)

The Deployment Manager template allows you to pass particular key/value pairs to override aspects of the deployment. Available keys are:

- machineType - the GCP machine type to run (default: g1-small)

- zone - the GCP zone to run in (default: us-central1-a)

To set these parameters, modify the .yaml file before running the `gcloud deployment-manager deployments create` command.

Step 5: Configure the Chroma Library

When you launch the Chroma client library to actually use Chroma, all you need to do is configure it to use the server's IP address and port 8000. You can do this in two ways:

Using Environment Variables

export CHROMA_API_IMPL=rest
export CHROMA_SERVER_HOST=<server IP address>

In Code

import chromadb
from chromadb.config import Settings
chroma = chromadb.HttpClient(host=<server IP address>, port=8000)

Step 6: Clean Up (optional)

To destroy the deployment and remove all GCP resources, use the `gcloud deployment-manager deployments delete` command.

gcloud deployment-manager deployments delete my-chroma-stack

⚠️ This will destroy all the data in your Chroma database, unless you've taken a snapshot or otherwise backed it up.

Troubleshooting

If you encounter any errors during deployment, check your Deployment Manager logs for specific error messages.