Deploy Chroma to GCP (Production Grade)
Jul 7, 2023
Simple GCP Deployment
⚠️ Chroma and its underlying database need at least 2gb of RAM, which means it won't fit on the f1-micro instances provided as part of the GCP Free Tier. This template uses a g1-small Compute Engine instance, which costs about two cents an hour, or $15 for a full month. If you follow these instructions, GCP will bill you accordingly.
⚠️ This basic stack doesn't support any kind of authentication; anyone who knows your server IP will be able to add and query for embeddings. To secure this endpoint, you'll need to put it behind GCP Cloud Endpoints or add your own authenticating proxy.
⚠️ By default, this template saves all data on a single volume. When you delete or replace it, the data will disappear. For serious production use (with high availability, backups, etc) please read and understand the Deployment Manager template and use it as a basis for what you need, or reach out to the Chroma team for assistance.
Step 1: Get a GCP Account
You will need a Google Cloud Platform Account. You can use one you already have, or create a new one.
Step 2: Get credentials
For this example, we will be using the GCP command-line interface (gcloud). There are several ways to configure gcloud, but for the purposes of these examples, we will presume that you have created a Service Account and downloaded its JSON key file.
To authenticate using the JSON key file, run:
You can also configure GCP to use a region of your choice using the `gcloud config set compute/region` command:
Step 3: Deploy with Deployment Manager
Chroma publishes Deployment Manager templates for each release. To launch the template, save it as a .yaml file, then run the following command:
Replace `my-chroma-stack` with a different deployment name, if you wish.
Wait a few minutes for the server to boot up, and Chroma will be available! You can get the public IP address of your new Chroma server using the GCP console, or using the following command:
Step 4: Customize the Deployment (optional)
The Deployment Manager template allows you to pass particular key/value pairs to override aspects of the deployment. Available keys are:
- machineType - the GCP machine type to run (default: g1-small)
- zone - the GCP zone to run in (default: us-central1-a)
To set these parameters, modify the .yaml file before running the `gcloud deployment-manager deployments create` command.
Step 5: Configure the Chroma Library
When you launch the Chroma client library to actually use Chroma, all you need to do is configure it to use the server's IP address and port 8000. You can do this in two ways:
Using Environment Variables
In Code
Step 6: Clean Up (optional)
To destroy the deployment and remove all GCP resources, use the `gcloud deployment-manager deployments delete` command.
⚠️ This will destroy all the data in your Chroma database, unless you've taken a snapshot or otherwise backed it up.
Troubleshooting
If you encounter any errors during deployment, check your Deployment Manager logs for specific error messages.