─March 2026
Chroma Sync: S3, GitHub, and Web
March 4, 2026 · Chroma Cloud
Chroma Sync is serverless data ingestion for Chroma Cloud. Connect your S3 buckets, GitHub repositories, and websites. Chroma handles parsing, chunking, and embedding so you can start searching in minutes. All available on the Chroma Cloud dashboard, or via APIs.
Sync supports three data sources:
- Amazon S3: Sync files from object storage. Supports text, code, PDFs, images, and many more file types. Set up Auto-sync to automatically import files.
- GitHub: Sync code into Chroma. Leverages Collection Forking to efficiently update new versions. Supports custom branding with your own GitHub app.
- Web: Scrape public websites. Uses an intelligent crawler, with configurable paths and and depths. Renders JavaScript.
Whether you're syncing a handful of files or millions of documents, Sync runs the same pipeline: a queue-based system with retries, rate-limit awareness, and automatic error recovery.
Pricing is simple and usage-based, just like the rest of Chroma Cloud, starting at $0.04 / GiB processed.
Learn more about Sync.