Open-source search infrastructure for AI

Fast, serverless, and scalable infrastructure supporting vector, full-text, regex, and metadata search. Built on object storage and trusted by millions of developers. Open-source Apache 2.0.

Or, get started locally.

Read case study →

AI App

Ask a question

Chroma

knowledge_base - 1,277,467 records

awaiting query input

15M+ monthly downloads

Apache 2.0
27k Github stars

Low latency search

Fast queries over billions of multi-tenant indexes.

Up to 10x cheaper

Built on object storage with automatic data tiering.

No engineering ops

Scales with your data and traffic. SOC 2 Type II.

Features

◇

Sparse vector search

Lexical search (BM25, SPLADE)

◆

Vector search

Semantic similarity search

●

Full-text search

Trigram and regex search

◐

Metadata search

Filtering and faceted search

◊

Forking

Dataset versioning, A/B testing, and roll-outs

▣

CLI

Command-line tools for development

// configure client and collection for sparse embeddings (BM25, SPLADE)

// Add documents with sparse embeddings (BM25)
await collection.add({
  ids: ["id1", "id2"],
  documents: ["Document about databases", "ML tutorial"]
})

// Query with sparse vector
const sparseRank = Knn({ query: "ML", key: "sparse_embedding" });

// Build and execute search
const search = new Search()
  .rank(sparseRank)
  .limit(10)
  .select(K.DOCUMENT, K.SCORE);

const results = await collection.search(search);

Terminal Output

$ node sparse-search.js
Connecting to Chroma...
✓ Connected successfully
Creating collection 'my_collection'...
✓ Collection created

Adding documents with sparse embeddings (BM25)...
✓ Added 2 documents

Querying with sparse vector...
✓ Query completed in 18ms

Results (ranked by BM25 score):
[
  {
    id: "id1",
    document: "Document about databases",
    score: 0.87,
    metadata: {}
  },
  {
    id: "id2",
    document: "ML tutorial",
    score: 0.45,
    metadata: {}
  }
]

Performance

Fast search over billions of multi-tenant indexes

Chroma's indexes are built and optimized for object-storage offering unparalleled cost and performance. State-of-the-art vector, full-text, and regex search.

Latency

Query Latency

@384 dim at 100k vectors

Warm

Cold

p50

20ms

650ms

p90

27ms

1.2s

p99

57ms

1.5s

Technical specs

Write throughput (per collection)30 MB/s (2000+ QPS)

Concurrent reads (per collection)10 (200+ QPS)

Collections per database1M

Records per collection5M

Recall90-100%

Zero-ops infra

┌───────────────────────────────┐
│ Query Layer                   │
│   Fast memory cache (hot)     │
│   SSD cache (warm)            │
└───────────────────────────────┘

↕ Intelligent tiering

┌───────────────────────────────┐
│ Storage Layer                 │
│   S3 / GCS (cold)             │
│     • All vectors             │
│     • All metadata            │
│     • All indexes             │
└───────────────────────────────┘

Unlike legacy search systems, Chroma is a database you'll want to be on-call for.

✓Auto-scales with usage

✓No manual tuning

✓Serverless pricing

Chroma takes full advantage of object storage with automatic query-aware data tiering and caching.

✓Vectors are large: 1GB text → 15GB of vectors

✓Memory is expensive: $5/GB/mo

✓Object storage is not: $0.02/GB/mo

Enterprise

Chroma brings the security, compliance, education and operational model enterprises need with our Apache 2.0 architecture.

BYOC in your VPC, multi-cloud/multi-region replication, point-in-time-recovery ensure a resilient and scalable search system with the same 0-ops story as Cloud.

 ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓
 ▓░                                         ░▓
 ▓░  ┌──────────── YOUR VPC ─────────────┐  ░▓
 ▓░  │                                   │  ░▓
 ▓░  │   █ DATA PLANE █                  │  ░▓
 ▓░  │                                   │  ░▓
 ▓░  │   Your data, your cloud           │  ░▓
 ▓░  │                                   │  ░▓
 ▓░  │                                   │  ░▓
 ▓░  └───────────────────────────────────┘  ░▓
 ▓░                    │                    ░▓
 ▓░                    │                    ░▓
 ▓░                    ▼                    ░▓
 ▓░  ═════════════════════════════════════  ░▓
 ▓░  ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░  ░▓
 ▓░                                         ░▓
 ▓░  ┌────────── CHROMA VPC ─────────────┐  ░▓
 ▓░  │                                   │  ░▓
 ▓░  │   █ CONTROL PLANE █               │  ░▓
 ▓░  │                                   │  ░▓
 ▓░  │   Managed by Chroma               │  ░▓
 ▓░  │   Monitoring, backups, ops        │  ░▓
 ▓░  │                                   │  ░▓
 ▓░  └───────────────────────────────────┘  ░▓
 ▓░                                         ░▓
 ▓░  ✓ BYOC in your VPC                     ░▓
 ▓░  ✓ Multi-region replication             ░▓
 ▓░  ✓ 0-ops management                     ░▓
 ▓░                                         ░▓
 ▓░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░▓
 ▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒

[▶] Videos

Deep dive: Using Reranking to improve search results