Sparse Vector support in Chroma#

Chroma now supports sparse vector search. We’re shipping first class support for BM25 and SPLADE embeddings. Sparse vectors capture lexical signals like term frequency and rarity. Combine with dense vector similarity for high performance hybrid search.

We've exposed these features through a powerful new Search() API.

BM25#

BM25 (BM meaning “best matching”) is a ranking function used to approximate the relevancy of a document to a search query. BM25 is a “bag of words” retrieval function, which treat documents as collections of individual terms rather than sequences of text.

BM25 works by scoring documents based on a few core factors:

Term frequency (TF): how often a query term appears in a document
Inverse document frequency (IDF): how rare or distinctive that term is across the whole collection
Document length normalization: prevent longer documents from getting unfairly high scores because they have more words

Each document in a collection is given a score, the higher the BM25 score, the more relevant the document is considered to the query. BM25 relies on exact term matches and weighting, it is especially effective for precise keyword search.

python

schema = Schema().create_index(
  key='sparse_vector_key',    
  config=SparseVectorIndexConfig(
    embedding_function=Bm25EmbeddingFunction(avg_len=10),
    source_key=K.DOCUMENT,
    bm25=True
  )
)
collection = client.get_or_create_collection(
  "bm25_collection",
  schema=schema
)
collection.upsert(
  ids=["doc1", "doc2"],
  documents=[
    "The quick brown fox jumps over the lazy dog",
    "Machine learning is a subset of artificial intelligence"
  ],
)
# Build and execute search
search = (Search()
  .rank(Knn(
    query="machine learning",
    key="sparse_vector_key")
  )
  .select(K.DOCUMENT, K.SCORE)
)
  
results = collection.search(search)

SPLADE#

SPLADE (Sparse Lexical and Expansion Model for Information Retrieval) is a retrieval method that uses transformer models to generate sparse vector representations of text. Unlike dense embeddings, SPLADE outputs vectors that are aligned with a vocabulary, similar to traditional bag-of-words models, but with learned weights that reflect semantic and contextual information.

SPLADE works by expanding and weighting the terms in a document or query. It can assign importance not only to the exact words present but also to related terms inferred from the SPLADE model. The resulting vector is sparse (most values are zero), which makes it efficient to store and index while preserving strong lexical signals.

This gives SPLADE two key advantages:

It performs like a lexical model, enabling fast, keyword-based matching.
It leverages context from transformers, improving recall and relevance compared to purely term-frequency methods like BM25.

In practice, SPLADE combines the precision of keyword search with the contextual awareness of neural models.

python

client = chromadb.CloudClient()
schema = Schema().create_index(
  key='sparse_vector_key',
  config=SparseVectorIndexConfig(
    embedding_function=ChromaCloudSpladeEmbeddingFunction(),
    source_key=K.DOCUMENT
  )
)
collection = client.get_or_create_collection(
  "splade_collection", schema=schema
)
collection.upsert(
  ids=["doc1", "doc2"],
  documents=[
    "The quick brown fox jumps over the lazy dog",
    "Machine learning is a subset of artificial intelligence"
  ],
)
# Build and execute search
search = (Search()
  .rank(Knn(query="machine learning", key="sparse_vector_key"))
  .limit(10)
  .select(K.DOCUMENT, K.SCORE))
results = collection.search(search)

Hybrid Search#

Hybrid search combines lexical and semantic retrieval methods to get the strengths of both in a single ranking function.

By combining results from both lexical and semantic, hybrid search can:

Match exact keywords when they matter most
Generalize to semantically related terms when exact matches aren’t present
Balance precision and recall without maintaining separate systems

Hybrid searches can be combined with different weightings to best optimize for specific datasets.

python

from chromadb import Search, K, Knn, Rrf
# Dense semantic embeddings
dense_rank = Knn(
  query="machine learning research",  # Text query for dense embeddings
  return_rank=True,
  limit=200          # Consider top 200 candidates
)
# Sparse keyword embeddings
sparse_rank = Knn(
  query="machine learning research",  # Text query for sparse embeddings
  key="sparse_vector_key",  # Metadata field for sparse vectors
  return_rank=True,
  limit=200
)
# Combine with RRF
hybrid_rank = Rrf(
  ranks=[dense_rank, sparse_rank],
  weights=[0.7, 0.3],  # 70% semantic, 30% keyword
  k=60
)
# Use in search
search = (Search()
  .where(K("status") == "published")  # Optional filtering
  .rank(hybrid_rank)
  .limit(20)
  .select(K.DOCUMENT, K.SCORE, "title")
)
results = collection.search(search)

At Chroma, we care deeply about developer experience. We’ve enabled sparse embeddings with no breaking changes to existing users. There’s also no explicit opt-in. You can add sparse vectors to existing collections without needing to re-index dense vectors.

To learn more about how to implement sparse vector search and hybrid search in your application, read more in our docs.

Sparse Vector support in Chroma#

BM25#

SPLADE#

Hybrid Search#

Product

Follow

Company

Legal