Research
Our research spans both basic and applied research for search, retrieval, agents, and context engineering.
[▲] Chroma Technical Reports

Chroma Context-1: Training a Self-Editing Search Agent
Retrieval pipelines typically operate in a single pass, which poses a problem when the information required to answer a question is spread across multiple documents or requires intermediate reasoni...

Context Rot: How Increasing Input Tokens Impacts LLM Performance
Large Language Models (LLMs) are typically presumed to process context uniformly—that is, the model should handle the 10,000th token just as reliably as the 100th. However, in practice, this assump...

Generative Benchmarking
In traditional software systems, evaluation typically relies on deterministic logic: given a fixed input, the output is known and reproducible. In contrast, AI systems produce probabilistic results...

Evaluating Chunking Strategies for Retrieval
Despite document chunking being virtually ubiquitous as a pre-processing step, little work has been done to investigate its impact on retrieval performance. This is partially due to the structure o...

Embedding Adapters
Retrieval accuracy is an important determinant of AI application performance. However, many approaches to improving retrieval accuracy require large labeled corpora, which are often not available t...
Join the Chroma research team
We are hiring talented researchers interested in the intersection of indexing, information retrieval, reinforcement learning, and model harnesses.