Chroma raises $18M seed round

building the AI-native open-source embedding database

Today is an exciting day. Chroma has raised an $18M seed round led by Astasia Myers from Quiet Capital. Joining the round are angels including Naval Ravikant, Max and Jack Altman, Jordan Tigani (Motherduck), Guillermo Rauch (Vercel), Akshay Kothari (Notion), Amjad Masad (Replit), Spencer Kimball (CockroachDB), and other founders and leaders from ScienceIO, Gumroad, MongoDB, Scale, Hugging Face, Jasper and more.

We are also announcing our pre-seed from May 2022 of last year, led by Anthony Goldbloom (Kaggle) from AIX Ventures, James Cham from Bloomberg Beta, and Nat Friedman and Daniel Gross (AI Grant).

Why now?

Over the past few months, we have seen the massive rise of developers adopting the tools of generative AI. We believe this represents a fundamentally new stack in computing that will make intelligence too cheap to measure and permeate every product and facet of our lives.

That new stack is:

  • LLM application logic: Langchain, Llamaindex - enables developers to write business logic around their use case
  • LLM/Embedding providers: OpenAI, Anthropic, Cohere, and the OSS community (eg Llama) - the raw CPU/horsepower
  • Embedding Databases: Chroma - enables LLM applications to have long term memory

What is Chroma?

Chroma is the AI native open-source embeddings database. Using embeddings, Chroma lets developers add state and memory to their AI-enabled applications.

Developers use Chroma to give LLMs pluggable knowledge about their data, facts, tools, and prevent hallucinations. Many developers have said they want "ChatGPT but for my data" - and Chroma provides the "for my data" bridge through embedding-based document retrieval.

Chroma comes 'batteries included' with everything developers need to store, embed, and query data with powerful features like filtering built in, with more features like automatic clustering and query relevance coming soon.

It's been amazing to see all the ways that developers have picked up Chroma over the past 5 weeks since launch, crossing 35k python downloads in the past month.

“Chroma’s vector search is as easy as getting started with SQLite - easy to start, open source, scales as you need. For Prefect, I recommend it to our customers for the same reason - they can start flexibly on their own terms as they figure out what they need, and then have the confidence of commercial support down the road.”

  • Jeremiah Lowin, CEO and cofounder of Prefect, the popular open-source data workflow orchestration software

Why we built Chroma?

Chroma was founded on the principle that models can be understood through interpretability of their latent spaces, and while we were experimenting with that we needed an open-source vector database that was powerful and easy-to-use. We evaluated the existing products, but found they were difficult to use and fundamentally built for a different use case (web scale semantic search). We built Chroma for ourselves, because it was the product we needed and wanted.

Anton and I's deep interpretability experience carries through to today; we believe you can’t just hand app developers a ‘vector database’ - you have to support the entire lifecycle from experimentation to scaling.

Why open source?

We are committed to building open source software because we believe in the flourishing of humanity that will be unlocked through the democratization of robust, safe, and aligned AI systems. These tools need to be available to a new developer just starting in ML as well as the organizations that scale ML to millions (and billions) of users. Open source is about expanding the horizon of what’s possible.

What's next?

In the short term, we are working hard on a few projects that the community has helped us prioritize:

  • New features like query relevancy will help developers know if the retrieved embeddings are relevant or not to their query.
  • We are working on an open source distributed system to replace the current database for client/server Chroma. This will enable us to offer a hosted product that will offer serverless storage and retrieval functionality that scales up and down to zero. This will be launched with a free technical preview, with fair pricing to follow.

We are also especially grateful to the entire Chroma community. Over the longer term, Chroma and the Chroma community will help define how this new wave of AI software will be built.

Join us

-Jeff, Anton, and the Chroma team


