-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Open
Labels
Description
What happened?
I created a collection in python with the following code:
chroma_client = chromadb.CloudClient(
api_key=os.getenv("CHROMA_API_KEY"),
tenant=os.getenv("CHROMA_TENANT"),
database=os.getenv("CHROMA_DATABASE")
)
schema = Schema()
schema.create_index(
config=VectorIndexConfig(
space="cosine",
embedding_function=OpenAIEmbeddingFunction(
model_name="text-embedding-3-small"
)
)
)
schema.create_index(
config=SparseVectorIndexConfig(
source_key=K.DOCUMENT,
embedding_function=ChromaBm25EmbeddingFunction(),
bm25=True
),
key=SPARSE_KEY
)
collection = client.get_or_create_collection(
name=collection_name,
schema=schema
)If I then mirror the schema in Typescript:
const chromaClient = new CloudClient({
apiKey: process.env.CHROMA_API_KEY,
tenant: process.env.CHROMA_TENANT,
database: process.env.CHROMA_DATABASE,
});
const chromaSchema = new Schema();
chromaSchema.createIndex(
new VectorIndexConfig({
space: "cosine",
embeddingFunction: new OpenAIEmbeddingFunction({
modelName: "text-embedding-3-small",
}),
})
);
chromaSchema.createIndex(
new SparseVectorIndexConfig({
sourceKey: "#document",
embeddingFunction: new ChromaBm25EmbeddingFunction(),
bm25: true,
}),
'bm25'
);
const chromaCollection = await chromaClient.getOrCreateCollection({
name: "national_parks_11",
schema: chromaSchema,
});When running the typescript file, I get
ChromaConnectionError: Unable to connect to the chromadb server (status: 500). Please try again later.
cause: undefined,
at <anonymous> (/Users/tjkrusinski/Software/data/agents/mastra/node_modules/chromadb/dist/chromadb.mjs:3907:9)
at async <anonymous> (/Users/tjkrusinski/Software/data/agents/mastra/node_modules/chromadb/dist/chromadb.mjs:334:32)
at async getOrCreateCollection (/Users/tjkrusinski/Software/data/agents/mastra/node_modules/chromadb/dist/chromadb.mjs:4318:43)If I remove the schema from the getOrCreateCollection, I then get the error:
error: Cannot find module '@chroma-core/default-embed' from '/Users/tjkrusinski/Software/data/agents/mastra/node_modules/chromadb/dist/chromadb.mjs'
1415 | if (!knownEmbeddingFunctions.has(new DefaultEmbeddingFunction().name)) {
1416 | registerEmbeddingFunction("default", DefaultEmbeddingFunction);
1417 | }
1418 | } catch (e) {
1419 | console.error(e);
1420 | throw new Error(
^
error: Cannot instantiate a collection with the DefaultEmbeddingFunction. Please install @chroma-core/default-embed, or provide a different embedding function
at <anonymous> (/Users/tjkrusinski/Software/data/agents/mastra/node_modules/chromadb/dist/chromadb.mjs:1420:15)
at async <anonymous> (/Users/tjkrusinski/Software/data/agents/mastra/node_modules/chromadb/dist/chromadb.mjs:1447:44)
at async getOrCreateCollection (/Users/tjkrusinski/Software/data/agents/mastra/node_modules/chromadb/dist/chromadb.mjs:4313:36)This is my search code:
const search = async (query: string): Promise<Array<{ excerpt: string }>> => {
const denseRank = Knn({
query,
key: "#embedding",
returnRank: true,
limit: 20
});
const sparseRank = Knn({
query,
key: "bm25",
returnRank: true,
limit: 20
});
const rrf = Rrf({
ranks: [denseRank, sparseRank],
weights: [.7, .3],
k: 60
});
const search = new Search().rank(rrf);
const results = await chromaCollection.search(search);
console.log(results)
return [];
};While it may not be super common so ingest and query from separate languages, I'm stuck as to what to do to resolve the issue.
If I supply an embedding function at the getOrCreateCollection call, I don't think the chromaCollection.search() will ignore the embedding function I provided earlier.
Versions
Chroma 1.2.1
All latest clients.