LangChain Chroma - load data from Vector Database

Question

I have written LangChain code using Chroma DB to vector store the data from a website url. It currently works to get the data from the URL, store it into the project folder and then use that data to respond to a user prompt. I figured out how to make that data persist/be stored after the run, but I can't figure out how to then load that data for future prompts. The goal is a user input is received, and the program using OpenAI LLM will generate a response based on the existing database files, as opposed to the program needing to create/write those database files on each run. How can this be done?

What should I do?

I tried this as this would likely be the ideal solution:

vectordb = Chroma(persist_directory=persist_directory, embedding_function=embeddings)
qa = RetrievalQA.from_chain_type(llm=llm, chain_type="stuff", vectorstore=vectordb)

But the from_chain_type() function doesn't take a vectorstore db as an input, so therefore this doesn't work.

What should I do?

I tried this as this would likely be the ideal solution:

vectordb = Chroma(persist_directory=persist_directory, embedding_function=embeddings)
qa = RetrievalQA.from_chain_type(llm=llm, chain_type="stuff", vectorstore=vectordb)

But the from_chain_type() function doesn't take a vectorstore db as an input, so therefore this doesn't work.

Andrew · Accepted Answer

You need to define the retriever and pass that to the chain. That will use your previously persisted DB to be used in queries.

vectordb = Chroma(persist_directory=persist_directory, embedding_function=embeddings)

retriever = vectordb.as_retriever()

qa = RetrievalQA.from_chain_type(llm=llm, chain_type="stuff", retriever=retriever)

Gautam Chauhan · Answer

All the answers I have seen are missing one crucial step to call persist the DB. As a complete solution, you need to perform following steps.

To create db first time and persist it using the below lines.

vectordb = Chroma.from_documents(data, embedding=embeddings, persist_directory = persist_directory)
vectordb.persist()

The db can then be loaded using the below line.

vectordb = Chroma(persist_directory=persist_directory, embedding_function=embeddings)

LangChain Chroma - load data from Vector Database

Tags:

python

langchain

chromadb

py-langchain

max choate

2 Answers

Andrew

Gautam Chauhan

Recent Activity

Donate For Us

LangChain Chroma - load data from Vector Database

Tags:

python

langchain

chromadb

py-langchain

max choate

2 Answers

Andrew

Gautam Chauhan

Related questions

Recent Activity

Donate For Us