We provide an integration with LangChain 🔗 to use enVector as a vector store. This allows you to leverage enVector's capabilities for storing and retrieving vector embeddings within AI applications.
Setup
You first need to install the langchain-envector package:
pipinstalllangchain-envector
Initialization
Key dataclasses live in langchain_envector.config:
ConnectionConfig: address or host/port for EnVector.
IndexSettings: index name, dimension (32–4096), query encryption mode, optional output fields and fetch parameters.
EnvectorConfig: wraps the above and enables auto-creation via create_if_missing.
Example Usage:
from langchain_envector.config import ConnectionConfig, EnvectorConfig, IndexSettings, KeyConfigfrom langchain_envector.vectorstore import Envectorcfg =EnvectorConfig( connection=ConnectionConfig( address="localhost:50050", # your envector address# access_token="..." # if needed ), key=KeyConfig( key_path="./keys", # your key path key_id="my_key_id", # your key id eval_mode="rmp"# evaluation mode ), index=IndexSettings( index_name="my_index", # your index name dim=1536, # your embedding vector dimension query_encryption="plain"# query encryption mode. (plain | cipher) ),)
To use the vector store, you also need to provide an embeddings model.
The following example uses OpenAI embeddings with environment variable OPENAI_API_KEY:
Now, you can initialize the Envector vector store with the configuration and embeddings:
Manage vector store
Once you have initialized your vector store, we can interact with it by adding items.
Add items to vector store
Use the add_documents or add_texts method to add content.
Query vector store
Once the vector store has been prepared to search, we can query it with your query or query vector.
Similarity search
Performing a similarity search with filtering on top-k results can be done as follows:
The methods similarity_search and similarity_search_with_vector (with embeddings.embed_query()) are also available to perform vector search.
Usage for Retrieval-Augmented Generation
In RAG, you take the query as a question that is to be answered by a LLM, but the LLM must answer the question based on the information it is seeing from the vectorstore.
The following example shows how to set up a retrieval chain using OpenAI LLM and the enVector vector store:
Troubleshooting
Connection issues: verify EnVector address and registered keys.
Embeddings mismatch: ensure embedding dimension equals index.dim when supplying vectors.
Unexpected raw strings: confirm inserts used the JSON envelope.
Key Issues: check key's metadata to sync with the registered key if facing any key issue.
API reference
The main class is langchain_envector.vectorstore.Envector, which extends langchain.vectorstores.VectorStore. It provides the following methods:
add_texts: Add a list of texts with optional metadata to the vector store.
add_documents: Add a list of LangChain Document objects to the vector store.
similarity_search: Perform a similarity search for the given query and return the top-k matching documents.
similarity_search_with_score: Perform a similarity search for the given query and return the top-k matching documents along with their similarity scores.
similarity_search_with_vector: Perform a similarity search using the provided query vector and return the top-k matching documents.
from langchain_openai import OpenAIEmbeddings
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
store = Envector(config=cfg, embeddings=embeddings)
from langchain_core.documents import Document
docs = [
Document(
page_content="enVector is a vector search engine that lets you search directly on encrypted data.",
metadata={"source": "document.pdf", "page": 1, "chunk": 0}
),
Document(
page_content="LangChain is a open-source framework for developing applications powered by language models.",
metadata={"source": "document.pdf", "page": 1, "chunk": 1}
),
]
store.add_documents(docs)
results = store.similarity_search_with_score(query, k=1)
for doc, score in results:
print(f"* [SCORE={score:.4f}] {doc.page_content} [{doc.metadata}]")
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
# llm (set OPENAI_API_KEY)
llm = ChatOpenAI(
model="gpt-4o-mini",
temperature=0.0,
)
# Create a prompt template
prompt = ChatPromptTemplate.from_template(
"""Answer the question using the following context:
{context}
Question: {question}
"""
)
# Create a retrieval chain
chain = (
{
"context": store.as_retriever(),
"question": RunnablePassthrough(),
}
| prompt
| llm
| StrOutputParser()
)
# Invoke the chain
chain.invoke("What is enVector?")