LangChain

We provide an integration with LangChain 🔗 to use enVector as a vector store. This allows you to leverage enVector's capabilities for storing and retrieving vector embeddings within AI applications.

Setup

You first need to install the langchain-envector package:

pip install langchain-envector

Initialization

Key dataclasses live in langchain_envector.config:

ConnectionConfig: address or host/port for EnVector.
KeyConfig: key path, key ID, optional preset/eval mode.
IndexSettings: index name, dimension (32–4096), query encryption mode, optional output fields and fetch parameters.
EnvectorConfig: wraps the above and enables auto-creation via create_if_missing.

Example Usage:

from langchain_envector.config import ConnectionConfig, EnvectorConfig, IndexSettings, KeyConfig
from langchain_envector.vectorstore import Envector

cfg = EnvectorConfig(
    connection=ConnectionConfig(
        address="localhost:50050", # your envector address
        # access_token="..."       # if needed
    ),
    key=KeyConfig(
        key_path="./keys",         # your key path
        key_id="my_key_id",        # your key id
        eval_mode="rmp"            # evaluation mode
    ),
    index=IndexSettings(
        index_name="my_index",     # your index name
        dim=1536,                  # your embedding vector dimension
        query_encryption="plain"   # query encryption mode. (plain | cipher)
    ),
)

To use the vector store, you also need to provide an embeddings model.

The following example uses OpenAI embeddings with environment variable OPENAI_API_KEY:

from langchain_openai import OpenAIEmbeddings

embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

Now, you can initialize the Envector vector store with the configuration and embeddings:

store = Envector(config=cfg, embeddings=embeddings)

Manage vector store

Once you have initialized your vector store, we can interact with it by adding items.

Add items to vector store

Use the add_documents or add_texts method to add content.

from langchain_core.documents import Document

docs = [
    Document(
        page_content="enVector is a vector search engine that lets you search directly on encrypted data.", 
        metadata={"source": "document.pdf", "page": 1, "chunk": 0}
    ),
    Document(
        page_content="LangChain is a open-source framework for developing applications powered by language models.", 
        metadata={"source": "document.pdf", "page": 1, "chunk": 1}
    ),
]
store.add_documents(docs)

Query vector store

Once the vector store has been prepared to search, we can query it with your query or query vector.

Similarity search

Performing a similarity search with filtering on top-k results can be done as follows:

results = store.similarity_search_with_score(query, k=1)
for doc, score in results:
    print(f"* [SCORE={score:.4f}] {doc.page_content} [{doc.metadata}]")

The methods similarity_search and similarity_search_with_vector (with embeddings.embed_query()) are also available to perform vector search.

Usage for Retrieval-Augmented Generation

In RAG, you take the query as a question that is to be answered by a LLM, but the LLM must answer the question based on the information it is seeing from the vectorstore.

The following example shows how to set up a retrieval chain using OpenAI LLM and the enVector vector store:

from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough

# llm (set OPENAI_API_KEY)
llm = ChatOpenAI(
    model="gpt-4o-mini",
    temperature=0.0,
)

# Create a prompt template
prompt = ChatPromptTemplate.from_template(
    """Answer the question using the following context:
    {context}

    Question: {question}
    """
)

# Create a retrieval chain
chain = (
    {
        "context": store.as_retriever(),
        "question": RunnablePassthrough(),
    }
    | prompt
    | llm
    | StrOutputParser()
)

# Invoke the chain
chain.invoke("What is enVector?")

Troubleshooting

Connection issues: verify EnVector address and registered keys.
Embeddings mismatch: ensure embedding dimension equals index.dim when supplying vectors.
Unexpected raw strings: confirm inserts used the JSON envelope.
Key Issues: check key's metadata to sync with the registered key if facing any key issue.

API reference

The main class is langchain_envector.vectorstore.Envector, which extends langchain.vectorstores.VectorStore. It provides the following methods:

add_texts: Add a list of texts with optional metadata to the vector store.
add_documents: Add a list of LangChain Document objects to the vector store.
similarity_search: Perform a similarity search for the given query and return the top-k matching documents.
similarity_search_with_score: Perform a similarity search for the given query and return the top-k matching documents along with their similarity scores.
similarity_search_with_vector: Perform a similarity search using the provided query vector and return the top-k matching documents.

See more details in the GitHub repository: https://github.com/CryptoLabInc/langchain-envector.

PreviousDecryption NextEncrypted RAG with enVector

Last updated 2 months ago

hashtagSetup

hashtagInitialization

hashtagManage vector store

hashtagAdd items to vector store

hashtagQuery vector store

hashtagSimilarity search

hashtagUsage for Retrieval-Augmented Generation

hashtagTroubleshooting

hashtagAPI reference