RAG with LangChain enVector

This example demonstrates the complete workflow of the enVector Python SDK, showcasing its capabilities for Encrypted Retrieval-Augmented Generation (Encrypted RAG) using fully homomorphic encryption (FHE). In this example, we'll see:

  • How text data is stored and encrypted in the index for RAG

  • How the encrypted similarity search is performed with FHE

  • How the LLM (Ollama using ChatGPT OSS) leverages RAG while keeping results encrypted until decryption

Prerequisites

  • enVector server reachable from this notebook environment

  • Registered key path and key ID for the target index

  • pyenvector, langchain, langchain-community, langchain-text-splitters, and sentence-transformers packages installed

  • A PDF document accessible from the working directory

# !pip install langchain-envector==0.1.3 --force-reinstall
# !pip install langchain-community --force-reinstall

Import langchain-envector

Import langchain_envector to use enVector with the LangChain framework.

import langchain_envector

First, load a sample document to search.

In this example, we use a NIST report. This report evaluates how accurate and reliable common empirical formulas are when used to predict fire behavior in various scenarios. For more details about the report, see NIST Reportarrow-up-right and download the PDF from Linkarrow-up-right.

Load the PDF and split into chunks

We rely on LangChain community loaders and text splitters to turn the PDF pages into retrieval-friendly passages.

Prepare text and metadata payloads

enVector expects parallel lists of texts and metadata dictionaries. Here we keep track of the original page number for traceability.

Set embedding model

We'll use HuggingFace embeddings to convert our text chunks into numerical vectors that can be encrypted and searched. The embeddings model will transform each text chunk into a high-dimensional vector that captures semantic meaning. These vectors will be encrypted before being stored in the enVector index.

Initialize the enVector store

Configure the encrypted vector index and instantiate the LangChain-compatible store. The embedding model derives the vector dimension automatically.

Initialization step includes:

  1. ConnectionConfig: establishing a connection to the enVector server,

  2. IndexSettings: configuring index settings necessary for vector search, including query and metadata encryption, and

  3. KeyConfig: registering evaluation keys to enable the enVector server to perform secure operations.

Insert chunks (batched)

Encrypted search on the index

Let's perform an encrypted similarity search using LangChain-enVector.

The enVector vectorstore provides a simple interface through LangChain to perform similarity search on encrypted data. Under the hood, enVector handles all the encryption, decryption, and secure search operations automatically. When we call similarity_search(), the query is encrypted, the secure similarity search is performed on the encrypted vectors, and the results are decrypted before being returned.

The store.similarity_search() method returns the top-k most relevant documents along with their similarity scores, making it easy to build secure RAG applications without having to manage encryption directly.

Generate Answers with Retrieval-augmented Context

Once the decrypted documents are retrieved, we can use an LLM (e.g. OpenAI's GPT) to generate answers based on the retrieved documents.

In this example, we use the gpt-oss model running locally with ollama.

Last updated