In this tutorial, we will walk through the steps to use the enVector SDK for Encrypted Retrieval-Augmented Generation (Encrypted RAG) using fully homomorphic encryption (FHE).
Import SDK
First, you should install and import the pyenvector package to use enVector Python APIs. Before installing, make sure you have Python 3 and a virtual environment on your system.
To use the enVector service, the initialization step is required. The following initialization step includes establishing a connection to the enVector server and configuring cryptographic settings necessary for vector search.
ev.init( address="localhost:50050",# access_token="...", # if needed key_path="./keys", key_id="rag_key_id",)
<pyenvector.client.client.EnvectorClient at 0x7ff4fdd8a7b0>
Prepare Data
Prepare Plaintext Vectors
To perform RAG, we need to prepare the plaintext text embedding vectors.
Note that these vectors should be normalized for the identification metric, cosine similarity. This is just one example of text embedding that uses sentence-transformers, you can also use your own embedding model to generate vectors from your text dataset.
Create Index
For encrypted similarity search, we first prepare a vector index, called Index, to store encrypted vectors and their metadata in the enVector system.
Once the index is ready, you can encrypt and insert data into it. This first encrypts the vectors using the generated encryption keys and inserts them into the created index. The data to be inserted can be in the form of vectors and associated metadata that provide additional context for RAG.
Let's perform an encrypted similarity search for encrypted RAG.
Once the encrypted vector index and encrypted query vectors are ready, we can perform a similarity search on encrypted data without decrypting it. The index object contains the decryption key, enabling the enVector server to return encrypted scores. These scores are decrypted by the client to retrieve the top-k relevant results along with their indices. After identifying the indices by decryption and top-k selection, we retrieve the encrypted documents and decrypt them to obtain the plaintext.
from typing import List, Union
from fastembed import TextEmbedding
import numpy as np
# 1. Load a pretrained text embedding model
model = TextEmbedding("sentence-transformers/all-MiniLM-L6-v2")
dim = model.embedding_size
# 2. Calculate embeddings by calling model.embed()
def get_embedding(texts: Union[str, List[str]]) -> np.ndarray:
BATCH_SIZE=128
if isinstance(texts, str):
texts = [texts]
embeddings = np.empty((0, model.embedding_size))
for i in range(0, len(texts), BATCH_SIZE):
batch_texts = texts[i : i + BATCH_SIZE]
batch_embeddings = list(model.embed(batch_texts))
embeddings = np.vstack([embeddings, batch_embeddings])
embeddings = np.squeeze(embeddings)
return embeddings
# Prepare vectors to be indexed
texts = [
"The capital of USA is Washington, D.C.",
"The capital of South Korea is Seoul.",
"The capital of France is Paris.",
"The capital of Germany is Berlin.",
"The capital of Italy is Rome.",
"The capital of Canada is Ottawa.",
]
# Get embeddings
vectors = get_embedding(texts)