🔎ANN (IVF_FLAT)

Approximate Nearest Neighbor (ANN) indexing accelerates similarity search by trading a small amount of recall for large performance gains. In enVector, you can enable ANN by creating an index with index_params.index_type = "ivf_flat". Internally, IVF (Inverted File) partitions the vector space into nlist clusters (coarse centroids) and scans only nprobe of them at query time.


When to use

  • Large datasets where exact scan is too slow.

  • Latency-sensitive applications needing fast top-k results.

  • Acceptable to tune recall via nprobe to balance speed/accuracy.


Key parameters

  • index_type (str): Set to "ivf_flat" to enable ANN.

  • nlist (int): Number of coarse clusters (lists). Larger values → finer partitioning but larger index and build cost.

  • default_nprobe (int): Default number of clusters to scan during search. Larger values → higher recall, higher latency.

  • centroids (optional): Precomputed nlist centroids. Accepted types:

    • 2D NumPy ndarray with shape (nlist, dim)

    • list[np.ndarray]

    • list[list[float]] If omitted, the client generates random centroids and sends them to the server.

Notes:

  • Dimensions must match the index dim (e.g., 32–4096). L2 normalization of vectors is recommended for stable Inner Product scoring (cosine equivalence).

  • If you provide centroids, ensure len(centroids) == nlist and that nlist ≤ number_of_vectors used to fit centroids.


Providing centroids fitted on your data (e.g., KMeans) typically yields better recall/latency trade-offs than random centroids.


Client-generated random centroids (quick start)

For quick experiments, you may skip KMeans and let the client initialize random centroids and pass them to the server. This reduces setup time but may underperform compared to data-fitted centroids.


Tuning tips

  • Choose nlist to reflect dataset size (common heuristic: √N, then validate).

  • Increase nprobe to improve recall; decrease to improve latency. You can override per-search if supported by the client.

  • Keep vectors L2-normalized if you use Inner Product scoring (cosine-equivalent with L2-normalized inputs).

  • Ensure nlist ≤ number_of_vectors when fitting centroids with KMeans.


Troubleshooting

  • Poor recall or unstable latency: Fit centroids on representative data and increase nprobe.

  • Import errors for KMeans: Install scikit-learn in your environment (pip install scikit-learn).

Last updated