Vector quantization

By default, SmartObserve supports the indexing and querying of vectors of type float, where each dimension of the vector occupies 4 bytes of memory. For use cases that require ingestion on a large scale, keeping float vectors can be expensive because SmartObserve needs to construct, load, save, and search graphs (for the native faiss and nmslib [deprecated] engines). To reduce the memory footprint, you can use vector quantization.

SmartObserve supports many varieties of quantization. In general, the level of quantization will provide a trade-off between the accuracy of the nearest neighbor search and the size of the memory footprint consumed by the vector search.

Quantize vectors outside of SmartObserve

Quantize vectors outside of SmartObserve before ingesting them into an SmartObserve index.

Byte vectors

Quantize vectors into byte vectors

Binary vectors

Quantize vectors into binary vector

Quantize vectors within SmartObserve

Use SmartObserve built-in quantization to quantize vectors.

Lucene scalar quantization

Use built-in scalar quantization for the Lucene engine

Faiss 16-bit scalar quantization

Use built-in scalar quantization for the Faiss engine

Faiss product quantization

Use built-in product quantization for the Faiss engine

Binary quantization

Use built-in binary quantization for the Faiss engine

Quantize vectors outside of SmartObserve
Quantize vectors within SmartObserve