pgvector 0.8.0 Released — HNSW Improvements and Sparse Vectors
pgvector 0.8.0 has been released, bringing a significant set of improvements for AI and vector similarity workloads running inside PostgreSQL.
What’s new
Sparse vector type (sparsevec)
Version 0.8.0 introduces a new sparsevec type for storing sparse vectors — vectors
where most dimensions are zero. This is particularly useful for embeddings generated
by models like SPLADE and BM25, which produce high-dimensional sparse representations.
Sparse vectors consume significantly less storage and can be queried more efficiently
than their dense equivalents.
HNSW build performance
The HNSW index build algorithm has been optimised for better parallelism. Build times on large datasets are reduced by up to 30% compared to 0.7.x, and peak memory usage during index construction is lower.
Iterative index scans
A new iterative scan mode for HNSW indexes improves recall on filtered queries. Instead of scanning a fixed number of candidates and then applying a filter, PostgreSQL can now iterate through additional HNSW candidates until the requested number of results passes the filter condition. This eliminates the recall degradation that occurs when a tight WHERE clause discards most index results.
Distance function additions
hamming_distance()for binary vectorsjaccard_distance()for binary vectors
Supported PostgreSQL versions
pgvector 0.8.0 supports PostgreSQL 13 through 17.
Installation and upgrade instructions are available on the pgvector GitHub repository.