https://simonwillison.net/atom/everything 前天 04:40
pgvector扩展在大规模应用中的挑战与实践
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文深入探讨了在实际应用中大规模运行pgvector PostgreSQL向量索引扩展所面临的挑战。文章重点关注了使用IVFFlat或HNSW索引类型,在近乎实时更新的情况下维护大型索引的困难。其中,关于预过滤与后过滤的讨论尤为关键,揭示了这一选择对查询性能和结果准确性的巨大影响。此外,文章还引用了Discourse团队在大规模生产环境中使用pgvector的经验,包括其如何通过量化技术(如16位浮点数存储和二进制向量索引)优化存储成本和性能,以及pgvector在“相关话题”、“标签建议”、“增强搜索”和“文件RAG”等功能中的应用。

💡 **大规模索引维护的挑战**: 在近乎实时更新的环境下,使用IVFFlat或HNSW等索引类型维护大型pgvector索引面临诸多挑战,尤其是在处理大量数据时,索引的效率和一致性至关重要。

🚀 **预过滤与后过滤的性能差异**: 对于带有元数据的向量搜索,选择先进行元数据过滤(预过滤)还是先进行向量搜索再过滤(后过滤),对查询响应时间和结果准确性有着决定性的影响,前者通常能带来数量级的性能提升。

📊 **量化技术优化存储与性能**: Discourse团队通过广泛采用量化技术,如使用16位浮点数(halfvec)进行存储和二进制向量(bit)用于索引,显著降低了存储成本并提升了查询性能,使得pgvector能够在其海量数据库中广泛应用。

📚 **pgvector在Discourse的实际应用**: pgvector在Discourse的生产环境中扮演着关键角色,支撑着“相关话题”推荐、新话题的标签和分类建议、增强搜索功能,以及为上传文件提供检索增强(RAG)能力,覆盖了绝大多数页面浏览量。

The case against pgvector (via) I wasn't keen on the title of this piece but the content is great: Alex Jacobs talks through lessons learned trying to run the popular pgvector PostgreSQL vector indexing extension at scale, in particular the challenges involved in maintaining a large index with close-to-realtime updates using the IVFFlat or HNSW index types.

The section on pre-v.s.-post filtering is particularly useful:

Okay but let's say you solve your index and insert problems. Now you have a document search system with millions of vectors. Documents have metadata---maybe they're marked as draft, published, or archived. A user searches for something, and you only want to return published documents.

[...] should Postgres filter on status first (pre-filter) or do the vector search first and then filter (post-filter)?

This seems like an implementation detail. It’s not. It’s the difference between queries that take 50ms and queries that take 5 seconds. It’s also the difference between returning the most relevant results and… not.

The Hacker News thread for this article attracted a robust discussion, including some fascinating comments by Discourse developer Rafael dos Santos Silva (xfalcox) about how they are using pgvector at scale:

We [run pgvector in production] at Discourse, in thousands of databases, and it's leveraged in most of the billions of page views we serve. [...]

Also worth mentioning that we use quantization extensively:

    halfvec (16bit float) for storage - bit (binary vectors) for indexes

Which makes the storage cost and on-going performance good enough that we could enable this in all our hosting. [...]

In Discourse embeddings power:

    Related Topics, a list of topics to read next, which uses embeddings of the current topic as the key to search for similar onesSuggesting tags and categories when composing a new topicAugmented searchRAG for uploaded files

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

pgvector PostgreSQL 向量数据库 AI 大数据 性能优化 Discourse
相关文章