From Static to Dynamic: A Streaming RAG Approach to Real-time Knowledge Base

cs.AI updates on arXiv.org 08月11日

From Static to Dynamic: A Streaming RAG Approach to Real-time Knowledge Base

文章提出Streaming RAG，一种用于动态数据检索的统一框架，通过多向量余弦筛选、小批量聚类和计数过滤，优化原型集，提高检索质量，并在实时流上实现显著性能提升。

arXiv:2508.05662v1 Announce Type: cross Abstract: Dynamic streams from news feeds, social media, sensor networks, and financial markets challenge static RAG frameworks. Full-scale indices incur high memory costs; periodic rebuilds introduce latency that undermines data freshness; naive sampling sacrifices semantic coverage. We present Streaming RAG, a unified pipeline that combines multi-vector cosine screening, mini-batch clustering, and a counter-based heavy-hitter filter to maintain a compact prototype set. We further prove an approximation bound \$E[R(K_t)] \ge R^* - L \Delta\$ linking retrieval quality to clustering variance. An incremental index upsert mechanism refreshes prototypes without interrupting queries. Experiments on eight real-time streams show statistically significant gains in Recall\@10 (up to 3 points, p

Fish AI Reader

AI辅助创作，多种专业模板，深度分析，高质量内容生成。从观点提取到深度思考，FishAI为您提供全方位的创作支持。新版本引入自定义参数，让您的创作更加个性化和精准。

FishAI

鱼阅，AI 时代的下一个智能信息助手，助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

动态数据检索 RAG框架性能提升

相关文章

Show HN: 开源 LLM 补丁流 - 速度和输出令牌改进

Rivian 更新 R1，采用新型电机和电池组，提高了性能，降低了成本

Solana: ↩️ @vohvohh

Intel：正式发布第二代酷睿Ultra处理器架构

重要科學運算函式庫NumPy經多年開發迎來2.0重大更新

号称提升100倍的CPU设计，真相究竟是什么

苹果 iOS 18 助力 iPhone 15 Pro Max 机器学习测试得分提高 25%

Salesforce AI Unveils SFR-Embedding-v2: Reclaiming Top Spot on HuggingFace MTEB Benchmark with Advanced Multitasking and Enhanced Performance in AI

零下78℃全网首发！“骁龙8Gen2”极限超频49%！能干翻8Gen3？甚至比肩M1吗？【小鹏HiTech】

探秘华为 HDC2024！原生鸿蒙到底怎么样？？？